Secure API access to AI endpoints
WashU’s Generative AI Environment provides cost effective, easy to use & unified API access to a variety of approved AI endpoints across multiple providers. These endpoints are secure and hosted within WashU’s Azure tenant. They are pre-configured with default content and safety filters so users can focus on their AI interactions rather than the myriad of configurations typically required. Endpoint services are billed at cost based on usage and are intended to be used by researchers, faculty & staff who need secure access to AI APIs for university-related purposes.
Key Features
- Programmatic API access to several sandboxed AI models in WashU’s secure environment
- Approved for use with sensitive information, including data covered by HIPAA or FERPA
- Access is limited to WashU networks and VPN
- Ability to preset a fixed budget for API usage to avoid surprise bills
- Ability to monitor token usage and budget in real-time
- Billed for token usage to the cost center provided when requesting access
Requesting Access
To access the secure AI API endpoints, you need to first complete a ServiceNow request form. As part of fulfilling this request, you will be provided a Client ID and Client Secret that will be used when accessing the API endpoints. Please follow this link to request access:
Approved AI Models
The following sandboxed LLMs are currently approved and hosted within WashU’s Secure AI environment.
Click on any model to view more details from within Azure AI Foundry including a model overview, benchmarks, the training data date and more.
Chat Completion Models
| Model Name | Model Version | Requests per Minute (RPM) | Tokens per Minute (TPM) | $ per 1M Tokens |
|---|---|---|---|---|
| gpt-4o | 2024-11-20 | 120 | 120,000 | ~$4.00 |
| gpt-4o-mini | 2024-07-18 | 1,200 | 120,000 | ~$0.30 |
| gpt-4.1 | 2025-04-14 | 100 | 100,000 | ~$3.50 |
| gpt-4.1-mini | 2025-04-14 | 120 | 120,000 | ~$0.70 |
| gpt-4.1-nano | 2025-04-14 | 120 | 120,000 | ~$0.20 |
| gpt-5 | 2025-08-07 | 1,200 | 120,000 | ~$3.50 |
| gpt-5-mini | 2025-08-07 | 120 | 120,000 | ~$0.70 |
| gpt-5-nano | 2025-08-07 | 120 | 120,000 | ~$0.20 |
| grok-3 | 1 | 100 | 100,000 | ~$6.00 |
| grok-3-mini | 1 | 100 | 100,000 | ~$0.50 |
Embedding Models
| Model Name | Model Version | Requests per Minute (RPM) | Tokens per Minute (TPM) |
|---|---|---|---|
| text-embedding-3-small | 1 | 120 | 120,000 |
| text-embedding-ada-002 | 2 | 120 | 120,000 |
Models in review
The following models are undergoing an Information Security Review before being released.
| Model Name | Model Version | Requests per Minute (RPM) | Tokens per Minute (TPM) | $ per 1M Tokens |
|---|---|---|---|---|
| gpt-5-chat | 2025-08-07 | 120 | 120,000 | ~$3.70 |
| llama-4-maverick | 1 | 120,000 | ~$0.60 | |
| deepseek-r1-0528 | 1 | 100 | 100,000 | ~$2.40 |
Getting Started
To access the API endpoints, you’ll have to be on the WashU network OR you will need to VPN to the WashU network. (VPN instructions can be found at it.wustl.edu/items/connect)
1. Get Microsoft OAuth 2.0 Token
The API endpoints use Microsoft OAuth 2.0 token. A token must be retrieved before an API call can be made.
Client ID & Client Secret are securely provided to you when your registration request is fulfilled.
It is essential that only authorized users have access to this password per the WashU Information Security Authentication, Authorization & Audit policy
Request:
POST https://login.microsoftonline.com/4ccca3b5-71cd-4e6d-974b-4d9beb96c6d6/oauth2/v2.0/token
{
"client_id": [provided during registration],
"client_secret": [provided during registration],
"scope": "api://bbeee386-60d6-4ba4-b9a7-631763f66065/.default",
"grant_type": client_credentials
}
Response: Status Code: 200
{
"body": {
"access_token": [ACCESS TOKEN TO USE]
}
}
2. Assemble API Calls
Once an OAuth token is retrieved, make your API endpoint calls with the model specified either in the body or the query string.
See Work with chat completion models for additional examples and syntax support
Example 1 (Request Body): https://api.openai.wustl.edu/models/v1/chat/completions
Headers:
{
"Content-Type": application/json,
"Authorization": Bearer [access_token]
}
Body:
{
"model": "gpt-5-mini",
"messages": [
{
"role": "user",
"content": "Hello world."
}
]
}
Example 2 (Query String): https://api.openai.wustl.edu/models/v1/chat/completions?model=grok-3
Headers:
{
"Content-Type": application/json,
"Authorization": Bearer [access_token]
}
New and updated models are released twice per year.
If you have a need for a specific model not currently available, please email ithelp@wustl.edu with your request so it can be reviewed & prioritized.
Secure AI API FAQs
Example Python code can be found at github.com/washu-ai/genai-api-demos
Secure AI API Users can collaborate in a Microsoft Team GenAI APIM Q&A (access is granted when API access is fulfilled).
You can submit a request to increase your API quota amount by completing this ServiceNow Request: Request Quota Increase.
If you have a need for a specific model not currently available, please email ithelp@wustl.edu with your request so it can be reviewed & prioritized.
Models are hosted with default Azure AI Foundry Content Filtering.
Unfiltered models are hosted within WashU’s secure environment and are available on an exception basis. If the default API content filtering is negatively impacting your research, please email ithelp@wustl.edu with your exception request so someone can reach out to discuss further.