Secure API access to AI endpoints

WashU’s Generative AI Environment provides cost effective, easy to use & unified API access to a variety of approved AI endpoints across multiple providers. These endpoints are secure and hosted within WashU’s Azure tenant. They are pre-configured with default content and safety filters so users can focus on their AI interactions rather than the myriad of configurations typically required. Endpoint services are billed at cost based on usage and are intended to be used by researchers, faculty & staff who need secure access to AI APIs for university-related purposes.

Access

Approved Models

In Review Models

Getting Started

FAQs

Key Features

Programmatic API access to several sandboxed AI models in WashU’s secure environment
Approved for use with sensitive information, including data covered by HIPAA or FERPA
Access is limited to WashU networks and VPN
Ability to preset a fixed budget for API usage to avoid surprise bills
Ability to monitor token usage and budget in real-time
Billed for token usage to the cost center provided when requesting access

Requesting Access

To access the secure AI API endpoints, you need to first complete a ServiceNow request form. As part of fulfilling this request, you will be provided a Client ID and Client Secret that will be used when accessing the API endpoints. Please follow this link to request access:

Request access to Secure AI APIs

Approved AI Models

The following sandboxed LLMs are currently approved and hosted within WashU’s Secure AI environment.

Click on any model to view more details from within Azure AI Foundry including a model overview, benchmarks, the training data date and more.

Chat Completion Models

Model Name	Model Version	Requests per Minute (RPM)	Tokens per Minute (TPM)	$ per 1M Tokens
gpt-4o	2024-11-20	120	120,000	~$4.00
gpt-4o-mini	2024-07-18	1,200	120,000	~$0.30
gpt-4.1	2025-04-14	100	100,000	~$3.50
gpt-4.1-mini	2025-04-14	120	120,000	~$0.70
gpt-4.1-nano	2025-04-14	120	120,000	~$0.20
gpt-5	2025-08-07	1,200	120,000	~$3.50
gpt-5-mini	2025-08-07	120	120,000	~$0.70
gpt-5-nano	2025-08-07	120	120,000	~$0.20
grok-3	1	100	100,000	~$6.00
grok-3-mini	1	100	100,000	~$0.50

Embedding Models

Model Name	Model Version	Requests per Minute (RPM)	Tokens per Minute (TPM)
text-embedding-3-small	1	120	120,000
text-embedding-ada-002	2	120	120,000

Models in review

The following models are undergoing an Information Security Review before being released.

Model Name	Model Version	Requests per Minute (RPM)	Tokens per Minute (TPM)	$ per 1M Tokens
gpt-5-chat	2025-08-07	120	120,000	~$3.70
llama-4-maverick	1		120,000	~$0.60
deepseek-r1-0528	1	100	100,000	~$2.40

Getting Started

To access the API endpoints, you’ll have to be on the WashU network OR you will need to VPN to the WashU network. (VPN instructions can be found at it.wustl.edu/items/connect)

1. Get Microsoft OAuth 2.0 Token

The API endpoints use Microsoft OAuth 2.0 token. A token must be retrieved before an API call can be made.

Client ID & Client Secret are securely provided to you when your registration request is fulfilled.

It is essential that only authorized users have access to this password per the WashU Information Security Authentication, Authorization & Audit policy

Request:

POST https://login.microsoftonline.com/4ccca3b5-71cd-4e6d-974b-4d9beb96c6d6/oauth2/v2.0/token

{
 "client_id": [provided during registration],
 "client_secret": [provided during registration],
 "scope": "api://bbeee386-60d6-4ba4-b9a7-631763f66065/.default",
 "grant_type": client_credentials
}

Response: Status Code: 200

{
  "body": {
   "access_token": [ACCESS TOKEN TO USE]
  }
}

2. Assemble API Calls

Once an OAuth token is retrieved, make your API endpoint calls with the model specified either in the body or the query string.

See Work with chat completion models for additional examples and syntax support

Example 1 (Request Body): https://api.openai.wustl.edu/models/v1/chat/completions

Headers:
  {
    "Content-Type": application/json,
    "Authorization": Bearer [access_token]
  }
Body:
  {
    "model": "gpt-5-mini",
    "messages": [
      {
        "role": "user",
        "content": "Hello world."
      }
    ]
  }


Example 2 (Query String): https://api.openai.wustl.edu/models/v1/chat/completions?model=grok-3

Headers:
  {
    "Content-Type": application/json,
    "Authorization": Bearer [access_token]
  }

New and updated models are released twice per year.

If you have a need for a specific model not currently available, please email ithelp@wustl.edu with your request so it can be reviewed & prioritized.

Secure AI API FAQs

Where can I find example code using APIs?

Example Python code can be found at github.com/washu-ai/genai-api-demos

Where can I collaborate with other APIM users and get support?

Secure AI API Users can collaborate in a Microsoft Team GenAI APIM Q&A (access is granted when API access is fulfilled).

Can I request a quota increase?

You can submit a request to increase your API quota amount by completing this ServiceNow Request: Request Quota Increase.

Can I request a new model be added to APIM?

If you have a need for a specific model not currently available, please email ithelp@wustl.edu with your request so it can be reviewed & prioritized.

What content filtering is set on the APIs?

Models are hosted with default Azure AI Foundry Content Filtering.

Default API content filtering is preventing my research project. Can it be adjusted?

Unfiltered models are hosted within WashU’s secure environment and are available on an exception basis. If the default API content filtering is negatively impacting your research, please email ithelp@wustl.edu with your exception request so someone can reach out to discuss further.