Import an OpenAI-compatible Google Gemini API

2025-07-08

APPLIES TO: All API Management tiers

This article shows you how to import an OpenAI-compatible Google Gemini API to access models such as gemini-2.0-flash. For these models, Azure API Management can manage an OpenAI-compatible chat completions endpoint.

Learn more about managing AI APIs in API Management:

Prerequisites

An existing API Management instance. Create one if you haven't already.
An API key for the Gemini API. If you don't have one, create it at Google AI Studio and store it in a safe location.

Import an OpenAI-compatible Gemini API using the portal

In the Azure portal, navigate to your API Management instance.
In the left menu, under APIs, select APIs > + Add API.
Under Define a new API, select Language Model API.
On the Configure API tab:
1. Enter a Display name and optional Description for the API.
2. In URL, enter the following base URL from the Gemini OpenAI compatibility documentation: https://generativelanguage.googleapis.com/v1beta/openai
3. In Path, append a path that your API Management instance uses to route requests to the Gemini API endpoints.
4. In Type, select Create OpenAI API.
5. In Access key, enter the following:
  1. Header name: Authorization.
  2. Header value (key): Bearer followed by your API key for the Gemini API.
On the remaining tabs, optionally configure policies to manage token consumption, semantic caching, and AI content safety. For details, see Import a language model API.
Select Review.
After settings are validated, select Create.

API Management creates the API and configures the following:

A backend resource and a set-backend-service policy that direct API requests to the Google Gemini endpoint.
Access to the LLM backend using the Gemini API key you provided. The key is protected as a secret named value in API Management.
(optionally) Policies to help you monitor and manage the API.

Test Gemini model

After importing the API, you can test the chat completions endpoint for the API.

Select the API that you created in the previous step.
Select the Test tab.
Select the POST Creates a model response for the given chat conversation operation, which is a POST request to the /chat/completions endpoint.
In the Request body section, enter the following JSON to specify the model and an example prompt. In this example, the gemini-2.0-flash model is used.
```
{
    "model": "gemini-2.0-flash",
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful assistant"
        },
        {
            "role": "user",
            "content": "How are you?"
        }
    ],
    "max_tokens": 50
}
```
When the test is successful, the backend responds with a successful HTTP response code and some data. Appended to the response is token usage data to help you monitor and manage your language model token consumption.

AI gateway capabilities in Azure API Management

Share via

Import an OpenAI-compatible Google Gemini API

Prerequisites

Import an OpenAI-compatible Gemini API using the portal

Test Gemini model

Related content

Feedback

Additional resources