Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Azure AI Search doesn't host embedding models, so you're responsible for creating vectors for query inputs and outputs. Choose one of the following approaches:
Approach | Description |
---|---|
Integrated vectorization | Use built-in data chunking and vectorization in Azure AI Search. This approach takes a dependency on indexers, skillsets, and built-in or custom skills that point to external embedding models, such as those in Azure AI Foundry. |
Manual vectorization | Manage data chunking and vectorization yourself. For indexing, you push prevectorized documents into vector fields in a search index. For querying, you provide precomputed vectors to the search engine. For demos of this approach, see the azure-search-vector-samples GitHub repository. |
We recommend integrated vectorization for most scenarios. Although you can use any supported embedding model, this article uses Azure OpenAI models for illustration.
How embedding models are used in vector queries
Embedding models generate vectors for both query inputs and query outputs. Query inputs include:
Text or images that are converted to vectors during query processing. As part of integrated vectorization, a vectorizer performs this task.
Precomputed vectors. You can generate these vectors by passing the query input to an embedding model of your choice. To avoid rate limiting, implement retry logic in your workload. Our Python demo uses tenacity.
Based on the query input, the search engine retrieves matching documents from your search index. These documents are the query outputs.
Your search index must already contain documents with one or more vector fields populated by embeddings. You can create these embeddings through integrated or manual vectorization. To ensure accurate results, use the same embedding model for indexing and querying.
Tips for embedding model integration
Identify use cases. Evaluate specific use cases where embedding model integration for vector search features adds value to your search solution. Examples include multimodal search or matching image content with text content, multilingual search, and similarity search.
Design a chunking strategy. Embedding models have limits on the number of tokens they accept, so data chunking is necessary for large files.
Optimize cost and performance. Vector search is resource intensive and subject to maximum limits, so vectorize only the fields that contain semantic meaning. Reduce vector size to store more vectors for the same price.
Choose the right embedding model. Select a model for your use case, such as word embeddings for text-based searches or image embeddings for visual searches. Consider pretrained models, such as text-embedding-ada-002 from OpenAI or the Image Retrieval REST API from Azure AI Vision.
Normalize vector lengths. To improve the accuracy and performance of similarity search, normalize vector lengths before you store them in a search index. Most pretrained models are already normalized.
Fine-tune the model. If needed, fine-tune the model on your domain-specific data to improve its performance and relevance to your search application.
Test and iterate. Continuously test and refine the embedding model integration to achieve your desired search performance and user satisfaction.
Create resources in the same region
Although integrated vectorization with Azure OpenAI embedding models doesn't require resources to be in the same region, using the same region can improve performance and reduce latency.
To use the same region for your resources:
Check the regional availability of Azure AI Search.
Create an Azure OpenAI resource and Azure AI Search service in the same region.
Tip
Want to use semantic ranking for hybrid queries or a machine learning model in a custom skill for AI enrichment? Choose an Azure AI Search region that provides those features.
Choose an embedding model in Azure AI Foundry
When you add knowledge to an agent workflow in the Azure AI Foundry portal, you have the option of creating a search index. A wizard guides you through the steps.
One step involves selecting an embedding model to vectorize your plain text content. The following models are supported:
- text-embedding-3-small
- text-embedding-3-large
- text-embedding-ada-002
- Cohere-embed-v3-english
- Cohere-embed-v3-multilingual
Your model must already be deployed, and you must have permission to access it. For more information, see Deployment overview for Azure AI Foundry Models.
Generate an embedding for an improvised query
If you don't want to use integrated vectorization, you can manually generate an embedding and paste it into the vectorQueries.vector
property of a vector query. For more information, see Create a vector query in Azure AI Search.
The following examples assume the text-embedding-ada-002 model. Replace YOUR-API-KEY
and YOUR-OPENAI-RESOURCE
with your Azure OpenAI resource details.
using System;
using System.Net.Http;
using System.Text;
using System.Threading.Tasks;
using Newtonsoft.Json;
class Program
{
static async Task Main(string[] args)
{
var apiKey = "YOUR-API-KEY";
var apiBase = "https://YOUR-OPENAI-RESOURCE.openai.azure.com";
var apiVersion = "2024-02-01";
var engine = "text-embedding-ada-002";
var client = new HttpClient();
client.DefaultRequestHeaders.Add("Authorization", $"Bearer {apiKey}");
var requestBody = new
{
input = "How do I use C# in VS Code?"
};
var response = await client.PostAsync(
$"{apiBase}/openai/deployments/{engine}/embeddings?api-version={apiVersion}",
new StringContent(JsonConvert.SerializeObject(requestBody), Encoding.UTF8, "application/json")
);
var responseBody = await response.Content.ReadAsStringAsync();
Console.WriteLine(responseBody);
}
}
The output is a vector array of 1,536 dimensions.