查询嵌入模型

2025-08-08

本文介绍如何为针对针对嵌入任务进行优化的基础模型编写查询请求，并将其发送到模型服务终结点。

本文中的示例适用于查询使用以下任一方法提供的基础模型：

基础模型 API ，称为 Databricks 托管的基础模型。
称为在 Databricks 外部托管的基础模型的外部模型。

要求

请参阅要求。
依据所选的查询客户端选项将合适的包安装到群集。

查询示例

以下是基础模型 API 使用不同的gte-large-en提供的模型的嵌入请求。

OpenAI 客户端

要使用 OpenAI 客户端，需将模型服务终结点名称指定为 model 输入。


from databricks.sdk import WorkspaceClient

w = WorkspaceClient()
openai_client = w.serving_endpoints.get_open_ai_client()

response = openai_client.embeddings.create(
  model="databricks-gte-large-en",
  input="what is databricks"
)

若要在工作区外部查询基础模型，必须直接使用 OpenAI 客户端，如下所示。以下示例假定在计算中安装了 Databricks API 令牌和 openai。还需要 Databricks 工作区实例才能将 OpenAI 客户端连接到 Databricks。


import os
import openai
from openai import OpenAI

client = OpenAI(
    api_key="dapi-your-databricks-token",
    base_url="https://example.staging.cloud.databricks.com/serving-endpoints"
)

response = client.embeddings.create(
  model="databricks-gte-large-en",
  input="what is databricks"
)

SQL

重要

以下示例使用内置 SQL 函数 ai_query。此函数为公共预览版，定义可能会更改。


SELECT ai_query(
    "databricks-gte-large-en",
    "Can you explain AI in ten words?"
  )

REST API

重要

以下示例使用 REST API 参数来查询为基础模型和外部模型提供服务的终结点。这些参数为公共预览版，定义可能会更改。请参阅 POST /serving-endpoints/{name}/invocations。


curl \
-u token:$DATABRICKS_TOKEN \
-X POST \
-H "Content-Type: application/json" \
-d  '{ "input": "Embed this sentence!"}' \
https://<workspace_host>.databricks.com/serving-endpoints/databricks-gte-large-en/invocations

MLflow 部署 SDK

重要

以下示例使用来自predict()的 API。


import mlflow.deployments

export DATABRICKS_HOST="https://<workspace_host>.databricks.com"
export DATABRICKS_TOKEN="dapi-your-databricks-token"

client = mlflow.deployments.get_deploy_client("databricks")

embeddings_response = client.predict(
    endpoint="databricks-gte-large-en",
    inputs={
        "input": "Here is some text to embed"
    }
)

Databricks Python SDK


from databricks.sdk import WorkspaceClient
from databricks.sdk.service.serving import ChatMessage, ChatMessageRole

w = WorkspaceClient()
response = w.serving_endpoints.query(
    name="databricks-gte-large-en",
    input="Embed this sentence!"
)
print(response.data[0].embedding)

LangChain

若要使用 LangChain 中的 Databricks 基础模型 API 模型作为嵌入模型，请导入 DatabricksEmbeddings 类并指定 endpoint 参数，如下所示：

%pip install databricks-langchain

from databricks_langchain import DatabricksEmbeddings

embeddings = DatabricksEmbeddings(endpoint="databricks-gte-large-en")
embeddings.embed_query("Can you explain AI in ten words?")

下面是嵌入模型的预期请求格式。对于外部模型，可以包含对给定提供程序和终结点配置有效的其他参数。请参阅其他查询参数。


{
  "input": [
    "embedding text"
  ]
}

下面是预期的响应格式：

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": []
    }
  ],
  "model": "text-embedding-ada-002-v2",
  "usage": {
    "prompt_tokens": 2,
    "total_tokens": 2
  }
}

支持的模型

有关支持的嵌入模型，请参阅基础模型类型。

检查嵌入是否规范化

使用以下命令检查模型生成的嵌入是否已规范化。


  import numpy as np

  def is_normalized(vector: list[float], tol=1e-3) -> bool:
      magnitude = np.linalg.norm(vector)
      return abs(magnitude - 1) < tol