提取关键短语
关键短语提取是 Azure AI 语言提供的功能。 它标识文本中的关键短语或主要概念。
可通过多种方式调用 关键短语提取 API。 在这里,使用 azure_ai
扩展提取 SQL 查询中的关键短语。
先决条件
你需要具有 Azure Database for PostgreSQL 灵活服务器,且azure_ai
扩展。 还需要使用 Azure 认知服务对其进行授权,方法是设置语言资源的密钥和终结点。
场景
关键短语提取适用于各种任务:
- 摘要:使用关键短语将较长的文档减少到核心主题,例如识别音频脚本或会议笔记中讨论的主题。
- 内容分类:使用关键短语为文档编制索引以供搜索和浏览。 关键短语还可用于可视化单词云中的文档。
- 文档聚类分析:可以使用关键短语对大量支持票证、产品评审和其他非结构化输入进行聚集和分析。
将关键短语提取 SQL 与 Azure 认知服务配合使用
Azure Database for PostgreSQL 灵活服务器的 azure_ai扩展 提供用户定义的函数(UDF),以便直接从 SQL 内部访问 AI 功能。 使用azure_cognitive.extract_key_phrases
函数访问关键短语提取 API:
azure_cognitive.extract_key_phrases(
text TEXT,
language TEXT,
timeout_ms INTEGER DEFAULT 3600000,
throw_on_error BOOLEAN DEFAULT TRUE,
disable_service_logs BOOLEAN DEFAULT FALSE
)
所需的参数是 text
、输入项,以及表示language
所使用的语言text
。 例如, en-us
美国英语,法语 fr
。 有关可用语言的完整列表,请参阅 语言支持 。
默认情况下,如果关键短语提取未以 3,600,000 毫秒(即 1 小时)完成,则会停止关键短语提取。 您可以更改 timeout_ms
以自定义此延迟。
如果发生错误,则默认行为是引发异常,从而导致事务回滚。 可以通过设置为 throw_on_error
false 来禁用此行为。
有关完整参数文档,请参阅 Azure 认知服务扩展文档 。
例如,调用此查询:
SELECT azure_cognitive.extract_key_phrases('The food was delicious and the staff were wonderful.', 'en-us');
提供以下结果:
extract_key_phrases
---------------------
{food,staff}
可以将表列用于输入文本:
SELECT description, azure_cognitive.extract_key_phrases(description, 'en-us')
FROM listings LIMIT 1;
返回结果(开启 \x
进行扩展显示):
description | Welcome! If you stay here you will be living in a light filled two bedroom upper and ground level apartment (in a two apartment home). During your stay you will be welcome to share in our fresh eggs from the chickens and garden produce in season! Welcome! Come enjoy your time in Seattle at a lovely urban farmstead. There are two bedrooms each with a queen bed, full bath, living room and kitchen with wood floors throughout. During your stay you will be welcome to eat fresh eggs from the chickens and possibly fruit/veggies from the garden if you are in luck! We are family friendly and have a down to earth atmosphere. There is a large covered back porch and grill for hanging out especially in summer and a treehouse for up in the trees hammock time! Walking distance to Othello Light Rail Station for easy access to downtown. Also nearby is the fantastic Seward Park and the Kubota Gardens for outdoorsy loveliness. New last year is out beautiful Rainier Beach indoor swimming pool comp
extract_key_phrases | {"beautiful Rainier Beach indoor swimming pool","large covered back porch","Othello Light Rail Station","ground level apartment","lovely urban farmstead","fantastic Seward Park","two bedroom upper","two apartment home","two bedrooms","fresh eggs","queen bed","full bath","living room","wood floors","earth atmosphere","Walking distance","easy access","Kubota Gardens","outdoorsy loveliness","garden produce","hammock time",stay,chickens,season,Seattle,kitchen,fruit/veggies,luck,grill,summer,treehouse,trees,downtown,last}
概要
关键短语提取从文本中选择主要概念。 Azure 认知服务语言模型负责将自然语言归结为关键字或短语。 azure_ai
Azure Database for PostgreSQL 的扩展提供 azure_cognitive.extract_key_phrases
API,用于直接在 SQL 查询中访问关键短语提取。