Understand observability

Completed

There are many ways to measure generative AI's response quality. In general, you can think of three dimensions for evaluating and monitoring generative AI. These include:

  • Performance and quality evaluators: assess the accuracy, groundedness, and relevance of generated content.
  • Risk and safety evaluators: assess potential risks associated with AI-generated content to safeguard against content risks. This includes evaluating an AI system's predisposition towards generating harmful or inappropriate content.
  • Custom evaluators: industry-specific metrics to meet specific needs and goals.

Azure AI Foundry supports observability features that improve the performance and trustworthiness of generative AI responses. Evaluators are specialized tools in Azure AI Foundry that measure the quality, safety, and reliability of AI responses.

Some evaluators include:

  • Groundedness: measures how consistent the response is with respect to the retrieved context.
  • Relevance: measures how relevant the response is with respect to the query.
  • Fluency: measures natural language quality and readability.
  • Coherence: measures logical consistency and flow of responses.
  • Content safety: comprehensive assessment of various safety concerns.

Next, let's try out generative AI capabilities in Azure AI Foundry portal.