Summary

Completed

In this module, you learned about audio-enabled generative AI models and how to implement chat solutions that include audio-based input.

Audio-enabled models let you create AI solutions that can understand audio and respond to related questions or instructions. Beyond just identifying spoken words, some models can also use reasoning based on what they hear. For instance, they can summarize a message or assess the speaker's sentiment.

Tip

For more information about working with multimodal models in Azure AI Foundry, see How to use image and audio in chat completions with Azure AI model inference and Quickstart: Use speech and audio in your AI chats.