Configure audio format and voices
When synthesizing speech, you can use a SpeechConfig object to customize the audio that is returned by the Azure AI Speech service.
Audio format
The Azure AI Speech service supports multiple output formats for the audio stream that is generated by speech synthesis. Depending on your specific needs, you can choose a format based on the required:
- Audio file type
- Sample-rate
- Bit-depth
For example, the following Python code sets the speech output format for a previously defined SpeechConfig object named speech_config:
speech_config.set_speech_synthesis_output_format(SpeechSynthesisOutputFormat.Riff24Khz16BitMonoPcm)
For a full list of supported formats and their enumeration values, see the Azure AI Speech SDK documentation.
Voices
The Azure AI Speech service provides multiple voices that you can use to personalize your speech-enabled applications. Voices are identified by names that indicate a locale and a person's name - for example en-GB-George
.
The following Python example code sets the voice to be used
speech_config.speech_synthesis_voice_name = "en-GB-George"
For information about voices, see the Azure AI Speech SDK documentation.