Configure audio format and voices

3 minutes

When synthesizing speech, you can use a SpeechConfig object to customize the audio that is returned by the Azure AI Speech service.

Audio format

The Azure AI Speech service supports multiple output formats for the audio stream that is generated by speech synthesis. Depending on your specific needs, you can choose a format based on the required:

Audio file type
Sample-rate
Bit-depth

For example, the following Python code sets the speech output format for a previously defined SpeechConfig object named speech_config:

speech_config.set_speech_synthesis_output_format(SpeechSynthesisOutputFormat.Riff24Khz16BitMonoPcm)

For a full list of supported formats and their enumeration values, see the Azure AI Speech SDK documentation.

Voices

The Azure AI Speech service provides multiple voices that you can use to personalize your speech-enabled applications. Voices are identified by names that indicate a locale and a person's name - for example en-GB-George.

The following Python example code sets the voice to be used

speech_config.speech_synthesis_voice_name = "en-GB-George"

For information about voices, see the Azure AI Speech SDK documentation.

Configure audio format and voices

Audio format

Voices

Feedback