Azure AI Speech

0 answers

Using a Custom AI Avatar on the sample code

Hello I am trying to create a Proof of Concept by using https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/samples/js/browser/avatar I want to show a custom video of a person in an avatar with text generated with OpenAi. In other…

asked

It is VMS 130

commented

Manas Mohanty 8,150 Microsoft External Staff Moderator

1 answer

Speech Studio - Error in exporting data from editor to dataset section

Hello all, I am using Azure speech studio to train custom speech models. Since last week I am unable to export data (audio + transcript) from "Editor" section to "Training and Testing Dataset" section. I simply see the attached error.…

asked

Shree 0

commented

Ravada Shivaprasad 920 Microsoft External Staff Moderator

2 answers

Azure OpenAI Realtime API

In Azure Foundry, the model "gpt-4o-realtime-preview-2025-06-03" is marked as: "Legacy model version: This deployment is using legacy model gpt-4o-realtime-preview version 2025-06-03 which will be retired on 9/01/2025 local time. When the…

asked

Artur Dereń 0

commented

Pavankumar Purilla 10,350 Microsoft External Staff Moderator

0 answers

Does live chat avatar synthesis support WordBoundary events?

I was trying to set up the WordBoundary event callback for my live chat avatar synthesis but the callback is never run (the avatar speaks in the front-end but I get no events). That brings me to the question - are these events even supported for live…

asked

Mindaugas Giedraitis 0

commented

Mindaugas Giedraitis 0

1 answer

change ai voice

how we can change ai voice -audiomodify?

asked

Charlotte Flint 0

answered

Amira Bedhiafi 35,766 Volunteer Moderator

0 answers

Latency Issue In Speech To Text Realtime API

We are using Azure Speech-to-Text (STT) streaming API from the Central India region and experiencing consistent latency of 1.5 to 2 seconds from audio input to transcription result. Our setup: SDK: JavaScript/Node SDK using SpeechRecognizer with…

asked

Nidoos Solutions 20

1 answer

30 secs timeout on Azure speech to text

Hello, I'm experiencing an issue with Azure Speech-to-Text where, in continuous recognition mode, it outputs a RECOGNIZED result every 30 seconds, regardless of whether speech has stopped. Adjusting settings like Speech_SegmentationSilenceTimeoutMs has…

asked

Nandhu TS 0

commented

Nandhu TS 0

1 answer

How-to setup Speech SDK with MAS (AEC) in Unity

Hello everyone, I am trying to implement Acoustic Echo Cancellation (AEC) in a Unity project using the Azure Speech SDK and the Microsoft Audio Stack (MAS), but I cannot get it to work correctly. The speech recognizer continues to pick up and transcribe…

asked

nk 5

answered

Manas Mohanty 8,150 Microsoft External Staff Moderator

0 answers

Issues with Azure Speech Services: Incorrect transcription of "draft" as "draught" and "£" as "lbs" in UK English

I'm using Azure Speech Services with the language set to UK English, and I've noticed two recurring transcription issues: When I dictate the word "draft", it consistently transcribes as "draught", even when the context clearly favors…

asked

Niki Kariappa 0

commented

Manas Mohanty 8,150 Microsoft External Staff Moderator

1 answer

azure speech to text cannot process spelt out words

When using real-time speech to text, if the audio spells out a word or name, the result outputs the name as if it was said in whole and not spelled. (e.g. the audio says "My name is John. J-O-H-N", but the result I get is "My name is John.…

asked

Tim 0

answered

Jerald Felix 4,450

0 answers

Error while trying to train a 202240228 Whisper Large v2 baseline model

When trying to train a custom speech model using a dataset containing an audio file and its transcript, the model failed to train due to an internal error. Can anyone provide any insights on how to troubleshoot this issue?

asked

Engineering 0

edited a comment

RNareddy 2,505 Microsoft External Staff Moderator

1 answer

Can you help me access the Dustin voice in Azure text to speech studio?

Hello I created a new Azure account and need to access the Dustin voice, but it is not available. Can you help? Thank you.

asked

Piet Levy 0

answered

Sina Salam 22,576 Volunteer Moderator

1 answer

Is it possible to get subtitles or a timed script with batch synthesis text to speech avatar?

Using batch text-to-speach or batch avatar API, is it possible to get subtitles on the generated video? Or even better, getting a script of the text with time stamps. I was hoping to do some front end shenanigans by creating texts highlights, as the…

asked

d m 5

commented

Piyush Paras Tiwari 0

0 answers

Azure speech to text appears very slow

Hi team, We have observed that the Azure speech-to-text is very slow. I am using continuousRecognitionAsync and I observe that Azure takes a total of close to 6s for just 3s audio. The parameters that I've set are: EndSilenceTimeoutMs =…

asked

Sai Vishnu Soudri 65

edited a comment

Sven 0

1 answer

Problem creating SpeechRecognizer with audio stream input using node.js Speech SDK

Using Speech SDK for JavaScript v1.44.0, and following the STT in-memory streaming example, but using the fromEndpoint API to create Recognizer, as recommended in the Release Notes for that SDK version. Node.js is v22 LTS, running in Azure Cloud as an…

asked

Michael Pickering 0

commented

Ravada Shivaprasad 920 Microsoft External Staff Moderator

1 answer

Unexpectedly high TTS character count in Azure Speech Service during live app test

Hi team, We are running a production-ready church translation app using Azure Translator and Azure Speech Services (STT & TTS, neural voice). During a 30 minute live test involving 8 user devices (all using Spanish), we observed the…

asked

Lauren Van Niekerk 20

accepted

Lauren Van Niekerk 20

1 answer

Quota increase on my cognitive services - text to speech usages.

Hello, I can't create a support ticket on Azure since obviously I am on basic plan, any way for me to get help increasing the quota? Thank you!

asked

LeetGPT 100

commented

Manas Mohanty 8,150 Microsoft External Staff Moderator

2 answers

Azure Speech Recognition: noise or chatter is recognized as insurance related texts, rather that just not recognized.

Hello, When noise or background chatter is sent to Azure Speech Recognition it sends back texts related to insurance policies. Texts which where not spoken at all. For example repsonse like (for language nl-NL): ik wil graag een verzekering afsluiten…

asked

Gijsbert Huijsen 21

accepted

Gijsbert Huijsen 21

1 answer

How is the Synthesized Characters count for Azure's Text to Speech service when generating from an SSML?

How is the Synthesized Characters count calculated for Azure's Text to Speech service when generating speech from an SSML document? What are the specific rules? I converted the following SSML file into speech: …

asked

ggg 0

commented

ggg 0

0 answers

I need a local STT and TTS model partner for my CRM product

I need a local STT and TTS model partner for my CRM product Essential features: 1, Speech to text 2, Support for websocket access 3, Support for speaker identification Key features: 1, Support for real-time translation 2, Support for downloading audio…

asked

Akansha Kapoor 0

Filter

Content

2,098 questions with Azure AI Speech tags

Using a Custom AI Avatar on the sample code

Speech Studio - Error in exporting data from editor to dataset section

Azure OpenAI Realtime API

Does live chat avatar synthesis support WordBoundary events?

change ai voice

Latency Issue In Speech To Text Realtime API

30 secs timeout on Azure speech to text

How-to setup Speech SDK with MAS (AEC) in Unity

Issues with Azure Speech Services: Incorrect transcription of "draft" as "draught" and "£" as "lbs" in UK English

azure speech to text cannot process spelt out words

Error while trying to train a 202240228 Whisper Large v2 baseline model

Can you help me access the Dustin voice in Azure text to speech studio?

Is it possible to get subtitles or a timed script with batch synthesis text to speech avatar?

Azure speech to text appears very slow

Problem creating SpeechRecognizer with audio stream input using node.js Speech SDK

Unexpectedly high TTS character count in Azure Speech Service during live app test

Quota increase on my cognitive services - text to speech usages.

Azure Speech Recognition: noise or chatter is recognized as insurance related texts, rather that just not recognized.

How is the Synthesized Characters count for Azure's Text to Speech service when generating from an SSML?

I need a local STT and TTS model partner for my CRM product