Latency Issue In Speech To Text Realtime API

Nidoos Solutions 20 Reputation points
2025-08-07T12:47:06.6733333+00:00

We are using Azure Speech-to-Text (STT) streaming API from the Central India region and experiencing consistent latency of 1.5 to 2 seconds from audio input to transcription result.

Our setup:

SDK: JavaScript/Node SDK using SpeechRecognizer with PushAudioInputStream

Audio Format: 16kHz, mono, 16-bit PCM

Speech_SegmentationSilenceTimeoutMs = 300, EndSilenceTimeoutMs = 500

Using startContinuousRecognitionAsync()

Client is located in India, and we are using the Speech Service key for the Central India region.

We’ve implemented all latency reduction suggestions from this official blog, including optimized buffering, silence detection, and configuration.

Still, we’re facing 1.5–2s latency, which affects our app’s responsiveness. Please help investigate whether this is expected or if it can be optimized further (region/config/network/etc.).

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
0 comments No comments
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.