Latency Issue In Speech To Text Realtime API

Nidoos Solutions 20

We are using Azure Speech-to-Text (STT) streaming API from the Central India region and experiencing consistent latency of 1.5 to 2 seconds from audio input to transcription result.

Our setup:

SDK: JavaScript/Node SDK using SpeechRecognizer with PushAudioInputStream

Audio Format: 16kHz, mono, 16-bit PCM

Speech_SegmentationSilenceTimeoutMs = 300, EndSilenceTimeoutMs = 500

Using startContinuousRecognitionAsync()

Client is located in India, and we are using the Speech Service key for the Central India region.

We’ve implemented all latency reduction suggestions from this official blog, including optimized buffering, silence detection, and configuration.

Still, we’re facing 1.5–2s latency, which affects our app’s responsiveness. Please help investigate whether this is expected or if it can be optimized further (region/config/network/etc.).

Share via

Latency Issue In Speech To Text Realtime API

Your answer