Hello GenixPRO,
Welcome to the Microsoft Q&A and thank you for posting your questions here.
I understand that you are in need of Voice Live API Javascript/TS code for mobile client to directly connect using websockets and I believed you would be using this on Microsoft Azure.
Regarding your questions:
Is there any JS/TS quickstart for Voice Live API implementation? To connect mobile client directly to API using websockets
No official JS/TS quickstart exists for mobile clients directly connecting to Azure Voice Live API via WebSocket. However, there are:
- A Python quickstart is available using
aiohttp
andwebsockets
- https://learn.microsoft.com/en-us/azure/ai-services/speech-service/voice-live-agents-quickstart - A GitHub sample exists for web clients using Azure Communication Services (ACS) and Azure Voice Live API - https://github.com/Azure-Samples/call-center-voice-agent-accelerator
The best workaround is that you can implement a JS/TS client using standard WebSocket libraries and follow the Azure WebSocket protocol. Following this architecture:
If no such QS available, is the implementation similar to Assistants API? In that case we specify agent_id, set property & connect using ws:// or wss://
Yes, conceptually similar, but with distinct differences technically.
Feature | Azure Assistants API | Azure Voice Live API |
---|---|---|
Protocol | HTTP/WebSocket | WebSocket only |
Protocol | HTTP/WebSocket | WebSocket only |
Authentication | API Key / Entra ID | Entra ID (recommended) |
Session Start | thread.create |
session.update |
Audio Format | Optional | Required (PCM 24kHz) |
Turn Detection | Basic | Advanced (semantic VAD) |
Agent Configuration | Manual | Encapsulated via agent_id |
You can use session.update
to configure agent, voice, audio settings, and turn detection.
Since you're using Azure Service, the best option is to use Azure AI Foundry Agent Service as follow:
- Create an Azure AI Foundry resource in supported regions.
- Create an agent in the Foundry portal.
- Use the agent_id in your WebSocket session payload.
- Use Microsoft Entra ID for secure authentication.
In this link - https://learn.microsoft.com/en-us/azure/ai-services/speech-service/voice-live-agents-quickstart the article explain a robust Azure Voice Live Agent Quickstart for your experiment.
I hope this is helpful! Do not hesitate to let me know if you have any other questions or clarifications.
Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.