Hi DMcCrea,
Since you're working with Azure AI Agents (via Azure OpenAI Agent SDK) and trying to handle tool calls and their results for your chat application, you're correct that SSE (Server-Sent Events) is the current default way to get real-time responses and tool invocation updates. However, if SSE feels overly complicated or you're looking for better control for processing and displaying formatted results, there are a few cleaner alternatives and design approaches you can consider.
Current Behavior with SSE :
Using stream=True with Azure OpenAI’s Agent SDK emits tokens and tool events via SSE in real-time, and you parse events like:
· message.delta.content
· message.tool_calls
· message.tool_responses
While functional, handling stream parsing and mapping tool calls/responses can feel cumbersome for structured display or async coordination.
Recommended Alternatives and Simplified Options:
1.Polling Instead of Streaming (Post-Tool Call Resolution)
You can avoid SSE entirely by:
· Submitting the user message to the agent.
· Waiting for tool calls to be triggered.
· Resolving the tools.
· Then polling for completion of the agent response.
Pros:
· Full control over timing.
· Easy to manage JSON-based logic.
· Cleaner response structure.
Sample Pattern:
# 1. Send user input
message = client.create_message(thread_id, role="user", content="Translate this text...")
# 2. Run agent (non-streaming)
run = client.run_thread(thread_id, assistant_id, stream=False)
# 3. Wait until tool calls are issued (polling or webhook callback)
while run.status not in ["requires_action", "completed", "failed"]:
run = client.get_run(run.id)
time.sleep(1)
# 4. Handle tool calls
if run.status == "requires_action":
for tool_call in run.required_action.submit_tool_outputs.tool_calls:
# Process each tool call
# Then call submit_tool_outputs(run_id, outputs=[...])
pass
# 5. Poll again until the run completes
This gives structured control without relying on parsing streamed events.
2.Webhooks (Experimental or Future Alternative)
Although not natively supported in the current Azure OpenAI SDK, the OpenAI team is working toward webhook-style callbacks for tool calls, allowing real-time processing without streaming.
Check periodically on Azure updates or raise a request through GitHub or support to vote for it.
3.Custom Event Handling in Your Own Middleware Layer
If you're building a chat frontend, consider:
· Wrapping tool call logic in middleware that intercepts tool invocations before submitting them back.
· Then render display formats on your end (e.g., via WebSocket or state management).
This separates concerns between:
· Agent orchestration (backend).
· Display formatting logic (frontend).
Hope this helps, do let me know if you have any further queries.
Thank you!