Azure OpenAI on Your Data: Streaming Response Performance with JSON Property Filtering

Kaja Sherief 115 Reputation points
2025-08-11T08:51:01.93+00:00

I'm working with Azure OpenAI streaming responses and facing a performance challenge. Currently getting JSON responses that need selective property extraction.

Current Performance:

  • Direct streaming: 3.5-4.5 seconds
  • With post-processing: 6-7 seconds

Technical Details:

  1. Azure "on your data" returns string responses (not native JSON objects)
  2. I convert string to JSON programmatically
  3. UI needs only one specific JSON property
  4. Other properties require background processing
  5. Currently waiting for complete response before processing

Current Approach:

Stream Response → Wait for Complete → Parse JSON → Extract Property → Re-chunk → Send to UI

Questions:

  1. How can I parse and extract specific JSON properties during streaming without waiting for complete response?
  2. Are there Azure-native solutions for selective property streaming?
  3. What's the recommended pattern for mixed streaming scenarios (immediate UI + background processing)?

Goal: Maintain 3.5-4.5 second response time while extracting specific properties for UI and processing remaining data in background.

Any optimization suggestions or alternative approaches would be helpful.

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.