Event Hub Trigger in Python Function: limits on total payload despite high batch size?

Wessel Vonk 45 Reputation points
2025-07-07T08:54:53.7333333+00:00

I'm using an Azure Function in Python with an Event Hub trigger and have configured maxEventBatchSize to a high value (e.g., 3000–5000).

I understand that each individual outbound Event Hub Batch has a maximum size of 1 MB. My question is about inbound batches:

Is there an effective maximum total payload size for a batch of events received via the Event Hub trigger (Python), regardless of the configured maxEventBatchSize?

In other words, if memory and resource usage remain acceptable, can I reliably receive large batches (thousands of small messages)? Or does the platform enforce a cap similar to outbound scenarios — for example, limiting total batch size to ~1 MB?

Any insights specific to the Python worker would be much appreciated.

Azure Event Hubs
{count} votes

1 answer

Sort by: Most helpful
  1. Smaran Thoomu 28,225 Reputation points Microsoft External Staff Moderator
    2025-07-07T10:29:32.73+00:00

    Hi Wessel Vonk
    Great question - this is a subtle but important detail when scaling Event Hub–triggered Azure Functions in Python.
    Clarifying Inbound Limits:

    While outbound Event Hubs batches are indeed capped at 1 MB, inbound batches to Azure Functions (via Event Hub trigger) do not have a strict size limit like 1 MB. However:

    • The actual size of the inbound batch is influenced by:
      • maxEventBatchSize (your configured value, e.g., 3000–5000)
      • Size of each individual event
      • Internal thresholds like function app memory, timeout, and Python worker buffer limits

    There isn’t a hard-documented limit on total batch payload size, but in practice:

    Azure may internally cap or truncate batches if:

    • The total payload size approaches several MBs, or
    • The event processing latency exceeds runtime constraints

    What This Means in Python:

    • If your events are small (~1 KB or less), you can reliably receive batches of 1000+ events.
    • However, if total batch size becomes too large (e.g., >5–10 MB), you may notice inconsistent delivery or auto-splitting.
    • The Python worker is single-threaded, so processing large batches may also affect throughput and memory usage - especially with deserialization of large payloads.

    Recommendation:

    • Keep maxEventBatchSize aligned with event size - test for sweet spot between batch size and processing latency.
    • Add monitoring to log len(events) and memory usage.
    • If needed, scale out by enabling Event Hub partition concurrency in host.json:
    {
      "extensions": {
        "eventHubs": {
          "batchCheckpointFrequency": 1,
          "eventProcessorOptions": {
            "maxBatchSize": 3000,
            "prefetchCount": 5000
          }
        }
      }
    }
    

    I hope this information helps. Please do let us know if you have any further queries.


    If this helps, feel free to click Accept Answer and let us know if you have further questions.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.