Azure Container App with EventHubConsumerClient gets stuck after idle period – requires restart to resume
I’ve deployed a Python app in Azure Container Apps which uses the EventHubConsumerClient from azure-eventhub to process events. The setup includes:
Azure Event Hub (with blob checkpointing via BlobCheckpointStore)
Azure Container App hosting the Python dispatcher
azure-eventhub async SDK (latest version)
Connection and error logging enabled (uamqp, azure.eventhub log levels: DEBUG)
Flask app with Celery for webhook delivery
Problem: After being idle for a while (e.g., no events for ~10–30 minutes), the container stops receiving new events from Event Hub. No exceptions or disconnections are logged — it just silently stops processing. Only when I restart the container app, it starts consuming events again.
What I’ve tried: Using client.receive(...) with starting_position='@latest'
Implementing a heartbeat logger to confirm the app is alive
Verifying no memory/CPU scaling issue in container logs
Logging Azure SDK + AMQP (uamqp) events for connection issues (none so far)
Confirmed Event Hub has new incoming events (checked via metrics)
Questions: Is there a known connection idle timeout behavior in EventHub SDK for Python that silently drops without logging?
Should I be manually reinitializing the EventHubConsumerClient in a loop to handle idle disconnects?
Any known limitations or recommendations for long-lived consumers in Azure Container Apps with Event Hub?
Can Azure Monitor / Diagnostic settings be used to detect dropped or disconnected AMQP sessions from the Event Hub side?
Any insights, best practices, or SDK configurations to handle this properly would be greatly appreciated.
Thanks!
from azure.eventhub.aio import EventHubConsumerClient
from azure.eventhub.extensions.checkpointstoreblobaio import BlobCheckpointStore
import asyncio
async def on_event(partition_context, event):
print(f"Received event from partition: {partition_context.partition_id}")
await partition_context.update_checkpoint(event)
def run_dispatcher():
checkpoint_store = BlobCheckpointStore.from_connection_string(
"<blob-connection-string>", "<container-name>"
)
client = EventHubConsumerClient.from_connection_string(
conn_str="<event-hub-conn-str>",
consumer_group="$Default",
eventhub_name="<event-hub-name>",
checkpoint_store=checkpoint_store
)
loop = asyncio.get_event_loop()
try:
loop.run_until_complete(
client.receive(
on_event=on_event,
starting_position="@latest"
)
)
except Exception as e:
print(f"Error in dispatcher: {e}")
finally:
loop.run_until_complete(client.close())
loop.close()