Batch analyse is slow and seems to not generate PDF with OCR

Pedro Oliveira 0 Reputation points
2025-06-30T11:31:38.7566667+00:00

Hi,

We are trying to use Document Intelligence Batch Analyze in our document OCR workflow.

But we were unable to use the PDF generation feature on this option we tried the following C# code so that it would be created.

string MODEL_ID = "prebuilt-read";
AnalyzeBatchDocumentsOptions options = new AnalyzeBatchDocumentsOptions(
            MODEL_ID,
            new BlobContentSource(sourceContainerUri)
            {
                // Prefix = sourceBlobNamePrefix
            },
            destinationContainerUri
        );

        // options.ResultPrefix = destinationBlobNamePrefix;
        options.Features.Add(DocumentAnalysisFeature.Barcodes);
        
        // cant make this work
        options.Output.Add(AnalyzeOutputOption.Pdf);
        options.OutputContentFormat = new DocumentContentFormat("pdf");

        var result = await _client.AnalyzeBatchDocumentsAsync(
            WaitUntil.Started,
            options,
            cancellationToken
        );

On the configured storage we saw no PDF generated with the OCR results. We also tried looking inside the json files, but there was no PDF binary there. Where can we get the PDF?

Also, it takes some time, in the order of minutes, to handle the batch. When picking up a single file to handle it is quite fast but between the files it's slow. How can we configure so that the batch is always handled correctly?

Regards, Pedro Oliveira

Azure AI Document Intelligence
{count} votes

1 answer

Sort by: Most helpful
  1. Danny Dang 90 Reputation points
    2025-07-01T10:54:36.19+00:00

    Hi Pedro,

    Thank you for contacting Q&A Forum.

    To address the issue you're facing with Document Intelligence Batch Analyze, here are some steps you can follow:

    1. Getting Batch Results: Follow the instructions provided in the Azure documentation to retrieve the batch results. https://learn.microsoft.com/en-us/dotnet/api/azure.ai.documentintelligence.analyzebatchresult?view=azure-dotnet-preview
    2. Checking Storage Configuration: If the batch jobs were successful but no result file can be found in the target container, I would suggest you check for the storage container configuration to verify that the Document Intelligence resource has the necessary permissions to write files to this container.
    3. Handling Failed Batch Jobs: If the batch jobs failed, follow the error messages to identify the issues and make the necessary adjustments before retrying the batch jobs.

    For more detailed information on batch jobs, you can refer to the Azure documentation on batch analysis. https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/prebuilt/batch-analysis?view=doc-intel-4.0.0

    If I have answered your question, please accept this answer as a token of appreciation and don't forget to give a thumbs up for "Was it helpful"!

    Best Regards,


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.