Is the OCR engine different when using a custom extraction model and prebuilt-layout?

Question

Is the OCR engine different when using a custom extraction model and prebuilt-layout?

Bogdan Pechounov 85

When using a custom extraction model or a prebuilt model, is the OCR engine the same? I noticed differences in the word confidences as well as in the paragraph roles.

Also, does the API version influence OCR results?

Accepted answer

0 additional answers

Your answer

Answer 1

Hi Bogdan Pechounov,

Yes, there can be differences in the OCR engine behavior when using a custom extraction model versus a prebuilt model like prebuilt-layout in Azure AI Document Intelligence. While both leverage Microsoft's core OCR capabilities, prebuilt models are optimized and tightly integrated with specific OCR configurations to enhance accuracy for general document structures, such as forms or invoices. On the other hand, custom extraction models, especially those trained on labeled data, may use OCR results differently such as adjusting how elements are grouped, how word confidences are calculated, or how roles like "paragraph" or "heading" are assigned based on the training data and model architecture. As a result, you may notice variations in OCR output, including word confidence scores and detected roles. Additionally, the API version used can also impact OCR results, as newer versions often include enhancements to the OCR engine, layout analysis, and document understanding pipeline. Therefore, for consistent results across use cases, it is important to consider the model type and API version being used.

Share via

Is the OCR engine different when using a custom extraction model and prebuilt-layout?

0 additional answers

Your answer