Hi Bogdan Pechounov,
Yes, there can be differences in the OCR engine behavior when using a custom extraction model versus a prebuilt model like prebuilt-layout in Azure AI Document Intelligence. While both leverage Microsoft's core OCR capabilities, prebuilt models are optimized and tightly integrated with specific OCR configurations to enhance accuracy for general document structures, such as forms or invoices. On the other hand, custom extraction models, especially those trained on labeled data, may use OCR results differently such as adjusting how elements are grouped, how word confidences are calculated, or how roles like "paragraph" or "heading" are assigned based on the training data and model architecture. As a result, you may notice variations in OCR output, including word confidence scores and detected roles. Additionally, the API version used can also impact OCR results, as newer versions often include enhancements to the OCR engine, layout analysis, and document understanding pipeline. Therefore, for consistent results across use cases, it is important to consider the model type and API version being used.