How to recognize copyright symbol © using pre-built layout model

Rafael Gomes 0 Reputation points
2025-06-30T09:48:13.0933333+00:00

When using the AI Document Intelligence service with the model prebuilt-layout, we are having issues recognising text containing the copyright symbol ©. Is there any possible solution(s) to fix/improve the accuracy for recognising this (and other) symbol(s)?

I've attached a simple document as an example of where this happens and what the Document Intelligence Studio outputs when analysing it.

test-symbol.pdf

Document Intelligence Studio analysis result when trying to recognize copyright symbol ©

Azure AI Document Intelligence
{count} votes

1 answer

Sort by: Most helpful
  1. santoshkc 15,590 Reputation points Microsoft External Staff Moderator
    2025-06-30T10:58:24.77+00:00

    Hi Rafael Gomes,

    Thank you for reporting this. We were able to reproduce the issue using the prebuilt-layout model (2024-02-29 Preview) and observed that the copyright symbol © is being recognized as the letter “C”. This behaviour is expected in some cases, as the model primarily focuses on extracting general text and may misinterpret certain special characters based on document quality, font style, or encoding.

    As a workaround, we recommend continuing to use the latest prebuilt-layout model version (2024-02-29 Preview), which has improved accuracy compared to previous versions for special characters, though it may still have occasional limitations. Additionally, ensuring the document has high resolution, clear fonts, and good contrast can help improve recognition.

    User's image

    I hope you understand! Thank you.

    1 person found this answer helpful.
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.