Share via


Nutrient - PDF OCR (Preview)

Transform scanned documents and images into searchable, editable PDFs with Nutrient Document Converter OCR actions. Seamlessly extract text from PDFs using advanced Optical Character Recognition (OCR) technology for indexing, automation, and content analysis.

This connector is available in the following products and regions:

Service Class Regions
Copilot Studio Premium All Power Automate regions except the following:
     -   US Government (GCC)
     -   US Government (GCC High)
     -   China Cloud operated by 21Vianet
     -   US Department of Defense (DoD)
Logic Apps Standard All Logic Apps regions except the following:
     -   Azure Government regions
     -   Azure China regions
     -   US Department of Defense (DoD)
Power Apps Premium All Power Apps regions except the following:
     -   US Government (GCC)
     -   US Government (GCC High)
     -   China Cloud operated by 21Vianet
     -   US Department of Defense (DoD)
Power Automate Premium All Power Automate regions except the following:
     -   US Government (GCC)
     -   US Government (GCC High)
     -   China Cloud operated by 21Vianet
     -   US Department of Defense (DoD)
Contact
Name Nutrient (formerly Muhimbi) Support
URL https://support.nutrient.io/hc/en-us/requests/new
Email support+low-code@nutrient.io
Connector Metadata
Publisher Muhimbi trading as Nutrient
Website https://www.nutrient.io/low-code/
Privacy policy https://www.nutrient.io/legal/privacy/
Categories Collaboration;Content and Files

Perform OCR on images and scanned documents

Use Nutrient Document Converter to run Optical Character Recognition (OCR) on images and scanned files through a REST API or a self-hosted server library.

OCR capabilities

  • Convert images, scans, and faxes into searchable PDFs.
  • Build automated document workflows for text extraction and PDF searchability.

Integration options

Integrate OCR functionality with code samples in your preferred language:

Prerequisites

To use Nutrient Document Converter, you need a Free or Trial account. Refer to the comparison guide to understand the differences between these account types.

Getting started

Follow the steps below to start using the Nutrient Document Converter connector:

Known issues and limitations

Documents protected with IRM, DRM, RMS, or AIP solutions cannot be processed due to security restrictions.

For questions or assistance, contact our Support team.

Throttling Limits

Name Calls Renewal Period
API calls per connection 100 60 seconds

Actions

Convert to OCRed PDF

Perform OCR on an existing PDF document or an image to create a searchable PDF

Extract text using OCR

Extract text from a PDF file using OCR

Convert to OCRed PDF

Perform OCR on an existing PDF document or an image to create a searchable PDF

Parameters

Name Key Required Type Description
Source file name
source_file_name True string

Name of the source file including extension

Source file content
source_file_content True byte

Content of the file to OCR

Language
language enum

Language

Performance
performance enum

Performance

Blacklist / whitelist
characters_option enum

Characters option

Characters
characters string

Characters to blacklist or whitelist

Use pagination
paginate boolean

Paginate

Regions
regions string

Limit the area to OCR to one or more specific areas

Fail on error
fail_on_error boolean

Fail on error

Returns

Response data for all operations

Extract text using OCR

Extract text from a PDF file using OCR

Parameters

Name Key Required Type Description
Source file name
source_file_name True string

Name of the source file including extension

Source file content
source_file_content True byte

Content of the file to OCR

Language
language enum

Language

X Coordinate
x string

X Coordinate (in Pts, 1/72 of an inch)

Y Coordinate
y string

Y Coordinate (in Pts, 1/72 of an inch)

Width
width string

Width of the OCR area (in Pts, 1/72 of an inch)

Height
height string

Height of the OCR area (in Pts, 1/72 of an inch)

Page number
page_number string

Page number (leave blank to OCR all pages)

Performance
performance enum

Performance ()

Blacklist / whitelist
characters_option enum

Characters option

Characters
characters string

Characters to blacklist or whitelist

Use pagination
paginate boolean

Paginate

Fail on error
fail_on_error boolean

Fail on error

Returns

Response data for OCRText operation

Definitions

ocr_operation_response

Response data for OCRText operation

Name Path Type Description
Out text
out_text string

Extracted OCRed text in plain text.

Base file name
base_file_name string

Name of the input file without the extension.

Result code
result_code enum

Operation result code.

Result details
result_details string

Operation result details.

operation_response

Response data for all operations

Name Path Type Description
Processed file content
processed_file_content byte

File generated by the Muhimbi converter.

Base file name
base_file_name string

Name of the input file without the extension.

Result code
result_code enum

Operation result code.

Result details
result_details string

Operation result details.