User Guide
Nitro Workspace

Intelligent Document Processing (IDP) Tools

PDFs contain valuable data users often need to extract and utilize in other applications. This data may include tables of data points or structured forms, requiring users to manually transfer and manipulate this information. Advancements in technology, such as Intelligent Document Processing (IDP), have significantly streamlined this process.

IDP leverages machine learning, natural language processing, and computer vision to automate the extraction, interpretation, and integration of data from documents and PDFs into other tools.

Key Nitro IDP Tools

At Nitro, we believe IDP systems are game-changers for organizations handling large document volumes, offering major boosts in speed, accuracy, and efficiency compared to manual processes.

This is why we created Nitro’s first IDP tools, Table Extract and Form Extract.

Table Extract: Automatically identifies and extracts data from tables in PDFs, even from tables without visible borders or with merged cells. This makes it easy to process data more efficiently in programs like Excel.

Form Extract: Captures text data from forms in PDFs along with labels (e.g., Name: John Doe, Phone: (555) 543 123), simplifying data collection tasks.

Use Cases for Nitro’s Table Extract and Form Extract Tools

Table Extract and Form Extract have many applications to help you extract data quickly and accurately. Here are some common use cases for these helpful tools:

  1. Automated Data Entry: Form Extract can automate the data entry process by extracting information from forms, invoices, and receipts, reducing manual entry errors and increasing efficiency.
  2. Compliance and Auditing: Table Extract can assist in compliance and auditing by extracting relevant data from a variety of documents, such as financial reports and contracts, ensuring that all necessary information is easily accessible and compliant with regulatory requirements.
  3. Loan Processing: In the financial services industry, form extract can be used to streamline the processing of loan applications by extracting data from supporting documents, along with the label data, which speeds up decision-making processes.
  4. Insurance Claims Processing: Form Extract can accelerate the claims processing workflow in the insurance sector by extracting data from claims forms and supporting documents, then making it available to processing tools.

These use cases illustrate how Nitro’s IDP tools can be leveraged to improve document handling processes, reduce operational costs, and enhance overall efficiency in various sectors.

How to Maximize Performance

Tables and forms can have a large variety of formats, such as labels inside boxes or tables with just lines under the totals. To maximize the performance and accuracy of our data extraction tools, here are some tips:

  1. Make sure the text in your document is in a language supported by our tools, which currently includes English, Spanish, German, Italian, French, and Portuguese.
  2. Use a high-quality PDF with a resolution of at least 150 DPI.
  3. Ensure that tables are clearly separated from other elements on the page, avoiding overlays on images or complex patterns.
  4. Keep the text within the table upright, not rotated in relation to other text on the page.

You may encounter inconsistent results when extracting text from tables if:

  • There are merged table cells across multiple columns.
  • The tables include cells, rows, or columns that differ from other parts of the same table.

Confidence Scores

You will see a confidence score (number between 0 and 1) alongside tables extracted in the Excel document output, indicating the probability that it is a table that has been extracted.

excelconfidencescores.png

Our table extract tool can detect and extract data from a large variety of table styles, whether they have borders, small fonts, or are embedded in images. Some content when viewed by a human eye might be questionable whether it is a table or not, similarly with our tools. So we have included a score along with each table extraction which describes how confident we are that each block of data is a table of data.

In situations with sensitive issues, such as financial decisions, you might require a confidence of 0.9, but for something like archival of handwritten notes, a threshold of 0.5 might be acceptable. Or in the case where a threshold is lower than 0.9, you might require a higher level of manual review before usage.

Limitations

Currently the maximum file size that can be used with table or form extraction tools is 25MB. There is also a limit of 100 pages in a single document. Additionally, to prevent abuse of the tool and manage over-use by multiple users, there is a per account maximum cap for table and form extract usage.

Beta Features

These tools are offered as a Beta feature, which means they are fully functional and secure, but Nitro is collecting feedback from our users to optimize them for their needs. Please provide feedback on your experiences with the tools here.

Data Security and Privacy

Your documents are only processed for your needs, the content is not used to train models and the data is only stored temporarily as part of the processing. To learn more about our data handling policies and how we secure your data, please visit our Trust Center.

How to Turn Off Intelligent Document Processing Features

The Nitro account administrator has the ability to turn off IDP features in the tools they use via the Nitro Admin portal. Please refer to the admin portal user guide for instructions on how to do this.

To manage the visibility of IDP tools and services:

  1. Log in to the Nitro Admin portal.
  2. From the Nitro Admin portal, choose Settings in the sidebar navigation.
  3. Choose the Workspace tab.
  4. Disable the Tools section.

To learn more about managing permissions in your Nitro account, please read our user guide.