IDP - Intelligent Document Processing

What is IDP? Intelligent Document Processing (IDP) is the effective processing of documents. It requires the ability to use OCR (Optical Character Recognition) and other techniques to extract the data contained on a page so that it is available for business processing.

People can see details in documents – Intelligent Document Processing can do that!

Recognising the text and its context is essential for an automation to add real value to a business.

Accurate extraction offers the potential for “Straight Through Processing” (STP). That is an automation when no manual work is involved.

Artificial Intelligence (AI) can be deployed to interpret the text extracted from the document.

By implementing a process with a feedback loop, Machine Learning (ML) can be included. Overtime ML can improve the ability of the AI to successful identify the extracted data.

Why is it difficult to get data from documents?

If it is easy for a person to see data in a document, why is it hard for a program to see the data?

To recognise text, understanding different character sets, different fonts, character orientation, all need to be processed.

Separating text from the background based on colour variations is one of the techniques a human eye can perform very well and something programs need to do as well.

Some document files are easier to process than others. PDF documents are the most common type used in business processes. They present the page effectively as an image, which is why OCR techniques need to be used.

There are a number of leading OCR engines available from Google, Microsoft, etc.

The OCR engines are all able to extract characters and most of them right.

By deploying AI’s ability to process patterns and giving context in terms of the type of document that is expected, the OCR extract characters can be enhanced to correct typical mistakes such as the letter “L” and the number “1”.

The use of AI in an extraction process is why the term “Intelligent” is added to Document Processing.

Humans in the loop

Despite the best extraction techniques and AI processing to determine effective pattern matches, there will be some documents that require humans to review them.

Modern solutions display a page of the extracted document, the data values found on the page highlighted. The person can review each of the extracted values compared to the corresponding highlighted item in the document to check with it is OK, or if action needs to be taken to change the extracted value. As part of making the change, the person might identify the correct value exists in the document but at a different location.

Machine Learning from extraction.

When it is necessary for a human to review the extraction of data from documents there is also the possibility use Machine Learning.

By capturing information on the success of each extraction, the “Reinforcement” of the mechanism can be enhance. Where there is a correction of the extracted value, improvement in the pattern analysis can be increased. Where an alternative location for the correct value in the document is used, the location criteria can be adjusted.

Feeding all of the information into the AI model to supplement the model’s training, will over time enhance the success of the extraction process.

Realistic expectations are changes to the original training of the model will a lot of instances from Machine Learning. It is not a “Correct Once” to see instant success.