Patent application lodged for Machine Learning in Data Capture

US firm ancora Software has filed a patent application for the way its ancoraDocs software can correctly identify data fields that must be captured from documents potentially distorted during the scanning process.

ancora Software, Inc., is a developer of intelligent process automation solutions including Intelligent Document Classification and Advanced Data Capture

Soaring volumes of documents and data require businesses to find more efficient ways of capturing critical data from the documents they receive. For instance, processing and approving an invoice typically requires accounts payable staff to capture data such as the invoice number, the supplier’s name and address, and the invoice amount and due date.

Legacy data capture systems work well on well-designed well behaving documents with pre-defined layouts for each document type and the correct identification of keywords such as ‘Loan Number’ and ‘Co-Borrower’ to determine the exact location of the data that must be captured.

These approaches to locating data often fail on badly designed documents or when vertical or horizontal shifts, noise, pre-printed lines, folds, or other distortions occur during the scanning process.

ancoraDocs uses patented technology to overcome the challenges caused by distorted images. The software utilizes a machine learning approach which is able to capitalise on the information from already processed images to capture the data from yet unprocessed images of documents such as invoices, purchase orders, sales orders, bills of lading, remittance documents, and health clams.

“The volume of information that organisations receive is growing every day,” said ancora Software CEO Noel Flynn.

“They cannot afford the time and resources to manually key data on every document that has become distorted during scanning. Using examples of documents from the same source and with the same layout the technology built into ancoraDocs automatically determines the precise location of data, even when an image has become distorted.”

ancoraDocs’ patented unassisted and assisted machine learning algorithms eliminate the need for document capture templates or a long, complicated setup. ancoraDocs says it can be deployed in hours or days, not the weeks or months required for traditional document capture solutions.