The Importance of Capture in Document-Based Process Automation

By Arnold von Büren

Capture is dead, long live capture! 25 years ago, turning a paper document into a digital image was a rather big challenge. One needed these unwieldy devices called scanners and document capture application software.

The capture software would create an image, and typically store it in a .tiff format. Then, before these captured images could be put into an electronic archive, a person would need to manually add some index data.

This was known as ‘late’ capture. ‘Late’ because paper documents are only captured and electronically archived at the very end of an entirely paper-based (manual) process -- the whole purpose was simply to replace paper-based filing systems with an electronic archive for easy storage and retrieval. Such were the times!

Capturing forms and data developed relatively early in the 1990s and was initially called data capture to differentiate from document capture, as only the data is of interest and the images often get discarded. Different technologies existed, each focused on handling to deal with either machine- or hand-written content and special information (e.g., checkboxes).

Around 15 years ago ‘early’ capture began to surface, which means that documents are being captured (scanned) immediately upon arrival at the organization. Again, an image is produced but advances mean that OCR technology is applied to obtain textual (content) data.

The normalization standard is typically PDF/A, which allows to keep the image information and its text data in a separate layer. ‘Early’ capture nowadays is widespread. It is called digitization and is the important first step for complete digital processing.

From digitization to intelligent document processing (IDP)

With the arrival of the internet, it became possible to scan documents from any office with so-called desktop scanners, and the most prevalent ‘documents’ became emails. Capturing the body of an email is straightforward, however email attachments are an entirely different matter – the variety of attachment formats is mind boggling, and that’s before we even get into the topic of emails attached to other emails, and/or containing archive media (of which there are again multiple types - .zip, .rar, etc.) Recognizing the formats and normalizing them into one single standard is the first challenge.

With a personal scanner in everyone's pocket and data entering organizations in increasingly various formats, organizations needed to be able to extract data from a wide range of sources, turn that data into information and gather insights.

The criticality of capture

Customers want the choice of communication (format, channel) with organizations, and organizations want to remove friction from customer-centric processes. To achieve this, organizations must have a system and process in place that treats all document ingestions in the same manner. Only this centralized, standardized normalization approach to capture guarantees the least amount of processing errors downstream.

Of course, I would not write this if the products and solutions provided by TCG Process could not fully provide the necessary functionality.

Capture is neither a thing of the past nor getting less important – quite the contrary. Capture remains a critical aspect of intelligent document processing (IDP) and digitization efforts, serving as the foundation for automating and optimizing business processes in organizations that are driven by document-based information flow.

By implementing effective capture strategies, businesses can unlock the full potential of their data and leverage it to drive digital transformation.

Pay attention to proper capture. TCG Process’ strong roots in capture and IDP mean we know how important this is to get right so you don’t pay for it somewhere downstream, where correction costs explode, process times increase dramatically, and customer satisfaction is impacted. Long live capture!

Arnold von Büren is CEO, TCG Process.

Contact TCG Process

Tel: +61 2 9060 3727