IDP. It’s about the Infrastructure, Stupid!

Arnold von Büren, CEO of TCG Process, is a Swiss entrepreneur with three decades of experience in capture and input management. On a visit to Australia in February, IDM asked Arnold to outline his views of the current state of Intelligent Document Processing (IDP).

IDM: What is the source of the technology behind TCG Process

AvB: It’s all our IP and the only technology we integrate is for document classification. We’ve been developing the platform for 10 years since going it alone and moving away from external providers as well as expanding internationally to go global into many countries around the world. We have expanded into areas we are happy with, and I am very much a believer that we need a local presence in countries where we are looking to play. (TCG Process’ ANZ subsidiary opened in Sydney in 2020 under Managing Director Frank Volckmar).

We use the word orchestration quite a bit. Some might refer to it as workflow or Business Process Automation (BPA) We consider ourselves more in the BPA space than the capture space, providing a level of verification that can increase the success of capture and recognition to much higher levels.

IDM:  Can you explain in a bit more detail how TCG Process leverages AI and other technologies to deliver fast and accurate outcomes?

AvB: The only place where we use AI at the moment is in our classification technology and the rest is machine learning or self-learning. We are cheating in a way. We always try to get what we find and compare it with the customer’s data and that gives us a lot of hints as to what’s right and what’s wrong. That’s another disadvantage when you are somewhere in the cloud and your AI is 100% OCR but it’s not validated against your data. What we try to do is provide very early access to the database and use that data to verify what we find in the document. So, yes, we need access to the data which is sometimes not that easy because people are reluctant to provide it, but when we have that we get much better results.

We try to differentiate between validation and verification. Validation is making sure the content that was on the document is now in the fields. Verification is making sure the field content corresponds with the database content. Turning it from business data into business information.

But we are not only comparing against the database we also establish rules within the system so we can be really sure that when data leaves our system it is absolutely correct with no more steps needed and you can run the transaction.

IDM: Where do you think the main opportunities lie in this region for intelligent document processing solutions?

AvB: Everywhere. We thought that Invoice Processing was only a replacement business, but it’s still a huge opportunity. Many companies are still lagging far behind with digitisation and are upgrading their AP OCR platforms to enterprise IDP so they may use the same platform to automate ingestion of all business information. And you can only automate if you digitise and normalise all the incoming documents and data streams. There are still huge opportunities.

Companies that have “digitised”, they may have a portal and take things in by email and other means but they haven’t automated. But there is a huge gap between those inputs and their backend systems, with all kinds of administrative staff performing a lot of mundane tasks. That could be elevated with automation and done much better without mistakes and errors.

Its fine to digitise incoming content but you may end up with 59 different document file types, which is impossible to open and review. You need to “normalise” those files so they fit on one screen and then extract the information and run rules and processes as a starting point for automating processes.

Many companies have created a Chief Digitalisation Officer. I think it’s the wrong title, it should be Chief Automation Officer. Digitising is just the first step; you must then normalise and automate.

Customers, suppliers and employees are hammering your business everyday with emails. Enter the new ingestion challenge - emails with a multitude of file formats attached, or embedded, or emails attached to another email. Download TCG’s Solution Guide Now!

IDM: Do you think organisations need to take a step back and reconsider the suitability of public cloud capture offerings?

AvB: People are hesitant about going into cloud. First it must now be a local cloud for the local country which makes it a little tougher for the suppliers. But it’s no longer an issue about the software technology, the important thing is to have a tool to orchestrate all these services. In the beginning you must do capture well and then deal with the document, the OCR, the AI, the classification.

What’s interesting for me is that everybody gets so excited about the technology and sometimes I say, ‘amateurs are excited about technology, experts talk about the infrastructure’. Everyone’s excited about new technology, everyone’s excited about ChatGPT, but how do you bring it into your system, how do you integrate it? That’s where the strength of our product lies. The interesting thing for me is that many of these AI offerings pretend to have a solution but they only have the OCR, a sliver of the whole project. OCR is the enabler of a solution but not the answer.  The cloud offerings are great but they must be orchestrated to create value.

Artificial Intelligence (AI) in a way is a scam, as its never right. There are so many places in business that you need to be 100% right. In classification you can get to 70% with AI but that’s as far as you can get, so for 30% you are back to manual and need some diligent eyes and hands somewhere to manually process. When we apply our technology, we get up to 85-90% automated throughput.

IDM: Following events in 2022, information security is now very much front of mind in Australia. You highlight some of the issues in in hosting data sets for training AI. Do you think there is a lack of awareness of the risk?

AvB: Absolutely. And something has to happen first before people will realise. Just recently we had Microsoft go down with certain services in the cloud. The risk of a data leak is always there. We see even today that companies are very reluctant to put everything in the cloud. I think it will remain a very hybrid environment with the really crucial data kept on-premise or in a private cloud.

It always makes me nervous when I see hype. It should be all about the integration, the infrastructure, that’s what you need to be concerned about.

IDM: Microsoft has announced a number of solutions such as Syntex, AI Builder and Azure Applied AI Services. Is Microsoft offering a competitive platform for Intelligent Document Processing (IDP)?

AvB: No. I mean look at how many customers Microsoft has. Billions. They can’t have a technology that fits everybody. It has to be very, very generic. We are very much a Microsoft shop but it’s important to find a solution that works for your organisation. It might include a piece of Microsoft AI it might be some Google Cloud technology, or something else from a third party. The great thing now is its all open so you can integrate almost anything (plus you can also throw out things when you see that they do not work.) I hold Satya Nadella in high regard, but it seems almost childish the way they jumped straight onto ChatGPT which wasn’t even grown in their garden, they were just the quickest to give them massive computing power and now they are hyping it. Then Google brings out Bard.

OCR is now quite commoditised and what’s important and adding value is process automation. Microsoft and Amazon are now providing generic OCR which you can access as a service, but it will only get you part of the way there.

Can you discuss the future developments and advancements in intelligent document processing technology?

AvB: IDP is excellent for collecting and understanding incoming business information, whether structured, semi-structured and more recently unstructured, from forms to invoices to medical reports, and we are focused on continuing to improve the options for how organisations “act” on the information.  This is today’s process automation or orchestration value add.  By being able to ingest any type of document from any channel, IDP platforms must be able to integrate, at any scale, into a customer’s infrastructure, flexibly, securely and simply to ensure business information is automatically acted on quickly and cost effectively. 

IDP is becoming more about the underlying architecture and fit with infrastructure to ensure organisations are able to leverage every opportunity to verify data and act upon it effectively.  The architecture of IDP offerings is evolving and legacy applications will find it increasing difficult to adapt.

The beauty of our platform is it can use any new technology that is developed, which we can then integrate. Our platform uses Microsoft Cognitive Services if you choose to, it can use one of our competitor’s OCR engines if you choose to, Tesseract, Google, all those are integrated on the platform. So, whatever you believe is the best OCR engine for your application we will integrate and then plug into different information sources and leverage all the data points in your infrastructure to make sure that what you are looking at when you have captured something is right.

Arnold von Büren was a founding member of DICOM Group plc. and played an instrumental role in the acquisition of Kofax, Inc. USA, becoming Kofax CEO in 2000. Since 2007 he has been CEO of TCG Process, providing leading process automation software to businesses of all sizes and growing the company into a global organization with more than a dozen subsidiaries across Europe, the Americas and Asia Pacific.