Mercatus Introduces AI-based PDF Parser

Mercatus has announced the availability of PDF Parser, a technology-augmented PDF data extraction for private markets. The latest in a series of new enhancements, the Mercatus platform’s PDF Parser feature significantly mitigates the challenges of data on-boarding, eliminating manual extraction of asset reports, investor memos and other custom reporting.

“Scraping data from PDFs is a time-consuming, expensive and manual process that is prone to errors and often leads to poor data quality,” says Mercatus CEO Haresh Patel.

“With PDF Parser, we are eliminating the labour-intense task of processing, extracting, cleaning and uploading data from PDFs. By automating the process, businesses can do in two or three minutes what traditionally has taken three or more days. That’s a reduction of $US240,000 per reporting cycle for a typical fund manager managing 50 active investments.”

Because PDFs are designed for humans and not computers, they do not have a defined structure that allows users to gather data from it easily. With a solid back-end extraction tool, the Mercatus data management platform allows users to query, search, filter, merge, sort and extract texts and images from any PDF documents in an effective way. Features include:

- Document Parser Templates – leverage configurable document Parser templates for automated and repeatable data extraction from assets, performance reports, investor memos and more.

- Batching and Historical Entry – Upload a batch of PDFs at one time to load data for single or multiple entities. Upload decades worth of data in minutes, not months.

- Auditing and Governance – Construct robust data lineage across an entire investment portfolio. Track and audit where data is coming from, how it is being used and who is using it.

“Private market investors are dependent on diverse and rapidly changing unstructured documents. Traditional (optical character recognition) OCR and data extraction techniques that work in adjacent markets with standardized documents simply do not work,” says Mercatus CTO, Jason Adams.

“By blending advancements in AI and machine learning paired with human-in-the-loop processing we have provided a solution that answers the demand to interact with these documents at scale. Now critical business data locked away in unstructured documents can finally be accessed and extracted at high volumes in an automated way.

Last week, Mercatus released a No-Code Integration front-end interface that automates the data-in process by connecting to private market applications and databases.

www.gomercatus.com