ActivePDF offers API for PDF text extraction

ActivePDF has released Xtractor 8.1.0, a .NET API for searching and extracting text and images from PDF files. Formerly available within a bundled package, Xtractor is now accessible as a stand-alone developer tool, designed for high-volume PDF data extraction and digital transformation automation

Xtractor 8.1.0 enables users to easily specify criteria such as words, invoice data, image formats, location of interest, and other pieces of data through regular expressions like Medicare numbers, phone numbers, and more. Once the PDF data is extracted, the content immediately becomes available for automation, editing, indexing, and other digital transformation needs.

Developers and IT professionals in the fields of healthcare, finance, insurance, property management and other industries turn to Xtractor for all their high-volume data extraction needs when OCR data capture is not required or part of the business process.

Key features of Xtractor 8.1.0 include:

  • Classification and Indexing: Quickly and easily automate key information for extraction. This feature provides an easy solution for categorising or indexing documents for archiving, classification, and more.
  • Extract and Save: Xtractor enables you to save all extracted text to a selected memory stream. Once saved, simply assign the saved data to a specified file name by targeting all images, specified pages, or placement coordinates.
  • Automate PDF Processing: Save time and money by automating high-volume search and extraction. Select data based on contextual information such as keywords, key phrases, location, or what you define with regular expressions, and those targeted PDF data fields will be automatically extracted.
  • Extraction Options: Besides defining data parameters for capture with regular expressions, users can locate and extract a variety of data including invisible text or metadata for PDF indexing and archiving. You can also locate and extract image files such as JPG, TIFF, PNG, or BMP with this .NET API.

 

“Our main focus at ActivePDF is to continually help businesses increase productivity in an efficient and affordable manner,” says Tim Sullivan, ActivePDF Chief Architect and CEO.

“In certain scenarios where OCR data capture can’t be used, Xtractor 8.1.0 is the perfect example of how we are still driving our partners' digital transformation initiatives through smart engineering and thoughtful development. We look forward to further assisting businesses optimize their digital workflow with innovative .NET developer tools such as Xtractor 8.1.0.”

https://www.activepdf.com/products/xtractor