contentCrawler adds automated file compression

DocsCorp has released a new module for contentCrawler, its integrated analysis, processing and reporting framework that adds the ability to significantly reduce large document file sizes in a content repository as an automated backend process.

A content repository could be a document management system, SharePoint or a Windows file system. Reducing document file size reduces storage costs and speeds up file transfers when downloading or sending these documents via email.

contentCrawler analyzes documents in a content repository based on a particular search query and compression or OCR thresholds specified by the IT Administrator. It then processes the documents that meet the criteria and saves them back into the content repository, replacing the originals with smaller, fully text-searchable files.

“In the case of the Compression module, contentCrawler will identify documents where a certain level of compression is achievable. It then compresses these documents as a backend process freeing up space for other documents to be added,” explains DocsCorp President and co-founder, Dean Sappey.

“Providing a backend service like contentCrawler delivers huge benefits in terms of efficiency, productivity as well as cost savings.”

IT Administrators can combine contentCrawler modules into a single, multi-process service for even greater efficiency and productivity. For example, a combined OCR and Compression service would locate all the non-searchable image-based documents in a content repository, OCR and convert them to text-searchable PDFs, which would then be reduced in file size through compression and downsampling.

It is an end-to-end automated solution that runs 24/7 without staff intervention, taking advantage of 4, 8, 16 and 32 CPU cores for concurrent processing. Staff do not have to worry about OCR or Compression processes or workflows. Instead contentCrawler works in Backlog mode for legacy documents and Active monitoring for recently-profiled documents. It can also work in both modes simultaneously.

contentCrawler integrates with leading document management systems including iManage, HP TRIM, OpenText eDOCS, OpenText Content Server, Worldox, and ProLaw as well as SharePoint, and Windows file systems.

The contentCrawler framework currently supports Compression and OCR processing.

Business Solution: