DocsCorp spreads the discovery net
DocsCorp is delivering the ability to search for hidden image-based content in document management systems via a new Content Crawler OCR module for its pdfDocs product line.
Autonomy iManage and Opentext eDOCS are the first two DM platforms that can be “crawled” to detect image-based documents, which are then automatically OCR’d to make them text-searchable.
Doocuments that arrive by fax, scanner or as email attachments can bypass the OCR processing that would make them text-searchable. Once in the DMS, these documents become completely “invisible” to the search engines.
“Businesses have made considerable investments in document management and search technologies, but it is estimated that 10-20% of documents in a DMS are non-searchable. This figure represents a significant risk to any business. Its reputation and financial well-being could be impacted simply by failing to produce a specific document on demand,” says David Woolstencroft, DocsCorp President Marketing, Sales and Strategy.
pdfDocs Content Crawler provides a framework for searching an entire DMS database or a subset of documents based on specific DMS queries. The Content Crawler OCR module identifies non-searchable content in image files, PDF files and even looks inside attachments to emails. The files are converted to text-searchable PDFs using DocsCorp’s OCR technology and saved back into the DMS. Content Crawler can search and convert backlogs of legacy documents as well as actively monitoring newly-profiled documents.
Woolstencroft adds “if you don’t know the extent of the problem, or you are not sure if you have a problem, DocsCorp invites you to use Content Crawler (trial version mode) to provide an audit report of your DMS documents.“
DocsCorp will also market the product for litigation support where firms are analysing electronic discovery bundles but do not know if all of the documents in the bundle are searchable. Many firms have invested heavily in search technology and complex litigation support systems but if the documents these systems are pointed at do not contain searchable text their effectiveness is reduced.
The current release integrates with Autonomy iManage 8.2 or higher and Opentext eDOCS DM 5.1.05 or higher. Further DMS and Content Repository integrations will follow.