ImageMAKER announces Near Duplication Detection integrating the dtSearch Engine
A specialist in document imaging and eDiscovery solutions, ImageMAKER will now market its Near Duplication Detection as a separate component integrating the dtSearch Engine. dtSearch offers enterprise and developer text retrieval (including the dtSearch Engine) to instantly search terabytes of online and offline data.
ImageMAKER incorporates its Near Duplication Detection system in its flagship Discovery Assistant along with the embedded dtSearch Engine document filters and searching.
De-duplication technology typically identifies documents and emails that are exact duplicates using criteria like hash value comparison. By contrast, ImageMAKER’s Near Duplication Detection can identify documents and emails that have contextually similar phrases or content, but are not exact matches.
The ability to detect near duplicates greatly streamlines processes such as eDiscovery, other data review such as Freedom of Information (FOI), archiving systems and document management.
Features of Near Duplication Detection include the ability to sequentially link multiple document versions as well as the capacity to output percentage similarity between similar documents.
Using the dtSearch Engine document filters along with ImageMAKER’s own technology, Near Duplication Detection can even detect the similarity of files that are fully nested in other documents. For example, Near Duplication Detection might find 87% of Document A contained in Document B, and 93% of Document B contained in Document C.
Near Duplication Detection also includes a visual comparison tool that outputs HTML formatted documents, and can highlight phrase differences in different colors for easy review. The visual comparison works both locally or in shared web based environments to allow end-users to seamlessly compare documents.
“ImageMAKER chose the dtSearch Engine for use in Discovery Assistant because of its market-leading search technology, including its ability to leverage metadata to ensure successful and fast search results in faceted searching,” says Ken Davies, ImageMAKER’s president. “Customers tell us that the embedded search technology is a ‘lifesaver.’”
“Near Duplicate Detection also incorporates the dtSearch Engine,” continues Davies. “As a standalone product, Near Duplicate Detection will rely on the dtSearch Engine’s robust and extremely fast search speed to provide outstanding response times in identifying close file matches across terabytes of data.”
dtSearch’s core developer component, the dtSearch Engine, can instantly search terabytes of mixed documents, emails plus nested attachments, databases and online data with over 25 different search options.
The dtSearch Engine has its own proprietary document filters for data parsing, extraction, conversion and display (including with highlighted hits). The dtSearch Engine SDK offers these capabilities through C++, Java and .NET / .NET Core APIs to Windows, Mac and Linux developers, both for “on premises” applications as well as for online platforms such as Microsoft Azure and Amazon Web Services (AWS).