Rosette 7 delivers next-generation text analysis
Basis Technology Corporation has unveiled the latest generation of the company’s linguistics platform, Rosette 7, offering expanded language coverage, improved entity extraction accuracy and new name matching and translation modules.
It allows global enterprises to deploy document management systems and XML databases capable of smart retrieval and navigation in multiple languages.
Legal teams are able to quickly locate relevant documents buried in multilingual repositories for e-discovery, while financial institutions are increasing accuracy and reducing false positives for anti-money laundering and counter-terrorism financing regulatory compliance.
Businesses of all sizes are exploiting unstructured data to discover trends and anticipate future problems.
Rosette Entity Extractor rapidly locates named entities in large volumes of unstructured text by employing three complementary detection algorithms: rule-based, list-based, and statistical.
Rosette 7’s improved extractor delivers breakthrough gains in speed and accuracy and dramatically shortens the length of time needed to train its statistical algorithms on new languages or entity types. Search-based applications are exploiting entity extraction to automatically generate metadata to filter search results, enable faceted navigation, deliver alerts, and feed downstream processes.
Agile businesses deploying the popular Apache Lucene/Solr open source search toolkits can now benefit from the same advanced linguistic processing used by high‑end web and enterprise search engines. Rosette easily integrates with Lucene to index and search text in English, French, Italian, German, and Spanish as well as such complex languages as Arabic, Chinese, Farsi, Japanese, Korean, and Russian.
Rosette Name Indexer matches names of people, places, or organizations, regardless of the language in which they are written, against entries in multilingual databases, while processing many types of intentional and unintentional name variants: script (Arabic vs. Hanzi vs. Latin); phonetic; orthographic; missing or disordered name components; formal and informal titles; initials; nicknames and aliases.
Rosette Name Translator analyses the fundamental linguistic structure of foreign personal names in Arabic, Chinese, Dari, Farsi, Korean, Pushto, or Urdu to produce highly accurate translations into English in compliance with applicable institutional or government standards.
Carl Hoffman, CEO of Basis Technology, said, “We’ve developed a reputation for expertise in computational linguistics, commitment delivering effective high-quality technology, and dedication to serving our customers’ needs with unparalleled support. Rosette 7 represents the most advanced and innovative linguistics analysis platform available, and it allows our customers to analyze their unstructured data – crucial for today’s global businesses.”