Anomalo adds Compliance Tool for Generative AI

Anomalo has announced a new capability to identify common and business-specific quality and compliance issues in unstructured data targeted for Generative AI workflows.

Anomalo’s platform uses AI to automatically detect issues in both unstructured and structured data, letting teams resolve any hiccups with their data before making decisions, running operations or powering AI and machine learning workflows. 

Elliot Shmukler, co-founder and CEO of Anomalo, said: “Generative AI is the next frontier, but there is no playbook for data quality when it comes to determining the quality of unstructured data feeding Generative AI workflows and LLMs.

“Enterprises need to understand what they have inside their unstructured data collections and which parts of those collections are suitable for Generative AI use. At Anomalo, we’re building this playbook and are working with the world’s largest and most innovative companies to solve this challenge together.”

A recent McKinsey Global Survey found that 65 percent of companies across sizes, geographies and industries now use Generative AI regularly, twice as many as last year.

But there is not an off-the-shelf Generative AI model that will "just work" for enterprises, because whether they are building a RAG workflow or powering a customer support chatbot, enterprise-specific data is needed to make sure they get the correct outputs from the LLM.

That means, they need to find a way to bring their data to the Generative AI models and, of course, to make sure they are bringing high quality and compliant data as well. 

The challenge is that most of this data is unstructured, such as documents, call transcripts and order forms, and unlike data quality for structured data, there is no established framework for determining the quality of unstructured data. These documents are often cluttered with duplicates, errors, private information and even abusive language.

Organizations who want to leverage their unstructured data need to be able to identify and resolve quality issues with such data before they get incorporated into Generative AI workflows and impact their performance or customer experience.

This challenge led Anomalo to expand its data quality platform for structured data to unstructured data in June. With its unstructured data quality monitoring capability, unstructured text documents can be evaluated for data quality with out-of-the-box issues including document length, duplicates, topics, tone, language, abusive language, PII and sentiment.

Users are then able to quickly evaluate the quality of a document collection and identify issues in individual documents, dramatically reducing the time needed to profile, curate and leverage high-value unstructured text data.

With its latest announcement, Anomalo is expanding on these capabilities with two major advancements:

- Enterprises can now customize detected issues to describe any criteria they want to look for within the document collection and assign weightings to how severe the issue is for both their custom and Anomalo’s out-of-the-box issues 

- Enterprises can now leverage the models approved to run within their own cloud environment and hosted by AWS Bedrock, Google Vertex and Microsoft Azure AI with Anomalo’s cloud-hosted model-as-a-service support. Paired with Anomalo’s existing ability to seamlessly integrate with cloud providers and run entirely within a Virtual Private Cloud (VPC), this keeps data within enterprise data teams’ control and minimizes risk that data is ever used to train or fine-tune models 

Anomalo has also announced $US10 million new funding which will be utilised to accelerate investment in R&D for unstructured data monitoring and to deliver the future of data quality for Generative AI applications.

https://www.anomalo.com/

 

Business Solution: