Testing Framework evaluates GenAI Model Suitability and Performance
Fisent Technologies has unveiled a new framework designed to help objectively evaluate the performance of various GenAI models for specific business process automation use cases. Fisent’s GenAI Efficacy Framework (GEF) enables enterprises to measure, compare, and select the most effective GenAI models based on key metrics like accuracy, speed, cost, and consistency.
Fisent’s GEF empowers enterprises to optimize their GenAI implementations by identifying the best model and parameters for each of their process automation efforts. This data-driven approach helps teams justify their GenAI choices and demonstrate a clear return on investment.
By evaluating models against real-world requirements, GEF mitigates the risk of suboptimal performance and ensures that organizations stay ahead of the rapidly evolving GenAI landscape through continuous assessment and optimization.
“The idea for GEF sparked as typical AI model evaluation methods, like those that measure Massive Multitask Language Understanding (MMLU), failed to balance the nuanced requirements of our customers’ real-world automation decisions,” explains Adrian Murray, Founder and CEO of Fisent.
“GEF offers a more pragmatic approach by evaluating the most important factors to any given application decision: accuracy, speed, cost, and consistency. Understanding these metrics allows enterprises to make more informed decisions about which LLM to employ for each of their process automation challenges.”
Fisent’s GEF includes a configurator that customers can use to evaluate the tradeoffs inherent in comparing their LLM options. The GEF configurator intelligently scores given requirements against available LLMs to produce a ranked list of the models expected to perform best for a specific situation along with numerous visual comparison charts for speedy analysis.
For example, using the configurator to increase the requirement for accuracy will adjust the rank-order of LLMs under consideration. What’s more, other variables evaluated by the configurator adjust accordingly.
If there is a need for both speed and high accuracy, the cost variable associated with the best-fit LLMs is likely to increase. GEF is useful to Fisent customers when they initially implement Fisent BizAI against a specific process automation and again when evaluating new models or model upgrades.
Key benefits of Fisent’s GEF include:
- Comprehensive evaluation: Assess GenAI models across multiple variables, such as accuracy, speed, cost, and consistency.
- Data-driven insights: Provide actionable recommendations based on objective metrics and statistical analysis.
- Continuous optimization: Enable enterprises to monitor and improve model performance over time.
- Ease of use: Streamline the process of evaluating and selecting GenAI models, even for non-technical users.
“Fisent is committed to driving innovation in the Applied GenAI Process Automation space and empowering enterprises to harness the full potential of this transformative technology,” said Murray.