AI Tool analyses Images to aid Enterprise Workflows

Organisations looking to make the leap into Generative AI with Copilot for M365 are now aware it is not a cheap add-on, so will need to consider alternatives. A new GenAI release from AI startup Writer promises to accelerate enterprise workflows by analysing images.

Kown as Palmyra-Vision, it is a multimodal LLM for visual and language understanding, which can analyse and generate text based on images.

The company says it excels in tasks such as extracting handwritten text, classifying objects, analyzing graphs and charts, and answering specific questions based on visual inputs.

Not only can it understand visuals, it can also answer specific questions, analyze graphs, and generate new content based on your images.

Writer benchmarked Palmyra-Vision against VQAv2, a dataset of open-ended questions on over 265,000 images that requires an understanding of vision, language, and common-sense knowledge. Palmyra-Vision achieved a score of 84.4%, outperforming both OpenAI’s GPT-4V and Google’s Gemini 1.0 Ultra, the company states.

Use cases identified fir Palmyra-Vision include insurance companies processing written reports for claims or healthcare companies processing doctor’s notes for medical reports, where it can be used for text extraction, even if handwriting quality is low.

Customer experience teams can use Palmyra-Vision to quickly draft ALT descriptions to improve accessibility and enhance SEO performance.

The company believes that enterprise leaders now realize the competitive advantage of implementing generative AI across their businesses, but they're also witnessing the risks that come from using free chatbots like ChatGPT, such as the creation of incorrect content and leakage of sensitive data.