Mistral's new OCR model beats competitors in 72 percent of blind test cases, company says

· Source: The Decoder · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Intermediate, quick

Summary

Mistral AI has released OCR 4, a new optical character recognition model designed to extract text from various document types, including PDFs, Word files, and PowerPoint presentations. Unlike previous versions, OCR 4 not only extracts raw text but also identifies the spatial location and semantic role of each element, classifying them as titles, tables, equations, or signatures. This block classification feature aids in automatically segmenting documents for search systems or AI agent processing. The model provides confidence scores for words and pages, indicating its certainty. OCR 4 supports 170 languages, including less common ones. In a blind test involving over 600 documents, independent reviewers preferred OCR 4's results 72 percent of the time compared to competing models. It is accessible via API, Mistral Studio, and Microsoft Foundry, costing \$4 per 1,000 pages, or \$2 in batch mode.

Key takeaway

For AI Engineers or Product Managers integrating OCR solutions, Mistral's OCR 4 offers advanced capabilities beyond raw text extraction. You should consider its semantic block classification and confidence scores for improved document processing and AI agent workflows. Its reported 72 percent blind test preference and 170-language support suggest a robust option for diverse, complex document types. Evaluate its \$4 per 1,000 pages cost against your project's specific needs.

Key insights

Mistral's OCR 4 extracts text and semantically classifies document elements, outperforming competitors in blind tests.

Principles

Method

OCR 4 identifies text elements, their page location, and semantic role (title, table, equation, signature), then outputs confidence scores for words and pages.

In practice

Topics

Best for: AI Architect, NLP Engineer, CTO, AI Engineer, Machine Learning Engineer, AI Product Manager

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by The Decoder.