Paris‑based Mistral released OCR 4 on June 23, positioning the new model as a purpose‑built solution for the tedious paperwork that fuels enterprise back‑office operations. Unlike conventional optical character recognition tools that return a flat stream of text, OCR 4 delivers a full map of each page: bounding boxes that pinpoint the exact location of every line, block classifications that label titles, tables, equations or signatures, and per‑word confidence scores that let reviewers focus on uncertain sections.
The model’s output arrives as clean markdown, making it easy for developers to feed the data into retrieval‑augmented generation (RAG) pipelines or other AI agents that need to understand not just what a document says but how it is organized. Early adopters are already using OCR 4 to digitize archives, extract fields from invoices and pull structured text from scientific reports.
Built for the back office
Back‑office teams demand precision and speed. Mistral reports that Anaqua, a provider of intellectual‑property filing software, sees the new model processing pages roughly four times faster than its previous tool, a gain that can make the difference in high‑volume docketing environments. The system handles PDFs, Microsoft Word, PowerPoint and OpenDocument formats, and reads 170 languages across ten language groups, including low‑resource tongues where rivals struggle.
Pricing reflects the focus on volume work. The API costs $4 per 1,000 pages, dropping to $2 in batch mode, while a higher‑level Document AI offering that reshapes output into custom fields runs $5 per 1,000 pages. A financial‑research firm, Rogo, claimed comparable accuracy to its former provider at about one‑eighth the cost.
Self‑hosting is a core selling point. OCR 4 fits into a single container, allowing banks, hospitals, governments and other data‑sensitive organizations to keep documents on‑premises and avoid the residency concerns tied to U.S. cloud providers. Mistral markets the model as Europe’s sovereign alternative to American AI, a claim reinforced by the company’s partnership with Microsoft, which highlighted the launch as a milestone in its collaboration with the French AI champion.
Distribution runs through Mistral’s own studio, Amazon SageMaker, Microsoft’s Foundry and, soon, Snowflake. The model also plugs into Mistral’s new Search Toolkit, an open‑source framework that lets developers feed structured OCR output directly into citation‑ready pipelines.
Benchmarks show OCR 4 leading public OlmOCRBench scores at 85.20 and winning 72 % of human‑judged comparisons against competing systems. Mistral cautions that the figures are “directional,” noting that the benchmark may misjudge math equations and multi‑column layouts. The company also stresses that OCR 4 is a reading tool, not a decision‑making system; it is unsuitable for medical diagnosis, legal judgment or high‑stakes finance without human oversight.
The document‑AI market remains crowded, with Google, AWS and a wave of startups offering competing products. Mistral’s differentiators—structured output, low price and self‑hostability—aim to win over enterprises that need both performance and data sovereignty. Whether the approach can capture a sizable share of back‑office workloads against larger cloud incumbents will unfold over the coming months, but for now the French firm has placed the boring but essential task of document processing at the heart of its AI strategy.
Cet article a été rédigé avec l'assistance de l'IA.
News Factory APP - actualités agentiques pour booster votre SEO et AEO.