Anthropic announced the immediate availability of Claude Opus 4.8, the latest iteration of its flagship artificial‑intelligence model. The company says the upgrade makes the system markedly more honest and reliable, especially in tasks that require autonomous action. Opus 4.8 is four times less likely than its predecessor, Opus 4.7, to let flaws in generated code slip by unnoticed. Early users also note a stronger willingness to flag uncertainty and a drop in unsupported claims.

The new model is priced identically to the previous version—$5 per million input tokens and $25 per million output tokens—and is being deployed across all Anthropic products, including the claude.ai web interface, Claude Code, and the public API. Benchmarks show gains across the board: agentic coding performance rose from 64.3% to 69.2% on Terminal‑Bench 2.1, multidisciplinary reasoning improved from 54.7% to 57.9%, and knowledge‑work scores climbed from 1,753 to 1,890.

Alignment testing puts Opus 4.8 at new highs for prosocial traits such as supporting user autonomy and acting in the user’s best interest. Rates of misaligned behavior—deception, misuse cooperation, and similar issues—dropped substantially, matching the performance of Anthropic’s Claude Mythos Preview, its most aligned model to date.

Several enterprise partners have already put the model through its paces. Cognition, the maker of the Devin AI coding agent, praised Opus 4.8 for cleaner tool usage and the resolution of comment‑verbosity and tool‑calling glitches that plagued Opus 4.7. Cursor, an AI‑powered code editor, recorded improvements at every effort level in its CursorBench evaluation. Legal‑tech firm Harvey said Opus 4.8 delivered the highest score ever on its Legal Agent Benchmark, breaking the 10% all‑pass threshold for the first time. Databricks observed faster handling of deep multistep questions and a 61% reduction in token cost compared with Opus 4.7. Thomson Reuters’ CoCounsel Legal noted better consistency and reasoning, while Hebbia highlighted sharper citation precision and token efficiency in document‑retrieval tasks.

Alongside the model release, Anthropic introduced new features aimed at giving developers finer control. An effort‑control slider in claude.ai and Cowork lets users trade speed for quality by adjusting how much computation the model applies to a response. Claude Code now supports dynamic workflows, enabling it to plan work and launch hundreds of parallel sub‑agents in a single session—useful for large‑scale code migrations. The Messages API was updated to accept system entries inside the message array, allowing mid‑task instruction changes without breaking the prompt cache. Fast mode for Opus 4.8 runs 2.5 times faster than earlier versions and is three times cheaper per token.

Anthropic also teased the upcoming Mythos‑class models, a tier above Opus that promises even higher intelligence. A limited group of roughly 50 partners—including Apple, Google, Microsoft, and Amazon Web Services—has already accessed a preview through Project Glasswing, uncovering more than 10,000 high‑ or critical‑severity software vulnerabilities. Anthropic says broader release will require stronger cybersecurity safeguards, but expects to make the models generally available within weeks.

The Opus 4.8 launch arrives amid a fresh wave of funding. Anthropic closed a $65 billion Series H round, pushing its post‑money valuation to $965 billion, a sharp rise from the $380 billion valuation at its Series G round in February. Revenue, according to the company, grew from about $1 billion at the end of 2024 to an estimated $30 billion annualized run rate in 2026, driven largely by enterprise adoption of Claude.

Geographically, the firm expanded its footprint with a new office in Milan, its sixth European location, and appointed KiYoung Choi as Representative Director of Korea ahead of a planned Seoul office. The moves reflect growing demand for Claude outside the United States.

Anthropic’s upgrade lands in a fiercely competitive frontier‑AI market. OpenAI recently rolled out GPT‑5.5, its first fully retrained base model since GPT‑4.5, while also releasing GPT‑5.4, which set new professional‑benchmark records earlier this year. Google, a major investor in Anthropic with up to $40 billion committed, continues developing its Gemini series. The race now centers on three giants—Anthropic, OpenAI, and Google—each pushing incremental model improvements at a rapid pace.

For Anthropic, the emphasis is shifting from raw capability to reliability. A model that self‑detects mistakes, flags uncertainty, and follows instructions consistently is a more practical tool for agentic workflows where humans cannot supervise every action. Whether the forthcoming Mythos models can preserve that reliability while delivering higher intelligence will be a key test of Anthropic’s strategy to stay ahead in the enterprise AI arena.

This article was written with the assistance of AI.
News Factory SEO helps you automate news content for your site.