Claude Sonnet 5 Review: Anthropic's New Default Model

Claude Sonnet 5 went live June 30, 2026. If you're on a free Claude plan, it's now your default model. That means the bar for what free AI delivers just moved somewhere it hasn't been in a while.

Here's what changed and why it matters.

Six months ago, this cost Opus money

The central claim in Anthropic's announcement is that Sonnet 5 performs at a level that previously required Opus on agentic tasks. The benchmarks back it up.

Previous Sonnet models handled single-turn tasks well: summarizing a document, drafting a reply, answering a factual question. Where they fell short was multi-step sequences: tasks that required tracking state across tool calls, recovering from errors mid-task, and continuing when something went wrong. That work went to Opus.

Sonnet 5 closes most of that gap. On BrowseComp (agentic web research) and OSWorld-Verified (computer use), Sonnet 5 scores closer to Opus 4.8 than any previous Sonnet model. Early partners report it completing pull requests with testing, running insurance claim workflows, and doing legal research review without needing an Opus fallback.

One specific behavior Anthropic highlights: Sonnet 5 checks its own output without being prompted to. For agent work, that's not a small thing. A model that self-audits catches the wrong first answer before downstream steps propagate the error.

The 1M context window, in human terms

The other major spec change is the context window: 1 million tokens by default. At roughly 750 words per thousand tokens, that's about 750,000 words, or about a thousand pages of dense text.

Most users won't hit this limit. But for anyone doing codebase analysis, large document review, or research synthesis, it removes the chunking problem entirely. You can pass a full codebase or document set and ask about the whole thing. The 128K maximum output token limit is also notable for agent workflows generating substantial artifacts in a single turn: code files, reports, structured data.

Pricing, and what August 31 means

Introductory pricing runs through August 31: $2 per million input tokens and $10 per million output. Standard rates kick in after that at $3 per million input and $15 per million output.

Opus 4.8 currently costs $15 per million input and $75 per million output. Sonnet 5 at standard pricing is one-fifth the cost for comparable agentic performance on most workloads. The introductory window is a developer adoption ramp: Anthropic filed its S-1 on June 1 at a $965 billion valuation, and locking in developer usage before standardizing pricing is a well-worn playbook.

The August 31 cliff matters if you're building anything at scale. Price your unit economics at standard rates now, not the introductory ones.

What changes for users

Free plan users now get Sonnet 5 as the default at no cost. Today's free tier is more capable than what Pro users had access to twelve months ago. That's a real shift in where the value is concentrated.

Pro plan users ($20/month) also get Sonnet 5 as the default. Opus 4.8 remains available, but for most tasks — writing, research, coding, analysis — Sonnet 5 is the better practical choice for the majority of work. Reserve Opus for the hardest reasoning problems where you want maximum capability regardless of cost.

API developers should note the model ID is claude-sonnet-5 and plan for the September rate adjustment. If production deployments are still on Sonnet 4.6, migrating during the introductory window is worth doing: the performance improvement is real and the cost increase doesn't start until September.

Safety, specifically for agents

Two aspects of Sonnet 5's safety profile matter for production use.

Resistance to prompt injection is meaningfully better. When Sonnet 5 reads external content — emails, web pages, database records — it's harder to redirect via adversarial text embedded in that content. This is a practical attack vector for any agent processing user-supplied or third-party input, and better resistance here matters for production deployments.

Sonnet 5 was also explicitly not trained on cybersecurity offensive tasks. Anthropic states it shows "substantially poorer performance" on exploit development compared to Opus models. That's a deliberate capability boundary for a model now deployed at free-tier scale, and a reasonable tradeoff.

For a full profile of Claude including Sonnet 5 details, see the Claude entry on chatbot.gallery. And for a broader comparison of where Claude stacks up against other AI assistants, the best AI chatbots guide covers current pricing and capabilities across the major options.

Want a weekly roundup of what matters in AI? Subscribe to the About.chat newsletter.