New: Claude Sonnet 5 - released June 30, 2026

Claude Sonnet 5: the mid-tier model built for agentic coding

The canonical guide to claude-sonnet-5 - the new default on Free and Pro plans. 1M context, 128K output, and performance Anthropic describes as close to Opus 4.8, at roughly 40% lower price.

#Sonnet 5 #claude-sonnet-5 #agentic #coding

Claude Sonnet 5 at a glance

claude-sonnet-5
Model ID

Pinned dateless snapshot, released June 30, 2026. No -v1 suffix.

1M / 128K
Context & max output

1M-token context (default and max) with 128K max output (up to 300K via a batches beta header).

$2 / $10
Intro pricing

$2 input / $10 output per MTok through Aug 31, 2026; standard $3/$15 from Sep 1, 2026.

≈ Opus 4.8
Positioning

Mid-tier model that Anthropic says is close to Opus 4.8, slotted between Haiku 4.5 and Opus 4.8.

At a glance: full specification

The verified facts for claude-sonnet-5 in one table, straight from Anthropic's model documentation.

Specification Claude Sonnet 5
Model ID claude-sonnet-5 (Bedrock: anthropic.claude-sonnet-5; OpenRouter: anthropic/claude-sonnet-5-20260630)
Release date June 30, 2026
Context window 1,000,000 tokens (both default and maximum - no smaller variant)
Max output 128K tokens (up to 300K with the output-300k-2026-03-24 Message Batches beta header)
Intro pricing (through Aug 31, 2026) $2 / MTok input, $10 / MTok output; cache read $0.20
Standard pricing (from Sep 1, 2026) $3 / MTok input, $15 / MTok output; cache read $0.30
Thinking & effort Adaptive thinking ON by default; effort low / medium / high / xhigh / max (default high). Manual thinking or non-default temperature/top_p/top_k return HTTP 400.
Knowledge cutoff January 2026
Availability Default on Free & Pro; Max/Team/Enterprise; Claude Code; Claude API; Amazon Bedrock; Google Cloud Vertex AI; Microsoft Foundry; GitHub Copilot; OpenRouter

Where Sonnet 5 shines - and where Opus 4.8 still leads

Anthropic frames Sonnet 5 as the best combination of speed and intelligence. It approaches Opus 4.8 without replacing it - here is an honest split.

Where Sonnet 5 shines

  • High-volume agentic coding

    Fast tool loops, refactors and multi-file edits where you want strong quality at a lower per-token cost than Opus.

  • Speed-sensitive interactive work

    Chat, pair-programming and IDE assist where low latency matters as much as raw depth of reasoning.

  • Long-context tasks on a budget

    The full 1M-token window is available by default, so large repos and documents fit without paying Opus rates.

  • Default everyday driver

    As the new default on Free and Pro, it handles the majority of general and coding requests well.

Where Opus 4.8 still leads

  • The hardest coding tasks

    Deep, ambiguous, long-horizon engineering problems still favour Opus 4.8's extra headroom.

  • High-stakes judgment

    Nuanced reasoning, tricky trade-offs and careful review benefit from Opus 4.8's top-tier depth.

  • Cybersecurity & adversarial work

    Opus 4.8 remains ahead on the most demanding security and red-team style reasoning.

  • Absolute peak quality

    When you need the best available answer regardless of price, Opus 4.8 is still the flagship.

Anthropic has not published exact benchmark numbers for Sonnet 5 - only that performance is qualitatively close to Opus 4.8. Sonnet 5 approaches Opus 4.8; it does not match or beat it.

Pricing summary

Always plan for both the introductory and standard rates - intro pricing is temporary, not permanent.

Through Aug 31, 2026
$2 / $10

Introductory rate

$2 per million input tokens and $10 per million output tokens. Cache read $0.20. This is the launch promotion, not the long-term price.

From Sep 1, 2026
$3 / $15

Standard rate

$3 per million input tokens and $15 per million output tokens. Cache read $0.30. Budget any post-August workload at this rate.

Cache reads
$0.20 / $0.30

Prompt caching

Cache read $0.20 intro / $0.30 standard. A 5-minute cache write is 1.25x base input; a 1-hour cache write is 2x base input.

⚠️
Tokenizer caveat - read before you compare to Sonnet 4.6

Sonnet 5 uses a new tokenizer: the same text consumes about 30% more tokens than Sonnet 4.6. So the $2/$10 intro rate is best described as roughly cost-neutral versus Sonnet 4.6's $3/$15 on identical text - not a 33% discount. Compare on real request cost, not sticker price alone.

Which Claude model for which workload

A simple routing framework across the current Claude line-up. Match the model to the job rather than defaulting to the biggest one.

Haiku 4.5

Fastest & cheapest

Reach for Haiku 4.5 on high-volume, latency-critical, low-complexity tasks: classification, extraction, routing and simple edits.

⭐ Sonnet 5

Best speed + intelligence

Sonnet 5 is the default workhorse for most agentic coding and general work - strong quality, fast, close to Opus at ~40% lower price.

Opus 4.8

Peak reasoning

Escalate to Opus 4.8 ($5/$25) for the hardest coding, high-stakes judgment and cybersecurity where you need the flagship.

Fable 5

Specialist flagship

Fable 5 ($10/$50, 1M context, 128K output) targets its own specialist workloads - use it when its particular strengths fit.

How to use Sonnet 5 on QCode

QCode gives you claude-sonnet-5 through a single API with no separate Anthropic account to manage.

Point to claude-sonnet-5

Set model to claude-sonnet-5 in any Anthropic-compatible request. The dateless snapshot means no date suffix to track.

Adaptive thinking by default

Thinking is on automatically at effort=high. Tune with effort low / medium / high / xhigh / max - don't send a manual thinking block.

Use it in Claude Code

Select Sonnet 5 as your Claude Code model for fast agentic loops, then escalate to Opus 4.8 only for the hardest steps.

Skip unsupported params

Omit temperature, top_p and top_k - non-default values return HTTP 400, the same rule as Opus 4.7+.

python - Anthropic-compatible call
# Minimal Sonnet 5 request
client.messages.create(
    model="claude-sonnet-5",
    max_tokens=8000,
    messages=[{"role": "user", "content": "Refactor this module"}]
    # adaptive thinking is ON by default (effort="high")
    # do NOT pass thinking={"type":"enabled"} or temperature -> HTTP 400
)

Migrating from Sonnet 4.6

Sonnet 5 is the recommended successor and new default - but there is no forced migration and no deadline.

Sonnet 4.6 is NOT retired

claude-sonnet-4-6 status is Active, with a tentative retirement no sooner than February 17, 2027. You can keep using it; Sonnet 5 is simply the recommended successor whenever you choose to move.

1. Swap the model id

Change claude-sonnet-4-6 to claude-sonnet-5, then re-check output token budgets - the new tokenizer emits ~30% more tokens for the same text.

2. Drop legacy params

Remove any manual thinking block and non-default temperature/top_p/top_k. Rely on adaptive thinking and the effort parameter instead.

Frequently asked questions

Is Claude Sonnet 5 available on the free plan?

Yes. Claude Sonnet 5 is the default model on the Free and Pro plans, and it is also available to Max, Team and Enterprise plans, in Claude Code, the Claude API, Amazon Bedrock, Google Cloud Vertex AI, Microsoft Foundry, GitHub Copilot and OpenRouter.

How much does Claude Sonnet 5 cost per token?

Introductory pricing is $2 per million input tokens and $10 per million output tokens through August 31, 2026. From September 1, 2026 the standard rate is $3 input and $15 output per million tokens. Cache reads are $0.20 (intro) / $0.30 (standard); a 5-minute cache write is 1.25x base input and a 1-hour cache write is 2x base input.

When does the Claude Sonnet 5 introductory price end?

The introductory $2/$10 pricing runs through August 31, 2026. On September 1, 2026 pricing moves to the standard $3 input / $15 output per million tokens. Plan any budget past August around the standard rate.

What is the Claude Sonnet 5 context window?

Claude Sonnet 5 has a 1,000,000 (1M) token context window. That figure is both the default and the maximum; there is no smaller-context variant to opt into.

What is the maximum output for Claude Sonnet 5?

Maximum output is 128K tokens, and up to 300K tokens when you send the Message Batches beta header output-300k-2026-03-24.

Is Claude Sonnet 5 better than Opus 4.8?

No. Anthropic positions Sonnet 5 as a mid-tier model whose performance is close to Opus 4.8 at roughly 40% lower price, but Opus 4.8 still leads on the hardest coding, judgment and cybersecurity tasks. Sonnet 5 approaches Opus 4.8; it does not match or beat it.

What is the Claude Sonnet 5 model id?

The model id is claude-sonnet-5, a pinned dateless snapshot with no -v1 suffix. On Amazon Bedrock it is anthropic.claude-sonnet-5, and on OpenRouter the slug is anthropic/claude-sonnet-5-20260630.

Can I use Claude Sonnet 5 in Claude Code?

Yes. Claude Sonnet 5 is available in Claude Code as well as the Claude API and every major cloud platform. Adaptive thinking is on by default with effort levels low, medium, high, xhigh and max (default high); manual extended thinking and non-default temperature/top_p/top_k return HTTP 400, so omit them.

Start building with Claude Sonnet 5 on QCode

Get claude-sonnet-5 plus the full Claude line-up - Haiku 4.5, Opus 4.8 and Fable 5 - through one simple API.