Claude Sonnet 5 Pricing: API Rates, Plans, and How to Save

Claude Sonnet 5 launched on June 30, 2026 with introductory API pricing of $2 per million input tokens and $10 per million output tokens through August 31, 2026, after which it moves to standard rates of $3 and $15. Anthropic positions Claude Sonnet 5 as its most agentic Sonnet model yet, and the pricing is the headline: near-Opus performance at a fraction of the cost, according to the company’s official announcement.

That combination matters because Sonnet 5 became the default model for the Free and Pro plans on launch day, while API developers get a limited window to test it cheaply before rates rise in September.

Claude Sonnet 5 API pricing and value

Claude Sonnet 5 API Pricing at a Glance

The clearest way to read Sonnet 5’s cost is the two-phase structure. During the introductory window you pay less; from September 1, 2026 the standard price applies. Cached input reads are dramatically cheaper than fresh input, which is where high-volume agent workloads save the most.

Token typeIntro (through Aug 31, 2026)Standard (from Sep 1, 2026)
Input$2 / 1M$3 / 1M
Output$10 / 1M$15 / 1M
Cache read (hit)$0.20 / 1M$0.30 / 1M

The Claude Platform pricing docs confirm these figures and list the cache-write multipliers as well. Anthropic’s own blog is the authority on the post-introductory price: it settles at $3 input and $15 output, matching what Sonnet-class models have historically cost.

Where the model runs

Developers call the model with the id claude-sonnet-5 through the Claude API, and it is also available in Claude Code and on the broader Claude Platform. The same model powers the consumer apps, so a Free or Pro chat user is talking to the identical engine that API customers pay per token to access.

The tokenizer caveat

There is one subtlety that affects real spend. Sonnet 5 uses an updated tokenizer that produces roughly 30% more tokens for the same text compared with older models. Anthropic set the introductory pricing so the migration from Sonnet 4.6 works out to be roughly cost-neutral in practice, but it is worth measuring your own workloads rather than assuming a like-for-like drop.

How Sonnet 5 Compares to Opus 4.8 and Sonnet 4.6

Pricing only makes sense next to performance. Sonnet 5 sits between the previous Sonnet 4.6 and the flagship Opus 4.8 on both cost and capability, and for most everyday agentic work it lands in the sweet spot.

ModelInput / Output (per 1M)Agentic coding score
Claude Sonnet 5 (intro)$2 / $1063.2%
Claude Sonnet 5 (standard)$3 / $1563.2%
Claude Opus 4.8$5 / $2569.2%
Claude Sonnet 4.6$3 / $1558.1%

Price versus performance

On one agentic coding benchmark, Sonnet 5 scores 63.2%, compared with Opus 4.8’s 69.2% and the previous Sonnet 4.6’s 58.1%. It even slightly surpasses Opus 4.8 on some knowledge-work benchmarks, which is unusual for a midsize model. The point is not that Sonnet 5 beats the flagship everywhere, but that it closes most of the gap for far less money. Anthropic frames the tradeoff plainly in its launch post:

Opus 4.8 is still the model of choice for higher accuracy on these tasks, but Sonnet 5 provides developers with lower-priced options that are of much higher quality than what was previously available.

Anthropic

Because both models expose adjustable effort levels, teams can dial Sonnet 5 up for hard problems or down for cheap, routine automation, then reserve Opus 4.8 for the tasks that genuinely need the flagship’s accuracy.

Self-correction that saves reruns

A quieter cost lever is reliability. Testers report that Sonnet 5 finishes complex, multi-step jobs where earlier Sonnet models would stall, and that it checks its own output without being asked. Fewer failed runs and fewer retries mean fewer wasted tokens, which is a real-world discount that never shows up on the price sheet.

Safety improvements point the same direction. Both new Sonnet models scored 0.0% on a Firefox 147 exploit benchmark that Anthropic developed with Mozilla, with the underlying vulnerabilities patched in Firefox 148. Sonnet 5 is also better at refusing malicious requests and resisting prompt-injection hijack attempts than its predecessor, and it hallucinates and behaves sycophantically at lower rates than Sonnet 4.6. For agent builders, a model that stays on task and declines unsafe instructions cleanly means fewer costly cleanups down the line.

Claude Sonnet 5 API pricing and value

Subscription Plans: Is Claude Sonnet 5 Free?

Yes, in the consumer apps. From launch, Sonnet 5 is the default model on the Free and Pro plans, and it is available to Max, Team, and Enterprise subscribers too. If you only chat through the web or desktop apps, you are not paying per token at all.

The per-token API pricing above applies when you build on the platform: calling the model programmatically, running it in Claude Code, or embedding it in your own product. Those are separate billing tracks from a monthly subscription, so heavy API usage is metered independently of whatever plan you hold.

Anthropic also raised rate limits across Chat, Cowork, Claude Code, and the Claude Platform alongside the launch, specifically to accommodate the higher token usage that comes with running the model at higher effort levels. That headroom matters for teams that plan to lean on Sonnet 5 for sustained autonomous work rather than one-off prompts.

How to Reduce Your Claude Sonnet 5 Costs

For API users, the sticker price is only the starting point. Anthropic ships several mechanisms that cut real spend substantially, and stacking them is where large agent workloads become affordable.

  1. Turn on prompt caching. A cache hit costs just 10% of the standard input price, so repeated system prompts, long documents, or conversation history stop being re-billed at full rate on every call.
  2. Batch non-urgent work. The Batch API applies a 50% discount on both input and output tokens for asynchronous jobs.
  3. Right-size the effort level. Run medium effort for routine tasks and reserve extra-high effort for genuinely hard problems.
  4. Migrate before September 1. Lock in real workload testing during the $2/$10 introductory window rather than after standard pricing kicks in.
  5. Measure token counts on the new tokenizer. Because the tokenizer emits more tokens per unit of text, base your budget on measured usage, not old Sonnet 4.6 numbers.

Used together, caching and batching can drop the effective cost of a repetitive agent pipeline far below the headline rate, especially when most of the context is stable across calls. A pipeline that re-sends the same long system prompt and reference documents on every step is the classic candidate: with caching, that shared context is billed at a tenth of the input rate after the first write, and with batching the whole job runs at half price if it does not need to be real time.

Claude Sonnet 5 API pricing and value

Sonnet 5 vs the Competition

Sonnet 5 is pitched as a direct, lower-cost alternative to OpenAI’s GPT-5.6 Sol and Google’s Gemini 3.5 Flash for everyday agentic tasks. On price it undercuts Opus 4.8, OpenAI’s GPT-5.5, and Google’s Gemini 3.1 Pro, while remaining more expensive than the very cheap Gemini 3.5 Flash. As TechCrunch summarized the launch:

The differentiator in this generation is no longer who can do agentic work, but how cheaply and reliably it can be done without human oversight. Sonnet 5’s $2/$10 introductory rate is Anthropic’s answer to that question, and it is aimed squarely at developers deciding where to run high-volume agents.

For a fuller picture of the current lineup, Anthropic maintains up-to-date figures for every model on its pricing page, including Opus 4.8 at $5/$25 and Haiku 4.5 at $1/$5.

FAQ

keyboard_arrow_up