Anthropic announced Opus 4.6 Fast Mode as "2.5x faster output token generation." What they didn't put in the headline: it costs $30/$150 per million tokens. Standard Opus 4.6 costs $5/$25.
That's 6x more expensive. For 2.5x more speed.
What Anthropic Announced
On February 5, 2026, Anthropic released Opus 4.6 alongside a "Fast Mode" research preview. The announcement framed it as a speed upgrade for developers who need faster iteration cycles in Claude Code.
The pitch: same model, same intelligence, just faster. Toggle /fast in Claude Code, get 2.5x faster responses.
Sounds great. Until you look at the invoice.
The Pricing Reality
Here's what Fast Mode actually costs vs standard Opus 4.6:
| Standard Opus 4.6 | Fast Mode | Multiplier | |
|---|---|---|---|
| Input | $5 / MTok | $30 / MTok | 6x |
| Output | $25 / MTok | $150 / MTok | 6x |
| Speed | ~71 tok/s | ~178 tok/s | 2.5x |
You pay 6x more per token. You get 2.5x faster delivery. The cost-per-second of output is still 2.4x higher than standard mode.
Put differently: for every dollar you spend in standard mode, Fast Mode burns $6 for the same amount of work—just delivered quicker.
But Wait, There's More
Fast Mode isn't the only cost multiplier hiding in Opus 4.6. Stack these up:
1. Adaptive Thinking = More Tokens Opus 4.6's new adaptive thinking mode (enabled by default at "high" effort) generates significantly more output tokens than Opus 4.5. Artificial Analysis measured 58M output tokens to run their Intelligence Index on Opus 4.6 with max effort—roughly 2x what Opus 4.5 needed (29M tokens).
Same price per token. Twice as many tokens. Your bill doubles before you even touch Fast Mode.
2. Long Context Premium Using the new 1M token context window? Every request over 200K input tokens gets premium pricing: $10/$37.50 per MTok. That's 2x the standard input price and 1.5x the output price.
3. US-Only Inference Need data residency in the US? Add a 1.1x multiplier on everything.
4. The Compounding Effect Stack Fast Mode + long context + US inference, and a single API call could cost you:
- Input: $30 × 2 (long context) × 1.1 (US) = $66 per MTok
- Output: $150 × 1.5 (long context) × 1.1 (US) = $247.50 per MTok
That's 13.2x more on input and 9.9x more on output vs standard Opus 4.6 with a short context window.
The Real-World Math
Let's say you're a dev running Claude Code for 8 hours. Conservative scenario:
Standard Opus 4.6:
- ~500K input tokens, ~200K output tokens per session
- Cost: ($5 × 0.5) + ($25 × 0.2) = $7.50
Fast Mode:
- Same work, same tokens
- Cost: ($30 × 0.5) + ($150 × 0.2) = $45.00
That's $45 vs $7.50 for the same output. Per day.
Over a month (22 working days): $990 vs $165. Fast Mode costs $825 more per month for one developer.
Why People on X Are Annoyed
The frustration isn't about Fast Mode existing. Faster inference is genuinely useful for interactive debugging and rapid iteration. The frustration is about how it was presented.
Anthropic's announcement led with "2.5x faster"—the benefit. The 6x price increase was buried in documentation pages. Cat Wu's tweet about it mentioned "2.5x-faster version" without pricing context. The Claude Code docs mention "$30/150 MTok" but only after scrolling past feature descriptions.
Weekly insights on AI Architecture. No spam.
Compare this to how Anthropic communicated the Opus 4.5 pricing—they led with "67% cheaper" because that was the good news. With Fast Mode, the good news is speed. The bad news is the price. Guess which one got the spotlight.
The Competitive Context
For context, here's where Opus 4.6 Fast Mode sits in the market:
| Model | Output Price/MTok | Speed (tok/s) | $/1K tokens per second |
|---|---|---|---|
| Opus 4.6 Standard | $25 | ~71 | $0.35 |
| Opus 4.6 Fast | $150 | ~178 | $0.84 |
| GPT-5.2 | $15 | ~90 | $0.17 |
| GPT-5.3-Codex | $15 | ~112 | $0.13 |
| Sonnet 4.5 | $15 | ~90 | $0.17 |
Even standard Opus 4.6 is expensive compared to competitors. Fast Mode puts it in a category of its own—and not in a good way, cost-wise.
GPT-5.3-Codex delivers 58% more speed than standard Opus at 60% of the output cost. It scored 77.3% on Terminal-Bench 2.0 vs Opus 4.6's 65.4%.
When Fast Mode Actually Makes Sense
To be fair, there are scenarios where paying 6x makes rational economic sense:
Live debugging sessions where a 10-second response vs 25-second response directly translates to developer productivity. If your developer costs $150/hour, waiting 15 extra seconds per query across 100 queries is ~25 minutes of dead time. That's $62.50 in developer salary vs maybe $37 in extra API cost.
Demo environments where you need snappy responses for client presentations.
Time-critical deployments where shipping 2 hours earlier matters more than API costs.
For async workloads, batch processing, or anything that isn't interactive? Standard mode. Every time. Or use Anthropic's Batch API for a 50% discount.
The Verdict
Fast Mode is a real product solving a real problem: Opus is smart but slow. The 2.5x speed improvement is legitimate and measurable.
But charging 6x more for 2.5x faster is a terrible value proposition for most developers. And burying the pricing while leading with the speed improvement is the kind of marketing that erodes trust.
The compounding cost story is what really matters: adaptive thinking burns 2x more tokens, long context doubles input prices, and now Fast Mode multiplies everything by 6. An Opus 4.6 power user could easily see 10-12x the API bill of someone running Opus 4.5 three months ago—for the same quality of work.
Anthropic makes great models. But the pricing transparency could use some of that "adaptive thinking" they're so proud of.
Related reads: