the AI bench
VERIFIED MAY 2026
All fast takes

FAST TAKE · 2026-05-28 · ANTHROPIC CLAUDE OPUS 4.8

Claude Opus 4.8 — price held flat, fast mode got 3× cheaper, and it catches its own code bugs now

Anthropic shipped Opus 4.8 on May 28 at unchanged standard pricing ($5 in / $25 out per 1M) — no tokenizer surprise this time, it keeps the 4.7 tokenizer. Two things actually changed: fast mode dropped to $10/$50 (3× cheaper than 4.6/4.7), and Anthropic claims the model is ~4× less likely to let flaws in code it wrote slip past unremarked. Still hosted-only; nothing changes for local picks today.

Verdict: Frontier reasoning still cloud-only; standard price held flat, fast mode 3× cheaper


The take

The facts, verified against Anthropic primary sources on launch day: Opus 4.8 is GA as of May 28, API id `claude-opus-4-8`. Standard pricing is $5 input / $25 output per 1M — identical to Opus 4.7 and 4.6 on platform.claude.com. Batch stays $2.50/$12.50, and the full 1M-token context window is at standard pricing. Critically, it keeps the 4.7 tokenizer, so there is no fresh token-count inflation — the ~35% effective-cost jump that quietly landed with 4.7 already happened; 4.8 does not repeat it.

The pricing news is fast mode. On Opus 4.6 and 4.7, fast mode (research preview, premium for significantly faster output) cost $30 input / $150 output per 1M — a 6× premium over standard. On 4.8 it is $10 / $50, roughly a 2× premium, which Anthropic describes as "three times cheaper than it was for previous models." If you run latency-sensitive agent loops on fast mode, that is a material drop; if you are on standard pricing, your per-token bill is unchanged from 4.7.

The capability claim is about self-review. Anthropic says Opus 4.8 is "around four times less likely than its predecessor to allow flaws in code it has written to pass unremarked" — i.e., it is better at catching its own bugs mid-task, which is exactly the failure mode that compounds in long autonomous coding loops. The announcement also claims lower misaligned-behaviour rates and richer, more information-dense outputs. We have not independently benchmarked any of this — treat it as a vendor claim pending community validation, the same posture we took on the 4.7 SWE claims.

Local-side, nothing changes today. Opus 4.8 is hosted-only with no open-weight equivalent, and the frontier-reasoning gap to the best local opens (Qwen 3.6-35B-A3B, GLM-5.1, Kimi K2.6, DeepSeek V4) is exactly where it was a week ago. The cost calculator on this site now carries the Opus 4.8 entry at the same effective rate (perMillion 15) that 4.7 used, because standard pricing did not move. The honest framing: unlike 4.7, which was a stealth cost increase, 4.8 is a quality-and-reliability upgrade at a flat price — and a genuine fast-mode discount if you use it.

Where this fits

Models: GLM-5.1 · Kimi K2.6 · Qwen 3.6-35B-A3B · DeepSeek V4-Flash

Hardware: NVIDIA DGX Spark · Dual RTX 5090 · Mac Studio M3 Ultra 96 GB

Sources

Next step

Try this in the planner