news

Claude Opus 4.8 Pricing: Same $5/$25 API Rate, Cheaper Fast Mode

Claude Opus 4.8 keeps $5/$25 API pricing while cutting Opus fast mode to $10/$50 per million tokens.

By AI Pricing Guru Editorial Team

AI Pricing Guru articles are maintained by the editorial workflow behind the site: daily pricing snapshots, provider source checks, and review passes for model launches, subscription limits, and billing changes.

Anthropic released Claude Opus 4.8 today, and the pricing story is unusually clean: the regular API rate did not change.

Claude Opus 4.8 costs $5.00 per million input tokens and $25.00 per million output tokens on the Claude API. That is the same standard price as Opus 4.7 and Opus 4.6, so teams already paying for the Opus tier can treat the launch as a quality upgrade first and a procurement event second.

The real billing change is fast mode. Opus 4.8 fast mode is listed at $10.00 per million input tokens and $50.00 per million output tokens. Previous Opus fast mode pricing for Opus 4.6 and 4.7 was $30.00 input and $150.00 output per million tokens, so Anthropic cut the fast-mode premium by two thirds for the new model.

For current live rates across every Claude model, see our Anthropic Claude API pricing page or plug your own token mix into the token cost calculator.

Claude Opus 4.8 API pricing

ItemClaude Opus 4.8 price
Input tokens$5.00 / 1M
Cache hits and refreshes$0.50 / 1M
5-minute cache writes$6.25 / 1M
1-hour cache writes$10.00 / 1M
Output tokens$25.00 / 1M
Batch input$2.50 / 1M
Batch output$12.50 / 1M
Fast mode input$10.00 / 1M
Fast mode output$50.00 / 1M

The model ID is claude-opus-4-8.

Anthropic’s docs list a 1M-token context window for Opus 4.8 on the Claude API, plus a 128K max output limit for synchronous Messages API calls. The Batch API can support larger output with the extended-output beta header.

What changed from Opus 4.7

The standard token price is unchanged:

ModelInputCached inputOutputStatus
Claude Opus 4.8$5.00$0.50$25.00Current flagship
Claude Opus 4.7$5.00$0.50$25.00Legacy
Claude Opus 4.6$5.00$0.50$25.00Legacy

Anthropic says Opus 4.8 improves on Opus 4.7 across coding, agentic tasks, practical knowledge work, and reliability. The launch post also highlights better judgement in agentic work and lower rates of unsupported claims.

From a buyer’s perspective, the most important part is that Anthropic is not asking for a higher base token rate. If your product already routes premium work to Opus 4.7, the migration question is quality and compatibility, not a new list price.

Fast mode is the economic headline

Fast mode is Anthropic’s research-preview path for faster Opus responses. With Opus 4.6 and 4.7, fast mode was expensive enough that many teams would reserve it for demos, human-in-the-loop tools, or workflows where latency mattered more than margin.

Opus 4.8 changes that math:

Fast modeInputOutput
Opus 4.6 / 4.7$30.00 / 1M$150.00 / 1M
Opus 4.8$10.00 / 1M$50.00 / 1M

That is still a 2x premium over standard Opus 4.8 pricing. It is no longer a 6x premium.

For interactive coding agents, research assistants, support copilot workflows, and browser agents, this matters. Latency often affects completion rate and user trust. A 2x token premium is easier to justify when faster responses reduce abandoned runs or human wait time.

Budget examples

Here is the rough API cost for common request shapes:

WorkloadStandard Opus 4.8Fast mode Opus 4.8
100K input + 10K output$0.75$1.50
1M input + 100K output$7.50$15.00
10M input + 1M output$75.00$150.00

Prompt caching can change the input side dramatically. A repeated 1M-token context read from cache costs $0.50 instead of $5.00 before output tokens. Batch processing cuts standard input and output by 50%, which makes Opus 4.8 batch pricing $2.50 input and $12.50 output per million tokens.

How it compares to GPT-5.5 and Sonnet 4.6

Opus 4.8 now sits in a clear premium tier:

ModelInputCached inputOutputBest fit
Claude Opus 4.8$5.00$0.50$25.00Complex coding, agentic work, high-value reasoning
GPT-5.5$5.00$0.50$30.00Premium OpenAI reasoning and chat workflows
Claude Sonnet 4.6$3.00$0.30$15.00Most Claude production workloads
GPT-5.4$2.50$0.25$15.00Lower-cost flagship-class OpenAI work

Compared with GPT-5.5, Opus 4.8 matches input price and is cheaper on output. Compared with Sonnet 4.6, Opus 4.8 is 67% more expensive on both input and output.

That keeps Sonnet 4.6 as the default Claude value pick. Opus 4.8 should be reserved for tasks where the model’s extra capability changes the outcome: long-horizon coding, agent planning, difficult reviews, legal or financial document reasoning, and high-autonomy workflows where a mistake is expensive.

What to do now

  1. Add claude-opus-4-8 to your eval harness and compare it against Opus 4.7 on real tasks.
  2. Re-check token volume. Anthropic notes that Opus 4.7 and later use a newer tokenizer that may use more tokens for the same fixed text than earlier models.
  3. Test fast mode separately. It is cheaper than before, but still 2x standard Opus 4.8 pricing.
  4. Keep Sonnet 4.6 in your router. Many production tasks will not need Opus pricing.
  5. Use prompt caching for repeated context, tool specs, memory, retrieval packs, and long system prompts.

My read

Opus 4.8 is not a price hike. It is a same-price flagship refresh with a meaningful fast-mode price cut.

That makes the launch commercially useful for Anthropic. Existing Opus buyers get a cleaner upgrade path, and teams that avoided fast mode because it was too expensive now have a reason to test it again.

For builders, the practical move is simple: benchmark Opus 4.8 for the expensive edge cases, keep Sonnet 4.6 or cheaper OpenAI and Google models for bulk work, and only pay the Opus premium where quality or latency directly affects revenue.


Sources: Anthropic Claude Opus 4.8 launch post, Anthropic model pricing docs, and Anthropic model overview.