Claude vs Gemini Pricing (2026)

Claude and Gemini are two of the strongest API choices for teams building writing tools, coding agents, RAG systems, support assistants, and long-document workflows. They are also easy to compare badly.

The short version: Gemini usually wins on listed token price, especially for Flash models and Gemini 2.5 Pro under the long-context threshold. Claude usually wins when your evals reward its writing quality, coding behavior, instruction following, or agent reliability enough to justify higher token rates.

Using AI Pricing Guru’s tracked pricing data updated on June 22, 2026:

Claude Haiku 4.5 costs $1.00 per 1M input tokens and $5.00 per 1M output tokens.
Claude Sonnet 4.6 costs $3.00 input, $0.30 cached input, and $15.00 output.
Claude Opus 4.8 costs $5.00 input, $0.50 cached input, and $25.00 output.
Claude Fable 5 and Mythos 5 are listed at $10.00 input, $1.00 cached input, and $50.00 output, but access is currently suspended.
Gemini 2.5 Flash-Lite costs $0.10 input and $0.40 output.
Gemini 2.5 Flash costs $0.30 input, $0.03 cached input, and $2.50 output.
Gemini 2.5 Pro costs $1.25 input, $0.125 cached input, and $10.00 output under the normal tier.
Gemini 2.5 Pro’s tracked >200K-token tier costs $2.50 input, $0.25 cached input, and $15.00 output.
Gemini 3 Pro and Gemini 3.1 Pro are listed at $2.00 input, $0.20 cached input, and $12.00 output.

For live tables, keep the Anthropic Claude pricing page, Google Gemini pricing page, and AI token cost calculator open while you model your own workload. For adjacent context, read our Claude API pricing guide and Google Gemini API pricing guide.

Quick Pricing Comparison

All prices are USD per 1 million tokens.

Provider	Model	Status	Input	Cached input	Output	Best fit
Google	Gemini 2.5 Flash-Lite	Active	$0.10	n/a	$0.40	Cheapest Gemini utility calls
Google	Gemini 2.5 Flash	Active	$0.30	$0.03	$2.50	Fast support, RAG, extraction, multimodal apps
Google	Gemini 2.5 Pro	Active	$1.25	$0.125	$10.00	Lower-cost premium work below the long-context tier
Google	Gemini 2.5 Pro (>200K)	Active	$2.50	$0.25	$15.00	Large-context 2.5 Pro workloads
Google	Gemini 3 Pro / 3.1 Pro	Preview	$2.00	$0.20	$12.00	Premium Google route, long-context and multimodal apps
Anthropic	Claude Haiku 4.5	Active	$1.00	$0.10	$5.00	Claude utility calls and cheap routing
Anthropic	Claude Sonnet 4.6	Active	$3.00	$0.30	$15.00	Default Claude production model
Anthropic	Claude Opus 4.8	Active	$5.00	$0.50	$25.00	Premium Claude reasoning, coding, writing
Anthropic	Claude Fable 5 / Mythos 5	Suspended	$10.00	$1.00	$50.00	Published frontier tier, not practical while suspended

The table shows why Gemini looks attractive in a spreadsheet. Gemini 2.5 Pro is cheaper than Claude Sonnet 4.6 on both input and output under normal context sizes. Gemini 3 Pro is also cheaper than Claude Opus 4.8 while sitting in a premium Google tier. Gemini Flash and Flash-Lite are far below Claude’s lowest active API price.

That does not make Gemini the automatic winner. If Claude Sonnet solves a coding, support, or document-reasoning job with fewer retries and less human editing, it can still beat a cheaper Gemini route in production.

Context Windows and Long-Context Cost

Claude and Gemini both compete hard on long context, but the pricing models feel different.

Claude’s current positioning is simple: the Anthropic pricing page describes Claude Fable 5, Mythos 5, Opus 4.8, and Sonnet 4.6 as supporting a 1M-token Claude API context window. The catch is availability. Fable 5 and Mythos 5 are published in the price table but suspended, so practical Claude buyers should plan around Sonnet 4.6 and Opus 4.8 unless Anthropic restores access.

Gemini’s long-context economics are more tiered. Google is strong on context window size, and the local pricing tracker explicitly separates Gemini 2.5 Pro’s normal tier from the >200K-token tier. That matters because the normal 2.5 Pro rate is $1.25 input and $10 output, while the >200K tier rises to $2.50 input and $15 output.

For large prompts, do not compare only the model name. Compare the exact context tier you will hit.

Long-context question	Claude answer	Gemini answer
Is long context a core selling point?	Yes, especially Sonnet 4.6 and Opus 4.8 at 1M context	Yes, especially Pro-family Gemini models
Is the cheapest model still cheap at very large context?	Claude prices do not split in the tracked table by token threshold	Gemini 2.5 Pro has a tracked >200K tier with higher rates
What is the main budgeting risk?	Sending huge repeated prompts to Sonnet or Opus without caching	Assuming 2.5 Pro normal rates apply to every large-document call
What should teams do?	Cache stable context and retrieve only the relevant chunks	Model the >200K tier and cache repeated prefixes

The safest planning rule is to calculate with real prompt sizes. A 20K-token support prompt, a 150K-token legal review, and a 700K-token repository context are different products from a cost perspective.

Scenario 1: Customer Support Assistant

Assume a support assistant uses 100M input tokens and 30M output tokens per month. This covers ticket triage, help-center RAG, suggested replies, and user-visible answers.

Model	Monthly token cost
Gemini 2.5 Flash-Lite	$22
Gemini 2.5 Flash	$105
Gemini 2.5 Pro	$425
Gemini 3 Pro	$560
Claude Haiku 4.5	$250
Claude Sonnet 4.6	$750
Claude Opus 4.8	$1,250

Gemini wins the raw cost comparison. Even Gemini 3 Pro is cheaper than Claude Sonnet 4.6 in this scenario, and Gemini Flash is dramatically cheaper.

But support quality depends on escalation risk. If the bot answers routine product questions, Gemini Flash or Gemini 2.5 Pro may be a strong fit. If the assistant handles refunds, policy exceptions, regulated topics, or delicate tone, Claude Sonnet 4.6 can be worth testing because the cost of a bad answer may exceed the token savings.

A practical architecture is to start cheap and escalate:

Gemini Flash-Lite or Gemini Flash for intent routing and easy answers.
Gemini 2.5 Pro or Claude Sonnet 4.6 for user-visible answers that need judgment.
Claude Opus 4.8 only for high-value or high-risk cases where Sonnet fails evals.

Scenario 2: Coding Agent with Repeated Context

Now assume a coding agent uses:

100M uncached input tokens
300M cached input tokens
80M output tokens

This shape is common when a tool repeatedly sends repository summaries, coding rules, test logs, or tool schemas.

Model	Monthly token cost
Gemini 2.5 Flash	$239
Gemini 2.5 Pro	$962.50
Gemini 3 Pro	$1,220
Claude Haiku 4.5	$530
Claude Sonnet 4.6	$1,590
Claude Opus 4.8	$2,650

The big lesson is that cached input helps both providers. Gemini and Claude both show roughly 90% cached-input discounts in the tracked data for current models. If your coding agent has a stable prefix, caching is not optional. It is one of the easiest ways to reduce spend without changing models.

The model decision should be driven by accepted patch cost:

tasks solved without human rewrite
test pass rate
tool calls and retries
output token length per successful patch
cost per merged change

If Gemini 2.5 Pro passes your coding evals, it can be much cheaper than Claude Sonnet. If Claude Sonnet produces safer patches, better explanations, or fewer broken edits, the higher rate can still be rational. Do not route all coding to Opus unless the value of the task clearly supports it.

For broader coding-tool spend, compare our Cursor vs GitHub Copilot pricing guide and best AI for coding guide.

Scenario 3: Long-Document RAG

For RAG and document review, input volume dominates. Assume 300M input tokens and 40M output tokens per month. If the app often crosses Gemini 2.5 Pro’s >200K-token threshold, the model choice changes.

Model	Monthly token cost
Gemini 2.5 Flash	$190
Gemini 2.5 Pro normal tier	$775
Gemini 2.5 Pro >200K tier	$1,350
Gemini 3 Pro	$1,080
Claude Sonnet 4.6	$1,500
Claude Opus 4.8	$2,500

Gemini 2.5 Pro is the budget winner when the normal tier applies. Once the >200K tier applies, Gemini 3 Pro can become the better premium Google route in this simplified example. Claude Sonnet remains more expensive on raw tokens, but it may be worth testing where answer quality, citation handling, and synthesis matter more than minimum cost.

Cost-control moves are the same for both providers:

retrieve fewer chunks
compress session memory
cache stable system prompts and reference material
cap answer length
split large workflows into cheaper classification and premium synthesis steps

Do not send every document in full on every turn. Long context is useful, but it should be reserved for requests that actually need it.

Feature Comparison

Factor	Claude advantage	Gemini advantage
Raw token price	Stronger than older Claude tiers, but not usually cheapest	Lower prices across Flash, Pro, and preview premium tiers
Context positioning	1M-token Claude API positioning for current high-end models	Large context is a major Gemini selling point; tiering must be modeled
Cached input	90% discount on current tracked models	90% discount on current tracked models
Coding and writing	Often strong in evals for code review, prose, and careful instruction following	Competitive, especially when Gemini Pro passes workload-specific evals
High-volume routing	Haiku is useful, but starts at $1/$5	Flash-Lite and Flash are much cheaper for bulk tasks
Availability risk	Fable and Mythos are suspended; use Sonnet or Opus in practice	Some Gemini 3.x models are preview; validate stability before standardizing
Best default	Sonnet 4.6 for Claude-first production apps	Gemini 2.5 Flash, 2.5 Pro, or 3 Pro depending on quality needs

Claude is easier to explain as a three-step active ladder: Haiku for utility, Sonnet for default production, Opus for premium escalation. Gemini is broader and cheaper, but buyers need to choose among Flash-Lite, Flash, 2.5 Pro, 3 Pro, and context-tiered pricing.

When to Choose Claude

Choose Claude when:

Sonnet or Opus wins your coding and writing evals
failed outputs are expensive
instruction following matters more than raw token savings
you need careful long-document synthesis
your product already depends on Claude behavior
you want a simple Haiku/Sonnet/Opus ladder

Claude’s biggest weakness is price. Its biggest strength is that many teams already trust Sonnet and Opus for work where a wrong answer costs more than the API call.

When to Choose Gemini

Choose Gemini when:

token price is a major constraint
volume is high and easy to evaluate
Flash or Flash-Lite is good enough
2.5 Pro passes evals below the long-context tier
your stack is already on Google Cloud or Vertex AI
you need provider diversification

Gemini’s biggest strength is the breadth of cheap routes. It gives teams more room to push routine work down to Flash and reserve Pro models for the tasks that actually need them.

Best Strategy: Route by Task, Not Brand

For many teams, the best answer is both providers.

Workload	First route	Escalation route
Intent classification	Gemini 2.5 Flash-Lite	Claude Haiku 4.5
Support draft	Gemini 2.5 Flash or Gemini 3 Flash	Claude Sonnet 4.6
Premium customer answer	Gemini 2.5 Pro or Gemini 3 Pro	Claude Sonnet 4.6 or Opus 4.8
Coding triage	Gemini 2.5 Pro	Claude Sonnet 4.6
Hard code review	Claude Sonnet 4.6	Claude Opus 4.8
Large-document RAG	Gemini 2.5 Pro, modeled by tier	Gemini 3 Pro or Claude Sonnet 4.6

Start with the cheapest model that passes evals, then escalate only when confidence, complexity, or user value justifies it.

FAQ

Is Gemini cheaper than Claude?

Usually, yes. Gemini 2.5 Flash-Lite, 2.5 Flash, 2.5 Pro, Gemini 3 Pro, and Gemini 3.1 Pro all undercut the closest active Claude tiers on listed token price in the current tracker. The exception is not the rate card; it is task quality. Claude can still be cheaper per successful task if it reduces retries and human review.

Which Claude model should I compare with Gemini 3 Pro?

Compare Gemini 3 Pro with Claude Sonnet 4.6 and Claude Opus 4.8. Gemini 3 Pro is $2 input and $12 output per 1M tokens. Claude Sonnet 4.6 is $3/$15, while Claude Opus 4.8 is $5/$25.

Which Gemini model should I compare with Claude Sonnet 4.6?

Start with Gemini 2.5 Pro and Gemini 3 Pro. Gemini 2.5 Pro is cheaper at $1.25 input and $10 output in the normal tier, while Gemini 3 Pro is $2/$12. If your workload is simple, also test Gemini Flash before paying Pro rates.

Does long context make Gemini more expensive?

It can. The current tracker lists Gemini 2.5 Pro at $1.25 input and $10 output in the normal tier, but $2.50 input and $15 output for >200K-token workloads. Large context can erase part of Gemini 2.5 Pro’s discount, so model your prompt sizes before launch.

Are Claude Fable 5 and Mythos 5 usable alternatives?

Not for normal production planning while access is suspended. They are listed in pricing data at $10 input and $50 output per 1M tokens, but practical Claude buyers should compare Haiku 4.5, Sonnet 4.6, and Opus 4.8 until Anthropic restores access.

Bottom Line

Gemini is the lower-cost API stack for most common Claude comparisons. Its Flash models are far cheaper than Claude Haiku, Gemini 2.5 Pro beats Claude Sonnet on raw price under normal context sizes, and Gemini 3 Pro undercuts Claude Opus 4.8 by a wide margin.

Claude remains compelling when quality, coding behavior, writing, tool use, or risk reduction matter more than token price. The right production design is usually a router: Gemini for cheap and scalable work, Claude for tasks where evals prove the higher price buys better outcomes.

Last updated: June 22, 2026, using AI Pricing Guru’s tracked pricing data.