How much does the Gemini API cost? Google Gemini pricing runs from $0.25 per 1M tokens (Gemini 3.1 Flash-Lite input) to $4.00 per 1M (Gemini 3.1 Pro input above 200K context). The flagship Gemini 3.1 Pro costs $2.00 / $12.00 per 1M up to 200K context, stepping to $4.00 / $18.00 above — matching GPT-5.4 on price while offering a 2M-token context window. Key change (April 1, 2026): Gemini Pro models are now paid-only. Flash and Flash-Lite retain free tiers with reduced daily quotas. Read the full change summary →
Google's Gemini API stands out in May 2026 for two reasons: the largest production context window (2M tokens) and competitive flagship pricing — Gemini 3.1 Pro matches GPT-5.4 at $2/$12 per 1M. The April 1 free-tier reshuffle removed free Pro-model access but kept Flash and Flash-Lite free for prototyping. For a full breakdown of what changed and how to adapt, see our Gemini free-tier change writeup.
Gemini 3.x (Current Production)
Google's latest generation is now GA and the recommended production tier:
- Gemini 3.1 Pro ($2.00 / $12.00 per 1M under 200K; $4.00 / $18.00 above) — Current flagship. Paid-only as of April 1, 2026. 2M-token context window.
- Gemini 3 Pro ($2.00 input / $12.00 output) — Stable flagship alternative. Paid-only.
- Gemini 3 Flash ($0.50 input / $3.00 output) — New default Flash model. Retains free tier with reduced quota.
- Gemini 3.1 Flash-Lite ($0.25 input / $1.50 output) — Cheapest Tier-1 budget model. Retains free tier with reduced quota.
Gemini 2.5 (Legacy — Paid Tier Only)
The 2.5 family remains available but is moving to legacy status. All 2.5 models are now paid-tier only.
- Gemini 2.5 Pro ($1.25 input / $10.00 output) — Legacy flagship. Still competitive on price but superseded by 3.x.
- Gemini 2.5 Flash ($0.30 input / $2.50 output) — Legacy mid-tier. Paid-only after April 1, 2026.
Free Tier: What Changed April 1, 2026
Google tightened the free tier significantly on April 1, 2026. The new state:
- Pro-tier models (3.1 Pro, 3 Pro, 2.5 Pro): free tier removed. Paid-only.
- Gemini 3 Flash: free tier retained with reduced daily quota.
- Gemini 3.1 Flash-Lite: free tier retained with reduced daily quota.
Google still has the most generous free tier overall — Flash access at scale is unmatched — but the "free Pro" era is over. For flagship-class access without payment, your only remaining option is Claude's $5 trial credits.
Full breakdown: What changed April 1 and three ways to keep prototyping cheaply →
Context Caching: 75-90% Savings
Gemini's context caching is exceptionally aggressive. Gemini 2.5 Pro cached input drops from $1.25 to $0.13/M — a 90% discount. For applications with repeated system prompts or reference documents, this can slash your bill dramatically.
How Gemini Compares
At the flagship tier, Gemini 3.1 Pro ($2.00/M input) matches GPT-5.4 on price and undercuts Claude Opus 4.7 ($5.00/M) by 60%. Gemini still wins decisively on context window — 2M tokens vs Opus 4.7's 200K and GPT-5.4's 270K — and on vision resolution.
At the budget tier, Gemini 3.1 Flash-Lite ($0.25/M input) is the cheapest mainstream model from a Tier-1 provider in 2026. Only DeepSeek's cached pricing is lower, at the cost of non-Tier-1 operational maturity.
For detailed head-to-head benchmarks and cost scenarios, see our flagship comparison guide.
All Google Gemini Models — Price per 1M Tokens
| Model↕ | Provider↕ | Input $/1M↑ | Cached $/1M↕ | Output $/1M↕ |
|---|---|---|---|---|
Gemini 2.5 Pro (>200k tokens) Gemini 2.5 | $2.50 | $0.25 | $15.00 |
- Gemini 2.5 Pro (>200k tokens)Gemini 2.5
- Input
- $2.50
- Cached
- $0.25
- Output
- $15.00
Price History
Track how Google Gemini API pricing has changed over time.
Gemini 2.5 Flash
Gemini 2.5 Pro
Gemini 2.5 Pro (>200k tokens)
Gemini 3 Flash
Gemini 3 Pro
Gemini 3.1 Flash-Lite
Gemini 3.1 Pro
Price history tracking started April 2026. Charts will appear after the first price change is detected.
View pricing changelog →
Frequently asked questions
How much does Gemini 3.1 Pro cost per token?
Gemini 3.1 Pro costs $2.00 per 1M input tokens and $12.00 per 1M output tokens for contexts up to 200K. Above 200K context, pricing jumps to $4.00 input / $18.00 output. Cached input drops to $0.20 per 1M — a 90% discount. Gemini 3.1 Pro is priced head-to-head with GPT-5.4.
Does Google Gemini have a free tier?
Partially. As of April 1, 2026, Google removed Gemini Pro models (including Gemini 3.1 Pro, 3 Pro, and 2.5 Pro) from the free tier — they are paid-only now. Flash models (Gemini 3 Flash and 3.1 Flash-Lite) remain free with reduced daily quotas. If you need free Pro-tier access, your only remaining option is Claude's $5 trial credits.
What changed on April 1, 2026?
Google removed all Pro-tier models from the free API tier — Gemini 3.1 Pro, Gemini 3 Pro, and Gemini 2.5 Pro now require billing to be enabled. Flash and Flash-Lite remain free but with tightened daily quotas. Google did not publish a formal changelog; the change surfaced through API error messages and updated pricing documentation.
How does Gemini compare to OpenAI and Claude?
At the flagship tier, Gemini 3.1 Pro ($2.00/$12.00) matches GPT-5.4 on pricing and undercuts Claude Opus 4.7 ($5.00/$25.00) by roughly 60%. Gemini still wins on context window (2M tokens vs 200K/270K) and vision resolution. For budget volume, Flash-Lite at $0.25/$1.50 is the cheapest Tier-1 budget model on the market.
What Gemini models does Google offer?
Google offers Gemini 3.1 Pro (flagship, tiered pricing), Gemini 3.1 Flash-Lite (budget), Gemini 3 Pro, Gemini 3 Flash (new default), and legacy Gemini 2.5 Pro / Flash / Flash-Lite models. The 3.x family is now the recommended production tier; 2.5 remains available but is moving to legacy status.
What is Gemini 3.1 Pro's context window?
Gemini 3.1 Pro supports a 2-million-token context window — the largest in production among Tier-1 providers. This is 10x Claude Opus 4.7's 200K standard window and ~7x GPT-5.4's 270K window. Pricing tiers at 200K: $2/$12 per 1M under, $4/$18 per 1M over.
How much does Gemini context caching save?
Google offers 90% discounts on cached context. Gemini 3.1 Pro drops from $2.00 to $0.20 per 1M input tokens on cache hits. For applications with large system prompts or reference documents, caching can reduce total input cost by 80-90%.
Methodology
Pricing sourced from https://ai.google.dev/pricing on . All prices in USD per 1 million tokens. Raw data: /api/pricing.json. API docs.
Compare All Providers
See how Google compares to OpenAI, Anthropic, DeepSeek, and more.