Cheapest AI APIs in 2026: Ranked by Cost Per Million Tokens

Looking for the cheapest AI API? We’ve ranked every major model by cost per million tokens so you can find the best deal for your use case.

All prices are current as of March 2026 and updated daily on our pricing page.

Cheapest AI Models: Input Price Ranking

Rank	Model	Provider	Input / 1M	Output / 1M
1	GPT-4.1 nano	OpenAI	$0.10	$0.40
2	Mistral Small	Mistral	$0.10	$0.30
3	Llama 4 Scout	Meta	$0.15	$0.15
4	GPT-4o mini	OpenAI	$0.15	$0.60
5	Llama 4 Maverick	Meta	$0.20	$0.20
6	GPT-5.4 nano	OpenAI	$0.20	$1.25
7	Grok 4.1 Fast	xAI	$0.20	$0.50
8	GPT-5.4 mini	OpenAI	$0.25	$2.00
9	Gemini 2.5 Flash-Lite	Google	$0.25	$1.50
10	DeepSeek V3.2	DeepSeek	$0.28	$0.42

The Ultra-Budget Tier (Under $0.50/1M Input)

1. GPT-4.1 nano — $0.10 input / $0.40 output

OpenAI’s cheapest model with 1M+ context window. Great for classification, routing, and simple extraction. The context window alone makes it stand out from similar-priced competitors.

2. Mistral Small — $0.10 input / $0.30 output

Mistral’s budget model matches GPT-4.1 nano on input and beats it on output pricing. Solid for European-language tasks and general-purpose work.

3. Llama 4 Scout — $0.15 input / $0.15 output

Meta’s open-source model available through API partners like Together and Fireworks. The symmetrical pricing (same input and output cost) is unusual and great for output-heavy workloads. Plus, it has a massive 10M-token context window.

4. DeepSeek V3.2 — $0.28 input / $0.42 output

DeepSeek offers remarkable performance for the price. The V3.2 update unified their pricing across chat and reasoning modes. With cache hits at just $0.028/1M, heavy reuse workloads become nearly free.

Best Value: Price vs. Performance

Raw cost isn’t everything. Here’s our value ranking considering capability:

Best Overall Value: Gemini 2.5 Flash ($0.30 / $2.50)

Google’s Flash model is fast, supports text + image + video + audio input, and costs less than most competitors. For most applications, this is the sweet spot.

Best for Coding: Grok 4.1 Fast ($0.20 / $0.50)

xAI’s fast model delivers strong coding performance at a fraction of Claude’s price. With a 2M-token context window, it handles large codebases easily.

Best for Reasoning: DeepSeek V3.2 Reasoner ($0.28 / $0.42)

DeepSeek’s reasoning mode delivers thinking capabilities at 1/10th the cost of OpenAI’s o3 or Anthropic’s Opus.

Best for Long Context: Llama 4 Scout ($0.15 / $0.15)

10 million token context window at $0.15/1M. Nothing else comes close for context length vs. price.

Cheapest by Use Case

Use Case	Best Model	Monthly Cost (10M tokens)
Classification/routing	GPT-4.1 nano	$5
Chatbots	Mistral Small	$4
Code generation	Grok 4.1 Fast	$7
Document analysis	Llama 4 Scout	$3
Complex reasoning	DeepSeek V3.2	$7
Multimodal	Gemini 2.5 Flash	$28

How to Save Even More

Use cached input pricing — most providers offer 80-90% discounts on repeated prompts
Batch API — OpenAI offers 50% off for async processing
Right-size your model — don’t use GPT-5.4 for tasks that GPT-4.1 nano can handle
Monitor your usage — use our token calculator to estimate costs before committing

The Expensive Tier (For Reference)

Not everything is about saving money. Here are the premium models and what you pay for:

Model	Input / 1M	Output / 1M	Why Pay More?
Claude Opus 4.1	$15.00	$75.00	Best reasoning (legacy)
Claude Opus 4.6	$5.00	$25.00	Best agentic coding
GPT-5.4	$2.50	$15.00	Most capable general-purpose
Gemini 3.1 Pro	$2.00	$12.00	Latest Google, multimodal

These models are worth it for complex tasks where quality matters more than cost. But for 80% of production workloads, the budget tier delivers good-enough results.

Compare all models on our pricing comparison page, or calculate your specific costs with our token calculator.