analysis

Cheapest AI APIs in 2026: Ranked by Cost Per Million Tokens

A complete ranking of the most affordable AI APIs in 2026. Compare token costs across OpenAI, DeepSeek, Google, Mistral, xAI, Meta, and Anthropic.

Looking for the cheapest AI API? We’ve ranked every major model by cost per million tokens so you can find the best deal for your use case.

All prices are current as of March 2026 and updated daily on our pricing page.

Cheapest AI Models: Input Price Ranking

RankModelProviderInput / 1MOutput / 1M
1GPT-4.1 nanoOpenAI$0.10$0.40
2Mistral SmallMistral$0.10$0.30
3Llama 4 ScoutMeta$0.15$0.15
4GPT-4o miniOpenAI$0.15$0.60
5Llama 4 MaverickMeta$0.20$0.20
6GPT-5.4 nanoOpenAI$0.20$1.25
7Grok 4.1 FastxAI$0.20$0.50
8GPT-5.4 miniOpenAI$0.25$2.00
9Gemini 2.5 Flash-LiteGoogle$0.25$1.50
10DeepSeek V3.2DeepSeek$0.28$0.42

The Ultra-Budget Tier (Under $0.50/1M Input)

1. GPT-4.1 nano — $0.10 input / $0.40 output

OpenAI’s cheapest model with 1M+ context window. Great for classification, routing, and simple extraction. The context window alone makes it stand out from similar-priced competitors.

2. Mistral Small — $0.10 input / $0.30 output

Mistral’s budget model matches GPT-4.1 nano on input and beats it on output pricing. Solid for European-language tasks and general-purpose work.

3. Llama 4 Scout — $0.15 input / $0.15 output

Meta’s open-source model available through API partners like Together and Fireworks. The symmetrical pricing (same input and output cost) is unusual and great for output-heavy workloads. Plus, it has a massive 10M-token context window.

4. DeepSeek V3.2 — $0.28 input / $0.42 output

DeepSeek offers remarkable performance for the price. The V3.2 update unified their pricing across chat and reasoning modes. With cache hits at just $0.028/1M, heavy reuse workloads become nearly free.

Best Value: Price vs. Performance

Raw cost isn’t everything. Here’s our value ranking considering capability:

Best Overall Value: Gemini 2.5 Flash ($0.30 / $2.50)

Google’s Flash model is fast, supports text + image + video + audio input, and costs less than most competitors. For most applications, this is the sweet spot.

Best for Coding: Grok 4.1 Fast ($0.20 / $0.50)

xAI’s fast model delivers strong coding performance at a fraction of Claude’s price. With a 2M-token context window, it handles large codebases easily.

Best for Reasoning: DeepSeek V3.2 Reasoner ($0.28 / $0.42)

DeepSeek’s reasoning mode delivers thinking capabilities at 1/10th the cost of OpenAI’s o3 or Anthropic’s Opus.

Best for Long Context: Llama 4 Scout ($0.15 / $0.15)

10 million token context window at $0.15/1M. Nothing else comes close for context length vs. price.

Cheapest by Use Case

Use CaseBest ModelMonthly Cost (10M tokens)
Classification/routingGPT-4.1 nano$5
ChatbotsMistral Small$4
Code generationGrok 4.1 Fast$7
Document analysisLlama 4 Scout$3
Complex reasoningDeepSeek V3.2$7
MultimodalGemini 2.5 Flash$28

How to Save Even More

  1. Use cached input pricing — most providers offer 80-90% discounts on repeated prompts
  2. Batch API — OpenAI offers 50% off for async processing
  3. Right-size your model — don’t use GPT-5.4 for tasks that GPT-4.1 nano can handle
  4. Monitor your usage — use our token calculator to estimate costs before committing

The Expensive Tier (For Reference)

Not everything is about saving money. Here are the premium models and what you pay for:

ModelInput / 1MOutput / 1MWhy Pay More?
Claude Opus 4.1$15.00$75.00Best reasoning (legacy)
Claude Opus 4.6$5.00$25.00Best agentic coding
GPT-5.4$2.50$15.00Most capable general-purpose
Gemini 3.1 Pro$2.00$12.00Latest Google, multimodal

These models are worth it for complex tasks where quality matters more than cost. But for 80% of production workloads, the budget tier delivers good-enough results.

Compare all models on our pricing comparison page, or calculate your specific costs with our token calculator.