AI Token Cost Calculator

Enter your expected token usage and instantly see costs across all major AI providers. Last updated:

How do I calculate AI API costs? Multiply your input tokens by the provider's input rate and your output tokens by the output rate, then divide by 1 million. For example, 500,000 input + 100,000 output on GPT-5.4 ($2.50/$15.00 per 1M) = $1.25 + $1.50 = $2.75 per request cycle. This calculator runs that math across all 55 tracked models so you can spot the cheapest option in one glance.

5,000 tokens
1,000 tokens
0% of input at cached rate
  • Ministral 3B
    Mistral
    Total
    $0.00024
    Input$0.0002
    Output$0.00004
  • Command R7B
    cohere
    Total
    $0.000338
    Input$0.000188
    Output$0.00015
  • Embed v3 English
    cohere
    Total
    $0.0005
    Input$0.0005
    Output$0.00
  • Embed v3 Multilingual
    cohere
    Total
    $0.0005
    Input$0.0005
    Output$0.00
  • Ministral 8B
    Mistral
    Total
    $0.0006
    Input$0.0005
    Output$0.0001
  • Mistral Small
    Mistral
    Total
    $0.0008
    Input$0.0005
    Output$0.0003
  • GPT-4.1 nano
    OpenAI
    Total
    $0.0009
    Input$0.0005
    Output$0.0004
  • Llama 4 Scout
    Meta
    Total
    $0.0009
    Input$0.00075
    Output$0.00015
  • Llama 4 Maverick
    Meta
    Total
    $0.0012
    Input$0.001
    Output$0.0002
  • Sonar Small Online
    perplexity
    Total
    $0.0012
    Input$0.001
    Output$0.0002
  • Command R 08-2024
    cohere
    Total
    $0.00135
    Input$0.00075
    Output$0.0006
  • GPT-4o mini
    OpenAI
    Total
    $0.00135
    Input$0.00075
    Output$0.0006
  • Grok 4.1 Fast
    xAI
    Total
    $0.0015
    Input$0.001
    Output$0.0005
  • DeepSeek V3.2 (Chat)
    DeepSeek
    Total
    $0.00182
    Input$0.0014
    Output$0.00042
  • DeepSeek V3.2 (Reasoner)
    DeepSeek
    Total
    $0.00182
    Input$0.0014
    Output$0.00042
  • GPT-5.4 nano
    OpenAI
    Total
    $0.00225
    Input$0.001
    Output$0.00125
  • Codestral
    Mistral
    Total
    $0.0024
    Input$0.0015
    Output$0.0009
  • Gemini 3.1 Flash-Lite
    Google
    Total
    $0.00275
    Input$0.00125
    Output$0.0015
  • GPT-4.1 mini
    OpenAI
    Total
    $0.0036
    Input$0.002
    Output$0.0016
  • Gemini 2.5 Flash
    Google
    Total
    $0.004
    Input$0.0015
    Output$0.0025
  • Llama 3.3 70B (Together)
    together
    Total
    $0.00528
    Input$0.0044
    Output$0.00088
  • Gemini 3 Flash
    Google
    Total
    $0.0055
    Input$0.0025
    Output$0.003
  • Sonar Large Online
    perplexity
    Total
    $0.006
    Input$0.005
    Output$0.001
  • Mixtral 8x22B (Together)
    together
    Total
    $0.0072
    Input$0.006
    Output$0.0012
  • Qwen 2.5 72B (Together)
    together
    Total
    $0.0072
    Input$0.006
    Output$0.0012
  • DeepSeek V3 (Together)
    together
    Total
    $0.0075
    Input$0.00625
    Output$0.00125
  • GPT-5.4 mini
    OpenAI
    Total
    $0.00825
    Input$0.00375
    Output$0.0045
  • o4-mini
    OpenAI
    Total
    $0.0099
    Input$0.0055
    Output$0.0044
  • Claude Haiku 4.5
    Anthropic
    Total
    $0.01
    Input$0.005
    Output$0.005
  • Rerank v3
    cohere
    Total
    $0.01
    Input$0.01
    Output$0.00
  • Grok 4.20
    xAI
    Total
    $0.016
    Input$0.01
    Output$0.006
  • Mistral Large
    Mistral
    Total
    $0.016
    Input$0.01
    Output$0.006
  • Mixtral 8x22B
    Mistral
    Total
    $0.016
    Input$0.01
    Output$0.006
  • Pixtral Large
    Mistral
    Total
    $0.016
    Input$0.01
    Output$0.006
  • Gemini 2.5 Pro
    Google
    Total
    $0.0163
    Input$0.00625
    Output$0.01
  • GPT-4.1
    OpenAI
    Total
    $0.018
    Input$0.01
    Output$0.008
  • o3
    OpenAI
    Total
    $0.018
    Input$0.01
    Output$0.008
  • Llama 3.1 405B (Together)
    together
    Total
    $0.021
    Input$0.0175
    Output$0.0035
  • DeepSeek R1 (Together)
    together
    Total
    $0.022
    Input$0.015
    Output$0.007
  • Gemini 3 Pro
    Google
    Total
    $0.022
    Input$0.01
    Output$0.012
  • Gemini 3.1 Pro
    Google
    Total
    $0.022
    Input$0.01
    Output$0.012
  • Command R+ 08-2024
    cohere
    Total
    $0.0225
    Input$0.0125
    Output$0.01
  • GPT-4o
    OpenAI
    Total
    $0.0225
    Input$0.0125
    Output$0.01
  • Mistral Large (Together)
    together
    Total
    $0.024
    Input$0.015
    Output$0.009
  • GPT-5.4
    OpenAI
    Total
    $0.0275
    Input$0.0125
    Output$0.015
  • Claude Sonnet 4
    Anthropic
    Total
    $0.03
    Input$0.015
    Output$0.015
  • Claude Sonnet 4.5
    Anthropic
    Total
    $0.03
    Input$0.015
    Output$0.015
  • Claude Sonnet 4.6
    Anthropic
    Total
    $0.03
    Input$0.015
    Output$0.015
  • Sonar Pro
    perplexity
    Total
    $0.03
    Input$0.015
    Output$0.015
  • Sonar Huge Online
    perplexity
    Total
    $0.03
    Input$0.025
    Output$0.005
  • Claude Opus 4.5
    Anthropic
    Total
    $0.05
    Input$0.025
    Output$0.025
  • Claude Opus 4.6
    Anthropic
    Total
    $0.05
    Input$0.025
    Output$0.025
  • Claude Opus 4.7
    Anthropic
    Total
    $0.05
    Input$0.025
    Output$0.025
  • Claude Opus 4
    Anthropic
    Total
    $0.15
    Input$0.075
    Output$0.075
  • Claude Opus 4.1
    Anthropic
    Total
    $0.15
    Input$0.075
    Output$0.075
Sort:
55 models · Costs based on 5,000 input + 1,000 output tokens

How do I use the AI token cost calculator?

  1. Enter expected input tokens — roughly 0.75 words or 4 characters per token. A 2,000-word prompt is ~2,700 tokens.
  2. Enter expected output tokens — model responses are usually 200–2,000 tokens unless you explicitly set max_tokens.
  3. Set monthly request volume — multiplies the single-request cost to estimate monthly spend.
  4. Compare rows — the table sorts cheapest-first. Cached-input rates drop many providers by 75–90%.
  5. Click a model to jump to its provider page for context, FAQ, and rate-limit details.

Methodology

All prices come from the official API pricing pages of each provider, checked daily. The formula for a single request is:

cost = (input_tokens / 1,000,000 * input_rate) + (output_tokens / 1,000,000 * output_rate)

When the cached-input slider is above 0%, the input portion splits into cached and non-cached fractions, each multiplied by the respective rate. Models without a published cached rate use the standard input rate for both.