AI Token Cost Calculator
Enter your expected token usage and instantly see costs across all major AI providers. Last updated:
How do I calculate AI API costs? Multiply your input tokens by the provider's input rate and your output tokens by the output rate, then divide by 1 million. For example, 500,000 input + 100,000 output on GPT-5.4 ($2.50/$15.00 per 1M) = $1.25 + $1.50 = $2.75 per request cycle. This calculator runs that math across all 112 tracked models so you can spot the cheapest option in one glance.
Tip: type 1M, 500k or 10,000. Cached-input tokens are billed at each model's discounted cached rate.
24 legacy models hidden —.
| Provider | Model | Input cost ↕ | Output cost ↕ | Total ↑ | |
|---|---|---|---|---|---|
| cohere | Embed v3 English Embed | $0.10 | $0.00 | $0.10 | |
| cohere | Embed v3 Multilingual Embed | $0.10 | $0.00 | $0.10 | |
| groq | Llama 3.1 8b Instant groq | $0.05 | $0.08 | $0.13 | |
| together | LFM2 24B A2B (Together) LFM2 | $0.03 | $0.12 | $0.15 | |
| together | Gemma 3n E4B Instruct (Together) Gemma 3n | $0.06 | $0.12 | $0.18 | |
| cohere | Command R7B Command R | $0.0375 | $0.15 | $0.1875 | |
| Mistral | Ministral 3B Ministral | $0.10 | $0.10 | $0.20 | |
| together | GPT-OSS 20B (Together) GPT-OSS | $0.05 | $0.20 | $0.25 | |
| Mistral | Ministral 8B Ministral | $0.15 | $0.15 | $0.30 | |
| Mistral | Mistral NeMo Mistral Open | $0.15 | $0.15 | $0.30 | |
| Mistral | Pixtral 12B Mistral | $0.15 | $0.15 | $0.30 | |
| together | Rnj-1 Instruct (Together) Rnj | $0.15 | $0.15 | $0.30 | |
| groq | GPT OSS Safeguard 20B GPT OSS | $0.075 | $0.30 | $0.375 | |
| groq | Openai/gpt Oss 20b groq | $0.075 | $0.30 | $0.375 | |
| Meta | Llama 4 Scout Llama 4 | $0.08 | $0.30 | $0.38 | |
| Mistral | Devstral Small 2 Devstral | $0.10 | $0.30 | $0.40 | |
| Mistral | Ministral 14B Ministral | $0.20 | $0.20 | $0.40 | |
| Mistral | Mistral Small 4 Mistral Small | $0.10 | $0.30 | $0.40 | |
| DeepSeek | DeepSeek V4 Flash DeepSeek V4 | $0.14 | $0.28 | $0.42 | |
| together | Qwen3.5 9B (Together) Qwen3.5 | $0.17 | $0.25 | $0.42 | |
| OpenAI | GPT-5 nano GPT-5 | $0.05 | $0.40 | $0.45 | |
| groq | Llama 4 Scout 17B 16E Instruct Llama 4 | $0.11 | $0.34 | $0.45 | |
Gemini 2.5 Flash-Lite Gemini 2.5 | $0.10 | $0.40 | $0.50 | ||
| Mistral | Mistral 7B Mistral Open | $0.25 | $0.25 | $0.50 | |
| cohere | Command R 08-2024 Command R | $0.15 | $0.60 | $0.75 | |
| OpenAI | GPT-4o mini GPT-4o | $0.15 | $0.60 | $0.75 | |
| together | GPT-OSS 120B (Together) GPT-OSS | $0.15 | $0.60 | $0.75 | |
| Meta | Llama 4 Maverick Llama 4 | $0.15 | $0.60 | $0.75 | |
| groq | Openai/gpt Oss 120b groq | $0.15 | $0.60 | $0.75 | |
| groq | Qwen3 32B Qwen3 | $0.29 | $0.59 | $0.88 | |
| Mistral | Codestral Mistral | $0.30 | $0.90 | $1.20 | |
| DeepSeek | DeepSeek V4 Pro DeepSeek V4 | $0.435 | $0.87 | $1.31 | |
| groq | Llama 3.3 70b Versatile groq | $0.59 | $0.79 | $1.38 | |
| Mistral | Mixtral 8x7B Mixtral | $0.70 | $0.70 | $1.40 | |
| OpenAI | GPT-5.4 nano GPT-5.4 | $0.20 | $1.25 | $1.45 | |
| together | MiniMax M2.7 (Together) MiniMax M2 | $0.30 | $1.20 | $1.50 | |
Gemini 3.1 Flash-Lite Gemini 3 | $0.25 | $1.50 | $1.75 | ||
| OpenAI | GPT-4.1 mini GPT-4.1 | $0.40 | $1.60 | $2.00 | |
| Mistral | Magistral Small Magistral | $0.50 | $1.50 | $2.00 | |
| Mistral | Mistral Large 3 Mistral | $0.50 | $1.50 | $2.00 | |
| cohere | Rerank v3 Rerank | $2.00 | $0.00 | $2.00 | |
| perplexity | Sonar Sonar | $1.00 | $1.00 | $2.00 | |
| together | Llama 3.3 70B (Together) Llama 3.3 | $1.04 | $1.04 | $2.08 | |
| OpenAI | GPT-5 mini GPT-5 | $0.25 | $2.00 | $2.25 | |
| Mistral | Devstral Medium 2 Devstral | $0.40 | $2.00 | $2.40 | |
| together | Cogito v2.1 671B (Together) Cogito | $1.25 | $1.25 | $2.50 | |
Gemini 2.5 Flash Gemini 2.5 | $0.30 | $2.50 | $2.80 | ||
Gemini 3 Flash Gemini 3 | $0.50 | $3.00 | $3.50 | ||
| xAI | Grok 4.3 Grok 4.3 | $1.25 | $2.50 | $3.75 | |
| together | GLM-5 (Together) GLM | $1.00 | $3.20 | $4.20 | |
| together | Qwen3.5 397B A17B (Together) Qwen3.5 | $0.60 | $3.60 | $4.20 | |
| together | Qwen3.7-Max (Together) Qwen3.7 | $1.25 | $3.75 | $5.00 | |
| OpenAI | GPT-5.4 mini GPT-5.4 | $0.75 | $4.50 | $5.25 | |
| together | Kimi K2.6 (Together) Kimi K2 | $1.20 | $4.50 | $5.70 | |
| together | GLM-5.1 (Together) GLM-5 | $1.40 | $4.40 | $5.80 | |
| Anthropic | Claude Haiku 4.5 Claude 4.5 | $1.00 | $5.00 | $6.00 | |
| together | DeepSeek V4 Pro (Together) DeepSeek V4 | $2.10 | $4.40 | $6.50 | |
| Mistral | Magistral Medium Magistral | $2.00 | $5.00 | $7.00 | |
| Mistral | Mixtral 8x22B Mistral | $2.00 | $6.00 | $8.00 | |
| Mistral | Pixtral Large Mistral | $2.00 | $6.00 | $8.00 | |
| Mistral | Mistral Medium 3.5 Mistral Medium | $1.50 | $7.50 | $9.00 | |
| OpenAI | GPT-4.1 GPT-4.1 | $2.00 | $8.00 | $10.00 | |
| OpenAI | o3 o-series | $2.00 | $8.00 | $10.00 | |
| perplexity | Sonar Deep Research Sonar | $2.00 | $8.00 | $10.00 | |
| perplexity | Sonar Reasoning Pro Sonar | $2.00 | $8.00 | $10.00 | |
Gemini 3.5 Flash Gemini 3.5 | $1.50 | $9.00 | $10.50 | ||
Gemini 2.5 Pro Gemini 2.5 | $1.25 | $10.00 | $11.25 | ||
| OpenAI | GPT-5 GPT-5 | $1.25 | $10.00 | $11.25 | |
| OpenAI | GPT-5.1 GPT-5 | $1.25 | $10.00 | $11.25 | |
| cohere | Command A Command A | $2.50 | $10.00 | $12.50 | |
| cohere | Command R+ 08-2024 Command R | $2.50 | $10.00 | $12.50 | |
| OpenAI | GPT-4o GPT-4o | $2.50 | $10.00 | $12.50 | |
Gemini 3 Pro Gemini 3 | $2.00 | $12.00 | $14.00 | ||
Gemini 3.1 Pro Gemini 3 | $2.00 | $12.00 | $14.00 | ||
| OpenAI | GPT-5.2 GPT-5 | $1.75 | $14.00 | $15.75 | |
Gemini 2.5 Pro (>200k tokens) Gemini 2.5 | $2.50 | $15.00 | $17.50 | ||
| OpenAI | GPT-5.4 GPT-5.4 | $2.50 | $15.00 | $17.50 | |
| Anthropic | Claude Sonnet 4.6 Claude 4.6 | $3.00 | $15.00 | $18.00 | |
| perplexity | Sonar Pro Sonar | $3.00 | $15.00 | $18.00 | |
| Anthropic | Claude Opus 4.8 Claude 4.8 | $5.00 | $25.00 | $30.00 | |
| OpenAI | GPT-5.5 GPT-5.5 | $5.00 | $30.00 | $35.00 | |
| Anthropic | Claude Fable 5 Claude Fable 5 | $10.00 | $50.00 | $60.00 | |
| Anthropic | Claude Mythos 5 Claude Mythos 5 | $10.00 | $50.00 | $60.00 | |
| OpenAI | o3-pro o-series | $20.00 | $80.00 | $100.00 | |
| OpenAI | GPT-5 Pro GPT-5 | $15.00 | $120.00 | $135.00 | |
| OpenAI | GPT-5.2 Pro GPT-5 | $21.00 | $168.00 | $189.00 | |
| OpenAI | GPT-5.4 Pro GPT-5.4 | $30.00 | $180.00 | $210.00 | |
| OpenAI | GPT-5.5 Pro GPT-5.5 | $30.00 | $180.00 | $210.00 |
- Embed v3 EnglishcohereTotal$0.10Input$0.10Output$0.00
- Embed v3 MultilingualcohereTotal$0.10Input$0.10Output$0.00
- Llama 3.1 8b InstantgroqTotal$0.13Input$0.05Output$0.08
- LFM2 24B A2B (Together)togetherTotal$0.15Input$0.03Output$0.12
- Gemma 3n E4B Instruct (Together)togetherTotal$0.18Input$0.06Output$0.12
- Command R7BcohereTotal$0.1875Input$0.0375Output$0.15
- Ministral 3BMistralTotal$0.20Input$0.10Output$0.10
- GPT-OSS 20B (Together)togetherTotal$0.25Input$0.05Output$0.20
- Ministral 8BMistralTotal$0.30Input$0.15Output$0.15
- Mistral NeMoMistralTotal$0.30Input$0.15Output$0.15
- Pixtral 12BMistralTotal$0.30Input$0.15Output$0.15
- Rnj-1 Instruct (Together)togetherTotal$0.30Input$0.15Output$0.15
- GPT OSS Safeguard 20BgroqTotal$0.375Input$0.075Output$0.30
- Openai/gpt Oss 20bgroqTotal$0.375Input$0.075Output$0.30
- Llama 4 ScoutMetaTotal$0.38Input$0.08Output$0.30
- Devstral Small 2MistralTotal$0.40Input$0.10Output$0.30
- Ministral 14BMistralTotal$0.40Input$0.20Output$0.20
- Mistral Small 4MistralTotal$0.40Input$0.10Output$0.30
- DeepSeek V4 FlashDeepSeekTotal$0.42Input$0.14Output$0.28
- Qwen3.5 9B (Together)togetherTotal$0.42Input$0.17Output$0.25
- GPT-5 nanoOpenAITotal$0.45Input$0.05Output$0.40
- Llama 4 Scout 17B 16E InstructgroqTotal$0.45Input$0.11Output$0.34
- Gemini 2.5 Flash-LiteGoogleTotal$0.50Input$0.10Output$0.40
- Mistral 7BMistralTotal$0.50Input$0.25Output$0.25
- Command R 08-2024cohereTotal$0.75Input$0.15Output$0.60
- GPT-4o miniOpenAITotal$0.75Input$0.15Output$0.60
- GPT-OSS 120B (Together)togetherTotal$0.75Input$0.15Output$0.60
- Llama 4 MaverickMetaTotal$0.75Input$0.15Output$0.60
- Openai/gpt Oss 120bgroqTotal$0.75Input$0.15Output$0.60
- Qwen3 32BgroqTotal$0.88Input$0.29Output$0.59
- CodestralMistralTotal$1.20Input$0.30Output$0.90
- DeepSeek V4 ProDeepSeekTotal$1.31Input$0.435Output$0.87
- Llama 3.3 70b VersatilegroqTotal$1.38Input$0.59Output$0.79
- Mixtral 8x7BMistralTotal$1.40Input$0.70Output$0.70
- GPT-5.4 nanoOpenAITotal$1.45Input$0.20Output$1.25
- MiniMax M2.7 (Together)togetherTotal$1.50Input$0.30Output$1.20
- Gemini 3.1 Flash-LiteGoogleTotal$1.75Input$0.25Output$1.50
- GPT-4.1 miniOpenAITotal$2.00Input$0.40Output$1.60
- Magistral SmallMistralTotal$2.00Input$0.50Output$1.50
- Mistral Large 3MistralTotal$2.00Input$0.50Output$1.50
- Rerank v3cohereTotal$2.00Input$2.00Output$0.00
- SonarperplexityTotal$2.00Input$1.00Output$1.00
- Llama 3.3 70B (Together)togetherTotal$2.08Input$1.04Output$1.04
- GPT-5 miniOpenAITotal$2.25Input$0.25Output$2.00
- Devstral Medium 2MistralTotal$2.40Input$0.40Output$2.00
- Cogito v2.1 671B (Together)togetherTotal$2.50Input$1.25Output$1.25
- Gemini 2.5 FlashGoogleTotal$2.80Input$0.30Output$2.50
- Gemini 3 FlashGoogleTotal$3.50Input$0.50Output$3.00
- Grok 4.3xAITotal$3.75Input$1.25Output$2.50
- GLM-5 (Together)togetherTotal$4.20Input$1.00Output$3.20
- Qwen3.5 397B A17B (Together)togetherTotal$4.20Input$0.60Output$3.60
- Qwen3.7-Max (Together)togetherTotal$5.00Input$1.25Output$3.75
- GPT-5.4 miniOpenAITotal$5.25Input$0.75Output$4.50
- Kimi K2.6 (Together)togetherTotal$5.70Input$1.20Output$4.50
- GLM-5.1 (Together)togetherTotal$5.80Input$1.40Output$4.40
- Claude Haiku 4.5AnthropicTotal$6.00Input$1.00Output$5.00
- DeepSeek V4 Pro (Together)togetherTotal$6.50Input$2.10Output$4.40
- Magistral MediumMistralTotal$7.00Input$2.00Output$5.00
- Mixtral 8x22BMistralTotal$8.00Input$2.00Output$6.00
- Pixtral LargeMistralTotal$8.00Input$2.00Output$6.00
- Mistral Medium 3.5MistralTotal$9.00Input$1.50Output$7.50
- GPT-4.1OpenAITotal$10.00Input$2.00Output$8.00
- o3OpenAITotal$10.00Input$2.00Output$8.00
- Sonar Deep ResearchperplexityTotal$10.00Input$2.00Output$8.00
- Sonar Reasoning ProperplexityTotal$10.00Input$2.00Output$8.00
- Gemini 3.5 FlashGoogleTotal$10.50Input$1.50Output$9.00
- Gemini 2.5 ProGoogleTotal$11.25Input$1.25Output$10.00
- GPT-5OpenAITotal$11.25Input$1.25Output$10.00
- GPT-5.1OpenAITotal$11.25Input$1.25Output$10.00
- Command AcohereTotal$12.50Input$2.50Output$10.00
- Command R+ 08-2024cohereTotal$12.50Input$2.50Output$10.00
- GPT-4oOpenAITotal$12.50Input$2.50Output$10.00
- Gemini 3 ProGoogleTotal$14.00Input$2.00Output$12.00
- Gemini 3.1 ProGoogleTotal$14.00Input$2.00Output$12.00
- GPT-5.2OpenAITotal$15.75Input$1.75Output$14.00
- Gemini 2.5 Pro (>200k tokens)GoogleTotal$17.50Input$2.50Output$15.00
- GPT-5.4OpenAITotal$17.50Input$2.50Output$15.00
- Claude Sonnet 4.6AnthropicTotal$18.00Input$3.00Output$15.00
- Sonar ProperplexityTotal$18.00Input$3.00Output$15.00
- Claude Opus 4.8AnthropicTotal$30.00Input$5.00Output$25.00
- GPT-5.5OpenAITotal$35.00Input$5.00Output$30.00
- Claude Fable 5AnthropicTotal$60.00Input$10.00Output$50.00
- Claude Mythos 5AnthropicTotal$60.00Input$10.00Output$50.00
- o3-proOpenAITotal$100.00Input$20.00Output$80.00
- GPT-5 ProOpenAITotal$135.00Input$15.00Output$120.00
- GPT-5.2 ProOpenAITotal$189.00Input$21.00Output$168.00
- GPT-5.4 ProOpenAITotal$210.00Input$30.00Output$180.00
- GPT-5.5 ProOpenAITotal$210.00Input$30.00Output$180.00
How do I use the AI token cost calculator?
- Enter expected input tokens — roughly 0.75 words or 4 characters per token. A 2,000-word prompt is ~2,700 tokens.
- Enter expected output tokens — model responses are usually 200–2,000 tokens unless you explicitly set
max_tokens. - Set monthly request volume — multiplies the single-request cost to estimate monthly spend.
- Compare rows — the table sorts cheapest-first. Cached-input rates drop many providers by 75–90%.
- Click a model to jump to its provider page for context, FAQ, and rate-limit details.
Methodology
All prices come from the official API pricing pages of each provider, checked daily. The formula for a single request is:
cost = (input_tokens / 1,000,000 * input_rate) + (output_tokens / 1,000,000 * output_rate)
When the cached-input slider is above 0%, the input portion splits into cached and non-cached fractions, each multiplied by the respective rate. Models without a published cached rate use the standard input rate for both.
Frequently asked questions about AI token costs
How much do AI tokens cost in 2026?
AI token prices in 2026 range from $0.00 per million output tokens on budget models like Embed v3 English up to $180.00 per million output tokens on flagship reasoning models like GPT-5.5 Pro. Most general-purpose APIs sit in the $0.50–$15.00 per million output token range. Input tokens are typically 2–8x cheaper than output tokens, and cached input drops costs another 75–90% on providers that support it.
What is the cheapest AI API in 2026?
As of 2026-06-11, the cheapest mainstream AI API is Embed v3 English at $0.10 per million input tokens and $0.00 per million output tokens. DeepSeek and Google Gemini Flash are also extremely competitive for general workloads, while xAI Grok mini and Anthropic Claude Haiku offer the best price-to-quality on fast, low-latency requests. The calculator above ranks all 112 tracked models cheapest-first so you can see today's leader at a glance.
How do I calculate AI API costs?
Multiply your input tokens by the provider input rate, multiply your output tokens by the provider output rate, then divide each by 1,000,000. For example, 500,000 input + 100,000 output tokens on GPT-5.4 ($2.50 / $15.00 per 1M) costs (500,000 × $2.50 / 1,000,000) + (100,000 × $15.00 / 1,000,000) = $1.25 + $1.50 = $2.75 per request. Multiply by your monthly request volume for an estimated monthly bill.
How many tokens are in 1,000 words?
Roughly 1,330 tokens for English text — the OpenAI rule of thumb is 1 token ≈ 0.75 words, or about 4 characters. Code, JSON, and non-Latin scripts tokenize differently: code is usually denser (1 token ≈ 3.5 chars), and languages like Japanese or Arabic can cost 2–3x more tokens per character than English. For exact counts, use the tokenizer published by your provider (e.g., tiktoken for OpenAI, the Anthropic token counting endpoint, or Google AI Studio).
Are input tokens and output tokens priced the same?
No. Output tokens are almost always more expensive than input tokens — typically 2x to 8x more. For example, GPT-5.4 charges $2.50 per million input tokens vs $15.00 per million output (6x). Claude Sonnet 4.6 charges $3.00 input vs $15.00 output (5x). DeepSeek V3 is one of the few providers with closer parity at $0.27 input vs $1.10 output. The output multiplier is why optimizing prompt length matters less than capping response length for cost control.
What does cached input pricing mean?
Cached input pricing is a discount applied to prompt tokens the provider has already processed in a recent prior request — typically the system prompt, conversation history, or RAG context. OpenAI, Anthropic, and Google offer cached input rates at 25–10% of the standard input rate (a 75–90% discount). If you reuse the same long context across many calls (e.g., chat with system prompt, agent loops, long documents), enable caching and your effective bill drops dramatically. The calculator includes a cached-input slider to model this.
How accurate is this AI token cost calculator?
All 112 model prices in this calculator come from the official API pricing pages of each provider, checked every few hours by our automated pipeline. The last full refresh ran at 2026-06-11. We track 10+ providers including OpenAI, Anthropic, Google, DeepSeek, xAI, Meta, Mistral, Cohere, Perplexity, and Together. If a price shown here ever differs from the provider's page, the provider's page is authoritative and we'll have it corrected within hours.
Why is OpenAI more expensive than DeepSeek for the same task?
DeepSeek (and other lower-priced challengers like Mistral, Together-hosted Llama, and Groq) run on smaller GPU clusters, charge less margin, and in some cases serve open-weight models that have no licensing layer. OpenAI prices in brand, latency SLOs, enterprise support, broad ecosystem integrations, and continuous frontier-model R&D. For straightforward chat, summarization, or extraction, DeepSeek V3 typically delivers ~90% of GPT-5.4 quality at <10% of the cost. For complex reasoning, code generation under time pressure, or agentic workflows, the premium models still pull ahead.