Cohere API Pricing (April 2026) — Command R+, Embed v3, Rerank v3
Last updated:
How much does Cohere cost? Cohere API pricing spans $0.0375 to $10.00 per million tokens. Command R+ 08-2024 flagship is $2.50 / $10.00 per 1M, Command R 08-2024 is $0.15 / $0.60, and Command R7B is just $0.0375 / $0.15 per 1M. Embed v3 (English and Multilingual) is $0.10 per 1M input tokens, and Rerank v3 is priced at $2.00 per 1M tokens of search input. Cohere's focus is enterprise RAG — all Command models ship with 128K context and strong long-document recall.
Key facts about Cohere pricing
- Command R+ 08-2024 flagship at $2.50 / $10.00 per 1M tokens — matches GPT-5.4 on input, 33% cheaper on output.
- Command R 08-2024 at $0.15 / $0.60 per 1M tokens, RAG-optimized mid-tier.
- Command R7B at $0.0375 / $0.15 per 1M tokens — one of the cheapest production APIs available.
- Embed v3 English and Embed v3 Multilingual at $0.10 per 1M input tokens; multilingual covers 100+ languages.
- Rerank v3 at $2.00 per 1M tokens — unique neural reranker for boosting vector-search precision.
- All Command models ship with a 128K token context window tuned for long-document RAG.
- Free developer tier with generous rate limits; no monthly minimums on paid tiers.
- Enterprise deployments available on AWS, Azure, Oracle, and self-hosted (Cohere for AI).
How much does each Cohere model cost per million tokens?
| Model↕ | Provider↕ | Input $/1M↑ | Cached $/1M↕ | Output $/1M↕ |
|---|---|---|---|---|
Command R7B Command R | cohere | $0.0375 | — | $0.15 |
Embed v3 Multilingual Embed | cohere | $0.10 | — | $0.00 |
Rerank v3 Rerank | cohere | $2.00 | — | $0.00 |
- cohereCommand R7BCommand R
- Input
- $0.0375
- Cached
- —
- Output
- $0.15
- cohereEmbed v3 MultilingualEmbed
- Input
- $0.10
- Cached
- —
- Output
- $0.00
- cohereRerank v3Rerank
- Input
- $2.00
- Cached
- —
- Output
- $0.00
Last synced:
Why choose Cohere over OpenAI or Anthropic?
Cohere's positioning is narrower than OpenAI or Anthropic: enterprise retrieval-augmented generation. The Command, Embed, and Rerank stack is purpose-built to work together — Embed v3 feeds a vector store, Rerank v3 boosts top-K precision, and Command R/R+ generate grounded answers with inline citations. At $0.10 / 1M for embeddings, $2 / 1M for reranking, and $2.50 / $10 per 1M for flagship generation, the end-to-end RAG cost is often lower than equivalent GPT-5.4 + OpenAI embeddings pipelines.
Command R7B at $0.0375 / $0.15 per 1M tokens is genuinely one of the cheapest production APIs available in April 2026 — 4x cheaper than GPT-5.4 nano on input. For high-volume classification, routing, or lightweight summarization it's a strong default. Cohere also offers enterprise deployments on AWS, Azure, and Oracle with BYOK and private endpoints, which matters in regulated industries.
When Cohere is not the right choice
Command R+ trails GPT-5.4 and Claude 3.5 Sonnet on frontier reasoning, code generation, and tool use. If your workload is agentic or code-heavy, OpenAI or Anthropic are still the better default. Cohere also doesn't ship a vision model in the Command family, so multimodal workloads need a different provider.
Price History
Track how Cohere API pricing has changed over time.
Command R+ 08-2024
No history yet — first snapshot 2026-04-16. Price trends will appear here as data accumulates.
Command R 08-2024
No history yet — first snapshot 2026-04-16. Price trends will appear here as data accumulates.
Command R7B
No history yet — first snapshot 2026-04-16. Price trends will appear here as data accumulates.
Embed v3 English
No history yet — first snapshot 2026-04-16. Price trends will appear here as data accumulates.
Embed v3 Multilingual
No history yet — first snapshot 2026-04-16. Price trends will appear here as data accumulates.
Rerank v3
No history yet — first snapshot 2026-04-16. Price trends will appear here as data accumulates.
Price history tracking started April 2026. Charts will appear after the first price change is detected.
View pricing changelog →
Frequently asked questions
How much does Cohere Command R+ cost?
Command R+ 08-2024 is priced at $2.50 per million input tokens and $10.00 per million output tokens. That is roughly the same as GPT-5.4 on input and 33% cheaper on output, positioning Command R+ as a competitive flagship option — especially for retrieval-augmented generation where Cohere is purpose-built.
Does Cohere have a free tier?
Yes. Cohere offers a generous free trial tier with rate-limited access to Command, Embed, and Rerank endpoints for prototyping. Production pay-as-you-go pricing kicks in once you exceed trial limits or request production keys — there are no monthly minimums.
How does Cohere compare to OpenAI?
Cohere is narrower in focus than OpenAI but often better at what it does. Command R+ at $2.50 / $10.00 per 1M tokens matches GPT-5.4 on input cost and undercuts by 33% on output. Embed v3 at $0.10 per 1M tokens is 33% cheaper than OpenAI text-embedding-3-large ($0.13). Rerank v3 is unique — OpenAI does not ship a dedicated reranker.
What are Cohere's embedding model prices?
Embed v3 English and Embed v3 Multilingual are both priced at $0.10 per million input tokens. The multilingual variant supports over 100 languages with near-English quality. Embeddings are input-only, so there is no output token cost.
What's Rerank v3 priced at?
Rerank v3 costs $2.00 per million tokens of search queries + documents combined. Reranking is typically used after vector retrieval to boost top-K precision — you send 100-1000 candidate documents per query and Rerank scores them. At $2 / 1M, a typical 10-query / 100-doc-each workload costs about $0.02.
What's Command R+'s context window?
Command R+ 08-2024 supports a 128K token context window, matching GPT-5.4. Command R 08-2024 also ships with 128K context. Cohere tunes these models aggressively for long-document RAG, with strong needle-in-a-haystack recall at depths up to 100K tokens.
Methodology
Pricing sourced from https://cohere.com/pricing on . All prices expressed in USD per 1 million tokens. We track 6 Cohere models covering Command (generation), Embed (vectors), and Rerank (neural reranking) endpoints.
Compare Cohere to other providers
Further reading: Full AI API pricing comparison · Best AI API for developers.