Cohere API Pricing (April 2026) — Command R+, Embed v3, Rerank v3

Last updated:

How much does Cohere cost? Cohere API pricing spans $0.0375 to $10.00 per million tokens. Command R+ 08-2024 flagship is $2.50 / $10.00 per 1M, Command R 08-2024 is $0.15 / $0.60, and Command R7B is just $0.0375 / $0.15 per 1M. Embed v3 (English and Multilingual) is $0.10 per 1M input tokens, and Rerank v3 is priced at $2.00 per 1M tokens of search input. Cohere's focus is enterprise RAG — all Command models ship with 128K context and strong long-document recall.

Key facts about Cohere pricing

  • Command R+ 08-2024 flagship at $2.50 / $10.00 per 1M tokens — matches GPT-5.4 on input, 33% cheaper on output.
  • Command R 08-2024 at $0.15 / $0.60 per 1M tokens, RAG-optimized mid-tier.
  • Command R7B at $0.0375 / $0.15 per 1M tokens — one of the cheapest production APIs available.
  • Embed v3 English and Embed v3 Multilingual at $0.10 per 1M input tokens; multilingual covers 100+ languages.
  • Rerank v3 at $2.00 per 1M tokens — unique neural reranker for boosting vector-search precision.
  • All Command models ship with a 128K token context window tuned for long-document RAG.
  • Free developer tier with generous rate limits; no monthly minimums on paid tiers.
  • Enterprise deployments available on AWS, Azure, Oracle, and self-hosted (Cohere for AI).

How much does each Cohere model cost per million tokens?

  • Command R7B
    Command R
    cohere
    Input
    $0.0375
    Cached
    Output
    $0.15
  • Embed v3 Multilingual
    Embed
    cohere
    Input
    $0.10
    Cached
    Output
    $0.00
  • Rerank v3
    Rerank
    cohere
    Input
    $2.00
    Cached
    Output
    $0.00
Showing 3 of 6 models · USD per 1M tokens
Last synced:

Last synced:

Why choose Cohere over OpenAI or Anthropic?

Cohere's positioning is narrower than OpenAI or Anthropic: enterprise retrieval-augmented generation. The Command, Embed, and Rerank stack is purpose-built to work together — Embed v3 feeds a vector store, Rerank v3 boosts top-K precision, and Command R/R+ generate grounded answers with inline citations. At $0.10 / 1M for embeddings, $2 / 1M for reranking, and $2.50 / $10 per 1M for flagship generation, the end-to-end RAG cost is often lower than equivalent GPT-5.4 + OpenAI embeddings pipelines.

Command R7B at $0.0375 / $0.15 per 1M tokens is genuinely one of the cheapest production APIs available in April 2026 — 4x cheaper than GPT-5.4 nano on input. For high-volume classification, routing, or lightweight summarization it's a strong default. Cohere also offers enterprise deployments on AWS, Azure, and Oracle with BYOK and private endpoints, which matters in regulated industries.

When Cohere is not the right choice

Command R+ trails GPT-5.4 and Claude 3.5 Sonnet on frontier reasoning, code generation, and tool use. If your workload is agentic or code-heavy, OpenAI or Anthropic are still the better default. Cohere also doesn't ship a vision model in the Command family, so multimodal workloads need a different provider.

Price History

Track how Cohere API pricing has changed over time.

Command R+ 08-2024

No history yet — first snapshot 2026-04-16. Price trends will appear here as data accumulates.

Command R 08-2024

No history yet — first snapshot 2026-04-16. Price trends will appear here as data accumulates.

Command R7B

No history yet — first snapshot 2026-04-16. Price trends will appear here as data accumulates.

Embed v3 English

No history yet — first snapshot 2026-04-16. Price trends will appear here as data accumulates.

Embed v3 Multilingual

No history yet — first snapshot 2026-04-16. Price trends will appear here as data accumulates.

Rerank v3

No history yet — first snapshot 2026-04-16. Price trends will appear here as data accumulates.

Price history tracking started April 2026. Charts will appear after the first price change is detected.
View pricing changelog →

Frequently asked questions

How much does Cohere Command R+ cost?

Command R+ 08-2024 is priced at $2.50 per million input tokens and $10.00 per million output tokens. That is roughly the same as GPT-5.4 on input and 33% cheaper on output, positioning Command R+ as a competitive flagship option — especially for retrieval-augmented generation where Cohere is purpose-built.

Does Cohere have a free tier?

Yes. Cohere offers a generous free trial tier with rate-limited access to Command, Embed, and Rerank endpoints for prototyping. Production pay-as-you-go pricing kicks in once you exceed trial limits or request production keys — there are no monthly minimums.

How does Cohere compare to OpenAI?

Cohere is narrower in focus than OpenAI but often better at what it does. Command R+ at $2.50 / $10.00 per 1M tokens matches GPT-5.4 on input cost and undercuts by 33% on output. Embed v3 at $0.10 per 1M tokens is 33% cheaper than OpenAI text-embedding-3-large ($0.13). Rerank v3 is unique — OpenAI does not ship a dedicated reranker.

What are Cohere's embedding model prices?

Embed v3 English and Embed v3 Multilingual are both priced at $0.10 per million input tokens. The multilingual variant supports over 100 languages with near-English quality. Embeddings are input-only, so there is no output token cost.

What's Rerank v3 priced at?

Rerank v3 costs $2.00 per million tokens of search queries + documents combined. Reranking is typically used after vector retrieval to boost top-K precision — you send 100-1000 candidate documents per query and Rerank scores them. At $2 / 1M, a typical 10-query / 100-doc-each workload costs about $0.02.

What's Command R+'s context window?

Command R+ 08-2024 supports a 128K token context window, matching GPT-5.4. Command R 08-2024 also ships with 128K context. Cohere tunes these models aggressively for long-document RAG, with strong needle-in-a-haystack recall at depths up to 100K tokens.

Methodology

Pricing sourced from https://cohere.com/pricing on . All prices expressed in USD per 1 million tokens. We track 6 Cohere models covering Command (generation), Embed (vectors), and Rerank (neural reranking) endpoints.