Cohere API Pricing (April 2026) — Command R+, Embed v3, Rerank v3

Last updated: 2026-04-15

How much does Cohere cost? Cohere API pricing spans $0.0375 to $10.00 per million tokens. Command R+ 08-2024 flagship is $2.50 / $10.00 per 1M, Command R 08-2024 is $0.15 / $0.60, and Command R7B is just $0.0375 / $0.15 per 1M. Embed v3 (English and Multilingual) is $0.10 per 1M input tokens, and Rerank v3 is priced at $2.00 per 1M tokens of search input. Cohere's focus is enterprise RAG — all Command models ship with 128K context and strong long-document recall.

Key facts about Cohere pricing

Command R+ 08-2024 flagship at $2.50 / $10.00 per 1M tokens — matches GPT-5.4 on input, 33% cheaper on output.
Command R 08-2024 at $0.15 / $0.60 per 1M tokens, RAG-optimized mid-tier.
Command R7B at $0.0375 / $0.15 per 1M tokens — one of the cheapest production APIs available.
Embed v3 English and Embed v3 Multilingual at $0.10 per 1M input tokens; multilingual covers 100+ languages.
Rerank v3 at $2.00 per 1M tokens — unique neural reranker for boosting vector-search precision.
All Command models ship with a 128K token context window tuned for long-document RAG.
Free developer tier with generous rate limits; no monthly minimums on paid tiers.
Enterprise deployments available on AWS, Azure, Oracle, and self-hosted (Cohere for AI).

How much does each Cohere model cost per million tokens?

Show legacy models

Model↕	Provider↕	Input $/1M↑	Cached $/1M↕	Output $/1M↕
Command R7B Command R	cohere	$0.0375	—	$0.15
Embed v3 Multilingual Embed	cohere	$0.10	—	$0.00
Rerank v3 Rerank	cohere	$2.00	—	$0.00

Showing 3 of 6 models · All prices in USD per 1M tokensLast synced: 2026-04-15

Command R7B
Command R
cohere
Input
$0.0375
Cached
—
Output
$0.15
Embed v3 Multilingual
Embed
cohere
Input
$0.10
Cached
—
Output
$0.00
Rerank v3
Rerank
cohere
Input
$2.00
Cached
—
Output
$0.00

Showing 3 of 6 models · USD per 1M tokens

Last synced: 2026-04-15

Why choose Cohere over OpenAI or Anthropic?

Cohere's positioning is narrower than OpenAI or Anthropic: enterprise retrieval-augmented generation. The Command, Embed, and Rerank stack is purpose-built to work together — Embed v3 feeds a vector store, Rerank v3 boosts top-K precision, and Command R/R+ generate grounded answers with inline citations. At $0.10 / 1M for embeddings, $2 / 1M for reranking, and $2.50 / $10 per 1M for flagship generation, the end-to-end RAG cost is often lower than equivalent GPT-5.4 + OpenAI embeddings pipelines.

Command R7B at $0.0375 / $0.15 per 1M tokens is genuinely one of the cheapest production APIs available in April 2026 — 4x cheaper than GPT-5.4 nano on input. For high-volume classification, routing, or lightweight summarization it's a strong default. Cohere also offers enterprise deployments on AWS, Azure, and Oracle with BYOK and private endpoints, which matters in regulated industries.

When Cohere is not the right choice

Command R+ trails GPT-5.4 and Claude 3.5 Sonnet on frontier reasoning, code generation, and tool use. If your workload is agentic or code-heavy, OpenAI or Anthropic are still the better default. Cohere also doesn't ship a vision model in the Command family, so multimodal workloads need a different provider.