Xgenious
developer

AI Token Cost Calculator — Compare API Pricing for GPT, Claude & Gemini

AI API costs are charged per token — roughly 4 characters or 0.75 words. Pricing varies by a factor of 1,000× across models: Gemini 1.5 Flash costs $0.075 per million input tokens while Claude Opus 4 costs $15/M. This calculator computes exact per-request, daily, and monthly costs for 13 major models including GPT-4o, Claude, Gemini, Llama, and Mistral.

Free — No SignupRuns in BrowserData Never UploadedPopular tool

developer

Calculate API costs for GPT-4o, Claude, Gemini, and other LLMs by token count.

  • Per-request, daily, and monthly cost estimates in one view
  • 13 models across OpenAI, Anthropic, Google, DeepSeek, and Meta
  • Separate input-token and output-token cost breakdown
  • Side-by-side monthly cost comparison across all models, cheapest first
  • Built-in token estimator — paste text to approximate token count
  • Custom pricing override for negotiated or cached-input rates
  • Client-side only — no data uploaded or stored
Features

Everything you need in one AI Token Cost Calculator

GPT, Claude & Gemini cost calculator

Covers 13 models across five providers — from ultra-cheap nano tiers to flagship models — so you can price any LLM workload in one place.

Input vs output cost breakdown

Output tokens cost several times more than input. The tool splits the two so you can see what is really driving your bill at a glance.

Side-by-side model comparison

Every model is ranked by monthly cost for your exact workload, cheapest first, turning a pricing guess into a data-driven architecture decision.

Built-in token estimator

Paste representative text to approximate its token count at roughly 4 characters per token, then load it straight into the calculator.

How It Works

How to use AI Token Cost Calculator

01

Select a model

Choose your AI model from the grouped list. Models are organized by provider — OpenAI, Anthropic, Google, Meta, and Mistral.

02

Enter token counts

Input the average number of input tokens (your prompt) and output tokens (the model's response) per API request.

03

Set daily request volume

Enter how many API calls you expect per day. The tool calculates per-request, daily, and 30-day monthly costs automatically.

Format Comparison

AI Model Pricing Comparison (per 1M tokens, May 2026)

ModelProviderInput $/1MOutput $/1MBest For
GPT-5.5OpenAI$5.00$30.00Current OpenAI flagship
GPT-5OpenAI$1.25$10.00Best value in GPT-5 family
GPT-5 miniOpenAI$0.25$2.00Affordable GPT-5 tier
GPT-5 nanoOpenAI$0.05$0.40Ultra-cheap, high-volume tasks
GPT-4oOpenAI$2.50$10.00Multimodal, proven general-purpose
Claude Sonnet 4Anthropic$3.00$15.00Code, long-context reasoning
Claude Haiku 4Anthropic$0.80$4.00Fast, affordable Anthropic tier
Gemini 3.5 FlashGoogle$1.50$9.00Latest Gemini, balanced perf
Gemini 3.1 LiteGoogle$0.25$1.50Cheapest Gemini 3 tier
DeepSeek V4 ProDeepSeek$1.74$3.48Strong reasoning, low output cost
DeepSeek V4 FlashDeepSeek$0.14$0.28Ultra-cheap, fast inference
Llama 3.3 70BMeta/Groq$0.59$0.79Open-weight, self-hostable
Troubleshooting

How to fix common syntax errors

Most “invalid JSON” failures come from a small set of mistakes. Paste the failing JSON above, click Validate, and the tool points you at the exact line and column.

Treating input and output tokens as equal cost500 tokens in + 500 tokens out × same rate

Output tokens are typically 3–6× more expensive than input. Enter input and output counts separately — the calculator applies the correct rate to each.

Equating word count to token count"100 words = 100 tokens"

100 English words ≈ 130 tokens due to subword tokenization. Use the built-in estimator or a provider tokenizer (OpenAI tiktoken) for accurate token counts before scaling.

Ignoring system prompt tokensOnly counting user message tokens

Every API request includes your system prompt. Count its tokens and add them to the input count — a 500-word system prompt adds ~650 tokens per request, compounding at scale.

Missing batch API discountBudgeting at real-time list prices for async work

OpenAI Batch API and Anthropic batch mode cost ~50% less than real-time API for offline or async workloads. Select the batch variant in the model list if available.

Comparing models at different context lengthsModel A at 500 tokens vs Model B at 4,000 tokens

Use the same input and output token counts when comparing models. A model that looks cheap at short context may become expensive at long context due to per-token pricing.

Ignoring prompt caching savingsPaying full input price for repeated system prompts

Anthropic prompt caching and OpenAI cached tokens reduce cost by 50–90% for large static contexts that repeat across requests. Apply caching before optimizing elsewhere.

FAQ

Frequently asked questions

A token is the basic unit of text that AI language models process. In English, one token is approximately 4 characters or 0.75 words. "Hello world" is 2 tokens; a typical paragraph of 100 words is about 130 tokens. Different tokenizers vary slightly — OpenAI uses cl100k_base for GPT-4o and Anthropic uses its own tokenizer. Both providers offer free tokenizer tools to count exact tokens for your specific text before committing to API costs at scale.

References

Further reading

Authority documentation and specifications behind this tool.

Have a project in mind?

We turn ideas into production-ready software — SaaS, web apps, mobile, and AI agents. Fixed price. Committed timeline. No surprises.

Let's talk