Free tool

AI Cost Calculator.

Pick a model, enter your usage, see the bill before you ship. Covers Claude, GPT-5, and Gemini, input, output, and prompt caching all factored in.

Prices reflect public list pricing as of 2026 and are approximations. Always check provider pricing pages before committing to a budget.

Inputs

Daily active users500
People who use your feature each day
Messages per user per day8
Avg API calls each active user triggers
Avg input tokens / message1,500
Including system prompt, RAG chunks, conversation history
Avg output tokens / message400
Length of the model's reply
Prompt cache hit rate (%)60%
System prompts + RAG chunks often cache. 50-80% is realistic.
Estimated monthly cost
$968
with Claude Sonnet 4.6 · 500 DAU
$0.0081
Per message
$1.94
Per user / month
$32.28
Per day
$11,782
Per year

Cost composition (per message)

Input (cached)$0.0012
Input (uncached)$0.0008
Output$0.0060

Same workload across models

3 ways to cut this number in half

  1. Cache aggressively. System prompts + retrieved chunks rarely change. Get cache hit rate to 70%+, saves up to 90% on input cost.
  2. Use a smaller model for routing. Run cheap model first to decide if you need flagship. Most queries don't.
  3. Cap output tokens. Concise prompts → concise answers. Cut output from 400 to 200 tokens often → same quality, half the cost.

Building this and want a second opinion?

Free 30-min call. We'll audit your prompt, caching strategy, and eval setup, no slides.

Book a free consultation

How this calculator works

Per-message cost = (input tokens × input price) + (output tokens × output price). We multiply by your messages-per-user-per-day and your DAU to project daily, monthly, and annual spend. If you turn on prompt caching, the cached portion of input tokens gets the provider's cache-hit discount applied (typically 10x cheaper).

What this calculator does NOT include

  • Image, audio, or video token costs (those are priced separately by every provider)
  • Fine-tuning training costs
  • Vector database / embedding costs (~$0.02 per 1M embeddings on most providers)
  • Your hosting bill (typically <5% of inference cost for serverless deployments)

Pricing sources

Numbers approximate public list prices as of 2026. Authoritative sources: Anthropic, OpenAI, Google.