Free tool

AI Cost Calculator.

Pick a model, enter your usage, see the bill before you ship. Covers Claude, GPT-5, and Gemini, input, output, and prompt caching all factored in.

Prices reflect public list pricing as of 2026 and are approximations. Always check provider pricing pages before committing to a budget.

Inputs

Model

Daily active users500

People who use your feature each day

Messages per user per day8

Avg API calls each active user triggers

Avg input tokens / message1,500

Including system prompt, RAG chunks, conversation history

Avg output tokens / message400

Length of the model's reply

Prompt cache hit rate (%)60%

System prompts + RAG chunks often cache. 50-80% is realistic.

Estimated monthly cost

$968

with Claude Sonnet 4.6 · 500 DAU

$0.0081

Per message

$1.94

Per user / month

$32.28

Per day

$11,782

Per year

Cost composition (per message)

Input (cached)$0.0012

Input (uncached)$0.0008

Output$0.0060

Same workload across models

3 ways to cut this number in half

Cache aggressively. System prompts + retrieved chunks rarely change. Get cache hit rate to 70%+, saves up to 90% on input cost.
Use a smaller model for routing. Run cheap model first to decide if you need flagship. Most queries don't.
Cap output tokens. Concise prompts → concise answers. Cut output from 400 to 200 tokens often → same quality, half the cost.

Building this and want a second opinion?

Free 30-min call. We'll audit your prompt, caching strategy, and eval setup, no slides.

Book a free consultation →

How this calculator works

Per-message cost = (input tokens × input price) + (output tokens × output price). We multiply by your messages-per-user-per-day and your DAU to project daily, monthly, and annual spend. If you turn on prompt caching, the cached portion of input tokens gets the provider's cache-hit discount applied (typically 10x cheaper).

What this calculator does NOT include

Image, audio, or video token costs (those are priced separately by every provider)
Fine-tuning training costs
Vector database / embedding costs (~$0.02 per 1M embeddings on most providers)
Your hosting bill (typically <5% of inference cost for serverless deployments)

Pricing sources

Numbers approximate public list prices as of 2026. Authoritative sources: Anthropic, OpenAI, Google.