Hatrio AI

LLM API Pricing Calculator Compare Costs Across 25+ AI Models

Calculate and compare costs for OpenAI, Anthropic Claude, Google Gemini, Mistral AI, and more. Make informed decisions for your AI integrations.

Real-time Pricing 25+ Models Instant Comparison

Calculator Settings

Enter your usage details to calculate costs

Calculate By

Input Tokens

Number of tokens in your input

Output Tokens

Number of tokens in the response

Number of API Calls

How many times you'll call the API

Quick Presets

Summary

Input Tokens: 1,000

Output Tokens: 1,000

Total per Call: 2,000

API Calls: 1

Total Tokens: 2,000

Search Models

Provider

Sort By

Cheapest Option

$0.000100

Meta Llama 3.1 8B

Most Expensive

$0.0900

Anthropic Claude 3 Opus

Pricing Comparison 24 models

Prices shown are for your specified usage: 1,000 input + 1,000 output tokens × 1 calls

Llama 3.1 8B

Lowest Cost Meta

Fast open model

Context: 128,000 tokens • Input: $0.05/1M • Output: $0.05/1M

$0.000100

$0.000100/call

Ministral 8B

Mistral

Ultra efficient

Context: 128,000 tokens • Input: $0.1/1M • Output: $0.1/1M

$0.000200

$0.000200/call

Pixtral 12B

Mistral

Vision model

Context: 128,000 tokens • Input: $0.15/1M • Output: $0.15/1M

$0.000300

$0.000300/call

Gemini 1.5 Flash

Google

Fast multimodal

Context: 1,000,000 tokens • Input: $0.075/1M • Output: $0.3/1M

$0.000375

$0.000375/call

GPT-4o mini

OpenAI

Fast and affordable

Context: 128,000 tokens • Input: $0.15/1M • Output: $0.6/1M

$0.000750

$0.000750/call

Command R

Cohere

Efficient retrieval

Context: 128,000 tokens • Input: $0.15/1M • Output: $0.6/1M

$0.000750

$0.000750/call

Llama 3.1 70B

Mistral Small 24.09

Mistral

Efficient model

Context: 128,000 tokens • Input: $0.2/1M • Output: $0.6/1M

$0.000800

$0.000800/call

Codestral

Mistral

Code generation

Context: 32,000 tokens • Input: $0.2/1M • Output: $0.6/1M

$0.000800

$0.000800/call

Claude 3.5 Haiku

Anthropic

Fast and efficient

Context: 200,000 tokens • Input: $0.25/1M • Output: $1.25/1M

$0.001500

$0.001500/call

Claude 3 Haiku

Anthropic

Fast responses

Context: 200,000 tokens • Input: $0.25/1M • Output: $1.25/1M

$0.001500

$0.001500/call

GPT-3.5 Turbo

OpenAI

Legacy efficient model

Context: 16,385 tokens • Input: $0.5/1M • Output: $1.5/1M

$0.002000

$0.002000/call

Gemini 1.0 Pro

Google

Previous generation

Context: 32,000 tokens • Input: $0.5/1M • Output: $1.5/1M

$0.002000

$0.002000/call

Llama 3.1 405B

Gemini 1.5 Pro

Google

Largest context window

Context: 2,000,000 tokens • Input: $1.25/1M • Output: $5/1M

$0.006250

$0.006250/call

Mistral Large 2

Mistral

Flagship model

Context: 128,000 tokens • Input: $2/1M • Output: $6/1M

$0.008000

$0.008000/call

GPT-4o

OpenAI

Latest flagship model

Context: 128,000 tokens • Input: $2.5/1M • Output: $10/1M

$0.0125

$0.0125/call

Command R+

Cohere

Enterprise model

Context: 128,000 tokens • Input: $2.5/1M • Output: $10/1M

$0.0125

$0.0125/call

o1-mini

OpenAI

Efficient reasoning

Context: 128,000 tokens • Input: $3/1M • Output: $12/1M

$0.0150

$0.0150/call

Claude 3.5 Sonnet

Anthropic

Latest Claude model

Context: 200,000 tokens • Input: $3/1M • Output: $15/1M

$0.0180

$0.0180/call

Claude 3 Sonnet

Anthropic

Balanced performance

Context: 200,000 tokens • Input: $3/1M • Output: $15/1M

$0.0180

$0.0180/call

GPT-4 Turbo

OpenAI

Previous flagship

Context: 128,000 tokens • Input: $10/1M • Output: $30/1M

$0.0400

$0.0400/call

o1-preview

OpenAI

Advanced reasoning

Context: 128,000 tokens • Input: $15/1M • Output: $60/1M

$0.0750

$0.0750/call

Claude 3 Opus

Anthropic

Most capable

Context: 200,000 tokens • Input: $15/1M • Output: $75/1M

$0.0900

$0.0900/call

Understanding LLM Pricing

Key factors that affect your AI API costs

Token-Based Pricing

Most LLM APIs charge based on the number of tokens processed. A token is roughly 4 characters or 0.75 words in English.

Input vs Output

Output tokens typically cost more than input tokens. This is because generating text requires more computational resources.

Model Selection

Flagship models offer better quality but cost more. Efficient models provide good value for simpler tasks at lower prices.

Context Windows

Larger context windows allow more input but may cost more. Choose based on your application's needs for conversation history.

Cost Optimization

Optimize costs by using efficient models for simple tasks, caching responses, and minimizing unnecessary output tokens.

Volume Discounts

Many providers offer volume discounts for high-usage customers. Contact providers directly for enterprise pricing.