models.json)+ This page doesn't exist in the AI Models Catalog. Maybe the model was deprecated, or the URL + is incorrect. +
+ โ Back to Catalog ++ How do top AI models compare on MMLU, MATH-500, HumanEval, SWE-bench, and Chatbot Arena? A + comprehensive benchmark analysis of 4,587 models across 95 providers. +
++ MMLU (Massive Multitask Language Understanding) tests knowledge across 57 academic subjects. + MMLU-Pro is a harder variant requiring deeper reasoning. +
+| Model | +MMLU | +MMLU-Pro | +Provider | +Input $/M | +
|---|---|---|---|---|
| GPT-4.1 | +~90% | +~78% | +OpenAI | +$2.00 | +
| Claude Opus 4 | +~90% | +~78% | +Anthropic | +$15.00 | +
| Gemini 2.5 Pro | +~90% | +~78% | +$1.25 | +|
| Claude Sonnet 4 | +~88% | +~76% | +Anthropic | +$3.00 | +
| Grok 3 | +~87% | +~75% | +xAI | +$3.00 | +
| DeepSeek R1 | +~85% | +~72% | +DeepSeek | +Free | +
| Qwen3-235B | +~85% | +~72% | +Alibaba | +Free | +
| Llama 4 Maverick | +~82% | +~68% | +Meta | +Free | +
+ MATH-500 tests competition-level mathematics. AIME 2024 is an even harder math competition + benchmark. +
+| Model | +MATH-500 | +AIME 2024 | +Provider | +Input $/M | +
|---|---|---|---|---|
| o3 | +~96% | +~83% | +OpenAI | +$2.00 | +
| o4-mini | +~93% | +~75% | +OpenAI | +$1.10 | +
| DeepSeek R1 | +~92% | +~72% | +DeepSeek | +Free | +
| Gemini 2.5 Pro | +~91% | +~70% | +$1.25 | +|
| Qwen3-235B | +~90% | +~68% | +Alibaba | +Free | +
| Claude Sonnet 4 | +~88% | +~65% | +Anthropic | +$3.00 | +
+ HumanEval tests Python code generation. SWE-bench tests real GitHub issue resolution โ more + realistic for production use. +
+| Model | +HumanEval | +SWE-bench Verified | +Provider | +Input $/M | +
|---|---|---|---|---|
| Claude Sonnet 4 | +~93% | +~72% | +Anthropic | +$3.00 | +
| o3 | +~92% | +~70% | +OpenAI | +$2.00 | +
| GPT-4.1 | +~91% | +~65% | +OpenAI | +$2.00 | +
| Gemini 2.5 Pro | +~90% | +~63% | +$1.25 | +|
| DeepSeek V3 | +~88% | +~55% | +DeepSeek | +$0.07 | +
| Codestral | +~86% | +N/A | +Mistral | +$0.30 | +
+ GPQA (Graduate-Level Google-Proof Q&A) tests expert-level scientific reasoning. Even + PhDs with internet access struggle. +
+| Model | +GPQA Diamond | +Provider | +Input $/M | +
|---|---|---|---|
| o3 | +~80% | +OpenAI | +$2.00 | +
| Gemini 2.5 Pro | +~78% | +$1.25 | +|
| Claude Opus 4 | +~75% | +Anthropic | +$15.00 | +
| o4-mini | +~73% | +OpenAI | +$1.10 | +
| DeepSeek R1 | +~71% | +DeepSeek | +Free | +
+ BFCL (Berkeley Function Calling Leaderboard) tests function calling accuracy โ critical for + AI agents. +
+| Model | +BFCL v3 | +Provider | +Input $/M | +
|---|---|---|---|
| GPT-4.1 | +~88% | +OpenAI | +$2.00 | +
| Claude Sonnet 4 | +~86% | +Anthropic | +$3.00 | +
| Gemini 2.5 Pro | +~85% | +$1.25 | +|
| Grok 3 | +~83% | +xAI | +$3.00 | +
| Gemini 2.5 Flash | +~82% | +Free | +
+ LMSYS Chatbot Arena uses blind human comparisons. This is the most practical benchmark for + chat quality. +
+| Model | +Arena Score | +Provider | +Input $/M | +
|---|---|---|---|
| GPT-4.1 | +~1380 | +OpenAI | +$2.00 | +
| Claude Sonnet 4 | +~1370 | +Anthropic | +$3.00 | +
| Gemini 2.5 Pro | +~1360 | +$1.25 | +|
| Grok 3 | +~1350 | +xAI | +$3.00 | +
| DeepSeek R1 | +~1330 | +DeepSeek | +Free | +
| Benchmark | +Best Free | +Best Paid | +Best Overall | +
|---|---|---|---|
| MMLU | +DeepSeek R1 / Qwen3 | +Gemini 2.5 Pro ($1.25) | +GPT-4.1 | +
| MATH | +DeepSeek R1 | +o4-mini ($1.10) | +o3 | +
| Coding | +DeepSeek V3 ($0.07) | +Gemini 2.5 Pro ($1.25) | +Claude Sonnet 4 | +
| GPQA | +DeepSeek R1 | +Gemini 2.5 Pro ($1.25) | +o3 | +
| Tool Calling | +Gemini 2.5 Flash | +Gemini 2.5 Pro ($1.25) | +GPT-4.1 | +
| Chat | +DeepSeek R1 | +Gemini 2.5 Pro ($1.25) | +GPT-4.1 | +
+ Side-by-side comparison of AI models: pricing, context windows, tool calling, reasoning, + vision, and structured output. Data from 95 providers, 4,587 models. +
+The top models from each major provider, compared across all key dimensions.
+| Model | +Provider | +Input $/M | +Output $/M | +Context | +Tool Call | +Reasoning | +Vision | +Struct. Output | +
|---|---|---|---|---|---|---|---|---|
| GPT-4.1 | +OpenAI | +$2.00 | +$8.00 | +1,047K | +โ | +โ | +โ | +โ | +
| o3 | +OpenAI | +$2.00 | +$8.00 | +200K | +โ | +โ | +โ | +โ | +
| o4-mini | +OpenAI | +$1.10 | +$4.40 | +200K | +โ | +โ | +โ | +โ | +
| Claude Opus 4 | +Anthropic | +$15.00 | +$75.00 | +200K | +โ | +โ | +โ | +โ | +
| Claude Sonnet 4 | +Anthropic | +$3.00 | +$15.00 | +200K | +โ | +โ | +โ | +โ | +
| Claude Haiku 3.5 | +Anthropic | +$0.80 | +$4.00 | +200K | +โ | +โ | +โ | +โ | +
| Gemini 2.5 Pro | +$1.25 | +$10.00 | +1,048K | +โ | +โ | +โ | +โ | +|
| Gemini 2.5 Flash | +Free | +Free | +1,048K | +โ | +โ | +โ | +โ | +|
| Grok 3 | +xAI | +$3.00 | +$15.00 | +131K | +โ | +โ | +โ | +โ | +
| Grok 3 Mini | +xAI | +$0.30 | +$0.50 | +131K | +โ | +โ | +โ | +โ | +
| DeepSeek R1 | +DeepSeek | +Free | +Free | +164K | +โ | +โ | +โ | +โ | +
| DeepSeek V3 | +DeepSeek | +$0.07 | +$0.27 | +164K | +โ | +โ | +โ | +โ | +
| Mistral Large | +Mistral | +$2.00 | +$6.00 | +128K | +โ | +โ | +โ | +โ | +
| Codestral | +Mistral | +$0.30 | +$0.90 | +256K | +โ | +โ | +โ | +โ | +
| Qwen3-235B | +Alibaba | +Free | +Free | +128K | +โ | +โ | +โ | +โ | +
| Command R+ | +Cohere | +$2.50 | +$10.00 | +128K | +โ | +โ | +โ | +โ | +
| Llama 4 Maverick | +Meta | +Free | +Free | +1,048K | +โ | +โ | +โ | +โ | +
| Nova Pro | +Amazon | +$0.80 | +$3.20 | +300K | +โ | +โ | +โ | +โ | +
Models that offer strong capabilities at budget-friendly prices.
+| Model | +Provider | +Input $/M | +Output $/M | +Context | +Tool Call | +Reasoning | +Vision | +
|---|---|---|---|---|---|---|---|
| Gemini 2.5 Flash | +Free | +Free | +1,048K | +โ | +โ | +โ | +|
| DeepSeek R1 | +DeepSeek | +Free | +Free | +164K | +โ | +โ | +โ | +
| Qwen3-235B | +Alibaba | +Free | +Free | +128K | +โ | +โ | +โ | +
| DeepSeek V3 | +DeepSeek | +$0.07 | +$0.27 | +164K | +โ | +โ | +โ | +
| Grok 3 Mini | +xAI | +$0.30 | +$0.50 | +131K | +โ | +โ | +โ | +
| Codestral | +Mistral | +$0.30 | +$0.90 | +256K | +โ | +โ | +โ | +
| Claude Haiku 3.5 | +Anthropic | +$0.80 | +$4.00 | +200K | +โ | +โ | +โ | +
| Nova Pro | +Amazon | +$0.80 | +$3.20 | +300K | +โ | +โ | +โ | +
Models with the largest context windows for processing long documents.
+| Model | +Provider | +Context Window | +Input $/M | +Tool Call | +
|---|---|---|---|---|
| Gemini 2.5 Pro | +1,048,576 | +$1.25 | +โ | +|
| Gemini 2.5 Flash | +1,048,576 | +Free | +โ | +|
| GPT-4.1 | +OpenAI | +1,047,576 | +$2.00 | +โ | +
| Llama 4 Maverick | +Meta | +1,048,576 | +Free | +โ | +
| Nova Pro | +Amazon | +300,000 | +$0.80 | +โ | +
| Claude Opus/Sonnet 4 | +Anthropic | +200,000 | +$3-15 | +โ | +
| o3 / o4-mini | +OpenAI | +200,000 | +$1.10-2 | +โ | +
| DeepSeek R1/V3 | +DeepSeek | +163,840 | +Free | +โ | +
How many models support each capability across our catalog.
+| Capability | +Models | +Free Models | +Cheapest Paid | +
|---|---|---|---|
| Tool Calling | +2,350 | +54 | +ling-2.6-flash ($0.01/$0.03) | +
| Reasoning | +1,306 | +18 | +qwen3.5-0.8b ($0.01/$0.05) | +
| Vision | +1,487 | +35 | +ling-2.6-flash ($0.01/$0.03) | +
| Structured Output | +829 | +24 | +ling-2.6-flash ($0.01/$0.03) | +
| Open Weights | +527 | +81 | +Free | +
| Image Output | +28 | +5 | +Various | +
| Audio Input | +118 | +12 | +Various | +
| Audio Output | +34 | +8 | +Various | +
| Use Case | +Best Model | +Why | +Cost | +
|---|---|---|---|
| AI Agents | +GPT-4.1 | +#1 tool calling, parallel calls | +$2/$8 | +
| Coding | +Claude Sonnet 4 | +#1 SWE-bench, 64K output | +$3/$15 | +
| Reasoning | +o3 | +#1 MATH, GPQA | +$2/$8 | +
| Long Documents | +Gemini 2.5 Pro | +1M context, best price | +$1.25/$10 | +
| Chat | +GPT-4.1 | +#1 Chatbot Arena | +$2/$8 | +
| Budget | +Gemini 2.5 Flash | +Free with 1M context | +Free | +
| Open Source | +Qwen3-235B | +Best open-weight model | +Free | +
| Vision | +Gemini 2.5 Pro | +Best MMMU, image+video | +$1.25/$10 | +
+ Automate AI model data in your CI/CD pipeline. Free, open source, and always up-to-date. +
+ +- name: Get AI Model Data
+ uses: i-need-token/ai-models@v0.2.0
+ with:
+ format: json
+ output: models.json
+
+ + Get structured model data with pricing, context windows, and capabilities for 4,587+ + models across 95 providers. +
++ Filter by provider, capability (tool calling, reasoning, vision), pricing tier, or + context window size. +
+Monitor pricing changes across providers. Get alerts when model prices change.
+Output as JSON, YAML, CSV, or Markdown table. Use in scripts, docs, or dashboards.
+| Input | +Description | +Default | +
|---|---|---|
format |
+ Output format: json, yaml, csv, markdown | +json |
+
output |
+ Output file path | +models.json |
+
provider |
+ Filter by provider name | +(all) | +
capability |
+ Filter by capability: tool_call, reasoning, vision, structured_output | +(all) | +
free-only |
+ Only include free models | +false |
+
max-price |
+ Maximum input price per M tokens | +(no limit) | +
min-context |
+ Minimum context window size | +0 | +
- name: Get free tool-calling models
+ uses: i-need-token/ai-models@v0.2.0
+ with:
+ format: json
+ output: free-tc-models.json
+ capability: tool_call
+ free-only: true
+
+ - name: Get budget models
+ uses: i-need-token/ai-models@v0.2.0
+ with:
+ format: csv
+ output: budget-models.csv
+ max-price: 0.50
+
+ - name: Generate comparison table
+ uses: i-need-token/ai-models@v0.2.0
+ with:
+ format: markdown
+ output: model-comparison.md
+ provider: openai
+ capability: reasoning
+
+ name: Price Monitor
+on:
+ schedule:
+ - cron: '0 6 * * 1' # Every Monday 6:00 UTC
+jobs:
+ check-prices:
+ runs-on: ubuntu-latest
+ steps:
+ - uses: i-need-token/ai-models@v0.2.0
+ with:
+ format: csv
+ output: current-prices.csv
+ - name: Compare with last week
+ run: |
+ diff last-week-prices.csv current-prices.csv || echo "Prices changed!"
+ - name: Save for next week
+ run: cp current-prices.csv last-week-prices.csv
+
+ {
+ "generated_at": "2025-05-21T12:00:00Z",
+ "stats": { "models": 4587, "providers": 95 },
+ "models": [
+ {
+ "id": "gpt-4o",
+ "provider": "openai",
+ "pricing": { "input": 2.5, "output": 10 },
+ "limit": { "context": 128000 },
+ "tool_call": true,
+ "reasoning": false
+ }
+ ]
+}
+
+ id,provider,input_price,output_price,context_window,tool_call,reasoning
+gpt-4o,openai,2.5,10,128000,true,false
+
+ + Answer 4 questions to find the best AI model for your use case. Data from + AI Models Catalog โ 4,587+ models + across 95 providers. +
+ ++ Calculate your monthly AI costs. Compare pricing for 4,587+ models across + 95 providers. Real-time cost estimation based on your token usage. +
+ +Monthly cost for 1M input + 0.5M output tokens across popular models.
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +
|---|---|---|---|---|
| openai--gpt-image-1-mini | +aimlapi | +$0.007 | +$0.676 | +? | +
| mistralai--Mistral-Nemo-Instruct-2407 | +klusterai | +$0.008 | +$0.001 | +131K | +
| qwen3.5-0.8b | +deepinfra | +$0.01 | +$0.05 | +262K | +
| ling-2.6-flash | +inclusionai | +$0.01 | +$0.03 | +262K | +
| bdc-coder | +inferencenet | +$0.01 | +$0.01 | +131K | +
| openai--gpt-image-1-model | +aimlapi | +$0.012 | +$0.175 | +? | +
| klusterai--Meta-Llama-3.1-8B-Instruct-Turbo | +klusterai | +$0.015 | +$0.02 | +131K | +
| granite-4.0-h-micro | +cloudflare | +$0.017 | +$0.112 | +131K | +
| meta-llama-3.1-8b-instruct-turbo | +deepinfra | +$0.02 | +$0.03 | +131K | +
| meta-llama-3.1-8b-instruct | +deepinfra | +$0.02 | +$0.05 | +131K | +
| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +
|---|---|---|---|---|
| ling-2.6-flash | +inclusionai | +$0.01 | +$0.03 | +262K | +
| bdc-coder | +inferencenet | +$0.01 | +$0.01 | +131K | +
| klusterai--Meta-Llama-3.1-8B-Instruct-Turbo | +klusterai | +$0.015 | +$0.02 | +131K | +
| granite-4.0-h-micro | +cloudflare | +$0.017 | +$0.112 | +131K | +
| llama-3.1-8b-instruct--fp-16 | +inferencenet | +$0.02 | +$0.03 | +131K | +
| schematron-3b | +inferencenet | +$0.02 | +$0.05 | +131K | +
| schematron-v3 | +inferencenet | +$0.02 | +$0.05 | +131K | +
| gpt-oss-20b | +inferencenet | +$0.03 | +$0.15 | +131K | +
| schematron-v2-turbo | +inferencenet | +$0.03 | +$0.15 | +131K | +
| openai--gpt-oss-20b | +neuralwatt | +$0.03 | +$0.16 | +? | +
| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +
|---|---|---|---|---|
| qwen3.5-0.8b | +deepinfra | +$0.01 | +$0.05 | +262K | +
| qwen3.5-2b | +deepinfra | +$0.02 | +$0.1 | +262K | +
| gpt-oss-20b | +deepinfra | +$0.03 | +$0.14 | +131K | +
| qwen3.5-4b | +deepinfra | +$0.03 | +$0.15 | +262K | +
| openai--gpt-oss-20b | +neuralwatt | +$0.03 | +$0.16 | +? | +
| qwen--qwen3-4b-fp8 | +novitaai | +$0.03 | +$0.03 | +128K | +
| gpt-oss-120b | +deepinfra | +$0.039 | +$0.19 | +131K | +
| nvidia-nemotron-nano-9b-v2 | +deepinfra | +$0.04 | +$0.16 | +131K | +
| openai--gpt-oss-20b | +novitaai | +$0.04 | +$0.15 | +131K | +
| nemotron-3-nano-30b-a3b | +deepinfra | +$0.05 | +$0.2 | +262K | +
| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +
|---|---|---|---|---|
| qwen3.5-0.8b | +deepinfra | +$0.01 | +$0.05 | +262K | +
| qwen3.5-2b | +deepinfra | +$0.02 | +$0.1 | +262K | +
| paddlepaddle--paddleocr-vl | +novitaai | +$0.02 | +$0.02 | +16K | +
| qwen3.5-4b | +deepinfra | +$0.03 | +$0.15 | +262K | +
| deepseek--deepseek-ocr-2 | +novitaai | +$0.03 | +$0.03 | +8K | +
| deepseek--deepseek-ocr | +novitaai | +$0.03 | +$0.03 | +8K | +
| reka-edge-2 | +reka | +$0.03 | +$0.1 | +131K | +
| zai-org--autoglm-phone-9b-multilingual | +novitaai | +$0.035 | +$0.138 | +65K | +
| gemini-1.5-flash-8b | +deepinfra | +$0.0375 | +$0.15 | +1M | +
| google-gemma-3-4b | +amazon-bedrock | +$0.04 | +$0.08 | +131K | +
+ All pricing data is sourced from first-party provider APIs. Prices are per million + tokens (1M = 1,000,000 tokens). Aggregator providers are excluded from ranking tables to avoid + duplicate models. Cache pricing is shown separately where available. +
+ ++ Browse 4,587 AI models across 95 providers. First-party data with real pricing, + context windows, and capabilities. +
+ +All 95 providers sorted by number of models. Click a provider to see their models.
+| Provider | +Models | +Cheapest Input $/1M | +Max Context | +Tool Call | +Free | +
|---|---|---|---|---|---|
| + nanogpt (aggregator) + | +547 | +Aggregator | +? | +0 | ++ |
| + aihubmix (aggregator) + | +476 | +Aggregator | +? | +132 | ++ |
| + openrouter (aggregator) + | +356 | +Aggregator | +10M | +263 | +โ | +
| + martian (aggregator) + | +304 | +Aggregator | +? | +0 | ++ |
| + requesty (aggregator) + | +277 | +Aggregator | +1M | +251 | ++ |
| + 302ai (aggregator) + | +268 | +Aggregator | +2M | +190 | ++ |
| + auriko (aggregator) + | +181 | +Aggregator | +1M | +154 | +โ | +
| + llmgateway (aggregator) + | +163 | +Aggregator | +? | +158 | +โ | +
| + aimlapi + | +147 | +$0.007 | +2M | +21 | +โ | +
| + fastrouter (aggregator) + | +120 | +Aggregator | +2M | +94 | +โ | +
| + orcarouter (aggregator) + | +120 | +Aggregator | +1M | +102 | ++ |
| + cortecs (aggregator) + | +105 | +Aggregator | +? | +97 | ++ |
| + novitaai + | +104 | +$0.02 | +1M | +72 | +โ | +
| + vultr + | +98 | +$0.55 | +1M | +11 | ++ |
| + deepinfra + | +88 | +$0.01 | +1M | +0 | ++ |
| + venice (aggregator) + | +75 | +Aggregator | +2M | +64 | ++ |
| + jiekou (aggregator) + | +73 | +Aggregator | +2M | +73 | ++ |
| + meganova (aggregator) + | +63 | +Aggregator | +1M | +60 | +โ | +
| + alibaba + | +62 | +$0.15 | +1M | +62 | ++ |
| + ppio + | +60 | +$0.2145 | +1M | +46 | +โ | +
| + amazon-bedrock + | +57 | +$0.035 | +1M | +37 | ++ |
| + google-vertex + | +38 | +$0.07 | +1M | +32 | ++ |
| + siliconflow-cn + | +37 | +$0.5 | +262K | +2 | ++ |
| + stepfun + | +31 | +$0.7 | +256K | +0 | +โ | +
| + cloudflare + | +30 | +$0.017 | +327K | +15 | ++ |
| + databricks + | +29 | +$0.05 | +200K | +4 | ++ |
| + gmicloud + | +29 | +$0.07 | +1M | +11 | ++ |
| + openai + | +28 | +$0.02 | +1M | +18 | ++ |
| + siliconflow + | +27 | +$0.04 | +1M | +24 | ++ |
| + togetherai + | +24 | +$0.03 | +262K | +22 | ++ |
| + nebius + | +23 | +$0.02 | +1M | +21 | ++ |
| + google + | +21 | +$0.075 | +2M | +8 | +โ | +
| + minimax + | +21 | +$2.1 | +204K | +0 | ++ |
| + voyage + | +21 | +$0.02 | +? | +0 | +โ | +
| + digitalocean + | +20 | +$0.05 | +1M | +14 | ++ |
| + inferencenet + | +20 | +$0.01 | +131K | +15 | ++ |
| + zhipuai + | +20 | +$0.1 | +1M | +20 | +โ | +
| + tencent-tokenhub + | +19 | +$1 | +1M | +16 | ++ |
| + mistral + | +16 | +$0.04 | +256K | +12 | +โ | +
| + moonshotai + | +16 | +$2 | +262K | +0 | ++ |
| + neuralwatt + | +14 | +$0.03 | +? | +14 | ++ |
| + tencent + | +14 | +$0.5 | +250K | +3 | +โ | +
| + scaleway + | +13 | +$0.15 | +131K | +6 | ++ |
| + chutes + | +12 | +$0.08 | +262K | +12 | ++ |
| + clarifai + | +12 | +$0.09 | +1M | +9 | ++ |
| + cloudferro-sherlock + | +12 | +$0.26 | +1M | +5 | ++ |
| + groq + | +12 | +$0.05 | +131K | +8 | ++ |
| + klusterai + | +12 | +$0.008 | +1M | +4 | ++ |
| + meta + | +12 | +$0.1 | +10M | +9 | ++ |
| + microsoft + | +12 | +$0.075 | +128K | +6 | ++ |
| + ovhcloud + | +12 | +$0.05 | +262K | +0 | ++ |
| + anthropic + | +11 | +$1 | +1M | +11 | ++ |
| + baichuan + | +11 | +$0.98 | +131K | +0 | +โ | +
| + cerebras + | +11 | +$0.1 | +131K | +9 | +โ | +
| + hpc-ai + | +11 | +$0.14 | +1M | +11 | ++ |
| + hyperbolic + | +11 | +$0.1 | +163K | +0 | ++ |
| + fireworks + | +10 | +$0.07 | +1M | +10 | ++ |
| + baseten + | +9 | +$0.1 | +1M | +9 | ++ |
| + baidu + | +8 | +$0.126 | +1M | +7 | +โ | +
| + evroc + | +8 | +$0.1 | +131K | +3 | ++ |
| + friendli + | +8 | +$0.1 | +262K | +8 | ++ |
| + upstage + | +8 | +$0.1 | +128K | +3 | ++ |
| + amazon + | +7 | +$0.035 | +1M | +7 | ++ |
| + arcee + | +7 | +$0.04 | +262K | +6 | +โ | +
| + berget + | +7 | +$0.2 | +? | +7 | ++ |
| + morph + | +7 | +$0.2 | +1M | +5 | ++ |
| + nousresearch + | +7 | +$0.06 | +131K | +7 | ++ |
| + sambanova + | +7 | +$0.22 | +196K | +0 | ++ |
| + dinference + | +6 | +$0.07 | +204K | +3 | ++ |
| + iflytek + | +6 | +$0.8 | +262K | +0 | +โ | +
| + submodel + | +6 | +$0.1 | +262K | +0 | ++ |
| + textsynth + | +6 | +$0.2 | +131K | +0 | ++ |
| + writer + | +6 | +$0.6 | +1M | +3 | ++ |
| + xai + | +6 | +$0.2 | +131K | +6 | ++ |
| + 01ai + | +5 | +$1 | +32K | +4 | ++ |
| + aion + | +5 | +$0.7 | +131K | +0 | ++ |
| + bytedance + | +5 | +$0.07 | +262K | +4 | ++ |
| + inception + | +5 | +$0.25 | +128K | +3 | ++ |
| + mixlayer + | +5 | +$0.1 | +131K | +5 | +โ | +
| + privatemode + | +5 | +$0.43 | +131K | +3 | ++ |
| + xiaomi + | +5 | +$0.1 | +1M | +5 | ++ |
| + deepseek + | +4 | +$0.14 | +1M | +4 | ++ |
| + perplexity + | +4 | +$1 | +200K | +4 | ++ |
| + inclusionai + | +3 | +$0.01 | +262K | +3 | ++ |
| + ai21 + | +2 | +$0.2 | +256K | +0 | ++ |
| + reka + | +2 | +$0.03 | +131K | +1 | ++ |
| + wafer + | +2 | +$0.6 | +262K | +2 | ++ |
GPT-4, GPT-4o, o1, o3 โ the industry standard for LLMs. 28 models available.
+| Model | +Input $/1M | +Output $/1M | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|
| text-embedding-3-small | +$0.02 | +$0 | +8K | ++ | + |
| gpt-4.1-nano | +$0.1 | +$0.4 | +1M | +โ | ++ |
| text-embedding-ada-002 | +$0.1 | +$0 | +8K | ++ | + |
| text-embedding-3-large | +$0.13 | +$0 | +8K | ++ | + |
| gpt-4o-mini | +$0.15 | +$0.6 | +128K | +โ | ++ |
| gpt-4.1-mini | +$0.4 | +$1.6 | +1M | +โ | ++ |
| gpt-3.5-turbo | +$0.5 | +$1.5 | +16K | +โ | ++ |
| o3-mini | +$1.1 | +$4.4 | +200K | +โ | +โ | +
| o4-mini | +$1.1 | +$4.4 | +200K | +โ | +โ | +
| codex-mini | +$1.5 | +$6 | +192K | ++ | โ | +
| o1-mini | +$1.5 | +$6 | +128K | +โ | +โ | +
| gpt-4.1 | +$2 | +$8 | +1M | +โ | ++ |
| gpt-4o-audio | +$2.5 | +$10 | +128K | +โ | ++ |
| gpt-4o | +$2.5 | +$10 | +128K | +โ | ++ |
| gpt-3.5-turbo-16k | +$3 | +$4 | +16K | +โ | ++ |
| gpt-4o-realtime | +$5 | +$20 | +128K | +โ | ++ |
| gpt-4-turbo | +$10 | +$30 | +128K | +โ | ++ |
| o3 | +$10 | +$40 | +200K | +โ | +โ | +
| o1-realtime | +$15 | +$60 | +200K | +โ | +โ | +
| o1 | +$15 | +$60 | +200K | +โ | +โ | +
| gpt-4 | +$30 | +$60 | +8K | +โ | ++ |
| gpt-4-32k | +$60 | +$120 | +32K | ++ | + |
| o1-pro | +$150 | +$600 | +200K | +โ | +โ | +
| dall-e-2 | +$? | +$? | +? | ++ | + |
| dall-e-3 | +$? | +$? | +? | ++ | + |
| tts-1-hd | +$? | +$? | +? | ++ | + |
| tts-1 | +$? | +$? | +? | ++ | + |
| whisper-1 | +$? | +$? | +? | ++ | + |
Claude โ known for safety, reasoning, and long context. 11 models available.
+| Model | +Input $/1M | +Output $/1M | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|
| claude-haiku-4-5 | +$1 | +$5 | +200K | +โ | +โ | +
| claude-sonnet-4-0 | +$3 | +$15 | +1M | +โ | +โ | +
| claude-sonnet-4-5 | +$3 | +$15 | +1M | +โ | +โ | +
| claude-sonnet-4-6 | +$3 | +$15 | +1M | +โ | +โ | +
| claude-opus-4-5 | +$5 | +$25 | +200K | +โ | +โ | +
| claude-opus-4-6 | +$5 | +$25 | +1M | +โ | +โ | +
| claude-opus-4-7 | +$5 | +$25 | +1M | +โ | +โ | +
| claude-opus-4-0 | +$15 | +$75 | +200K | +โ | +โ | +
| claude-opus-4-1 | +$15 | +$75 | +200K | +โ | +โ | +
| claude-opus-4-6-fast | +$30 | +$150 | +1M | +โ | +โ | +
| claude-opus-4-7-fast | +$30 | +$150 | +1M | +โ | +โ | +
Gemini โ multimodal models with massive context windows. 21 models available.
+| Model | +Input $/1M | +Output $/1M | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|
| gemini-1.5-flash-8b | +$0.075 | +$0.3 | +1M | +โ | ++ |
| gemini-1.5-flash | +$0.075 | +$0.3 | +1M | +โ | ++ |
| gemini-2.0-flash-lite | +$0.075 | +$0.3 | +1M | +โ | ++ |
| gemini-2.0-flash | +$0.1 | +$0.4 | +1M | +โ | ++ |
| gemini-2.5-flash-lite | +$0.1 | +$0.4 | +1M | +โ | ++ |
| gemini-2.5-flash | +$0.15 | +$3.5 | +1M | +โ | +โ | +
| gemini-1.5-pro | +$1.25 | +$5 | +2M | +โ | ++ |
| gemini-2.5-pro | +$1.25 | +$10 | +1M | +โ | +โ | +
| chirp-3.0-HD | +$? | +$? | +? | ++ | + |
| gemma-3-12b-it | +Free | ++ | 131K | ++ | + |
| gemma-3-1b-it | +Free | ++ | 131K | ++ | + |
| gemma-3-27b-it | +Free | ++ | 131K | ++ | + |
| gemma-3-4b-it | +Free | ++ | 131K | ++ | + |
| gemma-3n-E2B-it | +Free | ++ | 131K | ++ | + |
| gemma-3n-E4B-it | +Free | ++ | 131K | ++ | + |
| imagen-3.0-fast-generate | +$? | +$? | +? | ++ | + |
| imagen-3.0-generate | +$? | +$? | +? | ++ | + |
| imagen-4.0-fast-generate | +$? | +$? | +? | ++ | + |
| imagen-4.0-generate | +$? | +$? | +? | ++ | + |
| lyria-2.0 | +$? | +$? | +? | ++ | + |
| veo-2.0-generate | +$? | +$? | +? | ++ | + |
Llama โ open-weight models you can run anywhere. 12 models available.
+| Model | +Input $/1M | +Output $/1M | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|
| meta-llama-3.2-1b | +$0.1 | +$0.1 | +128K | ++ | + |
| meta-llama-3.2-3b | +$0.15 | +$0.15 | +128K | ++ | + |
| meta-llama-3.2-11b-vision | +$0.16 | +$0.16 | +128K | +โ | ++ |
| meta-llama-4-scout | +$0.17 | +$0.66 | +10M | +โ | ++ |
| meta-llama-3.1-8b | +$0.22 | +$0.22 | +128K | +โ | ++ |
| meta-llama-4-maverick | +$0.24 | +$0.97 | +1M | +โ | ++ |
| meta-llama-3-8b | +$0.3 | +$0.6 | +8K | ++ | + |
| meta-llama-3.1-70b | +$0.72 | +$0.72 | +128K | +โ | ++ |
| meta-llama-3.2-90b-vision | +$0.72 | +$0.72 | +128K | +โ | ++ |
| meta-llama-3.3-70b | +$0.72 | +$0.72 | +128K | +โ | ++ |
| meta-llama-3.1-405b | +$2.4 | +$2.4 | +128K | +โ | ++ |
| meta-llama-3-70b | +$2.65 | +$3.5 | +8K | +โ | ++ |
High-performance reasoning at competitive prices. 4 models available.
+| Model | +Input $/1M | +Output $/1M | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|
| deepseek-chat | +$0.14 | +$0.28 | +1M | +โ | ++ |
| deepseek-reasoner | +$0.14 | +$0.28 | +1M | +โ | +โ | +
| deepseek-v4-flash | +$0.14 | +$0.28 | +1M | +โ | +โ | +
| deepseek-v4-pro | +$0.435 | +$0.87 | +1M | +โ | +โ | +
European AI with open and commercial models. 16 models available.
+| Model | +Input $/1M | +Output $/1M | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|
| ministral-3b | +$0.04 | +$0.04 | +128K | +โ | ++ |
| voxtral-mini | +$0.04 | +$0.04 | +128K | ++ | + |
| ministral-8b | +$0.1 | +$0.1 | +128K | +โ | ++ |
| voxtral-small | +$0.1 | +$0.3 | +128K | ++ | + |
| mistral-7b | +$0.15 | +$0.2 | +32K | ++ | + |
| mistral-nemo | +$0.15 | +$0.15 | +128K | +โ | ++ |
| mistral-small | +$0.2 | +$0.6 | +128K | +โ | ++ |
| mistral-medium | +$0.4 | +$2 | +128K | +โ | ++ |
| mixtral-8x7b | +$0.45 | +$0.7 | +32K | +โ | ++ |
| magistral-small | +$0.5 | +$1.5 | +128K | +โ | +โ | +
| mixtral-8x22b | +$0.8 | +$1.2 | +64K | +โ | ++ |
| mistral-large | +$2 | +$6 | +128K | +โ | ++ |
| pixtral-large | +$2 | +$6 | +128K | +โ | ++ |
| mistral-large-2407 | +$4 | +$12 | +128K | +โ | ++ |
| codestral | +Free | ++ | 256K | ++ | + |
| devstral | +Free | ++ | 128K | +โ | ++ |
Grok โ models with real-time knowledge. 6 models available.
+| Model | +Input $/1M | +Output $/1M | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|
| xai-grok-4-fast | +$0.2 | +$0.5 | +131K | +โ | ++ |
| xai-grok-4.1 | +$0.2 | +$0.5 | +131K | +โ | +โ | +
| xai-grok-3-mini | +$0.25 | +$1.27 | +131K | +โ | +โ | +
| xai-grok-4.2 | +$2 | +$6 | +131K | +โ | +โ | +
| xai-grok-3 | +$3 | +$15 | +131K | +โ | +โ | +
| xai-grok-4 | +$3 | +$15 | +131K | +โ | +โ | +
Managed access to multiple foundation models. 57 models available.
+| Model | +Input $/1M | +Output $/1M | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|
| amazon-nova-micro | +$0.035 | +$0.14 | +128K | +โ | ++ |
| google-gemma-3-4b | +$0.04 | +$0.08 | +131K | ++ | + |
| mistral-voxtral-mini | +$0.04 | +$0.04 | +128K | ++ | + |
| amazon-nova-lite | +$0.06 | +$0.24 | +300K | +โ | ++ |
| nvidia-nemotron-nano-2 | +$0.06 | +$0.23 | +4K | ++ | + |
| nvidia-nemotron-nano-3-30b | +$0.06 | +$0.24 | +4K | ++ | + |
| openai-gpt-oss-20b | +$0.07 | +$0.3 | +131K | +โ | ++ |
| openai-gpt-oss-safeguard-20b | +$0.07 | +$0.2 | +131K | +โ | ++ |
| zai-glm-4-7-flash | +$0.07 | +$0.4 | +131K | +โ | ++ |
| google-gemma-3-12b | +$0.09 | +$0.29 | +131K | ++ | + |
| meta-llama-3-2-1b | +$0.1 | +$0.1 | +128K | ++ | + |
| mistral-ministral-3b | +$0.1 | +$0.1 | +128K | ++ | + |
| mistral-voxtral-small | +$0.1 | +$0.3 | +128K | ++ | + |
| meta-llama-3-2-3b | +$0.15 | +$0.15 | +128K | ++ | + |
| mistral-ministral-8b | +$0.15 | +$0.15 | +128K | ++ | + |
| mistral-mistral-7b | +$0.15 | +$0.2 | +32K | ++ | + |
| nvidia-nemotron-3-super-120b | +$0.15 | +$0.65 | +4K | ++ | + |
| openai-gpt-oss-120b | +$0.15 | +$0.6 | +131K | +โ | ++ |
| openai-gpt-oss-safeguard-120b | +$0.15 | +$0.6 | +131K | +โ | ++ |
| qwen-qwen3-32b | +$0.15 | +$0.6 | +131K | +โ | ++ |
| qwen-qwen3-coder-30b-a3b | +$0.15 | +$0.6 | +131K | +โ | ++ |
| writer-palmyra-vision-7b | +$0.15 | +$0.6 | +8K | ++ | + |
| meta-llama-3-2-11b | +$0.16 | +$0.16 | +128K | +โ | ++ |
| meta-llama-4-scout-17b | +$0.17 | +$0.66 | +1M | +โ | ++ |
| mistral-ministral-14b | +$0.2 | +$0.2 | +128K | ++ | + |
| nvidia-nemotron-nano-2-vl | +$0.2 | +$0.6 | +4K | ++ | + |
| meta-llama-3-1-8b | +$0.22 | +$0.22 | +128K | +โ | ++ |
| google-gemma-3-27b | +$0.23 | +$0.38 | +131K | ++ | + |
| meta-llama-4-maverick-17b | +$0.24 | +$0.97 | +1M | +โ | ++ |
| meta-llama-3-8b | +$0.3 | +$0.6 | +8K | ++ | + |
| minimax-m2-1 | +$0.3 | +$1.2 | +1M | +โ | ++ |
| minimax-m2-5 | +$0.3 | +$1.2 | +1M | +โ | ++ |
| minimax-m2 | +$0.3 | +$1.2 | +1M | +โ | ++ |
| amazon-nova-2-lite | +$0.33 | +$2.75 | +64K | +โ | ++ |
| mistral-devstral | +$0.4 | +$2 | +128K | +โ | ++ |
| mistral-mixtral-8x7b | +$0.45 | +$0.7 | +32K | ++ | + |
| mistral-magistral-small | +$0.5 | +$1.5 | +128K | +โ | ++ |
| mistral-mistral-large-3 | +$0.5 | +$1.5 | +128K | +โ | ++ |
| qwen-qwen3-coder-next | +$0.5 | +$1.2 | +131K | +โ | ++ |
| qwen-qwen3-vl-235b-a22b | +$0.53 | +$2.66 | +131K | +โ | ++ |
| kimi-k2-thinking | +$0.6 | +$2.5 | +131K | +โ | ++ |
| moonshot-kimi-k2-5 | +$0.6 | +$3 | +131K | +โ | ++ |
| zai-glm-4-7 | +$0.6 | +$2.2 | +131K | +โ | ++ |
| deepseek-v3-2 | +$0.62 | +$1.85 | +65K | +โ | ++ |
| meta-llama-3-1-70b | +$0.72 | +$0.72 | +128K | +โ | ++ |
| meta-llama-3-2-90b | +$0.72 | +$0.72 | +128K | +โ | ++ |
| meta-llama-3-3-70b | +$0.72 | +$0.72 | +128K | +โ | ++ |
| amazon-nova-pro | +$0.8 | +$3.2 | +300K | +โ | ++ |
| meta-llama-3-1-70b-latency-optimized | +$0.9 | +$0.9 | +128K | +โ | ++ |
| amazon-nova-pro-latency-optimized | +$1 | +$4 | +300K | +โ | ++ |
| mistral-mistral-small | +$1 | +$3 | +128K | +โ | ++ |
| zai-glm-5 | +$1 | +$3.2 | +131K | +โ | ++ |
| deepseek-r1 | +$1.35 | +$5.4 | +65K | ++ | + |
| mistral-pixtral-large | +$2 | +$6 | +128K | +โ | ++ |
| amazon-nova-premier | +$2.5 | +$12.5 | +1M | +โ | ++ |
| meta-llama-3-70b | +$2.65 | +$3.5 | +8K | ++ | + |
| mistral-mistral-large | +$4 | +$12 | +128K | +โ | ++ |
Ultra-fast inference with LPU hardware. 12 models available.
+| Model | +Input $/1M | +Output $/1M | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|
| llama-3.1-8b-instant | +$0.05 | +$0.08 | +131K | +โ | ++ |
| gpt-oss-20b | +$0.075 | +$0.3 | +131K | +โ | ++ |
| gpt-oss-safeguard-20b | +$0.075 | +$0.3 | +131K | +โ | ++ |
| llama-4-scout-17b-16e-instruct | +$0.11 | +$0.34 | +131K | +โ | ++ |
| gpt-oss-120b | +$0.15 | +$0.6 | +131K | +โ | ++ |
| qwen3-32b | +$0.29 | +$0.59 | +131K | +โ | ++ |
| llama-3.3-70b-versatile | +$0.59 | +$0.79 | +131K | +โ | ++ |
| kimi-k2-instruct-0905 | +$1 | +$3 | +131K | +โ | ++ |
| orpheus-ar-sa | +$? | +$? | +? | ++ | + |
| orpheus-en | +$? | +$? | +? | ++ | + |
| whisper-large-v3-turbo | +$? | +$? | +? | ++ | + |
| whisper-large-v3 | +$? | +$? | +? | ++ | + |
Open-weight model hosting platform. 24 models available.
+| Model | +Input $/1M | +Output $/1M | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|
| liquid-ai--LFM2-24B-A2B | +$0.03 | +$0.12 | +131K | +โ | ++ |
| openai--gpt-oss-20b | +$0.05 | +$0.2 | +131K | +โ | ++ |
| google--gemma-3n-E4B-it | +$0.06 | +$0.12 | +131K | ++ | + |
| Qwen--Qwen3.5-9B | +$0.1 | +$0.15 | +131K | +โ | ++ |
| meta-llama--Meta-Llama-3.1-8B-Instruct-Lite | +$0.1 | +$0.1 | +131K | +โ | ++ |
| essential-ai--Rnj-1-Instruct | +$0.15 | +$0.15 | +131K | ++ | + |
| openai--gpt-oss-120b | +$0.15 | +$0.6 | +131K | +โ | ++ |
| Qwen--Qwen3-235B-A22B-FP8-Throughput | +$0.2 | +$0.6 | +131K | +โ | ++ |
| MiniMaxAI--MiniMax-M2.5 | +$0.3 | +$1.2 | +131K | +โ | ++ |
| MiniMaxAI--MiniMax-M2.7 | +$0.3 | +$1.2 | +131K | +โ | ++ |
| Qwen--Qwen2.5-7B-Instruct-Turbo | +$0.3 | +$0.3 | +131K | +โ | ++ |
| google--gemma-4-31B-it | +$0.39 | +$0.97 | +131K | +โ | ++ |
| Qwen--Qwen3-Coder-Next | +$0.5 | +$1.2 | +131K | +โ | ++ |
| Qwen--Qwen3.6-Plus | +$0.5 | +$3 | +131K | +โ | ++ |
| moonshotai--Kimi-K2.5 | +$0.5 | +$2.8 | +131K | +โ | ++ |
| Qwen--Qwen3.5-397B-A17B | +$0.6 | +$3.6 | +131K | +โ | ++ |
| deepseek-ai--DeepSeek-V3.1 | +$0.6 | +$1.7 | +131K | +โ | ++ |
| meta-llama--Llama-3.3-70B-Instruct-Turbo | +$0.88 | +$0.88 | +131K | +โ | ++ |
| zai-org--GLM-5 | +$1 | +$3.2 | +131K | +โ | ++ |
| moonshotai--Kimi-K2.6 | +$1.2 | +$4.5 | +262K | +โ | ++ |
| cogito-ai--Cogito-v2.1-671B | +$1.25 | +$1.25 | +131K | +โ | +โ | +
| zai-org--GLM-5.1 | +$1.4 | +$4.4 | +131K | +โ | ++ |
| Qwen--Qwen3-Coder-480B-A35B-Instruct | +$2 | +$2 | +131K | +โ | ++ |
| deepseek-ai--DeepSeek-V4-Pro | +$2.1 | +$4.4 | +131K | +โ | +โ | +
Fast inference for open-source models. 10 models available.
+| Model | +Input $/1M | +Output $/1M | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|
| gpt-oss-20b | +$0.07 | +$0.3 | +131K | +โ | ++ |
| gpt-oss-120b | +$0.15 | +$0.6 | +131K | +โ | ++ |
| llama4-scout-17b-16e-instruct | +$0.18 | +$0.59 | +131K | +โ | ++ |
| minimax-m2.5 | +$0.3 | +$1.2 | +196K | +โ | ++ |
| minimax-m2.7 | +$0.3 | +$1.2 | +196K | +โ | ++ |
| qwen3.6-plus | +$0.5 | +$3 | +131K | +โ | ++ |
| kimi-k2.5 | +$0.6 | +$3 | +262K | +โ | ++ |
| kimi-k2.6 | +$0.95 | +$4 | +262K | +โ | ++ |
| glm-5.1 | +$1.4 | +$4.4 | +202K | +โ | ++ |
| deepseek-v4-pro | +$1.74 | +$3.48 | +1M | +โ | +โ | +
Wafer-scale inference at extreme speed. 11 models available.
+| Model | +Input $/1M | +Output $/1M | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|
| llama3.1-8b | +$0.1 | +$0.1 | +131K | +โ | ++ |
| gpt-oss-120b | +$0.35 | +$0.75 | +131K | +โ | ++ |
| qwen3-235b-instruct | +$0.6 | +$1.2 | +131K | +โ | ++ |
| zai-glm-4.7 | +$2.25 | +$2.75 | +131K | +โ | ++ |
| deepseek-r1-distill-llama-70b | +Free | ++ | 131K | ++ | โ | +
| deepseek-r1-distill-llama-8b | +Free | ++ | 131K | ++ | โ | +
| llama-3.3-70b | +Free | ++ | 131K | +โ | ++ |
| llama-4-scout-17b-16e-instruct | +Free | ++ | 131K | +โ | ++ |
| qwen-2.5-32b | +Free | ++ | 131K | +โ | ++ |
| qwen-2.5-coder-32b | +Free | ++ | 131K | +โ | ++ |
| qwen3-32b | +Free | ++ | 131K | +โ | ++ |
DBRX and enterprise AI models. 29 models available.
+| Model | +Input $/1M | +Output $/1M | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|
| databricks-gpt-5-nano | +$0.05 | +$0.4 | +200K | ++ | + |
| databricks-gpt-oss-20b | +$0.07 | +$0.3 | +131K | ++ | + |
| databricks-gemma-3-12b | +$0.15 | +$0.5 | +131K | ++ | + |
| databricks-gpt-oss-120b | +$0.15 | +$0.6 | +131K | ++ | + |
| databricks-meta-llama-3-1-8b-instruct | +$0.15 | +$0.45 | +131K | +โ | ++ |
| databricks-qwen3-next-80b-a3b-instruct | +$0.15 | +$1.2 | +131K | +โ | ++ |
| databricks-gpt-5-4-nano | +$0.2 | +$1.25 | +128K | ++ | + |
| databricks-gemini-3-1-flash-lite | +$0.25 | +$1.5 | +128K | ++ | + |
| databricks-gpt-5-1-codex-mini | +$0.25 | +$2 | +200K | ++ | + |
| databricks-gpt-5-mini | +$0.25 | +$2 | +200K | ++ | + |
| databricks-gemini-2-5-flash | +$0.3 | +$2.5 | +128K | ++ | + |
| databricks-llama-4-maverick | +$0.5 | +$1.5 | +131K | +โ | ++ |
| databricks-meta-llama-3-3-70b-instruct | +$0.5 | +$1.5 | +131K | +โ | ++ |
| databricks-gemini-3-flash | +$0.63 | +$3.75 | +128K | ++ | + |
| databricks-gpt-5-4-mini | +$0.75 | +$4.5 | +128K | ++ | + |
| databricks-claude-haiku-4-5 | +$1 | +$5 | +200K | ++ | + |
| databricks-gemini-2-5-pro | +$1.25 | +$10 | +128K | ++ | + |
| databricks-gpt-5-1-codex-max | +$1.25 | +$10 | +200K | ++ | + |
| databricks-gpt-5-1 | +$1.25 | +$10 | +200K | ++ | + |
| databricks-gpt-5 | +$1.25 | +$10 | +200K | ++ | + |
| databricks-gpt-5-2-codex | +$1.75 | +$14 | +200K | ++ | + |
| databricks-gpt-5-2 | +$1.75 | +$14 | +200K | ++ | + |
| databricks-gemini-3-1-pro | +$2.5 | +$15 | +128K | ++ | + |
| databricks-gpt-5-4 | +$2.5 | +$15 | +128K | ++ | + |
| databricks-claude-sonnet-4-5 | +$3 | +$15 | +200K | ++ | + |
| databricks-claude-sonnet-4 | +$3 | +$15 | +200K | ++ | + |
| databricks-claude-opus-4-5 | +$5 | +$25 | +200K | ++ | + |
| databricks-gpt-5-5 | +$5 | +$30 | +128K | ++ | + |
| databricks-claude-opus-4-1 | +$15 | +$75 | +200K | ++ | + |
Qwen โ multilingual models from Alibaba Cloud. 62 models available.
+| Model | +Input $/1M | +Output $/1M | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|
| qwen-flash | +$0.15 | +$1.5 | +? | +โ | +โ | +
| qwen3.5-flash-2026-02-23 | +$0.2 | +$2 | +1M | +โ | ++ |
| qwen3.5-flash | +$0.2 | +$2 | +1M | +โ | ++ |
| qwen-flash-character | +$0.25 | +$1.5 | +? | +โ | +โ | +
| qwen-turbo | +$0.3 | +$0.6 | +? | +โ | +โ | +
| qwen3-0.6b | +$0.3 | +$1.2 | +? | +โ | +โ | +
| qwen3-1.7b | +$0.3 | +$1.2 | +? | +โ | +โ | +
| qwen3-4b | +$0.3 | +$1.2 | +? | +โ | +โ | +
| qwen-omni-turbo | +$0.4 | +$25 | +? | +โ | +โ | +
| qwen3.5-35b-a3b | +$0.4 | +$3.2 | +256K | +โ | ++ |
| qwen-long-2025-01-25 | +$0.5 | +$2 | +? | +โ | +โ | +
| qwen-long-latest | +$0.5 | +$2 | +? | +โ | +โ | +
| qwen-long | +$0.5 | +$2 | +? | +โ | +โ | +
| qwen2.5-7b-instruct-1m | +$0.5 | +$1 | +? | +โ | +โ | +
| qwen2.5-7b-instruct | +$0.5 | +$1 | +? | +โ | +โ | +
| qwen3-8b | +$0.5 | +$2 | +? | +โ | +โ | +
| qwen-mt-lite | +$0.6 | +$1.6 | +? | +โ | +โ | +
| qwen2.5-omni-7b | +$0.6 | +$38 | +? | +โ | +โ | +
| qwen3.5-27b | +$0.6 | +$4.8 | +256K | +โ | ++ |
| qwen-mt-flash | +$0.7 | +$1.95 | +? | +โ | +โ | +
| qwen-mt-turbo | +$0.7 | +$1.95 | +? | +โ | +โ | +
| qwen3-30b-a3b-instruct-2507 | +$0.75 | +$3 | +? | +โ | +โ | +
| qwen3-30b-a3b | +$0.75 | +$3 | +? | +โ | +โ | +
| qwen-plus-character | +$0.8 | +$2 | +? | +โ | +โ | +
| qwen-plus | +$0.8 | +$2 | +? | +โ | +โ | +
| qwen3.5-122b-a10b | +$0.8 | +$6.4 | +256K | +โ | ++ |
| qwen3.5-plus-2026-02-15 | +$0.8 | +$4.8 | +1M | +โ | ++ |
| qwen3.5-plus | +$0.8 | +$4.8 | +1M | +โ | ++ |
| qwen2.5-14b-instruct-1m | +$1 | +$3 | +? | +โ | +โ | +
| qwen2.5-14b-instruct | +$1 | +$3 | +? | +โ | +โ | +
| qwen3-14b | +$1 | +$4 | +? | +โ | +โ | +
| qwen3-coder-flash-2025-07-28 | +$1 | +$4 | +? | +โ | +โ | +
| qwen3-coder-flash | +$1 | +$4 | +? | +โ | +โ | +
| qwen3-coder-next | +$1 | +$4 | +? | +โ | +โ | +
| qwen3-next-80b-a3b-instruct | +$1 | +$4 | +? | +โ | +โ | +
| qwen2.5-vl-3b-instruct | +$1.2 | +$3.6 | +? | +โ | +โ | +
| qwen3.5-397b-a17b | +$1.2 | +$7.2 | +256K | +โ | ++ |
| qwen3.6-flash-2026-04-16 | +$1.2 | +$7.2 | +1M | +โ | ++ |
| qwen3.6-flash | +$1.2 | +$7.2 | +1M | +โ | +โ | +
| qwen3-coder-30b-a3b-instruct | +$1.5 | +$6 | +? | +โ | +โ | +
| qwen-mt-plus | +$1.8 | +$5.4 | +? | +โ | +โ | +
| qwen2.5-32b-instruct | +$2 | +$6 | +? | +โ | +โ | +
| qwen2.5-vl-7b-instruct | +$2 | +$5 | +? | +โ | +โ | +
| qwen3-235b-a22b-instruct-2507 | +$2 | +$8 | +? | +โ | +โ | +
| qwen3-235b-a22b | +$2 | +$8 | +? | +โ | +โ | +
| qwen3-32b | +$2 | +$8 | +? | +โ | +โ | +
| qwen3.6-plus-2026-04-02 | +$2 | +$12 | +1M | +โ | ++ |
| qwen3.6-plus | +$2 | +$12 | +1M | +โ | +โ | +
| qwen-max | +$2.4 | +$9.6 | +? | +โ | +โ | +
| qwen3-max-2026-01-23 | +$2.5 | +$10 | +? | +โ | +โ | +
| qwen3-max | +$2.5 | +$10 | +? | +โ | +โ | +
| qwen-plus-character-ja | +$3.67 | +$10.275 | +? | +โ | +โ | +
| qwen2.5-72b-instruct | +$4 | +$12 | +? | +โ | +โ | +
| qwen3-coder-plus-2025-07-22 | +$4 | +$16 | +? | +โ | +โ | +
| qwen3-coder-plus-2025-09-23 | +$4 | +$16 | +? | +โ | +โ | +
| qwen3-coder-plus | +$4 | +$16 | +? | +โ | +โ | +
| qwen3-coder-480b-a35b-instruct | +$6 | +$24 | +? | +โ | +โ | +
| qwen3-max-2025-09-23 | +$6 | +$24 | +? | +โ | +โ | +
| qwen3-max-preview | +$6 | +$24 | +? | +โ | +โ | +
| qwen2.5-vl-32b-instruct | +$8 | +$24 | +? | +โ | +โ | +
| qwen3.6-max-preview | +$9 | +$54 | +256K | +โ | +โ | +
| qwen2.5-vl-72b-instruct | +$16 | +$48 | +? | +โ | +โ | +
Doubao โ models from the TikTok parent company. 5 models available.
+| Model | +Input $/1M | +Output $/1M | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|
| seed-1.6-flash | +$0.07 | +$0.3 | +262K | +โ | +โ | +
| seed-2.0-mini | +$0.1 | +$0.4 | +262K | +โ | +โ | +
| ui-tars-1.5-7b | +$0.1 | +$0.2 | +128K | ++ | + |
| seed-1.6 | +$0.25 | +$2 | +262K | +โ | +โ | +
| seed-2.0-lite | +$0.25 | +$2 | +262K | +โ | +โ | +
Chinese AI startup with competitive models. 21 models available.
+| Model | +Input $/1M | +Output $/1M | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|
| M2-her | +$2.1 | +$8.4 | +64K | ++ | + |
| MiniMax-M2.1 | +$2.1 | +$8.4 | +204K | ++ | + |
| MiniMax-M2.5 | +$2.1 | +$8.4 | +204K | ++ | + |
| MiniMax-M2.7 | +$2.1 | +$8.4 | +204K | ++ | + |
| MiniMax-M2 | +$2.1 | +$8.4 | +204K | ++ | + |
| MiniMax-M2.1-highspeed | +$4.2 | +$16.8 | +204K | ++ | + |
| MiniMax-M2.5-highspeed | +$4.2 | +$16.8 | +204K | ++ | + |
| MiniMax-M2.7-highspeed | +$4.2 | +$16.8 | +204K | ++ | + |
| MiniMax-Hailuo-02 | +$? | +$? | +? | ++ | + |
| MiniMax-Hailuo-2.3-Fast | +$? | +$? | +? | ++ | + |
| MiniMax-Hailuo-2.3 | +$? | +$? | +? | ++ | + |
| image-01-live | +$? | +$? | +? | ++ | + |
| image-01 | +$? | +$? | +? | ++ | + |
| music-2.6 | +$? | +$? | +? | ++ | + |
| music-cover | +$? | +$? | +? | ++ | + |
| speech-02-hd | +$? | +$? | +? | ++ | + |
| speech-02-turbo | +$? | +$? | +? | ++ | + |
| speech-2.6-hd | +$? | +$? | +? | ++ | + |
| speech-2.6-turbo | +$? | +$? | +? | ++ | + |
| speech-2.8-hd | +$? | +$? | +? | ++ | + |
| speech-2.8-turbo | +$? | +$? | +? | ++ | + |
Kimi โ long-context Chinese models. 16 models available.
+| Model | +Input $/1M | +Output $/1M | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|
| moonshot-v1-8k-vision-preview | +$2 | +$10 | +8K | ++ | + |
| moonshot-v1-8k | +$2 | +$10 | +8K | ++ | + |
| kimi-k2-0711-preview | +$4 | +$16 | +131K | ++ | + |
| kimi-k2-0905-preview | +$4 | +$16 | +262K | ++ | + |
| kimi-k2-thinking | +$4 | +$16 | +262K | ++ | โ | +
| kimi-k2.5 | +$4 | +$21 | +262K | ++ | โ | +
| kimi-vl-a3b-thinking | +$4 | +$21 | +131K | ++ | โ | +
| kimi-vl-a3b | +$4 | +$21 | +131K | ++ | + |
| moonshot-v1-32k-vision-preview | +$5 | +$20 | +32K | ++ | + |
| moonshot-v1-32k | +$5 | +$20 | +32K | ++ | + |
| kimi-k2.6-long | +$6.5 | +$27 | +262K | ++ | โ | +
| kimi-k2.6 | +$6.5 | +$27 | +262K | ++ | โ | +
| kimi-k2-thinking-turbo | +$8 | +$58 | +262K | ++ | โ | +
| kimi-k2-turbo-preview | +$8 | +$58 | +262K | ++ | + |
| moonshot-v1-128k-vision-preview | +$10 | +$30 | +131K | ++ | + |
| moonshot-v1-128k | +$10 | +$30 | +131K | ++ | + |
Step โ Chinese AI models with strong capabilities. 31 models available.
+| Model | +Input $/1M | +Output $/1M | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|
| step-3.5-flash-2603 | +$0.7 | +$2.1 | +256K | ++ | + |
| step-3.5-flash | +$0.7 | +$2.1 | +256K | ++ | + |
| step-2-mini | +$1 | +$2 | +32K | ++ | + |
| step-3 | +$1.5 | +$4 | +64K | ++ | + |
| step-1o-turbo-vision | +$2.5 | +$8 | +32K | ++ | + |
| step-r1-v-mini | +$2.5 | +$8 | +100K | ++ | + |
| step-1-8k | +$5 | +$20 | +8K | ++ | + |
| step-1v-8k | +$5 | +$20 | +8K | ++ | + |
| step-audio-2 | +$10 | +$70 | +? | ++ | + |
| stepaudio-2.5-chat | +$10 | +$25 | +? | ++ | + |
| stepaudio-2.5-realtime | +$10 | +$70 | +? | ++ | + |
| step-1-32k | +$15 | +$70 | +32K | ++ | + |
| step-1o-vision-32k | +$15 | +$70 | +32K | ++ | + |
| step-1v-32k | +$15 | +$70 | +32K | ++ | + |
| step-1o-audio | +$25 | +$60 | +? | ++ | + |
| step-2-16k-exp | +$38 | +$120 | +16K | ++ | + |
| step-2-16k | +$38 | +$120 | +16K | ++ | + |
| step-1x-edit | +Free | ++ | ? | ++ | + |
| step-1x-medium | +$? | +$? | +? | ++ | + |
| step-2x-large | +Free | ++ | ? | ++ | + |
| step-asr-1.1-stream | +$? | +$? | +? | ++ | + |
| step-asr-1.1 | +$? | +$? | +? | ++ | + |
| step-asr | +$? | +$? | +? | ++ | + |
| step-audio-r1.1 | +Free | ++ | ? | ++ | + |
| step-gui | +Free | ++ | ? | ++ | + |
| step-image-edit-2 | +$? | +$? | +? | ++ | + |
| step-tts-2 | +$? | +$? | +? | ++ | + |
| step-tts-mini | +$? | +$? | +? | ++ | + |
| stepaudio-2-asr-pro | +$? | +$? | +? | ++ | + |
| stepaudio-2.5-asr | +$? | +$? | +? | ++ | + |
| stepaudio-2.5-tts | +$? | +$? | +? | ++ | + |
ERNIE โ models from China's search giant. 8 models available.
+| Model | +Input $/1M | +Output $/1M | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|
| deepseek-v4-flash | +$0.126 | +$0.252 | +1M | +โ | +โ | +
| deepseek-v3.2 | +$0.252 | +$0.378 | +131K | +โ | +โ | +
| minimax-m2.5 | +$0.27 | +$1.08 | +196K | +โ | +โ | +
| qianfan-ocr-fast | +$0.6799999999999999 | +$2.81 | +65K | ++ | + |
| glm-5 | +$0.7 | +$2.24 | +202K | +โ | +โ | +
| glm-5.1 | +$0.98 | +$3.08 | +202K | +โ | +โ | +
| deepseek-v4-pro | +$1.521 | +$3.042 | +716K | +โ | +โ | +
| cobuddy | +Free | ++ | 131K | +โ | ++ |
+ All data is sourced from first-party APIs โ not third-party aggregators. Pricing, + context windows, and capabilities are verified against official provider documentation. + Aggregator providers (OpenRouter, Requesty, etc.) are labeled as such โ they provide access to + other providers' models. +
+ ++ Compare the top AI models for building autonomous agents. 1,080+ models with tool + calling โ the key capability for agentic workflows. +
+ +Models with all three agentic capabilities. Best for complex autonomous workflows.
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +
|---|---|---|---|---|
| openai--gpt-oss-20b | +neuralwatt | +$0.03 | +$0.16 | +? | +
| Qwen--Qwen3.6-35B-A3B | +neuralwatt | +$0.05 | +$0.1 | +? | +
| openai--gpt-oss-120b | +novitaai | +$0.05 | +$0.25 | +131K | +
| Nemotron-3-Nano-Omni | +nebius | +$0.06 | +$0.24 | +128K | +
| hermes-4-llama-3.1-8b | +nousresearch | +$0.06 | +$0.12 | +131K | +
| zai-org--glm-4.7-flash | +novitaai | +$0.07 | +$0.4 | +200K | +
| Qwen--Qwen3-32B-TEE | +chutes | +$0.08 | +$0.24 | +40K | +
| Gemma-3-27b-it | +nebius | +$0.1 | +$0.3 | +96K | +
| Qwen3-32B | +nebius | +$0.1 | +$0.3 | +128K | +
| xiaomimimo--mimo-v2-flash | +novitaai | +$0.1 | +$0.3 | +262K | +
| Qwen--Qwen3-235B-A22B-Thinking-2507 | +chutes | +$0.11 | +$0.6 | +262K | +
| deepseek-v4-flash | +baidu | +$0.126 | +$0.252 | +1M | +
| google--gemma-4-31B-turbo-TEE | +chutes | +$0.13 | +$0.38 | +131K | +
| Hermes-4-70B | +nebius | +$0.13 | +$0.4 | +128K | +
| google--gemma-4-26b-a4b-it | +novitaai | +$0.13 | +$0.4 | +262K | +
+ Models that can both call tools and reason about when/how to use them. Essential for + ReAct-style agents. +
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +
|---|---|---|---|---|
| openai--gpt-oss-20b | +neuralwatt | +$0.03 | +$0.16 | +? | +
| qwen--qwen3-4b-fp8 | +novitaai | +$0.03 | +$0.03 | +128K | +
| gpt-oss-120b | +inferencenet | +$0.05 | +$0.45 | +131K | +
| Qwen--Qwen3.6-35B-A3B | +neuralwatt | +$0.05 | +$0.1 | +? | +
| openai--gpt-oss-120b | +novitaai | +$0.05 | +$0.25 | +131K | +
| qwen3-30b-a3b-fp8 | +cloudflare | +$0.051 | +$0.335 | +40K | +
| glm-4.7-flash | +cloudflare | +$0.06 | +$0.4 | +131K | +
| Nemotron-3-Nano-Omni | +nebius | +$0.06 | +$0.24 | +128K | +
| hermes-4-llama-3.1-8b | +nousresearch | +$0.06 | +$0.12 | +131K | +
| seed-1.6-flash | +bytedance | +$0.07 | +$0.3 | +262K | +
| ring-2.6-1t | +inclusionai | +$0.07 | +$0.62 | +262K | +
| zai-org--glm-4.7-flash | +novitaai | +$0.07 | +$0.4 | +200K | +
| microsoft-phi-4-mini-reasoning | +microsoft | +$0.075 | +$0.3 | +128K | +
| Qwen--Qwen3-32B-TEE | +chutes | +$0.08 | +$0.24 | +40K | +
| gpt-oss-120b | +clarifai | +$0.09 | +$0.36 | +131K | +
Most affordable models with tool calling for budget-conscious agent deployments.
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +
|---|---|---|---|---|
| ling-2.6-flash | +inclusionai | +$0.01 | +$0.03 | +262K | +
| bdc-coder | +inferencenet | +$0.01 | +$0.01 | +131K | +
| klusterai--Meta-Llama-3.1-8B-Instruct-Turbo | +klusterai | +$0.015 | +$0.02 | +131K | +
| granite-4.0-h-micro | +cloudflare | +$0.017 | +$0.112 | +131K | +
| llama-3.1-8b-instruct--fp-16 | +inferencenet | +$0.02 | +$0.03 | +131K | +
| schematron-3b | +inferencenet | +$0.02 | +$0.05 | +131K | +
| schematron-v3 | +inferencenet | +$0.02 | +$0.05 | +131K | +
| gpt-oss-20b | +inferencenet | +$0.03 | +$0.15 | +131K | +
| schematron-v2-turbo | +inferencenet | +$0.03 | +$0.15 | +131K | +
| openai--gpt-oss-20b | +neuralwatt | +$0.03 | +$0.16 | +? | +
| qwen--qwen3-4b-fp8 | +novitaai | +$0.03 | +$0.03 | +128K | +
| liquid-ai--LFM2-24B-A2B | +togetherai | +$0.03 | +$0.12 | +131K | +
| amazon-nova-micro | +amazon | +$0.035 | +$0.14 | +128K | +
| amazon-nova-micro | +amazon-bedrock | +$0.035 | +$0.14 | +128K | +
| mistral-nemo-12b-instruct--fp-8 | +inferencenet | +$0.0375 | +$0.1 | +131K | +
Zero-cost models for building and testing agents.
+| Model | +Provider | +Context | +Reasoning | +Structured Output | +
|---|---|---|---|---|
| openrouter--owl-alpha | +openrouter | +1M | ++ | โ | +
| deepseek--deepseek-v4-flash--free | +openrouter | +1M | +โ | ++ |
| qwen--qwen3-coder--free | +openrouter | +1M | ++ | + |
| nvidia--nemotron-3-super-120b-a12b--free | +openrouter | +1M | +โ | +โ | +
| gemma-4-26b-a4b-it | +auriko | +262K | +โ | +โ | +
| gemma-4-31b-it | +auriko | +262K | +โ | +โ | +
| arcee-ai--trinity-large-thinking--free | +openrouter | +262K | +โ | ++ |
| google--gemma-4-26b-a4b-it--free | +openrouter | +262K | +โ | +โ | +
| google--gemma-4-31b-it--free | +openrouter | +262K | +โ | +โ | +
| nvidia--nemotron-3-nano-omni-30b-a3b-reasoning--free | +openrouter | +256K | +โ | ++ |
Run agent models locally for full privacy and zero API costs at scale.
+| Model | +Provider | +Context | +Reasoning | +Structured Output | +
|---|---|---|---|---|
| google--gemma-4-31b-it | +orcarouter | +1M | ++ | + |
| qwen--qwen3.5-flash-2026-02-23 | +orcarouter | +1M | ++ | + |
| qwen--qwen3.5-flash | +orcarouter | +1M | ++ | + |
| qwen--qwen3.6-flash-2026-04-16 | +orcarouter | +1M | ++ | + |
| qwen--qwen3.6-flash | +orcarouter | +1M | ++ | + |
| meta-llama-4-maverick-17b | +amazon-bedrock | +1M | ++ | + |
| meta-llama-4-scout-17b | +amazon-bedrock | +1M | ++ | + |
| minimax-m2-1 | +amazon-bedrock | +1M | ++ | + |
| minimax-m2-5 | +amazon-bedrock | +1M | ++ | + |
| minimax-m2 | +amazon-bedrock | +1M | ++ | + |
+ Models with 128K+ context and tool calling for agents that need to process large documents or + maintain long conversation history. +
+| Model | +Provider | +Context | +Input $/1M | +Reasoning | +
|---|---|---|---|---|
| ling-2.6-flash | +inclusionai | +262K | +$0.01 | ++ |
| bdc-coder | +inferencenet | +131K | +$0.01 | ++ |
| klusterai--Meta-Llama-3.1-8B-Instruct-Turbo | +klusterai | +131K | +$0.015 | ++ |
| granite-4.0-h-micro | +cloudflare | +131K | +$0.017 | ++ |
| llama-3.1-8b-instruct--fp-16 | +inferencenet | +131K | +$0.02 | ++ |
| schematron-3b | +inferencenet | +131K | +$0.02 | ++ |
| schematron-v3 | +inferencenet | +131K | +$0.02 | ++ |
| gpt-oss-20b | +inferencenet | +131K | +$0.03 | ++ |
| schematron-v2-turbo | +inferencenet | +131K | +$0.03 | ++ |
| qwen--qwen3-4b-fp8 | +novitaai | +128K | +$0.03 | +โ | +
| liquid-ai--LFM2-24B-A2B | +togetherai | +131K | +$0.03 | ++ |
| amazon-nova-micro | +amazon | +128K | +$0.035 | ++ |
| amazon-nova-micro | +amazon-bedrock | +128K | +$0.035 | ++ |
| mistral-nemo-12b-instruct--fp-8 | +inferencenet | +131K | +$0.0375 | ++ |
| klusterai--Meta-Llama-3.3-70B-Instruct-Turbo | +klusterai | +131K | +$0.038 | ++ |
+ All data is sourced from first-party APIs. Agentic capability is defined by tool + calling (function calling), reasoning (chain-of-thought), and structured output (JSON mode). + Aggregator providers are excluded from ranking tables to avoid duplicate models. +
+ ++ Compare the top AI models for code generation, debugging, and software development. Real + pricing, context windows, and capabilities from first-party data. +
+ +The most capable models for complex coding tasks. Higher price, highest quality.
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|---|
| gpt-4.1 | +openai | +$2 | +$8 | +1M | +โ | ++ |
| gpt-4o | +openai | +$2.5 | +$10 | +128K | +โ | ++ |
| gemini-2.5-pro | +deepinfra | +$1.25 | +$10 | +1M | ++ | โ | +
| deepseek-r1 | +amazon-bedrock | +$1.35 | +$5.4 | +65K | ++ | + |
Great coding performance at lower prices. Perfect for high-volume code generation.
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|---|
| gpt-4o-mini | +openai | +$0.15 | +$0.6 | +128K | +โ | ++ |
| gemini-2.5-flash | +deepinfra | +$0.3 | +$2.5 | +1M | ++ | โ | +
| deepseek-v3 | +deepinfra | +$0.32 | +$0.89 | +163K | ++ | + |
| deepseek-r1 | +amazon-bedrock | +$1.35 | +$5.4 | +65K | ++ | + |
Zero-cost models for learning, prototyping, and personal projects.
+| Model | +Provider | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|
| openrouter--owl-alpha | +openrouter | +1M | +โ | ++ |
| deepseek--deepseek-v4-flash--free | +openrouter | +1M | +โ | +โ | +
| qwen--qwen3-coder--free | +openrouter | +1M | +โ | ++ |
| nvidia--nemotron-3-super-120b-a12b--free | +openrouter | +1M | +โ | +โ | +
| gemma-4-26b-a4b-it | +auriko | +262K | +โ | +โ | +
| gemma-4-31b-it | +auriko | +262K | +โ | +โ | +
| arcee-ai--trinity-large-thinking--free | +openrouter | +262K | +โ | +โ | +
| google--gemma-4-26b-a4b-it--free | +openrouter | +262K | +โ | +โ | +
| google--gemma-4-31b-it--free | +openrouter | +262K | +โ | +โ | +
| nvidia--nemotron-3-nano-omni-30b-a3b-reasoning--free | +openrouter | +256K | +โ | +โ | +
Download and run locally for full privacy and zero API costs at scale.
+| Model | +Provider | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|
| google--gemma-4-31b-it | +orcarouter | +1M | +โ | ++ |
| qwen--qwen3.5-flash-2026-02-23 | +orcarouter | +1M | +โ | ++ |
| qwen--qwen3.5-flash | +orcarouter | +1M | +โ | ++ |
| qwen--qwen3.6-flash-2026-04-16 | +orcarouter | +1M | +โ | ++ |
| qwen--qwen3.6-flash | +orcarouter | +1M | +โ | ++ |
| meta-llama-4-maverick-17b | +amazon-bedrock | +1M | +โ | ++ |
| meta-llama-4-scout-17b | +amazon-bedrock | +1M | +โ | ++ |
| minimax-m2-1 | +amazon-bedrock | +1M | +โ | ++ |
| minimax-m2-5 | +amazon-bedrock | +1M | +โ | ++ |
| minimax-m2 | +amazon-bedrock | +1M | +โ | ++ |
+ Models with 128K+ context for working with large codebases, multiple files, and long + conversations. +
+| Model | +Provider | +Context | +Input $/1M | +Tool Call | +
|---|---|---|---|---|
| ling-2.6-flash | +inclusionai | +262K | +$0.01 | +โ | +
| bdc-coder | +inferencenet | +131K | +$0.01 | +โ | +
| klusterai--Meta-Llama-3.1-8B-Instruct-Turbo | +klusterai | +131K | +$0.015 | +โ | +
| granite-4.0-h-micro | +cloudflare | +131K | +$0.017 | +โ | +
| llama-3.1-8b-instruct--fp-16 | +inferencenet | +131K | +$0.02 | +โ | +
| schematron-3b | +inferencenet | +131K | +$0.02 | +โ | +
| schematron-v3 | +inferencenet | +131K | +$0.02 | +โ | +
| gpt-oss-20b | +inferencenet | +131K | +$0.03 | +โ | +
| schematron-v2-turbo | +inferencenet | +131K | +$0.03 | +โ | +
| qwen--qwen3-4b-fp8 | +novitaai | +128K | +$0.03 | +โ | +
| liquid-ai--LFM2-24B-A2B | +togetherai | +131K | +$0.03 | +โ | +
| amazon-nova-micro | +amazon | +128K | +$0.035 | +โ | +
| amazon-nova-micro | +amazon-bedrock | +128K | +$0.035 | +โ | +
| mistral-nemo-12b-instruct--fp-8 | +inferencenet | +131K | +$0.0375 | +โ | +
| klusterai--Meta-Llama-3.3-70B-Instruct-Turbo | +klusterai | +131K | +$0.038 | +โ | +
+ Models with tool calling + reasoning โ the key capabilities for AI coding agents (Cursor, + Copilot, Devin-style). +
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +
|---|---|---|---|---|
| openai--gpt-oss-20b | +neuralwatt | +$0.03 | +$0.16 | +? | +
| qwen--qwen3-4b-fp8 | +novitaai | +$0.03 | +$0.03 | +128K | +
| gpt-oss-120b | +inferencenet | +$0.05 | +$0.45 | +131K | +
| Qwen--Qwen3.6-35B-A3B | +neuralwatt | +$0.05 | +$0.1 | +? | +
| openai--gpt-oss-120b | +novitaai | +$0.05 | +$0.25 | +131K | +
| qwen3-30b-a3b-fp8 | +cloudflare | +$0.051 | +$0.335 | +40K | +
| glm-4.7-flash | +cloudflare | +$0.06 | +$0.4 | +131K | +
| Nemotron-3-Nano-Omni | +nebius | +$0.06 | +$0.24 | +128K | +
| hermes-4-llama-3.1-8b | +nousresearch | +$0.06 | +$0.12 | +131K | +
| seed-1.6-flash | +bytedance | +$0.07 | +$0.3 | +262K | +
| ring-2.6-1t | +inclusionai | +$0.07 | +$0.62 | +262K | +
| zai-org--glm-4.7-flash | +novitaai | +$0.07 | +$0.4 | +200K | +
| microsoft-phi-4-mini-reasoning | +microsoft | +$0.075 | +$0.3 | +128K | +
| Qwen--Qwen3-32B-TEE | +chutes | +$0.08 | +$0.24 | +40K | +
| gpt-oss-120b | +clarifai | +$0.09 | +$0.36 | +131K | +
+ All data is sourced from first-party APIs. Models are selected based on capabilities + relevant to coding: tool calling (for agentic workflows), reasoning (for complex logic), large + context (for codebases), and structured output (for parsing). Aggregator providers are + excluded from ranking tables. +
+ ++ Compare the top AI models for image generation โ DALLยทE, Imagen, GPT-5 Image, Gemini, and + more. Real pricing and capabilities from first-party data. +
+ ++ Purpose-built models for text-to-image generation. Best for art, design, and visual content + creation. +
+| Model | +Provider | +Type | +Key Feature | +
|---|---|---|---|
| imagen-4.0-generate | +Text โ Image | +Latest Imagen, highest quality | +|
| imagen-4.0-fast-generate | +Text โ Image | +Fast generation, lower cost | +|
| imagen-3.0-generate | +Text โ Image | +Stable v3, production-ready | +|
| imagen-3.0-fast-generate | +Text โ Image | +Fast v3 variant | +|
| dall-e-3 | +openai | +Text โ Image | +Best prompt adherence, DALLยทE quality | +
| dall-e-2 | +openai | +Text โ Image | +Lower cost, good for simple images | +
| step-2x-large | +stepfun | +Text โ Image | +High-quality Chinese + English | +
| step-1x-medium | +stepfun | +Text โ Image | +Mid-tier, good balance | +
| step-1x-edit | +stepfun | +Image Edit | +Edit existing images | +
| step-image-edit-2 | +stepfun | +Image Edit | +Advanced editing v2 | +
| image-01 | +minimax | +Text โ Image | +MiniMax image generation | +
| image-01-live | +minimax | +Text โ Image | +Real-time generation | +
+ Multimodal chat models that can generate images within a conversation. Best for agents and + interactive applications. +
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|---|
| gpt-5-image-mini | +openrouter | +$2.50 | +$2 | +400K | ++ | โ | +
| gemini-3.1-flash-image | +fastrouter | +$0.25 | +$1.50 | +65K | ++ | โ | +
| gemini-2.5-flash-image | +fastrouter | +$0.30 | +$2.50 | +32K | ++ | + |
| gemini-3.1-flash-image | +auriko | +$0.50 | +$3 | +65K | ++ | โ | +
| gemini-2.5-flash-image | +auriko | +$0.30 | +$0.04 | +32K | ++ | + |
| amazon-nova-2.0-omni | +amazon | +$0.20 | +$1.30 | +64K | +โ | +โ | +
| gpt-5-image | +openrouter | +$10 | +$10 | +400K | ++ | โ | +
| gpt-5.4-image-2 | +openrouter | +$8 | +$15 | +272K | ++ | โ | +
| gemini-3-pro-image | +fastrouter | +$2 | +$12 | +65K | ++ | + |
| gemini-3-pro-image | +auriko | +$2 | +$12 | +131K | ++ | โ | +
Most affordable options for high-volume image generation.
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +
|---|---|---|---|---|
| amazon-nova-2.0-omni | +amazon | +$0.20 | +$1.30 | +64K | +
| gemini-3.1-flash-image | +fastrouter | +$0.25 | +$1.50 | +65K | +
| gemini-2.5-flash-image | +fastrouter | +$0.30 | +$2.50 | +32K | +
| gemini-2.5-flash-image | +auriko | +$0.30 | +$0.04 | +32K | +
| gemini-3.1-flash-image | +auriko | +$0.50 | +$3 | +65K | +
| gpt-5-image-mini | +openrouter | +$2.50 | +$2 | +400K | +
+ Models that support both image generation and function/tool calling โ ideal for AI agents that + create images. +
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +Reasoning | +
|---|---|---|---|---|---|
| amazon-nova-2.0-omni | +amazon | +$0.20 | +$1.30 | +64K | +โ | +
| gemini-3-pro-image | +llmgateway | +$2 | +$12 | +โ | ++ |
| gemini-3.1-flash-image | +llmgateway | +$0.25 | +$1.50 | +โ | ++ |
| gemini-2.5-flash-image | +llmgateway | +$0.30 | +$30 | +โ | ++ |
+ Models with 64K+ context for detailed image descriptions, multi-image generation, and long + conversations. +
+| Model | +Provider | +Context | +Input $/1M | +Output $/1M | +
|---|---|---|---|---|
| gpt-5-image | +openrouter | +400K | +$10 | +$10 | +
| gpt-5-image-mini | +openrouter | +400K | +$2.50 | +$2 | +
| gpt-5.4-image-2 | +openrouter | +272K | +$8 | +$15 | +
| gemini-3-pro-image | +auriko | +131K | +$2 | +$12 | +
| gemini-3.1-flash-image | +fastrouter | +65K | +$0.25 | +$1.50 | +
| gemini-3-pro-image | +fastrouter | +65K | +$2 | +$12 | +
| gemini-3.1-flash-image | +auriko | +65K | +$0.50 | +$3 | +
| amazon-nova-2.0-omni | +amazon | +64K | +$0.20 | +$1.30 | +
| Use Case | +Recommended Model | +Why | +
|---|---|---|
| Art & creative | +imagen-4.0-generate | +Highest quality, Google's latest | +
| Product images | +dall-e-3 | +Best prompt adherence, consistent style | +
| Chat + images | +gpt-5-image-mini | +Conversational image gen, 400K context | +
| AI agents | +amazon-nova-2.0-omni | +Tool calling + reasoning + image output | +
| High volume / cheap | +gemini-2.5-flash-image | +Lowest cost per image | +
| Image editing | +step-image-edit-2 | +Purpose-built for editing | +
| Chinese content | +step-2x-large | +Best Chinese + English generation | +
+ All data is sourced from first-party APIs. Models are identified by having
+ image in their modalities.output field. Dedicated image models
+ (DALLยทE, Imagen) have no chat context. Chat models with image output support both text and
+ image generation in conversation. Aggregator providers are excluded from ranking tables.
+
+ Compare the top vision AI models โ GPT-4o, Claude 4, Gemini, and 1,487 models with image + understanding. Real pricing and capabilities from first-party data. +
+ ++ The top-tier multimodal models from each major provider, compared on pricing, context, and + capabilities. +
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|---|
| gpt-4o | +openai | +$2.50 | +$10 | +128K | +โ | ++ |
| gpt-4o-mini | +openai | +$0.15 | +$0.60 | +128K | +โ | ++ |
| o3 | +openai | +$2 | +$8 | +200K | +โ | +โ | +
| o4-mini | +openai | +$1.10 | +$4.40 | +200K | +โ | +โ | +
| claude-sonnet-4-20250514 | +anthropic | +$3 | +$15 | +200K | +โ | +โ | +
| claude-opus-4-20250514 | +anthropic | +$15 | +$75 | +200K | +โ | +โ | +
| gemini-2.5-pro | +$1.25 | +$10 | +1M | +โ | +โ | +|
| gemini-2.5-flash | +$0.15 | +$0.60 | +1M | +โ | +โ | +|
| deepseek-r1 | +deepseek | +$0.55 | +$2.19 | +128K | ++ | โ | +
| grok-3 | +xai | +$3 | +$15 | +131K | +โ | +โ | +
| qwen3-235b-a22b | +alibaba | +$0.14 | +$0.42 | +128K | +โ | +โ | +
| llama4-maverick | +meta | +$0.20 | +$0.80 | +1M | +โ | ++ |
Most affordable models with image understanding โ ideal for high-volume applications.
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +Tool Call | +
|---|---|---|---|---|---|
| gemini-2.0-flash-lite | +$0.075 | +$0.30 | +1M | +โ | +|
| gemini-2.5-flash | +$0.15 | +$0.60 | +1M | +โ | +|
| gpt-4o-mini | +openai | +$0.15 | +$0.60 | +128K | +โ | +
| qwen3-235b-a22b | +alibaba | +$0.14 | +$0.42 | +128K | +โ | +
| llama4-maverick | +meta | +$0.20 | +$0.80 | +1M | +โ | +
| deepseek-chat | +deepseek | +$0.14 | +$0.28 | +128K | ++ |
+ Vision models available at zero cost โ perfect for prototyping, learning, and small projects. +
+| Model | +Provider | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|
| gemini-2.0-flash | +1M | +โ | ++ | |
| gemini-2.5-flash | +1M | +โ | +โ | +|
| gemma3-4b | +128K | ++ | + | |
| llama4-scout-17b-16e | +meta | +10M | ++ | + |
| qwen3-30b-a3b | +alibaba | +128K | ++ | โ | +
+ 1,179 models that support both image understanding and function/tool calling โ essential for + AI agents that process images. +
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +Reasoning | +
|---|---|---|---|---|---|
| gemini-2.0-flash-lite | +$0.075 | +$0.30 | +1M | ++ | |
| gemini-2.5-flash | +$0.15 | +$0.60 | +1M | +โ | +|
| gpt-4o-mini | +openai | +$0.15 | +$0.60 | +128K | ++ |
| qwen3-235b-a22b | +alibaba | +$0.14 | +$0.42 | +128K | +โ | +
| claude-sonnet-4-20250514 | +anthropic | +$3 | +$15 | +200K | +โ | +
| grok-3-mini | +xai | +$0.30 | +$0.50 | +131K | +โ | +
+ 1,267 models with 128K+ context for processing large documents, multiple images, and long + conversations. +
+| Model | +Provider | +Context | +Input $/1M | +Output $/1M | +Tool Call | +
|---|---|---|---|---|---|
| llama4-scout-17b-16e | +meta | +10M | +โ | +โ | ++ |
| gemini-2.5-pro | +1M | +$1.25 | +$10 | +โ | +|
| gemini-2.5-flash | +1M | +$0.15 | +$0.60 | +โ | +|
| llama4-maverick | +meta | +1M | +$0.20 | +$0.80 | +โ | +
| claude-sonnet-4-20250514 | +anthropic | +200K | +$3 | +$15 | +โ | +
| o3 | +openai | +200K | +$2 | +$8 | +โ | +
| Use Case | +Recommended Model | +Why | +
|---|---|---|
| Document OCR | +gemini-2.5-pro | +1M context, best document understanding | +
| Image chatbot | +gpt-4o-mini | +Cheapest with tool calling, good quality | +
| AI agents | +claude-sonnet-4 | +Best tool calling + reasoning + vision | +
| High volume / cheap | +gemini-2.0-flash-lite | +Lowest cost at $0.075/M input | +
| Medical imaging | +o3 | +Reasoning + vision for complex analysis | +
| Video analysis | +gemini-2.5-flash | +1M context + video input + cheap | +
| Prototyping | +gemini-2.5-flash | +Free tier, 1M context, all capabilities | +
+ All data is sourced from first-party APIs. Models are identified by having
+ image in their modalities.input field. Aggregator providers are
+ excluded from ranking tables to avoid duplicate models. Pricing is per million tokens.
+
+ A comprehensive comparison of 4587 AI models across 95 providers. Find the best + model for your use case โ whether you need the cheapest, the most capable, or the best for a + specific task. +
+ +The most affordable models per million tokens, excluding aggregator providers.
+| Model | +Provider | +Input $/M | +Output $/M | +Context | +Capabilities | +
|---|---|---|---|---|---|
| openai--gpt-image-1-mini | +aimlapi | +$0.007 | +$0.676 | +? | ++ |
| mistralai--Mistral-Nemo-Instruct-2407 | +klusterai | +$0.008 | +$0.001 | +131K | ++ |
| qwen3.5-0.8b | +deepinfra | +$0.01 | +$0.05 | +262K | ++ ๐ง Reason + ๐๏ธ Vision + | +
| ling-2.6-flash | +inclusionai | +$0.01 | +$0.03 | +262K | +๐ง Tool | +
| bdc-coder | +inferencenet | +$0.01 | +$0.01 | +131K | +๐ง Tool | +
| openai--gpt-image-1-model | +aimlapi | +$0.012 | +$0.175 | +? | ++ |
| klusterai--Meta-Llama-3.1-8B-Instruct-Turbo | +klusterai | +$0.015 | +$0.02 | +131K | +๐ง Tool | +
| granite-4.0-h-micro | +cloudflare | +$0.017 | +$0.112 | +131K | +๐ง Tool | +
| meta-llama-3.1-8b-instruct-turbo | +deepinfra | +$0.02 | +$0.03 | +131K | ++ |
| meta-llama-3.1-8b-instruct | +deepinfra | +$0.02 | +$0.05 | +131K | ++ |
| mistral-nemo-instruct-2407 | +deepinfra | +$0.02 | +$0.04 | +131K | ++ |
| qwen3.5-2b | +deepinfra | +$0.02 | +$0.1 | +262K | ++ ๐ง Reason + ๐๏ธ Vision + | +
| llama-3.1-8b-instruct--fp-16 | +inferencenet | +$0.02 | +$0.03 | +131K | +๐ง Tool | +
| schematron-3b | +inferencenet | +$0.02 | +$0.05 | +131K | +๐ง Tool | +
| schematron-v3 | +inferencenet | +$0.02 | +$0.05 | +131K | +๐ง Tool | +
81 models available at zero cost. Perfect for testing, prototyping, and learning.
+| Model | +Provider | +Context | +Capabilities | +
|---|---|---|---|
| openrouter--owl-alpha | +openrouter | +1M | +๐ง Tool | +
| deepseek--deepseek-v4-flash--free | +openrouter | +1M | ++ ๐ง Tool + ๐ง Reason + | +
| google--lyria-3-clip-preview | +openrouter | +1M | +๐๏ธ Vision | +
| google--lyria-3-pro-preview | +openrouter | +1M | +๐๏ธ Vision | +
| qwen--qwen3-coder--free | +openrouter | +1M | +๐ง Tool | +
| nvidia--nemotron-3-super-120b-a12b--free | +openrouter | +1M | ++ ๐ง Tool + ๐ง Reason + | +
| gemma-4-26b-a4b-it | +auriko | +262K | ++ ๐ง Tool + ๐ง Reason + ๐๏ธ Vision + | +
| gemma-4-31b-it | +auriko | +262K | ++ ๐ง Tool + ๐ง Reason + ๐๏ธ Vision + | +
| arcee-ai--trinity-large-thinking--free | +openrouter | +262K | ++ ๐ง Tool + ๐ง Reason + | +
| google--gemma-4-26b-a4b-it--free | +openrouter | +262K | ++ ๐ง Tool + ๐ง Reason + ๐๏ธ Vision + | +
| google--gemma-4-31b-it--free | +openrouter | +262K | ++ ๐ง Tool + ๐ง Reason + ๐๏ธ Vision + | +
| codestral | +mistral | +256K | ++ |
| nvidia--nemotron-3-nano-omni-30b-a3b-reasoning--free | +openrouter | +256K | ++ ๐ง Tool + ๐ง Reason + ๐๏ธ Vision + | +
| hunyuan-lite | +tencent | +250K | ++ |
| minimax--minimax-m2.5--free | +openrouter | +204K | ++ ๐ง Tool + ๐ง Reason + | +
0 models optimized for code generation, completion, and understanding.
+| Model | +Provider | +Input $/M | +Output $/M | +Context | +Capabilities | +
|---|
+ 1080 models with both tool calling and reasoning โ the key capabilities for building AI + agents. +
+| Model | +Provider | +Input $/M | +Output $/M | +Context | +Capabilities | +
|---|---|---|---|---|---|
| openai--gpt-oss-20b | +neuralwatt | +$0.03 | +$0.16 | +? | ++ ๐ง Tool + ๐ง Reason + | +
| qwen--qwen3-4b-fp8 | +novitaai | +$0.03 | +$0.03 | +128K | ++ ๐ง Tool + ๐ง Reason + | +
| gpt-oss-120b | +inferencenet | +$0.05 | +$0.45 | +131K | ++ ๐ง Tool + ๐ง Reason + | +
| Qwen--Qwen3.6-35B-A3B | +neuralwatt | +$0.05 | +$0.1 | +? | ++ ๐ง Tool + ๐ง Reason + | +
| openai--gpt-oss-120b | +novitaai | +$0.05 | +$0.25 | +131K | ++ ๐ง Tool + ๐ง Reason + | +
| qwen3-30b-a3b-fp8 | +cloudflare | +$0.051 | +$0.335 | +40K | ++ ๐ง Tool + ๐ง Reason + | +
| glm-4.7-flash | +cloudflare | +$0.06 | +$0.4 | +131K | ++ ๐ง Tool + ๐ง Reason + | +
| Nemotron-3-Nano-Omni | +nebius | +$0.06 | +$0.24 | +128K | ++ ๐ง Tool + ๐ง Reason + | +
| hermes-4-llama-3.1-8b | +nousresearch | +$0.06 | +$0.12 | +131K | ++ ๐ง Tool + ๐ง Reason + | +
| seed-1.6-flash | +bytedance | +$0.07 | +$0.3 | +262K | ++ ๐ง Tool + ๐ง Reason + | +
1306 models with advanced reasoning capabilities.
+| Model | +Provider | +Input $/M | +Output $/M | +Context | +
|---|---|---|---|---|
| qwen3.5-0.8b | +deepinfra | +$0.01 | +$0.05 | +262K | +
| qwen3.5-2b | +deepinfra | +$0.02 | +$0.1 | +262K | +
| gpt-oss-20b | +deepinfra | +$0.03 | +$0.14 | +131K | +
| qwen3.5-4b | +deepinfra | +$0.03 | +$0.15 | +262K | +
| openai--gpt-oss-20b | +neuralwatt | +$0.03 | +$0.16 | +? | +
| qwen--qwen3-4b-fp8 | +novitaai | +$0.03 | +$0.03 | +128K | +
| gpt-oss-120b | +deepinfra | +$0.039 | +$0.19 | +131K | +
| nvidia-nemotron-nano-9b-v2 | +deepinfra | +$0.04 | +$0.16 | +131K | +
| openai--gpt-oss-20b | +novitaai | +$0.04 | +$0.15 | +131K | +
| nemotron-3-nano-30b-a3b | +deepinfra | +$0.05 | +$0.2 | +262K | +
1487 models that can understand images and visual content.
+| Model | +Provider | +Input $/M | +Output $/M | +Context | +
|---|---|---|---|---|
| qwen3.5-0.8b | +deepinfra | +$0.01 | +$0.05 | +262K | +
| qwen3.5-2b | +deepinfra | +$0.02 | +$0.1 | +262K | +
| paddlepaddle--paddleocr-vl | +novitaai | +$0.02 | +$0.02 | +16K | +
| qwen3.5-4b | +deepinfra | +$0.03 | +$0.15 | +262K | +
| deepseek--deepseek-ocr-2 | +novitaai | +$0.03 | +$0.03 | +8K | +
| deepseek--deepseek-ocr | +novitaai | +$0.03 | +$0.03 | +8K | +
| reka-edge-2 | +reka | +$0.03 | +$0.1 | +131K | +
| zai-org--autoglm-phone-9b-multilingual | +novitaai | +$0.035 | +$0.138 | +65K | +
| gemini-1.5-flash-8b | +deepinfra | +$0.0375 | +$0.15 | +1M | +
| google-gemma-3-4b | +amazon-bedrock | +$0.04 | +$0.08 | +131K | +
Models with the largest context windows for processing long documents.
+| Model | +Provider | +Context | +Input $/M | +Output $/M | +
|---|---|---|---|---|
| meta-llama-4-scout | +meta | +10M | +$0.17 | +$0.66 | +
| gemini-1.5-pro | +2M | +$1.25 | +$5 | +|
| xai--grok-4-fast-non-reasoning | +aimlapi | +2M | +$0.52 | +$1.3 | +
| xai--grok-4-fast-reasoning | +aimlapi | +2M | +$0.52 | +$1.3 | +
| meta-llama-4-maverick-17b | +amazon-bedrock | +1M | +$0.24 | +$0.97 | +
| meta-llama-4-scout-17b | +amazon-bedrock | +1M | +$0.17 | +$0.66 | +
| minimax-m2-1 | +amazon-bedrock | +1M | +$0.3 | +$1.2 | +
| minimax-m2-5 | +amazon-bedrock | +1M | +$0.3 | +$1.2 | +
| minimax-m2 | +amazon-bedrock | +1M | +$0.3 | +$1.2 | +
| deepseek-v4-flash | +baidu | +1M | +$0.126 | +$0.252 | +
527 models with downloadable weights you can run locally.
+| Model | +Provider | +Context | +Capabilities | +
|---|---|---|---|
| google--gemma-4-31b-it | +orcarouter | +1M | +๐ง Tool | +
| qwen--qwen3.5-flash-2026-02-23 | +orcarouter | +1M | +๐ง Tool | +
| qwen--qwen3.5-flash | +orcarouter | +1M | +๐ง Tool | +
| qwen--qwen3.6-flash-2026-04-16 | +orcarouter | +1M | +๐ง Tool | +
| qwen--qwen3.6-flash | +orcarouter | +1M | +๐ง Tool | +
| MiniMax-Text-01 | +302ai | +1M | ++ |
| llama-4-maverick | +302ai | +1M | ++ |
| llama-4-scout | +302ai | +1M | ++ |
| meta-llama-4-maverick-17b | +amazon-bedrock | +1M | +๐ง Tool | +
| meta-llama-4-scout-17b | +amazon-bedrock | +1M | +๐ง Tool | +
+ All data is sourced from first-party APIs โ not third-party aggregators. Pricing, + context windows, and capabilities are verified against official provider documentation. + Aggregator providers (OpenRouter, Requesty, etc.) are excluded from ranking tables to avoid + duplicate models. +
+Data is auto-scraped and validated with Zod schemas. Last updated: 2025-05-21.
+ ++ The definitive 2025 comparison: pricing, context windows, capabilities, benchmarks, and API + features. GPT-4.1 vs Claude Sonnet 4 vs Gemini 2.5 Pro. +
+| Feature | +GPT-4.1 | +Claude Sonnet 4 | +Gemini 2.5 Pro | +
|---|---|---|---|
| Input price ($/M tokens) | +$2.00 | +$3.00 | +$1.25 | +
| Output price ($/M tokens) | +$8.00 | +$15.00 | +$10.00 | +
| Cache input ($/M tokens) | +$0.50 | +$0.30 | +$0.07 | +
| Context window | +1,047,576 | +200,000 | +1,048,576 | +
| Max output tokens | +32,768 | +64,000 | +65,536 | +
| Free tier | +No | +Yes (limited) | +Yes (generous) | +
| Capability | +GPT-4.1 | +Claude Sonnet 4 | +Gemini 2.5 Pro | +
|---|---|---|---|
| Tool calling | +โ | +โ | +โ | +
| Structured output | +โ | +โ | +โ | +
| Reasoning (extended thinking) | +โ (use o3) | +โ | +โ | +
| Vision (image input) | +โ | +โ | +โ | +
| Image generation | +โ (DALL-E) | +โ | +โ (Imagen) | +
| Audio input | +โ | +โ | +โ | +
| Audio output | +โ | +โ | +โ | +
| Video input | +โ | +โ | +โ | +
| PDF input | +โ | +โ | +โ | +
| Code execution | +โ | +โ (analysis tool) | +โ | +
| Benchmark | +GPT-4.1 | +Claude Sonnet 4 | +Gemini 2.5 Pro | +
|---|---|---|---|
| MMLU | +~90% | +~88% | +~90% | +
| MATH-500 | +~85% | +~88% | +~91% | +
| HumanEval | +~91% | +~93% | +~90% | +
| SWE-bench Verified | +~65% | +~72% | +~63% | +
| GPQA Diamond | +~72% | +~70% | +~78% | +
| BFCL v3 (tool calling) | +~88% | +~86% | +~85% | +
| Chatbot Arena | +~1380 | +~1370 | +~1360 | +
| Feature | +OpenAI | +Anthropic | +|
|---|---|---|---|
| API maturity | +Most mature | +Mature | +Maturing | +
| SDK languages | +Python, Node, Go, etc. | +Python, Node | +Python, Node, Go, etc. | +
| Streaming | +โ SSE | +โ SSE | +โ SSE | +
| Function calling | +Parallel, strict mode | +Parallel, forced tool | +Parallel, auto | +
| Batch API | +โ (50% discount) | +โ (50% discount) | +โ (50% discount) | +
| Fine-tuning | +โ | +โ | +โ (limited) | +
| Rate limits | +Tier-based | +Tier-based | +Per-project | +
| Use Case | +Best Budget Option | +Price | +Why | +
|---|---|---|---|
| General chat | +Gemini 2.5 Flash | +Free | +Strong quality at zero cost | +
| Coding | +DeepSeek V3 | +$0.07/$0.27 | +Near-frontier coding at 1/30th the price | +
| Reasoning | +DeepSeek R1 | +Free | +Top-tier reasoning at zero cost | +
| Tool calling | +Gemini 2.5 Flash | +Free | +Strong BFCL scores for free | +
| Long context | +Gemini 2.5 Flash | +Free | +1M context window for free | +
| Open source | +Qwen3-235B | +Free | +Best open-weight model | +
| If you need... | +Choose | +Because | +
|---|---|---|
| Best overall value | +Gemini 2.5 Pro | +Lowest input price, 1M context, broadest capabilities | +
| Best coding assistant | +Claude Sonnet 4 | +#1 on SWE-bench, 64K output, analysis tool | +
| Best tool calling | +GPT-4.1 | +#1 on BFCL, parallel calls, strict mode | +
| Best free option | +Gemini 2.5 Flash | +Free with 1M context, strong capabilities | +
| Best reasoning | +o3 / DeepSeek R1 | +Reasoning models outperform standard models on math/science | +
| Most mature API | +OpenAI | +Widest SDK support, fine-tuning, most integrations | +
+ Find the most affordable AI models across 95 providers. All prices per million tokens, + from first-party data. Aggregator providers excluded to avoid duplicates. +
+ +The absolute lowest-priced models across all providers.
+| # | +Model | +Provider | +Input $/1M | +Output $/1M | +Context | +Tool Call | +
|---|---|---|---|---|---|---|
| 1 | +openai--gpt-image-1-mini | +aimlapi | +$0.007 | +$0.676 | +? | ++ |
| 2 | +mistralai--Mistral-Nemo-Instruct-2407 | +klusterai | +$0.008 | +$0.001 | +131K | ++ |
| 3 | +qwen3.5-0.8b | +deepinfra | +$0.01 | +$0.05 | +262K | ++ |
| 4 | +ling-2.6-flash | +inclusionai | +$0.01 | +$0.03 | +262K | +โ | +
| 5 | +bdc-coder | +inferencenet | +$0.01 | +$0.01 | +131K | +โ | +
| 6 | +openai--gpt-image-1-model | +aimlapi | +$0.012 | +$0.175 | +? | ++ |
| 7 | +klusterai--Meta-Llama-3.1-8B-Instruct-Turbo | +klusterai | +$0.015 | +$0.02 | +131K | +โ | +
| 8 | +granite-4.0-h-micro | +cloudflare | +$0.017 | +$0.112 | +131K | +โ | +
| 9 | +meta-llama-3.1-8b-instruct-turbo | +deepinfra | +$0.02 | +$0.03 | +131K | ++ |
| 10 | +meta-llama-3.1-8b-instruct | +deepinfra | +$0.02 | +$0.05 | +131K | ++ |
| 11 | +mistral-nemo-instruct-2407 | +deepinfra | +$0.02 | +$0.04 | +131K | ++ |
| 12 | +qwen3.5-2b | +deepinfra | +$0.02 | +$0.1 | +262K | ++ |
| 13 | +llama-3.1-8b-instruct--fp-16 | +inferencenet | +$0.02 | +$0.03 | +131K | +โ | +
| 14 | +schematron-3b | +inferencenet | +$0.02 | +$0.05 | +131K | +โ | +
| 15 | +schematron-v3 | +inferencenet | +$0.02 | +$0.05 | +131K | +โ | +
| 16 | +Gemma-2-2b-it | +nebius | +$0.02 | +$0.06 | +8K | ++ |
| 17 | +Meta-Llama-3.1-8B-Instruct | +nebius | +$0.02 | +$0.06 | +131K | ++ |
| 18 | +meta-llama--llama-3.1-8b-instruct | +novitaai | +$0.02 | +$0.05 | +16K | ++ |
| 19 | +paddlepaddle--paddleocr-vl | +novitaai | +$0.02 | +$0.02 | +16K | ++ |
| 20 | +text-embedding-3-small | +openai | +$0.02 | +$0 | +8K | ++ |
+ Most affordable models that support function/tool calling โ essential for agents and + automation. +
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +
|---|---|---|---|---|
| ling-2.6-flash | +inclusionai | +$0.01 | +$0.03 | +262K | +
| bdc-coder | +inferencenet | +$0.01 | +$0.01 | +131K | +
| klusterai--Meta-Llama-3.1-8B-Instruct-Turbo | +klusterai | +$0.015 | +$0.02 | +131K | +
| granite-4.0-h-micro | +cloudflare | +$0.017 | +$0.112 | +131K | +
| llama-3.1-8b-instruct--fp-16 | +inferencenet | +$0.02 | +$0.03 | +131K | +
| schematron-3b | +inferencenet | +$0.02 | +$0.05 | +131K | +
| schematron-v3 | +inferencenet | +$0.02 | +$0.05 | +131K | +
| gpt-oss-20b | +inferencenet | +$0.03 | +$0.15 | +131K | +
| schematron-v2-turbo | +inferencenet | +$0.03 | +$0.15 | +131K | +
| openai--gpt-oss-20b | +neuralwatt | +$0.03 | +$0.16 | +? | +
| qwen--qwen3-4b-fp8 | +novitaai | +$0.03 | +$0.03 | +128K | +
| liquid-ai--LFM2-24B-A2B | +togetherai | +$0.03 | +$0.12 | +131K | +
| amazon-nova-micro | +amazon | +$0.035 | +$0.14 | +128K | +
| amazon-nova-micro | +amazon-bedrock | +$0.035 | +$0.14 | +128K | +
| mistral-nemo-12b-instruct--fp-8 | +inferencenet | +$0.0375 | +$0.1 | +131K | +
Most affordable reasoning models โ chain-of-thought for complex problems on a budget.
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +
|---|---|---|---|---|
| qwen3.5-0.8b | +deepinfra | +$0.01 | +$0.05 | +262K | +
| qwen3.5-2b | +deepinfra | +$0.02 | +$0.1 | +262K | +
| gpt-oss-20b | +deepinfra | +$0.03 | +$0.14 | +131K | +
| qwen3.5-4b | +deepinfra | +$0.03 | +$0.15 | +262K | +
| openai--gpt-oss-20b | +neuralwatt | +$0.03 | +$0.16 | +? | +
| qwen--qwen3-4b-fp8 | +novitaai | +$0.03 | +$0.03 | +128K | +
| gpt-oss-120b | +deepinfra | +$0.039 | +$0.19 | +131K | +
| nvidia-nemotron-nano-9b-v2 | +deepinfra | +$0.04 | +$0.16 | +131K | +
| openai--gpt-oss-20b | +novitaai | +$0.04 | +$0.15 | +131K | +
| nemotron-3-nano-30b-a3b | +deepinfra | +$0.05 | +$0.2 | +262K | +
| gpt-oss-120b | +inferencenet | +$0.05 | +$0.45 | +131K | +
| Qwen--Qwen3.6-35B-A3B | +neuralwatt | +$0.05 | +$0.1 | +? | +
| openai--gpt-oss-120b | +novitaai | +$0.05 | +$0.25 | +131K | +
| qwen3-30b-a3b-fp8 | +cloudflare | +$0.051 | +$0.335 | +40K | +
| glm-4.7-flash | +cloudflare | +$0.06 | +$0.4 | +131K | +
+ Most affordable models that can process images โ for OCR, visual Q&A, and multimodal tasks. +
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +
|---|---|---|---|---|
| qwen3.5-0.8b | +deepinfra | +$0.01 | +$0.05 | +262K | +
| qwen3.5-2b | +deepinfra | +$0.02 | +$0.1 | +262K | +
| paddlepaddle--paddleocr-vl | +novitaai | +$0.02 | +$0.02 | +16K | +
| qwen3.5-4b | +deepinfra | +$0.03 | +$0.15 | +262K | +
| deepseek--deepseek-ocr-2 | +novitaai | +$0.03 | +$0.03 | +8K | +
| deepseek--deepseek-ocr | +novitaai | +$0.03 | +$0.03 | +8K | +
| reka-edge-2 | +reka | +$0.03 | +$0.1 | +131K | +
| zai-org--autoglm-phone-9b-multilingual | +novitaai | +$0.035 | +$0.138 | +65K | +
| gemini-1.5-flash-8b | +deepinfra | +$0.0375 | +$0.15 | +1M | +
| google-gemma-3-4b | +amazon-bedrock | +$0.04 | +$0.08 | +131K | +
| gemma-3-12b-it | +deepinfra | +$0.04 | +$0.13 | +131K | +
| gemma-3-4b-it | +deepinfra | +$0.04 | +$0.08 | +131K | +
| qwen3.5-9b | +deepinfra | +$0.04 | +$0.15 | +262K | +
| openai--gpt-oss-20b | +novitaai | +$0.04 | +$0.15 | +131K | +
| llama-3.2-11b-vision-instruct | +cloudflare | +$0.049 | +$0.676 | +131K | +
+ Most affordable models with large context windows โ for long documents, codebases, and + conversations. +
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +
|---|---|---|---|---|
| mistralai--Mistral-Nemo-Instruct-2407 | +klusterai | +$0.008 | +$0.001 | +131K | +
| qwen3.5-0.8b | +deepinfra | +$0.01 | +$0.05 | +262K | +
| ling-2.6-flash | +inclusionai | +$0.01 | +$0.03 | +262K | +
| bdc-coder | +inferencenet | +$0.01 | +$0.01 | +131K | +
| klusterai--Meta-Llama-3.1-8B-Instruct-Turbo | +klusterai | +$0.015 | +$0.02 | +131K | +
| granite-4.0-h-micro | +cloudflare | +$0.017 | +$0.112 | +131K | +
| meta-llama-3.1-8b-instruct-turbo | +deepinfra | +$0.02 | +$0.03 | +131K | +
| meta-llama-3.1-8b-instruct | +deepinfra | +$0.02 | +$0.05 | +131K | +
| mistral-nemo-instruct-2407 | +deepinfra | +$0.02 | +$0.04 | +131K | +
| qwen3.5-2b | +deepinfra | +$0.02 | +$0.1 | +262K | +
| llama-3.1-8b-instruct--fp-16 | +inferencenet | +$0.02 | +$0.03 | +131K | +
| schematron-3b | +inferencenet | +$0.02 | +$0.05 | +131K | +
| schematron-v3 | +inferencenet | +$0.02 | +$0.05 | +131K | +
| Meta-Llama-3.1-8B-Instruct | +nebius | +$0.02 | +$0.06 | +131K | +
| llama-3.2-1b-instruct | +cloudflare | +$0.027 | +$0.201 | +131K | +
+ The most affordable model from each provider โ find the best deal from your preferred + provider. +
+| Provider | +Cheapest Model | +Input $/1M | +Output $/1M | +Context | +
|---|---|---|---|---|
| 01ai | +yi-lightning | +$1 | +$1 | +16K | +
| ai21 | +jamba-mini-2-2026-01 | +$0.2 | +$0.4 | +256K | +
| aimlapi | +openai--gpt-image-1-mini | +$0.007 | +$0.676 | +? | +
| aion | +aion-1.0-mini | +$0.7 | +$1.4 | +131K | +
| alibaba | +qwen-flash | +$0.15 | +$1.5 | +? | +
| amazon | +amazon-nova-micro | +$0.035 | +$0.14 | +128K | +
| amazon-bedrock | +amazon-nova-micro | +$0.035 | +$0.14 | +128K | +
| anthropic | +claude-haiku-4-5 | +$1 | +$5 | +200K | +
| arcee | +trinity-mini | +$0.04 | +$0.15 | +131K | +
| baichuan | +baichuan4-air | +$0.98 | +$0.98 | +32K | +
| baidu | +deepseek-v4-flash | +$0.126 | +$0.252 | +1M | +
| baseten | +gpt-oss-120b | +$0.1 | +$0.5 | +131K | +
| berget | +meta-llama--Llama-3.1-8B-Instruct | +$0.2 | +$0.2 | +? | +
| bytedance | +seed-1.6-flash | +$0.07 | +$0.3 | +262K | +
| cerebras | +llama3.1-8b | +$0.1 | +$0.1 | +131K | +
| chutes | +Qwen--Qwen3-32B-TEE | +$0.08 | +$0.24 | +40K | +
| clarifai | +gpt-oss-120b | +$0.09 | +$0.36 | +131K | +
| cloudferro-sherlock | +minimax-m2.5 | +$0.26 | +$1.04 | +1M | +
| cloudflare | +granite-4.0-h-micro | +$0.017 | +$0.112 | +131K | +
| databricks | +databricks-gpt-5-nano | +$0.05 | +$0.4 | +200K | +
| deepinfra | +qwen3.5-0.8b | +$0.01 | +$0.05 | +262K | +
| deepseek | +deepseek-chat | +$0.14 | +$0.28 | +1M | +
| digitalocean | +openai-gpt-oss-20b | +$0.05 | +$0.45 | +131K | +
| dinference | +gpt-oss-20b | +$0.07 | +$0.25 | +131K | +
| evroc | +Qwen--Qwen3-30B-A3B-Instruct | +$0.1 | +$0.8 | +40K | +
| fireworks | +gpt-oss-20b | +$0.07 | +$0.3 | +131K | +
| friendli | +meta-llama-3.1-8b-instruct | +$0.1 | +$0.1 | +131K | +
| gmicloud | +openai--gpt-oss-120b | +$0.07 | +$0.28 | +131K | +
| gemini-1.5-flash-8b | +$0.075 | +$0.3 | +1M | +|
| google-vertex | +gpt-oss-20b | +$0.07 | +$0.25 | +131K | +
| groq | +llama-3.1-8b-instant | +$0.05 | +$0.08 | +131K | +
| hpc-ai | +deepseek--deepseek-v4-flash | +$0.14 | +$0.28 | +1M | +
| hyperbolic | +meta-llama--Llama-3.1-8B-BF16-Base | +$0.1 | +$0.1 | +131K | +
| iflytek | +spark-ultra | +$0.8 | +$0.8 | +131K | +
| inception | +mercury-2 | +$0.25 | +$0.75 | +128K | +
| inclusionai | +ling-2.6-flash | +$0.01 | +$0.03 | +262K | +
| inferencenet | +bdc-coder | +$0.01 | +$0.01 | +131K | +
| klusterai | +mistralai--Mistral-Nemo-Instruct-2407 | +$0.008 | +$0.001 | +131K | +
| meta | +meta-llama-3.2-1b | +$0.1 | +$0.1 | +128K | +
| microsoft | +microsoft-phi-4-mini-reasoning | +$0.075 | +$0.3 | +128K | +
| minimax | +M2-her | +$2.1 | +$8.4 | +64K | +
| mistral | +ministral-3b | +$0.04 | +$0.04 | +128K | +
| mixlayer | +qwen--qwen3.5-9b | +$0.1 | +$0.4 | +131K | +
| moonshotai | +moonshot-v1-8k-vision-preview | +$2 | +$10 | +8K | +
| morph | +morph-compact | +$0.2 | +$0.5 | +1M | +
| nebius | +Gemma-2-2b-it | +$0.02 | +$0.06 | +8K | +
| neuralwatt | +openai--gpt-oss-20b | +$0.03 | +$0.16 | +? | +
| nousresearch | +hermes-3-llama-3.1-8b | +$0.06 | +$0.12 | +131K | +
| novitaai | +meta-llama--llama-3.1-8b-instruct | +$0.02 | +$0.05 | +16K | +
| openai | +text-embedding-3-small | +$0.02 | +$0 | +8K | +
| ovhcloud | +gpt-oss-20b | +$0.05 | +$0.18 | +131K | +
| perplexity | +sonar | +$1 | +$1 | +127K | +
| ppio | +qwen--qwen3-4b-fp8 | +$0.2145 | +$0.2145 | +128K | +
| privatemode | +gpt-oss-120b | +$0.43 | +$1.7 | +131K | +
| reka | +reka-edge-2 | +$0.03 | +$0.1 | +131K | +
| sambanova | +gpt-oss-120b | +$0.22 | +$0.59 | +131K | +
| scaleway | +gpt-oss-120b | +$0.15 | +$0.6 | +131K | +
| siliconflow | +gpt-oss-20b | +$0.04 | +$0.18 | +131K | +
| siliconflow-cn | +ling-mini-2.0 | +$0.5 | +$2 | +131K | +
| stepfun | +step-3.5-flash-2603 | +$0.7 | +$2.1 | +256K | +
| submodel | +openai--gpt-oss-120b | +$0.1 | +$0.5 | +131K | +
| tencent | +hunyuan-a13b | +$0.5 | +$2 | +224K | +
| tencent-tokenhub | +deepseek-v4-flash | +$1 | +$2 | +1M | +
| textsynth | +EleutherAI--gpt-j-6B | +$0.2 | +$2 | +2K | +
| togetherai | +liquid-ai--LFM2-24B-A2B | +$0.03 | +$0.12 | +131K | +
| upstage | +solar-embedding-1-large | +$0.1 | +$0 | +? | +
| voyage | +rerank-2.5-lite | +$0.02 | +$0 | +? | +
| vultr | +cosmos-reason-2-2b | +$0.55 | +$2.75 | +131K | +
| wafer | +Qwen3.5-397B-A17B | +$0.6 | +$3.6 | +262K | +
| writer | +palmyra-x5 | +$0.6 | +$6 | +1M | +
| xai | +xai-grok-4-fast | +$0.2 | +$0.5 | +131K | +
| xiaomi | +mimo-v2-flash | +$0.1 | +$0.3 | +262K | +
| zhipuai | +glm-4-flashx-250414 | +$0.1 | +$0.1 | +128K | +
+ All data is sourced from first-party APIs โ not third-party aggregators. Prices are per + million tokens as listed by each provider. Aggregator providers (OpenRouter, Requesty, etc.) + are excluded from ranking tables to avoid duplicate models. Actual costs may vary based on + usage patterns, caching, and batch discounts. +
+ ++ Compare context windows across 4,587 AI models. Find the largest context LLMs for your + use case โ from 1M+ token monsters to compact 8K models. +
+ +| # | +Model | +Provider | +Context | +Input $/1M | +Tool Call | +
|---|---|---|---|---|---|
| 1 | +meta-llama-4-scout | +meta | +10M | +$0.17 | +โ | +
| 2 | +gemini-1.5-pro | +2M | +$1.25 | +โ | +|
| 3 | +xai--grok-4-fast-non-reasoning | +aimlapi | +2M | +$0.52 | +โ | +
| 4 | +xai--grok-4-fast-reasoning | +aimlapi | +2M | +$0.52 | +โ | +
| 5 | +meta-llama-4-maverick-17b | +amazon-bedrock | +1M | +$0.24 | +โ | +
| 6 | +meta-llama-4-scout-17b | +amazon-bedrock | +1M | +$0.17 | +โ | +
| 7 | +minimax-m2-1 | +amazon-bedrock | +1M | +$0.3 | +โ | +
| 8 | +minimax-m2-5 | +amazon-bedrock | +1M | +$0.3 | +โ | +
| 9 | +minimax-m2 | +amazon-bedrock | +1M | +$0.3 | +โ | +
| 10 | +deepseek-v4-flash | +baidu | +1M | +$0.126 | +โ | +
| 11 | +minimax-m2-5 | +baseten | +1M | +$0.3 | +โ | +
| 12 | +gpt-5-1 | +clarifai | +1M | +$1.5625 | +โ | +
| 13 | +deepseek-v4-flash | +deepinfra | +1M | +$0.14 | ++ |
| 14 | +llama-4-maverick-17b-128e-instruct-fp8 | +deepinfra | +1M | +$0.15 | ++ |
| 15 | +mimo-v2.5-pro | +deepinfra | +1M | +$1 | ++ |
| 16 | +llama-4-maverick | +digitalocean | +1M | +$0.25 | +โ | +
| 17 | +deepseek-v4-pro | +fireworks | +1M | +$1.74 | +โ | +
| 18 | +meta-llama--Llama-4-Maverick-17B-128E-Instruct-FP8 | +gmicloud | +1M | +$0.25 | +โ | +
| 19 | +gemini-1.5-flash-8b | +1M | +$0.075 | +โ | +|
| 20 | +gemini-1.5-flash | +1M | +$0.075 | +โ | +
| Model | +Provider | +Context | +Input $/1M | +Output $/1M | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|---|
| meta-llama-4-scout | +meta | +10M | +$0.17 | +$0.66 | +โ | ++ |
| gemini-1.5-pro | +2M | +$1.25 | +$5 | +โ | ++ | |
| xai--grok-4-fast-non-reasoning | +aimlapi | +2M | +$0.52 | +$1.3 | +โ | ++ |
| xai--grok-4-fast-reasoning | +aimlapi | +2M | +$0.52 | +$1.3 | +โ | ++ |
| meta-llama-4-maverick-17b | +amazon-bedrock | +1M | +$0.24 | +$0.97 | +โ | ++ |
| meta-llama-4-scout-17b | +amazon-bedrock | +1M | +$0.17 | +$0.66 | +โ | ++ |
| minimax-m2-1 | +amazon-bedrock | +1M | +$0.3 | +$1.2 | +โ | ++ |
| minimax-m2-5 | +amazon-bedrock | +1M | +$0.3 | +$1.2 | +โ | ++ |
| minimax-m2 | +amazon-bedrock | +1M | +$0.3 | +$1.2 | +โ | ++ |
| deepseek-v4-flash | +baidu | +1M | +$0.126 | +$0.252 | +โ | +โ | +
| minimax-m2-5 | +baseten | +1M | +$0.3 | +$1.2 | +โ | ++ |
| gpt-5-1 | +clarifai | +1M | +$1.5625 | +$12.5 | +โ | ++ |
| deepseek-v4-flash | +deepinfra | +1M | +$0.14 | +$0.28 | ++ | โ | +
| llama-4-maverick-17b-128e-instruct-fp8 | +deepinfra | +1M | +$0.15 | +$0.6 | ++ | + |
| mimo-v2.5-pro | +deepinfra | +1M | +$1 | +$3 | ++ | โ | +
| ... and 78 more models | +||||||
| Model | +Provider | +Context | +Input $/1M | +Output $/1M | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|---|
| deepseek-v4-pro | +baidu | +716K | +$1.521 | +$3.042 | +โ | +โ | +
| Model | +Provider | +Context | +Input $/1M | +Output $/1M | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|---|
| openai--gpt-5-chat | +aimlapi | +400K | +$1.625 | +$13 | ++ | + |
| openai--gpt-5-mini | +aimlapi | +400K | +$0.325 | +$2.6 | +โ | ++ |
| openai--gpt-5-nano | +aimlapi | +400K | +$0.065 | +$0.52 | +โ | ++ |
| openai--gpt-5.1-chat-latest | +aimlapi | +400K | +$1.625 | +$13 | +โ | ++ |
| openai--gpt-5.1 | +aimlapi | +400K | +$1.625 | +$13 | +โ | ++ |
| openai--gpt-5.2 | +aimlapi | +400K | +$2.275 | +$18.2 | +โ | ++ |
| openai--gpt-5 | +aimlapi | +400K | +$1.625 | +$13 | +โ | ++ |
| llama-4-scout-17b-16e-instruct | +cloudflare | +327K | +$0.27 | +$0.85 | +โ | ++ |
| llama-4-scout-17b-16e-instruct | +deepinfra | +327K | +$0.08 | +$0.3 | ++ | + |
| meta-llama--Llama-4-Scout-17B-16E-Instruct | +gmicloud | +327K | +$0.08 | +$0.5 | +โ | ++ |
| llama-4-scout-17b-16e-instruct | +vultr | +327K | +$0.55 | +$2.75 | +โ | ++ |
| llama-4-scout-17b-16e | +vultr | +327K | +$0.55 | +$2.75 | ++ | + |
| amazon-nova-lite | +amazon | +300K | +$0.06 | +$0.24 | +โ | ++ |
| amazon-nova-pro | +amazon | +300K | +$0.8 | +$3.2 | +โ | ++ |
| amazon-nova-lite | +amazon-bedrock | +300K | +$0.06 | +$0.24 | +โ | ++ |
| ... and 172 more models | +||||||
| Model | +Provider | +Context | +Input $/1M | +Output $/1M | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|---|
| hunyuan-lite | +tencent | +250K | +Free | ++ | + | + |
| hunyuan-a13b | +tencent | +224K | +$0.5 | +$2 | ++ | โ | +
| minimax-m2.5 | +dinference | +204K | +$0.22 | +$0.88 | ++ | + |
| minimax--minimax-m2.5 | +hpc-ai | +204K | +$0.3 | +$1.2 | +โ | +โ | +
| MiniMax-M2.1-highspeed | +minimax | +204K | +$4.2 | +$16.8 | ++ | + |
| MiniMax-M2.1 | +minimax | +204K | +$2.1 | +$8.4 | ++ | + |
| MiniMax-M2.5-highspeed | +minimax | +204K | +$4.2 | +$16.8 | ++ | + |
| MiniMax-M2.5 | +minimax | +204K | +$2.1 | +$8.4 | ++ | + |
| MiniMax-M2.7-highspeed | +minimax | +204K | +$4.2 | +$16.8 | ++ | + |
| MiniMax-M2.7 | +minimax | +204K | +$2.1 | +$8.4 | ++ | + |
| MiniMax-M2 | +minimax | +204K | +$2.1 | +$8.4 | ++ | + |
| minimax--minimax-m2.1 | +novitaai | +204K | +$0.3 | +$1.2 | +โ | ++ |
| minimax--minimax-m2.5-highspeed | +novitaai | +204K | +$0.6 | +$2.4 | +โ | +โ | +
| minimax--minimax-m2.5 | +novitaai | +204K | +$0.3 | +$1.2 | +โ | +โ | +
| minimax--minimax-m2.7 | +novitaai | +204K | +$0.3 | +$1.2 | +โ | +โ | +
| ... and 670 more models | +||||||
| Model | +Provider | +Context | +Input $/1M | +Output $/1M | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|---|
| sonar | +perplexity | +127K | +$1 | +$1 | +โ | ++ |
| baidu--ernie-4.5-300b-a47b-paddle | +novitaai | +123K | +$0.28 | +$1.1 | ++ | + |
| baidu--ernie-4.5-vl-424b-a47b | +novitaai | +123K | +$0.42 | +$1.25 | ++ | โ | +
| baidu--ernie-4.5-300b-a47b-paddle | +ppio | +123K | +$2 | +$7 | ++ | + |
| baidu--ernie-4.5-vl-424b-a47b | +ppio | +123K | +$3 | +$9 | ++ | + |
| baidu--ernie-4.5-0.3b | +aimlapi | +120K | +Free | ++ | โ | ++ |
| baidu--ernie-4.5-21B-a3b | +novitaai | +120K | +$0.07 | +$0.28 | +โ | ++ |
| baidu--ernie-4.5-0.3b | +ppio | +120K | +Free | ++ | + | + |
| baidu--ernie-4.5-21B-a3b | +ppio | +120K | +$0.5 | +$2 | ++ | + |
| qwen3.6-27b | +vultr | +120K | +$0.55 | +$2.75 | ++ | + |
| step-r1-v-mini | +stepfun | +100K | +$2.5 | +$8 | ++ | + |
| google--gemma-3-27b-it | +novitaai | +98K | +$0.119 | +$0.2 | ++ | + |
| Gemma-3-27b-it | +nebius | +96K | +$0.1 | +$0.3 | +โ | +โ | +
| gemma-3-27b | +privatemode | +96K | +$0.77 | +$1.27 | ++ | + |
| gemma-4-31b | +privatemode | +96K | +$0.77 | +$1.27 | ++ | + |
| ... and 41 more models | +||||||
| Model | +Provider | +Context | +Input $/1M | +Output $/1M | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|---|
| mistralai--mistral-nemo | +novitaai | +60K | +$0.04 | +$0.17 | ++ | + |
| Qwen--Qwen3-32B-TEE | +chutes | +40K | +$0.08 | +$0.24 | +โ | +โ | +
| qwen3-30b-a3b-fp8 | +cloudflare | +40K | +$0.051 | +$0.335 | +โ | +โ | +
| qwen3-14b | +deepinfra | +40K | +$0.12 | +$0.24 | ++ | โ | +
| qwen3-30b-a3b | +deepinfra | +40K | +$0.09 | +$0.45 | ++ | โ | +
| qwen3-32b | +deepinfra | +40K | +$0.08 | +$0.28 | ++ | โ | +
| Qwen--Qwen3-30B-A3B-Instruct | +evroc | +40K | +$0.1 | +$0.8 | ++ | + |
| Qwen--Qwen3-VL-30B-A3B-Instruct | +evroc | +40K | +$0.2 | +$0.8 | ++ | + |
| Qwen--Qwen3-30B-A3B | +gmicloud | +40K | +$0.08 | +$0.25 | ++ | + |
| Qwen--Qwen3-32B-FP8 | +gmicloud | +40K | +$0.1 | +$0.6 | ++ | + |
| Qwen--Qwen3-235B-A22B-FP8 | +klusterai | +40K | +$0.13 | +$2 | +โ | +โ | +
| mistralai--Magistral-Small-2506 | +klusterai | +40K | +$0.1 | +$0.3 | ++ | + |
| qwen--qwen3-235b-a22b-fp8 | +novitaai | +40K | +$0.2 | +$0.8 | ++ | โ | +
| qwen--qwen3-30b-a3b-fp8 | +novitaai | +40K | +$0.09 | +$0.45 | +โ | +โ | +
| qwen--qwen3-32b-fp8 | +novitaai | +40K | +$0.1 | +$0.45 | ++ | โ | +
| ... and 59 more models | +||||||
| Model | +Provider | +Context | +Input $/1M | +Output $/1M | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|---|
| baidu--ernie-4.5-vl-28b-a3b | +novitaai | +30K | +$0.14 | +$0.56 | +โ | +โ | +
| baidu--ernie-4.5-vl-28b-a3b | +ppio | +30K | +$1 | +$4 | ++ | + |
| gpt-oss-120b | +vultr | +30K | +$0.55 | +$2.75 | ++ | + |
| gpt-oss-20b | +vultr | +30K | +$0.55 | +$2.75 | ++ | + |
| hunyuan-large-role-latest | +tencent | +28K | +$2.4 | +$9.6 | ++ | + |
| hunyuan-t1-vision | +tencent | +28K | +$3 | +$9 | ++ | โ | +
| hunyuan-role | +tencent-tokenhub | +28K | +$2.4 | +$9.6 | ++ | + |
| hunyuan-turbos-vision-video | +tencent | +24K | +$3 | +$9 | ++ | + |
| hunyuan-turbos-vision | +tencent | +24K | +$3 | +$9 | ++ | + |
| hunyuan-vision-1.5-instruct | +tencent | +24K | +$3 | +$9 | ++ | + |
| autoglm-phone | +zhipuai | +20K | +Free | ++ | โ | ++ |
| gpt-3.5-turbo-16k | +openai | +16K | +$3 | +$4 | +โ | ++ |
| gpt-3.5-turbo | +openai | +16K | +$0.5 | +$1.5 | +โ | ++ |
| yi-lightning | +01ai | +16K | +$1 | +$1 | +โ | ++ |
| yi-medium | +01ai | +16K | +$2.5 | +$2.5 | +โ | ++ |
| ... and 64 more models | +||||||
| Model | +Provider | +Context | +Input $/1M | +Output $/1M | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|---|
| nvidia-nemotron-3-super-120b | +amazon-bedrock | +4K | +$0.15 | +$0.65 | ++ | + |
| nvidia-nemotron-nano-2-vl | +amazon-bedrock | +4K | +$0.2 | +$0.6 | ++ | + |
| nvidia-nemotron-nano-2 | +amazon-bedrock | +4K | +$0.06 | +$0.23 | ++ | + |
| nvidia-nemotron-nano-3-30b | +amazon-bedrock | +4K | +$0.06 | +$0.24 | ++ | + |
| llama-2-7b-chat-fp16 | +cloudflare | +4K | +$0.556 | +$6.667 | ++ | + |
| mythomax-l2-13b | +deepinfra | +4K | +$0.4 | +$0.4 | ++ | + |
| nvidia-nemotron-3-super-120b | +digitalocean | +4K | +$0.3 | +$0.65 | ++ | + |
| nemotron3-super | +inferencenet | +4K | +$2.5 | +$5 | ++ | + |
| gryphe--mythomax-l2-13b | +novitaai | +4K | +$0.09 | +$0.09 | ++ | + |
| nemotron-3-super-120b-a12b-bf16 | +vultr | +4K | +$0.55 | +$2.75 | ++ | + |
| hunyuan-translation-lite | +tencent | +4K | +$1 | +$3 | ++ | + |
| hunyuan-translation | +tencent | +4K | +$1.2 | +$3.6 | ++ | + |
| EleutherAI--gpt-j-6B | +textsynth | +2K | +$0.2 | +$2 | ++ | + |
Find the most affordable model in each context window tier.
+| Context Tier | +Cheapest Model | +Provider | +Context | +Input $/1M | +
|---|---|---|---|---|
| 1M+ Tokens | +gemini-1.5-flash-8b | +deepinfra | +1M | +$0.0375 | +
| 512Kโ1M Tokens | +deepseek-v4-pro | +baidu | +716K | +$1.521 | +
| 256Kโ512K Tokens | +qwen3.5-0.8b | +deepinfra | +262K | +$0.01 | +
| 128Kโ256K Tokens | +mistralai--Mistral-Nemo-Instruct-2407 | +klusterai | +131K | +$0.008 | +
| 64Kโ128K Tokens | +zai-org--autoglm-phone-9b-multilingual | +novitaai | +65K | +$0.035 | +
| 32Kโ64K Tokens | +meta-llama--llama-3.2-3b-instruct | +novitaai | +32K | +$0.03 | +
| 8Kโ32K Tokens | +Gemma-2-2b-it | +nebius | +8K | +$0.02 | +
+ All data is sourced from first-party APIs โ not third-party aggregators. Context + windows are as reported by each provider. Aggregator providers are excluded from ranking + tables to avoid duplicate models. +
+ ++ A complete, verified list of 81 AI models you can use for free โ no credit card, no + hidden fees. Data sourced from first-party provider APIs. +
+ ++ These free models can handle the longest inputs โ perfect for processing documents, codebases, + and long conversations. +
+| Model | +Provider | +Context | +Tool Call | +Reasoning | +Vision | +
|---|---|---|---|---|---|
| openrouter--owl-alpha | +openrouter | +1M | +โ | ++ | + |
| deepseek--deepseek-v4-flash--free | +openrouter | +1M | +โ | +โ | ++ |
| google--lyria-3-clip-preview | +openrouter | +1M | ++ | + | โ | +
| google--lyria-3-pro-preview | +openrouter | +1M | ++ | + | โ | +
| qwen--qwen3-coder--free | +openrouter | +1M | +โ | ++ | + |
| nvidia--nemotron-3-super-120b-a12b--free | +openrouter | +1M | +โ | +โ | ++ |
| gemma-4-26b-a4b-it | +auriko | +262K | +โ | +โ | +โ | +
| gemma-4-31b-it | +auriko | +262K | +โ | +โ | +โ | +
| arcee-ai--trinity-large-thinking--free | +openrouter | +262K | +โ | +โ | ++ |
| google--gemma-4-26b-a4b-it--free | +openrouter | +262K | +โ | +โ | +โ | +
| google--gemma-4-31b-it--free | +openrouter | +262K | +โ | +โ | +โ | +
| codestral | +mistral | +256K | ++ | + | + |
| nvidia--nemotron-3-nano-omni-30b-a3b-reasoning--free | +openrouter | +256K | +โ | +โ | +โ | +
| hunyuan-lite | +tencent | +250K | ++ | + | + |
| minimax--minimax-m2.5--free | +openrouter | +204K | +โ | +โ | ++ |
54 free models support tool/function calling โ essential for building AI agents.
+| Model | +Provider | +Context | +Reasoning | +Vision | +
|---|---|---|---|---|
| openrouter--owl-alpha | +openrouter | +1M | ++ | + |
| deepseek--deepseek-v4-flash--free | +openrouter | +1M | +โ | ++ |
| qwen--qwen3-coder--free | +openrouter | +1M | ++ | + |
| nvidia--nemotron-3-super-120b-a12b--free | +openrouter | +1M | +โ | ++ |
| gemma-4-26b-a4b-it | +auriko | +262K | +โ | +โ | +
| gemma-4-31b-it | +auriko | +262K | +โ | +โ | +
| arcee-ai--trinity-large-thinking--free | +openrouter | +262K | +โ | ++ |
| google--gemma-4-26b-a4b-it--free | +openrouter | +262K | +โ | +โ | +
| google--gemma-4-31b-it--free | +openrouter | +262K | +โ | +โ | +
| nvidia--nemotron-3-nano-omni-30b-a3b-reasoning--free | +openrouter | +256K | +โ | +โ | +
| minimax--minimax-m2.5--free | +openrouter | +204K | +โ | ++ |
| z-ai--glm-5.1 | +openrouter | +202K | +โ | ++ |
| glm-4.5-flash | +auriko | +200K | +โ | ++ |
| glm-4.7-flash | +zhipuai | +200K | ++ | + |
| cobuddy | +baidu | +131K | ++ | + |
33 free models with advanced reasoning capabilities.
+| Model | +Provider | +Context | +Tool Call | +
|---|---|---|---|
| deepseek--deepseek-v4-flash--free | +openrouter | +1M | +โ | +
| nvidia--nemotron-3-super-120b-a12b--free | +openrouter | +1M | +โ | +
| gemma-4-26b-a4b-it | +auriko | +262K | +โ | +
| gemma-4-31b-it | +auriko | +262K | +โ | +
| arcee-ai--trinity-large-thinking--free | +openrouter | +262K | +โ | +
| google--gemma-4-26b-a4b-it--free | +openrouter | +262K | +โ | +
| google--gemma-4-31b-it--free | +openrouter | +262K | +โ | +
| nvidia--nemotron-3-nano-omni-30b-a3b-reasoning--free | +openrouter | +256K | +โ | +
| minimax--minimax-m2.5--free | +openrouter | +204K | +โ | +
| z-ai--glm-5.1 | +openrouter | +202K | +โ | +
27 free models that can understand images.
+| Model | +Provider | +Context | +Tool Call | +
|---|
10 free models with downloadable weights you can run locally.
+| Model | +Provider | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|
| hunyuan-lite | +tencent | +250K | ++ | + |
| deepseek-r1-distill-llama-70b | +cerebras | +131K | ++ | โ | +
| deepseek-r1-distill-llama-8b | +cerebras | +131K | ++ | โ | +
| llama-3.3-70b | +cerebras | +131K | +โ | ++ |
| llama-4-scout-17b-16e-instruct | +cerebras | +131K | +โ | ++ |
| qwen-2.5-32b | +cerebras | +131K | +โ | ++ |
| qwen-2.5-coder-32b | +cerebras | +131K | +โ | ++ |
| qwen3-32b | +cerebras | +131K | +โ | ++ |
| qwen--qwen3.5-4b-free | +mixlayer | +131K | +โ | +โ | +
| voyage-4-nano | +voyage | +? | ++ | + |
Complete list of every free AI model in our catalog, sorted by context window size.
+| Model | +Provider | +Context | +Tool Call | +Reasoning | +Vision | +Open Weights | +
|---|---|---|---|---|---|---|
| openrouter--owl-alpha | +openrouter | +1M | +โ | ++ | + | + |
| deepseek--deepseek-v4-flash--free | +openrouter | +1M | +โ | +โ | ++ | + |
| google--lyria-3-clip-preview | +openrouter | +1M | ++ | + | โ | ++ |
| google--lyria-3-pro-preview | +openrouter | +1M | ++ | + | โ | ++ |
| qwen--qwen3-coder--free | +openrouter | +1M | +โ | ++ | + | + |
| nvidia--nemotron-3-super-120b-a12b--free | +openrouter | +1M | +โ | +โ | ++ | + |
| gemma-4-26b-a4b-it | +auriko | +262K | +โ | +โ | +โ | ++ |
| gemma-4-31b-it | +auriko | +262K | +โ | +โ | +โ | ++ |
| arcee-ai--trinity-large-thinking--free | +openrouter | +262K | +โ | +โ | ++ | + |
| google--gemma-4-26b-a4b-it--free | +openrouter | +262K | +โ | +โ | +โ | ++ |
| google--gemma-4-31b-it--free | +openrouter | +262K | +โ | +โ | +โ | ++ |
| codestral | +mistral | +256K | ++ | + | + | + |
| nvidia--nemotron-3-nano-omni-30b-a3b-reasoning--free | +openrouter | +256K | +โ | +โ | +โ | ++ |
| hunyuan-lite | +tencent | +250K | ++ | + | + | โ | +
| minimax--minimax-m2.5--free | +openrouter | +204K | +โ | +โ | ++ | + |
| z-ai--glm-5.1 | +openrouter | +202K | +โ | +โ | ++ | + |
| glm-4.5-flash | +auriko | +200K | +โ | +โ | ++ | + |
| glm-4.7-flash | +zhipuai | +200K | +โ | ++ | + | + |
| spotlight | +arcee | +131K | ++ | + | โ | ++ |
| cobuddy | +baidu | +131K | +โ | ++ | + | + |
| deepseek-r1-distill-llama-70b | +cerebras | +131K | ++ | โ | ++ | โ | +
| deepseek-r1-distill-llama-8b | +cerebras | +131K | ++ | โ | ++ | โ | +
| llama-3.3-70b | +cerebras | +131K | +โ | ++ | + | โ | +
| llama-4-scout-17b-16e-instruct | +cerebras | +131K | +โ | ++ | + | โ | +
| qwen-2.5-32b | +cerebras | +131K | +โ | ++ | + | โ | +
| qwen-2.5-coder-32b | +cerebras | +131K | +โ | ++ | + | โ | +
| qwen3-32b | +cerebras | +131K | +โ | ++ | + | โ | +
| gemma-3-12b-it | +131K | ++ | + | โ | ++ | |
| gemma-3-1b-it | +131K | ++ | + | + | + | |
| gemma-3-27b-it | +131K | ++ | + | โ | ++ | |
| gemma-3-4b-it | +131K | ++ | + | โ | ++ | |
| gemma-3n-E2B-it | +131K | ++ | + | โ | ++ | |
| gemma-3n-E4B-it | +131K | ++ | + | โ | ++ | |
| glm-4-7-flash | +meganova | +131K | +โ | ++ | + | + |
| manta-flash-1.0 | +meganova | +131K | +โ | ++ | + | + |
| manta-mini-1.0 | +meganova | +131K | +โ | ++ | + | + |
| manta-pro-1.0 | +meganova | +131K | +โ | ++ | + | + |
| qwen--qwen3.5-4b-free | +mixlayer | +131K | +โ | +โ | ++ | โ | +
| baidu--cobuddy--free | +openrouter | +131K | +โ | +โ | ++ | + |
| openai--gpt-oss-120b--free | +openrouter | +131K | +โ | +โ | ++ | + |
| openai--gpt-oss-20b--free | +openrouter | +131K | +โ | +โ | ++ | + |
| poolside--laguna-m.1--free | +openrouter | +131K | +โ | +โ | ++ | + |
| poolside--laguna-xs.2--free | +openrouter | +131K | +โ | +โ | ++ | + |
| z-ai--glm-4.5-air--free | +openrouter | +131K | +โ | +โ | ++ | + |
| glm-4.6v-flash | +auriko | +128K | +โ | +โ | +โ | ++ |
| sarvam--sarvam-105b | +fastrouter | +128K | +โ | +โ | ++ | + |
| sarvam--sarvam-30b | +fastrouter | +128K | +โ | +โ | ++ | + |
| devstral | +mistral | +128K | +โ | ++ | + | + |
| nvidia--nemotron-nano-12b-v2-vl--free | +openrouter | +128K | +โ | +โ | +โ | ++ |
| glm-4-flash-250414 | +zhipuai | +128K | +โ | ++ | + | + |
| glm-4.6v-flash | +zhipuai | +128K | +โ | ++ | โ | ++ |
| baidu--ernie-4.5-0.3b | +aimlapi | +120K | +โ | ++ | + | + |
| baidu--ernie-4.5-0.3b | +ppio | +120K | ++ | + | + | + |
| qwen--qwen3-omni-30b-a3b-instruct | +novitaai | +65K | +โ | ++ | โ | ++ |
| qwen--qwen3-omni-30b-a3b-thinking | +novitaai | +65K | +โ | +โ | +โ | ++ |
| glm-4.1v-thinking-flash | +zhipuai | +64K | +โ | +โ | +โ | ++ |
| baichuan4 | +baichuan | +32K | ++ | + | + | + |
| autoglm-phone | +zhipuai | +20K | +โ | ++ | โ | ++ |
| glm-4v-flash | +zhipuai | +16K | +โ | ++ | โ | ++ |
| spark-lite | +iflytek | +8K | ++ | + | + | + |
| nvidia--nemotron-3-nano-omni | +aimlapi | +? | ++ | + | + | + |
| glm-4.7-flash | +auriko | +? | +โ | +โ | ++ | + |
| glm-4.5-flash | +llmgateway | +? | +โ | ++ | + | + |
| glm-4.6v-flash | +llmgateway | +? | +โ | +โ | +โ | ++ |
| glm-4.7-flash | +llmgateway | +? | +โ | +โ | ++ | + |
| cognitivecomputations--dolphin-mistral-24b-venice-edition--free | +openrouter | +? | ++ | + | + | + |
| liquid--lfm-2.5-1.2b-instruct--free | +openrouter | +? | ++ | + | + | + |
| liquid--lfm-2.5-1.2b-thinking--free | +openrouter | +? | ++ | โ | ++ | + |
| meta-llama--llama-3.2-3b-instruct--free | +openrouter | +? | ++ | + | + | + |
| meta-llama--llama-3.3-70b-instruct--free | +openrouter | +? | +โ | ++ | + | + |
| nousresearch--hermes-3-llama-3.1-405b--free | +openrouter | +? | ++ | + | + | + |
| nvidia--nemotron-3-nano-30b-a3b--free | +openrouter | +? | +โ | +โ | ++ | + |
| nvidia--nemotron-nano-9b-v2--free | +openrouter | +? | +โ | +โ | ++ | + |
| openrouter--free | +openrouter | +? | +โ | +โ | +โ | ++ |
| qwen--qwen3-next-80b-a3b-instruct--free | +openrouter | +? | +โ | ++ | + | + |
| step-1x-edit | +stepfun | +? | ++ | + | โ | ++ |
| step-2x-large | +stepfun | +? | ++ | + | โ | ++ |
| step-audio-r1.1 | +stepfun | +? | ++ | + | + | + |
| step-gui | +stepfun | +? | ++ | + | โ | ++ |
| voyage-4-nano | +voyage | +? | ++ | + | + | โ | +
| glm-ocr | +zhipuai | +? | +โ | ++ | โ | ++ |
+ Real pricing data for 4587 AI models across 95 providers. All prices are per + million tokens, sourced from first-party APIs. No third-party aggregators. +
+ +The most affordable models per million input tokens.
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +
|---|---|---|---|---|
| openai--gpt-image-1-mini | +aimlapi | +$0.007 | +$0.676 | +? | +
| mistralai--Mistral-Nemo-Instruct-2407 | +klusterai | +$0.008 | +$0.001 | +131K | +
| qwen3.5-0.8b | +deepinfra | +$0.01 | +$0.05 | +262K | +
| ling-2.6-flash | +inclusionai | +$0.01 | +$0.03 | +262K | +
| bdc-coder | +inferencenet | +$0.01 | +$0.01 | +131K | +
| openai--gpt-image-1-model | +aimlapi | +$0.012 | +$0.175 | +? | +
| klusterai--Meta-Llama-3.1-8B-Instruct-Turbo | +klusterai | +$0.015 | +$0.02 | +131K | +
| granite-4.0-h-micro | +cloudflare | +$0.017 | +$0.112 | +131K | +
| meta-llama-3.1-8b-instruct-turbo | +deepinfra | +$0.02 | +$0.03 | +131K | +
| meta-llama-3.1-8b-instruct | +deepinfra | +$0.02 | +$0.05 | +131K | +
| mistral-nemo-instruct-2407 | +deepinfra | +$0.02 | +$0.04 | +131K | +
| qwen3.5-2b | +deepinfra | +$0.02 | +$0.1 | +262K | +
| llama-3.1-8b-instruct--fp-16 | +inferencenet | +$0.02 | +$0.03 | +131K | +
| schematron-3b | +inferencenet | +$0.02 | +$0.05 | +131K | +
| schematron-v3 | +inferencenet | +$0.02 | +$0.05 | +131K | +
The most affordable models that support function/tool calling โ essential for AI agents.
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +
|---|---|---|---|---|
| ling-2.6-flash | +inclusionai | +$0.01 | +$0.03 | +262K | +
| bdc-coder | +inferencenet | +$0.01 | +$0.01 | +131K | +
| klusterai--Meta-Llama-3.1-8B-Instruct-Turbo | +klusterai | +$0.015 | +$0.02 | +131K | +
| granite-4.0-h-micro | +cloudflare | +$0.017 | +$0.112 | +131K | +
| llama-3.1-8b-instruct--fp-16 | +inferencenet | +$0.02 | +$0.03 | +131K | +
| schematron-3b | +inferencenet | +$0.02 | +$0.05 | +131K | +
| schematron-v3 | +inferencenet | +$0.02 | +$0.05 | +131K | +
| gpt-oss-20b | +inferencenet | +$0.03 | +$0.15 | +131K | +
| schematron-v2-turbo | +inferencenet | +$0.03 | +$0.15 | +131K | +
| openai--gpt-oss-20b | +neuralwatt | +$0.03 | +$0.16 | +? | +
The most affordable models with advanced reasoning capabilities.
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +
|---|---|---|---|---|
| qwen3.5-0.8b | +deepinfra | +$0.01 | +$0.05 | +262K | +
| qwen3.5-2b | +deepinfra | +$0.02 | +$0.1 | +262K | +
| gpt-oss-20b | +deepinfra | +$0.03 | +$0.14 | +131K | +
| qwen3.5-4b | +deepinfra | +$0.03 | +$0.15 | +262K | +
| openai--gpt-oss-20b | +neuralwatt | +$0.03 | +$0.16 | +? | +
| qwen--qwen3-4b-fp8 | +novitaai | +$0.03 | +$0.03 | +128K | +
| gpt-oss-120b | +deepinfra | +$0.039 | +$0.19 | +131K | +
| nvidia-nemotron-nano-9b-v2 | +deepinfra | +$0.04 | +$0.16 | +131K | +
| openai--gpt-oss-20b | +novitaai | +$0.04 | +$0.15 | +131K | +
| nemotron-3-nano-30b-a3b | +deepinfra | +$0.05 | +$0.2 | +262K | +
The most affordable models that can understand images.
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +
|---|---|---|---|---|
| qwen3.5-0.8b | +deepinfra | +$0.01 | +$0.05 | +262K | +
| qwen3.5-2b | +deepinfra | +$0.02 | +$0.1 | +262K | +
| paddlepaddle--paddleocr-vl | +novitaai | +$0.02 | +$0.02 | +16K | +
| qwen3.5-4b | +deepinfra | +$0.03 | +$0.15 | +262K | +
| deepseek--deepseek-ocr-2 | +novitaai | +$0.03 | +$0.03 | +8K | +
| deepseek--deepseek-ocr | +novitaai | +$0.03 | +$0.03 | +8K | +
| reka-edge-2 | +reka | +$0.03 | +$0.1 | +131K | +
| zai-org--autoglm-phone-9b-multilingual | +novitaai | +$0.035 | +$0.138 | +65K | +
| gemini-1.5-flash-8b | +deepinfra | +$0.0375 | +$0.15 | +1M | +
| google-gemma-3-4b | +amazon-bedrock | +$0.04 | +$0.08 | +131K | +
The most affordable models with large context windows (128K+ tokens).
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +
|---|---|---|---|---|
| mistralai--Mistral-Nemo-Instruct-2407 | +klusterai | +$0.008 | +$0.001 | +131K | +
| qwen3.5-0.8b | +deepinfra | +$0.01 | +$0.05 | +262K | +
| ling-2.6-flash | +inclusionai | +$0.01 | +$0.03 | +262K | +
| bdc-coder | +inferencenet | +$0.01 | +$0.01 | +131K | +
| klusterai--Meta-Llama-3.1-8B-Instruct-Turbo | +klusterai | +$0.015 | +$0.02 | +131K | +
| granite-4.0-h-micro | +cloudflare | +$0.017 | +$0.112 | +131K | +
| meta-llama-3.1-8b-instruct-turbo | +deepinfra | +$0.02 | +$0.03 | +131K | +
| meta-llama-3.1-8b-instruct | +deepinfra | +$0.02 | +$0.05 | +131K | +
| mistral-nemo-instruct-2407 | +deepinfra | +$0.02 | +$0.04 | +131K | +
| qwen3.5-2b | +deepinfra | +$0.02 | +$0.1 | +262K | +
How much do the top AI models cost? A side-by-side comparison of the most popular models.
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +
|---|---|---|---|---|
| gpt-4.1 | +openai | +$2 | +$8 | +1M | +
| gpt-4o | +openai | +$2.5 | +$10 | +128K | +
| gpt-4o-mini | +openai | +$0.15 | +$0.6 | +128K | +
| gemini-2.5-pro | +deepinfra | +$1.25 | +$10 | +1M | +
| gemini-2.5-flash | +deepinfra | +$0.3 | +$2.5 | +1M | +
| llama-4-maverick | +digitalocean | +$0.25 | +$0.87 | +1M | +
| deepseek-r1 | +amazon-bedrock | +$1.35 | +$5.4 | +65K | +
| deepseek-v3 | +deepinfra | +$0.32 | +$0.89 | +163K | +
+ 0 models offer cache pricing โ significantly reducing costs for repeated prompts. Cache + pricing is typically 50-90% cheaper than standard input pricing. +
+| Model | +Provider | +Input $/1M | +Cache $/1M | +Savings | +
|---|
+ All pricing data is sourced from first-party APIs โ not third-party aggregators. Prices + are per million tokens (input and output separately). Aggregator providers (OpenRouter, + Requesty, etc.) are excluded from ranking tables to avoid duplicate models. Cache pricing is + shown where available. +
+Data is auto-scraped and validated with Zod schemas. Last updated: 2025-05-21.
+ ++ 1,548 models that see, hear, speak, and create โ compared with pricing, context windows, + and capabilities +
+ + ๐ Interactive Catalog + โญ Star on GitHub +The most capable multimodal models across all providers:
+| Model | +Provider | +Context | +Input | +Output | +Tool Call | +Price (in/out per 1M) | +
|---|---|---|---|---|---|---|
gpt-4o |
+ OpenAI | +128K | +text, image | +text | +โ | +$2.50/$10 | +
gpt-4.1 |
+ OpenAI | +1M | +text, image | +text | +โ | +$2/$8 | +
claude-sonnet-4 |
+ Anthropic | +200K | +text, image | +text | +โ | +$3/$15 | +
gemini-2.5-pro |
+ 1M | +text, image, audio, video | +text | +โ | +$1.25/$10 | +|
gemini-2.5-flash |
+ 1M | +text, image, audio, video | +text | +โ | +$0.15/$0.60 | +|
llama-4-maverick |
+ Meta | +1M | +text, image | +text | +โ | +Varies | +
qwen3-235b-a22b |
+ Alibaba | +128K | +text, image | +text | +โ | +Varies | +
+ 1,487 models can accept images as input alongside text. These are the most common type of + multimodal model: +
+โ See all 1,487 vision models compared
+ ++ 118 models can process audio input โ for transcription, voice analysis, and audio + understanding: +
+| Model | +Provider | +Audio Capabilities | +Context | +
|---|---|---|---|
gemini-2.5-pro |
+ Audio understanding + transcription | +1M | +|
gemini-2.5-flash |
+ Audio understanding + transcription | +1M | +|
gpt-4o-audio-preview |
+ OpenAI | +Audio input + output | +128K | +
claude-sonnet-4 |
+ Anthropic | +Audio transcription | +200K | +
+ 28 models can generate images from text descriptions. This is a rapidly growing category: +
+| Model | +Provider | +Capabilities | +
|---|---|---|
gpt-image-1 |
+ OpenAI | +Text-to-image, image editing | +
dall-e-3 |
+ OpenAI | +Text-to-image generation | +
flux-1.1-pro |
+ Black Forest Labs | +High-quality text-to-image | +
stable-diffusion-3.5 |
+ Stability AI | +Open-weight text-to-image | +
+ โ See all 28 image generation models +
+ ++ 34 models can generate audio output โ for text-to-speech, voice cloning, and audio + generation: +
++ 167 models can process video input โ for video analysis, summarization, and content + understanding: +
+gemma-3-27b-it (free) or
+ gpt-4o
+ gemini-2.5-flash (cheapest
+ multimodal) or gemini-2.5-pro
+ gpt-image-1 or
+ flux-1.1-pro
+ gemini-2.5-pro (best video
+ understanding)
+ llama-4-maverick or
+ claude-sonnet-4
+ gemini-2.5-flash ($0.15/$0.60 per 1M
+ tokens)
+ gemma-3-27b-it (Google, free) or
+ qwen3-32b (Alibaba, free)
+ + 527 open-weight LLMs compared โ pricing, context windows, tool calling, reasoning, and + vision capabilities +
+ + ๐ Interactive Catalog + โญ Star on GitHub +The most capable open-weight models available today, from leading AI labs:
+| Model | +Provider | +Context | +Tool Call | +Reasoning | +Price (in/out per 1M) | +
|---|---|---|---|---|---|
llama-4-maverick |
+ Meta | +1M | +โ | +โ | +Varies | +
llama-4-scout |
+ Meta | +10M | +โ | +โ | +Varies | +
deepseek-r1 |
+ DeepSeek | +128K | +โ | +โ | +Varies | +
deepseek-v3 |
+ DeepSeek | +128K | +โ | +โ | +Varies | +
qwen3-235b-a22b |
+ Alibaba | +128K | +โ | +โ | +Varies | +
qwen3-32b |
+ Alibaba | +128K | +โ | +โ | +Varies | +
llama-3.3-70b-instruct |
+ Meta | +128K | +โ | +โ | +Varies | +
gemma-3-27b-it |
+ 128K | +โ | +โ | +Free | +|
phi-4 |
+ Microsoft | +16K | +โ | +โ | +Varies | +
command-a |
+ Cohere | +256K | +โ | +โ | +Varies | +
mistral-large-2411 |
+ Mistral | +128K | +โ | +โ | +Varies | +
+ 81 open-weight models you can use for free through their provider APIs. These are ideal for + prototyping, testing, and learning: +
+| Model | +Provider | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|
gemma-3-27b-it |
+ 128K | +โ | +โ | +|
gemma-3-12b-it |
+ 128K | +โ | +โ | +|
gemma-3-4b-it |
+ 128K | +โ | +โ | +|
gemma-3-1b-it |
+ 128K | +โ | +โ | +|
qwen3-235b-a22b |
+ Alibaba | +128K | +โ | +โ | +
qwen3-30b-a3b |
+ Alibaba | +128K | +โ | +โ | +
qwen3-32b |
+ Alibaba | +128K | +โ | +โ | +
qwen3-14b |
+ Alibaba | +128K | +โ | +โ | +
qwen3-8b |
+ Alibaba | +128K | +โ | +โ | +
qwen3-4b |
+ Alibaba | +128K | +โ | +โ | +
qwen3-1.7b |
+ Alibaba | +128K | +โ | +โ | +
qwen3-0.6b |
+ Alibaba | +128K | +โ | +โ | +
llama-4-maverick |
+ Meta | +1M | +โ | +โ | +
llama-4-scout |
+ Meta | +10M | +โ | +โ | +
llama-3.3-70b-instruct |
+ Meta | +128K | +โ | +โ | +
+ โ See all 81 free AI models (including non-open-weight) +
+ ++ 375 open-weight models support tool/function calling โ essential for AI agents and agentic + workflows: +
+โ See all 2,350 tool-calling models
+ ++ 231 open-weight models with reasoning capabilities โ these can "think step by step" for + complex tasks: +
+| Model | +Provider | +Context | +Tool Call | +Key Strength | +
|---|---|---|---|---|
deepseek-r1 |
+ DeepSeek | +128K | +โ | +Best open-weight reasoning, rivals o1 | +
qwen3-235b-a22b |
+ Alibaba | +128K | +โ | +MoE architecture, thinking mode | +
qwen3-32b |
+ Alibaba | +128K | +โ | +Dense reasoning, strong benchmarks | +
qwen3-30b-a3b |
+ Alibaba | +128K | +โ | +Lightweight MoE reasoning | +
qwen3-14b |
+ Alibaba | +128K | +โ | +Mid-size reasoning model | +
qwen3-8b |
+ Alibaba | +128K | +โ | +Small but capable reasoning | +
โ See all 1,306 reasoning models
+ ++ 269 open-weight models can process images alongside text โ useful for document analysis, + visual Q&A, and multimodal applications: +
+โ See all 1,487 vision models
+ ++ Open-weight models with the largest context windows โ essential for processing long + documents, codebases, and multi-turn conversations: +
+| Model | +Provider | +Context Window | +Tool Call | +Reasoning | +
|---|---|---|---|---|
llama-4-scout |
+ Meta | +10M | +โ | +โ | +
llama-4-maverick |
+ Meta | +1M | +โ | +โ | +
command-a |
+ Cohere | +256K | +โ | +โ | +
deepseek-r1 |
+ DeepSeek | +128K | +โ | +โ | +
deepseek-v3 |
+ DeepSeek | +128K | +โ | +โ | +
qwen3-235b-a22b |
+ Alibaba | +128K | +โ | +โ | +
llama-3.3-70b-instruct |
+ Meta | +128K | +โ | +โ | +
gemma-3-27b-it |
+ 128K | +โ | +โ | +|
mistral-large-2411 |
+ Mistral | +128K | +โ | +โ | +
+ โ See all models with context window comparison +
+ +gemma-3-27b-it (Google,
+ free) or qwen3-32b (Alibaba, free)
+ llama-4-maverick (1M context + tool
+ calling) or deepseek-r1 (reasoning + tools)
+ llama-4-scout (10M context)
+ or llama-4-maverick (1M context)
+ deepseek-r1 (best open-weight
+ reasoning) or qwen3-235b-a22b
+ llama-4-maverick or
+ gemma-3-27b-it
+ qwen3-0.6b or
+ gemma-3-1b-it (smallest open-weight)
+ command-a (256K context, optimized
+ for RAG + tools)
+ | Aspect | +Open Weights | +Proprietary | +
|---|---|---|
| Self-hosting | +โ Run on your own hardware | +โ Cloud API only | +
| Data privacy | +โ Full control over data | +โ Data sent to provider | +
| Customization | +โ Fine-tune on your data | +โ Limited (prompt-based) | +
| Cost at scale | +โ Fixed infra cost | +โ Per-token pricing | +
| Latest capabilities | +~3โ6 months behind | +โ Cutting-edge | +
| Convenience | +Requires infra setup | +โ Instant API access | +
+ Looking for alternatives to OpenAI? Compare 87 AI providers with 4,587 models. + Real pricing, real capabilities, first-party data. +
+ +How do alternative providers compare on price? All prices per million tokens.
+| Provider | +Model | +Input $/1M | +Output $/1M | +Context | +Tool Call | +
|---|---|---|---|---|---|
| OpenAI | +gpt-4.1 | +$2 | +$8 | +1M | +โ | +
| OpenAI | +gpt-4o | +$2.5 | +$10 | +128K | +โ | +
| OpenAI | +gpt-4o-mini | +$0.15 | +$0.6 | +128K | +โ | +
| gemini-2.5-pro | +$1.25 | +$10 | +1M | ++ | |
| gemini-2.5-flash | +$0.3 | +$2.5 | +1M | ++ | |
| DeepSeek | +deepseek-r1 | +$1.35 | +$5.4 | +65K | ++ |
| DeepSeek | +deepseek-v3 | +$0.32 | +$0.89 | +163K | ++ |
| Meta | +llama-4-maverick | +$0.25 | +$0.87 | +1M | +โ | +
+ Anthropic's Claude models are known for superior reasoning, safety, and long context windows. + Claude is a strong alternative for complex tasks, coding, and analysis. +
+| Model | +Input $/1M | +Output $/1M | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|
| claude-haiku-4-5 | +$1 | +$5 | +200K | +โ | +โ | +
| databricks-claude-haiku-4-5 | +$1 | +$5 | +200K | ++ | + |
| claude-haiku-4-5 | +$1 | +$5 | +200K | ++ | โ | +
| claude-haiku-4-5 | +$1 | +$5 | +200K | +โ | ++ |
| claude-sonnet-4-0 | +$3 | +$15 | +1M | +โ | +โ | +
| claude-sonnet-4-5 | +$3 | +$15 | +1M | +โ | +โ | +
| claude-sonnet-4-6 | +$3 | +$15 | +1M | +โ | +โ | +
| databricks-claude-sonnet-4-5 | +$3 | +$15 | +200K | ++ | + |
| databricks-claude-sonnet-4 | +$3 | +$15 | +200K | ++ | + |
| claude-sonnet-4-6 | +$3 | +$15 | +1M | ++ | โ | +
| claude-sonnet-4-5 | +$3 | +$15 | +200K | +โ | ++ |
| claude-sonnet-4-6 | +$3 | +$15 | +200K | +โ | ++ |
| claude-sonnet-4 | +$3 | +$15 | +200K | +โ | ++ |
| claude-opus-4-5 | +$5 | +$25 | +200K | +โ | +โ | +
| claude-opus-4-6 | +$5 | +$25 | +1M | +โ | +โ | +
| claude-opus-4-7 | +$5 | +$25 | +1M | +โ | +โ | +
| databricks-claude-opus-4-5 | +$5 | +$25 | +200K | ++ | + |
| claude-opus-4-7 | +$5 | +$25 | +1M | ++ | โ | +
| claude-opus-4-5 | +$5 | +$25 | +200K | +โ | ++ |
| claude-opus-4-6 | +$5 | +$25 | +200K | +โ | ++ |
| claude-opus-4-7 | +$5 | +$25 | +200K | +โ | ++ |
| claude-opus-4-5 | +$6.25 | +$31.25 | +200K | +โ | ++ |
| claude-opus-4-0 | +$15 | +$75 | +200K | +โ | +โ | +
| claude-opus-4-1 | +$15 | +$75 | +200K | +โ | +โ | +
| databricks-claude-opus-4-1 | +$15 | +$75 | +200K | ++ | + |
| claude-opus-4-1 | +$15 | +$75 | +200K | +โ | ++ |
| claude-opus-4 | +$15 | +$75 | +200K | +โ | ++ |
| claude-opus-4-6-fast | +$30 | +$150 | +1M | +โ | +โ | +
| claude-opus-4-7-fast | +$30 | +$150 | +1M | +โ | +โ | +
+ Google's Gemini models offer multimodal capabilities (text, image, audio, video) with + competitive pricing and massive context windows. +
+| Model | +Input $/1M | +Output $/1M | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|
| gemini-1.5-flash-8b | +$0.0375 | +$0.15 | +1M | ++ | + |
| gemini-1.5-flash | +$0.075 | +$0.3 | +1M | ++ | + |
| gemini-1.5-flash-8b | +$0.075 | +$0.3 | +1M | +โ | ++ |
| gemini-1.5-flash | +$0.075 | +$0.3 | +1M | +โ | ++ |
| gemini-2.0-flash-lite | +$0.075 | +$0.3 | +1M | +โ | ++ |
| gemini-2-0-flash-lite | +$0.075 | +$0.3 | +1M | +โ | ++ |
| gemini-2.0-flash | +$0.1 | +$0.4 | +1M | +โ | ++ |
| gemini-2.5-flash-lite | +$0.1 | +$0.4 | +1M | +โ | ++ |
| gemini-2-5-flash-lite | +$0.1 | +$0.4 | +1M | +โ | ++ |
| gemini-2.5-flash | +$0.15 | +$3.5 | +1M | +โ | +โ | +
| gemini-2-0-flash | +$0.15 | +$0.6 | +1M | +โ | ++ |
| databricks-gemini-3-1-flash-lite | +$0.25 | +$1.5 | +128K | ++ | + |
| gemini-3.1-flash-lite | +$0.25 | +$1.5 | +1M | ++ | โ | +
| gemini-3-1-flash-lite | +$0.25 | +$1.5 | +200K | +โ | ++ |
| databricks-gemini-2-5-flash | +$0.3 | +$2.5 | +128K | ++ | + |
| gemini-2.5-flash | +$0.3 | +$2.5 | +1M | ++ | โ | +
| gemini-2-5-flash | +$0.3 | +$2.5 | +1M | +โ | ++ |
| gemini-3-flash | +$0.5 | +$3 | +200K | +โ | ++ |
| databricks-gemini-3-flash | +$0.63 | +$3.75 | +128K | ++ | + |
| databricks-gemini-2-5-pro | +$1.25 | +$10 | +128K | ++ | + |
| gemini-2.5-pro | +$1.25 | +$10 | +1M | ++ | โ | +
| gemini-1.5-pro | +$1.25 | +$5 | +2M | +โ | ++ |
| gemini-2.5-pro | +$1.25 | +$10 | +1M | +โ | +โ | +
| gemini-2-5-pro | +$1.25 | +$10 | +1M | +โ | ++ |
| gemini-3.1-pro | +$2 | +$12 | +1M | ++ | โ | +
| gemini-3-pro | +$2 | +$12 | +200K | +โ | ++ |
| databricks-gemini-3-1-pro | +$2.5 | +$15 | +128K | ++ | + |
| chirp-3.0-HD | +$? | +$? | +? | ++ | + |
| gemma-3-12b-it | +Free | ++ | 131K | ++ | + |
| gemma-3-1b-it | +Free | ++ | 131K | ++ | + |
| gemma-3-27b-it | +Free | ++ | 131K | ++ | + |
| gemma-3-4b-it | +Free | ++ | 131K | ++ | + |
| gemma-3n-E2B-it | +Free | ++ | 131K | ++ | + |
| gemma-3n-E4B-it | +Free | ++ | 131K | ++ | + |
| imagen-3.0-fast-generate | +$? | +$? | +? | ++ | + |
| imagen-3.0-generate | +$? | +$? | +? | ++ | + |
| imagen-4.0-fast-generate | +$? | +$? | +? | ++ | + |
| imagen-4.0-generate | +$? | +$? | +? | ++ | + |
| lyria-2.0 | +$? | +$? | +? | ++ | + |
| veo-2.0-generate | +$? | +$? | +? | ++ | + |
+ Meta's Llama models are open-weight, meaning you can download and run them locally. Great for + privacy-sensitive applications and cost optimization. +
+| Model | +Provider | +Context | +Tool Call | +Reasoning | +Open Weights | +
|---|---|---|---|---|---|
| meta-llama-4-scout | +meta | +10M | +โ | ++ | + |
| meta-llama-4-maverick-17b | +amazon-bedrock | +1M | +โ | ++ | โ | +
| meta-llama-4-scout-17b | +amazon-bedrock | +1M | +โ | ++ | โ | +
| llama-4-maverick-17b-128e-instruct-fp8 | +deepinfra | +1M | ++ | + | + |
| llama-4-maverick | +digitalocean | +1M | +โ | ++ | โ | +
| meta-llama--Llama-4-Maverick-17B-128E-Instruct-FP8 | +gmicloud | +1M | +โ | ++ | โ | +
| llama-4-maverick | +google-vertex | +1M | +โ | ++ | โ | +
| llama-4-scout | +google-vertex | +1M | +โ | ++ | โ | +
| meta-llama--Llama-4-Maverick-17B-128E-Instruct-FP8 | +klusterai | +1M | ++ | + | โ | +
| meta-llama--llama-4-maverick-17b-128e-instruct-fp8 | +novitaai | +1M | ++ | + | + |
| meta-llama-4-maverick | +meta | +1M | +โ | ++ | + |
| llama-4-scout-17b-16e-instruct | +cloudflare | +327K | +โ | ++ | โ | +
| llama-4-scout-17b-16e-instruct | +deepinfra | +327K | ++ | + | + |
| meta-llama--Llama-4-Scout-17B-16E-Instruct | +gmicloud | +327K | +โ | ++ | โ | +
| llama-4-scout-17b-16e-instruct | +vultr | +327K | +โ | ++ | โ | +
+ DeepSeek offers high-performance reasoning and chat models at significantly lower prices than + OpenAI. DeepSeek-R1 rivals o1 on reasoning benchmarks. +
+| Model | +Input $/1M | +Output $/1M | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|
| deepseek--deepseek-ocr-2 | +$0.03 | +$0.03 | +8K | ++ | + |
| deepseek--deepseek-ocr | +$0.03 | +$0.03 | +8K | ++ | + |
| deepseek--deepseek-r1-0528-qwen3-8b | +$0.06 | +$0.09 | +128K | ++ | + |
| deepseek-ai--DeepSeek-R1-Distill-Qwen-7B | +$0.1 | +$0.2 | +131K | ++ | โ | +
| deepseek-v4-flash | +$0.126 | +$0.252 | +1M | +โ | +โ | +
| deepseek-v4-flash | +$0.14 | +$0.28 | +1M | ++ | โ | +
| deepseek-chat | +$0.14 | +$0.28 | +1M | +โ | ++ |
| deepseek-reasoner | +$0.14 | +$0.28 | +1M | +โ | +โ | +
| deepseek-v4-flash | +$0.14 | +$0.28 | +1M | +โ | +โ | +
| deepseek-ai--DeepSeek-R1-Distill-Llama-8B | +$0.14 | +$0.39 | +131K | ++ | โ | +
| deepseek--deepseek-v4-flash | +$0.14 | +$0.28 | +1M | +โ | +โ | +
| deepseek--deepseek-v4-flash | +$0.14 | +$0.28 | +1M | +โ | +โ | +
| deepseek-v4-flash | +$0.14 | +$0.28 | +1M | +โ | +โ | +
| deepseek--deepseek-r1-distill-qwen-14b | +$0.15 | +$0.15 | +32K | ++ | + |
| deepseek--deepseek-4-flash | +$0.182 | +$0.364 | +? | ++ | + |
| deepseek-v3-0324 | +$0.2 | +$0.77 | +163K | ++ | + |
| deepseek-ai--DeepSeek-R1-Distill-Qwen-14B | +$0.2 | +$0.2 | +131K | ++ | โ | +
| deepseek--DeepSeek-R1 | +$0.2 | +$0.8 | +163K | ++ | โ | +
| deepseek--DeepSeek-V3.1 | +$0.2 | +$0.8 | +163K | ++ | + |
| deepseek-v3.1 | +$0.21 | +$0.79 | +163K | ++ | โ | +
| deepseek--deepseek-ocr-2 | +$0.216 | +$0.216 | +8K | +โ | ++ |
| deepseek-ai--DeepSeek-R1-Distill-Llama-70B | +$0.25 | +$0.75 | +131K | ++ | โ | +
| deepseek-v3.2 | +$0.252 | +$0.378 | +131K | +โ | +โ | +
| deepseek-v3.2 | +$0.26 | +$0.38 | +163K | ++ | + |
| deepseek--deepseek-v3.2 | +$0.269 | +$0.4 | +163K | +โ | +โ | +
| deepseek-v3.1-terminus | +$0.27 | +$0.95 | +163K | ++ | โ | +
| deepseek-ai--DeepSeek-V3.1-Terminus | +$0.27 | +$1 | +163K | ++ | + |
| deepseek-ai--DeepSeek-V3.1 | +$0.27 | +$1 | +163K | ++ | + |
| deepseek-ai--DeepSeek-V3.2-Exp | +$0.27 | +$0.41 | +163K | ++ | + |
| deepseek--deepseek-v3-0324 | +$0.27 | +$1.12 | +163K | +โ | ++ |
| deepseek--deepseek-v3.1-terminus | +$0.27 | +$1 | +131K | +โ | +โ | +
| deepseek--deepseek-v3.1 | +$0.27 | +$1 | +131K | +โ | +โ | +
| deepseek--deepseek-v3.2-exp | +$0.27 | +$0.41 | +163K | +โ | +โ | +
| deepseek-v3.1-nex-n1 | +$0.27 | +$1 | +131K | +โ | ++ |
| deepseek-v3.1-terminus | +$0.27 | +$1 | +163K | +โ | ++ |
| deepseek-v3.2-exp | +$0.27 | +$0.41 | +163K | +โ | ++ |
| deepseek-v3.2 | +$0.27 | +$0.42 | +163K | +โ | ++ |
| deepseek-ai--DeepSeek-V3.2-TEE | +$0.28 | +$0.42 | +131K | +โ | +โ | +
| deepseek-ai--DeepSeek-V3-0324 | +$0.28 | +$0.88 | +163K | ++ | + |
| deepseek-ai--DeepSeek-V3-0324 | +$0.28 | +$1.14 | +163K | +โ | ++ |
| deepseek--deepseek-v3.1-release | +$0.294 | +$0.441 | +? | ++ | + |
| deepseek--deepseek-v3.1 | +$0.294 | +$0.441 | +? | ++ | + |
| DeepSeek-V3.2 | +$0.3 | +$0.45 | +160K | +โ | +โ | +
| deepseek--deepseek-r1-distill-qwen-32b | +$0.3 | +$0.3 | +64K | ++ | + |
| deepseek-v3 | +$0.32 | +$0.89 | +163K | ++ | + |
| deepseek--deepseek-v3.1-terminus | +$0.364 | +$0.546 | +? | ++ | + |
| deepseek--deepseek-v3.2-exp-non-thinking | +$0.364 | +$0.546 | +128K | +โ | ++ |
| deepseek--deepseekโv3.2โexp-thinking | +$0.364 | +$0.546 | +? | ++ | + |
| deepseek--deepseek-v3.2-speciale | +$0.36855 | +$0.56186 | +? | ++ | + |
| deepseek-v3--fp-8 | +$0.4 | +$1.2 | +131K | +โ | ++ |
| deepseek--deepseek-v3-turbo | +$0.4 | +$1.3 | +64K | +โ | ++ |
| deepseek-v4-pro | +$0.435 | +$0.87 | +1M | +โ | +โ | +
| deepseek-r1-distill-qwen-32b | +$0.497 | +$4.881 | +131K | ++ | โ | +
| deepseek-v3-1 | +$0.5 | +$1.5 | +131K | +โ | ++ |
| deepseek-r1-0528 | +$0.5 | +$2.15 | +163K | ++ | โ | +
| deepseek-3.2 | +$0.5 | +$1.6 | +131K | +โ | ++ |
| deepseek-ai--DeepSeek-V3.2 | +$0.5 | +$1.5 | +163K | +โ | ++ |
| deepseek-ai--DeepSeek-Prover-V2-671B | +$0.5 | +$2.18 | +163K | ++ | โ | +
| deepseek-ai--DeepSeek-R1-Distill-Qwen-32B | +$0.5 | +$0.9 | +131K | ++ | โ | +
| deepseek-ai--DeepSeek-R1 | +$0.5 | +$2.18 | +163K | ++ | โ | +
| deepseek-r1-0528--fp-8 | +$0.5 | +$2.15 | +131K | ++ | โ | +
| deepseek-r1-0528 | +$0.55 | +$2.75 | +131K | ++ | โ | +
| deepseek-r1-distill-llama-70b | +$0.55 | +$2.75 | +131K | ++ | โ | +
| deepseek-r1-distill-llama-8b | +$0.55 | +$2.75 | +131K | ++ | โ | +
| deepseek-r1-distill-qwen-1.5b | +$0.55 | +$2.75 | +32K | ++ | โ | +
| deepseek-r1-distill-qwen-14b | +$0.55 | +$2.75 | +131K | ++ | โ | +
| deepseek-r1-distill-qwen-32b | +$0.55 | +$2.75 | +131K | ++ | โ | +
| deepseek-r1-distill-qwen-7b | +$0.55 | +$2.75 | +32K | ++ | โ | +
| deepseek-v32-exp | +$0.55 | +$2.75 | +131K | ++ | + |
| deepseek-v32-speciale | +$0.55 | +$2.75 | +131K | ++ | + |
| deepseek-v32 | +$0.55 | +$2.75 | +131K | ++ | + |
| deepseek-v4-flash | +$0.55 | +$2.75 | +131K | ++ | + |
| deepseek-v4-pro | +$0.55 | +$2.75 | +131K | ++ | โ | +
| deepseek-v3-2 | +$0.56 | +$1.68 | +65K | +โ | ++ |
| deepseek-v3-1 | +$0.6 | +$1.7 | +65K | ++ | + |
| deepseek-ai--DeepSeek-V3.1 | +$0.6 | +$1.7 | +131K | +โ | ++ |
| deepseek-v3-2 | +$0.62 | +$1.85 | +65K | +โ | ++ |
| deepseek-r1-distill-llama-70b | +$0.7 | +$0.8 | +131K | ++ | โ | +
| deepseek-ai--DeepSeek-R1-0528 | +$0.7 | +$2.3 | +163K | ++ | โ | +
| deepseek--deepseek-prover-v2-671b | +$0.7 | +$2.5 | +160K | ++ | + |
| deepseek--deepseek-r1-0528 | +$0.7 | +$2.5 | +163K | +โ | +โ | +
| deepseek--deepseek-r1-turbo | +$0.7 | +$2.5 | +64K | +โ | +โ | +
| deepseek--deepseek-r1-distill-llama-70b | +$0.8 | +$0.8 | +8K | ++ | โ | +
| deepseek-r1-distill-llama-70b | +$0.99 | +$0.99 | +131K | ++ | โ | +
| deepseek-r1-0528-turbo | +$1 | +$3 | +32K | ++ | โ | +
| deepseek--deepseek-v4-flash | +$1 | +$2 | +1M | +โ | ++ |
| deepseek-v4-flash | +$1 | +$2 | +131K | ++ | โ | +
| deepseek-v4-flash | +$1 | +$2 | +1M | +โ | ++ |
| deepseek-r1 | +$1.35 | +$5.4 | +65K | ++ | + |
| deepseek-r1 | +$1.35 | +$5.4 | +65K | ++ | + |
| deepseek-v4-pro | +$1.521 | +$3.042 | +716K | +โ | +โ | +
| deepseek--deepseek-v4-pro | +$1.67 | +$3.38 | +1M | +โ | +โ | +
| deepseek-v4 | +$1.74 | +$3.48 | +131K | +โ | +โ | +
| deepseek-v4-pro | +$1.74 | +$3.48 | +65K | ++ | + |
| deepseek-v4-pro | +$1.74 | +$3.48 | +163K | +โ | +โ | +
| deepseek-v4-pro | +$1.74 | +$3.48 | +1M | +โ | +โ | +
| deepseek--deepseek-v4-pro | +$1.74 | +$3.48 | +1M | +โ | +โ | +
| deepseek-v4-pro | +$1.74 | +$3.48 | +1M | +โ | +โ | +
| DeepSeek-V4-Pro | +$1.75 | +$3.5 | +1M | +โ | +โ | +
| deepseek-ai--DeepSeek-V2.5 | +$2 | +$2 | +163K | ++ | + |
| deepseek--deepseek-v3--community | +$2 | +$8 | +64K | +โ | ++ |
| deepseek--deepseek-v3-0324 | +$2 | +$8 | +163K | +โ | ++ |
| deepseek--deepseek-v3-turbo | +$2 | +$8 | +64K | +โ | ++ |
| deepseek--deepseek-v3.2-exp | +$2 | +$3 | +163K | +โ | ++ |
| deepseek--deepseek-v3.2 | +$2 | +$3 | +163K | +โ | ++ |
| deepseek-v3.2 | +$2 | +$3 | +163K | +โ | ++ |
| deepseek-v3 | +$2 | +$8 | +163K | +โ | ++ |
| deepseek-v3-0324 | +$2 | +$8 | +128K | +โ | ++ |
| deepseek-v3.2 | +$2 | +$3 | +128K | +โ | ++ |
| deepseek-ai--DeepSeek-V4-Pro | +$2.1 | +$4.4 | +131K | +โ | +โ | +
| deepseek--deepseek-v4-pro | +$2.262 | +$4.524 | +? | ++ | + |
| deepseek-r1-distill-llama-70b | +$2.44 | +$2.44 | +131K | ++ | โ | +
| deepseek-ai--DeepSeek-R1-0528 | +$2.5 | +$5 | +163K | ++ | โ | +
| DeepSeek-V3.1 | +$3 | +$4.5 | +131K | ++ | + |
| DeepSeek-V3.2 | +$3 | +$4.5 | +32K | ++ | + |
| deepseek--deepseek-prover-v2-671b | +$4 | +$16 | +160K | +โ | +โ | +
| deepseek--deepseek-r1--community | +$4 | +$16 | +64K | +โ | +โ | +
| deepseek--deepseek-r1-0528 | +$4 | +$16 | +163K | +โ | +โ | +
| deepseek--deepseek-r1-turbo | +$4 | +$16 | +64K | +โ | +โ | +
| deepseek--deepseek-v3.1-terminus | +$4 | +$12 | +131K | +โ | ++ |
| deepseek--deepseek-v3.1 | +$4 | +$12 | +131K | +โ | ++ |
| deepseek-r1 | +$4 | +$16 | +163K | ++ | โ | +
| deepseek-v3.1-terminus | +$4 | +$12 | +163K | ++ | โ | +
| deepseek-r1-0528 | +$4 | +$16 | +128K | +โ | +โ | +
| deepseek-v3.1 | +$4 | +$12 | +128K | +โ | ++ |
| deepseek--deepseek-r1-distill-llama-70b | +$5.8 | +$5.8 | +32K | +โ | +โ | +
| deepseek--deepseek-v4-pro | +$12 | +$24 | +1M | +โ | ++ |
| deepseek-v4-pro | +$12 | +$24 | +128K | +โ | +โ | +
| deepseek-r1-distill-llama-70b | +Free | ++ | 131K | ++ | โ | +
| deepseek-r1-distill-llama-8b | +Free | ++ | 131K | ++ | โ | +
+ Mistral offers both open-weight and commercial models. Known for efficiency and European data + sovereignty. +
+| Model | +Input $/1M | +Output $/1M | +Context | +Tool Call | +Open Weights | +
|---|---|---|---|---|---|
| mistralai--Mistral-Nemo-Instruct-2407 | +$0.008 | +$0.001 | +131K | ++ | โ | +
| mistral-nemo-instruct-2407 | +$0.02 | +$0.04 | +131K | ++ | + |
| mistral-nemo-12b-instruct--fp-8 | +$0.0375 | +$0.1 | +131K | +โ | +โ | +
| ministral-3b | +$0.04 | +$0.04 | +128K | +โ | ++ |
| voxtral-mini | +$0.04 | +$0.04 | +128K | ++ | + |
| mistralai--mistral-nemo | +$0.04 | +$0.17 | +60K | ++ | + |
| mistral-small-24b-instruct-2501 | +$0.05 | +$0.08 | +32K | ++ | + |
| mistralai--Mistral-Small-24B-Instruct-2501 | +$0.05 | +$0.08 | +32K | ++ | โ | +
| mistral-small-3.2-24b-instruct-2506 | +$0.075 | +$0.2 | +128K | ++ | + |
| mistralai--Devstral-Small-2-24B-Instruct | +$0.1 | +$0.4 | +131K | +โ | +โ | +
| mistral-small-3-1 | +$0.1 | +$0.3 | +128K | +โ | ++ |
| ministral-8b | +$0.1 | +$0.1 | +128K | +โ | ++ |
| voxtral-small | +$0.1 | +$0.3 | +128K | ++ | + |
| Mistral-Small-3.2-24B-Instruct-2506 | +$0.1 | +$0.31 | +131K | ++ | + |
| mistral-7b-instruct-v0.1 | +$0.11 | +$0.19 | +32K | ++ | โ | +
| Mistral-7B-Instruct-v0.3 | +$0.11 | +$0.11 | +65K | ++ | + |
| Mistral-Nemo-Instruct-2407 | +$0.14 | +$0.14 | +65K | ++ | + |
| mistral-mistral-7b | +$0.15 | +$0.2 | +32K | ++ | โ | +
| mistral-7b | +$0.15 | +$0.2 | +32K | ++ | + |
| mistral-nemo | +$0.15 | +$0.15 | +128K | +โ | ++ |
| mistral-small-3.2-24b-instruct-2506 | +$0.15 | +$0.35 | +131K | +โ | +โ | +
| mistral-small | +$0.2 | +$0.6 | +128K | +โ | ++ |
| mistral-nemo-instruct-2407 | +$0.2 | +$0.2 | +131K | ++ | โ | +
| mistralai--Mistral-7B | +$0.2 | +$2 | +8K | ++ | โ | +
| mistralai--Mistral-Small-3.2-24B-Instruct-2506 | +$0.3 | +$0.3 | +? | +โ | ++ |
| mistral-small-3.1-24b-instruct | +$0.351 | +$0.555 | +131K | +โ | +โ | +
| mistral-medium-3 | +$0.4 | +$2 | +128K | +โ | ++ |
| mistral-medium | +$0.4 | +$2 | +128K | +โ | ++ |
| mixtral-8x7b | +$0.45 | +$0.7 | +32K | +โ | ++ |
| mistral-mistral-large-3 | +$0.5 | +$1.5 | +128K | +โ | ++ |
| mistralai--Magistral-Small | +$0.5 | +$2 | +131K | ++ | โ | +
| magistral-small | +$0.5 | +$1.5 | +128K | +โ | ++ |
| mistral-large-3-675b-instruct-2512 | +$0.55 | +$2.75 | +262K | +โ | +โ | +
| mistral-small-4-119b-2603 | +$0.55 | +$2.75 | +131K | +โ | +โ | +
| mixtral-8x22b | +$0.8 | +$1.2 | +64K | +โ | ++ |
| mistral-mistral-small | +$1 | +$3 | +128K | +โ | ++ |
| mistral-ai--mixtral-8x22b | +$1.26 | +$1.26 | +? | ++ | + |
| mistral-small-24b-instruct-2501 | +$1.26 | +$1.26 | +32K | +โ | ++ |
| mistral-large | +$2 | +$6 | +128K | +โ | ++ |
| pixtral-large | +$2 | +$6 | +128K | +โ | ++ |
| mistral-mistral-large | +$4 | +$12 | +128K | +โ | ++ |
| mistral-large-2407 | +$4 | +$12 | +128K | +โ | ++ |
| codestral | +Free | ++ | 256K | ++ | + |
| devstral | +Free | ++ | 128K | +โ | ++ |
+ 81 models are available at zero cost โ perfect for testing, prototyping, and learning. Many + support tool calling and have large context windows. +
+| Model | +Provider | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|
| openrouter--owl-alpha | +openrouter | +1M | +โ | ++ |
| deepseek--deepseek-v4-flash--free | +openrouter | +1M | +โ | +โ | +
| google--lyria-3-clip-preview | +openrouter | +1M | ++ | + |
| google--lyria-3-pro-preview | +openrouter | +1M | ++ | + |
| qwen--qwen3-coder--free | +openrouter | +1M | +โ | ++ |
| nvidia--nemotron-3-super-120b-a12b--free | +openrouter | +1M | +โ | +โ | +
| gemma-4-26b-a4b-it | +auriko | +262K | +โ | +โ | +
| gemma-4-31b-it | +auriko | +262K | +โ | +โ | +
| arcee-ai--trinity-large-thinking--free | +openrouter | +262K | +โ | +โ | +
| google--gemma-4-26b-a4b-it--free | +openrouter | +262K | +โ | +โ | +
| google--gemma-4-31b-it--free | +openrouter | +262K | +โ | +โ | +
| codestral | +mistral | +256K | ++ | + |
| nvidia--nemotron-3-nano-omni-30b-a3b-reasoning--free | +openrouter | +256K | +โ | +โ | +
| hunyuan-lite | +tencent | +250K | ++ | + |
| minimax--minimax-m2.5--free | +openrouter | +204K | +โ | +โ | +
+ 527 models with downloadable weights you can run locally or on your own infrastructure. No API + dependency, full privacy control. +
+| Model | +Provider | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|
| google--gemma-4-31b-it | +orcarouter | +1M | +โ | ++ |
| qwen--qwen3.5-flash-2026-02-23 | +orcarouter | +1M | +โ | ++ |
| qwen--qwen3.5-flash | +orcarouter | +1M | +โ | ++ |
| qwen--qwen3.6-flash-2026-04-16 | +orcarouter | +1M | +โ | ++ |
| qwen--qwen3.6-flash | +orcarouter | +1M | +โ | ++ |
| MiniMax-Text-01 | +302ai | +1M | ++ | + |
| llama-4-maverick | +302ai | +1M | ++ | + |
| llama-4-scout | +302ai | +1M | ++ | + |
| meta-llama-4-maverick-17b | +amazon-bedrock | +1M | +โ | ++ |
| meta-llama-4-scout-17b | +amazon-bedrock | +1M | +โ | ++ |
| minimax-m2-1 | +amazon-bedrock | +1M | +โ | ++ |
| minimax-m2-5 | +amazon-bedrock | +1M | +โ | ++ |
| minimax-m2 | +amazon-bedrock | +1M | +โ | ++ |
| minimax-m2-5 | +baseten | +1M | +โ | ++ |
| llama-4-maverick | +digitalocean | +1M | +โ | ++ |
+ All data is sourced from first-party APIs โ not third-party aggregators. Pricing, + context windows, and capabilities are verified against official provider documentation. + Aggregator providers are excluded from ranking tables to avoid duplicate models. +
+ ++ Compare 1,306 reasoning models across 95 providers. Find the best chain-of-thought + model for math, science, coding, and complex analysis. +
+ +The top reasoning models compared side by side.
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +Tool Call | +
|---|---|---|---|---|---|
| o3 | +openai | +$10 | +$40 | +200K | +โ | +
| o3-mini | +openai | +$1.1 | +$4.4 | +200K | +โ | +
| o4-mini | +openai | +$1.1 | +$4.4 | +200K | +โ | +
| o1 | +openai | +$15 | +$60 | +200K | +โ | +
| o1-mini | +openai | +$1.5 | +$6 | +128K | +โ | +
| o1-pro | +openai | +$150 | +$600 | +200K | +โ | +
| deepseek-r1-distill-llama-70b | +cerebras | +Free | ++ | 131K | ++ |
| gemini-2.5-pro | +deepinfra | +$1.25 | +$10 | +1M | ++ |
| gemini-2.5-flash | +deepinfra | +$0.3 | +$2.5 | +1M | ++ |
| qwen3-235b-a22b | +alibaba | +$2 | +$8 | +? | +โ | +
Reasoning on a budget โ most affordable models with reasoning capability.
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +Tool Call | +
|---|---|---|---|---|---|
| qwen3.5-0.8b | +deepinfra | +$0.01 | +$0.05 | +262K | ++ |
| qwen3.5-2b | +deepinfra | +$0.02 | +$0.1 | +262K | ++ |
| gpt-oss-20b | +deepinfra | +$0.03 | +$0.14 | +131K | ++ |
| qwen3.5-4b | +deepinfra | +$0.03 | +$0.15 | +262K | ++ |
| openai--gpt-oss-20b | +neuralwatt | +$0.03 | +$0.16 | +? | +โ | +
| qwen--qwen3-4b-fp8 | +novitaai | +$0.03 | +$0.03 | +128K | +โ | +
| gpt-oss-120b | +deepinfra | +$0.039 | +$0.19 | +131K | ++ |
| nvidia-nemotron-nano-9b-v2 | +deepinfra | +$0.04 | +$0.16 | +131K | ++ |
| openai--gpt-oss-20b | +novitaai | +$0.04 | +$0.15 | +131K | ++ |
| nemotron-3-nano-30b-a3b | +deepinfra | +$0.05 | +$0.2 | +262K | ++ |
| gpt-oss-120b | +inferencenet | +$0.05 | +$0.45 | +131K | +โ | +
| Qwen--Qwen3.6-35B-A3B | +neuralwatt | +$0.05 | +$0.1 | +? | +โ | +
| openai--gpt-oss-120b | +novitaai | +$0.05 | +$0.25 | +131K | +โ | +
| qwen3-30b-a3b-fp8 | +cloudflare | +$0.051 | +$0.335 | +40K | +โ | +
| glm-4.7-flash | +cloudflare | +$0.06 | +$0.4 | +131K | +โ | +
33 reasoning models at zero cost โ perfect for learning and prototyping.
+| Model | +Provider | +Context | +Tool Call | +
|---|---|---|---|
| deepseek--deepseek-v4-flash--free | +openrouter | +1M | +โ | +
| nvidia--nemotron-3-super-120b-a12b--free | +openrouter | +1M | +โ | +
| gemma-4-26b-a4b-it | +auriko | +262K | +โ | +
| gemma-4-31b-it | +auriko | +262K | +โ | +
| arcee-ai--trinity-large-thinking--free | +openrouter | +262K | +โ | +
| google--gemma-4-26b-a4b-it--free | +openrouter | +262K | +โ | +
| google--gemma-4-31b-it--free | +openrouter | +262K | +โ | +
| nvidia--nemotron-3-nano-omni-30b-a3b-reasoning--free | +openrouter | +256K | +โ | +
| minimax--minimax-m2.5--free | +openrouter | +204K | +โ | +
| z-ai--glm-5.1 | +openrouter | +202K | +โ | +
120 reasoning models you can run locally for full privacy and zero API costs.
+| Model | +Provider | +Context | +Tool Call | +
|---|---|---|---|
| xiaomi--mimo-v2.5-pro | +hpc-ai | +1M | +โ | +
| xiaomi--mimo-v2.5 | +hpc-ai | +1M | +โ | +
| deepseek--deepseek-v4-flash | +hpc-ai | +1M | +โ | +
| deepseek--deepseek-v4-pro | +hpc-ai | +1M | +โ | +
| DeepSeek-V4-Pro | +nebius | +1M | +โ | +
| trinity-large-thinking | +arcee | +262K | +โ | +
| qwen3-next-80b-a3b-thinking | +clarifai | +262K | +โ | +
| gemma-4-26b-a4b-it | +cloudflare | +262K | +โ | +
| kimi-k2.5 | +cloudflare | +262K | +โ | +
| kimi-k2.6 | +cloudflare | +262K | +โ | +
+ Models with both reasoning and tool calling โ the most capable for agentic workflows that need + complex planning. +
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +
|---|---|---|---|---|
| openai--gpt-oss-20b | +neuralwatt | +$0.03 | +$0.16 | +? | +
| qwen--qwen3-4b-fp8 | +novitaai | +$0.03 | +$0.03 | +128K | +
| gpt-oss-120b | +inferencenet | +$0.05 | +$0.45 | +131K | +
| Qwen--Qwen3.6-35B-A3B | +neuralwatt | +$0.05 | +$0.1 | +? | +
| openai--gpt-oss-120b | +novitaai | +$0.05 | +$0.25 | +131K | +
| qwen3-30b-a3b-fp8 | +cloudflare | +$0.051 | +$0.335 | +40K | +
| glm-4.7-flash | +cloudflare | +$0.06 | +$0.4 | +131K | +
| Nemotron-3-Nano-Omni | +nebius | +$0.06 | +$0.24 | +128K | +
| hermes-4-llama-3.1-8b | +nousresearch | +$0.06 | +$0.12 | +131K | +
| seed-1.6-flash | +bytedance | +$0.07 | +$0.3 | +262K | +
| ring-2.6-1t | +inclusionai | +$0.07 | +$0.62 | +262K | +
| zai-org--glm-4.7-flash | +novitaai | +$0.07 | +$0.4 | +200K | +
| microsoft-phi-4-mini-reasoning | +microsoft | +$0.075 | +$0.3 | +128K | +
| Qwen--Qwen3-32B-TEE | +chutes | +$0.08 | +$0.24 | +40K | +
| gpt-oss-120b | +clarifai | +$0.09 | +$0.36 | +131K | +
+ Reasoning models with 128K+ context โ for analyzing long documents, large codebases, and + complex multi-step problems. +
+| Model | +Provider | +Context | +Input $/1M | +Tool Call | +
|---|---|---|---|---|
| qwen3.5-0.8b | +deepinfra | +262K | +$0.01 | ++ |
| qwen3.5-2b | +deepinfra | +262K | +$0.02 | ++ |
| gpt-oss-20b | +deepinfra | +131K | +$0.03 | ++ |
| qwen3.5-4b | +deepinfra | +262K | +$0.03 | ++ |
| qwen--qwen3-4b-fp8 | +novitaai | +128K | +$0.03 | +โ | +
| gpt-oss-120b | +deepinfra | +131K | +$0.039 | ++ |
| nvidia-nemotron-nano-9b-v2 | +deepinfra | +131K | +$0.04 | ++ |
| openai--gpt-oss-20b | +novitaai | +131K | +$0.04 | ++ |
| nemotron-3-nano-30b-a3b | +deepinfra | +262K | +$0.05 | ++ |
| gpt-oss-120b | +inferencenet | +131K | +$0.05 | +โ | +
| openai--gpt-oss-120b | +novitaai | +131K | +$0.05 | +โ | +
| glm-4.7-flash | +cloudflare | +131K | +$0.06 | +โ | +
| glm-4.7-flash | +deepinfra | +202K | +$0.06 | ++ |
| Nemotron-3-Nano-Omni | +nebius | +128K | +$0.06 | +โ | +
| hermes-4-llama-3.1-8b | +nousresearch | +131K | +$0.06 | +โ | +
+ All data is sourced from first-party APIs. Reasoning capability is defined by the + provider's own classification โ models that use chain-of-thought, extended thinking, or + similar techniques. Aggregator providers are excluded from ranking tables to avoid duplicate + models. +
+ ++ Complete guide to small language models for edge deployment, mobile apps, and cost-efficient + production. All data from + AI Models Catalog โ first-party data + only. +
+ ++ Small Language Models (SLMs) are AI models with fewer than ~10 billion parameters, designed + for efficiency, low latency, and deployment on resource-constrained hardware โ from + smartphones to edge servers. They offer a practical alternative to large frontier models + when cost, speed, or privacy matters. +
+Key advantages of SLMs:
+Best value SLMs for AI agents and tool-use workflows (first-party providers only):
+| Model | +Provider | +Input $/M | +Output $/M | +Context | +Reasoning | +
|---|---|---|---|---|---|
| ling-2.6-flash | +ling | +$0.01 | +$0.03 | +262K | ++ |
| klusterai--Meta-Llama-3.1-8B-Instruct-Turbo | +klusterai | +$0.015 | +$0.02 | +131K | ++ |
| granite-4.0-h-micro | +ibm | +$0.017 | +$0.112 | +131K | ++ |
| llama-3.1-8b-instruct--fp-16 | +fireworks | +$0.02 | +$0.03 | +131K | ++ |
| schematron-3b | +fireworks | +$0.02 | +$0.05 | +131K | ++ |
48 small models available at zero cost โ perfect for prototyping and development:
+| Model | +Provider | +Context | +Tool Calling | +Reasoning | +
|---|---|---|---|---|
| deepseek-r1-distill-llama-8b | +cerebras | +131K | ++ | โ | +
| llama-4-scout-17b-16e-instruct | +cerebras | +131K | +โ | ++ |
| qwen-2.5-32b | +cerebras | +131K | +โ | ++ |
| gemma-4-26b-a4b-it | +auriko | +262K | +โ | ++ |
| glm-4.5-flash | +auriko | +200K | +โ | ++ |
| glm-4.6v-flash | +auriko | +128K | +โ | ++ |
| baidu--ernie-4.5-0.3b | +aimlapi | +120K | +โ | ++ |
+ 557 small models with reasoning capabilities โ ideal for math, logic, and step-by-step + problem solving: +
+| Model | +Provider | +Input $/M | +Output $/M | +Context | +Tool Calling | +
|---|---|---|---|---|---|
| qwen3.5-0.8b | +qwen | +$0.01 | +$0.05 | +262K | ++ |
| qwen3.5-2b | +qwen | +$0.02 | +$0.10 | +262K | ++ |
| qwen--qwen3-4b-fp8 | +fireworks | +$0.03 | +$0.03 | +128K | ++ |
| qwen3.5-4b | +qwen | +$0.03 | +$0.15 | +262K | ++ |
| deepseek-r1-distill-llama-8b | +cerebras | +Free | +Free | +131K | ++ |
+ ling-2.6-flash ($0.01/$0.03/M) โ cheapest tool-calling model with 262K + context. Perfect for high-volume agent workflows. +
++ Qwen3.5 0.8B โ ultra-compact reasoning model. + Gemma 4 27B IT โ free with vision + tool calling. +
++ bdc-coder ($0.01/$0.01/M) โ cheapest coding model. + Qwen3 4B ($0.03/$0.15/M) โ open-source with reasoning. +
++ DeepSeek R1 Distill Llama 8B โ free reasoning model. + Qwen3.5 0.8B ($0.01/$0.05/M) โ cheapest reasoning. +
++ GPT-4.1-nano ($0.10/$0.40/M) โ fast, cheap, reliable. + Qwen3 4B ($0.03/$0.15/M) โ open-source alternative. +
+ +| Factor | +Small Model (SLM) | +Large Model (LLM) | +
|---|---|---|
| Cost per 1M tokens | +$0.01 โ $0.20 | +$1 โ $40 | +
| Latency (first token) | +50 โ 200ms | +200 โ 2000ms | +
| Deployment | +On-device, edge, cloud | +Cloud only | +
| Privacy | +Data stays on device | +Data sent to cloud | +
| Customization | +Easy fine-tuning | +Expensive fine-tuning | +
| Complex reasoning | +Good for simple tasks | +Superior for complex tasks | +
| Best for | +High-volume, real-time, edge | +Complex, nuanced, creative | +
+ A data-driven analysis of 4,587 AI models across 95 providers โ pricing trends, capability + adoption, context window growth, and the rise of open-source AI. +
++ The AI model ecosystem spans 95 providers, from tech giants to specialized startups. The top + 15 providers account for the majority of models: +
+| Provider | +Models | +Notable Models | +
|---|---|---|
| OpenRouter | +415 | +Aggregator โ routes to 100+ models | +
| 261 | +Gemini 2.5 Pro/Flash, Gemma 3 | +|
| Requesty | +234 | +Aggregator โ unified API | +
| Cohere | +197 | +Command R+, Embed v3 | +
| xAI | +193 | +Grok 3, Grok 3 Mini | +
| DeepSeek | +184 | +DeepSeek R1, V3 | +
| Meta | +163 | +Llama 4 Maverick/Scout | +
| Mistral | +155 | +Mistral Large, Codestral | +
| Alibaba (Qwen) | +139 | +Qwen3-235B, QwQ | +
| Anthropic | +121 | +Claude Sonnet 4, Opus 4 | +
| OpenAI | +115 | +GPT-4.1, o3, o4-mini | +
| Microsoft | +99 | +Phi-4, Florence 2 | +
| Amazon | +96 | +Nova Pro, Titan | +
| NVIDIA | +87 | +Nemotron, Llama Nemotron | +
| 01.ai | +83 | +Yi-Lightning, Yi-VL | +
+ AI model pricing varies dramatically โ from completely free to over $15 per million input + tokens. Here is the breakdown of the 4,587 models: +
+ ++ Modern AI models increasingly support advanced capabilities beyond basic text generation: +
+| Capability | +Models | +% of Total | +Avg Input $/M | +
|---|---|---|---|
| Tool Calling | +2,350 | +51.2% | +$1.50 | +
| Reasoning | +1,306 | +28.5% | +$2.10 | +
| Structured Output | +829 | +18.1% | +$1.80 | +
| Vision (Image Input) | +1,487 | +32.4% | +$1.50 | +
| Open Weights | +527 | +11.5% | +Free or low-cost | +
| Image Generation | +28 | +0.6% | +$3.00+ | +
| Audio Input | +118 | +2.6% | +$2.50+ | +
| Audio Output | +34 | +0.7% | +$3.00+ | +
| Video Input | +167 | +3.6% | +$2.00+ | +
+ Context windows have grown exponentially. The average context window across all models is + now approximately 200K tokens: +
+ +| Model | +Context | +Provider | +
|---|---|---|
| Google Gemini 2.5 Pro | +1,048,576 | +|
| Google Gemini 2.5 Flash | +1,048,576 | +|
| Meta Llama 4 Scout | +10,000,000 | +Meta | +
| Meta Llama 4 Maverick | +1,048,576 | +Meta | +
| Google Gemma 3 27B | +131,072 | +
+ 81 models are completely free to use, and 527 have open weights. Here are the most capable + free models: +
+| Model | +Context | +Capabilities | +Provider | +
|---|---|---|---|
| Google Gemini 2.5 Flash | +1M | +TC, Reasoning, Vision, SO | +|
| DeepSeek R1 | +128K | +Reasoning, TC | +DeepSeek | +
| Meta Llama 4 Maverick | +1M | +TC, Vision | +Meta | +
| Alibaba Qwen3-235B | +128K | +TC, Reasoning, SO | +Alibaba | +
| Google Gemma 3 27B | +131K | +Vision, TC | +
| Use Case | +Best Free | +Best Paid (Cheapest) | +Best Overall | +
|---|---|---|---|
| General Chat | +Gemini 2.5 Flash | +DeepSeek V3 ($0.07/$0.28) | +Claude Sonnet 4 | +
| Coding | +DeepSeek R1 | +DeepSeek V3 ($0.07/$0.28) | +Claude Sonnet 4 | +
| AI Agents | +Gemini 2.5 Flash | +Grok 3 Mini ($0.30/$0.50) | +Claude Sonnet 4 | +
| Reasoning | +DeepSeek R1 | +Grok 3 Mini ($0.30/$0.50) | +o3 | +
| Vision | +Gemini 2.5 Flash | +Gemma 3 4B (free) | +Gemini 2.5 Pro | +
| Large Context | +Llama 4 Scout (10M) | +Gemini 2.5 Flash ($0.15/$0.60) | +Gemini 2.5 Pro | +
+ Compare 829 AI models with structured output / JSON mode support. GPT-4o, Claude, Gemini, and + more โ real pricing and capabilities from first-party data. +
+ ++ The top-tier models from each major provider, all supporting structured output with tool + calling. +
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|---|---|
| gpt-4o | +openai | +$2.50 | +$10 | +128K | +โ | ++ |
| gpt-4o-mini | +openai | +$0.15 | +$0.60 | +128K | +โ | ++ |
| o3 | +openai | +$2 | +$8 | +200K | +โ | +โ | +
| o4-mini | +openai | +$1.10 | +$4.40 | +200K | +โ | +โ | +
| claude-sonnet-4-20250514 | +anthropic | +$3 | +$15 | +200K | +โ | +โ | +
| claude-opus-4-20250514 | +anthropic | +$15 | +$75 | +200K | +โ | +โ | +
| gemini-2.5-pro | +$1.25 | +$10 | +1M | +โ | +โ | +|
| gemini-2.5-flash | +$0.15 | +$0.60 | +1M | +โ | +โ | +|
| deepseek-r1 | +deepseek | +$0.55 | +$2.19 | +128K | ++ | โ | +
| grok-3 | +xai | +$3 | +$15 | +131K | +โ | +โ | +
| qwen3-235b-a22b | +alibaba | +$0.14 | +$0.42 | +128K | +โ | +โ | +
| llama4-maverick | +meta | +$0.20 | +$0.80 | +1M | +โ | ++ |
+ Most affordable models with structured output โ ideal for high-volume production applications. +
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +Tool Call | +
|---|---|---|---|---|---|
| gemini-2.0-flash-lite | +$0.075 | +$0.30 | +1M | +โ | +|
| gemini-2.5-flash | +$0.15 | +$0.60 | +1M | +โ | +|
| gpt-4o-mini | +openai | +$0.15 | +$0.60 | +128K | +โ | +
| qwen3-235b-a22b | +alibaba | +$0.14 | +$0.42 | +128K | +โ | +
| llama4-maverick | +meta | +$0.20 | +$0.80 | +1M | +โ | +
| deepseek-chat | +deepseek | +$0.14 | +$0.28 | +128K | ++ |
+ Structured output models available at zero cost โ perfect for prototyping JSON-mode + applications. +
+| Model | +Provider | +Context | +Tool Call | +Reasoning | +
|---|---|---|---|---|
| gemini-2.0-flash | +1M | +โ | ++ | |
| gemini-2.5-flash | +1M | +โ | +โ | +|
| llama4-scout-17b-16e | +meta | +10M | ++ | + |
| qwen3-30b-a3b | +alibaba | +128K | ++ | โ | +
+ 780 models that support both structured output and tool calling โ the ideal combination for + building AI agents that return structured data from function calls. +
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +Reasoning | +
|---|---|---|---|---|---|
| gemini-2.0-flash-lite | +$0.075 | +$0.30 | +1M | ++ | |
| gemini-2.5-flash | +$0.15 | +$0.60 | +1M | +โ | +|
| gpt-4o-mini | +openai | +$0.15 | +$0.60 | +128K | ++ |
| qwen3-235b-a22b | +alibaba | +$0.14 | +$0.42 | +128K | +โ | +
| claude-sonnet-4-20250514 | +anthropic | +$3 | +$15 | +200K | +โ | +
| grok-3-mini | +xai | +$0.30 | +$0.50 | +131K | +โ | +
+ 672 models with both structured output and reasoning capabilities โ for complex tasks that + require both thinking and structured responses. +
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +Tool Call | +
|---|---|---|---|---|---|
| gemini-2.5-flash | +$0.15 | +$0.60 | +1M | +โ | +|
| qwen3-235b-a22b | +alibaba | +$0.14 | +$0.42 | +128K | +โ | +
| deepseek-chat | +deepseek | +$0.14 | +$0.28 | +128K | ++ |
| deepseek-r1 | +deepseek | +$0.55 | +$2.19 | +128K | ++ |
| o4-mini | +openai | +$1.10 | +$4.40 | +200K | +โ | +
| o3 | +openai | +$2 | +$8 | +200K | +โ | +
| claude-sonnet-4-20250514 | +anthropic | +$3 | +$15 | +200K | +โ | +
| Use Case | +Recommended Model | +Why | +
|---|---|---|
| API response parsing | +gpt-4o-mini | +Cheapest with SO + tool calling | +
| Data extraction | +gemini-2.5-flash | +1M context + SO + reasoning + cheap | +
| AI agents | +claude-sonnet-4 | +Best tool calling + SO + reasoning | +
| High volume / cheap | +gemini-2.0-flash-lite | +Lowest cost at $0.075/M input | +
| Complex reasoning | +o3 | +Best reasoning + SO + tool calling | +
| Prototyping | +gemini-2.5-flash | +Free tier, 1M context, all capabilities | +
+ All data is sourced from first-party APIs. Models are identified by having
+ structured_output: true in their metadata. Aggregator providers are excluded from
+ ranking tables to avoid duplicate models. Pricing is per million tokens.
+
+ Compare 2,350 AI models with tool/function calling across 95 providers. Find the best + model for agents, automation, and API integration. +
+ +The top models with tool calling compared side by side.
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +Reasoning | +
|---|---|---|---|---|---|
| gpt-4o | +openai | +$2.5 | +$10 | +128K | ++ |
| gpt-4o-mini | +openai | +$0.15 | +$0.6 | +128K | ++ |
| gpt-4.1 | +openai | +$2 | +$8 | +1M | ++ |
| gpt-4.1-mini | +openai | +$0.4 | +$1.6 | +1M | ++ |
| gpt-4.1-nano | +openai | +$0.1 | +$0.4 | +1M | ++ |
| o3 | +openai | +$10 | +$40 | +200K | +โ | +
| o3-mini | +openai | +$1.1 | +$4.4 | +200K | +โ | +
| o4-mini | +openai | +$1.1 | +$4.4 | +200K | +โ | +
| gemini-2.0-flash | +$0.1 | +$0.4 | +1M | ++ | |
| deepseek-chat | +deepseek | +$0.14 | +$0.28 | +1M | ++ |
| qwen3-235b-a22b | +alibaba | +$2 | +$8 | +? | +โ | +
| llama-4-maverick | +digitalocean | +$0.25 | +$0.87 | +1M | ++ |
| llama-4-scout | +google-vertex | +$0.25 | +$0.7 | +1M | ++ |
Most affordable models with tool calling โ for cost-sensitive agents and automation.
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +Reasoning | +
|---|---|---|---|---|---|
| ling-2.6-flash | +inclusionai | +$0.01 | +$0.03 | +262K | ++ |
| bdc-coder | +inferencenet | +$0.01 | +$0.01 | +131K | ++ |
| klusterai--Meta-Llama-3.1-8B-Instruct-Turbo | +klusterai | +$0.015 | +$0.02 | +131K | ++ |
| granite-4.0-h-micro | +cloudflare | +$0.017 | +$0.112 | +131K | ++ |
| llama-3.1-8b-instruct--fp-16 | +inferencenet | +$0.02 | +$0.03 | +131K | ++ |
| schematron-3b | +inferencenet | +$0.02 | +$0.05 | +131K | ++ |
| schematron-v3 | +inferencenet | +$0.02 | +$0.05 | +131K | ++ |
| gpt-oss-20b | +inferencenet | +$0.03 | +$0.15 | +131K | ++ |
| schematron-v2-turbo | +inferencenet | +$0.03 | +$0.15 | +131K | ++ |
| openai--gpt-oss-20b | +neuralwatt | +$0.03 | +$0.16 | +? | +โ | +
| qwen--qwen3-4b-fp8 | +novitaai | +$0.03 | +$0.03 | +128K | +โ | +
| liquid-ai--LFM2-24B-A2B | +togetherai | +$0.03 | +$0.12 | +131K | ++ |
| amazon-nova-micro | +amazon | +$0.035 | +$0.14 | +128K | ++ |
| amazon-nova-micro | +amazon-bedrock | +$0.035 | +$0.14 | +128K | ++ |
| mistral-nemo-12b-instruct--fp-8 | +inferencenet | +$0.0375 | +$0.1 | +131K | ++ |
54 models with tool calling at zero cost โ perfect for prototyping agents.
+| Model | +Provider | +Context | +Reasoning | +
|---|---|---|---|
| openrouter--owl-alpha | +openrouter | +1M | ++ |
| deepseek--deepseek-v4-flash--free | +openrouter | +1M | +โ | +
| qwen--qwen3-coder--free | +openrouter | +1M | ++ |
| nvidia--nemotron-3-super-120b-a12b--free | +openrouter | +1M | +โ | +
| gemma-4-26b-a4b-it | +auriko | +262K | +โ | +
| gemma-4-31b-it | +auriko | +262K | +โ | +
| arcee-ai--trinity-large-thinking--free | +openrouter | +262K | +โ | +
| google--gemma-4-26b-a4b-it--free | +openrouter | +262K | +โ | +
| google--gemma-4-31b-it--free | +openrouter | +262K | +โ | +
| nvidia--nemotron-3-nano-omni-30b-a3b-reasoning--free | +openrouter | +256K | +โ | +
278 models with tool calling you can run locally โ for privacy-first agents.
+| Model | +Provider | +Context | +Reasoning | +
|---|---|---|---|
| google--gemma-4-31b-it | +orcarouter | +1M | ++ |
| qwen--qwen3.5-flash-2026-02-23 | +orcarouter | +1M | ++ |
| qwen--qwen3.5-flash | +orcarouter | +1M | ++ |
| qwen--qwen3.6-flash-2026-04-16 | +orcarouter | +1M | ++ |
| qwen--qwen3.6-flash | +orcarouter | +1M | ++ |
| meta-llama-4-maverick-17b | +amazon-bedrock | +1M | ++ |
| meta-llama-4-scout-17b | +amazon-bedrock | +1M | ++ |
| minimax-m2-1 | +amazon-bedrock | +1M | ++ |
| minimax-m2-5 | +amazon-bedrock | +1M | ++ |
| minimax-m2 | +amazon-bedrock | +1M | ++ |
+ Models with both tool calling and reasoning โ the most capable for complex agentic workflows + that need planning and execution. +
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +
|---|---|---|---|---|
| openai--gpt-oss-20b | +neuralwatt | +$0.03 | +$0.16 | +? | +
| qwen--qwen3-4b-fp8 | +novitaai | +$0.03 | +$0.03 | +128K | +
| gpt-oss-120b | +inferencenet | +$0.05 | +$0.45 | +131K | +
| Qwen--Qwen3.6-35B-A3B | +neuralwatt | +$0.05 | +$0.1 | +? | +
| openai--gpt-oss-120b | +novitaai | +$0.05 | +$0.25 | +131K | +
| qwen3-30b-a3b-fp8 | +cloudflare | +$0.051 | +$0.335 | +40K | +
| glm-4.7-flash | +cloudflare | +$0.06 | +$0.4 | +131K | +
| Nemotron-3-Nano-Omni | +nebius | +$0.06 | +$0.24 | +128K | +
| hermes-4-llama-3.1-8b | +nousresearch | +$0.06 | +$0.12 | +131K | +
| seed-1.6-flash | +bytedance | +$0.07 | +$0.3 | +262K | +
| ring-2.6-1t | +inclusionai | +$0.07 | +$0.62 | +262K | +
| zai-org--glm-4.7-flash | +novitaai | +$0.07 | +$0.4 | +200K | +
| microsoft-phi-4-mini-reasoning | +microsoft | +$0.075 | +$0.3 | +128K | +
| Qwen--Qwen3-32B-TEE | +chutes | +$0.08 | +$0.24 | +40K | +
| gpt-oss-120b | +clarifai | +$0.09 | +$0.36 | +131K | +
Models with tool calling and image understanding โ for agents that need to see and act.
+| Model | +Provider | +Input $/1M | +Output $/1M | +Context | +
|---|---|---|---|---|
| Qwen--Qwen3.6-35B-A3B | +neuralwatt | +$0.05 | +$0.1 | +? | +
| qwen3.6-35b-fast | +neuralwatt | +$0.05 | +$0.1 | +? | +
| openai--gpt-oss-120b | +novitaai | +$0.05 | +$0.25 | +131K | +
| amazon-nova-lite | +amazon | +$0.06 | +$0.24 | +300K | +
| amazon-nova-lite | +amazon-bedrock | +$0.06 | +$0.24 | +300K | +
| Nemotron-3-Nano-Omni | +nebius | +$0.06 | +$0.24 | +128K | +
| openai--gpt-5-nano | +aimlapi | +$0.065 | +$0.52 | +400K | +
| seed-1.6-flash | +bytedance | +$0.07 | +$0.3 | +262K | +
| gemini-1.5-flash-8b | +$0.075 | +$0.3 | +1M | +|
| gemini-1.5-flash | +$0.075 | +$0.3 | +1M | +|
| gemini-2.0-flash-lite | +$0.075 | +$0.3 | +1M | +|
| gemini-2-0-flash-lite | +google-vertex | +$0.075 | +$0.3 | +1M | +
| microsoft-phi-4-mini-multimodal | +microsoft | +$0.08 | +$0.32 | +128K | +
| qwen--qwen3-vl-8b-instruct | +novitaai | +$0.08 | +$0.5 | +131K | +
| seed-2.0-mini | +bytedance | +$0.1 | +$0.4 | +262K | +
+ Models with tool calling and large context windows โ for agents processing long documents or + complex multi-step tasks. +
+| Model | +Provider | +Context | +Input $/1M | +Reasoning | +
|---|---|---|---|---|
| ling-2.6-flash | +inclusionai | +262K | +$0.01 | ++ |
| bdc-coder | +inferencenet | +131K | +$0.01 | ++ |
| klusterai--Meta-Llama-3.1-8B-Instruct-Turbo | +klusterai | +131K | +$0.015 | ++ |
| granite-4.0-h-micro | +cloudflare | +131K | +$0.017 | ++ |
| llama-3.1-8b-instruct--fp-16 | +inferencenet | +131K | +$0.02 | ++ |
| schematron-3b | +inferencenet | +131K | +$0.02 | ++ |
| schematron-v3 | +inferencenet | +131K | +$0.02 | ++ |
| gpt-oss-20b | +inferencenet | +131K | +$0.03 | ++ |
| schematron-v2-turbo | +inferencenet | +131K | +$0.03 | ++ |
| qwen--qwen3-4b-fp8 | +novitaai | +128K | +$0.03 | +โ | +
| liquid-ai--LFM2-24B-A2B | +togetherai | +131K | +$0.03 | ++ |
| amazon-nova-micro | +amazon | +128K | +$0.035 | ++ |
| amazon-nova-micro | +amazon-bedrock | +128K | +$0.035 | ++ |
| mistral-nemo-12b-instruct--fp-8 | +inferencenet | +131K | +$0.0375 | ++ |
| klusterai--Meta-Llama-3.3-70B-Instruct-Turbo | +klusterai | +131K | +$0.038 | ++ |
+ All data is sourced from first-party APIs. Tool calling capability is defined by the + provider's own classification โ models that support function/tool calling via their API. + Aggregator providers are excluded from ranking tables to avoid duplicate models. +
+ +