🔗 Website • 📖 Docs • 🧪 Playground • 🗝️ API Key • 📚 Blog
One fullstack platform for Compute, Inference, Fine-tuning, and RAG on Open Source Models.
![]() Llama 3.3 70B Instruct General • Instruction |
Nemotron Nano 30B Efficient • Enterprise |
DeepSeek R1 Reasoning • Advanced |
DeepSeek R1 Distill 70B Reasoning • Efficient |
DeepSeek V3 Multilingual • General |
DeepSeek V3.2 Improved • Multilingual |
DeepSeek V4 Flash Fast • Low Latency |
DeepSeek V4 Pro High Performance • Reasoning |
GLM 4.7 Chat • Multilingual |
GLM 5 Reasoning • Multilingual |
Kimi K2 Thinking Reasoning • Long Context |
Fara 7B Lightweight • Enterprise |
Minimax M2.5 Multimodal • Chat |
Mistral 7B Efficient • Open |
Kimi K2 Instruct Instruction • Long Context |
Nemotron Nano Omni Multimodal • Efficient |
Nemotron Super 120B Enterprise • High Performance |
GPT OSS 120B Open Model • Large |
Qwen3 Max General • Multilingual |
Qwen3 Next 80B Reasoning • Advanced |
Qwen3 Coder Next Advanced • Coding |
Qwen3 Coder Plus Balanced • Coding |
Claude Haiku 4.5 Fast • Efficient |
Claude Opus 4.5 Premium • Reasoning |
Claude Opus 4.6 Advanced • Reasoning |
GPT-4o Multimodal • Fast |
GPT-4o Mini Lightweight • Fast |
GPT-5.4 Next Gen • Reasoning |
GPT-5.4 Mini Efficient • Fast |
GPT-5.4 Nano Ultra Fast • Lightweight |
Claude Opus 4.7 Top Tier • Reasoning |
Claude Sonnet 4.5 Balanced • Chat |
Claude Sonnet 4.6 Balanced • Reasoning |
Gemini 2.5 Flash Fast • Multimodal |
Gemini 2.5 Pro Advanced • Multimodal |
Gemini 3 Flash Preview • Fast |
Gemini 3.1 Pro Preview • Advanced |
Kimi K2.5 Long Context • Chat |
Kimi K2.6 Improved • Long Context |
GPT-4.1 General • Reliable |
Qwen3 Plus Balanced • Multilingual |
Qwen3.6 Max Preview • High Performance |
Qwen3 Coder 30B Coding • Instruct |
Qwen3 Coder 480B Coding • Large |
Qwen3 Coder Flash Fast • Coding |
Qwen3 VL 235B Instruct Vision • Instruct |
Qwen3 VL 235B Thinking Vision • Reasoning |
Qwen3 VL 30B Vision • Efficient |
Qwen3 VL 8B Vision • Lightweight |
Qwen3 VL Flash Vision • Fast |
Qwen3 VL Plus Vision • Advanced |
Qwen3.5 122B Large • Multilingual |
Qwen3.5 27B Balanced • Efficient |
Qwen3.5 35B Reasoning • Balanced |
Qwen3.5 397B Massive • Advanced |
Qwen3.5 Flash Fast • Efficient |
Qwen3.5 Plus Balanced • Multilingual |
Qwen3.6 27B Efficient • Multilingual |
Qwen3.6 35B Reasoning • Balanced |
Qwen3.6 Plus Advanced • Multilingual |
|
Tencent Hunyuan OCR OCR • Vision |
||||
Run powerful AI models via simple APIs - no infrastructure required.
We handle routing, scaling, tuning, and reliability so your team can focus on building.
Need higher performance or predictable workloads?
Launch dedicated GPU instances with better latency, control, and consistent performance.
As demand grows, scale to high-performance infrastructure.
Move to bare metal and AI appliances for maximum performance and lower cost at scale.
From zero setup → dedicated compute → hyperscale infrastructure - all in one platform.
