Qubrid AI

🔗 Website • 📖 Docs • 🧪 Playground • 🗝️ API Key • 📚 Blog

One fullstack platform for Compute, Inference, Fine-tuning, and RAG on Open Source Models.

🚀 Model Gallery

Llama 3.3 70B Instruct _{General • Instruction}	Nemotron Nano 30B _{Efficient • Enterprise}	DeepSeek R1 _{Reasoning • Advanced}	DeepSeek R1 Distill 70B _{Reasoning • Efficient}	DeepSeek V3 _{Multilingual • General}
DeepSeek V3.2 _{Improved • Multilingual}	DeepSeek V4 Flash _{Fast • Low Latency}	DeepSeek V4 Pro _{High Performance • Reasoning}	GLM 4.7 _{Chat • Multilingual}	GLM 5 _{Reasoning • Multilingual}
Kimi K2 Thinking _{Reasoning • Long Context}	Fara 7B _{Lightweight • Enterprise}	Minimax M2.5 _{Multimodal • Chat}	Mistral 7B _{Efficient • Open}	Kimi K2 Instruct _{Instruction • Long Context}
Nemotron Nano Omni _{Multimodal • Efficient}	Nemotron Super 120B _{Enterprise • High Performance}	GPT OSS 120B _{Open Model • Large}	Qwen3 Max _{General • Multilingual}	Qwen3 Next 80B _{Reasoning • Advanced}
Qwen3 Coder Next _{Advanced • Coding}	Qwen3 Coder Plus _{Balanced • Coding}	Claude Haiku 4.5 _{Fast • Efficient}	Claude Opus 4.5 _{Premium • Reasoning}	Claude Opus 4.6 _{Advanced • Reasoning}
GPT-4o _{Multimodal • Fast}	GPT-4o Mini _{Lightweight • Fast}	GPT-5.4 _{Next Gen • Reasoning}	GPT-5.4 Mini _{Efficient • Fast}	GPT-5.4 Nano _{Ultra Fast • Lightweight}
Claude Opus 4.7 _{Top Tier • Reasoning}	Claude Sonnet 4.5 _{Balanced • Chat}	Claude Sonnet 4.6 _{Balanced • Reasoning}	Gemini 2.5 Flash _{Fast • Multimodal}	Gemini 2.5 Pro _{Advanced • Multimodal}
Gemini 3 Flash _{Preview • Fast}	Gemini 3.1 Pro _{Preview • Advanced}	Kimi K2.5 _{Long Context • Chat}	Kimi K2.6 _{Improved • Long Context}	GPT-4.1 _{General • Reliable}
Qwen3 Plus _{Balanced • Multilingual}	Qwen3.6 Max _{Preview • High Performance}	Qwen3 Coder 30B _{Coding • Instruct}	Qwen3 Coder 480B _{Coding • Large}	Qwen3 Coder Flash _{Fast • Coding}
Qwen3 VL 235B Instruct _{Vision • Instruct}	Qwen3 VL 235B Thinking _{Vision • Reasoning}	Qwen3 VL 30B _{Vision • Efficient}	Qwen3 VL 8B _{Vision • Lightweight}	Qwen3 VL Flash _{Vision • Fast}
Qwen3 VL Plus _{Vision • Advanced}	Qwen3.5 122B _{Large • Multilingual}	Qwen3.5 27B _{Balanced • Efficient}	Qwen3.5 35B _{Reasoning • Balanced}	Qwen3.5 397B _{Massive • Advanced}
Qwen3.5 Flash _{Fast • Efficient}	Qwen3.5 Plus _{Balanced • Multilingual}	Qwen3.6 27B _{Efficient • Multilingual}	Qwen3.6 35B _{Reasoning • Balanced}	Qwen3.6 Plus _{Advanced • Multilingual}
Tencent Hunyuan OCR _{OCR • Vision}

🚀 What You Can Do with Qubrid

⚡ 1. Serverless API Inference

Run powerful AI models via simple APIs - no infrastructure required.
We handle routing, scaling, tuning, and reliability so your team can focus on building.

🖥️ 2. Deploy on GPU VMs

Need higher performance or predictable workloads?
Launch dedicated GPU instances with better latency, control, and consistent performance.

🏭 3. Scale with AI Factory

As demand grows, scale to high-performance infrastructure.
Move to bare metal and AI appliances for maximum performance and lower cost at scale.

From zero setup → dedicated compute → hyperscale infrastructure - all in one platform.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qubrid AI

🚀 Model Gallery

🚀 What You Can Do with Qubrid

⚡ 1. Serverless API Inference

🖥️ 2. Deploy on GPU VMs

🏭 3. Scale with AI Factory

Pinned Loading

Repositories

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

People

Top languages

Uh oh!

Most used topics

Uh oh!