This quickstart will serve a small language model on CPUs, using vLLM inference runtime
-
Updated
Jun 3, 2026 - Go Template
This quickstart will serve a small language model on CPUs, using vLLM inference runtime
Deploys Llama 3.2-3B on vLLM with Llama Stack and MCP servers in OpenShift AI.
AI quickstart for deploying an LLM with Tool Calling enabled on top of OpenShift AI
Dynamically route user prompts to LoRA adapters or a base LLM using semantic evaluation on Red Hat OpenShift AI with LiteLLM and vLLM.
An easy way to quickly add a lot of useful, community-provided, custom Workbench Images
Add a description, image, and links to the rh-ai-bu topic page so that developers can more easily learn about it.
To associate your repository with the rh-ai-bu topic, visit your repo's landing page and select "manage topics."