:orphan:
Welcome to the NVIDIA RAG Blueprint documentation. You can learn more here, including how to get started with the RAG Blueprint, how to customize the RAG Blueprint, and how to troubleshoot the RAG Blueprint.
- To view this documentation on docs.nvidia.com, browse to NVIDIA RAG Blueprint Documentation.
- To view this documentation on GitHub, browse to NVIDIA RAG Blueprint Documentation.
For the release notes, refer to Release Notes.
For hardware requirements and other information, refer to the Support Matrix.
- Use the procedures in Get Started to get started quickly with the NVIDIA RAG Blueprint.
- Experiment and test in the Web User Interface.
- Explore the notebooks that demonstrate how to use the APIs. For details refer to Notebooks.
You can deploy the RAG Blueprint with Docker, Helm, or NIM Operator, and target dedicated hardware or a Kubernetes cluster. Use the following documentation to deploy the blueprint.
- Deploy with Docker (Self-Hosted Models)
- Deploy with Docker (NVIDIA-Hosted Models)
- Deploy on Kubernetes with Helm
- Deploy on Kubernetes with Helm from the repository
- Deploy on Kubernetes with Helm and MIG Support
After you deploy the RAG blueprint, you can customize it for your use cases.
-
Common configurations
- Best Practices for Common Settings
- Change the LLM or Embedding Model
- Customize LLM Parameters at Runtime
- Customize Prompts
- Model Profiles for Hardware Configurations
- Multi-Collection Retrieval
- Multi-Turn Conversation Support
- Reasoning in Nemotron LLM model
- Self-reflection to improve accuracy
- Summarization
-
Data Ingestion & Processing
-
Vector Database and Retrieval
-
Multimodal and Advanced Generation
-
Evaluation
-
Governance
-
Observability and Telemetry