Production-style RAG system for SRE/DevOps knowledge. It ingests runbooks and docs into Chroma, serves answers through a FastAPI + Streamlit Ops Copilot, and is designed for GKE deployment with Terraform and GitHub Actions CI/CD.
flowchart LR
A[Runbooks and Docs] --> B[Ingestor Service]
B --> C[Embeddings]
C --> D[(Chroma Vector DB)]
E[Retriever API] --> D
F[DevOps Agent API] --> E
F --> D
F --> G[LLM Provider]
F --> H[(Cloud SQL Postgres Logs/Feedback)]
I[Streamlit UI] --> F
U[User] --> I
Flow: documents are embedded and indexed in Chroma, the UI sends questions to the DevOps agent, and the agent retrieves context, calls the LLM, and logs interactions.
uv sync --all-groups
docker compose up --buildOptional local runs (without Docker):
make ingest-local
make run-retriever
make run-devops-agent
make run-uiMetrics endpoints:
- DevOps Agent:
http://localhost:8080/metrics - Retriever:
http://localhost:8081/metrics
Run the RAG evaluation pipeline:
make evalRun three experiment variants and compare in MLflow:
make eval-experiments
make mlflow-uiResults are saved under eval/results/YYYY-MM-DD.json.
Validation checklist and latest run results: VALIDATION.md
Low-cost single-cluster infrastructure is defined in infra/terraform.
See infra/README.md for:
- Terraform apply flow
- required GitHub Actions variables/secrets
- DNS + ingress setup
- observability stack rollout (Grafana + in-cluster Prometheus + GKE managed metrics/SLOs)