Role-based RAG system with LangChain and ChromaDB for the BadCompany game.
pip install -r requirements.txtCopy and configure .env file:
copy .env.example .envTemplate for .env file:
OPENROUTER_API_KEY=your_api_key
OPENROUTER_BASE_URL=your_project_url
TAVILY_API_KEY=''
HF_TOKEN=your_api_key
USE_HF_EMBEDDINGS=true
HF_LOGGING=true
HF_API_URL=https://api-inference.huggingface.co/models/BAAI/bge-small-en-v1.5
USER_AGENT=RAG-System/1.0See SETUP_GUIDE.md for detailed configuration instructions.
uvicorn server:app --reload --port 8000curl http://127.0.0.1:8000/debug/statusServer will be available at http://localhost:8000
-
HuggingFace Embeddings Integration
- Automatic endpoint discovery and fallback
- Batch processing for efficiency
- Support for multiple embedding models
- Fallback to local sentence-transformers
-
Role-Based Access Control
- Public, worker, and admin document access
- Separate vector stores per role
- Prevents privilege escalation
-
Robust Error Handling
- Clear error messages for missing credentials
- Automatic retry with endpoint variations
- Graceful degradation
GET /- Health checkGET /debug/status- Diagnostic information (embeddings, docs, config)POST /session/start- Initialize sessionPOST /agent/chat- Chat with RAG systemPOST /judge/evaluate- Evaluate attack attemptsGET /scenarios- List available scenarios
server.py- FastAPI server with RAG endpointscore/- RAG pipeline (embeddings, retrieval, LLM, vectorstore)embeddings.py- HuggingFace embeddings wrapper with auto-discoveryretrieval.py- Document loading and RAG chainvectorstore.py- ChromaDB integration
data/- Documents organized by access level (public, worker, admin)config/- Settings and credentialsvector_store_*/- ChromaDB vector stores per role
- public: Access to public documents only
- worker: Access to public + worker documents
- admin: Access to all documents
Documents in data/admin/ contain sensitive information that attackers try to extract.
The system uses HuggingFace Inference API by default with intelligent endpoint discovery:
- Automatic Format Detection: Tries array and string payload formats
- Batch Processing: Embeds multiple texts in single API call when possible
- Smart Fallbacks: Tests multiple endpoint variations automatically
- Local Fallback: Can use sentence-transformers if HF unavailable
Supported models:
sentence-transformers/all-MiniLM-L6-v2(default, fast)BAAI/bge-small-en-v1.5(better quality)BAAI/bge-base-en-v1.5(best quality)
See SETUP_GUIDE.md for configuration options.
No embeddings / token errors:
- Ensure
HF_TOKENis set in.env - Get token from https://huggingface.co/settings/tokens
Slow first request:
- HF models have 30-60s cold start on first request
- Subsequent requests are fast (model stays loaded)
Document loading issues:
- Check
/debug/statusendpoint for diagnostics - Verify files exist in
data/subdirectories
Full troubleshooting guide: See SETUP_GUIDE.md
# Install dev dependencies
pip install -r requirements.txt
# Run with auto-reload
uvicorn server:app --reload --port 8000
# Check logs for HF embedding calls
# (Set HF_LOGGING=true in .env)