Local RAG (Retrieval-Augmented Generation) system for OpenClaw AI agent platform. Provides semantic search over transcripts, digests, world-scans, and state files using locally-hosted embedding model and vector database.
Zero external API calls. Zero cost. Full privacy.
- BGE-M3 embedding model (BAAI/bge-m3) — 1024-dim, multilingual, SOTA quality, runs on CPU
- Qdrant vector database — standalone binary, no Docker required
- Section-aware chunking — custom strategies per document type (not naive fixed-size)
- FastAPI service with query, context injection, and ingest endpoints
- File watcher — automatic re-indexing when files change
- OpenClaw integration — bootstrap context + CLI tool for agents
- Systemd services — auto-start on boot, auto-restart on failure
- Linux (tested on Ubuntu 24.04)
- Python 3.12+
- 8 GB RAM (4 GB swap recommended)
- 4 GB free disk (model + venv + index)
- No GPU required
git clone https://github.com/danielis1975/openclaw-rag.git
cd openclaw-rag
sudo bash install.shThe installer will:
- Create 4 GB swap (if not present)
- Download and start Qdrant
- Create Python venv with CPU-only PyTorch (no CUDA bloat)
- Download BGE-M3 model (~2.2 GB)
- Start RAG service on localhost:18790
- Install
rag-queryCLI tool - Set up auto-refresh cron job
- Configure UFW firewall rules
# Search transcripts and digests from last 3 days
rag-query "Daniel psychological signals" --type transcript,digest_person --days 3
# Search world-scan signals
rag-query "geopolitical risks health AI governance" --type world_scan --days 7
# Search neuroplasticity data (no date filter)
rag-query "active pathways mutation candidates" --type neuroplasticity --days 0
# Get raw JSON instead of formatted context
rag-query "search query" --format json# Health check
curl http://localhost:18790/health
# Index stats
curl http://localhost:18790/stats
# Query
curl -X POST http://localhost:18790/query \
-H "Content-Type: application/json" \
-d '{"query": "search text", "top_k": 10, "max_tokens": 4000}'
# Get formatted context for prompt injection
curl -X POST http://localhost:18790/context \
-H "Content-Type: application/json" \
-d '{"query": "search text", "filters": {"doc_type": ["transcript"]}}'
# Force reindex
curl -X POST http://localhost:18790/reindexsource /root/.openclaw/rag/venv/bin/activate
cd /root/.openclaw/rag/service
# Full reindex
python3 cli.py reindex --force
# Reindex specific type
python3 cli.py reindex --type transcripts
# Status
python3 cli.py status
# Query
python3 cli.py query "search text" --days 7 --top-k 5| Type | Description | Chunking Strategy |
|---|---|---|
transcript |
Daily transcript mirrors | Section-aware, merge related sections |
digest_person |
3-day person digests | Section-aware + semantic grouping |
digest_system |
3-day system digests | Section-aware + semantic grouping |
world_scan |
Daily world-scan signals | Per-bullet chunking |
human_intel |
Human-intelligence intake | Section-aware |
neuroplasticity |
Pathway catalog/protocol | Table-row + section chunking |
state |
General state files | Section-aware or whole-file |
| Operation | Time |
|---|---|
| Embed 1 chunk | ~130 ms |
| Query (embed + search) | ~180 ms |
| Full reindex (~800 chunks) | ~15 min |
| File watcher re-index | ~2-5 s per file |
/root/.openclaw/rag/
├── service/ # Python RAG service
│ ├── main.py # FastAPI app
│ ├── ingest.py # Ingest pipeline
│ ├── chunker.py # Chunking strategies
│ ├── embedder.py # BGE-M3 wrapper
│ ├── retriever.py # Query engine
│ ├── watcher.py # File watcher
│ ├── config.py # Configuration
│ └── cli.py # CLI tool
├── qdrant/
│ ├── bin/qdrant # Qdrant binary
│ ├── storage/ # Vector data
│ └── config.yaml # Qdrant config
├── venv/ # Python virtual env
├── logs/ # Service + refresh logs
├── .env # Qdrant API key
└── rag-refresh-context.sh # Auto-refresh script
See docs/OPENCLAW_INTEGRATION.md for:
- Bootstrap context setup (passive — auto-injected into sessions)
- Agent TOOLS.md instructions (active — agent queries on demand)
- Cron job configuration
- API endpoint reference
See docs/ARCHITECTURE.md for component diagram and data flow.
Edit service/config.py to customize:
- Watched paths and directories
- Document type detection
- Chunking parameters (max tokens, overlap)
- Qdrant connection settings
# Check services
systemctl status qdrant rag-service
# View logs
journalctl -u rag-service -f
journalctl -u qdrant -f
tail -f /root/.openclaw/rag/logs/service.log
# Restart
systemctl restart qdrant rag-service
# Manual reindex
cd /root/.openclaw/rag/service
/root/.openclaw/rag/venv/bin/python3 cli.py reindex --forceMIT