A custom Claude Code Router (CCR) transformer that injects persistent memory context into every Claude Code session.
Universal by design — works with any memory service implementing a simple /context/inject endpoint. Includes setup guide for Mnemi-ai.
Every time Claude Code makes an API request, this transformer:
- Extracts the user's message
- Queries your memory service via
POST /context/inject - Injects the returned memory context as a system message
This gives the model awareness of facts, entities, and context from previous sessions.
Claude Code → CCR → Memory Transformer → OpenRouter → Model
↑
Memory Service
/context/inject
- Claude Code Router installed
- A memory service implementing the
/context/injectendpoint (see below) - Node.js
mkdir -p ~/.claude-code-router/transformers
cp memory-transformer.js ~/.claude-code-router/transformers/Add to your ~/.claude-code-router/config.json:
{
"transformers": [
{
"path": "/path/to/memory-transformer.js"
}
],
"Providers": [{
"name": "openrouter",
"api_base_url": "https://openrouter.ai/api/v1/chat/completions",
"api_key": "sk-or-v1-YOUR-KEY",
"models": ["qwen/qwen3.6-plus:free"],
"transformer": { "use": ["openrouter", "memory"] }
}],
"Router": {
"default": "openrouter,qwen/qwen3.6-plus:free"
}
}ccr start
ccr codeThe transformer expects any memory service to implement:
Request:
{
"query": "user's current message (for relevance scoring)",
"entity": "user identifier (e.g., 'doug')",
"max_tokens": 8000
}Response:
{
"context": "--- memory_context\n...\n--- end memory_context ---",
"overflowed": false
}That's it. Any service that returns this contract can be used with this transformer.
The transformer reads from environment variables:
| Variable | Default | Description |
|---|---|---|
MEMORY_SERVICE_URL |
http://localhost:8765 |
Base URL of memory service |
MEMORY_SERVICE_ENDPOINT |
/context/inject |
Endpoint path |
MEMORY_ENTITY |
doug |
User/entity identifier |
MEMORY_MAX_TOKENS |
8000 |
Max tokens for context |
Example:
export MEMORY_SERVICE_URL="http://your-memory-service:8080"
export MEMORY_ENTITY="alice"
export MEMORY_MAX_TOKENS="12000"Mnemi-ai is a memory system that implements the /context/inject contract.
git clone https://github.com/yourusername/mnemi.git
cd mnemi
pip install -r requirements.txt
python server.py &Mnemi runs on http://localhost:8765 by default.
curl http://localhost:8765/healthexport MEMORY_SERVICE_URL="http://localhost:8765"
export MEMORY_SERVICE_ENDPOINT="/context/inject"
export MEMORY_ENTITY="doug"
export MEMORY_MAX_TOKENS="8000"ccr restart
ccr codeMnemi provides:
POST /context/inject— Returns formatted memory contextGET /facts?query=&entity=&limit=— Query stored factsGET /graph/neighbors?entity=— Entity relationshipsGET /vec/search?query=&top_k=— Semantic vector searchGET /episodes/recent?limit=— Session summaries
See Mnemi docs for full API.
The injected context looks like:
--- memory_context
[entity_graph]
user likes: coffee, coding, memory systems
[facts]
[8] user is a software engineer
[7] prefers working in the morning
[episodes]
[2026-04-04] Set up CCR with OpenRouter...
[recent_messages]
assistant: here's the memory system design
user: perfect, let's implement it
--- end memory_context ---
You have the above memory context from previous sessions. Use it to inform your responses.
- Non-fatal: If the memory service is unavailable, requests continue without memory context
- No errors shown: The model simply won't have memory context for that request
- Token budget: Configurable via
MEMORY_MAX_TOKENS
memory-transformer.js # The CCR transformer (universal)
.env.example # Environment variable template
README.md # This file
Implement the contract in any language:
@app.post("/context/inject")
def inject_context(request: ContextInjectRequest):
# Query your memory store (SQLite, ChromaDB, etc.)
facts = search_facts(request.query, request.entity)
graph = get_entity_neighbors(request.entity)
episodes = get_recent_sessions(request.entity)
context = format_context(facts, graph, episodes)
return {"context": context, "overflowed": False}Return a formatted string under the "context" key.
MIT