-
Notifications
You must be signed in to change notification settings - Fork 0
Advanced Topics
For advanced use cases, customisation, and optimisation of Fold.
A-MEM automatically suggests and creates links between related memories. You can tune how aggressive this linking is.
# How many neighbours to consider for linking
A_MEM_NEIGHBOUR_COUNT=5 # Default: 5
# Minimum confidence score for auto-linking
A_MEM_MIN_CONFIDENCE=0.75 # Default: 0.75 (0-1)
# Whether to auto-create suggested links
A_MEM_AUTO_ACCEPT_LINKS=false # Default: false (manual review)When a new memory is created:
- Find 5 nearest neighbours via vector similarity
- Ask LLM: "Should we link these memories?"
- LLM returns suggestions with confidence scores
- If
AUTO_ACCEPT_LINKS=true, create links automatically - Otherwise, store as pending suggestions for manual approval
Aggressive linking (startups, fast-moving projects):
A_MEM_NEIGHBOUR_COUNT=10
A_MEM_MIN_CONFIDENCE=0.60
A_MEM_AUTO_ACCEPT_LINKS=trueConservative linking (stable projects, mature codebases):
A_MEM_NEIGHBOUR_COUNT=3
A_MEM_MIN_CONFIDENCE=0.85
A_MEM_AUTO_ACCEPT_LINKS=falseFine-tune how recent and frequently-accessed memories are prioritised.
# How long until memory strength halves (days)
DECAY_HALF_LIFE_DAYS=30
# Blend factor: 0=pure semantic, 1=pure strength
DECAY_STRENGTH_WEIGHT=0.3strength = recency × access_boost
recency = exp(-age / half_life)
access_boost = log(retrieval_count + 1)
combined_score = (1 - weight) × relevance + weight × strength
Per-project configuration in fold/project.toml:
[decay]
half_life_days = 30 # How quickly memories fade
strength_weight = 0.3 # How much decay affects rankingExamples:
Fast-moving projects (recent context matters):
[decay]
half_life_days = 7 # Fade in a week
strength_weight = 0.5 # 50% weight to recencyReference projects (decisions are timeless):
[decay]
half_life_days = 365 # Fade in a year
strength_weight = 0.1 # Only 10% weight to recencyFold supports multiple LLM providers with automatic fallback.
GOOGLE_API_KEY=... # Try Gemini first
ANTHROPIC_API_KEY=... # Fallback to Claude
OPENAI_API_KEY=... # Fallback to OpenAI
OPENROUTER_API_KEY=... # Last resortFold tries providers in order. If Gemini times out, it tries Claude. If Claude is rate-limited, it tries OpenAI.
Minimise costs (Gemini free tier):
GOOGLE_API_KEY=... # Only set this
# Don't set othersAutomatic failover (redundancy):
GOOGLE_API_KEY=... # Primary (cheapest)
OPENROUTER_API_KEY=... # Fallback (more expensive)Load balancing (OpenRouter):
OPENROUTER_API_KEY=... # Single endpoint, multiple models# Model parameters (per-provider)
# Set via environment or code configuration
LLM_TEMPERATURE=0.7 # Creativity (0=deterministic, 1=creative)
LLM_MAX_TOKENS=1024 # Max output lengthConfigure which embedding model to use for semantic search.
# Gemini embeddings (fast, 768 dimensions)
EMBEDDING_PROVIDER=gemini
EMBEDDING_MODEL=gemini-embedding-001
# OpenAI embeddings (high quality, 1536 dimensions)
EMBEDDING_PROVIDER=openai
EMBEDDING_MODEL=text-embedding-3-smallLarger dimensions = more expressive but slower:
| Dimensions | Speed | Quality | Use Case |
|---|---|---|---|
| 384 | Fast | Good | Code and technical docs |
| 768 | Medium | Very Good | General purpose |
| 1536 | Slower | Excellent | Complex semantic understanding |
Memories are stored in fold/ as markdown files with git-native storage.
fold/
├── a/b/hash1.md # Hash-based storage
├── a/c/hash2.md
├── 9/a/hash3.md
└── project.toml # Per-project config
Why hash-based:
- Repo path determines identity (SHA256 first 16 chars)
- Stable identity across content changes
- Deterministic paths
Create fold/project.toml:
[project]
id = "proj_abc123"
slug = "my-app"
name = "My Application"
[indexing]
# Include patterns (customise as needed)
# See Configuration docs for full list of 50+ supported file types
include = [
"**/*.ts", "**/*.tsx", "**/*.js", "**/*.jsx",
"**/*.py", "**/*.rs", "**/*.go", "**/*.java",
"**/*.cs", "**/*.kt", "**/*.swift", "**/*.rb",
"**/*.md", "**/*.json", "**/*.yaml", "**/*.txt"
]
# Exclude patterns
exclude = [
"node_modules/**",
"dist/**",
"*.test.ts",
"*.spec.ts"
]
# Skip large files (KB)
max_file_size = 100
[embedding]
provider = "gemini"
model = "gemini-embedding-001"
dimension = 768
[decay]
half_life_days = 30
strength_weight = 0.3If your SQLite database becomes corrupted, rebuild it from fold/:
# Fold will auto-detect and rebuild on startup
# Or manually trigger:
curl -X POST http://localhost:8765/api/projects/my-app/index/rebuild \
-H "Authorization: Bearer $TOKEN"Background jobs (indexing, embedding generation) are queued and processed asynchronously.
# Max concurrent jobs
JOB_WORKER_THREADS=4
# Job timeout (seconds)
JOB_TIMEOUT=300 # 5 minutes
# Max retry attempts
JOB_MAX_RETRIES=3
# Retry backoff (exponential)
JOB_INITIAL_BACKOFF=5 # Start with 5 seconds
JOB_MAX_BACKOFF=600 # Cap at 10 minutes# Check job queue status
curl http://localhost:8765/status/jobs
# Get job details
curl http://localhost:8765/status/jobs/{job_id}| Job Type | Purpose | Typical Duration |
|---|---|---|
index_repo |
Index files from push | 5-60 seconds |
reindex_repo |
Full reindex | 1-10 minutes |
process_webhook |
Handle webhook events | <1 second |
generate_embedding |
Create vector embedding | 1-5 seconds |
Control API usage and prevent abuse.
# Requests per minute per token
RATE_LIMIT_REQUESTS=60
# Rate limit window (seconds)
RATE_LIMIT_WINDOW=60
# Burst allowance
RATE_LIMIT_BURST=10When rate limited:
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1643817000
Retry after Retry-After header value.
Control log verbosity:
# Detailed logs
RUST_LOG=fold=debug,tower_http=debug
# Production (minimal)
RUST_LOG=fold=info
# Specific modules
RUST_LOG=fold::services::memory=debug,fold::api=infoExpose Prometheus metrics:
# Metrics endpoint
GET http://localhost:8765/metricsKey metrics:
-
fold_memories_indexed_total- Total memories created -
fold_search_latency_seconds- Search response time -
fold_embeddings_generated_total- Embeddings created -
fold_jobs_processed_total- Background jobs completed
# Basic health check
curl http://localhost:8765/health
# Detailed status
curl http://localhost:8765/status/jobsRotate LLM API keys regularly:
# Update .env or environment
GOOGLE_API_KEY=new-key-here
# Restart Fold (no downtime with load balancer)
docker restart fold# API tokens stored as hashed values
# Only shown once at creation
# Rotate tokens regularly
# Settings → API Tokens → Delete old tokens
# Use short-lived tokens where possible# Daily backup (encrypted)
0 2 * * * tar czf - /var/lib/fold | \
openssl enc -aes-256-cbc > /backups/fold-$(date +%Y%m%d).tar.gz.enc
# Test restore monthlyEnable caching for frequently-accessed memories:
# Redis cache (optional)
REDIS_URL=redis://localhost:6379
# Cache TTL
CACHE_TTL_SECONDS=3600Index multiple files at once:
# Batch indexing reduces overhead
POST /api/projects/slug/memories/batch
{
"memories": [
{ "type": "codebase", "content": "...", "file_path": "..." },
{ "type": "codebase", "content": "...", "file_path": "..." }
]
}Configure Qdrant for performance:
# In docker-compose.yml
qdrant:
environment:
QDRANT_HNSW_EF_CONSTRUCT: 400 # Build index quality
QDRANT_HNSW_M: 16 # Connections per node
QDRANT_HNSW_EF_SEARCH: 100 # Search qualityHigher values = better quality but slower indexing.
Check A-MEM is enabled:
# Verify A_MEM configuration
echo $A_MEM_AUTO_ACCEPT_LINKS
# Check pending suggestions
curl http://localhost:8765/api/projects/slug/memories/{id}/suggested-linksVerify decay is enabled:
# Check decay weight
echo $DECAY_STRENGTH_WEIGHT
# Search with decay disabled to compare
POST /api/projects/slug/search
{
"query": "...",
"include_decay": false
}Check LLM provider:
# Verify provider is responding
curl https://generativelanguage.googleapis.com/v1/models/list \
-H "Authorization: Bearer $GOOGLE_API_KEY"
# Check job queue for stuck jobs
curl http://localhost:8765/status/jobs?status=processing