Skip to content

Neboy72/openclaw-nexus-memory

Repository files navigation

OpenClaw Nexus Memory 🧠

One prompt — your bot never forgets again.

You talk to your bot. Next day it asks you the same thing. Nexus Memory fixes that. Permanently.

Your agent remembers across sessions — facts, decisions, patterns. Hybrid retrieval (BM25 + Vector) kills RAG poisoning. Drift detection flags stale memories automatically. No nonsense. No bloat. Just memory that works.

Stars License Python 3.11+ Qdrant v1.17+ Version

Bot Self-Install: Tell your AI: "Read AGENTS.md and install Nexus Memory." It does the rest.

No API keys needed. Local by default — cloud if you want.

👉🏻 click 🤖 Bot Self-Install    👉🏻 click ⭐ Star on GitHub

⭐️ If this project helps your agent remember — drop a star so others find it too. Takes 2 seconds.


Architecture


Quick Start

🤖 Tell your agent to install it

Send this prompt to your OpenClaw agent:

Read https://raw.githubusercontent.com/Neboy72/openclaw-nexus-memory/main/AGENTS.md and follow the installation instructions.

🛠️ Install as Python package

# Create a virtual environment
python3 -m venv ~/.openclaw/nexus-venv
source ~/.openclaw/nexus-venv/bin/activate

# Install with all dependencies (includes Qdrant client, BM25, analytics)
pip install -e "git+https://github.com/Neboy72/openclaw-nexus-memory.git#egg=openclaw-nexus-memory[all]"

# Or install base only (health checks, patterns — no Qdrant needed)
pip install -e "git+https://github.com/Neboy72/openclaw-nexus-memory.git"

🚀 One-liner install

curl -sL https://raw.githubusercontent.com/Neboy72/openclaw-nexus-memory/main/install.sh | bash

First steps

# Health check (works without Qdrant)
python3 scripts/openclaw_nexus_health.py

# Run auto-discovery (requires Qdrant)
openclaw-nexus-discover

# Graph analytics report (requires Qdrant)
openclaw-nexus-graph-report

# Export memories
nexus-export

What's New

v2.1.0 — Auto-Discovery + Graph Analytics 🔄

Feature What it does Why it matters
🔄 Auto-Discovery AutoDiscovery.discover_all() — scans all canonical facts, finds similarity via Qdrant vector search (O(n·k)), classifies relations heuristically (wikilinks, "see also"/"cf.", keyword overlap), deduplicates against SQLite, stores as active (≥0.85) or proposed (<0.85) Zero-token relation discovery. No LLM, no new dependencies. Facts connect themselves — no manual edges needed.
📊 Graph Analytics GraphAnalytics.full_report() — Hub scores (most-connected facts), isolation scores, knowledge gaps, connected components (weakly), relation distribution Understand your knowledge graph. See at a glance which facts are isolated (= knowledge gaps) and which are most connected.
🧲 Hybrid Retrieval BM25 + Vector + Reciprocal Rank Fusion + Source-Tier Boosting + Graph Boost Anti-RAG-poisoning. Keyword precision + semantic search combined. Trusted sources rank higher. Connected facts rank higher.
🔗 Edge Lifecycle EdgeRelation (6 types: SUPERSEDES, CONTRADICTS, SUPPORTS, ALTERNATIVE_TO, DEPENDS_ON, REFERENCES) × EdgeStatus (4 states: PROPOSED, ACTIVE, DEPRECATED, REJECTED) Safe auto-discovered relations. Proposed edges don't affect retrieval until promoted. Full audit trail.
🛡️ Graph Boost HybridRetriever.search_hybrid(graph_boost=True) — boosts results by 1.0 + degree * 0.05 based on graph connectivity Connected facts rank higher. A fact with 10 edges gets 1.5× boost, an isolated one stays at 1.0×.
🤖 Embedding Auto-Detection Reads memorySearch.provider from OpenClaw config (openclaw.json). Default: sentence-transformers (local, no API key). Supports Voyage and Ollama. Zero config needed. Works with whatever embedding provider you already use.
📦 CLI Commands openclaw-nexus-discover (auto-discovery), openclaw-nexus-graph-report (analytics), nexus-export (skill export) Run discovery and analytics from terminal.
🏗️ Package Structure Migrated from scripts-only to proper nexus/ Python package with pyproject.toml pip install instead of copying scripts. Proper imports, proper CLI entry points.
🐛 Bugfix index_memories() handles dict content gracefully (was crashing on non-string Qdrant payloads) No more crashes on dict content from Qdrant entries.
🧪 219 Unit Tests Full coverage for discovery, analytics, hybrid retrieval, graph, edges, provenance All green on Python 3.12.

v2.0.0 — SkillGraph: Edge Store + Query Layer 🔗

Feature What it does Why it matters
🔗 SkillGraph Edge Store SQLite-backed directed graph — add_edge() with 6 relation types, 4 lifecycle statuses, partial unique index WHERE status='active' Store WHY facts relate, not just WHAT they are. No duplicate active edges. Full audit trail.
🕸️ Graph Query Layer NetworkX in-memory cache — BFS with depth-limit, DFS chain detection, symmetric contradicts edges auto-inserted. Incremental updates. Query relationships in milliseconds. Multi-hop paths for reasoning chains.
🏛️ Schema-First Design EdgeRelation enum + EdgeStatus enum. SQLite = Source of Truth, NetworkX = readonly cache. Data integrity before performance. Everything validates against the schema before touching the store.
Earlier releases (v1.0.0 – v1.9.0)

v1.9.0 — Skill Export 🎯

Feature What it does Why it matters
🎯 Skill Export export_skill() searches canonical facts by topic → clusters into Steps/Pitfalls/Prerequisites/Verification → generates complete SKILL.md Turn learned facts into reusable agent skills. No more manual SKILL.md editing.
🖥️ nexus-export CLI --list shows exportable topics, --skill name exports as .md, --deploy writes directly to skills directory One command from Nexus to deployable skill.
🔍 Legacy-Compat Handles both dict content payloads and legacy format. Paginated scroll scans. Works on existing databases without migration.

v1.8.0 — Fact Lifecycle Model 🧬

Feature What it does Why it matters
🧬 Fact Lifecycle Model Append-only state machine: `pending → canonical deprecated
🏗️ Staging Area create_pending() → review → promote() to canonical. list_pending() shows update-drafts. Stage changes before they go live. Review queue for manual approval.
🔄 Rollback rollback() creates a rolled_back marker + restores previous canonical. Both stay in history. Undo mistakes without losing evidence.
🛡️ Concurrency Guard promote() verifies pending.supersedes == current_canonical.version_id before writes. Stale pendings raise ValueError. Prevents lost-update / fork scenarios.

v1.7.2 — Hybrid Search CLI

Feature What it does Why it matters
🔍 Hybrid Search — search_hybrid() BM25 + Vector + RRF + Tier-Boost — auto-detection from config. Falls back to BM25-only. Anti-RAG-poisoning: keyword precision + semantic search combined. Source-tier boost pushes trusted sources higher.
🎯 3 Embedding Providers voyage (cloud, 1024d), sentence-transformers (local, 384d), ollama (local, 768d) Choose what fits your setup: cloud quality or local privacy.

v1.7.1 — Provenance + Orphan Check

Feature What it does Why it matters
🔍 Provenance Scan scan_provenance() scrolls all entries and reports source types, confidence distribution, criticality markers Full transparency on where every memory comes from.
🔗 Wikilink Orphan Detection find_wikilink_orphans() finds [[wikilinks]] that don't resolve anywhere Ensures your wiki has no dead links.

v1.7.0 — Memory Expiry + Enrichment

Feature What it does Why it matters
📅 Memory Expiry 3 policies: static (never), normal (90d), volatile (7d). DriftDetector flags expired entries. valid_until overrides policy. Stale configs, old paths and dead API keys finally get caught.
📊 Tiered Enrichment 3 tiers: T1 (store), T2 (+keywords), T3 (+linking). Hybrid decision: caller override → importance → category → content heuristics → T1 default Low-value logs stay lean, critical facts get full enrichment.

v1.6.0 — Grounding Scoring

Feature What it does Why it matters
RAG Grounding Scoring 5-signal evaluation: similarity, dominance, grounding, factual overlap, coverage Know how reliable an answer really is — 🟢 Very High → ⛔ Very Low
🎯 3-Provider Embedding Grounding Scorer uses the same provider as your system (voyage/sentence-transformers/ollama) No dimension mismatch — works with any configuration.
🧹 Retrieval Filter search_vector() filters on type: memory — no session turns in results Clean results, only facts, no chat history noise.

v1.5.0 — Authority Chain

Feature What it does Why it matters
⚖️ Authority Chain 6-level priority: Direct Instruction > Fixed Rules > Recent Decisions > Sourced Memory > Semantic Search > Summarized Resolves conflicting facts automatically — knows which one to trust.
🔍 resolve_authority(facts) Picks the highest-authority fact from a list One call, no manual ranking.
⚖️ nexus_resolve_conflict(facts) Returns winner + runner-up + reasoning Full transparency on tiebreak decisions.
Timestamp Tiebreaker Among equal authorities, newer fact wins "Direct instruction now" beats "policy from yesterday".
🤖 Auto-Detection Reads authority level from category, source, and content Zero config — works out of the box.

v1.4.0 — Provenance

Feature What it does Why it matters
🏷️ Provenance Support Parse <!-- source:TYPE, by:WHO, trust:0.0-1.0 --> annotations from memory files Know where facts come from and how trustworthy they are.
Criticality Marking Detect markers for high-importance facts Never lose track of critical knowledge.
🔗 Wikilink Orphan Check Find broken [[links]] in MEMORY.md and wiki files Catch dead references before they confuse your agent.
📊 Provenance Report New --provenance and --orphans CLI flags On-demand trust and dependency analysis.

v1.3.0 — Version Sync

Feature What it does Why it matters
🔄 Version Bump All versions synced to 1.3.0 Consistent across README, pyproject, CHANGELOG.
🔗 Related Projects Cross-reference to Hermes Nexus Memory Discover the sibling project.

v1.2.0 — Semantic Contradiction Detection

Feature What it does Why it matters
🧠 Semantic Contradiction Detection Embedding similarity + sentiment heuristic (Phase 2) Find contradictions your regex missed.
Two-Phase Pipeline Phase 1 (regex) + Phase 2 (embeddings) Deep and shallow detection combined.
🛡️ Graceful Degradation No embedder → regex only, never crashes Always works, improves with more data.

v1.1.0 — Auto-Fix

Feature What it does Why it matters
🔧 Auto-Fix Mark stale MEMORY.md entries as HISTORICAL Fix problems, not just find them.
🕐 Scheduled Maintenance Weekly cron auto-runs with --auto-fix Set it and forget it.
🔍 Dry-Run --dry-run previews auto-fix changes Safe before committing.
📊 Embedding Providers sentence-transformers / ollama / voyage Choose local or cloud.

v1.0.0 — Initial Release

Feature What it does Why it matters
🩺 Health Checks Stale patterns, contradictions, health score 0–10 Know if your memory is drifting.
Contradiction Detection Pattern pairs find conflicting claims Catch "X is active" vs "X is disabled".
📚 Knowledge Gaps Entities mentioned often but never documented Know what you're missing.
🔗 Co-occurrence Analysis Find entities that appear together Discover hidden relationships.
📊 Growth Stats Track memory size and health over time Spot trends before they become problems.
📝 Wiki Layer Obsidian-compatible templates with [[Wikilinks]] Community knowledge layer.
🏛️ Bi-temporal Tracking Valid from / Valid until markers Never silently overwrite decisions.

Hybrid Retrieval 🛡️

Pure vector search is vulnerable to RAG poisoning — adversarial documents that rank high semantically but contain garbage. Hybrid retrieval fixes this by blending two search strategies:

Method Strengths Weaknesses
BM25 Keyword-exact, poison-resistant Misses semantics
Vector Semantic matching, fuzzy queries Vulnerable to poisoning
Hybrid (RRF) Best of both

How it works

Query → ┌─ BM25 Index ──────→ Keyword Rankings
        │                          │
        └─ Vector Embeddings ──→ Semantic Rankings
                                       │
                              RRF Fusion ───→ Combined Rankings
                                       │
                              Tier Boost ───→ Final Results

Reciprocal Rank Fusion (RRF): Each result gets 1/(k + rank) points from each method. Sum across methods. Simple, effective, no tuning needed.

Source Tiers

Tier Sources Boost Example
🟢 Tier 1 Agent, user, config, official docs 1.2× Your agent's own memory
🟡 Tier 2 Curated external sources 1.0× Medium, arXiv, GitHub READMEs
🔴 Tier 3 Uncurated / unknown sources 0.8× Reddit, Twitter, random forums

Your own data always wins. Untrusted sources get penalized. Poisoning becomes statistically unlikely.

Graph Boost

Facts connected by edges in the SkillGraph get a relevance bonus: 1.0 + degree * 0.05. A fact with 10 edges gets 1.5× boost, an isolated one stays at 1.0×.

Use it standalone

from nexus.retrieval import HybridRetriever

retriever = HybridRetriever(qdrant_host="localhost", qdrant_port=6333)
retriever.index_memories()                          # build BM25 index from Qdrant
results = retriever.search_bm25("fallback routing") # keyword search
results = retriever.search_hybrid("fallback routing", query_vector=vec)  # full hybrid

Or without Qdrant (for testing):

retriever = HybridRetriever()
retriever.index_from_texts(
    texts=["DeepSeek V4 is disabled", "Kimi K2.6 is the fallback"],
    ids=["1", "2"],
)
results = retriever.search_bm25("deepseek fallback")

Auto-Discovery

Automatically finds relationships between your facts using regex heuristics and vector similarity. Zero LLM cost — runs entirely on Qdrant vector search.

from nexus.discovery import AutoDiscovery

ad = AutoDiscovery(qdrant_url="http://localhost:6333", collection="openclaw-memory")
ad.initialize()
result = ad.discover_all()
# Returns: total_facts_scanned, candidates_found, inserted_active, inserted_proposed

Discovered edges start as PROPOSED — they don't affect retrieval until you promote them:

from nexus.graph.store import EdgeStore

store = EdgeStore()
store.promote_edge(edge_id, reason="verified manually")

Edge Lifecycle

Status Meaning Search boost
PROPOSED Auto-discovered, awaiting confirmation None
ACTIVE Promoted/verified Yes
DEPRECATED No longer relevant None
REJECTED Explicitly rejected None

Edge relations: SUPERSEDES, CONTRADICTS, SUPPORTS, ALTERNATIVE_TO, DEPENDS_ON, REFERENCES


Embedding Providers

One toolkit. Three backends. Same config, same results.

Provider Type Setup Dims Quality
sentence-transformers In-process Python pip install sentence-transformers 384 Good ✅ (default)
ollama Local service Ollama running + ollama pull nomic-embed-text 768 Better
voyage Cloud API Get API key → .env 1024 Best

Default is sentence-transformers — no account, no API key, works immediately.

Nexus auto-detects your embedding provider from OpenClaw's openclaw.json config. If memorySearch.provider is set to "voyage", Nexus uses Voyage automatically.


Health Checks 🩺

Check Method Example
Stale patterns Regex patterns for outdated facts "GPT-3.5 is primary" — but it was replaced
Contradictions Pattern pairs across files "X is enabled" vs "X is disabled"
Score Weighted 0–10 🟢 <1 = healthy · 🟡 1–3 = attention · 🔴 >3 = action needed
# Full health check
python3 scripts/openclaw_nexus_health.py

# Auto-fix stale entries
python3 scripts/openclaw_nexus_health.py --auto-fix

# Preview fixes without changes
python3 scripts/openclaw_nexus_health.py --dry-run

Configuration

Copy config/default_config.yaml to config/nexus_config.yaml and customize:

workspace: ~/.openclaw/workspace

stale_patterns:
  - pattern: "\\bold-model\\b"
    explanation: "old-model is no longer the primary"

contradiction_pairs:
  - pattern_a: "\\bmodel-a\\b.*primary"
    pattern_b: "\\bmodel-b\\b.*primary"
    description: "Multiple models claimed as primary"

entity_patterns:
  - name: "MyAgent"
    pattern: "\\bMyAgent\\b"

historical_markers:
  - HISTORICAL
  - RESOLVED
  - ARCHIVED

embed_provider: sentence-transformers
wiki_path: null

See examples/full_config.yaml for all options.


Provenance

Nexus tracks where facts come from and how trustworthy they are:

GPT-5.2 is the primary model <!-- source:chat, by:Nebo, trust:1.0 -->
Server runs on Mac mini M4 <!-- source:manual, by:Miosha, trust:0.8 -->
Source Trust Description
chat 1.0 Directly confirmed in conversation
ingest 0.9 Imported from trusted external source
cron 0.8 Auto-detected by scheduled job
manual 0.7 Manually written, not verified
inferred 0.5 Deduced from context

Critical facts marked with get highest priority.


Requirements

  • Python 3.11+ with pyyaml
  • Qdrant (for hybrid retrieval, auto-discovery, graph analytics) — or skip for health checks only
  • Embedding provider (default: sentence-transformers, no API key needed)
  • No external database required for health checks, patterns, and auto-fix

Full install (with Qdrant + all features)

pip install "openclaw-nexus-memory[all]"

Minimal install (health checks only, no Qdrant)

pip install openclaw-nexus-memory

Troubleshooting

Symptom Check Fix
No stale patterns found Config has stale_patterns? Add patterns specific to your agent's setup
Health score always 0 Any patterns match? Customize nexus_config.yaml with real patterns
ModuleNotFoundError: yaml pip3 show pyyaml pip3 install pyyaml
Auto-fix not changing anything Stale entries in MEMORY.md? Auto-fix only modifies MEMORY.md, not daily logs
Qdrant connection refused Qdrant running? docker run -p 6333:6333 qdrant/qdrant
Collection not found Create it first Run openclaw-nexus-discover or create manually
dict content in Qdrant v2.1.0 handles this Update to latest version

vs Other OpenClaw Memory Solutions

🏆 Feature Memory Engine (ClawHub) mem0 mem9 Nexus 🏆
🔀 Hybrid retrieval ✅ BM25+Vector+RRF
🔄 Auto-discovery ✅ Zero LLM cost
📊 Graph analytics ✅ Hubs, gaps, clusters
🔗 Edge lifecycle ✅ Proposed→Active→Deprecated
🛡️ Source-tier boosting ✅ Anti-RAG-poisoning
🚀 Graph boost ✅ Connected facts rank higher
🩺 Stale detection ✅ Regex + auto-fix
Contradiction detection ✅ Pattern pairs + semantic
🧩 Knowledge gaps ✅ Entity counting
⚖️ Authority chain ✅ 6-level hierarchy
🏷️ Provenance ✅ Source+Trust+Criticality
🎯 Grounding scoring ✅ 5-signal evaluation
🧬 Fact lifecycle ✅ Pending→Canonical→Deprecated
📤 Skill export ✅ Facts→SKILL.md
🕐 Bi-temporal ✅ Valid from/until
🗄️ Database needed ✅ Cloud API ✅ Cloud API Qdrant (optional)
💰 Cost Free (local) Free tier + paid Free tier + paid Free (local)

The others store memories. Nexus manages them. Use it alongside any memory backend — builtin, Memory Engine, mem0, or mem9.


Related Projects

  • Hermes Nexus Memory — Sister project: the original codebase with Hermes-specific adaptations. Same core, different agent framework.

⭐ Found it useful? Give it a star on GitHub — it helps others find it!

License

MIT — use it, modify it, ship it.


Built by Nebo · May 2026 · v2.1.0 — Hybrid Memory Management for OpenClaw