Skip to content

Consolidation

Varun Pratap Bhardwaj edited this page Mar 30, 2026 · 1 revision

Consolidation: Sleep-Time Memory Processing

Consolidation is SLM's background process that compresses, promotes, and pre-compiles memory during idle periods. The biological metaphor: during sleep, the brain replays recent experiences, compresses them into long-term memory, strengthens important connections, and prunes weak ones. SLM does the same.

The headline feature is Core Memory blocks --- always-in-context working memory that is loaded at the start of every session, before any retrieval happens. This is inspired by Letta's Core Memory architecture.

Core Memory Blocks

Core Memory blocks are pre-compiled summaries of your most important knowledge, organized into five categories:

Block Type What It Contains Source
user_profile Your key traits and preferences Top semantic/opinion facts by access count
project_context Current project state and recent activity Top episodic facts by recency
behavioral_patterns Learned patterns from your usage Top behavioral patterns by confidence
active_decisions Decisions you reference frequently Decision-typed facts with access count >= 3
learned_preferences Your confirmed preferences Opinion facts with trust >= 0.7

Each block is capped at 500 characters. Total across all blocks is capped at 2000 characters. This keeps the working memory footprint small enough to fit alongside other context without exhausting the token budget.

Session Injection Order

At session start (session_init), context is loaded in this order:

  1. Core Memory blocks loaded from core_memory_blocks table (cached in RAM for the session)
  2. Auto-Invoke runs with Core Memory already in context, improving retrieval quality
  3. Combined context returned to the AI

Core Memory provides the persistent "who am I working with and what are we doing" context. Auto-Invoke provides the "what is specifically relevant to this query" context. Together, they give the AI both general and specific knowledge.

Mode A Rules-Based Compilation

In Mode A (no LLM), Core Memory blocks are compiled using deterministic rules:

  • user_profile: Top 5 semantic or opinion facts, sorted by access_count descending
  • project_context: Top 5 episodic facts, sorted by created_at descending (most recent)
  • behavioral_patterns: Top 5 behavioral patterns, sorted by confidence descending
  • active_decisions: Facts with signal_type = 'decision' and access count >= 3
  • learned_preferences: Opinion facts with confidence >= 0.7

Facts are joined with --- separators and truncated to the character limit. No LLM is called.

Mode B/C LLM-Assisted Compilation

In Mode B and C, the user_profile and project_context blocks are compiled by passing the top facts to the LLM Summarizer, which produces a coherent summary rather than raw concatenation. The behavioral, decisions, and preferences blocks still use rules-based compilation (the LLM adds no value for structured pattern data).

If the LLM fails, compilation falls back to Mode A rules automatically.

The 6-Step Consolidation Cycle

A full consolidation runs six steps in sequence:

Step 1: Compress

Deduplicate near-identical facts. The algorithm:

  1. Get all active facts for the profile
  2. Group by high cosine similarity (>= 0.85 threshold) using VectorStore
  3. For each group of 3+ near-identical facts:
    • Create a summary fact (Mode A: heuristic; Mode B/C: LLM)
    • Set original facts' lifecycle to archived
    • Link summary to originals via association_edges (type: consolidation)

No facts are deleted. Originals are archived, not removed.

Step 2: Compile Core Memory Blocks

Recompile all 5 Core Memory block types from current data (as described above). Uses INSERT OR REPLACE on the UNIQUE constraint (profile_id, block_type), ensuring idempotency.

Step 3: Auto-Promote

Promote frequently accessed facts to higher lifecycle states:

  1. Find facts with access count >= promotion_min_access (default: 3)
  2. Filter: only temporally valid facts (not expired)
  3. Filter: only facts with trust >= promotion_min_trust (default: 0.5)
  4. Update lifecycle state based on access patterns

The temporal validity check (L12 audit fix) ensures that contradicted facts are not promoted, even if they have high access counts.

Step 4: Decay Unused Edges

Delegate to AutoLinker.decay_unused() from the Association Graph. Edges in association_edges that have not been strengthened in 30+ days have their weights reduced:

new_weight = current_weight * exp(-0.01 * days_inactive)

Edges below weight 0.05 are deleted.

Step 5: Recompute PageRank and Communities

Delegate to GraphAnalyzer.compute_and_store() from the Association Graph. This recomputes PageRank scores and Label Propagation community IDs across the full graph (both graph_edges and association_edges). Results are stored in fact_importance.

Step 6: Derive New Associations

Run AutoLinker.link_new_fact() on any summary facts created in Step 1. This connects the new summaries to the broader memory graph.

Triggers

Consolidation can be triggered through five mechanisms:

Session End

When a Claude Code session ends (Stop hook), full consolidation runs in a background daemon thread. This does not block the session shutdown.

Idle Timer

After 5 minutes of inactivity (no store or recall operations), a lightweight consolidation runs automatically. The timer resets on every user action.

Step-Count Trigger

Every 50 store() calls, a lightweight consolidation runs synchronously. Lightweight consolidation executes only Step 2 (refresh Core Memory blocks) and Step 4 (decay edges). This is fast enough (< 100ms) to run without noticeable delay.

# After each store():
if store_count % 50 == 0:
    consolidation_engine.consolidate(profile_id, lightweight=True)

Manual Trigger

Via CLI:

slm consolidate                          # Full consolidation, active profile
slm consolidate --profile work           # Full consolidation, specific profile
slm consolidate --lightweight            # Lightweight: blocks + decay only

Via MCP tool:

{
  "tool": "consolidate",
  "arguments": {
    "profile_id": "default",
    "lightweight": false
  }
}

Scheduled

Full consolidation runs automatically every N sessions (default: 5). The session count is tracked per profile.

Idempotency

Consolidation is designed to be safe to run multiple times. Running consolidate() twice produces identical database state. This is guaranteed by:

  • Compression: Uses INSERT OR IGNORE for deduplication (content hash prevents duplicates)
  • Block compilation: Uses INSERT OR REPLACE on UNIQUE(profile_id, block_type)
  • Promotion: Checks current lifecycle state before updating
  • PageRank: Deterministic given the same graph
  • Edge decay: Monotonic (weights only decrease or stay the same)

You can safely run slm consolidate at any time without worrying about corrupting memory state.

Configuration

Enable consolidation:

slm config set consolidation.enabled true

Adjust trigger thresholds:

# Step-count trigger interval
slm config set consolidation.step_count_trigger 50

# Idle timeout in seconds
slm config set consolidation.idle_timeout_seconds 300

# Scheduled full consolidation interval (sessions)
slm config set consolidation.scheduled_sessions 5

Adjust block compilation parameters:

# Per-block character limit
slm config set consolidation.block_char_limit 500

# Total Core Memory character limit
slm config set consolidation.core_memory_char_limit 2000

Adjust promotion and decay:

# Minimum access count for promotion
slm config set consolidation.promotion_min_access 3

# Minimum trust score for promotion
slm config set consolidation.promotion_min_trust 0.5

# Edge decay threshold in days
slm config set consolidation.decay_days_threshold 30

# Compression similarity threshold
slm config set consolidation.compression_similarity 0.85

Full Configuration Reference

Parameter Default Description
enabled false Feature flag. Must be true to activate.
step_count_trigger 50 Lightweight consolidation every N stores
session_trigger true Run full consolidation on session end
idle_timeout_seconds 300 Idle timer threshold (5 minutes)
scheduled_sessions 5 Full consolidation every N sessions
core_memory_char_limit 2000 Total character budget for all blocks
block_char_limit 500 Per-block character limit
compression_similarity 0.85 Cosine threshold for deduplication
promotion_min_access 3 Minimum access count for promotion
promotion_min_trust 0.5 Minimum trust for promotion
decay_days_threshold 30 Days before edge decay begins

Lightweight vs. Full Consolidation

Lightweight Full
Steps 2 (blocks) + 4 (decay) All 6 steps
Duration < 100ms Seconds (depends on memory size)
Trigger Step-count (every 50 stores), idle timer Session end, manual, scheduled
Blocking Synchronous (fast) Background thread (non-blocking)
When to use Keeping blocks fresh during active work Periodic deep maintenance

Dependencies on Other V3.2 Features

Consolidation integrates with all other v3.2 features:

  • Auto-Invoke (Phase 2): Core Memory is injected before auto-invoke at session start
  • Association Graph (Phase 3): Steps 4 and 5 delegate to AutoLinker and GraphAnalyzer; Step 6 links new summaries
  • Temporal Intelligence (Phase 4): Step 3 checks temporal validity before promoting facts

If any dependency is disabled (feature flag off), consolidation gracefully skips the relevant operations.


Part of Qualixar | Created by Varun Pratap Bhardwaj

Clone this wiki locally