meme

High-performance long-term memory for AI agents — production-grade pipeline with semantic compression, lifecycle reconciliation, full CRUD, hybrid retrieval, and persistent vector storage, written in Rust.

meme implements a production-grade memory pipeline with a Rust core: (1) Semantic Structured Compression extracts lossless, disambiguated memory entries from dialogues or raw facts via LLM, (2) Lifecycle Reconciliation deduplicates and manages ADD/UPDATE/DELETE/NOOP via LLM-driven conflict resolution at write time, and (3) Intent-Aware Retrieval Planning combines semantic, lexical (FTS), and structured metadata search with LLM-driven reflection. Memory is stored persistently on disk via LanceDB with full change history tracking.

Quick Start

Install the CLI

Shell (macOS / Linux):

curl -fsSL https://sh.qntx.fun/meme | sh

PowerShell (Windows):

irm https://sh.qntx.fun/meme/ps | iex

CLI

# Initialize configuration
meme init

# Add dialogues
meme add -s Alice "I'll be in Tokyo next Monday for the conference."
meme add -s Bob "Let's meet at Shibuya station at 3pm."

# Add raw facts (no speaker needed)
meme add "Alice prefers coffee over tea"

# Import from JSONL file
meme add --file conversation.jsonl

# Ask questions
meme ask "Where will Alice and Bob meet?"

# Semantic search
meme search "Alice travel plans"

# CRUD operations
meme get <uuid>
meme update <uuid> "Updated content here"
meme delete <uuid>

# View change history
meme history <uuid>

# List stored memories
meme list
meme list --json --limit 50

# Export / import
meme export -o memories.json
meme import memories.json

Library

use meme::MemeBuilder;

let meme = MemeBuilder::new()
    .api_key("sk-...")
    .model("gpt-4.1-mini")
    .build()
    .await?;

// Dialogue-based ingestion — automatically extracted into structured memory entries.
meme.add_dialogue("Alice", "Let's meet at 2pm tomorrow", None).await?;
meme.add_dialogue("Bob", "Sure, I'll bring the Q3 report", None).await?;
meme.finalize().await?;

// Direct fact ingestion — bypasses dialogue windowing.
meme.add("Alice prefers coffee over tea").await?;

// CRUD operations.
let results = meme.search("Alice meeting").await?;
let entry = meme.get(results[0].id).await?;
meme.update(results[0].id, "Alice prefers tea over coffee").await?;
meme.delete(results[0].id).await?;

// Change history tracking.
let events = meme.history(results[0].id).await?;

// Q&A — hybrid retrieval + LLM answer generation.
let answer = meme.ask("When will Alice meet?").await?;

See examples/ for more: basic, batch import.

Feature Flags

Feature	Default	Description
`api-embedding`	yes	Remote OpenAI-compatible embedding API
`onnx`	no	Local ONNX embedding via `fastembed` — auto-downloads models from Hugging Face Hub

Configuration

No configuration file is required. The library is configured entirely through MemeBuilder:

let meme = MemeBuilder::new()
    .api_key("sk-...")
    .model("gpt-4.1-mini")
    .base_url("https://api.openai.com/v1")
    .user_id("alice")           // multi-tenant isolation
    .session_id("session-001")  // multi-session isolation
    .build()
    .await?;

For full control, pass a Config struct directly:

use meme::config::{Config, LlmConfig, EmbeddingConfig, StoreConfig, PipelineConfig};

let config = Config {
    llm: LlmConfig { api_key: Some("sk-...".into()), ..Default::default() },
    embedding: EmbeddingConfig { model: "text-embedding-3-small".into(), dimension: 1536, ..Default::default() },
    store: StoreConfig { lancedb_path: "/custom/path/lancedb".into(), ..Default::default() },
    pipeline: PipelineConfig { semantic_top_k: 25, enable_reflection: true, ..Default::default() },
};

let meme = MemeBuilder::new().config(config).build().await?;

The CLI tool (meme-cli) optionally reads ~/.meme/config.toml. Environment variables override any file or default values:

Env Var	Overrides	Default
`MEME_LLM_API_KEY`	`llm.api_key`	(required)
`MEME_LLM_BASE_URL`	`llm.base_url`	`https://api.openai.com/v1`
`MEME_LLM_MODEL`	`llm.model`	`gpt-4.1-mini`
`MEME_EMBEDDING_PROVIDER`	`embedding.provider`	`api`

Full config.toml reference

[llm]
api_key = "sk-..."
base_url = "https://api.openai.com/v1"
model = "gpt-4.1-mini"
temperature = 0.1
max_retries = 3

[embedding]
provider = "api"                        # "api" or "onnx"
model = "text-embedding-3-small"        # API model name or fastembed model code
dimension = 1024                        # vector dimension (auto-detected for onnx)

[store]
lancedb_path = "~/.meme/lancedb"
table_name = "memories"

[pipeline]
window_size = 40                        # dialogues per extraction window
overlap_size = 2                        # overlap between consecutive windows
semantic_top_k = 25                     # max semantic search results
keyword_top_k = 5                       # max keyword search results
structured_top_k = 5                    # max structured search results
enable_planning = true                  # LLM-driven query analysis
enable_reflection = true                # iterative completeness checking
max_reflection_rounds = 2
max_build_workers = 16                  # parallel extraction workers
max_retrieval_workers = 8               # parallel search workers
enable_rerank = false                   # LLM-based reranking
# custom_extraction_prompt = "..."      # override built-in extraction prompt
# custom_answer_prompt = "..."          # override built-in answer prompt

Architecture

flowchart TB
    subgraph Write["Write Path"]
        D["Dialogues / Facts"] --> W[Windowing]
        W --> LLM1["LLM Extraction<br/><i>Semantic Structured Compression</i>"]
        LLM1 --> E[MemoryEntry]
        E --> EMB1[Embedding]
        EMB1 --> RC{"LLM Reconciliation<br/><i>ADD / UPDATE / DELETE / NOOP</i>"}
        RC --> VS[(VectorStore<br/>LanceDB)]
        RC --> HS[(HistoryStore<br/>Change Tracking)]
    end

    subgraph CRUD["CRUD API"]
        GA["get(id)"] --> VS
        UA["update(id, content)"] --> VS
        DA["delete(id)"] --> VS
        SA["search(query)"] --> VS
        HA["history(id)"] --> HS
    end

    subgraph Read["Read Path"]
        Q[Query] --> P["LLM Planning<br/><i>Intent-Aware Retrieval</i>"]
        P --> S1[Semantic Search<br/>dense vectors]
        P --> S2[Keyword Search<br/>FTS / Tantivy]
        P --> S3[Structured Search<br/>metadata filters]
        S1 & S2 & S3 --> M[Merge + Deduplicate]
        M --> R{Reflection}
        R -->|incomplete| P
        R -->|complete| G["LLM Answer Generation"]
    end

    VS -.-> S1 & S2 & S3

Each MemoryEntry is a self-contained, unambiguous unit of knowledge stored with three index layers:

Index Layer	Type	Purpose	Implementation
Semantic	Dense vector	Conceptual similarity	1024-d embeddings via OpenAI or local ONNX
Lexical	Inverted index	Exact term matching	FTS (Tantivy) + BM25-style keywords
Symbolic	Structured metadata	Filtered lookup	Timestamp, location, persons, entities, topic

Pipeline

Stage 1: Semantic Structured Compression

Raw dialogues (or direct facts via add()) are split into overlapping windows and sent to an LLM. The LLM extracts atomic, self-contained memory entries — each entry is a complete, independent fact with all pronouns resolved and all timestamps converted to absolute ISO 8601 format.

Each entry contains:

Lossless restatement — complete sentence (no pronouns, no relative time)
Keywords — core terms for BM25-style lexical matching
Structured metadata — ISO 8601 timestamp, location, person names, entity names, topic phrase

Stage 2: Lifecycle Reconciliation

New entries are reconciled against existing memories in a single LLM call. For each new fact, the LLM decides:

Action	When	Effect
ADD	Genuinely new information	Store the new entry
UPDATE	Supersedes an existing memory	Delete old + store new
DELETE	Contradicts an existing memory	Remove the obsolete entry
NOOP	Duplicate of existing memory	Skip (no storage)

All write operations are tracked in the HistoryStore for audit and debugging.

Stage 3: Intent-Aware Retrieval Planning

A single LLM call analyzes the user's query and produces a unified retrieval plan:

Query analysis — extract keywords, person names, entities, time expressions, and question type
Search planning — generate 1–3 targeted search queries for semantic retrieval
Information requirements — identify what specific facts are needed for a complete answer

The plan drives parallel execution of all three search layers (semantic, keyword, structured). Results are merged via ID-based deduplication.

When reflection is enabled, the system iteratively assesses completeness: if retrieved context is insufficient, additional targeted queries are generated and executed until the information requirement is satisfied or the max reflection rounds are reached.

Benchmark

meme-bench evaluates memory quality using the LOCOMO benchmark format:

MEME_LLM_API_KEY=sk-... meme-bench run --dataset locomo10.json
meme-bench run --dataset data.json --model gpt-4.1-mini --output report.json
meme-bench sample -o sample_bench.json  # generate sample dataset

Metrics: token-level F1, precision, recall, exact match — per question category (single-hop, temporal, commonsense, open-domain, adversarial).

License

Licensed under either of:

Apache License, Version 2.0 (LICENSE-APACHE or https://www.apache.org/licenses/LICENSE-2.0)
MIT License (LICENSE-MIT or https://opensource.org/licenses/MIT)

at your option.

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in this project shall be dual-licensed as above, without any additional terms or conditions.

A QNTX open-source project.

Code is law. We write both.

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
.github		.github
.vscode		.vscode
meme-bench		meme-bench
meme-cli		meme-cli
meme		meme
.gitattributes		.gitattributes
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE-APACHE		LICENSE-APACHE
LICENSE-MIT		LICENSE-MIT
Makefile		Makefile
README.md		README.md
clippy.toml		clippy.toml
install.ps1		install.ps1
install.sh		install.sh
rust-toolchain.toml		rust-toolchain.toml
rustfmt.toml		rustfmt.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

meme

Quick Start

Install the CLI

CLI

Library

Feature Flags

Configuration

Architecture

Pipeline

Stage 1: Semantic Structured Compression

Stage 2: Lifecycle Reconciliation

Stage 3: Intent-Aware Retrieval Planning

Benchmark

License

About

Licenses found

Uh oh!

Releases 4

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

meme

Quick Start

Install the CLI

CLI

Library

Feature Flags

Configuration

Architecture

Pipeline

Stage 1: Semantic Structured Compression

Stage 2: Lifecycle Reconciliation

Stage 3: Intent-Aware Retrieval Planning

Benchmark

License

About

Topics

Resources

License

Licenses found

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 4

Uh oh!

Contributors

Uh oh!

Languages