MindCity

An adressable hierarchical memory system for LLMs. Compress a 10k-token context into 2k active tokens without losing information — by making the rest adressable on demand.

TL;DR

MindCity is a research proof-of-concept that explores a new direction for LLM long-context memory: instead of trying to fit everything into the context window, make everything adressable at the right granularity, on demand.

The core thesis:

The "lossless" compression of a long LLM context does not mean fitting everything into fewer tokens. It means making the full content adressable on demand while exposing, by default, only what is relevant at the relevant level of granularity.

MindCity combines:

Global entity deduplication — a shared dictionary of entities and concepts across documents, strictly reversible
Hierarchical spatial metaphor — a "city" structure (districts, buildings, apartments, rooms, drawers) that gives the LLM a stable mental model for navigation
A minimal navigation DSL — around ten verbs (enter, list, zoom, follow, search_local, etc.) exposed via standard tool calls, no fine-tuning required
A zero-copy binary storage layer — Apache Arrow + Kùzu + LanceDB for physical speed

The LLM never sees the binary. It walks through the city, zooming in where it needs detail. Dense summaries at each level prevent it from being overwhelmed.

Status

This is a research proof-of-concept, not a production system.

Current phase: Phase 1 — Ingestion & entity dictionary (3/5 workstreams complete).

What's done:

Full project scaffolding (config, logging, types, tests, CI)
Synthetic corpus generator with 10 diverse topic templates
corpus_tiny (10 conversations) generated and committed
Ingestion pipeline: loader (multi-format JSON), normalizer (MindCity/Claude/ChatGPT exports), chunker (1 message = 1 chunk)
Entity system: spaCy NER + pattern matching extractor, LMDB-backed entity dictionary, mention resolver with alias tracking
39 unit tests passing, lint clean (ruff + black)

What's next: entity pointer encoding/decoding (@ent:xxx), full pipeline orchestration, and phase 1 exit criteria validation.

See CLAUDE.md for the current state of each phase and PLAN.md for the roadmap.

Documents

Start here, in order:

Document	Purpose
`VISION.md`	The thesis, the three levers, the v1/v2 scope
`BENCHMARK.md`	Evaluation protocol, metrics, baselines
`PLAN.md`	Implementation roadmap, stack, phase-by-phase milestones
`CLAUDE.md`	Living context file, updated every session
`paper/main.tex`	The accompanying research paper (drafted alongside the code)

Research questions

MindCity is structured around seven explicit research questions (see VISION.md section 9):

RQ1 — Compression. What effective compression ratio does MindCity achieve versus a naive RAG, at equivalent answer quality?
RQ2 — Quality. At a fixed token budget, does MindCity's answer quality match or exceed naive RAG?
RQ3 — Adressability. What fraction of hard questions require explicit zooming? Does the LLM learn zero-shot to zoom at the right moment?
RQ4 — Hierarchy. Does the spatial metaphor (city/district/building/…) improve navigation over abstract hierarchies?
RQ5 — Deduplication. What gain does the entity dictionary bring in isolation? Does it combine linearly with the hierarchy?
RQ6 — Latency. How many tool calls does MindCity need per query on average?
RQ7 — Scaling. How do metrics evolve from 100 to 10,000 conversations?

How it differs from existing work

System	Structure	Compression	LLM-driven navigation
Naive RAG	flat chunks	top-k	no
LLMLingua	prompt-level	perplexity pruning	no
Gist tokens / ICAE	fine-tuned memory slots	learned	no
MemGPT / Letta	paginated memory	coarse	yes (page-level)
GraphRAG	hierarchical communities	multi-level summaries	no
HippoRAG	concept graph + PageRank	none (retrieval only)	no
MindCity	spatial hierarchy	deduplication + adressable zoom	yes, fine-grained

MindCity is the only system combining global deduplication, LLM-driven hierarchical navigation, and a cognitive metaphor that the LLM can use zero-shot.

Quick start

# Clone and install
git clone https://github.com/berch-t/mindcity.git
cd mindcity
uv venv --python 3.12 .venv
source .venv/bin/activate
uv pip install -e ".[dev]"

# Verify everything works
make lint     # ruff + black
make test     # 39 unit tests

# Generate the synthetic test corpus
python scripts/generate_synthetic_corpus.py --size tiny

# (Coming soon) Ingest and benchmark
make ingest
make benchmark

Project layout

mindcity/
├── VISION.md, BENCHMARK.md, PLAN.md, PROMPT.md, CLAUDE.md  # research docs
├── src/mindcity/                                            # core code
│   ├── ingestion/    # loading, normalizing, chunking
│   ├── entities/     # deduplication dictionary (lever 1)
│   ├── hierarchy/    # city construction (lever 2)
│   ├── storage/      # Kùzu + LanceDB + Arrow wrappers
│   ├── dsl/          # navigation verbs + LLM loop
│   └── api/          # FastAPI exposure
├── baselines/        # raw context, RAG, BM25, GraphRAG-lite
├── benchmarks/       # questions, judge, metrics
├── scripts/          # run_benchmark, generate_corpus, etc.
├── tests/            # pytest suite
├── data/             # corpora (synthetic committed, real gitignored)
├── results/          # benchmark outputs, committed
└── paper/            # LaTeX research paper, drafted along the way

Contributing

This is a research project in active development. Issues, discussions, and pull requests are welcome, especially around:

Alternative clustering strategies for hierarchy construction
Improvements to the DSL specification
New baselines to compare against
Additional corpora for robustness evaluation

See CLAUDE.md for the current state and open questions.

Citation

If you use or reference MindCity in your research, please cite the paper (when available) or this repository:

@misc{mindcity2026,
  author = {Berchet, Thomas},
  title  = {MindCity: An Adressable Hierarchical Memory System for LLMs},
  year   = {2026},
  url    = {https://github.com/berch-t/mindcity}
}

License

MIT — see LICENSE.

Acknowledgments

This project builds conceptually on the excellent prior work of LLMLingua (Microsoft), GraphRAG (Microsoft Research), MemGPT / Letta, HippoRAG, and the broader LLM long-context research community. MindCity's contribution is to combine adressability with cognitive metaphor and global deduplication in a single evaluated system.

"Quand on veut, on peut."

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MindCity

TL;DR

Status

Documents

Research questions

How it differs from existing work

Quick start

Project layout

Contributing

Citation

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github/workflows		.github/workflows
baselines		baselines
benchmarks		benchmarks
data/synthetic/corpus_tiny		data/synthetic/corpus_tiny
notebooks		notebooks
paper		paper
results		results
scripts		scripts
src/mindcity		src/mindcity
tests		tests
.env.example		.env.example
.gitignore		.gitignore
BENCHMARK.md		BENCHMARK.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
PLAN.md		PLAN.md
README.md		README.md
VISION.md		VISION.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

MindCity

TL;DR

Status

Documents

Research questions

How it differs from existing work

Quick start

Project layout

Contributing

Citation

License

Acknowledgments

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages