Mnexa

A disciplined wiki maintainer for a personal markdown knowledge base. Throw any file at it — local, a folder, a Google Drive URL, or a Granola meeting note — and an LLM reads it and maintains a structured wiki of source / entity / concept pages with cross-references, an index, and a log. You curate; the LLM does the bookkeeping.

Implementation of the pattern in Andrej Karpathy's LLM Wiki gist — read that first; it's the design spec.

Why

Most LLM document tools are RAG: retrieve chunks at query time, generate from the chunks, throw the synthesis away. Mnexa treats the wiki as a persistent, compounding artifact — every ingest updates entity and concept pages once, every query runs against accumulated synthesis instead of re-deriving from raw sources. Open the wiki in Obsidian, Logseq, VS Code, or any markdown editor. The LLM is the maintainer; you are the curator.

Install

Requires Python 3.12+. Get a Gemini API key at https://aistudio.google.com/apikey.

# from PyPI
uv tool install mnexa            # or: pip install mnexa
# or for development
git clone https://github.com/jiashuoz/mnexa && cd mnexa && uv sync

Set GOOGLE_API_KEY in your shell or in a .env file at the vault root. See .env.example.

Use

# Create a new vault
mnexa init ~/my-vault
cd ~/my-vault

# Ingest anything — local file, local folder, Google Drive URL, or Granola
mnexa ingest paper.pdf
mnexa ingest ~/Documents/papers/
mnexa ingest "https://drive.google.com/drive/folders/<id>"
mnexa ingest "https://drive.google.com/file/d/<id>"
mnexa ingest "https://app.granola.ai/notes/<id>"
mnexa ingest granola                              # all your Granola notes
mnexa ingest granola --since 2026-04-01           # incremental

# Ask the wiki a question
mnexa query "what does this paper claim?"

# Audit the wiki
mnexa lint

Folder ingests support --yes / -y to skip confirmation and --limit N to cap files per run. Re-running an ingest on a folder skips files whose source hasn't changed (Drive: by modifiedTime; local: by content hash).

Vault layout

my-vault/
├── .git/
├── .gitignore                  # ignores .mnexa/ and .env
├── .mnexa/                     # Mnexa local state (lint reports)
├── CLAUDE.md                   # the schema — edit §6 to customize
├── raw/                        # immutable source documents
└── wiki/
    ├── index.md                # categorized table of contents
    ├── log.md                  # append-only activity log
    ├── sources/                # one page per ingested document
    ├── entities/               # people, orgs, products, places
    └── concepts/               # ideas, techniques, recurring topics

Every successful ingest is a git commit. Free undo, free history, free diff.

How it works

Ingest is a two-stage pipeline:

Analyze — LLM reads the source plus the schema, index, and obviously-related existing pages. Produces a structured analysis (entities, concepts, claims, contradictions). Internal scratch.
Generate — LLM emits FILE blocks for the new/updated wiki pages. Mnexa parses, validates paths and frontmatter, substring-verifies that every ⟦"..."⟧ source-quote marker appears verbatim in the source, then atomically writes and commits.

The substring verifier is the anti-hallucination floor. If the LLM invents a biographical detail not present in the source, the marker check fails and the ingest aborts with no on-disk changes.

Query is a single LLM call against index.md + the top-N pages by keyword overlap, streamed to stdout with inline [[wikilink]] citations. Drive-sourced pages carry drive_url: in their frontmatter, so query answers naturally surface clickable Drive links when relevant — no separate "find files" command.

Lint runs deterministic checks first (broken links, frontmatter, index/wiki sync, orphans, ungrounded pages, slug style), then one LLM call for semantic checks (contradictions, stale claims, missing pages, slug typos). Output: .mnexa/lint/<timestamp>.md.

Google Drive

Drive is a transport, not a separate concept. Same mnexa ingest command takes a Drive URL or a folder URL; mnexa fetches content in memory, ingests, and stores Drive metadata (drive_file_id, drive_modified, drive_url, drive_path, mime_type) in the resulting source page's frontmatter. Originals stay in Drive — nothing is downloaded to raw/.

Re-syncing is idempotent: walking a folder again skips files whose drive_modified matches what's already on disk. Source-page depth adapts to content — a paper gets a full structured page; a tax form or receipt gets a brief one without entity/concept synthesis.

One-time GCP setup (required for Drive):

Create a project at https://console.cloud.google.com and enable the Google Drive API.
Create OAuth credentials → "Desktop app" → download the JSON.
Set MNEXA_GOOGLE_CLIENT_ID and MNEXA_GOOGLE_CLIENT_SECRET in your .env.
On the OAuth consent screen, set User Type = External, Publishing status = Testing, scope = drive.readonly, and add yourself as a test user.

First Drive ingest opens a browser for OAuth; the refresh token is cached at ~/.config/mnexa/google-token.json and used silently after that.

Granola

Granola meeting notes work the same way: same mnexa ingest command, transport hidden. Auth is just a Bearer token — no OAuth dance.

Setup:

Generate a personal API key at https://app.granola.ai (Business or Enterprise plan required — Granola-side limitation).
Set GRANOLA_API_KEY in your .env.
mnexa ingest granola://note/not_<14-char-id> to ingest one meeting, or mnexa ingest granola to walk your entire notes list. (Granola's web share URLs notes.granola.ai/d/<uuid> use a different identifier than the API; you need the not_* note ID, not the share URL.)

The big win for this source type is that participants become entity pages. After 30 ingested meetings, entities/alice-smith.md synthesises every topic you've discussed with her, with verifiable quotes from the transcripts. That's exactly what the wiki pattern is for.

Frontmatter on a Granola-sourced page:

type: source
slug: 2026-04-15-design-review
source_path: granola://not_1d3tmYTlCICgjy
granola_note_id: not_1d3tmYTlCICgjy
granola_created: "2026-04-15T14:00:00Z"
granola_updated: "2026-04-15T15:30:00Z"
granola_url: https://notes.granola.ai/d/<uuid>
attendees: ["Alice Smith", "Bob Jones"]
granola_folders: ["Engineering"]

mnexa ingest granola is idempotent — it walks the notes list, reads existing source-page frontmatter, and skips notes whose granola_updated matches. Use --since YYYY-MM-DD to only fetch notes updated after a given date.

LLM

Provider-agnostic via a small LLMClient protocol. v0 ships Google Gemini (default gemini-3-flash-preview). Set MNEXA_MODEL to any gemini-* model; set MNEXA_PROVIDER to override the auto-inference. Adding Anthropic or OpenAI is ~80 lines plus an extras entry — not shipped because no one needs it yet.

Status


`mnexa init`	✅
`mnexa ingest` (local file / folder)	✅ — `.md`, `.txt`, `.pdf`, `.docx`
`mnexa ingest` (Google Drive file / folder)	✅ — adaptive-depth, idempotent re-sync
`mnexa ingest` (Granola meeting notes)	✅ — single note or full list, incremental via `--since`
`mnexa query`	✅
`mnexa lint`	✅
`mnexa lint --fix`	not yet (v0.1)
save query answer as wiki page	not yet (v0.1)
Anthropic / OpenAI providers	not yet
Notion / other sources	planned

Develop

uv sync --all-extras
uv run pytest         # 54 tests
uv run ruff check .
uv run pyright        # strict

Prompts live as files in src/mnexa/prompts/ and load via importlib.resources. Edit them, rerun, iterate.

Design notes

Pure markdown is the canonical store. No SQLite, no vector index, no FTS5. Karpathy's gist argues index.md is enough at moderate scale; we believe it until measurements say otherwise.
Two-stage ingest is borrowed from nashsu/llm_wiki; the deterministic-then-LLM lint tier is borrowed from SamurAIGPT/llm-wiki-agent. The substring-grounding verifier is novel — neither reference project does it.
Atomic-ish writes via stage-then-rename + git checkout HEAD -- rollback on failure. The git commit is the durability barrier.
Gemini context caching is a no-op at our schema size (~3k tokens, below the threshold). The protocol still expresses intent so other providers can honor it.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
src/mnexa		src/mnexa
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mnexa

Why

Install

Use

Vault layout

How it works

Google Drive

Granola

LLM

Status

Develop

Design notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Mnexa

Why

Install

Use

Vault layout

How it works

Google Drive

Granola

LLM

Status

Develop

Design notes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages