Warning
Katalyst is in its earliest days. I'm actively building it in the open, which means things are incomplete, rough in places, and likely to change without notice. APIs, commands, config formats, and concepts can all break between commits. Please don't rely on it for anything important yet. If you have feedback, open an issue, or open a pull request if you have a fix.
Katalyst is a content consistency layer, designed for people and agents who curate persistent memory, wikis, and knowledge bases.
Katalyst gives you and your agent tools to solve problems like these:
- "My agent takes a long time to find things, and sometimes burns a ton of tokens."
- "I've repeatedly told my agent how to organize content and it still gets it wrong."
- "Sometimes when I go back and look, I discover that my agent has completely lost important information."
- "The content in my knowledge base isn't just text. It also includes metadata that I need to be able to categorize, score, filter, sort, etc."
- "My agent is supposed to store and curate notes for me, but I spend way too much time checking its work."
- "I want to change how I'm storing my data, but migration would be a big pain."
If you want to be confident that your content/data/memory is always in good shape, even when it's maintained by sometimes-sloppy agents and sometimes-sloppy humans, then Katalyst is for you.
Need a quick shape profile of an existing wiki before authoring rules? Use
katalyst inspect <path> to report frontmatter, markdown, and filename
conventions as evidence, and, once you've configured collections,
katalyst inspect <collection> to profile a collection's items. The full
command surface (including inspect) is in the
command reference.
Tip
New to Katalyst? Get started »: install the CLI, scaffold a .katalyst/ project, and run your first checks in a few minutes.
Katalyst comes with tools and skills to take stock of your content, no matter what state it's in today. It can help you (and your agents) figure out what you've got, map out the important concepts, and, if needed, get more organized.
Compared to having an LLM scan every file or write its own bash scripts, this approach can save a ton of tokens. It also lets you take advantage of skills, tools and strategies curated by a community of people who've faced similar challenges.
Curation always requires shared language and consistent structure. Katalyst provides tools for declaring structure and rules for your content in your knowledge base.
- Markdown content: required sections, naming conventions, templates, etc.
- File structure: naming conventions, preferred and required extensions, directory structures, etc.
- Metadata: required fields, types, enums, numeric ranges, and full JSON Schema validation of frontmatter.
- Object relationships: links, summaries, tables of contents, sequential numbering, etc.
As your content evolves, Katalyst gives you tools to navigate change.
- Add or change checks
- Change the structure of your content
- Change your storage layer
You can run Katalyst as a linter, a CLI, or a server. Use only the infrastructure that you need for your particular use case.
I'm building Katalyst to work with a variety of filesystems and databases. It isn't tied to any one data store.
Similarly, you choose which model to use.
Express the same rules in a project's own vocabulary and conventions.
Ergonomics matter: especially for agents. An agent should be able to read the rules, find what it needs, and extend them without ceremony:
- Speed: fast enough to run on every write.
- Discoverability: an agent can find the schemas and structure on its own.
- Readability: rules and content stay legible to humans and agents alike.
- Extensibility: add new check types as needs grow.
Curating your content with the right structure makes it more useful, but it also takes work. Historically, defining the right structure for knowledge was heavy: high-cost, high-risk, and sometimes technically demanding. This was doubly true when changing needs required updates to structure.
As a result, most structured data systems were rigid and hard to change. Most unstructured knowledge bases were either chronically outdated, or very limited in scope.
As AI starts to infuse our work, curating knowledge is going to become even more important, a massive potential unlock for people who want to work more productively and creatively with agents.
What if structure were light: easy to add, easy to maintain, easy to change?
In a world of unbounded creative collaboration with agents, the limiting factor isn't generating new ideas or gathering more information: it's having a shared language and structure to organize what we've learned, and to act on it together.
Install the latest release with Go (1.25+):
go install github.com/abegong/katalyst@latestOr build from source:
git clone https://github.com/abegong/katalyst.git
cd katalyst
make build # produces ./katalystThen point it at a directory to profile its structure:
katalyst inspect ./my-knowledge-baseKatalyst ships a family of skills that
teach a Claude/Cowork agent to drive the CLI — cataloging content, defining
collections and schemas, and deploying enforcement. Each is attached to a GitHub
Release as a .skill you install through your client's "Save skill" flow; no
clone required. Maintainers package them from source with make skills (writing
bin/*.skill); make skill SKILL=<name> packages one.
- Build the CLI and run your first checks.
- Contribute: dev setup, plus how we plan and document changes.
I'm Abe Gong, a technical founder with a deep love for data/ML/AI and open source. I'm the co-creator of Great Expectations, the leading open-source tool for data quality.
I'm fascinated by AI and the way it's changing how we work and collaborate, and I'm building Katalyst in the open to explore it. I take user feedback seriously: open an issue for bugs, questions, or requests, or open a pull request if you have a concrete fix.
More about me: LinkedIn · twitter/x · personal site.
Katalyst is licensed under the Apache License 2.0.