why use many token when few do trick
Before/After β’ Install β’ Levels β’ Skills β’ Benchmarks β’ Evals
πͺ¨ Caveman Ecosystem Β Β·Β caveman talk less (you are here) Β Β·Β cavemem remember more Β Β·Β cavekit build better
A Claude Code skill/plugin and Codex plugin that makes agent talk like caveman β cutting ~75% of output tokens while keeping full technical accuracy. Now with ζθ¨ζ mode, terse commits / one-line reviews / lifetime stats, and a compression tool that cuts ~46% of input tokens every session.
Based on the viral observation that caveman-speak dramatically reduces LLM token usage without losing technical substance. So we made it a one-line install.
|
|
|
|
Same fix. 75% less word. Brain still big.
Pick your level of grunt:
|
|
|
|
Same answer. You pick how many word.
βββββββββββββββββββββββββββββββββββββββ
β TOKENS SAVED ββββββββ 75% β
β TECHNICAL ACCURACY ββββββββ 100%β
β SPEED INCREASE ββββββββ ~3x β
β VIBES ββββββββ OOG β
βββββββββββββββββββββββββββββββββββββββ
- Faster response β less token to generate = speed go brrr
- Easier to read β no wall of text, just answer
- Same accuracy β all technical info kept, only fluff dropped (science say so)
- Save money β 65% mean output reduction across our benchmarks (range 22-87%)
- Fun β every code review become comedy
One line. Detect every agent. Install for each.
# macOS / Linux / WSL / Git Bash
curl -fsSL https://raw.githubusercontent.com/JuliusBrussee/caveman/main/install.sh | bash
# Windows (PowerShell)
irm https://raw.githubusercontent.com/JuliusBrussee/caveman/main/install.ps1 | iexDetects 30+ agents (Claude Code, Gemini CLI, Codex, Cursor, Windsurf, Cline, Copilot, Continue, Kilo, Roo, Augment, Aider Desk, Amp, Bob, Crush, Devin, Droid, ForgeCode, Goose, iFlow, JetBrains Junie, Kiro CLI, Mistral Vibe, OpenHands, opencode, Qwen Code, Qoder, Rovo Dev, Tabnine, Trae, Warp, Replit Agent, Antigravity, β¦). Runs each one's native install. Skips what you not have. Safe to re-run.
By default the installer wires Claude Code's hooks + statusline + stats badge and registers the caveman-shrink MCP proxy on top of the plugin install. Pass --minimal to skip the extras and just install the plugin/extension. Pass --all to also drop per-repo rule files into the current directory.
| Flag | What |
|---|---|
--all |
Plugin + hooks + statusline + MCP shrink + per-repo rule files in $PWD. The full ride. |
--minimal |
Plugin/extension only. No hooks, no MCP shrink, no per-repo rules. |
--dry-run |
Preview, write nothing |
--only <agent> |
One target only (repeatable) |
--with-hooks |
Claude Code: also wire standalone hooks + statusline + stats badge. On by default. |
--with-mcp-shrink |
Claude Code: register the caveman-shrink MCP proxy via npx caveman-shrink. On by default. |
--with-init |
Drop always-on rule files into the current repo (Cursor / Windsurf / Cline / Copilot / AGENTS.md). Off by default; turned on by --all. |
--list |
Print full agent matrix and exit |
--force |
Re-run even if already installed |
install.sh --help for full reference.
Manual install per agent:
| Agent | Command |
|---|---|
| Claude Code | claude plugin marketplace add JuliusBrussee/caveman && claude plugin install caveman@caveman |
| Gemini CLI | gemini extensions install https://github.com/JuliusBrussee/caveman |
| Cursor / Windsurf / Cline / Copilot | npx skills add JuliusBrussee/caveman -a <cursor|windsurf|cline|github-copilot> |
| Codex / opencode / Roo / Amp / Goose / Kiro / Augment / Aider Desk / Continue / Kilo / Junie / Trae / Warp / Tabnine / Mistral / Qwen / Devin / Droid / ForgeCode / Bob / Crush / iFlow / OpenHands / Qoder / Rovo Dev / Replit / Antigravity | npx skills add JuliusBrussee/caveman -a <profile> (see install.sh --list for the full slug list) |
| Anything else (40+ agents) | npx skills add JuliusBrussee/caveman (auto-detect) |
Standalone Claude Code hooks (without plugin): bash <(curl -s https://raw.githubusercontent.com/JuliusBrussee/caveman/main/hooks/install.sh). Windows: irm https://raw.githubusercontent.com/JuliusBrussee/caveman/main/hooks/install.ps1 | iex. Manual fallback for stubborn Windows envs lives in docs/install-windows.md.
Uninstall: disable the Claude plugin, gemini extensions uninstall caveman, or npx skills remove caveman.
| Feature | Claude Code | Codex | Gemini CLI | Cursor / Windsurf | Cline / Copilot | Others* |
|---|---|---|---|---|---|---|
| Caveman mode | Y | Y | Y | Y | Y | Y |
| Auto-activate every session | Y | YΒΉ | Y | with --with-init |
with --with-init |
with --with-init |
/caveman command |
Y | YΒΉ | Y | β | β | β |
| Mode switching (lite/full/ultra) | Y | YΒΉ | Y | YΒ² | β | β |
| Statusline badge | Y | β | β | β | β | β |
| caveman-commit / caveman-review | Y | β | Y | Y | Y | Y |
| caveman-compress / caveman-help | Y | YΒ³ | Y | Y | Y | Y |
| caveman-stats | Y | β | β | β | β | β |
| cavecrew (subagents) | Y | β | β | β | β | β |
* opencode, Roo, Amp, Goose, Kiro CLI, Augment, Aider Desk, Continue, Kilo, Junie (JetBrains), Trae, Warp, Tabnine, Mistral, Qwen, Devin, Droid, ForgeCode, Bob, Crush, iFlow, OpenHands, Qoder, Rovo Dev, Replit, Antigravity, and more via npx skills. AGENTS.md / IDE rule files reach Zed, generic agents, etc. via --with-init.
ΒΉ Codex uses $caveman instead of /caveman. Auto-start ships when you run Codex inside this repo (via .codex/hooks.json); for other repos, copy the hook or use $caveman manually. Β² Mode switching is on-demand via the skill, no slash command. Β³ Compress only.
--with-init writes .cursor/rules/caveman.mdc, .windsurf/rules/caveman.md, .clinerules/caveman.md, .github/copilot-instructions.md, and AGENTS.md into the current repo so caveman auto-starts there.
Trigger with:
/cavemanor Codex$caveman- "talk like caveman"
- "caveman mode"
- "less tokens please"
Stop with: "stop caveman" or "normal mode"
| Level | Trigger | What it do |
|---|---|---|
| Lite | /caveman lite |
Drop filler, keep grammar. Professional but no fluff |
| Full | /caveman full |
Default caveman. Drop articles, fragments, full grunt |
| Ultra | /caveman ultra |
Maximum compression. Telegraphic. Abbreviate everything |
Classical Chinese literary compression β same technical accuracy, but in the most token-efficient written language humans ever invented.
| Level | Trigger | What it do |
|---|---|---|
| Wenyan-Lite | /caveman wenyan-lite |
Semi-classical. Grammar intact, filler gone |
| Wenyan-Full | /caveman wenyan |
Full ζθ¨ζ. Maximum classical terseness |
| Wenyan-Ultra | /caveman wenyan-ultra |
Extreme. Ancient scholar on a budget |
Level stick until you change it or session end.
| Skill | What |
|---|---|
/caveman-commit |
Terse commit messages. Conventional Commits, β€50 char subject. Why over what. |
/caveman-review |
One-line PR comments: L42: π΄ bug: user null. Add guard. No throat-clearing. |
/caveman-help |
Quick-reference card. All modes, skills, commands. |
/caveman-stats |
Real session token usage + estimated savings + USD. Lifetime aggregation via --all, time window via --since 7d, tweetable line via --share. Reads the Claude Code session JSONL directly, no model-side guessing. Claude Code only. |
/caveman:compress <file> |
Rewrites a memory file (e.g. CLAUDE.md) into caveman-speak. Saves backup as <file>.original.md. Cuts ~46% of input tokens every session start. Code/URLs/paths preserved byte-for-byte. |
cavecrew-investigator/builder/reviewer |
Caveman subagents for Claude Code. Subagent tool-output gets injected back into main context β these emit ~60% fewer tokens than vanilla Explore / reviewer agents, so main context lasts longer across long sessions. Investigator (read-only locator, haiku), builder (1-2 file surgical edit, refuses 3+), reviewer (one-line findings, haiku). |
Statusline savings badge β on by default. After your first /caveman-stats run the statusline appends [CAVEMAN] β 12.4k (lifetime tokens saved) and updates every time /caveman-stats runs. Don't want it? Set CAVEMAN_STATUSLINE_SAVINGS=0 to silence.
| File | Original | Compressed | Saved |
|---|---|---|---|
claude-md-preferences.md |
706 | 285 | 59.6% |
project-notes.md |
1145 | 535 | 53.3% |
claude-md-project.md |
1122 | 636 | 43.3% |
todo-list.md |
627 | 388 | 38.1% |
mixed-with-code.md |
888 | 560 | 36.9% |
| Average | 898 | 481 | 46% |
Full docs: caveman-compress README. Snyk false-positive note.
Stdio proxy that wraps any MCP server, intercepts tools/list / prompts/list / resources/list responses, and compresses the description fields. Code, URLs, paths, identifiers stay byte-for-byte identical.
Published on npm as caveman-shrink. V1 does not touch tool-call response bodies or request payloads. Auto-registered by install.sh (use --minimal to skip). Full docs: mcp-servers/caveman-shrink/.
Real token counts from the Claude API (reproduce it yourself):
| Task | Normal (tokens) | Caveman (tokens) | Saved |
|---|---|---|---|
| Explain React re-render bug | 1180 | 159 | 87% |
| Fix auth middleware token expiry | 704 | 121 | 83% |
| Set up PostgreSQL connection pool | 2347 | 380 | 84% |
| Explain git rebase vs merge | 702 | 292 | 58% |
| Refactor callback to async/await | 387 | 301 | 22% |
| Architecture: microservices vs monolith | 446 | 310 | 30% |
| Review PR for security issues | 678 | 398 | 41% |
| Docker multi-stage build | 1042 | 290 | 72% |
| Debug PostgreSQL race condition | 1200 | 232 | 81% |
| Implement React error boundary | 3454 | 456 | 87% |
| Average | 1214 | 294 | 65% |
Range: 22%β87% savings across prompts.
Important
Caveman only affects output tokens β thinking/reasoning tokens are untouched. Caveman no make brain smaller. Caveman make mouth smaller. Biggest win is readability and speed, cost savings are a bonus.
A March 2026 paper "Brevity Constraints Reverse Performance Hierarchies in Language Models" found that constraining large models to brief responses improved accuracy by 26 percentage points on certain benchmarks and completely reversed performance hierarchies. Verbose not always better. Sometimes less word = more correct.
Caveman not just claim 75%. Caveman prove it.
The evals/ directory has a three-arm eval harness that measures real token compression against a proper control β not just "verbose vs skill" but "terse vs skill". Because comparing caveman to verbose Claude conflate the skill with generic terseness. That cheating. Caveman not cheat.
# Run the eval (needs claude CLI)
uv run python evals/llm_run.py
# Read results (no API key, runs offline)
uv run --with tiktoken python evals/measure.pyIf caveman save you mass token, mass money β leave mass star. β
Three tools. One philosophy: agent do more with less.
| Repo | What | One-liner |
|---|---|---|
| caveman (you are here) | Output compression skill | why use many token when few do trick β ~75% fewer output tokens across Claude Code, Cursor, Gemini, Codex |
| cavemem | Cross-agent persistent memory | why agent forget when agent can remember β compressed SQLite + MCP, local by default |
| cavekit | Spec-driven autonomous build loop | why agent guess when agent can know β natural language β kits β parallel build β verified |
They compose: cavekit orchestrates the build, caveman compresses what the agent says, cavemem compresses what the agent remembers. Install one, some, or all β each stands alone.
- Revu β local-first macOS study app with FSRS spaced repetition, decks, exams, and study guides. revu.cards
MIT β free like mass mammoth on open plain.

{ "mcpServers": { "fs-shrunk": { "command": "npx", "args": ["caveman-shrink", "npx", "@modelcontextprotocol/server-filesystem", "/path/to/dir"] } } }