Skip to content

Releases: Mibayy/token-savior

v4.0.0 — single 'optimized' profile, 97.9% @ −80% tokens

18 May 10:53

Choose a tag to compare

What's new

v4.0 simplifies Token Savior to a single recommended profile that just works.

pip install "token-savior-recall[mcp]"
{
  "mcpServers": {
    "token-savior-recall": {
      "command": "/path/to/venv/bin/token-savior",
      "env": {
        "TOKEN_SAVIOR_PROFILE": "optimized",
        "WORKSPACE_ROOTS": "/path/to/project"
      }
    }
  }
}

That's it. optimized bundles tiny_plus + thin schemas + capture-disabled.

Bench results

Plain Opus 4.7 Token Savior v4.0
Score 78.3% 97.9% (188/192)
Active tokens / task 17 221 3 395 (−80%)
Wall time / task 110.6 s 18.9 s (−83%)

Honest correction vs v3.5

v3.5 framed the ts CLI as the headline ("drop the MCP"). The Run D mini-bench showed CLI mode is actually +337% tokens vs MCP on Claude Code — Bash wrapping + no protocol-native compact stub eats the gain.

v4.0 pitch is honest : MCP is the win on Claude Code. The CLI is kept as a portability utility for agents that don't speak MCP (Cursor, Aider, scripts), nothing more.

Migration from v3.x

  • TS_PROFILE=tiny_plus + TS_THIN_SCHEMAS=1 → simply TOKEN_SAVIOR_PROFILE=optimized
  • tiny_plus / tiny / lean / ultra / code_mode still work (legacy aliases)
  • No breaking change in the dispatcher itself

PyPI : https://pypi.org/project/token-savior-recall/4.0.0/
Bench : https://github.com/Mibayy/tsbench/blob/main/BENCHMARK-SUMMARY.md

v3.5.0 — CLI release (drop the MCP overhead)

18 May 10:20

Choose a tag to compare

What's new

Token Savior is now a CLI binary drop-in for any AI coding agent. The MCP server still works (legacy path), but the CLI avoids the ~15k tokens / task structural overhead Claude Code injects with an attached MCP server.

pip install token-savior-recall
ts use /path/to/project
ts get my_function

Bench results (tsbench v3.0)

Plain Opus 4.7 Token Savior v3.5.0
Score 78.3% 97.9% (188/192)
Active tokens / task 17 221 3 395 (−80%)
Wall time / task 110.6 s 18.9 s (−83%)

vs v2.9 (Apr 26 record 192/192) : −14% tokens, −29% wall, −2% score.

Key changes

  • NEW ts CLI binary + daemon mode (Unix socket, 10× faster per-call vs cold fork)
  • NEW TS_THIN_SCHEMAS=1 retire les descriptions sub-properties des inputSchema (−44% manifest)
  • FIX 5/6 memory hooks now respect TS_MEMORY_DISABLE=1 (was leaking ~10k tokens / task)
  • PERF mcp.* imports lazy → cold start 1.5s → 685ms (−55%)
  • NEW systemd service scripts/ts-daemon.service
  • NEW ts daemon warm precharge symbol cache pour un projet
  • DOC README CLI-first, MCP setup moved to Legacy section

Backward compat

The MCP server is unchanged. Existing Claude Code mcp-config still works and benefits from the same internal optims. Just enable TS_PROFILE=tiny_plus + TS_THIN_SCHEMAS=1 + TS_CAPTURE_DISABLED=1 + TS_MEMORY_DISABLE=1 in your env to get the v3 numbers.

Install

pip install token-savior-recall          # CLI mode (recommended)
pip install "token-savior-recall[mcp]"   # CLI + MCP legacy

Full README : https://github.com/Mibayy/token-savior#readme
Benchmark methodology : https://github.com/Mibayy/tsbench/blob/main/BENCHMARK-SUMMARY.md

v3.4.0 — auto profile

16 May 23:55

Choose a tag to compare

One profile to rule them all

The new auto profile sizes its manifest from your actual telemetry instead of asking you to pick a hand-tuned static subset.

How it works

  • Essentials (always exposed): switch_project, list_projects, get_git_status, ts_search, ts_execute. ~1 KT.
  • Hot core: top-K tools (default K=10) ranked by your persisted tool_call_counts. Tune K with TS_AUTO_HOT_K.
  • Cold start: falls back to the tiny_plus baseline so the first session is still usable before any telemetry exists.

Resulting manifest: ~15-18 tools / ~2-3 KT, converging to your personal usage after a few sessions.

Use it

export TOKEN_SAVIOR_PROFILE=auto

Profile cleanup

  • core, nav, lean, ultra, tiny, tiny_plus are now deprecated. Setting any of them prints a stderr notice pointing to auto. They will be removed in v4.0.0.
  • full stays as a first-class debug / power-user profile.
  • code_mode keeps its own slot — it's an execution-mode switch (JS sandbox), orthogonal to manifest size.

Numbers

  • Tests: 1476 passed, 2 skipped (+7 new auto-profile cases)
  • 8 profiles → effective 3 (auto, full, code_mode)
  • Manifest with telemetry data: ~2-3 KT vs full ~9 KT

Install

pip install -U token-savior-recall

v3.3.0 — warm worker pool + 6-axis quality pass

16 May 23:25

Choose a tag to compare

Highlights

⚡ Warm worker pool for ts_execute — 31× faster

Long-lived Node worker handles all ts_execute calls in sequence with isolated vm.createContext() per script. Cold spawn 67.96 ms/call → warm reuse 2.20 ms/call.

🧹 Six quality improvements

  • Test isolation: latency.py writes redirected to per-session temp DB so pytest never pollutes prod memory.db. Fixed 2 memory_viewer FTS flakes.
  • Quieter logs: HuggingFace UserWarning from embeddings.py silenced.
  • Node 24 CI: actions/checkout@v5 + setup-python@v6 — clears deprecation banner.
  • README: full Code Mode section with discovery flow example.
  • set_project_root guard: idempotent on already-registered projects (audit showed 32% of calls did redundant reindex). New force=true flag preserves rebuild path.
  • Process hardening: scripts/preflight.sh (ruff + pytest) mandatory before push.

Numbers

  • ts_execute latency: 31× faster warm vs cold
  • Tools advertised (full): 68
  • Tests: 1469 passed, 2 skipped

Install

pip install -U token-savior-recall

v3.2.0 — Code Mode complete

16 May 22:54

Choose a tag to compare

Code Mode is now fully operational

v3.1.0 shipped the sandbox + ts_execute. v3.2.0 makes the discovery flow + manifest savings work end-to-end.

🎨 Typed facade auto-generated from MCP inputSchema

Every tool's TypeScript signature is now derived from its real inputSchema. Required fields are non-optional, optional fields use ?, enums become string-literal unions, arrays use Array<T>. Before:

find_symbol: (args?: Record<string, unknown>) => Promise<unknown>;

After:

find_symbol: (args?: { name?: string; names?: Array<string>; level?: number; hints?: boolean; project?: string }) => Promise<unknown>;

📦 New `code_mode` profile (4 tools, ~1.5 KT manifest)

`TOKEN_SAVIOR_PROFILE=code_mode` exposes only `ts_execute`, `ts_search`, `switch_project`, `list_projects`. −83% manifest tokens vs full. Discovery on demand via ts_search.

🔎 `ts_search(format='ts')`

ts_search now accepts `format='ts'` which returns one-line TS signatures instead of full JSONSchema — roughly half the tokens per match. Auto-selected when profile=code_mode.

Suggested flow under code_mode

```

  1. ts_search(query='find symbol then deps', format='ts')
    → returns find_symbol + get_dependents signatures
  2. ts_execute(script=`
    const sym = await tools.find_symbol({ name: 'foo' });
    const deps = await tools.get_dependents({ name: sym.symbol });
    return { sym, deps };
    `)
    → 1 round-trip, full result returned
    ```

Numbers

  • Tools advertised (full profile): 68
  • Tests: 1464 passed, 2 skipped (+5 since v3.1.0)
  • Manifest math: full ~9 KT, lean ~7 KT, tiny_plus ~2.5 KT, code_mode ~1.5 KT

Install

```bash
pip install -U token-savior-recall
```

v3.1.0 — Code Mode + Windows fix + optim pass

16 May 22:46

Choose a tag to compare

Highlights

🚀 Code Mode (new!)

A single new tool ts_execute(script, timeout_ms) runs a JS function body in a Node sandbox with a typed 34-tool facade. Chain await tools.find_symbol(...)await tools.get_function_source(...)await tools.get_dependents(...) in one round-trip instead of three. Adapted from Cloudflare's Code Mode for MCP pattern.

  • Cold subprocess: ~50 ms · 3-tool chain: ~50 ms total
  • Disabled with TS_CODE_MODE_DISABLE=1
  • Allowlist: 34 navigation + graph + edit + git + audit tools

🪟 Windows MCP unblocked (#27)

Every subprocess.run in production code now passes stdin=subprocess.DEVNULL, fixing the IOCP ProactorEventLoop deadlock on Python 3.14 Windows. Credit to @arkancrow for the root-cause analysis.

🧰 Operational hardening

  • Hooks: 5 sites in memory-pretooluse.sh / memory-userprompt.sh switched to stdin pipe — eliminates 1620 recurring json.JSONDecodeError entries in hook-errors.log.
  • Sessions table: empty-row guard added to session_end. Initial purge cleared 11 582 rows of bench noise (11 612 → 30, DB 8.6 → 6.2 MB).
  • Latency tracer: new tool_latency SQLite table records per-call (ts, tool, project, duration_ms, status, error_type). Overhead measured at < 1 ms.

Numbers

  • Tools advertised: 68 (was 67)
  • Tests: 1459 passed, 2 skipped (was 1450)

Install

pip install -U token-savior-recall

Or with uvx:

uvx token-savior-recall

v2.6.0 — Memory Engine Phase 1+2 + tsbench 97.8%

20 Apr 14:11

Choose a tag to compare

tsbench (90 paired tasks · Opus 4.7)

Plain Claude Token Savior Δ
Score 120/180 (66.7%) 176/180 (97.8%) +31.1pp
Active tokens 1.55M 821k −47%
Wall time 166min 36min −78%
W/T/L 40 / 48 / 2

8 of 11 categories at 100% (audit, bug_fixing, code_generation, code_review, config_infra, documentation, git, refactoring, writing_tests).

Bench-driven fixes

  • CLAUDE_PROJECT_ROOT env auto-promotes active project at boot
  • Explicit project= hint auto-promotes active project on first call
  • TS_WARM_START=1 pre-builds index at server start
  • get_full_context defaults to compact mode (source head 80 lines + names-only deps)
  • Empty-result _suggestion on search_codebase and get_dependents
  • Lower defaults on noisy analyses (analyze_config, find_dead_code, find_semantic_duplicates)
  • lean profile (59 tools) confirmed as bench default

Memory Engine (Phase 1+2)

  • <private> tag stripper, content_hash dedup O(1), ts://obs/{id} citation URIs
  • PreToolUse-Read hook → file-context injection
  • Session-end rollup structuré (FTS5, 6 fields)
  • Progressive disclosure formalisé (Layer 1/2/3)
  • sqlite-vec hybrid search + RRF fusion (FTS fallback graceful)
  • Web viewer opt-in 127.0.0.1:$TS_VIEWER_PORT (htmx + SSE)
  • LLM auto-extraction PostToolUse (opt-in TS_AUTO_EXTRACT=1)

Stats

  • Tools: 105 (lean profile: 59)
  • Tests: 1318/1318 ✅
  • Vector search: sqlite-vec + sentence-transformers/all-MiniLM-L6-v2

Install

pip install token-savior-recall==2.6.0

Full changelog: CHANGELOG.md

v2.1.1 — Token Savior Recall

13 Apr 10:49

Choose a tag to compare

Highlights

75 MCP tools · 891/891 tests · 97% token reduction · Persistent memory across sessions

This is the recommended release of the v2.1.x line. v2.1.0 was tagged but never published as a Release — use v2.1.1 instead (CI fixes + private assets removed, no functional change).


Phase 2 — Advanced Context Engine

  • Program slicing via backward Data Dependency Graph (Python AST)
    get_backward_slice(name, variable, line) returns the minimal instructions affecting a variable at a given line. 92% token reduction on debug workflows.
  • Knapsack context packing (greedy fractional, Dantzig 1957)
    pack_context(query, budget_tokens) returns the optimal symbol bundle within a fixed token budget.
  • PageRank / RWR on the dependency graph (Tong, Faloutsos, Pan 2006)
    get_relevance_cluster(name, budget) ranks symbols by mathematical relevance, capturing indirect dependencies BFS misses.
  • Markov predictive prefetching with daemon warm cache
    → After each tool call, the next most-likely tool is pre-computed in background. 77.8% prediction accuracy on common chains.
  • Proof-carrying edits via static-analysis EditSafety certificate
    → `verify_edit` and `apply_symbol_change_and_validate` now attach a non-blocking cert: signature preservation, exception unchanged, side-effect unchanged.
  • Semantic AST hash (alpha-conversion + docstring stripping)
    → `find_semantic_duplicates` detects functions equivalent modulo variable renaming. Falls back to text hash on syntax errors.

Phase 1 — Core Optimizations

  • Symbol-level content hashing — 19x reindex speedup on targeted edits
  • 2-level semantic hash (signature + body) — precise breaking change detection
  • Conversation Symbol Cache (CSC) — 93% token savings on re-accessed symbols
  • Lattice of Abstractions L0→L3 — 94-97% compression vs full source

Memory Engine

  • 16 memory tools, 8 lifecycle hooks, 12 observation types
  • LRU scoring: `0.4 × recency + 0.3 × access + 0.3 × type_priority`
  • Delta injection at SessionStart — only changed observations are re-injected (50-70% savings vs full refresh)
  • Auto-promotion (note × 5 → convention, warning × 5 → guardrail)
  • Contradiction detection, auto-linking, TTL per type
  • Mode system (`code` / `review` / `debug` / `infra` / `silent`) with auto-detection
  • CLI `ts memory {status,list,search,get,save,top,why,doctor,relink}`
  • Telegram feed for critical observations (optional)
  • Markdown export + git versioning

Refactor

  • `_build_line_offsets` extracted to `models.py` as shared helper (15 annotators dedup'd)

Manifest optimization

  • 80 → 75 tools (-6%), 42K → 36K chars (-14%), ~1500 tokens/session saved

v2.1.1 patch (vs v2.1.0)

  • Resolve 6 ruff errors (5 E731 lambda assignments inlined in `_warm_cache_async`, 1 F401 unused import)
  • Remove `assets/` directory from repo (private marketing assets accidentally committed) and add to `.gitignore`
  • No functional change vs v2.1.0

Install

```bash
uvx token-savior-recall
```

Or development install: see README.

🤖 Generated with Claude Code

v1.0.0 — Production Stable

11 Apr 15:35

Choose a tag to compare

Token Savior v1.0.0 -- Release Notes

25 files changed, 217 symbols affected. 865 tests passing.

Architecture overhaul

The server has been restructured from a monolithic 2,400-line server.py into focused modules:

  • tool_schemas.py -- all 53 MCP tool schemas extracted (server.py reduced to 1,002 lines)
  • cache_ops.py -- CacheManager class for persistent JSON cache (save, load, legacy migration)
  • slot_manager.py -- SlotManager + _ProjectSlot for multi-project lifecycle
  • brace_matcher.py -- shared find_brace_end_* for C, C#, Rust, Go annotators
  • query_api.py -- ProjectQueryEngine class (22 methods + as_dict()) replaces 705-line closure

Performance

  • LazyLines: file content lazy-loaded from disk on demand instead of stored in cache. Cache size reduced ~57%, idle RAM reduced proportionally.
  • Manual serialization: CacheManager.index_to_dict() does zero-copy field-by-field serialization instead of dataclasses.asdict().
  • scandir batching: _check_mtime_changes uses os.scandir() per directory.
  • Regex cache: module-level _WORD_BOUNDARY_CACHE avoids recompiling patterns.
  • File limits: ProjectIndexer gains max_files param (env: TOKEN_SAVIOR_MAX_FILES, default 10,000).

Bug fixes

  • Path traversal: create_checkpoint validates paths with os.path.commonpath.
  • Triple save: _dirty flag pattern ensures _save_cache called at most once per execution path.
  • Output truncation: get_dependents and get_change_impact gained max_total_chars (default 50,000).

Tool fusions

  • get_changed_symbols: now accepts optional ref parameter (replaces get_changed_symbols_since_ref)
  • apply_symbol_change_and_validate: now accepts rollback_on_failure parameter (replaces apply_symbol_change_validate_with_rollback)

Deprecated (removal in v1.1.0)

Deprecated tool Use instead
get_changed_symbols_since_ref get_changed_symbols(ref=...)
apply_symbol_change_validate_with_rollback apply_symbol_change_and_validate(rollback_on_failure=true)

Both inject a _deprecated field in their response with migration instructions. Schemas marked "deprecated": true.

Annotator refactoring

  • annotate_rust: 6 extracted handlers (_handle_rust_impl, _handle_rust_macro, _handle_rust_struct, _handle_rust_enum, _handle_rust_trait, _handle_rust_fn). Complexity dropped from 211 to under 150.
  • annotate_csharp: 8 extracted handlers (_handle_csharp_namespace, _handle_csharp_type, _extract_type_methods, etc.). Complexity dropped from 201 to under 150.
  • AnnotatorProtocol: typing.Protocol + runtime_checkable for annotator type safety.

New modules

Module Purpose
src/token_savior/tool_schemas.py 53 tool schemas + DEPRECATED_TOOLS set
src/token_savior/cache_ops.py CacheManager (save/load/migrate)
src/token_savior/slot_manager.py SlotManager + _ProjectSlot
src/token_savior/brace_matcher.py Per-language brace matching

Test coverage

Suite Tests
test_cache_ops.py 12
test_slot_manager.py 13
test_server_integration.py 5
test_annotator_protocol.py 4
test_tool_schemas.py 9
Total project 865

Benchmarks

  • benchmarks/run_benchmarks.py: automated benchmarks on FastAPI + CPython measuring index time, RAM, query response time, and cache size.
  • .github/workflows/benchmark.yml: GitHub Action for release benchmarks.

v0.9.0 — Initial release

05 Apr 23:43

Choose a tag to compare

First public release of Token Savior — structural code intelligence MCP server.

What's included

  • Symbol-level code navigation (get_function_source, get_class_source, find_symbol)
  • Dependency graphs (get_dependencies, get_dependents, get_change_impact)
  • Git-aware structural diff (get_changed_symbols, build_commit_summary)
  • Dead code detection and hotspot analysis
  • Multi-project support with sub-millisecond queries
  • 87% token reduction on large codebases