Trammel helps AI coding assistants plan, verify, and remember multi-step tasks.
Instead of letting your AI assistant wing it on complex changes, Trammel breaks goals into ordered steps, verifies each one works, learns from failures, and saves successful strategies for reuse. It works with Claude Code, Cursor, and any MCP-compatible editor — or standalone via Python and CLI.
Trammel is a tool for LLMs, not a tool that calls LLMs. It provides the planning discipline that coding assistants need to tackle multi-step tasks reliably.
- Breaks down goals into steps — Analyzes your project's code structure, figures out what depends on what, and creates an ordered plan
- Tries multiple approaches — Explores different strategies in parallel (bottom-up, top-down, risk-first, and more) to find what works best
- Verifies as it goes — In the Python harness, runs your tests after each step in an isolated copy so bad changes don't pile up. In MCP mode,
verify_stepis a static/heuristic check — run your test suite yourself between steps, or callplan_and_executefrom Python for real isolated test execution. - Learns from mistakes — Records what failed and why, then blocks the same mistake from happening again
- Remembers what worked — Saves successful strategies as reusable recipes, so similar tasks get solved faster next time
- Coordinates multiple agents — Provides step claiming, dependency tracking, and DAG metrics for multi-agent workflows
15 languages supported: Python, TypeScript, JavaScript, Go, Rust, C/C++, Java/Kotlin, C#, Ruby, PHP, Swift, Dart, Zig
pip install trammel # core library (no dependencies beyond Python stdlib)
pip install trammel[mcp] # with MCP server for Claude Code / CursorOr from source:
git clone https://github.com/IronAdamant/Trammel.git
cd Trammel && pip install -e '.[mcp]'Add to your .claude/.mcp.json (Claude Code) or equivalent MCP config:
{
"mcpServers": {
"trammel": {
"command": "trammel-mcp",
"args": []
}
}
}Your AI assistant now has access to 30+ planning tools — decompose goals, create plans, claim steps, verify work, and save recipes. See SYSTEM_PROMPT.md for the full orchestration guide.
python -m trammel "refactor auth module" --root /path/to/project --beams 3
python -m trammel "fix tests" --test-cmd "pytest -x -q"
python -m trammel "explore auth" --dry-run # explore strategies without verification
python -m trammel "fix auth" --root /monorepo --scope services/authfrom trammel import plan_and_execute, explore, synthesize
# Full pipeline: decompose → plan → explore → verify → store recipe
result = plan_and_execute("your goal", "/path/to/project", num_beams=3)
# Explore only (no verification)
strategy = explore("refactor auth", "/path/to/project")
# Save a verified strategy as a reusable recipe
synthesize("refactor auth", verified_strategy)Trammel treats planning as a structured search problem:
- Decompose — Analyzes imports and builds a dependency graph, then generates ordered steps with rationale. Supports scaffold definitions for new files that don't exist yet.
- Explore — Creates multiple strategy variants (bottom-up, top-down, risk-first, critical-path, cohesion, minimal-change, and more) and runs them in parallel.
- Verify — Applies edits in isolated temp copies and runs tests per-step. Extracts structured failure analysis when something breaks.
- Constrain — Records failure reasons as persistent constraints that prevent the same mistake across sessions.
- Remember — Stores successful strategies as recipes, retrieved later by text similarity + file overlap + success ratio.
All state lives in a local SQLite database (trammel.db) — plans, steps, constraints, recipes, and telemetry.
Trammel doesn't require MCP. The same capabilities are available through multiple surfaces:
| Surface | Best for |
|---|---|
MCP server (trammel-mcp) |
Claude Code, Cursor, and other MCP-aware editors |
Python API (plan_and_execute, explore, etc.) |
Scripts, CI pipelines, custom orchestrators |
CLI (python -m trammel) |
Shell automation, quick one-off plans |
SQLite (trammel.db) |
Direct queries, external dashboards, cross-process coordination |
When decomposing with a scaffold, Trammel returns DAG metrics for dispatching work across multiple agents:
| Metric | What it tells you |
|---|---|
max_parallelism |
Peak number of agents you can run at once |
layer_widths |
How many files can be worked on per round (e.g. [7, 12, 6, 10, 3, 2]) |
critical_path_length |
The longest chain — your minimum number of rounds |
Agents coordinate via claim_step / release_step / available_steps. Claims auto-expire after 10 minutes for stale agent recovery.
Trammel works standalone. When co-installed with Stele (context retrieval) and Chisel (code analysis), all three cooperate through the MCP tool layer — no cross-dependencies between packages.
| Tool | Role |
|---|---|
| Stele | Persistent context retrieval and semantic indexing |
| Chisel | Code analysis, churn, coupling, risk mapping |
| Trammel | Planning, verification, failure learning, recipe memory |
Trammel has controls to keep plans accurate and sub-agents aligned:
strict_greenfield— Fails decomposition if a new-work goal has no scaffold, preventing vague improvised plansrelevant_only— Filters steps to only what matters for the goal- Relevance tiers — Each step gets
high/medium/lowrelevance so prioritization is transparent
Optional project config via pyproject.toml ([tool.trammel] section) or .trammel.json:
[tool.trammel]
default_scope = "src/"
focus_keywords = ["auth", "login"]
max_files = 50trammel/ Importable package
__init__.py Public API: plan_and_execute, explore, synthesize
core.py Planner: decomposition, constraints, step generation
store.py RecipeStore: SQLite persistence (8 tables), telemetry
store_recipes.py Recipe methods: save, retrieve, list, prune
strategies.py 9 built-in beam strategies
harness.py Execution harness: temp copies, test runner
analyzer_specs.py Declarative regex specs for 13 regex-based analyzers
analyzer_engine.py RegexAnalyzerEngine + import resolvers + backward-compat shims
analyzers.py Python + TypeScript analyzers, language detection
analyzers_ext.py Backward-compat shim (re-exports from analyzer_engine)
analyzers_ext2.py Backward-compat shim (re-exports from analyzer_engine)
project_config.py Config merging (pyproject.toml + .trammel.json)
utils.py Trigrams, cosine, failure extraction, shared helpers
cli.py CLI entry point
mcp_server.py MCP tool schemas and dispatch
mcp_stdio.py MCP stdio server entry point
plan_merge.py Plan merging engine: conflict detection + 4 resolution strategies
store_agents.py Multi-agent coordination: step claiming, availability, proximity warnings
tests/ 361 tests across 6 modules (stdlib unittest)
SYSTEM_PROMPT.md Reference orchestration guide for LLM clients
| Table | Purpose |
|---|---|
recipes |
Successful strategies with pattern, constraints, success/failure counts |
recipe_trigrams |
Inverted trigram index for fast recipe retrieval |
recipe_files |
File paths for structural matching (Jaccard overlap) |
plans |
Goals + strategy snapshots + scaffold with step progress |
steps |
Work units with dependencies, rationale, verification results |
constraints |
Failure records that prevent known-bad repetition |
trajectories |
Run logs per beam: outcome, steps completed, failure reason |
failure_patterns |
Historical failure signatures with resolution history |
usage_events |
Telemetry: tool calls, recipe hit rates, strategy win rates |
Contributions welcome. Please open an issue first to discuss changes.
- Fork the repository
- Create a feature branch (
git checkout -b feature/your-feature) - Make your changes (core must remain stdlib-only; tests use
unittestonly) - Run tests:
python -m unittest discover -q -s tests -p 'test_*.py' - Open a pull request
Releases use Trusted Publishing (GitHub OIDC → PyPI). No API tokens needed.
- Bump
versioninpyproject.toml - Run tests, commit, push to
main - Tag:
git tag -a vX.Y.Z -m "Release X.Y.Z" && git push origin vX.Y.Z - Create GitHub Release (triggers PyPI publish automatically)
Full Changelog
Structural refactor with no public API changes. All existing from trammel... imports keep working; re-exports preserve legacy paths.
- Every module now under 500 LOC. 9 files were over the project-convention target before this release (worst at 1223); zero after. New modules created in the split:
recipe_fingerprints.py,store_retrieval.py,store_scaffolds.py(extracted fromstore_recipes.py, which drops from 1223 → 373).store_plans.py,store_telemetry.py(extracted fromstore.py).tool_schemas.py(extracted frommcp_server.py).implicit_deps_engines.py,pattern_learner.py(extracted fromimplicit_deps.py).analyzer_resolvers.py(extracted fromanalyzer_engine.py).text_similarity.py,scaffold_validation.py(extracted fromutils.py).language_detection.py(extracted fromanalyzers.py).scaffold_creation.py(extracted fromscaffold_logic.py).planner_helpers.py(extracted fromcore.py).
- Pattern tables externalized.
NAMING_CONVENTION_RULES,INFRASTRUCTURE_PATTERNS, file-role patterns, and goal-role patterns now live intrammel/data/patterns.jsonand are loaded via a newpattern_config.py. Edit the JSON to tune inference without code changes. - MCP dispatch integration tests. New
tests/test_mcp_dispatch.pywith 11 tests covering schema↔dispatch parity, category-registry sync, unknown-tool error shape, previously-uncovered handlers (record_steps,claim/release/available_steps,merge_plans,usage_stats,list_strategies,resolve_failure), and schema-driven int coercion. - Public-API docstrings. Added to
Planner(class + every method),ExecutionHarness(class +__init__),RecipeStore(class + context-manager lifecycle), every MCP_handle_*, the constraint/trajectory methods on the store, and all 11 language-analyzer subclasses. - Docs parity. "Verifies as it goes" clarified in README (Python harness runs tests; MCP
verify_stepis static/heuristic).SYSTEM_PROMPT.mdgains an explicit "decomposevsexplorevscreate_plan— which do I call?" decision block.pyproject.tomlversion bumped to 3.12.0 (closes the drift where README already documented a v3.11.3 entry but the package version still said 3.11.2). - 406 tests passing (395 pre-refactor + 11 new dispatch tests).
v3.11.3 — Robustness improvements for test-heavy projects, explore fallback, and strategy trajectory data
over_constrainedno longer fatal:validate_scaffoldtreats>4dependencies as a warning rather than a fatal error. Test files and facades with many dependencies no longer block decomposition.- Partial-plan recovery: When scaffold validation detects issues (self-referential, missing dependencies, etc.),
decomposenow returns whatever steps can still be inferred from the scaffold instead of an empty plan. explorefallback mode: Ifexplorereceives an empty or error result fromdecompose, it automatically retries withsuppress_creation_hints=trueandexpand_repo=falseto produce a usable strategy.- Auto-persisting strategy trajectories:
complete_plannow logs a trajectory entry automatically, solist_strategiesaccumulates empirical success/failure data over time. - Improved MCP tool descriptions:
strict_greenfieldandsuppress_creation_hintsdescriptions now explicitly mention test-heavy projects and refactor goals. - 361 tests passing.
v3.11.2 — Major robustness improvements: scaffold-less decomposition, plan merging, recipe matching, verify_step depth
- Scaffold-less decomposition improvements:
- Lowered scaffold-recipe lookup threshold (0.15) with auto-inflation for historical scaffold reuse.
- Expanded architectural template registry with CLI command, middleware pipeline, event-driven, and auth system patterns.
- Added sibling convention cloning in creation hints (e.g.,
UserService.js→RecipeService.js). - Capped fallback full-repo expansions to 15 files when no scaffold is provided, preventing the "176 steps" problem.
- Plan merging system:
- New
merge_plansMCP tool with four strategies:sequential,interleave,priority,unified. - Conflict detection engine identifies file overlap, action clashes, dependency inversions, and cycle introduction.
claim_stepnow returns proximity warnings when other active plans have pending steps targeting the same file.
- New
- Recipe matching enhancements:
- Architecture-shape MinHash index: recipes are indexed by their structural role fingerprint, enabling shape-based retrieval (e.g., model→service→route→test).
- Scaffold-derived structural matching: when
decomposeis called with a scaffold, the scaffold fingerprint is used for recipe comparison. - Success-weighted pre-filtering: candidates with zero successes are deprioritized.
- Technical thesaurus expansion for retrieval (
auth→authentication,authorization,login,token, etc.).
verify_stepdepth improvements:- Python AST pre-flight validation catches
SyntaxErrorand possible undefined names before running tests. - Import integrity check validates that relative imports resolve to existing files.
- Symbol reference validation warns when edits remove symbols that project dependents may reference.
- Expanded
static_analysisheuristics: mixed indentation, TODO/FIXME density, empty files, duplicate symbol names. - Test command dry-run fails fast when the test runner executable is missing.
- Python AST pre-flight validation catches
- 358 tests passing.
v3.11.1 — Review findings: plan validation, ambiguity detection, verify_step static analysis, MinHash recipe retrieval
create_planpre-flight validation: Added topological cycle detection tocreate_plan. Plans with cyclic step dependencies are rejected with a clearcircular_dependencyerror instead of being persisted.- Ambiguity detection in
decompose: Newambiguityfield inanalysis_metascores goals for vague phrasing (real-time,AI-powered,conflict resolution, etc.), scope words, and conjunction density. Helps flag underspecified goals before planning. verify_stepstatic analysis: Step verification now includes astatic_analysisobject with file-path convention checks and test-coverage heuristics (e.g., missing matching test files), surfacing warnings even when tests pass.- MinHash-enhanced recipe retrieval:
retrieve_best_recipeandretrieve_near_matchesnow union candidates from both the TF-IDF inverted word index and the MinHash LSH index, improving recall for synonym/reordered goals. - 346 tests passing.
- core.py modularization: Extracted monolith into
scoring.py,scaffold_logic.py,goal_nlp.py,constraints.py,scaffold_templates.py. Orchestrator reduced from ~1,700 LOC to ~570 LOC. - Recipe index migration: New
recipe_index.pywith zero-dep inverted word index (TF-IDF) and MinHash LSH for deduplication / approximate nearest neighbors. Schema extended withrecipe_termsandrecipe_signatures; integrated intostore_recipes.py(trigram tables kept for transition safety). - Analyzer farm collapse: 15 regex analyzers collapsed into declarative
analyzer_specs.py+ singleRegexAnalyzerEngineinanalyzer_engine.py.analyzers_ext.py/analyzers_ext2.pynow backward-compat shims. All class names remain unchanged. - 358 tests passing.
- README: New "Scaffold DAG Metrics for Multi-Agent Dispatch" section documenting
max_parallelism,layer_widths,critical_path_length, andmax_dependency_depthwith usage guidance for parallel agent workflows. - Spec: Added scaffold DAG metrics documentation to §5 Planner.
- RecipeLab Review Ten closure: Zero bugs found. Validated 40-step / 59-edge scaffold decomposition with correct topological ordering across 6 feature trees. Largest scaffold and most complex DAG ever tested — flawless.
- No API changes — documentation update only.
- Bug fix:
create_planfailed with"table plans has no column named scaffold"on existing databases. The schema migration was targeting the wrong table (stepsinstead ofplans). Fixed instore.py. - Impact: Unblocks full plan execution workflow (
create_plan→claim_step→record_steps→complete_plan).
- Planner / MCP:
suppress_creation_hints,scaffold_dag_metrics,skipped_existing_scaffold; refactor-verb guard for creation inference;summary_onlysurfaces DAG + skip blocks. - Tests:
tests/test_findings_checklist.pyencodes validation matrix A–C (305 tests). Core remains stdlib-only (zero third-party runtime deps).
- README: Integration surfaces (API / CLI / SQLite vs MCP), roadmap notes, release checklist for Trusted Publishing.
- Wiki / spec: §1.1–1.2 alignment;
SYSTEM_PROMPT.mdtracked in repo (sub-agents without MCP). - No API changes — doc and packaging metadata refresh for PyPI long description.
- Bug fixes: CLI goal validation (
str(None)bug), missingcreatedcolumn in step queries, mixin stubs now raiseNotImplementedError,_sql_inempty-input guard. - Dead code: Removed unused
_is_ignored_dirimport fromanalyzers_ext2.py. - Hardening: Replaced all positional
sqlite3.Rowunpacking with named column access (8 sites). Pre-compiled Rust/Cargo/Maven regex patterns at module level. - Simplification: Beam-count capping, step description helper, list_strategies comprehension,
_validate_registries()wrapper. - Modernization: Walrus operator in
_extract_step_files,TYPE_CHECKINGimports, private_cosinenaming, KeyboardInterrupt handling in MCP stdio. - 248 tests (all passing).
- DRY step dicts: Extracted
_STEP_COLUMNS+_step_to_dict()in store.py — eliminates duplicated step-dict construction inget_plan()andget_step(). - Simplified helpers:
word_jaccard,cosine,_strip_php_commentsin utils.py condensed (reduced total LOC). - Consistent isinstance: Fixed
type()vsisinstance()inconsistency inPythonAnalyzer.collect_typed_symbols. - 248 tests (all passing).
- Bug hardening: Fixed potential
ZeroDivisionErrorinexplore_trajectories,ValueErrorwith 0 beams, replaced fragileassertwithRuntimeErrorin MCP server. - Dead code removal: Removed unused
_default_beam_count, deadTYPE_CHECKING: passblock. - Simplification: Consolidated 4 SQL COUNT queries into 1, extracted
_sql_in()helper (deduplicates 8 sites), eliminated redundant dict copies, simplified confusing list-unpacking comprehension. - Modernization:
asyncio.to_threadreplacesrun_in_executor, logger configured at module level for early error capture, lazy_get_analyzer_registry()replaces late inline imports. - 248 tests (all passing).
- Deduplication: Extracted shared trigram/file helpers in
store_recipes.py, deduplicated Swift SPM scanning. - Modernization:
_count_importersusesCounter, set-based symbol deduplication, f-string continuation. - Hardening: Narrowed telemetry exception to
sqlite3.Error, fixed type annotation, named constants for magic numbers. - Cleanup: Marked unused MCP handler params, fixed Dart import mutual exclusivity.
- 248 tests (all passing).
- Bug fixes: Fixed
_inject_orderingsNone-key dict comprehension, recipe mutation indecompose(), hardcoded DB path insynthesize(), missingclaimed_by/claimed_atinget_step(). - Cross-platform: Normalized Swift analyzer path separator handling.
- 248 tests (all passing).
- Type safety: Added typed attribute stubs to
RecipeStoreMixinandAgentStoreMixinfor type-checker compatibility. - Robustness:
claim_steprejects non-pending steps; parallel beam fallback narrowed toOSErroronly with debug logging. - Performance:
run_incrementaloptimized from O(K^2) to O(K) via persistent base copy. - Test coverage: 9 new
collect_typed_symbolstests (Rust, C++, Java, C#, Ruby, PHP, Swift, Dart, Zig). - 248 tests (all passing).
- 8 bug fixes: Critical-path cycle leak,
log_eventbare commit, schema migration over-broad catch, inconsistent win-rate formulas,trigram_signaturefingerprint type bug, Rust cargo relpath bug, Java import comment stripping, PHP grouped-use alias stripping. - Dead code: Removed
ExecutionHarness.run()alias. - Simplification: N+1 query fix in recipe retrieval, SQL-side recipe pruning,
_walk_project_sourcesshared generator,DEFAULT_DB_PATHconstant consolidation. - Modernization:
_SUPPORTED_LANGUAGESderived from registry, language/analyzer sync assertion, mid-file imports moved to top, premature Python 3.14 classifier removed. - 239 tests (all passing).
- Bug fixes: Fixed TOCTOU race in
record_failure_patternandresolve_failure_pattern(wrapped in transactions). FixedJavaAnalyzer.pick_test_cmdfallback to use systemgradleinstead of nonexistent./gradlew. - Duplication elimination: Extracted
_count_importers(),_is_claimed_by_other(),_try_resolve()helpers.run()now delegates toverify_step(). Removed redundant_collect_fileswrappers. Merged two transactions invalidate_recipes. - Dead code removal: Always-true guard, unnecessary
or "", redundant assignment, ineffective deferred import. - Modernization:
collections.abc.Callable/Generatorimports, PEP 561py.typedmarker,get_analyzeradded to__all__,.mypy_cache/.ruff_cachein.gitignore. - Simplifications: Single-pass
_split_active_skipped, defensive.get()for constraint descriptions, simplified test assertions. - 239 tests (all passing).
- Consolidated comment strippers: removed duplicate
_strip_js_commentsand_strip_cpp_comments, all analyzers now use the shared_strip_c_commentsfrom utils - Consistent import analysis: all regex-based analyzers now strip comments before extracting imports (fixes false positives from commented-out imports in Ruby, Dart, Zig, C#, PHP)
- Fixed PHP method pattern overlap: method pattern now requires at least one access modifier (changed
*to+), preventing duplicate symbol entries with the function pattern - Fixed Dart function false positives: added negative lookahead to exclude control flow keywords (
if,for,while, etc.) - Improved
max()type safety:detect_languageextension counting now useslambda k: counts[k]instead ofcounts.get - Modernized
str.endswith: uses native tuple form throughout_collect_*helpers - Removed redundant
storeparameter fromPlanner.explore_trajectories(usesself.store) - MCP status handler refactored: now delegates to
RecipeStore.get_status_summary()instead of raw SQL - Schema/dispatch sync assertion: module-load assertion ensures
_TOOL_SCHEMASand_DISPATCHkeys stay in sync - Store improvements: removed unreliable
__del__, narrowed schema migration exception tosqlite3.OperationalError, fixedlist_plansstatus filter, fixedget_strategy_statsreturn type, batch-fetched recipe files inlist_recipes(N+1 fix) - Fixed float equality: recipe text similarity early-exit now uses
>= 0.9999instead of== 1.0 - Logging setup fix:
mcp_stdio.pynow callslogging.basicConfigbefore server construction - CLI hardening: added JSON parse error handling for stdin input
- Language-agnostic messages: "No Python symbols" fallback message now says "No symbols found"
__main__.pyguard: addedif __name__ == "__main__"protection
- Multi-agent step coordination: New
claim_step,release_step, andavailable_stepsMCP tools (27 tools total). Agents claim steps before working on them — other agents see claimed steps and skip them. Claims auto-expire after 10 minutes (stale agent recovery).available_stepsreturns only steps whose dependencies are satisfied AND aren't claimed by another agent. - New
store_agents.pymixin:AgentStoreMixinwithclaim_step(),release_step(),get_available_steps().RecipeStoreinherits from bothRecipeStoreMixinandAgentStoreMixin. - Schema migration: Steps table gains
claimed_byandclaimed_atcolumns (safe migration for existing databases). - 242 tests (all passing).
- Failure pattern learning: New
failure_patternsSQLite table (9 tables total) accumulates structured failure signatures across sessions. When a step fails with verification data, the file + error type + message are auto-recorded. Patterns track occurrence count, first/last seen timestamps, and resolution history. failure_historyMCP tool: Query historical failure patterns by file or project-wide. Shows which files fail frequently, what error types occur, and what resolutions worked. Use before modifying a file to avoid known pitfalls.resolve_failureMCP tool: Record what fixed a known failure pattern. Builds institutional memory of what works for specific error types on specific files. 24 MCP tools total.- Auto-recording:
update_stepwithstatus="failed"automatically extracts failure analysis and records the pattern — no manual instrumentation needed. - 242 tests (all passing).
- C++ nested template parsing: Template patterns now handle 2 levels of nesting (
template<typename T, std::vector<int>>) instead of breaking at the first>. Also fixed in Rustimpl<>and Java/Kotlin genericfun<>patterns. - Rust workspace + relative imports:
analyze_importsnow resolvesuse super::,use self::, and workspace crate imports (readsCargo.toml[workspace] membersand member crate names). Previously onlyuse crate::was handled. - PHP grouped use statements:
use Foo\{Bar, Baz, Qux};(PHP 7.0+) now correctly expanded and resolved. Previously only simpleuse Foo\Bar;was parsed. - Swift SPM-aware module mapping: Detects
Sources/<Module>/andTests/<Module>/directory structure for Swift Package Manager projects. Falls back to parent-directory mapping for non-SPM projects. - TypeScript monorepo workspace support: Reads
package.jsonworkspacesfield (npm, yarn, pnpm patterns), discovers workspace packages, resolves bare imports (import { x } from '@scope/pkg') to workspace package entry points. - 242 tests (all passing).
- Usage telemetry: New
usage_eventsSQLite table (8 tables total) withlog_event()andget_usage_stats()methods onRecipeStore. Tool calls, recipe hit/miss rates, and strategy win rates tracked automatically. Newusage_statsMCP tool (22 tools total) returns aggregated telemetry over a configurable time window. - Dispatch refactor: Replaced 153-line
match/caseinmcp_server.pywith dispatch-dict pattern. Each tool has a dedicated handler function, looked up via_DISPATCHdict. Adding new tools now requires only: handler function + schema + dict entry. - Analyzer improvements: PHP class methods now detected (was major gap). Java 16+
recordkeyword supported. Dart factory/named constructors detected. - Comment stripping for 7 languages: Added
_strip_c_comments(shared by Go, Rust, Java, C#, Swift, Dart, Zig),_strip_hash_comments(Ruby), and_strip_php_comments(PHP). Previously only Python (AST), TypeScript, and C/C++ stripped comments before symbol detection. - 5 new sample repos: Jekyll (Ruby), Flame (Dart), ZLS (Zig), Laravel (PHP), Ktor (Kotlin) cloned to
sample_file_test/for analyzer validation across all 15 supported languages. - 242 tests (all passing).
- Strategy module extraction: Beam strategy registry and 9 built-in orderings extracted from
core.pyinto newstrategies.py(~280 LOC).core.pyreduced from 595 to 324 LOC, well under the 500 LOC limit. - Eliminated circular dependency workarounds: Moved shared
_collect_symbols_regexand_collect_typed_symbols_regexhelpers fromanalyzers.pytoutils.py. Removedfunctools.cachelazy-import wrappers fromanalyzers_ext.pyandanalyzers_ext2.py(12 lines each), replaced with direct imports. - DRY regex patterns: Derived
_*_SYMBOL_PATTERNSfrom_*_TYPED_PATTERNSvia[p for p, _ in _*_TYPED_PATTERNS]for TypeScript, Rust, Java, C#, Ruby, PHP, Swift, and Dart (8 languages). Eliminates ~70 lines of duplicated regex definitions. - DRY Python AST walking: Extracted
PythonAnalyzer._iter_ast()generator, shared by bothcollect_symbolsandcollect_typed_symbols, eliminating duplicated file-walking and AST-parsing code. - Import ordering fix: Fixed stdlib import ordering in
store_recipes.py(import osmoved before local imports). - All files under 500 LOC:
analyzers.pyreduced from 559 to 463 LOC. Total source: 3,876 to 3,771 LOC (105 lines net reduction through deduplication). - 242 tests (unchanged, all passing).
- Typed symbol analysis: New
collect_typed_symbols()method on all 15 analyzers returns symbols with type classification (function, class, interface, enum, struct, trait, etc.). - 3 new beam strategies (9 total):
leaf_first(zero-importer files first),hub_first(network hub files by in*out degree),test_adjacent(files with matching test files first). - Analyzer fixes: Ruby basename overwriting, Swift overly broad directory mapping, Java packageless file gap.
- Store refactor:
_init_schema()decomposed into class-level SQL constants. - 242 tests (12 new).
- Bug fix:
retrieve_best_recipescoring bug wherebest_scorewas updated before JSON validation — corrupted entries could shadow valid recipes. - Dead code removed: unused
total_filesvariable inestimatetool, redundantget_analyzerre-import. - Code simplification:
detect_languagereplaced 8 extension alias constants and 12-branch if/elif with data-driven loop. Lambda replaced withdefin_detect_from_config. RedundantDartAnalyzer.pick_test_cmdbranch removed. - Performance: eliminated double file reads in C#/PHP analyzers; Go analyzer reduced from two walks to one; PHP namespace lookup optimized from O(n) to O(1).
- Modernization: added return type annotations to
_get_collect_symbols_regex. Moved_SUPPORTEDto module-level_SUPPORTED_LANGUAGESfrozenset. Simplified__main__.py. - 230 tests (unchanged).
- Six new language analyzers:
CSharpAnalyzer(.cs),RubyAnalyzer(.rb),PhpAnalyzer(.php),SwiftAnalyzer(.swift),DartAnalyzer(.dart),ZigAnalyzer(.zig) in newanalyzers_ext2.py(~480 LOC). Total: 15 supported languages (Python, TypeScript, JavaScript, Go, Rust, C/C++, Java/Kotlin, C#, Ruby, PHP, Swift, Dart, Zig). - Config-file detection expanded: Package.swift (swift), build.zig (zig), pubspec.yaml (dart), .csproj/.sln (csharp), Gemfile (ruby), composer.json (php).
- Extension counting expanded:
.cs,.rb,.php,.swift,.dart,.zigall counted indetect_languagefallback. - Registry expanded: 15 languages in
_ANALYZER_REGISTRY. MCP_LANGUAGESlist updated to match. - Exports expanded:
CSharpAnalyzer,DartAnalyzer,PhpAnalyzer,RubyAnalyzer,SwiftAnalyzer,ZigAnalyzerexported from__init__.py. - All 21 sample repos re-tested: zero errors across all languages and scopes.
- 230 tests (15 new: symbols/imports for C#, Ruby, PHP, Swift, Dart, Zig + 6 config detection).
- Abbreviation handling in recipe matching: New
_ABBREVIATIONSdict inutils.pywith ~40 common coding abbreviations (gc, db, auth, api, etc.).normalize_goalexpands abbreviations before applying verb synonyms. Recipe matching that previously failed (e.g., "optimize GC" vs "optimize garbage collector") now works with 0.86+ similarity. - Analysis timing metadata:
decomposenow returnsanalysis_metain the response withlanguage,scope,files_analyzed,dep_files,dep_edges,timing_s(symbols, imports, total), and optionalwarningfor unsupported language fallbacks. estimateMCP tool: Quick file count for a project or scope without running full analysis. Returnslanguage,matching_files,recommendation("use scope" if >5000 files, "full analysis OK" otherwise). Helps LLMs decide whether to scope before analyzing large repos. 21 MCP tools total.- Iterative critical_path strategy: Converted recursive
_longestdepth computation to iterative stack-based DFS with cycle detection viain_stackset. Fixes stack overflow on deep dependency graphs (Guava's 1.66M-edge Java import graph was crashing). - 215 tests (5 new: 3 abbreviation, 1 analysis meta, 1 estimate tool).
- Monorepo scope support: New
scopeparameter ondecompose,explore,plan_and_execute, CLI (--scope), and MCP tools. Limits analysis to a subdirectory while keeping the full project available for test execution. Example:--scope services/authanalyzes onlyservices/auth/. - Concurrent write safety: Validated with threading tests (4 threads x 5 operations). Plans, recipes, and constraints all survive concurrent access without errors.
- 206 tests (6 new: 3 concurrent, 3 scope).
- Plan resumption: New
get_plan_progress(plan_id)returns accumulatedprior_editsfrom passed steps andremaining_stepsfor continuing failed plans. NewresumeMCP tool. - Recipe validation: New
validate_recipes(project_root)checks recipe file entries against current project, removes stale entries, prunes fully-stale recipes. Newvalidate_recipesMCP tool. - Config-file language detection:
detect_language()now checks config files first (Cargo.toml, go.mod, tsconfig.json, package.json, build.gradle, CMakeLists.txt, pyproject.toml) before falling back to extension counting. Config detection takes priority. - Shared file-collection helper: New
_collect_project_files(root, extensions)inutils.pyreplaces duplicatedos.walkpatterns in TypeScriptAnalyzer, CppAnalyzer, RustAnalyzer. verify_steplanguage support: MCP tool gainslanguageparameter for auto-detecting test command and error patterns.explore_trajectoriesnow uses existing_split_active_skippedhelper instead of reimplementing inline.- Full MCP dispatch test coverage: All 20 tools now have explicit dispatch tests.
- 200 tests (25 new).
- Codebase cleanup: Removed unused imports (
Anyfromanalyzers.py,jsonfromanalyzers_ext.py,ExecutionHarness/dumps_jsonfromtest_strategies.py). Replaced fragile lazy-import global inanalyzers_ext.pywithfunctools.cache(thread-safe, simpler). Fixed overly broadBaseException→Exceptionin transaction rollback (utils.py). Modernizedconn.commit()/conn.rollback(). Optimized_order_cohesionset creation. UpdatedSYSTEM_PROMPT.md(tool count 17→18, added C/C++/Java/Kotlin to multi-language section). - 175 tests (unchanged).
- Store module split: Extracted recipe methods (
save_recipe,retrieve_best_recipe,list_recipes,prune_recipes,_rebuild_trigram_index,_backfill_files) intostore_recipes.pyasRecipeStoreMixin(~210 LOC).RecipeStoreinstore.pynow inherits from it (~342 LOC, down from 540). - Expanded C++ symbol detection: Replaced single function pattern in
CppAnalyzerwith 5 targeted patterns: template functions, qualified functions (static/inline/constexpr), operator overloading, constructor/destructor detection, macro-prefixed functions (EXPORT_API etc). - Java/Kotlin source root detection: New
JavaAnalyzer._detect_source_roots(project_root)readsbuild.gradle/build.gradle.ktsandpom.xmlto find standard source directories (src/main/java,src/main/kotlin, etc). Falls back to project root.analyze_importsnow walks detected source roots instead of project root. - MCP tool
prune_recipes: Exposed recipe pruning as MCP tool withmax_age_daysandmin_success_ratioparameters. 18 MCP tools total. - 175 tests (9 new: 5 C++ expansion, 3 Java source roots, 2 MCP prune, minus 1 renamed).
- Parallel beam execution:
plan_and_executeruns beams concurrently viaconcurrent.futures.ProcessPoolExecutor(stdlib). Falls back to sequential on systems where process spawning fails. - C/C++ and Java/Kotlin analyzers: New
CppAnalyzer(.c/.cpp/.cc/.cxx/.h/.hpp/.hxx, class/struct/namespace/enum/typedef/function symbols,#include "..."resolution) andJavaAnalyzer(.java/.kt/.kts, class/interface/enum/fun/object/@interface symbols, package-based import resolution) inanalyzers_ext.py. MCP language enum expanded to 9 entries. - Analyzer module split:
analyzers.pysplit intoanalyzers.py(~370 LOC) +analyzers_ext.py(~400 LOC) to stay under 500 LOC per file. All existing imports preserved via re-export. - Recipe pruning: New
RecipeStore.prune_recipes(max_age_days=90, min_success_ratio=0.1)removes stale, low-quality recipes with cascade deletes torecipe_trigramsandrecipe_files. - Harness base-copy caching: New
prepare_base(project_root)andrun_from_base(edits, base_dir)create one filtered base copy; beams copy from it instead of re-filtering per beam. --dry-runand--languageCLI flags:--dry-runrunsexplore()instead ofplan_and_execute().- Decomposed
_apply_constraints: 85-line function incore.pysplit into_parse_constraints,_mark_avoided,_inject_orderings,_mark_incompatible,_add_prerequisites. - 166 tests (20 new).
- Code cleanup: Removed unused
import sysfromharness.py. Eliminated duplicateset(symbols) | set(dep_graph)computation inPlanner.decompose(core.py). Added defensivejson.loadserror handling inretrieve_best_recipe(store.py). Made failure default explicit inget_strategy_stats(store.py). Fixedregister_strategyparameter order inspec-project.mddocumentation. - 146 tests (unchanged).
- Go and Rust support: New
GoAnalyzer(regex-based, readsgo.modfor module path, resolves internal imports) andRustAnalyzer(regex-based, resolvesuse crate::andmoddeclarations). Shared_collect_symbols_regexhelper for regex-based analyzers.detect_languageexpanded to count.go/.rsfiles. Registry now supports 5 languages. - TypeScript enhancements:
_strip_c_commentsfor comment stripping before symbol/import detection. Namespace pattern added to_TS_SYMBOL_PATTERNS. - Improved beam strategies:
_order_bottom_upstable-sorts by ascending dependency count (files with fewer deps first)._order_top_downstable-sorts by descending dependency count (most consumer-facing files first). Both now genuinely use thedep_graphparameter. - Better recipe matching: New
word_substring_score(a, b)for partial word matching.goal_similarityreweighted: 0.3 trigram cosine + 0.4 word Jaccard + 0.3 substring (was 0.4/0.6). - Store improvements: Merged duplicated SQL branches in
save_recipe,list_plans,get_active_constraints. Composite scoring gains recency weighting (30-day half-life). New weights: text 0.4, files 0.25, success 0.15, recency 0.2. File trimmed from 534 to 516 lines. - MCP server refactor:
_schema()and_prop()helpers reduce from 507 to 255 lines. Added "go" and "rust" to language enums. - 146 tests (17 new: 4 Go, 3 Rust, 2 TS enhancement, 2 detection, 2 strategy, 4 matching).
- Code cleanup: Extracted
_split_active_skippedhelper incore.py, shared by all 6 beam strategy functions (eliminates duplicated active/skipped split pattern). Modernized_VERB_SYNONYMSinutils.pyfrom imperative loop to dict comprehension (eliminates leaked module-level variables_canonical,_variants,_v). - Documentation fixes: Corrected canonical verb form examples in glossary and spec (was "refactor", now correctly "restructure"). Fixed
register_strategyparameter order in glossary ((name, description, fn)not(name, fn, description)). - 129 tests (unchanged).
- Improved recipe matching: New
_VERB_SYNONYMS(40+ verb variants to 9 canonical forms),normalize_goal,word_jaccard, andgoal_similarity(0.4 trigram cosine + 0.6 word Jaccard on normalized text) inutils.py.save_recipenormalizes before trigram indexing.retrieve_best_recipeusesgoal_similarity._backfill_trigramsrenamed to_rebuild_trigram_index(rebuilds with normalized text on init). - New beam strategies (6 total, 3 new):
critical_path(longest dependency chain first — bottleneck feedback),cohesion(flood-fill connected components, largest first, toposort within),minimal_change(fewest symbols first — quick wins). - TypeScript analyzer improvements:
_TS_SYMBOL_PATTERNSlist replacing single regex (interface, enum, const enum, type alias, abstract class, decorated class, function expression). Expanded import detection (re-exports, barrel exports, type re-exports, dynamic imports). New_TS_ALIAS_IMPORT_RE,_read_ts_path_aliases,_resolve_alias. Added.mts/.mjsextensions. - 129 tests (34 new: 10 recipe matching, 9 strategy, 15 TypeScript).
- Code cleanup: Removed dead
jsonimport fromcore.py. Eliminated duplicated error patterns betweenutils.pyandPythonAnalyzer. Removed duplicated_pick_test_cmdfromharness.py(falls back toPythonAnalyzer). Removedanalyze_importsbackward-compat wrapper fromutils.py. - 95 tests (1 obsolete backward-compat test removed).
- Reference LLM integration: New
SYSTEM_PROMPT.mdproviding a reference orchestration guide for LLM clients (plan-verify-store loop). - New MCP tools:
update_plan_status(exposes existing store method),deactivate_constraint(exposes existing store method).statustool now includestoolscount in response. Tool count 16 → 17. - 96 tests (4 new).
- Structural recipe matching: New
recipe_filestable in SQLite schema (7 tables total) with indexes on both columns.save_recipepopulatesrecipe_fileswith file paths from strategy steps._backfill_files()auto-migrates existing databases. - Composite scoring:
retrieve_best_recipeaccepts optionalcontext_filesfor composite scoring — text similarity (0.5), file overlap via Jaccard (0.3), success ratio (0.2). Withoutcontext_files, scoring is backward-compatible (text-only). - Planner integration:
Planner.decomposepasses project file context to recipe retrieval (two-phase: text-only fast path, then structural). - New MCP tools:
list_recipes(limit=20),get_recipegainscontext_filesparameter. Tool count 14 → 16. - 92 tests (9 new recipe tests).
- Multi-language support: New
trammel/analyzers.pywithLanguageAnalyzerprotocol,PythonAnalyzer,TypeScriptAnalyzer(regex-based, stdlib-only),detect_language(),get_analyzer(). - Planner integration:
Planneraccepts optionalanalyzerparameter, auto-detects language.ExecutionHarnessaccepts optionalanalyzerfor language-specific test commands and error patterns. - Refactored analysis:
_collect_python_symbolsremoved fromcore.py(moved toPythonAnalyzer).analyze_importsinutils.pynow a backward-compat wrapper delegating toPythonAnalyzer.astimport removed fromutils.py. analyze_failureaccepts optionalerror_patternsparameter._IGNORED_DIRSexpanded:.next,.nuxt,coverage,.turbo,.parcel-cache.languageparameter added toplan_and_execute,explore, and MCPdecompose/exploretools.- New exports:
PythonAnalyzer,TypeScriptAnalyzer,detect_languagefromtrammel.__init__. - 83 tests (13 new in
tests/test_analyzers.py).
- Pluggable strategy registry: New
register_strategy()andget_strategies()API incore.pywithStrategyFnandStrategyEntrytypes. Three built-in strategies (bottom_up,top_down,risk_first) auto-registered at module load. Strategy functions use unified signature(steps, dep_graph) -> steps. - Strategy learning:
explore_trajectoriesaccepts optionalstorefor learning feedback. When provided, strategies sorted by historical success rate from trajectory data.plan_and_executeandexplorepass store to enable learning. - Strategy stats:
RecipeStore.get_strategy_stats()aggregates trajectory outcomes by variant (success/failure counts per strategy). - New MCP tool:
list_strategiesreturns registered strategy names with success/failure stats. Tool count 13 → 14. - New exports:
register_strategyandget_strategiesexported fromtrammel.__init__. - Test reorganization: New
tests/test_strategies.pywith 8 strategy-focused tests.TestBeamStrategiesmoved fromtest_trammel_extra.py. Test count 62 → 70.
- Simplified symbol collection:
_collect_python_symbolsreturns symbol name strings instead of redundant dicts; unusedfile,type,linefields removed (onlynamewas consumed downstream). - Removed dead parameter:
_step_rationaleno longer accepts unusedfilepathargument. - Inlined beam descriptions: Removed
_BEAM_STRATEGIESmodule-level constant; descriptions inlined at usage site, eliminating fragile index-based coupling. - Documentation fix: Removed duplicated extension point line in
spec-project.md.
- Concurrent write protection: All mutating
RecipeStoremethods wrapped in explicitBEGIN IMMEDIATEtransactions with exponential backoff retry onSQLITE_BUSY. Multi-statement operations likecreate_planare now atomic.db_connectsetstimeout=5.0. - Recipe retrieval at scale: Inverted trigram index (
recipe_trigramstable with B-tree index).retrieve_best_recipenow queries candidate recipes by shared trigrams before computing exact cosine, avoiding full table scans. Existing databases auto-backfill on schema init. - Constraint propagation: New
_apply_constraintsenforces active constraints during decomposition —avoidskips files,dependencyinjects ordering,incompatiblemarks conflict metadata,requiresadds prerequisite steps. Strategy output now includesconstraints_applied. - Constraint-aware beam strategies:
_order_bottom_upand_order_top_downplace skipped steps at end._order_risk_firstisolates incompatible steps and batches by package directory.explore_trajectoriesexcludes skipped steps from beam edits. - RecipeStore context manager: Added
close(),__enter__/__exit__, and__del__safety net. All public API functions and MCP server usewith RecipeStore(...).
- Import consistency: Converted absolute imports in
mcp_stdio.pyto relative imports, matching the rest of the package. - Dead exception handling: Removed unreachable
UnicodeDecodeErrorfrom_collect_python_symbolsexcept clause (core.py). Files are opened witherrors="replace", so the exception can never be raised. - Ignored directories: Added
.chiselto_IGNORED_DIRSinutils.py(tool cache directory, same category as.mypy_cache,.ruff_cache).
- Dead code removal: Removed unused test imports (
explore,synthesize,analyze_imports,cosine,trigram_bag_cosine,trigram_signature). Removed deadgoal_sliceparameter from_collect_python_symbols— computed per symbol but never consumed downstream. - Simplified topological sort: Removed redundant
rev.setdefault()call where keys are guaranteed to exist from pre-initialization. - Documentation fix:
plan_and_executeAPI signature in spec now includestest_cmdparameter.
- Version from metadata:
__version__now derived fromimportlib.metadataat runtime, eliminating version duplication betweenpyproject.tomland source code. - Match/case dispatch:
dispatch_toolinmcp_server.pyconverted from 13-branch if/elif chain to Python 3.10+match/case. - Configurable test command:
ExecutionHarnessacceptstest_cmdparameter for custom test runners (e.g. pytest). Propagated throughplan_and_execute, CLI (--test-cmd), and MCPverify_steptool. - Recipe retrieval optimization:
retrieve_best_recipeshort-circuits on exact match (similarity 1.0). - Tests package: Added
tests/__init__.py.
- Dead code removal: Removed unused
advance_plan_stepmethod fromRecipeStore, unusedjson/osimports frommcp_server.py, unusedjsonimport from tests. - Consolidated ignored-dirs: Unified hardcoded directory skip list in
harness.pywith_IGNORED_DIRSfromutils.pyvia new_is_ignored_dirhelper. Fixedegg-infopattern that could never match actual*.egg-infodirectories. - Performance:
topological_sortusescollections.dequeinstead oflist.pop(0)for O(1) queue operations. - Simplified core: Replaced verbose loop in
Planner.decomposewith set union forall_filesconstruction.
- Dependency-aware planning: Import analysis via AST, topological sort, steps with ordering rationale and dependencies.
- Real beam branching: Three strategies --
bottom_up,top_down,risk_first-- instead of label variations. - Incremental verification: Per-step harness with
verify_step()andrun_incremental(). - Failure analysis: Structured error extraction (type, message, file, line, suggestion).
- Constraint propagation: Persistent failure constraints that block repetition across sessions.
- MCP server: 13 tools exposed via stdio transport, matching Stele/Chisel pattern (expanded to 17 by v2.0.0).
- Enriched schema: Recipes store strategies + constraints + failure counts. Plans track step-level status. New
stepsandconstraintstables.
- Recipe retrieval requires minimum similarity threshold (0.3).
_collect_python_symbolscollectsasync defand skips ignored directories.- Deduplicated trigram computation; removed dead code.
- Correct trigram similarity for recipe retrieval; tie-break on stored success counts.
- Test subprocess uses
sys.executable; SQLiteforeign_keys=ON. - Beam
editsincludepath; JSON serialization centralized viadumps_json. __version__and CLI--version.
