Skip to content

Releases: emrecdr/devt

v0.44.0

18 May 21:02

Choose a tag to compare

Nine-commit wave layering four concerns on top of v0.43.0's graphify integration: alignment-drift cleanup, JSON-sidecar contract completion for the highest-traffic markdown-only artifact, memory-layer module split, and post-integration polish for graphify + claude-mem MCP routing. Two silent failure modes closed: validateConsistency was recording NO_STATUS_LINE warnings on every code-review verify-phase advance because code-reviewer.md emits ## Verdict while extractStatus() only matched ## Status; loadGraph silently degraded when graph.json exceeded the 100MB cap with no signal in /devt:forensics. One forensic-preservation gap closed: dispatch-warnings.jsonl was being deleted by state reset despite CLAUDE.md promising it survives. Plus the claude-mem harvest pre-step was routing to a tool that errors out for the canonical (worker-mode) install. Twelve new smoke gates added across the wave. Smoke: 454 passed, 0 failed (was 438 at the start of the wave).

Added — alignment cleanup (a9ecfdf)

  • Documentation-discipline smoke gate in scripts/smoke-test.sh: scans agents/, workflows/, skills/, and docs/ for devt-internal version refs (v0.X.Y patterns and since v[0-9] markers), excluding CHANGELOG.md and docs/superpowers/plans/ as legitimate historical homes. Catches the class of drift where version refs leak into source-of-truth surfaces despite the existing rule.
  • RESET_EXEMPT preservation smoke gate: state-reset functional test asserts both preflight-denies.jsonl and dispatch-warnings.jsonl survive a reset, locking in the forensic-preservation contract.

Fixed — alignment cleanup (a9ecfdf)

  • bin/modules/state.cjs::RESET_EXEMPT now preserves dispatch-warnings.jsonl alongside preflight-denies.jsonl. The forensic-preservation claim for /devt:forensics already promised this; the implementation now honours it.
  • agents/retro.md frontmatter adds skills: [] for contract consistency with agents/io-contracts.yaml (where retro.frontmatter_skills already declared []).
  • docs/MEMORY.md: removed two devt-internal version refs (header note and multi-root prose), added lesson to the doc_type enum comment, deleted the empty ## Version Notes section header.

Added — review.json JSON sidecar (8dff301)

  • review.json sidecar completes the JSON-sidecar contract for the highest-traffic markdown-only artifact. Joins the existing impl-summary.json, test-summary.json, and verification.json. Schema split mirrors impl-summary: status ∈ {DONE, BLOCKED} for workflow routing, verdict ∈ {APPROVED, APPROVED_WITH_NOTES, NEEDS_WORK} for the review outcome.
  • Wired end-to-end: bin/modules/state.cjs (JSON_SIDECAR_SCHEMAS entry + SIDECAR_FOR_MARKDOWN mapping + review.md removed from ARTIFACT_SCHEMA), agents/io-contracts.yaml (code-reviewer.outputs.sidecar = review.json), agents/code-reviewer.md (stub-first protocol writing review.json first, finalizing with status+verdict+agent+score+counts+timestamp), workflows/code-review.md (artifact pre-gate requires both .md and .json), workflows/next.md (three review-routing branches now state read-sidecar review.json instead of text-matching the markdown).
  • Seven smoke gates covering registry presence, mapping presence, ARTIFACT_SCHEMA absence, io-contracts declaration, agent emission, workflow consumption, and end-to-end schema validation of all three flags.

Fixed — review.json JSON sidecar (8dff301)

  • Silent extractStatus warnings on review.md eliminated as side effect. Before the sidecar wire, validateConsistency ran extractStatus() on review.md, which only matched ## Status headings — but the code-reviewer template emits ## Verdict. Every code-review verify-phase advance silently persisted NO_STATUS_LINE to workflow.yaml::validation_warnings. Sidecar routing via SIDECAR_FOR_MARKDOWN bypasses extractStatus entirely.
  • Generalized the ARTIFACT_SCHEMA drift gate to recognize both "Status field is one of" and "Verdict field is one of" agent doc patterns, with the sidecar field-kind resolved from whichever matched. Prevents the same class of drift across all sidecar-routed artifacts going forward.

Added — polish pass (867c005)

  • docs/COMMANDS.md scope_trust + graph_stats coverage added in the preflight section. The JSON sidecar fields (scope_hint, scope_trust, graph_stats) were documented in CLAUDE.md (dev-facing) but absent from COMMANDS.md (user-facing). Three new paragraphs covering: sidecar shape, <scope_trust> dispatch signal semantics (low-confidence treatment when trust ∈ {sparse, empty} or lag_commits > 10), and graph_stats source.
  • End-to-end smoke gate for review.json sidecar routing: constructs the exact scenario the silent bug occurred in (review.md without ## Status heading, valid review.json) and asserts state validate's JSON output contains no review.md no_status_line mismatch. Exercises the SIDECAR_FOR_MARKDOWN code path in validateConsistency, not just wiring presence.

Fixed — polish pass (867c005)

  • agents/researcher.md status pattern: template wrote Status: DONE | ... as plain text under ## Confidence heading. extractStatus() only matches ## Status headings, so the status line was unparseable. research.md is in ARTIFACT_SCHEMA and consumed by workflows/dev-workflow.md; misalignment was latent because research is not in PHASE_ARTIFACT_MAP, but adding research as a routed phase would have re-introduced the same silent-warning class fixed for review.md. Promoted to a proper ## Status heading with block-form value.

Changed — memory module split (af90cf0)

  • bin/modules/memory.cjs extracted into three files. The 1884-line module was well past the 700-line informal threshold. Two clean extraction boundaries identified after dependency mapping:
    • bin/modules/memory-graph.cjs (135 lines): graph traversal over the links table — getLinks, getSubgraphTriples, getBacklinks, findOrphans, findStaleLinks. Only needs the DB handle, obtained via withDb.
    • bin/modules/memory-bundle.cjs (251 lines): portable JSON bundle export/import — resolveExportPath, resolveImportPath, readDocFile, exportBundle, importBundle. Uses parser/validation helpers from the core module.
    • bin/modules/memory.cjs (1576 lines, was 1884): everything else — paths, frontmatter parsing/validation, DB lifecycle, queries, upsertDoc, symbol validation, CLI dispatcher.
  • Lazy-require pattern breaks the load-time circular dep: sub-modules require("./memory.cjs") inside function bodies, so memory.cjs's top-level require of the sub-modules resolves cleanly. Public API unchanged via re-exports; existing consumers (devt-tools.cjs, devt-memory-mcp.cjs, discovery.cjs, preflight.cjs, health.cjs) need zero call-site changes.
  • Four helpers now formally exported from memory.cjs to support the sibling-module contract: withDb, findProjectRoot, parseYamlSubset, serializeFrontmatter. These are internal-but-shared utilities that sub-modules need; the export is the contract that lets them stay in their natural homes instead of being moved to a shared base module.
  • Net: -308 lines from memory.cjs (-16%), +386 lines across two sibling modules, +78 lines total codebase (import boilerplate cost — acceptable tradeoff for module health).

Fixed — claude-mem MCP harvest routing (26033b9)

  • claude-mem harvest pre-step in three workflows (dev-workflow.md, quick-implement.md, lesson-extraction.md) was routing to observation_search, which requires CLAUDE_MEM_RUNTIME=server-beta and silently no-ops in the canonical worker-mode install. Re-targeted to search — the worker-mode equivalent exposed identically by both runtimes. Parsing instructions refined to extract only numeric-ID rows from the markdown index (the search tool returns observations + sessions + prompts under the same result count) and to map the emoji column (⚖️ → decision, 🔵 → discovery) to obs_type, dropping session-telemetry types that don't promote.
  • Negative smoke gate added; existing positive harvest gate retargeted to the new tool name.

Added — graphify polish (6ec5cf3 / 7d1f080 / 667f50b)

  • loadGraph size-cap forensic record (6ec5cf3): when graph.json exceeds the 100MB cap, appends one JSONL record to .devt/state/preflight-denies.jsonl with source="graph_loader", path, size, cap, and ISO timestamp. Per-process dedupe via a path set so one workflow that calls multiple graphify wrappers writes one record, not N. Skipping the full readFileSync on oversize files is a side benefit. Two new fixture-test assertions.
  • graphify.godNodes() public function (7d1f080): discovery.cjs::harvestGraphifyGodNodes and preflight.cjs's Cross-Cutting Concerns renderer were both reading god-nodes by regex-scraping graphify-out/GRAPH_REPORT.md. That path lags the actual graph because graphify update rewrites graph.json but leaves GRAPH_REPORT.md alone unless cluster-only also runs. The local _topByDegree() already computes god-nodes from graph.json adjacency with matching filters; wrapping it as a public godNodes() lets both consumers read live data. CLI: new graphify god-nodes [--limit=N] subcommand. Five new fixture-test assertions.
  • docs/graphify-helpers/SKILL.md MCP table aligned to v0.8.11 (667f50b): upstream graphify v0.8.11 ships 10 MCP tools; the skill's table listed 7 (pre-v0.8.8). Now lists all 10 (query_graph, get_node, get_neighbors, shortest_path, god_nodes, get_community, graph_stats, get_pr_impact, list_prs, triage_prs). Decision-tree step "Probe graphify --help -> exit 0?" was dead text (devt reads graph.json directly in-process); consolidated to graphify status which combines enabled-flag + graph.json existence in one call.

Fixed — setup reinit reconci...

Read more

v0.43.0

18 May 15:48

Choose a tag to compare

Eleven-commit wave addressing structural integration drift against upstream graphify and claude-mem. Three silent failure modes closed: graphify wrappers shelling out to subcommands with --json flags that don't exist upstream; code-reviewer never consuming mcp__graphify__get_pr_impact during PR reviews; claude-mem mcp --db invocation invalid against claude-mem v13 (produced "Unknown IDE: --db" error every Claude Code session). Two new agent signals added end-to-end: graph trust verdict + freshness lag in the preflight sidecar, with workflow caching + 7 agent body paragraphs implementing low-confidence treatment on sparse / stale graphs. One budget-protection mechanism: code-reviewer applies a community filter for large PR reviews when pr-impact.md is present, deferring out-of-community files to a follow-up dispatch. Smoke: 438 passed, 0 failed (was 427 at the start of the wave). Architectural through-line: devt's Node code never reaches into upstream MCP/CLI directly — file artifacts for Node, orchestrator-mediated MCP for agents, deletion when a path is unrecoverable.

Graphify wrapper migration to direct graph.json reads (bin/modules/graphify.cjs). Every structured-query call (queryGraph, getNode, getNeighbors, shortestPath, blastRadius) previously shelled out to graphify query <text> --json / graphify query <sym> --neighbors --direction=... --depth=... — flags that don't exist in upstream's CLI surface. Verified against safishamsi/graphify upstream source: the CLI accepts only --dfs/--budget/--context/--graph for query, and the MCP server's tool handlers return types.TextContent(type="text", text=str) blobs, not structured objects. Every call was silently triggering the grep fallback via safeJsonParse failure on non-JSON output. The migration replaces all 5 functions with pure-Node algorithms over the deterministic graphify-out/graph.json NetworkX node-link artifact ({nodes, links/edges, hyperedges}). One in-process tree walk replaces 2N subprocess spawns per blastRadius call. status() decoupled from binary presence — state === "ready" now depends only on graphify.enabled + graph.json exists, so projects with a checked-in or CI-built graph work without graphify on PATH. Smoke: 428 passed, 0 failed (+1 gate over the prior 427 baseline).

Added (Phase A)

  • graph.json in-process reader (bin/modules/graphify.cjs::loadGraph): memoized by (path, mtimeMs) so repeated calls within a workflow turn parse the file once. Builds {out, inc, nodeMap} adjacency maps for O(1) neighbor lookup. Handles both links (modern NetworkX node_link_data(G, edges="links")) and edges (legacy NetworkX) field names. Caps file size at 100 MB via safeJsonParse.
  • Pure-Node algorithms for the 5 structured-query functions: substring/case-insensitive label+id resolution (_resolveOne, _resolveMany), direction-aware BFS (_bfs with direction: in|out|both, configurable depth), and directed shortest-path (BFS along outgoing edges only). blastRadius now walks depth-2 incoming in one tree traversal per symbol instead of issuing two subprocess calls per symbol.
  • scripts/test-graphify.cjs (new, 16 assertions): fixture-based test runner matching the scripts/test-locking.cjs convention. Builds a 4-node / 4-edge NetworkX-format graph.json in a temp project and exercises every public function: status (ready/graph_missing/disabled paths) / query (exact/substring/empty) / neighbors (in/out/both/depth=2) / path (connected/no-route) / blast-radius (shape contract + direct dependent count) / legacy edges field name compatibility / graph-missing degradation. Wired into scripts/smoke-test.sh so CI runs it as part of the standard gate.

Changed (Phase A)

  • graphify.cjs::status() no longer gates on binary presence. Previously required graphify --help to exit 0 even though devt's read path never invokes the binary. State enum collapsed to "ready" | "disabled" | "graph_missing" (removed "binary_missing"). The graphify binary is still required to generate graph.json via graphify update ., but devt's consumption is now binary-independent. probeBinary is kept for setup.cjs's MCP-server registration logic — that path legitimately needs to know whether the binary is installed.

Removed (Phase A)

  • callGraphify subprocess wrapper. The function passed --json and other flags that don't exist in upstream's CLI argparse — every invocation either returned exit 2 ("unrecognized arguments") or non-JSON text that failed safeJsonParse. All structured-query operations now read graph.json directly. The export is gone from module.exports; no other module imported it.

Added (Phase B-1 — PR-impact MCP wiring)

  • Orchestrator fetches mcp__graphify__get_pr_impact during code-review context_init (workflows/code-review.md): when REVIEW_SCOPE mentions a PR number ("PR #N", "pull request N", or a PR arg), the orchestrator (main session, which has the project's MCP allowlist) calls the tool once and Writes the response verbatim to .devt/state/pr-impact.md. Skip-silently semantics — no PR number / no graphify MCP / call errors all proceed without the file, and the agent falls back to scope_hint + raw file list. The orchestrator pattern is necessary because the code-reviewer agent's allowlist is Read, Bash, Glob, Grep (no MCP), so the main session does the MCP fetch and the agent consumes the persisted file.
  • Code-reviewer agent Reads .devt/state/pr-impact.md when present (agents/code-reviewer.md::context_loading): instructions to prioritize files in affected communities ahead of unrelated files in the scope list, and weight finding severity by structural impact rather than diff size alone. Graphify's structured map (files changed, communities affected, blast radius) is treated as authoritative for "what does this PR actually touch in the graph".
  • ADR Compliance section gains a PR-impact item (both workflows/code-review.md and agents/code-reviewer.md): when reviewing a PR, the structured impact map is consulted alongside memory affects / memory rejected-keywords / get_neighbors so reviewers can weight findings by graph community rather than file count.
  • 2 new smoke-test gates that pin the wiring: workflow file references both get_pr_impact and pr-impact.md; agent file references both. Prevents silent regression if a future audit strips the guidance.

Notes (Phase B-1)

  • The companion MCP tools mcp__graphify__list_prs and mcp__graphify__triage_prs exist in upstream but apply to PR-selection ("which PR should I review next?"), not per-review work. They are deliberately not wired into this workflow — review-selection is a separate concern handled outside /devt:review.
  • The fetch step does not write any tool name into Bash. The MCP call is a tool-use directive to the orchestrator (natural-language instruction), not a mcp call shell command — devt's Node code remains MCP-client-free per the architectural invariant from Phase A.

Removed (Phase C-1 — broken claude-mem CLI integration)

  • discovery.cjs::harvestClaudeMem + claudeMemAvailable removed entirely (~60 LOC). Modern claude-mem (v13.x) does not expose a query CLI command — its surface is status / search <query> / mcp <ide> / install / repair / start / stop / restart / server / worker / adopt / cleanup / transcript, all positional with no --tags or --json flags. devt's invocation claude-mem query --tags decision,discovery --json was returning exit 2 ("Unknown command") on every modern install. Source #1 of the discovery harvest has been silently producing [] for any user past the v13 upgrade.
  • Per-project .mcp.json scaffolding for claude-mem removed from setup.cjs::scaffoldProject. The previous entry {command: "claude-mem", args: ["mcp", "--db", ".claude-mem/mem.db"]} is doubly wrong: claude-mem mcp is an IDE-installer subcommand that takes an IDE identifier (claude-code, cursor, etc.) — not flags. The scaffolded entry triggered an "Unknown IDE: --db" error on every Claude Code session for users with both devt and claude-mem installed. Modern claude-mem self-registers as a Claude Code plugin (its package.json declares plugin/.mcp.json + plugin/.claude-plugin), so per-project registration is also redundant.
  • .claude-mem/mem.db entry removed from the setup.cjs .gitignore scaffold. Modern claude-mem uses ~/.claude-mem/ (per-user) for its database; the per-project path is upstream-obsolete.
  • discovery claude-mem-status CLI subcommand removed from bin/devt-tools.cjs + bin/modules/discovery.cjs::run dispatcher. It probed a capability that has no consumer.
  • Documentation cleanup: 24 references to "claude-mem ⚖️/🔵 harvest" / "claude-mem absent" / "claude-mem timeout" across workflows/dev-workflow.md, workflows/quick-implement.md, workflows/lesson-extraction.md, workflows/memory-promote.md, workflows/uninstall.md, agents/curator.md, and the JSDoc/comment sites in bin/modules/discovery.cjs + bin/modules/memory.cjs. Replaced with accurate descriptions of the remaining 3 sources (#KNOWLEDGE-CANDIDATE scratchpad tags, .devt/state/decisions.md DEC-xxx entries, Graphify god-nodes when available).
  • Stale smoke gate ("discovery claude-mem-status returns boolean") removed. It exercised the deleted CLI subcommand.

Added (Phase C-1)

  • 2 smoke gates that pin the removal: discovery.cjs MUST NOT spawnSync("claude-mem"...); setup.cjs MUST NOT scaffold a claude-mem MCP entry with --db or "mcp" args. Prevents future regressions where someone re-adds the broken shellout.
  • An explanatory comment in setup.cjs documenting why the per-project claude-mem entry is intentionally absent: "claude-mem v13+ self-registers as a Claude Code plugin under ~/.claude/plugins/ — no per-project entry needed."

N...

Read more

v0.42.0

17 May 09:10

Choose a tag to compare

Tier-aware skill preloading + Agent IO Contracts registry + inline-loading coverage completion across write-agents + memory_signal cache hoist + over-cap skill description trim. The coordinated wave extends the existing inline-prefix pattern (<governing_rules>, <guardrails_inline>) from the original 3 read-only agents to the 3 write-agents (programmer/tester/architect) that re-read CLAUDE.md + rule files on every retry iteration; adds rubric body inlining for the verifier; collapses 7 per-dispatch memory query --signal CLI calls into a single workflow-start cache; trims 8 SKILL.md descriptions back under the 1024-char soft cap. Validated via node bin/devt-tools.cjs token-report showing aggregate cache_hit_rate = 93.05% across 5 recent sessions — empirical evidence that the existing prefix patterns were already hitting the prompt cache, so this wave focused on closing the coverage gaps the original inline-loading wave intentionally restricted to read-only agents. Smoke: 401 passed, 0 failed.

Added

  • Tier buckets in skill-index.yaml. Per-agent skill assignments now split across three sibling keys at the existing indent-4 level — skills (always loaded), skills_standard (added when state.tier is STANDARD or COMPLEX), skills_complex (added only at COMPLEX). Heavy specialist skills (strategic-analysis ~8K chars, complexity-assessment ~10K, autoskill ~12K) demoted out of the always bucket so SIMPLE/TRIVIAL dispatches skip them. The hand-rolled YAML parser at init.cjs::parseSkillIndex already accepted arbitrary indent-4 keys — no parser change required. User overrides at .devt/config.json::agent_skills.<agent> keep accepting a flat array (= always loaded, ignores tier) so existing project configs don't break.
  • bin/modules/init.cjs::mergeSkillsForTier (NEW): merges the three buckets per agent against a tier (TRIVIAL/SIMPLE/STANDARD/COMPLEX/null), normalizes case, dedupes. Null/unknown tier returns the full union for backward-compatible default behavior. resolveSkills gained a tier parameter wired through initWorkflow; the call site seeds tier from state.tier (set by complexity-assessment once it runs) or falls back to detectTier(task) so the very first dispatch in a fresh workflow still gets tier-aware loading. The init payload now surfaces the resolved tier at the top level (tier: "trivial"|"simple"|"standard"|"complex") for transparency.
  • agents/io-contracts.yaml (NEW): single source of truth declaring per-agent frontmatter_skills, index_buckets, outputs.{primary,sidecar}, and inputs.context_blocks. Currently 10 dev agents covered. Three smoke-test gates assert no drift against (a) agents/<name>.md frontmatter skills:, (b) declared sidecars exist in state.cjs::JSON_SIDECAR_SCHEMAS, (c) every contracted agent has a backing .md file. The class of bug it catches: memory-pre-flight had been preloaded by 9 agents via frontmatter for several releases but was missing from skill-index.yaml — the kind of three-surface drift that's silently corrosive until something snaps.
  • memory-pre-flight added to skill-index.yaml for the 9 dev agents that already preload it via frontmatter (programmer/tester/code-reviewer/docs-writer/architect/verifier/researcher/debugger — plus devt-coordinator already had it). Closes the registry-vs-reality gap that the new contracts gate would have failed on.
  • graphify-helpers added to architect's skills bucket to match its frontmatter (architect was the second drift case).
  • Three new smoke-test gates under == Tier-aware skill resolution == and == Agent IO Contracts registry drift ==: empirical verification that a typo-style task seeds tier=trivial and prunes complex-tier skills from the programmer's resolved set; a refactor-style task seeds tier=complex and loads the full union; the io-contracts.yaml file agrees with all three drift surfaces.
  • bin/modules/init.cjs::loadInlineRubrics (NEW): mirrors loadInlineGuardrails for the per-workflow-type pinned rubric files (references/rubrics/<filename> resolved via the same three-layer order as grader.cjs::resolveRubricPath — absolute → project-local .devt/rubrics/ → plugin defaults). 32 KB cap. Surfaced at top-level inline_rubrics in the init payload as a {workflow_type: content} map. Verifier dispatches in dev-workflow.md and code-review.md now embed <rubric_content>{inline_rubrics.dev} / {inline_rubrics.code_review} alongside the existing <rubric_path> block — agent body instructs prefer-inline-over-path-read.
  • <governing_rules> block extended to programmer + tester + architect dispatches. Original wave covered the 3 read-only agents (code-reviewer / verifier / researcher); this completes the coverage to write-agents that re-read CLAUDE.md + 1-3 rule files on every retry iteration. Per-agent sub-tag sets vary based on which files each agent actually needs: programmer (claude_md + coding_standards + architecture + quality_gates), tester (claude_md + quality_gates + testing_patterns), architect (claude_md + architecture). All three agent bodies updated with the prefer-inline instruction listing exactly the sub-tags they accept.
  • <guardrails_inline> block extended to tester + architect. Tester preloads only golden-rules.md; architect preloads golden-rules.md + engineering-principles.md. Programmer already had the full 3-file inline block from the original wave.
  • quick-implement.md programmer + tester dispatches now carry inline blocks. Was the most cache-unfriendly dispatcher in the codebase pre-wave — had zero inline blocks despite being the "lightweight fast path".
  • Parallel-bash pairing for Step 2 (scan) + Step 2.5 (regression_baseline) in dev-workflow.md. New <!-- parallel-bash: ... --> marker comment documents the pattern (mirrors the existing <!-- parallel-dispatch: researcher + architect --> marker for Task subagent parallelism). The two steps share no state (distinct artifacts, no overlapping state update keys), so when regression_baseline would run a slow test suite the orchestrator can launch it with run_in_background=true and proceed to scan in the foreground. Wall-clock savings up to the full test-suite duration on projects with slow tests.

Changed

  • Skill preload behavior is tier-conditional from dispatch #1. Previously every dispatch loaded the full per-agent skill union regardless of complexity. Now init.cjs seeds tier via detectTier(task) (heuristic; refined by complexity-assessment once the workflow runs). Concrete effect: a trivial typo fix gives the programmer 3 preloaded skills (codebase-scan, scratchpad, memory-pre-flight) instead of the prior 6+, shrinking the per-dispatch prefix by ~28K chars. The full union still loads for COMPLEX-tier work — no regression for non-trivial flows.
  • memory_signal computation hoisted from per-dispatch CLI calls to workflow context_init cache across dev-workflow.md, quick-implement.md, and code-review.md. The same memory query "<task>" --signal=3 --json-compact aggregate was previously computed 7 times across the 3 workflows (3 dispatches in dev: programmer/code-reviewer/verifier; 2 in quick-implement: programmer/code-reviewer; 2 in code-review: code-reviewer/verifier). Now computed once at context_init, persisted to workflow.yaml::memory_signal_json, and read back via state read | jq -r '.memory_signal_json' in each dispatch's orchestrator-prep block. Saves up to 6 subprocess calls per workflow + makes the <memory_signal> block byte-stable across iterations (no risk of mid-workflow index-mutation producing different ordering across dispatches).
  • 8 over-cap skill descriptions trimmed back under the 1024-char soft cap. Was 1030-1233 chars (pre-folded-scalar parser fix masked 4 of them); now 740-900 chars. Trimmed redundant trigger-phrase repetition while preserving discoverability triggers and scope-boundary statements. Each preload-injected description appears in agent system prompts on every dispatch, so even small per-skill trims compound. Skills affected: lesson-extraction (1233→825), autoskill (1131→743), verification-patterns (1076→900), architecture-health-scanner (1049→777), code-review-guide (1049→900), memory-curation (1049→~900), council (1042→844), strategic-analysis (1030→740).
  • Agent body context-loading instructions updated for programmer / tester / architect / verifier. The numbered "Read X" steps now consolidate the inline-prefer language into the load step itself, listing exactly which <governing_rules> / <guardrails_inline> / <rubric_content> sub-tags each agent recognizes. Agents fall back to disk Reads only when the inline block is absent or a specific sub-tag is empty.
  • regression_baseline added to the "Valid phases" enumeration in dev-workflow.md (3 occurrences: --to validation, --only validation, error message template). Previously /devt:workflow --to regression_baseline would reject as invalid even though the step is wired into the workflow.
  • status.md routing entry added for phase=regression_baseline, sibling of the existing phase=scan row. Previously /devt:status would show a blank suggestion line for a workflow stopped at the baseline phase.

Fixed

  • scripts/smoke-test.sh Agent IO Contracts gate ROOT env propagation bug. ROOT="$(cd ...)" was set but never exported; the gate's node -e "..." subprocess saw process.env.ROOT === undefined, crashed with TypeError [ERR_INVALID_ARG_TYPE]: path argument must be of type string on the first path.join(root, ...) call, and silently aborted the entire smoke test mid-run (at 321/401 passes, no FAIL line emitted because the abort happened before any fail() call). Two months of green-looking smoke runs were actually hiding a hard-aborting check. Fixed by prefixing the node call with ROOT="$ROOT" to scope-pass the var without mutating ...
Read more

v0.41.0

16 May 12:19

Choose a tag to compare

Sidecar migration wave + deterministic pre-verifier gate. Test-summary joins impl-summary and verification as a sidecar-routed artifact; impl-summary gains structured gates.{lint,typecheck,test} fields capturing the programmer's quality-gate execution; a new zero-dep grader.cjs runs as a pre-verifier gate that short-circuits the LLM verifier on red-test cycles (saves ~5–15K input tokens per failed iteration, up to ~45K per 3-iteration cycle). The grader gate's routing logic distinguishes three envelope shapes — I/O failures (sidecar missing/malformed/non-object, rubric missing, rubric ## Deterministic Gates JSON malformed, path-traversal in config) route to BLOCKED so the programmer isn't retried on something they can't fix; constraint violations route to RETRY/PRUNE under the verify_iteration cap; greens dispatch the LLM verifier. The two-call merge precedence is documented explicitly: strictest outcome wins (any ok:false → BLOCKED). Project-local rubric overrides land at .devt/rubrics/<file> and are picked up before plugin defaults; relative paths are scoped to their trusted root, absolute paths bypass the check (operator opt-in). Pre-flight deny messages now carry an explicit recovery template so agents without the memory-pre-flight skill can recover from the deny output alone. Skill frontmatter is now structurally validated at smoke time per Anthropic's official Skills guide. Smoke: 398 passed, 0 failed.

Added

  • test-summary.json sidecar registered in state.cjs::JSON_SIDECAR_SCHEMAS (status enum mirrors the prior markdown ARTIFACT_SCHEMA, verdict enum mirrors impl-summary's {PASS, FAIL, INDETERMINATE}, agent gated to tester). The tester now emits both .md (human review) and .json (workflow routing) at gate time. SIDECAR_FOR_MARKDOWN maps test-summary.md → test-summary.json; validateConsistency() reads status through the sidecar for this artifact. JSON shape includes tests.{added,passed,failed,skipped}_count, test_files[], failures[], and concerns[] — the count fields feed the new deterministic grader directly. workflows/dev-workflow.md and quick-implement.md updated to read status via state read-sidecar test-summary.json instead of grepping the markdown's ## Status header.
  • impl-summary.json::gates schema extends the programmer sidecar with structured quality-gate execution fields: gates.lint.{ran, passed, errors, warnings}, gates.typecheck.{ran, passed, errors}, gates.test.{ran, passed, passed_count, failed_count, skipped_count}. Converts "did the programmer run tests" from prose-in-the-markdown into machine-readable fields the deterministic grader inspects directly. Existing verdict/status/requirements_* fields unchanged.
  • bin/modules/grader.cjs (NEW, zero-dep stdlib only): extracts the ## Deterministic Gates JSON block from a rubric markdown, walks the constraint tree against a sidecar's parsed JSON, returns {pass, gate_failures: [{field, expected, got}]}. Constraint leaves: scalar (equality), array (oneOf). Nested objects recurse with a dotted field path. CLI: node bin/devt-tools.cjs grade <workflow_type> <sidecar.json> (exit 0 on pass, 1 on fail). Rubrics without a Deterministic Gates section short-circuit to pass:true (no enforcement).
  • references/rubrics/dev.v1.md Deterministic Gates section declares the dev-workflow constraints: test-summary.json.verdict = "PASS" + tests.failed_count = 0; impl-summary.json.verdict = "PASS" + gates.{lint,typecheck,test}.{ran,passed} = true. Projects override per-project in .devt/config.json::rubrics.dev by pointing at a customized rubric file.
  • Pre-verifier gate wired into workflows/dev-workflow.md — runs the grader against test-summary + impl-summary BEFORE the LLM verifier Task dispatch. Three-way envelope routing: {ok:false} → BLOCKED (I/O failure, not retryable); {ok:true, pass:false} → RETRY/PRUNE under verify_iteration cap; {ok:true, pass:true} → LLM verifier dispatches. On constraint-violation pass:false, participates in the same verify_iteration counter the LLM verifier path uses: under workflow.max_iterations cap (default 3) routes to programmer re-dispatch with gate_failures as <review_feedback>; at cap routes to PRUNE with gate_failures written to scratchpad and status=DONE_WITH_CONCERNS. Existence pre-gate extended to check JSON sidecars alongside the markdown artifacts — missing sidecars surface as BLOCKED early instead of through a generic grader I/O error. Skips the LLM verifier entirely on red-test cycles. Verifier's job under deterministic-gating narrows to semantic verification — did the implementation solve the task? — rather than re-grading test results the grader already proved.
  • Skill frontmatter smoke gate per Anthropic's "Complete Guide to Building Skills for Claude" (2026). Hard-fails on structural rules that would break Claude's skill loader: SKILL.md case-sensitive presence, no README.md inside skill folders, YAML frontmatter present, name = folder name in kebab-case, no claude/anthropic reserved name prefix, no XML angle brackets in frontmatter (security: frontmatter loads into Claude's system prompt). Soft-warns (informational, does not fail) on description >1024 chars or body >5000 words — the PDF lists these as guidelines, not loader requirements. Current state: 12 of 16 devt skills fully clean; 4 surface as soft-warn (architecture-health-scanner / autoskill / lesson-extraction / memory-curation — all are slightly over the 1024-char soft cap because they carry rich trigger-phrase lists, which the PDF also recommends for reliable triggering). Drift prevention surface for future skill additions.

Changed

  • impl-summary gate-check now routes through the JSON sidecar. workflows/dev-workflow.md and workflows/quick-implement.md previously instructed the orchestrator to "Read .devt/state/impl-summary.md and check status", but impl-summary.md has carried no ## Status header since v0.33.0 (sidecar-only routing contract). The instruction worked anyway via implicit Claude adaptation, but the documented routing was stale. Migrated to explicit state read-sidecar impl-summary.json — same pattern Phase 1 established for test-summary. All 3 sidecar-covered artifacts (impl-summary, test-summary, verification) now route uniformly through the CLI helper.

Fixed

  • Pre-flight deny message includes explicit recovery template. hooks/pre-flight-guard.sh replaces the single-paragraph reason with action-led multi-line output: leads with the literal PREFLIGHT <ISO-8601-timestamp> edit <path> :: <governing-IDs> template + a single-word ungoverned fallback keyword. Agents that haven't preloaded devt:memory-pre-flight can recover from the deny output alone instead of looping on the bare "missing PREFLIGHT line" diagnosis (root cause of the 9-deny stuck pattern surfaced in /devt:status field reports). The deny-record reason field in preflight-denies.jsonl is unchanged (terse for log scanning); only the user-facing message is enriched. Smoke gate added asserting the deny stdout contains the literal template substrings.
  • Grader propagates I/O errors as ok:false, not silently as pass:false. bin/modules/grader.cjs::run previously emitted {ok:true, pass:false} when the rubric file was missing on disk, masking an operator-level problem as a constraint violation that would re-dispatch the programmer in an infinite loop (bounded by max_iterations, but burning 3 dispatches before PRUNE). Now propagates the error field from gradeArtifact as the top-level ok:false envelope. Sidecar-missing/malformed already returned ok:false; this fix completes the symmetry — all I/O failures from the grader route to BLOCKED. CLAUDE.md "Deterministic pre-verifier gate" entry expanded to document the three envelope shapes + the custom-agent / no-test-runner friction with copy-pasteable rubric override example.
  • Rubric path resolution supports project-local overrides. grader.cjs::resolveRubricPath previously hardcoded path.join(PLUGIN_ROOT, "references", "rubrics", file), which meant path.isAbsolute paths were silently mangled and ../ escapes resolved inside the plugin tree — the documented escape hatch was non-functional. Now uses three-layer resolution: (1) absolute paths in config → use directly, (2) project-local <projectRoot>/.devt/rubrics/<file> if it exists → that, (3) plugin default fallback. Users hitting the gates friction can drop a lenient rubric at .devt/rubrics/dev-lenient.md and reference it by name in .devt/config.json::rubrics.dev. CLAUDE.md updated with the corrected mechanics.
  • Malformed ## Deterministic Gates JSON surfaces as ok:false, not silent pass. extractDeterministicGates previously returned null on both "section missing" (by design — no enforcement) AND "JSON malformed" (silent bug — gate enforcement disabled with zero operator visibility). Now distinguishes the two: missing section still returns null (silent pass:true by design), but missing fence / malformed JSON / non-object root returns {error: "..."} which gradeArtifact propagates as pass:false, error and run lifts to ok:false. Operator edits that break the rubric now fail loud, not silent.
  • Two-call merge precedence documented explicitly in workflow text. workflows/dev-workflow.md verify step now states the strictest-outcome-wins rule for combining GRADE_TS + GRADE_IS routing: ok:false (BLOCKED) > pass:false (RETRY/PRUNE) > pass:true (proceed). Without this, Claude could misroute when the two calls return different envelope shapes (e.g. one ok:false, other ok:true, pass:true).
  • Non-object sidecar payloads surface as ok:false. state.cjs::readSidecar previously crashed with TypeError: Cannot read properties of null (reading 'status') when a sidecar file contained literal null, a JSON array, or a scalar — the ...
Read more

v0.40.0

13 May 09:44

Choose a tag to compare

Graphify Cross-Cutting Concerns + god-node candidate seeding + CI hardening. The Pre-Flight Brief and discovery harvest now read graphify-out/GRAPH_REPORT.md to surface structural couplings before changes start and to seed the underused CON-* tier with high-fanin concept candidates. Smoke: 383 passed, 0 failed.

Added

  • Pre-Flight Brief absorbs GRAPH_REPORT.md sections. bin/modules/graphify.cjs::parseReportSections(reportPath) is a 4 MB-capped markdown header parser that pulls God Nodes, Surprising Connections, and Knowledge Gaps out of graphify's report. bin/modules/preflight.cjs::generate calls it once per Brief and renderBrief emits a new ## Cross-Cutting Concerns (graphify) section between Blast Radius and Recommendations — filtered to entries whose symbols overlap the topic (case-insensitive substring, ≥3 chars), capped at 5 god-nodes and 5 surprising connections. Section is omitted entirely when graphify isn't ready, the report is missing, or no entries overlap — Brief layout stays byte-stable for non-graphify projects.
  • Discovery seeds curator concept candidates from graphify god-nodes. bin/modules/discovery.cjs::harvestGraphifyGodNodes() reads the same parseReportSections output, strips trailing parens, skips private/module-shaped symbols, caps at top 10 by edge count, and filters out symbols already covered by an active CON/ADR via memory.affectsSymbol(). Composes alongside the existing 3 sources in harvest(); REJ tombstone keyword suppression and dedup against existing memory docs apply unchanged. Closes the gap where CON-* docs starved of candidates because session-time ⚖️/🔵 signals rarely surface structural concepts.

Fixed

  • token-report --regression emits a stable JSON contract. When no Claude Code session logs exist for a project (fresh CI checkout), the missing-session-dir early-return now still emits the top-level regression block with zero counts, so the --fail-on-regression consumer and downstream automation can rely on the field shape. Previously the block was silently dropped on that branch, causing the smoke gate to fail in CI.
  • Release workflow promotes Latest by highest semver. .github/workflows/release.yml computes the highest stable tag including the current $TAG and passes --latest=true only when $TAG is that maximum. Prereleases keep their existing --prerelease path and are never flagged Latest. Guards against retags of older versions or hotfixes of older series stealing "Latest" from a higher release.
  • deferred list --tags=CSV filter works (was DEF-017). The list subcommand previously read --tag (singular) and only matched the first tag; now --tags=a,b,c parses to an array, OR-filters across items whose tags[] include any requested tag, and aligns with the documented canonical form.

v0.39.0

13 May 06:01

Choose a tag to compare

Observability foundation. The MCP trace records now carry the active workflow's context, and mcp-stats gains three filter flags so per-workflow / per-phase / per-type slicing is possible — unlocks measuring whether the <memory_signal> extensions from v0.38.x actually save the predicted MCP round trips. Smoke: 379 passed, 0 failed. Locking: 3/3.

Added

  • Workflow context on MCP trace records. bin/devt-memory-mcp.cjs gains a readWorkflowContext() helper that reads .devt/state/workflow.yaml on demand with mtime-invalidated caching — one stat() syscall per MCP call when nothing changed, full re-read on workflow transitions. Each trace record emitted while a workflow is active now carries workflow_id, workflow_type, and phase fields merged into the existing schema. Records emitted outside any workflow omit these fields entirely (cleanest signal). Existing record fields (ts, tool, ok, duration_ms, …) take precedence on the unlikely collision.
  • Three new filter flags in bin/devt-tools.cjs mcp-stats: --workflow-id=<UUID>, --workflow-type=<dev|code_review|…>, --phase=<implement|verify|…>. Filters compose conjunctively with the existing --since and --tool — e.g. mcp-stats --workflow-type=dev --phase=verify --tool=query_fts shows verifier-phase memory lookups across all dev workflows. Trace records lacking a field are excluded when its filter is set; bare aggregate behavior is unchanged.

Internal

  • 5 new smoke-test gates: bare aggregate over a synthesized 4-record fixture (counts all 4), --workflow-id=wf-A (narrows to 2), conjunctive --workflow-type=dev --phase=verify (narrows to 1), unknown workflow_id returns 0 cleanly, and a live MCP server boot test that fires a real tools/call JSON-RPC request and asserts the resulting trace record carries workflow_id + workflow_type + phase from the active workflow.yaml.
  • Workflow-context regexes hardcoded (not built via new RegExp(varName)) so Semgrep's ReDoS analysis can prove the patterns are bounded.

v0.38.1

13 May 07:49

Choose a tag to compare

Small composing additions: memory signal extended to more dispatches, narrow git destructive patterns added to the Bash safety hook, and a new input-JSON schema registry validates handoff.json for resume reliability.

Added

  • <memory_signal> extended to programmer + code-reviewer dispatches across workflows/dev-workflow.md, workflows/code-review.md, and workflows/quick-implement.md — five new dispatch sites total. Each uses the same orchestrator-prep memory query --signal=3 pattern shipped earlier for verifiers, so programmer and code-reviewer skip per-doc memory round trips on their initial scan. agents/programmer.md and agents/code-reviewer.md instruct preferring the inline block over fresh queries; programmer uses it to confirm which ADRs/CONs apply to the code path, code-reviewer uses it to flag REJ-tombstone matches and ADR violations. KEEP-IN-SYNC discipline extended to cover the 5-site cluster.
  • Git-destructive Bash patterns (source: "git_destructive") in bin/modules/bash-guard.cjs. Three narrow patterns with zero legitimate dev use: (1) force-push to a protected branch (main, master, release/*, prod*, develop) — --force-with-lease to the same branches is explicitly allowed as the safer variant; (2) git clean -x (any flag combo containing x) — nukes gitignored files including .env; (3) git checkout -- . or git checkout -- * mass-discard. git reset --hard is deliberately NOT denied so devt's own self-update flow in workflows/update.md continues to work.
  • JSON_INPUT_SCHEMAS registry in bin/modules/state.cjs with a validateInputJson(body, schema) helper. Distinct from JSON_SIDECAR_SCHEMAS — sidecars validate enum membership (status/verdict/agent); input schemas validate required + recommended top-level fields. handoff.json is the first registered entry: required = [task, phase, paused_at], recommended = [tier, iteration, last_commit, remaining_tasks, next_action]. state validate now surfaces a missing_required_field mismatch when handoff.json exists but lacks a required field, catching resume-after-pause breakage before it silently corrupts a routing decision.

Internal

  • 9 new smoke-test gates covering the 5 <memory_signal> dispatch sites, 3 orchestrator-prep step invocations, agent guidance presence in programmer + code-reviewer, force-push deny + --force-with-lease allow, git clean -fdx deny, mass-discard deny, devt self-update compatibility (regression guard), JSON_INPUT_SCHEMAS registry export, validateInputJson happy path, and end-to-end state validate surfacing missing required field.
  • New MISMATCH_REASONS.MISSING_REQUIRED_FIELD entry — used by validateConsistency when an input JSON parses but lacks a contractually required field.

v0.37.0

12 May 20:52

Choose a tag to compare

Cache-friendliness, CI hardening, and a strict documentation discipline pass. Smoke: 340 passed, 0 failed. Locking: 3/3.

Added

  • Cache-friendly dispatch ordering across every workflow. Every Task(subagent_type="devt:*", ...) dispatch in workflows/*.md (25 dispatches across 11 files) was reordered so the per-task dynamic block (<task> or, for workflows/debug.md, <bug>) appears AFTER </context>. Static blocks (<governing_rules>, <guardrails_inline>, <workflow_type>, <rubric_path>) now lead the prompt so the Anthropic prompt-cache prefix is byte-stable across retry iterations within the 5-minute TTL. Subagent dispatches in a single workflow run now cache-hit each other's prefixes, paying ~10% of full input price on the static portion.
  • Cache-ordering smoke gate. New scripts/check-dispatch-ordering.cjs walks every dispatch block and rejects any <task> that precedes </context>. Wired into scripts/smoke-test.sh as a new section; runs on every CI push.
  • devt:tokens --regression mode. bin/modules/token-report.cjs exposes detectRegressions(records, opts) and the --regression, --regression-min-input, --regression-streak CLI flags. The detector scans per-turn JSONL records for streaks of "cold" turns (cache_read_tokens == 0 with input_tokens >= min_input_tokens, default 5000) running ≥4-in-a-row (default). A streak is a near-certain signature of a dispatch-template ordering regression. Output: sessions_with_regression, total_cold_turns, est_wasted_input_tokens, offending_sessions[].streaks[]. Documented in workflows/tokens.md.

Changed

  • Codebase-wide version/option/wave/D-NN reference removal. Every devt-internal version marker (v0.X.Y+, since v0.A.B, Phase N (v0.X.Y+), Option N, Wave A, D-NN, CCA v27 §X, roadmap pointers) has been stripped from every .md, .cjs, and .sh comment / prose surface — agents, workflows, skills, hooks, guardrails, references, docs, READMEs, CLAUDE.md. The codebase is no longer a parallel changelog; CHANGELOG.md + git log are the canonical sources for "when did X land". Third-party version markers (Graphify v0.7.10+, Node v22, model IDs) are preserved.
  • CLAUDE.md "Key Conventions" extended with three new rules: cache-friendly dispatch ordering, documentation discipline (no version refs in code), and comment discipline (comments reserved for non-obvious WHY).

Fixed

  • CI smoke-test exit 1 on Node 22 / Node 24. The memory.upsertDoc + MCP write surface smoke check captured stderr (2>&1) which let Node 22/24's node:sqlite ExperimentalWarning contaminate UPSERT_OUT, breaking the JSON.parse that validates the upsert result. Switched to 2>/dev/null to match the surrounding captures. Locally on Node 26 (where the warning is silent) the check always passed; CI on Node 22/24 now matches.
  • GitHub Actions Node-20 deprecation warnings. Bumped actions/checkout@v4@v5 and actions/setup-node@v4@v5 in .github/workflows/ci.yml and .github/workflows/release.yml.

v0.36.0

12 May 12:43

Choose a tag to compare

Two waves consolidated into one release. v0.35.0's Wave A (6 options) was authored in a prior session but never published; v0.36.0's wave adds 3 more options on top. Ships 9 architectural improvements drawn from the ticklish-mapping-backus.md 12-option roadmap. Smoke: 336 passed (was 325 pre-wave). Locking: 3/3. Plugin contract surface: stable (no breaking changes to commands, agents, or hooks).

Added — v0.36.0 wave (Options 9a, 10, 11)

  • Parallel researcher + arch_health dispatch (Option 9a). COMPLEX-tier dev flows now dispatch the researcher and (when arch_health is opted-in via risk-signal AskUserQuestion) the architect in one message with two Task tool calls from workflows/dev-workflow.md Step 2.5. The arch_health architect dispatch reads .devt/state/scan-results.md only — the plan.md dependency dropped since the plan does not yet exist at parallel-dispatch time. Inline Auto-Plan consumes both research.md AND arch-health-scan.md. Workflow carries <!-- parallel-dispatch: researcher + architect (arch_health mode) --> marker; smoke test asserts presence + absence of regressions.
  • Memory Graph subgraph in Pre-Flight Brief (Option 10). bin/modules/memory.cjs::getSubgraphTriples(seedIds, depth=2, maxTriples=50) reshapes per-seed getLinks rows into a deduped, sorted {source, predicate, target} array. bin/modules/preflight.cjs::renderBrief emits a new ## Memory Graph (2-hop subgraph) section between Governing Documentation and Rejected Approaches. Agents scan structural relationships (supersedes, depends_on, relates_to, etc.) without per-doc get_doc round-trips. Smoke: 2 linked ADRs produce 2 expected triples.
  • Pinned rubric versions (Option 11). references/rubrics/dev.md renamed to dev.v1.md. New bin/modules/config.cjs::DEFAULTS.rubrics block (default { dev: "dev.v1.md" }) exposed at the top of the init payload as rubrics. workflows/dev-workflow.md verifier dispatch injects <rubric_path>references/rubrics/{rubrics.dev}</rubric_path>; agents/verifier.md prefers that block over computing the path from <workflow_type>. Future rubric updates ship as new files (dev.v2.md); projects opt in by overriding rubrics.dev in .devt/config.json. Naming convention: <workflow_type>.v<N>.md.

Added — v0.35.0 carryover wave (Options 1, 2, 4, 5, 6, 8)

  • Hot-path read cache: governing rules wiring (Option 1). bin/modules/init.cjs::loadGoverningRules returns the project's CLAUDE.md + .devt/rules/*.md contents inline in the init payload as governing_rules: {content, paths_included, paths_excluded, rules_hash, total_bytes}. Cap is 96 KB total. Workflows dev-workflow.md, quick-implement.md, code-review.md, research-task.md inject a <governing_rules rules_hash="..."> block (with <claude_md>, <coding_standards>, <architecture>, <quality_gates>, <review_checklist> sub-tags) into code-reviewer, verifier, and researcher dispatches. Those agents prefer inline content over on-disk Reads when present. rules_hash (SHA-256 first 16 chars) lets agents detect mid-workflow drift.
  • MCP write surface for curator (Option 2). bin/modules/memory.cjs::upsertDoc({frontmatter, body}) atomically writes a .devt/memory/<subdir>/<ID>-<slug>.md file AND refreshes the FTS5 index in one call. Validates frontmatter BEFORE touching disk; rolls back file write if index rebuild fails. bin/devt-memory-mcp.cjs exposes memory_upsert_doc tool gated by DEVT_MCP_ALLOW_WRITES=1 (set by plugin's .mcp.json env block by default). listTools() filters out write tools when the flag is unset; callTool() re-checks at handler level. agents/curator.md instructs the curator to call memory_upsert_doc first and fall back to the legacy 3-tool ritual on WRITES_DISABLED error.
  • Sidecar-only status routing (Option 4). impl-summary.md + verification.md no longer carry a ## Status header in their markdown templates. JSON sidecars (impl-summary.json / verification.json) are the single source of truth for workflow routing. bin/modules/state.cjs::SIDECAR_FOR_MARKDOWN maps markdown → sidecar; validateConsistency() reads the sidecar's status field for these artifacts. Other 7 ARTIFACT_SCHEMA artifacts keep the markdown ## Status header until backfilled with their own sidecars in a future wave (Path A of Option 4, deferred).
  • Stable-prefix invariant smoke test (Option 5). Asserts that the byte-prefix of init payloads is stable across task-string variations — guards against accidentally moving task-text into a prefix-position that would defeat cache hits.
  • Memory query aggregate flags (Option 6). bin/modules/memory.cjs::queryFTS accepts a mode option — "full" (default), "count", "top", "domain-counts". CLI surfaces: memory query "<terms>" --count|--top=N|--domain-counts|--json-compact. MCP exposes query_fts_count, query_fts_top, query_fts_by_domain. Aggregates return ~50–500 B vs ~1.5–15 KB for full payloads — memory-pre-flight skill documents the "aggregate-first" probe pattern.
  • Hook profile docs resync (Option 8). Updated the hook-profile table in CLAUDE.md to reflect the current minimal | standard | full set.

Changed

  • references/rubrics/dev.mdreferences/rubrics/dev.v1.md (rename, full content preserved). Future rubric revisions ship as new versioned files.
  • workflows/dev-workflow.md Step 2.7 deleted — its risk-signal detection + user prompt logic moved into Step 2.5's parallel-dispatch block. Step 3's architect review prompt updated to reference the parallel dispatch instead of the deleted Step 2.7.
  • agents/verifier.md: prefers the dispatch-injected <rubric_path> over computing the path from <workflow_type>; falls back to <workflow_type>.v1.md lookup when the block is absent.
  • agents/code-reviewer.md, agents/verifier.md, agents/researcher.md: prefer the <governing_rules> dispatch block over on-disk Reads of CLAUDE.md + .devt/rules/*.md.
  • agents/programmer.md, agents/verifier.md: emit BOTH .md (narrative) AND .json (workflow-routing sidecar) per Option 4's sidecar-only contract. Markdown templates no longer carry ## Status for these two artifacts.
  • agents/curator.md: instructs the curator to call memory_upsert_doc first and fall back to the legacy 3-tool ritual on WRITES_DISABLED error.
  • bin/devt-memory-mcp.cjs: adds query_fts_count, query_fts_top, query_fts_by_domain, memory_upsert_doc tools; write tools filtered out via listTools() when DEVT_MCP_ALLOW_WRITES is unset.
  • bin/modules/state.cjs: new SIDECAR_FOR_MARKDOWN registry; validateConsistency() reads sidecar status for sidecar-covered artifacts.

Smoke

  • +11 new assertions in scripts/smoke-test.sh:
    • Option 9a (4): parallel-dispatch marker comment present; Step 2.7 deleted; arch_health dispatch reads scan-results.md only (no plan.md); no stale "from Step 2.7" references.
    • Option 10 (4): preflight Brief generated with seeded ADRs; Brief contains Memory Graph section header; section renders source → predicate → target triples; getSubgraphTriples returns flat {source, predicate, target} array.
    • Option 11 (3): verifier rubric resolved via DEFAULTS.rubrics.dev exists (dev.v1.md); init payload exposes rubrics.dev; dev-workflow verifier dispatch injects <rubric_path>.
  • 336 total pass (was 325 pre-wave). 3/3 locking assertions still pass.

Docs

  • CLAUDE.md — six new architecture doc blocks covering each shipped option, plus an updated entry for Option 11's rubrics config key.
  • docs/MEMORY.md — added aggregate-flag CLI variants under "CLI Surface"; added query_fts_count / query_fts_top / query_fts_by_domain / memory_upsert_doc rows under "MCP Server"; added Memory Graph bullet under "Tier 1 — Topic Pre-Flight".
  • README.md — added rubrics.dev config row under "Basic configuration".
  • skills/memory-pre-flight/SKILL.md — documents the aggregate-first probe pattern and the Memory Graph Brief section.

Notes for projects upgrading from v0.34.1

  • No config migration required. .devt/config.json keeps working unchanged.
  • Projects that subclassed references/rubrics/dev.md directly need to update their path — point to dev.v1.md (or override rubrics.dev in .devt/config.json).
  • MCP write surface (Option 2) is enabled by default via DEVT_MCP_ALLOW_WRITES=1 in the plugin's .mcp.json. Set to "0" or remove the env var to disable and force the legacy 3-tool path.

v0.30.5

08 May 22:19

Choose a tag to compare

Added

  • Forensic deny log for the pre-flight guard (hooks/pre-flight-guard.sh, skills/memory-pre-flight/SKILL.md). Every decision: "deny" (block mode) and every advisory (warn mode) now appends one line to .devt/state/preflight-denies.log — single-writer, append-only, gitignored under existing .devt/state/ rules. Format: <mode> <ISO-ts> <action> <file_path> :: missing PREFLIGHT line. Closes the silent-stall failure mode where a subagent dispatched without the devt:memory-pre-flight skill received a deny it didn't know how to satisfy, then went silent (no streaming output) for 600s until Claude Code's stream watchdog killed it. With the log, recovering agents read .devt/state/preflight-denies.log first to see their own prior denied attempts, then write the missing PREFLIGHT lines to scratchpad in order. Hook stays stateless — log is pure side-effect, never read by the hook itself. Wrapped in try-catch so a log failure can never block the deny path. Survives state reset via the v0.30.4 archive ring buffer (.devt/state/.archive/<ts>/preflight-denies.log), so post-mortem of stalled workflows is possible after the workflow finishes.
  • memory-pre-flight skill documents the deny-recovery sequence with explicit Read-log → Append-PREFLIGHT-lines → Retry steps. All 8 dev agents preload the skill, so the recovery protocol propagates without per-agent prompt edits.

Changed

  • workflows/dev-workflow.md Step 1 gains a CONTRACT callout above the state update line: "Execute the next bash block VERBATIM. Do not paraphrase workflow_type=dev to workflow_type=workflow (the slash-command name)." Addresses the orchestrator-deviation bug where an agent invoked /devt:workflow and improvised workflow_type=workflow instead of executing the workflow file's workflow_type=dev literal — the entire downstream stall traced back to this single deviation. The v0.30.4 alias hint catches drift after the fact; this callout prevents drift in the first place.
  • CLAUDE.md Two-Tier Pre-Flight Protocol entry updated to mention the forensic deny log and point at the skill's recovery section.

Fixed

  • Silent watchdog stalls when subagent hits a deny without the memory-pre-flight skill loaded — fixed by giving the agent its own deny history via the new log so it can break out of silent reasoning and write the missing PREFLIGHT lines on retry. Root cause was orchestrator drift (separate fix above) plus agents lacking forensic visibility into hook denies (this fix).

Smoke

  • 2 new assertions (scripts/smoke-test.sh): hook deny appends correctly to preflight-denies.log; deny JSON contract (decision: "deny") still emitted alongside the log so the existing hook protocol is preserved. 273 total pass (was 271).