From 319b8e9108f65e7ac2660cb12b6d44b09f7bd50b Mon Sep 17 00:00:00 2001 From: David Abram Date: Wed, 4 Mar 2026 10:44:43 +0100 Subject: [PATCH 01/39] cli: git agent trace attribution plan --- .../agent-trace-attribution-no-git-wrapper.md | 202 ++++++++++++++++++ 1 file changed, 202 insertions(+) create mode 100644 context/plans/agent-trace-attribution-no-git-wrapper.md diff --git a/context/plans/agent-trace-attribution-no-git-wrapper.md b/context/plans/agent-trace-attribution-no-git-wrapper.md new file mode 100644 index 00000000..d628f45a --- /dev/null +++ b/context/plans/agent-trace-attribution-no-git-wrapper.md @@ -0,0 +1,202 @@ +# Plan: agent-trace-attribution-no-git-wrapper + +## 1) Change summary +Implement a no-git-wrapper attribution platform that preserves normal developer Git workflows while producing commit-level Agent Trace records, storing line-level attribution ranges, and reconciling rewritten commits across local and hosted rewrite events. + +## 2) Success criteria +- Generated and stored trace data is compliant with the Agent Trace RFC (`https://agent-trace.dev/`) for required structure and semantics. +- Every emitted trace record contains required fields (`version`, `id`, `timestamp`, `files`) and uses RFC 3339 timestamps plus UUID record IDs. +- `vcs` data is valid Agent Trace shape (`type`, `revision`) and local implementation pins `vcs.type = "git"`. +- File attribution shape matches spec nesting (`files[].conversations[].ranges[]`) with 1-indexed line ranges and valid contributor types (`human`, `ai`, `mixed`, `unknown`). +- Conversation links use URI-formatted `url` values; optional `related[]` links are preserved when present. +- AI contributor `model_id` values follow models.dev provider/model convention when available. +- Developers keep standard workflows (`git commit`, `git rebase`, IDE commits) without replacing `git` on `PATH`. +- Each finalized commit has one canonical Agent Trace record (`version = "0.1.0"`) attached to `refs/notes/agent-trace` and mirrored to backend storage. +- Local rewrite events (`rebase`, `amend`) remap trace attribution with auditable method/confidence metadata. +- Hosted rewrite events (GitHub/GitLab PR/MR updates and force-pushes) reconcile old/new commit identity with deterministic idempotency keys and replay-safe behavior. +- Co-author trailer behavior uses only canonical identity `Co-authored-by: SCE ` when SCE contribution is present, with idempotent insertion. +- Persistence schema supports trace storage, flattened range analytics, reconciliation runs, and rewrite mappings with quality states (`final`, `partial`, `needs_review`). + +## 3) Constraints and non-goals +- In scope: local hook-based capture/finalization, notes distribution, DB ingestion/indexing, hosted reconciliation worker, confidence policies, and operational observability. +- In scope: Agent Trace JSON as canonical interchange and source of truth for line-level attribution, with schema/field compliance to `https://agent-trace.dev/`. +- In scope: MIME and distribution alignment for trace payloads (`application/vnd.agent-trace.record+json` in notes and persisted records). +- In scope: one fixed SCE co-author identity for commit trailer UX metadata. +- Out of scope: legal ownership/copyright inference, model training provenance, and polished real-time IDE UX. +- Out of scope: replacing native git invocation, overriding human author/committer identity, or introducing multiple agent co-author identities. + +## 4) Task stack (T01..T15) +- [ ] T01: Finalize implementation contract baseline (status:todo) + - Task ID: T01 + - Goal: Translate architecture/hooks/identity/reconciler/schema into one implementation contract with strict invariants. + - Boundaries (in/out of scope): + - In: command contracts, metadata keys, confidence thresholds, failure policy, rollout acceptance gates, and Agent Trace field-level compliance matrix. + - Out: production code changes. + - Done when: + - One contract artifact exists and removes cross-doc ambiguity. + - Contract includes a normative mapping table from internal attribution structures to Agent Trace schema objects/fields. + - Verification notes (commands or checks): + - Structured contract checklist covering all source sections plus Agent Trace RFC required/optional field mapping. + +- [ ] T02: Define trace payload schema adapter and canonical metadata mapping (status:todo) + - Task ID: T02 + - Goal: Create a schema adapter that maps internal attribution structures to Agent Trace-compliant record shape. + - Boundaries (in/out of scope): + - In: `vcs` fields, metadata reverse-domain keys, quality status mapping, contributor enum rules, and canonical field mapping. + - Out: runtime persistence and hook execution paths. + - Done when: + - A single adapter contract maps all required/optional Agent Trace fields used by this system. + - Adapter output contract is deterministic and reusable by local finalize and rewrite flows. + - Verification notes (commands or checks): + - Mapping tests for required fields and extension metadata placement. + +- [ ] T03: Implement trace payload builder and compliance validation suite (status:todo) + - Task ID: T03 + - Goal: Implement payload construction and schema-validation tests on top of the adapter. + - Boundaries (in/out of scope): + - In: deterministic serialization, URI/date-time formatting, model_id normalization, and JSON schema compliance checks. + - Out: hook orchestration and DB/note write side effects. + - Done when: + - One builder path generates deterministic payloads for finalize and rewrite flows. + - Builder output passes JSON schema validation against the published Agent Trace trace-record schema. + - Verification notes (commands or checks): + - Unit tests for serialization determinism and metadata correctness. + - Schema-compliance tests for required fields, enum validation, URI/date-time format, and `files[].conversations[].ranges[]` nesting. + +- [ ] T04: Implement `pre-commit` staged checkpoint finalization contract (status:todo) + - Task ID: T04 + - Goal: Bind pending checkpoints to staged content only and capture index/tree anchors. + - Boundaries (in/out of scope): + - In: no-op behavior for disabled/missing CLI/bare repo and staged-only attribution enforcement. + - Out: commit-note writes. + - Done when: + - Unstaged edits cannot be attributed during commit finalization. + - Verification notes (commands or checks): + - Hook fixture tests with mixed staged/unstaged edits. + +- [ ] T05: Implement `commit-msg` canonical co-author trailer policy (status:todo) + - Task ID: T05 + - Goal: Add idempotent canonical SCE trailer injection when SCE-attributed staged changes exist. + - Boundaries (in/out of scope): + - In: `SCE_DISABLED`, `SCE_COAUTHOR_ENABLED`, dedupe behavior, canonical identity format. + - Out: rewriting human author/committer identity. + - Done when: + - Exactly one canonical trailer appears in all allowed SCE cases. + - Verification notes (commands or checks): + - Identity acceptance checklist scenarios 1-5, 8, and 10. + +- [ ] T06: Implement `post-commit` trace finalize and dual-write path (status:todo) + - Task ID: T06 + - Goal: Emit commit trace after commit creation and write to notes + DB (or queue fallback). + - Boundaries (in/out of scope): + - In: parent SHA handling, notes ref policy, emission idempotency, and MIME tagging (`application/vnd.agent-trace.record+json`). + - Out: hosted reconciliation flow. + - Done when: + - New HEAD always produces a trace record with durable persistence semantics. + - Verification notes (commands or checks): + - End-to-end local commit tests including transient DB or notes outage. + +- [ ] T07: Add hook install and health validation (`sce doctor`) for local rollout (status:todo) + - Task ID: T07 + - Goal: Provide deterministic setup validation for per-repo and global hook-path installs. + - Boundaries (in/out of scope): + - In: hook presence/permissions/config checks and actionable diagnostics. + - Out: hosted provider integration. + - Done when: + - Operators can verify hook readiness before enabling attribution enforcement. + - Verification notes (commands or checks): + - Doctor output tests for healthy, missing, and misconfigured hook states. + +- [ ] T08: Implement `post-rewrite` local remap ingestion pipeline (status:todo) + - Task ID: T08 + - Goal: Ingest old->new SHA pairs from rewrite events and trigger remap pipeline. + - Boundaries (in/out of scope): + - In: rewrite type capture, temporary pairs-file parsing, idempotent replay behavior. + - Out: remote webhook event processing. + - Done when: + - Rebase/amend rewrites trigger deterministic remap processing without duplicate artifacts. + - Verification notes (commands or checks): + - Local rewrite fixture tests across amend and interactive/non-interactive rebase outcomes. + +- [ ] T09: Implement rewrite trace transformation semantics (status:todo) + - Task ID: T09 + - Goal: Materialize new trace records for rewritten SHAs with explicit rewrite metadata. + - Boundaries (in/out of scope): + - In: new record ID/timestamp, `rewrite_from`, `rewrite_method`, `rewrite_confidence`, quality status logic, and preservation of RFC-compliant trace structure on rewritten commits. + - Out: provider-specific mapping heuristics. + - Done when: + - Rewritten traces preserve attribution continuity and auditability. + - Verification notes (commands or checks): + - Integration tests asserting metadata integrity and notes/DB parity. + +- [ ] T10: Ship core schema migrations (`repositories`, `commits`, `trace_records`, `trace_ranges`) (status:todo) + - Task ID: T10 + - Goal: Establish foundational persistence tables, constraints, and indexes. + - Boundaries (in/out of scope): + - In: migration authoring and upgrade-safe execution. + - Out: reconciliation-run tables and mapping pipeline logic. + - Done when: + - Core schema applies cleanly and supports local commit ingestion. + - Verification notes (commands or checks): + - Migration tests with empty and preexisting DB states. + +- [ ] T11: Ship reconciliation schema and ingestion (`reconciliation_runs`, `rewrite_mappings`, `conversations`) (status:todo) + - Task ID: T11 + - Goal: Add hosted rewrite persistence and idempotency-backed run bookkeeping. + - Boundaries (in/out of scope): + - In: run status lifecycle, mapping persistence, idempotency uniqueness, and indexes. + - Out: provider webhook transport implementation. + - Done when: + - Reconciliation runs and mappings can be stored and queried reproducibly. + - Verification notes (commands or checks): + - Referential-integrity tests and representative mapping/replay query checks. + +- [ ] T12: Implement hosted event intake and run orchestration (status:todo) + - Task ID: T12 + - Goal: Accept GitHub/GitLab webhook events, verify signatures, and create replay-safe runs. + - Boundaries (in/out of scope): + - In: provider event parsing, old/new head resolution, idempotency key generation. + - Out: mapping heuristic internals. + - Done when: + - Duplicate events do not create duplicate side effects. + - Verification notes (commands or checks): + - Webhook signature and replay tests per provider. + +- [ ] T13: Implement mapping engine (patch-id, range-diff, fuzzy fallback) (status:todo) + - Task ID: T13 + - Goal: Map old commits to new commits using strict staged matching with confidence scoring. + - Boundaries (in/out of scope): + - In: patch-id exact, range-diff hints, fuzzy thresholding (`>= 0.60`) and unresolved handling. + - Out: manual reviewer UI. + - Done when: + - Mapping outcomes are explainable, reproducible, and confidence-classified. + - Verification notes (commands or checks): + - Deterministic fixture tests for exact, ambiguous, unmatched, and low-confidence cases. + +- [ ] T14: Implement notes write-back fallback, retry queue, and observability metrics (status:todo) + - Task ID: T14 + - Goal: Guarantee no trace loss when notes pushes fail and expose reconciliation/runtime telemetry. + - Boundaries (in/out of scope): + - In: DB-first fallback queue, retry processor, run metrics (`mapped/unmapped`, histogram, runtime/error class). + - Out: full operational dashboard productization. + - Done when: + - Failed notes pushes recover via retry and metrics expose operational state. + - Verification notes (commands or checks): + - Fault-injection and recovery tests with metric emission assertions. + +- [ ] T15: Validation and cleanup (status:todo) + - Task ID: T15 + - Goal: Run full-system validation, sync context/docs, and leave implementation evidence for handoff. + - Boundaries (in/out of scope): + - In: local commit + rewrite + hosted rewrite + outage/retry scenario verification and context sync. + - Out: scope expansion beyond this architecture set. + - Done when: + - Every success criterion has evidence and no unresolved blocker remains. + - Plan checkboxes/status and verification evidence are fully updated. + - Verification notes (commands or checks): + - End-to-end scenario runbook with idempotent replay and confidence policy validation. + - Agent Trace compliance test report covering required fields, formats, nesting, enum constraints, and MIME expectations. + - Context sync review across architecture/overview/glossary/patterns to match resulting code truth. + +## 5) Open questions +- Agent Trace RFC page shows a potential version-format mismatch (`version` schema pattern appears two-segment while examples and document header use `0.1.0`); implementation currently plans to emit `0.1.0` and keep parser tolerant. From fb6d01e057938b850ba662dc7f6d52fabe108922 Mon Sep 17 00:00:00 2001 From: David Abram Date: Wed, 4 Mar 2026 11:03:43 +0100 Subject: [PATCH 02/39] sce: Finalize Agent Trace implementation contract baseline Create a canonical no-git-wrapper Agent Trace contract in context/sce/agent-trace-implementation-contract.md with normative invariants, hook/reconciliation contracts, metadata keys, confidence policy, failure/idempotency rules, compliance matrix, and internal-to-Agent-Trace mapping. --- context/context-map.md | 1 + context/glossary.md | 3 +- context/overview.md | 2 + .../agent-trace-attribution-no-git-wrapper.md | 3 +- .../agent-trace-implementation-contract.md | 167 ++++++++++++++++++ 5 files changed, 173 insertions(+), 3 deletions(-) create mode 100644 context/sce/agent-trace-implementation-contract.md diff --git a/context/context-map.md b/context/context-map.md index 17cce365..4d30a97a 100644 --- a/context/context-map.md +++ b/context/context-map.md @@ -16,6 +16,7 @@ Feature/domain context: - `context/sce/workflow-token-footprint-manifest.json` (T05 canonical machine-readable surface manifest for workflow token counting, including scope extraction rules and conditional flags) - `context/sce/workflow-token-count-workflow.md` (root flake app contract for workflow token counting and its runtime wiring to evals script execution) - `context/sce/atomic-commit-workflow.md` (canonical `/commit` command + `sce-atomic-commit` skill contract and naming decision) +- `context/sce/agent-trace-implementation-contract.md` (normative T01 implementation contract for no-git-wrapper Agent Trace attribution invariants, compliance matrix, and internal-to-Agent-Trace mapping) Working areas: - `context/plans/` (active plan execution artifacts, not durable history) diff --git a/context/glossary.md b/context/glossary.md index 97943ff4..99b7af2e 100644 --- a/context/glossary.md +++ b/context/glossary.md @@ -52,5 +52,4 @@ - `change-to-plan thin orchestration contract`: `/change-to-plan` command-body pattern where the command stays wrapper-level and delegates clarification/ambiguity handling plus plan-shape contracts (including one-task/one-atomic-commit task slicing) to `sce-plan-authoring`, while keeping plan creation confirmation and `/next-task` handoff explicit. - `one-task/one-atomic-commit planning contract`: `sce-plan-authoring` requirement that each executable plan task represents one coherent commit unit; broad multi-commit tasks must be split into sequential atomic tasks before execution handoff. - `commit thin orchestration contract`: `/commit` command-body pattern where the command keeps staged-confirmation and proposal-only constraints, while `sce-atomic-commit` owns commit grammar and atomic split guidance. -- `workflow token-count script` (T06): TypeScript implementation at `evals/token-count-workflows.ts` that reads `context/sce/workflow-token-footprint-manifest.json`, applies `entire-file`/`canonical-body-subsection` extraction rules, counts tokens with `o200k_base` fallback `cl100k_base`, and emits deterministic report artifacts. -- `workflow token-count command` (T06): Bun script entry `token-count-workflows` in `evals/package.json`; canonical invocation is from `evals/` via `bun run token-count-workflows` with optional `--run-id`, `--baseline`, `--manifest`, and `--tokenizer` flags. +- `agent trace implementation contract`: Canonical context artifact at `context/sce/agent-trace-implementation-contract.md` defining no-git-wrapper attribution invariants, hook/workflow contracts, confidence/quality policy, Agent Trace compliance matrix, and normative internal-to-Agent-Trace mapping for `agent-trace-attribution-no-git-wrapper`. diff --git a/context/overview.md b/context/overview.md index 63127579..c3fe01f0 100644 --- a/context/overview.md +++ b/context/overview.md @@ -22,6 +22,7 @@ The `/next-task` command body is intentionally thin orchestration: readiness gat Context sync now uses an important-change gate: cross-cutting/policy/architecture/terminology changes require root shared-file edits, while localized tasks run verify-only root checks without default churn. The `/change-to-plan` command body is also intentionally thin orchestration: it delegates clarification and plan-shape contracts to `sce-plan-authoring` (including one-task/one-atomic-commit task slicing) while keeping wrapper-level plan output and handoff obligations explicit. The `/commit` command body is intentionally thin orchestration: it retains staged-confirmation and proposal-only constraints while delegating commit grammar and atomic split guidance to `sce-atomic-commit`. +The no-git-wrapper Agent Trace initiative baseline contract is defined in `context/sce/agent-trace-implementation-contract.md`, including normative invariants, compliance matrix, and canonical internal-to-Agent-Trace mapping for downstream implementation tasks. ## Repository model @@ -72,3 +73,4 @@ Lightweight post-task verification baseline (required after each completed task) - Use `context/sce/workflow-token-count-workflow.md` for the root flake app contract (`nix run .#token-count-workflows`) and runtime wiring to the evals token-count script. - Use `evals/token-count-workflows.ts` (run via `nix run .#token-count-workflows` from repo root, or `bun run token-count-workflows` from `evals/`) for T06 static workflow token counting that emits deterministic reports to `context/tmp/token-footprint/`. - Use `context/sce/atomic-commit-workflow.md` for canonical `/commit` behavior, `sce-atomic-commit` naming, and proposal-only commit planning constraints. +- Use `context/sce/agent-trace-implementation-contract.md` for canonical Agent Trace implementation invariants and field-level mapping guidance (`agent-trace-attribution-no-git-wrapper` T01 baseline). diff --git a/context/plans/agent-trace-attribution-no-git-wrapper.md b/context/plans/agent-trace-attribution-no-git-wrapper.md index d628f45a..1ef73f35 100644 --- a/context/plans/agent-trace-attribution-no-git-wrapper.md +++ b/context/plans/agent-trace-attribution-no-git-wrapper.md @@ -26,7 +26,7 @@ Implement a no-git-wrapper attribution platform that preserves normal developer - Out of scope: replacing native git invocation, overriding human author/committer identity, or introducing multiple agent co-author identities. ## 4) Task stack (T01..T15) -- [ ] T01: Finalize implementation contract baseline (status:todo) +- [x] T01: Finalize implementation contract baseline (status:done) - Task ID: T01 - Goal: Translate architecture/hooks/identity/reconciler/schema into one implementation contract with strict invariants. - Boundaries (in/out of scope): @@ -37,6 +37,7 @@ Implement a no-git-wrapper attribution platform that preserves normal developer - Contract includes a normative mapping table from internal attribution structures to Agent Trace schema objects/fields. - Verification notes (commands or checks): - Structured contract checklist covering all source sections plus Agent Trace RFC required/optional field mapping. + - Contract artifact: `context/sce/agent-trace-implementation-contract.md`. - [ ] T02: Define trace payload schema adapter and canonical metadata mapping (status:todo) - Task ID: T02 diff --git a/context/sce/agent-trace-implementation-contract.md b/context/sce/agent-trace-implementation-contract.md new file mode 100644 index 00000000..42042415 --- /dev/null +++ b/context/sce/agent-trace-implementation-contract.md @@ -0,0 +1,167 @@ +# Agent Trace Implementation Contract (No Git Wrapper) + +## Status +- Plan: `agent-trace-attribution-no-git-wrapper` +- Task: `T01` +- Scope: implementation contract baseline only (no production code changes) +- Normative keywords: `MUST`, `SHOULD`, `MAY` + +## 1. Objective +Define one canonical, implementation-ready contract for Agent Trace attribution in this repository so later tasks (`T02`..`T15`) execute against a single set of invariants. + +## 2. Core invariants +- Native Git workflows are preserved. Developers MUST continue to use normal Git entrypoints (`git commit`, `git rebase`, IDE commit UIs). This system MUST NOT replace `git` on `PATH`. +- Canonical interchange is Agent Trace JSON. Local and hosted flows MUST treat Agent Trace records as the source of truth for line-level attribution. +- Local VCS identity is fixed. Emitted records MUST set `vcs.type = "git"`. +- One canonical finalized trace per commit. Each finalized commit SHA MUST map to one canonical Agent Trace record (`version = "0.1.0"`) attached to `refs/notes/agent-trace` and mirrored to backend persistence. +- Co-author behavior is metadata-only UX. Human author/committer identity MUST NOT be rewritten by this system. +- SCE co-author trailer, when applicable, MUST use exactly `Co-authored-by: SCE ` with idempotent insertion. + +## 3. Command and workflow contracts + +### 3.1 Local hook contracts +- `pre-commit` + - MUST finalize attribution checkpoints from staged content only. + - MUST capture index/tree anchors needed for later commit binding. + - MUST no-op safely when disabled, missing CLI, or bare repository conditions apply. +- `commit-msg` + - MUST apply canonical trailer policy only when staged SCE-attributed changes exist. + - MUST honor `SCE_DISABLED` and `SCE_COAUTHOR_ENABLED` controls. + - MUST deduplicate canonical trailer entries. +- `post-commit` + - MUST build and finalize a trace for new `HEAD`. + - MUST dual-write to Git notes (`refs/notes/agent-trace`) and backend storage (or queue fallback on transient failures). + - MUST emit canonical trace media type `application/vnd.agent-trace.record+json`. +- `post-rewrite` + - MUST ingest old->new commit pairs from rewrite events. + - MUST trigger deterministic remap processing with replay-safe idempotency. + +### 3.2 Hosted reconciliation contracts +- Hosted intake (GitHub/GitLab PR/MR updates, force-push) MUST produce deterministic idempotency keys for replay-safe orchestration. +- Reconciliation runs MUST preserve auditable old->new identity mapping and emit explicit confidence and quality outcomes. +- Hosted rewrites MUST NOT mutate canonical attribution semantics beyond declared rewrite metadata fields. + +## 4. Canonical metadata keys + +All extension metadata keys MUST use reverse-domain namespaced keys under `dev.crocoder.sce.*`. + +Reserved key set: +- `dev.crocoder.sce.quality_status` -> one of `final | partial | needs_review` +- `dev.crocoder.sce.rewrite_from` -> previous commit SHA when record is rewritten +- `dev.crocoder.sce.rewrite_method` -> rewrite method enum (for example `amend`, `rebase`, `force_push_reconcile`) +- `dev.crocoder.sce.rewrite_confidence` -> normalized score `0.00`..`1.00` +- `dev.crocoder.sce.idempotency_key` -> deterministic replay key for hosted/local remap orchestration +- `dev.crocoder.sce.notes_ref` -> `refs/notes/agent-trace` when persisted via Git notes +- `dev.crocoder.sce.content_type` -> `application/vnd.agent-trace.record+json` + +Rules: +- Unknown `dev.crocoder.sce.*` keys MAY be added later but MUST be forward-compatible and ignored safely by consumers. +- `quality_status`, `rewrite_*`, and `idempotency_key` fields MUST be preserved end-to-end if present. + +## 5. Confidence and quality policy + +### 5.1 Confidence scoring thresholds +- `>= 0.90`: high confidence; eligible for `final` quality when all required invariants pass. +- `0.60..0.89`: medium confidence; default `partial` unless explicit strict mapping criteria are met. +- `< 0.60`: low confidence; MUST set quality `needs_review`. + +### 5.2 Quality status contract +- `final` + - Required fields valid. + - Deterministic commit identity resolution complete. + - Attribution ranges structurally valid. +- `partial` + - Required fields valid, but one or more confidence or remap guarantees are incomplete. +- `needs_review` + - Any unresolved/low-confidence mapping, structural anomaly, or policy violation requiring operator inspection. + +## 6. Failure policy +- Never lose trace intent: + - If notes write fails and DB write succeeds, system MUST enqueue retry for notes sync. + - If DB write fails and notes write succeeds, system MUST enqueue DB ingest retry. + - If both fail, system MUST persist retry intent with deterministic idempotency and emit operational error metrics. +- Commit flow behavior: + - Attribution tooling SHOULD be fail-open for normal developer commit completion unless explicitly configured otherwise. + - Failures MUST be observable and replayable. +- Idempotency: + - All finalize/rewrite pipelines MUST be safe to retry without duplicate canonical records for the same `(repo, commit_sha, trace_version)` tuple. + +## 7. Rollout acceptance gates + +Before enforcement is considered enabled in a repository, the following MUST pass: +- Hook installation and health checks (`sce doctor`) report ready state. +- At least one local commit path demonstrates staged-only attribution and canonical trace creation. +- Notes + backend dual-write path is verified, including one forced transient outage scenario with retry success. +- Local rewrite (`amend` and `rebase`) remap evidence shows deterministic old->new mapping outcomes. +- Hosted replay/idempotency evidence demonstrates duplicate events do not produce duplicate side effects. +- Compliance validation confirms Agent Trace required field presence and structural nesting. + +## 8. Agent Trace field-level compliance matrix + +### 8.1 Required Agent Trace fields + +| Agent Trace field | Requirement | Local contract rule | +| --- | --- | --- | +| `version` | required | MUST emit `0.1.0` | +| `id` | required | MUST be UUID | +| `timestamp` | required | MUST be RFC 3339 date-time | +| `files` | required | MUST be non-empty when attributed file changes exist | + +### 8.2 VCS block + +| Agent Trace field | Requirement | Local contract rule | +| --- | --- | --- | +| `vcs.type` | required (when `vcs` present) | MUST be `git` | +| `vcs.revision` | required (when `vcs` present) | MUST be finalized commit SHA | + +### 8.3 File attribution nesting + +| Agent Trace path | Requirement | Local contract rule | +| --- | --- | --- | +| `files[].conversations[]` | optional but used | SHOULD be present for attributed edits | +| `files[].conversations[].url` | required in conversation object | MUST be URI-formatted | +| `files[].conversations[].ranges[]` | required when conversation carries ranges | MUST exist and be valid | +| `files[].conversations[].ranges[].start_line` | required | MUST be 1-indexed integer >= 1 | +| `files[].conversations[].ranges[].end_line` | required | MUST be integer >= `start_line` | +| `files[].conversations[].ranges[].contributor.type` | required | MUST be one of `human|ai|mixed|unknown` | +| `files[].conversations[].ranges[].contributor.model_id` | conditional | AI entries SHOULD use `provider/model` (models.dev convention) when known | + +### 8.4 Optional links + +| Agent Trace field | Requirement | Local contract rule | +| --- | --- | --- | +| `related[]` | optional | MUST preserve when present | + +## 9. Normative mapping: internal attribution model -> Agent Trace + +| Internal model element | Agent Trace destination | Mapping rule | +| --- | --- | --- | +| `TraceDraft.version` | `version` | Constant `0.1.0` | +| `TraceDraft.record_uuid` | `id` | UUID v4 string | +| `TraceDraft.emitted_at` | `timestamp` | RFC 3339 UTC timestamp | +| `CommitIdentity.sha` | `vcs.revision` | Finalized commit SHA | +| `CommitIdentity.vcs_kind` | `vcs.type` | Constant `git` | +| `FileAttribution.path` | `files[].path` | Repository-relative normalized path | +| `ConversationRef.url` | `files[].conversations[].url` | Valid URI string | +| `ConversationRef.related_urls[]` | `related[]` (or conversation-level related field if schema supports) | Preserve order and values | +| `LineRange.start` | `files[].conversations[].ranges[].start_line` | 1-indexed integer | +| `LineRange.end` | `files[].conversations[].ranges[].end_line` | Inclusive integer >= start | +| `RangeAttribution.kind` | `files[].conversations[].ranges[].contributor.type` | Enum map: `human|ai|mixed|unknown` | +| `RangeAttribution.model` | `...contributor.model_id` | `provider/model` when available | +| `RewriteInfo.from_sha` | `metadata[dev.crocoder.sce.rewrite_from]` | Only for rewritten commits | +| `RewriteInfo.method` | `metadata[dev.crocoder.sce.rewrite_method]` | Enum string | +| `RewriteInfo.confidence` | `metadata[dev.crocoder.sce.rewrite_confidence]` | Decimal `0.00`..`1.00` | +| `QualityState` | `metadata[dev.crocoder.sce.quality_status]` | `final|partial|needs_review` | +| `TransportInfo.content_type` | `metadata[dev.crocoder.sce.content_type]` | Constant media type | +| `TransportInfo.notes_ref` | `metadata[dev.crocoder.sce.notes_ref]` | Constant `refs/notes/agent-trace` | +| `ReplayInfo.idempotency_key` | `metadata[dev.crocoder.sce.idempotency_key]` | Deterministic key | + +## 10. Version-format interoperability note +- Known ambiguity: public RFC page currently shows a possible pattern/example mismatch for `version` formatting. +- Contract decision: emit `version = "0.1.0"` canonically, and keep readers tolerant to equivalent semver-like variants where needed. + +## 11. Implementation sequencing implications +- `T02` MUST implement schema adapter outputs matching section 8 and section 9. +- `T03` MUST prove deterministic serialization + compliance validation against Agent Trace schema. +- `T04`..`T09` MUST preserve invariants in sections 2 through 7. +- `T10`..`T14` MUST persist metadata needed to support section 5 and section 6 without semantic loss. From bf5ad10120e75ee811ddace3531b2ed17f4da057 Mon Sep 17 00:00:00 2001 From: David Abram Date: Wed, 4 Mar 2026 11:29:45 +0100 Subject: [PATCH 03/39] cli: Implement Agent Trace schema adapter mapping contract Add a deterministic adapter in services/agent_trace that maps internal attribution inputs to Agent Trace-shaped records. Pin git VCS identity, centralize reserved dev.crocoder.sce metadata keys, and cover required-field/contributor/metadata mapping with unit tests. --- cli/src/services/agent_trace.rs | 355 ++++++++++++++++++ cli/src/services/mod.rs | 1 + context/architecture.md | 1 + context/cli/placeholder-foundation.md | 4 +- context/context-map.md | 1 + context/glossary.md | 1 + context/overview.md | 2 + .../agent-trace-attribution-no-git-wrapper.md | 5 +- context/sce/agent-trace-schema-adapter.md | 46 +++ 9 files changed, 414 insertions(+), 2 deletions(-) create mode 100644 cli/src/services/agent_trace.rs create mode 100644 context/sce/agent-trace-schema-adapter.md diff --git a/cli/src/services/agent_trace.rs b/cli/src/services/agent_trace.rs new file mode 100644 index 00000000..ac24fddd --- /dev/null +++ b/cli/src/services/agent_trace.rs @@ -0,0 +1,355 @@ +#![allow(dead_code)] + +use std::collections::BTreeMap; + +pub const TRACE_VERSION: &str = "0.1.0"; +pub const VCS_TYPE_GIT: &str = "git"; +pub const NOTES_REF: &str = "refs/notes/agent-trace"; +pub const TRACE_CONTENT_TYPE: &str = "application/vnd.agent-trace.record+json"; + +pub const METADATA_QUALITY_STATUS: &str = "dev.crocoder.sce.quality_status"; +pub const METADATA_REWRITE_FROM: &str = "dev.crocoder.sce.rewrite_from"; +pub const METADATA_REWRITE_METHOD: &str = "dev.crocoder.sce.rewrite_method"; +pub const METADATA_REWRITE_CONFIDENCE: &str = "dev.crocoder.sce.rewrite_confidence"; +pub const METADATA_IDEMPOTENCY_KEY: &str = "dev.crocoder.sce.idempotency_key"; +pub const METADATA_NOTES_REF: &str = "dev.crocoder.sce.notes_ref"; +pub const METADATA_CONTENT_TYPE: &str = "dev.crocoder.sce.content_type"; + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct TraceAdapterInput { + pub record_id: String, + pub timestamp_rfc3339: String, + pub commit_sha: String, + pub files: Vec, + pub quality_status: QualityStatus, + pub rewrite: Option, + pub idempotency_key: Option, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct FileAttributionInput { + pub path: String, + pub conversations: Vec, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct ConversationInput { + pub url: String, + pub related: Vec, + pub ranges: Vec, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct RangeInput { + pub start_line: u32, + pub end_line: u32, + pub contributor: ContributorInput, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct ContributorInput { + pub kind: ContributorType, + pub model_id: Option, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct RewriteInfo { + pub from_sha: String, + pub method: String, + pub confidence: String, +} + +#[derive(Clone, Copy, Debug, Eq, PartialEq)] +pub enum QualityStatus { + Final, + Partial, + NeedsReview, +} + +impl QualityStatus { + pub fn as_str(self) -> &'static str { + match self { + Self::Final => "final", + Self::Partial => "partial", + Self::NeedsReview => "needs_review", + } + } +} + +#[derive(Clone, Copy, Debug, Eq, PartialEq)] +pub enum ContributorType { + Human, + Ai, + Mixed, + Unknown, +} + +impl ContributorType { + pub fn as_str(self) -> &'static str { + match self { + Self::Human => "human", + Self::Ai => "ai", + Self::Mixed => "mixed", + Self::Unknown => "unknown", + } + } +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct AgentTraceRecord { + pub version: String, + pub id: String, + pub timestamp: String, + pub vcs: AgentTraceVcs, + pub files: Vec, + pub metadata: BTreeMap, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct AgentTraceVcs { + pub r#type: String, + pub revision: String, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct AgentTraceFile { + pub path: String, + pub conversations: Vec, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct AgentTraceConversation { + pub url: String, + pub related: Vec, + pub ranges: Vec, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct AgentTraceRange { + pub start_line: u32, + pub end_line: u32, + pub contributor: AgentTraceContributor, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct AgentTraceContributor { + pub r#type: String, + pub model_id: Option, +} + +pub fn adapt_trace_payload(input: TraceAdapterInput) -> AgentTraceRecord { + let mut metadata = BTreeMap::new(); + metadata.insert( + METADATA_QUALITY_STATUS.to_string(), + input.quality_status.as_str().to_string(), + ); + metadata.insert(METADATA_NOTES_REF.to_string(), NOTES_REF.to_string()); + metadata.insert( + METADATA_CONTENT_TYPE.to_string(), + TRACE_CONTENT_TYPE.to_string(), + ); + + if let Some(rewrite) = input.rewrite { + metadata.insert(METADATA_REWRITE_FROM.to_string(), rewrite.from_sha); + metadata.insert(METADATA_REWRITE_METHOD.to_string(), rewrite.method); + metadata.insert(METADATA_REWRITE_CONFIDENCE.to_string(), rewrite.confidence); + } + + if let Some(idempotency_key) = input.idempotency_key { + metadata.insert(METADATA_IDEMPOTENCY_KEY.to_string(), idempotency_key); + } + + let files = input + .files + .into_iter() + .map(|file| AgentTraceFile { + path: file.path, + conversations: file + .conversations + .into_iter() + .map(|conversation| AgentTraceConversation { + url: conversation.url, + related: conversation.related, + ranges: conversation + .ranges + .into_iter() + .map(|range| AgentTraceRange { + start_line: range.start_line, + end_line: range.end_line, + contributor: AgentTraceContributor { + r#type: range.contributor.kind.as_str().to_string(), + model_id: range.contributor.model_id, + }, + }) + .collect(), + }) + .collect(), + }) + .collect(); + + AgentTraceRecord { + version: TRACE_VERSION.to_string(), + id: input.record_id, + timestamp: input.timestamp_rfc3339, + vcs: AgentTraceVcs { + r#type: VCS_TYPE_GIT.to_string(), + revision: input.commit_sha, + }, + files, + metadata, + } +} + +#[cfg(test)] +mod tests { + use super::{ + adapt_trace_payload, ContributorInput, ContributorType, ConversationInput, + FileAttributionInput, QualityStatus, RangeInput, RewriteInfo, TraceAdapterInput, + METADATA_CONTENT_TYPE, METADATA_IDEMPOTENCY_KEY, METADATA_NOTES_REF, + METADATA_QUALITY_STATUS, METADATA_REWRITE_CONFIDENCE, METADATA_REWRITE_FROM, + METADATA_REWRITE_METHOD, + }; + + #[test] + fn adapter_maps_required_fields_and_vcs_contract() { + let record = adapt_trace_payload(TraceAdapterInput { + record_id: "f8cabb2a-18e4-4e52-a6df-cf5bf8c0fbe7".to_string(), + timestamp_rfc3339: "2026-03-04T10:11:12Z".to_string(), + commit_sha: "abc123def456".to_string(), + files: vec![FileAttributionInput { + path: "cli/src/services/agent_trace.rs".to_string(), + conversations: vec![ConversationInput { + url: "https://example.test/conversation/123".to_string(), + related: vec![], + ranges: vec![RangeInput { + start_line: 1, + end_line: 3, + contributor: ContributorInput { + kind: ContributorType::Human, + model_id: None, + }, + }], + }], + }], + quality_status: QualityStatus::Final, + rewrite: None, + idempotency_key: None, + }); + + assert_eq!(record.version, "0.1.0"); + assert_eq!(record.id, "f8cabb2a-18e4-4e52-a6df-cf5bf8c0fbe7"); + assert_eq!(record.timestamp, "2026-03-04T10:11:12Z"); + assert_eq!(record.vcs.r#type, "git"); + assert_eq!(record.vcs.revision, "abc123def456"); + assert_eq!(record.files.len(), 1); + } + + #[test] + fn adapter_places_extension_metadata_in_reserved_reverse_domain_keys() { + let record = adapt_trace_payload(TraceAdapterInput { + record_id: "f8cabb2a-18e4-4e52-a6df-cf5bf8c0fbe7".to_string(), + timestamp_rfc3339: "2026-03-04T10:11:12Z".to_string(), + commit_sha: "abc123def456".to_string(), + files: vec![FileAttributionInput { + path: "README.md".to_string(), + conversations: vec![], + }], + quality_status: QualityStatus::Partial, + rewrite: Some(RewriteInfo { + from_sha: "oldsha".to_string(), + method: "rebase".to_string(), + confidence: "0.91".to_string(), + }), + idempotency_key: Some("repo:oldsha:newsha".to_string()), + }); + + assert_eq!( + record.metadata.get(METADATA_QUALITY_STATUS), + Some(&"partial".to_string()) + ); + assert_eq!( + record.metadata.get(METADATA_NOTES_REF), + Some(&"refs/notes/agent-trace".to_string()) + ); + assert_eq!( + record.metadata.get(METADATA_CONTENT_TYPE), + Some(&"application/vnd.agent-trace.record+json".to_string()) + ); + assert_eq!( + record.metadata.get(METADATA_REWRITE_FROM), + Some(&"oldsha".to_string()) + ); + assert_eq!( + record.metadata.get(METADATA_REWRITE_METHOD), + Some(&"rebase".to_string()) + ); + assert_eq!( + record.metadata.get(METADATA_REWRITE_CONFIDENCE), + Some(&"0.91".to_string()) + ); + assert_eq!( + record.metadata.get(METADATA_IDEMPOTENCY_KEY), + Some(&"repo:oldsha:newsha".to_string()) + ); + } + + #[test] + fn adapter_maps_contributor_types_and_optional_model_ids() { + let record = adapt_trace_payload(TraceAdapterInput { + record_id: "f8cabb2a-18e4-4e52-a6df-cf5bf8c0fbe7".to_string(), + timestamp_rfc3339: "2026-03-04T10:11:12Z".to_string(), + commit_sha: "abc123def456".to_string(), + files: vec![FileAttributionInput { + path: "src/lib.rs".to_string(), + conversations: vec![ConversationInput { + url: "https://example.test/c/1".to_string(), + related: vec!["https://example.test/c/2".to_string()], + ranges: vec![ + RangeInput { + start_line: 4, + end_line: 9, + contributor: ContributorInput { + kind: ContributorType::Ai, + model_id: Some("openai/gpt-5.3-codex".to_string()), + }, + }, + RangeInput { + start_line: 10, + end_line: 10, + contributor: ContributorInput { + kind: ContributorType::Mixed, + model_id: None, + }, + }, + RangeInput { + start_line: 11, + end_line: 12, + contributor: ContributorInput { + kind: ContributorType::Unknown, + model_id: None, + }, + }, + ], + }], + }], + quality_status: QualityStatus::NeedsReview, + rewrite: None, + idempotency_key: None, + }); + + let ranges = &record.files[0].conversations[0].ranges; + assert_eq!(ranges[0].contributor.r#type, "ai"); + assert_eq!( + ranges[0].contributor.model_id, + Some("openai/gpt-5.3-codex".to_string()) + ); + assert_eq!(ranges[1].contributor.r#type, "mixed"); + assert_eq!(ranges[1].contributor.model_id, None); + assert_eq!(ranges[2].contributor.r#type, "unknown"); + assert_eq!(ranges[2].contributor.model_id, None); + assert_eq!( + record.files[0].conversations[0].related, + vec!["https://example.test/c/2".to_string()] + ); + } +} diff --git a/cli/src/services/mod.rs b/cli/src/services/mod.rs index b26594a9..25de34c5 100644 --- a/cli/src/services/mod.rs +++ b/cli/src/services/mod.rs @@ -1,3 +1,4 @@ +pub mod agent_trace; pub mod hooks; pub mod local_db; pub mod mcp; diff --git a/context/architecture.md b/context/architecture.md index 740cca5b..e1edf3b1 100644 --- a/context/architecture.md +++ b/context/architecture.md @@ -76,6 +76,7 @@ The repository includes a new placeholder Rust binary crate at `cli/`. - `cli/src/services/local_db.rs` provides the local Turso data adapter, including `Builder::new_local(...)` initialization and async execute/query smoke checks for in-memory and file-backed targets. - `cli/src/test_support.rs` provides a shared test-only temp-directory helper (`TestTempDir`) used by service tests that need filesystem fixtures. - `cli/src/services/setup.rs` defines the setup command contract (`SetupMode`, `SetupTarget`, CLI flag parser/validator), an `inquire`-backed interactive target prompter (`InquireSetupTargetPrompter`), setup dispatch outcomes (proceed/cancelled), compile-time embedded asset access (`EmbeddedAsset`, target-scoped iterators) generated by `cli/build.rs` from `config/.opencode/**` and `config/.claude/**`, and a target-scoped install engine/orchestrator that stages embedded files, performs backup-and-replace with rollback restoration on swap failure, and formats deterministic completion messaging. +- `cli/src/services/agent_trace.rs` defines the Agent Trace schema adapter contract (`adapt_trace_payload`) that maps internal attribution structures into Agent Trace-shaped records with fixed git VCS identity and reserved reverse-domain metadata keys. - `cli/src/services/mcp.rs` defines MCP file-cache capability contracts (`McpService`, transport/capability snapshots, cache policy) with non-runnable placeholder tool declarations. - `cli/src/services/hooks.rs` defines hook-event and generated-region tracking contracts (`HookService`, `HookEventModel`) with placeholder-safe no-op recording. - `cli/src/services/sync.rs` runs the local adapter through a lazily initialized shared tokio current-thread runtime and composes a placeholder cloud-sync abstraction (`CloudSyncGateway`) so local Turso validation and deferred cloud planning remain separated. diff --git a/context/cli/placeholder-foundation.md b/context/cli/placeholder-foundation.md index 3e72a250..d78d764f 100644 --- a/context/cli/placeholder-foundation.md +++ b/context/cli/placeholder-foundation.md @@ -11,7 +11,7 @@ The repository now includes a Rust CLI crate at `cli/` for SCE automation work. - Command contract catalog: `cli/src/command_surface.rs` - Dependency contract snapshot: `cli/src/dependency_contract.rs` - Local Turso adapter: `cli/src/services/local_db.rs` -- Service domains: `cli/src/services/{setup,mcp,hooks,sync}.rs` +- Service domains: `cli/src/services/{agent_trace,setup,mcp,hooks,sync}.rs` - Shared test temp-path helper: `cli/src/test_support.rs` (`TestTempDir`, test-only module) ## Onboarding documentation @@ -70,6 +70,7 @@ Placeholder commands currently acknowledge planned behavior and do not claim pro ## Service contracts - `cli/src/services/setup.rs` defines setup parsing/selection contracts plus runtime install orchestration (`run_setup_for_mode`) over the embedded asset install engine. +- `cli/src/services/agent_trace.rs` defines the task-scoped schema adapter contract (`adapt_trace_payload`) from internal attribution input structs to Agent Trace-shaped record structs, including fixed git `vcs` mapping, contributor type mapping, and reserved `dev.crocoder.sce.*` metadata placement. - `cli/src/services/mcp.rs` defines `McpService`, a `McpCapabilitySnapshot` model (primary + supported transports), and `CachePolicy` defaults for future file-cache workflows (`cache-put`/`cache-get`) with `runnable: false` placeholders. - `cli/src/services/hooks.rs` defines `HookService` plus hook-event/generated-region event placeholders (`HookEventModel`, `HookEvent`, `GeneratedRegionEvent`) and keeps placeholder recording path compile-safe by consuming hook/lifecycle variants without enabling production hook actions. - `cli/src/services/sync.rs` defines cloud-sync abstraction points (`CloudSyncGateway`, `CloudSyncRequest`, `CloudSyncPlan`) layered after the local Turso smoke gate. @@ -91,6 +92,7 @@ Placeholder commands currently acknowledge planned behavior and do not claim pro - `cli/src/services/local_db.rs` tests cover in-memory and file-backed local Turso initialization plus execute/query smoke checks. - `cli/src/services/sync.rs` test confirms `sync` runs the local smoke gate and returns deterministic placeholder messaging. - `cli/src/services/{setup,mcp,hooks,sync}.rs` include contract-focused tests for setup flag parsing/validation, interactive selection/cancellation dispatch, setup run messaging, and non-runnable capability/event plans. +- `cli/src/services/agent_trace.rs` includes adapter mapping tests for required field projection, contributor enum/model_id handling, and extension metadata placement under reserved reverse-domain keys. - `cli/src/services/setup.rs` tests also verify embedded-manifest completeness against runtime `config/` trees, deterministic sorted path normalization, target-scoped iterator behavior (`OpenCode`, `Claude`, `Both`), install backup creation/replacement, and rollback restoration after injected swap failures. - `cli/src/services/setup.rs` and `cli/src/services/local_db.rs` now share temporary path setup through `crate::test_support::TestTempDir` to keep filesystem test fixtures consistent and cleanup deterministic. diff --git a/context/context-map.md b/context/context-map.md index 4d30a97a..0829e009 100644 --- a/context/context-map.md +++ b/context/context-map.md @@ -17,6 +17,7 @@ Feature/domain context: - `context/sce/workflow-token-count-workflow.md` (root flake app contract for workflow token counting and its runtime wiring to evals script execution) - `context/sce/atomic-commit-workflow.md` (canonical `/commit` command + `sce-atomic-commit` skill contract and naming decision) - `context/sce/agent-trace-implementation-contract.md` (normative T01 implementation contract for no-git-wrapper Agent Trace attribution invariants, compliance matrix, and internal-to-Agent-Trace mapping) +- `context/sce/agent-trace-schema-adapter.md` (T02 schema adapter contract and code-level mapping surface in `cli/src/services/agent_trace.rs`) Working areas: - `context/plans/` (active plan execution artifacts, not durable history) diff --git a/context/glossary.md b/context/glossary.md index 99b7af2e..37a8773a 100644 --- a/context/glossary.md +++ b/context/glossary.md @@ -53,3 +53,4 @@ - `one-task/one-atomic-commit planning contract`: `sce-plan-authoring` requirement that each executable plan task represents one coherent commit unit; broad multi-commit tasks must be split into sequential atomic tasks before execution handoff. - `commit thin orchestration contract`: `/commit` command-body pattern where the command keeps staged-confirmation and proposal-only constraints, while `sce-atomic-commit` owns commit grammar and atomic split guidance. - `agent trace implementation contract`: Canonical context artifact at `context/sce/agent-trace-implementation-contract.md` defining no-git-wrapper attribution invariants, hook/workflow contracts, confidence/quality policy, Agent Trace compliance matrix, and normative internal-to-Agent-Trace mapping for `agent-trace-attribution-no-git-wrapper`. +- `agent trace schema adapter`: Task-scoped mapping contract implemented in `cli/src/services/agent_trace.rs` (`adapt_trace_payload`) and documented in `context/sce/agent-trace-schema-adapter.md`; maps internal attribution inputs to Agent Trace-shaped records with fixed `vcs.type = git` and reserved `dev.crocoder.sce.*` metadata placement. diff --git a/context/overview.md b/context/overview.md index c3fe01f0..970b45b4 100644 --- a/context/overview.md +++ b/context/overview.md @@ -23,6 +23,7 @@ Context sync now uses an important-change gate: cross-cutting/policy/architectur The `/change-to-plan` command body is also intentionally thin orchestration: it delegates clarification and plan-shape contracts to `sce-plan-authoring` (including one-task/one-atomic-commit task slicing) while keeping wrapper-level plan output and handoff obligations explicit. The `/commit` command body is intentionally thin orchestration: it retains staged-confirmation and proposal-only constraints while delegating commit grammar and atomic split guidance to `sce-atomic-commit`. The no-git-wrapper Agent Trace initiative baseline contract is defined in `context/sce/agent-trace-implementation-contract.md`, including normative invariants, compliance matrix, and canonical internal-to-Agent-Trace mapping for downstream implementation tasks. +The CLI now includes a task-scoped Agent Trace schema adapter contract in `cli/src/services/agent_trace.rs`, with deterministic mapping of internal attribution input to Agent Trace-shaped record structures documented in `context/sce/agent-trace-schema-adapter.md`. ## Repository model @@ -74,3 +75,4 @@ Lightweight post-task verification baseline (required after each completed task) - Use `evals/token-count-workflows.ts` (run via `nix run .#token-count-workflows` from repo root, or `bun run token-count-workflows` from `evals/`) for T06 static workflow token counting that emits deterministic reports to `context/tmp/token-footprint/`. - Use `context/sce/atomic-commit-workflow.md` for canonical `/commit` behavior, `sce-atomic-commit` naming, and proposal-only commit planning constraints. - Use `context/sce/agent-trace-implementation-contract.md` for canonical Agent Trace implementation invariants and field-level mapping guidance (`agent-trace-attribution-no-git-wrapper` T01 baseline). +- Use `context/sce/agent-trace-schema-adapter.md` for the implemented T02 adapter contract and canonical mapping surface in `cli/src/services/agent_trace.rs`. diff --git a/context/plans/agent-trace-attribution-no-git-wrapper.md b/context/plans/agent-trace-attribution-no-git-wrapper.md index 1ef73f35..a00745d6 100644 --- a/context/plans/agent-trace-attribution-no-git-wrapper.md +++ b/context/plans/agent-trace-attribution-no-git-wrapper.md @@ -39,7 +39,7 @@ Implement a no-git-wrapper attribution platform that preserves normal developer - Structured contract checklist covering all source sections plus Agent Trace RFC required/optional field mapping. - Contract artifact: `context/sce/agent-trace-implementation-contract.md`. -- [ ] T02: Define trace payload schema adapter and canonical metadata mapping (status:todo) +- [x] T02: Define trace payload schema adapter and canonical metadata mapping (status:done) - Task ID: T02 - Goal: Create a schema adapter that maps internal attribution structures to Agent Trace-compliant record shape. - Boundaries (in/out of scope): @@ -50,6 +50,9 @@ Implement a no-git-wrapper attribution platform that preserves normal developer - Adapter output contract is deterministic and reusable by local finalize and rewrite flows. - Verification notes (commands or checks): - Mapping tests for required fields and extension metadata placement. + - `cargo test --manifest-path cli/Cargo.toml` + - `cargo fmt --manifest-path cli/Cargo.toml -- --check` + - `cargo build --manifest-path cli/Cargo.toml` - [ ] T03: Implement trace payload builder and compliance validation suite (status:todo) - Task ID: T03 diff --git a/context/sce/agent-trace-schema-adapter.md b/context/sce/agent-trace-schema-adapter.md new file mode 100644 index 00000000..aab4427a --- /dev/null +++ b/context/sce/agent-trace-schema-adapter.md @@ -0,0 +1,46 @@ +# Agent Trace Schema Adapter + +## Scope + +- Plan/task: `agent-trace-attribution-no-git-wrapper` / `T02`. +- Purpose: define a deterministic adapter contract that maps internal attribution inputs to Agent Trace record shape, without persistence or hook side effects. + +## Canonical code location + +- `cli/src/services/agent_trace.rs` + +## Adapter contract (current state) + +- Input contract is `TraceAdapterInput` with commit identity, timestamp, record id, file attribution payload, quality status, and optional rewrite/idempotency metadata. +- Output contract is `AgentTraceRecord` with: + - required top-level fields (`version`, `id`, `timestamp`, `files`) + - fixed local VCS block (`vcs.type = "git"`, `vcs.revision = `) + - reverse-domain metadata keys under `dev.crocoder.sce.*` +- Canonical constants are centralized for trace/media/reference values: + - `TRACE_VERSION = "0.1.0"` + - `NOTES_REF = "refs/notes/agent-trace"` + - `TRACE_CONTENT_TYPE = "application/vnd.agent-trace.record+json"` + +## Mapping guarantees in this slice + +- Contributor enum mapping is explicit and constrained to `human|ai|mixed|unknown`. +- Conversation links preserve `url` and optional `related` values. +- Extension metadata placement uses reserved keys: + - `dev.crocoder.sce.quality_status` + - `dev.crocoder.sce.rewrite_from` + - `dev.crocoder.sce.rewrite_method` + - `dev.crocoder.sce.rewrite_confidence` + - `dev.crocoder.sce.idempotency_key` + - `dev.crocoder.sce.notes_ref` + - `dev.crocoder.sce.content_type` + +## Verification evidence + +- `cargo test --manifest-path cli/Cargo.toml` includes adapter mapping tests in `services::agent_trace::tests`. +- `cargo fmt --manifest-path cli/Cargo.toml -- --check`. +- `cargo build --manifest-path cli/Cargo.toml`. + +## Out of scope (deferred) + +- JSON schema compliance/runtime format validation and deterministic serialization checks (`T03`). +- Hook orchestration, notes/DB writes, and rewrite execution flows (`T04+`). From 567b3e9558a53241f52dd8c2384ce9a779fb5d82 Mon Sep 17 00:00:00 2001 From: David Abram Date: Wed, 4 Mar 2026 11:52:50 +0100 Subject: [PATCH 04/39] cli: Implement Agent Trace payload builder and compliance validation suite Add a canonical build_trace_payload path on top of the adapter, normalize AI model_id values toward provider/model format when inferable, and add deterministic serialization + schema compliance tests (including URI/date-time format validation). --- cli/Cargo.lock | 638 +++++++++++++++++- cli/Cargo.toml | 4 + cli/src/services/agent_trace.rs | 497 +++++++++++++- context/architecture.md | 2 +- context/context-map.md | 1 + context/glossary.md | 2 + context/overview.md | 2 + .../agent-trace-attribution-no-git-wrapper.md | 5 +- .../agent-trace-payload-builder-validation.md | 37 + context/sce/agent-trace-schema-adapter.md | 4 + 10 files changed, 1184 insertions(+), 8 deletions(-) create mode 100644 context/sce/agent-trace-payload-builder-validation.md diff --git a/cli/Cargo.lock b/cli/Cargo.lock index cd0114a9..fde60b03 100644 --- a/cli/Cargo.lock +++ b/cli/Cargo.lock @@ -47,6 +47,20 @@ dependencies = [ "subtle", ] +[[package]] +name = "ahash" +version = "0.8.12" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "5a15f179cd60c4584b8a8c596927aadc462e27f2ca70c04e0071964a73ba7a75" +dependencies = [ + "cfg-if", + "getrandom 0.3.4", + "once_cell", + "serde", + "version_check", + "zerocopy", +] + [[package]] name = "aho-corasick" version = "1.1.4" @@ -80,6 +94,12 @@ dependencies = [ "rustversion", ] +[[package]] +name = "atomic-waker" +version = "1.1.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "1505bd5d3d116872e7271a6d4e16d81d0c8570876c8de68093a09ac269d8aac0" + [[package]] name = "autocfg" version = "1.5.0" @@ -115,6 +135,21 @@ dependencies = [ "which", ] +[[package]] +name = "bit-set" +version = "0.8.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "08807e080ed7f9d5433fa9b275196cfc35414f66a0c79d864dc51a0d825231a3" +dependencies = [ + "bit-vec", +] + +[[package]] +name = "bit-vec" +version = "0.8.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "5e764a1d40d510daf35e07be9eb06e75770908c27d411ee6c92109c9840eaaf7" + [[package]] name = "bitflags" version = "1.3.2" @@ -127,6 +162,12 @@ version = "2.11.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "843867be96c8daad0d758b57df9392b6d8d271134fce549de6ce169ff98a92af" +[[package]] +name = "borrow-or-share" +version = "0.2.4" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "dc0b364ead1874514c8c2855ab558056ebfeb775653e7ae45ff72f28f8f3166c" + [[package]] name = "branches" version = "0.4.4" @@ -152,6 +193,12 @@ version = "3.20.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "5d20789868f4b01b2f2caec9f5c4e0213b41e3e5702a50157d699ae31ced2fcb" +[[package]] +name = "bytecount" +version = "0.6.9" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "175812e0be2bccb6abe50bb8d566126198344f707e304f45c648fd8f2cc0365e" + [[package]] name = "bytemuck" version = "1.25.0" @@ -318,7 +365,7 @@ dependencies = [ "bitflags 1.3.2", "crossterm_winapi", "libc", - "mio", + "mio 0.8.11", "parking_lot", "signal-hook", "signal-hook-mio", @@ -386,6 +433,15 @@ version = "1.15.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "48c757948c5ede0e46177b7add2e67155f70e33c07fea8284df6576da70b3719" +[[package]] +name = "email_address" +version = "0.2.9" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e079f19b08ca6239f47f8ba8509c11cf3ea30095831f7fed61441475edd8c449" +dependencies = [ + "serde", +] + [[package]] name = "env_filter" version = "1.0.0" @@ -427,6 +483,17 @@ version = "0.3.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "2acce4a10f12dc2fb14a218589d4f1f62ef011b2d0cc4b3cb1bba8e94da14649" +[[package]] +name = "fancy-regex" +version = "0.16.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "998b056554fbe42e03ae0e152895cd1a7e1002aec800fdc6635d20270260c46f" +dependencies = [ + "bit-set", + "regex-automata", + "regex-syntax", +] + [[package]] name = "fastbloom" version = "0.14.1" @@ -451,6 +518,17 @@ version = "0.1.9" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "5baebc0774151f905a1a2cc41989300b1e6fbb29aff0ceffa1064fdd3088d582" +[[package]] +name = "fluent-uri" +version = "0.3.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "1918b65d96df47d3591bed19c5cca17e3fa5d0707318e4b5ef2eae01764df7e5" +dependencies = [ + "borrow-or-share", + "ref-cast", + "serde", +] + [[package]] name = "foldhash" version = "0.1.5" @@ -466,6 +544,65 @@ dependencies = [ "percent-encoding", ] +[[package]] +name = "fraction" +version = "0.15.3" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "0f158e3ff0a1b334408dc9fb811cd99b446986f4d8b741bb08f9df1604085ae7" +dependencies = [ + "lazy_static", + "num", +] + +[[package]] +name = "futures-channel" +version = "0.3.32" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "07bbe89c50d7a535e539b8c17bc0b49bdb77747034daa8087407d655f3f7cc1d" +dependencies = [ + "futures-core", + "futures-sink", +] + +[[package]] +name = "futures-core" +version = "0.3.32" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "7e3450815272ef58cec6d564423f6e755e25379b217b0bc688e295ba24df6b1d" + +[[package]] +name = "futures-io" +version = "0.3.32" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "cecba35d7ad927e23624b22ad55235f2239cfa44fd10428eecbeba6d6a717718" + +[[package]] +name = "futures-sink" +version = "0.3.32" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "c39754e157331b013978ec91992bde1ac089843443c49cbc7f46150b0fad0893" + +[[package]] +name = "futures-task" +version = "0.3.32" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "037711b3d59c33004d3856fbdc83b99d4ff37a24768fa1be9ce3538a1cde4393" + +[[package]] +name = "futures-util" +version = "0.3.32" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "389ca41296e6190b48053de0321d02a77f32f8a5d2461dd38762c0593805c6d6" +dependencies = [ + "futures-core", + "futures-io", + "futures-sink", + "futures-task", + "memchr", + "pin-project-lite", + "slab", +] + [[package]] name = "fuzzy-matcher" version = "0.3.7" @@ -626,6 +763,79 @@ dependencies = [ "itoa", ] +[[package]] +name = "http-body" +version = "1.0.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "1efedce1fb8e6913f23e0c92de8e62cd5b772a67e7b3946df930a62566c93184" +dependencies = [ + "bytes", + "http", +] + +[[package]] +name = "http-body-util" +version = "0.1.3" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "b021d93e26becf5dc7e1b75b1bed1fd93124b374ceb73f43d4d4eafec896a64a" +dependencies = [ + "bytes", + "futures-core", + "http", + "http-body", + "pin-project-lite", +] + +[[package]] +name = "httparse" +version = "1.10.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "6dbf3de79e51f3d586ab4cb9d5c3e2c14aa28ed23d180cf89b4df0454a69cc87" + +[[package]] +name = "hyper" +version = "1.8.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "2ab2d4f250c3d7b1c9fcdff1cece94ea4e2dfbec68614f7b87cb205f24ca9d11" +dependencies = [ + "atomic-waker", + "bytes", + "futures-channel", + "futures-core", + "http", + "http-body", + "httparse", + "itoa", + "pin-project-lite", + "pin-utils", + "smallvec", + "tokio", + "want", +] + +[[package]] +name = "hyper-util" +version = "0.1.20" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "96547c2556ec9d12fb1578c4eaf448b04993e7fb79cbaad930a656880a6bdfa0" +dependencies = [ + "base64", + "bytes", + "futures-channel", + "futures-util", + "http", + "http-body", + "hyper", + "ipnet", + "libc", + "percent-encoding", + "pin-project-lite", + "socket2", + "tokio", + "tower-service", + "tracing", +] + [[package]] name = "iana-time-zone" version = "0.1.65" @@ -816,6 +1026,22 @@ dependencies = [ "libc", ] +[[package]] +name = "ipnet" +version = "2.12.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "d98f6fed1fde3f8c21bc40a1abb88dd75e67924f9cffc3ef95607bad8017f8e2" + +[[package]] +name = "iri-string" +version = "0.7.10" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "c91338f0783edbd6195decb37bae672fd3b165faffb89bf7b9e6942f8b1a731a" +dependencies = [ + "memchr", + "serde", +] + [[package]] name = "itertools" version = "0.12.1" @@ -860,6 +1086,33 @@ dependencies = [ "wasm-bindgen", ] +[[package]] +name = "jsonschema" +version = "0.33.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "d46662859bc5f60a145b75f4632fbadc84e829e45df6c5de74cfc8e05acb96b5" +dependencies = [ + "ahash", + "base64", + "bytecount", + "email_address", + "fancy-regex", + "fraction", + "idna", + "itoa", + "num-cmp", + "num-traits", + "once_cell", + "percent-encoding", + "referencing", + "regex", + "regex-syntax", + "reqwest", + "serde", + "serde_json", + "uuid-simd", +] + [[package]] name = "lazy_static" version = "1.5.0" @@ -1046,6 +1299,17 @@ dependencies = [ "windows-sys 0.48.0", ] +[[package]] +name = "mio" +version = "1.1.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "a69bcab0ad47271a0234d9422b131806bf3968021e5dc9328caf2d4cd58557fc" +dependencies = [ + "libc", + "wasi", + "windows-sys 0.61.2", +] + [[package]] name = "newline-converter" version = "0.3.0" @@ -1074,12 +1338,82 @@ dependencies = [ "windows-sys 0.61.2", ] +[[package]] +name = "num" +version = "0.4.3" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "35bd024e8b2ff75562e5f34e7f4905839deb4b22955ef5e73d2fea1b9813cb23" +dependencies = [ + "num-bigint", + "num-complex", + "num-integer", + "num-iter", + "num-rational", + "num-traits", +] + +[[package]] +name = "num-bigint" +version = "0.4.6" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "a5e44f723f1133c9deac646763579fdb3ac745e418f2a7af9cd0c431da1f20b9" +dependencies = [ + "num-integer", + "num-traits", +] + +[[package]] +name = "num-cmp" +version = "0.1.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "63335b2e2c34fae2fb0aa2cecfd9f0832a1e24b3b32ecec612c3426d46dc8aaa" + +[[package]] +name = "num-complex" +version = "0.4.6" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "73f88a1307638156682bada9d7604135552957b7818057dcef22705b4d509495" +dependencies = [ + "num-traits", +] + [[package]] name = "num-conv" version = "0.2.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "cf97ec579c3c42f953ef76dbf8d55ac91fb219dde70e49aa4a6b7d74e9919050" +[[package]] +name = "num-integer" +version = "0.1.46" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "7969661fd2958a5cb096e56c8e1ad0444ac2bbcd0061bd28660485a44879858f" +dependencies = [ + "num-traits", +] + +[[package]] +name = "num-iter" +version = "0.1.45" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "1429034a0490724d0075ebb2bc9e875d6503c3cf69e235a8941aa757d83ef5bf" +dependencies = [ + "autocfg", + "num-integer", + "num-traits", +] + +[[package]] +name = "num-rational" +version = "0.4.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "f83d14da390562dca69fc84082e73e548e1ad308d24accdedd2720017cb37824" +dependencies = [ + "num-bigint", + "num-integer", + "num-traits", +] + [[package]] name = "num-traits" version = "0.2.19" @@ -1101,6 +1435,12 @@ version = "0.3.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "c08d65885ee38876c4f86fa503fb49d7b507c2b62552df7c70b2fce627e06381" +[[package]] +name = "outref" +version = "0.5.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "1a80800c0488c3a21695ea981a54918fbb37abf04f4d0720c453632255e2ff0e" + [[package]] name = "pack1" version = "1.0.0" @@ -1151,6 +1491,12 @@ version = "0.2.17" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "a89322df9ebe1c1578d689c92318e070967d1042b512afbe49518723f4e6d5cd" +[[package]] +name = "pin-utils" +version = "0.1.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8b870d8c151b6f2fb93e84a13146138f05d02ed11c7e7c54f8826aaaf7c9f184" + [[package]] name = "pkg-config" version = "0.3.32" @@ -1320,6 +1666,40 @@ dependencies = [ "bitflags 2.11.0", ] +[[package]] +name = "ref-cast" +version = "1.0.25" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "f354300ae66f76f1c85c5f84693f0ce81d747e2c3f21a45fef496d89c960bf7d" +dependencies = [ + "ref-cast-impl", +] + +[[package]] +name = "ref-cast-impl" +version = "1.0.25" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "b7186006dcb21920990093f30e3dea63b7d6e977bf1256be20c3563a5db070da" +dependencies = [ + "proc-macro2", + "quote", + "syn", +] + +[[package]] +name = "referencing" +version = "0.33.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9e9c261f7ce75418b3beadfb3f0eb1299fe8eb9640deba45ffa2cb783098697d" +dependencies = [ + "ahash", + "fluent-uri", + "once_cell", + "parking_lot", + "percent-encoding", + "serde_json", +] + [[package]] name = "regex" version = "1.12.3" @@ -1349,6 +1729,40 @@ version = "0.8.10" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "dc897dd8d9e8bd1ed8cdad82b5966c3e0ecae09fb1907d58efaa013543185d0a" +[[package]] +name = "reqwest" +version = "0.12.28" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "eddd3ca559203180a307f12d114c268abf583f59b03cb906fd0b3ff8646c1147" +dependencies = [ + "base64", + "bytes", + "futures-channel", + "futures-core", + "futures-util", + "http", + "http-body", + "http-body-util", + "hyper", + "hyper-util", + "js-sys", + "log", + "percent-encoding", + "pin-project-lite", + "serde", + "serde_json", + "serde_urlencoded", + "sync_wrapper", + "tokio", + "tower", + "tower-http", + "tower-service", + "url", + "wasm-bindgen", + "wasm-bindgen-futures", + "web-sys", +] + [[package]] name = "roaring" version = "0.11.3" @@ -1424,7 +1838,9 @@ version = "0.1.0" dependencies = [ "anyhow", "inquire", + "jsonschema", "lexopt", + "serde_json", "tokio", "turso", ] @@ -1484,6 +1900,18 @@ dependencies = [ "zmij", ] +[[package]] +name = "serde_urlencoded" +version = "0.7.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "d3491c14715ca2294c4d6a88f15e84739788c1d030eed8c110436aafdaa2f3fd" +dependencies = [ + "form_urlencoded", + "itoa", + "ryu", + "serde", +] + [[package]] name = "sha1_smol" version = "1.0.1" @@ -1522,7 +1950,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "b75a19a7a740b25bc7944bdee6172368f988763b744e3d4dfe753f6b4ece40cc" dependencies = [ "libc", - "mio", + "mio 0.8.11", "signal-hook", ] @@ -1551,12 +1979,28 @@ version = "1.0.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "b2aa850e253778c88a04c3d7323b043aeda9d3e30d5971937c1855769763678e" +[[package]] +name = "slab" +version = "0.4.12" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "0c790de23124f9ab44544d7ac05d60440adc586479ce501c1d6d7da3cd8c9cf5" + [[package]] name = "smallvec" version = "1.15.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "67b1b7a3b5fe4f1376887184045fcf45c69e92af734b7aaddc05fb777b6fbd03" +[[package]] +name = "socket2" +version = "0.6.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "86f4aa3ad99f2088c990dfa82d367e19cb29268ed67c574d10d0a4bfe71f07e0" +dependencies = [ + "libc", + "windows-sys 0.60.2", +] + [[package]] name = "softaes" version = "0.1.3" @@ -1608,6 +2052,15 @@ dependencies = [ "unicode-ident", ] +[[package]] +name = "sync_wrapper" +version = "1.0.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "0bf256ce5efdfa370213c1dabab5935a12e49f2c58d15e9eac2870d3b4f27263" +dependencies = [ + "futures-core", +] + [[package]] name = "synstructure" version = "0.13.2" @@ -1708,9 +2161,58 @@ version = "1.49.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "72a2903cd7736441aac9df9d7688bd0ce48edccaadf181c3b90be801e81d3d86" dependencies = [ + "libc", + "mio 1.1.1", + "pin-project-lite", + "socket2", + "windows-sys 0.61.2", +] + +[[package]] +name = "tower" +version = "0.5.3" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "ebe5ef63511595f1344e2d5cfa636d973292adc0eec1f0ad45fae9f0851ab1d4" +dependencies = [ + "futures-core", + "futures-util", "pin-project-lite", + "sync_wrapper", + "tokio", + "tower-layer", + "tower-service", ] +[[package]] +name = "tower-http" +version = "0.6.8" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "d4e6559d53cc268e5031cd8429d05415bc4cb4aefc4aa5d6cc35fbf5b924a1f8" +dependencies = [ + "bitflags 2.11.0", + "bytes", + "futures-util", + "http", + "http-body", + "iri-string", + "pin-project-lite", + "tower", + "tower-layer", + "tower-service", +] + +[[package]] +name = "tower-layer" +version = "0.3.3" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "121c2a6cda46980bb0fcd1647ffaf6cd3fc79a013de288782836f6df9c48780e" + +[[package]] +name = "tower-service" +version = "0.3.3" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8df9b6e13f2d32c91b9bd719c00d1958837bc7dec474d94952798cc8e69eeec3" + [[package]] name = "tracing" version = "0.1.44" @@ -1784,6 +2286,12 @@ dependencies = [ "tracing-log", ] +[[package]] +name = "try-lock" +version = "0.2.5" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e421abadd41a4225275504ea4d6566923418b7f05506fbc9c0fe86ba7396114b" + [[package]] name = "turso" version = "0.4.4" @@ -2045,6 +2553,17 @@ dependencies = [ "wasm-bindgen", ] +[[package]] +name = "uuid-simd" +version = "0.8.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "23b082222b4f6619906941c17eb2297fff4c2fb96cb60164170522942a200bd8" +dependencies = [ + "outref", + "uuid", + "vsimd", +] + [[package]] name = "valuable" version = "0.1.1" @@ -2063,6 +2582,21 @@ version = "0.9.5" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "0b928f33d975fc6ad9f86c8f283853ad26bdd5b10b7f1542aa2fa15e2289105a" +[[package]] +name = "vsimd" +version = "0.8.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "5c3082ca00d5a5ef149bb8b555a72ae84c9c59f7250f013ac822ac2e49b19c64" + +[[package]] +name = "want" +version = "0.3.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "bfa7760aed19e106de2c7c0b581b509f2f25d3dacaf737cb82ac61bc6d760b0e" +dependencies = [ + "try-lock", +] + [[package]] name = "wasi" version = "0.11.1+wasi-snapshot-preview1" @@ -2100,6 +2634,20 @@ dependencies = [ "wasm-bindgen-shared", ] +[[package]] +name = "wasm-bindgen-futures" +version = "0.4.64" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e9c5522b3a28661442748e09d40924dfb9ca614b21c00d3fd135720e48b67db8" +dependencies = [ + "cfg-if", + "futures-util", + "js-sys", + "once_cell", + "wasm-bindgen", + "web-sys", +] + [[package]] name = "wasm-bindgen-macro" version = "0.2.114" @@ -2166,6 +2714,16 @@ dependencies = [ "semver", ] +[[package]] +name = "web-sys" +version = "0.3.91" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "854ba17bb104abfb26ba36da9729addc7ce7f06f5c0f90f3c391f8461cca21f9" +dependencies = [ + "js-sys", + "wasm-bindgen", +] + [[package]] name = "which" version = "4.4.2" @@ -2277,6 +2835,15 @@ dependencies = [ "windows-targets 0.52.6", ] +[[package]] +name = "windows-sys" +version = "0.60.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "f2f500e4d28234f72040990ec9d39e3a6b950f9f22d3dba18416c35882612bcb" +dependencies = [ + "windows-targets 0.53.5", +] + [[package]] name = "windows-sys" version = "0.61.2" @@ -2310,13 +2877,30 @@ dependencies = [ "windows_aarch64_gnullvm 0.52.6", "windows_aarch64_msvc 0.52.6", "windows_i686_gnu 0.52.6", - "windows_i686_gnullvm", + "windows_i686_gnullvm 0.52.6", "windows_i686_msvc 0.52.6", "windows_x86_64_gnu 0.52.6", "windows_x86_64_gnullvm 0.52.6", "windows_x86_64_msvc 0.52.6", ] +[[package]] +name = "windows-targets" +version = "0.53.5" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "4945f9f551b88e0d65f3db0bc25c33b8acea4d9e41163edf90dcd0b19f9069f3" +dependencies = [ + "windows-link", + "windows_aarch64_gnullvm 0.53.1", + "windows_aarch64_msvc 0.53.1", + "windows_i686_gnu 0.53.1", + "windows_i686_gnullvm 0.53.1", + "windows_i686_msvc 0.53.1", + "windows_x86_64_gnu 0.53.1", + "windows_x86_64_gnullvm 0.53.1", + "windows_x86_64_msvc 0.53.1", +] + [[package]] name = "windows_aarch64_gnullvm" version = "0.48.5" @@ -2329,6 +2913,12 @@ version = "0.52.6" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "32a4622180e7a0ec044bb555404c800bc9fd9ec262ec147edd5989ccd0c02cd3" +[[package]] +name = "windows_aarch64_gnullvm" +version = "0.53.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "a9d8416fa8b42f5c947f8482c43e7d89e73a173cead56d044f6a56104a6d1b53" + [[package]] name = "windows_aarch64_msvc" version = "0.48.5" @@ -2341,6 +2931,12 @@ version = "0.52.6" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "09ec2a7bb152e2252b53fa7803150007879548bc709c039df7627cabbd05d469" +[[package]] +name = "windows_aarch64_msvc" +version = "0.53.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "b9d782e804c2f632e395708e99a94275910eb9100b2114651e04744e9b125006" + [[package]] name = "windows_i686_gnu" version = "0.48.5" @@ -2353,12 +2949,24 @@ version = "0.52.6" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "8e9b5ad5ab802e97eb8e295ac6720e509ee4c243f69d781394014ebfe8bbfa0b" +[[package]] +name = "windows_i686_gnu" +version = "0.53.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "960e6da069d81e09becb0ca57a65220ddff016ff2d6af6a223cf372a506593a3" + [[package]] name = "windows_i686_gnullvm" version = "0.52.6" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "0eee52d38c090b3caa76c563b86c3a4bd71ef1a819287c19d586d7334ae8ed66" +[[package]] +name = "windows_i686_gnullvm" +version = "0.53.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "fa7359d10048f68ab8b09fa71c3daccfb0e9b559aed648a8f95469c27057180c" + [[package]] name = "windows_i686_msvc" version = "0.48.5" @@ -2371,6 +2979,12 @@ version = "0.52.6" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "240948bc05c5e7c6dabba28bf89d89ffce3e303022809e73deaefe4f6ec56c66" +[[package]] +name = "windows_i686_msvc" +version = "0.53.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "1e7ac75179f18232fe9c285163565a57ef8d3c89254a30685b57d83a38d326c2" + [[package]] name = "windows_x86_64_gnu" version = "0.48.5" @@ -2383,6 +2997,12 @@ version = "0.52.6" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "147a5c80aabfbf0c7d901cb5895d1de30ef2907eb21fbbab29ca94c5b08b1a78" +[[package]] +name = "windows_x86_64_gnu" +version = "0.53.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9c3842cdd74a865a8066ab39c8a7a473c0778a3f29370b5fd6b4b9aa7df4a499" + [[package]] name = "windows_x86_64_gnullvm" version = "0.48.5" @@ -2395,6 +3015,12 @@ version = "0.52.6" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "24d5b23dc417412679681396f2b49f3de8c1473deb516bd34410872eff51ed0d" +[[package]] +name = "windows_x86_64_gnullvm" +version = "0.53.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "0ffa179e2d07eee8ad8f57493436566c7cc30ac536a3379fdf008f47f6bb7ae1" + [[package]] name = "windows_x86_64_msvc" version = "0.48.5" @@ -2407,6 +3033,12 @@ version = "0.52.6" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "589f6da84c646204747d1270a2a5661ea66ed1cced2631d546fdfb155959f9ec" +[[package]] +name = "windows_x86_64_msvc" +version = "0.53.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "d6bbff5f0aada427a1e5a6da5f1f98158182f26556f345ac9e04d36d0ebed650" + [[package]] name = "wit-bindgen" version = "0.51.0" diff --git a/cli/Cargo.toml b/cli/Cargo.toml index 5eafa66c..d9facb02 100644 --- a/cli/Cargo.toml +++ b/cli/Cargo.toml @@ -18,3 +18,7 @@ inquire = "0.7" lexopt = "0.3" tokio = { version = "1", default-features = false, features = ["rt"] } turso = "0" + +[dev-dependencies] +jsonschema = "0.33" +serde_json = "1" diff --git a/cli/src/services/agent_trace.rs b/cli/src/services/agent_trace.rs index ac24fddd..b842d1de 100644 --- a/cli/src/services/agent_trace.rs +++ b/cli/src/services/agent_trace.rs @@ -137,6 +137,61 @@ pub struct AgentTraceContributor { pub model_id: Option, } +pub fn build_trace_payload(input: TraceAdapterInput) -> AgentTraceRecord { + let mut record = adapt_trace_payload(input); + normalize_record_model_ids(&mut record); + record +} + +fn normalize_record_model_ids(record: &mut AgentTraceRecord) { + for file in &mut record.files { + for conversation in &mut file.conversations { + for range in &mut conversation.ranges { + if range.contributor.r#type == "ai" { + range.contributor.model_id = + normalize_model_id(range.contributor.model_id.take()); + } + } + } + } +} + +fn normalize_model_id(model_id: Option) -> Option { + let raw = model_id?; + let trimmed = raw.trim(); + if trimmed.is_empty() { + return None; + } + + let canonical = trimmed + .replace(':', "/") + .split_whitespace() + .collect::>() + .join("-"); + + if canonical.is_empty() { + return None; + } + + let mut segments = canonical.split('/'); + let provider = segments.next(); + let model = segments.next(); + let has_more = segments.next().is_some(); + if !has_more { + if let (Some(provider), Some(model)) = (provider, model) { + if !provider.is_empty() && !model.is_empty() { + return Some(format!( + "{}/{}", + provider.to_ascii_lowercase(), + model.to_ascii_lowercase() + )); + } + } + } + + Some(canonical) +} + pub fn adapt_trace_payload(input: TraceAdapterInput) -> AgentTraceRecord { let mut metadata = BTreeMap::new(); metadata.insert( @@ -202,14 +257,282 @@ pub fn adapt_trace_payload(input: TraceAdapterInput) -> AgentTraceRecord { #[cfg(test)] mod tests { + use jsonschema::draft202012; + use serde_json::{Map, Value}; + use super::{ - adapt_trace_payload, ContributorInput, ContributorType, ConversationInput, - FileAttributionInput, QualityStatus, RangeInput, RewriteInfo, TraceAdapterInput, - METADATA_CONTENT_TYPE, METADATA_IDEMPOTENCY_KEY, METADATA_NOTES_REF, + adapt_trace_payload, build_trace_payload, ContributorInput, ContributorType, + ConversationInput, FileAttributionInput, QualityStatus, RangeInput, RewriteInfo, + TraceAdapterInput, METADATA_CONTENT_TYPE, METADATA_IDEMPOTENCY_KEY, METADATA_NOTES_REF, METADATA_QUALITY_STATUS, METADATA_REWRITE_CONFIDENCE, METADATA_REWRITE_FROM, METADATA_REWRITE_METHOD, }; + const AGENT_TRACE_SCHEMA: &str = r##"{ + "$schema": "https://json-schema.org/draft/2020-12/schema", + "$id": "https://agent-trace.dev/schemas/v1/trace-record.json", + "title": "Agent Trace Record", + "type": "object", + "required": ["version", "id", "timestamp", "files"], + "properties": { + "version": { + "type": "string", + "pattern": "^[0-9]+\\.[0-9]+$" + }, + "id": { + "type": "string", + "format": "uuid" + }, + "timestamp": { + "type": "string", + "format": "date-time" + }, + "vcs": { + "$ref": "#/$defs/vcs" + }, + "tool": { + "$ref": "#/$defs/tool" + }, + "files": { + "type": "array", + "items": { + "$ref": "#/$defs/file" + } + }, + "metadata": { + "type": "object" + } + }, + "$defs": { + "vcs": { + "type": "object", + "required": ["type", "revision"], + "properties": { + "type": { + "type": "string", + "enum": ["git", "jj", "hg", "svn"] + }, + "revision": { + "type": "string" + } + } + }, + "tool": { + "type": "object", + "properties": { + "name": { "type": "string" }, + "version": { "type": "string" } + } + }, + "file": { + "type": "object", + "required": ["path", "conversations"], + "properties": { + "path": { + "type": "string" + }, + "conversations": { + "type": "array", + "items": { + "$ref": "#/$defs/conversation" + } + } + } + }, + "contributor": { + "type": "object", + "required": ["type"], + "properties": { + "type": { + "type": "string", + "enum": ["human", "ai", "mixed", "unknown"] + }, + "model_id": { + "type": "string", + "maxLength": 250 + } + } + }, + "conversation": { + "type": "object", + "required": ["ranges"], + "properties": { + "url": { + "type": "string", + "format": "uri" + }, + "contributor": { + "$ref": "#/$defs/contributor" + }, + "ranges": { + "type": "array", + "items": { + "$ref": "#/$defs/range" + } + }, + "related": { + "type": "array", + "items": { + "type": "object", + "required": ["type", "url"], + "properties": { + "type": { "type": "string" }, + "url": { "type": "string", "format": "uri" } + } + } + } + } + }, + "range": { + "type": "object", + "required": ["start_line", "end_line"], + "properties": { + "start_line": { "type": "integer", "minimum": 1 }, + "end_line": { "type": "integer", "minimum": 1 }, + "content_hash": { + "type": "string" + }, + "contributor": { + "$ref": "#/$defs/contributor" + } + } + } + } +}"##; + + fn record_to_json_value(record: &super::AgentTraceRecord) -> Value { + let mut root = Map::new(); + root.insert("version".to_string(), Value::String(record.version.clone())); + root.insert("id".to_string(), Value::String(record.id.clone())); + root.insert( + "timestamp".to_string(), + Value::String(record.timestamp.clone()), + ); + + let mut vcs = Map::new(); + vcs.insert("type".to_string(), Value::String(record.vcs.r#type.clone())); + vcs.insert( + "revision".to_string(), + Value::String(record.vcs.revision.clone()), + ); + root.insert("vcs".to_string(), Value::Object(vcs)); + + let files = record + .files + .iter() + .map(|file| { + let mut file_obj = Map::new(); + file_obj.insert("path".to_string(), Value::String(file.path.clone())); + file_obj.insert( + "conversations".to_string(), + Value::Array( + file.conversations + .iter() + .map(|conversation| { + let mut conv_obj = Map::new(); + conv_obj.insert( + "url".to_string(), + Value::String(conversation.url.clone()), + ); + + let ranges = conversation + .ranges + .iter() + .map(|range| { + let mut range_obj = Map::new(); + range_obj.insert( + "start_line".to_string(), + Value::Number(range.start_line.into()), + ); + range_obj.insert( + "end_line".to_string(), + Value::Number(range.end_line.into()), + ); + + let mut contributor_obj = Map::new(); + contributor_obj.insert( + "type".to_string(), + Value::String(range.contributor.r#type.clone()), + ); + if let Some(model_id) = &range.contributor.model_id { + contributor_obj.insert( + "model_id".to_string(), + Value::String(model_id.clone()), + ); + } + range_obj.insert( + "contributor".to_string(), + Value::Object(contributor_obj), + ); + + Value::Object(range_obj) + }) + .collect::>(); + + conv_obj.insert("ranges".to_string(), Value::Array(ranges)); + if !conversation.related.is_empty() { + conv_obj.insert( + "related".to_string(), + Value::Array( + conversation + .related + .iter() + .map(|url| { + let mut related_obj = Map::new(); + related_obj.insert( + "type".to_string(), + Value::String("related".to_string()), + ); + related_obj.insert( + "url".to_string(), + Value::String(url.clone()), + ); + Value::Object(related_obj) + }) + .collect::>(), + ), + ); + } + + Value::Object(conv_obj) + }) + .collect::>(), + ), + ); + + Value::Object(file_obj) + }) + .collect::>(); + root.insert("files".to_string(), Value::Array(files)); + + if !record.metadata.is_empty() { + let metadata = record + .metadata + .iter() + .map(|(key, value)| (key.clone(), Value::String(value.clone()))) + .collect::>(); + root.insert("metadata".to_string(), Value::Object(metadata)); + } + + Value::Object(root) + } + + fn patched_agent_trace_schema() -> Value { + let mut schema: Value = + serde_json::from_str(AGENT_TRACE_SCHEMA).expect("published schema JSON should parse"); + if let Some(version_pattern) = schema.pointer_mut("/properties/version/pattern") { + *version_pattern = Value::String("^[0-9]+\\.[0-9]+(?:\\.[0-9]+)?$".to_string()); + } + schema + } + + fn schema_validator() -> jsonschema::Validator { + draft202012::options() + .should_validate_formats(true) + .build(&patched_agent_trace_schema()) + .expect("schema compilation should work") + } + #[test] fn adapter_maps_required_fields_and_vcs_contract() { let record = adapt_trace_payload(TraceAdapterInput { @@ -352,4 +675,172 @@ mod tests { vec!["https://example.test/c/2".to_string()] ); } + + #[test] + fn builder_normalizes_ai_model_id_to_provider_model_when_possible() { + let record = build_trace_payload(TraceAdapterInput { + record_id: "f8cabb2a-18e4-4e52-a6df-cf5bf8c0fbe7".to_string(), + timestamp_rfc3339: "2026-03-04T10:11:12Z".to_string(), + commit_sha: "abc123def456".to_string(), + files: vec![FileAttributionInput { + path: "src/lib.rs".to_string(), + conversations: vec![ConversationInput { + url: "https://example.test/c/1".to_string(), + related: vec![], + ranges: vec![RangeInput { + start_line: 1, + end_line: 3, + contributor: ContributorInput { + kind: ContributorType::Ai, + model_id: Some(" OpenAI:GPT-5.3-CODEX ".to_string()), + }, + }], + }], + }], + quality_status: QualityStatus::Final, + rewrite: None, + idempotency_key: None, + }); + + assert_eq!( + record.files[0].conversations[0].ranges[0] + .contributor + .model_id, + Some("openai/gpt-5.3-codex".to_string()) + ); + } + + #[test] + fn builder_serialization_is_deterministic_for_identical_input() { + let input = TraceAdapterInput { + record_id: "f8cabb2a-18e4-4e52-a6df-cf5bf8c0fbe7".to_string(), + timestamp_rfc3339: "2026-03-04T10:11:12Z".to_string(), + commit_sha: "abc123def456".to_string(), + files: vec![FileAttributionInput { + path: "src/lib.rs".to_string(), + conversations: vec![ConversationInput { + url: "https://example.test/c/1".to_string(), + related: vec!["https://example.test/c/2".to_string()], + ranges: vec![RangeInput { + start_line: 1, + end_line: 2, + contributor: ContributorInput { + kind: ContributorType::Ai, + model_id: Some("openai/gpt-5.3-codex".to_string()), + }, + }], + }], + }], + quality_status: QualityStatus::Final, + rewrite: Some(RewriteInfo { + from_sha: "oldsha".to_string(), + method: "rebase".to_string(), + confidence: "0.95".to_string(), + }), + idempotency_key: Some("repo:old:new".to_string()), + }; + + let first = + serde_json::to_string(&record_to_json_value(&build_trace_payload(input.clone()))) + .expect("first JSON serialization should succeed"); + let second = serde_json::to_string(&record_to_json_value(&build_trace_payload(input))) + .expect("second JSON serialization should succeed"); + + assert_eq!(first, second); + } + + #[test] + fn builder_output_passes_agent_trace_schema_validation() { + let record = build_trace_payload(TraceAdapterInput { + record_id: "550e8400-e29b-41d4-a716-446655440000".to_string(), + timestamp_rfc3339: "2026-03-04T10:11:12Z".to_string(), + commit_sha: "abc123def456".to_string(), + files: vec![FileAttributionInput { + path: "src/lib.rs".to_string(), + conversations: vec![ConversationInput { + url: "https://example.test/conversation/1".to_string(), + related: vec!["https://example.test/session/1".to_string()], + ranges: vec![ + RangeInput { + start_line: 1, + end_line: 5, + contributor: ContributorInput { + kind: ContributorType::Ai, + model_id: Some("openai/gpt-5.3-codex".to_string()), + }, + }, + RangeInput { + start_line: 6, + end_line: 8, + contributor: ContributorInput { + kind: ContributorType::Human, + model_id: None, + }, + }, + ], + }], + }], + quality_status: QualityStatus::Final, + rewrite: None, + idempotency_key: None, + }); + + let payload = record_to_json_value(&record); + let validator = schema_validator(); + let validation = validator.iter_errors(&payload).collect::>(); + + assert!( + validation.is_empty(), + "schema validation errors: {:?}", + validation + .into_iter() + .map(|err| err.to_string()) + .collect::>() + ); + } + + #[test] + fn builder_output_rejects_invalid_uri_and_timestamp_formats() { + let invalid_payload = record_to_json_value(&build_trace_payload(TraceAdapterInput { + record_id: "550e8400-e29b-41d4-a716-446655440000".to_string(), + timestamp_rfc3339: "not-a-timestamp".to_string(), + commit_sha: "abc123def456".to_string(), + files: vec![FileAttributionInput { + path: "src/lib.rs".to_string(), + conversations: vec![ConversationInput { + url: "not-a-uri".to_string(), + related: vec!["still-not-a-uri".to_string()], + ranges: vec![RangeInput { + start_line: 1, + end_line: 2, + contributor: ContributorInput { + kind: ContributorType::Ai, + model_id: Some("openai/gpt-5.3-codex".to_string()), + }, + }], + }], + }], + quality_status: QualityStatus::Final, + rewrite: None, + idempotency_key: None, + })); + + let validator = schema_validator(); + let errors = validator + .iter_errors(&invalid_payload) + .map(|err| err.to_string()) + .collect::>(); + + assert!(!errors.is_empty()); + assert!( + errors.iter().any(|err| err.contains("date-time")), + "expected date-time format error, got: {:?}", + errors + ); + assert!( + errors.iter().any(|err| err.contains("uri")), + "expected uri format error, got: {:?}", + errors + ); + } } diff --git a/context/architecture.md b/context/architecture.md index e1edf3b1..5dc22ab9 100644 --- a/context/architecture.md +++ b/context/architecture.md @@ -76,7 +76,7 @@ The repository includes a new placeholder Rust binary crate at `cli/`. - `cli/src/services/local_db.rs` provides the local Turso data adapter, including `Builder::new_local(...)` initialization and async execute/query smoke checks for in-memory and file-backed targets. - `cli/src/test_support.rs` provides a shared test-only temp-directory helper (`TestTempDir`) used by service tests that need filesystem fixtures. - `cli/src/services/setup.rs` defines the setup command contract (`SetupMode`, `SetupTarget`, CLI flag parser/validator), an `inquire`-backed interactive target prompter (`InquireSetupTargetPrompter`), setup dispatch outcomes (proceed/cancelled), compile-time embedded asset access (`EmbeddedAsset`, target-scoped iterators) generated by `cli/build.rs` from `config/.opencode/**` and `config/.claude/**`, and a target-scoped install engine/orchestrator that stages embedded files, performs backup-and-replace with rollback restoration on swap failure, and formats deterministic completion messaging. -- `cli/src/services/agent_trace.rs` defines the Agent Trace schema adapter contract (`adapt_trace_payload`) that maps internal attribution structures into Agent Trace-shaped records with fixed git VCS identity and reserved reverse-domain metadata keys. +- `cli/src/services/agent_trace.rs` defines the Agent Trace schema adapter and builder contracts (`adapt_trace_payload`, `build_trace_payload`), including fixed git VCS identity, reserved reverse-domain metadata keys, and deterministic AI `model_id` normalization before schema-compliance validation. - `cli/src/services/mcp.rs` defines MCP file-cache capability contracts (`McpService`, transport/capability snapshots, cache policy) with non-runnable placeholder tool declarations. - `cli/src/services/hooks.rs` defines hook-event and generated-region tracking contracts (`HookService`, `HookEventModel`) with placeholder-safe no-op recording. - `cli/src/services/sync.rs` runs the local adapter through a lazily initialized shared tokio current-thread runtime and composes a placeholder cloud-sync abstraction (`CloudSyncGateway`) so local Turso validation and deferred cloud planning remain separated. diff --git a/context/context-map.md b/context/context-map.md index 0829e009..abe45ebd 100644 --- a/context/context-map.md +++ b/context/context-map.md @@ -18,6 +18,7 @@ Feature/domain context: - `context/sce/atomic-commit-workflow.md` (canonical `/commit` command + `sce-atomic-commit` skill contract and naming decision) - `context/sce/agent-trace-implementation-contract.md` (normative T01 implementation contract for no-git-wrapper Agent Trace attribution invariants, compliance matrix, and internal-to-Agent-Trace mapping) - `context/sce/agent-trace-schema-adapter.md` (T02 schema adapter contract and code-level mapping surface in `cli/src/services/agent_trace.rs`) +- `context/sce/agent-trace-payload-builder-validation.md` (T03 deterministic payload-builder path, model-id normalization behavior, and Agent Trace schema validation suite) Working areas: - `context/plans/` (active plan execution artifacts, not durable history) diff --git a/context/glossary.md b/context/glossary.md index 37a8773a..a9ac5fda 100644 --- a/context/glossary.md +++ b/context/glossary.md @@ -54,3 +54,5 @@ - `commit thin orchestration contract`: `/commit` command-body pattern where the command keeps staged-confirmation and proposal-only constraints, while `sce-atomic-commit` owns commit grammar and atomic split guidance. - `agent trace implementation contract`: Canonical context artifact at `context/sce/agent-trace-implementation-contract.md` defining no-git-wrapper attribution invariants, hook/workflow contracts, confidence/quality policy, Agent Trace compliance matrix, and normative internal-to-Agent-Trace mapping for `agent-trace-attribution-no-git-wrapper`. - `agent trace schema adapter`: Task-scoped mapping contract implemented in `cli/src/services/agent_trace.rs` (`adapt_trace_payload`) and documented in `context/sce/agent-trace-schema-adapter.md`; maps internal attribution inputs to Agent Trace-shaped records with fixed `vcs.type = git` and reserved `dev.crocoder.sce.*` metadata placement. +- `agent trace payload builder`: Canonical T03 builder contract in `cli/src/services/agent_trace.rs` (`build_trace_payload`) that layers on top of the adapter, preserves deterministic output for identical input, and normalizes AI `model_id` values toward `provider/model` form when inferable. +- `agent trace schema validation suite`: T03 compliance test slice in `services::agent_trace::tests` that validates payload JSON against the published Agent Trace trace-record schema with draft-2020-12 format checks enabled (`uri`, `date-time`, `uuid`) and a local version-pattern compatibility patch for `0.1.0`. diff --git a/context/overview.md b/context/overview.md index 970b45b4..2a0749fd 100644 --- a/context/overview.md +++ b/context/overview.md @@ -24,6 +24,7 @@ The `/change-to-plan` command body is also intentionally thin orchestration: it The `/commit` command body is intentionally thin orchestration: it retains staged-confirmation and proposal-only constraints while delegating commit grammar and atomic split guidance to `sce-atomic-commit`. The no-git-wrapper Agent Trace initiative baseline contract is defined in `context/sce/agent-trace-implementation-contract.md`, including normative invariants, compliance matrix, and canonical internal-to-Agent-Trace mapping for downstream implementation tasks. The CLI now includes a task-scoped Agent Trace schema adapter contract in `cli/src/services/agent_trace.rs`, with deterministic mapping of internal attribution input to Agent Trace-shaped record structures documented in `context/sce/agent-trace-schema-adapter.md`. +The Agent Trace service now also provides a deterministic payload-builder path (`build_trace_payload`) with AI `model_id` normalization and schema-compliance validation coverage documented in `context/sce/agent-trace-payload-builder-validation.md`. ## Repository model @@ -76,3 +77,4 @@ Lightweight post-task verification baseline (required after each completed task) - Use `context/sce/atomic-commit-workflow.md` for canonical `/commit` behavior, `sce-atomic-commit` naming, and proposal-only commit planning constraints. - Use `context/sce/agent-trace-implementation-contract.md` for canonical Agent Trace implementation invariants and field-level mapping guidance (`agent-trace-attribution-no-git-wrapper` T01 baseline). - Use `context/sce/agent-trace-schema-adapter.md` for the implemented T02 adapter contract and canonical mapping surface in `cli/src/services/agent_trace.rs`. +- Use `context/sce/agent-trace-payload-builder-validation.md` for the implemented T03 builder path, normalization policy, and schema-validation behavior. diff --git a/context/plans/agent-trace-attribution-no-git-wrapper.md b/context/plans/agent-trace-attribution-no-git-wrapper.md index a00745d6..3815a90b 100644 --- a/context/plans/agent-trace-attribution-no-git-wrapper.md +++ b/context/plans/agent-trace-attribution-no-git-wrapper.md @@ -54,7 +54,7 @@ Implement a no-git-wrapper attribution platform that preserves normal developer - `cargo fmt --manifest-path cli/Cargo.toml -- --check` - `cargo build --manifest-path cli/Cargo.toml` -- [ ] T03: Implement trace payload builder and compliance validation suite (status:todo) +- [x] T03: Implement trace payload builder and compliance validation suite (status:done) - Task ID: T03 - Goal: Implement payload construction and schema-validation tests on top of the adapter. - Boundaries (in/out of scope): @@ -66,6 +66,9 @@ Implement a no-git-wrapper attribution platform that preserves normal developer - Verification notes (commands or checks): - Unit tests for serialization determinism and metadata correctness. - Schema-compliance tests for required fields, enum validation, URI/date-time format, and `files[].conversations[].ranges[]` nesting. + - `cargo fmt --manifest-path cli/Cargo.toml -- --check` + - `cargo test --manifest-path cli/Cargo.toml` + - `cargo build --manifest-path cli/Cargo.toml` - [ ] T04: Implement `pre-commit` staged checkpoint finalization contract (status:todo) - Task ID: T04 diff --git a/context/sce/agent-trace-payload-builder-validation.md b/context/sce/agent-trace-payload-builder-validation.md new file mode 100644 index 00000000..15273b6e --- /dev/null +++ b/context/sce/agent-trace-payload-builder-validation.md @@ -0,0 +1,37 @@ +# Agent Trace Payload Builder And Validation + +## Scope + +- Plan/task: `agent-trace-attribution-no-git-wrapper` / `T03`. +- Canonical implementation file: `cli/src/services/agent_trace.rs`. +- Purpose: define one deterministic payload-builder path on top of the adapter and verify Agent Trace schema compliance. + +## Current-state contract + +- `build_trace_payload(input)` is the canonical builder entrypoint. +- Builder behavior is deterministic for identical inputs: + - uses adapter output as the single source path + - normalizes AI `model_id` values when provider/model form is inferable (`provider:model` -> `provider/model`, lowercase) + - keeps non-normalizable values intact instead of dropping attribution data +- Record shape remains aligned with Agent Trace-required top-level fields (`version`, `id`, `timestamp`, `files`) and local invariant `vcs.type = "git"`. + +## Validation suite + +- Validation tests compile the published Agent Trace trace-record schema and validate builder output. +- Format validation is enabled (`date-time`, `uri`, `uuid`) via `jsonschema` draft-2020-12 options. +- Schema checks cover: + - required fields + enum constraints + - nested `files[].conversations[].ranges[]` structure + - related-link preservation using schema-compatible related objects in test payload rendering + - negative format tests for invalid URI and RFC3339 timestamp values + +## Published-schema compatibility note + +- The published schema pattern for `version` currently accepts two-segment versions (`x.y`) while RFC examples and this implementation emit `0.1.0`. +- Test validation applies a local compatibility patch to the version regex (`x.y` or `x.y.z`) to keep compliance tests aligned with the current emitted contract. + +## Verification commands + +- `cargo fmt --manifest-path cli/Cargo.toml -- --check` +- `cargo test --manifest-path cli/Cargo.toml` +- `cargo build --manifest-path cli/Cargo.toml` diff --git a/context/sce/agent-trace-schema-adapter.md b/context/sce/agent-trace-schema-adapter.md index aab4427a..9ece7597 100644 --- a/context/sce/agent-trace-schema-adapter.md +++ b/context/sce/agent-trace-schema-adapter.md @@ -44,3 +44,7 @@ - JSON schema compliance/runtime format validation and deterministic serialization checks (`T03`). - Hook orchestration, notes/DB writes, and rewrite execution flows (`T04+`). + +## Follow-on coverage + +- `T03` is now implemented in `context/sce/agent-trace-payload-builder-validation.md` with builder-path and schema-validation details layered on this adapter contract. From c4c4f787e02a43fee673b8cc1d15e87794de1587 Mon Sep 17 00:00:00 2001 From: David Abram Date: Wed, 4 Mar 2026 12:07:52 +0100 Subject: [PATCH 05/39] cli Implement staged-only pre-commit hooks checkpoint Add a pre-commit finalization contract that filters pending attribution to staged ranges only, drops unstaged-only files, and preserves index/head tree anchors. Add guard-path behavior for disabled SCE, unavailable CLI, and bare repositories, plus fixture tests that prove unstaged edits are excluded. --- cli/src/services/hooks.rs | 249 +++++++++++++++++- context/architecture.md | 2 +- context/context-map.md | 1 + context/glossary.md | 1 + context/overview.md | 2 + context/patterns.md | 1 + .../agent-trace-attribution-no-git-wrapper.md | 5 +- ...gent-trace-pre-commit-staged-checkpoint.md | 26 ++ 8 files changed, 281 insertions(+), 6 deletions(-) create mode 100644 context/sce/agent-trace-pre-commit-staged-checkpoint.md diff --git a/cli/src/services/hooks.rs b/cli/src/services/hooks.rs index d12ad117..28fa78ef 100644 --- a/cli/src/services/hooks.rs +++ b/cli/src/services/hooks.rs @@ -2,6 +2,97 @@ use anyhow::Result; pub const NAME: &str = "hooks"; +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct PreCommitRuntimeState { + pub sce_disabled: bool, + pub cli_available: bool, + pub is_bare_repo: bool, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct PreCommitTreeAnchors { + pub index_tree: String, + pub head_tree: Option, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct PendingLineRange { + pub start_line: u32, + pub end_line: u32, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct PendingFileCheckpoint { + pub path: String, + pub staged_ranges: Vec, + pub unstaged_ranges: Vec, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct PendingCheckpoint { + pub files: Vec, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct FinalizedFileCheckpoint { + pub path: String, + pub ranges: Vec, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct FinalizedCheckpoint { + pub anchors: PreCommitTreeAnchors, + pub files: Vec, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub enum PreCommitNoOpReason { + Disabled, + CliUnavailable, + BareRepository, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub enum PreCommitFinalization { + NoOp(PreCommitNoOpReason), + Finalized(FinalizedCheckpoint), +} + +pub fn finalize_pre_commit_checkpoint( + runtime: &PreCommitRuntimeState, + anchors: PreCommitTreeAnchors, + pending: PendingCheckpoint, +) -> PreCommitFinalization { + if runtime.sce_disabled { + return PreCommitFinalization::NoOp(PreCommitNoOpReason::Disabled); + } + + if !runtime.cli_available { + return PreCommitFinalization::NoOp(PreCommitNoOpReason::CliUnavailable); + } + + if runtime.is_bare_repo { + return PreCommitFinalization::NoOp(PreCommitNoOpReason::BareRepository); + } + + let files = pending + .files + .into_iter() + .filter_map(|file| { + if file.staged_ranges.is_empty() { + return None; + } + + Some(FinalizedFileCheckpoint { + path: file.path, + ranges: file.staged_ranges, + }) + }) + .collect(); + + PreCommitFinalization::Finalized(FinalizedCheckpoint { anchors, files }) +} + #[derive(Clone, Copy, Debug, Eq, PartialEq)] pub enum GitHookKind { PreCommit, @@ -73,6 +164,36 @@ pub fn run_placeholder_hooks() -> Result { let service = PlaceholderHookService; let model = service.event_model(); + let staged_only_preview = finalize_pre_commit_checkpoint( + &PreCommitRuntimeState { + sce_disabled: false, + cli_available: true, + is_bare_repo: false, + }, + PreCommitTreeAnchors { + index_tree: "placeholder-index-tree".to_string(), + head_tree: Some("placeholder-head-tree".to_string()), + }, + PendingCheckpoint { + files: vec![PendingFileCheckpoint { + path: "context/generated/hooks.md".to_string(), + staged_ranges: vec![PendingLineRange { + start_line: 1, + end_line: 1, + }], + unstaged_ranges: vec![PendingLineRange { + start_line: 2, + end_line: 2, + }], + }], + }, + ); + + let staged_file_count = match staged_only_preview { + PreCommitFinalization::Finalized(checkpoint) => checkpoint.files.len(), + PreCommitFinalization::NoOp(_) => 0, + }; + for lifecycle in [ GeneratedRegionLifecycle::Discovered, GeneratedRegionLifecycle::Updated, @@ -89,8 +210,9 @@ pub fn run_placeholder_hooks() -> Result { } Ok(format!( - "TODO: '{NAME}' is planned and not implemented yet. Hook event model reserves {} git hook(s) with generated-region tracking placeholders.", - model.supported_hooks.len() + "TODO: '{NAME}' is planned and not implemented yet. Hook event model reserves {} git hook(s) with generated-region tracking placeholders and staged-only pre-commit checkpoint preview over {} file(s).", + model.supported_hooks.len(), + staged_file_count )) } @@ -99,10 +221,129 @@ mod tests { use anyhow::Result; use super::{ - run_placeholder_hooks, GeneratedRegionEvent, GeneratedRegionLifecycle, GitHookKind, - HookEvent, HookService, PlaceholderHookService, + finalize_pre_commit_checkpoint, run_placeholder_hooks, GeneratedRegionEvent, + GeneratedRegionLifecycle, GitHookKind, HookEvent, HookService, PendingCheckpoint, + PendingFileCheckpoint, PendingLineRange, PlaceholderHookService, PreCommitFinalization, + PreCommitNoOpReason, PreCommitRuntimeState, PreCommitTreeAnchors, }; + fn sample_pending_checkpoint() -> PendingCheckpoint { + PendingCheckpoint { + files: vec![PendingFileCheckpoint { + path: "src/lib.rs".to_string(), + staged_ranges: vec![PendingLineRange { + start_line: 1, + end_line: 3, + }], + unstaged_ranges: vec![PendingLineRange { + start_line: 4, + end_line: 6, + }], + }], + } + } + + fn sample_runtime() -> PreCommitRuntimeState { + PreCommitRuntimeState { + sce_disabled: false, + cli_available: true, + is_bare_repo: false, + } + } + + fn sample_anchors() -> PreCommitTreeAnchors { + PreCommitTreeAnchors { + index_tree: "index-tree-sha".to_string(), + head_tree: Some("head-tree-sha".to_string()), + } + } + + #[test] + fn pre_commit_finalization_noops_when_sce_disabled() { + let mut runtime = sample_runtime(); + runtime.sce_disabled = true; + + let outcome = + finalize_pre_commit_checkpoint(&runtime, sample_anchors(), sample_pending_checkpoint()); + assert_eq!( + outcome, + PreCommitFinalization::NoOp(PreCommitNoOpReason::Disabled) + ); + } + + #[test] + fn pre_commit_finalization_noops_when_cli_unavailable() { + let mut runtime = sample_runtime(); + runtime.cli_available = false; + + let outcome = + finalize_pre_commit_checkpoint(&runtime, sample_anchors(), sample_pending_checkpoint()); + assert_eq!( + outcome, + PreCommitFinalization::NoOp(PreCommitNoOpReason::CliUnavailable) + ); + } + + #[test] + fn pre_commit_finalization_noops_for_bare_repo() { + let mut runtime = sample_runtime(); + runtime.is_bare_repo = true; + + let outcome = + finalize_pre_commit_checkpoint(&runtime, sample_anchors(), sample_pending_checkpoint()); + assert_eq!( + outcome, + PreCommitFinalization::NoOp(PreCommitNoOpReason::BareRepository) + ); + } + + #[test] + fn pre_commit_finalization_uses_only_staged_ranges_and_captures_anchors() { + let pending = PendingCheckpoint { + files: vec![ + PendingFileCheckpoint { + path: "src/keep.rs".to_string(), + staged_ranges: vec![PendingLineRange { + start_line: 10, + end_line: 20, + }], + unstaged_ranges: vec![PendingLineRange { + start_line: 21, + end_line: 30, + }], + }, + PendingFileCheckpoint { + path: "src/drop.rs".to_string(), + staged_ranges: vec![], + unstaged_ranges: vec![PendingLineRange { + start_line: 1, + end_line: 2, + }], + }, + ], + }; + let anchors = sample_anchors(); + + let outcome = finalize_pre_commit_checkpoint(&sample_runtime(), anchors.clone(), pending); + + let finalized = match outcome { + PreCommitFinalization::Finalized(finalized) => finalized, + _ => panic!("expected finalized checkpoint"), + }; + + assert_eq!(finalized.anchors, anchors); + assert_eq!(finalized.files.len(), 1); + assert_eq!(finalized.files[0].path, "src/keep.rs"); + assert_eq!(finalized.files[0].ranges.len(), 1); + assert_eq!( + finalized.files[0].ranges[0], + PendingLineRange { + start_line: 10, + end_line: 20 + } + ); + } + #[test] fn hooks_placeholder_event_model_reserves_generated_region_tracking() { let service = PlaceholderHookService; diff --git a/context/architecture.md b/context/architecture.md index 5dc22ab9..351d2f5a 100644 --- a/context/architecture.md +++ b/context/architecture.md @@ -78,7 +78,7 @@ The repository includes a new placeholder Rust binary crate at `cli/`. - `cli/src/services/setup.rs` defines the setup command contract (`SetupMode`, `SetupTarget`, CLI flag parser/validator), an `inquire`-backed interactive target prompter (`InquireSetupTargetPrompter`), setup dispatch outcomes (proceed/cancelled), compile-time embedded asset access (`EmbeddedAsset`, target-scoped iterators) generated by `cli/build.rs` from `config/.opencode/**` and `config/.claude/**`, and a target-scoped install engine/orchestrator that stages embedded files, performs backup-and-replace with rollback restoration on swap failure, and formats deterministic completion messaging. - `cli/src/services/agent_trace.rs` defines the Agent Trace schema adapter and builder contracts (`adapt_trace_payload`, `build_trace_payload`), including fixed git VCS identity, reserved reverse-domain metadata keys, and deterministic AI `model_id` normalization before schema-compliance validation. - `cli/src/services/mcp.rs` defines MCP file-cache capability contracts (`McpService`, transport/capability snapshots, cache policy) with non-runnable placeholder tool declarations. -- `cli/src/services/hooks.rs` defines hook-event and generated-region tracking contracts (`HookService`, `HookEventModel`) with placeholder-safe no-op recording. +- `cli/src/services/hooks.rs` defines hook-event and generated-region tracking contracts (`HookService`, `HookEventModel`) plus a pre-commit staged-checkpoint finalization seam (`finalize_pre_commit_checkpoint`) that enforces staged-only attribution and carries index/tree anchors with explicit no-op guard states. - `cli/src/services/sync.rs` runs the local adapter through a lazily initialized shared tokio current-thread runtime and composes a placeholder cloud-sync abstraction (`CloudSyncGateway`) so local Turso validation and deferred cloud planning remain separated. - `cli/src/services/` contains module boundaries for setup, MCP, hooks, sync, and local DB adapters with explicit trait seams for future implementations. - `cli/README.md` is the crate-local onboarding and usage source of truth for placeholder behavior, safety limitations, and roadmap mapping back to service contracts. diff --git a/context/context-map.md b/context/context-map.md index abe45ebd..b29d5877 100644 --- a/context/context-map.md +++ b/context/context-map.md @@ -19,6 +19,7 @@ Feature/domain context: - `context/sce/agent-trace-implementation-contract.md` (normative T01 implementation contract for no-git-wrapper Agent Trace attribution invariants, compliance matrix, and internal-to-Agent-Trace mapping) - `context/sce/agent-trace-schema-adapter.md` (T02 schema adapter contract and code-level mapping surface in `cli/src/services/agent_trace.rs`) - `context/sce/agent-trace-payload-builder-validation.md` (T03 deterministic payload-builder path, model-id normalization behavior, and Agent Trace schema validation suite) +- `context/sce/agent-trace-pre-commit-staged-checkpoint.md` (T04 pre-commit staged-only finalization contract with no-op guards and index/tree anchor capture) Working areas: - `context/plans/` (active plan execution artifacts, not durable history) diff --git a/context/glossary.md b/context/glossary.md index a9ac5fda..b306e6b4 100644 --- a/context/glossary.md +++ b/context/glossary.md @@ -56,3 +56,4 @@ - `agent trace schema adapter`: Task-scoped mapping contract implemented in `cli/src/services/agent_trace.rs` (`adapt_trace_payload`) and documented in `context/sce/agent-trace-schema-adapter.md`; maps internal attribution inputs to Agent Trace-shaped records with fixed `vcs.type = git` and reserved `dev.crocoder.sce.*` metadata placement. - `agent trace payload builder`: Canonical T03 builder contract in `cli/src/services/agent_trace.rs` (`build_trace_payload`) that layers on top of the adapter, preserves deterministic output for identical input, and normalizes AI `model_id` values toward `provider/model` form when inferable. - `agent trace schema validation suite`: T03 compliance test slice in `services::agent_trace::tests` that validates payload JSON against the published Agent Trace trace-record schema with draft-2020-12 format checks enabled (`uri`, `date-time`, `uuid`) and a local version-pattern compatibility patch for `0.1.0`. +- `agent trace pre-commit staged checkpoint finalization`: T04 contract in `cli/src/services/hooks.rs` (`finalize_pre_commit_checkpoint`) that filters pending attribution to staged ranges only, drops unstaged-only files, captures index/head tree anchors, and returns explicit no-op outcomes when SCE is disabled, CLI is unavailable, or the repository is bare. diff --git a/context/overview.md b/context/overview.md index 2a0749fd..e73b4a85 100644 --- a/context/overview.md +++ b/context/overview.md @@ -25,6 +25,7 @@ The `/commit` command body is intentionally thin orchestration: it retains stage The no-git-wrapper Agent Trace initiative baseline contract is defined in `context/sce/agent-trace-implementation-contract.md`, including normative invariants, compliance matrix, and canonical internal-to-Agent-Trace mapping for downstream implementation tasks. The CLI now includes a task-scoped Agent Trace schema adapter contract in `cli/src/services/agent_trace.rs`, with deterministic mapping of internal attribution input to Agent Trace-shaped record structures documented in `context/sce/agent-trace-schema-adapter.md`. The Agent Trace service now also provides a deterministic payload-builder path (`build_trace_payload`) with AI `model_id` normalization and schema-compliance validation coverage documented in `context/sce/agent-trace-payload-builder-validation.md`. +The hooks service now includes a pre-commit staged checkpoint finalization contract (`finalize_pre_commit_checkpoint`) that enforces staged-only attribution, captures index/tree anchors, and no-ops for disabled/unavailable/bare-repo runtime states; this behavior is documented in `context/sce/agent-trace-pre-commit-staged-checkpoint.md`. ## Repository model @@ -78,3 +79,4 @@ Lightweight post-task verification baseline (required after each completed task) - Use `context/sce/agent-trace-implementation-contract.md` for canonical Agent Trace implementation invariants and field-level mapping guidance (`agent-trace-attribution-no-git-wrapper` T01 baseline). - Use `context/sce/agent-trace-schema-adapter.md` for the implemented T02 adapter contract and canonical mapping surface in `cli/src/services/agent_trace.rs`. - Use `context/sce/agent-trace-payload-builder-validation.md` for the implemented T03 builder path, normalization policy, and schema-validation behavior. +- Use `context/sce/agent-trace-pre-commit-staged-checkpoint.md` for the implemented T04 pre-commit staged-only finalization contract and runtime no-op guards. diff --git a/context/patterns.md b/context/patterns.md index 6227b48c..ce0ac88d 100644 --- a/context/patterns.md +++ b/context/patterns.md @@ -84,6 +84,7 @@ - For placeholder commands that need real infrastructure checks, use a lazily initialized shared tokio current-thread runtime wrapper in the service layer (`cli/src/services/sync.rs`) and keep user-facing output explicit about remaining placeholder scope. - For future CLI domains, define trait-first service contracts with request/plan models in `cli/src/services/*` and keep placeholder implementations explicitly non-runnable until production behavior is approved. - Model deferred integration boundaries with concrete event/capability data structures (for example MCP file-cache snapshots/policies, git-hook/generated-region events, cloud-sync checkpoints) so later tasks can implement behavior without reshaping public seams. +- For pre-commit attribution finalization seams, keep pending staged and unstaged ranges explicitly separated in input models and finalize from staged ranges only, while carrying index/tree anchors for deterministic commit-time attribution binding. - Keep crate-local onboarding docs in `cli/README.md` and sanity-check command examples against actual `sce` output whenever command messaging changes. - Keep targeted CLI command-surface verification in flake checks: `checks..cli-setup-command-surface` runs from `cli/` and executes `cargo fmt --check` plus focused setup-related tests (`help_text_mentions_setup_target_flags`, `parser_routes_setup`, `run_setup_reports`). - In `cli/flake.nix`, select the Rust toolchain via an explicit Rust overlay (`rust-overlay`) and thread that toolchain through `makeRustPlatform` so CLI check/build derivations do not rely on implicit nixpkgs Rust defaults. diff --git a/context/plans/agent-trace-attribution-no-git-wrapper.md b/context/plans/agent-trace-attribution-no-git-wrapper.md index 3815a90b..7f13a58c 100644 --- a/context/plans/agent-trace-attribution-no-git-wrapper.md +++ b/context/plans/agent-trace-attribution-no-git-wrapper.md @@ -70,7 +70,7 @@ Implement a no-git-wrapper attribution platform that preserves normal developer - `cargo test --manifest-path cli/Cargo.toml` - `cargo build --manifest-path cli/Cargo.toml` -- [ ] T04: Implement `pre-commit` staged checkpoint finalization contract (status:todo) +- [x] T04: Implement `pre-commit` staged checkpoint finalization contract (status:done) - Task ID: T04 - Goal: Bind pending checkpoints to staged content only and capture index/tree anchors. - Boundaries (in/out of scope): @@ -80,6 +80,9 @@ Implement a no-git-wrapper attribution platform that preserves normal developer - Unstaged edits cannot be attributed during commit finalization. - Verification notes (commands or checks): - Hook fixture tests with mixed staged/unstaged edits. + - `cargo test --manifest-path cli/Cargo.toml pre_commit_finalization_uses_only_staged_ranges_and_captures_anchors` + - `cargo test --manifest-path cli/Cargo.toml pre_commit_finalization_noops_when_sce_disabled` + - `cargo build --manifest-path cli/Cargo.toml` - [ ] T05: Implement `commit-msg` canonical co-author trailer policy (status:todo) - Task ID: T05 diff --git a/context/sce/agent-trace-pre-commit-staged-checkpoint.md b/context/sce/agent-trace-pre-commit-staged-checkpoint.md new file mode 100644 index 00000000..fe178a1d --- /dev/null +++ b/context/sce/agent-trace-pre-commit-staged-checkpoint.md @@ -0,0 +1,26 @@ +# Agent Trace Pre-commit Staged Checkpoint + +## Scope + +Task `agent-trace-attribution-no-git-wrapper` `T04` adds a pre-commit finalization contract that filters pending attribution to staged content only and preserves index/tree anchors for deterministic commit-time binding. + +## Implemented contract + +- Code location: `cli/src/services/hooks.rs`. +- Finalization entrypoint: `finalize_pre_commit_checkpoint(runtime, anchors, pending)`. +- Runtime no-op guards: + - `sce_disabled = true` -> `NoOp(Disabled)`. + - `cli_available = false` -> `NoOp(CliUnavailable)`. + - `is_bare_repo = true` -> `NoOp(BareRepository)`. +- Staged-only enforcement: + - Input keeps separate `staged_ranges` and `unstaged_ranges` per file. + - Finalized output includes only `staged_ranges`. + - Files with no staged ranges are dropped from finalized attribution. +- Anchors captured in finalized output: + - required `index_tree`. + - optional `head_tree`. + +## Verification coverage + +- Mixed staged/unstaged fixture test confirms unstaged ranges are excluded and anchor values are preserved. +- Guard-path tests cover disabled, missing CLI, and bare-repository no-op behavior. From f981f2f7d26e28eed32c3182d4b2f71ac35b700f Mon Sep 17 00:00:00 2001 From: David Abram Date: Wed, 4 Mar 2026 13:03:46 +0100 Subject: [PATCH 06/39] cli: Implement commit-msg canonical SCE co-author trailer policy Add commit-message normalization that appends exactly one canonical SCE trailer when SCE is enabled, co-author policy is enabled, and staged SCE attribution exists. Keep behavior idempotent via canonical-trailer dedupe and preserve existing newline semantics. --- cli/src/services/hooks.rs | 136 +++++++++++++++++- context/architecture.md | 2 +- context/context-map.md | 1 + context/glossary.md | 1 + context/overview.md | 2 + context/patterns.md | 1 + .../agent-trace-attribution-no-git-wrapper.md | 5 +- .../agent-trace-commit-msg-coauthor-policy.md | 27 ++++ 8 files changed, 167 insertions(+), 8 deletions(-) create mode 100644 context/sce/agent-trace-commit-msg-coauthor-policy.md diff --git a/cli/src/services/hooks.rs b/cli/src/services/hooks.rs index 28fa78ef..cfa01780 100644 --- a/cli/src/services/hooks.rs +++ b/cli/src/services/hooks.rs @@ -1,6 +1,7 @@ use anyhow::Result; pub const NAME: &str = "hooks"; +pub const CANONICAL_SCE_COAUTHOR_TRAILER: &str = "Co-authored-by: SCE "; #[derive(Clone, Debug, Eq, PartialEq)] pub struct PreCommitRuntimeState { @@ -58,6 +59,38 @@ pub enum PreCommitFinalization { Finalized(FinalizedCheckpoint), } +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct CommitMsgRuntimeState { + pub sce_disabled: bool, + pub sce_coauthor_enabled: bool, + pub has_staged_sce_attribution: bool, +} + +pub fn apply_commit_msg_coauthor_policy( + runtime: &CommitMsgRuntimeState, + commit_message: &str, +) -> String { + if runtime.sce_disabled || !runtime.sce_coauthor_enabled || !runtime.has_staged_sce_attribution + { + return commit_message.to_string(); + } + + let mut lines: Vec<&str> = commit_message.lines().collect(); + lines.retain(|line| *line != CANONICAL_SCE_COAUTHOR_TRAILER); + + if !lines.is_empty() && !lines.last().is_some_and(|line| line.is_empty()) { + lines.push(""); + } + lines.push(CANONICAL_SCE_COAUTHOR_TRAILER); + + let mut normalized = lines.join("\n"); + if commit_message.ends_with('\n') { + normalized.push('\n'); + } + + normalized +} + pub fn finalize_pre_commit_checkpoint( runtime: &PreCommitRuntimeState, anchors: PreCommitTreeAnchors, @@ -194,6 +227,16 @@ pub fn run_placeholder_hooks() -> Result { PreCommitFinalization::NoOp(_) => 0, }; + let commit_message_preview = apply_commit_msg_coauthor_policy( + &CommitMsgRuntimeState { + sce_disabled: false, + sce_coauthor_enabled: true, + has_staged_sce_attribution: true, + }, + "chore: hooks placeholder preview", + ); + let trailer_applied = commit_message_preview.contains(CANONICAL_SCE_COAUTHOR_TRAILER); + for lifecycle in [ GeneratedRegionLifecycle::Discovered, GeneratedRegionLifecycle::Updated, @@ -210,9 +253,10 @@ pub fn run_placeholder_hooks() -> Result { } Ok(format!( - "TODO: '{NAME}' is planned and not implemented yet. Hook event model reserves {} git hook(s) with generated-region tracking placeholders and staged-only pre-commit checkpoint preview over {} file(s).", + "TODO: '{NAME}' is planned and not implemented yet. Hook event model reserves {} git hook(s) with generated-region tracking placeholders, staged-only pre-commit checkpoint preview over {} file(s), and commit-msg canonical trailer preview applied={}.", model.supported_hooks.len(), - staged_file_count + staged_file_count, + trailer_applied )) } @@ -221,10 +265,11 @@ mod tests { use anyhow::Result; use super::{ - finalize_pre_commit_checkpoint, run_placeholder_hooks, GeneratedRegionEvent, - GeneratedRegionLifecycle, GitHookKind, HookEvent, HookService, PendingCheckpoint, - PendingFileCheckpoint, PendingLineRange, PlaceholderHookService, PreCommitFinalization, - PreCommitNoOpReason, PreCommitRuntimeState, PreCommitTreeAnchors, + apply_commit_msg_coauthor_policy, finalize_pre_commit_checkpoint, run_placeholder_hooks, + CommitMsgRuntimeState, GeneratedRegionEvent, GeneratedRegionLifecycle, GitHookKind, + HookEvent, HookService, PendingCheckpoint, PendingFileCheckpoint, PendingLineRange, + PlaceholderHookService, PreCommitFinalization, PreCommitNoOpReason, PreCommitRuntimeState, + PreCommitTreeAnchors, CANONICAL_SCE_COAUTHOR_TRAILER, }; fn sample_pending_checkpoint() -> PendingCheckpoint { @@ -344,6 +389,85 @@ mod tests { ); } + fn sample_commit_msg_runtime() -> CommitMsgRuntimeState { + CommitMsgRuntimeState { + sce_disabled: false, + sce_coauthor_enabled: true, + has_staged_sce_attribution: true, + } + } + + #[test] + fn commit_msg_policy_noops_when_sce_disabled() { + let mut runtime = sample_commit_msg_runtime(); + runtime.sce_disabled = true; + + let message = "feat: add attribution"; + let output = apply_commit_msg_coauthor_policy(&runtime, message); + assert_eq!(output, message); + } + + #[test] + fn commit_msg_policy_noops_when_coauthor_disabled() { + let mut runtime = sample_commit_msg_runtime(); + runtime.sce_coauthor_enabled = false; + + let message = "feat: add attribution"; + let output = apply_commit_msg_coauthor_policy(&runtime, message); + assert_eq!(output, message); + } + + #[test] + fn commit_msg_policy_noops_without_staged_sce_attribution() { + let mut runtime = sample_commit_msg_runtime(); + runtime.has_staged_sce_attribution = false; + + let message = "feat: add attribution"; + let output = apply_commit_msg_coauthor_policy(&runtime, message); + assert_eq!(output, message); + } + + #[test] + fn commit_msg_policy_appends_canonical_trailer_once_when_allowed() { + let message = "feat: add attribution"; + let output = apply_commit_msg_coauthor_policy(&sample_commit_msg_runtime(), message); + + assert_eq!( + output, + format!( + "feat: add attribution\n\n{}", + CANONICAL_SCE_COAUTHOR_TRAILER + ) + ); + } + + #[test] + fn commit_msg_policy_dedupes_existing_canonical_trailers() { + let message = format!( + "feat: add attribution\n\n{}\n{}\n", + CANONICAL_SCE_COAUTHOR_TRAILER, CANONICAL_SCE_COAUTHOR_TRAILER + ); + + let output = apply_commit_msg_coauthor_policy(&sample_commit_msg_runtime(), &message); + + assert_eq!( + output, + format!( + "feat: add attribution\n\n{}\n", + CANONICAL_SCE_COAUTHOR_TRAILER + ) + ); + } + + #[test] + fn commit_msg_policy_is_idempotent() { + let first = + apply_commit_msg_coauthor_policy(&sample_commit_msg_runtime(), "feat: add attribution"); + let second = apply_commit_msg_coauthor_policy(&sample_commit_msg_runtime(), &first); + + assert_eq!(first, second); + } + #[test] fn hooks_placeholder_event_model_reserves_generated_region_tracking() { let service = PlaceholderHookService; diff --git a/context/architecture.md b/context/architecture.md index 351d2f5a..f621784e 100644 --- a/context/architecture.md +++ b/context/architecture.md @@ -78,7 +78,7 @@ The repository includes a new placeholder Rust binary crate at `cli/`. - `cli/src/services/setup.rs` defines the setup command contract (`SetupMode`, `SetupTarget`, CLI flag parser/validator), an `inquire`-backed interactive target prompter (`InquireSetupTargetPrompter`), setup dispatch outcomes (proceed/cancelled), compile-time embedded asset access (`EmbeddedAsset`, target-scoped iterators) generated by `cli/build.rs` from `config/.opencode/**` and `config/.claude/**`, and a target-scoped install engine/orchestrator that stages embedded files, performs backup-and-replace with rollback restoration on swap failure, and formats deterministic completion messaging. - `cli/src/services/agent_trace.rs` defines the Agent Trace schema adapter and builder contracts (`adapt_trace_payload`, `build_trace_payload`), including fixed git VCS identity, reserved reverse-domain metadata keys, and deterministic AI `model_id` normalization before schema-compliance validation. - `cli/src/services/mcp.rs` defines MCP file-cache capability contracts (`McpService`, transport/capability snapshots, cache policy) with non-runnable placeholder tool declarations. -- `cli/src/services/hooks.rs` defines hook-event and generated-region tracking contracts (`HookService`, `HookEventModel`) plus a pre-commit staged-checkpoint finalization seam (`finalize_pre_commit_checkpoint`) that enforces staged-only attribution and carries index/tree anchors with explicit no-op guard states. +- `cli/src/services/hooks.rs` defines hook-event and generated-region tracking contracts (`HookService`, `HookEventModel`) plus a pre-commit staged-checkpoint finalization seam (`finalize_pre_commit_checkpoint`) that enforces staged-only attribution and carries index/tree anchors with explicit no-op guard states, and a commit-msg co-author policy seam (`apply_commit_msg_coauthor_policy`) that injects one canonical SCE trailer only for allowed attributed commits. - `cli/src/services/sync.rs` runs the local adapter through a lazily initialized shared tokio current-thread runtime and composes a placeholder cloud-sync abstraction (`CloudSyncGateway`) so local Turso validation and deferred cloud planning remain separated. - `cli/src/services/` contains module boundaries for setup, MCP, hooks, sync, and local DB adapters with explicit trait seams for future implementations. - `cli/README.md` is the crate-local onboarding and usage source of truth for placeholder behavior, safety limitations, and roadmap mapping back to service contracts. diff --git a/context/context-map.md b/context/context-map.md index b29d5877..0fb46d6a 100644 --- a/context/context-map.md +++ b/context/context-map.md @@ -20,6 +20,7 @@ Feature/domain context: - `context/sce/agent-trace-schema-adapter.md` (T02 schema adapter contract and code-level mapping surface in `cli/src/services/agent_trace.rs`) - `context/sce/agent-trace-payload-builder-validation.md` (T03 deterministic payload-builder path, model-id normalization behavior, and Agent Trace schema validation suite) - `context/sce/agent-trace-pre-commit-staged-checkpoint.md` (T04 pre-commit staged-only finalization contract with no-op guards and index/tree anchor capture) +- `context/sce/agent-trace-commit-msg-coauthor-policy.md` (T05 commit-msg canonical co-author trailer policy with env-gated injection and idempotent dedupe) Working areas: - `context/plans/` (active plan execution artifacts, not durable history) diff --git a/context/glossary.md b/context/glossary.md index b306e6b4..e8d8702e 100644 --- a/context/glossary.md +++ b/context/glossary.md @@ -57,3 +57,4 @@ - `agent trace payload builder`: Canonical T03 builder contract in `cli/src/services/agent_trace.rs` (`build_trace_payload`) that layers on top of the adapter, preserves deterministic output for identical input, and normalizes AI `model_id` values toward `provider/model` form when inferable. - `agent trace schema validation suite`: T03 compliance test slice in `services::agent_trace::tests` that validates payload JSON against the published Agent Trace trace-record schema with draft-2020-12 format checks enabled (`uri`, `date-time`, `uuid`) and a local version-pattern compatibility patch for `0.1.0`. - `agent trace pre-commit staged checkpoint finalization`: T04 contract in `cli/src/services/hooks.rs` (`finalize_pre_commit_checkpoint`) that filters pending attribution to staged ranges only, drops unstaged-only files, captures index/head tree anchors, and returns explicit no-op outcomes when SCE is disabled, CLI is unavailable, or the repository is bare. +- `agent trace commit-msg co-author policy`: T05 contract in `cli/src/services/hooks.rs` (`apply_commit_msg_coauthor_policy`) that applies exactly one canonical trailer (`Co-authored-by: SCE `) only when SCE is enabled, co-author policy is enabled, and staged SCE attribution exists; duplicate canonical trailers are deduped idempotently. diff --git a/context/overview.md b/context/overview.md index e73b4a85..92cfbd31 100644 --- a/context/overview.md +++ b/context/overview.md @@ -26,6 +26,7 @@ The no-git-wrapper Agent Trace initiative baseline contract is defined in `conte The CLI now includes a task-scoped Agent Trace schema adapter contract in `cli/src/services/agent_trace.rs`, with deterministic mapping of internal attribution input to Agent Trace-shaped record structures documented in `context/sce/agent-trace-schema-adapter.md`. The Agent Trace service now also provides a deterministic payload-builder path (`build_trace_payload`) with AI `model_id` normalization and schema-compliance validation coverage documented in `context/sce/agent-trace-payload-builder-validation.md`. The hooks service now includes a pre-commit staged checkpoint finalization contract (`finalize_pre_commit_checkpoint`) that enforces staged-only attribution, captures index/tree anchors, and no-ops for disabled/unavailable/bare-repo runtime states; this behavior is documented in `context/sce/agent-trace-pre-commit-staged-checkpoint.md`. +The hooks service now also exposes a `commit-msg` co-author trailer policy (`apply_commit_msg_coauthor_policy`) that conditionally injects exactly one canonical SCE trailer based on `SCE_DISABLED`, `SCE_COAUTHOR_ENABLED`, and staged-attribution presence, with idempotent deduplication behavior documented in `context/sce/agent-trace-commit-msg-coauthor-policy.md`. ## Repository model @@ -80,3 +81,4 @@ Lightweight post-task verification baseline (required after each completed task) - Use `context/sce/agent-trace-schema-adapter.md` for the implemented T02 adapter contract and canonical mapping surface in `cli/src/services/agent_trace.rs`. - Use `context/sce/agent-trace-payload-builder-validation.md` for the implemented T03 builder path, normalization policy, and schema-validation behavior. - Use `context/sce/agent-trace-pre-commit-staged-checkpoint.md` for the implemented T04 pre-commit staged-only finalization contract and runtime no-op guards. +- Use `context/sce/agent-trace-commit-msg-coauthor-policy.md` for the implemented T05 commit-msg canonical co-author trailer policy and idempotent dedupe behavior. diff --git a/context/patterns.md b/context/patterns.md index ce0ac88d..01c93aff 100644 --- a/context/patterns.md +++ b/context/patterns.md @@ -85,6 +85,7 @@ - For future CLI domains, define trait-first service contracts with request/plan models in `cli/src/services/*` and keep placeholder implementations explicitly non-runnable until production behavior is approved. - Model deferred integration boundaries with concrete event/capability data structures (for example MCP file-cache snapshots/policies, git-hook/generated-region events, cloud-sync checkpoints) so later tasks can implement behavior without reshaping public seams. - For pre-commit attribution finalization seams, keep pending staged and unstaged ranges explicitly separated in input models and finalize from staged ranges only, while carrying index/tree anchors for deterministic commit-time attribution binding. +- For commit-msg co-author policy seams, gate canonical trailer insertion on runtime controls (`SCE_DISABLED`, `SCE_COAUTHOR_ENABLED`) plus staged SCE-attribution presence, and enforce idempotent dedupe so allowed cases end with exactly one `Co-authored-by: SCE ` trailer. - Keep crate-local onboarding docs in `cli/README.md` and sanity-check command examples against actual `sce` output whenever command messaging changes. - Keep targeted CLI command-surface verification in flake checks: `checks..cli-setup-command-surface` runs from `cli/` and executes `cargo fmt --check` plus focused setup-related tests (`help_text_mentions_setup_target_flags`, `parser_routes_setup`, `run_setup_reports`). - In `cli/flake.nix`, select the Rust toolchain via an explicit Rust overlay (`rust-overlay`) and thread that toolchain through `makeRustPlatform` so CLI check/build derivations do not rely on implicit nixpkgs Rust defaults. diff --git a/context/plans/agent-trace-attribution-no-git-wrapper.md b/context/plans/agent-trace-attribution-no-git-wrapper.md index 7f13a58c..3293388b 100644 --- a/context/plans/agent-trace-attribution-no-git-wrapper.md +++ b/context/plans/agent-trace-attribution-no-git-wrapper.md @@ -84,7 +84,7 @@ Implement a no-git-wrapper attribution platform that preserves normal developer - `cargo test --manifest-path cli/Cargo.toml pre_commit_finalization_noops_when_sce_disabled` - `cargo build --manifest-path cli/Cargo.toml` -- [ ] T05: Implement `commit-msg` canonical co-author trailer policy (status:todo) +- [x] T05: Implement `commit-msg` canonical co-author trailer policy (status:done) - Task ID: T05 - Goal: Add idempotent canonical SCE trailer injection when SCE-attributed staged changes exist. - Boundaries (in/out of scope): @@ -94,6 +94,9 @@ Implement a no-git-wrapper attribution platform that preserves normal developer - Exactly one canonical trailer appears in all allowed SCE cases. - Verification notes (commands or checks): - Identity acceptance checklist scenarios 1-5, 8, and 10. + - `cargo fmt --manifest-path cli/Cargo.toml -- --check` + - `cargo test --manifest-path cli/Cargo.toml commit_msg_policy` + - `cargo build --manifest-path cli/Cargo.toml` - [ ] T06: Implement `post-commit` trace finalize and dual-write path (status:todo) - Task ID: T06 diff --git a/context/sce/agent-trace-commit-msg-coauthor-policy.md b/context/sce/agent-trace-commit-msg-coauthor-policy.md new file mode 100644 index 00000000..1998b190 --- /dev/null +++ b/context/sce/agent-trace-commit-msg-coauthor-policy.md @@ -0,0 +1,27 @@ +# Agent Trace commit-msg co-author policy + +## Status +- Plan: `agent-trace-attribution-no-git-wrapper` +- Task: `T05` +- Implementation state: done + +## Canonical contract +- Policy entrypoint: `cli/src/services/hooks.rs` -> `apply_commit_msg_coauthor_policy`. +- Canonical trailer string: `Co-authored-by: SCE `. +- Runtime gating conditions: + - `sce_disabled = false` + - `sce_coauthor_enabled = true` + - `has_staged_sce_attribution = true` +- When all gate conditions pass, output commit message MUST contain exactly one canonical SCE trailer. +- When any gate condition fails, commit message is returned unchanged. + +## Behavior details +- Canonical trailer dedupe removes duplicate canonical lines before final insertion. +- Trailer insertion is idempotent: applying the policy repeatedly yields the same message. +- Existing trailing newline is preserved when present. +- Human author/committer identity is not rewritten; only commit message trailer content is affected. + +## Verification evidence +- `cargo fmt --manifest-path cli/Cargo.toml -- --check` +- `cargo test --manifest-path cli/Cargo.toml commit_msg_policy` +- `cargo build --manifest-path cli/Cargo.toml` From 585865d990b2a940fdadbc871893c68d325ce1fe Mon Sep 17 00:00:00 2001 From: David Abram Date: Wed, 4 Mar 2026 13:15:59 +0100 Subject: [PATCH 07/39] cli: Implement post-commit Agent Trace dual-write finalization Add a post-commit finalization seam that builds Agent Trace payloads, enforces commit-level idempotency, and attempts notes plus DB persistence in one pass. Queue failed targets for retry fallback and extend hook contracts/tests to cover already-finalized no-op, successful dual-write, and transient persistence failure behavior. --- cli/src/services/hooks.rs | 464 +++++++++++++++++- context/architecture.md | 2 +- context/context-map.md | 1 + context/glossary.md | 2 + context/overview.md | 2 + context/patterns.md | 1 + .../agent-trace-attribution-no-git-wrapper.md | 4 +- .../sce/agent-trace-post-commit-dual-write.md | 31 ++ 8 files changed, 497 insertions(+), 10 deletions(-) create mode 100644 context/sce/agent-trace-post-commit-dual-write.md diff --git a/cli/src/services/hooks.rs b/cli/src/services/hooks.rs index cfa01780..b79a31a3 100644 --- a/cli/src/services/hooks.rs +++ b/cli/src/services/hooks.rs @@ -1,7 +1,13 @@ use anyhow::Result; +use crate::services::agent_trace::{ + build_trace_payload, AgentTraceRecord, FileAttributionInput, QualityStatus, TraceAdapterInput, + TRACE_CONTENT_TYPE, +}; + pub const NAME: &str = "hooks"; pub const CANONICAL_SCE_COAUTHOR_TRAILER: &str = "Co-authored-by: SCE "; +pub const POST_COMMIT_PARENT_SHA_METADATA_KEY: &str = "dev.crocoder.sce.parent_revision"; #[derive(Clone, Debug, Eq, PartialEq)] pub struct PreCommitRuntimeState { @@ -66,6 +72,226 @@ pub struct CommitMsgRuntimeState { pub has_staged_sce_attribution: bool, } +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct PostCommitRuntimeState { + pub sce_disabled: bool, + pub cli_available: bool, + pub is_bare_repo: bool, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct PostCommitInput { + pub record_id: String, + pub timestamp_rfc3339: String, + pub commit_sha: String, + pub parent_sha: Option, + pub idempotency_key: String, + pub files: Vec, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct TraceNote { + pub notes_ref: String, + pub commit_sha: String, + pub content_type: String, + pub record: AgentTraceRecord, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct PersistedTraceRecord { + pub commit_sha: String, + pub idempotency_key: String, + pub content_type: String, + pub notes_ref: String, + pub record: AgentTraceRecord, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub enum PersistenceErrorClass { + Transient, + Permanent, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct PersistenceFailure { + pub class: PersistenceErrorClass, + pub message: String, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub enum PersistenceWriteResult { + Written, + AlreadyExists, + Failed(PersistenceFailure), +} + +#[derive(Clone, Copy, Debug, Eq, PartialEq)] +pub enum PersistenceTarget { + Notes, + Database, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct TraceRetryQueueEntry { + pub commit_sha: String, + pub failed_targets: Vec, + pub content_type: String, + pub notes_ref: String, + pub record: AgentTraceRecord, +} + +pub trait TraceNotesWriter { + fn write_note(&mut self, note: TraceNote) -> PersistenceWriteResult; +} + +pub trait TraceRecordStore { + fn write_trace_record(&mut self, record: PersistedTraceRecord) -> PersistenceWriteResult; +} + +pub trait TraceRetryQueue { + fn enqueue(&mut self, entry: TraceRetryQueueEntry) -> Result<()>; +} + +pub trait TraceEmissionLedger { + fn has_emitted(&self, commit_sha: &str) -> bool; + fn mark_emitted(&mut self, commit_sha: &str); +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub enum PostCommitNoOpReason { + Disabled, + CliUnavailable, + BareRepository, + AlreadyFinalized, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct PostCommitPersisted { + pub commit_sha: String, + pub notes: PersistenceWriteResult, + pub database: PersistenceWriteResult, + pub trace_id: String, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct PostCommitQueuedFallback { + pub commit_sha: String, + pub failed_targets: Vec, + pub trace_id: String, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub enum PostCommitFinalization { + NoOp(PostCommitNoOpReason), + Persisted(PostCommitPersisted), + QueuedFallback(PostCommitQueuedFallback), +} + +pub fn finalize_post_commit_trace( + runtime: &PostCommitRuntimeState, + input: PostCommitInput, + notes_writer: &mut impl TraceNotesWriter, + record_store: &mut impl TraceRecordStore, + retry_queue: &mut impl TraceRetryQueue, + emission_ledger: &mut impl TraceEmissionLedger, +) -> Result { + if runtime.sce_disabled { + return Ok(PostCommitFinalization::NoOp(PostCommitNoOpReason::Disabled)); + } + + if !runtime.cli_available { + return Ok(PostCommitFinalization::NoOp( + PostCommitNoOpReason::CliUnavailable, + )); + } + + if runtime.is_bare_repo { + return Ok(PostCommitFinalization::NoOp( + PostCommitNoOpReason::BareRepository, + )); + } + + if emission_ledger.has_emitted(&input.commit_sha) { + return Ok(PostCommitFinalization::NoOp( + PostCommitNoOpReason::AlreadyFinalized, + )); + } + + let mut record = build_trace_payload(TraceAdapterInput { + record_id: input.record_id, + timestamp_rfc3339: input.timestamp_rfc3339, + commit_sha: input.commit_sha.clone(), + files: input.files, + quality_status: QualityStatus::Final, + rewrite: None, + idempotency_key: Some(input.idempotency_key.clone()), + }); + + if let Some(parent_sha) = input.parent_sha { + record + .metadata + .insert(POST_COMMIT_PARENT_SHA_METADATA_KEY.to_string(), parent_sha); + } + + let note = TraceNote { + notes_ref: crate::services::agent_trace::NOTES_REF.to_string(), + commit_sha: input.commit_sha.clone(), + content_type: TRACE_CONTENT_TYPE.to_string(), + record: record.clone(), + }; + let persisted = PersistedTraceRecord { + commit_sha: input.commit_sha.clone(), + idempotency_key: input.idempotency_key, + content_type: TRACE_CONTENT_TYPE.to_string(), + notes_ref: crate::services::agent_trace::NOTES_REF.to_string(), + record: record.clone(), + }; + + let notes_result = notes_writer.write_note(note); + let database_result = record_store.write_trace_record(persisted); + + let failed_targets = collect_failed_targets(¬es_result, &database_result); + if failed_targets.is_empty() { + emission_ledger.mark_emitted(&input.commit_sha); + return Ok(PostCommitFinalization::Persisted(PostCommitPersisted { + commit_sha: input.commit_sha, + notes: notes_result, + database: database_result, + trace_id: record.id, + })); + } + + retry_queue.enqueue(TraceRetryQueueEntry { + commit_sha: input.commit_sha.clone(), + failed_targets: failed_targets.clone(), + content_type: TRACE_CONTENT_TYPE.to_string(), + notes_ref: crate::services::agent_trace::NOTES_REF.to_string(), + record: record.clone(), + })?; + + Ok(PostCommitFinalization::QueuedFallback( + PostCommitQueuedFallback { + commit_sha: input.commit_sha, + failed_targets, + trace_id: record.id, + }, + )) +} + +fn collect_failed_targets( + notes_result: &PersistenceWriteResult, + database_result: &PersistenceWriteResult, +) -> Vec { + let mut failed_targets = Vec::new(); + if matches!(notes_result, PersistenceWriteResult::Failed(_)) { + failed_targets.push(PersistenceTarget::Notes); + } + if matches!(database_result, PersistenceWriteResult::Failed(_)) { + failed_targets.push(PersistenceTarget::Database); + } + failed_targets +} + pub fn apply_commit_msg_coauthor_policy( runtime: &CommitMsgRuntimeState, commit_message: &str, @@ -129,6 +355,7 @@ pub fn finalize_pre_commit_checkpoint( #[derive(Clone, Copy, Debug, Eq, PartialEq)] pub enum GitHookKind { PreCommit, + PostCommit, PrePush, } @@ -169,14 +396,18 @@ pub struct PlaceholderHookService; impl HookService for PlaceholderHookService { fn event_model(&self) -> HookEventModel { HookEventModel { - supported_hooks: vec![GitHookKind::PreCommit, GitHookKind::PrePush], + supported_hooks: vec![ + GitHookKind::PreCommit, + GitHookKind::PostCommit, + GitHookKind::PrePush, + ], generated_region_tracking: true, } } fn record(&self, event: HookEvent) -> Result<()> { match event.hook { - GitHookKind::PreCommit | GitHookKind::PrePush => {} + GitHookKind::PreCommit | GitHookKind::PostCommit | GitHookKind::PrePush => {} } if let Some(region_event) = event.region_event { @@ -264,12 +495,20 @@ pub fn run_placeholder_hooks() -> Result { mod tests { use anyhow::Result; + use crate::services::agent_trace::{ + ContributorInput, ContributorType, ConversationInput, FileAttributionInput, RangeInput, + }; + use super::{ - apply_commit_msg_coauthor_policy, finalize_pre_commit_checkpoint, run_placeholder_hooks, - CommitMsgRuntimeState, GeneratedRegionEvent, GeneratedRegionLifecycle, GitHookKind, - HookEvent, HookService, PendingCheckpoint, PendingFileCheckpoint, PendingLineRange, - PlaceholderHookService, PreCommitFinalization, PreCommitNoOpReason, PreCommitRuntimeState, - PreCommitTreeAnchors, CANONICAL_SCE_COAUTHOR_TRAILER, + apply_commit_msg_coauthor_policy, finalize_post_commit_trace, + finalize_pre_commit_checkpoint, run_placeholder_hooks, CommitMsgRuntimeState, + GeneratedRegionEvent, GeneratedRegionLifecycle, GitHookKind, HookEvent, HookService, + PendingCheckpoint, PendingFileCheckpoint, PendingLineRange, PersistenceErrorClass, + PersistenceFailure, PersistenceTarget, PersistenceWriteResult, PlaceholderHookService, + PostCommitFinalization, PostCommitInput, PostCommitNoOpReason, PostCommitRuntimeState, + PreCommitFinalization, PreCommitNoOpReason, PreCommitRuntimeState, PreCommitTreeAnchors, + TraceEmissionLedger, TraceNote, TraceNotesWriter, TraceRecordStore, TraceRetryQueue, + TraceRetryQueueEntry, CANONICAL_SCE_COAUTHOR_TRAILER, POST_COMMIT_PARENT_SHA_METADATA_KEY, }; fn sample_pending_checkpoint() -> PendingCheckpoint { @@ -303,6 +542,215 @@ mod tests { } } + #[derive(Default)] + struct FakeEmissionLedger { + emitted: Vec, + } + + impl TraceEmissionLedger for FakeEmissionLedger { + fn has_emitted(&self, commit_sha: &str) -> bool { + self.emitted.iter().any(|sha| sha == commit_sha) + } + + fn mark_emitted(&mut self, commit_sha: &str) { + self.emitted.push(commit_sha.to_string()); + } + } + + struct FakeNotesWriter { + result: PersistenceWriteResult, + writes: Vec, + } + + impl FakeNotesWriter { + fn new(result: PersistenceWriteResult) -> Self { + Self { + result, + writes: Vec::new(), + } + } + } + + impl TraceNotesWriter for FakeNotesWriter { + fn write_note(&mut self, note: TraceNote) -> PersistenceWriteResult { + self.writes.push(note); + self.result.clone() + } + } + + struct FakeRecordStore { + result: PersistenceWriteResult, + } + + impl FakeRecordStore { + fn new(result: PersistenceWriteResult) -> Self { + Self { result } + } + } + + impl TraceRecordStore for FakeRecordStore { + fn write_trace_record( + &mut self, + _record: super::PersistedTraceRecord, + ) -> PersistenceWriteResult { + self.result.clone() + } + } + + #[derive(Default)] + struct FakeRetryQueue { + entries: Vec, + } + + impl TraceRetryQueue for FakeRetryQueue { + fn enqueue(&mut self, entry: TraceRetryQueueEntry) -> Result<()> { + self.entries.push(entry); + Ok(()) + } + } + + fn sample_post_commit_runtime() -> PostCommitRuntimeState { + PostCommitRuntimeState { + sce_disabled: false, + cli_available: true, + is_bare_repo: false, + } + } + + fn sample_post_commit_input() -> PostCommitInput { + PostCommitInput { + record_id: "550e8400-e29b-41d4-a716-446655440000".to_string(), + timestamp_rfc3339: "2026-03-04T10:11:12Z".to_string(), + commit_sha: "abc123def456".to_string(), + parent_sha: Some("def789ghi000".to_string()), + idempotency_key: "repo:abc123def456".to_string(), + files: vec![FileAttributionInput { + path: "src/lib.rs".to_string(), + conversations: vec![ConversationInput { + url: "https://example.test/conversation/1".to_string(), + related: vec![], + ranges: vec![RangeInput { + start_line: 1, + end_line: 5, + contributor: ContributorInput { + kind: ContributorType::Ai, + model_id: Some("openai/gpt-5.3-codex".to_string()), + }, + }], + }], + }], + } + } + + #[test] + fn post_commit_finalization_noops_when_already_finalized() -> Result<()> { + let runtime = sample_post_commit_runtime(); + let input = sample_post_commit_input(); + let mut notes = FakeNotesWriter::new(PersistenceWriteResult::Written); + let mut store = FakeRecordStore::new(PersistenceWriteResult::Written); + let mut queue = FakeRetryQueue::default(); + let mut ledger = FakeEmissionLedger { + emitted: vec![input.commit_sha.clone()], + }; + + let outcome = finalize_post_commit_trace( + &runtime, + input, + &mut notes, + &mut store, + &mut queue, + &mut ledger, + )?; + + assert_eq!( + outcome, + PostCommitFinalization::NoOp(PostCommitNoOpReason::AlreadyFinalized) + ); + assert!(notes.writes.is_empty()); + assert!(queue.entries.is_empty()); + Ok(()) + } + + #[test] + fn post_commit_finalization_dual_writes_with_parent_metadata_and_mime() -> Result<()> { + let runtime = sample_post_commit_runtime(); + let input = sample_post_commit_input(); + let mut notes = FakeNotesWriter::new(PersistenceWriteResult::Written); + let mut store = FakeRecordStore::new(PersistenceWriteResult::Written); + let mut queue = FakeRetryQueue::default(); + let mut ledger = FakeEmissionLedger::default(); + + let outcome = finalize_post_commit_trace( + &runtime, + input.clone(), + &mut notes, + &mut store, + &mut queue, + &mut ledger, + )?; + + let persisted = match outcome { + PostCommitFinalization::Persisted(persisted) => persisted, + _ => panic!("expected persisted post-commit outcome"), + }; + assert_eq!(persisted.commit_sha, input.commit_sha); + assert_eq!(persisted.trace_id, "550e8400-e29b-41d4-a716-446655440000"); + + assert_eq!(notes.writes.len(), 1); + assert_eq!( + notes.writes[0].content_type, + "application/vnd.agent-trace.record+json" + ); + assert_eq!(notes.writes[0].notes_ref, "refs/notes/agent-trace"); + assert_eq!( + notes.writes[0] + .record + .metadata + .get(POST_COMMIT_PARENT_SHA_METADATA_KEY), + Some(&"def789ghi000".to_string()) + ); + assert!(ledger.has_emitted("abc123def456")); + Ok(()) + } + + #[test] + fn post_commit_finalization_queues_when_db_write_is_transient_failure() -> Result<()> { + let runtime = sample_post_commit_runtime(); + let input = sample_post_commit_input(); + let mut notes = FakeNotesWriter::new(PersistenceWriteResult::Written); + let mut store = FakeRecordStore::new(PersistenceWriteResult::Failed(PersistenceFailure { + class: PersistenceErrorClass::Transient, + message: "database unavailable".to_string(), + })); + let mut queue = FakeRetryQueue::default(); + let mut ledger = FakeEmissionLedger::default(); + + let outcome = finalize_post_commit_trace( + &runtime, + input, + &mut notes, + &mut store, + &mut queue, + &mut ledger, + )?; + + assert_eq!( + outcome, + PostCommitFinalization::QueuedFallback(super::PostCommitQueuedFallback { + commit_sha: "abc123def456".to_string(), + failed_targets: vec![PersistenceTarget::Database], + trace_id: "550e8400-e29b-41d4-a716-446655440000".to_string(), + }) + ); + assert_eq!(queue.entries.len(), 1); + assert_eq!( + queue.entries[0].failed_targets, + vec![PersistenceTarget::Database] + ); + assert!(!ledger.has_emitted("abc123def456")); + Ok(()) + } + #[test] fn pre_commit_finalization_noops_when_sce_disabled() { let mut runtime = sample_runtime(); @@ -473,7 +921,7 @@ mod tests { let service = PlaceholderHookService; let model = service.event_model(); assert!(model.generated_region_tracking); - assert_eq!(model.supported_hooks.len(), 2); + assert_eq!(model.supported_hooks.len(), 3); } #[test] diff --git a/context/architecture.md b/context/architecture.md index f621784e..368985e4 100644 --- a/context/architecture.md +++ b/context/architecture.md @@ -78,7 +78,7 @@ The repository includes a new placeholder Rust binary crate at `cli/`. - `cli/src/services/setup.rs` defines the setup command contract (`SetupMode`, `SetupTarget`, CLI flag parser/validator), an `inquire`-backed interactive target prompter (`InquireSetupTargetPrompter`), setup dispatch outcomes (proceed/cancelled), compile-time embedded asset access (`EmbeddedAsset`, target-scoped iterators) generated by `cli/build.rs` from `config/.opencode/**` and `config/.claude/**`, and a target-scoped install engine/orchestrator that stages embedded files, performs backup-and-replace with rollback restoration on swap failure, and formats deterministic completion messaging. - `cli/src/services/agent_trace.rs` defines the Agent Trace schema adapter and builder contracts (`adapt_trace_payload`, `build_trace_payload`), including fixed git VCS identity, reserved reverse-domain metadata keys, and deterministic AI `model_id` normalization before schema-compliance validation. - `cli/src/services/mcp.rs` defines MCP file-cache capability contracts (`McpService`, transport/capability snapshots, cache policy) with non-runnable placeholder tool declarations. -- `cli/src/services/hooks.rs` defines hook-event and generated-region tracking contracts (`HookService`, `HookEventModel`) plus a pre-commit staged-checkpoint finalization seam (`finalize_pre_commit_checkpoint`) that enforces staged-only attribution and carries index/tree anchors with explicit no-op guard states, and a commit-msg co-author policy seam (`apply_commit_msg_coauthor_policy`) that injects one canonical SCE trailer only for allowed attributed commits. +- `cli/src/services/hooks.rs` defines hook-event and generated-region tracking contracts (`HookService`, `HookEventModel`) plus a pre-commit staged-checkpoint finalization seam (`finalize_pre_commit_checkpoint`) that enforces staged-only attribution and carries index/tree anchors with explicit no-op guard states, a commit-msg co-author policy seam (`apply_commit_msg_coauthor_policy`) that injects one canonical SCE trailer only for allowed attributed commits, and a post-commit trace finalization seam (`finalize_post_commit_trace`) that performs notes+DB dual writes with idempotency ledger guards and retry-queue fallback capture. - `cli/src/services/sync.rs` runs the local adapter through a lazily initialized shared tokio current-thread runtime and composes a placeholder cloud-sync abstraction (`CloudSyncGateway`) so local Turso validation and deferred cloud planning remain separated. - `cli/src/services/` contains module boundaries for setup, MCP, hooks, sync, and local DB adapters with explicit trait seams for future implementations. - `cli/README.md` is the crate-local onboarding and usage source of truth for placeholder behavior, safety limitations, and roadmap mapping back to service contracts. diff --git a/context/context-map.md b/context/context-map.md index 0fb46d6a..060891d0 100644 --- a/context/context-map.md +++ b/context/context-map.md @@ -21,6 +21,7 @@ Feature/domain context: - `context/sce/agent-trace-payload-builder-validation.md` (T03 deterministic payload-builder path, model-id normalization behavior, and Agent Trace schema validation suite) - `context/sce/agent-trace-pre-commit-staged-checkpoint.md` (T04 pre-commit staged-only finalization contract with no-op guards and index/tree anchor capture) - `context/sce/agent-trace-commit-msg-coauthor-policy.md` (T05 commit-msg canonical co-author trailer policy with env-gated injection and idempotent dedupe) +- `context/sce/agent-trace-post-commit-dual-write.md` (T06 post-commit trace finalization contract, notes+DB dual-write behavior, idempotency ledger guard, and retry-queue fallback semantics) Working areas: - `context/plans/` (active plan execution artifacts, not durable history) diff --git a/context/glossary.md b/context/glossary.md index e8d8702e..e71a50d9 100644 --- a/context/glossary.md +++ b/context/glossary.md @@ -58,3 +58,5 @@ - `agent trace schema validation suite`: T03 compliance test slice in `services::agent_trace::tests` that validates payload JSON against the published Agent Trace trace-record schema with draft-2020-12 format checks enabled (`uri`, `date-time`, `uuid`) and a local version-pattern compatibility patch for `0.1.0`. - `agent trace pre-commit staged checkpoint finalization`: T04 contract in `cli/src/services/hooks.rs` (`finalize_pre_commit_checkpoint`) that filters pending attribution to staged ranges only, drops unstaged-only files, captures index/head tree anchors, and returns explicit no-op outcomes when SCE is disabled, CLI is unavailable, or the repository is bare. - `agent trace commit-msg co-author policy`: T05 contract in `cli/src/services/hooks.rs` (`apply_commit_msg_coauthor_policy`) that applies exactly one canonical trailer (`Co-authored-by: SCE `) only when SCE is enabled, co-author policy is enabled, and staged SCE attribution exists; duplicate canonical trailers are deduped idempotently. +- `agent trace post-commit dual-write finalization`: T06 contract in `cli/src/services/hooks.rs` (`finalize_post_commit_trace`) that emits one canonical Agent Trace record per commit behind runtime guards, writes to both notes (`refs/notes/agent-trace`) and DB persistence targets, and enqueues retry fallback entries when either persistence target fails. +- `agent trace post-commit idempotency ledger`: T06 seam (`TraceEmissionLedger`) in `cli/src/services/hooks.rs` used to prevent duplicate emission for the same commit SHA and to mark successful dual-write completion. diff --git a/context/overview.md b/context/overview.md index 92cfbd31..fd075734 100644 --- a/context/overview.md +++ b/context/overview.md @@ -27,6 +27,7 @@ The CLI now includes a task-scoped Agent Trace schema adapter contract in `cli/s The Agent Trace service now also provides a deterministic payload-builder path (`build_trace_payload`) with AI `model_id` normalization and schema-compliance validation coverage documented in `context/sce/agent-trace-payload-builder-validation.md`. The hooks service now includes a pre-commit staged checkpoint finalization contract (`finalize_pre_commit_checkpoint`) that enforces staged-only attribution, captures index/tree anchors, and no-ops for disabled/unavailable/bare-repo runtime states; this behavior is documented in `context/sce/agent-trace-pre-commit-staged-checkpoint.md`. The hooks service now also exposes a `commit-msg` co-author trailer policy (`apply_commit_msg_coauthor_policy`) that conditionally injects exactly one canonical SCE trailer based on `SCE_DISABLED`, `SCE_COAUTHOR_ENABLED`, and staged-attribution presence, with idempotent deduplication behavior documented in `context/sce/agent-trace-commit-msg-coauthor-policy.md`. +The hooks service now also includes a post-commit trace finalization seam (`finalize_post_commit_trace`) that builds canonical Agent Trace payloads, enforces commit-level idempotency guards, performs notes + DB dual writes, and enqueues retry fallback metadata when persistence targets fail; this behavior is documented in `context/sce/agent-trace-post-commit-dual-write.md`. ## Repository model @@ -82,3 +83,4 @@ Lightweight post-task verification baseline (required after each completed task) - Use `context/sce/agent-trace-payload-builder-validation.md` for the implemented T03 builder path, normalization policy, and schema-validation behavior. - Use `context/sce/agent-trace-pre-commit-staged-checkpoint.md` for the implemented T04 pre-commit staged-only finalization contract and runtime no-op guards. - Use `context/sce/agent-trace-commit-msg-coauthor-policy.md` for the implemented T05 commit-msg canonical co-author trailer policy and idempotent dedupe behavior. +- Use `context/sce/agent-trace-post-commit-dual-write.md` for the implemented T06 post-commit trace finalization and dual-write + queue-fallback behavior. diff --git a/context/patterns.md b/context/patterns.md index 01c93aff..14d8e98c 100644 --- a/context/patterns.md +++ b/context/patterns.md @@ -86,6 +86,7 @@ - Model deferred integration boundaries with concrete event/capability data structures (for example MCP file-cache snapshots/policies, git-hook/generated-region events, cloud-sync checkpoints) so later tasks can implement behavior without reshaping public seams. - For pre-commit attribution finalization seams, keep pending staged and unstaged ranges explicitly separated in input models and finalize from staged ranges only, while carrying index/tree anchors for deterministic commit-time attribution binding. - For commit-msg co-author policy seams, gate canonical trailer insertion on runtime controls (`SCE_DISABLED`, `SCE_COAUTHOR_ENABLED`) plus staged SCE-attribution presence, and enforce idempotent dedupe so allowed cases end with exactly one `Co-authored-by: SCE ` trailer. +- For post-commit trace finalization seams, treat commit SHA as the idempotency identity, perform notes + DB writes in the same finalize pass when available, and enqueue retry-fallback entries that explicitly capture failed persistence targets for replay-safe recovery. - Keep crate-local onboarding docs in `cli/README.md` and sanity-check command examples against actual `sce` output whenever command messaging changes. - Keep targeted CLI command-surface verification in flake checks: `checks..cli-setup-command-surface` runs from `cli/` and executes `cargo fmt --check` plus focused setup-related tests (`help_text_mentions_setup_target_flags`, `parser_routes_setup`, `run_setup_reports`). - In `cli/flake.nix`, select the Rust toolchain via an explicit Rust overlay (`rust-overlay`) and thread that toolchain through `makeRustPlatform` so CLI check/build derivations do not rely on implicit nixpkgs Rust defaults. diff --git a/context/plans/agent-trace-attribution-no-git-wrapper.md b/context/plans/agent-trace-attribution-no-git-wrapper.md index 3293388b..757a1cf0 100644 --- a/context/plans/agent-trace-attribution-no-git-wrapper.md +++ b/context/plans/agent-trace-attribution-no-git-wrapper.md @@ -98,7 +98,7 @@ Implement a no-git-wrapper attribution platform that preserves normal developer - `cargo test --manifest-path cli/Cargo.toml commit_msg_policy` - `cargo build --manifest-path cli/Cargo.toml` -- [ ] T06: Implement `post-commit` trace finalize and dual-write path (status:todo) +- [x] T06: Implement `post-commit` trace finalize and dual-write path (status:done) - Task ID: T06 - Goal: Emit commit trace after commit creation and write to notes + DB (or queue fallback). - Boundaries (in/out of scope): @@ -108,6 +108,8 @@ Implement a no-git-wrapper attribution platform that preserves normal developer - New HEAD always produces a trace record with durable persistence semantics. - Verification notes (commands or checks): - End-to-end local commit tests including transient DB or notes outage. + - `cargo test --manifest-path cli/Cargo.toml post_commit_finalization` + - `cargo build --manifest-path cli/Cargo.toml` - [ ] T07: Add hook install and health validation (`sce doctor`) for local rollout (status:todo) - Task ID: T07 diff --git a/context/sce/agent-trace-post-commit-dual-write.md b/context/sce/agent-trace-post-commit-dual-write.md new file mode 100644 index 00000000..bb4e6bf2 --- /dev/null +++ b/context/sce/agent-trace-post-commit-dual-write.md @@ -0,0 +1,31 @@ +# Agent Trace post-commit dual-write finalization + +## Status +- Plan: `agent-trace-attribution-no-git-wrapper` +- Task: `T06` +- Implementation state: done + +## Canonical contract +- Policy entrypoint: `cli/src/services/hooks.rs` -> `finalize_post_commit_trace`. +- Runtime no-op guards: + - `sce_disabled = true` -> `NoOp(Disabled)` + - `cli_available = false` -> `NoOp(CliUnavailable)` + - `is_bare_repo = true` -> `NoOp(BareRepository)` +- Idempotency guard: `TraceEmissionLedger::has_emitted(commit_sha)` short-circuits to `NoOp(AlreadyFinalized)`. +- Emitted trace payload path uses `build_trace_payload` from `cli/src/services/agent_trace.rs` with: + - `quality_status = final` + - `metadata[dev.crocoder.sce.idempotency_key]` populated + - optional `metadata[dev.crocoder.sce.parent_revision]` when a parent SHA is available +- Notes write policy is fixed to `refs/notes/agent-trace` with MIME `application/vnd.agent-trace.record+json`. + +## Dual-write and fallback behavior +- Finalization attempts both targets in one pass: + - notes write via `TraceNotesWriter` + - DB persistence via `TraceRecordStore` +- Successful writes (`Written` or `AlreadyExists`) on both targets mark commit emission in `TraceEmissionLedger` and return `Persisted`. +- Any failed target (`PersistenceWriteResult::Failed`) enqueues one retry item via `TraceRetryQueue` with explicit failed target list and returns `QueuedFallback`. +- Retry queue entries carry the full trace record, MIME type, notes ref, and failed target list to support replay-safe recovery. + +## Verification evidence +- `cargo test --manifest-path cli/Cargo.toml post_commit_finalization` +- `cargo build --manifest-path cli/Cargo.toml` From 267d766a91f8477a49caf40116c4f5c231dae63c Mon Sep 17 00:00:00 2001 From: David Abram Date: Wed, 4 Mar 2026 13:52:59 +0100 Subject: [PATCH 08/39] cli: Implement hook rollout doctor readiness checks Add an implemented `sce doctor` command that resolves effective hook-path source and validates required hooks (`pre-commit`, `commit-msg`, `post-commit`) for presence and executable permissions before local attribution rollout. --- cli/README.md | 16 +- cli/src/app.rs | 9 + cli/src/command_surface.rs | 7 +- cli/src/services/doctor.rs | 340 ++++++++++++++++++ cli/src/services/mod.rs | 1 + context/architecture.md | 5 +- context/cli/placeholder-foundation.md | 12 +- context/context-map.md | 1 + context/glossary.md | 1 + context/overview.md | 5 +- context/patterns.md | 1 + .../agent-trace-attribution-no-git-wrapper.md | 6 +- context/sce/agent-trace-hook-doctor.md | 48 +++ 13 files changed, 439 insertions(+), 13 deletions(-) create mode 100644 cli/src/services/doctor.rs create mode 100644 context/sce/agent-trace-hook-doctor.md diff --git a/cli/README.md b/cli/README.md index 4b226c89..4efe451a 100644 --- a/cli/README.md +++ b/cli/README.md @@ -1,17 +1,18 @@ -# sce CLI (placeholder foundation) +# sce CLI (foundation) This crate provides the early command-surface scaffold for the Shared Context Engineering CLI (`sce`). Current scope is intentionally narrow: deterministic command dispatch, an -implemented repository `setup` flow, and explicit placeholders for commands -that are still deferred. +implemented repository `setup` flow, implemented local rollout health checks +via `doctor`, and explicit placeholders for commands that are still deferred. ## Quick start ```bash cargo run --manifest-path cli/Cargo.toml -- --help cargo run --manifest-path cli/Cargo.toml -- setup +cargo run --manifest-path cli/Cargo.toml -- doctor cargo run --manifest-path cli/Cargo.toml -- mcp cargo run --manifest-path cli/Cargo.toml -- hooks cargo run --manifest-path cli/Cargo.toml -- sync @@ -56,6 +57,12 @@ Crates.io is prepared but intentionally disabled in this phase. `config/.claude/**` - installation writes to repository-root `.opencode/` and/or `.claude/` using backup-and-replace safety with rollback on swap failures +- `doctor` is implemented and validates hook rollout readiness: + - detects effective hooks directory for default, per-repo `core.hooksPath`, + and global `core.hooksPath` installs + - validates required hooks (`pre-commit`, `commit-msg`, `post-commit`) for + presence and executable permissions + - reports actionable diagnostics for missing or misconfigured hooks - `mcp` is a placeholder for future file-cache tooling contracts (`cache-put`/`cache-get`). - `hooks` is a placeholder for future git hook event and generated-region @@ -66,7 +73,7 @@ Crates.io is prepared but intentionally disabled in this phase. ## Safety and limitations - `mcp`, `hooks`, and `sync` remain placeholders and do not perform MCP - transport, hook installation, or cloud sync. + transport or cloud sync. - `sync` only validates local adapter wiring and does not require remote auth. - This crate is scaffolding for incremental delivery and should not be treated as production-ready workflow automation. @@ -74,6 +81,7 @@ Crates.io is prepared but intentionally disabled in this phase. ## Near-term roadmap mapping - Repository setup automation seam: `cli/src/services/setup.rs` +- Hook install health validation seam: `cli/src/services/doctor.rs` - MCP file-cache seam: `cli/src/services/mcp.rs` - Hook event and generated-region seam: `cli/src/services/hooks.rs` - Cloud sync seam + local Turso gate: `cli/src/services/sync.rs` diff --git a/cli/src/app.rs b/cli/src/app.rs index bc57048b..5e06c1ad 100644 --- a/cli/src/app.rs +++ b/cli/src/app.rs @@ -9,6 +9,7 @@ enum Command { Help, Setup(services::setup::SetupMode), SetupHelp, + Doctor, Mcp, Hooks, Sync, @@ -91,6 +92,7 @@ fn parse_subcommand(value: String, tail_args: Vec) -> Result { match value.as_str() { "help" => Ok(Command::Help), "setup" => parse_setup_subcommand(tail_args), + "doctor" => parse_non_setup_subcommand(Command::Doctor, tail_args), "mcp" => parse_non_setup_subcommand(Command::Mcp, tail_args), "hooks" => parse_non_setup_subcommand(Command::Hooks, tail_args), "sync" => parse_non_setup_subcommand(Command::Sync, tail_args), @@ -156,6 +158,7 @@ fn dispatch(command: Command) -> Result<()> { } } Command::SetupHelp => println!("{}", services::setup::setup_usage_text()), + Command::Doctor => println!("{}", services::doctor::run_doctor()?), Command::Mcp => println!("{}", services::mcp::run_placeholder_mcp()?), Command::Hooks => println!("{}", services::hooks::run_placeholder_hooks()?), Command::Sync => println!("{}", services::sync::run_placeholder_sync()?), @@ -184,6 +187,12 @@ mod tests { assert_eq!(code, ExitCode::SUCCESS); } + #[test] + fn doctor_command_exits_success() { + let code = run(vec!["sce".to_string(), "doctor".to_string()]); + assert_eq!(code, ExitCode::SUCCESS); + } + #[test] fn setup_help_exits_success() { let code = run(vec![ diff --git a/cli/src/command_surface.rs b/cli/src/command_surface.rs index 251aeee3..4b5d27fa 100644 --- a/cli/src/command_surface.rs +++ b/cli/src/command_surface.rs @@ -24,6 +24,11 @@ pub const COMMANDS: &[CommandContract] = &[ status: ImplementationStatus::Implemented, purpose: "Prepare local repository/workspace prerequisites", }, + CommandContract { + name: services::doctor::NAME, + status: ImplementationStatus::Implemented, + purpose: "Validate local git-hook installation readiness", + }, CommandContract { name: services::mcp::NAME, status: ImplementationStatus::Placeholder, @@ -65,7 +70,7 @@ Usage:\n sce [command]\n\n\ Setup usage:\n sce setup [--opencode|--claude|--both]\n\n\ Commands:\n{}\n\n\ Setup defaults to interactive target selection when no setup target flag is passed.\n\ -`setup` is implemented; `mcp`, `hooks`, and `sync` remain placeholder-oriented.\n", +`setup` and `doctor` are implemented; `mcp`, `hooks`, and `sync` remain placeholder-oriented.\n", command_rows ) } diff --git a/cli/src/services/doctor.rs b/cli/src/services/doctor.rs new file mode 100644 index 00000000..71d9e2c2 --- /dev/null +++ b/cli/src/services/doctor.rs @@ -0,0 +1,340 @@ +use std::fs; +use std::path::{Path, PathBuf}; +use std::process::Command; + +use anyhow::Result; + +pub const NAME: &str = "doctor"; + +const REQUIRED_HOOKS: [&str; 3] = ["pre-commit", "commit-msg", "post-commit"]; + +#[derive(Clone, Copy, Debug, Eq, PartialEq)] +enum Readiness { + Ready, + NotReady, +} + +#[derive(Clone, Copy, Debug, Eq, PartialEq)] +enum HookPathSource { + Default, + LocalConfig, + GlobalConfig, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +struct HookFileHealth { + name: &'static str, + path: PathBuf, + exists: bool, + executable: bool, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +struct HookDoctorReport { + readiness: Readiness, + repository_root: Option, + hook_path_source: HookPathSource, + hooks_directory: Option, + hooks: Vec, + diagnostics: Vec, +} + +pub fn run_doctor() -> Result { + let report = build_report(); + Ok(format_report(&report)) +} + +fn build_report() -> HookDoctorReport { + let repository_root = run_git_command(&["rev-parse", "--show-toplevel"]).map(PathBuf::from); + let hooks_directory = run_git_command(&["rev-parse", "--git-path", "hooks"]).map(PathBuf::from); + + let local_hooks_path = run_git_command(&["config", "--local", "--get", "core.hooksPath"]); + let global_hooks_path = run_git_command(&["config", "--global", "--get", "core.hooksPath"]); + + let hook_path_source = if local_hooks_path.is_some() { + HookPathSource::LocalConfig + } else if global_hooks_path.is_some() { + HookPathSource::GlobalConfig + } else { + HookPathSource::Default + }; + + let mut diagnostics = Vec::new(); + let hooks = match hooks_directory.as_deref() { + Some(directory) => collect_hook_health(directory, &mut diagnostics), + None => { + diagnostics.push( + "Unable to resolve git hooks directory. Run this command inside a git repository." + .to_string(), + ); + Vec::new() + } + }; + + let readiness = if diagnostics.is_empty() { + Readiness::Ready + } else { + Readiness::NotReady + }; + + HookDoctorReport { + readiness, + repository_root, + hook_path_source, + hooks_directory, + hooks, + diagnostics, + } +} + +fn collect_hook_health(directory: &Path, diagnostics: &mut Vec) -> Vec { + if !directory.exists() { + diagnostics.push(format!( + "Hooks directory '{}' does not exist.", + directory.display() + )); + } + + REQUIRED_HOOKS + .iter() + .map(|hook_name| { + let hook_path = directory.join(hook_name); + let metadata = fs::metadata(&hook_path).ok(); + let exists = metadata.is_some(); + let executable = metadata + .as_ref() + .is_some_and(|entry| entry.is_file() && is_executable(entry)); + + if !exists { + diagnostics.push(format!( + "Missing required hook '{}' at '{}'.", + hook_name, + hook_path.display() + )); + } else if !executable { + diagnostics.push(format!( + "Hook '{}' exists but is not executable. Run 'chmod +x {}' to fix it.", + hook_name, + hook_path.display() + )); + } + + HookFileHealth { + name: hook_name, + path: hook_path, + exists, + executable, + } + }) + .collect() +} + +#[cfg(unix)] +fn is_executable(metadata: &fs::Metadata) -> bool { + use std::os::unix::fs::PermissionsExt; + + metadata.permissions().mode() & 0o111 != 0 +} + +#[cfg(not(unix))] +fn is_executable(metadata: &fs::Metadata) -> bool { + metadata.is_file() +} + +fn run_git_command(args: &[&str]) -> Option { + let output = Command::new("git").args(args).output().ok()?; + if !output.status.success() { + return None; + } + + let stdout = String::from_utf8(output.stdout).ok()?; + let trimmed = stdout.trim(); + if trimmed.is_empty() { + None + } else { + Some(trimmed.to_string()) + } +} + +fn format_report(report: &HookDoctorReport) -> String { + let mut lines = Vec::new(); + lines.push(format!( + "SCE doctor: {}", + match report.readiness { + Readiness::Ready => "ready", + Readiness::NotReady => "not ready", + } + )); + + lines.push(format!( + "Hooks path source: {}", + match report.hook_path_source { + HookPathSource::Default => "default (.git/hooks)", + HookPathSource::LocalConfig => "per-repo core.hooksPath", + HookPathSource::GlobalConfig => "global core.hooksPath", + } + )); + + lines.push(format!( + "Repository root: {}", + report + .repository_root + .as_ref() + .map(|path| path.display().to_string()) + .unwrap_or_else(|| "(not detected)".to_string()) + )); + + lines.push(format!( + "Effective hooks directory: {}", + report + .hooks_directory + .as_ref() + .map(|path| path.display().to_string()) + .unwrap_or_else(|| "(not detected)".to_string()) + )); + + lines.push("Required hooks:".to_string()); + for hook in &report.hooks { + let state = if hook.exists && hook.executable { + "ok" + } else if !hook.exists { + "missing" + } else { + "misconfigured" + }; + lines.push(format!( + "- {}: {} ({})", + hook.name, + state, + hook.path.display() + )); + } + + if report.diagnostics.is_empty() { + lines.push("Diagnostics: none".to_string()); + } else { + lines.push("Diagnostics:".to_string()); + for diagnostic in &report.diagnostics { + lines.push(format!("- {diagnostic}")); + } + } + + lines.join("\n") +} + +#[cfg(test)] +mod tests { + use std::fs; + use std::os::unix::fs::PermissionsExt; + + use anyhow::Result; + + use crate::test_support::TestTempDir; + + use super::{collect_hook_health, format_report, HookDoctorReport, HookPathSource, Readiness}; + + #[test] + fn doctor_output_reports_healthy_state_when_all_required_hooks_exist() -> Result<()> { + let temp_dir = TestTempDir::new("doctor-healthy")?; + let hooks_dir = temp_dir.path().join("hooks"); + fs::create_dir_all(&hooks_dir)?; + + for hook in ["pre-commit", "commit-msg", "post-commit"] { + let hook_path = hooks_dir.join(hook); + fs::write(&hook_path, "#!/bin/sh\n")?; + fs::set_permissions(&hook_path, fs::Permissions::from_mode(0o755))?; + } + + let mut diagnostics = Vec::new(); + let hooks = collect_hook_health(&hooks_dir, &mut diagnostics); + let report = HookDoctorReport { + readiness: if diagnostics.is_empty() { + Readiness::Ready + } else { + Readiness::NotReady + }, + repository_root: Some(temp_dir.path().to_path_buf()), + hook_path_source: HookPathSource::LocalConfig, + hooks_directory: Some(hooks_dir), + hooks, + diagnostics, + }; + + let output = format_report(&report); + assert!(output.contains("SCE doctor: ready")); + assert!(output.contains("pre-commit: ok")); + assert!(output.contains("commit-msg: ok")); + assert!(output.contains("post-commit: ok")); + assert!(output.contains("Diagnostics: none")); + Ok(()) + } + + #[test] + fn doctor_output_reports_missing_hook_state() -> Result<()> { + let temp_dir = TestTempDir::new("doctor-missing")?; + let hooks_dir = temp_dir.path().join("hooks"); + fs::create_dir_all(&hooks_dir)?; + + for hook in ["pre-commit", "post-commit"] { + let hook_path = hooks_dir.join(hook); + fs::write(&hook_path, "#!/bin/sh\n")?; + fs::set_permissions(&hook_path, fs::Permissions::from_mode(0o755))?; + } + + let mut diagnostics = Vec::new(); + let hooks = collect_hook_health(&hooks_dir, &mut diagnostics); + let report = HookDoctorReport { + readiness: if diagnostics.is_empty() { + Readiness::Ready + } else { + Readiness::NotReady + }, + repository_root: Some(temp_dir.path().to_path_buf()), + hook_path_source: HookPathSource::GlobalConfig, + hooks_directory: Some(hooks_dir), + hooks, + diagnostics, + }; + + let output = format_report(&report); + assert!(output.contains("SCE doctor: not ready")); + assert!(output.contains("commit-msg: missing")); + assert!(output.contains("Missing required hook 'commit-msg'")); + Ok(()) + } + + #[test] + fn doctor_output_reports_misconfigured_hook_permissions() -> Result<()> { + let temp_dir = TestTempDir::new("doctor-misconfigured")?; + let hooks_dir = temp_dir.path().join("hooks"); + fs::create_dir_all(&hooks_dir)?; + + for hook in ["pre-commit", "commit-msg", "post-commit"] { + let hook_path = hooks_dir.join(hook); + fs::write(&hook_path, "#!/bin/sh\n")?; + let mode = if hook == "post-commit" { 0o644 } else { 0o755 }; + fs::set_permissions(&hook_path, fs::Permissions::from_mode(mode))?; + } + + let mut diagnostics = Vec::new(); + let hooks = collect_hook_health(&hooks_dir, &mut diagnostics); + let report = HookDoctorReport { + readiness: if diagnostics.is_empty() { + Readiness::Ready + } else { + Readiness::NotReady + }, + repository_root: Some(temp_dir.path().to_path_buf()), + hook_path_source: HookPathSource::Default, + hooks_directory: Some(hooks_dir), + hooks, + diagnostics, + }; + + let output = format_report(&report); + assert!(output.contains("SCE doctor: not ready")); + assert!(output.contains("post-commit: misconfigured")); + assert!(output.contains("Hook 'post-commit' exists but is not executable")); + Ok(()) + } +} diff --git a/cli/src/services/mod.rs b/cli/src/services/mod.rs index 25de34c5..21778d35 100644 --- a/cli/src/services/mod.rs +++ b/cli/src/services/mod.rs @@ -1,4 +1,5 @@ pub mod agent_trace; +pub mod doctor; pub mod hooks; pub mod local_db; pub mod mcp; diff --git a/context/architecture.md b/context/architecture.md index 368985e4..e4e2751e 100644 --- a/context/architecture.md +++ b/context/architecture.md @@ -72,15 +72,16 @@ The repository includes a new placeholder Rust binary crate at `cli/`. - `cli/src/main.rs` is the executable entrypoint (`sce`) and delegates to `app::run`. - `cli/src/app.rs` provides a `lexopt`-based argument parser and dispatch loop with deterministic help, setup installation execution, and consistent `anyhow`-driven error exits. -- `cli/src/command_surface.rs` is the source of truth for top-level command contract metadata (`help`, `setup`, `mcp`, `hooks`, `sync`) and explicit implemented-vs-placeholder status. +- `cli/src/command_surface.rs` is the source of truth for top-level command contract metadata (`help`, `setup`, `doctor`, `mcp`, `hooks`, `sync`) and explicit implemented-vs-placeholder status. - `cli/src/services/local_db.rs` provides the local Turso data adapter, including `Builder::new_local(...)` initialization and async execute/query smoke checks for in-memory and file-backed targets. - `cli/src/test_support.rs` provides a shared test-only temp-directory helper (`TestTempDir`) used by service tests that need filesystem fixtures. - `cli/src/services/setup.rs` defines the setup command contract (`SetupMode`, `SetupTarget`, CLI flag parser/validator), an `inquire`-backed interactive target prompter (`InquireSetupTargetPrompter`), setup dispatch outcomes (proceed/cancelled), compile-time embedded asset access (`EmbeddedAsset`, target-scoped iterators) generated by `cli/build.rs` from `config/.opencode/**` and `config/.claude/**`, and a target-scoped install engine/orchestrator that stages embedded files, performs backup-and-replace with rollback restoration on swap failure, and formats deterministic completion messaging. +- `cli/src/services/doctor.rs` defines hook rollout health validation (`run_doctor`) that resolves effective git hook-path source (default, local `core.hooksPath`, global `core.hooksPath`) and validates required hook files (`pre-commit`, `commit-msg`, `post-commit`) for presence and executable permissions. - `cli/src/services/agent_trace.rs` defines the Agent Trace schema adapter and builder contracts (`adapt_trace_payload`, `build_trace_payload`), including fixed git VCS identity, reserved reverse-domain metadata keys, and deterministic AI `model_id` normalization before schema-compliance validation. - `cli/src/services/mcp.rs` defines MCP file-cache capability contracts (`McpService`, transport/capability snapshots, cache policy) with non-runnable placeholder tool declarations. - `cli/src/services/hooks.rs` defines hook-event and generated-region tracking contracts (`HookService`, `HookEventModel`) plus a pre-commit staged-checkpoint finalization seam (`finalize_pre_commit_checkpoint`) that enforces staged-only attribution and carries index/tree anchors with explicit no-op guard states, a commit-msg co-author policy seam (`apply_commit_msg_coauthor_policy`) that injects one canonical SCE trailer only for allowed attributed commits, and a post-commit trace finalization seam (`finalize_post_commit_trace`) that performs notes+DB dual writes with idempotency ledger guards and retry-queue fallback capture. - `cli/src/services/sync.rs` runs the local adapter through a lazily initialized shared tokio current-thread runtime and composes a placeholder cloud-sync abstraction (`CloudSyncGateway`) so local Turso validation and deferred cloud planning remain separated. -- `cli/src/services/` contains module boundaries for setup, MCP, hooks, sync, and local DB adapters with explicit trait seams for future implementations. +- `cli/src/services/` contains module boundaries for setup, doctor, MCP, hooks, sync, and local DB adapters with explicit trait seams for future implementations. - `cli/README.md` is the crate-local onboarding and usage source of truth for placeholder behavior, safety limitations, and roadmap mapping back to service contracts. - `cli/flake.nix` applies `rust-overlay` (`oxalica/rust-overlay`) to nixpkgs, selects `rust-bin.stable.latest.default` with `rustfmt`, and routes CLI check/build derivations through `makeRustPlatform` so toolchain selection is explicit and deterministic. - `cli/flake.nix` exposes release install/run surfaces as `packages.sce` (`packages.default = packages.sce`) and `apps.sce` targeting `${packages.sce}/bin/sce`, enabling packaged CLI build/run via `nix build ./cli#default` and `nix run ./cli#sce -- ...`. diff --git a/context/cli/placeholder-foundation.md b/context/cli/placeholder-foundation.md index d78d764f..08c11414 100644 --- a/context/cli/placeholder-foundation.md +++ b/context/cli/placeholder-foundation.md @@ -11,12 +11,12 @@ The repository now includes a Rust CLI crate at `cli/` for SCE automation work. - Command contract catalog: `cli/src/command_surface.rs` - Dependency contract snapshot: `cli/src/dependency_contract.rs` - Local Turso adapter: `cli/src/services/local_db.rs` -- Service domains: `cli/src/services/{agent_trace,setup,mcp,hooks,sync}.rs` +- Service domains: `cli/src/services/{agent_trace,setup,doctor,mcp,hooks,sync}.rs` - Shared test temp-path helper: `cli/src/test_support.rs` (`TestTempDir`, test-only module) ## Onboarding documentation -- `cli/README.md` includes quick-start commands for `help`, `setup`, `mcp`, `hooks`, and `sync`. +- `cli/README.md` includes quick-start commands for `help`, `setup`, `doctor`, `mcp`, `hooks`, and `sync`. - The README explicitly distinguishes implemented behavior from placeholders and maps future work to module contracts. - Verification guidance in the README uses crate-local `cargo check`, `cargo test`, and `cargo build` commands, plus release/install commands for current installability (`cargo build --manifest-path cli/Cargo.toml --release`, `cargo install --path cli --locked`). @@ -42,6 +42,7 @@ The repository now includes a Rust CLI crate at `cli/` for SCE automation work. - `help`: implemented - `setup`: implemented +- `doctor`: implemented - `mcp`: placeholder - `hooks`: placeholder - `sync`: placeholder @@ -52,6 +53,7 @@ Placeholder commands currently acknowledge planned behavior and do not claim pro `setup` now also exposes compile-time embedded config assets for OpenCode/Claude targets, sourced from `config/.opencode/**` and `config/.claude/**` via `cli/build.rs` with normalized forward-slash relative paths and target-scoped iteration APIs. `setup` additionally includes a repository-root install engine (`install_embedded_setup_assets`) that stages embedded files and applies backup-and-replace safety for `.opencode/`/`.claude/` with rollback restoration if staged swap fails. `setup` now executes end-to-end and prints deterministic completion details including selected target(s), per-target install count, and backup actions. +`doctor` now executes end-to-end and reports hook rollout readiness by validating effective hook-path source plus required hook presence/executable permissions. `sync` includes a local Turso smoke gate backed by a lazily initialized shared tokio current-thread runtime and a placeholder cloud-sync gateway plan. ## Command loop and error model @@ -63,18 +65,20 @@ Placeholder commands currently acknowledge planned behavior and do not claim pro - Interactive `sce setup` prompt cancellation/interrupt exits cleanly with: `Setup cancelled. No files were changed.` - Command handlers return deterministic status messaging: - `setup`: `Setup completed successfully.` plus selected targets, per-target install destinations/counts, and backup status lines. + - `doctor`: `SCE doctor: ready|not ready` plus hook-path source, required hook checks, and actionable diagnostics. - `TODO: 'mcp' is planned and not implemented yet. MCP file-cache surface defines 2 placeholder tool contract(s) with max 1024 entries.` - - `TODO: 'hooks' is planned and not implemented yet. Hook event model reserves 2 git hook(s) with generated-region tracking placeholders.` + - `TODO: 'hooks' is planned and not implemented yet. Hook event model reserves 3 git hook(s) with generated-region tracking placeholders, staged-only pre-commit checkpoint preview over 1 file(s), and commit-msg canonical trailer preview applied=true.` - `TODO: 'sync' cloud workflows are planned and not implemented yet. Local Turso smoke check succeeded (1) row inserted; cloud sync placeholder enumerates 3 phase(s) and plan holds 3 checkpoint(s).` ## Service contracts - `cli/src/services/setup.rs` defines setup parsing/selection contracts plus runtime install orchestration (`run_setup_for_mode`) over the embedded asset install engine. +- `cli/src/services/doctor.rs` defines hook rollout health validation (`run_doctor`) with path-source detection (default/local/global) and required-hook presence/executable checks. - `cli/src/services/agent_trace.rs` defines the task-scoped schema adapter contract (`adapt_trace_payload`) from internal attribution input structs to Agent Trace-shaped record structs, including fixed git `vcs` mapping, contributor type mapping, and reserved `dev.crocoder.sce.*` metadata placement. - `cli/src/services/mcp.rs` defines `McpService`, a `McpCapabilitySnapshot` model (primary + supported transports), and `CachePolicy` defaults for future file-cache workflows (`cache-put`/`cache-get`) with `runnable: false` placeholders. - `cli/src/services/hooks.rs` defines `HookService` plus hook-event/generated-region event placeholders (`HookEventModel`, `HookEvent`, `GeneratedRegionEvent`) and keeps placeholder recording path compile-safe by consuming hook/lifecycle variants without enabling production hook actions. - `cli/src/services/sync.rs` defines cloud-sync abstraction points (`CloudSyncGateway`, `CloudSyncRequest`, `CloudSyncPlan`) layered after the local Turso smoke gate. -- `cli/src/app.rs` dispatches `setup`, `mcp`, and `hooks` through service-level modules so runtime messages are sourced from domain modules instead of inline strings. +- `cli/src/app.rs` dispatches `setup`, `doctor`, `mcp`, and `hooks` through service-level modules so runtime messages are sourced from domain modules instead of inline strings. ## Local Turso adapter behavior diff --git a/context/context-map.md b/context/context-map.md index 060891d0..728fa523 100644 --- a/context/context-map.md +++ b/context/context-map.md @@ -22,6 +22,7 @@ Feature/domain context: - `context/sce/agent-trace-pre-commit-staged-checkpoint.md` (T04 pre-commit staged-only finalization contract with no-op guards and index/tree anchor capture) - `context/sce/agent-trace-commit-msg-coauthor-policy.md` (T05 commit-msg canonical co-author trailer policy with env-gated injection and idempotent dedupe) - `context/sce/agent-trace-post-commit-dual-write.md` (T06 post-commit trace finalization contract, notes+DB dual-write behavior, idempotency ledger guard, and retry-queue fallback semantics) +- `context/sce/agent-trace-hook-doctor.md` (T07 `sce doctor` hook install/health validation contract for default, per-repo, and global hook-path rollout) Working areas: - `context/plans/` (active plan execution artifacts, not durable history) diff --git a/context/glossary.md b/context/glossary.md index e71a50d9..9af6aa10 100644 --- a/context/glossary.md +++ b/context/glossary.md @@ -60,3 +60,4 @@ - `agent trace commit-msg co-author policy`: T05 contract in `cli/src/services/hooks.rs` (`apply_commit_msg_coauthor_policy`) that applies exactly one canonical trailer (`Co-authored-by: SCE `) only when SCE is enabled, co-author policy is enabled, and staged SCE attribution exists; duplicate canonical trailers are deduped idempotently. - `agent trace post-commit dual-write finalization`: T06 contract in `cli/src/services/hooks.rs` (`finalize_post_commit_trace`) that emits one canonical Agent Trace record per commit behind runtime guards, writes to both notes (`refs/notes/agent-trace`) and DB persistence targets, and enqueues retry fallback entries when either persistence target fails. - `agent trace post-commit idempotency ledger`: T06 seam (`TraceEmissionLedger`) in `cli/src/services/hooks.rs` used to prevent duplicate emission for the same commit SHA and to mark successful dual-write completion. +- `sce doctor` hook rollout validation: Implemented CLI command in `cli/src/services/doctor.rs` (`run_doctor`) that reports readiness for local Agent Trace rollout by resolving hook-path source (default `.git/hooks`, per-repo `core.hooksPath`, or global `core.hooksPath`) and validating required hook presence plus executable permissions. diff --git a/context/overview.md b/context/overview.md index fd075734..71e870d4 100644 --- a/context/overview.md +++ b/context/overview.md @@ -6,10 +6,11 @@ It also includes an early Rust CLI foundation at `cli/` for Shared Context Engin The crate ships onboarding and usage documentation at `cli/README.md` that reflects current implemented vs placeholder behavior. The CLI crate currently enforces a minimal dependency contract: `anyhow`, `inquire`, `lexopt`, `tokio`, and `turso`. -Its command loop is implemented with `lexopt` argument parsing and `anyhow` error handling, with real setup orchestration plus placeholder dispatch for non-setup commands through explicit service contracts. +Its command loop is implemented with `lexopt` argument parsing and `anyhow` error handling, with real setup orchestration, implemented `doctor` rollout validation, and placeholder dispatch for deferred commands through explicit service contracts. The `setup` command includes an `inquire`-backed target-selection flow: default interactive selection for OpenCode/Claude/both, explicit non-interactive target flags (`--opencode`, `--claude`, `--both`), deterministic mutually-exclusive validation, and non-destructive cancellation exits. The CLI now compiles an embedded setup asset manifest from `config/.opencode/**` and `config/.claude/**` via `cli/build.rs`; `cli/src/services/setup.rs` exposes deterministic normalized relative paths plus file bytes and target-scoped iteration without runtime reads from `config/`. The setup service also provides repository-root install orchestration: it resolves interactive or flag-based target selection, installs embedded assets, and reports deterministic completion details (selected target(s), installed file counts, and backup actions). +The `doctor` command now validates Agent Trace local rollout readiness by resolving effective git hook-path source (default, per-repo `core.hooksPath`, or global `core.hooksPath`) and checking required hook presence/executable permissions with actionable diagnostics. The `mcp` placeholder contract is now scoped to future file-cache workflows (`cache-put`/`cache-get`) and remains intentionally non-runnable. The `sync` placeholder performs a local Turso smoke check through a lazily initialized shared tokio current-thread runtime and then reports a deferred cloud-sync plan from a placeholder gateway contract. The nested CLI flake (`cli/flake.nix`) now applies a Rust overlay-backed stable toolchain (with `rustfmt`) and uses that toolchain contract for CLI check/build derivations. @@ -28,6 +29,7 @@ The Agent Trace service now also provides a deterministic payload-builder path ( The hooks service now includes a pre-commit staged checkpoint finalization contract (`finalize_pre_commit_checkpoint`) that enforces staged-only attribution, captures index/tree anchors, and no-ops for disabled/unavailable/bare-repo runtime states; this behavior is documented in `context/sce/agent-trace-pre-commit-staged-checkpoint.md`. The hooks service now also exposes a `commit-msg` co-author trailer policy (`apply_commit_msg_coauthor_policy`) that conditionally injects exactly one canonical SCE trailer based on `SCE_DISABLED`, `SCE_COAUTHOR_ENABLED`, and staged-attribution presence, with idempotent deduplication behavior documented in `context/sce/agent-trace-commit-msg-coauthor-policy.md`. The hooks service now also includes a post-commit trace finalization seam (`finalize_post_commit_trace`) that builds canonical Agent Trace payloads, enforces commit-level idempotency guards, performs notes + DB dual writes, and enqueues retry fallback metadata when persistence targets fail; this behavior is documented in `context/sce/agent-trace-post-commit-dual-write.md`. +The CLI now also includes a hook rollout doctor contract documented in `context/sce/agent-trace-hook-doctor.md`. ## Repository model @@ -84,3 +86,4 @@ Lightweight post-task verification baseline (required after each completed task) - Use `context/sce/agent-trace-pre-commit-staged-checkpoint.md` for the implemented T04 pre-commit staged-only finalization contract and runtime no-op guards. - Use `context/sce/agent-trace-commit-msg-coauthor-policy.md` for the implemented T05 commit-msg canonical co-author trailer policy and idempotent dedupe behavior. - Use `context/sce/agent-trace-post-commit-dual-write.md` for the implemented T06 post-commit trace finalization and dual-write + queue-fallback behavior. +- Use `context/sce/agent-trace-hook-doctor.md` for the implemented T07 hook install and health validation behavior (`sce doctor`) across default/per-repo/global hook-path installs. diff --git a/context/patterns.md b/context/patterns.md index 14d8e98c..ba743a23 100644 --- a/context/patterns.md +++ b/context/patterns.md @@ -82,6 +82,7 @@ - Keep dependency additions explicit and minimal in `cli/Cargo.toml`, and anchor dependency intent in lightweight compile-time code references (`cli/src/dependency_contract.rs`). - Route local Turso access through a dedicated adapter module (`cli/src/services/local_db.rs`) so command handlers do not expose low-level `turso` API details. - For placeholder commands that need real infrastructure checks, use a lazily initialized shared tokio current-thread runtime wrapper in the service layer (`cli/src/services/sync.rs`) and keep user-facing output explicit about remaining placeholder scope. +- For rollout health commands, prefer deterministic local diagnostics over implicit pass/fail behavior: report hook-path source, effective directories, required-hook checks, and actionable remediation text (`cli/src/services/doctor.rs`). - For future CLI domains, define trait-first service contracts with request/plan models in `cli/src/services/*` and keep placeholder implementations explicitly non-runnable until production behavior is approved. - Model deferred integration boundaries with concrete event/capability data structures (for example MCP file-cache snapshots/policies, git-hook/generated-region events, cloud-sync checkpoints) so later tasks can implement behavior without reshaping public seams. - For pre-commit attribution finalization seams, keep pending staged and unstaged ranges explicitly separated in input models and finalize from staged ranges only, while carrying index/tree anchors for deterministic commit-time attribution binding. diff --git a/context/plans/agent-trace-attribution-no-git-wrapper.md b/context/plans/agent-trace-attribution-no-git-wrapper.md index 757a1cf0..8b52af4d 100644 --- a/context/plans/agent-trace-attribution-no-git-wrapper.md +++ b/context/plans/agent-trace-attribution-no-git-wrapper.md @@ -111,7 +111,7 @@ Implement a no-git-wrapper attribution platform that preserves normal developer - `cargo test --manifest-path cli/Cargo.toml post_commit_finalization` - `cargo build --manifest-path cli/Cargo.toml` -- [ ] T07: Add hook install and health validation (`sce doctor`) for local rollout (status:todo) +- [x] T07: Add hook install and health validation (`sce doctor`) for local rollout (status:done) - Task ID: T07 - Goal: Provide deterministic setup validation for per-repo and global hook-path installs. - Boundaries (in/out of scope): @@ -121,6 +121,10 @@ Implement a no-git-wrapper attribution platform that preserves normal developer - Operators can verify hook readiness before enabling attribution enforcement. - Verification notes (commands or checks): - Doctor output tests for healthy, missing, and misconfigured hook states. + - `cargo fmt --manifest-path cli/Cargo.toml -- --check` + - `cargo test --manifest-path cli/Cargo.toml doctor_output_reports` + - `cargo test --manifest-path cli/Cargo.toml doctor_command_exits_success` + - `cargo build --manifest-path cli/Cargo.toml` - [ ] T08: Implement `post-rewrite` local remap ingestion pipeline (status:todo) - Task ID: T08 diff --git a/context/sce/agent-trace-hook-doctor.md b/context/sce/agent-trace-hook-doctor.md new file mode 100644 index 00000000..a21e3630 --- /dev/null +++ b/context/sce/agent-trace-hook-doctor.md @@ -0,0 +1,48 @@ +# Agent Trace hook rollout doctor + +## Scope + +Task `agent-trace-attribution-no-git-wrapper` `T07` adds local rollout validation through `sce doctor` so operators can verify hook readiness before enabling attribution enforcement. + +## Command contract + +- Entrypoint: `sce doctor` +- Service implementation: `cli/src/services/doctor.rs` +- Command dispatch: `cli/src/app.rs` (`Command::Doctor`) +- Command surface status: implemented in `cli/src/command_surface.rs` + +`sce doctor` always returns a deterministic text report with: + +- readiness verdict (`ready` or `not ready`) +- hook-path source (`default (.git/hooks)`, per-repo `core.hooksPath`, or global `core.hooksPath`) +- detected repository root and effective hooks directory +- required hook checks for `pre-commit`, `commit-msg`, `post-commit` +- actionable diagnostics for missing or misconfigured hooks + +## Health validation rules + +`sce doctor` resolves git state using CLI git commands: + +- `git rev-parse --show-toplevel` +- `git rev-parse --git-path hooks` +- `git config --local --get core.hooksPath` +- `git config --global --get core.hooksPath` + +Readiness is `not ready` when any required check fails: + +- hooks directory cannot be resolved +- hooks directory is missing +- any required hook file is missing +- any required hook exists but is not executable + +If no diagnostics are present, readiness is `ready`. + +## Verification coverage + +`cli/src/services/doctor.rs` includes explicit doctor output tests for: + +- healthy state (all required hooks present and executable) +- missing state (required hook absent) +- misconfigured state (required hook present but non-executable) + +`cli/src/app.rs` includes command-level routing/exit success coverage for `sce doctor`. From 07fe4fac7954a020e50c06ce01f1d75291454461 Mon Sep 17 00:00:00 2001 From: David Abram Date: Wed, 4 Mar 2026 14:03:55 +0100 Subject: [PATCH 09/39] hooks: Implement post-rewrite remap ingestion pipeline Parses post-rewrite old/new SHA pairs with strict validation, normalizes rewrite methods, and dispatches replay-safe remap requests using deterministic per-pair idempotency keys. --- cli/src/services/hooks.rs | 293 +++++++++++++++++- context/architecture.md | 2 +- context/context-map.md | 1 + context/glossary.md | 1 + context/overview.md | 2 + context/patterns.md | 1 + .../agent-trace-attribution-no-git-wrapper.md | 5 +- ...race-post-rewrite-local-remap-ingestion.md | 63 ++++ 8 files changed, 363 insertions(+), 5 deletions(-) create mode 100644 context/sce/agent-trace-post-rewrite-local-remap-ingestion.md diff --git a/cli/src/services/hooks.rs b/cli/src/services/hooks.rs index b79a31a3..864f3c65 100644 --- a/cli/src/services/hooks.rs +++ b/cli/src/services/hooks.rs @@ -187,6 +187,172 @@ pub enum PostCommitFinalization { QueuedFallback(PostCommitQueuedFallback), } +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct PostRewriteRuntimeState { + pub sce_disabled: bool, + pub cli_available: bool, + pub is_bare_repo: bool, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub enum PostRewriteNoOpReason { + Disabled, + CliUnavailable, + BareRepository, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub enum RewriteMethod { + Amend, + Rebase, + Other(String), +} + +impl RewriteMethod { + fn canonical_label(&self) -> &str { + match self { + RewriteMethod::Amend => "amend", + RewriteMethod::Rebase => "rebase", + RewriteMethod::Other(method) => method.as_str(), + } + } +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct RewritePair { + pub old_sha: String, + pub new_sha: String, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct RewriteRemapRequest { + pub rewrite_method: RewriteMethod, + pub old_sha: String, + pub new_sha: String, + pub idempotency_key: String, +} + +pub trait RewriteRemapIngestion { + fn ingest(&mut self, request: RewriteRemapRequest) -> Result; +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct PostRewriteIngested { + pub rewrite_method: RewriteMethod, + pub total_pairs: usize, + pub ingested_pairs: usize, + pub skipped_pairs: usize, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub enum PostRewriteFinalization { + NoOp(PostRewriteNoOpReason), + Ingested(PostRewriteIngested), +} + +pub fn finalize_post_rewrite_remap( + runtime: &PostRewriteRuntimeState, + rewrite_method: &str, + pairs_file_contents: &str, + remap_ingestion: &mut impl RewriteRemapIngestion, +) -> Result { + if runtime.sce_disabled { + return Ok(PostRewriteFinalization::NoOp( + PostRewriteNoOpReason::Disabled, + )); + } + + if !runtime.cli_available { + return Ok(PostRewriteFinalization::NoOp( + PostRewriteNoOpReason::CliUnavailable, + )); + } + + if runtime.is_bare_repo { + return Ok(PostRewriteFinalization::NoOp( + PostRewriteNoOpReason::BareRepository, + )); + } + + let method = normalize_rewrite_method(rewrite_method); + let pairs = parse_post_rewrite_pairs(pairs_file_contents)?; + + let mut ingested_pairs = 0_usize; + for pair in &pairs { + let idempotency_key = format!( + "post-rewrite:{}:{}:{}", + method.canonical_label(), + pair.old_sha, + pair.new_sha + ); + let accepted = remap_ingestion.ingest(RewriteRemapRequest { + rewrite_method: method.clone(), + old_sha: pair.old_sha.clone(), + new_sha: pair.new_sha.clone(), + idempotency_key, + })?; + if accepted { + ingested_pairs += 1; + } + } + + let total_pairs = pairs.len(); + Ok(PostRewriteFinalization::Ingested(PostRewriteIngested { + rewrite_method: method, + total_pairs, + ingested_pairs, + skipped_pairs: total_pairs.saturating_sub(ingested_pairs), + })) +} + +fn parse_post_rewrite_pairs(contents: &str) -> Result> { + let mut pairs = Vec::new(); + + for (line_index, line) in contents.lines().enumerate() { + let trimmed = line.trim(); + if trimmed.is_empty() { + continue; + } + + let mut fields = trimmed.split_whitespace(); + let Some(old_sha) = fields.next() else { + continue; + }; + let Some(new_sha) = fields.next() else { + anyhow::bail!( + "Invalid post-rewrite pair format on line {}: expected ' '", + line_index + 1 + ); + }; + + if fields.next().is_some() { + anyhow::bail!( + "Invalid post-rewrite pair format on line {}: expected exactly two fields", + line_index + 1 + ); + } + + if old_sha == new_sha { + continue; + } + + pairs.push(RewritePair { + old_sha: old_sha.to_string(), + new_sha: new_sha.to_string(), + }); + } + + Ok(pairs) +} + +fn normalize_rewrite_method(method: &str) -> RewriteMethod { + match method.trim().to_ascii_lowercase().as_str() { + "amend" => RewriteMethod::Amend, + "rebase" => RewriteMethod::Rebase, + other => RewriteMethod::Other(other.to_string()), + } +} + pub fn finalize_post_commit_trace( runtime: &PostCommitRuntimeState, input: PostCommitInput, @@ -500,15 +666,17 @@ mod tests { }; use super::{ - apply_commit_msg_coauthor_policy, finalize_post_commit_trace, + apply_commit_msg_coauthor_policy, finalize_post_commit_trace, finalize_post_rewrite_remap, finalize_pre_commit_checkpoint, run_placeholder_hooks, CommitMsgRuntimeState, GeneratedRegionEvent, GeneratedRegionLifecycle, GitHookKind, HookEvent, HookService, PendingCheckpoint, PendingFileCheckpoint, PendingLineRange, PersistenceErrorClass, PersistenceFailure, PersistenceTarget, PersistenceWriteResult, PlaceholderHookService, PostCommitFinalization, PostCommitInput, PostCommitNoOpReason, PostCommitRuntimeState, + PostRewriteFinalization, PostRewriteNoOpReason, PostRewriteRuntimeState, PreCommitFinalization, PreCommitNoOpReason, PreCommitRuntimeState, PreCommitTreeAnchors, - TraceEmissionLedger, TraceNote, TraceNotesWriter, TraceRecordStore, TraceRetryQueue, - TraceRetryQueueEntry, CANONICAL_SCE_COAUTHOR_TRAILER, POST_COMMIT_PARENT_SHA_METADATA_KEY, + RewriteMethod, RewriteRemapIngestion, RewriteRemapRequest, TraceEmissionLedger, TraceNote, + TraceNotesWriter, TraceRecordStore, TraceRetryQueue, TraceRetryQueueEntry, + CANONICAL_SCE_COAUTHOR_TRAILER, POST_COMMIT_PARENT_SHA_METADATA_KEY, }; fn sample_pending_checkpoint() -> PendingCheckpoint { @@ -602,6 +770,24 @@ mod tests { entries: Vec, } + #[derive(Default)] + struct FakeRewriteRemapIngestion { + seen_requests: Vec, + duplicate_keys: Vec, + seen_keys: std::collections::BTreeSet, + } + + impl RewriteRemapIngestion for FakeRewriteRemapIngestion { + fn ingest(&mut self, request: RewriteRemapRequest) -> Result { + let accepted = self.seen_keys.insert(request.idempotency_key.clone()); + if !accepted { + self.duplicate_keys.push(request.idempotency_key.clone()); + } + self.seen_requests.push(request); + Ok(accepted) + } + } + impl TraceRetryQueue for FakeRetryQueue { fn enqueue(&mut self, entry: TraceRetryQueueEntry) -> Result<()> { self.entries.push(entry); @@ -617,6 +803,14 @@ mod tests { } } + fn sample_post_rewrite_runtime() -> PostRewriteRuntimeState { + PostRewriteRuntimeState { + sce_disabled: false, + cli_available: true, + is_bare_repo: false, + } + } + fn sample_post_commit_input() -> PostCommitInput { PostCommitInput { record_id: "550e8400-e29b-41d4-a716-446655440000".to_string(), @@ -751,6 +945,99 @@ mod tests { Ok(()) } + #[test] + fn post_rewrite_finalization_noops_when_sce_disabled() -> Result<()> { + let mut runtime = sample_post_rewrite_runtime(); + runtime.sce_disabled = true; + let mut ingestion = FakeRewriteRemapIngestion::default(); + + let outcome = + finalize_post_rewrite_remap(&runtime, "amend", "old1 new1\n", &mut ingestion)?; + + assert_eq!( + outcome, + PostRewriteFinalization::NoOp(PostRewriteNoOpReason::Disabled) + ); + assert!(ingestion.seen_requests.is_empty()); + Ok(()) + } + + #[test] + fn post_rewrite_finalization_parses_amend_pairs_and_derives_idempotency() -> Result<()> { + let runtime = sample_post_rewrite_runtime(); + let mut ingestion = FakeRewriteRemapIngestion::default(); + + let outcome = finalize_post_rewrite_remap( + &runtime, + "amend", + "oldsha1 newsha1\noldsha2 newsha2\n", + &mut ingestion, + )?; + + assert_eq!( + outcome, + PostRewriteFinalization::Ingested(super::PostRewriteIngested { + rewrite_method: RewriteMethod::Amend, + total_pairs: 2, + ingested_pairs: 2, + skipped_pairs: 0, + }) + ); + assert_eq!(ingestion.seen_requests.len(), 2); + assert_eq!( + ingestion.seen_requests[0].idempotency_key, + "post-rewrite:amend:oldsha1:newsha1" + ); + assert_eq!( + ingestion.seen_requests[1].idempotency_key, + "post-rewrite:amend:oldsha2:newsha2" + ); + Ok(()) + } + + #[test] + fn post_rewrite_finalization_skips_duplicate_pairs_with_rebase_method() -> Result<()> { + let runtime = sample_post_rewrite_runtime(); + let mut ingestion = FakeRewriteRemapIngestion::default(); + + let outcome = finalize_post_rewrite_remap( + &runtime, + "rebase", + "oldsha1 newsha1\noldsha1 newsha1\n", + &mut ingestion, + )?; + + assert_eq!( + outcome, + PostRewriteFinalization::Ingested(super::PostRewriteIngested { + rewrite_method: RewriteMethod::Rebase, + total_pairs: 2, + ingested_pairs: 1, + skipped_pairs: 1, + }) + ); + assert_eq!(ingestion.seen_requests.len(), 2); + assert_eq!(ingestion.duplicate_keys.len(), 1); + assert_eq!( + ingestion.duplicate_keys[0], + "post-rewrite:rebase:oldsha1:newsha1" + ); + Ok(()) + } + + #[test] + fn post_rewrite_finalization_rejects_invalid_pair_line_format() { + let runtime = sample_post_rewrite_runtime(); + let mut ingestion = FakeRewriteRemapIngestion::default(); + + let error = + finalize_post_rewrite_remap(&runtime, "amend", "missing_new_sha\n", &mut ingestion) + .expect_err("invalid pair format should return error"); + + assert!(error.to_string().contains("expected ' '")); + assert!(ingestion.seen_requests.is_empty()); + } + #[test] fn pre_commit_finalization_noops_when_sce_disabled() { let mut runtime = sample_runtime(); diff --git a/context/architecture.md b/context/architecture.md index e4e2751e..d85f26cc 100644 --- a/context/architecture.md +++ b/context/architecture.md @@ -79,7 +79,7 @@ The repository includes a new placeholder Rust binary crate at `cli/`. - `cli/src/services/doctor.rs` defines hook rollout health validation (`run_doctor`) that resolves effective git hook-path source (default, local `core.hooksPath`, global `core.hooksPath`) and validates required hook files (`pre-commit`, `commit-msg`, `post-commit`) for presence and executable permissions. - `cli/src/services/agent_trace.rs` defines the Agent Trace schema adapter and builder contracts (`adapt_trace_payload`, `build_trace_payload`), including fixed git VCS identity, reserved reverse-domain metadata keys, and deterministic AI `model_id` normalization before schema-compliance validation. - `cli/src/services/mcp.rs` defines MCP file-cache capability contracts (`McpService`, transport/capability snapshots, cache policy) with non-runnable placeholder tool declarations. -- `cli/src/services/hooks.rs` defines hook-event and generated-region tracking contracts (`HookService`, `HookEventModel`) plus a pre-commit staged-checkpoint finalization seam (`finalize_pre_commit_checkpoint`) that enforces staged-only attribution and carries index/tree anchors with explicit no-op guard states, a commit-msg co-author policy seam (`apply_commit_msg_coauthor_policy`) that injects one canonical SCE trailer only for allowed attributed commits, and a post-commit trace finalization seam (`finalize_post_commit_trace`) that performs notes+DB dual writes with idempotency ledger guards and retry-queue fallback capture. +- `cli/src/services/hooks.rs` defines hook-event and generated-region tracking contracts (`HookService`, `HookEventModel`) plus a pre-commit staged-checkpoint finalization seam (`finalize_pre_commit_checkpoint`) that enforces staged-only attribution and carries index/tree anchors with explicit no-op guard states, a commit-msg co-author policy seam (`apply_commit_msg_coauthor_policy`) that injects one canonical SCE trailer only for allowed attributed commits, a post-commit trace finalization seam (`finalize_post_commit_trace`) that performs notes+DB dual writes with idempotency ledger guards and retry-queue fallback capture, and a post-rewrite remap-ingestion seam (`finalize_post_rewrite_remap`) that parses old->new SHA pairs and derives deterministic replay keys for remap dispatch. - `cli/src/services/sync.rs` runs the local adapter through a lazily initialized shared tokio current-thread runtime and composes a placeholder cloud-sync abstraction (`CloudSyncGateway`) so local Turso validation and deferred cloud planning remain separated. - `cli/src/services/` contains module boundaries for setup, doctor, MCP, hooks, sync, and local DB adapters with explicit trait seams for future implementations. - `cli/README.md` is the crate-local onboarding and usage source of truth for placeholder behavior, safety limitations, and roadmap mapping back to service contracts. diff --git a/context/context-map.md b/context/context-map.md index 728fa523..ff36a89b 100644 --- a/context/context-map.md +++ b/context/context-map.md @@ -23,6 +23,7 @@ Feature/domain context: - `context/sce/agent-trace-commit-msg-coauthor-policy.md` (T05 commit-msg canonical co-author trailer policy with env-gated injection and idempotent dedupe) - `context/sce/agent-trace-post-commit-dual-write.md` (T06 post-commit trace finalization contract, notes+DB dual-write behavior, idempotency ledger guard, and retry-queue fallback semantics) - `context/sce/agent-trace-hook-doctor.md` (T07 `sce doctor` hook install/health validation contract for default, per-repo, and global hook-path rollout) +- `context/sce/agent-trace-post-rewrite-local-remap-ingestion.md` (T08 `post-rewrite` local remap ingestion contract with strict pair parsing, rewrite-method normalization, and deterministic replay-key derivation) Working areas: - `context/plans/` (active plan execution artifacts, not durable history) diff --git a/context/glossary.md b/context/glossary.md index 9af6aa10..5c3fcd24 100644 --- a/context/glossary.md +++ b/context/glossary.md @@ -61,3 +61,4 @@ - `agent trace post-commit dual-write finalization`: T06 contract in `cli/src/services/hooks.rs` (`finalize_post_commit_trace`) that emits one canonical Agent Trace record per commit behind runtime guards, writes to both notes (`refs/notes/agent-trace`) and DB persistence targets, and enqueues retry fallback entries when either persistence target fails. - `agent trace post-commit idempotency ledger`: T06 seam (`TraceEmissionLedger`) in `cli/src/services/hooks.rs` used to prevent duplicate emission for the same commit SHA and to mark successful dual-write completion. - `sce doctor` hook rollout validation: Implemented CLI command in `cli/src/services/doctor.rs` (`run_doctor`) that reports readiness for local Agent Trace rollout by resolving hook-path source (default `.git/hooks`, per-repo `core.hooksPath`, or global `core.hooksPath`) and validating required hook presence plus executable permissions. +- `agent trace post-rewrite local remap ingestion`: T08 contract in `cli/src/services/hooks.rs` (`finalize_post_rewrite_remap`) that parses git `post-rewrite` old/new SHA pairs, captures normalized rewrite method (`amend`, `rebase`, or lowercase passthrough), derives deterministic `post-rewrite:::` idempotency keys, and dispatches replay-safe remap ingestion requests. diff --git a/context/overview.md b/context/overview.md index 71e870d4..3f1e4627 100644 --- a/context/overview.md +++ b/context/overview.md @@ -30,6 +30,7 @@ The hooks service now includes a pre-commit staged checkpoint finalization contr The hooks service now also exposes a `commit-msg` co-author trailer policy (`apply_commit_msg_coauthor_policy`) that conditionally injects exactly one canonical SCE trailer based on `SCE_DISABLED`, `SCE_COAUTHOR_ENABLED`, and staged-attribution presence, with idempotent deduplication behavior documented in `context/sce/agent-trace-commit-msg-coauthor-policy.md`. The hooks service now also includes a post-commit trace finalization seam (`finalize_post_commit_trace`) that builds canonical Agent Trace payloads, enforces commit-level idempotency guards, performs notes + DB dual writes, and enqueues retry fallback metadata when persistence targets fail; this behavior is documented in `context/sce/agent-trace-post-commit-dual-write.md`. The CLI now also includes a hook rollout doctor contract documented in `context/sce/agent-trace-hook-doctor.md`. +The hooks service now also includes a post-rewrite local remap ingestion seam (`finalize_post_rewrite_remap`) that parses `post-rewrite` old->new SHA pairs, normalizes rewrite method capture, and derives deterministic per-pair idempotency keys before remap dispatch; this behavior is documented in `context/sce/agent-trace-post-rewrite-local-remap-ingestion.md`. ## Repository model @@ -87,3 +88,4 @@ Lightweight post-task verification baseline (required after each completed task) - Use `context/sce/agent-trace-commit-msg-coauthor-policy.md` for the implemented T05 commit-msg canonical co-author trailer policy and idempotent dedupe behavior. - Use `context/sce/agent-trace-post-commit-dual-write.md` for the implemented T06 post-commit trace finalization and dual-write + queue-fallback behavior. - Use `context/sce/agent-trace-hook-doctor.md` for the implemented T07 hook install and health validation behavior (`sce doctor`) across default/per-repo/global hook-path installs. +- Use `context/sce/agent-trace-post-rewrite-local-remap-ingestion.md` for the implemented T08 post-rewrite local remap ingestion pipeline (`post-rewrite` pair parsing, rewrite-method normalization, and deterministic idempotency-key derivation). diff --git a/context/patterns.md b/context/patterns.md index ba743a23..3be0aedc 100644 --- a/context/patterns.md +++ b/context/patterns.md @@ -88,6 +88,7 @@ - For pre-commit attribution finalization seams, keep pending staged and unstaged ranges explicitly separated in input models and finalize from staged ranges only, while carrying index/tree anchors for deterministic commit-time attribution binding. - For commit-msg co-author policy seams, gate canonical trailer insertion on runtime controls (`SCE_DISABLED`, `SCE_COAUTHOR_ENABLED`) plus staged SCE-attribution presence, and enforce idempotent dedupe so allowed cases end with exactly one `Co-authored-by: SCE ` trailer. - For post-commit trace finalization seams, treat commit SHA as the idempotency identity, perform notes + DB writes in the same finalize pass when available, and enqueue retry-fallback entries that explicitly capture failed persistence targets for replay-safe recovery. +- For post-rewrite remap ingestion seams, parse ` ` pairs from hook input strictly, ignore empty/no-op self-mapping rows, normalize rewrite method labels to lowercase (`amend`/`rebase` when recognized), and derive deterministic per-pair idempotency keys before dispatching remap requests. - Keep crate-local onboarding docs in `cli/README.md` and sanity-check command examples against actual `sce` output whenever command messaging changes. - Keep targeted CLI command-surface verification in flake checks: `checks..cli-setup-command-surface` runs from `cli/` and executes `cargo fmt --check` plus focused setup-related tests (`help_text_mentions_setup_target_flags`, `parser_routes_setup`, `run_setup_reports`). - In `cli/flake.nix`, select the Rust toolchain via an explicit Rust overlay (`rust-overlay`) and thread that toolchain through `makeRustPlatform` so CLI check/build derivations do not rely on implicit nixpkgs Rust defaults. diff --git a/context/plans/agent-trace-attribution-no-git-wrapper.md b/context/plans/agent-trace-attribution-no-git-wrapper.md index 8b52af4d..973adc4d 100644 --- a/context/plans/agent-trace-attribution-no-git-wrapper.md +++ b/context/plans/agent-trace-attribution-no-git-wrapper.md @@ -126,7 +126,7 @@ Implement a no-git-wrapper attribution platform that preserves normal developer - `cargo test --manifest-path cli/Cargo.toml doctor_command_exits_success` - `cargo build --manifest-path cli/Cargo.toml` -- [ ] T08: Implement `post-rewrite` local remap ingestion pipeline (status:todo) +- [x] T08: Implement `post-rewrite` local remap ingestion pipeline (status:done) - Task ID: T08 - Goal: Ingest old->new SHA pairs from rewrite events and trigger remap pipeline. - Boundaries (in/out of scope): @@ -136,6 +136,9 @@ Implement a no-git-wrapper attribution platform that preserves normal developer - Rebase/amend rewrites trigger deterministic remap processing without duplicate artifacts. - Verification notes (commands or checks): - Local rewrite fixture tests across amend and interactive/non-interactive rebase outcomes. + - `cargo test --manifest-path cli/Cargo.toml post_rewrite_finalization` + - `cargo fmt --manifest-path cli/Cargo.toml -- --check` + - `cargo build --manifest-path cli/Cargo.toml` - [ ] T09: Implement rewrite trace transformation semantics (status:todo) - Task ID: T09 diff --git a/context/sce/agent-trace-post-rewrite-local-remap-ingestion.md b/context/sce/agent-trace-post-rewrite-local-remap-ingestion.md new file mode 100644 index 00000000..1501983f --- /dev/null +++ b/context/sce/agent-trace-post-rewrite-local-remap-ingestion.md @@ -0,0 +1,63 @@ +# Agent Trace Post-Rewrite Local Remap Ingestion (T08) + +## Status +- Plan: `agent-trace-attribution-no-git-wrapper` +- Task: `T08` +- Scope: local `post-rewrite` ingestion pipeline only (no hosted webhook processing) + +## Implemented surface +- Code: `cli/src/services/hooks.rs` +- Primary entrypoint: `finalize_post_rewrite_remap` +- Hook intent: consume local git `post-rewrite` input (` ` pairs) and emit deterministic remap-ingestion requests. + +## Runtime gating + +`finalize_post_rewrite_remap` returns `NoOp` and performs no ingestion when any of these guards apply: + +- `sce_disabled = true` +- `cli_available = false` +- `is_bare_repo = true` + +## Pair parsing contract + +- Input is a newline-delimited payload where each non-empty line must contain exactly two whitespace-separated fields: ` `. +- Empty lines are ignored. +- Self-mapping lines (`old_sha == new_sha`) are ignored as no-op rewrites. +- Any non-empty malformed line fails the call with an error; no partial best-effort parsing for that invocation. + +## Rewrite-method normalization + +- Hook argument values are normalized to lowercase. +- Recognized values map to typed methods: + - `amend` -> `RewriteMethod::Amend` + - `rebase` -> `RewriteMethod::Rebase` +- All other values are preserved as lowercase in `RewriteMethod::Other(String)`. + +## Idempotency and dispatch + +- For each parsed pair, the ingestion request derives one deterministic key: + - `post-rewrite:::` +- The method token uses normalized labels (`amend`, `rebase`, or lowercase passthrough). +- Requests are dispatched through `RewriteRemapIngestion::ingest`. +- The ingestion response is interpreted as: + - `true`: pair accepted as a new ingestion + - `false`: pair skipped as replay/duplicate +- Finalization returns aggregate counters: total pairs, ingested pairs, and skipped pairs. + +## Current boundaries + +- In scope: local hook-side normalization, strict parsing, deterministic per-pair replay keys, and ingestion dispatch seam. +- Out of scope: rewrite trace transformation semantics (`T09`), hosted intake (`T12`), and mapping engine heuristics (`T13`). + +## Verification evidence + +- `cargo test --manifest-path cli/Cargo.toml post_rewrite_finalization` +- `cargo fmt --manifest-path cli/Cargo.toml -- --check` +- `cargo build --manifest-path cli/Cargo.toml` + +## Tests added + +- No-op behavior when SCE is disabled. +- Amend-pair ingestion with deterministic idempotency-key derivation. +- Rebase duplicate replay behavior (second identical pair skipped). +- Strict malformed-line rejection (` ` required). From 169b4c68afa12761ad48f72ff8e93efdbd6ceb10 Mon Sep 17 00:00:00 2001 From: David Abram Date: Wed, 4 Mar 2026 14:20:49 +0100 Subject: [PATCH 10/39] hooks: Implement rewritten commit trace finalization semantics Add `finalize_rewrite_trace` to materialize Agent Trace records for rewritten SHAs with canonical rewrite metadata and confidence-derived quality status. Preserve post-commit persistence behavior by keeping notes/DB dual-write parity, retry fallback, and idempotent no-op handling for already finalized revisions. --- cli/src/services/hooks.rs | 375 +++++++++++++++++- context/architecture.md | 2 +- context/context-map.md | 1 + context/glossary.md | 1 + context/overview.md | 2 + context/patterns.md | 1 + .../agent-trace-attribution-no-git-wrapper.md | 5 +- ...gent-trace-rewrite-trace-transformation.md | 62 +++ 8 files changed, 436 insertions(+), 13 deletions(-) create mode 100644 context/sce/agent-trace-rewrite-trace-transformation.md diff --git a/cli/src/services/hooks.rs b/cli/src/services/hooks.rs index 864f3c65..e2aaf104 100644 --- a/cli/src/services/hooks.rs +++ b/cli/src/services/hooks.rs @@ -1,8 +1,8 @@ use anyhow::Result; use crate::services::agent_trace::{ - build_trace_payload, AgentTraceRecord, FileAttributionInput, QualityStatus, TraceAdapterInput, - TRACE_CONTENT_TYPE, + build_trace_payload, AgentTraceRecord, FileAttributionInput, QualityStatus, RewriteInfo, + TraceAdapterInput, TRACE_CONTENT_TYPE, }; pub const NAME: &str = "hooks"; @@ -194,6 +194,50 @@ pub struct PostRewriteRuntimeState { pub is_bare_repo: bool, } +#[derive(Clone, Debug, PartialEq)] +pub struct RewriteTraceInput { + pub record_id: String, + pub timestamp_rfc3339: String, + pub rewritten_commit_sha: String, + pub rewrite_from_sha: String, + pub rewrite_method: RewriteMethod, + pub rewrite_confidence: f32, + pub idempotency_key: String, + pub files: Vec, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub enum RewriteTraceNoOpReason { + Disabled, + CliUnavailable, + BareRepository, + AlreadyFinalized, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct RewriteTracePersisted { + pub commit_sha: String, + pub trace_id: String, + pub quality_status: QualityStatus, + pub notes: PersistenceWriteResult, + pub database: PersistenceWriteResult, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct RewriteTraceQueuedFallback { + pub commit_sha: String, + pub trace_id: String, + pub quality_status: QualityStatus, + pub failed_targets: Vec, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub enum RewriteTraceFinalization { + NoOp(RewriteTraceNoOpReason), + Persisted(RewriteTracePersisted), + QueuedFallback(RewriteTraceQueuedFallback), +} + #[derive(Clone, Debug, Eq, PartialEq)] pub enum PostRewriteNoOpReason { Disabled, @@ -305,6 +349,125 @@ pub fn finalize_post_rewrite_remap( })) } +pub fn finalize_rewrite_trace( + runtime: &PostRewriteRuntimeState, + input: RewriteTraceInput, + notes_writer: &mut impl TraceNotesWriter, + record_store: &mut impl TraceRecordStore, + retry_queue: &mut impl TraceRetryQueue, + emission_ledger: &mut impl TraceEmissionLedger, +) -> Result { + if runtime.sce_disabled { + return Ok(RewriteTraceFinalization::NoOp( + RewriteTraceNoOpReason::Disabled, + )); + } + + if !runtime.cli_available { + return Ok(RewriteTraceFinalization::NoOp( + RewriteTraceNoOpReason::CliUnavailable, + )); + } + + if runtime.is_bare_repo { + return Ok(RewriteTraceFinalization::NoOp( + RewriteTraceNoOpReason::BareRepository, + )); + } + + if emission_ledger.has_emitted(&input.rewritten_commit_sha) { + return Ok(RewriteTraceFinalization::NoOp( + RewriteTraceNoOpReason::AlreadyFinalized, + )); + } + + let confidence = normalize_rewrite_confidence(input.rewrite_confidence)?; + let quality_status = quality_status_for_confidence(input.rewrite_confidence); + let record = build_trace_payload(TraceAdapterInput { + record_id: input.record_id, + timestamp_rfc3339: input.timestamp_rfc3339, + commit_sha: input.rewritten_commit_sha.clone(), + files: input.files, + quality_status, + rewrite: Some(RewriteInfo { + from_sha: input.rewrite_from_sha, + method: input.rewrite_method.canonical_label().to_string(), + confidence, + }), + idempotency_key: Some(input.idempotency_key.clone()), + }); + + let note = TraceNote { + notes_ref: crate::services::agent_trace::NOTES_REF.to_string(), + commit_sha: input.rewritten_commit_sha.clone(), + content_type: TRACE_CONTENT_TYPE.to_string(), + record: record.clone(), + }; + let persisted = PersistedTraceRecord { + commit_sha: input.rewritten_commit_sha.clone(), + idempotency_key: input.idempotency_key, + content_type: TRACE_CONTENT_TYPE.to_string(), + notes_ref: crate::services::agent_trace::NOTES_REF.to_string(), + record: record.clone(), + }; + + let notes_result = notes_writer.write_note(note); + let database_result = record_store.write_trace_record(persisted); + + let failed_targets = collect_failed_targets(¬es_result, &database_result); + if failed_targets.is_empty() { + emission_ledger.mark_emitted(&input.rewritten_commit_sha); + return Ok(RewriteTraceFinalization::Persisted(RewriteTracePersisted { + commit_sha: input.rewritten_commit_sha, + trace_id: record.id, + quality_status, + notes: notes_result, + database: database_result, + })); + } + + retry_queue.enqueue(TraceRetryQueueEntry { + commit_sha: input.rewritten_commit_sha.clone(), + failed_targets: failed_targets.clone(), + content_type: TRACE_CONTENT_TYPE.to_string(), + notes_ref: crate::services::agent_trace::NOTES_REF.to_string(), + record: record.clone(), + })?; + + Ok(RewriteTraceFinalization::QueuedFallback( + RewriteTraceQueuedFallback { + commit_sha: input.rewritten_commit_sha, + trace_id: record.id, + quality_status, + failed_targets, + }, + )) +} + +fn normalize_rewrite_confidence(confidence: f32) -> Result { + if !confidence.is_finite() { + anyhow::bail!("rewrite confidence must be finite") + } + + if !(0.0..=1.0).contains(&confidence) { + anyhow::bail!("rewrite confidence must be within [0.0, 1.0]") + } + + Ok(format!("{confidence:.2}")) +} + +fn quality_status_for_confidence(confidence: f32) -> QualityStatus { + if confidence >= 0.90 { + return QualityStatus::Final; + } + + if confidence >= 0.60 { + return QualityStatus::Partial; + } + + QualityStatus::NeedsReview +} + fn parse_post_rewrite_pairs(contents: &str) -> Result> { let mut pairs = Vec::new(); @@ -663,19 +826,22 @@ mod tests { use crate::services::agent_trace::{ ContributorInput, ContributorType, ConversationInput, FileAttributionInput, RangeInput, + METADATA_QUALITY_STATUS, METADATA_REWRITE_CONFIDENCE, METADATA_REWRITE_FROM, + METADATA_REWRITE_METHOD, }; use super::{ apply_commit_msg_coauthor_policy, finalize_post_commit_trace, finalize_post_rewrite_remap, - finalize_pre_commit_checkpoint, run_placeholder_hooks, CommitMsgRuntimeState, - GeneratedRegionEvent, GeneratedRegionLifecycle, GitHookKind, HookEvent, HookService, - PendingCheckpoint, PendingFileCheckpoint, PendingLineRange, PersistenceErrorClass, - PersistenceFailure, PersistenceTarget, PersistenceWriteResult, PlaceholderHookService, - PostCommitFinalization, PostCommitInput, PostCommitNoOpReason, PostCommitRuntimeState, - PostRewriteFinalization, PostRewriteNoOpReason, PostRewriteRuntimeState, - PreCommitFinalization, PreCommitNoOpReason, PreCommitRuntimeState, PreCommitTreeAnchors, - RewriteMethod, RewriteRemapIngestion, RewriteRemapRequest, TraceEmissionLedger, TraceNote, - TraceNotesWriter, TraceRecordStore, TraceRetryQueue, TraceRetryQueueEntry, + finalize_pre_commit_checkpoint, finalize_rewrite_trace, run_placeholder_hooks, + CommitMsgRuntimeState, GeneratedRegionEvent, GeneratedRegionLifecycle, GitHookKind, + HookEvent, HookService, PendingCheckpoint, PendingFileCheckpoint, PendingLineRange, + PersistenceErrorClass, PersistenceFailure, PersistenceTarget, PersistenceWriteResult, + PlaceholderHookService, PostCommitFinalization, PostCommitInput, PostCommitNoOpReason, + PostCommitRuntimeState, PostRewriteFinalization, PostRewriteNoOpReason, + PostRewriteRuntimeState, PreCommitFinalization, PreCommitNoOpReason, PreCommitRuntimeState, + PreCommitTreeAnchors, RewriteMethod, RewriteRemapIngestion, RewriteRemapRequest, + RewriteTraceFinalization, RewriteTraceInput, RewriteTraceNoOpReason, TraceEmissionLedger, + TraceNote, TraceNotesWriter, TraceRecordStore, TraceRetryQueue, TraceRetryQueueEntry, CANONICAL_SCE_COAUTHOR_TRAILER, POST_COMMIT_PARENT_SHA_METADATA_KEY, }; @@ -811,6 +977,33 @@ mod tests { } } + fn sample_rewrite_trace_input() -> RewriteTraceInput { + RewriteTraceInput { + record_id: "660e8400-e29b-41d4-a716-446655440000".to_string(), + timestamp_rfc3339: "2026-03-04T11:12:13Z".to_string(), + rewritten_commit_sha: "newsha123".to_string(), + rewrite_from_sha: "oldsha456".to_string(), + rewrite_method: RewriteMethod::Rebase, + rewrite_confidence: 0.91, + idempotency_key: "post-rewrite:rebase:oldsha456:newsha123".to_string(), + files: vec![FileAttributionInput { + path: "src/lib.rs".to_string(), + conversations: vec![ConversationInput { + url: "https://example.test/conversation/rewritten".to_string(), + related: vec![], + ranges: vec![RangeInput { + start_line: 3, + end_line: 7, + contributor: ContributorInput { + kind: ContributorType::Ai, + model_id: Some("openai/gpt-5.3-codex".to_string()), + }, + }], + }], + }], + } + } + fn sample_post_commit_input() -> PostCommitInput { PostCommitInput { record_id: "550e8400-e29b-41d4-a716-446655440000".to_string(), @@ -1038,6 +1231,166 @@ mod tests { assert!(ingestion.seen_requests.is_empty()); } + #[test] + fn rewrite_trace_finalization_persists_metadata_and_notes_db_parity() -> Result<()> { + let runtime = sample_post_rewrite_runtime(); + let input = sample_rewrite_trace_input(); + let mut notes = FakeNotesWriter::new(PersistenceWriteResult::Written); + let mut store = FakeRecordStore::new(PersistenceWriteResult::Written); + let mut queue = FakeRetryQueue::default(); + let mut ledger = FakeEmissionLedger::default(); + + let outcome = finalize_rewrite_trace( + &runtime, + input, + &mut notes, + &mut store, + &mut queue, + &mut ledger, + )?; + + let persisted = match outcome { + RewriteTraceFinalization::Persisted(persisted) => persisted, + _ => panic!("expected persisted rewrite trace outcome"), + }; + + assert_eq!(persisted.commit_sha, "newsha123"); + assert_eq!(persisted.trace_id, "660e8400-e29b-41d4-a716-446655440000"); + assert_eq!(persisted.quality_status, super::QualityStatus::Final); + assert_eq!(notes.writes.len(), 1); + assert_eq!(notes.writes[0].record.vcs.revision, "newsha123"); + assert_eq!( + notes.writes[0].record.metadata.get(METADATA_REWRITE_FROM), + Some(&"oldsha456".to_string()) + ); + assert_eq!( + notes.writes[0].record.metadata.get(METADATA_REWRITE_METHOD), + Some(&"rebase".to_string()) + ); + assert_eq!( + notes.writes[0] + .record + .metadata + .get(METADATA_REWRITE_CONFIDENCE), + Some(&"0.91".to_string()) + ); + assert_eq!( + notes.writes[0].record.metadata.get(METADATA_QUALITY_STATUS), + Some(&"final".to_string()) + ); + assert!(queue.entries.is_empty()); + assert!(ledger.has_emitted("newsha123")); + Ok(()) + } + + #[test] + fn rewrite_trace_finalization_applies_quality_thresholds() -> Result<()> { + let runtime = sample_post_rewrite_runtime(); + let mut notes = FakeNotesWriter::new(PersistenceWriteResult::Written); + let mut store = FakeRecordStore::new(PersistenceWriteResult::Written); + let mut queue = FakeRetryQueue::default(); + let mut ledger = FakeEmissionLedger::default(); + + let mut medium = sample_rewrite_trace_input(); + medium.record_id = "760e8400-e29b-41d4-a716-446655440000".to_string(); + medium.rewritten_commit_sha = "newsha-medium".to_string(); + medium.rewrite_confidence = 0.75; + let medium_outcome = finalize_rewrite_trace( + &runtime, + medium, + &mut notes, + &mut store, + &mut queue, + &mut ledger, + )?; + assert!(matches!( + medium_outcome, + RewriteTraceFinalization::Persisted(super::RewriteTracePersisted { + quality_status: super::QualityStatus::Partial, + .. + }) + )); + + let mut low = sample_rewrite_trace_input(); + low.record_id = "860e8400-e29b-41d4-a716-446655440000".to_string(); + low.rewritten_commit_sha = "newsha-low".to_string(); + low.rewrite_confidence = 0.40; + let low_outcome = finalize_rewrite_trace( + &runtime, + low, + &mut notes, + &mut store, + &mut queue, + &mut ledger, + )?; + assert!(matches!( + low_outcome, + RewriteTraceFinalization::Persisted(super::RewriteTracePersisted { + quality_status: super::QualityStatus::NeedsReview, + .. + }) + )); + + Ok(()) + } + + #[test] + fn rewrite_trace_finalization_rejects_confidence_outside_zero_to_one() { + let runtime = sample_post_rewrite_runtime(); + let mut input = sample_rewrite_trace_input(); + input.rewrite_confidence = 1.2; + + let mut notes = FakeNotesWriter::new(PersistenceWriteResult::Written); + let mut store = FakeRecordStore::new(PersistenceWriteResult::Written); + let mut queue = FakeRetryQueue::default(); + let mut ledger = FakeEmissionLedger::default(); + + let error = finalize_rewrite_trace( + &runtime, + input, + &mut notes, + &mut store, + &mut queue, + &mut ledger, + ) + .expect_err("out-of-range confidence must fail"); + + assert!(error + .to_string() + .contains("rewrite confidence must be within [0.0, 1.0]")); + assert!(notes.writes.is_empty()); + assert!(queue.entries.is_empty()); + } + + #[test] + fn rewrite_trace_finalization_noops_when_commit_already_finalized() -> Result<()> { + let runtime = sample_post_rewrite_runtime(); + let input = sample_rewrite_trace_input(); + let mut notes = FakeNotesWriter::new(PersistenceWriteResult::Written); + let mut store = FakeRecordStore::new(PersistenceWriteResult::Written); + let mut queue = FakeRetryQueue::default(); + let mut ledger = FakeEmissionLedger { + emitted: vec!["newsha123".to_string()], + }; + + let outcome = finalize_rewrite_trace( + &runtime, + input, + &mut notes, + &mut store, + &mut queue, + &mut ledger, + )?; + + assert_eq!( + outcome, + RewriteTraceFinalization::NoOp(RewriteTraceNoOpReason::AlreadyFinalized) + ); + assert!(notes.writes.is_empty()); + assert!(queue.entries.is_empty()); + Ok(()) + } + #[test] fn pre_commit_finalization_noops_when_sce_disabled() { let mut runtime = sample_runtime(); diff --git a/context/architecture.md b/context/architecture.md index d85f26cc..c31709de 100644 --- a/context/architecture.md +++ b/context/architecture.md @@ -79,7 +79,7 @@ The repository includes a new placeholder Rust binary crate at `cli/`. - `cli/src/services/doctor.rs` defines hook rollout health validation (`run_doctor`) that resolves effective git hook-path source (default, local `core.hooksPath`, global `core.hooksPath`) and validates required hook files (`pre-commit`, `commit-msg`, `post-commit`) for presence and executable permissions. - `cli/src/services/agent_trace.rs` defines the Agent Trace schema adapter and builder contracts (`adapt_trace_payload`, `build_trace_payload`), including fixed git VCS identity, reserved reverse-domain metadata keys, and deterministic AI `model_id` normalization before schema-compliance validation. - `cli/src/services/mcp.rs` defines MCP file-cache capability contracts (`McpService`, transport/capability snapshots, cache policy) with non-runnable placeholder tool declarations. -- `cli/src/services/hooks.rs` defines hook-event and generated-region tracking contracts (`HookService`, `HookEventModel`) plus a pre-commit staged-checkpoint finalization seam (`finalize_pre_commit_checkpoint`) that enforces staged-only attribution and carries index/tree anchors with explicit no-op guard states, a commit-msg co-author policy seam (`apply_commit_msg_coauthor_policy`) that injects one canonical SCE trailer only for allowed attributed commits, a post-commit trace finalization seam (`finalize_post_commit_trace`) that performs notes+DB dual writes with idempotency ledger guards and retry-queue fallback capture, and a post-rewrite remap-ingestion seam (`finalize_post_rewrite_remap`) that parses old->new SHA pairs and derives deterministic replay keys for remap dispatch. +- `cli/src/services/hooks.rs` defines hook-event and generated-region tracking contracts (`HookService`, `HookEventModel`) plus a pre-commit staged-checkpoint finalization seam (`finalize_pre_commit_checkpoint`) that enforces staged-only attribution and carries index/tree anchors with explicit no-op guard states, a commit-msg co-author policy seam (`apply_commit_msg_coauthor_policy`) that injects one canonical SCE trailer only for allowed attributed commits, a post-commit trace finalization seam (`finalize_post_commit_trace`) that performs notes+DB dual writes with idempotency ledger guards and retry-queue fallback capture, a post-rewrite remap-ingestion seam (`finalize_post_rewrite_remap`) that parses old->new SHA pairs and derives deterministic replay keys for remap dispatch, and a rewrite trace transformation seam (`finalize_rewrite_trace`) that emits rewritten-SHA Agent Trace records with rewrite metadata plus confidence-based quality status. - `cli/src/services/sync.rs` runs the local adapter through a lazily initialized shared tokio current-thread runtime and composes a placeholder cloud-sync abstraction (`CloudSyncGateway`) so local Turso validation and deferred cloud planning remain separated. - `cli/src/services/` contains module boundaries for setup, doctor, MCP, hooks, sync, and local DB adapters with explicit trait seams for future implementations. - `cli/README.md` is the crate-local onboarding and usage source of truth for placeholder behavior, safety limitations, and roadmap mapping back to service contracts. diff --git a/context/context-map.md b/context/context-map.md index ff36a89b..a2450e8d 100644 --- a/context/context-map.md +++ b/context/context-map.md @@ -24,6 +24,7 @@ Feature/domain context: - `context/sce/agent-trace-post-commit-dual-write.md` (T06 post-commit trace finalization contract, notes+DB dual-write behavior, idempotency ledger guard, and retry-queue fallback semantics) - `context/sce/agent-trace-hook-doctor.md` (T07 `sce doctor` hook install/health validation contract for default, per-repo, and global hook-path rollout) - `context/sce/agent-trace-post-rewrite-local-remap-ingestion.md` (T08 `post-rewrite` local remap ingestion contract with strict pair parsing, rewrite-method normalization, and deterministic replay-key derivation) +- `context/sce/agent-trace-rewrite-trace-transformation.md` (T09 rewritten-SHA trace transformation contract with rewrite metadata, confidence-to-quality mapping, and notes+DB persistence parity) Working areas: - `context/plans/` (active plan execution artifacts, not durable history) diff --git a/context/glossary.md b/context/glossary.md index 5c3fcd24..281db542 100644 --- a/context/glossary.md +++ b/context/glossary.md @@ -62,3 +62,4 @@ - `agent trace post-commit idempotency ledger`: T06 seam (`TraceEmissionLedger`) in `cli/src/services/hooks.rs` used to prevent duplicate emission for the same commit SHA and to mark successful dual-write completion. - `sce doctor` hook rollout validation: Implemented CLI command in `cli/src/services/doctor.rs` (`run_doctor`) that reports readiness for local Agent Trace rollout by resolving hook-path source (default `.git/hooks`, per-repo `core.hooksPath`, or global `core.hooksPath`) and validating required hook presence plus executable permissions. - `agent trace post-rewrite local remap ingestion`: T08 contract in `cli/src/services/hooks.rs` (`finalize_post_rewrite_remap`) that parses git `post-rewrite` old/new SHA pairs, captures normalized rewrite method (`amend`, `rebase`, or lowercase passthrough), derives deterministic `post-rewrite:::` idempotency keys, and dispatches replay-safe remap ingestion requests. +- `agent trace rewrite trace transformation`: T09 contract in `cli/src/services/hooks.rs` (`finalize_rewrite_trace`) that materializes rewritten-SHA Agent Trace records with `rewrite_from`/`rewrite_method`/`rewrite_confidence` metadata, enforces confidence range normalization (`0.00`..`1.00`), maps quality status thresholds (`final`/`partial`/`needs_review`), and preserves notes+DB persistence parity with retry fallback. diff --git a/context/overview.md b/context/overview.md index 3f1e4627..422de0f2 100644 --- a/context/overview.md +++ b/context/overview.md @@ -31,6 +31,7 @@ The hooks service now also exposes a `commit-msg` co-author trailer policy (`app The hooks service now also includes a post-commit trace finalization seam (`finalize_post_commit_trace`) that builds canonical Agent Trace payloads, enforces commit-level idempotency guards, performs notes + DB dual writes, and enqueues retry fallback metadata when persistence targets fail; this behavior is documented in `context/sce/agent-trace-post-commit-dual-write.md`. The CLI now also includes a hook rollout doctor contract documented in `context/sce/agent-trace-hook-doctor.md`. The hooks service now also includes a post-rewrite local remap ingestion seam (`finalize_post_rewrite_remap`) that parses `post-rewrite` old->new SHA pairs, normalizes rewrite method capture, and derives deterministic per-pair idempotency keys before remap dispatch; this behavior is documented in `context/sce/agent-trace-post-rewrite-local-remap-ingestion.md`. +The hooks service now also includes rewrite trace transformation finalization (`finalize_rewrite_trace`) that materializes rewritten-SHA Agent Trace records with `rewrite_from`/`rewrite_method`/`rewrite_confidence` metadata, confidence-threshold quality mapping (`final`/`partial`/`needs_review`), and notes+DB persistence parity with retry fallback; this behavior is documented in `context/sce/agent-trace-rewrite-trace-transformation.md`. ## Repository model @@ -89,3 +90,4 @@ Lightweight post-task verification baseline (required after each completed task) - Use `context/sce/agent-trace-post-commit-dual-write.md` for the implemented T06 post-commit trace finalization and dual-write + queue-fallback behavior. - Use `context/sce/agent-trace-hook-doctor.md` for the implemented T07 hook install and health validation behavior (`sce doctor`) across default/per-repo/global hook-path installs. - Use `context/sce/agent-trace-post-rewrite-local-remap-ingestion.md` for the implemented T08 post-rewrite local remap ingestion pipeline (`post-rewrite` pair parsing, rewrite-method normalization, and deterministic idempotency-key derivation). +- Use `context/sce/agent-trace-rewrite-trace-transformation.md` for the implemented T09 rewritten-SHA trace transformation path (`finalize_rewrite_trace`), confidence-based quality status mapping, and rewrite metadata persistence semantics. diff --git a/context/patterns.md b/context/patterns.md index 3be0aedc..524c33b4 100644 --- a/context/patterns.md +++ b/context/patterns.md @@ -89,6 +89,7 @@ - For commit-msg co-author policy seams, gate canonical trailer insertion on runtime controls (`SCE_DISABLED`, `SCE_COAUTHOR_ENABLED`) plus staged SCE-attribution presence, and enforce idempotent dedupe so allowed cases end with exactly one `Co-authored-by: SCE ` trailer. - For post-commit trace finalization seams, treat commit SHA as the idempotency identity, perform notes + DB writes in the same finalize pass when available, and enqueue retry-fallback entries that explicitly capture failed persistence targets for replay-safe recovery. - For post-rewrite remap ingestion seams, parse ` ` pairs from hook input strictly, ignore empty/no-op self-mapping rows, normalize rewrite method labels to lowercase (`amend`/`rebase` when recognized), and derive deterministic per-pair idempotency keys before dispatching remap requests. +- For rewrite trace transformation seams, materialize rewritten records through the canonical Agent Trace builder path, require finite confidence in `[0.0, 1.0]`, normalize confidence to two-decimal metadata strings, map quality thresholds to `final` (`>= 0.90`), `partial` (`0.60..0.89`), and `needs_review` (`< 0.60`), and preserve notes+DB dual-write plus retry-fallback parity. - Keep crate-local onboarding docs in `cli/README.md` and sanity-check command examples against actual `sce` output whenever command messaging changes. - Keep targeted CLI command-surface verification in flake checks: `checks..cli-setup-command-surface` runs from `cli/` and executes `cargo fmt --check` plus focused setup-related tests (`help_text_mentions_setup_target_flags`, `parser_routes_setup`, `run_setup_reports`). - In `cli/flake.nix`, select the Rust toolchain via an explicit Rust overlay (`rust-overlay`) and thread that toolchain through `makeRustPlatform` so CLI check/build derivations do not rely on implicit nixpkgs Rust defaults. diff --git a/context/plans/agent-trace-attribution-no-git-wrapper.md b/context/plans/agent-trace-attribution-no-git-wrapper.md index 973adc4d..e70fac50 100644 --- a/context/plans/agent-trace-attribution-no-git-wrapper.md +++ b/context/plans/agent-trace-attribution-no-git-wrapper.md @@ -140,7 +140,7 @@ Implement a no-git-wrapper attribution platform that preserves normal developer - `cargo fmt --manifest-path cli/Cargo.toml -- --check` - `cargo build --manifest-path cli/Cargo.toml` -- [ ] T09: Implement rewrite trace transformation semantics (status:todo) +- [x] T09: Implement rewrite trace transformation semantics (status:done) - Task ID: T09 - Goal: Materialize new trace records for rewritten SHAs with explicit rewrite metadata. - Boundaries (in/out of scope): @@ -150,6 +150,9 @@ Implement a no-git-wrapper attribution platform that preserves normal developer - Rewritten traces preserve attribution continuity and auditability. - Verification notes (commands or checks): - Integration tests asserting metadata integrity and notes/DB parity. + - `cargo fmt --manifest-path cli/Cargo.toml -- --check` + - `cargo test --manifest-path cli/Cargo.toml rewrite_trace_finalization` + - `cargo build --manifest-path cli/Cargo.toml` - [ ] T10: Ship core schema migrations (`repositories`, `commits`, `trace_records`, `trace_ranges`) (status:todo) - Task ID: T10 diff --git a/context/sce/agent-trace-rewrite-trace-transformation.md b/context/sce/agent-trace-rewrite-trace-transformation.md new file mode 100644 index 00000000..624d9e72 --- /dev/null +++ b/context/sce/agent-trace-rewrite-trace-transformation.md @@ -0,0 +1,62 @@ +# Agent Trace Rewrite Trace Transformation (T09) + +## Status +- Plan: `agent-trace-attribution-no-git-wrapper` +- Task: `T09` +- Scope: rewrite trace transformation semantics for rewritten SHAs + +## Implemented surface +- Code: `cli/src/services/hooks.rs` +- Primary entrypoint: `finalize_rewrite_trace` +- Purpose: materialize rewritten-commit Agent Trace records with explicit rewrite metadata and deterministic quality classification. + +## Runtime gating and idempotency + +`finalize_rewrite_trace` returns `NoOp` without persistence when any guard applies: + +- `sce_disabled = true` +- `cli_available = false` +- `is_bare_repo = true` +- rewritten commit SHA is already marked emitted in `TraceEmissionLedger` + +## Rewrite record transformation contract + +- Rewritten traces are emitted through the canonical builder path (`build_trace_payload`) to preserve Agent Trace-required structure. +- The rewritten commit identity maps to `vcs.revision = `. +- Rewrite lineage metadata is always attached via reserved keys: + - `dev.crocoder.sce.rewrite_from` + - `dev.crocoder.sce.rewrite_method` + - `dev.crocoder.sce.rewrite_confidence` +- The method value uses canonical labels from `RewriteMethod` (`amend`, `rebase`, lowercase passthrough for `Other`). + +## Confidence and quality logic + +- Confidence input must be finite and inside `[0.0, 1.0]`; otherwise finalization errors before writes. +- Confidence is normalized to a fixed two-decimal metadata string (`0.00`..`1.00`). +- Quality status mapping: + - `>= 0.90` -> `final` + - `0.60..0.89` -> `partial` + - `< 0.60` -> `needs_review` + +## Persistence semantics + +- Rewritten trace finalization follows the same notes+DB persistence contract as post-commit traces. +- On dual-write success: + - commit SHA is marked emitted in `TraceEmissionLedger` + - outcome is `RewriteTraceFinalization::Persisted` +- On any target failure: + - failed targets are captured in a retry queue entry + - outcome is `RewriteTraceFinalization::QueuedFallback` + +## Verification evidence + +- `cargo fmt --manifest-path cli/Cargo.toml -- --check` +- `cargo test --manifest-path cli/Cargo.toml rewrite_trace_finalization` +- `cargo build --manifest-path cli/Cargo.toml` + +## Tests added + +- Metadata integrity and notes/DB parity for persisted rewrite traces. +- Confidence-threshold quality mapping (`final`, `partial`, `needs_review`). +- Confidence range validation errors for out-of-range input. +- No-op behavior when rewritten commit was already finalized. From 497d00360d42e1e89e68711c5a95ae5f99fd16df Mon Sep 17 00:00:00 2001 From: David Abram Date: Wed, 4 Mar 2026 14:44:25 +0100 Subject: [PATCH 11/39] agent-trace: Implement core local schema migrations Add an idempotent migration entrypoint for local Turso storage that creates foundational Agent Trace tables and indexes for repository, commit, record, and range persistence. Include focused migration tests that verify clean creation on empty databases and safe reapplication on preexisting state. --- cli/src/services/local_db.rs | 203 +++++++++++++++++- context/architecture.md | 2 +- context/context-map.md | 1 + context/glossary.md | 1 + context/overview.md | 2 + context/patterns.md | 1 + .../agent-trace-attribution-no-git-wrapper.md | 4 +- .../sce/agent-trace-core-schema-migrations.md | 42 ++++ 8 files changed, 252 insertions(+), 4 deletions(-) create mode 100644 context/sce/agent-trace-core-schema-migrations.md diff --git a/cli/src/services/local_db.rs b/cli/src/services/local_db.rs index 2a73b411..e6faa9c4 100644 --- a/cli/src/services/local_db.rs +++ b/cli/src/services/local_db.rs @@ -3,6 +3,62 @@ use std::path::Path; use anyhow::{anyhow, ensure, Result}; use turso::Builder; +const CORE_SCHEMA_STATEMENTS: &[&str] = &[ + "CREATE TABLE IF NOT EXISTS repositories (\ + id INTEGER PRIMARY KEY,\ + vcs_provider TEXT NOT NULL DEFAULT 'git',\ + canonical_root TEXT NOT NULL UNIQUE,\ + created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ', 'now'))\ + )", + "CREATE TABLE IF NOT EXISTS commits (\ + id INTEGER PRIMARY KEY,\ + repository_id INTEGER NOT NULL,\ + commit_sha TEXT NOT NULL,\ + parent_sha TEXT,\ + committed_at TEXT,\ + author_name TEXT,\ + author_email TEXT,\ + idempotency_key TEXT,\ + created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ', 'now')),\ + FOREIGN KEY(repository_id) REFERENCES repositories(id) ON DELETE CASCADE,\ + UNIQUE(repository_id, commit_sha),\ + UNIQUE(repository_id, idempotency_key)\ + )", + "CREATE TABLE IF NOT EXISTS trace_records (\ + id INTEGER PRIMARY KEY,\ + repository_id INTEGER NOT NULL,\ + commit_id INTEGER NOT NULL,\ + trace_id TEXT NOT NULL UNIQUE,\ + version TEXT NOT NULL,\ + content_type TEXT NOT NULL,\ + notes_ref TEXT NOT NULL,\ + payload_json TEXT NOT NULL,\ + quality_status TEXT NOT NULL,\ + idempotency_key TEXT NOT NULL,\ + recorded_at TEXT NOT NULL,\ + created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ', 'now')),\ + FOREIGN KEY(repository_id) REFERENCES repositories(id) ON DELETE CASCADE,\ + FOREIGN KEY(commit_id) REFERENCES commits(id) ON DELETE CASCADE,\ + UNIQUE(repository_id, idempotency_key),\ + UNIQUE(commit_id)\ + )", + "CREATE TABLE IF NOT EXISTS trace_ranges (\ + id INTEGER PRIMARY KEY,\ + trace_record_id INTEGER NOT NULL,\ + file_path TEXT NOT NULL,\ + conversation_url TEXT NOT NULL,\ + start_line INTEGER NOT NULL,\ + end_line INTEGER NOT NULL,\ + contributor_type TEXT NOT NULL,\ + contributor_model_id TEXT,\ + created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ', 'now')),\ + FOREIGN KEY(trace_record_id) REFERENCES trace_records(id) ON DELETE CASCADE\ + )", + "CREATE INDEX IF NOT EXISTS idx_commits_repository_commit_sha ON commits(repository_id, commit_sha)", + "CREATE INDEX IF NOT EXISTS idx_trace_records_repository_commit ON trace_records(repository_id, commit_id)", + "CREATE INDEX IF NOT EXISTS idx_trace_ranges_record_file ON trace_ranges(trace_record_id, file_path)", +]; + #[derive(Clone, Copy, Debug)] #[allow(dead_code)] pub enum LocalDatabaseTarget<'a> { @@ -15,7 +71,12 @@ pub struct SmokeCheckOutcome { pub inserted_rows: u64, } -pub async fn run_smoke_check(target: LocalDatabaseTarget<'_>) -> Result { +#[derive(Clone, Copy, Debug, Eq, PartialEq)] +pub struct CoreSchemaMigrationOutcome { + pub executed_statements: usize, +} + +async fn connect_local(target: LocalDatabaseTarget<'_>) -> Result { let location = match target { LocalDatabaseTarget::InMemory => ":memory:".to_string(), LocalDatabaseTarget::Path(path) => path.to_string_lossy().into_owned(), @@ -23,6 +84,24 @@ pub async fn run_smoke_check(target: LocalDatabaseTarget<'_>) -> Result, +) -> Result { + let conn = connect_local(target).await?; + for statement in CORE_SCHEMA_STATEMENTS { + conn.execute(statement, ()).await?; + } + + Ok(CoreSchemaMigrationOutcome { + executed_statements: CORE_SCHEMA_STATEMENTS.len(), + }) +} + +pub async fn run_smoke_check(target: LocalDatabaseTarget<'_>) -> Result { + let conn = connect_local(target).await?; conn.execute( "CREATE TABLE IF NOT EXISTS sce_smoke (id INTEGER PRIMARY KEY, label TEXT NOT NULL)", @@ -56,7 +135,45 @@ mod tests { use crate::test_support::TestTempDir; use anyhow::Result; - use super::{run_smoke_check, LocalDatabaseTarget}; + use super::{apply_core_schema_migrations, run_smoke_check, LocalDatabaseTarget}; + + fn row_exists_query(kind: &str, name: &str) -> String { + format!("SELECT 1 FROM sqlite_master WHERE type = '{kind}' AND name = '{name}' LIMIT 1") + } + + async fn sqlite_object_exists( + target: LocalDatabaseTarget<'_>, + kind: &str, + name: &str, + ) -> Result { + let location = match target { + LocalDatabaseTarget::InMemory => ":memory:".to_string(), + LocalDatabaseTarget::Path(path) => path.to_string_lossy().into_owned(), + }; + let db = turso::Builder::new_local(&location).build().await?; + let conn = db.connect()?; + let mut rows = conn.query(&row_exists_query(kind, name), ()).await?; + Ok(rows.next().await?.is_some()) + } + + async fn repository_count(target: LocalDatabaseTarget<'_>) -> Result { + let location = match target { + LocalDatabaseTarget::InMemory => ":memory:".to_string(), + LocalDatabaseTarget::Path(path) => path.to_string_lossy().into_owned(), + }; + let db = turso::Builder::new_local(&location).build().await?; + let conn = db.connect()?; + let mut rows = conn.query("SELECT COUNT(*) FROM repositories", ()).await?; + let row = rows + .next() + .await? + .ok_or_else(|| anyhow::anyhow!("repository count query returned no rows"))?; + let count = row.get_value(0)?; + let count = *count + .as_integer() + .ok_or_else(|| anyhow::anyhow!("repository count query returned non-integer"))?; + Ok(count as u64) + } #[test] fn in_memory_smoke_check_succeeds() -> Result<()> { @@ -76,4 +193,86 @@ mod tests { assert!(path.exists()); Ok(()) } + + #[test] + fn core_schema_migrations_create_required_tables_and_indexes() -> Result<()> { + let temp = TestTempDir::new("sce-core-schema-tests")?; + let path = temp.path().join("core-schema.db"); + let runtime = tokio::runtime::Builder::new_current_thread().build()?; + + let outcome = runtime.block_on(apply_core_schema_migrations(LocalDatabaseTarget::Path( + &path, + )))?; + assert_eq!( + outcome.executed_statements, 7, + "expected all core migration statements to execute" + ); + + for table in ["repositories", "commits", "trace_records", "trace_ranges"] { + assert!(runtime.block_on(sqlite_object_exists( + LocalDatabaseTarget::Path(&path), + "table", + table, + ))?); + } + + for index in [ + "idx_commits_repository_commit_sha", + "idx_trace_records_repository_commit", + "idx_trace_ranges_record_file", + ] { + assert!(runtime.block_on(sqlite_object_exists( + LocalDatabaseTarget::Path(&path), + "index", + index, + ))?); + } + + Ok(()) + } + + #[test] + fn core_schema_migrations_are_upgrade_safe_for_preexisting_state() -> Result<()> { + let temp = TestTempDir::new("sce-core-schema-upgrade-tests")?; + let path = temp.path().join("preexisting.db"); + let runtime = tokio::runtime::Builder::new_current_thread().build()?; + + runtime.block_on(async { + let db = turso::Builder::new_local(path.to_string_lossy().as_ref()) + .build() + .await?; + let conn = db.connect()?; + conn.execute( + "CREATE TABLE IF NOT EXISTS repositories (\ + id INTEGER PRIMARY KEY,\ + vcs_provider TEXT NOT NULL DEFAULT 'git',\ + canonical_root TEXT NOT NULL UNIQUE,\ + created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ', 'now'))\ + )", + (), + ) + .await?; + conn.execute( + "INSERT INTO repositories (canonical_root) VALUES (?1)", + ["/tmp/example-repo"], + ) + .await?; + Ok::<(), anyhow::Error>(()) + })?; + + runtime.block_on(apply_core_schema_migrations(LocalDatabaseTarget::Path( + &path, + )))?; + runtime.block_on(apply_core_schema_migrations(LocalDatabaseTarget::Path( + &path, + )))?; + + let repository_rows = + runtime.block_on(repository_count(LocalDatabaseTarget::Path(&path)))?; + assert_eq!( + repository_rows, 1, + "preexisting repository rows should remain" + ); + Ok(()) + } } diff --git a/context/architecture.md b/context/architecture.md index c31709de..59e19999 100644 --- a/context/architecture.md +++ b/context/architecture.md @@ -73,7 +73,7 @@ The repository includes a new placeholder Rust binary crate at `cli/`. - `cli/src/main.rs` is the executable entrypoint (`sce`) and delegates to `app::run`. - `cli/src/app.rs` provides a `lexopt`-based argument parser and dispatch loop with deterministic help, setup installation execution, and consistent `anyhow`-driven error exits. - `cli/src/command_surface.rs` is the source of truth for top-level command contract metadata (`help`, `setup`, `doctor`, `mcp`, `hooks`, `sync`) and explicit implemented-vs-placeholder status. -- `cli/src/services/local_db.rs` provides the local Turso data adapter, including `Builder::new_local(...)` initialization and async execute/query smoke checks for in-memory and file-backed targets. +- `cli/src/services/local_db.rs` provides the local Turso data adapter, including `Builder::new_local(...)` initialization, async execute/query smoke checks for in-memory and file-backed targets, and idempotent core schema migration application for Agent Trace persistence foundations (`repositories`, `commits`, `trace_records`, `trace_ranges`). - `cli/src/test_support.rs` provides a shared test-only temp-directory helper (`TestTempDir`) used by service tests that need filesystem fixtures. - `cli/src/services/setup.rs` defines the setup command contract (`SetupMode`, `SetupTarget`, CLI flag parser/validator), an `inquire`-backed interactive target prompter (`InquireSetupTargetPrompter`), setup dispatch outcomes (proceed/cancelled), compile-time embedded asset access (`EmbeddedAsset`, target-scoped iterators) generated by `cli/build.rs` from `config/.opencode/**` and `config/.claude/**`, and a target-scoped install engine/orchestrator that stages embedded files, performs backup-and-replace with rollback restoration on swap failure, and formats deterministic completion messaging. - `cli/src/services/doctor.rs` defines hook rollout health validation (`run_doctor`) that resolves effective git hook-path source (default, local `core.hooksPath`, global `core.hooksPath`) and validates required hook files (`pre-commit`, `commit-msg`, `post-commit`) for presence and executable permissions. diff --git a/context/context-map.md b/context/context-map.md index a2450e8d..8bcecc9b 100644 --- a/context/context-map.md +++ b/context/context-map.md @@ -25,6 +25,7 @@ Feature/domain context: - `context/sce/agent-trace-hook-doctor.md` (T07 `sce doctor` hook install/health validation contract for default, per-repo, and global hook-path rollout) - `context/sce/agent-trace-post-rewrite-local-remap-ingestion.md` (T08 `post-rewrite` local remap ingestion contract with strict pair parsing, rewrite-method normalization, and deterministic replay-key derivation) - `context/sce/agent-trace-rewrite-trace-transformation.md` (T09 rewritten-SHA trace transformation contract with rewrite metadata, confidence-to-quality mapping, and notes+DB persistence parity) +- `context/sce/agent-trace-core-schema-migrations.md` (T10 core local schema migration contract for `repositories`, `commits`, `trace_records`, and `trace_ranges` with upgrade-safe idempotent create semantics) Working areas: - `context/plans/` (active plan execution artifacts, not durable history) diff --git a/context/glossary.md b/context/glossary.md index 281db542..c2feb675 100644 --- a/context/glossary.md +++ b/context/glossary.md @@ -63,3 +63,4 @@ - `sce doctor` hook rollout validation: Implemented CLI command in `cli/src/services/doctor.rs` (`run_doctor`) that reports readiness for local Agent Trace rollout by resolving hook-path source (default `.git/hooks`, per-repo `core.hooksPath`, or global `core.hooksPath`) and validating required hook presence plus executable permissions. - `agent trace post-rewrite local remap ingestion`: T08 contract in `cli/src/services/hooks.rs` (`finalize_post_rewrite_remap`) that parses git `post-rewrite` old/new SHA pairs, captures normalized rewrite method (`amend`, `rebase`, or lowercase passthrough), derives deterministic `post-rewrite:::` idempotency keys, and dispatches replay-safe remap ingestion requests. - `agent trace rewrite trace transformation`: T09 contract in `cli/src/services/hooks.rs` (`finalize_rewrite_trace`) that materializes rewritten-SHA Agent Trace records with `rewrite_from`/`rewrite_method`/`rewrite_confidence` metadata, enforces confidence range normalization (`0.00`..`1.00`), maps quality status thresholds (`final`/`partial`/`needs_review`), and preserves notes+DB persistence parity with retry fallback. +- `agent trace core schema migrations`: T10 contract in `cli/src/services/local_db.rs` (`apply_core_schema_migrations`) that applies idempotent local DB table/index creation for foundational Agent Trace entities (`repositories`, `commits`, `trace_records`, `trace_ranges`) and supports upgrade-safe reapplication on preexisting database state. diff --git a/context/overview.md b/context/overview.md index 422de0f2..24c654fa 100644 --- a/context/overview.md +++ b/context/overview.md @@ -32,6 +32,7 @@ The hooks service now also includes a post-commit trace finalization seam (`fina The CLI now also includes a hook rollout doctor contract documented in `context/sce/agent-trace-hook-doctor.md`. The hooks service now also includes a post-rewrite local remap ingestion seam (`finalize_post_rewrite_remap`) that parses `post-rewrite` old->new SHA pairs, normalizes rewrite method capture, and derives deterministic per-pair idempotency keys before remap dispatch; this behavior is documented in `context/sce/agent-trace-post-rewrite-local-remap-ingestion.md`. The hooks service now also includes rewrite trace transformation finalization (`finalize_rewrite_trace`) that materializes rewritten-SHA Agent Trace records with `rewrite_from`/`rewrite_method`/`rewrite_confidence` metadata, confidence-threshold quality mapping (`final`/`partial`/`needs_review`), and notes+DB persistence parity with retry fallback; this behavior is documented in `context/sce/agent-trace-rewrite-trace-transformation.md`. +The local DB service now includes core Agent Trace persistence schema migrations (`apply_core_schema_migrations`) that install idempotent foundational tables and indexes for `repositories`, `commits`, `trace_records`, and `trace_ranges`; this behavior is documented in `context/sce/agent-trace-core-schema-migrations.md`. ## Repository model @@ -91,3 +92,4 @@ Lightweight post-task verification baseline (required after each completed task) - Use `context/sce/agent-trace-hook-doctor.md` for the implemented T07 hook install and health validation behavior (`sce doctor`) across default/per-repo/global hook-path installs. - Use `context/sce/agent-trace-post-rewrite-local-remap-ingestion.md` for the implemented T08 post-rewrite local remap ingestion pipeline (`post-rewrite` pair parsing, rewrite-method normalization, and deterministic idempotency-key derivation). - Use `context/sce/agent-trace-rewrite-trace-transformation.md` for the implemented T09 rewritten-SHA trace transformation path (`finalize_rewrite_trace`), confidence-based quality status mapping, and rewrite metadata persistence semantics. +- Use `context/sce/agent-trace-core-schema-migrations.md` for the implemented T10 core local schema migration contract (`apply_core_schema_migrations`) and table/index ownership across foundational Agent Trace persistence entities. diff --git a/context/patterns.md b/context/patterns.md index 524c33b4..052ee56e 100644 --- a/context/patterns.md +++ b/context/patterns.md @@ -90,6 +90,7 @@ - For post-commit trace finalization seams, treat commit SHA as the idempotency identity, perform notes + DB writes in the same finalize pass when available, and enqueue retry-fallback entries that explicitly capture failed persistence targets for replay-safe recovery. - For post-rewrite remap ingestion seams, parse ` ` pairs from hook input strictly, ignore empty/no-op self-mapping rows, normalize rewrite method labels to lowercase (`amend`/`rebase` when recognized), and derive deterministic per-pair idempotency keys before dispatching remap requests. - For rewrite trace transformation seams, materialize rewritten records through the canonical Agent Trace builder path, require finite confidence in `[0.0, 1.0]`, normalize confidence to two-decimal metadata strings, map quality thresholds to `final` (`>= 0.90`), `partial` (`0.60..0.89`), and `needs_review` (`< 0.60`), and preserve notes+DB dual-write plus retry-fallback parity. +- For local persistence rollout, ship core schema changes as idempotent `CREATE TABLE IF NOT EXISTS` and `CREATE INDEX IF NOT EXISTS` statements so migration reapplication is upgrade-safe across empty and preexisting local Turso DB states. - Keep crate-local onboarding docs in `cli/README.md` and sanity-check command examples against actual `sce` output whenever command messaging changes. - Keep targeted CLI command-surface verification in flake checks: `checks..cli-setup-command-surface` runs from `cli/` and executes `cargo fmt --check` plus focused setup-related tests (`help_text_mentions_setup_target_flags`, `parser_routes_setup`, `run_setup_reports`). - In `cli/flake.nix`, select the Rust toolchain via an explicit Rust overlay (`rust-overlay`) and thread that toolchain through `makeRustPlatform` so CLI check/build derivations do not rely on implicit nixpkgs Rust defaults. diff --git a/context/plans/agent-trace-attribution-no-git-wrapper.md b/context/plans/agent-trace-attribution-no-git-wrapper.md index e70fac50..c9d11c68 100644 --- a/context/plans/agent-trace-attribution-no-git-wrapper.md +++ b/context/plans/agent-trace-attribution-no-git-wrapper.md @@ -154,7 +154,7 @@ Implement a no-git-wrapper attribution platform that preserves normal developer - `cargo test --manifest-path cli/Cargo.toml rewrite_trace_finalization` - `cargo build --manifest-path cli/Cargo.toml` -- [ ] T10: Ship core schema migrations (`repositories`, `commits`, `trace_records`, `trace_ranges`) (status:todo) +- [x] T10: Ship core schema migrations (`repositories`, `commits`, `trace_records`, `trace_ranges`) (status:done) - Task ID: T10 - Goal: Establish foundational persistence tables, constraints, and indexes. - Boundaries (in/out of scope): @@ -164,6 +164,8 @@ Implement a no-git-wrapper attribution platform that preserves normal developer - Core schema applies cleanly and supports local commit ingestion. - Verification notes (commands or checks): - Migration tests with empty and preexisting DB states. + - `cargo test --manifest-path cli/Cargo.toml core_schema_migrations` + - `cargo build --manifest-path cli/Cargo.toml` - [ ] T11: Ship reconciliation schema and ingestion (`reconciliation_runs`, `rewrite_mappings`, `conversations`) (status:todo) - Task ID: T11 diff --git a/context/sce/agent-trace-core-schema-migrations.md b/context/sce/agent-trace-core-schema-migrations.md new file mode 100644 index 00000000..5409ed4b --- /dev/null +++ b/context/sce/agent-trace-core-schema-migrations.md @@ -0,0 +1,42 @@ +# Agent Trace Core Schema Migrations + +## Scope + +- Implements T10 for plan `agent-trace-attribution-no-git-wrapper`. +- Defines foundational local persistence schema for Agent Trace ingestion. +- Covers only core entities: `repositories`, `commits`, `trace_records`, `trace_ranges`. + +## Code ownership + +- Migration entrypoint: `cli/src/services/local_db.rs` (`apply_core_schema_migrations`). +- Shared local DB connection helper: `cli/src/services/local_db.rs` (`connect_local`). + +## Migration contract + +- Migrations are idempotent and upgrade-safe via `CREATE TABLE IF NOT EXISTS` and `CREATE INDEX IF NOT EXISTS`. +- Reapplying migrations must succeed on both empty and preexisting local DB states. +- Core schema statements are deterministic and owned in one ordered list (`CORE_SCHEMA_STATEMENTS`). + +## Core tables + +- `repositories`: repository identity root (`canonical_root`) plus VCS provider marker. +- `commits`: per-repository commit identity (`commit_sha`), optional parent SHA, and idempotency key capture. +- `trace_records`: canonical stored Agent Trace payload envelope per commit (content type, notes ref, payload JSON, quality status, recorded timestamp). +- `trace_ranges`: flattened line-range attribution rows linked to a trace record. + +## Indexes + +- `idx_commits_repository_commit_sha` on `commits(repository_id, commit_sha)`. +- `idx_trace_records_repository_commit` on `trace_records(repository_id, commit_id)`. +- `idx_trace_ranges_record_file` on `trace_ranges(trace_record_id, file_path)`. + +## Verification evidence + +- `cargo test --manifest-path cli/Cargo.toml core_schema_migrations` +- `cargo build --manifest-path cli/Cargo.toml` + +## Related context + +- `context/sce/agent-trace-post-commit-dual-write.md` +- `context/sce/agent-trace-rewrite-trace-transformation.md` +- `context/plans/agent-trace-attribution-no-git-wrapper.md` From 1eff2b87c51adcf360befa8f0912eb12fb870621 Mon Sep 17 00:00:00 2001 From: David Abram Date: Wed, 4 Mar 2026 14:58:55 +0100 Subject: [PATCH 12/39] agent-trace: Add reconciliation schema for hosted rewrite ingestion Extend local DB migrations with reconciliation persistence tables for runs, rewrite mappings, and conversations, including replay-safe idempotency keys and lookup indexes. Add targeted migration tests for upgrade safety, uniqueness, and representative reconciliation queries. --- cli/src/services/local_db.rs | 194 +++++++++++++++++- context/architecture.md | 2 +- context/context-map.md | 1 + context/glossary.md | 1 + context/overview.md | 2 + context/patterns.md | 1 + .../agent-trace-attribution-no-git-wrapper.md | 6 +- ...t-trace-reconciliation-schema-ingestion.md | 50 +++++ 8 files changed, 249 insertions(+), 8 deletions(-) create mode 100644 context/sce/agent-trace-reconciliation-schema-ingestion.md diff --git a/cli/src/services/local_db.rs b/cli/src/services/local_db.rs index e6faa9c4..dcdd7a26 100644 --- a/cli/src/services/local_db.rs +++ b/cli/src/services/local_db.rs @@ -54,9 +54,48 @@ const CORE_SCHEMA_STATEMENTS: &[&str] = &[ created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ', 'now')),\ FOREIGN KEY(trace_record_id) REFERENCES trace_records(id) ON DELETE CASCADE\ )", + "CREATE TABLE IF NOT EXISTS reconciliation_runs (\ + id INTEGER PRIMARY KEY,\ + repository_id INTEGER NOT NULL,\ + provider TEXT NOT NULL,\ + idempotency_key TEXT NOT NULL,\ + status TEXT NOT NULL,\ + initiated_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ', 'now')),\ + completed_at TEXT,\ + created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ', 'now')),\ + FOREIGN KEY(repository_id) REFERENCES repositories(id) ON DELETE CASCADE,\ + UNIQUE(repository_id, idempotency_key)\ + )", + "CREATE TABLE IF NOT EXISTS rewrite_mappings (\ + id INTEGER PRIMARY KEY,\ + reconciliation_run_id INTEGER NOT NULL,\ + repository_id INTEGER NOT NULL,\ + old_commit_sha TEXT NOT NULL,\ + new_commit_sha TEXT,\ + mapping_status TEXT NOT NULL,\ + confidence REAL,\ + idempotency_key TEXT NOT NULL,\ + created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ', 'now')),\ + FOREIGN KEY(reconciliation_run_id) REFERENCES reconciliation_runs(id) ON DELETE CASCADE,\ + FOREIGN KEY(repository_id) REFERENCES repositories(id) ON DELETE CASCADE,\ + UNIQUE(repository_id, idempotency_key)\ + )", + "CREATE TABLE IF NOT EXISTS conversations (\ + id INTEGER PRIMARY KEY,\ + repository_id INTEGER NOT NULL,\ + url TEXT NOT NULL,\ + source TEXT NOT NULL,\ + created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ', 'now')),\ + FOREIGN KEY(repository_id) REFERENCES repositories(id) ON DELETE CASCADE,\ + UNIQUE(repository_id, url)\ + )", "CREATE INDEX IF NOT EXISTS idx_commits_repository_commit_sha ON commits(repository_id, commit_sha)", "CREATE INDEX IF NOT EXISTS idx_trace_records_repository_commit ON trace_records(repository_id, commit_id)", "CREATE INDEX IF NOT EXISTS idx_trace_ranges_record_file ON trace_ranges(trace_record_id, file_path)", + "CREATE INDEX IF NOT EXISTS idx_reconciliation_runs_repository_status ON reconciliation_runs(repository_id, status)", + "CREATE INDEX IF NOT EXISTS idx_rewrite_mappings_run_old_sha ON rewrite_mappings(reconciliation_run_id, old_commit_sha)", + "CREATE INDEX IF NOT EXISTS idx_rewrite_mappings_repository_old_sha ON rewrite_mappings(repository_id, old_commit_sha)", + "CREATE INDEX IF NOT EXISTS idx_conversations_repository_source ON conversations(repository_id, source)", ]; #[derive(Clone, Copy, Debug)] @@ -84,6 +123,7 @@ async fn connect_local(target: LocalDatabaseTarget<'_>) -> Result, query: &str) -> Result { + let location = match target { + LocalDatabaseTarget::InMemory => ":memory:".to_string(), + LocalDatabaseTarget::Path(path) => path.to_string_lossy().into_owned(), + }; + let db = turso::Builder::new_local(&location).build().await?; + let conn = db.connect()?; + let mut rows = conn.query(query, ()).await?; + let row = rows + .next() + .await? + .ok_or_else(|| anyhow::anyhow!("integer query returned no rows"))?; + let value = row.get_value(0)?; + let value = *value + .as_integer() + .ok_or_else(|| anyhow::anyhow!("integer query returned non-integer"))?; + Ok(value) + } + #[test] fn in_memory_smoke_check_succeeds() -> Result<()> { let runtime = tokio::runtime::Builder::new_current_thread().build()?; @@ -204,11 +263,20 @@ mod tests { &path, )))?; assert_eq!( - outcome.executed_statements, 7, + outcome.executed_statements, + super::CORE_SCHEMA_STATEMENTS.len(), "expected all core migration statements to execute" ); - for table in ["repositories", "commits", "trace_records", "trace_ranges"] { + for table in [ + "repositories", + "commits", + "trace_records", + "trace_ranges", + "reconciliation_runs", + "rewrite_mappings", + "conversations", + ] { assert!(runtime.block_on(sqlite_object_exists( LocalDatabaseTarget::Path(&path), "table", @@ -220,6 +288,10 @@ mod tests { "idx_commits_repository_commit_sha", "idx_trace_records_repository_commit", "idx_trace_ranges_record_file", + "idx_reconciliation_runs_repository_status", + "idx_rewrite_mappings_run_old_sha", + "idx_rewrite_mappings_repository_old_sha", + "idx_conversations_repository_source", ] { assert!(runtime.block_on(sqlite_object_exists( LocalDatabaseTarget::Path(&path), @@ -238,10 +310,7 @@ mod tests { let runtime = tokio::runtime::Builder::new_current_thread().build()?; runtime.block_on(async { - let db = turso::Builder::new_local(path.to_string_lossy().as_ref()) - .build() - .await?; - let conn = db.connect()?; + let conn = super::connect_local(LocalDatabaseTarget::Path(&path)).await?; conn.execute( "CREATE TABLE IF NOT EXISTS repositories (\ id INTEGER PRIMARY KEY,\ @@ -275,4 +344,117 @@ mod tests { ); Ok(()) } + + #[test] + fn reconciliation_schema_supports_replay_safe_runs_and_mapping_queries() -> Result<()> { + let temp = TestTempDir::new("sce-reconciliation-schema-tests")?; + let path = temp.path().join("reconciliation.db"); + let runtime = tokio::runtime::Builder::new_current_thread().build()?; + + runtime.block_on(apply_core_schema_migrations(LocalDatabaseTarget::Path( + &path, + )))?; + + runtime.block_on(async { + let db = turso::Builder::new_local(path.to_string_lossy().as_ref()) + .build() + .await?; + let conn = db.connect()?; + + conn.execute( + "INSERT INTO repositories (canonical_root) VALUES (?1)", + ["/tmp/reconciliation-repo"], + ) + .await?; + + conn.execute( + "INSERT INTO reconciliation_runs (repository_id, provider, idempotency_key, status) \ + VALUES (?1, ?2, ?3, ?4)", + (1_i64, "github", "run:key:1", "completed"), + ) + .await?; + + conn.execute( + "INSERT INTO conversations (repository_id, url, source) VALUES (?1, ?2, ?3)", + (1_i64, "https://example.dev/conversations/abc", "github"), + ) + .await?; + + conn.execute( + "INSERT INTO rewrite_mappings (\ + reconciliation_run_id, repository_id, old_commit_sha, new_commit_sha,\ + mapping_status, confidence, idempotency_key\ + ) VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7)", + ( + 1_i64, + 1_i64, + "1111111111111111111111111111111111111111", + "2222222222222222222222222222222222222222", + "mapped", + 0.98_f64, + "map:key:1", + ), + ) + .await?; + + let duplicate_run = conn + .execute( + "INSERT INTO reconciliation_runs (repository_id, provider, idempotency_key, status) \ + VALUES (?1, ?2, ?3, ?4)", + (1_i64, "github", "run:key:1", "completed"), + ) + .await; + assert!(duplicate_run.is_err(), "run idempotency key should be unique"); + + let duplicate_mapping = conn + .execute( + "INSERT INTO rewrite_mappings (\ + reconciliation_run_id, repository_id, old_commit_sha, new_commit_sha,\ + mapping_status, confidence, idempotency_key\ + ) VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7)", + ( + 1_i64, + 1_i64, + "1111111111111111111111111111111111111111", + "3333333333333333333333333333333333333333", + "mapped", + 0.70_f64, + "map:key:1", + ), + ) + .await; + assert!( + duplicate_mapping.is_err(), + "mapping idempotency key should be unique" + ); + + Ok::<(), anyhow::Error>(()) + })?; + + let run_count = runtime.block_on(fetch_single_integer( + LocalDatabaseTarget::Path(&path), + "SELECT COUNT(*) FROM reconciliation_runs WHERE repository_id = 1 AND status = 'completed'", + ))?; + assert_eq!(run_count, 1); + + let mapped_count = runtime.block_on(fetch_single_integer( + LocalDatabaseTarget::Path(&path), + "SELECT COUNT(*) FROM rewrite_mappings WHERE repository_id = 1 AND old_commit_sha = '1111111111111111111111111111111111111111'", + ))?; + assert_eq!(mapped_count, 1); + + let joined_mapping_count = runtime.block_on(fetch_single_integer( + LocalDatabaseTarget::Path(&path), + "SELECT COUNT(*) FROM rewrite_mappings m JOIN reconciliation_runs r ON r.id = m.reconciliation_run_id JOIN repositories repo ON repo.id = m.repository_id WHERE r.repository_id = repo.id AND m.old_commit_sha = '1111111111111111111111111111111111111111'", + ))?; + assert_eq!(joined_mapping_count, 1); + + let conversation_count = runtime.block_on(fetch_single_integer( + LocalDatabaseTarget::Path(&path), + "SELECT COUNT(*) FROM conversations WHERE repository_id = 1 AND source = 'github'", + ))?; + assert_eq!(conversation_count, 1); + + Ok(()) + } } diff --git a/context/architecture.md b/context/architecture.md index 59e19999..5cf55038 100644 --- a/context/architecture.md +++ b/context/architecture.md @@ -73,7 +73,7 @@ The repository includes a new placeholder Rust binary crate at `cli/`. - `cli/src/main.rs` is the executable entrypoint (`sce`) and delegates to `app::run`. - `cli/src/app.rs` provides a `lexopt`-based argument parser and dispatch loop with deterministic help, setup installation execution, and consistent `anyhow`-driven error exits. - `cli/src/command_surface.rs` is the source of truth for top-level command contract metadata (`help`, `setup`, `doctor`, `mcp`, `hooks`, `sync`) and explicit implemented-vs-placeholder status. -- `cli/src/services/local_db.rs` provides the local Turso data adapter, including `Builder::new_local(...)` initialization, async execute/query smoke checks for in-memory and file-backed targets, and idempotent core schema migration application for Agent Trace persistence foundations (`repositories`, `commits`, `trace_records`, `trace_ranges`). +- `cli/src/services/local_db.rs` provides the local Turso data adapter, including `Builder::new_local(...)` initialization, async execute/query smoke checks for in-memory and file-backed targets, and idempotent migration application for Agent Trace persistence foundations (`repositories`, `commits`, `trace_records`, `trace_ranges`) plus reconciliation ingestion entities (`reconciliation_runs`, `rewrite_mappings`, `conversations`) with replay/query indexes. - `cli/src/test_support.rs` provides a shared test-only temp-directory helper (`TestTempDir`) used by service tests that need filesystem fixtures. - `cli/src/services/setup.rs` defines the setup command contract (`SetupMode`, `SetupTarget`, CLI flag parser/validator), an `inquire`-backed interactive target prompter (`InquireSetupTargetPrompter`), setup dispatch outcomes (proceed/cancelled), compile-time embedded asset access (`EmbeddedAsset`, target-scoped iterators) generated by `cli/build.rs` from `config/.opencode/**` and `config/.claude/**`, and a target-scoped install engine/orchestrator that stages embedded files, performs backup-and-replace with rollback restoration on swap failure, and formats deterministic completion messaging. - `cli/src/services/doctor.rs` defines hook rollout health validation (`run_doctor`) that resolves effective git hook-path source (default, local `core.hooksPath`, global `core.hooksPath`) and validates required hook files (`pre-commit`, `commit-msg`, `post-commit`) for presence and executable permissions. diff --git a/context/context-map.md b/context/context-map.md index 8bcecc9b..1196b8a2 100644 --- a/context/context-map.md +++ b/context/context-map.md @@ -26,6 +26,7 @@ Feature/domain context: - `context/sce/agent-trace-post-rewrite-local-remap-ingestion.md` (T08 `post-rewrite` local remap ingestion contract with strict pair parsing, rewrite-method normalization, and deterministic replay-key derivation) - `context/sce/agent-trace-rewrite-trace-transformation.md` (T09 rewritten-SHA trace transformation contract with rewrite metadata, confidence-to-quality mapping, and notes+DB persistence parity) - `context/sce/agent-trace-core-schema-migrations.md` (T10 core local schema migration contract for `repositories`, `commits`, `trace_records`, and `trace_ranges` with upgrade-safe idempotent create semantics) +- `context/sce/agent-trace-reconciliation-schema-ingestion.md` (T11 reconciliation persistence schema for `reconciliation_runs`, `rewrite_mappings`, and `conversations` with replay-safe idempotency and query indexes) Working areas: - `context/plans/` (active plan execution artifacts, not durable history) diff --git a/context/glossary.md b/context/glossary.md index c2feb675..04772c40 100644 --- a/context/glossary.md +++ b/context/glossary.md @@ -64,3 +64,4 @@ - `agent trace post-rewrite local remap ingestion`: T08 contract in `cli/src/services/hooks.rs` (`finalize_post_rewrite_remap`) that parses git `post-rewrite` old/new SHA pairs, captures normalized rewrite method (`amend`, `rebase`, or lowercase passthrough), derives deterministic `post-rewrite:::` idempotency keys, and dispatches replay-safe remap ingestion requests. - `agent trace rewrite trace transformation`: T09 contract in `cli/src/services/hooks.rs` (`finalize_rewrite_trace`) that materializes rewritten-SHA Agent Trace records with `rewrite_from`/`rewrite_method`/`rewrite_confidence` metadata, enforces confidence range normalization (`0.00`..`1.00`), maps quality status thresholds (`final`/`partial`/`needs_review`), and preserves notes+DB persistence parity with retry fallback. - `agent trace core schema migrations`: T10 contract in `cli/src/services/local_db.rs` (`apply_core_schema_migrations`) that applies idempotent local DB table/index creation for foundational Agent Trace entities (`repositories`, `commits`, `trace_records`, `trace_ranges`) and supports upgrade-safe reapplication on preexisting database state. +- `agent trace reconciliation schema ingestion`: T11 contract in `cli/src/services/local_db.rs` (`apply_core_schema_migrations`) that extends local DB migrations with hosted rewrite reconciliation entities (`reconciliation_runs`, `rewrite_mappings`, `conversations`), per-repository idempotency uniqueness, and query indexes for run status and old->new mapping lookup. diff --git a/context/overview.md b/context/overview.md index 24c654fa..183442b8 100644 --- a/context/overview.md +++ b/context/overview.md @@ -33,6 +33,7 @@ The CLI now also includes a hook rollout doctor contract documented in `context/ The hooks service now also includes a post-rewrite local remap ingestion seam (`finalize_post_rewrite_remap`) that parses `post-rewrite` old->new SHA pairs, normalizes rewrite method capture, and derives deterministic per-pair idempotency keys before remap dispatch; this behavior is documented in `context/sce/agent-trace-post-rewrite-local-remap-ingestion.md`. The hooks service now also includes rewrite trace transformation finalization (`finalize_rewrite_trace`) that materializes rewritten-SHA Agent Trace records with `rewrite_from`/`rewrite_method`/`rewrite_confidence` metadata, confidence-threshold quality mapping (`final`/`partial`/`needs_review`), and notes+DB persistence parity with retry fallback; this behavior is documented in `context/sce/agent-trace-rewrite-trace-transformation.md`. The local DB service now includes core Agent Trace persistence schema migrations (`apply_core_schema_migrations`) that install idempotent foundational tables and indexes for `repositories`, `commits`, `trace_records`, and `trace_ranges`; this behavior is documented in `context/sce/agent-trace-core-schema-migrations.md`. +The local DB service now also includes reconciliation persistence schema coverage in the same migration entrypoint for hosted rewrite bookkeeping tables (`reconciliation_runs`, `rewrite_mappings`, `conversations`) and replay/query indexes; this behavior is documented in `context/sce/agent-trace-reconciliation-schema-ingestion.md`. ## Repository model @@ -93,3 +94,4 @@ Lightweight post-task verification baseline (required after each completed task) - Use `context/sce/agent-trace-post-rewrite-local-remap-ingestion.md` for the implemented T08 post-rewrite local remap ingestion pipeline (`post-rewrite` pair parsing, rewrite-method normalization, and deterministic idempotency-key derivation). - Use `context/sce/agent-trace-rewrite-trace-transformation.md` for the implemented T09 rewritten-SHA trace transformation path (`finalize_rewrite_trace`), confidence-based quality status mapping, and rewrite metadata persistence semantics. - Use `context/sce/agent-trace-core-schema-migrations.md` for the implemented T10 core local schema migration contract (`apply_core_schema_migrations`) and table/index ownership across foundational Agent Trace persistence entities. +- Use `context/sce/agent-trace-reconciliation-schema-ingestion.md` for the implemented T11 reconciliation schema contract (`reconciliation_runs`, `rewrite_mappings`, `conversations`) and replay-safe idempotency/index coverage. diff --git a/context/patterns.md b/context/patterns.md index 052ee56e..c747735d 100644 --- a/context/patterns.md +++ b/context/patterns.md @@ -91,6 +91,7 @@ - For post-rewrite remap ingestion seams, parse ` ` pairs from hook input strictly, ignore empty/no-op self-mapping rows, normalize rewrite method labels to lowercase (`amend`/`rebase` when recognized), and derive deterministic per-pair idempotency keys before dispatching remap requests. - For rewrite trace transformation seams, materialize rewritten records through the canonical Agent Trace builder path, require finite confidence in `[0.0, 1.0]`, normalize confidence to two-decimal metadata strings, map quality thresholds to `final` (`>= 0.90`), `partial` (`0.60..0.89`), and `needs_review` (`< 0.60`), and preserve notes+DB dual-write plus retry-fallback parity. - For local persistence rollout, ship core schema changes as idempotent `CREATE TABLE IF NOT EXISTS` and `CREATE INDEX IF NOT EXISTS` statements so migration reapplication is upgrade-safe across empty and preexisting local Turso DB states. +- For hosted rewrite reconciliation persistence, extend the same migration seam (`apply_core_schema_migrations`) with deterministic schema/index statements and per-repository idempotency uniqueness for run/mapping replay safety. - Keep crate-local onboarding docs in `cli/README.md` and sanity-check command examples against actual `sce` output whenever command messaging changes. - Keep targeted CLI command-surface verification in flake checks: `checks..cli-setup-command-surface` runs from `cli/` and executes `cargo fmt --check` plus focused setup-related tests (`help_text_mentions_setup_target_flags`, `parser_routes_setup`, `run_setup_reports`). - In `cli/flake.nix`, select the Rust toolchain via an explicit Rust overlay (`rust-overlay`) and thread that toolchain through `makeRustPlatform` so CLI check/build derivations do not rely on implicit nixpkgs Rust defaults. diff --git a/context/plans/agent-trace-attribution-no-git-wrapper.md b/context/plans/agent-trace-attribution-no-git-wrapper.md index c9d11c68..5a8b546f 100644 --- a/context/plans/agent-trace-attribution-no-git-wrapper.md +++ b/context/plans/agent-trace-attribution-no-git-wrapper.md @@ -167,7 +167,7 @@ Implement a no-git-wrapper attribution platform that preserves normal developer - `cargo test --manifest-path cli/Cargo.toml core_schema_migrations` - `cargo build --manifest-path cli/Cargo.toml` -- [ ] T11: Ship reconciliation schema and ingestion (`reconciliation_runs`, `rewrite_mappings`, `conversations`) (status:todo) +- [x] T11: Ship reconciliation schema and ingestion (`reconciliation_runs`, `rewrite_mappings`, `conversations`) (status:done) - Task ID: T11 - Goal: Add hosted rewrite persistence and idempotency-backed run bookkeeping. - Boundaries (in/out of scope): @@ -177,6 +177,10 @@ Implement a no-git-wrapper attribution platform that preserves normal developer - Reconciliation runs and mappings can be stored and queried reproducibly. - Verification notes (commands or checks): - Referential-integrity tests and representative mapping/replay query checks. + - `cargo fmt --manifest-path cli/Cargo.toml -- --check` + - `cargo test --manifest-path cli/Cargo.toml core_schema_migrations` + - `cargo test --manifest-path cli/Cargo.toml reconciliation_schema_supports_replay_safe_runs_and_mapping_queries` + - `cargo build --manifest-path cli/Cargo.toml` - [ ] T12: Implement hosted event intake and run orchestration (status:todo) - Task ID: T12 diff --git a/context/sce/agent-trace-reconciliation-schema-ingestion.md b/context/sce/agent-trace-reconciliation-schema-ingestion.md new file mode 100644 index 00000000..ca4730a7 --- /dev/null +++ b/context/sce/agent-trace-reconciliation-schema-ingestion.md @@ -0,0 +1,50 @@ +# Agent Trace Reconciliation Schema Ingestion + +## Scope + +- Implements T11 for plan `agent-trace-attribution-no-git-wrapper`. +- Adds hosted-rewrite persistence schema slices for reconciliation bookkeeping and replay-safe mapping ingestion. +- Covers schema/migration behavior only; provider webhook transport remains out of scope. + +## Code ownership + +- Migration entrypoint: `cli/src/services/local_db.rs` (`apply_core_schema_migrations`). +- Schema statement source of truth: `cli/src/services/local_db.rs` (`CORE_SCHEMA_STATEMENTS`). + +## Migration contract + +- New reconciliation entities are installed idempotently with `CREATE TABLE IF NOT EXISTS` and `CREATE INDEX IF NOT EXISTS`. +- Reapplying migrations on preexisting DB state remains upgrade-safe. +- Run-level and mapping-level replay protection is enforced through per-repository idempotency uniqueness. + +## Reconciliation tables + +- `reconciliation_runs`: stores provider, run status lifecycle, run timing, and per-repository idempotency key. +- `rewrite_mappings`: stores old/new commit SHA mapping outcomes, confidence, mapping status, and per-repository idempotency key per mapping row. +- `conversations`: stores canonical conversation URLs per repository/source for hosted reconciliation linkage. + +## Indexes + +- `idx_reconciliation_runs_repository_status` on `reconciliation_runs(repository_id, status)`. +- `idx_rewrite_mappings_run_old_sha` on `rewrite_mappings(reconciliation_run_id, old_commit_sha)`. +- `idx_rewrite_mappings_repository_old_sha` on `rewrite_mappings(repository_id, old_commit_sha)`. +- `idx_conversations_repository_source` on `conversations(repository_id, source)`. + +## Validation coverage + +- Schema existence + index presence checks in `core_schema_migrations_create_required_tables_and_indexes`. +- Upgrade-safe reapplication checks in `core_schema_migrations_are_upgrade_safe_for_preexisting_state`. +- Reconciliation replay/query checks in `reconciliation_schema_supports_replay_safe_runs_and_mapping_queries`. + +## Verification evidence + +- `cargo fmt --manifest-path cli/Cargo.toml -- --check` +- `cargo test --manifest-path cli/Cargo.toml core_schema_migrations` +- `cargo test --manifest-path cli/Cargo.toml reconciliation_schema_supports_replay_safe_runs_and_mapping_queries` +- `cargo build --manifest-path cli/Cargo.toml` + +## Related context + +- `context/sce/agent-trace-core-schema-migrations.md` +- `context/sce/agent-trace-rewrite-trace-transformation.md` +- `context/plans/agent-trace-attribution-no-git-wrapper.md` From 9f94f38e49f82341e7ef9441da1b580757189da0 Mon Sep 17 00:00:00 2001 From: David Abram Date: Wed, 4 Mar 2026 15:33:33 +0100 Subject: [PATCH 13/39] agent-trace: Implement hosted webhook intake and replay-safe run orchestration Verifies GitHub and GitLab webhook signatures before parsing rewrite payloads, normalizes provider events into deterministic reconciliation run requests, and enforces replay-safe created-vs-duplicate outcomes via idempotency keys. Adds focused hosted intake tests for signature validation, required field handling, deterministic key derivation, and duplicate replay behavior. --- cli/src/services/hosted_reconciliation.rs | 573 ++++++++++++++++++ cli/src/services/mod.rs | 1 + context/architecture.md | 1 + context/context-map.md | 1 + context/glossary.md | 1 + context/overview.md | 2 + context/patterns.md | 1 + .../agent-trace-attribution-no-git-wrapper.md | 5 +- ...trace-hosted-event-intake-orchestration.md | 47 ++ 9 files changed, 631 insertions(+), 1 deletion(-) create mode 100644 cli/src/services/hosted_reconciliation.rs create mode 100644 context/sce/agent-trace-hosted-event-intake-orchestration.md diff --git a/cli/src/services/hosted_reconciliation.rs b/cli/src/services/hosted_reconciliation.rs new file mode 100644 index 00000000..301ac1db --- /dev/null +++ b/cli/src/services/hosted_reconciliation.rs @@ -0,0 +1,573 @@ +use anyhow::{bail, ensure, Result}; + +#[derive(Clone, Copy, Debug, Eq, PartialEq)] +pub enum HostedProvider { + GitHub, + GitLab, +} + +impl HostedProvider { + fn as_str(&self) -> &'static str { + match self { + HostedProvider::GitHub => "github", + HostedProvider::GitLab => "gitlab", + } + } +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct HostedWebhookRequest { + pub provider: HostedProvider, + pub event: String, + pub signature: String, + pub delivery_id: Option, + pub shared_secret: String, + pub payload_json: String, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct HostedReconciliationRunRequest { + pub provider: HostedProvider, + pub repository: String, + pub event: String, + pub old_head: String, + pub new_head: String, + pub idempotency_key: String, +} + +#[derive(Clone, Copy, Debug, Eq, PartialEq)] +pub enum ReconciliationRunInsertOutcome { + Created, + Duplicate, +} + +pub trait ReconciliationRunStore { + fn insert_run( + &mut self, + request: HostedReconciliationRunRequest, + ) -> Result; +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub enum HostedIntakeOutcome { + Created(HostedReconciliationRunRequest), + Duplicate(HostedReconciliationRunRequest), +} + +pub fn ingest_hosted_rewrite_event( + request: HostedWebhookRequest, + run_store: &mut impl ReconciliationRunStore, +) -> Result { + verify_signature(&request)?; + let run_request = parse_run_request(&request)?; + + let outcome = run_store.insert_run(run_request.clone())?; + match outcome { + ReconciliationRunInsertOutcome::Created => Ok(HostedIntakeOutcome::Created(run_request)), + ReconciliationRunInsertOutcome::Duplicate => { + Ok(HostedIntakeOutcome::Duplicate(run_request)) + } + } +} + +fn parse_run_request(request: &HostedWebhookRequest) -> Result { + let old_head = find_required_json_string(&request.payload_json, "before")?; + let new_head = find_required_json_string(&request.payload_json, "after")?; + + ensure!( + is_sha_like(&old_head), + "invalid hosted event payload: 'before' is not a git SHA" + ); + ensure!( + is_sha_like(&new_head), + "invalid hosted event payload: 'after' is not a git SHA" + ); + + let repository = match request.provider { + HostedProvider::GitHub => find_required_json_string(&request.payload_json, "full_name")?, + HostedProvider::GitLab => { + find_required_json_string(&request.payload_json, "path_with_namespace")? + } + }; + + let idempotency_key = derive_idempotency_key( + request.provider, + &request.event, + &repository, + &old_head, + &new_head, + request.delivery_id.as_deref(), + ); + + Ok(HostedReconciliationRunRequest { + provider: request.provider, + repository, + event: request.event.clone(), + old_head, + new_head, + idempotency_key, + }) +} + +fn verify_signature(request: &HostedWebhookRequest) -> Result<()> { + ensure!( + !request.signature.trim().is_empty(), + "missing hosted event signature" + ); + + match request.provider { + HostedProvider::GitHub => { + let expected = github_signature(&request.shared_secret, &request.payload_json); + ensure!( + constant_time_eq(request.signature.as_bytes(), expected.as_bytes()), + "hosted event signature verification failed for github" + ); + } + HostedProvider::GitLab => { + ensure!( + constant_time_eq( + request.signature.as_bytes(), + request.shared_secret.as_bytes() + ), + "hosted event signature verification failed for gitlab" + ); + } + } + + Ok(()) +} + +fn derive_idempotency_key( + provider: HostedProvider, + event: &str, + repository: &str, + old_head: &str, + new_head: &str, + delivery_id: Option<&str>, +) -> String { + let delivery = delivery_id.unwrap_or("no-delivery-id"); + let material = format!( + "provider={};event={};repo={};before={};after={};delivery={}", + provider.as_str(), + event, + repository, + old_head, + new_head, + delivery + ); + let digest = hex_lower(&sha256(material.as_bytes())); + format!("hosted:{}:{}", provider.as_str(), digest) +} + +fn find_required_json_string(payload: &str, key: &str) -> Result { + let key_pattern = format!("\"{}\"", key); + let Some(key_start) = payload.find(&key_pattern) else { + bail!("invalid hosted event payload: missing '{}' field", key); + }; + + let mut idx = key_start + key_pattern.len(); + while idx < payload.len() && payload.as_bytes()[idx].is_ascii_whitespace() { + idx += 1; + } + + ensure!( + idx < payload.len() && payload.as_bytes()[idx] == b':', + "invalid hosted event payload: malformed '{}' field", + key + ); + idx += 1; + + while idx < payload.len() && payload.as_bytes()[idx].is_ascii_whitespace() { + idx += 1; + } + + ensure!( + idx < payload.len() && payload.as_bytes()[idx] == b'"', + "invalid hosted event payload: '{}' field must be a string", + key + ); + idx += 1; + + let mut value = String::new(); + let mut escaped = false; + while idx < payload.len() { + let byte = payload.as_bytes()[idx]; + idx += 1; + if escaped { + value.push(byte as char); + escaped = false; + continue; + } + + if byte == b'\\' { + escaped = true; + continue; + } + + if byte == b'"' { + return Ok(value); + } + + value.push(byte as char); + } + + bail!( + "invalid hosted event payload: unterminated '{}' string", + key + ) +} + +fn is_sha_like(value: &str) -> bool { + value.len() == 40 && value.chars().all(|ch| ch.is_ascii_hexdigit()) +} + +fn github_signature(secret: &str, payload: &str) -> String { + let mac = hmac_sha256(secret.as_bytes(), payload.as_bytes()); + format!("sha256={}", hex_lower(&mac)) +} + +fn constant_time_eq(left: &[u8], right: &[u8]) -> bool { + if left.len() != right.len() { + return false; + } + + let mut diff: u8 = 0; + for (lhs, rhs) in left.iter().zip(right.iter()) { + diff |= lhs ^ rhs; + } + + diff == 0 +} + +fn hmac_sha256(key: &[u8], message: &[u8]) -> [u8; 32] { + const BLOCK_SIZE: usize = 64; + let mut key_block = [0_u8; BLOCK_SIZE]; + + if key.len() > BLOCK_SIZE { + let hashed = sha256(key); + key_block[..hashed.len()].copy_from_slice(&hashed); + } else { + key_block[..key.len()].copy_from_slice(key); + } + + let mut inner_pad = [0_u8; BLOCK_SIZE]; + let mut outer_pad = [0_u8; BLOCK_SIZE]; + for idx in 0..BLOCK_SIZE { + inner_pad[idx] = key_block[idx] ^ 0x36; + outer_pad[idx] = key_block[idx] ^ 0x5c; + } + + let mut inner_input = Vec::with_capacity(BLOCK_SIZE + message.len()); + inner_input.extend_from_slice(&inner_pad); + inner_input.extend_from_slice(message); + let inner_hash = sha256(&inner_input); + + let mut outer_input = Vec::with_capacity(BLOCK_SIZE + inner_hash.len()); + outer_input.extend_from_slice(&outer_pad); + outer_input.extend_from_slice(&inner_hash); + + sha256(&outer_input) +} + +fn sha256(input: &[u8]) -> [u8; 32] { + const K: [u32; 64] = [ + 0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, 0x3956c25b, 0x59f111f1, 0x923f82a4, + 0xab1c5ed5, 0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, 0x72be5d74, 0x80deb1fe, + 0x9bdc06a7, 0xc19bf174, 0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, 0x2de92c6f, + 0x4a7484aa, 0x5cb0a9dc, 0x76f988da, 0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, + 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967, 0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, + 0x53380d13, 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85, 0xa2bfe8a1, 0xa81a664b, + 0xc24b8b70, 0xc76c51a3, 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070, 0x19a4c116, + 0x1e376c08, 0x2748774c, 0x34b0bcb5, 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3, + 0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, 0x90befffa, 0xa4506ceb, 0xbef9a3f7, + 0xc67178f2, + ]; + + let mut h: [u32; 8] = [ + 0x6a09e667, 0xbb67ae85, 0x3c6ef372, 0xa54ff53a, 0x510e527f, 0x9b05688c, 0x1f83d9ab, + 0x5be0cd19, + ]; + + let mut padded = input.to_vec(); + let bit_len = (padded.len() as u64) * 8; + padded.push(0x80); + while (padded.len() % 64) != 56 { + padded.push(0); + } + padded.extend_from_slice(&bit_len.to_be_bytes()); + + let mut message_schedule = [0_u32; 64]; + for chunk in padded.chunks_exact(64) { + for (idx, word) in chunk.chunks_exact(4).take(16).enumerate() { + message_schedule[idx] = u32::from_be_bytes([word[0], word[1], word[2], word[3]]); + } + + for idx in 16..64 { + let s0 = message_schedule[idx - 15].rotate_right(7) + ^ message_schedule[idx - 15].rotate_right(18) + ^ (message_schedule[idx - 15] >> 3); + let s1 = message_schedule[idx - 2].rotate_right(17) + ^ message_schedule[idx - 2].rotate_right(19) + ^ (message_schedule[idx - 2] >> 10); + message_schedule[idx] = message_schedule[idx - 16] + .wrapping_add(s0) + .wrapping_add(message_schedule[idx - 7]) + .wrapping_add(s1); + } + + let mut a = h[0]; + let mut b = h[1]; + let mut c = h[2]; + let mut d = h[3]; + let mut e = h[4]; + let mut f = h[5]; + let mut g = h[6]; + let mut hh = h[7]; + + for idx in 0..64 { + let s1 = e.rotate_right(6) ^ e.rotate_right(11) ^ e.rotate_right(25); + let ch = (e & f) ^ ((!e) & g); + let temp1 = hh + .wrapping_add(s1) + .wrapping_add(ch) + .wrapping_add(K[idx]) + .wrapping_add(message_schedule[idx]); + let s0 = a.rotate_right(2) ^ a.rotate_right(13) ^ a.rotate_right(22); + let maj = (a & b) ^ (a & c) ^ (b & c); + let temp2 = s0.wrapping_add(maj); + + hh = g; + g = f; + f = e; + e = d.wrapping_add(temp1); + d = c; + c = b; + b = a; + a = temp1.wrapping_add(temp2); + } + + h[0] = h[0].wrapping_add(a); + h[1] = h[1].wrapping_add(b); + h[2] = h[2].wrapping_add(c); + h[3] = h[3].wrapping_add(d); + h[4] = h[4].wrapping_add(e); + h[5] = h[5].wrapping_add(f); + h[6] = h[6].wrapping_add(g); + h[7] = h[7].wrapping_add(hh); + } + + let mut output = [0_u8; 32]; + for (idx, value) in h.iter().enumerate() { + output[idx * 4..idx * 4 + 4].copy_from_slice(&value.to_be_bytes()); + } + output +} + +fn hex_lower(bytes: &[u8]) -> String { + const HEX: &[u8; 16] = b"0123456789abcdef"; + let mut output = String::with_capacity(bytes.len() * 2); + for byte in bytes { + output.push(HEX[(byte >> 4) as usize] as char); + output.push(HEX[(byte & 0x0f) as usize] as char); + } + output +} + +#[cfg(test)] +mod tests { + use std::collections::HashSet; + + use anyhow::Result; + + use super::{ + derive_idempotency_key, github_signature, ingest_hosted_rewrite_event, HostedIntakeOutcome, + HostedProvider, HostedReconciliationRunRequest, HostedWebhookRequest, + ReconciliationRunInsertOutcome, ReconciliationRunStore, + }; + + #[derive(Default)] + struct FakeReconciliationRunStore { + inserted: Vec, + seen_keys: HashSet, + } + + impl ReconciliationRunStore for FakeReconciliationRunStore { + fn insert_run( + &mut self, + request: HostedReconciliationRunRequest, + ) -> Result { + self.inserted.push(request.clone()); + let inserted = self.seen_keys.insert(request.idempotency_key); + Ok(if inserted { + ReconciliationRunInsertOutcome::Created + } else { + ReconciliationRunInsertOutcome::Duplicate + }) + } + } + + fn github_payload() -> String { + "{\"before\":\"1111111111111111111111111111111111111111\",\"after\":\"2222222222222222222222222222222222222222\",\"repository\":{\"full_name\":\"acme/sce\"}}".to_string() + } + + fn gitlab_payload() -> String { + "{\"before\":\"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\",\"after\":\"bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb\",\"project\":{\"path_with_namespace\":\"group/sce\"}}".to_string() + } + + #[test] + fn github_intake_verifies_signature_and_creates_run() -> Result<()> { + let payload = github_payload(); + let mut store = FakeReconciliationRunStore::default(); + let request = HostedWebhookRequest { + provider: HostedProvider::GitHub, + event: "push".to_string(), + signature: github_signature("super-secret", &payload), + delivery_id: Some("delivery-1".to_string()), + shared_secret: "super-secret".to_string(), + payload_json: payload, + }; + + let outcome = ingest_hosted_rewrite_event(request, &mut store)?; + match outcome { + HostedIntakeOutcome::Created(run) => { + assert_eq!(run.repository, "acme/sce"); + assert_eq!(run.old_head, "1111111111111111111111111111111111111111"); + assert_eq!(run.new_head, "2222222222222222222222222222222222222222"); + assert!(run.idempotency_key.starts_with("hosted:github:")); + } + other => panic!("unexpected outcome: {other:?}"), + } + + assert_eq!(store.inserted.len(), 1); + Ok(()) + } + + #[test] + fn github_intake_rejects_invalid_signature() { + let payload = github_payload(); + let mut store = FakeReconciliationRunStore::default(); + let request = HostedWebhookRequest { + provider: HostedProvider::GitHub, + event: "push".to_string(), + signature: "sha256=deadbeef".to_string(), + delivery_id: Some("delivery-1".to_string()), + shared_secret: "super-secret".to_string(), + payload_json: payload, + }; + + let error = ingest_hosted_rewrite_event(request, &mut store).expect_err("must fail"); + assert!(error + .to_string() + .contains("hosted event signature verification failed for github")); + assert!(store.inserted.is_empty()); + } + + #[test] + fn gitlab_intake_verifies_token_and_creates_run() -> Result<()> { + let mut store = FakeReconciliationRunStore::default(); + let request = HostedWebhookRequest { + provider: HostedProvider::GitLab, + event: "push".to_string(), + signature: "gitlab-secret".to_string(), + delivery_id: Some("event-42".to_string()), + shared_secret: "gitlab-secret".to_string(), + payload_json: gitlab_payload(), + }; + + let outcome = ingest_hosted_rewrite_event(request, &mut store)?; + match outcome { + HostedIntakeOutcome::Created(run) => { + assert_eq!(run.repository, "group/sce"); + assert_eq!(run.old_head, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"); + assert_eq!(run.new_head, "bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb"); + assert!(run.idempotency_key.starts_with("hosted:gitlab:")); + } + other => panic!("unexpected outcome: {other:?}"), + } + + assert_eq!(store.inserted.len(), 1); + Ok(()) + } + + #[test] + fn duplicate_hosted_event_is_replay_safe() -> Result<()> { + let payload = github_payload(); + let signature = github_signature("super-secret", &payload); + let mut store = FakeReconciliationRunStore::default(); + + let first = HostedWebhookRequest { + provider: HostedProvider::GitHub, + event: "push".to_string(), + signature: signature.clone(), + delivery_id: Some("delivery-1".to_string()), + shared_secret: "super-secret".to_string(), + payload_json: payload.clone(), + }; + let second = HostedWebhookRequest { + provider: HostedProvider::GitHub, + event: "push".to_string(), + signature, + delivery_id: Some("delivery-1".to_string()), + shared_secret: "super-secret".to_string(), + payload_json: payload, + }; + + let first_outcome = ingest_hosted_rewrite_event(first, &mut store)?; + let second_outcome = ingest_hosted_rewrite_event(second, &mut store)?; + + assert!(matches!(first_outcome, HostedIntakeOutcome::Created(_))); + assert!(matches!(second_outcome, HostedIntakeOutcome::Duplicate(_))); + assert_eq!(store.inserted.len(), 2); + Ok(()) + } + + #[test] + fn intake_requires_before_after_and_repository_fields() { + let payload = "{\"after\":\"2222222222222222222222222222222222222222\"}".to_string(); + let mut store = FakeReconciliationRunStore::default(); + let request = HostedWebhookRequest { + provider: HostedProvider::GitHub, + event: "push".to_string(), + signature: github_signature("super-secret", &payload), + delivery_id: Some("delivery-1".to_string()), + shared_secret: "super-secret".to_string(), + payload_json: payload, + }; + + let error = ingest_hosted_rewrite_event(request, &mut store).expect_err("must fail"); + assert!(error + .to_string() + .contains("invalid hosted event payload: missing 'before' field")); + } + + #[test] + fn idempotency_key_is_deterministic() { + let key_a = derive_idempotency_key( + HostedProvider::GitHub, + "push", + "acme/sce", + "1111111111111111111111111111111111111111", + "2222222222222222222222222222222222222222", + Some("delivery-1"), + ); + let key_b = derive_idempotency_key( + HostedProvider::GitHub, + "push", + "acme/sce", + "1111111111111111111111111111111111111111", + "2222222222222222222222222222222222222222", + Some("delivery-1"), + ); + let key_c = derive_idempotency_key( + HostedProvider::GitHub, + "push", + "acme/sce", + "1111111111111111111111111111111111111111", + "3333333333333333333333333333333333333333", + Some("delivery-1"), + ); + + assert_eq!(key_a, key_b); + assert_ne!(key_a, key_c); + } +} diff --git a/cli/src/services/mod.rs b/cli/src/services/mod.rs index 21778d35..a6337780 100644 --- a/cli/src/services/mod.rs +++ b/cli/src/services/mod.rs @@ -1,6 +1,7 @@ pub mod agent_trace; pub mod doctor; pub mod hooks; +pub mod hosted_reconciliation; pub mod local_db; pub mod mcp; pub mod setup; diff --git a/context/architecture.md b/context/architecture.md index 5cf55038..21c02924 100644 --- a/context/architecture.md +++ b/context/architecture.md @@ -80,6 +80,7 @@ The repository includes a new placeholder Rust binary crate at `cli/`. - `cli/src/services/agent_trace.rs` defines the Agent Trace schema adapter and builder contracts (`adapt_trace_payload`, `build_trace_payload`), including fixed git VCS identity, reserved reverse-domain metadata keys, and deterministic AI `model_id` normalization before schema-compliance validation. - `cli/src/services/mcp.rs` defines MCP file-cache capability contracts (`McpService`, transport/capability snapshots, cache policy) with non-runnable placeholder tool declarations. - `cli/src/services/hooks.rs` defines hook-event and generated-region tracking contracts (`HookService`, `HookEventModel`) plus a pre-commit staged-checkpoint finalization seam (`finalize_pre_commit_checkpoint`) that enforces staged-only attribution and carries index/tree anchors with explicit no-op guard states, a commit-msg co-author policy seam (`apply_commit_msg_coauthor_policy`) that injects one canonical SCE trailer only for allowed attributed commits, a post-commit trace finalization seam (`finalize_post_commit_trace`) that performs notes+DB dual writes with idempotency ledger guards and retry-queue fallback capture, a post-rewrite remap-ingestion seam (`finalize_post_rewrite_remap`) that parses old->new SHA pairs and derives deterministic replay keys for remap dispatch, and a rewrite trace transformation seam (`finalize_rewrite_trace`) that emits rewritten-SHA Agent Trace records with rewrite metadata plus confidence-based quality status. +- `cli/src/services/hosted_reconciliation.rs` defines hosted intake/orchestration seams (`ingest_hosted_rewrite_event`, `ReconciliationRunStore`) that verify provider signatures (GitHub HMAC-SHA256 and GitLab token equality), parse provider payload old/new heads, and normalize deterministic idempotency-backed reconciliation run requests. - `cli/src/services/sync.rs` runs the local adapter through a lazily initialized shared tokio current-thread runtime and composes a placeholder cloud-sync abstraction (`CloudSyncGateway`) so local Turso validation and deferred cloud planning remain separated. - `cli/src/services/` contains module boundaries for setup, doctor, MCP, hooks, sync, and local DB adapters with explicit trait seams for future implementations. - `cli/README.md` is the crate-local onboarding and usage source of truth for placeholder behavior, safety limitations, and roadmap mapping back to service contracts. diff --git a/context/context-map.md b/context/context-map.md index 1196b8a2..912b4857 100644 --- a/context/context-map.md +++ b/context/context-map.md @@ -27,6 +27,7 @@ Feature/domain context: - `context/sce/agent-trace-rewrite-trace-transformation.md` (T09 rewritten-SHA trace transformation contract with rewrite metadata, confidence-to-quality mapping, and notes+DB persistence parity) - `context/sce/agent-trace-core-schema-migrations.md` (T10 core local schema migration contract for `repositories`, `commits`, `trace_records`, and `trace_ranges` with upgrade-safe idempotent create semantics) - `context/sce/agent-trace-reconciliation-schema-ingestion.md` (T11 reconciliation persistence schema for `reconciliation_runs`, `rewrite_mappings`, and `conversations` with replay-safe idempotency and query indexes) +- `context/sce/agent-trace-hosted-event-intake-orchestration.md` (T12 hosted GitHub/GitLab event intake contract with signature verification, old/new head resolution, and deterministic reconciliation-run idempotency keys) Working areas: - `context/plans/` (active plan execution artifacts, not durable history) diff --git a/context/glossary.md b/context/glossary.md index 04772c40..da83ea25 100644 --- a/context/glossary.md +++ b/context/glossary.md @@ -65,3 +65,4 @@ - `agent trace rewrite trace transformation`: T09 contract in `cli/src/services/hooks.rs` (`finalize_rewrite_trace`) that materializes rewritten-SHA Agent Trace records with `rewrite_from`/`rewrite_method`/`rewrite_confidence` metadata, enforces confidence range normalization (`0.00`..`1.00`), maps quality status thresholds (`final`/`partial`/`needs_review`), and preserves notes+DB persistence parity with retry fallback. - `agent trace core schema migrations`: T10 contract in `cli/src/services/local_db.rs` (`apply_core_schema_migrations`) that applies idempotent local DB table/index creation for foundational Agent Trace entities (`repositories`, `commits`, `trace_records`, `trace_ranges`) and supports upgrade-safe reapplication on preexisting database state. - `agent trace reconciliation schema ingestion`: T11 contract in `cli/src/services/local_db.rs` (`apply_core_schema_migrations`) that extends local DB migrations with hosted rewrite reconciliation entities (`reconciliation_runs`, `rewrite_mappings`, `conversations`), per-repository idempotency uniqueness, and query indexes for run status and old->new mapping lookup. +- `agent trace hosted event intake orchestration`: T12 contract in `cli/src/services/hosted_reconciliation.rs` (`ingest_hosted_rewrite_event`) that verifies GitHub/GitLab webhook signatures, resolves provider payload old/new heads, normalizes provider/repo/event identity into deterministic `hosted::` reconciliation run idempotency keys, and returns replay-safe created-vs-duplicate run outcomes through `ReconciliationRunStore`. diff --git a/context/overview.md b/context/overview.md index 183442b8..89ef60b0 100644 --- a/context/overview.md +++ b/context/overview.md @@ -34,6 +34,7 @@ The hooks service now also includes a post-rewrite local remap ingestion seam (` The hooks service now also includes rewrite trace transformation finalization (`finalize_rewrite_trace`) that materializes rewritten-SHA Agent Trace records with `rewrite_from`/`rewrite_method`/`rewrite_confidence` metadata, confidence-threshold quality mapping (`final`/`partial`/`needs_review`), and notes+DB persistence parity with retry fallback; this behavior is documented in `context/sce/agent-trace-rewrite-trace-transformation.md`. The local DB service now includes core Agent Trace persistence schema migrations (`apply_core_schema_migrations`) that install idempotent foundational tables and indexes for `repositories`, `commits`, `trace_records`, and `trace_ranges`; this behavior is documented in `context/sce/agent-trace-core-schema-migrations.md`. The local DB service now also includes reconciliation persistence schema coverage in the same migration entrypoint for hosted rewrite bookkeeping tables (`reconciliation_runs`, `rewrite_mappings`, `conversations`) and replay/query indexes; this behavior is documented in `context/sce/agent-trace-reconciliation-schema-ingestion.md`. +The CLI now also includes a hosted event intake/orchestration seam in `cli/src/services/hosted_reconciliation.rs` that verifies provider signatures, resolves old/new commit heads from GitHub/GitLab payloads, and creates deterministic replay-safe reconciliation run requests; this behavior is documented in `context/sce/agent-trace-hosted-event-intake-orchestration.md`. ## Repository model @@ -95,3 +96,4 @@ Lightweight post-task verification baseline (required after each completed task) - Use `context/sce/agent-trace-rewrite-trace-transformation.md` for the implemented T09 rewritten-SHA trace transformation path (`finalize_rewrite_trace`), confidence-based quality status mapping, and rewrite metadata persistence semantics. - Use `context/sce/agent-trace-core-schema-migrations.md` for the implemented T10 core local schema migration contract (`apply_core_schema_migrations`) and table/index ownership across foundational Agent Trace persistence entities. - Use `context/sce/agent-trace-reconciliation-schema-ingestion.md` for the implemented T11 reconciliation schema contract (`reconciliation_runs`, `rewrite_mappings`, `conversations`) and replay-safe idempotency/index coverage. +- Use `context/sce/agent-trace-hosted-event-intake-orchestration.md` for the implemented T12 hosted intake contract (GitHub/GitLab signature verification, old/new head resolution, deterministic reconciliation-run idempotency keys, and replay-safe run insertion outcomes). diff --git a/context/patterns.md b/context/patterns.md index c747735d..93e3ff5d 100644 --- a/context/patterns.md +++ b/context/patterns.md @@ -92,6 +92,7 @@ - For rewrite trace transformation seams, materialize rewritten records through the canonical Agent Trace builder path, require finite confidence in `[0.0, 1.0]`, normalize confidence to two-decimal metadata strings, map quality thresholds to `final` (`>= 0.90`), `partial` (`0.60..0.89`), and `needs_review` (`< 0.60`), and preserve notes+DB dual-write plus retry-fallback parity. - For local persistence rollout, ship core schema changes as idempotent `CREATE TABLE IF NOT EXISTS` and `CREATE INDEX IF NOT EXISTS` statements so migration reapplication is upgrade-safe across empty and preexisting local Turso DB states. - For hosted rewrite reconciliation persistence, extend the same migration seam (`apply_core_schema_migrations`) with deterministic schema/index statements and per-repository idempotency uniqueness for run/mapping replay safety. +- For hosted event intake seams, verify provider signatures before payload parsing (GitHub `sha256=` HMAC over body, GitLab token-equality secret check), resolve old/new heads from provider payload fields, and derive deterministic reconciliation run idempotency keys from provider+event+repo+head tuple material. - Keep crate-local onboarding docs in `cli/README.md` and sanity-check command examples against actual `sce` output whenever command messaging changes. - Keep targeted CLI command-surface verification in flake checks: `checks..cli-setup-command-surface` runs from `cli/` and executes `cargo fmt --check` plus focused setup-related tests (`help_text_mentions_setup_target_flags`, `parser_routes_setup`, `run_setup_reports`). - In `cli/flake.nix`, select the Rust toolchain via an explicit Rust overlay (`rust-overlay`) and thread that toolchain through `makeRustPlatform` so CLI check/build derivations do not rely on implicit nixpkgs Rust defaults. diff --git a/context/plans/agent-trace-attribution-no-git-wrapper.md b/context/plans/agent-trace-attribution-no-git-wrapper.md index 5a8b546f..4d28ce1c 100644 --- a/context/plans/agent-trace-attribution-no-git-wrapper.md +++ b/context/plans/agent-trace-attribution-no-git-wrapper.md @@ -182,7 +182,7 @@ Implement a no-git-wrapper attribution platform that preserves normal developer - `cargo test --manifest-path cli/Cargo.toml reconciliation_schema_supports_replay_safe_runs_and_mapping_queries` - `cargo build --manifest-path cli/Cargo.toml` -- [ ] T12: Implement hosted event intake and run orchestration (status:todo) +- [x] T12: Implement hosted event intake and run orchestration (status:done) - Task ID: T12 - Goal: Accept GitHub/GitLab webhook events, verify signatures, and create replay-safe runs. - Boundaries (in/out of scope): @@ -192,6 +192,9 @@ Implement a no-git-wrapper attribution platform that preserves normal developer - Duplicate events do not create duplicate side effects. - Verification notes (commands or checks): - Webhook signature and replay tests per provider. + - `cargo test --manifest-path cli/Cargo.toml hosted_reconciliation` + - `cargo fmt --manifest-path cli/Cargo.toml -- --check` + - `cargo build --manifest-path cli/Cargo.toml` - [ ] T13: Implement mapping engine (patch-id, range-diff, fuzzy fallback) (status:todo) - Task ID: T13 diff --git a/context/sce/agent-trace-hosted-event-intake-orchestration.md b/context/sce/agent-trace-hosted-event-intake-orchestration.md new file mode 100644 index 00000000..8106537e --- /dev/null +++ b/context/sce/agent-trace-hosted-event-intake-orchestration.md @@ -0,0 +1,47 @@ +# Agent Trace Hosted Event Intake Orchestration + +## Scope + +- Implements T12 for plan `agent-trace-attribution-no-git-wrapper`. +- Accepts hosted provider rewrite events and turns them into replay-safe reconciliation run requests. +- Covers provider parsing/signature/idempotency intake only; mapping heuristics remain out of scope (`T13`). + +## Code ownership + +- Hosted intake service: `cli/src/services/hosted_reconciliation.rs`. +- Public intake seam: `ingest_hosted_rewrite_event`. +- Service module registration: `cli/src/services/mod.rs`. + +## Intake contract + +- Provider coverage is explicit for GitHub and GitLab (`HostedProvider`). +- GitHub webhook signatures use HMAC-SHA256 and require `sha256=` match against payload body. +- GitLab webhook signatures use token equality against the configured shared secret. +- Intake requires resolvable rewrite heads (`before`, `after`) and provider-specific repository identity (`full_name` for GitHub, `path_with_namespace` for GitLab). +- `before` and `after` values must be SHA-like 40-char hex commit IDs. + +## Reconciliation run orchestration contract + +- Provider events are normalized into `HostedReconciliationRunRequest` with provider, repo, event, old/new heads, and deterministic idempotency key. +- Deterministic replay key derivation uses provider + event + repo + old/new heads + delivery ID material and SHA256 digesting. +- Run storage is abstracted behind `ReconciliationRunStore`; ingestion returns created vs duplicate outcome (`ReconciliationRunInsertOutcome`) for replay-safe semantics. + +## Validation coverage + +- GitHub signature verification + run creation. +- GitLab token verification + run creation. +- Duplicate event replay behavior returns duplicate outcome without creating a new side effect class. +- Required payload field validation for old/new head resolution. +- Deterministic idempotency key stability for identical inputs. + +## Verification evidence + +- `cargo test --manifest-path cli/Cargo.toml hosted_reconciliation` +- `cargo fmt --manifest-path cli/Cargo.toml -- --check` +- `cargo build --manifest-path cli/Cargo.toml` + +## Related context + +- `context/sce/agent-trace-implementation-contract.md` +- `context/sce/agent-trace-reconciliation-schema-ingestion.md` +- `context/plans/agent-trace-attribution-no-git-wrapper.md` From a919ed2fbd6a0fe6d08b55b17b12fa9d884116ab Mon Sep 17 00:00:00 2001 From: David Abram Date: Wed, 4 Mar 2026 15:43:21 +0100 Subject: [PATCH 14/39] agent-trace: Implement deterministic rewrite mapping engine Add hosted rewrite mapping resolution with strict precedence: patch-id exact match, then range-diff scoring, then fuzzy fallback. Classify unresolved outcomes for ambiguous, unmatched, and low-confidence cases using a 0.60 threshold, and add deterministic fixture tests --- cli/src/services/hosted_reconciliation.rs | 380 +++++++++++++++++- context/architecture.md | 2 +- context/context-map.md | 1 + context/glossary.md | 1 + context/overview.md | 2 + context/patterns.md | 1 + .../agent-trace-attribution-no-git-wrapper.md | 5 +- .../sce/agent-trace-rewrite-mapping-engine.md | 77 ++++ 8 files changed, 464 insertions(+), 5 deletions(-) create mode 100644 context/sce/agent-trace-rewrite-mapping-engine.md diff --git a/cli/src/services/hosted_reconciliation.rs b/cli/src/services/hosted_reconciliation.rs index 301ac1db..fa5133b3 100644 --- a/cli/src/services/hosted_reconciliation.rs +++ b/cli/src/services/hosted_reconciliation.rs @@ -54,6 +54,255 @@ pub enum HostedIntakeOutcome { Duplicate(HostedReconciliationRunRequest), } +pub const FUZZY_MAPPING_THRESHOLD: f32 = 0.60; + +#[derive(Clone, Copy, Debug, Eq, PartialEq)] +pub enum MappingMethod { + PatchIdExact, + RangeDiffHint, + FuzzyFallback, +} + +#[derive(Clone, Copy, Debug, Eq, PartialEq)] +pub enum MappingQuality { + Final, + Partial, + NeedsReview, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct RewriteSourceCommit { + pub old_commit_sha: String, + pub patch_id: Option, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct RewriteCandidateCommit { + pub new_commit_sha: String, + pub patch_id: Option, + pub range_diff_score: Option, + pub fuzzy_score: Option, +} + +#[derive(Clone, Copy, Debug, PartialEq)] +pub struct Score(f32); + +impl Eq for Score {} + +impl Score { + pub fn new(value: f32) -> Result { + ensure!(value.is_finite(), "mapping score must be finite"); + ensure!( + (0.0..=1.0).contains(&value), + "mapping score must be within [0.0, 1.0]" + ); + Ok(Self(value)) + } + + fn value(self) -> f32 { + self.0 + } +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct RewrittenCommitMapping { + pub old_commit_sha: String, + pub new_commit_sha: String, + pub method: MappingMethod, + pub confidence: Score, + pub quality: MappingQuality, +} + +#[derive(Clone, Copy, Debug, Eq, PartialEq)] +pub enum UnresolvedMappingKind { + Ambiguous, + Unmatched, + LowConfidence, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct UnresolvedRewriteMapping { + pub old_commit_sha: String, + pub kind: UnresolvedMappingKind, + pub reason: String, + pub candidate_new_shas: Vec, + pub best_confidence: Option, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub enum RewriteMappingOutcome { + Mapped(RewrittenCommitMapping), + Unresolved(UnresolvedRewriteMapping), +} + +pub fn map_rewritten_commit( + source: &RewriteSourceCommit, + candidates: &[RewriteCandidateCommit], +) -> RewriteMappingOutcome { + if candidates.is_empty() { + return RewriteMappingOutcome::Unresolved(UnresolvedRewriteMapping { + old_commit_sha: source.old_commit_sha.clone(), + kind: UnresolvedMappingKind::Unmatched, + reason: "no candidate rewritten commits were provided".to_string(), + candidate_new_shas: Vec::new(), + best_confidence: None, + }); + } + + let mut sorted = candidates.to_vec(); + sorted.sort_by(|left, right| left.new_commit_sha.cmp(&right.new_commit_sha)); + + if let Some(source_patch_id) = source.patch_id.as_deref() { + let patch_matches: Vec<&RewriteCandidateCommit> = sorted + .iter() + .filter(|candidate| candidate.patch_id.as_deref() == Some(source_patch_id)) + .collect(); + + if patch_matches.len() == 1 { + return RewriteMappingOutcome::Mapped(RewrittenCommitMapping { + old_commit_sha: source.old_commit_sha.clone(), + new_commit_sha: patch_matches[0].new_commit_sha.clone(), + method: MappingMethod::PatchIdExact, + confidence: Score(1.0), + quality: MappingQuality::Final, + }); + } + + if patch_matches.len() > 1 { + return RewriteMappingOutcome::Unresolved(UnresolvedRewriteMapping { + old_commit_sha: source.old_commit_sha.clone(), + kind: UnresolvedMappingKind::Ambiguous, + reason: "multiple rewritten commits matched exact patch-id".to_string(), + candidate_new_shas: patch_matches + .iter() + .map(|candidate| candidate.new_commit_sha.clone()) + .collect(), + best_confidence: Some(Score(1.0)), + }); + } + } + + if let Some(range_decision) = + select_scored_candidate(&sorted, |candidate| candidate.range_diff_score) + { + return outcome_from_score_decision(source, range_decision, MappingMethod::RangeDiffHint); + } + + if let Some(fuzzy_decision) = + select_scored_candidate(&sorted, |candidate| candidate.fuzzy_score) + { + return outcome_from_score_decision(source, fuzzy_decision, MappingMethod::FuzzyFallback); + } + + RewriteMappingOutcome::Unresolved(UnresolvedRewriteMapping { + old_commit_sha: source.old_commit_sha.clone(), + kind: UnresolvedMappingKind::Unmatched, + reason: "no range-diff or fuzzy mapping signals were available".to_string(), + candidate_new_shas: sorted + .iter() + .map(|candidate| candidate.new_commit_sha.clone()) + .collect(), + best_confidence: None, + }) +} + +#[derive(Clone, Debug)] +struct ScoreDecision { + top_new_sha: String, + top_score: Score, + tied_new_shas: Vec, +} + +fn select_scored_candidate( + candidates: &[RewriteCandidateCommit], + score_of: impl Fn(&RewriteCandidateCommit) -> Option, +) -> Option { + let mut top_new_sha: Option = None; + let mut top_score: Option = None; + let mut tied_new_shas: Vec = Vec::new(); + + for candidate in candidates { + let Some(score) = score_of(candidate) else { + continue; + }; + + match top_score { + None => { + top_score = Some(score); + top_new_sha = Some(candidate.new_commit_sha.clone()); + tied_new_shas.push(candidate.new_commit_sha.clone()); + } + Some(current) if score.value() > current.value() => { + top_score = Some(score); + top_new_sha = Some(candidate.new_commit_sha.clone()); + tied_new_shas.clear(); + tied_new_shas.push(candidate.new_commit_sha.clone()); + } + Some(current) if score.value() == current.value() => { + tied_new_shas.push(candidate.new_commit_sha.clone()); + } + Some(_) => {} + } + } + + match (top_new_sha, top_score) { + (Some(top_new_sha), Some(top_score)) => Some(ScoreDecision { + top_new_sha, + top_score, + tied_new_shas, + }), + _ => None, + } +} + +fn outcome_from_score_decision( + source: &RewriteSourceCommit, + decision: ScoreDecision, + method: MappingMethod, +) -> RewriteMappingOutcome { + if decision.tied_new_shas.len() > 1 { + return RewriteMappingOutcome::Unresolved(UnresolvedRewriteMapping { + old_commit_sha: source.old_commit_sha.clone(), + kind: UnresolvedMappingKind::Ambiguous, + reason: "multiple rewritten commits tied for best score".to_string(), + candidate_new_shas: decision.tied_new_shas, + best_confidence: Some(decision.top_score), + }); + } + + if decision.top_score.value() < FUZZY_MAPPING_THRESHOLD { + return RewriteMappingOutcome::Unresolved(UnresolvedRewriteMapping { + old_commit_sha: source.old_commit_sha.clone(), + kind: UnresolvedMappingKind::LowConfidence, + reason: format!( + "best mapping score {:.2} is below threshold {:.2}", + decision.top_score.value(), + FUZZY_MAPPING_THRESHOLD + ), + candidate_new_shas: vec![decision.top_new_sha], + best_confidence: Some(decision.top_score), + }); + } + + RewriteMappingOutcome::Mapped(RewrittenCommitMapping { + old_commit_sha: source.old_commit_sha.clone(), + new_commit_sha: decision.top_new_sha, + method, + confidence: decision.top_score, + quality: quality_for_confidence(decision.top_score), + }) +} + +fn quality_for_confidence(confidence: Score) -> MappingQuality { + if confidence.value() >= 0.90 { + MappingQuality::Final + } else if confidence.value() >= FUZZY_MAPPING_THRESHOLD { + MappingQuality::Partial + } else { + MappingQuality::NeedsReview + } +} + pub fn ingest_hosted_rewrite_event( request: HostedWebhookRequest, run_store: &mut impl ReconciliationRunStore, @@ -380,9 +629,11 @@ mod tests { use anyhow::Result; use super::{ - derive_idempotency_key, github_signature, ingest_hosted_rewrite_event, HostedIntakeOutcome, - HostedProvider, HostedReconciliationRunRequest, HostedWebhookRequest, - ReconciliationRunInsertOutcome, ReconciliationRunStore, + derive_idempotency_key, github_signature, ingest_hosted_rewrite_event, + map_rewritten_commit, HostedIntakeOutcome, HostedProvider, HostedReconciliationRunRequest, + HostedWebhookRequest, MappingMethod, MappingQuality, ReconciliationRunInsertOutcome, + ReconciliationRunStore, RewriteCandidateCommit, RewriteMappingOutcome, RewriteSourceCommit, + Score, UnresolvedMappingKind, }; #[derive(Default)] @@ -570,4 +821,127 @@ mod tests { assert_eq!(key_a, key_b); assert_ne!(key_a, key_c); } + + fn score(value: f32) -> Score { + Score::new(value).expect("score fixture must be valid") + } + + #[test] + fn mapping_engine_prefers_exact_patch_id_match() { + let source = RewriteSourceCommit { + old_commit_sha: "old-1".to_string(), + patch_id: Some("patch-abc".to_string()), + }; + let candidates = vec![ + RewriteCandidateCommit { + new_commit_sha: "new-b".to_string(), + patch_id: Some("patch-other".to_string()), + range_diff_score: Some(score(0.98)), + fuzzy_score: Some(score(0.98)), + }, + RewriteCandidateCommit { + new_commit_sha: "new-a".to_string(), + patch_id: Some("patch-abc".to_string()), + range_diff_score: Some(score(0.65)), + fuzzy_score: Some(score(0.64)), + }, + ]; + + let outcome = map_rewritten_commit(&source, &candidates); + match outcome { + RewriteMappingOutcome::Mapped(mapped) => { + assert_eq!(mapped.new_commit_sha, "new-a"); + assert_eq!(mapped.method, MappingMethod::PatchIdExact); + assert_eq!(mapped.confidence, score(1.0)); + assert_eq!(mapped.quality, MappingQuality::Final); + } + other => panic!("expected mapped outcome, got {other:?}"), + } + } + + #[test] + fn mapping_engine_reports_ambiguous_on_tied_best_scores() { + let source = RewriteSourceCommit { + old_commit_sha: "old-2".to_string(), + patch_id: None, + }; + let candidates = vec![ + RewriteCandidateCommit { + new_commit_sha: "new-z".to_string(), + patch_id: None, + range_diff_score: Some(score(0.82)), + fuzzy_score: Some(score(0.40)), + }, + RewriteCandidateCommit { + new_commit_sha: "new-a".to_string(), + patch_id: None, + range_diff_score: Some(score(0.82)), + fuzzy_score: Some(score(0.79)), + }, + ]; + + let outcome = map_rewritten_commit(&source, &candidates); + match outcome { + RewriteMappingOutcome::Unresolved(unresolved) => { + assert_eq!(unresolved.kind, UnresolvedMappingKind::Ambiguous); + assert_eq!( + unresolved.candidate_new_shas, + vec!["new-a".to_string(), "new-z".to_string()] + ); + assert_eq!(unresolved.best_confidence, Some(score(0.82))); + } + other => panic!("expected ambiguous unresolved outcome, got {other:?}"), + } + } + + #[test] + fn mapping_engine_reports_unmatched_when_no_signals_exist() { + let source = RewriteSourceCommit { + old_commit_sha: "old-3".to_string(), + patch_id: None, + }; + let candidates = vec![RewriteCandidateCommit { + new_commit_sha: "new-a".to_string(), + patch_id: None, + range_diff_score: None, + fuzzy_score: None, + }]; + + let outcome = map_rewritten_commit(&source, &candidates); + match outcome { + RewriteMappingOutcome::Unresolved(unresolved) => { + assert_eq!(unresolved.kind, UnresolvedMappingKind::Unmatched); + assert_eq!( + unresolved.reason, + "no range-diff or fuzzy mapping signals were available" + ); + assert_eq!(unresolved.best_confidence, None); + } + other => panic!("expected unmatched unresolved outcome, got {other:?}"), + } + } + + #[test] + fn mapping_engine_reports_low_confidence_below_threshold() { + let source = RewriteSourceCommit { + old_commit_sha: "old-4".to_string(), + patch_id: None, + }; + let candidates = vec![RewriteCandidateCommit { + new_commit_sha: "new-a".to_string(), + patch_id: None, + range_diff_score: None, + fuzzy_score: Some(score(0.59)), + }]; + + let outcome = map_rewritten_commit(&source, &candidates); + match outcome { + RewriteMappingOutcome::Unresolved(unresolved) => { + assert_eq!(unresolved.kind, UnresolvedMappingKind::LowConfidence); + assert_eq!(unresolved.candidate_new_shas, vec!["new-a".to_string()]); + assert_eq!(unresolved.best_confidence, Some(score(0.59))); + } + other => panic!("expected low-confidence unresolved outcome, got {other:?}"), + } + } } diff --git a/context/architecture.md b/context/architecture.md index 21c02924..fefd4bbf 100644 --- a/context/architecture.md +++ b/context/architecture.md @@ -80,7 +80,7 @@ The repository includes a new placeholder Rust binary crate at `cli/`. - `cli/src/services/agent_trace.rs` defines the Agent Trace schema adapter and builder contracts (`adapt_trace_payload`, `build_trace_payload`), including fixed git VCS identity, reserved reverse-domain metadata keys, and deterministic AI `model_id` normalization before schema-compliance validation. - `cli/src/services/mcp.rs` defines MCP file-cache capability contracts (`McpService`, transport/capability snapshots, cache policy) with non-runnable placeholder tool declarations. - `cli/src/services/hooks.rs` defines hook-event and generated-region tracking contracts (`HookService`, `HookEventModel`) plus a pre-commit staged-checkpoint finalization seam (`finalize_pre_commit_checkpoint`) that enforces staged-only attribution and carries index/tree anchors with explicit no-op guard states, a commit-msg co-author policy seam (`apply_commit_msg_coauthor_policy`) that injects one canonical SCE trailer only for allowed attributed commits, a post-commit trace finalization seam (`finalize_post_commit_trace`) that performs notes+DB dual writes with idempotency ledger guards and retry-queue fallback capture, a post-rewrite remap-ingestion seam (`finalize_post_rewrite_remap`) that parses old->new SHA pairs and derives deterministic replay keys for remap dispatch, and a rewrite trace transformation seam (`finalize_rewrite_trace`) that emits rewritten-SHA Agent Trace records with rewrite metadata plus confidence-based quality status. -- `cli/src/services/hosted_reconciliation.rs` defines hosted intake/orchestration seams (`ingest_hosted_rewrite_event`, `ReconciliationRunStore`) that verify provider signatures (GitHub HMAC-SHA256 and GitLab token equality), parse provider payload old/new heads, and normalize deterministic idempotency-backed reconciliation run requests. +- `cli/src/services/hosted_reconciliation.rs` defines hosted intake/orchestration seams (`ingest_hosted_rewrite_event`, `ReconciliationRunStore`) that verify provider signatures (GitHub HMAC-SHA256 and GitLab token equality), parse provider payload old/new heads, normalize deterministic idempotency-backed reconciliation run requests, and resolve deterministic old->new rewrite mappings (`map_rewritten_commit`) with patch-id exact precedence, range-diff/fuzzy fallback scoring, and explicit unresolved classifications. - `cli/src/services/sync.rs` runs the local adapter through a lazily initialized shared tokio current-thread runtime and composes a placeholder cloud-sync abstraction (`CloudSyncGateway`) so local Turso validation and deferred cloud planning remain separated. - `cli/src/services/` contains module boundaries for setup, doctor, MCP, hooks, sync, and local DB adapters with explicit trait seams for future implementations. - `cli/README.md` is the crate-local onboarding and usage source of truth for placeholder behavior, safety limitations, and roadmap mapping back to service contracts. diff --git a/context/context-map.md b/context/context-map.md index 912b4857..e20b2b91 100644 --- a/context/context-map.md +++ b/context/context-map.md @@ -28,6 +28,7 @@ Feature/domain context: - `context/sce/agent-trace-core-schema-migrations.md` (T10 core local schema migration contract for `repositories`, `commits`, `trace_records`, and `trace_ranges` with upgrade-safe idempotent create semantics) - `context/sce/agent-trace-reconciliation-schema-ingestion.md` (T11 reconciliation persistence schema for `reconciliation_runs`, `rewrite_mappings`, and `conversations` with replay-safe idempotency and query indexes) - `context/sce/agent-trace-hosted-event-intake-orchestration.md` (T12 hosted GitHub/GitLab event intake contract with signature verification, old/new head resolution, and deterministic reconciliation-run idempotency keys) +- `context/sce/agent-trace-rewrite-mapping-engine.md` (T13 hosted rewrite mapping engine contract with patch-id exact precedence, range-diff/fuzzy scoring, and deterministic unresolved outcomes) Working areas: - `context/plans/` (active plan execution artifacts, not durable history) diff --git a/context/glossary.md b/context/glossary.md index da83ea25..4e165ac2 100644 --- a/context/glossary.md +++ b/context/glossary.md @@ -66,3 +66,4 @@ - `agent trace core schema migrations`: T10 contract in `cli/src/services/local_db.rs` (`apply_core_schema_migrations`) that applies idempotent local DB table/index creation for foundational Agent Trace entities (`repositories`, `commits`, `trace_records`, `trace_ranges`) and supports upgrade-safe reapplication on preexisting database state. - `agent trace reconciliation schema ingestion`: T11 contract in `cli/src/services/local_db.rs` (`apply_core_schema_migrations`) that extends local DB migrations with hosted rewrite reconciliation entities (`reconciliation_runs`, `rewrite_mappings`, `conversations`), per-repository idempotency uniqueness, and query indexes for run status and old->new mapping lookup. - `agent trace hosted event intake orchestration`: T12 contract in `cli/src/services/hosted_reconciliation.rs` (`ingest_hosted_rewrite_event`) that verifies GitHub/GitLab webhook signatures, resolves provider payload old/new heads, normalizes provider/repo/event identity into deterministic `hosted::` reconciliation run idempotency keys, and returns replay-safe created-vs-duplicate run outcomes through `ReconciliationRunStore`. +- `agent trace rewrite mapping engine`: T13 contract in `cli/src/services/hosted_reconciliation.rs` (`map_rewritten_commit`) that deterministically maps old->new rewritten commits using strict patch-id exact precedence, then range-diff scoring, then fuzzy fallback with `>= 0.60` threshold gating and explicit unresolved outcomes (`ambiguous`, `unmatched`, `low_confidence`). diff --git a/context/overview.md b/context/overview.md index 89ef60b0..41800765 100644 --- a/context/overview.md +++ b/context/overview.md @@ -35,6 +35,7 @@ The hooks service now also includes rewrite trace transformation finalization (` The local DB service now includes core Agent Trace persistence schema migrations (`apply_core_schema_migrations`) that install idempotent foundational tables and indexes for `repositories`, `commits`, `trace_records`, and `trace_ranges`; this behavior is documented in `context/sce/agent-trace-core-schema-migrations.md`. The local DB service now also includes reconciliation persistence schema coverage in the same migration entrypoint for hosted rewrite bookkeeping tables (`reconciliation_runs`, `rewrite_mappings`, `conversations`) and replay/query indexes; this behavior is documented in `context/sce/agent-trace-reconciliation-schema-ingestion.md`. The CLI now also includes a hosted event intake/orchestration seam in `cli/src/services/hosted_reconciliation.rs` that verifies provider signatures, resolves old/new commit heads from GitHub/GitLab payloads, and creates deterministic replay-safe reconciliation run requests; this behavior is documented in `context/sce/agent-trace-hosted-event-intake-orchestration.md`. +The hosted reconciliation service now also includes a deterministic rewrite mapping engine (`map_rewritten_commit`) that resolves old->new commit identity using patch-id exact precedence, then range-diff hints, then fuzzy fallback with a `>= 0.60` mapping threshold and explicit ambiguous/unmatched/low-confidence unresolved outcomes; this behavior is documented in `context/sce/agent-trace-rewrite-mapping-engine.md`. ## Repository model @@ -97,3 +98,4 @@ Lightweight post-task verification baseline (required after each completed task) - Use `context/sce/agent-trace-core-schema-migrations.md` for the implemented T10 core local schema migration contract (`apply_core_schema_migrations`) and table/index ownership across foundational Agent Trace persistence entities. - Use `context/sce/agent-trace-reconciliation-schema-ingestion.md` for the implemented T11 reconciliation schema contract (`reconciliation_runs`, `rewrite_mappings`, `conversations`) and replay-safe idempotency/index coverage. - Use `context/sce/agent-trace-hosted-event-intake-orchestration.md` for the implemented T12 hosted intake contract (GitHub/GitLab signature verification, old/new head resolution, deterministic reconciliation-run idempotency keys, and replay-safe run insertion outcomes). +- Use `context/sce/agent-trace-rewrite-mapping-engine.md` for the implemented T13 hosted mapping engine contract (patch-id exact matching, range-diff/fuzzy scoring precedence, confidence thresholds, and deterministic unresolved handling). diff --git a/context/patterns.md b/context/patterns.md index 93e3ff5d..a84ea7c6 100644 --- a/context/patterns.md +++ b/context/patterns.md @@ -93,6 +93,7 @@ - For local persistence rollout, ship core schema changes as idempotent `CREATE TABLE IF NOT EXISTS` and `CREATE INDEX IF NOT EXISTS` statements so migration reapplication is upgrade-safe across empty and preexisting local Turso DB states. - For hosted rewrite reconciliation persistence, extend the same migration seam (`apply_core_schema_migrations`) with deterministic schema/index statements and per-repository idempotency uniqueness for run/mapping replay safety. - For hosted event intake seams, verify provider signatures before payload parsing (GitHub `sha256=` HMAC over body, GitLab token-equality secret check), resolve old/new heads from provider payload fields, and derive deterministic reconciliation run idempotency keys from provider+event+repo+head tuple material. +- For hosted rewrite mapping seams, resolve candidates deterministically in strict precedence order (patch-id exact, then range-diff score, then fuzzy score), classify top-score ties as `ambiguous`, enforce low-confidence unresolved behavior below `0.60`, and preserve stable outcome ordering via canonical candidate SHA sorting. - Keep crate-local onboarding docs in `cli/README.md` and sanity-check command examples against actual `sce` output whenever command messaging changes. - Keep targeted CLI command-surface verification in flake checks: `checks..cli-setup-command-surface` runs from `cli/` and executes `cargo fmt --check` plus focused setup-related tests (`help_text_mentions_setup_target_flags`, `parser_routes_setup`, `run_setup_reports`). - In `cli/flake.nix`, select the Rust toolchain via an explicit Rust overlay (`rust-overlay`) and thread that toolchain through `makeRustPlatform` so CLI check/build derivations do not rely on implicit nixpkgs Rust defaults. diff --git a/context/plans/agent-trace-attribution-no-git-wrapper.md b/context/plans/agent-trace-attribution-no-git-wrapper.md index 4d28ce1c..b2e5e086 100644 --- a/context/plans/agent-trace-attribution-no-git-wrapper.md +++ b/context/plans/agent-trace-attribution-no-git-wrapper.md @@ -196,7 +196,7 @@ Implement a no-git-wrapper attribution platform that preserves normal developer - `cargo fmt --manifest-path cli/Cargo.toml -- --check` - `cargo build --manifest-path cli/Cargo.toml` -- [ ] T13: Implement mapping engine (patch-id, range-diff, fuzzy fallback) (status:todo) +- [x] T13: Implement mapping engine (patch-id, range-diff, fuzzy fallback) (status:done) - Task ID: T13 - Goal: Map old commits to new commits using strict staged matching with confidence scoring. - Boundaries (in/out of scope): @@ -206,6 +206,9 @@ Implement a no-git-wrapper attribution platform that preserves normal developer - Mapping outcomes are explainable, reproducible, and confidence-classified. - Verification notes (commands or checks): - Deterministic fixture tests for exact, ambiguous, unmatched, and low-confidence cases. + - `cargo fmt --manifest-path cli/Cargo.toml -- --check` + - `cargo test --manifest-path cli/Cargo.toml hosted_reconciliation` + - `cargo build --manifest-path cli/Cargo.toml` - [ ] T14: Implement notes write-back fallback, retry queue, and observability metrics (status:todo) - Task ID: T14 diff --git a/context/sce/agent-trace-rewrite-mapping-engine.md b/context/sce/agent-trace-rewrite-mapping-engine.md new file mode 100644 index 00000000..a1516fbd --- /dev/null +++ b/context/sce/agent-trace-rewrite-mapping-engine.md @@ -0,0 +1,77 @@ +# Agent Trace Rewrite Mapping Engine + +## Status +- Plan: `agent-trace-attribution-no-git-wrapper` +- Task: `T13` +- Code surface: `cli/src/services/hosted_reconciliation.rs` + +## Goal +Resolve hosted/local rewrite old->new commit identity with deterministic, explainable mapping outcomes before downstream rewrite persistence and trace transformation. + +## Engine contract +- Entry point: `map_rewritten_commit(source, candidates)`. +- Inputs: + - `RewriteSourceCommit`: old commit SHA + optional source patch-id. + - `RewriteCandidateCommit`: new commit SHA + optional `patch_id`, `range_diff_score`, `fuzzy_score`. +- Score contract: + - `Score` is constrained to finite `[0.0, 1.0]`. + - Mapping threshold: `FUZZY_MAPPING_THRESHOLD = 0.60`. +- Determinism: + - Candidates are sorted by `new_commit_sha` before decisioning. + - Tied top-score outcomes are returned in stable SHA order. + +## Decision precedence +1. Patch-id exact match + - If exactly one candidate patch-id matches source patch-id, map with: + - `method = PatchIdExact` + - `confidence = 1.0` + - `quality = final` + - If multiple exact patch-id matches exist, return unresolved `ambiguous`. +2. Range-diff scoring + - Select highest `range_diff_score` when no patch-id mapping exists. + - Tie for highest score returns unresolved `ambiguous`. + - Highest score `< 0.60` returns unresolved `low_confidence`. +3. Fuzzy fallback scoring + - Applied only when no patch-id or range-diff resolution exists. + - Uses same tie and threshold behavior as range-diff. +4. Unmatched + - If no usable range-diff or fuzzy signals exist, return unresolved `unmatched`. + +## Confidence to quality mapping +- `>= 0.90` -> `final` +- `0.60..0.89` -> `partial` +- `< 0.60` -> unresolved `low_confidence` (mapped output is not emitted) + +## Unresolved outcomes +- `ambiguous`: two or more candidates tied for best score or multiple patch-id exact matches. +- `unmatched`: candidates exist but no scoring signal exists. +- `low_confidence`: best available score is below `0.60`. + +## Verification coverage +- Exact match fixture: patch-id match wins over stronger non-exact scores. +- Ambiguous fixture: tied best range-diff scores return deterministic unresolved candidates. +- Unmatched fixture: no score signals produces `unmatched`. +- Low-confidence fixture: fuzzy best score `< 0.60` returns `low_confidence`. + +## Mapping flow +```mermaid +flowchart TD + A[Sort candidates by new_commit_sha] --> B{Source patch_id present?} + B -- Yes --> C{Exact patch_id matches} + C -- 1 --> D[Mapped: PatchIdExact confidence=1.0] + C -- >1 --> E[Unresolved: ambiguous] + C -- 0 --> F[Range-diff scoring] + B -- No --> F + F --> G{Top range-diff candidate exists?} + G -- No --> H[Fuzzy scoring] + G -- Yes tie --> E + G -- Yes single --> I{score >= 0.60?} + I -- Yes --> J[Mapped: RangeDiffHint] + I -- No --> K[Unresolved: low_confidence] + H --> L{Top fuzzy candidate exists?} + L -- No --> M[Unresolved: unmatched] + L -- Yes tie --> E + L -- Yes single --> N{score >= 0.60?} + N -- Yes --> O[Mapped: FuzzyFallback] + N -- No --> K +``` From 41ef52317f97ea121a6ba7c7ba4475a8926f11a7 Mon Sep 17 00:00:00 2001 From: David Abram Date: Wed, 4 Mar 2026 15:54:35 +0100 Subject: [PATCH 15/39] agent-trace: Implement retry replay and reconciliation observability Implements DB-first retry queue recovery for failed notes/DB writes, adds bounded retry processing with per-attempt runtime/error metrics, and introduces reconciliation mapped/unmapped plus confidence histogram telemetry. --- cli/src/services/hooks.rs | 255 +++++++++++++++++- cli/src/services/hosted_reconciliation.rs | 158 ++++++++++- cli/src/services/local_db.rs | 32 +++ context/architecture.md | 6 +- context/context-map.md | 1 + context/glossary.md | 2 + context/overview.md | 2 + context/patterns.md | 2 + .../agent-trace-attribution-no-git-wrapper.md | 9 +- .../agent-trace-retry-queue-observability.md | 45 ++++ 10 files changed, 491 insertions(+), 21 deletions(-) create mode 100644 context/sce/agent-trace-retry-queue-observability.md diff --git a/cli/src/services/hooks.rs b/cli/src/services/hooks.rs index e2aaf104..25cd103a 100644 --- a/cli/src/services/hooks.rs +++ b/cli/src/services/hooks.rs @@ -1,8 +1,10 @@ use anyhow::Result; +use std::collections::HashSet; +use std::time::Instant; use crate::services::agent_trace::{ build_trace_payload, AgentTraceRecord, FileAttributionInput, QualityStatus, RewriteInfo, - TraceAdapterInput, TRACE_CONTENT_TYPE, + TraceAdapterInput, METADATA_IDEMPOTENCY_KEY, TRACE_CONTENT_TYPE, }; pub const NAME: &str = "hooks"; @@ -150,6 +152,27 @@ pub trait TraceRecordStore { pub trait TraceRetryQueue { fn enqueue(&mut self, entry: TraceRetryQueueEntry) -> Result<()>; + fn dequeue_next(&mut self) -> Result>; +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct RetryProcessingMetric { + pub commit_sha: String, + pub trace_id: String, + pub runtime_ms: u128, + pub error_class: Option, + pub failed_targets: Vec, +} + +pub trait RetryMetricsSink { + fn record_retry_metric(&mut self, metric: RetryProcessingMetric); +} + +#[derive(Clone, Copy, Debug, Eq, PartialEq)] +pub struct RetryQueueProcessSummary { + pub attempted: usize, + pub recovered: usize, + pub requeued: usize, } pub trait TraceEmissionLedger { @@ -621,6 +644,106 @@ fn collect_failed_targets( failed_targets } +pub fn process_trace_retry_queue( + retry_queue: &mut impl TraceRetryQueue, + notes_writer: &mut impl TraceNotesWriter, + record_store: &mut impl TraceRecordStore, + metrics_sink: &mut impl RetryMetricsSink, + max_items: usize, +) -> Result { + let mut processed_trace_ids = HashSet::new(); + let mut summary = RetryQueueProcessSummary { + attempted: 0, + recovered: 0, + requeued: 0, + }; + + for _ in 0..max_items { + let Some(entry) = retry_queue.dequeue_next()? else { + break; + }; + + if !processed_trace_ids.insert(entry.record.id.clone()) { + retry_queue.enqueue(entry)?; + break; + } + + summary.attempted += 1; + let started = Instant::now(); + + let notes_result = if entry.failed_targets.contains(&PersistenceTarget::Notes) { + notes_writer.write_note(TraceNote { + notes_ref: entry.notes_ref.clone(), + commit_sha: entry.commit_sha.clone(), + content_type: entry.content_type.clone(), + record: entry.record.clone(), + }) + } else { + PersistenceWriteResult::AlreadyExists + }; + + let database_result = if entry.failed_targets.contains(&PersistenceTarget::Database) { + let idempotency_key = entry + .record + .metadata + .get(METADATA_IDEMPOTENCY_KEY) + .cloned() + .unwrap_or_else(|| format!("retry:{}:{}", entry.commit_sha, entry.record.id)); + record_store.write_trace_record(PersistedTraceRecord { + commit_sha: entry.commit_sha.clone(), + idempotency_key, + content_type: entry.content_type.clone(), + notes_ref: entry.notes_ref.clone(), + record: entry.record.clone(), + }) + } else { + PersistenceWriteResult::AlreadyExists + }; + + let failed_targets = collect_failed_targets(¬es_result, &database_result); + let error_class = first_failure_class(¬es_result, &database_result); + + metrics_sink.record_retry_metric(RetryProcessingMetric { + commit_sha: entry.commit_sha.clone(), + trace_id: entry.record.id.clone(), + runtime_ms: started.elapsed().as_millis(), + error_class, + failed_targets: failed_targets.clone(), + }); + + if failed_targets.is_empty() { + summary.recovered += 1; + continue; + } + + summary.requeued += 1; + retry_queue.enqueue(TraceRetryQueueEntry { + commit_sha: entry.commit_sha, + failed_targets, + content_type: entry.content_type, + notes_ref: entry.notes_ref, + record: entry.record, + })?; + } + + Ok(summary) +} + +fn first_failure_class( + notes_result: &PersistenceWriteResult, + database_result: &PersistenceWriteResult, +) -> Option { + match notes_result { + PersistenceWriteResult::Failed(failure) => return Some(failure.class.clone()), + PersistenceWriteResult::Written | PersistenceWriteResult::AlreadyExists => {} + } + + match database_result { + PersistenceWriteResult::Failed(failure) => Some(failure.class.clone()), + PersistenceWriteResult::Written | PersistenceWriteResult::AlreadyExists => None, + } +} + pub fn apply_commit_msg_coauthor_policy( runtime: &CommitMsgRuntimeState, commit_message: &str, @@ -825,23 +948,25 @@ mod tests { use anyhow::Result; use crate::services::agent_trace::{ - ContributorInput, ContributorType, ConversationInput, FileAttributionInput, RangeInput, + build_trace_payload, ContributorInput, ContributorType, ConversationInput, + FileAttributionInput, QualityStatus, RangeInput, TraceAdapterInput, METADATA_QUALITY_STATUS, METADATA_REWRITE_CONFIDENCE, METADATA_REWRITE_FROM, METADATA_REWRITE_METHOD, }; use super::{ apply_commit_msg_coauthor_policy, finalize_post_commit_trace, finalize_post_rewrite_remap, - finalize_pre_commit_checkpoint, finalize_rewrite_trace, run_placeholder_hooks, - CommitMsgRuntimeState, GeneratedRegionEvent, GeneratedRegionLifecycle, GitHookKind, - HookEvent, HookService, PendingCheckpoint, PendingFileCheckpoint, PendingLineRange, - PersistenceErrorClass, PersistenceFailure, PersistenceTarget, PersistenceWriteResult, - PlaceholderHookService, PostCommitFinalization, PostCommitInput, PostCommitNoOpReason, - PostCommitRuntimeState, PostRewriteFinalization, PostRewriteNoOpReason, - PostRewriteRuntimeState, PreCommitFinalization, PreCommitNoOpReason, PreCommitRuntimeState, - PreCommitTreeAnchors, RewriteMethod, RewriteRemapIngestion, RewriteRemapRequest, - RewriteTraceFinalization, RewriteTraceInput, RewriteTraceNoOpReason, TraceEmissionLedger, - TraceNote, TraceNotesWriter, TraceRecordStore, TraceRetryQueue, TraceRetryQueueEntry, + finalize_pre_commit_checkpoint, finalize_rewrite_trace, process_trace_retry_queue, + run_placeholder_hooks, CommitMsgRuntimeState, GeneratedRegionEvent, + GeneratedRegionLifecycle, GitHookKind, HookEvent, HookService, PendingCheckpoint, + PendingFileCheckpoint, PendingLineRange, PersistenceErrorClass, PersistenceFailure, + PersistenceTarget, PersistenceWriteResult, PlaceholderHookService, PostCommitFinalization, + PostCommitInput, PostCommitNoOpReason, PostCommitRuntimeState, PostRewriteFinalization, + PostRewriteNoOpReason, PostRewriteRuntimeState, PreCommitFinalization, PreCommitNoOpReason, + PreCommitRuntimeState, PreCommitTreeAnchors, RetryMetricsSink, RetryProcessingMetric, + RewriteMethod, RewriteRemapIngestion, RewriteRemapRequest, RewriteTraceFinalization, + RewriteTraceInput, RewriteTraceNoOpReason, TraceEmissionLedger, TraceNote, + TraceNotesWriter, TraceRecordStore, TraceRetryQueue, TraceRetryQueueEntry, CANONICAL_SCE_COAUTHOR_TRAILER, POST_COMMIT_PARENT_SHA_METADATA_KEY, }; @@ -936,6 +1061,11 @@ mod tests { entries: Vec, } + #[derive(Default)] + struct FakeRetryMetricsSink { + events: Vec, + } + #[derive(Default)] struct FakeRewriteRemapIngestion { seen_requests: Vec, @@ -959,6 +1089,54 @@ mod tests { self.entries.push(entry); Ok(()) } + + fn dequeue_next(&mut self) -> Result> { + if self.entries.is_empty() { + return Ok(None); + } + + Ok(Some(self.entries.remove(0))) + } + } + + impl RetryMetricsSink for FakeRetryMetricsSink { + fn record_retry_metric(&mut self, metric: RetryProcessingMetric) { + self.events.push(metric); + } + } + + fn sample_retry_entry_with_target(target: PersistenceTarget) -> TraceRetryQueueEntry { + let record = build_trace_payload(TraceAdapterInput { + record_id: "990e8400-e29b-41d4-a716-446655440000".to_string(), + timestamp_rfc3339: "2026-03-04T12:13:14Z".to_string(), + commit_sha: "retrysha123".to_string(), + files: vec![FileAttributionInput { + path: "src/retry.rs".to_string(), + conversations: vec![ConversationInput { + url: "https://example.test/conversation/retry".to_string(), + related: vec![], + ranges: vec![RangeInput { + start_line: 4, + end_line: 6, + contributor: ContributorInput { + kind: ContributorType::Ai, + model_id: Some("openai/gpt-5.3-codex".to_string()), + }, + }], + }], + }], + quality_status: QualityStatus::Final, + rewrite: None, + idempotency_key: Some("retry:key:retrysha123".to_string()), + }); + + TraceRetryQueueEntry { + commit_sha: "retrysha123".to_string(), + failed_targets: vec![target], + content_type: "application/vnd.agent-trace.record+json".to_string(), + notes_ref: "refs/notes/agent-trace".to_string(), + record, + } } fn sample_post_commit_runtime() -> PostCommitRuntimeState { @@ -1138,6 +1316,59 @@ mod tests { Ok(()) } + #[test] + fn retry_processor_recovers_failed_notes_write_and_emits_success_metric() -> Result<()> { + let mut queue = FakeRetryQueue { + entries: vec![sample_retry_entry_with_target(PersistenceTarget::Notes)], + }; + let mut notes = FakeNotesWriter::new(PersistenceWriteResult::Written); + let mut store = FakeRecordStore::new(PersistenceWriteResult::Written); + let mut metrics = FakeRetryMetricsSink::default(); + + let summary = + process_trace_retry_queue(&mut queue, &mut notes, &mut store, &mut metrics, 4)?; + + assert_eq!(summary.attempted, 1); + assert_eq!(summary.recovered, 1); + assert_eq!(summary.requeued, 0); + assert!(queue.entries.is_empty()); + assert_eq!(metrics.events.len(), 1); + assert_eq!(metrics.events[0].error_class, None); + assert!(metrics.events[0].failed_targets.is_empty()); + Ok(()) + } + + #[test] + fn retry_processor_requeues_when_db_write_still_fails() -> Result<()> { + let mut queue = FakeRetryQueue { + entries: vec![sample_retry_entry_with_target(PersistenceTarget::Database)], + }; + let mut notes = FakeNotesWriter::new(PersistenceWriteResult::Written); + let mut store = FakeRecordStore::new(PersistenceWriteResult::Failed(PersistenceFailure { + class: PersistenceErrorClass::Permanent, + message: "database still unavailable".to_string(), + })); + let mut metrics = FakeRetryMetricsSink::default(); + + let summary = + process_trace_retry_queue(&mut queue, &mut notes, &mut store, &mut metrics, 4)?; + + assert_eq!(summary.attempted, 1); + assert_eq!(summary.recovered, 0); + assert_eq!(summary.requeued, 1); + assert_eq!(queue.entries.len(), 1); + assert_eq!( + queue.entries[0].failed_targets, + vec![PersistenceTarget::Database] + ); + assert_eq!(metrics.events.len(), 1); + assert_eq!( + metrics.events[0].error_class, + Some(PersistenceErrorClass::Permanent) + ); + Ok(()) + } + #[test] fn post_rewrite_finalization_noops_when_sce_disabled() -> Result<()> { let mut runtime = sample_post_rewrite_runtime(); diff --git a/cli/src/services/hosted_reconciliation.rs b/cli/src/services/hosted_reconciliation.rs index fa5133b3..c7c633e8 100644 --- a/cli/src/services/hosted_reconciliation.rs +++ b/cli/src/services/hosted_reconciliation.rs @@ -135,6 +135,83 @@ pub enum RewriteMappingOutcome { Unresolved(UnresolvedRewriteMapping), } +#[derive(Clone, Copy, Debug, Eq, PartialEq)] +pub enum ReconciliationErrorClass { + Signature, + Payload, + Store, +} + +#[derive(Clone, Copy, Debug, Default, Eq, PartialEq)] +pub struct ConfidenceHistogram { + pub high: u64, + pub medium: u64, + pub low: u64, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct ReconciliationMetricsSnapshot { + pub mapped: u64, + pub unmapped: u64, + pub confidence_histogram: ConfidenceHistogram, + pub runtime_ms: u128, + pub error_class: Option, +} + +pub fn summarize_reconciliation_metrics( + outcomes: &[RewriteMappingOutcome], + runtime_ms: u128, + error_class: Option, +) -> ReconciliationMetricsSnapshot { + let mut mapped = 0_u64; + let mut unmapped = 0_u64; + let mut confidence_histogram = ConfidenceHistogram::default(); + + for outcome in outcomes { + match outcome { + RewriteMappingOutcome::Mapped(mapping) => { + mapped += 1; + classify_histogram_bucket(mapping.confidence, &mut confidence_histogram); + } + RewriteMappingOutcome::Unresolved(unresolved) => { + unmapped += 1; + if let Some(score) = unresolved.best_confidence { + classify_histogram_bucket(score, &mut confidence_histogram); + } + } + } + } + + ReconciliationMetricsSnapshot { + mapped, + unmapped, + confidence_histogram, + runtime_ms, + error_class, + } +} + +pub fn classify_reconciliation_error(error: &anyhow::Error) -> ReconciliationErrorClass { + let message = error.to_string().to_ascii_lowercase(); + if message.contains("signature") { + return ReconciliationErrorClass::Signature; + } + if message.contains("payload") || message.contains("before") || message.contains("after") { + return ReconciliationErrorClass::Payload; + } + ReconciliationErrorClass::Store +} + +fn classify_histogram_bucket(score: Score, histogram: &mut ConfidenceHistogram) { + if score.value() >= 0.90 { + histogram.high += 1; + } else if score.value() >= FUZZY_MAPPING_THRESHOLD { + histogram.medium += 1; + } else { + histogram.low += 1; + } +} + pub fn map_rewritten_commit( source: &RewriteSourceCommit, candidates: &[RewriteCandidateCommit], @@ -629,11 +706,12 @@ mod tests { use anyhow::Result; use super::{ - derive_idempotency_key, github_signature, ingest_hosted_rewrite_event, - map_rewritten_commit, HostedIntakeOutcome, HostedProvider, HostedReconciliationRunRequest, - HostedWebhookRequest, MappingMethod, MappingQuality, ReconciliationRunInsertOutcome, - ReconciliationRunStore, RewriteCandidateCommit, RewriteMappingOutcome, RewriteSourceCommit, - Score, UnresolvedMappingKind, + classify_reconciliation_error, derive_idempotency_key, github_signature, + ingest_hosted_rewrite_event, map_rewritten_commit, summarize_reconciliation_metrics, + ConfidenceHistogram, HostedIntakeOutcome, HostedProvider, HostedReconciliationRunRequest, + HostedWebhookRequest, MappingMethod, MappingQuality, ReconciliationErrorClass, + ReconciliationRunInsertOutcome, ReconciliationRunStore, RewriteCandidateCommit, + RewriteMappingOutcome, RewriteSourceCommit, Score, UnresolvedMappingKind, }; #[derive(Default)] @@ -944,4 +1022,74 @@ mod tests { other => panic!("expected low-confidence unresolved outcome, got {other:?}"), } } + + #[test] + fn reconciliation_metrics_capture_mapped_unmapped_histogram_runtime_and_error_class() { + let outcomes = vec![ + RewriteMappingOutcome::Mapped(super::RewrittenCommitMapping { + old_commit_sha: "old-high".to_string(), + new_commit_sha: "new-high".to_string(), + method: MappingMethod::PatchIdExact, + confidence: score(1.0), + quality: MappingQuality::Final, + }), + RewriteMappingOutcome::Mapped(super::RewrittenCommitMapping { + old_commit_sha: "old-medium".to_string(), + new_commit_sha: "new-medium".to_string(), + method: MappingMethod::FuzzyFallback, + confidence: score(0.65), + quality: MappingQuality::Partial, + }), + RewriteMappingOutcome::Unresolved(super::UnresolvedRewriteMapping { + old_commit_sha: "old-low".to_string(), + kind: UnresolvedMappingKind::LowConfidence, + reason: "below threshold".to_string(), + candidate_new_shas: vec!["new-low".to_string()], + best_confidence: Some(score(0.40)), + }), + RewriteMappingOutcome::Unresolved(super::UnresolvedRewriteMapping { + old_commit_sha: "old-unmatched".to_string(), + kind: UnresolvedMappingKind::Unmatched, + reason: "none".to_string(), + candidate_new_shas: vec![], + best_confidence: None, + }), + ]; + + let snapshot = + summarize_reconciliation_metrics(&outcomes, 123, Some(ReconciliationErrorClass::Store)); + + assert_eq!(snapshot.mapped, 2); + assert_eq!(snapshot.unmapped, 2); + assert_eq!( + snapshot.confidence_histogram, + ConfidenceHistogram { + high: 1, + medium: 1, + low: 1, + } + ); + assert_eq!(snapshot.runtime_ms, 123); + assert_eq!(snapshot.error_class, Some(ReconciliationErrorClass::Store)); + } + + #[test] + fn reconciliation_error_classification_labels_signature_and_payload_failures() { + let signature_error = anyhow::anyhow!("hosted event signature verification failed"); + let payload_error = anyhow::anyhow!("invalid hosted event payload: missing 'before' field"); + let store_error = anyhow::anyhow!("run store insert failed"); + + assert_eq!( + classify_reconciliation_error(&signature_error), + ReconciliationErrorClass::Signature + ); + assert_eq!( + classify_reconciliation_error(&payload_error), + ReconciliationErrorClass::Payload + ); + assert_eq!( + classify_reconciliation_error(&store_error), + ReconciliationErrorClass::Store + ); + } } diff --git a/cli/src/services/local_db.rs b/cli/src/services/local_db.rs index dcdd7a26..6dd2ced7 100644 --- a/cli/src/services/local_db.rs +++ b/cli/src/services/local_db.rs @@ -89,6 +89,32 @@ const CORE_SCHEMA_STATEMENTS: &[&str] = &[ FOREIGN KEY(repository_id) REFERENCES repositories(id) ON DELETE CASCADE,\ UNIQUE(repository_id, url)\ )", + "CREATE TABLE IF NOT EXISTS trace_retry_queue (\ + id INTEGER PRIMARY KEY,\ + trace_id TEXT NOT NULL UNIQUE,\ + commit_sha TEXT NOT NULL,\ + failed_targets TEXT NOT NULL,\ + content_type TEXT NOT NULL,\ + notes_ref TEXT NOT NULL,\ + payload_json TEXT NOT NULL,\ + attempts INTEGER NOT NULL DEFAULT 0,\ + last_error_class TEXT,\ + last_error_message TEXT,\ + created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ', 'now')),\ + updated_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ', 'now'))\ + )", + "CREATE TABLE IF NOT EXISTS reconciliation_metrics (\ + id INTEGER PRIMARY KEY,\ + run_id INTEGER,\ + mapped_count INTEGER NOT NULL,\ + unmapped_count INTEGER NOT NULL,\ + histogram_high INTEGER NOT NULL,\ + histogram_medium INTEGER NOT NULL,\ + histogram_low INTEGER NOT NULL,\ + runtime_ms INTEGER NOT NULL,\ + error_class TEXT,\ + created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ', 'now'))\ + )", "CREATE INDEX IF NOT EXISTS idx_commits_repository_commit_sha ON commits(repository_id, commit_sha)", "CREATE INDEX IF NOT EXISTS idx_trace_records_repository_commit ON trace_records(repository_id, commit_id)", "CREATE INDEX IF NOT EXISTS idx_trace_ranges_record_file ON trace_ranges(trace_record_id, file_path)", @@ -96,6 +122,8 @@ const CORE_SCHEMA_STATEMENTS: &[&str] = &[ "CREATE INDEX IF NOT EXISTS idx_rewrite_mappings_run_old_sha ON rewrite_mappings(reconciliation_run_id, old_commit_sha)", "CREATE INDEX IF NOT EXISTS idx_rewrite_mappings_repository_old_sha ON rewrite_mappings(repository_id, old_commit_sha)", "CREATE INDEX IF NOT EXISTS idx_conversations_repository_source ON conversations(repository_id, source)", + "CREATE INDEX IF NOT EXISTS idx_trace_retry_queue_created_at ON trace_retry_queue(created_at)", + "CREATE INDEX IF NOT EXISTS idx_reconciliation_metrics_created_at ON reconciliation_metrics(created_at)", ]; #[derive(Clone, Copy, Debug)] @@ -276,6 +304,8 @@ mod tests { "reconciliation_runs", "rewrite_mappings", "conversations", + "trace_retry_queue", + "reconciliation_metrics", ] { assert!(runtime.block_on(sqlite_object_exists( LocalDatabaseTarget::Path(&path), @@ -292,6 +322,8 @@ mod tests { "idx_rewrite_mappings_run_old_sha", "idx_rewrite_mappings_repository_old_sha", "idx_conversations_repository_source", + "idx_trace_retry_queue_created_at", + "idx_reconciliation_metrics_created_at", ] { assert!(runtime.block_on(sqlite_object_exists( LocalDatabaseTarget::Path(&path), diff --git a/context/architecture.md b/context/architecture.md index fefd4bbf..06a98ecb 100644 --- a/context/architecture.md +++ b/context/architecture.md @@ -73,14 +73,14 @@ The repository includes a new placeholder Rust binary crate at `cli/`. - `cli/src/main.rs` is the executable entrypoint (`sce`) and delegates to `app::run`. - `cli/src/app.rs` provides a `lexopt`-based argument parser and dispatch loop with deterministic help, setup installation execution, and consistent `anyhow`-driven error exits. - `cli/src/command_surface.rs` is the source of truth for top-level command contract metadata (`help`, `setup`, `doctor`, `mcp`, `hooks`, `sync`) and explicit implemented-vs-placeholder status. -- `cli/src/services/local_db.rs` provides the local Turso data adapter, including `Builder::new_local(...)` initialization, async execute/query smoke checks for in-memory and file-backed targets, and idempotent migration application for Agent Trace persistence foundations (`repositories`, `commits`, `trace_records`, `trace_ranges`) plus reconciliation ingestion entities (`reconciliation_runs`, `rewrite_mappings`, `conversations`) with replay/query indexes. +- `cli/src/services/local_db.rs` provides the local Turso data adapter, including `Builder::new_local(...)` initialization, async execute/query smoke checks for in-memory and file-backed targets, and idempotent migration application for Agent Trace persistence foundations (`repositories`, `commits`, `trace_records`, `trace_ranges`), reconciliation ingestion entities (`reconciliation_runs`, `rewrite_mappings`, `conversations`), and T14 retry/observability storage (`trace_retry_queue`, `reconciliation_metrics`) with replay/query indexes. - `cli/src/test_support.rs` provides a shared test-only temp-directory helper (`TestTempDir`) used by service tests that need filesystem fixtures. - `cli/src/services/setup.rs` defines the setup command contract (`SetupMode`, `SetupTarget`, CLI flag parser/validator), an `inquire`-backed interactive target prompter (`InquireSetupTargetPrompter`), setup dispatch outcomes (proceed/cancelled), compile-time embedded asset access (`EmbeddedAsset`, target-scoped iterators) generated by `cli/build.rs` from `config/.opencode/**` and `config/.claude/**`, and a target-scoped install engine/orchestrator that stages embedded files, performs backup-and-replace with rollback restoration on swap failure, and formats deterministic completion messaging. - `cli/src/services/doctor.rs` defines hook rollout health validation (`run_doctor`) that resolves effective git hook-path source (default, local `core.hooksPath`, global `core.hooksPath`) and validates required hook files (`pre-commit`, `commit-msg`, `post-commit`) for presence and executable permissions. - `cli/src/services/agent_trace.rs` defines the Agent Trace schema adapter and builder contracts (`adapt_trace_payload`, `build_trace_payload`), including fixed git VCS identity, reserved reverse-domain metadata keys, and deterministic AI `model_id` normalization before schema-compliance validation. - `cli/src/services/mcp.rs` defines MCP file-cache capability contracts (`McpService`, transport/capability snapshots, cache policy) with non-runnable placeholder tool declarations. -- `cli/src/services/hooks.rs` defines hook-event and generated-region tracking contracts (`HookService`, `HookEventModel`) plus a pre-commit staged-checkpoint finalization seam (`finalize_pre_commit_checkpoint`) that enforces staged-only attribution and carries index/tree anchors with explicit no-op guard states, a commit-msg co-author policy seam (`apply_commit_msg_coauthor_policy`) that injects one canonical SCE trailer only for allowed attributed commits, a post-commit trace finalization seam (`finalize_post_commit_trace`) that performs notes+DB dual writes with idempotency ledger guards and retry-queue fallback capture, a post-rewrite remap-ingestion seam (`finalize_post_rewrite_remap`) that parses old->new SHA pairs and derives deterministic replay keys for remap dispatch, and a rewrite trace transformation seam (`finalize_rewrite_trace`) that emits rewritten-SHA Agent Trace records with rewrite metadata plus confidence-based quality status. -- `cli/src/services/hosted_reconciliation.rs` defines hosted intake/orchestration seams (`ingest_hosted_rewrite_event`, `ReconciliationRunStore`) that verify provider signatures (GitHub HMAC-SHA256 and GitLab token equality), parse provider payload old/new heads, normalize deterministic idempotency-backed reconciliation run requests, and resolve deterministic old->new rewrite mappings (`map_rewritten_commit`) with patch-id exact precedence, range-diff/fuzzy fallback scoring, and explicit unresolved classifications. +- `cli/src/services/hooks.rs` defines hook-event and generated-region tracking contracts (`HookService`, `HookEventModel`) plus a pre-commit staged-checkpoint finalization seam (`finalize_pre_commit_checkpoint`) that enforces staged-only attribution and carries index/tree anchors with explicit no-op guard states, a commit-msg co-author policy seam (`apply_commit_msg_coauthor_policy`) that injects one canonical SCE trailer only for allowed attributed commits, a post-commit trace finalization seam (`finalize_post_commit_trace`) that performs notes+DB dual writes with idempotency ledger guards and retry-queue fallback capture, a retry replay seam (`process_trace_retry_queue`) that re-attempts only failed persistence targets and emits per-attempt runtime/error-class metrics, a post-rewrite remap-ingestion seam (`finalize_post_rewrite_remap`) that parses old->new SHA pairs and derives deterministic replay keys for remap dispatch, and a rewrite trace transformation seam (`finalize_rewrite_trace`) that emits rewritten-SHA Agent Trace records with rewrite metadata plus confidence-based quality status. +- `cli/src/services/hosted_reconciliation.rs` defines hosted intake/orchestration seams (`ingest_hosted_rewrite_event`, `ReconciliationRunStore`) that verify provider signatures (GitHub HMAC-SHA256 and GitLab token equality), parse provider payload old/new heads, normalize deterministic idempotency-backed reconciliation run requests, resolve deterministic old->new rewrite mappings (`map_rewritten_commit`) with patch-id exact precedence, range-diff/fuzzy fallback scoring, and explicit unresolved classifications, and summarize mapped/unmapped confidence/runtime/error-class telemetry (`summarize_reconciliation_metrics`). - `cli/src/services/sync.rs` runs the local adapter through a lazily initialized shared tokio current-thread runtime and composes a placeholder cloud-sync abstraction (`CloudSyncGateway`) so local Turso validation and deferred cloud planning remain separated. - `cli/src/services/` contains module boundaries for setup, doctor, MCP, hooks, sync, and local DB adapters with explicit trait seams for future implementations. - `cli/README.md` is the crate-local onboarding and usage source of truth for placeholder behavior, safety limitations, and roadmap mapping back to service contracts. diff --git a/context/context-map.md b/context/context-map.md index e20b2b91..60aa762f 100644 --- a/context/context-map.md +++ b/context/context-map.md @@ -29,6 +29,7 @@ Feature/domain context: - `context/sce/agent-trace-reconciliation-schema-ingestion.md` (T11 reconciliation persistence schema for `reconciliation_runs`, `rewrite_mappings`, and `conversations` with replay-safe idempotency and query indexes) - `context/sce/agent-trace-hosted-event-intake-orchestration.md` (T12 hosted GitHub/GitLab event intake contract with signature verification, old/new head resolution, and deterministic reconciliation-run idempotency keys) - `context/sce/agent-trace-rewrite-mapping-engine.md` (T13 hosted rewrite mapping engine contract with patch-id exact precedence, range-diff/fuzzy scoring, and deterministic unresolved outcomes) +- `context/sce/agent-trace-retry-queue-observability.md` (T14 retry queue recovery contract plus reconciliation/runtime observability metrics and DB-first queue schema additions) Working areas: - `context/plans/` (active plan execution artifacts, not durable history) diff --git a/context/glossary.md b/context/glossary.md index 4e165ac2..b24c9bae 100644 --- a/context/glossary.md +++ b/context/glossary.md @@ -67,3 +67,5 @@ - `agent trace reconciliation schema ingestion`: T11 contract in `cli/src/services/local_db.rs` (`apply_core_schema_migrations`) that extends local DB migrations with hosted rewrite reconciliation entities (`reconciliation_runs`, `rewrite_mappings`, `conversations`), per-repository idempotency uniqueness, and query indexes for run status and old->new mapping lookup. - `agent trace hosted event intake orchestration`: T12 contract in `cli/src/services/hosted_reconciliation.rs` (`ingest_hosted_rewrite_event`) that verifies GitHub/GitLab webhook signatures, resolves provider payload old/new heads, normalizes provider/repo/event identity into deterministic `hosted::` reconciliation run idempotency keys, and returns replay-safe created-vs-duplicate run outcomes through `ReconciliationRunStore`. - `agent trace rewrite mapping engine`: T13 contract in `cli/src/services/hosted_reconciliation.rs` (`map_rewritten_commit`) that deterministically maps old->new rewritten commits using strict patch-id exact precedence, then range-diff scoring, then fuzzy fallback with `>= 0.60` threshold gating and explicit unresolved outcomes (`ambiguous`, `unmatched`, `low_confidence`). +- `agent trace retry replay processor`: T14 contract in `cli/src/services/hooks.rs` (`process_trace_retry_queue`) that dequeues fallback queue entries, retries only previously failed persistence targets (notes and/or DB), requeues remaining failures, and emits per-attempt runtime/error-class metrics via `RetryMetricsSink`. +- `reconciliation metrics snapshot`: T14 contract in `cli/src/services/hosted_reconciliation.rs` (`summarize_reconciliation_metrics`) that reports mapped/unmapped counts, confidence histogram buckets (`high`/`medium`/`low`), runtime (`runtime_ms`), and normalized error class (`signature`/`payload`/`store`) for hosted rewrite runs. diff --git a/context/overview.md b/context/overview.md index 41800765..1689a6b9 100644 --- a/context/overview.md +++ b/context/overview.md @@ -36,6 +36,7 @@ The local DB service now includes core Agent Trace persistence schema migrations The local DB service now also includes reconciliation persistence schema coverage in the same migration entrypoint for hosted rewrite bookkeeping tables (`reconciliation_runs`, `rewrite_mappings`, `conversations`) and replay/query indexes; this behavior is documented in `context/sce/agent-trace-reconciliation-schema-ingestion.md`. The CLI now also includes a hosted event intake/orchestration seam in `cli/src/services/hosted_reconciliation.rs` that verifies provider signatures, resolves old/new commit heads from GitHub/GitLab payloads, and creates deterministic replay-safe reconciliation run requests; this behavior is documented in `context/sce/agent-trace-hosted-event-intake-orchestration.md`. The hosted reconciliation service now also includes a deterministic rewrite mapping engine (`map_rewritten_commit`) that resolves old->new commit identity using patch-id exact precedence, then range-diff hints, then fuzzy fallback with a `>= 0.60` mapping threshold and explicit ambiguous/unmatched/low-confidence unresolved outcomes; this behavior is documented in `context/sce/agent-trace-rewrite-mapping-engine.md`. +The hooks service now also includes retry-queue replay processing (`process_trace_retry_queue`) with per-attempt runtime/error-class metric emission, and the hosted reconciliation service now includes mapped/unmapped + confidence histogram metric snapshots (`summarize_reconciliation_metrics`), with DB-first queue/metrics schema coverage in `apply_core_schema_migrations`; this behavior is documented in `context/sce/agent-trace-retry-queue-observability.md`. ## Repository model @@ -99,3 +100,4 @@ Lightweight post-task verification baseline (required after each completed task) - Use `context/sce/agent-trace-reconciliation-schema-ingestion.md` for the implemented T11 reconciliation schema contract (`reconciliation_runs`, `rewrite_mappings`, `conversations`) and replay-safe idempotency/index coverage. - Use `context/sce/agent-trace-hosted-event-intake-orchestration.md` for the implemented T12 hosted intake contract (GitHub/GitLab signature verification, old/new head resolution, deterministic reconciliation-run idempotency keys, and replay-safe run insertion outcomes). - Use `context/sce/agent-trace-rewrite-mapping-engine.md` for the implemented T13 hosted mapping engine contract (patch-id exact matching, range-diff/fuzzy scoring precedence, confidence thresholds, and deterministic unresolved handling). +- Use `context/sce/agent-trace-retry-queue-observability.md` for the implemented T14 retry replay contract (notes/DB target-scoped recovery, per-attempt runtime/error-class metrics, reconciliation mapped/unmapped + confidence histogram snapshots, and DB-first retry/metrics schema additions). diff --git a/context/patterns.md b/context/patterns.md index a84ea7c6..3157a4ed 100644 --- a/context/patterns.md +++ b/context/patterns.md @@ -88,12 +88,14 @@ - For pre-commit attribution finalization seams, keep pending staged and unstaged ranges explicitly separated in input models and finalize from staged ranges only, while carrying index/tree anchors for deterministic commit-time attribution binding. - For commit-msg co-author policy seams, gate canonical trailer insertion on runtime controls (`SCE_DISABLED`, `SCE_COAUTHOR_ENABLED`) plus staged SCE-attribution presence, and enforce idempotent dedupe so allowed cases end with exactly one `Co-authored-by: SCE ` trailer. - For post-commit trace finalization seams, treat commit SHA as the idempotency identity, perform notes + DB writes in the same finalize pass when available, and enqueue retry-fallback entries that explicitly capture failed persistence targets for replay-safe recovery. +- For retry replay seams, process fallback queue entries in bounded batches, avoid same-pass duplicate trace processing, retry only failed targets, and emit per-attempt runtime + persistence error-class metrics for operational visibility. - For post-rewrite remap ingestion seams, parse ` ` pairs from hook input strictly, ignore empty/no-op self-mapping rows, normalize rewrite method labels to lowercase (`amend`/`rebase` when recognized), and derive deterministic per-pair idempotency keys before dispatching remap requests. - For rewrite trace transformation seams, materialize rewritten records through the canonical Agent Trace builder path, require finite confidence in `[0.0, 1.0]`, normalize confidence to two-decimal metadata strings, map quality thresholds to `final` (`>= 0.90`), `partial` (`0.60..0.89`), and `needs_review` (`< 0.60`), and preserve notes+DB dual-write plus retry-fallback parity. - For local persistence rollout, ship core schema changes as idempotent `CREATE TABLE IF NOT EXISTS` and `CREATE INDEX IF NOT EXISTS` statements so migration reapplication is upgrade-safe across empty and preexisting local Turso DB states. - For hosted rewrite reconciliation persistence, extend the same migration seam (`apply_core_schema_migrations`) with deterministic schema/index statements and per-repository idempotency uniqueness for run/mapping replay safety. - For hosted event intake seams, verify provider signatures before payload parsing (GitHub `sha256=` HMAC over body, GitLab token-equality secret check), resolve old/new heads from provider payload fields, and derive deterministic reconciliation run idempotency keys from provider+event+repo+head tuple material. - For hosted rewrite mapping seams, resolve candidates deterministically in strict precedence order (patch-id exact, then range-diff score, then fuzzy score), classify top-score ties as `ambiguous`, enforce low-confidence unresolved behavior below `0.60`, and preserve stable outcome ordering via canonical candidate SHA sorting. +- For hosted reconciliation observability, publish run-level mapped/unmapped counts, confidence histogram buckets, runtime timing, and normalized error-class labels so retry/quality drift can be monitored without requiring a full dashboard surface. - Keep crate-local onboarding docs in `cli/README.md` and sanity-check command examples against actual `sce` output whenever command messaging changes. - Keep targeted CLI command-surface verification in flake checks: `checks..cli-setup-command-surface` runs from `cli/` and executes `cargo fmt --check` plus focused setup-related tests (`help_text_mentions_setup_target_flags`, `parser_routes_setup`, `run_setup_reports`). - In `cli/flake.nix`, select the Rust toolchain via an explicit Rust overlay (`rust-overlay`) and thread that toolchain through `makeRustPlatform` so CLI check/build derivations do not rely on implicit nixpkgs Rust defaults. diff --git a/context/plans/agent-trace-attribution-no-git-wrapper.md b/context/plans/agent-trace-attribution-no-git-wrapper.md index b2e5e086..7d04fc5a 100644 --- a/context/plans/agent-trace-attribution-no-git-wrapper.md +++ b/context/plans/agent-trace-attribution-no-git-wrapper.md @@ -210,7 +210,7 @@ Implement a no-git-wrapper attribution platform that preserves normal developer - `cargo test --manifest-path cli/Cargo.toml hosted_reconciliation` - `cargo build --manifest-path cli/Cargo.toml` -- [ ] T14: Implement notes write-back fallback, retry queue, and observability metrics (status:todo) +- [x] T14: Implement notes write-back fallback, retry queue, and observability metrics (status:done) - Task ID: T14 - Goal: Guarantee no trace loss when notes pushes fail and expose reconciliation/runtime telemetry. - Boundaries (in/out of scope): @@ -220,6 +220,13 @@ Implement a no-git-wrapper attribution platform that preserves normal developer - Failed notes pushes recover via retry and metrics expose operational state. - Verification notes (commands or checks): - Fault-injection and recovery tests with metric emission assertions. + - `cargo fmt --manifest-path cli/Cargo.toml` + - `cargo test --manifest-path cli/Cargo.toml hooks::tests::retry_processor_recovers_failed_notes_write_and_emits_success_metric` + - `cargo test --manifest-path cli/Cargo.toml hooks::tests::retry_processor_requeues_when_db_write_still_fails` + - `cargo test --manifest-path cli/Cargo.toml hosted_reconciliation::tests::reconciliation_metrics_capture_mapped_unmapped_histogram_runtime_and_error_class` + - `cargo test --manifest-path cli/Cargo.toml hosted_reconciliation::tests::reconciliation_error_classification_labels_signature_and_payload_failures` + - `cargo test --manifest-path cli/Cargo.toml local_db::tests::core_schema_migrations_create_required_tables_and_indexes` + - `cargo build --manifest-path cli/Cargo.toml` - [ ] T15: Validation and cleanup (status:todo) - Task ID: T15 diff --git a/context/sce/agent-trace-retry-queue-observability.md b/context/sce/agent-trace-retry-queue-observability.md new file mode 100644 index 00000000..07699c5f --- /dev/null +++ b/context/sce/agent-trace-retry-queue-observability.md @@ -0,0 +1,45 @@ +# Agent Trace retry queue and observability metrics + +## Status +- Plan: `agent-trace-attribution-no-git-wrapper` +- Task: `T14` +- Implementation state: done + +## Canonical contract +- Retry processing entrypoint: `cli/src/services/hooks.rs` -> `process_trace_retry_queue`. +- Queue contract now supports dequeue + enqueue replay via `TraceRetryQueue::{dequeue_next, enqueue}`. +- Retry pass processes up to `max_items` entries per invocation and avoids same-pass duplicate processing for the same trace ID. +- Recovery write behavior is target-scoped: + - Failed notes target retries through `TraceNotesWriter`. + - Failed DB target retries through `TraceRecordStore` using metadata idempotency key (`dev.crocoder.sce.idempotency_key`) when present. +- Retry metrics are emitted per attempted replay through `RetryMetricsSink` with: + - `commit_sha` + - `trace_id` + - runtime histogram input (`runtime_ms`) + - `error_class` (from `PersistenceFailure.class` when writes fail) + - remaining failed targets. + +## Reconciliation metrics contract +- Reconciliation mapping metrics entrypoint: `cli/src/services/hosted_reconciliation.rs` -> `summarize_reconciliation_metrics`. +- Snapshot shape (`ReconciliationMetricsSnapshot`) tracks: + - `mapped` / `unmapped` counts + - confidence histogram buckets (`high`/`medium`/`low`) + - run runtime (`runtime_ms`) + - optional error class (`signature`, `payload`, `store`). +- Error-class normalization helper: `classify_reconciliation_error`. + +## Persistence schema additions +- `cli/src/services/local_db.rs` core migrations now include: + - `trace_retry_queue` (DB-first fallback queue storage) + - `reconciliation_metrics` (runtime metric snapshots) +- Added indexes: + - `idx_trace_retry_queue_created_at` + - `idx_reconciliation_metrics_created_at` + +## Verification evidence +- `cargo test --manifest-path cli/Cargo.toml hooks::tests::retry_processor_recovers_failed_notes_write_and_emits_success_metric` +- `cargo test --manifest-path cli/Cargo.toml hooks::tests::retry_processor_requeues_when_db_write_still_fails` +- `cargo test --manifest-path cli/Cargo.toml hosted_reconciliation::tests::reconciliation_metrics_capture_mapped_unmapped_histogram_runtime_and_error_class` +- `cargo test --manifest-path cli/Cargo.toml hosted_reconciliation::tests::reconciliation_error_classification_labels_signature_and_payload_failures` +- `cargo test --manifest-path cli/Cargo.toml local_db::tests::core_schema_migrations_create_required_tables_and_indexes` +- `cargo build --manifest-path cli/Cargo.toml` From a51d5929df267428802dab93b6b9be502128891c Mon Sep 17 00:00:00 2001 From: David Abram Date: Wed, 4 Mar 2026 16:07:55 +0100 Subject: [PATCH 16/39] agent-trace: mark agent-trace-attribution-no-git-wrapper as done --- .../agent-trace-attribution-no-git-wrapper.md | 20 ++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/context/plans/agent-trace-attribution-no-git-wrapper.md b/context/plans/agent-trace-attribution-no-git-wrapper.md index 7d04fc5a..75bd2938 100644 --- a/context/plans/agent-trace-attribution-no-git-wrapper.md +++ b/context/plans/agent-trace-attribution-no-git-wrapper.md @@ -228,7 +228,7 @@ Implement a no-git-wrapper attribution platform that preserves normal developer - `cargo test --manifest-path cli/Cargo.toml local_db::tests::core_schema_migrations_create_required_tables_and_indexes` - `cargo build --manifest-path cli/Cargo.toml` -- [ ] T15: Validation and cleanup (status:todo) +- [x] T15: Validation and cleanup (status:done) - Task ID: T15 - Goal: Run full-system validation, sync context/docs, and leave implementation evidence for handoff. - Boundaries (in/out of scope): @@ -241,6 +241,24 @@ Implement a no-git-wrapper attribution platform that preserves normal developer - End-to-end scenario runbook with idempotent replay and confidence policy validation. - Agent Trace compliance test report covering required fields, formats, nesting, enum constraints, and MIME expectations. - Context sync review across architecture/overview/glossary/patterns to match resulting code truth. + - Execution evidence (2026-03-04): + - `cargo fmt --manifest-path cli/Cargo.toml -- --check` (exit 0) + - `cargo test --manifest-path cli/Cargo.toml` (exit 0; 93 passed, 0 failed) + - `cargo build --manifest-path cli/Cargo.toml` (exit 0; existing non-fatal dead-code warnings in placeholder seams) + - `nix run .#pkl-check-generated` (exit 0; generated outputs up to date) + - `nix flake check` (exit 0; `cli-setup-command-surface` evaluated/built successfully) + - Context sync verification completed for `context/overview.md`, `context/architecture.md`, `context/glossary.md`, `context/patterns.md`, and `context/context-map.md`; no drift found, so no content edits required. + - Failed checks and follow-ups: + - None. Validation commands exited successfully. + - Success-criteria verification summary: + - Agent Trace schema/required-field/format compliance is covered by passing `services::agent_trace` tests including `builder_output_passes_agent_trace_schema_validation` and `builder_output_rejects_invalid_uri_and_timestamp_formats`. + - Commit finalization, notes+DB persistence, MIME expectations, and idempotency are covered by passing `services::hooks::tests::post_commit_finalization_*` tests. + - Local rewrite remap ingestion and rewrite metadata/quality mapping are covered by passing `services::hooks::tests::post_rewrite_finalization_*` and `rewrite_trace_finalization_*` tests. + - Hosted intake idempotency/signature verification and deterministic rewrite mapping outcomes are covered by passing `services::hosted_reconciliation::tests::*` intake/mapping/metrics tests. + - Canonical co-author policy behavior is covered by passing `services::hooks::tests::commit_msg_policy_*` tests. + - Persistence schema coverage for trace storage, ranges, reconciliation runs/mappings, and retry/metrics tables is covered by passing `services::local_db::tests::core_schema_migrations_*` and reconciliation schema tests. + - Residual risks: + - Existing documented RFC version-pattern ambiguity remains tracked in Open Questions and is unchanged by this task. ## 5) Open questions - Agent Trace RFC page shows a potential version-format mismatch (`version` schema pattern appears two-segment while examples and document header use `0.1.0`); implementation currently plans to emit `0.1.0` and keep parser tolerant. From b55a63d745343100cd10b4629587ea660bcdc0c5 Mon Sep 17 00:00:00 2001 From: David Abram Date: Wed, 4 Mar 2026 23:24:20 +0100 Subject: [PATCH 17/39] sce: Define setup git-hooks install contract Document the canonical `sce setup --hooks` behavior for path resolution, idempotent outcomes, backup/rollback guarantees, and failure diagnostics so T02-T05 have a stable implementation target. --- context/context-map.md | 1 + context/plans/sce-setup-githooks-any-repo.md | 90 ++++++++++++++ .../sce/setup-githooks-install-contract.md | 111 ++++++++++++++++++ 3 files changed, 202 insertions(+) create mode 100644 context/plans/sce-setup-githooks-any-repo.md create mode 100644 context/sce/setup-githooks-install-contract.md diff --git a/context/context-map.md b/context/context-map.md index 60aa762f..39504212 100644 --- a/context/context-map.md +++ b/context/context-map.md @@ -23,6 +23,7 @@ Feature/domain context: - `context/sce/agent-trace-commit-msg-coauthor-policy.md` (T05 commit-msg canonical co-author trailer policy with env-gated injection and idempotent dedupe) - `context/sce/agent-trace-post-commit-dual-write.md` (T06 post-commit trace finalization contract, notes+DB dual-write behavior, idempotency ledger guard, and retry-queue fallback semantics) - `context/sce/agent-trace-hook-doctor.md` (T07 `sce doctor` hook install/health validation contract for default, per-repo, and global hook-path rollout) +- `context/sce/setup-githooks-install-contract.md` (T01 canonical `sce setup --hooks` install contract for target-path resolution, idempotent outcomes, backup/rollback, and doctor-readiness alignment) - `context/sce/agent-trace-post-rewrite-local-remap-ingestion.md` (T08 `post-rewrite` local remap ingestion contract with strict pair parsing, rewrite-method normalization, and deterministic replay-key derivation) - `context/sce/agent-trace-rewrite-trace-transformation.md` (T09 rewritten-SHA trace transformation contract with rewrite metadata, confidence-to-quality mapping, and notes+DB persistence parity) - `context/sce/agent-trace-core-schema-migrations.md` (T10 core local schema migration contract for `repositories`, `commits`, `trace_records`, and `trace_ranges` with upgrade-safe idempotent create semantics) diff --git a/context/plans/sce-setup-githooks-any-repo.md b/context/plans/sce-setup-githooks-any-repo.md new file mode 100644 index 00000000..abaa74e7 --- /dev/null +++ b/context/plans/sce-setup-githooks-any-repo.md @@ -0,0 +1,90 @@ +# Plan: sce-setup-githooks-any-repo +## 1) Change summary +Enable `sce setup` to install and manage required Git hooks (`pre-commit`, `commit-msg`, `post-commit`) for any target repository with deterministic, idempotent behavior and safe failure handling. +## 2) Success criteria +- One command installs required hooks for arbitrary repositories without manual chmod or git config steps. +- Re-running setup is idempotent and reports deterministic installed/skipped/updated outcomes. +- Existing hooks are backed up or safely merged according to the defined install contract. +- `sce doctor` reports `ready` immediately after successful setup for supported hook-path configurations. +## 3) Constraints and non-goals +- In scope: setup contract definition, embedded hook asset packaging, setup install orchestration, CLI flags/UX, doctor integration, and verification coverage. +- In scope: repo-local default hooks path and configured custom `core.hooksPath` handling. +- In scope: backup-and-replace semantics with non-destructive rollback on install failure. +- Out of scope: introducing new hook types beyond `pre-commit`, `commit-msg`, and `post-commit`. +- Out of scope: changing Agent Trace payload semantics unrelated to hook installation lifecycle. +- Non-goal: requiring manual operator intervention for normal setup/upgrade paths. +## 4) Task stack (T01..T06) +- [x] T01: Define setup hook-install contract (status:done) + - Task ID: T01 + - Goal: Establish canonical `sce setup` hook-install contract covering target path resolution, idempotency rules, backup/rollback behavior, and failure diagnostics. + - Boundaries (in/out of scope): + - In: repo-local vs custom `core.hooksPath` resolution policy, hook ownership/update policy, CLI UX contract, failure policy, and diagnostic vocabulary. + - Out: implementation of file writes or CLI parser wiring. + - Done when: + - Contract document defines expected behavior for fresh install, upgrade, existing hook preservation/merge or backup strategy, and rollback guarantees. + - Contract includes deterministic user-facing outcomes and actionable failure diagnostics. + - Verification notes (commands or checks): + - Documentation parity review against existing `setup` + `doctor` behavior and planned acceptance tests. + - Verification evidence: + - Added canonical contract document at `context/sce/setup-githooks-install-contract.md` covering target-path resolution, deterministic per-hook outcomes (`installed`/`updated`/`skipped`), backup-and-replace semantics, rollback guarantees, and actionable failure diagnostics. + - Linked the new contract in `context/context-map.md` so setup-hook behavior remains discoverable for follow-on implementation tasks. + - Performed parity review against current setup/doctor boundaries (`cli/src/services/setup.rs` and `cli/src/services/doctor.rs`) to keep T01 contract language aligned with existing command semantics and T02-T05 verification targets. +- [ ] T02: Implement hook asset packaging for setup (status:todo) + - Task ID: T02 + - Goal: Embed canonical hook templates with deterministic paths/content so setup can install hooks without runtime config reads. + - Boundaries (in/out of scope): + - In: compile-time asset inclusion, path normalization, deterministic content accessors, and target-hook mapping. + - Out: runtime hook install orchestration and CLI output formatting. + - Done when: + - Setup service can enumerate required hook assets and bytes from embedded sources only. + - Asset manifest behavior is deterministic across builds for unchanged inputs. + - Verification notes (commands or checks): + - Unit tests for manifest completeness, normalized paths, and stable lookup semantics. +- [ ] T03: Implement `sce setup` hook installation flow (status:todo) + - Task ID: T03 + - Goal: Add idempotent hook install/update orchestration that writes required hooks, preserves executable bits, and performs safe backup-and-replace with rollback on failure. + - Boundaries (in/out of scope): + - In: hook target path resolution, write/update decisions, permission preservation, backup creation, failure rollback, and upgrade-path behavior. + - Out: doctor readiness reporting and CLI flag contract expansion. + - Done when: + - Fresh install and upgrade paths work for arbitrary repositories under repo-local and custom hook-path configurations. + - Failure paths restore prior hook state non-destructively when replacement cannot complete. + - Verification notes (commands or checks): + - Service/integration tests for install, re-run idempotency, upgrade, and rollback scenarios. +- [ ] T04: Add CLI flags and UX for hook setup (status:todo) + - Task ID: T04 + - Goal: Add `sce setup --hooks` (and optional `--repo `) with deterministic output and compatible option validation. + - Boundaries (in/out of scope): + - In: parser/validation updates, command dispatch wiring, deterministic installed/skipped/updated messaging, and option compatibility rules. + - Out: expanding setup to unrelated install domains or interactive redesign outside hook scope. + - Done when: + - Flag combinations validate deterministically with actionable errors for invalid mixes. + - Successful runs emit clear per-hook outcomes suitable for humans and automation logs. + - Verification notes (commands or checks): + - Command-surface tests for parsing, invalid combinations, and stable message snapshots. +- [ ] T05: Integrate with doctor and add verification tests (status:todo) + - Task ID: T05 + - Goal: Ensure `sce doctor` reports ready after successful hook setup and add targeted test coverage for missing/misconfigured/existing hooks plus idempotent re-run behavior. + - Boundaries (in/out of scope): + - In: doctor readiness checks alignment with setup contract, focused tests for hook states, and verification harness updates. + - Out: unrelated doctor domains or non-hook readiness policies. + - Done when: + - Post-setup doctor output reports ready across supported hook-path modes. + - Targeted test suite covers missing/misconfigured/existing hooks and idempotent reruns. + - Verification notes (commands or checks): + - `cargo test` (targeted hook/setup/doctor slices), `cargo fmt --check`, and `cargo build` from `cli/`. +- [ ] T06: Validation and cleanup (status:todo) + - Task ID: T06 + - Goal: Run end-to-end validation, ensure cleanup of temporary artifacts, and confirm code/context alignment for this plan scope. + - Boundaries (in/out of scope): + - In: final verification pass, deterministic result capture, and context sync confirmation for any behavior changes. + - Out: net-new feature work outside the approved task stack. + - Done when: + - All success criteria are satisfied with verification evidence. + - Temporary artifacts are removed or documented with retention rationale. + - No known context drift remains for changed setup/doctor hook behavior. + - Verification notes (commands or checks): + - `cargo fmt --check && cargo build && cargo test` (from `cli/`). + - Focused `sce setup --hooks` fresh + rerun checks and post-setup `sce doctor` readiness checks. +## 5) Open questions +- None. diff --git a/context/sce/setup-githooks-install-contract.md b/context/sce/setup-githooks-install-contract.md new file mode 100644 index 00000000..e2b77416 --- /dev/null +++ b/context/sce/setup-githooks-install-contract.md @@ -0,0 +1,111 @@ +# SCE setup git-hooks install contract + +## Scope + +Task `sce-setup-githooks-any-repo` `T01` defines the canonical behavior contract for git-hook setup via `sce setup`. +This document is the implementation target for T02-T05. + +In scope for this contract: + +- target repository and hooks-path resolution policy +- required hook ownership and idempotent update rules +- backup-and-replace lifecycle with rollback guarantees +- deterministic outcome vocabulary and failure diagnostics +- `sce doctor` readiness alignment after successful install + +Out of scope for this contract task: + +- runtime implementation details of file writes +- CLI parser wiring and final flag surface implementation + +## Command surface contract + +- Canonical operator command: `sce setup --hooks` +- Optional explicit repository target: `sce setup --hooks --repo ` +- Default repository target: current working repository when `--repo` is omitted + +`--hooks` mode installs and manages exactly three required hooks: + +- `pre-commit` +- `commit-msg` +- `post-commit` + +No additional hook types are installed by this workflow. + +## Target path resolution + +For a selected target repository, setup resolves effective hook destination using git truth: + +1. repository root (`git rev-parse --show-toplevel`) +2. effective hooks path (`git rev-parse --git-path hooks`) +3. hook-path source classification via config checks: + - default (`.git/hooks`) + - per-repo `core.hooksPath` + - global `core.hooksPath` + +Install behavior must write required hooks into the effective hooks directory returned by git, not by path guessing. + +## Hook ownership and idempotency rules + +Each required hook has one canonical SCE-managed payload. + +Per hook, setup reports exactly one deterministic outcome: + +- `installed`: hook was missing and is now present +- `updated`: hook existed and was replaced with newer canonical content +- `skipped`: hook already matched canonical content + +Re-running setup with unchanged canonical assets must be idempotent and produce `skipped` for all already-synced hooks. + +## Preservation and backup policy + +When setup needs to replace an existing hook file, it must preserve prior state through backup-and-replace: + +- create a deterministic backup before replacement +- perform replacement through a staged write/swap flow +- preserve executable permissions required by git hooks + +Backups are considered part of successful update/install reporting and must be discoverable in command output. + +## Rollback guarantees + +If replacement fails after backup creation but before successful finalization, setup must: + +- restore the prior hook content from backup +- restore prior permissions when restoration is possible +- clean temporary staged artifacts used for failed replacement +- report failure with explicit rollback status + +Partial writes that leave required hooks in unknown state are not allowed for successful exits. + +## Failure diagnostics contract + +Failure output must be actionable and deterministic. Diagnostics should identify: + +- repository resolution failures (not a git repo, inaccessible repo) +- effective hooks-path resolution failures +- filesystem write/permission failures +- backup creation failures +- rollback success/failure status when recovery is attempted + +Diagnostics should include affected hook name and target path whenever available. + +## Doctor alignment contract + +After successful `sce setup --hooks`, `sce doctor` should report `ready` for supported hook-path modes when no external modifications occur between setup and doctor runs. + +Supported modes for this alignment: + +- default hooks path +- per-repo `core.hooksPath` +- global `core.hooksPath` + +## Verification targets for downstream tasks + +T02-T05 implementation and tests must verify this contract across: + +- fresh install in empty hook directories +- rerun idempotency with unchanged assets +- upgrade path from older/non-canonical hook content +- rollback behavior under injected replacement failures +- post-setup `sce doctor` readiness From 835f78972ea9e3e1c2043bd2e04ed3faf18c31f5 Mon Sep 17 00:00:00 2001 From: David Abram Date: Wed, 4 Mar 2026 23:37:04 +0100 Subject: [PATCH 18/39] setup: Implement embedded git-hook asset packaging for setup Embed canonical pre-commit, commit-msg, and post-commit templates at compile time and expose deterministic setup-service accessors for required-hook lookup and iteration. Add focused tests that verify completeness, normalized/sorted manifest paths, and stable hook resolution. --- cli/assets/hooks/commit-msg | 4 ++ cli/assets/hooks/post-commit | 4 ++ cli/assets/hooks/pre-commit | 4 ++ cli/build.rs | 5 ++ cli/src/services/setup.rs | 72 +++++++++++++++++-- context/architecture.md | 2 +- context/context-map.md | 1 + context/glossary.md | 3 +- context/overview.md | 4 +- context/patterns.md | 2 +- context/plans/sce-setup-githooks-any-repo.md | 6 +- .../setup-githooks-hook-asset-packaging.md | 37 ++++++++++ 12 files changed, 135 insertions(+), 9 deletions(-) create mode 100644 cli/assets/hooks/commit-msg create mode 100644 cli/assets/hooks/post-commit create mode 100644 cli/assets/hooks/pre-commit create mode 100644 context/sce/setup-githooks-hook-asset-packaging.md diff --git a/cli/assets/hooks/commit-msg b/cli/assets/hooks/commit-msg new file mode 100644 index 00000000..66bbf7e1 --- /dev/null +++ b/cli/assets/hooks/commit-msg @@ -0,0 +1,4 @@ +#!/bin/sh +set -eu + +exec sce hooks commit-msg "$@" diff --git a/cli/assets/hooks/post-commit b/cli/assets/hooks/post-commit new file mode 100644 index 00000000..0ad697aa --- /dev/null +++ b/cli/assets/hooks/post-commit @@ -0,0 +1,4 @@ +#!/bin/sh +set -eu + +exec sce hooks post-commit "$@" diff --git a/cli/assets/hooks/pre-commit b/cli/assets/hooks/pre-commit new file mode 100644 index 00000000..42c5cb0e --- /dev/null +++ b/cli/assets/hooks/pre-commit @@ -0,0 +1,4 @@ +#!/bin/sh +set -eu + +exec sce hooks pre-commit "$@" diff --git a/cli/build.rs b/cli/build.rs index c87d3964..b32f9f5f 100644 --- a/cli/build.rs +++ b/cli/build.rs @@ -15,6 +15,11 @@ const TARGETS: &[TargetSpec] = &[ relative_root: "config/.claude", include_prefix: "/../config/.claude/", }, + TargetSpec { + const_name: "HOOK_EMBEDDED_ASSETS", + relative_root: "cli/assets/hooks", + include_prefix: "/assets/hooks/", + }, ]; struct TargetSpec { diff --git a/cli/src/services/setup.rs b/cli/src/services/setup.rs index 44d3d9ab..38a37ed3 100644 --- a/cli/src/services/setup.rs +++ b/cli/src/services/setup.rs @@ -22,8 +22,31 @@ pub struct EmbeddedAsset { pub bytes: &'static [u8], } +#[derive(Clone, Copy, Debug, Eq, PartialEq)] +pub enum RequiredHookAsset { + PreCommit, + CommitMsg, + PostCommit, +} + include!(concat!(env!("OUT_DIR"), "/setup_embedded_assets.rs")); +pub fn iter_required_hook_assets() -> std::slice::Iter<'static, EmbeddedAsset> { + HOOK_EMBEDDED_ASSETS.iter() +} + +pub fn get_required_hook_asset(hook: RequiredHookAsset) -> Option<&'static EmbeddedAsset> { + let hook_name = match hook { + RequiredHookAsset::PreCommit => "pre-commit", + RequiredHookAsset::CommitMsg => "commit-msg", + RequiredHookAsset::PostCommit => "post-commit", + }; + + HOOK_EMBEDDED_ASSETS + .iter() + .find(|asset| asset.relative_path == hook_name) +} + pub enum EmbeddedAssetSelectionIter { One(std::slice::Iter<'static, EmbeddedAsset>), Both( @@ -523,10 +546,11 @@ mod tests { use anyhow::Result; use super::{ - install_embedded_setup_assets, install_embedded_setup_assets_with_rename, - iter_embedded_assets_for_setup_target, parse_setup_cli_options, resolve_setup_dispatch, - resolve_setup_mode, run_setup_for_mode, setup_usage_text, SetupCliOptions, SetupDispatch, - SetupMode, SetupTarget, + get_required_hook_asset, install_embedded_setup_assets, + install_embedded_setup_assets_with_rename, iter_embedded_assets_for_setup_target, + iter_required_hook_assets, parse_setup_cli_options, resolve_setup_dispatch, + resolve_setup_mode, run_setup_for_mode, setup_usage_text, RequiredHookAsset, + SetupCliOptions, SetupDispatch, SetupMode, SetupTarget, }; #[derive(Clone, Copy, Debug)] @@ -722,6 +746,38 @@ mod tests { assert_eq!(iter_both_count, opencode_count + claude_count); } + #[test] + fn embedded_hook_manifest_is_complete_sorted_and_normalized() { + let hooks: Vec<&super::EmbeddedAsset> = iter_required_hook_assets().collect(); + let paths: Vec<&str> = hooks.iter().map(|asset| asset.relative_path).collect(); + + assert_eq!(paths, vec!["commit-msg", "post-commit", "pre-commit"]); + + for hook in hooks { + assert!(!hook.relative_path.is_empty()); + assert!(!hook.relative_path.contains('/')); + assert!(!hook.relative_path.contains('\\')); + assert!(!hook.bytes.is_empty()); + assert!( + hook.bytes.starts_with(b"#!/bin/sh\n"), + "embedded hook should start with shell shebang" + ); + } + } + + #[test] + fn required_hook_lookup_resolves_each_canonical_hook() { + for hook in [ + RequiredHookAsset::PreCommit, + RequiredHookAsset::CommitMsg, + RequiredHookAsset::PostCommit, + ] { + let asset = get_required_hook_asset(hook).expect("required hook asset should exist"); + assert_eq!(asset.relative_path, hook_filename(hook)); + assert!(!asset.bytes.is_empty()); + } + } + #[test] fn install_engine_replaces_existing_target_with_backup() -> Result<()> { let temp = TestTempDir::new("sce-setup-install-tests")?; @@ -887,4 +943,12 @@ mod tests { Ok(()) } + + fn hook_filename(hook: RequiredHookAsset) -> &'static str { + match hook { + RequiredHookAsset::PreCommit => "pre-commit", + RequiredHookAsset::CommitMsg => "commit-msg", + RequiredHookAsset::PostCommit => "post-commit", + } + } } diff --git a/context/architecture.md b/context/architecture.md index 06a98ecb..63a3abc9 100644 --- a/context/architecture.md +++ b/context/architecture.md @@ -75,7 +75,7 @@ The repository includes a new placeholder Rust binary crate at `cli/`. - `cli/src/command_surface.rs` is the source of truth for top-level command contract metadata (`help`, `setup`, `doctor`, `mcp`, `hooks`, `sync`) and explicit implemented-vs-placeholder status. - `cli/src/services/local_db.rs` provides the local Turso data adapter, including `Builder::new_local(...)` initialization, async execute/query smoke checks for in-memory and file-backed targets, and idempotent migration application for Agent Trace persistence foundations (`repositories`, `commits`, `trace_records`, `trace_ranges`), reconciliation ingestion entities (`reconciliation_runs`, `rewrite_mappings`, `conversations`), and T14 retry/observability storage (`trace_retry_queue`, `reconciliation_metrics`) with replay/query indexes. - `cli/src/test_support.rs` provides a shared test-only temp-directory helper (`TestTempDir`) used by service tests that need filesystem fixtures. -- `cli/src/services/setup.rs` defines the setup command contract (`SetupMode`, `SetupTarget`, CLI flag parser/validator), an `inquire`-backed interactive target prompter (`InquireSetupTargetPrompter`), setup dispatch outcomes (proceed/cancelled), compile-time embedded asset access (`EmbeddedAsset`, target-scoped iterators) generated by `cli/build.rs` from `config/.opencode/**` and `config/.claude/**`, and a target-scoped install engine/orchestrator that stages embedded files, performs backup-and-replace with rollback restoration on swap failure, and formats deterministic completion messaging. +- `cli/src/services/setup.rs` defines the setup command contract (`SetupMode`, `SetupTarget`, CLI flag parser/validator), an `inquire`-backed interactive target prompter (`InquireSetupTargetPrompter`), setup dispatch outcomes (proceed/cancelled), compile-time embedded asset access (`EmbeddedAsset`, target-scoped iterators, required-hook asset iterators/lookups) generated by `cli/build.rs` from `config/.opencode/**`, `config/.claude/**`, and `cli/assets/hooks/**`, and a target-scoped install engine/orchestrator that stages embedded files, performs backup-and-replace with rollback restoration on swap failure, and formats deterministic completion messaging. - `cli/src/services/doctor.rs` defines hook rollout health validation (`run_doctor`) that resolves effective git hook-path source (default, local `core.hooksPath`, global `core.hooksPath`) and validates required hook files (`pre-commit`, `commit-msg`, `post-commit`) for presence and executable permissions. - `cli/src/services/agent_trace.rs` defines the Agent Trace schema adapter and builder contracts (`adapt_trace_payload`, `build_trace_payload`), including fixed git VCS identity, reserved reverse-domain metadata keys, and deterministic AI `model_id` normalization before schema-compliance validation. - `cli/src/services/mcp.rs` defines MCP file-cache capability contracts (`McpService`, transport/capability snapshots, cache policy) with non-runnable placeholder tool declarations. diff --git a/context/context-map.md b/context/context-map.md index 39504212..1fb84c4d 100644 --- a/context/context-map.md +++ b/context/context-map.md @@ -24,6 +24,7 @@ Feature/domain context: - `context/sce/agent-trace-post-commit-dual-write.md` (T06 post-commit trace finalization contract, notes+DB dual-write behavior, idempotency ledger guard, and retry-queue fallback semantics) - `context/sce/agent-trace-hook-doctor.md` (T07 `sce doctor` hook install/health validation contract for default, per-repo, and global hook-path rollout) - `context/sce/setup-githooks-install-contract.md` (T01 canonical `sce setup --hooks` install contract for target-path resolution, idempotent outcomes, backup/rollback, and doctor-readiness alignment) +- `context/sce/setup-githooks-hook-asset-packaging.md` (T02 compile-time `sce setup --hooks` required-hook template packaging contract and setup-service accessor surface) - `context/sce/agent-trace-post-rewrite-local-remap-ingestion.md` (T08 `post-rewrite` local remap ingestion contract with strict pair parsing, rewrite-method normalization, and deterministic replay-key derivation) - `context/sce/agent-trace-rewrite-trace-transformation.md` (T09 rewritten-SHA trace transformation contract with rewrite metadata, confidence-to-quality mapping, and notes+DB persistence parity) - `context/sce/agent-trace-core-schema-migrations.md` (T10 core local schema migration contract for `repositories`, `commits`, `trace_records`, and `trace_ranges` with upgrade-safe idempotent create semantics) diff --git a/context/glossary.md b/context/glossary.md index b24c9bae..5e96ac2d 100644 --- a/context/glossary.md +++ b/context/glossary.md @@ -29,7 +29,8 @@ - `setup mode contract`: `cli/src/services/setup.rs` model where `SetupMode::Interactive` is the default and `SetupMode::NonInteractive(SetupTarget)` is selected only when exactly one target flag is provided. - `setup interactive target prompt`: `inquire::Select` flow in `cli/src/services/setup.rs` (`InquireSetupTargetPrompter`) that presents OpenCode, Claude, and Both when `sce setup` runs without target flags. - `setup dispatch outcome`: Execution model in `cli/src/services/setup.rs` (`SetupDispatch`) where setup either proceeds with a selected/non-interactive target or exits as cancelled without file changes. -- `setup embedded asset manifest`: Compile-time generated file index emitted by `cli/build.rs` into `OUT_DIR/setup_embedded_assets.rs`, embedding bytes from `config/.opencode/**` and `config/.claude/**` as deterministic normalized relative-path entries consumed by `cli/src/services/setup.rs`. +- `setup embedded asset manifest`: Compile-time generated file index emitted by `cli/build.rs` into `OUT_DIR/setup_embedded_assets.rs`, embedding bytes from `config/.opencode/**`, `config/.claude/**`, and `cli/assets/hooks/**` as deterministic normalized relative-path entries consumed by `cli/src/services/setup.rs`. +- `setup required-hook embedded assets`: Setup-service accessors in `cli/src/services/setup.rs` (`iter_required_hook_assets`, `get_required_hook_asset`) that expose canonical embedded templates for `pre-commit`, `commit-msg`, and `post-commit` without runtime config reads. - `setup install engine`: Installer in `cli/src/services/setup.rs` (`install_embedded_setup_assets`) that writes embedded setup assets into per-target staging directories and swaps them into repository-root `.opencode/`/`.claude/` destinations. - `setup backup-and-replace`: Replacement choreography in `cli/src/services/setup.rs` where existing install targets are renamed to unique `.backup` paths before staged content is promoted; on swap failure, the engine restores the original target from backup and cleans temporary staging paths. - `MCP capability snapshot`: Placeholder capability model in `cli/src/services/mcp.rs` that captures planned file-cache transport/tool contracts (`cache-put`, `cache-get`) and cache policy defaults without enabling runtime MCP execution. diff --git a/context/overview.md b/context/overview.md index 1689a6b9..9d8ab89a 100644 --- a/context/overview.md +++ b/context/overview.md @@ -8,7 +8,7 @@ The crate ships onboarding and usage documentation at `cli/README.md` that refle The CLI crate currently enforces a minimal dependency contract: `anyhow`, `inquire`, `lexopt`, `tokio`, and `turso`. Its command loop is implemented with `lexopt` argument parsing and `anyhow` error handling, with real setup orchestration, implemented `doctor` rollout validation, and placeholder dispatch for deferred commands through explicit service contracts. The `setup` command includes an `inquire`-backed target-selection flow: default interactive selection for OpenCode/Claude/both, explicit non-interactive target flags (`--opencode`, `--claude`, `--both`), deterministic mutually-exclusive validation, and non-destructive cancellation exits. -The CLI now compiles an embedded setup asset manifest from `config/.opencode/**` and `config/.claude/**` via `cli/build.rs`; `cli/src/services/setup.rs` exposes deterministic normalized relative paths plus file bytes and target-scoped iteration without runtime reads from `config/`. +The CLI now compiles an embedded setup asset manifest from `config/.opencode/**`, `config/.claude/**`, and `cli/assets/hooks/**` via `cli/build.rs`; `cli/src/services/setup.rs` exposes deterministic normalized relative paths plus file bytes and target-scoped iteration without runtime reads from `config/`. The setup service also provides repository-root install orchestration: it resolves interactive or flag-based target selection, installs embedded assets, and reports deterministic completion details (selected target(s), installed file counts, and backup actions). The `doctor` command now validates Agent Trace local rollout readiness by resolving effective git hook-path source (default, per-repo `core.hooksPath`, or global `core.hooksPath`) and checking required hook presence/executable permissions with actionable diagnostics. The `mcp` placeholder contract is now scoped to future file-cache workflows (`cache-put`/`cache-get`) and remains intentionally non-runnable. @@ -37,6 +37,7 @@ The local DB service now also includes reconciliation persistence schema coverag The CLI now also includes a hosted event intake/orchestration seam in `cli/src/services/hosted_reconciliation.rs` that verifies provider signatures, resolves old/new commit heads from GitHub/GitLab payloads, and creates deterministic replay-safe reconciliation run requests; this behavior is documented in `context/sce/agent-trace-hosted-event-intake-orchestration.md`. The hosted reconciliation service now also includes a deterministic rewrite mapping engine (`map_rewritten_commit`) that resolves old->new commit identity using patch-id exact precedence, then range-diff hints, then fuzzy fallback with a `>= 0.60` mapping threshold and explicit ambiguous/unmatched/low-confidence unresolved outcomes; this behavior is documented in `context/sce/agent-trace-rewrite-mapping-engine.md`. The hooks service now also includes retry-queue replay processing (`process_trace_retry_queue`) with per-attempt runtime/error-class metric emission, and the hosted reconciliation service now includes mapped/unmapped + confidence histogram metric snapshots (`summarize_reconciliation_metrics`), with DB-first queue/metrics schema coverage in `apply_core_schema_migrations`; this behavior is documented in `context/sce/agent-trace-retry-queue-observability.md`. +The setup service now also exposes deterministic required-hook embedded asset accessors (`iter_required_hook_assets`, `get_required_hook_asset`) backed by canonical templates in `cli/assets/hooks/` for `pre-commit`, `commit-msg`, and `post-commit`; this behavior is documented in `context/sce/setup-githooks-hook-asset-packaging.md`. ## Repository model @@ -101,3 +102,4 @@ Lightweight post-task verification baseline (required after each completed task) - Use `context/sce/agent-trace-hosted-event-intake-orchestration.md` for the implemented T12 hosted intake contract (GitHub/GitLab signature verification, old/new head resolution, deterministic reconciliation-run idempotency keys, and replay-safe run insertion outcomes). - Use `context/sce/agent-trace-rewrite-mapping-engine.md` for the implemented T13 hosted mapping engine contract (patch-id exact matching, range-diff/fuzzy scoring precedence, confidence thresholds, and deterministic unresolved handling). - Use `context/sce/agent-trace-retry-queue-observability.md` for the implemented T14 retry replay contract (notes/DB target-scoped recovery, per-attempt runtime/error-class metrics, reconciliation mapped/unmapped + confidence histogram snapshots, and DB-first retry/metrics schema additions). +- Use `context/sce/setup-githooks-hook-asset-packaging.md` for the implemented `sce-setup-githooks-any-repo` T02 compile-time hook-template packaging contract and setup-service required-hook embedded accessor surface. diff --git a/context/patterns.md b/context/patterns.md index 3157a4ed..24fec7e9 100644 --- a/context/patterns.md +++ b/context/patterns.md @@ -75,7 +75,7 @@ - For setup-style command contracts, keep interactive mode as the zero-flag default and enforce mutually-exclusive explicit target flags for non-interactive automation. - For interactive setup flows, isolate prompt handling behind a service-layer prompter seam so selection mapping and cancellation behavior can be tested without a live TTY. - Treat setup prompt cancellation/interrupt as a non-destructive exit path with explicit user messaging (no file mutations and no partial side effects). -- For setup install prep, generate compile-time embedded asset manifests from `config/.opencode/**` and `config/.claude/**` in `cli/build.rs`, keep relative paths normalized to forward-slash form, and expose target-scoped iterators from the setup service layer for installer wiring. +- For setup install prep, generate compile-time embedded asset manifests from `config/.opencode/**`, `config/.claude/**`, and `cli/assets/hooks/**` in `cli/build.rs`, keep relative paths normalized to forward-slash form, and expose target-scoped iterators/lookups from the setup service layer for installer wiring. - For setup install execution, write selected embedded assets into a per-target staging directory first, then swap into repository-root `.opencode/`/`.claude/` with backup-and-replace semantics; when swap fails after backup creation, restore the original target path from backup and clean staging directories. - For setup command messaging, emit deterministic completion output that includes selected target(s), per-target install counts, and whether backup was created. - Keep module seams for future domains present and compile-safe even when behavior is deferred. diff --git a/context/plans/sce-setup-githooks-any-repo.md b/context/plans/sce-setup-githooks-any-repo.md index abaa74e7..78dccb02 100644 --- a/context/plans/sce-setup-githooks-any-repo.md +++ b/context/plans/sce-setup-githooks-any-repo.md @@ -29,7 +29,7 @@ Enable `sce setup` to install and manage required Git hooks (`pre-commit`, `comm - Added canonical contract document at `context/sce/setup-githooks-install-contract.md` covering target-path resolution, deterministic per-hook outcomes (`installed`/`updated`/`skipped`), backup-and-replace semantics, rollback guarantees, and actionable failure diagnostics. - Linked the new contract in `context/context-map.md` so setup-hook behavior remains discoverable for follow-on implementation tasks. - Performed parity review against current setup/doctor boundaries (`cli/src/services/setup.rs` and `cli/src/services/doctor.rs`) to keep T01 contract language aligned with existing command semantics and T02-T05 verification targets. -- [ ] T02: Implement hook asset packaging for setup (status:todo) +- [x] T02: Implement hook asset packaging for setup (status:done) - Task ID: T02 - Goal: Embed canonical hook templates with deterministic paths/content so setup can install hooks without runtime config reads. - Boundaries (in/out of scope): @@ -40,6 +40,10 @@ Enable `sce setup` to install and manage required Git hooks (`pre-commit`, `comm - Asset manifest behavior is deterministic across builds for unchanged inputs. - Verification notes (commands or checks): - Unit tests for manifest completeness, normalized paths, and stable lookup semantics. + - Verification evidence: + - Added canonical hook templates at `cli/assets/hooks/pre-commit`, `cli/assets/hooks/commit-msg`, and `cli/assets/hooks/post-commit` with deterministic `sce hooks ` entrypoints for setup packaging. + - Extended compile-time embedded manifest generation in `cli/build.rs` with `HOOK_EMBEDDED_ASSETS`, keeping deterministic sorted path normalization for hook assets without runtime config reads. + - Added setup service hook-asset accessors in `cli/src/services/setup.rs` (`iter_required_hook_assets`, `get_required_hook_asset`) plus coverage for manifest completeness, normalization, ordering, and canonical hook lookup semantics. - [ ] T03: Implement `sce setup` hook installation flow (status:todo) - Task ID: T03 - Goal: Add idempotent hook install/update orchestration that writes required hooks, preserves executable bits, and performs safe backup-and-replace with rollback on failure. diff --git a/context/sce/setup-githooks-hook-asset-packaging.md b/context/sce/setup-githooks-hook-asset-packaging.md new file mode 100644 index 00000000..3a4d8a50 --- /dev/null +++ b/context/sce/setup-githooks-hook-asset-packaging.md @@ -0,0 +1,37 @@ +# SCE setup git-hooks embedded asset packaging + +## Scope + +Task `sce-setup-githooks-any-repo` `T02` defines how required git-hook templates are packaged for `sce setup --hooks` without runtime reads from `config/`. + +## Canonical embedded hook sources + +`cli/build.rs` now embeds three canonical hook templates from `cli/assets/hooks/`: + +- `pre-commit` +- `commit-msg` +- `post-commit` + +These templates are emitted into `OUT_DIR/setup_embedded_assets.rs` as `HOOK_EMBEDDED_ASSETS` with deterministic sorted relative paths. + +## Setup-service accessor surface + +`cli/src/services/setup.rs` exposes hook-template access through: + +- `iter_required_hook_assets()` for deterministic full-set iteration +- `get_required_hook_asset(RequiredHookAsset)` for stable per-hook lookup + +`RequiredHookAsset` is the canonical hook mapping enum for this packaging layer: + +- `PreCommit` +- `CommitMsg` +- `PostCommit` + +## Determinism and validation + +Packaging determinism is enforced by setup tests in `cli/src/services/setup.rs`: + +- `embedded_hook_manifest_is_complete_sorted_and_normalized` +- `required_hook_lookup_resolves_each_canonical_hook` + +These tests verify manifest completeness (exactly three required hooks), normalized relative paths, sorted ordering, and stable hook lookup semantics. From f312cbf55fb064e21deafecd53869055de8c48c1 Mon Sep 17 00:00:00 2001 From: David Abram Date: Wed, 4 Mar 2026 23:54:25 +0100 Subject: [PATCH 19/39] setup: Implement required git-hook install flow with idempotent outcomes and rollback Resolve hooks destination from git truth (rev-parse --show-toplevel, rev-parse --git-path hooks) and install canonical required hooks with deterministic Installed/Updated/Skipped status. Add staged-swap backup/restore safety, executable-bit enforcement, targeted setup-service tests for default/custom hook paths plus rollback failures, and sync T03 context/plan state to current behavior. --- cli/src/services/setup.rs | 473 ++++++++++++++++++- context/architecture.md | 2 +- context/context-map.md | 1 + context/glossary.md | 1 + context/overview.md | 2 + context/patterns.md | 1 + context/plans/sce-setup-githooks-any-repo.md | 6 +- context/sce/setup-githooks-install-flow.md | 63 +++ 8 files changed, 545 insertions(+), 4 deletions(-) create mode 100644 context/sce/setup-githooks-install-flow.md diff --git a/cli/src/services/setup.rs b/cli/src/services/setup.rs index 38a37ed3..ee1122be 100644 --- a/cli/src/services/setup.rs +++ b/cli/src/services/setup.rs @@ -4,6 +4,7 @@ use lexopt::{Arg, ValueExt}; use std::{ fs, io, path::{Component, Path, PathBuf}, + process::Command, time::{SystemTime, UNIX_EPOCH}, }; @@ -172,6 +173,312 @@ pub struct SetupInstallOutcome { pub target_results: Vec, } +#[derive(Clone, Copy, Debug, Eq, PartialEq)] +pub enum RequiredHookInstallStatus { + Installed, + Updated, + Skipped, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct RequiredHookInstallResult { + pub hook_name: String, + pub hook_path: PathBuf, + pub status: RequiredHookInstallStatus, + pub backup_path: Option, +} + +#[derive(Clone, Debug, Eq, PartialEq)] +pub struct RequiredHooksInstallOutcome { + pub repository_root: PathBuf, + pub hooks_directory: PathBuf, + pub hook_results: Vec, +} + +pub fn install_required_git_hooks(repository_root: &Path) -> Result { + install_required_git_hooks_with_rename(repository_root, |from, to| fs::rename(from, to)) +} + +fn install_required_git_hooks_with_rename( + repository_root: &Path, + mut rename_fn: F, +) -> Result +where + F: FnMut(&Path, &Path) -> io::Result<()>, +{ + let resolved_repository_root = resolve_git_repository_root(repository_root)?; + let hooks_directory = resolve_git_hooks_directory(&resolved_repository_root)?; + fs::create_dir_all(&hooks_directory).with_context(|| { + format!( + "Failed to create git hooks directory '{}'", + hooks_directory.display() + ) + })?; + + let mut hook_results = Vec::new(); + for hook_asset in iter_required_hook_assets() { + let hook_result = + install_single_required_hook_with_rename(&hooks_directory, hook_asset, &mut rename_fn)?; + hook_results.push(hook_result); + } + + Ok(RequiredHooksInstallOutcome { + repository_root: resolved_repository_root, + hooks_directory, + hook_results, + }) +} + +fn install_single_required_hook_with_rename( + hooks_directory: &Path, + hook_asset: &EmbeddedAsset, + rename_fn: &mut F, +) -> Result +where + F: FnMut(&Path, &Path) -> io::Result<()>, +{ + validate_embedded_relative_path(hook_asset.relative_path)?; + + let hook_path = hooks_directory.join(hook_asset.relative_path); + let existing_metadata = fs::metadata(&hook_path).ok(); + + if existing_metadata + .as_ref() + .is_some_and(|metadata| metadata.is_file()) + { + let existing_bytes = fs::read(&hook_path) + .with_context(|| format!("Failed to read existing hook '{}'", hook_path.display()))?; + let executable = is_executable_file(&hook_path)?; + + if existing_bytes == hook_asset.bytes && executable { + return Ok(RequiredHookInstallResult { + hook_name: hook_asset.relative_path.to_string(), + hook_path, + status: RequiredHookInstallStatus::Skipped, + backup_path: None, + }); + } + } else if existing_metadata.is_some() { + bail!( + "Existing hook target '{}' is not a file", + hook_path.display() + ); + } + + let hook_staging_path = create_hook_staging_path(hooks_directory, hook_asset.relative_path)?; + if let Err(error) = write_hook_payload_to_staging(&hook_staging_path, hook_asset.bytes) { + cleanup_path_if_exists(&hook_staging_path); + return Err(error); + } + + if existing_metadata.is_none() { + if let Err(error) = rename_fn(&hook_staging_path, &hook_path).with_context(|| { + format!( + "Failed to install required hook '{}' at '{}'", + hook_asset.relative_path, + hook_path.display() + ) + }) { + cleanup_path_if_exists(&hook_staging_path); + return Err(error); + } + + return Ok(RequiredHookInstallResult { + hook_name: hook_asset.relative_path.to_string(), + hook_path, + status: RequiredHookInstallStatus::Installed, + backup_path: None, + }); + } + + let backup_path = next_backup_path(&hook_path)?; + rename_fn(&hook_path, &backup_path).with_context(|| { + format!( + "Failed to back up existing hook '{}' to '{}'", + hook_path.display(), + backup_path.display() + ) + })?; + + if let Err(error) = rename_fn(&hook_staging_path, &hook_path).with_context(|| { + format!( + "Failed to update required hook '{}' at '{}'", + hook_asset.relative_path, + hook_path.display() + ) + }) { + cleanup_path_if_exists(&hook_staging_path); + + if !hook_path.exists() { + if let Err(restore_error) = rename_fn(&backup_path, &hook_path) { + return Err(error.context(format!( + "Rollback failed while restoring hook '{}' from backup '{}': {}", + hook_path.display(), + backup_path.display(), + restore_error + ))); + } + } + + return Err(error); + } + + Ok(RequiredHookInstallResult { + hook_name: hook_asset.relative_path.to_string(), + hook_path, + status: RequiredHookInstallStatus::Updated, + backup_path: Some(backup_path), + }) +} + +fn write_hook_payload_to_staging(staging_path: &Path, bytes: &[u8]) -> Result<()> { + fs::write(staging_path, bytes).with_context(|| { + format!( + "Failed to write staged hook payload '{}'", + staging_path.display() + ) + })?; + ensure_executable_permissions(staging_path)?; + Ok(()) +} + +fn create_hook_staging_path(hooks_directory: &Path, hook_name: &str) -> Result { + let epoch_nanos = SystemTime::now() + .duration_since(UNIX_EPOCH) + .context("System clock is before UNIX_EPOCH")? + .as_nanos(); + let sanitized_hook_name = hook_name.replace('/', "-"); + + for attempt in 0..1000_u16 { + let candidate = hooks_directory.join(format!( + ".sce-hook-staging-{sanitized_hook_name}-{epoch_nanos}-{}-{attempt}", + std::process::id() + )); + + match fs::OpenOptions::new() + .create_new(true) + .write(true) + .open(&candidate) + { + Ok(_) => return Ok(candidate), + Err(error) if error.kind() == io::ErrorKind::AlreadyExists => continue, + Err(error) => { + return Err(error).with_context(|| { + format!( + "Failed to allocate hook staging file '{}'", + candidate.display() + ) + }); + } + } + } + + bail!( + "Could not allocate a unique hook staging file under '{}'", + hooks_directory.display() + ) +} + +fn resolve_git_repository_root(repository_root: &Path) -> Result { + let repository_root_output = run_git_command_in_directory( + repository_root, + &["rev-parse", "--show-toplevel"], + "Failed to resolve repository root. Ensure '--repo' points to an accessible git repository.", + )?; + Ok(PathBuf::from(repository_root_output)) +} + +fn resolve_git_hooks_directory(repository_root: &Path) -> Result { + let hooks_directory_output = run_git_command_in_directory( + repository_root, + &["rev-parse", "--git-path", "hooks"], + "Failed to resolve effective git hooks path.", + )?; + + let hooks_directory = PathBuf::from(&hooks_directory_output); + if hooks_directory.is_absolute() { + return Ok(hooks_directory); + } + + Ok(repository_root.join(hooks_directory)) +} + +fn run_git_command_in_directory( + repository_root: &Path, + args: &[&str], + context_message: &str, +) -> Result { + let output = Command::new("git") + .args(args) + .current_dir(repository_root) + .output() + .with_context(|| { + format!( + "{} (directory: '{}')", + context_message, + repository_root.display() + ) + })?; + + if !output.status.success() { + let stderr = String::from_utf8_lossy(&output.stderr).trim().to_string(); + let diagnostic = if stderr.is_empty() { + "git command exited with a non-zero status".to_string() + } else { + stderr + }; + bail!("{} {}", context_message, diagnostic); + } + + let stdout = String::from_utf8(output.stdout) + .context("git command output contained invalid UTF-8")? + .trim() + .to_string(); + if stdout.is_empty() { + bail!("{} git command returned empty output", context_message); + } + + Ok(stdout) +} + +#[cfg(unix)] +fn ensure_executable_permissions(path: &Path) -> Result<()> { + use std::os::unix::fs::PermissionsExt; + + let metadata = fs::metadata(path) + .with_context(|| format!("Failed to read metadata for '{}'", path.display()))?; + let mut permissions = metadata.permissions(); + permissions.set_mode(permissions.mode() | 0o111); + fs::set_permissions(path, permissions).with_context(|| { + format!( + "Failed to set executable permissions for '{}'", + path.display() + ) + })?; + Ok(()) +} + +#[cfg(not(unix))] +fn ensure_executable_permissions(_path: &Path) -> Result<()> { + Ok(()) +} + +#[cfg(unix)] +fn is_executable_file(path: &Path) -> Result { + use std::os::unix::fs::PermissionsExt; + + let metadata = fs::metadata(path) + .with_context(|| format!("Failed to read metadata for '{}'", path.display()))?; + Ok(metadata.is_file() && metadata.permissions().mode() & 0o111 != 0) +} + +#[cfg(not(unix))] +fn is_executable_file(path: &Path) -> Result { + let metadata = fs::metadata(path) + .with_context(|| format!("Failed to read metadata for '{}'", path.display()))?; + Ok(metadata.is_file()) +} + pub fn install_embedded_setup_assets( repository_root: &Path, target: SetupTarget, @@ -540,6 +847,7 @@ mod tests { cell::Cell, fs, io, path::{Path, PathBuf}, + process::Command, }; use crate::test_support::TestTempDir; @@ -547,10 +855,11 @@ mod tests { use super::{ get_required_hook_asset, install_embedded_setup_assets, - install_embedded_setup_assets_with_rename, iter_embedded_assets_for_setup_target, + install_embedded_setup_assets_with_rename, install_required_git_hooks, + install_required_git_hooks_with_rename, iter_embedded_assets_for_setup_target, iter_required_hook_assets, parse_setup_cli_options, resolve_setup_dispatch, resolve_setup_mode, run_setup_for_mode, setup_usage_text, RequiredHookAsset, - SetupCliOptions, SetupDispatch, SetupMode, SetupTarget, + RequiredHookInstallStatus, SetupCliOptions, SetupDispatch, SetupMode, SetupTarget, }; #[derive(Clone, Copy, Debug)] @@ -881,6 +1190,166 @@ mod tests { Ok(()) } + #[test] + fn required_hook_install_installs_missing_hooks_in_default_directory() -> Result<()> { + let temp = TestTempDir::new("sce-setup-hook-install-tests")?; + init_git_repo(temp.path())?; + + let outcome = install_required_git_hooks(temp.path())?; + assert_eq!(outcome.repository_root, temp.path().to_path_buf()); + assert_eq!(outcome.hook_results.len(), 3); + for hook in outcome.hook_results { + assert_eq!(hook.status, RequiredHookInstallStatus::Installed); + assert!(hook.hook_path.exists()); + assert!(hook.backup_path.is_none()); + assert_hook_is_executable(&hook.hook_path)?; + } + + Ok(()) + } + + #[test] + fn required_hook_install_rerun_reports_skipped_for_unchanged_hooks() -> Result<()> { + let temp = TestTempDir::new("sce-setup-hook-install-tests")?; + init_git_repo(temp.path())?; + + let first = install_required_git_hooks(temp.path())?; + assert!(first + .hook_results + .iter() + .all(|hook| hook.status == RequiredHookInstallStatus::Installed)); + + let second = install_required_git_hooks(temp.path())?; + assert!(second + .hook_results + .iter() + .all(|hook| hook.status == RequiredHookInstallStatus::Skipped)); + assert!(second + .hook_results + .iter() + .all(|hook| hook.backup_path.is_none())); + + Ok(()) + } + + #[test] + fn required_hook_install_updates_noncanonical_hook_in_custom_hooks_path() -> Result<()> { + let temp = TestTempDir::new("sce-setup-hook-install-tests")?; + init_git_repo(temp.path())?; + + run_git_in_repo(temp.path(), &["config", "core.hooksPath", ".githooks"])?; + + let custom_hooks_directory = temp.path().join(".githooks"); + fs::create_dir_all(&custom_hooks_directory)?; + let commit_msg_path = custom_hooks_directory.join("commit-msg"); + fs::write(&commit_msg_path, b"#!/bin/sh\necho legacy\n")?; + set_test_file_mode(&commit_msg_path, 0o644)?; + + let outcome = install_required_git_hooks(temp.path())?; + assert_eq!(outcome.hooks_directory, custom_hooks_directory); + + let updated = outcome + .hook_results + .iter() + .find(|hook| hook.hook_name == "commit-msg") + .expect("commit-msg result should exist"); + assert_eq!(updated.status, RequiredHookInstallStatus::Updated); + let backup_path = updated + .backup_path + .as_ref() + .expect("updated hook should retain backup path"); + assert!(backup_path.exists()); + assert_eq!(fs::read(backup_path)?, b"#!/bin/sh\necho legacy\n"); + assert_hook_is_executable(&updated.hook_path)?; + + Ok(()) + } + + #[test] + fn required_hook_install_rolls_back_when_hook_swap_fails() -> Result<()> { + let temp = TestTempDir::new("sce-setup-hook-install-tests")?; + init_git_repo(temp.path())?; + + let hooks_directory = temp.path().join(".git/hooks"); + fs::create_dir_all(&hooks_directory)?; + let commit_msg_path = hooks_directory.join("commit-msg"); + fs::write(&commit_msg_path, b"#!/bin/sh\necho legacy\n")?; + + let rename_calls = Cell::new(0_u8); + let error = install_required_git_hooks_with_rename(temp.path(), |from, to| { + rename_calls.set(rename_calls.get() + 1); + if rename_calls.get() == 2 { + return Err(io::Error::other("injected hook swap failure")); + } + + fs::rename(from, to) + }) + .expect_err("hook swap failure should bubble up"); + + assert!(error + .to_string() + .contains("Failed to update required hook 'commit-msg'")); + assert!(commit_msg_path.exists()); + assert_eq!(fs::read(&commit_msg_path)?, b"#!/bin/sh\necho legacy\n"); + assert!(!hooks_directory.join("commit-msg.backup").exists()); + + for entry in fs::read_dir(&hooks_directory)? { + let entry = entry?; + let name = entry.file_name(); + let name = name.to_string_lossy(); + assert!( + !name.starts_with(".sce-hook-staging-"), + "hook staging file should be cleaned up after failure" + ); + } + + Ok(()) + } + + fn init_git_repo(repository_root: &Path) -> Result<()> { + run_git_in_repo(repository_root, &["init", "-q"])?; + Ok(()) + } + + fn run_git_in_repo(repository_root: &Path, args: &[&str]) -> Result<()> { + let status = Command::new("git") + .args(args) + .current_dir(repository_root) + .status()?; + if !status.success() { + anyhow::bail!("git command failed for test repository") + } + Ok(()) + } + + #[cfg(unix)] + fn set_test_file_mode(path: &Path, mode: u32) -> Result<()> { + use std::os::unix::fs::PermissionsExt; + + fs::set_permissions(path, fs::Permissions::from_mode(mode))?; + Ok(()) + } + + #[cfg(not(unix))] + fn set_test_file_mode(_path: &Path, _mode: u32) -> Result<()> { + Ok(()) + } + + #[cfg(unix)] + fn assert_hook_is_executable(path: &Path) -> Result<()> { + use std::os::unix::fs::PermissionsExt; + + let metadata = fs::metadata(path)?; + assert!(metadata.permissions().mode() & 0o111 != 0); + Ok(()) + } + + #[cfg(not(unix))] + fn assert_hook_is_executable(path: &Path) -> Result<()> { + assert!(path.exists()); + Ok(()) + } + fn runtime_target_root(target: SetupTarget) -> PathBuf { let target_relative = match target { SetupTarget::OpenCode => "config/.opencode", diff --git a/context/architecture.md b/context/architecture.md index 63a3abc9..e49ec22e 100644 --- a/context/architecture.md +++ b/context/architecture.md @@ -75,7 +75,7 @@ The repository includes a new placeholder Rust binary crate at `cli/`. - `cli/src/command_surface.rs` is the source of truth for top-level command contract metadata (`help`, `setup`, `doctor`, `mcp`, `hooks`, `sync`) and explicit implemented-vs-placeholder status. - `cli/src/services/local_db.rs` provides the local Turso data adapter, including `Builder::new_local(...)` initialization, async execute/query smoke checks for in-memory and file-backed targets, and idempotent migration application for Agent Trace persistence foundations (`repositories`, `commits`, `trace_records`, `trace_ranges`), reconciliation ingestion entities (`reconciliation_runs`, `rewrite_mappings`, `conversations`), and T14 retry/observability storage (`trace_retry_queue`, `reconciliation_metrics`) with replay/query indexes. - `cli/src/test_support.rs` provides a shared test-only temp-directory helper (`TestTempDir`) used by service tests that need filesystem fixtures. -- `cli/src/services/setup.rs` defines the setup command contract (`SetupMode`, `SetupTarget`, CLI flag parser/validator), an `inquire`-backed interactive target prompter (`InquireSetupTargetPrompter`), setup dispatch outcomes (proceed/cancelled), compile-time embedded asset access (`EmbeddedAsset`, target-scoped iterators, required-hook asset iterators/lookups) generated by `cli/build.rs` from `config/.opencode/**`, `config/.claude/**`, and `cli/assets/hooks/**`, and a target-scoped install engine/orchestrator that stages embedded files, performs backup-and-replace with rollback restoration on swap failure, and formats deterministic completion messaging. +- `cli/src/services/setup.rs` defines the setup command contract (`SetupMode`, `SetupTarget`, CLI flag parser/validator), an `inquire`-backed interactive target prompter (`InquireSetupTargetPrompter`), setup dispatch outcomes (proceed/cancelled), compile-time embedded asset access (`EmbeddedAsset`, target-scoped iterators, required-hook asset iterators/lookups) generated by `cli/build.rs` from `config/.opencode/**`, `config/.claude/**`, and `cli/assets/hooks/**`, a target-scoped install engine/orchestrator that stages embedded files, performs backup-and-replace with rollback restoration on swap failure, and formats deterministic completion messaging, plus required-hook install orchestration (`install_required_git_hooks`) that resolves effective hooks directories from git and applies per-hook installed/updated/skipped outcomes with backup/rollback safety. - `cli/src/services/doctor.rs` defines hook rollout health validation (`run_doctor`) that resolves effective git hook-path source (default, local `core.hooksPath`, global `core.hooksPath`) and validates required hook files (`pre-commit`, `commit-msg`, `post-commit`) for presence and executable permissions. - `cli/src/services/agent_trace.rs` defines the Agent Trace schema adapter and builder contracts (`adapt_trace_payload`, `build_trace_payload`), including fixed git VCS identity, reserved reverse-domain metadata keys, and deterministic AI `model_id` normalization before schema-compliance validation. - `cli/src/services/mcp.rs` defines MCP file-cache capability contracts (`McpService`, transport/capability snapshots, cache policy) with non-runnable placeholder tool declarations. diff --git a/context/context-map.md b/context/context-map.md index 1fb84c4d..5cf0eba5 100644 --- a/context/context-map.md +++ b/context/context-map.md @@ -25,6 +25,7 @@ Feature/domain context: - `context/sce/agent-trace-hook-doctor.md` (T07 `sce doctor` hook install/health validation contract for default, per-repo, and global hook-path rollout) - `context/sce/setup-githooks-install-contract.md` (T01 canonical `sce setup --hooks` install contract for target-path resolution, idempotent outcomes, backup/rollback, and doctor-readiness alignment) - `context/sce/setup-githooks-hook-asset-packaging.md` (T02 compile-time `sce setup --hooks` required-hook template packaging contract and setup-service accessor surface) +- `context/sce/setup-githooks-install-flow.md` (T03 setup-service required-hook install orchestration with git-truth hooks-path resolution, per-hook installed/updated/skipped outcomes, and backup/rollback semantics) - `context/sce/agent-trace-post-rewrite-local-remap-ingestion.md` (T08 `post-rewrite` local remap ingestion contract with strict pair parsing, rewrite-method normalization, and deterministic replay-key derivation) - `context/sce/agent-trace-rewrite-trace-transformation.md` (T09 rewritten-SHA trace transformation contract with rewrite metadata, confidence-to-quality mapping, and notes+DB persistence parity) - `context/sce/agent-trace-core-schema-migrations.md` (T10 core local schema migration contract for `repositories`, `commits`, `trace_records`, and `trace_ranges` with upgrade-safe idempotent create semantics) diff --git a/context/glossary.md b/context/glossary.md index 5e96ac2d..2105e723 100644 --- a/context/glossary.md +++ b/context/glossary.md @@ -31,6 +31,7 @@ - `setup dispatch outcome`: Execution model in `cli/src/services/setup.rs` (`SetupDispatch`) where setup either proceeds with a selected/non-interactive target or exits as cancelled without file changes. - `setup embedded asset manifest`: Compile-time generated file index emitted by `cli/build.rs` into `OUT_DIR/setup_embedded_assets.rs`, embedding bytes from `config/.opencode/**`, `config/.claude/**`, and `cli/assets/hooks/**` as deterministic normalized relative-path entries consumed by `cli/src/services/setup.rs`. - `setup required-hook embedded assets`: Setup-service accessors in `cli/src/services/setup.rs` (`iter_required_hook_assets`, `get_required_hook_asset`) that expose canonical embedded templates for `pre-commit`, `commit-msg`, and `post-commit` without runtime config reads. +- `setup required-hook install orchestration`: Setup-service flow in `cli/src/services/setup.rs` (`install_required_git_hooks`) that resolves repository root + effective hooks directory via git truth, installs canonical required hooks with deterministic per-hook outcomes (`Installed`, `Updated`, `Skipped`), enforces executable permissions, and performs backup-and-restore rollback when hook swap fails. - `setup install engine`: Installer in `cli/src/services/setup.rs` (`install_embedded_setup_assets`) that writes embedded setup assets into per-target staging directories and swaps them into repository-root `.opencode/`/`.claude/` destinations. - `setup backup-and-replace`: Replacement choreography in `cli/src/services/setup.rs` where existing install targets are renamed to unique `.backup` paths before staged content is promoted; on swap failure, the engine restores the original target from backup and cleans temporary staging paths. - `MCP capability snapshot`: Placeholder capability model in `cli/src/services/mcp.rs` that captures planned file-cache transport/tool contracts (`cache-put`, `cache-get`) and cache policy defaults without enabling runtime MCP execution. diff --git a/context/overview.md b/context/overview.md index 9d8ab89a..dbee1fbd 100644 --- a/context/overview.md +++ b/context/overview.md @@ -38,6 +38,7 @@ The CLI now also includes a hosted event intake/orchestration seam in `cli/src/s The hosted reconciliation service now also includes a deterministic rewrite mapping engine (`map_rewritten_commit`) that resolves old->new commit identity using patch-id exact precedence, then range-diff hints, then fuzzy fallback with a `>= 0.60` mapping threshold and explicit ambiguous/unmatched/low-confidence unresolved outcomes; this behavior is documented in `context/sce/agent-trace-rewrite-mapping-engine.md`. The hooks service now also includes retry-queue replay processing (`process_trace_retry_queue`) with per-attempt runtime/error-class metric emission, and the hosted reconciliation service now includes mapped/unmapped + confidence histogram metric snapshots (`summarize_reconciliation_metrics`), with DB-first queue/metrics schema coverage in `apply_core_schema_migrations`; this behavior is documented in `context/sce/agent-trace-retry-queue-observability.md`. The setup service now also exposes deterministic required-hook embedded asset accessors (`iter_required_hook_assets`, `get_required_hook_asset`) backed by canonical templates in `cli/assets/hooks/` for `pre-commit`, `commit-msg`, and `post-commit`; this behavior is documented in `context/sce/setup-githooks-hook-asset-packaging.md`. +The setup service now also includes required-hook install orchestration (`install_required_git_hooks`) that resolves repository root and effective hooks path from git truth, enforces deterministic per-hook outcomes (`Installed`/`Updated`/`Skipped`), and performs backup-and-restore rollback on swap failures; this behavior is documented in `context/sce/setup-githooks-install-flow.md`. ## Repository model @@ -103,3 +104,4 @@ Lightweight post-task verification baseline (required after each completed task) - Use `context/sce/agent-trace-rewrite-mapping-engine.md` for the implemented T13 hosted mapping engine contract (patch-id exact matching, range-diff/fuzzy scoring precedence, confidence thresholds, and deterministic unresolved handling). - Use `context/sce/agent-trace-retry-queue-observability.md` for the implemented T14 retry replay contract (notes/DB target-scoped recovery, per-attempt runtime/error-class metrics, reconciliation mapped/unmapped + confidence histogram snapshots, and DB-first retry/metrics schema additions). - Use `context/sce/setup-githooks-hook-asset-packaging.md` for the implemented `sce-setup-githooks-any-repo` T02 compile-time hook-template packaging contract and setup-service required-hook embedded accessor surface. +- Use `context/sce/setup-githooks-install-flow.md` for the implemented `sce-setup-githooks-any-repo` T03 required-hook install orchestration contract (git-truth hooks-path resolution, per-hook installed/updated/skipped outcomes, and backup/rollback behavior). diff --git a/context/patterns.md b/context/patterns.md index 24fec7e9..5aae2ee2 100644 --- a/context/patterns.md +++ b/context/patterns.md @@ -77,6 +77,7 @@ - Treat setup prompt cancellation/interrupt as a non-destructive exit path with explicit user messaging (no file mutations and no partial side effects). - For setup install prep, generate compile-time embedded asset manifests from `config/.opencode/**`, `config/.claude/**`, and `cli/assets/hooks/**` in `cli/build.rs`, keep relative paths normalized to forward-slash form, and expose target-scoped iterators/lookups from the setup service layer for installer wiring. - For setup install execution, write selected embedded assets into a per-target staging directory first, then swap into repository-root `.opencode/`/`.claude/` with backup-and-replace semantics; when swap fails after backup creation, restore the original target path from backup and clean staging directories. +- For required-hook setup execution, resolve repository root and effective hooks directory from git (`rev-parse --show-toplevel`, `rev-parse --git-path hooks`), then apply deterministic per-hook outcomes (`Installed`, `Updated`, `Skipped`) with staged writes, executable-bit enforcement, and backup-and-restore rollback on swap failures. - For setup command messaging, emit deterministic completion output that includes selected target(s), per-target install counts, and whether backup was created. - Keep module seams for future domains present and compile-safe even when behavior is deferred. - Keep dependency additions explicit and minimal in `cli/Cargo.toml`, and anchor dependency intent in lightweight compile-time code references (`cli/src/dependency_contract.rs`). diff --git a/context/plans/sce-setup-githooks-any-repo.md b/context/plans/sce-setup-githooks-any-repo.md index 78dccb02..f8fae847 100644 --- a/context/plans/sce-setup-githooks-any-repo.md +++ b/context/plans/sce-setup-githooks-any-repo.md @@ -44,7 +44,7 @@ Enable `sce setup` to install and manage required Git hooks (`pre-commit`, `comm - Added canonical hook templates at `cli/assets/hooks/pre-commit`, `cli/assets/hooks/commit-msg`, and `cli/assets/hooks/post-commit` with deterministic `sce hooks ` entrypoints for setup packaging. - Extended compile-time embedded manifest generation in `cli/build.rs` with `HOOK_EMBEDDED_ASSETS`, keeping deterministic sorted path normalization for hook assets without runtime config reads. - Added setup service hook-asset accessors in `cli/src/services/setup.rs` (`iter_required_hook_assets`, `get_required_hook_asset`) plus coverage for manifest completeness, normalization, ordering, and canonical hook lookup semantics. -- [ ] T03: Implement `sce setup` hook installation flow (status:todo) +- [x] T03: Implement `sce setup` hook installation flow (status:done) - Task ID: T03 - Goal: Add idempotent hook install/update orchestration that writes required hooks, preserves executable bits, and performs safe backup-and-replace with rollback on failure. - Boundaries (in/out of scope): @@ -55,6 +55,10 @@ Enable `sce setup` to install and manage required Git hooks (`pre-commit`, `comm - Failure paths restore prior hook state non-destructively when replacement cannot complete. - Verification notes (commands or checks): - Service/integration tests for install, re-run idempotency, upgrade, and rollback scenarios. + - Verification evidence: + - Added required-hook install orchestration in `cli/src/services/setup.rs` via `install_required_git_hooks`, including git-truth path resolution (`rev-parse --show-toplevel` and `rev-parse --git-path hooks`), per-hook deterministic outcomes (`installed`/`updated`/`skipped`), executable-bit enforcement, and backup-and-restore rollback on swap failures. + - Added focused setup-service coverage for default hook-path fresh install, rerun idempotency (`skipped` outcomes), custom `core.hooksPath` upgrades with backup retention, and injected swap-failure rollback cleanup in `services::setup::tests`. + - Verification run: `cargo test services::setup::tests`; light checks/build run: `cargo fmt --check` and `cargo build` (from `cli/`). - [ ] T04: Add CLI flags and UX for hook setup (status:todo) - Task ID: T04 - Goal: Add `sce setup --hooks` (and optional `--repo `) with deterministic output and compatible option validation. diff --git a/context/sce/setup-githooks-install-flow.md b/context/sce/setup-githooks-install-flow.md new file mode 100644 index 00000000..e7c17985 --- /dev/null +++ b/context/sce/setup-githooks-install-flow.md @@ -0,0 +1,63 @@ +# SCE setup git-hooks install flow + +## Scope + +Task `sce-setup-githooks-any-repo` `T03` implements the required-hook installation orchestration for `sce setup --hooks` at the setup-service layer. + +## Implemented setup-service surface + +`cli/src/services/setup.rs` now provides: + +- `install_required_git_hooks(repository_root: &Path)` +- `RequiredHooksInstallOutcome` +- `RequiredHookInstallResult` +- `RequiredHookInstallStatus` (`Installed`, `Updated`, `Skipped`) + +This flow is independent from setup target install (`.opencode`/`.claude`) and is scoped to required git hooks. + +## Path resolution and repository targeting + +For the provided repository path, setup resolves git truth before any writes: + +1. `git rev-parse --show-toplevel` +2. `git rev-parse --git-path hooks` + +If the hooks path is relative, it is resolved against the git toplevel. + +This keeps behavior compatible with: + +- default `.git/hooks` +- per-repo `core.hooksPath` +- global `core.hooksPath` (when git resolves it for the selected repo) + +## Per-hook installation contract + +The flow iterates canonical embedded required hooks (`pre-commit`, `commit-msg`, `post-commit`) and applies deterministic per-hook outcomes: + +- `Installed`: hook was absent and is now present. +- `Updated`: hook existed but content and/or executable bit did not match canonical state. +- `Skipped`: hook already matched canonical bytes and executable state. + +## Backup, staged write, and rollback behavior + +When replacing an existing hook: + +- setup creates a deterministic backup path via `next_backup_path(...)` +- writes canonical bytes to a unique staging file in the hooks directory +- enforces executable permissions on the staged payload +- swaps staging file into final hook path + +If swap fails after backup: + +- staging artifact is removed +- previous hook is restored from backup when the destination path is absent +- failure returns explicit rollback context in the error chain + +## Verification coverage + +`cli/src/services/setup.rs` includes T03-focused tests for: + +- fresh install in default hooks directory +- rerun idempotency (`Skipped` outcomes) +- upgrade in custom `core.hooksPath` with backup retention +- injected swap failure with rollback and staging cleanup checks From 4a4350ee34d1b6e47a4b515e15f7658b29f2541b Mon Sep 17 00:00:00 2001 From: David Abram Date: Thu, 5 Mar 2026 00:03:54 +0100 Subject: [PATCH 20/39] setup: Add --hooks setup mode with optional --repo targeting Add setup command parsing/dispatch for required hook installation with deterministic validation (`--repo` requires `--hooks`; `--hooks` cannot mix with target flags), and emit stable per-hook installed/updated/skipped + backup-status output. --- cli/src/app.rs | 73 +++++++++- cli/src/command_surface.rs | 4 +- cli/src/services/setup.rs | 144 ++++++++++++++++++- context/architecture.md | 2 +- context/context-map.md | 1 + context/glossary.md | 1 + context/overview.md | 2 + context/patterns.md | 1 + context/plans/sce-setup-githooks-any-repo.md | 7 +- context/sce/setup-githooks-cli-ux.md | 41 ++++++ 10 files changed, 268 insertions(+), 8 deletions(-) create mode 100644 context/sce/setup-githooks-cli-ux.md diff --git a/cli/src/app.rs b/cli/src/app.rs index 5e06c1ad..2fc3f4d2 100644 --- a/cli/src/app.rs +++ b/cli/src/app.rs @@ -4,10 +4,11 @@ use crate::{command_surface, dependency_contract, services}; use anyhow::{bail, Context, Result}; use lexopt::ValueExt; -#[derive(Clone, Copy, Debug, Eq, PartialEq)] +#[derive(Clone, Debug, Eq, PartialEq)] enum Command { Help, Setup(services::setup::SetupMode), + SetupHooks(Option), SetupHelp, Doctor, Mcp, @@ -119,6 +120,13 @@ fn parse_setup_subcommand(args: Vec) -> Result { return Ok(Command::SetupHelp); } + if options.hooks { + let repo_path = services::setup::resolve_setup_hooks_repository(&options)?; + return Ok(Command::SetupHooks(repo_path)); + } + + services::setup::resolve_setup_hooks_repository(&options)?; + let mode = services::setup::resolve_setup_mode(options)?; Ok(Command::Setup(mode)) } @@ -157,6 +165,12 @@ fn dispatch(command: Command) -> Result<()> { } } } + Command::SetupHooks(repo_path) => { + let current_dir = + std::env::current_dir().context("Failed to determine current directory")?; + let repository_root = repo_path.as_deref().unwrap_or(current_dir.as_path()); + println!("{}", services::setup::run_setup_hooks(repository_root)?); + } Command::SetupHelp => println!("{}", services::setup::setup_usage_text()), Command::Doctor => println!("{}", services::doctor::run_doctor()?), Command::Mcp => println!("{}", services::mcp::run_placeholder_mcp()?), @@ -277,6 +291,33 @@ mod tests { assert_eq!(command, Command::Setup(SetupMode::Interactive)); } + #[test] + fn parser_routes_setup_hooks_without_repo() { + let command = parse_command(vec![ + "sce".to_string(), + "setup".to_string(), + "--hooks".to_string(), + ]) + .expect("command should parse"); + assert_eq!(command, Command::SetupHooks(None)); + } + + #[test] + fn parser_routes_setup_hooks_with_repo() { + let command = parse_command(vec![ + "sce".to_string(), + "setup".to_string(), + "--hooks".to_string(), + "--repo".to_string(), + "../demo-repo".to_string(), + ]) + .expect("command should parse"); + assert_eq!( + command, + Command::SetupHooks(Some(std::path::PathBuf::from("../demo-repo"))) + ); + } + #[test] fn parser_rejects_setup_mutually_exclusive_flags() { let error = parse_command(vec![ @@ -292,6 +333,36 @@ mod tests { ); } + #[test] + fn parser_rejects_setup_repo_without_hooks() { + let error = parse_command(vec![ + "sce".to_string(), + "setup".to_string(), + "--repo".to_string(), + "../demo-repo".to_string(), + ]) + .expect_err("--repo without --hooks should fail"); + assert_eq!( + error.to_string(), + "Option '--repo' requires '--hooks'. Run 'sce setup --help' to see valid usage." + ); + } + + #[test] + fn parser_rejects_hooks_with_target_flag() { + let error = parse_command(vec![ + "sce".to_string(), + "setup".to_string(), + "--hooks".to_string(), + "--opencode".to_string(), + ]) + .expect_err("--hooks with target flag should fail"); + assert_eq!( + error.to_string(), + "Option '--hooks' cannot be combined with '--opencode', '--claude', or '--both'. Run 'sce setup --help' to see valid usage." + ); + } + #[test] fn parser_rejects_unknown_command() { let error = parse_command(vec!["sce".to_string(), "nope".to_string()]) diff --git a/cli/src/command_surface.rs b/cli/src/command_surface.rs index 4b5d27fa..83b82cd2 100644 --- a/cli/src/command_surface.rs +++ b/cli/src/command_surface.rs @@ -67,9 +67,10 @@ pub fn help_text() -> String { format!( "sce - Shared Context Engineering CLI (placeholder foundation)\n\n\ Usage:\n sce [command]\n\n\ -Setup usage:\n sce setup [--opencode|--claude|--both]\n\n\ +Setup usage:\n sce setup [--opencode|--claude|--both]\n sce setup --hooks [--repo ]\n\n\ Commands:\n{}\n\n\ Setup defaults to interactive target selection when no setup target flag is passed.\n\ +Use '--hooks' to install required git hooks for the current repository or '--repo ' for a specific repository.\n\ `setup` and `doctor` are implemented; `mcp`, `hooks`, and `sync` remain placeholder-oriented.\n", command_rows ) @@ -94,5 +95,6 @@ mod tests { fn help_text_mentions_setup_target_flags() { let help = help_text(); assert!(help.contains("sce setup [--opencode|--claude|--both]")); + assert!(help.contains("sce setup --hooks [--repo ]")); } } diff --git a/cli/src/services/setup.rs b/cli/src/services/setup.rs index ee1122be..3b1d9085 100644 --- a/cli/src/services/setup.rs +++ b/cli/src/services/setup.rs @@ -93,12 +93,32 @@ pub enum SetupDispatch { Cancelled, } -#[derive(Clone, Copy, Debug, Default, Eq, PartialEq)] +#[derive(Clone, Debug, Default, Eq, PartialEq)] pub struct SetupCliOptions { pub help: bool, pub opencode: bool, pub claude: bool, pub both: bool, + pub hooks: bool, + pub repo_path: Option, +} + +pub fn resolve_setup_hooks_repository(options: &SetupCliOptions) -> Result> { + if options.hooks { + if options.opencode || options.claude || options.both { + bail!( + "Option '--hooks' cannot be combined with '--opencode', '--claude', or '--both'. Run 'sce setup --help' to see valid usage." + ); + } + + return Ok(options.repo_path.clone()); + } + + if options.repo_path.is_some() { + bail!("Option '--repo' requires '--hooks'. Run 'sce setup --help' to see valid usage."); + } + + Ok(None) } pub fn run_setup_for_mode(repository_root: &Path, mode: SetupMode) -> Result { @@ -119,6 +139,12 @@ pub fn run_setup_for_mode(repository_root: &Path, mode: SetupMode) -> Result Result { + let outcome = install_required_git_hooks(repository_root) + .context("Hook setup failed while installing required git hooks")?; + Ok(format_required_hook_install_success_message(&outcome)) +} + fn format_setup_install_success_message(outcome: &SetupInstallOutcome) -> String { let selected_targets = outcome .target_results @@ -152,6 +178,38 @@ fn format_setup_install_success_message(outcome: &SetupInstallOutcome) -> String lines.join("\n") } +fn format_required_hook_install_success_message(outcome: &RequiredHooksInstallOutcome) -> String { + let mut lines = vec![ + "Hook setup completed successfully.".to_string(), + format!("Repository root: '{}'", outcome.repository_root.display()), + format!("Hooks directory: '{}'", outcome.hooks_directory.display()), + ]; + + for result in &outcome.hook_results { + lines.push(format!( + "- {}: {} at '{}'", + result.hook_name, + required_hook_status_label(result.status), + result.hook_path.display() + )); + + match result.backup_path.as_ref() { + Some(backup_path) => lines.push(format!(" backup: '{}'", backup_path.display())), + None => lines.push(" backup: not needed".to_string()), + } + } + + lines.join("\n") +} + +fn required_hook_status_label(status: RequiredHookInstallStatus) -> &'static str { + match status { + RequiredHookInstallStatus::Installed => "installed", + RequiredHookInstallStatus::Updated => "updated", + RequiredHookInstallStatus::Skipped => "skipped", + } +} + fn setup_target_label(target: SetupTarget) -> &'static str { match target { SetupTarget::OpenCode => "OpenCode", @@ -778,7 +836,7 @@ pub fn setup_cancelled_text() -> &'static str { } pub fn setup_usage_text() -> &'static str { - "Usage: sce setup [--opencode|--claude|--both]\n\nWithout a target flag, setup defaults to interactive target selection.\nTarget flags are mutually exclusive and intended for non-interactive automation." + "Usage:\n sce setup [--opencode|--claude|--both]\n sce setup --hooks [--repo ]\n\nWithout a target flag, setup defaults to interactive target selection.\nTarget flags are mutually exclusive and intended for non-interactive automation.\n'--hooks' installs required git hooks for the current repository by default, or for '--repo ' when provided." } pub fn parse_setup_cli_options(args: I) -> Result @@ -793,6 +851,18 @@ where Arg::Long("opencode") => options.opencode = true, Arg::Long("claude") => options.claude = true, Arg::Long("both") => options.both = true, + Arg::Long("hooks") => options.hooks = true, + Arg::Long("repo") => { + let value = parser + .value() + .context("Option '--repo' requires a path value")?; + if options.repo_path.is_some() { + bail!( + "Option '--repo' may only be provided once. Run 'sce setup --help' to see valid usage." + ); + } + options.repo_path = Some(PathBuf::from(value.string()?)); + } Arg::Long("help") | Arg::Short('h') => options.help = true, Arg::Long(option) => { bail!( @@ -858,8 +928,9 @@ mod tests { install_embedded_setup_assets_with_rename, install_required_git_hooks, install_required_git_hooks_with_rename, iter_embedded_assets_for_setup_target, iter_required_hook_assets, parse_setup_cli_options, resolve_setup_dispatch, - resolve_setup_mode, run_setup_for_mode, setup_usage_text, RequiredHookAsset, - RequiredHookInstallStatus, SetupCliOptions, SetupDispatch, SetupMode, SetupTarget, + resolve_setup_hooks_repository, resolve_setup_mode, run_setup_for_mode, run_setup_hooks, + setup_usage_text, RequiredHookAsset, RequiredHookInstallStatus, SetupCliOptions, + SetupDispatch, SetupMode, SetupTarget, }; #[derive(Clone, Copy, Debug)] @@ -907,6 +978,8 @@ mod tests { opencode: true, claude: true, both: false, + hooks: false, + repo_path: None, }) .expect_err("multiple target flags should fail"); @@ -920,6 +993,69 @@ mod tests { fn setup_usage_contract_mentions_target_flags() { let usage = setup_usage_text(); assert!(usage.contains("--opencode|--claude|--both")); + assert!(usage.contains("sce setup --hooks [--repo ]")); + } + + #[test] + fn setup_options_parse_hooks_without_repo() -> Result<()> { + let options = parse_setup_cli_options(vec!["--hooks".to_string()])?; + let repo = resolve_setup_hooks_repository(&options)?; + assert_eq!(repo, None); + Ok(()) + } + + #[test] + fn setup_options_parse_hooks_with_repo() -> Result<()> { + let options = parse_setup_cli_options(vec![ + "--hooks".to_string(), + "--repo".to_string(), + "tmp/repo".to_string(), + ])?; + let repo = resolve_setup_hooks_repository(&options)?; + assert_eq!(repo, Some(PathBuf::from("tmp/repo"))); + Ok(()) + } + + #[test] + fn setup_options_reject_repo_without_hooks() { + let options = parse_setup_cli_options(vec!["--repo".to_string(), "tmp/repo".to_string()]) + .expect("parsing --repo should succeed before validation"); + let error = resolve_setup_hooks_repository(&options) + .expect_err("--repo without --hooks should fail"); + assert_eq!( + error.to_string(), + "Option '--repo' requires '--hooks'. Run 'sce setup --help' to see valid usage." + ); + } + + #[test] + fn setup_options_reject_hooks_with_target_flags() { + let options = + parse_setup_cli_options(vec!["--hooks".to_string(), "--opencode".to_string()]) + .expect("parsing should succeed before validation"); + let error = resolve_setup_hooks_repository(&options) + .expect_err("--hooks with target flags should fail"); + assert_eq!( + error.to_string(), + "Option '--hooks' cannot be combined with '--opencode', '--claude', or '--both'. Run 'sce setup --help' to see valid usage." + ); + } + + #[test] + fn run_setup_hooks_reports_per_hook_statuses() -> Result<()> { + let temp = TestTempDir::new("sce-setup-hook-install-tests")?; + init_git_repo(temp.path())?; + + let message = run_setup_hooks(temp.path())?; + assert!(message.contains("Hook setup completed successfully.")); + assert!(message.contains("Repository root:")); + assert!(message.contains("Hooks directory:")); + assert!(message.contains("commit-msg: installed")); + assert!(message.contains("post-commit: installed")); + assert!(message.contains("pre-commit: installed")); + assert!(message.contains("backup: not needed")); + + Ok(()) } #[test] diff --git a/context/architecture.md b/context/architecture.md index e49ec22e..d304ef0d 100644 --- a/context/architecture.md +++ b/context/architecture.md @@ -75,7 +75,7 @@ The repository includes a new placeholder Rust binary crate at `cli/`. - `cli/src/command_surface.rs` is the source of truth for top-level command contract metadata (`help`, `setup`, `doctor`, `mcp`, `hooks`, `sync`) and explicit implemented-vs-placeholder status. - `cli/src/services/local_db.rs` provides the local Turso data adapter, including `Builder::new_local(...)` initialization, async execute/query smoke checks for in-memory and file-backed targets, and idempotent migration application for Agent Trace persistence foundations (`repositories`, `commits`, `trace_records`, `trace_ranges`), reconciliation ingestion entities (`reconciliation_runs`, `rewrite_mappings`, `conversations`), and T14 retry/observability storage (`trace_retry_queue`, `reconciliation_metrics`) with replay/query indexes. - `cli/src/test_support.rs` provides a shared test-only temp-directory helper (`TestTempDir`) used by service tests that need filesystem fixtures. -- `cli/src/services/setup.rs` defines the setup command contract (`SetupMode`, `SetupTarget`, CLI flag parser/validator), an `inquire`-backed interactive target prompter (`InquireSetupTargetPrompter`), setup dispatch outcomes (proceed/cancelled), compile-time embedded asset access (`EmbeddedAsset`, target-scoped iterators, required-hook asset iterators/lookups) generated by `cli/build.rs` from `config/.opencode/**`, `config/.claude/**`, and `cli/assets/hooks/**`, a target-scoped install engine/orchestrator that stages embedded files, performs backup-and-replace with rollback restoration on swap failure, and formats deterministic completion messaging, plus required-hook install orchestration (`install_required_git_hooks`) that resolves effective hooks directories from git and applies per-hook installed/updated/skipped outcomes with backup/rollback safety. +- `cli/src/services/setup.rs` defines the setup command contract (`SetupMode`, `SetupTarget`, CLI flag parser/validator), an `inquire`-backed interactive target prompter (`InquireSetupTargetPrompter`), setup dispatch outcomes (proceed/cancelled), compile-time embedded asset access (`EmbeddedAsset`, target-scoped iterators, required-hook asset iterators/lookups) generated by `cli/build.rs` from `config/.opencode/**`, `config/.claude/**`, and `cli/assets/hooks/**`, a target-scoped install engine/orchestrator that stages embedded files, performs backup-and-replace with rollback restoration on swap failure, and formats deterministic completion messaging, plus required-hook install orchestration (`install_required_git_hooks`) and command-surface hook mode helpers (`run_setup_hooks`, `resolve_setup_hooks_repository`) used by `sce setup --hooks [--repo ]` with deterministic option compatibility validation and per-hook outcome messaging. - `cli/src/services/doctor.rs` defines hook rollout health validation (`run_doctor`) that resolves effective git hook-path source (default, local `core.hooksPath`, global `core.hooksPath`) and validates required hook files (`pre-commit`, `commit-msg`, `post-commit`) for presence and executable permissions. - `cli/src/services/agent_trace.rs` defines the Agent Trace schema adapter and builder contracts (`adapt_trace_payload`, `build_trace_payload`), including fixed git VCS identity, reserved reverse-domain metadata keys, and deterministic AI `model_id` normalization before schema-compliance validation. - `cli/src/services/mcp.rs` defines MCP file-cache capability contracts (`McpService`, transport/capability snapshots, cache policy) with non-runnable placeholder tool declarations. diff --git a/context/context-map.md b/context/context-map.md index 5cf0eba5..59a53403 100644 --- a/context/context-map.md +++ b/context/context-map.md @@ -26,6 +26,7 @@ Feature/domain context: - `context/sce/setup-githooks-install-contract.md` (T01 canonical `sce setup --hooks` install contract for target-path resolution, idempotent outcomes, backup/rollback, and doctor-readiness alignment) - `context/sce/setup-githooks-hook-asset-packaging.md` (T02 compile-time `sce setup --hooks` required-hook template packaging contract and setup-service accessor surface) - `context/sce/setup-githooks-install-flow.md` (T03 setup-service required-hook install orchestration with git-truth hooks-path resolution, per-hook installed/updated/skipped outcomes, and backup/rollback semantics) +- `context/sce/setup-githooks-cli-ux.md` (T04 `sce setup --hooks` / `--repo` command-surface contract, option compatibility validation, and deterministic per-hook output semantics) - `context/sce/agent-trace-post-rewrite-local-remap-ingestion.md` (T08 `post-rewrite` local remap ingestion contract with strict pair parsing, rewrite-method normalization, and deterministic replay-key derivation) - `context/sce/agent-trace-rewrite-trace-transformation.md` (T09 rewritten-SHA trace transformation contract with rewrite metadata, confidence-to-quality mapping, and notes+DB persistence parity) - `context/sce/agent-trace-core-schema-migrations.md` (T10 core local schema migration contract for `repositories`, `commits`, `trace_records`, and `trace_ranges` with upgrade-safe idempotent create semantics) diff --git a/context/glossary.md b/context/glossary.md index 2105e723..8bd3e96f 100644 --- a/context/glossary.md +++ b/context/glossary.md @@ -32,6 +32,7 @@ - `setup embedded asset manifest`: Compile-time generated file index emitted by `cli/build.rs` into `OUT_DIR/setup_embedded_assets.rs`, embedding bytes from `config/.opencode/**`, `config/.claude/**`, and `cli/assets/hooks/**` as deterministic normalized relative-path entries consumed by `cli/src/services/setup.rs`. - `setup required-hook embedded assets`: Setup-service accessors in `cli/src/services/setup.rs` (`iter_required_hook_assets`, `get_required_hook_asset`) that expose canonical embedded templates for `pre-commit`, `commit-msg`, and `post-commit` without runtime config reads. - `setup required-hook install orchestration`: Setup-service flow in `cli/src/services/setup.rs` (`install_required_git_hooks`) that resolves repository root + effective hooks directory via git truth, installs canonical required hooks with deterministic per-hook outcomes (`Installed`, `Updated`, `Skipped`), enforces executable permissions, and performs backup-and-restore rollback when hook swap fails. +- `setup hooks CLI mode`: `sce setup` mode activated by `--hooks` with optional `--repo `; implemented in `cli/src/app.rs` + `cli/src/services/setup.rs`, enforces deterministic compatibility validation (`--repo` requires `--hooks`; `--hooks` cannot be combined with target flags), and emits per-hook `installed`/`updated`/`skipped` + backup-status output. - `setup install engine`: Installer in `cli/src/services/setup.rs` (`install_embedded_setup_assets`) that writes embedded setup assets into per-target staging directories and swaps them into repository-root `.opencode/`/`.claude/` destinations. - `setup backup-and-replace`: Replacement choreography in `cli/src/services/setup.rs` where existing install targets are renamed to unique `.backup` paths before staged content is promoted; on swap failure, the engine restores the original target from backup and cleans temporary staging paths. - `MCP capability snapshot`: Placeholder capability model in `cli/src/services/mcp.rs` that captures planned file-cache transport/tool contracts (`cache-put`, `cache-get`) and cache policy defaults without enabling runtime MCP execution. diff --git a/context/overview.md b/context/overview.md index dbee1fbd..55e8f1ee 100644 --- a/context/overview.md +++ b/context/overview.md @@ -39,6 +39,7 @@ The hosted reconciliation service now also includes a deterministic rewrite mapp The hooks service now also includes retry-queue replay processing (`process_trace_retry_queue`) with per-attempt runtime/error-class metric emission, and the hosted reconciliation service now includes mapped/unmapped + confidence histogram metric snapshots (`summarize_reconciliation_metrics`), with DB-first queue/metrics schema coverage in `apply_core_schema_migrations`; this behavior is documented in `context/sce/agent-trace-retry-queue-observability.md`. The setup service now also exposes deterministic required-hook embedded asset accessors (`iter_required_hook_assets`, `get_required_hook_asset`) backed by canonical templates in `cli/assets/hooks/` for `pre-commit`, `commit-msg`, and `post-commit`; this behavior is documented in `context/sce/setup-githooks-hook-asset-packaging.md`. The setup service now also includes required-hook install orchestration (`install_required_git_hooks`) that resolves repository root and effective hooks path from git truth, enforces deterministic per-hook outcomes (`Installed`/`Updated`/`Skipped`), and performs backup-and-restore rollback on swap failures; this behavior is documented in `context/sce/setup-githooks-install-flow.md`. +The setup command parser/dispatch now also supports `sce setup --hooks` with optional `--repo `, enforces deterministic compatibility validation (`--repo` requires `--hooks`; `--hooks` incompatible with setup target flags), and emits deterministic per-hook setup outcome messaging (`installed`/`updated`/`skipped` with backup status); this behavior is documented in `context/sce/setup-githooks-cli-ux.md`. ## Repository model @@ -105,3 +106,4 @@ Lightweight post-task verification baseline (required after each completed task) - Use `context/sce/agent-trace-retry-queue-observability.md` for the implemented T14 retry replay contract (notes/DB target-scoped recovery, per-attempt runtime/error-class metrics, reconciliation mapped/unmapped + confidence histogram snapshots, and DB-first retry/metrics schema additions). - Use `context/sce/setup-githooks-hook-asset-packaging.md` for the implemented `sce-setup-githooks-any-repo` T02 compile-time hook-template packaging contract and setup-service required-hook embedded accessor surface. - Use `context/sce/setup-githooks-install-flow.md` for the implemented `sce-setup-githooks-any-repo` T03 required-hook install orchestration contract (git-truth hooks-path resolution, per-hook installed/updated/skipped outcomes, and backup/rollback behavior). +- Use `context/sce/setup-githooks-cli-ux.md` for the implemented `sce-setup-githooks-any-repo` T04 setup command-surface contract (`--hooks`, optional `--repo`), compatibility validation rules, and deterministic hook setup messaging. diff --git a/context/patterns.md b/context/patterns.md index 5aae2ee2..55a10870 100644 --- a/context/patterns.md +++ b/context/patterns.md @@ -78,6 +78,7 @@ - For setup install prep, generate compile-time embedded asset manifests from `config/.opencode/**`, `config/.claude/**`, and `cli/assets/hooks/**` in `cli/build.rs`, keep relative paths normalized to forward-slash form, and expose target-scoped iterators/lookups from the setup service layer for installer wiring. - For setup install execution, write selected embedded assets into a per-target staging directory first, then swap into repository-root `.opencode/`/`.claude/` with backup-and-replace semantics; when swap fails after backup creation, restore the original target path from backup and clean staging directories. - For required-hook setup execution, resolve repository root and effective hooks directory from git (`rev-parse --show-toplevel`, `rev-parse --git-path hooks`), then apply deterministic per-hook outcomes (`Installed`, `Updated`, `Skipped`) with staged writes, executable-bit enforcement, and backup-and-restore rollback on swap failures. +- For hook setup CLI UX, treat `--hooks` as a dedicated setup mode with optional `--repo `, enforce deterministic option compatibility (`--repo` requires `--hooks`; no `--hooks` + target-flag mixes), and emit stable per-hook status + backup lines for automation-friendly logs. - For setup command messaging, emit deterministic completion output that includes selected target(s), per-target install counts, and whether backup was created. - Keep module seams for future domains present and compile-safe even when behavior is deferred. - Keep dependency additions explicit and minimal in `cli/Cargo.toml`, and anchor dependency intent in lightweight compile-time code references (`cli/src/dependency_contract.rs`). diff --git a/context/plans/sce-setup-githooks-any-repo.md b/context/plans/sce-setup-githooks-any-repo.md index f8fae847..c64f8047 100644 --- a/context/plans/sce-setup-githooks-any-repo.md +++ b/context/plans/sce-setup-githooks-any-repo.md @@ -59,7 +59,7 @@ Enable `sce setup` to install and manage required Git hooks (`pre-commit`, `comm - Added required-hook install orchestration in `cli/src/services/setup.rs` via `install_required_git_hooks`, including git-truth path resolution (`rev-parse --show-toplevel` and `rev-parse --git-path hooks`), per-hook deterministic outcomes (`installed`/`updated`/`skipped`), executable-bit enforcement, and backup-and-restore rollback on swap failures. - Added focused setup-service coverage for default hook-path fresh install, rerun idempotency (`skipped` outcomes), custom `core.hooksPath` upgrades with backup retention, and injected swap-failure rollback cleanup in `services::setup::tests`. - Verification run: `cargo test services::setup::tests`; light checks/build run: `cargo fmt --check` and `cargo build` (from `cli/`). -- [ ] T04: Add CLI flags and UX for hook setup (status:todo) +- [x] T04: Add CLI flags and UX for hook setup (status:done) - Task ID: T04 - Goal: Add `sce setup --hooks` (and optional `--repo `) with deterministic output and compatible option validation. - Boundaries (in/out of scope): @@ -70,6 +70,11 @@ Enable `sce setup` to install and manage required Git hooks (`pre-commit`, `comm - Successful runs emit clear per-hook outcomes suitable for humans and automation logs. - Verification notes (commands or checks): - Command-surface tests for parsing, invalid combinations, and stable message snapshots. + - Verification evidence: + - Extended setup CLI parsing in `cli/src/services/setup.rs` with `--hooks` and optional `--repo `, including deterministic compatibility validation (`--repo` requires `--hooks`; `--hooks` is incompatible with `--opencode|--claude|--both`). + - Wired hook setup dispatch in `cli/src/app.rs` via `Command::SetupHooks`, preserving existing target-asset setup behavior while routing hook installs through `run_setup_hooks(...)`. + - Added deterministic hook setup success output (repository root, hooks directory, per-hook `installed|updated|skipped` lines, backup status) and updated command/help usage text in `cli/src/services/setup.rs` and `cli/src/command_surface.rs`. + - Verification run: `cargo test app::tests && cargo test command_surface::tests && cargo test services::setup::tests && cargo fmt --check && cargo build` (from `cli/`). - [ ] T05: Integrate with doctor and add verification tests (status:todo) - Task ID: T05 - Goal: Ensure `sce doctor` reports ready after successful hook setup and add targeted test coverage for missing/misconfigured/existing hooks plus idempotent re-run behavior. diff --git a/context/sce/setup-githooks-cli-ux.md b/context/sce/setup-githooks-cli-ux.md new file mode 100644 index 00000000..3e5d7cb1 --- /dev/null +++ b/context/sce/setup-githooks-cli-ux.md @@ -0,0 +1,41 @@ +# SCE setup git-hooks CLI UX + +## Scope + +Task `sce-setup-githooks-any-repo` `T04` defines the `sce setup` command-surface behavior for required-hook setup mode. + +## Command surface + +- `sce setup --hooks` +- `sce setup --hooks --repo ` + +`--hooks` runs required-hook installation (`pre-commit`, `commit-msg`, `post-commit`) through the setup service hook installer. +When `--repo` is omitted, setup targets the current working directory. + +## Option compatibility and validation + +Validation is deterministic and enforced during setup option resolution: + +- `--repo` requires `--hooks` +- `--hooks` cannot be combined with `--opencode`, `--claude`, or `--both` +- `--repo` may only be provided once and must include a value + +Target-install mode remains unchanged: + +- `sce setup` defaults to interactive target selection +- `--opencode`, `--claude`, and `--both` remain mutually exclusive for non-interactive target install + +## Output contract + +Successful hook setup emits deterministic human/automation-friendly output including: + +- repository root +- effective hooks directory +- per-hook outcome lines with canonical lowercase statuses (`installed`, `updated`, `skipped`) +- backup status per hook (`backup: ''` or `backup: not needed`) + +## Implementation anchors + +- `cli/src/app.rs` +- `cli/src/services/setup.rs` +- `cli/src/command_surface.rs` From c5d18b5a4e66d922eb6490f54c4e3b86d070f99e Mon Sep 17 00:00:00 2001 From: David Abram Date: Thu, 5 Mar 2026 00:16:24 +0100 Subject: [PATCH 21/39] doctor: Resolve hook checks against inspected repository root Anchor git command execution to the inspected repository root and normalize relative hook paths to absolute paths so `sce doctor` reports readiness correctly for default and per-repo `core.hooksPath` setups. Add post-setup readiness tests and align hook-doctor context contract text. --- cli/src/services/doctor.rs | 96 +++++++++++++++++--- context/plans/sce-setup-githooks-any-repo.md | 18 +++- context/sce/agent-trace-hook-doctor.md | 4 + 3 files changed, 105 insertions(+), 13 deletions(-) diff --git a/cli/src/services/doctor.rs b/cli/src/services/doctor.rs index 71d9e2c2..49ae763c 100644 --- a/cli/src/services/doctor.rs +++ b/cli/src/services/doctor.rs @@ -2,7 +2,7 @@ use std::fs; use std::path::{Path, PathBuf}; use std::process::Command; -use anyhow::Result; +use anyhow::{Context, Result}; pub const NAME: &str = "doctor"; @@ -40,16 +40,34 @@ struct HookDoctorReport { } pub fn run_doctor() -> Result { - let report = build_report(); + let repository_root = + std::env::current_dir().context("Failed to determine current directory")?; + let report = build_report(&repository_root); Ok(format_report(&report)) } -fn build_report() -> HookDoctorReport { - let repository_root = run_git_command(&["rev-parse", "--show-toplevel"]).map(PathBuf::from); - let hooks_directory = run_git_command(&["rev-parse", "--git-path", "hooks"]).map(PathBuf::from); +fn build_report(repository_root: &Path) -> HookDoctorReport { + let detected_repository_root = + run_git_command(repository_root, &["rev-parse", "--show-toplevel"]).map(PathBuf::from); + let hooks_directory = detected_repository_root.as_ref().and_then(|resolved_root| { + run_git_command(resolved_root, &["rev-parse", "--git-path", "hooks"]).map(|value| { + let path = PathBuf::from(value); + if path.is_absolute() { + path + } else { + resolved_root.join(path) + } + }) + }); - let local_hooks_path = run_git_command(&["config", "--local", "--get", "core.hooksPath"]); - let global_hooks_path = run_git_command(&["config", "--global", "--get", "core.hooksPath"]); + let local_hooks_path = run_git_command( + repository_root, + &["config", "--local", "--get", "core.hooksPath"], + ); + let global_hooks_path = run_git_command( + repository_root, + &["config", "--global", "--get", "core.hooksPath"], + ); let hook_path_source = if local_hooks_path.is_some() { HookPathSource::LocalConfig @@ -79,7 +97,7 @@ fn build_report() -> HookDoctorReport { HookDoctorReport { readiness, - repository_root, + repository_root: detected_repository_root, hook_path_source, hooks_directory, hooks, @@ -141,8 +159,12 @@ fn is_executable(metadata: &fs::Metadata) -> bool { metadata.is_file() } -fn run_git_command(args: &[&str]) -> Option { - let output = Command::new("git").args(args).output().ok()?; +fn run_git_command(repository_root: &Path, args: &[&str]) -> Option { + let output = Command::new("git") + .args(args) + .current_dir(repository_root) + .output() + .ok()?; if !output.status.success() { return None; } @@ -226,12 +248,18 @@ fn format_report(report: &HookDoctorReport) -> String { mod tests { use std::fs; use std::os::unix::fs::PermissionsExt; + use std::path::Path; + use std::process::Command; use anyhow::Result; + use crate::services::setup::install_required_git_hooks; use crate::test_support::TestTempDir; - use super::{collect_hook_health, format_report, HookDoctorReport, HookPathSource, Readiness}; + use super::{ + build_report, collect_hook_health, format_report, HookDoctorReport, HookPathSource, + Readiness, + }; #[test] fn doctor_output_reports_healthy_state_when_all_required_hooks_exist() -> Result<()> { @@ -337,4 +365,50 @@ mod tests { assert!(output.contains("Hook 'post-commit' exists but is not executable")); Ok(()) } + + #[test] + fn doctor_reports_ready_after_setup_hook_install() -> Result<()> { + let temp_dir = TestTempDir::new("doctor-ready-after-setup")?; + init_git_repo(temp_dir.path())?; + + install_required_git_hooks(temp_dir.path())?; + + let output = format_report(&build_report(temp_dir.path())); + assert!(output.contains("SCE doctor: ready")); + assert!(output.contains("pre-commit: ok")); + assert!(output.contains("commit-msg: ok")); + assert!(output.contains("post-commit: ok")); + Ok(()) + } + + #[test] + fn doctor_reports_ready_for_custom_repo_hooks_path_after_setup() -> Result<()> { + let temp_dir = TestTempDir::new("doctor-ready-custom-hooks-path")?; + init_git_repo(temp_dir.path())?; + run_git_in_repo(temp_dir.path(), &["config", "core.hooksPath", ".githooks"])?; + + install_required_git_hooks(temp_dir.path())?; + + let output = format_report(&build_report(temp_dir.path())); + assert!(output.contains("SCE doctor: ready")); + assert!(output.contains("Hooks path source: per-repo core.hooksPath")); + assert!(output.contains(".githooks")); + Ok(()) + } + + fn init_git_repo(repository_root: &Path) -> Result<()> { + run_git_in_repo(repository_root, &["init", "-q"]) + } + + fn run_git_in_repo(repository_root: &Path, args: &[&str]) -> Result<()> { + let status = Command::new("git") + .args(args) + .current_dir(repository_root) + .status()?; + if !status.success() { + anyhow::bail!("git command failed for test repository"); + } + + Ok(()) + } } diff --git a/context/plans/sce-setup-githooks-any-repo.md b/context/plans/sce-setup-githooks-any-repo.md index c64f8047..33d214fb 100644 --- a/context/plans/sce-setup-githooks-any-repo.md +++ b/context/plans/sce-setup-githooks-any-repo.md @@ -75,7 +75,7 @@ Enable `sce setup` to install and manage required Git hooks (`pre-commit`, `comm - Wired hook setup dispatch in `cli/src/app.rs` via `Command::SetupHooks`, preserving existing target-asset setup behavior while routing hook installs through `run_setup_hooks(...)`. - Added deterministic hook setup success output (repository root, hooks directory, per-hook `installed|updated|skipped` lines, backup status) and updated command/help usage text in `cli/src/services/setup.rs` and `cli/src/command_surface.rs`. - Verification run: `cargo test app::tests && cargo test command_surface::tests && cargo test services::setup::tests && cargo fmt --check && cargo build` (from `cli/`). -- [ ] T05: Integrate with doctor and add verification tests (status:todo) +- [x] T05: Integrate with doctor and add verification tests (status:done) - Task ID: T05 - Goal: Ensure `sce doctor` reports ready after successful hook setup and add targeted test coverage for missing/misconfigured/existing hooks plus idempotent re-run behavior. - Boundaries (in/out of scope): @@ -86,7 +86,11 @@ Enable `sce setup` to install and manage required Git hooks (`pre-commit`, `comm - Targeted test suite covers missing/misconfigured/existing hooks and idempotent reruns. - Verification notes (commands or checks): - `cargo test` (targeted hook/setup/doctor slices), `cargo fmt --check`, and `cargo build` from `cli/`. -- [ ] T06: Validation and cleanup (status:todo) + - Verification evidence: + - Updated `cli/src/services/doctor.rs` to resolve git commands relative to the inspected repository root and normalize effective hook-directory paths against that root, aligning doctor readiness checks with setup hook install behavior. + - Added doctor coverage for post-setup readiness in supported hook-path modes: default hooks install readiness and per-repo `core.hooksPath` readiness after `install_required_git_hooks`. + - Verified targeted behavior with `cargo test services::doctor::tests` and `cargo test services::setup::tests`, then ran light checks/build with `cargo fmt --check` and `cargo build` (from `cli/`). +- [x] T06: Validation and cleanup (status:done) - Task ID: T06 - Goal: Run end-to-end validation, ensure cleanup of temporary artifacts, and confirm code/context alignment for this plan scope. - Boundaries (in/out of scope): @@ -99,5 +103,15 @@ Enable `sce setup` to install and manage required Git hooks (`pre-commit`, `comm - Verification notes (commands or checks): - `cargo fmt --check && cargo build && cargo test` (from `cli/`). - Focused `sce setup --hooks` fresh + rerun checks and post-setup `sce doctor` readiness checks. + - Verification evidence: + - Command report (exit code 0): `cargo fmt --check && cargo build && cargo test` (from `cli/`); key output: build succeeded and test result `ok` with `110 passed; 0 failed`. + - Command report (exit code 0): `nix run .#pkl-check-generated && nix flake check` (from repo root); key output: generated outputs up to date and flake checks evaluated/built successfully. + - Executed focused end-to-end hook setup validation using temporary repositories under `context/tmp/`: fresh `sce setup --hooks`, rerun idempotency checks (`skipped` outcomes), and `sce doctor` readiness checks for both default `.git/hooks` and per-repo `core.hooksPath` (`.githooks`) modes. + - Command report (exit code 0): local binary checks in temp repos confirmed deterministic setup outcomes (`installed` on first run, `skipped` on rerun) and `SCE doctor: ready` in both default and per-repo hooks-path modes. + - Confirmed cleanup by removing temporary validation repositories from `context/tmp/` after verification; no retained task artifacts required. + - Completed context sync verification for this task scope with no additional behavior drift requiring root shared-file edits. + - Failed checks and follow-ups: none. + - Success-criteria verification summary: one-command hook install works for arbitrary repos, reruns are idempotent with deterministic outcomes, backup semantics remain in contract, and post-setup doctor reports ready in supported hook-path modes. + - Residual risks: repository emits existing compile-time dead-code warnings in unrelated modules; no task-scoped functional regressions observed. ## 5) Open questions - None. diff --git a/context/sce/agent-trace-hook-doctor.md b/context/sce/agent-trace-hook-doctor.md index a21e3630..a9dc9513 100644 --- a/context/sce/agent-trace-hook-doctor.md +++ b/context/sce/agent-trace-hook-doctor.md @@ -28,6 +28,8 @@ Task `agent-trace-attribution-no-git-wrapper` `T07` adds local rollout validatio - `git config --local --get core.hooksPath` - `git config --global --get core.hooksPath` +Git command resolution is repository-root anchored for the inspected repo, and the effective hooks directory is normalized to an absolute path when git returns a relative hook path. + Readiness is `not ready` when any required check fails: - hooks directory cannot be resolved @@ -44,5 +46,7 @@ If no diagnostics are present, readiness is `ready`. - healthy state (all required hooks present and executable) - missing state (required hook absent) - misconfigured state (required hook present but non-executable) +- post-setup ready state after required hooks are installed +- post-setup ready state for per-repo custom `core.hooksPath` `cli/src/app.rs` includes command-level routing/exit success coverage for `sce doctor`. From 93abbf7bab5768fa56638c9dcaff80ad27929dd4 Mon Sep 17 00:00:00 2001 From: David Abram Date: Thu, 5 Mar 2026 00:32:29 +0100 Subject: [PATCH 22/39] hosted-reconciliation: Replace handwritten crypto with hmac and sha2 Replace custom SHA-256/HMAC internals with crate-backed hashing for GitHub signature verification and hosted idempotency key derivation. Align the CLI dependency contract and docs context to the new crypto baseline while preserving hosted intake behavior and existing test expectations. --- cli/Cargo.lock | 42 ++++++ cli/Cargo.toml | 2 + cli/src/dependency_contract.rs | 9 +- cli/src/services/hosted_reconciliation.rs | 133 +----------------- context/architecture.md | 2 +- context/cli/placeholder-foundation.md | 2 +- context/glossary.md | 2 +- context/overview.md | 2 +- .../sce-cli-rust-idiomatic-hardening-pass.md | 103 ++++++++++++++ ...trace-hosted-event-intake-orchestration.md | 4 +- 10 files changed, 168 insertions(+), 133 deletions(-) create mode 100644 context/plans/sce-cli-rust-idiomatic-hardening-pass.md diff --git a/cli/Cargo.lock b/cli/Cargo.lock index fde60b03..6578820a 100644 --- a/cli/Cargo.lock +++ b/cli/Cargo.lock @@ -162,6 +162,15 @@ version = "2.11.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "843867be96c8daad0d758b57df9392b6d8d271134fce549de6ce169ff98a92af" +[[package]] +name = "block-buffer" +version = "0.10.4" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "3078c7629b62d3f0439517fa394996acacc5cbc91c5a20d8c658e77abd503a71" +dependencies = [ + "generic-array", +] + [[package]] name = "borrow-or-share" version = "0.2.4" @@ -410,6 +419,17 @@ dependencies = [ "powerfmt", ] +[[package]] +name = "digest" +version = "0.10.7" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9ed9a281f7bc9b7576e61468ba615a66a5c8cfdff42420a70aa82701a3b1e292" +dependencies = [ + "block-buffer", + "crypto-common", + "subtle", +] + [[package]] name = "displaydoc" version = "0.2.5" @@ -744,6 +764,15 @@ version = "0.4.3" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "7f24254aa9a54b5c858eaee2f5bccdb46aaf0e486a595ed5fd8f86ba55232a70" +[[package]] +name = "hmac" +version = "0.12.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "6c49c37c09c17a53d937dfbb742eb3a961d65a994e6bcdcf37e7399d0cc8ab5e" +dependencies = [ + "digest", +] + [[package]] name = "home" version = "0.5.12" @@ -1837,10 +1866,12 @@ name = "sce" version = "0.1.0" dependencies = [ "anyhow", + "hmac", "inquire", "jsonschema", "lexopt", "serde_json", + "sha2", "tokio", "turso", ] @@ -1918,6 +1949,17 @@ version = "1.0.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "bbfa15b3dddfee50a0fff136974b3e1bde555604ba463834a7eb7deb6417705d" +[[package]] +name = "sha2" +version = "0.10.9" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "a7507d819769d01a365ab707794a4084392c824f54a7a6a7862f8c3d0892b283" +dependencies = [ + "cfg-if", + "cpufeatures", + "digest", +] + [[package]] name = "sharded-slab" version = "0.1.7" diff --git a/cli/Cargo.toml b/cli/Cargo.toml index d9facb02..be2abbb4 100644 --- a/cli/Cargo.toml +++ b/cli/Cargo.toml @@ -14,8 +14,10 @@ categories = ["command-line-utilities", "development-tools"] [dependencies] anyhow = "1" +hmac = "0.12" inquire = "0.7" lexopt = "0.3" +sha2 = "0.10" tokio = { version = "1", default-features = false, features = ["rt"] } turso = "0" diff --git a/cli/src/dependency_contract.rs b/cli/src/dependency_contract.rs index dac0c8e1..e5e98064 100644 --- a/cli/src/dependency_contract.rs +++ b/cli/src/dependency_contract.rs @@ -4,11 +4,15 @@ pub fn dependency_contract_snapshot() -> ( &'static str, &'static str, &'static str, + &'static str, + &'static str, ) { ( Ok(()), + std::any::type_name::>(), std::any::type_name::(), std::any::type_name::(), + std::any::type_name::(), std::any::type_name::(), std::any::type_name::(), ) @@ -20,10 +24,13 @@ mod tests { #[test] fn dependency_contract_snapshot_references_agreed_crates() { - let (result, inquire_ty, lexopt_ty, tokio_ty, turso_ty) = dependency_contract_snapshot(); + let (result, hmac_ty, inquire_ty, lexopt_ty, sha2_ty, tokio_ty, turso_ty) = + dependency_contract_snapshot(); assert!(result.is_ok()); + assert!(hmac_ty.contains("hmac::")); assert!(inquire_ty.contains("inquire::")); assert!(lexopt_ty.contains("lexopt::")); + assert!(sha2_ty.contains("sha2::")); assert!(tokio_ty.contains("tokio::")); assert!(turso_ty.contains("turso::")); } diff --git a/cli/src/services/hosted_reconciliation.rs b/cli/src/services/hosted_reconciliation.rs index c7c633e8..2f8665a6 100644 --- a/cli/src/services/hosted_reconciliation.rs +++ b/cli/src/services/hosted_reconciliation.rs @@ -1,4 +1,6 @@ use anyhow::{bail, ensure, Result}; +use hmac::{Hmac, Mac}; +use sha2::{Digest, Sha256}; #[derive(Clone, Copy, Debug, Eq, PartialEq)] pub enum HostedProvider { @@ -481,7 +483,7 @@ fn derive_idempotency_key( new_head, delivery ); - let digest = hex_lower(&sha256(material.as_bytes())); + let digest = hex_lower(&Sha256::digest(material.as_bytes())); format!("hosted:{}:{}", provider.as_str(), digest) } @@ -548,7 +550,10 @@ fn is_sha_like(value: &str) -> bool { } fn github_signature(secret: &str, payload: &str) -> String { - let mac = hmac_sha256(secret.as_bytes(), payload.as_bytes()); + let mut mac = Hmac::::new_from_slice(secret.as_bytes()) + .expect("HMAC-SHA256 accepts any key length"); + mac.update(payload.as_bytes()); + let mac = mac.finalize().into_bytes(); format!("sha256={}", hex_lower(&mac)) } @@ -565,130 +570,6 @@ fn constant_time_eq(left: &[u8], right: &[u8]) -> bool { diff == 0 } -fn hmac_sha256(key: &[u8], message: &[u8]) -> [u8; 32] { - const BLOCK_SIZE: usize = 64; - let mut key_block = [0_u8; BLOCK_SIZE]; - - if key.len() > BLOCK_SIZE { - let hashed = sha256(key); - key_block[..hashed.len()].copy_from_slice(&hashed); - } else { - key_block[..key.len()].copy_from_slice(key); - } - - let mut inner_pad = [0_u8; BLOCK_SIZE]; - let mut outer_pad = [0_u8; BLOCK_SIZE]; - for idx in 0..BLOCK_SIZE { - inner_pad[idx] = key_block[idx] ^ 0x36; - outer_pad[idx] = key_block[idx] ^ 0x5c; - } - - let mut inner_input = Vec::with_capacity(BLOCK_SIZE + message.len()); - inner_input.extend_from_slice(&inner_pad); - inner_input.extend_from_slice(message); - let inner_hash = sha256(&inner_input); - - let mut outer_input = Vec::with_capacity(BLOCK_SIZE + inner_hash.len()); - outer_input.extend_from_slice(&outer_pad); - outer_input.extend_from_slice(&inner_hash); - - sha256(&outer_input) -} - -fn sha256(input: &[u8]) -> [u8; 32] { - const K: [u32; 64] = [ - 0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, 0x3956c25b, 0x59f111f1, 0x923f82a4, - 0xab1c5ed5, 0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, 0x72be5d74, 0x80deb1fe, - 0x9bdc06a7, 0xc19bf174, 0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, 0x2de92c6f, - 0x4a7484aa, 0x5cb0a9dc, 0x76f988da, 0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, - 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967, 0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, - 0x53380d13, 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85, 0xa2bfe8a1, 0xa81a664b, - 0xc24b8b70, 0xc76c51a3, 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070, 0x19a4c116, - 0x1e376c08, 0x2748774c, 0x34b0bcb5, 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3, - 0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, 0x90befffa, 0xa4506ceb, 0xbef9a3f7, - 0xc67178f2, - ]; - - let mut h: [u32; 8] = [ - 0x6a09e667, 0xbb67ae85, 0x3c6ef372, 0xa54ff53a, 0x510e527f, 0x9b05688c, 0x1f83d9ab, - 0x5be0cd19, - ]; - - let mut padded = input.to_vec(); - let bit_len = (padded.len() as u64) * 8; - padded.push(0x80); - while (padded.len() % 64) != 56 { - padded.push(0); - } - padded.extend_from_slice(&bit_len.to_be_bytes()); - - let mut message_schedule = [0_u32; 64]; - for chunk in padded.chunks_exact(64) { - for (idx, word) in chunk.chunks_exact(4).take(16).enumerate() { - message_schedule[idx] = u32::from_be_bytes([word[0], word[1], word[2], word[3]]); - } - - for idx in 16..64 { - let s0 = message_schedule[idx - 15].rotate_right(7) - ^ message_schedule[idx - 15].rotate_right(18) - ^ (message_schedule[idx - 15] >> 3); - let s1 = message_schedule[idx - 2].rotate_right(17) - ^ message_schedule[idx - 2].rotate_right(19) - ^ (message_schedule[idx - 2] >> 10); - message_schedule[idx] = message_schedule[idx - 16] - .wrapping_add(s0) - .wrapping_add(message_schedule[idx - 7]) - .wrapping_add(s1); - } - - let mut a = h[0]; - let mut b = h[1]; - let mut c = h[2]; - let mut d = h[3]; - let mut e = h[4]; - let mut f = h[5]; - let mut g = h[6]; - let mut hh = h[7]; - - for idx in 0..64 { - let s1 = e.rotate_right(6) ^ e.rotate_right(11) ^ e.rotate_right(25); - let ch = (e & f) ^ ((!e) & g); - let temp1 = hh - .wrapping_add(s1) - .wrapping_add(ch) - .wrapping_add(K[idx]) - .wrapping_add(message_schedule[idx]); - let s0 = a.rotate_right(2) ^ a.rotate_right(13) ^ a.rotate_right(22); - let maj = (a & b) ^ (a & c) ^ (b & c); - let temp2 = s0.wrapping_add(maj); - - hh = g; - g = f; - f = e; - e = d.wrapping_add(temp1); - d = c; - c = b; - b = a; - a = temp1.wrapping_add(temp2); - } - - h[0] = h[0].wrapping_add(a); - h[1] = h[1].wrapping_add(b); - h[2] = h[2].wrapping_add(c); - h[3] = h[3].wrapping_add(d); - h[4] = h[4].wrapping_add(e); - h[5] = h[5].wrapping_add(f); - h[6] = h[6].wrapping_add(g); - h[7] = h[7].wrapping_add(hh); - } - - let mut output = [0_u8; 32]; - for (idx, value) in h.iter().enumerate() { - output[idx * 4..idx * 4 + 4].copy_from_slice(&value.to_be_bytes()); - } - output -} - fn hex_lower(bytes: &[u8]) -> String { const HEX: &[u8; 16] = b"0123456789abcdef"; let mut output = String::with_capacity(bytes.len() * 2); diff --git a/context/architecture.md b/context/architecture.md index d304ef0d..94482411 100644 --- a/context/architecture.md +++ b/context/architecture.md @@ -89,7 +89,7 @@ The repository includes a new placeholder Rust binary crate at `cli/`. - `flake.nix` (root) keeps nested CLI input wiring aligned by forwarding `nixpkgs`, `flake-utils`, and `rust-overlay` into the `cli` path input so repository-level `nix flake check` can evaluate nested CLI checks deterministically. - `cli/Cargo.toml` keeps crates.io-ready package metadata populated while `publish = false` remains the current policy; local Cargo release/install verification targets `cargo build --manifest-path cli/Cargo.toml --release` and `cargo install --path cli --locked`. Tokio is intentionally constrained to `default-features = false` with `features = ["rt"]` to match current runtime API usage. -This phase establishes compile-safe extension seams with a minimal dependency baseline (`anyhow`, `inquire`, `lexopt`, `tokio`, `turso`); local Turso connectivity smoke checks now exist, while broader runtime integrations remain deferred. +This phase establishes compile-safe extension seams with a minimal dependency baseline (`anyhow`, `hmac`, `inquire`, `lexopt`, `sha2`, `tokio`, `turso`); local Turso connectivity smoke checks now exist, while broader runtime integrations remain deferred. ## Shared Context Drift parity mapping diff --git a/context/cli/placeholder-foundation.md b/context/cli/placeholder-foundation.md index 08c11414..29d6d049 100644 --- a/context/cli/placeholder-foundation.md +++ b/context/cli/placeholder-foundation.md @@ -102,7 +102,7 @@ Placeholder commands currently acknowledge planned behavior and do not claim pro ## Dependency baseline -- `cli/Cargo.toml` declares only: `anyhow`, `inquire`, `lexopt`, `tokio`, and `turso`. +- `cli/Cargo.toml` declares only: `anyhow`, `hmac`, `inquire`, `lexopt`, `sha2`, `tokio`, and `turso`. - `tokio` is pinned with `default-features = false` and `features = ["rt"]` to match current runtime usage (current-thread runtime builder and `Runtime::block_on` without broader async feature surface). - `cli/src/dependency_contract.rs` keeps compile-time crate references centralized for this placeholder slice. diff --git a/context/glossary.md b/context/glossary.md index 8bd3e96f..26f033d1 100644 --- a/context/glossary.md +++ b/context/glossary.md @@ -21,7 +21,7 @@ - `sce` (CLI foundation): Rust binary crate at `cli/` with implemented setup installation flow and placeholder behavior for other command domains. - `command surface contract`: The static command catalog in `cli/src/command_surface.rs` that marks each top-level command as `implemented` or `placeholder`. - `command loop`: The `lexopt` parser + dispatcher in `cli/src/app.rs` that routes `help`, `setup`, `mcp`, `hooks`, and `sync`, executes setup installation, emits TODO placeholders for non-implemented commands, and returns deterministic actionable errors for invalid invocation. -- `sce dependency contract`: Minimal crate dependency baseline declared in `cli/Cargo.toml` and referenced via `cli/src/dependency_contract.rs` (`anyhow`, `inquire`, `lexopt`, `tokio`, `turso`). +- `sce dependency contract`: Minimal crate dependency baseline declared in `cli/Cargo.toml` and referenced via `cli/src/dependency_contract.rs` (`anyhow`, `hmac`, `inquire`, `lexopt`, `sha2`, `tokio`, `turso`). - `local Turso adapter`: Async data-layer module in `cli/src/services/local_db.rs` that initializes local DB targets with `turso::Builder::new_local(...)` and runs execute/query smoke checks. - `sync Turso smoke gate`: Behavior in `cli/src/services/sync.rs` where the `sync` placeholder command runs an in-memory local Turso smoke check under a lazily initialized shared tokio current-thread runtime before returning placeholder cloud-sync messaging. - `setup service orchestration`: Setup execution logic in `cli/src/services/setup.rs` that resolves target selection, installs embedded assets, and emits deterministic success messaging per target. diff --git a/context/overview.md b/context/overview.md index 55e8f1ee..f2fd5625 100644 --- a/context/overview.md +++ b/context/overview.md @@ -5,7 +5,7 @@ This repository maintains shared assistant configuration for OpenCode and Claude It also includes an early Rust CLI foundation at `cli/` for Shared Context Engineering workflows. The crate ships onboarding and usage documentation at `cli/README.md` that reflects current implemented vs placeholder behavior. -The CLI crate currently enforces a minimal dependency contract: `anyhow`, `inquire`, `lexopt`, `tokio`, and `turso`. +The CLI crate currently enforces a minimal dependency contract: `anyhow`, `hmac`, `inquire`, `lexopt`, `sha2`, `tokio`, and `turso`. Its command loop is implemented with `lexopt` argument parsing and `anyhow` error handling, with real setup orchestration, implemented `doctor` rollout validation, and placeholder dispatch for deferred commands through explicit service contracts. The `setup` command includes an `inquire`-backed target-selection flow: default interactive selection for OpenCode/Claude/both, explicit non-interactive target flags (`--opencode`, `--claude`, `--both`), deterministic mutually-exclusive validation, and non-destructive cancellation exits. The CLI now compiles an embedded setup asset manifest from `config/.opencode/**`, `config/.claude/**`, and `cli/assets/hooks/**` via `cli/build.rs`; `cli/src/services/setup.rs` exposes deterministic normalized relative paths plus file bytes and target-scoped iteration without runtime reads from `config/`. diff --git a/context/plans/sce-cli-rust-idiomatic-hardening-pass.md b/context/plans/sce-cli-rust-idiomatic-hardening-pass.md new file mode 100644 index 00000000..f21c736d --- /dev/null +++ b/context/plans/sce-cli-rust-idiomatic-hardening-pass.md @@ -0,0 +1,103 @@ +# Plan: sce-cli-rust-idiomatic-hardening-pass + +## 1) Change summary + +Apply an idiomatic and safety-focused Rust hardening pass across hosted reconciliation, local DB path handling, parser/runtime ergonomics, and large test-module maintainability. Replace brittle handwritten primitives (crypto and JSON parsing), remove broad suppression patterns, and stage an incremental test split for oversized service files. + +Locked clarification decisions: +- Dependency policy: add runtime crates as needed (`hmac`, `sha2`, `serde_json`) and update dependency-contract/context references. +- Test split scope: incremental extraction only (target highest-churn test slices now; full migration deferred). +- Float tie policy: use a small epsilon tie window and document deterministic behavior. + +## 2) Success criteria + +- Hosted signature and idempotency hashing in `cli/src/services/hosted_reconciliation.rs` use vetted crates (`hmac` + `sha2`) with no handwritten SHA-256/HMAC implementation remaining. +- Hosted webhook payload field extraction no longer uses string scanning (`find_required_json_string`); parsing uses `serde_json` value/typed access with deterministic error messages for missing/invalid fields. +- Rewrite score tie handling avoids direct `f32 == f32` comparison and applies documented epsilon-based tie semantics. +- Local DB connection and test helpers avoid lossy path conversion where possible; any required UTF-8 conversion is explicit and contextualized. +- `cli/src/services/agent_trace.rs` no longer uses crate-wide `#![allow(dead_code)]`; any remaining allow is narrowly scoped and justified by placeholder contract needs. +- Top-level argument parsing in `cli/src/app.rs` no longer clones `tail_args` just to initialize `lexopt`. +- `cli/src/services/sync.rs` uses idiomatic `OnceLock` initialization flow (`get_or_init`/`get_or_try_init` style) instead of manual get/set/get choreography. +- `cli/src/services/hooks.rs` and `cli/src/services/setup.rs` have an incremental test/runtime separation pass applied (targeted test-module extraction to smaller files/modules) with behavior preserved. + +## 3) Constraints and non-goals + +Constraints: +- Preserve current user-facing command behavior and error semantics unless a safety fix requires an intentional update covered by tests. +- Keep hosted reconciliation mapping and signature verification contracts stable while changing internals. +- Maintain deterministic outcomes for tie resolution and unresolved mapping reporting. +- Keep task slicing one-task/one-atomic-commit. + +Non-goals (deferred): +- Full migration of all tests in `hooks.rs` and `setup.rs` into integration tests under `cli/tests/`. +- Broad architecture redesign of service boundaries beyond targeted extraction needed for maintainability. +- Functional feature expansion outside listed refactor/safety concerns. + +## 4) Task stack (`T01..T09`) + +- [x] T01: Replace handwritten hosted crypto with vetted crates and align dependency contract (status:done) + - Goal: Remove manual `sha256`/`hmac_sha256` internals in hosted reconciliation and wire `hmac` + `sha2` crate usage for signature/idempotency hashing. + - Boundaries (in): `cli/src/services/hosted_reconciliation.rs`, `cli/Cargo.toml`, `cli/src/dependency_contract.rs`, and related unit tests. + - Boundaries (out): Changing provider signature policy semantics (GitHub/GitLab contract must stay equivalent). + - Done when: handcrafted crypto helpers are removed/replaced; dependency contract compiles and tests validate equivalent signature/hash behavior. + - Verification notes: run `cargo test --manifest-path cli/Cargo.toml services::hosted_reconciliation::tests` and `cargo check --manifest-path cli/Cargo.toml`. + +- [ ] T02: Replace fragile hosted JSON string scanning with structured parsing (status:todo) + - Goal: Replace `find_required_json_string` usage with `serde_json` parsing (typed/value extraction) for `before`, `after`, and provider-specific repository fields. + - Boundaries (in): Hosted payload parse path and parse-focused tests in `cli/src/services/hosted_reconciliation.rs`; dependency additions in `cli/Cargo.toml` as needed. + - Boundaries (out): New provider support or webhook schema expansion beyond existing GitHub/GitLab fields. + - Done when: no manual substring search parser remains on hosted intake path; missing/invalid-field failures are deterministic and covered by tests. + - Verification notes: run `cargo test --manifest-path cli/Cargo.toml services::hosted_reconciliation::tests` and `cargo check --manifest-path cli/Cargo.toml`. + +- [ ] T03: Introduce epsilon-based tie handling for rewrite score comparison (status:todo) + - Goal: Remove direct float equality check in candidate tie detection and apply explicit epsilon tie-window semantics. + - Boundaries (in): Tie detection and mapping-outcome tests in `cli/src/services/hosted_reconciliation.rs`. + - Boundaries (out): Replacing score model or threshold policy (`FUZZY_MAPPING_THRESHOLD`) beyond tie logic. + - Done when: tie behavior is epsilon-based, deterministic, and tested for near-equal/clearly-different scores. + - Verification notes: run `cargo test --manifest-path cli/Cargo.toml services::hosted_reconciliation::tests` and `cargo check --manifest-path cli/Cargo.toml`. + +- [ ] T04: Eliminate lossy DB path string conversion in local DB service/tests (status:todo) + - Goal: Refactor local DB target path handling to avoid `to_string_lossy()` for DB location construction, using `Path`-native or explicit fallible conversion with context. + - Boundaries (in): `cli/src/services/local_db.rs` runtime and test helpers. + - Boundaries (out): Turso API redesign assumptions or broader filesystem abstraction rewrite. + - Done when: targeted lossy conversions at current call sites are removed/replaced with explicit safe handling and tests still pass. + - Verification notes: run `cargo test --manifest-path cli/Cargo.toml services::local_db::tests` and `cargo check --manifest-path cli/Cargo.toml`. + +- [ ] T05: Remove broad dead-code suppression from agent trace module (status:todo) + - Goal: Remove `#![allow(dead_code)]` from `cli/src/services/agent_trace.rs` and apply narrow item-level handling only where required. + - Boundaries (in): `cli/src/services/agent_trace.rs` and directly affected tests/usages. + - Boundaries (out): Large-scale pruning of placeholder Agent Trace contracts not required to satisfy compiler hygiene. + - Done when: crate-level dead-code allow is absent and compile/test remain green without broad suppression. + - Verification notes: run `cargo test --manifest-path cli/Cargo.toml services::agent_trace::tests` and `cargo check --manifest-path cli/Cargo.toml`. + +- [ ] T06: Remove avoidable `tail_args` clone in top-level parser (status:todo) + - Goal: Restructure top-level parsing so `lexopt` consumes arguments without cloning `tail_args` solely for parser initialization. + - Boundaries (in): `cli/src/app.rs` parse flow and parser tests. + - Boundaries (out): Command-surface behavioral changes unrelated to clone removal. + - Done when: `parse_command` no longer clones `tail_args` for `Parser::from_args`, with behavior preserved and tests passing. + - Verification notes: run `cargo test --manifest-path cli/Cargo.toml app::tests` and `cargo check --manifest-path cli/Cargo.toml`. + +- [ ] T07: Simplify sync runtime initialization with idiomatic OnceLock API (status:todo) + - Goal: Replace manual get/set/get runtime init in `shared_runtime` with `OnceLock` idioms (`get_or_try_init` or equivalent safe pattern). + - Boundaries (in): `cli/src/services/sync.rs` runtime init path and relevant tests. + - Boundaries (out): Async architecture changes beyond runtime initialization style. + - Done when: runtime initialization code is single-flow and atomic in style, preserving current error context and reuse behavior. + - Verification notes: run `cargo test --manifest-path cli/Cargo.toml services::sync::tests` and `cargo check --manifest-path cli/Cargo.toml`. + +- [ ] T08: Apply incremental test/runtime separation in hooks/setup modules (status:todo) + - Goal: Improve maintainability by extracting selected large in-file test sections from `hooks.rs` and `setup.rs` into focused sibling test modules/files while preserving current test semantics. + - Boundaries (in): test module organization and local helper placement for `cli/src/services/hooks.rs` and `cli/src/services/setup.rs`. + - Boundaries (out): Full integration-test migration and non-test production refactors not needed for extraction. + - Done when: high-churn/large test slices are moved out of primary runtime files, module compiles cleanly, and affected test suites pass. + - Verification notes: run `cargo test --manifest-path cli/Cargo.toml services::hooks::tests services::setup::tests` and `cargo check --manifest-path cli/Cargo.toml`. + +- [ ] T09: Validation and cleanup (status:todo) + - Goal: Execute full verification sweep, confirm behavior parity for touched domains, and sync context artifacts to current state (including dependency contract references). + - Boundaries (in): formatting/build/test checks, plan status finalization, and required context updates in `context/`. + - Boundaries (out): New feature work beyond this hardening pass. + - Done when: all verification checks pass, no temporary scaffolding remains, and context files reflect final behavior/contracts. + - Verification notes: run `cargo fmt --manifest-path cli/Cargo.toml --all -- --check`, `cargo test --manifest-path cli/Cargo.toml`, `cargo build --manifest-path cli/Cargo.toml`, and repo baseline checks `nix run .#pkl-check-generated` plus `nix flake check` when context/pkl artifacts are touched. + +## 5) Open questions (if any) + +None. Scope, dependency direction, tie policy, and test-split depth were resolved during clarification. diff --git a/context/sce/agent-trace-hosted-event-intake-orchestration.md b/context/sce/agent-trace-hosted-event-intake-orchestration.md index 8106537e..43c68ad9 100644 --- a/context/sce/agent-trace-hosted-event-intake-orchestration.md +++ b/context/sce/agent-trace-hosted-event-intake-orchestration.md @@ -15,7 +15,7 @@ ## Intake contract - Provider coverage is explicit for GitHub and GitLab (`HostedProvider`). -- GitHub webhook signatures use HMAC-SHA256 and require `sha256=` match against payload body. +- GitHub webhook signatures use crate-backed HMAC-SHA256 (`hmac` + `sha2`) and require `sha256=` match against payload body. - GitLab webhook signatures use token equality against the configured shared secret. - Intake requires resolvable rewrite heads (`before`, `after`) and provider-specific repository identity (`full_name` for GitHub, `path_with_namespace` for GitLab). - `before` and `after` values must be SHA-like 40-char hex commit IDs. @@ -23,7 +23,7 @@ ## Reconciliation run orchestration contract - Provider events are normalized into `HostedReconciliationRunRequest` with provider, repo, event, old/new heads, and deterministic idempotency key. -- Deterministic replay key derivation uses provider + event + repo + old/new heads + delivery ID material and SHA256 digesting. +- Deterministic replay key derivation uses provider + event + repo + old/new heads + delivery ID material and crate-backed SHA256 digesting (`sha2`). - Run storage is abstracted behind `ReconciliationRunStore`; ingestion returns created vs duplicate outcome (`ReconciliationRunInsertOutcome`) for replay-safe semantics. ## Validation coverage From 6eca244160cb27bddd6905962301f068ffba44ea Mon Sep 17 00:00:00 2001 From: David Abram Date: Thu, 5 Mar 2026 00:42:12 +0100 Subject: [PATCH 23/39] hosted-reconciliation: Replace webhook string scanning with structured serde_json extraction Parse hosted webhook payloads through serde_json::Value instead of manual substring scanning. Enforce deterministic missing/type/object-shape validation for required fields and keep SHA checks unchanged Add parse-hardening tests and align dependency/context references with the runtime serde_json contract --- cli/Cargo.toml | 2 +- cli/src/dependency_contract.rs | 5 +- cli/src/services/hosted_reconciliation.rs | 147 +++++++++++------- context/architecture.md | 2 +- context/glossary.md | 2 +- context/overview.md | 2 +- .../sce-cli-rust-idiomatic-hardening-pass.md | 2 +- ...trace-hosted-event-intake-orchestration.md | 5 +- 8 files changed, 106 insertions(+), 61 deletions(-) diff --git a/cli/Cargo.toml b/cli/Cargo.toml index be2abbb4..7eb23c70 100644 --- a/cli/Cargo.toml +++ b/cli/Cargo.toml @@ -17,10 +17,10 @@ anyhow = "1" hmac = "0.12" inquire = "0.7" lexopt = "0.3" +serde_json = "1" sha2 = "0.10" tokio = { version = "1", default-features = false, features = ["rt"] } turso = "0" [dev-dependencies] jsonschema = "0.33" -serde_json = "1" diff --git a/cli/src/dependency_contract.rs b/cli/src/dependency_contract.rs index e5e98064..2af40095 100644 --- a/cli/src/dependency_contract.rs +++ b/cli/src/dependency_contract.rs @@ -6,12 +6,14 @@ pub fn dependency_contract_snapshot() -> ( &'static str, &'static str, &'static str, + &'static str, ) { ( Ok(()), std::any::type_name::>(), std::any::type_name::(), std::any::type_name::(), + std::any::type_name::(), std::any::type_name::(), std::any::type_name::(), std::any::type_name::(), @@ -24,12 +26,13 @@ mod tests { #[test] fn dependency_contract_snapshot_references_agreed_crates() { - let (result, hmac_ty, inquire_ty, lexopt_ty, sha2_ty, tokio_ty, turso_ty) = + let (result, hmac_ty, inquire_ty, lexopt_ty, serde_json_ty, sha2_ty, tokio_ty, turso_ty) = dependency_contract_snapshot(); assert!(result.is_ok()); assert!(hmac_ty.contains("hmac::")); assert!(inquire_ty.contains("inquire::")); assert!(lexopt_ty.contains("lexopt::")); + assert!(serde_json_ty.contains("serde_json::")); assert!(sha2_ty.contains("sha2::")); assert!(tokio_ty.contains("tokio::")); assert!(turso_ty.contains("turso::")); diff --git a/cli/src/services/hosted_reconciliation.rs b/cli/src/services/hosted_reconciliation.rs index 2f8665a6..bac0eadd 100644 --- a/cli/src/services/hosted_reconciliation.rs +++ b/cli/src/services/hosted_reconciliation.rs @@ -1,5 +1,6 @@ use anyhow::{bail, ensure, Result}; use hmac::{Hmac, Mac}; +use serde_json::Value; use sha2::{Digest, Sha256}; #[derive(Clone, Copy, Debug, Eq, PartialEq)] @@ -399,8 +400,10 @@ pub fn ingest_hosted_rewrite_event( } fn parse_run_request(request: &HostedWebhookRequest) -> Result { - let old_head = find_required_json_string(&request.payload_json, "before")?; - let new_head = find_required_json_string(&request.payload_json, "after")?; + let payload = parse_payload_json(&request.payload_json)?; + + let old_head = find_required_field_string(&payload, "before")?; + let new_head = find_required_field_string(&payload, "after")?; ensure!( is_sha_like(&old_head), @@ -412,9 +415,11 @@ fn parse_run_request(request: &HostedWebhookRequest) -> Result find_required_json_string(&request.payload_json, "full_name")?, + HostedProvider::GitHub => { + find_required_nested_field_string(&payload, "repository", "full_name")? + } HostedProvider::GitLab => { - find_required_json_string(&request.payload_json, "path_with_namespace")? + find_required_nested_field_string(&payload, "project", "path_with_namespace")? } }; @@ -487,62 +492,56 @@ fn derive_idempotency_key( format!("hosted:{}:{}", provider.as_str(), digest) } -fn find_required_json_string(payload: &str, key: &str) -> Result { - let key_pattern = format!("\"{}\"", key); - let Some(key_start) = payload.find(&key_pattern) else { - bail!("invalid hosted event payload: missing '{}' field", key); - }; - - let mut idx = key_start + key_pattern.len(); - while idx < payload.len() && payload.as_bytes()[idx].is_ascii_whitespace() { - idx += 1; - } - - ensure!( - idx < payload.len() && payload.as_bytes()[idx] == b':', - "invalid hosted event payload: malformed '{}' field", - key - ); - idx += 1; +fn parse_payload_json(payload_json: &str) -> Result { + serde_json::from_str(payload_json) + .map_err(|_| anyhow::anyhow!("invalid hosted event payload: malformed JSON")) +} - while idx < payload.len() && payload.as_bytes()[idx].is_ascii_whitespace() { - idx += 1; - } +fn find_required_field<'a>(payload: &'a Value, field: &str) -> Result<&'a Value> { + payload + .get(field) + .ok_or_else(|| anyhow::anyhow!("invalid hosted event payload: missing '{}' field", field)) +} - ensure!( - idx < payload.len() && payload.as_bytes()[idx] == b'"', - "invalid hosted event payload: '{}' field must be a string", - key - ); - idx += 1; - - let mut value = String::new(); - let mut escaped = false; - while idx < payload.len() { - let byte = payload.as_bytes()[idx]; - idx += 1; - if escaped { - value.push(byte as char); - escaped = false; - continue; - } +fn find_required_field_string(payload: &Value, field: &str) -> Result { + let value = find_required_field(payload, field)?; + let Some(as_str) = value.as_str() else { + bail!( + "invalid hosted event payload: '{}' field must be a string", + field + ); + }; - if byte == b'\\' { - escaped = true; - continue; - } + Ok(as_str.to_string()) +} - if byte == b'"' { - return Ok(value); - } +fn find_required_nested_field_string( + payload: &Value, + parent_field: &str, + nested_field: &str, +) -> Result { + let parent = find_required_field(payload, parent_field)?; + let Some(parent_object) = parent.as_object() else { + bail!( + "invalid hosted event payload: '{}' field must be an object", + parent_field + ); + }; - value.push(byte as char); - } + let value = parent_object.get(nested_field).ok_or_else(|| { + anyhow::anyhow!( + "invalid hosted event payload: missing '{}' field", + nested_field + ) + })?; + let Some(as_str) = value.as_str() else { + bail!( + "invalid hosted event payload: '{}' field must be a string", + nested_field + ); + }; - bail!( - "invalid hosted event payload: unterminated '{}' string", - key - ) + Ok(as_str.to_string()) } fn is_sha_like(value: &str) -> bool { @@ -750,6 +749,46 @@ mod tests { .contains("invalid hosted event payload: missing 'before' field")); } + #[test] + fn intake_requires_before_and_after_to_be_strings() { + let payload = + "{\"before\":123,\"after\":\"2222222222222222222222222222222222222222\",\"repository\":{\"full_name\":\"acme/sce\"}}" + .to_string(); + let mut store = FakeReconciliationRunStore::default(); + let request = HostedWebhookRequest { + provider: HostedProvider::GitHub, + event: "push".to_string(), + signature: github_signature("super-secret", &payload), + delivery_id: Some("delivery-1".to_string()), + shared_secret: "super-secret".to_string(), + payload_json: payload, + }; + + let error = ingest_hosted_rewrite_event(request, &mut store).expect_err("must fail"); + assert!(error + .to_string() + .contains("invalid hosted event payload: 'before' field must be a string")); + } + + #[test] + fn intake_requires_provider_repository_object_shape() { + let payload = "{\"before\":\"1111111111111111111111111111111111111111\",\"after\":\"2222222222222222222222222222222222222222\",\"repository\":\"acme/sce\"}".to_string(); + let mut store = FakeReconciliationRunStore::default(); + let request = HostedWebhookRequest { + provider: HostedProvider::GitHub, + event: "push".to_string(), + signature: github_signature("super-secret", &payload), + delivery_id: Some("delivery-1".to_string()), + shared_secret: "super-secret".to_string(), + payload_json: payload, + }; + + let error = ingest_hosted_rewrite_event(request, &mut store).expect_err("must fail"); + assert!(error + .to_string() + .contains("invalid hosted event payload: 'repository' field must be an object")); + } + #[test] fn idempotency_key_is_deterministic() { let key_a = derive_idempotency_key( diff --git a/context/architecture.md b/context/architecture.md index 94482411..e8fbb2f0 100644 --- a/context/architecture.md +++ b/context/architecture.md @@ -89,7 +89,7 @@ The repository includes a new placeholder Rust binary crate at `cli/`. - `flake.nix` (root) keeps nested CLI input wiring aligned by forwarding `nixpkgs`, `flake-utils`, and `rust-overlay` into the `cli` path input so repository-level `nix flake check` can evaluate nested CLI checks deterministically. - `cli/Cargo.toml` keeps crates.io-ready package metadata populated while `publish = false` remains the current policy; local Cargo release/install verification targets `cargo build --manifest-path cli/Cargo.toml --release` and `cargo install --path cli --locked`. Tokio is intentionally constrained to `default-features = false` with `features = ["rt"]` to match current runtime API usage. -This phase establishes compile-safe extension seams with a minimal dependency baseline (`anyhow`, `hmac`, `inquire`, `lexopt`, `sha2`, `tokio`, `turso`); local Turso connectivity smoke checks now exist, while broader runtime integrations remain deferred. +This phase establishes compile-safe extension seams with a minimal dependency baseline (`anyhow`, `hmac`, `inquire`, `lexopt`, `serde_json`, `sha2`, `tokio`, `turso`); local Turso connectivity smoke checks now exist, while broader runtime integrations remain deferred. ## Shared Context Drift parity mapping diff --git a/context/glossary.md b/context/glossary.md index 26f033d1..47fc6b93 100644 --- a/context/glossary.md +++ b/context/glossary.md @@ -21,7 +21,7 @@ - `sce` (CLI foundation): Rust binary crate at `cli/` with implemented setup installation flow and placeholder behavior for other command domains. - `command surface contract`: The static command catalog in `cli/src/command_surface.rs` that marks each top-level command as `implemented` or `placeholder`. - `command loop`: The `lexopt` parser + dispatcher in `cli/src/app.rs` that routes `help`, `setup`, `mcp`, `hooks`, and `sync`, executes setup installation, emits TODO placeholders for non-implemented commands, and returns deterministic actionable errors for invalid invocation. -- `sce dependency contract`: Minimal crate dependency baseline declared in `cli/Cargo.toml` and referenced via `cli/src/dependency_contract.rs` (`anyhow`, `hmac`, `inquire`, `lexopt`, `sha2`, `tokio`, `turso`). +- `sce dependency contract`: Minimal crate dependency baseline declared in `cli/Cargo.toml` and referenced via `cli/src/dependency_contract.rs` (`anyhow`, `hmac`, `inquire`, `lexopt`, `serde_json`, `sha2`, `tokio`, `turso`). - `local Turso adapter`: Async data-layer module in `cli/src/services/local_db.rs` that initializes local DB targets with `turso::Builder::new_local(...)` and runs execute/query smoke checks. - `sync Turso smoke gate`: Behavior in `cli/src/services/sync.rs` where the `sync` placeholder command runs an in-memory local Turso smoke check under a lazily initialized shared tokio current-thread runtime before returning placeholder cloud-sync messaging. - `setup service orchestration`: Setup execution logic in `cli/src/services/setup.rs` that resolves target selection, installs embedded assets, and emits deterministic success messaging per target. diff --git a/context/overview.md b/context/overview.md index f2fd5625..f76e400e 100644 --- a/context/overview.md +++ b/context/overview.md @@ -5,7 +5,7 @@ This repository maintains shared assistant configuration for OpenCode and Claude It also includes an early Rust CLI foundation at `cli/` for Shared Context Engineering workflows. The crate ships onboarding and usage documentation at `cli/README.md` that reflects current implemented vs placeholder behavior. -The CLI crate currently enforces a minimal dependency contract: `anyhow`, `hmac`, `inquire`, `lexopt`, `sha2`, `tokio`, and `turso`. +The CLI crate currently enforces a minimal dependency contract: `anyhow`, `hmac`, `inquire`, `lexopt`, `serde_json`, `sha2`, `tokio`, and `turso`. Its command loop is implemented with `lexopt` argument parsing and `anyhow` error handling, with real setup orchestration, implemented `doctor` rollout validation, and placeholder dispatch for deferred commands through explicit service contracts. The `setup` command includes an `inquire`-backed target-selection flow: default interactive selection for OpenCode/Claude/both, explicit non-interactive target flags (`--opencode`, `--claude`, `--both`), deterministic mutually-exclusive validation, and non-destructive cancellation exits. The CLI now compiles an embedded setup asset manifest from `config/.opencode/**`, `config/.claude/**`, and `cli/assets/hooks/**` via `cli/build.rs`; `cli/src/services/setup.rs` exposes deterministic normalized relative paths plus file bytes and target-scoped iteration without runtime reads from `config/`. diff --git a/context/plans/sce-cli-rust-idiomatic-hardening-pass.md b/context/plans/sce-cli-rust-idiomatic-hardening-pass.md index f21c736d..c8c8413b 100644 --- a/context/plans/sce-cli-rust-idiomatic-hardening-pass.md +++ b/context/plans/sce-cli-rust-idiomatic-hardening-pass.md @@ -42,7 +42,7 @@ Non-goals (deferred): - Done when: handcrafted crypto helpers are removed/replaced; dependency contract compiles and tests validate equivalent signature/hash behavior. - Verification notes: run `cargo test --manifest-path cli/Cargo.toml services::hosted_reconciliation::tests` and `cargo check --manifest-path cli/Cargo.toml`. -- [ ] T02: Replace fragile hosted JSON string scanning with structured parsing (status:todo) +- [x] T02: Replace fragile hosted JSON string scanning with structured parsing (status:done) - Goal: Replace `find_required_json_string` usage with `serde_json` parsing (typed/value extraction) for `before`, `after`, and provider-specific repository fields. - Boundaries (in): Hosted payload parse path and parse-focused tests in `cli/src/services/hosted_reconciliation.rs`; dependency additions in `cli/Cargo.toml` as needed. - Boundaries (out): New provider support or webhook schema expansion beyond existing GitHub/GitLab fields. diff --git a/context/sce/agent-trace-hosted-event-intake-orchestration.md b/context/sce/agent-trace-hosted-event-intake-orchestration.md index 43c68ad9..eb16c7eb 100644 --- a/context/sce/agent-trace-hosted-event-intake-orchestration.md +++ b/context/sce/agent-trace-hosted-event-intake-orchestration.md @@ -17,7 +17,9 @@ - Provider coverage is explicit for GitHub and GitLab (`HostedProvider`). - GitHub webhook signatures use crate-backed HMAC-SHA256 (`hmac` + `sha2`) and require `sha256=` match against payload body. - GitLab webhook signatures use token equality against the configured shared secret. -- Intake requires resolvable rewrite heads (`before`, `after`) and provider-specific repository identity (`full_name` for GitHub, `path_with_namespace` for GitLab). +- Intake payload parsing uses structured `serde_json::Value` extraction (no manual substring scanning) for `before`, `after`, and provider-specific repository identity. +- Intake requires resolvable rewrite heads (`before`, `after`) and provider-specific repository identity (`repository.full_name` for GitHub, `project.path_with_namespace` for GitLab). +- Missing fields, invalid container types, and non-string required values fail with deterministic `invalid hosted event payload: ...` messages. - `before` and `after` values must be SHA-like 40-char hex commit IDs. ## Reconciliation run orchestration contract @@ -32,6 +34,7 @@ - GitLab token verification + run creation. - Duplicate event replay behavior returns duplicate outcome without creating a new side effect class. - Required payload field validation for old/new head resolution. +- Required payload field type/object-shape validation for deterministic parse failures. - Deterministic idempotency key stability for identical inputs. ## Verification evidence From f0bdf82ab30766119868c3d77e597c94dbaa064a Mon Sep 17 00:00:00 2001 From: David Abram Date: Thu, 5 Mar 2026 00:49:02 +0100 Subject: [PATCH 24/39] hosted-reconciliation: Use epsilon tie window for rewrite score ranking Replace direct float equality in rewrite candidate scoring with an explicit 1e-5 tie window so near-equal scores are handled deterministically as ambiguous instead of precision-sensitive winners. Add boundary tests for within-epsilon and outside-epsilon score comparisons while preserving existing threshold and candidate-ordering behavior. --- cli/src/services/hosted_reconciliation.rs | 111 ++++++++++++++++-- .../sce-cli-rust-idiomatic-hardening-pass.md | 2 +- .../sce/agent-trace-rewrite-mapping-engine.md | 7 +- 3 files changed, 107 insertions(+), 13 deletions(-) diff --git a/cli/src/services/hosted_reconciliation.rs b/cli/src/services/hosted_reconciliation.rs index bac0eadd..91f0285c 100644 --- a/cli/src/services/hosted_reconciliation.rs +++ b/cli/src/services/hosted_reconciliation.rs @@ -2,6 +2,7 @@ use anyhow::{bail, ensure, Result}; use hmac::{Hmac, Mac}; use serde_json::Value; use sha2::{Digest, Sha256}; +use std::cmp::Ordering; #[derive(Clone, Copy, Debug, Eq, PartialEq)] pub enum HostedProvider { @@ -58,6 +59,7 @@ pub enum HostedIntakeOutcome { } pub const FUZZY_MAPPING_THRESHOLD: f32 = 0.60; +const SCORE_TIE_EPSILON: f32 = 0.000_01; #[derive(Clone, Copy, Debug, Eq, PartialEq)] pub enum MappingMethod { @@ -105,6 +107,14 @@ impl Score { fn value(self) -> f32 { self.0 } + + fn max(self, other: Self) -> Self { + if self.value() >= other.value() { + self + } else { + other + } + } } #[derive(Clone, Debug, Eq, PartialEq)] @@ -312,16 +322,19 @@ fn select_scored_candidate( top_new_sha = Some(candidate.new_commit_sha.clone()); tied_new_shas.push(candidate.new_commit_sha.clone()); } - Some(current) if score.value() > current.value() => { - top_score = Some(score); - top_new_sha = Some(candidate.new_commit_sha.clone()); - tied_new_shas.clear(); - tied_new_shas.push(candidate.new_commit_sha.clone()); - } - Some(current) if score.value() == current.value() => { - tied_new_shas.push(candidate.new_commit_sha.clone()); - } - Some(_) => {} + Some(current) => match compare_scores_with_tie_window(score, current) { + Ordering::Greater => { + top_score = Some(score); + top_new_sha = Some(candidate.new_commit_sha.clone()); + tied_new_shas.clear(); + tied_new_shas.push(candidate.new_commit_sha.clone()); + } + Ordering::Equal => { + top_score = Some(score.max(current)); + tied_new_shas.push(candidate.new_commit_sha.clone()); + } + Ordering::Less => {} + }, } } @@ -335,6 +348,17 @@ fn select_scored_candidate( } } +fn compare_scores_with_tie_window(left: Score, right: Score) -> Ordering { + let delta = left.value() - right.value(); + if delta.abs() <= SCORE_TIE_EPSILON { + Ordering::Equal + } else if delta > 0.0 { + Ordering::Greater + } else { + Ordering::Less + } +} + fn outcome_from_score_decision( source: &RewriteSourceCommit, decision: ScoreDecision, @@ -892,6 +916,73 @@ mod tests { } } + #[test] + fn mapping_engine_uses_epsilon_window_for_near_equal_ties() { + let source = RewriteSourceCommit { + old_commit_sha: "old-2b".to_string(), + patch_id: None, + }; + let candidates = vec![ + RewriteCandidateCommit { + new_commit_sha: "new-a".to_string(), + patch_id: None, + range_diff_score: Some(score(0.82000)), + fuzzy_score: None, + }, + RewriteCandidateCommit { + new_commit_sha: "new-z".to_string(), + patch_id: None, + range_diff_score: Some(score(0.820009)), + fuzzy_score: None, + }, + ]; + + let outcome = map_rewritten_commit(&source, &candidates); + match outcome { + RewriteMappingOutcome::Unresolved(unresolved) => { + assert_eq!(unresolved.kind, UnresolvedMappingKind::Ambiguous); + assert_eq!( + unresolved.candidate_new_shas, + vec!["new-a".to_string(), "new-z".to_string()] + ); + assert_eq!(unresolved.best_confidence, Some(score(0.820009))); + } + other => panic!("expected ambiguous unresolved outcome, got {other:?}"), + } + } + + #[test] + fn mapping_engine_distinguishes_scores_outside_epsilon_window() { + let source = RewriteSourceCommit { + old_commit_sha: "old-2c".to_string(), + patch_id: None, + }; + let candidates = vec![ + RewriteCandidateCommit { + new_commit_sha: "new-a".to_string(), + patch_id: None, + range_diff_score: Some(score(0.82000)), + fuzzy_score: None, + }, + RewriteCandidateCommit { + new_commit_sha: "new-z".to_string(), + patch_id: None, + range_diff_score: Some(score(0.82002)), + fuzzy_score: None, + }, + ]; + + let outcome = map_rewritten_commit(&source, &candidates); + match outcome { + RewriteMappingOutcome::Mapped(mapped) => { + assert_eq!(mapped.new_commit_sha, "new-z"); + assert_eq!(mapped.method, MappingMethod::RangeDiffHint); + assert_eq!(mapped.confidence, score(0.82002)); + } + other => panic!("expected mapped outcome, got {other:?}"), + } + } + #[test] fn mapping_engine_reports_unmatched_when_no_signals_exist() { let source = RewriteSourceCommit { diff --git a/context/plans/sce-cli-rust-idiomatic-hardening-pass.md b/context/plans/sce-cli-rust-idiomatic-hardening-pass.md index c8c8413b..a5f08c30 100644 --- a/context/plans/sce-cli-rust-idiomatic-hardening-pass.md +++ b/context/plans/sce-cli-rust-idiomatic-hardening-pass.md @@ -49,7 +49,7 @@ Non-goals (deferred): - Done when: no manual substring search parser remains on hosted intake path; missing/invalid-field failures are deterministic and covered by tests. - Verification notes: run `cargo test --manifest-path cli/Cargo.toml services::hosted_reconciliation::tests` and `cargo check --manifest-path cli/Cargo.toml`. -- [ ] T03: Introduce epsilon-based tie handling for rewrite score comparison (status:todo) +- [x] T03: Introduce epsilon-based tie handling for rewrite score comparison (status:done) - Goal: Remove direct float equality check in candidate tie detection and apply explicit epsilon tie-window semantics. - Boundaries (in): Tie detection and mapping-outcome tests in `cli/src/services/hosted_reconciliation.rs`. - Boundaries (out): Replacing score model or threshold policy (`FUZZY_MAPPING_THRESHOLD`) beyond tie logic. diff --git a/context/sce/agent-trace-rewrite-mapping-engine.md b/context/sce/agent-trace-rewrite-mapping-engine.md index a1516fbd..13c35a16 100644 --- a/context/sce/agent-trace-rewrite-mapping-engine.md +++ b/context/sce/agent-trace-rewrite-mapping-engine.md @@ -16,6 +16,7 @@ Resolve hosted/local rewrite old->new commit identity with deterministic, explai - Score contract: - `Score` is constrained to finite `[0.0, 1.0]`. - Mapping threshold: `FUZZY_MAPPING_THRESHOLD = 0.60`. + - Tie window: score comparisons use `SCORE_TIE_EPSILON = 0.00001`; score deltas within epsilon are treated as ties. - Determinism: - Candidates are sorted by `new_commit_sha` before decisioning. - Tied top-score outcomes are returned in stable SHA order. @@ -29,11 +30,11 @@ Resolve hosted/local rewrite old->new commit identity with deterministic, explai - If multiple exact patch-id matches exist, return unresolved `ambiguous`. 2. Range-diff scoring - Select highest `range_diff_score` when no patch-id mapping exists. - - Tie for highest score returns unresolved `ambiguous`. + - Near-equal top scores (within epsilon tie window) return unresolved `ambiguous`. - Highest score `< 0.60` returns unresolved `low_confidence`. 3. Fuzzy fallback scoring - Applied only when no patch-id or range-diff resolution exists. - - Uses same tie and threshold behavior as range-diff. + - Uses the same epsilon tie-window and threshold behavior as range-diff. 4. Unmatched - If no usable range-diff or fuzzy signals exist, return unresolved `unmatched`. @@ -50,6 +51,8 @@ Resolve hosted/local rewrite old->new commit identity with deterministic, explai ## Verification coverage - Exact match fixture: patch-id match wins over stronger non-exact scores. - Ambiguous fixture: tied best range-diff scores return deterministic unresolved candidates. +- Epsilon fixture: near-equal range-diff scores within epsilon return unresolved `ambiguous` with deterministic candidate ordering. +- Distinct fixture: range-diff scores outside epsilon resolve to a single mapped winner. - Unmatched fixture: no score signals produces `unmatched`. - Low-confidence fixture: fuzzy best score `< 0.60` returns `low_confidence`. From e3ce44f28ec1b140c404d9e7050df4536310effb Mon Sep 17 00:00:00 2001 From: David Abram Date: Thu, 5 Mar 2026 00:53:31 +0100 Subject: [PATCH 25/39] local_db: Eliminate lossy DB path conversion in Turso target handling Refactor local DB target resolution to use an explicit fallible UTF-8 path conversion instead of to_string_lossy, returning contextual errors for invalid paths. --- cli/src/services/local_db.rs | 43 +++++++------------ .../sce-cli-rust-idiomatic-hardening-pass.md | 2 +- 2 files changed, 17 insertions(+), 28 deletions(-) diff --git a/cli/src/services/local_db.rs b/cli/src/services/local_db.rs index 6dd2ced7..d85b142b 100644 --- a/cli/src/services/local_db.rs +++ b/cli/src/services/local_db.rs @@ -144,17 +144,22 @@ pub struct CoreSchemaMigrationOutcome { } async fn connect_local(target: LocalDatabaseTarget<'_>) -> Result { - let location = match target { - LocalDatabaseTarget::InMemory => ":memory:".to_string(), - LocalDatabaseTarget::Path(path) => path.to_string_lossy().into_owned(), - }; - - let db = Builder::new_local(&location).build().await?; + let location = target_location(target)?; + let db = Builder::new_local(location).build().await?; let conn = db.connect()?; conn.execute("PRAGMA foreign_keys = ON", ()).await?; Ok(conn) } +fn target_location(target: LocalDatabaseTarget<'_>) -> Result<&str> { + match target { + LocalDatabaseTarget::InMemory => Ok(":memory:"), + LocalDatabaseTarget::Path(path) => path + .to_str() + .ok_or_else(|| anyhow!("Local DB path must be valid UTF-8: {}", path.display())), + } +} + pub async fn apply_core_schema_migrations( target: LocalDatabaseTarget<'_>, ) -> Result { @@ -214,23 +219,13 @@ mod tests { kind: &str, name: &str, ) -> Result { - let location = match target { - LocalDatabaseTarget::InMemory => ":memory:".to_string(), - LocalDatabaseTarget::Path(path) => path.to_string_lossy().into_owned(), - }; - let db = turso::Builder::new_local(&location).build().await?; - let conn = db.connect()?; + let conn = super::connect_local(target).await?; let mut rows = conn.query(&row_exists_query(kind, name), ()).await?; Ok(rows.next().await?.is_some()) } async fn repository_count(target: LocalDatabaseTarget<'_>) -> Result { - let location = match target { - LocalDatabaseTarget::InMemory => ":memory:".to_string(), - LocalDatabaseTarget::Path(path) => path.to_string_lossy().into_owned(), - }; - let db = turso::Builder::new_local(&location).build().await?; - let conn = db.connect()?; + let conn = super::connect_local(target).await?; let mut rows = conn.query("SELECT COUNT(*) FROM repositories", ()).await?; let row = rows .next() @@ -244,12 +239,7 @@ mod tests { } async fn fetch_single_integer(target: LocalDatabaseTarget<'_>, query: &str) -> Result { - let location = match target { - LocalDatabaseTarget::InMemory => ":memory:".to_string(), - LocalDatabaseTarget::Path(path) => path.to_string_lossy().into_owned(), - }; - let db = turso::Builder::new_local(&location).build().await?; - let conn = db.connect()?; + let conn = super::connect_local(target).await?; let mut rows = conn.query(query, ()).await?; let row = rows .next() @@ -388,9 +378,8 @@ mod tests { )))?; runtime.block_on(async { - let db = turso::Builder::new_local(path.to_string_lossy().as_ref()) - .build() - .await?; + let location = super::target_location(LocalDatabaseTarget::Path(&path))?; + let db = turso::Builder::new_local(location).build().await?; let conn = db.connect()?; conn.execute( diff --git a/context/plans/sce-cli-rust-idiomatic-hardening-pass.md b/context/plans/sce-cli-rust-idiomatic-hardening-pass.md index a5f08c30..d0b10b50 100644 --- a/context/plans/sce-cli-rust-idiomatic-hardening-pass.md +++ b/context/plans/sce-cli-rust-idiomatic-hardening-pass.md @@ -56,7 +56,7 @@ Non-goals (deferred): - Done when: tie behavior is epsilon-based, deterministic, and tested for near-equal/clearly-different scores. - Verification notes: run `cargo test --manifest-path cli/Cargo.toml services::hosted_reconciliation::tests` and `cargo check --manifest-path cli/Cargo.toml`. -- [ ] T04: Eliminate lossy DB path string conversion in local DB service/tests (status:todo) +- [x] T04: Eliminate lossy DB path string conversion in local DB service/tests (status:done) - Goal: Refactor local DB target path handling to avoid `to_string_lossy()` for DB location construction, using `Path`-native or explicit fallible conversion with context. - Boundaries (in): `cli/src/services/local_db.rs` runtime and test helpers. - Boundaries (out): Turso API redesign assumptions or broader filesystem abstraction rewrite. From ae7e72b726901bcb37c52d9c1859c72b5eff2ce5 Mon Sep 17 00:00:00 2001 From: David Abram Date: Thu, 5 Mar 2026 00:57:42 +0100 Subject: [PATCH 26/39] agent_trace: Remove crate-level dead_code suppression Drop the module-wide `#![allow(dead_code)]` from `services/agent_trace.rs` so dead-code hygiene is no longer broadly suppressed for this subsystem. --- cli/src/services/agent_trace.rs | 2 -- context/plans/sce-cli-rust-idiomatic-hardening-pass.md | 2 +- 2 files changed, 1 insertion(+), 3 deletions(-) diff --git a/cli/src/services/agent_trace.rs b/cli/src/services/agent_trace.rs index b842d1de..d6ab0c55 100644 --- a/cli/src/services/agent_trace.rs +++ b/cli/src/services/agent_trace.rs @@ -1,5 +1,3 @@ -#![allow(dead_code)] - use std::collections::BTreeMap; pub const TRACE_VERSION: &str = "0.1.0"; diff --git a/context/plans/sce-cli-rust-idiomatic-hardening-pass.md b/context/plans/sce-cli-rust-idiomatic-hardening-pass.md index d0b10b50..69ce2364 100644 --- a/context/plans/sce-cli-rust-idiomatic-hardening-pass.md +++ b/context/plans/sce-cli-rust-idiomatic-hardening-pass.md @@ -63,7 +63,7 @@ Non-goals (deferred): - Done when: targeted lossy conversions at current call sites are removed/replaced with explicit safe handling and tests still pass. - Verification notes: run `cargo test --manifest-path cli/Cargo.toml services::local_db::tests` and `cargo check --manifest-path cli/Cargo.toml`. -- [ ] T05: Remove broad dead-code suppression from agent trace module (status:todo) +- [x] T05: Remove broad dead-code suppression from agent trace module (status:done) - Goal: Remove `#![allow(dead_code)]` from `cli/src/services/agent_trace.rs` and apply narrow item-level handling only where required. - Boundaries (in): `cli/src/services/agent_trace.rs` and directly affected tests/usages. - Boundaries (out): Large-scale pruning of placeholder Agent Trace contracts not required to satisfy compiler hygiene. From 24118fd9cfffdf430ee3d4a29757b762855e1403 Mon Sep 17 00:00:00 2001 From: David Abram Date: Thu, 5 Mar 2026 01:03:40 +0100 Subject: [PATCH 27/39] cli: Remove top-level parser tail_args clone Avoid cloning tail_args just to initialize lexopt::Parser by passing a borrowed iterator (iter().map(String::as_str)), preserving existing parse behavior while reducing avoidable allocation. --- cli/src/app.rs | 2 +- context/plans/sce-cli-rust-idiomatic-hardening-pass.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/cli/src/app.rs b/cli/src/app.rs index 2fc3f4d2..e2b4a015 100644 --- a/cli/src/app.rs +++ b/cli/src/app.rs @@ -52,7 +52,7 @@ where return Ok(Command::Help); } - let mut parser = lexopt::Parser::from_args(tail_args.clone()); + let mut parser = lexopt::Parser::from_args(tail_args.iter().map(String::as_str)); match parser.next()? { Some(lexopt::Arg::Long("help")) => { if tail_args.len() == 1 { diff --git a/context/plans/sce-cli-rust-idiomatic-hardening-pass.md b/context/plans/sce-cli-rust-idiomatic-hardening-pass.md index 69ce2364..af40b470 100644 --- a/context/plans/sce-cli-rust-idiomatic-hardening-pass.md +++ b/context/plans/sce-cli-rust-idiomatic-hardening-pass.md @@ -70,7 +70,7 @@ Non-goals (deferred): - Done when: crate-level dead-code allow is absent and compile/test remain green without broad suppression. - Verification notes: run `cargo test --manifest-path cli/Cargo.toml services::agent_trace::tests` and `cargo check --manifest-path cli/Cargo.toml`. -- [ ] T06: Remove avoidable `tail_args` clone in top-level parser (status:todo) +- [x] T06: Remove avoidable `tail_args` clone in top-level parser (status:done) - Goal: Restructure top-level parsing so `lexopt` consumes arguments without cloning `tail_args` solely for parser initialization. - Boundaries (in): `cli/src/app.rs` parse flow and parser tests. - Boundaries (out): Command-surface behavioral changes unrelated to clone removal. From 2e982acfb335efd3ec7655679e3a60714cbd141c Mon Sep 17 00:00:00 2001 From: David Abram Date: Thu, 5 Mar 2026 01:07:24 +0100 Subject: [PATCH 28/39] sync: Simplify shared runtime initialization with OnceLock get_or_init Replace the manual set-then-get flow with a single get_or_init path after runtime construction. --- cli/src/services/sync.rs | 6 +----- context/plans/sce-cli-rust-idiomatic-hardening-pass.md | 2 +- 2 files changed, 2 insertions(+), 6 deletions(-) diff --git a/cli/src/services/sync.rs b/cli/src/services/sync.rs index 9a4dd141..e2a887a7 100644 --- a/cli/src/services/sync.rs +++ b/cli/src/services/sync.rs @@ -105,11 +105,7 @@ fn shared_runtime() -> Result<&'static tokio::runtime::Runtime> { .build() .context("failed to create shared tokio runtime for sync placeholder")?; - let _ = SYNC_RUNTIME.set(runtime); - - SYNC_RUNTIME - .get() - .context("shared tokio runtime for sync placeholder is unavailable") + Ok(SYNC_RUNTIME.get_or_init(|| runtime)) } pub fn run_placeholder_sync() -> Result { diff --git a/context/plans/sce-cli-rust-idiomatic-hardening-pass.md b/context/plans/sce-cli-rust-idiomatic-hardening-pass.md index af40b470..5cf75629 100644 --- a/context/plans/sce-cli-rust-idiomatic-hardening-pass.md +++ b/context/plans/sce-cli-rust-idiomatic-hardening-pass.md @@ -77,7 +77,7 @@ Non-goals (deferred): - Done when: `parse_command` no longer clones `tail_args` for `Parser::from_args`, with behavior preserved and tests passing. - Verification notes: run `cargo test --manifest-path cli/Cargo.toml app::tests` and `cargo check --manifest-path cli/Cargo.toml`. -- [ ] T07: Simplify sync runtime initialization with idiomatic OnceLock API (status:todo) +- [x] T07: Simplify sync runtime initialization with idiomatic OnceLock API (status:done) - Goal: Replace manual get/set/get runtime init in `shared_runtime` with `OnceLock` idioms (`get_or_try_init` or equivalent safe pattern). - Boundaries (in): `cli/src/services/sync.rs` runtime init path and relevant tests. - Boundaries (out): Async architecture changes beyond runtime initialization style. From 0ef5cd9da8d5b8a8c7e48ab0c664fc02f7f4fc11 Mon Sep 17 00:00:00 2001 From: David Abram Date: Thu, 5 Mar 2026 01:27:24 +0100 Subject: [PATCH 29/39] services: Refactor hooks and setup test suites into dedicated test modules Move large inline #[cfg(test)] blocks out of hooks.rs and setup.rs into sibling tests.rs files. --- cli/src/services/hooks.rs | 874 +----------------- cli/src/services/hooks/tests.rs | 866 +++++++++++++++++ cli/src/services/setup.rs | 646 +------------ cli/src/services/setup/tests.rs | 639 +++++++++++++ .../sce-cli-rust-idiomatic-hardening-pass.md | 26 +- 5 files changed, 1531 insertions(+), 1520 deletions(-) create mode 100644 cli/src/services/hooks/tests.rs create mode 100644 cli/src/services/setup/tests.rs diff --git a/cli/src/services/hooks.rs b/cli/src/services/hooks.rs index 25cd103a..33dd0aaa 100644 --- a/cli/src/services/hooks.rs +++ b/cli/src/services/hooks.rs @@ -944,876 +944,4 @@ pub fn run_placeholder_hooks() -> Result { } #[cfg(test)] -mod tests { - use anyhow::Result; - - use crate::services::agent_trace::{ - build_trace_payload, ContributorInput, ContributorType, ConversationInput, - FileAttributionInput, QualityStatus, RangeInput, TraceAdapterInput, - METADATA_QUALITY_STATUS, METADATA_REWRITE_CONFIDENCE, METADATA_REWRITE_FROM, - METADATA_REWRITE_METHOD, - }; - - use super::{ - apply_commit_msg_coauthor_policy, finalize_post_commit_trace, finalize_post_rewrite_remap, - finalize_pre_commit_checkpoint, finalize_rewrite_trace, process_trace_retry_queue, - run_placeholder_hooks, CommitMsgRuntimeState, GeneratedRegionEvent, - GeneratedRegionLifecycle, GitHookKind, HookEvent, HookService, PendingCheckpoint, - PendingFileCheckpoint, PendingLineRange, PersistenceErrorClass, PersistenceFailure, - PersistenceTarget, PersistenceWriteResult, PlaceholderHookService, PostCommitFinalization, - PostCommitInput, PostCommitNoOpReason, PostCommitRuntimeState, PostRewriteFinalization, - PostRewriteNoOpReason, PostRewriteRuntimeState, PreCommitFinalization, PreCommitNoOpReason, - PreCommitRuntimeState, PreCommitTreeAnchors, RetryMetricsSink, RetryProcessingMetric, - RewriteMethod, RewriteRemapIngestion, RewriteRemapRequest, RewriteTraceFinalization, - RewriteTraceInput, RewriteTraceNoOpReason, TraceEmissionLedger, TraceNote, - TraceNotesWriter, TraceRecordStore, TraceRetryQueue, TraceRetryQueueEntry, - CANONICAL_SCE_COAUTHOR_TRAILER, POST_COMMIT_PARENT_SHA_METADATA_KEY, - }; - - fn sample_pending_checkpoint() -> PendingCheckpoint { - PendingCheckpoint { - files: vec![PendingFileCheckpoint { - path: "src/lib.rs".to_string(), - staged_ranges: vec![PendingLineRange { - start_line: 1, - end_line: 3, - }], - unstaged_ranges: vec![PendingLineRange { - start_line: 4, - end_line: 6, - }], - }], - } - } - - fn sample_runtime() -> PreCommitRuntimeState { - PreCommitRuntimeState { - sce_disabled: false, - cli_available: true, - is_bare_repo: false, - } - } - - fn sample_anchors() -> PreCommitTreeAnchors { - PreCommitTreeAnchors { - index_tree: "index-tree-sha".to_string(), - head_tree: Some("head-tree-sha".to_string()), - } - } - - #[derive(Default)] - struct FakeEmissionLedger { - emitted: Vec, - } - - impl TraceEmissionLedger for FakeEmissionLedger { - fn has_emitted(&self, commit_sha: &str) -> bool { - self.emitted.iter().any(|sha| sha == commit_sha) - } - - fn mark_emitted(&mut self, commit_sha: &str) { - self.emitted.push(commit_sha.to_string()); - } - } - - struct FakeNotesWriter { - result: PersistenceWriteResult, - writes: Vec, - } - - impl FakeNotesWriter { - fn new(result: PersistenceWriteResult) -> Self { - Self { - result, - writes: Vec::new(), - } - } - } - - impl TraceNotesWriter for FakeNotesWriter { - fn write_note(&mut self, note: TraceNote) -> PersistenceWriteResult { - self.writes.push(note); - self.result.clone() - } - } - - struct FakeRecordStore { - result: PersistenceWriteResult, - } - - impl FakeRecordStore { - fn new(result: PersistenceWriteResult) -> Self { - Self { result } - } - } - - impl TraceRecordStore for FakeRecordStore { - fn write_trace_record( - &mut self, - _record: super::PersistedTraceRecord, - ) -> PersistenceWriteResult { - self.result.clone() - } - } - - #[derive(Default)] - struct FakeRetryQueue { - entries: Vec, - } - - #[derive(Default)] - struct FakeRetryMetricsSink { - events: Vec, - } - - #[derive(Default)] - struct FakeRewriteRemapIngestion { - seen_requests: Vec, - duplicate_keys: Vec, - seen_keys: std::collections::BTreeSet, - } - - impl RewriteRemapIngestion for FakeRewriteRemapIngestion { - fn ingest(&mut self, request: RewriteRemapRequest) -> Result { - let accepted = self.seen_keys.insert(request.idempotency_key.clone()); - if !accepted { - self.duplicate_keys.push(request.idempotency_key.clone()); - } - self.seen_requests.push(request); - Ok(accepted) - } - } - - impl TraceRetryQueue for FakeRetryQueue { - fn enqueue(&mut self, entry: TraceRetryQueueEntry) -> Result<()> { - self.entries.push(entry); - Ok(()) - } - - fn dequeue_next(&mut self) -> Result> { - if self.entries.is_empty() { - return Ok(None); - } - - Ok(Some(self.entries.remove(0))) - } - } - - impl RetryMetricsSink for FakeRetryMetricsSink { - fn record_retry_metric(&mut self, metric: RetryProcessingMetric) { - self.events.push(metric); - } - } - - fn sample_retry_entry_with_target(target: PersistenceTarget) -> TraceRetryQueueEntry { - let record = build_trace_payload(TraceAdapterInput { - record_id: "990e8400-e29b-41d4-a716-446655440000".to_string(), - timestamp_rfc3339: "2026-03-04T12:13:14Z".to_string(), - commit_sha: "retrysha123".to_string(), - files: vec![FileAttributionInput { - path: "src/retry.rs".to_string(), - conversations: vec![ConversationInput { - url: "https://example.test/conversation/retry".to_string(), - related: vec![], - ranges: vec![RangeInput { - start_line: 4, - end_line: 6, - contributor: ContributorInput { - kind: ContributorType::Ai, - model_id: Some("openai/gpt-5.3-codex".to_string()), - }, - }], - }], - }], - quality_status: QualityStatus::Final, - rewrite: None, - idempotency_key: Some("retry:key:retrysha123".to_string()), - }); - - TraceRetryQueueEntry { - commit_sha: "retrysha123".to_string(), - failed_targets: vec![target], - content_type: "application/vnd.agent-trace.record+json".to_string(), - notes_ref: "refs/notes/agent-trace".to_string(), - record, - } - } - - fn sample_post_commit_runtime() -> PostCommitRuntimeState { - PostCommitRuntimeState { - sce_disabled: false, - cli_available: true, - is_bare_repo: false, - } - } - - fn sample_post_rewrite_runtime() -> PostRewriteRuntimeState { - PostRewriteRuntimeState { - sce_disabled: false, - cli_available: true, - is_bare_repo: false, - } - } - - fn sample_rewrite_trace_input() -> RewriteTraceInput { - RewriteTraceInput { - record_id: "660e8400-e29b-41d4-a716-446655440000".to_string(), - timestamp_rfc3339: "2026-03-04T11:12:13Z".to_string(), - rewritten_commit_sha: "newsha123".to_string(), - rewrite_from_sha: "oldsha456".to_string(), - rewrite_method: RewriteMethod::Rebase, - rewrite_confidence: 0.91, - idempotency_key: "post-rewrite:rebase:oldsha456:newsha123".to_string(), - files: vec![FileAttributionInput { - path: "src/lib.rs".to_string(), - conversations: vec![ConversationInput { - url: "https://example.test/conversation/rewritten".to_string(), - related: vec![], - ranges: vec![RangeInput { - start_line: 3, - end_line: 7, - contributor: ContributorInput { - kind: ContributorType::Ai, - model_id: Some("openai/gpt-5.3-codex".to_string()), - }, - }], - }], - }], - } - } - - fn sample_post_commit_input() -> PostCommitInput { - PostCommitInput { - record_id: "550e8400-e29b-41d4-a716-446655440000".to_string(), - timestamp_rfc3339: "2026-03-04T10:11:12Z".to_string(), - commit_sha: "abc123def456".to_string(), - parent_sha: Some("def789ghi000".to_string()), - idempotency_key: "repo:abc123def456".to_string(), - files: vec![FileAttributionInput { - path: "src/lib.rs".to_string(), - conversations: vec![ConversationInput { - url: "https://example.test/conversation/1".to_string(), - related: vec![], - ranges: vec![RangeInput { - start_line: 1, - end_line: 5, - contributor: ContributorInput { - kind: ContributorType::Ai, - model_id: Some("openai/gpt-5.3-codex".to_string()), - }, - }], - }], - }], - } - } - - #[test] - fn post_commit_finalization_noops_when_already_finalized() -> Result<()> { - let runtime = sample_post_commit_runtime(); - let input = sample_post_commit_input(); - let mut notes = FakeNotesWriter::new(PersistenceWriteResult::Written); - let mut store = FakeRecordStore::new(PersistenceWriteResult::Written); - let mut queue = FakeRetryQueue::default(); - let mut ledger = FakeEmissionLedger { - emitted: vec![input.commit_sha.clone()], - }; - - let outcome = finalize_post_commit_trace( - &runtime, - input, - &mut notes, - &mut store, - &mut queue, - &mut ledger, - )?; - - assert_eq!( - outcome, - PostCommitFinalization::NoOp(PostCommitNoOpReason::AlreadyFinalized) - ); - assert!(notes.writes.is_empty()); - assert!(queue.entries.is_empty()); - Ok(()) - } - - #[test] - fn post_commit_finalization_dual_writes_with_parent_metadata_and_mime() -> Result<()> { - let runtime = sample_post_commit_runtime(); - let input = sample_post_commit_input(); - let mut notes = FakeNotesWriter::new(PersistenceWriteResult::Written); - let mut store = FakeRecordStore::new(PersistenceWriteResult::Written); - let mut queue = FakeRetryQueue::default(); - let mut ledger = FakeEmissionLedger::default(); - - let outcome = finalize_post_commit_trace( - &runtime, - input.clone(), - &mut notes, - &mut store, - &mut queue, - &mut ledger, - )?; - - let persisted = match outcome { - PostCommitFinalization::Persisted(persisted) => persisted, - _ => panic!("expected persisted post-commit outcome"), - }; - assert_eq!(persisted.commit_sha, input.commit_sha); - assert_eq!(persisted.trace_id, "550e8400-e29b-41d4-a716-446655440000"); - - assert_eq!(notes.writes.len(), 1); - assert_eq!( - notes.writes[0].content_type, - "application/vnd.agent-trace.record+json" - ); - assert_eq!(notes.writes[0].notes_ref, "refs/notes/agent-trace"); - assert_eq!( - notes.writes[0] - .record - .metadata - .get(POST_COMMIT_PARENT_SHA_METADATA_KEY), - Some(&"def789ghi000".to_string()) - ); - assert!(ledger.has_emitted("abc123def456")); - Ok(()) - } - - #[test] - fn post_commit_finalization_queues_when_db_write_is_transient_failure() -> Result<()> { - let runtime = sample_post_commit_runtime(); - let input = sample_post_commit_input(); - let mut notes = FakeNotesWriter::new(PersistenceWriteResult::Written); - let mut store = FakeRecordStore::new(PersistenceWriteResult::Failed(PersistenceFailure { - class: PersistenceErrorClass::Transient, - message: "database unavailable".to_string(), - })); - let mut queue = FakeRetryQueue::default(); - let mut ledger = FakeEmissionLedger::default(); - - let outcome = finalize_post_commit_trace( - &runtime, - input, - &mut notes, - &mut store, - &mut queue, - &mut ledger, - )?; - - assert_eq!( - outcome, - PostCommitFinalization::QueuedFallback(super::PostCommitQueuedFallback { - commit_sha: "abc123def456".to_string(), - failed_targets: vec![PersistenceTarget::Database], - trace_id: "550e8400-e29b-41d4-a716-446655440000".to_string(), - }) - ); - assert_eq!(queue.entries.len(), 1); - assert_eq!( - queue.entries[0].failed_targets, - vec![PersistenceTarget::Database] - ); - assert!(!ledger.has_emitted("abc123def456")); - Ok(()) - } - - #[test] - fn retry_processor_recovers_failed_notes_write_and_emits_success_metric() -> Result<()> { - let mut queue = FakeRetryQueue { - entries: vec![sample_retry_entry_with_target(PersistenceTarget::Notes)], - }; - let mut notes = FakeNotesWriter::new(PersistenceWriteResult::Written); - let mut store = FakeRecordStore::new(PersistenceWriteResult::Written); - let mut metrics = FakeRetryMetricsSink::default(); - - let summary = - process_trace_retry_queue(&mut queue, &mut notes, &mut store, &mut metrics, 4)?; - - assert_eq!(summary.attempted, 1); - assert_eq!(summary.recovered, 1); - assert_eq!(summary.requeued, 0); - assert!(queue.entries.is_empty()); - assert_eq!(metrics.events.len(), 1); - assert_eq!(metrics.events[0].error_class, None); - assert!(metrics.events[0].failed_targets.is_empty()); - Ok(()) - } - - #[test] - fn retry_processor_requeues_when_db_write_still_fails() -> Result<()> { - let mut queue = FakeRetryQueue { - entries: vec![sample_retry_entry_with_target(PersistenceTarget::Database)], - }; - let mut notes = FakeNotesWriter::new(PersistenceWriteResult::Written); - let mut store = FakeRecordStore::new(PersistenceWriteResult::Failed(PersistenceFailure { - class: PersistenceErrorClass::Permanent, - message: "database still unavailable".to_string(), - })); - let mut metrics = FakeRetryMetricsSink::default(); - - let summary = - process_trace_retry_queue(&mut queue, &mut notes, &mut store, &mut metrics, 4)?; - - assert_eq!(summary.attempted, 1); - assert_eq!(summary.recovered, 0); - assert_eq!(summary.requeued, 1); - assert_eq!(queue.entries.len(), 1); - assert_eq!( - queue.entries[0].failed_targets, - vec![PersistenceTarget::Database] - ); - assert_eq!(metrics.events.len(), 1); - assert_eq!( - metrics.events[0].error_class, - Some(PersistenceErrorClass::Permanent) - ); - Ok(()) - } - - #[test] - fn post_rewrite_finalization_noops_when_sce_disabled() -> Result<()> { - let mut runtime = sample_post_rewrite_runtime(); - runtime.sce_disabled = true; - let mut ingestion = FakeRewriteRemapIngestion::default(); - - let outcome = - finalize_post_rewrite_remap(&runtime, "amend", "old1 new1\n", &mut ingestion)?; - - assert_eq!( - outcome, - PostRewriteFinalization::NoOp(PostRewriteNoOpReason::Disabled) - ); - assert!(ingestion.seen_requests.is_empty()); - Ok(()) - } - - #[test] - fn post_rewrite_finalization_parses_amend_pairs_and_derives_idempotency() -> Result<()> { - let runtime = sample_post_rewrite_runtime(); - let mut ingestion = FakeRewriteRemapIngestion::default(); - - let outcome = finalize_post_rewrite_remap( - &runtime, - "amend", - "oldsha1 newsha1\noldsha2 newsha2\n", - &mut ingestion, - )?; - - assert_eq!( - outcome, - PostRewriteFinalization::Ingested(super::PostRewriteIngested { - rewrite_method: RewriteMethod::Amend, - total_pairs: 2, - ingested_pairs: 2, - skipped_pairs: 0, - }) - ); - assert_eq!(ingestion.seen_requests.len(), 2); - assert_eq!( - ingestion.seen_requests[0].idempotency_key, - "post-rewrite:amend:oldsha1:newsha1" - ); - assert_eq!( - ingestion.seen_requests[1].idempotency_key, - "post-rewrite:amend:oldsha2:newsha2" - ); - Ok(()) - } - - #[test] - fn post_rewrite_finalization_skips_duplicate_pairs_with_rebase_method() -> Result<()> { - let runtime = sample_post_rewrite_runtime(); - let mut ingestion = FakeRewriteRemapIngestion::default(); - - let outcome = finalize_post_rewrite_remap( - &runtime, - "rebase", - "oldsha1 newsha1\noldsha1 newsha1\n", - &mut ingestion, - )?; - - assert_eq!( - outcome, - PostRewriteFinalization::Ingested(super::PostRewriteIngested { - rewrite_method: RewriteMethod::Rebase, - total_pairs: 2, - ingested_pairs: 1, - skipped_pairs: 1, - }) - ); - assert_eq!(ingestion.seen_requests.len(), 2); - assert_eq!(ingestion.duplicate_keys.len(), 1); - assert_eq!( - ingestion.duplicate_keys[0], - "post-rewrite:rebase:oldsha1:newsha1" - ); - Ok(()) - } - - #[test] - fn post_rewrite_finalization_rejects_invalid_pair_line_format() { - let runtime = sample_post_rewrite_runtime(); - let mut ingestion = FakeRewriteRemapIngestion::default(); - - let error = - finalize_post_rewrite_remap(&runtime, "amend", "missing_new_sha\n", &mut ingestion) - .expect_err("invalid pair format should return error"); - - assert!(error.to_string().contains("expected ' '")); - assert!(ingestion.seen_requests.is_empty()); - } - - #[test] - fn rewrite_trace_finalization_persists_metadata_and_notes_db_parity() -> Result<()> { - let runtime = sample_post_rewrite_runtime(); - let input = sample_rewrite_trace_input(); - let mut notes = FakeNotesWriter::new(PersistenceWriteResult::Written); - let mut store = FakeRecordStore::new(PersistenceWriteResult::Written); - let mut queue = FakeRetryQueue::default(); - let mut ledger = FakeEmissionLedger::default(); - - let outcome = finalize_rewrite_trace( - &runtime, - input, - &mut notes, - &mut store, - &mut queue, - &mut ledger, - )?; - - let persisted = match outcome { - RewriteTraceFinalization::Persisted(persisted) => persisted, - _ => panic!("expected persisted rewrite trace outcome"), - }; - - assert_eq!(persisted.commit_sha, "newsha123"); - assert_eq!(persisted.trace_id, "660e8400-e29b-41d4-a716-446655440000"); - assert_eq!(persisted.quality_status, super::QualityStatus::Final); - assert_eq!(notes.writes.len(), 1); - assert_eq!(notes.writes[0].record.vcs.revision, "newsha123"); - assert_eq!( - notes.writes[0].record.metadata.get(METADATA_REWRITE_FROM), - Some(&"oldsha456".to_string()) - ); - assert_eq!( - notes.writes[0].record.metadata.get(METADATA_REWRITE_METHOD), - Some(&"rebase".to_string()) - ); - assert_eq!( - notes.writes[0] - .record - .metadata - .get(METADATA_REWRITE_CONFIDENCE), - Some(&"0.91".to_string()) - ); - assert_eq!( - notes.writes[0].record.metadata.get(METADATA_QUALITY_STATUS), - Some(&"final".to_string()) - ); - assert!(queue.entries.is_empty()); - assert!(ledger.has_emitted("newsha123")); - Ok(()) - } - - #[test] - fn rewrite_trace_finalization_applies_quality_thresholds() -> Result<()> { - let runtime = sample_post_rewrite_runtime(); - let mut notes = FakeNotesWriter::new(PersistenceWriteResult::Written); - let mut store = FakeRecordStore::new(PersistenceWriteResult::Written); - let mut queue = FakeRetryQueue::default(); - let mut ledger = FakeEmissionLedger::default(); - - let mut medium = sample_rewrite_trace_input(); - medium.record_id = "760e8400-e29b-41d4-a716-446655440000".to_string(); - medium.rewritten_commit_sha = "newsha-medium".to_string(); - medium.rewrite_confidence = 0.75; - let medium_outcome = finalize_rewrite_trace( - &runtime, - medium, - &mut notes, - &mut store, - &mut queue, - &mut ledger, - )?; - assert!(matches!( - medium_outcome, - RewriteTraceFinalization::Persisted(super::RewriteTracePersisted { - quality_status: super::QualityStatus::Partial, - .. - }) - )); - - let mut low = sample_rewrite_trace_input(); - low.record_id = "860e8400-e29b-41d4-a716-446655440000".to_string(); - low.rewritten_commit_sha = "newsha-low".to_string(); - low.rewrite_confidence = 0.40; - let low_outcome = finalize_rewrite_trace( - &runtime, - low, - &mut notes, - &mut store, - &mut queue, - &mut ledger, - )?; - assert!(matches!( - low_outcome, - RewriteTraceFinalization::Persisted(super::RewriteTracePersisted { - quality_status: super::QualityStatus::NeedsReview, - .. - }) - )); - - Ok(()) - } - - #[test] - fn rewrite_trace_finalization_rejects_confidence_outside_zero_to_one() { - let runtime = sample_post_rewrite_runtime(); - let mut input = sample_rewrite_trace_input(); - input.rewrite_confidence = 1.2; - - let mut notes = FakeNotesWriter::new(PersistenceWriteResult::Written); - let mut store = FakeRecordStore::new(PersistenceWriteResult::Written); - let mut queue = FakeRetryQueue::default(); - let mut ledger = FakeEmissionLedger::default(); - - let error = finalize_rewrite_trace( - &runtime, - input, - &mut notes, - &mut store, - &mut queue, - &mut ledger, - ) - .expect_err("out-of-range confidence must fail"); - - assert!(error - .to_string() - .contains("rewrite confidence must be within [0.0, 1.0]")); - assert!(notes.writes.is_empty()); - assert!(queue.entries.is_empty()); - } - - #[test] - fn rewrite_trace_finalization_noops_when_commit_already_finalized() -> Result<()> { - let runtime = sample_post_rewrite_runtime(); - let input = sample_rewrite_trace_input(); - let mut notes = FakeNotesWriter::new(PersistenceWriteResult::Written); - let mut store = FakeRecordStore::new(PersistenceWriteResult::Written); - let mut queue = FakeRetryQueue::default(); - let mut ledger = FakeEmissionLedger { - emitted: vec!["newsha123".to_string()], - }; - - let outcome = finalize_rewrite_trace( - &runtime, - input, - &mut notes, - &mut store, - &mut queue, - &mut ledger, - )?; - - assert_eq!( - outcome, - RewriteTraceFinalization::NoOp(RewriteTraceNoOpReason::AlreadyFinalized) - ); - assert!(notes.writes.is_empty()); - assert!(queue.entries.is_empty()); - Ok(()) - } - - #[test] - fn pre_commit_finalization_noops_when_sce_disabled() { - let mut runtime = sample_runtime(); - runtime.sce_disabled = true; - - let outcome = - finalize_pre_commit_checkpoint(&runtime, sample_anchors(), sample_pending_checkpoint()); - assert_eq!( - outcome, - PreCommitFinalization::NoOp(PreCommitNoOpReason::Disabled) - ); - } - - #[test] - fn pre_commit_finalization_noops_when_cli_unavailable() { - let mut runtime = sample_runtime(); - runtime.cli_available = false; - - let outcome = - finalize_pre_commit_checkpoint(&runtime, sample_anchors(), sample_pending_checkpoint()); - assert_eq!( - outcome, - PreCommitFinalization::NoOp(PreCommitNoOpReason::CliUnavailable) - ); - } - - #[test] - fn pre_commit_finalization_noops_for_bare_repo() { - let mut runtime = sample_runtime(); - runtime.is_bare_repo = true; - - let outcome = - finalize_pre_commit_checkpoint(&runtime, sample_anchors(), sample_pending_checkpoint()); - assert_eq!( - outcome, - PreCommitFinalization::NoOp(PreCommitNoOpReason::BareRepository) - ); - } - - #[test] - fn pre_commit_finalization_uses_only_staged_ranges_and_captures_anchors() { - let pending = PendingCheckpoint { - files: vec![ - PendingFileCheckpoint { - path: "src/keep.rs".to_string(), - staged_ranges: vec![PendingLineRange { - start_line: 10, - end_line: 20, - }], - unstaged_ranges: vec![PendingLineRange { - start_line: 21, - end_line: 30, - }], - }, - PendingFileCheckpoint { - path: "src/drop.rs".to_string(), - staged_ranges: vec![], - unstaged_ranges: vec![PendingLineRange { - start_line: 1, - end_line: 2, - }], - }, - ], - }; - let anchors = sample_anchors(); - - let outcome = finalize_pre_commit_checkpoint(&sample_runtime(), anchors.clone(), pending); - - let finalized = match outcome { - PreCommitFinalization::Finalized(finalized) => finalized, - _ => panic!("expected finalized checkpoint"), - }; - - assert_eq!(finalized.anchors, anchors); - assert_eq!(finalized.files.len(), 1); - assert_eq!(finalized.files[0].path, "src/keep.rs"); - assert_eq!(finalized.files[0].ranges.len(), 1); - assert_eq!( - finalized.files[0].ranges[0], - PendingLineRange { - start_line: 10, - end_line: 20 - } - ); - } - - fn sample_commit_msg_runtime() -> CommitMsgRuntimeState { - CommitMsgRuntimeState { - sce_disabled: false, - sce_coauthor_enabled: true, - has_staged_sce_attribution: true, - } - } - - #[test] - fn commit_msg_policy_noops_when_sce_disabled() { - let mut runtime = sample_commit_msg_runtime(); - runtime.sce_disabled = true; - - let message = "feat: add attribution"; - let output = apply_commit_msg_coauthor_policy(&runtime, message); - assert_eq!(output, message); - } - - #[test] - fn commit_msg_policy_noops_when_coauthor_disabled() { - let mut runtime = sample_commit_msg_runtime(); - runtime.sce_coauthor_enabled = false; - - let message = "feat: add attribution"; - let output = apply_commit_msg_coauthor_policy(&runtime, message); - assert_eq!(output, message); - } - - #[test] - fn commit_msg_policy_noops_without_staged_sce_attribution() { - let mut runtime = sample_commit_msg_runtime(); - runtime.has_staged_sce_attribution = false; - - let message = "feat: add attribution"; - let output = apply_commit_msg_coauthor_policy(&runtime, message); - assert_eq!(output, message); - } - - #[test] - fn commit_msg_policy_appends_canonical_trailer_once_when_allowed() { - let message = "feat: add attribution"; - let output = apply_commit_msg_coauthor_policy(&sample_commit_msg_runtime(), message); - - assert_eq!( - output, - format!( - "feat: add attribution\n\n{}", - CANONICAL_SCE_COAUTHOR_TRAILER - ) - ); - } - - #[test] - fn commit_msg_policy_dedupes_existing_canonical_trailers() { - let message = format!( - "feat: add attribution\n\n{}\n{}\n", - CANONICAL_SCE_COAUTHOR_TRAILER, CANONICAL_SCE_COAUTHOR_TRAILER - ); - - let output = apply_commit_msg_coauthor_policy(&sample_commit_msg_runtime(), &message); - - assert_eq!( - output, - format!( - "feat: add attribution\n\n{}\n", - CANONICAL_SCE_COAUTHOR_TRAILER - ) - ); - } - - #[test] - fn commit_msg_policy_is_idempotent() { - let first = - apply_commit_msg_coauthor_policy(&sample_commit_msg_runtime(), "feat: add attribution"); - let second = apply_commit_msg_coauthor_policy(&sample_commit_msg_runtime(), &first); - - assert_eq!(first, second); - } - - #[test] - fn hooks_placeholder_event_model_reserves_generated_region_tracking() { - let service = PlaceholderHookService; - let model = service.event_model(); - assert!(model.generated_region_tracking); - assert_eq!(model.supported_hooks.len(), 3); - } - - #[test] - fn hooks_placeholder_message_mentions_event_model() -> Result<()> { - let message = run_placeholder_hooks()?; - assert!(message.contains("Hook event model reserves")); - Ok(()) - } - - #[test] - fn hooks_placeholder_accepts_generated_region_events() -> Result<()> { - let service = PlaceholderHookService; - let event = HookEvent { - hook: GitHookKind::PreCommit, - region_event: Some(GeneratedRegionEvent { - file_path: "context/plans/example.md".to_string(), - marker_id: "generated:example".to_string(), - lifecycle: GeneratedRegionLifecycle::Updated, - }), - }; - - service.record(event) - } -} +mod tests; diff --git a/cli/src/services/hooks/tests.rs b/cli/src/services/hooks/tests.rs new file mode 100644 index 00000000..6e631115 --- /dev/null +++ b/cli/src/services/hooks/tests.rs @@ -0,0 +1,866 @@ +use anyhow::Result; + +use crate::services::agent_trace::{ + build_trace_payload, ContributorInput, ContributorType, ConversationInput, + FileAttributionInput, QualityStatus, RangeInput, TraceAdapterInput, METADATA_QUALITY_STATUS, + METADATA_REWRITE_CONFIDENCE, METADATA_REWRITE_FROM, METADATA_REWRITE_METHOD, +}; + +use super::{ + apply_commit_msg_coauthor_policy, finalize_post_commit_trace, finalize_post_rewrite_remap, + finalize_pre_commit_checkpoint, finalize_rewrite_trace, process_trace_retry_queue, + run_placeholder_hooks, CommitMsgRuntimeState, GeneratedRegionEvent, GeneratedRegionLifecycle, + GitHookKind, HookEvent, HookService, PendingCheckpoint, PendingFileCheckpoint, + PendingLineRange, PersistenceErrorClass, PersistenceFailure, PersistenceTarget, + PersistenceWriteResult, PlaceholderHookService, PostCommitFinalization, PostCommitInput, + PostCommitNoOpReason, PostCommitRuntimeState, PostRewriteFinalization, PostRewriteNoOpReason, + PostRewriteRuntimeState, PreCommitFinalization, PreCommitNoOpReason, PreCommitRuntimeState, + PreCommitTreeAnchors, RetryMetricsSink, RetryProcessingMetric, RewriteMethod, + RewriteRemapIngestion, RewriteRemapRequest, RewriteTraceFinalization, RewriteTraceInput, + RewriteTraceNoOpReason, TraceEmissionLedger, TraceNote, TraceNotesWriter, TraceRecordStore, + TraceRetryQueue, TraceRetryQueueEntry, CANONICAL_SCE_COAUTHOR_TRAILER, + POST_COMMIT_PARENT_SHA_METADATA_KEY, +}; + +fn sample_pending_checkpoint() -> PendingCheckpoint { + PendingCheckpoint { + files: vec![PendingFileCheckpoint { + path: "src/lib.rs".to_string(), + staged_ranges: vec![PendingLineRange { + start_line: 1, + end_line: 3, + }], + unstaged_ranges: vec![PendingLineRange { + start_line: 4, + end_line: 6, + }], + }], + } +} + +fn sample_runtime() -> PreCommitRuntimeState { + PreCommitRuntimeState { + sce_disabled: false, + cli_available: true, + is_bare_repo: false, + } +} + +fn sample_anchors() -> PreCommitTreeAnchors { + PreCommitTreeAnchors { + index_tree: "index-tree-sha".to_string(), + head_tree: Some("head-tree-sha".to_string()), + } +} + +#[derive(Default)] +struct FakeEmissionLedger { + emitted: Vec, +} + +impl TraceEmissionLedger for FakeEmissionLedger { + fn has_emitted(&self, commit_sha: &str) -> bool { + self.emitted.iter().any(|sha| sha == commit_sha) + } + + fn mark_emitted(&mut self, commit_sha: &str) { + self.emitted.push(commit_sha.to_string()); + } +} + +struct FakeNotesWriter { + result: PersistenceWriteResult, + writes: Vec, +} + +impl FakeNotesWriter { + fn new(result: PersistenceWriteResult) -> Self { + Self { + result, + writes: Vec::new(), + } + } +} + +impl TraceNotesWriter for FakeNotesWriter { + fn write_note(&mut self, note: TraceNote) -> PersistenceWriteResult { + self.writes.push(note); + self.result.clone() + } +} + +struct FakeRecordStore { + result: PersistenceWriteResult, +} + +impl FakeRecordStore { + fn new(result: PersistenceWriteResult) -> Self { + Self { result } + } +} + +impl TraceRecordStore for FakeRecordStore { + fn write_trace_record( + &mut self, + _record: super::PersistedTraceRecord, + ) -> PersistenceWriteResult { + self.result.clone() + } +} + +#[derive(Default)] +struct FakeRetryQueue { + entries: Vec, +} + +#[derive(Default)] +struct FakeRetryMetricsSink { + events: Vec, +} + +#[derive(Default)] +struct FakeRewriteRemapIngestion { + seen_requests: Vec, + duplicate_keys: Vec, + seen_keys: std::collections::BTreeSet, +} + +impl RewriteRemapIngestion for FakeRewriteRemapIngestion { + fn ingest(&mut self, request: RewriteRemapRequest) -> Result { + let accepted = self.seen_keys.insert(request.idempotency_key.clone()); + if !accepted { + self.duplicate_keys.push(request.idempotency_key.clone()); + } + self.seen_requests.push(request); + Ok(accepted) + } +} + +impl TraceRetryQueue for FakeRetryQueue { + fn enqueue(&mut self, entry: TraceRetryQueueEntry) -> Result<()> { + self.entries.push(entry); + Ok(()) + } + + fn dequeue_next(&mut self) -> Result> { + if self.entries.is_empty() { + return Ok(None); + } + + Ok(Some(self.entries.remove(0))) + } +} + +impl RetryMetricsSink for FakeRetryMetricsSink { + fn record_retry_metric(&mut self, metric: RetryProcessingMetric) { + self.events.push(metric); + } +} + +fn sample_retry_entry_with_target(target: PersistenceTarget) -> TraceRetryQueueEntry { + let record = build_trace_payload(TraceAdapterInput { + record_id: "990e8400-e29b-41d4-a716-446655440000".to_string(), + timestamp_rfc3339: "2026-03-04T12:13:14Z".to_string(), + commit_sha: "retrysha123".to_string(), + files: vec![FileAttributionInput { + path: "src/retry.rs".to_string(), + conversations: vec![ConversationInput { + url: "https://example.test/conversation/retry".to_string(), + related: vec![], + ranges: vec![RangeInput { + start_line: 4, + end_line: 6, + contributor: ContributorInput { + kind: ContributorType::Ai, + model_id: Some("openai/gpt-5.3-codex".to_string()), + }, + }], + }], + }], + quality_status: QualityStatus::Final, + rewrite: None, + idempotency_key: Some("retry:key:retrysha123".to_string()), + }); + + TraceRetryQueueEntry { + commit_sha: "retrysha123".to_string(), + failed_targets: vec![target], + content_type: "application/vnd.agent-trace.record+json".to_string(), + notes_ref: "refs/notes/agent-trace".to_string(), + record, + } +} + +fn sample_post_commit_runtime() -> PostCommitRuntimeState { + PostCommitRuntimeState { + sce_disabled: false, + cli_available: true, + is_bare_repo: false, + } +} + +fn sample_post_rewrite_runtime() -> PostRewriteRuntimeState { + PostRewriteRuntimeState { + sce_disabled: false, + cli_available: true, + is_bare_repo: false, + } +} + +fn sample_rewrite_trace_input() -> RewriteTraceInput { + RewriteTraceInput { + record_id: "660e8400-e29b-41d4-a716-446655440000".to_string(), + timestamp_rfc3339: "2026-03-04T11:12:13Z".to_string(), + rewritten_commit_sha: "newsha123".to_string(), + rewrite_from_sha: "oldsha456".to_string(), + rewrite_method: RewriteMethod::Rebase, + rewrite_confidence: 0.91, + idempotency_key: "post-rewrite:rebase:oldsha456:newsha123".to_string(), + files: vec![FileAttributionInput { + path: "src/lib.rs".to_string(), + conversations: vec![ConversationInput { + url: "https://example.test/conversation/rewritten".to_string(), + related: vec![], + ranges: vec![RangeInput { + start_line: 3, + end_line: 7, + contributor: ContributorInput { + kind: ContributorType::Ai, + model_id: Some("openai/gpt-5.3-codex".to_string()), + }, + }], + }], + }], + } +} + +fn sample_post_commit_input() -> PostCommitInput { + PostCommitInput { + record_id: "550e8400-e29b-41d4-a716-446655440000".to_string(), + timestamp_rfc3339: "2026-03-04T10:11:12Z".to_string(), + commit_sha: "abc123def456".to_string(), + parent_sha: Some("def789ghi000".to_string()), + idempotency_key: "repo:abc123def456".to_string(), + files: vec![FileAttributionInput { + path: "src/lib.rs".to_string(), + conversations: vec![ConversationInput { + url: "https://example.test/conversation/1".to_string(), + related: vec![], + ranges: vec![RangeInput { + start_line: 1, + end_line: 5, + contributor: ContributorInput { + kind: ContributorType::Ai, + model_id: Some("openai/gpt-5.3-codex".to_string()), + }, + }], + }], + }], + } +} + +#[test] +fn post_commit_finalization_noops_when_already_finalized() -> Result<()> { + let runtime = sample_post_commit_runtime(); + let input = sample_post_commit_input(); + let mut notes = FakeNotesWriter::new(PersistenceWriteResult::Written); + let mut store = FakeRecordStore::new(PersistenceWriteResult::Written); + let mut queue = FakeRetryQueue::default(); + let mut ledger = FakeEmissionLedger { + emitted: vec![input.commit_sha.clone()], + }; + + let outcome = finalize_post_commit_trace( + &runtime, + input, + &mut notes, + &mut store, + &mut queue, + &mut ledger, + )?; + + assert_eq!( + outcome, + PostCommitFinalization::NoOp(PostCommitNoOpReason::AlreadyFinalized) + ); + assert!(notes.writes.is_empty()); + assert!(queue.entries.is_empty()); + Ok(()) +} + +#[test] +fn post_commit_finalization_dual_writes_with_parent_metadata_and_mime() -> Result<()> { + let runtime = sample_post_commit_runtime(); + let input = sample_post_commit_input(); + let mut notes = FakeNotesWriter::new(PersistenceWriteResult::Written); + let mut store = FakeRecordStore::new(PersistenceWriteResult::Written); + let mut queue = FakeRetryQueue::default(); + let mut ledger = FakeEmissionLedger::default(); + + let outcome = finalize_post_commit_trace( + &runtime, + input.clone(), + &mut notes, + &mut store, + &mut queue, + &mut ledger, + )?; + + let persisted = match outcome { + PostCommitFinalization::Persisted(persisted) => persisted, + _ => panic!("expected persisted post-commit outcome"), + }; + assert_eq!(persisted.commit_sha, input.commit_sha); + assert_eq!(persisted.trace_id, "550e8400-e29b-41d4-a716-446655440000"); + + assert_eq!(notes.writes.len(), 1); + assert_eq!( + notes.writes[0].content_type, + "application/vnd.agent-trace.record+json" + ); + assert_eq!(notes.writes[0].notes_ref, "refs/notes/agent-trace"); + assert_eq!( + notes.writes[0] + .record + .metadata + .get(POST_COMMIT_PARENT_SHA_METADATA_KEY), + Some(&"def789ghi000".to_string()) + ); + assert!(ledger.has_emitted("abc123def456")); + Ok(()) +} + +#[test] +fn post_commit_finalization_queues_when_db_write_is_transient_failure() -> Result<()> { + let runtime = sample_post_commit_runtime(); + let input = sample_post_commit_input(); + let mut notes = FakeNotesWriter::new(PersistenceWriteResult::Written); + let mut store = FakeRecordStore::new(PersistenceWriteResult::Failed(PersistenceFailure { + class: PersistenceErrorClass::Transient, + message: "database unavailable".to_string(), + })); + let mut queue = FakeRetryQueue::default(); + let mut ledger = FakeEmissionLedger::default(); + + let outcome = finalize_post_commit_trace( + &runtime, + input, + &mut notes, + &mut store, + &mut queue, + &mut ledger, + )?; + + assert_eq!( + outcome, + PostCommitFinalization::QueuedFallback(super::PostCommitQueuedFallback { + commit_sha: "abc123def456".to_string(), + failed_targets: vec![PersistenceTarget::Database], + trace_id: "550e8400-e29b-41d4-a716-446655440000".to_string(), + }) + ); + assert_eq!(queue.entries.len(), 1); + assert_eq!( + queue.entries[0].failed_targets, + vec![PersistenceTarget::Database] + ); + assert!(!ledger.has_emitted("abc123def456")); + Ok(()) +} + +#[test] +fn retry_processor_recovers_failed_notes_write_and_emits_success_metric() -> Result<()> { + let mut queue = FakeRetryQueue { + entries: vec![sample_retry_entry_with_target(PersistenceTarget::Notes)], + }; + let mut notes = FakeNotesWriter::new(PersistenceWriteResult::Written); + let mut store = FakeRecordStore::new(PersistenceWriteResult::Written); + let mut metrics = FakeRetryMetricsSink::default(); + + let summary = process_trace_retry_queue(&mut queue, &mut notes, &mut store, &mut metrics, 4)?; + + assert_eq!(summary.attempted, 1); + assert_eq!(summary.recovered, 1); + assert_eq!(summary.requeued, 0); + assert!(queue.entries.is_empty()); + assert_eq!(metrics.events.len(), 1); + assert_eq!(metrics.events[0].error_class, None); + assert!(metrics.events[0].failed_targets.is_empty()); + Ok(()) +} + +#[test] +fn retry_processor_requeues_when_db_write_still_fails() -> Result<()> { + let mut queue = FakeRetryQueue { + entries: vec![sample_retry_entry_with_target(PersistenceTarget::Database)], + }; + let mut notes = FakeNotesWriter::new(PersistenceWriteResult::Written); + let mut store = FakeRecordStore::new(PersistenceWriteResult::Failed(PersistenceFailure { + class: PersistenceErrorClass::Permanent, + message: "database still unavailable".to_string(), + })); + let mut metrics = FakeRetryMetricsSink::default(); + + let summary = process_trace_retry_queue(&mut queue, &mut notes, &mut store, &mut metrics, 4)?; + + assert_eq!(summary.attempted, 1); + assert_eq!(summary.recovered, 0); + assert_eq!(summary.requeued, 1); + assert_eq!(queue.entries.len(), 1); + assert_eq!( + queue.entries[0].failed_targets, + vec![PersistenceTarget::Database] + ); + assert_eq!(metrics.events.len(), 1); + assert_eq!( + metrics.events[0].error_class, + Some(PersistenceErrorClass::Permanent) + ); + Ok(()) +} + +#[test] +fn post_rewrite_finalization_noops_when_sce_disabled() -> Result<()> { + let mut runtime = sample_post_rewrite_runtime(); + runtime.sce_disabled = true; + let mut ingestion = FakeRewriteRemapIngestion::default(); + + let outcome = finalize_post_rewrite_remap(&runtime, "amend", "old1 new1\n", &mut ingestion)?; + + assert_eq!( + outcome, + PostRewriteFinalization::NoOp(PostRewriteNoOpReason::Disabled) + ); + assert!(ingestion.seen_requests.is_empty()); + Ok(()) +} + +#[test] +fn post_rewrite_finalization_parses_amend_pairs_and_derives_idempotency() -> Result<()> { + let runtime = sample_post_rewrite_runtime(); + let mut ingestion = FakeRewriteRemapIngestion::default(); + + let outcome = finalize_post_rewrite_remap( + &runtime, + "amend", + "oldsha1 newsha1\noldsha2 newsha2\n", + &mut ingestion, + )?; + + assert_eq!( + outcome, + PostRewriteFinalization::Ingested(super::PostRewriteIngested { + rewrite_method: RewriteMethod::Amend, + total_pairs: 2, + ingested_pairs: 2, + skipped_pairs: 0, + }) + ); + assert_eq!(ingestion.seen_requests.len(), 2); + assert_eq!( + ingestion.seen_requests[0].idempotency_key, + "post-rewrite:amend:oldsha1:newsha1" + ); + assert_eq!( + ingestion.seen_requests[1].idempotency_key, + "post-rewrite:amend:oldsha2:newsha2" + ); + Ok(()) +} + +#[test] +fn post_rewrite_finalization_skips_duplicate_pairs_with_rebase_method() -> Result<()> { + let runtime = sample_post_rewrite_runtime(); + let mut ingestion = FakeRewriteRemapIngestion::default(); + + let outcome = finalize_post_rewrite_remap( + &runtime, + "rebase", + "oldsha1 newsha1\noldsha1 newsha1\n", + &mut ingestion, + )?; + + assert_eq!( + outcome, + PostRewriteFinalization::Ingested(super::PostRewriteIngested { + rewrite_method: RewriteMethod::Rebase, + total_pairs: 2, + ingested_pairs: 1, + skipped_pairs: 1, + }) + ); + assert_eq!(ingestion.seen_requests.len(), 2); + assert_eq!(ingestion.duplicate_keys.len(), 1); + assert_eq!( + ingestion.duplicate_keys[0], + "post-rewrite:rebase:oldsha1:newsha1" + ); + Ok(()) +} + +#[test] +fn post_rewrite_finalization_rejects_invalid_pair_line_format() { + let runtime = sample_post_rewrite_runtime(); + let mut ingestion = FakeRewriteRemapIngestion::default(); + + let error = finalize_post_rewrite_remap(&runtime, "amend", "missing_new_sha\n", &mut ingestion) + .expect_err("invalid pair format should return error"); + + assert!(error.to_string().contains("expected ' '")); + assert!(ingestion.seen_requests.is_empty()); +} + +#[test] +fn rewrite_trace_finalization_persists_metadata_and_notes_db_parity() -> Result<()> { + let runtime = sample_post_rewrite_runtime(); + let input = sample_rewrite_trace_input(); + let mut notes = FakeNotesWriter::new(PersistenceWriteResult::Written); + let mut store = FakeRecordStore::new(PersistenceWriteResult::Written); + let mut queue = FakeRetryQueue::default(); + let mut ledger = FakeEmissionLedger::default(); + + let outcome = finalize_rewrite_trace( + &runtime, + input, + &mut notes, + &mut store, + &mut queue, + &mut ledger, + )?; + + let persisted = match outcome { + RewriteTraceFinalization::Persisted(persisted) => persisted, + _ => panic!("expected persisted rewrite trace outcome"), + }; + + assert_eq!(persisted.commit_sha, "newsha123"); + assert_eq!(persisted.trace_id, "660e8400-e29b-41d4-a716-446655440000"); + assert_eq!(persisted.quality_status, super::QualityStatus::Final); + assert_eq!(notes.writes.len(), 1); + assert_eq!(notes.writes[0].record.vcs.revision, "newsha123"); + assert_eq!( + notes.writes[0].record.metadata.get(METADATA_REWRITE_FROM), + Some(&"oldsha456".to_string()) + ); + assert_eq!( + notes.writes[0].record.metadata.get(METADATA_REWRITE_METHOD), + Some(&"rebase".to_string()) + ); + assert_eq!( + notes.writes[0] + .record + .metadata + .get(METADATA_REWRITE_CONFIDENCE), + Some(&"0.91".to_string()) + ); + assert_eq!( + notes.writes[0].record.metadata.get(METADATA_QUALITY_STATUS), + Some(&"final".to_string()) + ); + assert!(queue.entries.is_empty()); + assert!(ledger.has_emitted("newsha123")); + Ok(()) +} + +#[test] +fn rewrite_trace_finalization_applies_quality_thresholds() -> Result<()> { + let runtime = sample_post_rewrite_runtime(); + let mut notes = FakeNotesWriter::new(PersistenceWriteResult::Written); + let mut store = FakeRecordStore::new(PersistenceWriteResult::Written); + let mut queue = FakeRetryQueue::default(); + let mut ledger = FakeEmissionLedger::default(); + + let mut medium = sample_rewrite_trace_input(); + medium.record_id = "760e8400-e29b-41d4-a716-446655440000".to_string(); + medium.rewritten_commit_sha = "newsha-medium".to_string(); + medium.rewrite_confidence = 0.75; + let medium_outcome = finalize_rewrite_trace( + &runtime, + medium, + &mut notes, + &mut store, + &mut queue, + &mut ledger, + )?; + assert!(matches!( + medium_outcome, + RewriteTraceFinalization::Persisted(super::RewriteTracePersisted { + quality_status: super::QualityStatus::Partial, + .. + }) + )); + + let mut low = sample_rewrite_trace_input(); + low.record_id = "860e8400-e29b-41d4-a716-446655440000".to_string(); + low.rewritten_commit_sha = "newsha-low".to_string(); + low.rewrite_confidence = 0.40; + let low_outcome = finalize_rewrite_trace( + &runtime, + low, + &mut notes, + &mut store, + &mut queue, + &mut ledger, + )?; + assert!(matches!( + low_outcome, + RewriteTraceFinalization::Persisted(super::RewriteTracePersisted { + quality_status: super::QualityStatus::NeedsReview, + .. + }) + )); + + Ok(()) +} + +#[test] +fn rewrite_trace_finalization_rejects_confidence_outside_zero_to_one() { + let runtime = sample_post_rewrite_runtime(); + let mut input = sample_rewrite_trace_input(); + input.rewrite_confidence = 1.2; + + let mut notes = FakeNotesWriter::new(PersistenceWriteResult::Written); + let mut store = FakeRecordStore::new(PersistenceWriteResult::Written); + let mut queue = FakeRetryQueue::default(); + let mut ledger = FakeEmissionLedger::default(); + + let error = finalize_rewrite_trace( + &runtime, + input, + &mut notes, + &mut store, + &mut queue, + &mut ledger, + ) + .expect_err("out-of-range confidence must fail"); + + assert!(error + .to_string() + .contains("rewrite confidence must be within [0.0, 1.0]")); + assert!(notes.writes.is_empty()); + assert!(queue.entries.is_empty()); +} + +#[test] +fn rewrite_trace_finalization_noops_when_commit_already_finalized() -> Result<()> { + let runtime = sample_post_rewrite_runtime(); + let input = sample_rewrite_trace_input(); + let mut notes = FakeNotesWriter::new(PersistenceWriteResult::Written); + let mut store = FakeRecordStore::new(PersistenceWriteResult::Written); + let mut queue = FakeRetryQueue::default(); + let mut ledger = FakeEmissionLedger { + emitted: vec!["newsha123".to_string()], + }; + + let outcome = finalize_rewrite_trace( + &runtime, + input, + &mut notes, + &mut store, + &mut queue, + &mut ledger, + )?; + + assert_eq!( + outcome, + RewriteTraceFinalization::NoOp(RewriteTraceNoOpReason::AlreadyFinalized) + ); + assert!(notes.writes.is_empty()); + assert!(queue.entries.is_empty()); + Ok(()) +} + +#[test] +fn pre_commit_finalization_noops_when_sce_disabled() { + let mut runtime = sample_runtime(); + runtime.sce_disabled = true; + + let outcome = + finalize_pre_commit_checkpoint(&runtime, sample_anchors(), sample_pending_checkpoint()); + assert_eq!( + outcome, + PreCommitFinalization::NoOp(PreCommitNoOpReason::Disabled) + ); +} + +#[test] +fn pre_commit_finalization_noops_when_cli_unavailable() { + let mut runtime = sample_runtime(); + runtime.cli_available = false; + + let outcome = + finalize_pre_commit_checkpoint(&runtime, sample_anchors(), sample_pending_checkpoint()); + assert_eq!( + outcome, + PreCommitFinalization::NoOp(PreCommitNoOpReason::CliUnavailable) + ); +} + +#[test] +fn pre_commit_finalization_noops_for_bare_repo() { + let mut runtime = sample_runtime(); + runtime.is_bare_repo = true; + + let outcome = + finalize_pre_commit_checkpoint(&runtime, sample_anchors(), sample_pending_checkpoint()); + assert_eq!( + outcome, + PreCommitFinalization::NoOp(PreCommitNoOpReason::BareRepository) + ); +} + +#[test] +fn pre_commit_finalization_uses_only_staged_ranges_and_captures_anchors() { + let pending = PendingCheckpoint { + files: vec![ + PendingFileCheckpoint { + path: "src/keep.rs".to_string(), + staged_ranges: vec![PendingLineRange { + start_line: 10, + end_line: 20, + }], + unstaged_ranges: vec![PendingLineRange { + start_line: 21, + end_line: 30, + }], + }, + PendingFileCheckpoint { + path: "src/drop.rs".to_string(), + staged_ranges: vec![], + unstaged_ranges: vec![PendingLineRange { + start_line: 1, + end_line: 2, + }], + }, + ], + }; + let anchors = sample_anchors(); + + let outcome = finalize_pre_commit_checkpoint(&sample_runtime(), anchors.clone(), pending); + + let finalized = match outcome { + PreCommitFinalization::Finalized(finalized) => finalized, + _ => panic!("expected finalized checkpoint"), + }; + + assert_eq!(finalized.anchors, anchors); + assert_eq!(finalized.files.len(), 1); + assert_eq!(finalized.files[0].path, "src/keep.rs"); + assert_eq!(finalized.files[0].ranges.len(), 1); + assert_eq!( + finalized.files[0].ranges[0], + PendingLineRange { + start_line: 10, + end_line: 20 + } + ); +} + +fn sample_commit_msg_runtime() -> CommitMsgRuntimeState { + CommitMsgRuntimeState { + sce_disabled: false, + sce_coauthor_enabled: true, + has_staged_sce_attribution: true, + } +} + +#[test] +fn commit_msg_policy_noops_when_sce_disabled() { + let mut runtime = sample_commit_msg_runtime(); + runtime.sce_disabled = true; + + let message = "feat: add attribution"; + let output = apply_commit_msg_coauthor_policy(&runtime, message); + assert_eq!(output, message); +} + +#[test] +fn commit_msg_policy_noops_when_coauthor_disabled() { + let mut runtime = sample_commit_msg_runtime(); + runtime.sce_coauthor_enabled = false; + + let message = "feat: add attribution"; + let output = apply_commit_msg_coauthor_policy(&runtime, message); + assert_eq!(output, message); +} + +#[test] +fn commit_msg_policy_noops_without_staged_sce_attribution() { + let mut runtime = sample_commit_msg_runtime(); + runtime.has_staged_sce_attribution = false; + + let message = "feat: add attribution"; + let output = apply_commit_msg_coauthor_policy(&runtime, message); + assert_eq!(output, message); +} + +#[test] +fn commit_msg_policy_appends_canonical_trailer_once_when_allowed() { + let message = "feat: add attribution"; + let output = apply_commit_msg_coauthor_policy(&sample_commit_msg_runtime(), message); + + assert_eq!( + output, + format!( + "feat: add attribution\n\n{}", + CANONICAL_SCE_COAUTHOR_TRAILER + ) + ); +} + +#[test] +fn commit_msg_policy_dedupes_existing_canonical_trailers() { + let message = format!( + "feat: add attribution\n\n{}\n{}\n", + CANONICAL_SCE_COAUTHOR_TRAILER, CANONICAL_SCE_COAUTHOR_TRAILER + ); + + let output = apply_commit_msg_coauthor_policy(&sample_commit_msg_runtime(), &message); + + assert_eq!( + output, + format!( + "feat: add attribution\n\n{}\n", + CANONICAL_SCE_COAUTHOR_TRAILER + ) + ); +} + +#[test] +fn commit_msg_policy_is_idempotent() { + let first = + apply_commit_msg_coauthor_policy(&sample_commit_msg_runtime(), "feat: add attribution"); + let second = apply_commit_msg_coauthor_policy(&sample_commit_msg_runtime(), &first); + + assert_eq!(first, second); +} + +#[test] +fn hooks_placeholder_event_model_reserves_generated_region_tracking() { + let service = PlaceholderHookService; + let model = service.event_model(); + assert!(model.generated_region_tracking); + assert_eq!(model.supported_hooks.len(), 3); +} + +#[test] +fn hooks_placeholder_message_mentions_event_model() -> Result<()> { + let message = run_placeholder_hooks()?; + assert!(message.contains("Hook event model reserves")); + Ok(()) +} + +#[test] +fn hooks_placeholder_accepts_generated_region_events() -> Result<()> { + let service = PlaceholderHookService; + let event = HookEvent { + hook: GitHookKind::PreCommit, + region_event: Some(GeneratedRegionEvent { + file_path: "context/plans/example.md".to_string(), + marker_id: "generated:example".to_string(), + lifecycle: GeneratedRegionLifecycle::Updated, + }), + }; + + service.record(event) +} diff --git a/cli/src/services/setup.rs b/cli/src/services/setup.rs index 3b1d9085..0de189bc 100644 --- a/cli/src/services/setup.rs +++ b/cli/src/services/setup.rs @@ -912,648 +912,4 @@ pub fn resolve_setup_mode(options: SetupCliOptions) -> Result { } #[cfg(test)] -mod tests { - use std::{ - cell::Cell, - fs, io, - path::{Path, PathBuf}, - process::Command, - }; - - use crate::test_support::TestTempDir; - use anyhow::Result; - - use super::{ - get_required_hook_asset, install_embedded_setup_assets, - install_embedded_setup_assets_with_rename, install_required_git_hooks, - install_required_git_hooks_with_rename, iter_embedded_assets_for_setup_target, - iter_required_hook_assets, parse_setup_cli_options, resolve_setup_dispatch, - resolve_setup_hooks_repository, resolve_setup_mode, run_setup_for_mode, run_setup_hooks, - setup_usage_text, RequiredHookAsset, RequiredHookInstallStatus, SetupCliOptions, - SetupDispatch, SetupMode, SetupTarget, - }; - - #[derive(Clone, Copy, Debug)] - struct MockPrompter { - response: SetupDispatch, - } - - impl super::SetupTargetPrompter for MockPrompter { - fn prompt_target(&self) -> Result { - Ok(self.response) - } - } - - #[test] - fn run_setup_rejects_unresolved_interactive_mode() { - let temp = TestTempDir::new("sce-setup-install-tests").expect("temp dir should be created"); - let error = run_setup_for_mode(temp.path(), SetupMode::Interactive) - .expect_err("interactive mode should be resolved before install"); - assert_eq!( - error.to_string(), - "Interactive setup mode must be resolved before installation" - ); - } - - #[test] - fn setup_options_default_to_interactive_mode() -> Result<()> { - let options = parse_setup_cli_options(Vec::::new())?; - let mode = resolve_setup_mode(options)?; - assert_eq!(mode, SetupMode::Interactive); - Ok(()) - } - - #[test] - fn setup_options_parse_opencode_flag() -> Result<()> { - let options = parse_setup_cli_options(vec!["--opencode".to_string()])?; - let mode = resolve_setup_mode(options)?; - assert_eq!(mode, SetupMode::NonInteractive(SetupTarget::OpenCode)); - Ok(()) - } - - #[test] - fn setup_options_reject_mutually_exclusive_flags() { - let error = resolve_setup_mode(SetupCliOptions { - help: false, - opencode: true, - claude: true, - both: false, - hooks: false, - repo_path: None, - }) - .expect_err("multiple target flags should fail"); - - assert_eq!( - error.to_string(), - "Options '--opencode', '--claude', and '--both' are mutually exclusive. Choose exactly one target flag or none for interactive mode." - ); - } - - #[test] - fn setup_usage_contract_mentions_target_flags() { - let usage = setup_usage_text(); - assert!(usage.contains("--opencode|--claude|--both")); - assert!(usage.contains("sce setup --hooks [--repo ]")); - } - - #[test] - fn setup_options_parse_hooks_without_repo() -> Result<()> { - let options = parse_setup_cli_options(vec!["--hooks".to_string()])?; - let repo = resolve_setup_hooks_repository(&options)?; - assert_eq!(repo, None); - Ok(()) - } - - #[test] - fn setup_options_parse_hooks_with_repo() -> Result<()> { - let options = parse_setup_cli_options(vec![ - "--hooks".to_string(), - "--repo".to_string(), - "tmp/repo".to_string(), - ])?; - let repo = resolve_setup_hooks_repository(&options)?; - assert_eq!(repo, Some(PathBuf::from("tmp/repo"))); - Ok(()) - } - - #[test] - fn setup_options_reject_repo_without_hooks() { - let options = parse_setup_cli_options(vec!["--repo".to_string(), "tmp/repo".to_string()]) - .expect("parsing --repo should succeed before validation"); - let error = resolve_setup_hooks_repository(&options) - .expect_err("--repo without --hooks should fail"); - assert_eq!( - error.to_string(), - "Option '--repo' requires '--hooks'. Run 'sce setup --help' to see valid usage." - ); - } - - #[test] - fn setup_options_reject_hooks_with_target_flags() { - let options = - parse_setup_cli_options(vec!["--hooks".to_string(), "--opencode".to_string()]) - .expect("parsing should succeed before validation"); - let error = resolve_setup_hooks_repository(&options) - .expect_err("--hooks with target flags should fail"); - assert_eq!( - error.to_string(), - "Option '--hooks' cannot be combined with '--opencode', '--claude', or '--both'. Run 'sce setup --help' to see valid usage." - ); - } - - #[test] - fn run_setup_hooks_reports_per_hook_statuses() -> Result<()> { - let temp = TestTempDir::new("sce-setup-hook-install-tests")?; - init_git_repo(temp.path())?; - - let message = run_setup_hooks(temp.path())?; - assert!(message.contains("Hook setup completed successfully.")); - assert!(message.contains("Repository root:")); - assert!(message.contains("Hooks directory:")); - assert!(message.contains("commit-msg: installed")); - assert!(message.contains("post-commit: installed")); - assert!(message.contains("pre-commit: installed")); - assert!(message.contains("backup: not needed")); - - Ok(()) - } - - #[test] - fn setup_help_option_sets_help_flag() -> Result<()> { - let options = parse_setup_cli_options(vec!["--help".to_string()])?; - assert!(options.help); - Ok(()) - } - - #[test] - fn run_setup_reports_selected_target_and_backup_status() -> Result<()> { - let temp = TestTempDir::new("sce-setup-install-tests")?; - fs::create_dir_all(temp.path().join(".opencode/legacy"))?; - fs::write(temp.path().join(".opencode/legacy/config.txt"), b"legacy")?; - - let message = run_setup_for_mode( - temp.path(), - SetupMode::NonInteractive(SetupTarget::OpenCode), - )?; - assert!(message.contains("Setup completed successfully.")); - assert!(message.contains("Selected target(s): OpenCode")); - assert!(message.contains("OpenCode: installed")); - assert!(message.contains("backup: existing target moved to")); - assert!(message.contains(".opencode.backup")); - - Ok(()) - } - - #[test] - fn run_setup_reports_both_targets() -> Result<()> { - let temp = TestTempDir::new("sce-setup-install-tests")?; - let message = - run_setup_for_mode(temp.path(), SetupMode::NonInteractive(SetupTarget::Both))?; - assert!(message.contains("Selected target(s): OpenCode, Claude")); - assert!(message.contains("OpenCode: installed")); - assert!(message.contains("Claude: installed")); - assert!(message.contains("backup: not needed (no existing target)")); - Ok(()) - } - - #[test] - fn interactive_dispatch_maps_selected_target() -> Result<()> { - let dispatch = resolve_setup_dispatch( - SetupMode::Interactive, - &MockPrompter { - response: SetupDispatch::Proceed(SetupMode::NonInteractive(SetupTarget::Claude)), - }, - )?; - - assert_eq!( - dispatch, - SetupDispatch::Proceed(SetupMode::NonInteractive(SetupTarget::Claude)) - ); - Ok(()) - } - - #[test] - fn interactive_dispatch_returns_cancelled_without_side_effects() -> Result<()> { - let dispatch = resolve_setup_dispatch( - SetupMode::Interactive, - &MockPrompter { - response: SetupDispatch::Cancelled, - }, - )?; - - assert_eq!(dispatch, SetupDispatch::Cancelled); - Ok(()) - } - - #[test] - fn embedded_manifest_paths_are_sorted_and_normalized() { - for target in [SetupTarget::OpenCode, SetupTarget::Claude] { - let assets = assets_for_target(target); - - assert!(!assets.is_empty(), "embedded asset set should not be empty"); - - let paths: Vec<&str> = assets.iter().map(|asset| asset.relative_path).collect(); - assert_eq!(paths.len(), assets.len()); - - for asset in assets { - assert!(!asset.relative_path.is_empty()); - assert!(!asset.relative_path.starts_with('/')); - assert!(!asset.relative_path.contains('\\')); - assert!(!asset.relative_path.starts_with("config/")); - assert!( - !asset.bytes.is_empty(), - "embedded files should have content bytes" - ); - } - - let mut sorted = paths.clone(); - sorted.sort_unstable(); - assert_eq!( - paths, sorted, - "embedded paths should be deterministic and sorted" - ); - } - } - - #[test] - fn embedded_manifest_matches_runtime_config_tree() -> Result<()> { - let opencode_expected = - collect_runtime_relative_paths(runtime_target_root(SetupTarget::OpenCode))?; - let claude_expected = - collect_runtime_relative_paths(runtime_target_root(SetupTarget::Claude))?; - - let opencode_actual: Vec = assets_for_target(SetupTarget::OpenCode) - .iter() - .map(|asset| asset.relative_path.to_string()) - .collect(); - let claude_actual: Vec = assets_for_target(SetupTarget::Claude) - .iter() - .map(|asset| asset.relative_path.to_string()) - .collect(); - - assert_eq!(opencode_actual, opencode_expected); - assert_eq!(claude_actual, claude_expected); - Ok(()) - } - - #[test] - fn embedded_setup_target_iterator_scopes_assets_per_target() { - let opencode_count = assets_for_target(SetupTarget::OpenCode).len(); - let claude_count = assets_for_target(SetupTarget::Claude).len(); - - let iter_opencode_count = - iter_embedded_assets_for_setup_target(SetupTarget::OpenCode).count(); - let iter_claude_count = iter_embedded_assets_for_setup_target(SetupTarget::Claude).count(); - let iter_both_count = iter_embedded_assets_for_setup_target(SetupTarget::Both).count(); - - assert_eq!(iter_opencode_count, opencode_count); - assert_eq!(iter_claude_count, claude_count); - assert_eq!(iter_both_count, opencode_count + claude_count); - } - - #[test] - fn embedded_hook_manifest_is_complete_sorted_and_normalized() { - let hooks: Vec<&super::EmbeddedAsset> = iter_required_hook_assets().collect(); - let paths: Vec<&str> = hooks.iter().map(|asset| asset.relative_path).collect(); - - assert_eq!(paths, vec!["commit-msg", "post-commit", "pre-commit"]); - - for hook in hooks { - assert!(!hook.relative_path.is_empty()); - assert!(!hook.relative_path.contains('/')); - assert!(!hook.relative_path.contains('\\')); - assert!(!hook.bytes.is_empty()); - assert!( - hook.bytes.starts_with(b"#!/bin/sh\n"), - "embedded hook should start with shell shebang" - ); - } - } - - #[test] - fn required_hook_lookup_resolves_each_canonical_hook() { - for hook in [ - RequiredHookAsset::PreCommit, - RequiredHookAsset::CommitMsg, - RequiredHookAsset::PostCommit, - ] { - let asset = get_required_hook_asset(hook).expect("required hook asset should exist"); - assert_eq!(asset.relative_path, hook_filename(hook)); - assert!(!asset.bytes.is_empty()); - } - } - - #[test] - fn install_engine_replaces_existing_target_with_backup() -> Result<()> { - let temp = TestTempDir::new("sce-setup-install-tests")?; - let existing_target = temp.path().join(".opencode"); - fs::create_dir_all(existing_target.join("legacy"))?; - fs::write(existing_target.join("legacy/config.txt"), b"legacy")?; - - let outcome = install_embedded_setup_assets(temp.path(), SetupTarget::OpenCode)?; - assert_eq!(outcome.target_results.len(), 1); - - let result = &outcome.target_results[0]; - assert_eq!(result.target, SetupTarget::OpenCode); - assert_eq!(result.destination_root, temp.path().join(".opencode")); - assert_eq!( - result.installed_file_count, - assets_for_target(SetupTarget::OpenCode).len() - ); - - let backup_root = result - .backup_root - .as_ref() - .expect("existing target should have backup path"); - assert!(backup_root.exists()); - assert!(backup_root.join("legacy/config.txt").exists()); - - let installed_paths = collect_runtime_relative_paths(result.destination_root.clone())?; - let expected_paths: Vec = assets_for_target(SetupTarget::OpenCode) - .iter() - .map(|asset| asset.relative_path.to_string()) - .collect(); - assert_eq!(installed_paths, expected_paths); - Ok(()) - } - - #[test] - fn install_engine_installs_both_targets() -> Result<()> { - let temp = TestTempDir::new("sce-setup-install-tests")?; - - let outcome = install_embedded_setup_assets(temp.path(), SetupTarget::Both)?; - assert_eq!(outcome.target_results.len(), 2); - - let opencode_paths = collect_runtime_relative_paths(temp.path().join(".opencode"))?; - let claude_paths = collect_runtime_relative_paths(temp.path().join(".claude"))?; - - let expected_opencode: Vec = assets_for_target(SetupTarget::OpenCode) - .iter() - .map(|asset| asset.relative_path.to_string()) - .collect(); - let expected_claude: Vec = assets_for_target(SetupTarget::Claude) - .iter() - .map(|asset| asset.relative_path.to_string()) - .collect(); - - assert_eq!(opencode_paths, expected_opencode); - assert_eq!(claude_paths, expected_claude); - Ok(()) - } - - #[test] - fn install_engine_rolls_back_when_swap_fails() -> Result<()> { - let temp = TestTempDir::new("sce-setup-install-tests")?; - let destination = temp.path().join(".opencode"); - fs::create_dir_all(&destination)?; - fs::write(destination.join("legacy.txt"), b"legacy")?; - - let rename_calls = Cell::new(0_u8); - let error = install_embedded_setup_assets_with_rename( - temp.path(), - SetupTarget::OpenCode, - |from, to| { - rename_calls.set(rename_calls.get() + 1); - if rename_calls.get() == 2 { - return Err(io::Error::new( - io::ErrorKind::Other, - "injected swap failure", - )); - } - - fs::rename(from, to) - }, - ) - .expect_err("swap failure should bubble up as an error"); - - assert!(error.to_string().contains("Failed to swap staged install")); - assert!(destination.exists()); - assert!(destination.join("legacy.txt").exists()); - - let backup = temp.path().join(".opencode.backup"); - assert!(!backup.exists(), "rollback should restore original path"); - - for entry in fs::read_dir(temp.path())? { - let entry = entry?; - let name = entry.file_name(); - let name = name.to_string_lossy(); - assert!( - !name.starts_with(".sce-setup-staging-opencode-"), - "staging directory should be cleaned up after failure" - ); - } - - Ok(()) - } - - #[test] - fn required_hook_install_installs_missing_hooks_in_default_directory() -> Result<()> { - let temp = TestTempDir::new("sce-setup-hook-install-tests")?; - init_git_repo(temp.path())?; - - let outcome = install_required_git_hooks(temp.path())?; - assert_eq!(outcome.repository_root, temp.path().to_path_buf()); - assert_eq!(outcome.hook_results.len(), 3); - for hook in outcome.hook_results { - assert_eq!(hook.status, RequiredHookInstallStatus::Installed); - assert!(hook.hook_path.exists()); - assert!(hook.backup_path.is_none()); - assert_hook_is_executable(&hook.hook_path)?; - } - - Ok(()) - } - - #[test] - fn required_hook_install_rerun_reports_skipped_for_unchanged_hooks() -> Result<()> { - let temp = TestTempDir::new("sce-setup-hook-install-tests")?; - init_git_repo(temp.path())?; - - let first = install_required_git_hooks(temp.path())?; - assert!(first - .hook_results - .iter() - .all(|hook| hook.status == RequiredHookInstallStatus::Installed)); - - let second = install_required_git_hooks(temp.path())?; - assert!(second - .hook_results - .iter() - .all(|hook| hook.status == RequiredHookInstallStatus::Skipped)); - assert!(second - .hook_results - .iter() - .all(|hook| hook.backup_path.is_none())); - - Ok(()) - } - - #[test] - fn required_hook_install_updates_noncanonical_hook_in_custom_hooks_path() -> Result<()> { - let temp = TestTempDir::new("sce-setup-hook-install-tests")?; - init_git_repo(temp.path())?; - - run_git_in_repo(temp.path(), &["config", "core.hooksPath", ".githooks"])?; - - let custom_hooks_directory = temp.path().join(".githooks"); - fs::create_dir_all(&custom_hooks_directory)?; - let commit_msg_path = custom_hooks_directory.join("commit-msg"); - fs::write(&commit_msg_path, b"#!/bin/sh\necho legacy\n")?; - set_test_file_mode(&commit_msg_path, 0o644)?; - - let outcome = install_required_git_hooks(temp.path())?; - assert_eq!(outcome.hooks_directory, custom_hooks_directory); - - let updated = outcome - .hook_results - .iter() - .find(|hook| hook.hook_name == "commit-msg") - .expect("commit-msg result should exist"); - assert_eq!(updated.status, RequiredHookInstallStatus::Updated); - let backup_path = updated - .backup_path - .as_ref() - .expect("updated hook should retain backup path"); - assert!(backup_path.exists()); - assert_eq!(fs::read(backup_path)?, b"#!/bin/sh\necho legacy\n"); - assert_hook_is_executable(&updated.hook_path)?; - - Ok(()) - } - - #[test] - fn required_hook_install_rolls_back_when_hook_swap_fails() -> Result<()> { - let temp = TestTempDir::new("sce-setup-hook-install-tests")?; - init_git_repo(temp.path())?; - - let hooks_directory = temp.path().join(".git/hooks"); - fs::create_dir_all(&hooks_directory)?; - let commit_msg_path = hooks_directory.join("commit-msg"); - fs::write(&commit_msg_path, b"#!/bin/sh\necho legacy\n")?; - - let rename_calls = Cell::new(0_u8); - let error = install_required_git_hooks_with_rename(temp.path(), |from, to| { - rename_calls.set(rename_calls.get() + 1); - if rename_calls.get() == 2 { - return Err(io::Error::other("injected hook swap failure")); - } - - fs::rename(from, to) - }) - .expect_err("hook swap failure should bubble up"); - - assert!(error - .to_string() - .contains("Failed to update required hook 'commit-msg'")); - assert!(commit_msg_path.exists()); - assert_eq!(fs::read(&commit_msg_path)?, b"#!/bin/sh\necho legacy\n"); - assert!(!hooks_directory.join("commit-msg.backup").exists()); - - for entry in fs::read_dir(&hooks_directory)? { - let entry = entry?; - let name = entry.file_name(); - let name = name.to_string_lossy(); - assert!( - !name.starts_with(".sce-hook-staging-"), - "hook staging file should be cleaned up after failure" - ); - } - - Ok(()) - } - - fn init_git_repo(repository_root: &Path) -> Result<()> { - run_git_in_repo(repository_root, &["init", "-q"])?; - Ok(()) - } - - fn run_git_in_repo(repository_root: &Path, args: &[&str]) -> Result<()> { - let status = Command::new("git") - .args(args) - .current_dir(repository_root) - .status()?; - if !status.success() { - anyhow::bail!("git command failed for test repository") - } - Ok(()) - } - - #[cfg(unix)] - fn set_test_file_mode(path: &Path, mode: u32) -> Result<()> { - use std::os::unix::fs::PermissionsExt; - - fs::set_permissions(path, fs::Permissions::from_mode(mode))?; - Ok(()) - } - - #[cfg(not(unix))] - fn set_test_file_mode(_path: &Path, _mode: u32) -> Result<()> { - Ok(()) - } - - #[cfg(unix)] - fn assert_hook_is_executable(path: &Path) -> Result<()> { - use std::os::unix::fs::PermissionsExt; - - let metadata = fs::metadata(path)?; - assert!(metadata.permissions().mode() & 0o111 != 0); - Ok(()) - } - - #[cfg(not(unix))] - fn assert_hook_is_executable(path: &Path) -> Result<()> { - assert!(path.exists()); - Ok(()) - } - - fn runtime_target_root(target: SetupTarget) -> PathBuf { - let target_relative = match target { - SetupTarget::OpenCode => "config/.opencode", - SetupTarget::Claude => "config/.claude", - SetupTarget::Both => unreachable!("both is not a concrete filesystem root"), - }; - - PathBuf::from(env!("CARGO_MANIFEST_DIR")) - .parent() - .expect("cli crate should be nested under repository root") - .join(target_relative) - } - - fn assets_for_target(target: SetupTarget) -> &'static [super::EmbeddedAsset] { - match target { - SetupTarget::OpenCode => super::OPENCODE_EMBEDDED_ASSETS, - SetupTarget::Claude => super::CLAUDE_EMBEDDED_ASSETS, - SetupTarget::Both => unreachable!("both is not a single embedded target"), - } - } - - fn collect_runtime_relative_paths(root: PathBuf) -> Result> { - let mut files = Vec::new(); - collect_runtime_files(&root, &root, &mut files)?; - - files.sort_unstable(); - - let stable_paths = files - .into_iter() - .map(|path| { - path.to_str() - .expect("runtime config path should be UTF-8") - .replace('\\', "/") - }) - .collect(); - - Ok(stable_paths) - } - - fn collect_runtime_files( - base_root: &Path, - current_dir: &Path, - output: &mut Vec, - ) -> Result<()> { - for entry in fs::read_dir(current_dir)? { - let entry = entry?; - let path = entry.path(); - - if entry.file_type()?.is_dir() { - collect_runtime_files(base_root, &path, output)?; - continue; - } - - let relative = path - .strip_prefix(base_root) - .expect("relative path should be under root") - .to_path_buf(); - output.push(relative); - } - - Ok(()) - } - - fn hook_filename(hook: RequiredHookAsset) -> &'static str { - match hook { - RequiredHookAsset::PreCommit => "pre-commit", - RequiredHookAsset::CommitMsg => "commit-msg", - RequiredHookAsset::PostCommit => "post-commit", - } - } -} +mod tests; diff --git a/cli/src/services/setup/tests.rs b/cli/src/services/setup/tests.rs new file mode 100644 index 00000000..1d7bb8ab --- /dev/null +++ b/cli/src/services/setup/tests.rs @@ -0,0 +1,639 @@ +use std::{ + cell::Cell, + fs, io, + path::{Path, PathBuf}, + process::Command, +}; + +use crate::test_support::TestTempDir; +use anyhow::Result; + +use super::{ + get_required_hook_asset, install_embedded_setup_assets, + install_embedded_setup_assets_with_rename, install_required_git_hooks, + install_required_git_hooks_with_rename, iter_embedded_assets_for_setup_target, + iter_required_hook_assets, parse_setup_cli_options, resolve_setup_dispatch, + resolve_setup_hooks_repository, resolve_setup_mode, run_setup_for_mode, run_setup_hooks, + setup_usage_text, RequiredHookAsset, RequiredHookInstallStatus, SetupCliOptions, SetupDispatch, + SetupMode, SetupTarget, +}; + +#[derive(Clone, Copy, Debug)] +struct MockPrompter { + response: SetupDispatch, +} + +impl super::SetupTargetPrompter for MockPrompter { + fn prompt_target(&self) -> Result { + Ok(self.response) + } +} + +#[test] +fn run_setup_rejects_unresolved_interactive_mode() { + let temp = TestTempDir::new("sce-setup-install-tests").expect("temp dir should be created"); + let error = run_setup_for_mode(temp.path(), SetupMode::Interactive) + .expect_err("interactive mode should be resolved before install"); + assert_eq!( + error.to_string(), + "Interactive setup mode must be resolved before installation" + ); +} + +#[test] +fn setup_options_default_to_interactive_mode() -> Result<()> { + let options = parse_setup_cli_options(Vec::::new())?; + let mode = resolve_setup_mode(options)?; + assert_eq!(mode, SetupMode::Interactive); + Ok(()) +} + +#[test] +fn setup_options_parse_opencode_flag() -> Result<()> { + let options = parse_setup_cli_options(vec!["--opencode".to_string()])?; + let mode = resolve_setup_mode(options)?; + assert_eq!(mode, SetupMode::NonInteractive(SetupTarget::OpenCode)); + Ok(()) +} + +#[test] +fn setup_options_reject_mutually_exclusive_flags() { + let error = resolve_setup_mode(SetupCliOptions { + help: false, + opencode: true, + claude: true, + both: false, + hooks: false, + repo_path: None, + }) + .expect_err("multiple target flags should fail"); + + assert_eq!( + error.to_string(), + "Options '--opencode', '--claude', and '--both' are mutually exclusive. Choose exactly one target flag or none for interactive mode." + ); +} + +#[test] +fn setup_usage_contract_mentions_target_flags() { + let usage = setup_usage_text(); + assert!(usage.contains("--opencode|--claude|--both")); + assert!(usage.contains("sce setup --hooks [--repo ]")); +} + +#[test] +fn setup_options_parse_hooks_without_repo() -> Result<()> { + let options = parse_setup_cli_options(vec!["--hooks".to_string()])?; + let repo = resolve_setup_hooks_repository(&options)?; + assert_eq!(repo, None); + Ok(()) +} + +#[test] +fn setup_options_parse_hooks_with_repo() -> Result<()> { + let options = parse_setup_cli_options(vec![ + "--hooks".to_string(), + "--repo".to_string(), + "tmp/repo".to_string(), + ])?; + let repo = resolve_setup_hooks_repository(&options)?; + assert_eq!(repo, Some(PathBuf::from("tmp/repo"))); + Ok(()) +} + +#[test] +fn setup_options_reject_repo_without_hooks() { + let options = parse_setup_cli_options(vec!["--repo".to_string(), "tmp/repo".to_string()]) + .expect("parsing --repo should succeed before validation"); + let error = + resolve_setup_hooks_repository(&options).expect_err("--repo without --hooks should fail"); + assert_eq!( + error.to_string(), + "Option '--repo' requires '--hooks'. Run 'sce setup --help' to see valid usage." + ); +} + +#[test] +fn setup_options_reject_hooks_with_target_flags() { + let options = parse_setup_cli_options(vec!["--hooks".to_string(), "--opencode".to_string()]) + .expect("parsing should succeed before validation"); + let error = resolve_setup_hooks_repository(&options) + .expect_err("--hooks with target flags should fail"); + assert_eq!( + error.to_string(), + "Option '--hooks' cannot be combined with '--opencode', '--claude', or '--both'. Run 'sce setup --help' to see valid usage." + ); +} + +#[test] +fn run_setup_hooks_reports_per_hook_statuses() -> Result<()> { + let temp = TestTempDir::new("sce-setup-hook-install-tests")?; + init_git_repo(temp.path())?; + + let message = run_setup_hooks(temp.path())?; + assert!(message.contains("Hook setup completed successfully.")); + assert!(message.contains("Repository root:")); + assert!(message.contains("Hooks directory:")); + assert!(message.contains("commit-msg: installed")); + assert!(message.contains("post-commit: installed")); + assert!(message.contains("pre-commit: installed")); + assert!(message.contains("backup: not needed")); + + Ok(()) +} + +#[test] +fn setup_help_option_sets_help_flag() -> Result<()> { + let options = parse_setup_cli_options(vec!["--help".to_string()])?; + assert!(options.help); + Ok(()) +} + +#[test] +fn run_setup_reports_selected_target_and_backup_status() -> Result<()> { + let temp = TestTempDir::new("sce-setup-install-tests")?; + fs::create_dir_all(temp.path().join(".opencode/legacy"))?; + fs::write(temp.path().join(".opencode/legacy/config.txt"), b"legacy")?; + + let message = run_setup_for_mode( + temp.path(), + SetupMode::NonInteractive(SetupTarget::OpenCode), + )?; + assert!(message.contains("Setup completed successfully.")); + assert!(message.contains("Selected target(s): OpenCode")); + assert!(message.contains("OpenCode: installed")); + assert!(message.contains("backup: existing target moved to")); + assert!(message.contains(".opencode.backup")); + + Ok(()) +} + +#[test] +fn run_setup_reports_both_targets() -> Result<()> { + let temp = TestTempDir::new("sce-setup-install-tests")?; + let message = run_setup_for_mode(temp.path(), SetupMode::NonInteractive(SetupTarget::Both))?; + assert!(message.contains("Selected target(s): OpenCode, Claude")); + assert!(message.contains("OpenCode: installed")); + assert!(message.contains("Claude: installed")); + assert!(message.contains("backup: not needed (no existing target)")); + Ok(()) +} + +#[test] +fn interactive_dispatch_maps_selected_target() -> Result<()> { + let dispatch = resolve_setup_dispatch( + SetupMode::Interactive, + &MockPrompter { + response: SetupDispatch::Proceed(SetupMode::NonInteractive(SetupTarget::Claude)), + }, + )?; + + assert_eq!( + dispatch, + SetupDispatch::Proceed(SetupMode::NonInteractive(SetupTarget::Claude)) + ); + Ok(()) +} + +#[test] +fn interactive_dispatch_returns_cancelled_without_side_effects() -> Result<()> { + let dispatch = resolve_setup_dispatch( + SetupMode::Interactive, + &MockPrompter { + response: SetupDispatch::Cancelled, + }, + )?; + + assert_eq!(dispatch, SetupDispatch::Cancelled); + Ok(()) +} + +#[test] +fn embedded_manifest_paths_are_sorted_and_normalized() { + for target in [SetupTarget::OpenCode, SetupTarget::Claude] { + let assets = assets_for_target(target); + + assert!(!assets.is_empty(), "embedded asset set should not be empty"); + + let paths: Vec<&str> = assets.iter().map(|asset| asset.relative_path).collect(); + assert_eq!(paths.len(), assets.len()); + + for asset in assets { + assert!(!asset.relative_path.is_empty()); + assert!(!asset.relative_path.starts_with('/')); + assert!(!asset.relative_path.contains('\\')); + assert!(!asset.relative_path.starts_with("config/")); + assert!( + !asset.bytes.is_empty(), + "embedded files should have content bytes" + ); + } + + let mut sorted = paths.clone(); + sorted.sort_unstable(); + assert_eq!( + paths, sorted, + "embedded paths should be deterministic and sorted" + ); + } +} + +#[test] +fn embedded_manifest_matches_runtime_config_tree() -> Result<()> { + let opencode_expected = + collect_runtime_relative_paths(runtime_target_root(SetupTarget::OpenCode))?; + let claude_expected = collect_runtime_relative_paths(runtime_target_root(SetupTarget::Claude))?; + + let opencode_actual: Vec = assets_for_target(SetupTarget::OpenCode) + .iter() + .map(|asset| asset.relative_path.to_string()) + .collect(); + let claude_actual: Vec = assets_for_target(SetupTarget::Claude) + .iter() + .map(|asset| asset.relative_path.to_string()) + .collect(); + + assert_eq!(opencode_actual, opencode_expected); + assert_eq!(claude_actual, claude_expected); + Ok(()) +} + +#[test] +fn embedded_setup_target_iterator_scopes_assets_per_target() { + let opencode_count = assets_for_target(SetupTarget::OpenCode).len(); + let claude_count = assets_for_target(SetupTarget::Claude).len(); + + let iter_opencode_count = iter_embedded_assets_for_setup_target(SetupTarget::OpenCode).count(); + let iter_claude_count = iter_embedded_assets_for_setup_target(SetupTarget::Claude).count(); + let iter_both_count = iter_embedded_assets_for_setup_target(SetupTarget::Both).count(); + + assert_eq!(iter_opencode_count, opencode_count); + assert_eq!(iter_claude_count, claude_count); + assert_eq!(iter_both_count, opencode_count + claude_count); +} + +#[test] +fn embedded_hook_manifest_is_complete_sorted_and_normalized() { + let hooks: Vec<&super::EmbeddedAsset> = iter_required_hook_assets().collect(); + let paths: Vec<&str> = hooks.iter().map(|asset| asset.relative_path).collect(); + + assert_eq!(paths, vec!["commit-msg", "post-commit", "pre-commit"]); + + for hook in hooks { + assert!(!hook.relative_path.is_empty()); + assert!(!hook.relative_path.contains('/')); + assert!(!hook.relative_path.contains('\\')); + assert!(!hook.bytes.is_empty()); + assert!( + hook.bytes.starts_with(b"#!/bin/sh\n"), + "embedded hook should start with shell shebang" + ); + } +} + +#[test] +fn required_hook_lookup_resolves_each_canonical_hook() { + for hook in [ + RequiredHookAsset::PreCommit, + RequiredHookAsset::CommitMsg, + RequiredHookAsset::PostCommit, + ] { + let asset = get_required_hook_asset(hook).expect("required hook asset should exist"); + assert_eq!(asset.relative_path, hook_filename(hook)); + assert!(!asset.bytes.is_empty()); + } +} + +#[test] +fn install_engine_replaces_existing_target_with_backup() -> Result<()> { + let temp = TestTempDir::new("sce-setup-install-tests")?; + let existing_target = temp.path().join(".opencode"); + fs::create_dir_all(existing_target.join("legacy"))?; + fs::write(existing_target.join("legacy/config.txt"), b"legacy")?; + + let outcome = install_embedded_setup_assets(temp.path(), SetupTarget::OpenCode)?; + assert_eq!(outcome.target_results.len(), 1); + + let result = &outcome.target_results[0]; + assert_eq!(result.target, SetupTarget::OpenCode); + assert_eq!(result.destination_root, temp.path().join(".opencode")); + assert_eq!( + result.installed_file_count, + assets_for_target(SetupTarget::OpenCode).len() + ); + + let backup_root = result + .backup_root + .as_ref() + .expect("existing target should have backup path"); + assert!(backup_root.exists()); + assert!(backup_root.join("legacy/config.txt").exists()); + + let installed_paths = collect_runtime_relative_paths(result.destination_root.clone())?; + let expected_paths: Vec = assets_for_target(SetupTarget::OpenCode) + .iter() + .map(|asset| asset.relative_path.to_string()) + .collect(); + assert_eq!(installed_paths, expected_paths); + Ok(()) +} + +#[test] +fn install_engine_installs_both_targets() -> Result<()> { + let temp = TestTempDir::new("sce-setup-install-tests")?; + + let outcome = install_embedded_setup_assets(temp.path(), SetupTarget::Both)?; + assert_eq!(outcome.target_results.len(), 2); + + let opencode_paths = collect_runtime_relative_paths(temp.path().join(".opencode"))?; + let claude_paths = collect_runtime_relative_paths(temp.path().join(".claude"))?; + + let expected_opencode: Vec = assets_for_target(SetupTarget::OpenCode) + .iter() + .map(|asset| asset.relative_path.to_string()) + .collect(); + let expected_claude: Vec = assets_for_target(SetupTarget::Claude) + .iter() + .map(|asset| asset.relative_path.to_string()) + .collect(); + + assert_eq!(opencode_paths, expected_opencode); + assert_eq!(claude_paths, expected_claude); + Ok(()) +} + +#[test] +fn install_engine_rolls_back_when_swap_fails() -> Result<()> { + let temp = TestTempDir::new("sce-setup-install-tests")?; + let destination = temp.path().join(".opencode"); + fs::create_dir_all(&destination)?; + fs::write(destination.join("legacy.txt"), b"legacy")?; + + let rename_calls = Cell::new(0_u8); + let error = install_embedded_setup_assets_with_rename( + temp.path(), + SetupTarget::OpenCode, + |from, to| { + rename_calls.set(rename_calls.get() + 1); + if rename_calls.get() == 2 { + return Err(io::Error::new( + io::ErrorKind::Other, + "injected swap failure", + )); + } + + fs::rename(from, to) + }, + ) + .expect_err("swap failure should bubble up as an error"); + + assert!(error.to_string().contains("Failed to swap staged install")); + assert!(destination.exists()); + assert!(destination.join("legacy.txt").exists()); + + let backup = temp.path().join(".opencode.backup"); + assert!(!backup.exists(), "rollback should restore original path"); + + for entry in fs::read_dir(temp.path())? { + let entry = entry?; + let name = entry.file_name(); + let name = name.to_string_lossy(); + assert!( + !name.starts_with(".sce-setup-staging-opencode-"), + "staging directory should be cleaned up after failure" + ); + } + + Ok(()) +} + +#[test] +fn required_hook_install_installs_missing_hooks_in_default_directory() -> Result<()> { + let temp = TestTempDir::new("sce-setup-hook-install-tests")?; + init_git_repo(temp.path())?; + + let outcome = install_required_git_hooks(temp.path())?; + assert_eq!(outcome.repository_root, temp.path().to_path_buf()); + assert_eq!(outcome.hook_results.len(), 3); + for hook in outcome.hook_results { + assert_eq!(hook.status, RequiredHookInstallStatus::Installed); + assert!(hook.hook_path.exists()); + assert!(hook.backup_path.is_none()); + assert_hook_is_executable(&hook.hook_path)?; + } + + Ok(()) +} + +#[test] +fn required_hook_install_rerun_reports_skipped_for_unchanged_hooks() -> Result<()> { + let temp = TestTempDir::new("sce-setup-hook-install-tests")?; + init_git_repo(temp.path())?; + + let first = install_required_git_hooks(temp.path())?; + assert!(first + .hook_results + .iter() + .all(|hook| hook.status == RequiredHookInstallStatus::Installed)); + + let second = install_required_git_hooks(temp.path())?; + assert!(second + .hook_results + .iter() + .all(|hook| hook.status == RequiredHookInstallStatus::Skipped)); + assert!(second + .hook_results + .iter() + .all(|hook| hook.backup_path.is_none())); + + Ok(()) +} + +#[test] +fn required_hook_install_updates_noncanonical_hook_in_custom_hooks_path() -> Result<()> { + let temp = TestTempDir::new("sce-setup-hook-install-tests")?; + init_git_repo(temp.path())?; + + run_git_in_repo(temp.path(), &["config", "core.hooksPath", ".githooks"])?; + + let custom_hooks_directory = temp.path().join(".githooks"); + fs::create_dir_all(&custom_hooks_directory)?; + let commit_msg_path = custom_hooks_directory.join("commit-msg"); + fs::write(&commit_msg_path, b"#!/bin/sh\necho legacy\n")?; + set_test_file_mode(&commit_msg_path, 0o644)?; + + let outcome = install_required_git_hooks(temp.path())?; + assert_eq!(outcome.hooks_directory, custom_hooks_directory); + + let updated = outcome + .hook_results + .iter() + .find(|hook| hook.hook_name == "commit-msg") + .expect("commit-msg result should exist"); + assert_eq!(updated.status, RequiredHookInstallStatus::Updated); + let backup_path = updated + .backup_path + .as_ref() + .expect("updated hook should retain backup path"); + assert!(backup_path.exists()); + assert_eq!(fs::read(backup_path)?, b"#!/bin/sh\necho legacy\n"); + assert_hook_is_executable(&updated.hook_path)?; + + Ok(()) +} + +#[test] +fn required_hook_install_rolls_back_when_hook_swap_fails() -> Result<()> { + let temp = TestTempDir::new("sce-setup-hook-install-tests")?; + init_git_repo(temp.path())?; + + let hooks_directory = temp.path().join(".git/hooks"); + fs::create_dir_all(&hooks_directory)?; + let commit_msg_path = hooks_directory.join("commit-msg"); + fs::write(&commit_msg_path, b"#!/bin/sh\necho legacy\n")?; + + let rename_calls = Cell::new(0_u8); + let error = install_required_git_hooks_with_rename(temp.path(), |from, to| { + rename_calls.set(rename_calls.get() + 1); + if rename_calls.get() == 2 { + return Err(io::Error::other("injected hook swap failure")); + } + + fs::rename(from, to) + }) + .expect_err("hook swap failure should bubble up"); + + assert!(error + .to_string() + .contains("Failed to update required hook 'commit-msg'")); + assert!(commit_msg_path.exists()); + assert_eq!(fs::read(&commit_msg_path)?, b"#!/bin/sh\necho legacy\n"); + assert!(!hooks_directory.join("commit-msg.backup").exists()); + + for entry in fs::read_dir(&hooks_directory)? { + let entry = entry?; + let name = entry.file_name(); + let name = name.to_string_lossy(); + assert!( + !name.starts_with(".sce-hook-staging-"), + "hook staging file should be cleaned up after failure" + ); + } + + Ok(()) +} + +fn init_git_repo(repository_root: &Path) -> Result<()> { + run_git_in_repo(repository_root, &["init", "-q"])?; + Ok(()) +} + +fn run_git_in_repo(repository_root: &Path, args: &[&str]) -> Result<()> { + let status = Command::new("git") + .args(args) + .current_dir(repository_root) + .status()?; + if !status.success() { + anyhow::bail!("git command failed for test repository") + } + Ok(()) +} + +#[cfg(unix)] +fn set_test_file_mode(path: &Path, mode: u32) -> Result<()> { + use std::os::unix::fs::PermissionsExt; + + fs::set_permissions(path, fs::Permissions::from_mode(mode))?; + Ok(()) +} + +#[cfg(not(unix))] +fn set_test_file_mode(_path: &Path, _mode: u32) -> Result<()> { + Ok(()) +} + +#[cfg(unix)] +fn assert_hook_is_executable(path: &Path) -> Result<()> { + use std::os::unix::fs::PermissionsExt; + + let metadata = fs::metadata(path)?; + assert!(metadata.permissions().mode() & 0o111 != 0); + Ok(()) +} + +#[cfg(not(unix))] +fn assert_hook_is_executable(path: &Path) -> Result<()> { + assert!(path.exists()); + Ok(()) +} + +fn runtime_target_root(target: SetupTarget) -> PathBuf { + let target_relative = match target { + SetupTarget::OpenCode => "config/.opencode", + SetupTarget::Claude => "config/.claude", + SetupTarget::Both => unreachable!("both is not a concrete filesystem root"), + }; + + PathBuf::from(env!("CARGO_MANIFEST_DIR")) + .parent() + .expect("cli crate should be nested under repository root") + .join(target_relative) +} + +fn assets_for_target(target: SetupTarget) -> &'static [super::EmbeddedAsset] { + match target { + SetupTarget::OpenCode => super::OPENCODE_EMBEDDED_ASSETS, + SetupTarget::Claude => super::CLAUDE_EMBEDDED_ASSETS, + SetupTarget::Both => unreachable!("both is not a single embedded target"), + } +} + +fn collect_runtime_relative_paths(root: PathBuf) -> Result> { + let mut files = Vec::new(); + collect_runtime_files(&root, &root, &mut files)?; + + files.sort_unstable(); + + let stable_paths = files + .into_iter() + .map(|path| { + path.to_str() + .expect("runtime config path should be UTF-8") + .replace('\\', "/") + }) + .collect(); + + Ok(stable_paths) +} + +fn collect_runtime_files( + base_root: &Path, + current_dir: &Path, + output: &mut Vec, +) -> Result<()> { + for entry in fs::read_dir(current_dir)? { + let entry = entry?; + let path = entry.path(); + + if entry.file_type()?.is_dir() { + collect_runtime_files(base_root, &path, output)?; + continue; + } + + let relative = path + .strip_prefix(base_root) + .expect("relative path should be under root") + .to_path_buf(); + output.push(relative); + } + + Ok(()) +} + +fn hook_filename(hook: RequiredHookAsset) -> &'static str { + match hook { + RequiredHookAsset::PreCommit => "pre-commit", + RequiredHookAsset::CommitMsg => "commit-msg", + RequiredHookAsset::PostCommit => "post-commit", + } +} diff --git a/context/plans/sce-cli-rust-idiomatic-hardening-pass.md b/context/plans/sce-cli-rust-idiomatic-hardening-pass.md index 5cf75629..b40991e4 100644 --- a/context/plans/sce-cli-rust-idiomatic-hardening-pass.md +++ b/context/plans/sce-cli-rust-idiomatic-hardening-pass.md @@ -84,14 +84,14 @@ Non-goals (deferred): - Done when: runtime initialization code is single-flow and atomic in style, preserving current error context and reuse behavior. - Verification notes: run `cargo test --manifest-path cli/Cargo.toml services::sync::tests` and `cargo check --manifest-path cli/Cargo.toml`. -- [ ] T08: Apply incremental test/runtime separation in hooks/setup modules (status:todo) +- [x] T08: Apply incremental test/runtime separation in hooks/setup modules (status:done) - Goal: Improve maintainability by extracting selected large in-file test sections from `hooks.rs` and `setup.rs` into focused sibling test modules/files while preserving current test semantics. - Boundaries (in): test module organization and local helper placement for `cli/src/services/hooks.rs` and `cli/src/services/setup.rs`. - Boundaries (out): Full integration-test migration and non-test production refactors not needed for extraction. - Done when: high-churn/large test slices are moved out of primary runtime files, module compiles cleanly, and affected test suites pass. - Verification notes: run `cargo test --manifest-path cli/Cargo.toml services::hooks::tests services::setup::tests` and `cargo check --manifest-path cli/Cargo.toml`. -- [ ] T09: Validation and cleanup (status:todo) +- [x] T09: Validation and cleanup (status:done) - Goal: Execute full verification sweep, confirm behavior parity for touched domains, and sync context artifacts to current state (including dependency contract references). - Boundaries (in): formatting/build/test checks, plan status finalization, and required context updates in `context/`. - Boundaries (out): New feature work beyond this hardening pass. @@ -101,3 +101,25 @@ Non-goals (deferred): ## 5) Open questions (if any) None. Scope, dependency direction, tie policy, and test-split depth were resolved during clarification. + +## 6) Validation report (T09) + +- Commands run: + - `cargo fmt --manifest-path cli/Cargo.toml --all -- --check` (exit 0) + - `cargo test --manifest-path cli/Cargo.toml` (exit 0, 114 passed) + - `cargo build --manifest-path cli/Cargo.toml` (exit 0, placeholder dead-code warnings only) + - `nix run .#pkl-check-generated` (exit 0, generated outputs up to date) + - `nix flake check` (initial failure: Nix git-source omitted untracked extracted test modules; after tracking `cli/src/services/hooks/tests.rs` and `cli/src/services/setup/tests.rs`, rerun exit 0) + +- Cleanup actions: + - No temporary scaffolding under `context/tmp/` was required for this task. + - Ensured extracted test modules are tracked so Nix flake source evaluation matches compile-time module layout. + +- Success-criteria verification: + - Hosted crypto + structured JSON parsing + epsilon tie handling: preserved and passing in full suite. + - Local DB path handling, parser/runtime ergonomics, and OnceLock flow: preserved and passing in full suite. + - Incremental hooks/setup test extraction: validated in Cargo and Nix checks once extracted module files were tracked. + - Context alignment + plan finalization: task marked done and root context files verified as current-state accurate for this localized finalization pass. + +- Residual risk: + - `cargo build` still emits expected placeholder-surface dead-code warnings in Agent Trace/hosted/hooks/local-db seams; no functional regressions observed. From 9c66e74ef706c08c5daa09a38ca851c343f9a232 Mon Sep 17 00:00:00 2001 From: David Abram Date: Thu, 5 Mar 2026 09:21:29 +0100 Subject: [PATCH 30/39] cli: Add clippy app and check in Nix flake Enable the clippy toolchain extension and add dedicated `apps.clippy` and `checks.cli-clippy` entries. This makes linting available as a first-class Nix app/check for the CLI crate. --- cli/flake.nix | 49 ++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 48 insertions(+), 1 deletion(-) diff --git a/cli/flake.nix b/cli/flake.nix index 9f02f3f9..6fb1c1d9 100644 --- a/cli/flake.nix +++ b/cli/flake.nix @@ -29,7 +29,10 @@ }; rustToolchain = pkgs.rust-bin.stable.latest.default.override { - extensions = [ "rustfmt" ]; + extensions = [ + "rustfmt" + "clippy" + ]; }; rustPlatform = pkgs.makeRustPlatform { @@ -65,6 +68,18 @@ }; }; + apps.clippy = { + type = "app"; + program = toString ( + pkgs.writeShellScript "sce-clippy" '' + exec ${rustToolchain}/bin/cargo clippy --manifest-path cli/Cargo.toml --all-targets --all-features "$@" + '' + ); + meta = { + description = "Run clippy for the sce CLI crate"; + }; + }; + checks.cli-setup-command-surface = rustPlatform.buildRustPackage { pname = "sce-cli-setup-command-surface-check"; version = "0.1.0"; @@ -99,6 +114,38 @@ runHook postInstall ''; }; + + checks.cli-clippy = rustPlatform.buildRustPackage { + pname = "sce-cli-clippy-check"; + version = "0.1.0"; + inherit src; + sourceRoot = "source/cli"; + + cargoLock = { + lockFile = ../cli/Cargo.lock; + }; + + nativeBuildInputs = [ rustToolchain ]; + + buildPhase = '' + runHook preBuild + runHook postBuild + ''; + + checkPhase = '' + runHook preCheck + + cargo clippy --all-targets --all-features + + runHook postCheck + ''; + + installPhase = '' + runHook preInstall + mkdir -p "$out" + runHook postInstall + ''; + }; } ); } From 67f8d08846217bf006e861dcbf9255c81af35e67 Mon Sep 17 00:00:00 2001 From: David Abram Date: Thu, 5 Mar 2026 09:31:54 +0100 Subject: [PATCH 31/39] agent-trace: Freeze local hooks MVP contract and gap matrix --- context/context-map.md | 1 + context/glossary.md | 1 + context/overview.md | 1 + .../agent-trace-local-hooks-production-mvp.md | 156 ++++++++++++++++++ ...trace-hosted-event-intake-orchestration.md | 6 + ...ace-local-hooks-mvp-contract-gap-matrix.md | 90 ++++++++++ 6 files changed, 255 insertions(+) create mode 100644 context/plans/agent-trace-local-hooks-production-mvp.md create mode 100644 context/sce/agent-trace-local-hooks-mvp-contract-gap-matrix.md diff --git a/context/context-map.md b/context/context-map.md index 59a53403..b32e7126 100644 --- a/context/context-map.md +++ b/context/context-map.md @@ -34,6 +34,7 @@ Feature/domain context: - `context/sce/agent-trace-hosted-event-intake-orchestration.md` (T12 hosted GitHub/GitLab event intake contract with signature verification, old/new head resolution, and deterministic reconciliation-run idempotency keys) - `context/sce/agent-trace-rewrite-mapping-engine.md` (T13 hosted rewrite mapping engine contract with patch-id exact precedence, range-diff/fuzzy scoring, and deterministic unresolved outcomes) - `context/sce/agent-trace-retry-queue-observability.md` (T14 retry queue recovery contract plus reconciliation/runtime observability metrics and DB-first queue schema additions) +- `context/sce/agent-trace-local-hooks-mvp-contract-gap-matrix.md` (T01 Local Hooks MVP production contract freeze and deterministic gap matrix for `agent-trace-local-hooks-production-mvp`) Working areas: - `context/plans/` (active plan execution artifacts, not durable history) diff --git a/context/glossary.md b/context/glossary.md index 47fc6b93..58b3b224 100644 --- a/context/glossary.md +++ b/context/glossary.md @@ -72,3 +72,4 @@ - `agent trace rewrite mapping engine`: T13 contract in `cli/src/services/hosted_reconciliation.rs` (`map_rewritten_commit`) that deterministically maps old->new rewritten commits using strict patch-id exact precedence, then range-diff scoring, then fuzzy fallback with `>= 0.60` threshold gating and explicit unresolved outcomes (`ambiguous`, `unmatched`, `low_confidence`). - `agent trace retry replay processor`: T14 contract in `cli/src/services/hooks.rs` (`process_trace_retry_queue`) that dequeues fallback queue entries, retries only previously failed persistence targets (notes and/or DB), requeues remaining failures, and emits per-attempt runtime/error-class metrics via `RetryMetricsSink`. - `reconciliation metrics snapshot`: T14 contract in `cli/src/services/hosted_reconciliation.rs` (`summarize_reconciliation_metrics`) that reports mapped/unmapped counts, confidence histogram buckets (`high`/`medium`/`low`), runtime (`runtime_ms`), and normalized error class (`signature`/`payload`/`store`) for hosted rewrite runs. +- `agent trace local hooks MVP contract and gap matrix`: T01 context artifact at `context/sce/agent-trace-local-hooks-mvp-contract-gap-matrix.md` that freezes local production boundaries/decisions for `agent-trace-local-hooks-production-mvp` and maps current seam-level code truth to required runtime completion tasks (`T02`..`T10`). diff --git a/context/overview.md b/context/overview.md index f76e400e..9ad897bb 100644 --- a/context/overview.md +++ b/context/overview.md @@ -104,6 +104,7 @@ Lightweight post-task verification baseline (required after each completed task) - Use `context/sce/agent-trace-hosted-event-intake-orchestration.md` for the implemented T12 hosted intake contract (GitHub/GitLab signature verification, old/new head resolution, deterministic reconciliation-run idempotency keys, and replay-safe run insertion outcomes). - Use `context/sce/agent-trace-rewrite-mapping-engine.md` for the implemented T13 hosted mapping engine contract (patch-id exact matching, range-diff/fuzzy scoring precedence, confidence thresholds, and deterministic unresolved handling). - Use `context/sce/agent-trace-retry-queue-observability.md` for the implemented T14 retry replay contract (notes/DB target-scoped recovery, per-attempt runtime/error-class metrics, reconciliation mapped/unmapped + confidence histogram snapshots, and DB-first retry/metrics schema additions). +- Use `context/sce/agent-trace-local-hooks-mvp-contract-gap-matrix.md` for the frozen T01 Local Hooks MVP production contract and deterministic gap matrix that maps current seam-level code truth to the remaining implementation stack (`T02`..`T10`). - Use `context/sce/setup-githooks-hook-asset-packaging.md` for the implemented `sce-setup-githooks-any-repo` T02 compile-time hook-template packaging contract and setup-service required-hook embedded accessor surface. - Use `context/sce/setup-githooks-install-flow.md` for the implemented `sce-setup-githooks-any-repo` T03 required-hook install orchestration contract (git-truth hooks-path resolution, per-hook installed/updated/skipped outcomes, and backup/rollback behavior). - Use `context/sce/setup-githooks-cli-ux.md` for the implemented `sce-setup-githooks-any-repo` T04 setup command-surface contract (`--hooks`, optional `--repo`), compatibility validation rules, and deterministic hook setup messaging. diff --git a/context/plans/agent-trace-local-hooks-production-mvp.md b/context/plans/agent-trace-local-hooks-production-mvp.md new file mode 100644 index 00000000..c7a365e3 --- /dev/null +++ b/context/plans/agent-trace-local-hooks-production-mvp.md @@ -0,0 +1,156 @@ +# Plan: agent-trace-local-hooks-production-mvp +## 1) Change summary +Connect the existing Agent Trace service seams into a fully functional local Git-hook pipeline for production readiness: real `sce hooks` subcommand execution, end-to-end hook data flow (`pre-commit`, `commit-msg`, `post-commit`, `post-rewrite`), canonical notes + local DB persistence, retry recovery, and operator-facing rollout/validation guarantees. + +## 2) Success criteria +- Local hooks MVP is production-functional: required hooks execute real behavior through `sce hooks ` instead of placeholder output. +- End-to-end local commit flow is validated: staged-only pre-commit checkpointing, commit-msg co-author policy, post-commit canonical Agent Trace creation, and post-rewrite remap + rewrite-trace handling. +- Persistence contract is operational: canonical writes to `refs/notes/agent-trace` plus local DB persistence with deterministic idempotency and replay-safe retry behavior. +- Hard release gates pass for this scope: no dead-code warnings in Agent Trace/local-hooks production modules, deterministic tests for happy/failure/idempotency paths, and rollout docs/checklists are updated. + +## 3) Constraints and non-goals +- In scope: local Git-hook productionization for Agent Trace (`pre-commit`, `commit-msg`, `post-commit`, `post-rewrite`), hook command wiring, local notes/DB persistence adapters, retry processing, doctor/setup alignment, and release validation artifacts. +- In scope: production decisions needed for local persistence/runtime policy (for example local DB path resolution policy, schema bootstrap timing, and hook runtime guard behavior). +- In scope: reducing current dead-code warnings by wiring currently isolated seams into executable production paths for this MVP slice. +- Out of scope: hosted webhook ingestion/orchestration and hosted reconciliation pipelines (T12+ equivalent behavior remains future scope). +- Out of scope: making `mcp` or cloud `sync` production-ready. +- Non-goal: broad architecture changes unrelated to local hooks attribution and persistence. + +## 4) Task stack (T01..T10) +- [x] T01: Freeze Local Hooks MVP production contract and gap matrix (status:done) + - Task ID: T01 + - Goal: Define the exact production MVP contract and map current seam-level implementation to missing runtime wiring/gates. + - Boundaries (in/out of scope): + - In: explicit local flow boundaries, required runtime guards, persistence policy decisions, and module ownership for hook runtime adapters. + - Out: implementing code paths; this is contract/gap finalization only. + - Done when: + - A current-state contract artifact captures Local Hooks MVP behavior and acceptance boundaries. + - A deterministic gap matrix lists each missing runtime piece needed to move from placeholder to functional behavior. + - Verification notes (commands or checks): + - Context review parity against `cli/src/services/{hooks,agent_trace,local_db,setup,doctor}.rs` and relevant context artifacts. + +- [ ] T02: Implement real `sce hooks` command routing and hook argument handling (status:todo) + - Task ID: T02 + - Goal: Replace placeholder-only hooks dispatch with concrete subcommand routing for `pre-commit`, `commit-msg`, `post-commit`, and `post-rewrite` execution. + - Boundaries (in/out of scope): + - In: parser/dispatch updates, deterministic error handling for invalid hook invocations, and wiring to concrete runtime handlers. + - Out: deep persistence logic internals (handled in later tasks). + - Done when: + - `sce hooks ` executes the corresponding production path instead of placeholder messaging. + - Hook argument/STDIN contracts are validated and surfaced with actionable deterministic errors. + - Verification notes (commands or checks): + - `cargo test --manifest-path cli/Cargo.toml app::tests` + - Focused hook command-surface tests for valid/invalid hook invocations. + +- [ ] T03: Wire pre-commit runtime finalization to real staged attribution inputs (status:todo) + - Task ID: T03 + - Goal: Connect `finalize_pre_commit_checkpoint` to real runtime data collection and deterministic checkpoint persistence handoff. + - Boundaries (in/out of scope): + - In: runtime guard evaluation, staged/unstaged extraction integration, anchor capture, and finalized checkpoint handoff/store seam. + - Out: post-commit persistence and rewrite flow behavior. + - Done when: + - Pre-commit path produces staged-only finalized checkpoint artifacts for downstream commit binding. + - No-op guard outcomes remain explicit and test-covered. + - Verification notes (commands or checks): + - `cargo test --manifest-path cli/Cargo.toml pre_commit` + - End-to-end local repo fixture test proving unstaged ranges are excluded. + +- [ ] T04: Wire commit-msg hook file mutation to canonical co-author policy (status:todo) + - Task ID: T04 + - Goal: Connect `apply_commit_msg_coauthor_policy` to real commit message file IO in hook runtime with idempotent trailer handling. + - Boundaries (in/out of scope): + - In: commit message file read/transform/write flow, newline preservation, and policy gate wiring. + - Out: author identity rewriting or non-canonical trailer behavior. + - Done when: + - Commit-msg runtime mutates message files only when policy gates pass and preserves idempotency/newline semantics. + - Invalid message-file scenarios return deterministic actionable failures. + - Verification notes (commands or checks): + - `cargo test --manifest-path cli/Cargo.toml commit_msg_policy` + - Hook-runtime integration test with on-disk commit message fixture. + +- [ ] T05: Implement post-commit production persistence adapters (notes + DB + ledger + queue) (status:todo) + - Task ID: T05 + - Goal: Connect `finalize_post_commit_trace` to concrete production adapters for notes writes, DB writes, emission ledger, and retry queue enqueue. + - Boundaries (in/out of scope): + - In: notes write adapter, DB write adapter, idempotency ledger storage behavior, fallback queue enqueue path, and runtime error classification mapping. + - Out: hosted reconciliation workflows. + - Done when: + - Post-commit path persists canonical records to both targets or deterministically enqueues failed-target fallback. + - Duplicate commit emission is prevented by ledger checks. + - Verification notes (commands or checks): + - `cargo test --manifest-path cli/Cargo.toml post_commit_finalization` + - Local integration test validating notes content type/ref and DB persistence parity. + +- [ ] T06: Productionize local DB runtime policy and schema bootstrap (status:todo) + - Task ID: T06 + - Goal: Establish and implement production local DB location/bootstrap policy for Linux and other supported local platforms, then wire schema migration lifecycle. + - Boundaries (in/out of scope): + - In: deterministic DB path policy, path creation/error handling, startup migration execution, and migration idempotency behavior. + - Out: hosted database/service infrastructure. + - Done when: + - Hook runtime uses a deterministic persistent DB target (not in-memory) for production paths. + - Core/reconciliation/retry schema migrations are automatically ensured before writes. + - Verification notes (commands or checks): + - `cargo test --manifest-path cli/Cargo.toml local_db::tests` + - Integration test proving persisted data survives process restart with configured local DB path. + +- [ ] T07: Wire post-rewrite runtime flow (remap ingestion + rewrite trace finalization) (status:todo) + - Task ID: T07 + - Goal: Connect `post-rewrite` hook runtime input parsing and rewrite-method normalization to real remap ingestion and rewritten-trace emission paths. + - Boundaries (in/out of scope): + - In: old/new SHA pair input ingestion, rewrite method handling, confidence/quality mapping flow, and fallback queue behavior for rewritten traces. + - Out: hosted webhook event intake. + - Done when: + - Local amend/rebase rewrite scenarios emit deterministic remap ingestion requests and rewritten trace records. + - Malformed input and duplicate replay scenarios are deterministic and test-covered. + - Verification notes (commands or checks): + - `cargo test --manifest-path cli/Cargo.toml post_rewrite_finalization` + - `cargo test --manifest-path cli/Cargo.toml rewrite_trace_finalization` + +- [ ] T08: Wire retry replay processor into operational runtime and observability outputs (status:todo) + - Task ID: T08 + - Goal: Ensure retry queue processing is invokable in production local workflow with deterministic metrics emission and target-scoped recovery. + - Boundaries (in/out of scope): + - In: retry trigger strategy for local runtime, queue dequeue/requeue lifecycle, and metrics sink output integration. + - Out: external metrics backends beyond current local/runtime contract. + - Done when: + - Failed-target retries are processed and recovered/requeued as expected with emitted runtime/error metrics. + - Replay loops avoid same-pass duplicate processing for identical trace IDs. + - Verification notes (commands or checks): + - `cargo test --manifest-path cli/Cargo.toml hooks::tests::retry_processor_recovers_failed_notes_write_and_emits_success_metric` + - `cargo test --manifest-path cli/Cargo.toml hooks::tests::retry_processor_requeues_when_db_write_still_fails` + +- [ ] T09: Hardening pass for production gates (warnings, docs, rollout/runbook) (status:todo) + - Task ID: T09 + - Goal: Satisfy hard release gates by eliminating dead-code warnings in MVP modules through real wiring, tightening failure diagnostics, and updating operator docs. + - Boundaries (in/out of scope): + - In: dead-code cleanup for Local Hooks MVP modules, CLI/help/readme/doctor/setup docs updates, and rollout checklist updates. + - Out: cleanup of unrelated placeholder domains not needed for this MVP release. + - Done when: + - `clippy` for the target crate no longer reports dead-code warnings for local hooks production modules. + - Operator docs clearly specify install, health checks, expected artifacts, and failure recovery workflow. + - Verification notes (commands or checks): + - `nix run ./cli#clippy` + - `cargo test --manifest-path cli/Cargo.toml` + - Documentation parity review across `cli/README.md` and context artifacts. + +- [ ] T10: Validation and cleanup (status:todo) + - Task ID: T10 + - Goal: Execute final end-to-end validation, evidence capture, artifact cleanup, and context sync verification for production-readiness signoff. + - Boundaries (in/out of scope): + - In: full verification suite, temporary artifact cleanup, and context/code alignment checks for changed behavior. + - Out: net-new feature additions after validation freeze. + - Done when: + - End-to-end local commit and rewrite flows pass with deterministic evidence for success and failure/retry scenarios. + - Required checks pass and context is synchronized to current behavior. + - Residual risks and deferred items are explicitly documented. + - Verification notes (commands or checks): + - `cargo fmt --manifest-path cli/Cargo.toml -- --check` + - `cargo build --manifest-path cli/Cargo.toml` + - `cargo test --manifest-path cli/Cargo.toml` + - `nix run ./cli#clippy` + - `nix run .#pkl-check-generated` + - `nix flake check` + +## 5) Open questions +- None. diff --git a/context/sce/agent-trace-hosted-event-intake-orchestration.md b/context/sce/agent-trace-hosted-event-intake-orchestration.md index eb16c7eb..779338e7 100644 --- a/context/sce/agent-trace-hosted-event-intake-orchestration.md +++ b/context/sce/agent-trace-hosted-event-intake-orchestration.md @@ -6,6 +6,12 @@ - Accepts hosted provider rewrite events and turns them into replay-safe reconciliation run requests. - Covers provider parsing/signature/idempotency intake only; mapping heuristics remain out of scope (`T13`). +## Deployment model + +- GitHub and GitLab webhook handling is owned by a hosted reconciliation server (not local per-repo `.git/hooks`). +- The hosted reconciliation server is the cross-machine synchronization point for attribution/rewrite reconciliation data coming from all user computers. +- Local hooks remain responsible for local capture/finalization, while hosted webhooks reconcile repository-history changes that must be synchronized across users. + ## Code ownership - Hosted intake service: `cli/src/services/hosted_reconciliation.rs`. diff --git a/context/sce/agent-trace-local-hooks-mvp-contract-gap-matrix.md b/context/sce/agent-trace-local-hooks-mvp-contract-gap-matrix.md new file mode 100644 index 00000000..8908c0ab --- /dev/null +++ b/context/sce/agent-trace-local-hooks-mvp-contract-gap-matrix.md @@ -0,0 +1,90 @@ +# Agent Trace Local Hooks MVP Contract and Gap Matrix + +## Status +- Plan: `agent-trace-local-hooks-production-mvp` +- Task: `T01` +- Scope: contract and gap freeze only (no production code changes) +- Normative keywords: `MUST`, `SHOULD`, `MAY` + +## Objective +Freeze one implementation-ready contract for Local Hooks MVP productionization and map current code-truth seams to missing runtime wiring required by tasks `T02`..`T10`. + +## Local MVP boundary +- In scope: `sce hooks` runtime command flow for `pre-commit`, `commit-msg`, `post-commit`, `post-rewrite`; local notes + DB persistence; retry replay; rollout readiness alignment with `sce setup --hooks` and `sce doctor`. +- In scope: local runtime guard behavior (`SCE_DISABLED`, CLI availability, bare-repo safety), deterministic idempotency, and actionable failure diagnostics. +- Out of scope: hosted webhook ingestion/reconciliation execution (`T12+` equivalent scope), MCP productionization, cloud sync productionization. + +## Production contract (frozen for T02..T10) + +### 1) Command/runtime entrypoints +- `sce hooks` MUST become an implemented command surface and MUST route to concrete hook subcommands: `pre-commit`, `commit-msg`, `post-commit`, `post-rewrite`. +- Invalid hook invocations MUST return deterministic actionable usage errors. +- Hook runtime handlers MUST support Git hook argument/STDIN contracts without placeholder output. + +### 2) Pre-commit contract +- Runtime MUST collect staged-vs-unstaged attribution inputs and pass staged-only data into `finalize_pre_commit_checkpoint`. +- Runtime MUST capture index/head tree anchors and persist finalized checkpoint artifacts for downstream binding. +- Runtime MUST preserve explicit no-op outcomes for disabled, CLI-unavailable, and bare-repo states. + +### 3) Commit-msg contract +- Runtime MUST read, transform, and write the real commit message file path passed by Git. +- Runtime MUST apply canonical trailer policy only when gates pass (`SCE_DISABLED`, `SCE_COAUTHOR_ENABLED`, staged-attribution present). +- Runtime MUST preserve idempotency and newline semantics during file mutation. + +### 4) Post-commit contract +- Runtime MUST materialize canonical Agent Trace payloads through `build_trace_payload` and finalize via notes + DB dual-write adapters. +- Runtime MUST gate duplicate emission through commit-level ledger checks. +- Runtime MUST enqueue target-scoped retry entries when either notes or DB write fails. + +### 5) Post-rewrite contract +- Runtime MUST parse `post-rewrite` old/new SHA input, normalize rewrite method, and ingest deterministic remap requests. +- Runtime MUST emit rewritten-SHA trace finalization with rewrite metadata and confidence-to-quality mapping. +- Runtime MUST preserve idempotent replay behavior for duplicate rewrite events. + +### 6) Persistence and schema contract +- Local production runtime MUST use a deterministic persistent DB path policy (not in-memory). +- Schema bootstrap (`apply_core_schema_migrations`) MUST run before production write paths. +- Notes persistence target MUST remain `refs/notes/agent-trace` with content type `application/vnd.agent-trace.record+json`. + +### 7) Retry/observability contract +- Retry processor MUST be invokable in local production workflow and MUST recover only failed targets per entry. +- Retry processing MUST emit per-attempt runtime/error-class metrics. +- Same-pass duplicate trace processing MUST be prevented. + +### 8) Rollout/health contract +- `sce setup --hooks` remains canonical install path for required hook scripts. +- `sce doctor` remains canonical health/readiness validator for required hook files and executable state. +- Operator docs MUST describe install, health verification, expected artifacts, and recovery workflow. + +## Deterministic policy decisions frozen in T01 +- DB location policy target for production local writes: platform state-data location under `sce/agent-trace/local.db` (Linux baseline: `${XDG_STATE_HOME:-~/.local/state}/sce/agent-trace/local.db`); non-Linux follows equivalent per-user state-data root. +- Runtime failure posture for local hooks: fail-open for commit progression by default, while preserving retry-safe persistence intent and diagnostics. +- Idempotency unit for finalized local commit traces: one canonical finalized record per commit SHA. + +## Module ownership map (code truth) +- CLI command parsing/dispatch: `cli/src/app.rs` +- Command surface status/help text: `cli/src/command_surface.rs` +- Hook-domain contracts/finalizers/retry processor: `cli/src/services/hooks.rs` +- Agent Trace payload adapter/builder/schema contract: `cli/src/services/agent_trace.rs` +- Local DB connection/migrations/smoke helpers: `cli/src/services/local_db.rs` +- Hook installation orchestration: `cli/src/services/setup.rs` +- Hook readiness diagnostics: `cli/src/services/doctor.rs` + +## Gap matrix (current code truth -> required runtime completion) + +| MVP area | Current state (code truth) | Required completion target | Planned task(s) | +| --- | --- | --- | --- | +| `sce hooks` command routing | `hooks` command dispatches `run_placeholder_hooks()` and rejects extra args as plain subcommand extras (`cli/src/app.rs`, `cli/src/command_surface.rs`). | Implement concrete subcommand parser/dispatcher for `pre-commit`/`commit-msg`/`post-commit`/`post-rewrite` with deterministic errors and no placeholder messaging. | `T02` | +| Pre-commit runtime wiring | Finalizer exists but has no real Git runtime data collection or persistence handoff (`cli/src/services/hooks.rs`). | Add runtime staged-data collection, anchor capture from Git state, and finalized checkpoint storage handoff for downstream commit binding. | `T03` | +| Commit-msg file IO wiring | Policy transformer exists as pure string function only (`apply_commit_msg_coauthor_policy`). | Wire real commit message file read/transform/write flow with newline/idempotency guarantees and deterministic file-path failures. | `T04` | +| Post-commit persistence adapters | Finalizer contracts exist as traits/in-memory seams; no production adapters bound to git notes/local DB/ledger queue in command runtime path. | Implement production adapters for notes write, DB write, emission ledger, queue enqueue, and runtime error-class mapping. | `T05` | +| Local DB persistent runtime policy | DB helpers support in-memory/path targets and migrations, but no production path policy/bootstrap lifecycle wiring for hooks runtime. | Add deterministic persistent path resolution + directory creation and run schema migrations before write paths. | `T06` | +| Post-rewrite runtime orchestration | Remap and rewrite finalizers exist, but no implemented hook command path binds STDIN/method args to these flows. | Implement runtime ingestion of Git `post-rewrite` inputs and wire remap + rewritten-trace finalization path end to end. | `T07` | +| Retry replay operational trigger | Retry processor exists but not integrated into operational hook runtime trigger strategy. | Wire retry execution into local workflow with bounded batch processing and metrics outputs. | `T08` | +| Release hardening gates | Placeholder command/help text and dead-code suppression (`#[allow(dead_code)]` in local DB target enum) indicate incomplete production wiring. | Remove local-hooks-module dead-code warnings through real wiring, tighten diagnostics, and update operator docs/runbooks. | `T09` | +| End-to-end validation signoff | No full local commit/rewrite production evidence bundle yet for this MVP scope. | Execute full verification suite, capture deterministic evidence, clean temporary artifacts, and sync context to final code truth. | `T10` | + +## Acceptance checklist for T01 +- [x] One current-state contract artifact defines Local Hooks MVP production boundaries and behavioral requirements. +- [x] Deterministic gap matrix maps code-truth seams to remaining runtime work for `T02`..`T10`. +- [x] Ownership mapping is explicit for command/runtime/persistence/doctor/setup modules. From f02b2f4bdbcd394906d761fa4d73b1325ce62322 Mon Sep 17 00:00:00 2001 From: David Abram Date: Thu, 5 Mar 2026 09:53:24 +0100 Subject: [PATCH 32/39] hooks: Implement sce hooks subcommand routing entrypoints Replace placeholder sce hooks dispatch with explicit pre-commit, commit-msg, post-commit, and post-rewrite subcommand parsing and execution paths. Add deterministic invocation validation and parser/runtime coverage tests so invalid hook usage fails with actionable errors while valid invocations route through the new runtime entrypoints. --- cli/src/app.rs | 63 +++++- cli/src/command_surface.rs | 6 +- cli/src/services/hooks.rs | 189 +++++++++++++++++- cli/src/services/hooks/tests.rs | 61 +++++- context/cli/placeholder-foundation.md | 9 +- context/context-map.md | 1 + context/glossary.md | 6 +- context/overview.md | 4 +- .../agent-trace-local-hooks-production-mvp.md | 2 +- .../sce/agent-trace-hooks-command-routing.md | 27 +++ 10 files changed, 340 insertions(+), 28 deletions(-) create mode 100644 context/sce/agent-trace-hooks-command-routing.md diff --git a/cli/src/app.rs b/cli/src/app.rs index e2b4a015..799903d0 100644 --- a/cli/src/app.rs +++ b/cli/src/app.rs @@ -12,7 +12,7 @@ enum Command { SetupHelp, Doctor, Mcp, - Hooks, + Hooks(services::hooks::HookSubcommand), Sync, } @@ -95,7 +95,7 @@ fn parse_subcommand(value: String, tail_args: Vec) -> Result { "setup" => parse_setup_subcommand(tail_args), "doctor" => parse_non_setup_subcommand(Command::Doctor, tail_args), "mcp" => parse_non_setup_subcommand(Command::Mcp, tail_args), - "hooks" => parse_non_setup_subcommand(Command::Hooks, tail_args), + "hooks" => parse_hooks_subcommand(tail_args), "sync" => parse_non_setup_subcommand(Command::Sync, tail_args), _ => { if command_surface::is_known_command(&value) { @@ -142,6 +142,11 @@ fn parse_non_setup_subcommand(command: Command, tail_args: Vec) -> Resul ); } +fn parse_hooks_subcommand(args: Vec) -> Result { + let subcommand = services::hooks::parse_hooks_subcommand(args)?; + Ok(Command::Hooks(subcommand)) +} + fn dispatch(command: Command) -> Result<()> { match command { Command::Help => println!("{}", command_surface::help_text()), @@ -174,7 +179,9 @@ fn dispatch(command: Command) -> Result<()> { Command::SetupHelp => println!("{}", services::setup::setup_usage_text()), Command::Doctor => println!("{}", services::doctor::run_doctor()?), Command::Mcp => println!("{}", services::mcp::run_placeholder_mcp()?), - Command::Hooks => println!("{}", services::hooks::run_placeholder_hooks()?), + Command::Hooks(subcommand) => { + println!("{}", services::hooks::run_hooks_subcommand(subcommand)?) + } Command::Sync => println!("{}", services::sync::run_placeholder_sync()?), } @@ -196,9 +203,9 @@ mod tests { } #[test] - fn hooks_command_exits_success() { + fn hooks_command_without_subcommand_exits_non_zero() { let code = run(vec!["sce".to_string(), "hooks".to_string()]); - assert_eq!(code, ExitCode::SUCCESS); + assert_eq!(code, ExitCode::from(2)); } #[test] @@ -236,10 +243,48 @@ mod tests { } #[test] - fn parser_routes_placeholder_command() { - let command = parse_command(vec!["sce".to_string(), "hooks".to_string()]) - .expect("command should parse"); - assert_eq!(command, Command::Hooks); + fn parser_routes_hooks_pre_commit_subcommand() { + let command = parse_command(vec![ + "sce".to_string(), + "hooks".to_string(), + "pre-commit".to_string(), + ]) + .expect("command should parse"); + assert_eq!( + command, + Command::Hooks(crate::services::hooks::HookSubcommand::PreCommit) + ); + } + + #[test] + fn parser_routes_hooks_commit_msg_subcommand_with_path() { + let command = parse_command(vec![ + "sce".to_string(), + "hooks".to_string(), + "commit-msg".to_string(), + ".git/COMMIT_EDITMSG".to_string(), + ]) + .expect("command should parse"); + assert_eq!( + command, + Command::Hooks(crate::services::hooks::HookSubcommand::CommitMsg { + message_file: std::path::PathBuf::from(".git/COMMIT_EDITMSG"), + }) + ); + } + + #[test] + fn parser_rejects_hooks_unknown_subcommand() { + let error = parse_command(vec![ + "sce".to_string(), + "hooks".to_string(), + "unknown".to_string(), + ]) + .expect_err("unknown hook subcommand should fail"); + assert_eq!( + error.to_string(), + "Unknown hook subcommand 'unknown'. Run 'sce hooks --help' to see valid usage." + ); } #[test] diff --git a/cli/src/command_surface.rs b/cli/src/command_surface.rs index 83b82cd2..7cca2c46 100644 --- a/cli/src/command_surface.rs +++ b/cli/src/command_surface.rs @@ -36,8 +36,8 @@ pub const COMMANDS: &[CommandContract] = &[ }, CommandContract { name: services::hooks::NAME, - status: ImplementationStatus::Placeholder, - purpose: "Manage git-hook listener and generated-region awareness", + status: ImplementationStatus::Implemented, + purpose: "Run git-hook runtime entrypoints for local Agent Trace flows", }, CommandContract { name: services::sync::NAME, @@ -71,7 +71,7 @@ Setup usage:\n sce setup [--opencode|--claude|--both]\n sce setup --hooks [--r Commands:\n{}\n\n\ Setup defaults to interactive target selection when no setup target flag is passed.\n\ Use '--hooks' to install required git hooks for the current repository or '--repo ' for a specific repository.\n\ -`setup` and `doctor` are implemented; `mcp`, `hooks`, and `sync` remain placeholder-oriented.\n", +`setup`, `doctor`, and `hooks` are implemented; `mcp` and `sync` remain placeholder-oriented.\n", command_rows ) } diff --git a/cli/src/services/hooks.rs b/cli/src/services/hooks.rs index 33dd0aaa..b90cdebc 100644 --- a/cli/src/services/hooks.rs +++ b/cli/src/services/hooks.rs @@ -1,5 +1,8 @@ -use anyhow::Result; +use anyhow::{bail, Context, Result}; use std::collections::HashSet; +use std::fs; +use std::path::PathBuf; +use std::str::FromStr; use std::time::Instant; use crate::services::agent_trace::{ @@ -11,6 +14,190 @@ pub const NAME: &str = "hooks"; pub const CANONICAL_SCE_COAUTHOR_TRAILER: &str = "Co-authored-by: SCE "; pub const POST_COMMIT_PARENT_SHA_METADATA_KEY: &str = "dev.crocoder.sce.parent_revision"; +#[derive(Clone, Debug, Eq, PartialEq)] +pub enum HookSubcommand { + PreCommit, + CommitMsg { message_file: PathBuf }, + PostCommit, + PostRewrite { rewrite_method: String }, +} + +pub fn hooks_usage_text() -> &'static str { + "Usage:\n sce hooks pre-commit\n sce hooks commit-msg \n sce hooks post-commit\n sce hooks post-rewrite \n\nGit executes hook scripts with these subcommands. `post-rewrite` reads rewrite pairs from STDIN." +} + +pub fn parse_hooks_subcommand(args: Vec) -> Result { + if args.is_empty() { + bail!("Missing hook subcommand. Run 'sce hooks --help' to see valid usage."); + } + + if args.len() == 1 && (args[0] == "--help" || args[0] == "-h") { + bail!("{}", hooks_usage_text()); + } + + match args[0].as_str() { + "pre-commit" => { + ensure_no_extra_hook_args("pre-commit", &args[1..])?; + Ok(HookSubcommand::PreCommit) + } + "commit-msg" => { + if args.len() < 2 { + bail!( + "Missing required argument '' for 'commit-msg'. Run 'sce hooks --help' to see valid usage." + ); + } + + if args.len() > 2 { + bail!( + "Unexpected extra argument '{}' for 'commit-msg'. Run 'sce hooks --help' to see valid usage.", + args[2] + ); + } + + Ok(HookSubcommand::CommitMsg { + message_file: PathBuf::from_str(&args[1])?, + }) + } + "post-commit" => { + ensure_no_extra_hook_args("post-commit", &args[1..])?; + Ok(HookSubcommand::PostCommit) + } + "post-rewrite" => { + if args.len() < 2 { + bail!( + "Missing required argument '' for 'post-rewrite'. Run 'sce hooks --help' to see valid usage." + ); + } + + if args.len() > 2 { + bail!( + "Unexpected extra argument '{}' for 'post-rewrite'. Run 'sce hooks --help' to see valid usage.", + args[2] + ); + } + + Ok(HookSubcommand::PostRewrite { + rewrite_method: args[1].clone(), + }) + } + unknown => bail!( + "Unknown hook subcommand '{}'. Run 'sce hooks --help' to see valid usage.", + unknown + ), + } +} + +fn ensure_no_extra_hook_args(hook: &str, args: &[String]) -> Result<()> { + if args.is_empty() { + return Ok(()); + } + + bail!( + "Unexpected extra argument '{}' for '{}'. Run 'sce hooks --help' to see valid usage.", + args[0], + hook + ) +} + +pub fn run_hooks_subcommand(subcommand: HookSubcommand) -> Result { + match subcommand { + HookSubcommand::PreCommit => run_pre_commit_subcommand(), + HookSubcommand::CommitMsg { message_file } => run_commit_msg_subcommand(message_file), + HookSubcommand::PostCommit => run_post_commit_subcommand(), + HookSubcommand::PostRewrite { rewrite_method } => { + run_post_rewrite_subcommand(&rewrite_method) + } + } +} + +fn run_pre_commit_subcommand() -> Result { + let outcome = finalize_pre_commit_checkpoint( + &PreCommitRuntimeState { + sce_disabled: false, + cli_available: true, + is_bare_repo: false, + }, + PreCommitTreeAnchors { + index_tree: "pending-index-tree".to_string(), + head_tree: None, + }, + PendingCheckpoint { files: Vec::new() }, + ); + + let message = match outcome { + PreCommitFinalization::NoOp(reason) => { + format!("pre-commit hook executed with no-op runtime state: {reason:?}") + } + PreCommitFinalization::Finalized(checkpoint) => format!( + "pre-commit hook executed and finalized staged checkpoint for {} file(s).", + checkpoint.files.len() + ), + }; + + Ok(message) +} + +fn run_commit_msg_subcommand(message_file: PathBuf) -> Result { + let metadata = fs::metadata(&message_file).with_context(|| { + format!( + "Invalid commit message file '{}': file does not exist or is not readable.", + message_file.display() + ) + })?; + + if !metadata.is_file() { + bail!( + "Invalid commit message file '{}': expected a regular file path.", + message_file.display() + ); + } + + Ok(format!( + "commit-msg hook accepted message file '{}'.", + message_file.display() + )) +} + +fn run_post_commit_subcommand() -> Result { + Ok("post-commit hook accepted runtime invocation.".to_string()) +} + +fn run_post_rewrite_subcommand(rewrite_method: &str) -> Result { + let stdin = std::io::read_to_string(std::io::stdin()) + .context("Failed to read post-rewrite pair input from STDIN")?; + let mut ingestion = AcceptAllRewriteRemapIngestion; + let outcome = finalize_post_rewrite_remap( + &PostRewriteRuntimeState { + sce_disabled: false, + cli_available: true, + is_bare_repo: false, + }, + rewrite_method, + &stdin, + &mut ingestion, + )?; + + match outcome { + PostRewriteFinalization::NoOp(reason) => Ok(format!( + "post-rewrite hook executed with no-op runtime state: {reason:?}" + )), + PostRewriteFinalization::Ingested(ingested) => Ok(format!( + "post-rewrite hook ingested {} pair(s), skipped {} duplicate pair(s), method='{}'.", + ingested.ingested_pairs, + ingested.skipped_pairs, + ingested.rewrite_method.canonical_label() + )), + } +} + +struct AcceptAllRewriteRemapIngestion; + +impl RewriteRemapIngestion for AcceptAllRewriteRemapIngestion { + fn ingest(&mut self, _request: RewriteRemapRequest) -> Result { + Ok(true) + } +} + #[derive(Clone, Debug, Eq, PartialEq)] pub struct PreCommitRuntimeState { pub sce_disabled: bool, diff --git a/cli/src/services/hooks/tests.rs b/cli/src/services/hooks/tests.rs index 6e631115..f59f2daf 100644 --- a/cli/src/services/hooks/tests.rs +++ b/cli/src/services/hooks/tests.rs @@ -8,12 +8,13 @@ use crate::services::agent_trace::{ use super::{ apply_commit_msg_coauthor_policy, finalize_post_commit_trace, finalize_post_rewrite_remap, - finalize_pre_commit_checkpoint, finalize_rewrite_trace, process_trace_retry_queue, - run_placeholder_hooks, CommitMsgRuntimeState, GeneratedRegionEvent, GeneratedRegionLifecycle, - GitHookKind, HookEvent, HookService, PendingCheckpoint, PendingFileCheckpoint, - PendingLineRange, PersistenceErrorClass, PersistenceFailure, PersistenceTarget, - PersistenceWriteResult, PlaceholderHookService, PostCommitFinalization, PostCommitInput, - PostCommitNoOpReason, PostCommitRuntimeState, PostRewriteFinalization, PostRewriteNoOpReason, + finalize_pre_commit_checkpoint, finalize_rewrite_trace, parse_hooks_subcommand, + process_trace_retry_queue, run_hooks_subcommand, run_placeholder_hooks, CommitMsgRuntimeState, + GeneratedRegionEvent, GeneratedRegionLifecycle, GitHookKind, HookEvent, HookService, + HookSubcommand, PendingCheckpoint, PendingFileCheckpoint, PendingLineRange, + PersistenceErrorClass, PersistenceFailure, PersistenceTarget, PersistenceWriteResult, + PlaceholderHookService, PostCommitFinalization, PostCommitInput, PostCommitNoOpReason, + PostCommitRuntimeState, PostRewriteFinalization, PostRewriteNoOpReason, PostRewriteRuntimeState, PreCommitFinalization, PreCommitNoOpReason, PreCommitRuntimeState, PreCommitTreeAnchors, RetryMetricsSink, RetryProcessingMetric, RewriteMethod, RewriteRemapIngestion, RewriteRemapRequest, RewriteTraceFinalization, RewriteTraceInput, @@ -864,3 +865,51 @@ fn hooks_placeholder_accepts_generated_region_events() -> Result<()> { service.record(event) } + +#[test] +fn parse_hooks_subcommand_routes_pre_commit() -> Result<()> { + let parsed = parse_hooks_subcommand(vec!["pre-commit".to_string()])?; + assert_eq!(parsed, HookSubcommand::PreCommit); + Ok(()) +} + +#[test] +fn parse_hooks_subcommand_rejects_missing_hook_name() { + let error = parse_hooks_subcommand(Vec::new()) + .expect_err("missing hook subcommand should return usage error"); + assert_eq!( + error.to_string(), + "Missing hook subcommand. Run 'sce hooks --help' to see valid usage." + ); +} + +#[test] +fn parse_hooks_subcommand_requires_commit_msg_path() { + let error = parse_hooks_subcommand(vec!["commit-msg".to_string()]) + .expect_err("commit-msg requires "); + assert_eq!( + error.to_string(), + "Missing required argument '' for 'commit-msg'. Run 'sce hooks --help' to see valid usage." + ); +} + +#[test] +fn run_hooks_subcommand_commit_msg_rejects_missing_file() { + let missing = std::env::temp_dir().join(format!( + "sce-hooks-missing-{}-{}.msg", + std::process::id(), + "nope" + )); + let error = run_hooks_subcommand(HookSubcommand::CommitMsg { + message_file: missing.clone(), + }) + .expect_err("missing commit message file should fail deterministically"); + + assert_eq!( + error.to_string(), + format!( + "Invalid commit message file '{}': file does not exist or is not readable.", + missing.display() + ) + ); +} diff --git a/context/cli/placeholder-foundation.md b/context/cli/placeholder-foundation.md index 29d6d049..2f919179 100644 --- a/context/cli/placeholder-foundation.md +++ b/context/cli/placeholder-foundation.md @@ -44,11 +44,12 @@ The repository now includes a Rust CLI crate at `cli/` for SCE automation work. - `setup`: implemented - `doctor`: implemented - `mcp`: placeholder -- `hooks`: placeholder +- `hooks`: implemented - `sync`: placeholder Placeholder commands currently acknowledge planned behavior and do not claim production implementation. -`mcp`, `hooks`, and `sync` route through explicit service-contract placeholders. +`mcp` and `sync` route through explicit service-contract placeholders. +`hooks` routes through implemented subcommand parsing/dispatch for `pre-commit`, `commit-msg`, `post-commit`, and `post-rewrite`. `setup` defaults to an `inquire` interactive target selection (OpenCode, Claude, Both) and accepts mutually-exclusive non-interactive target flags (`--opencode`, `--claude`, `--both`). `setup` now also exposes compile-time embedded config assets for OpenCode/Claude targets, sourced from `config/.opencode/**` and `config/.claude/**` via `cli/build.rs` with normalized forward-slash relative paths and target-scoped iteration APIs. `setup` additionally includes a repository-root install engine (`install_embedded_setup_assets`) that stages embedded files and applies backup-and-replace safety for `.opencode/`/`.claude/` with rollback restoration if staged swap fails. @@ -67,7 +68,7 @@ Placeholder commands currently acknowledge planned behavior and do not claim pro - `setup`: `Setup completed successfully.` plus selected targets, per-target install destinations/counts, and backup status lines. - `doctor`: `SCE doctor: ready|not ready` plus hook-path source, required hook checks, and actionable diagnostics. - `TODO: 'mcp' is planned and not implemented yet. MCP file-cache surface defines 2 placeholder tool contract(s) with max 1024 entries.` - - `TODO: 'hooks' is planned and not implemented yet. Hook event model reserves 3 git hook(s) with generated-region tracking placeholders, staged-only pre-commit checkpoint preview over 1 file(s), and commit-msg canonical trailer preview applied=true.` + - `hooks`: deterministic hook subcommand status messaging for runtime entrypoint invocation and argument/STDIN contract validation. - `TODO: 'sync' cloud workflows are planned and not implemented yet. Local Turso smoke check succeeded (1) row inserted; cloud sync placeholder enumerates 3 phase(s) and plan holds 3 checkpoint(s).` ## Service contracts @@ -76,7 +77,7 @@ Placeholder commands currently acknowledge planned behavior and do not claim pro - `cli/src/services/doctor.rs` defines hook rollout health validation (`run_doctor`) with path-source detection (default/local/global) and required-hook presence/executable checks. - `cli/src/services/agent_trace.rs` defines the task-scoped schema adapter contract (`adapt_trace_payload`) from internal attribution input structs to Agent Trace-shaped record structs, including fixed git `vcs` mapping, contributor type mapping, and reserved `dev.crocoder.sce.*` metadata placement. - `cli/src/services/mcp.rs` defines `McpService`, a `McpCapabilitySnapshot` model (primary + supported transports), and `CachePolicy` defaults for future file-cache workflows (`cache-put`/`cache-get`) with `runnable: false` placeholders. -- `cli/src/services/hooks.rs` defines `HookService` plus hook-event/generated-region event placeholders (`HookEventModel`, `HookEvent`, `GeneratedRegionEvent`) and keeps placeholder recording path compile-safe by consuming hook/lifecycle variants without enabling production hook actions. +- `cli/src/services/hooks.rs` defines hook runtime command parsing/dispatch (`HookSubcommand`, `parse_hooks_subcommand`, `run_hooks_subcommand`) and retains hook-event/generated-region placeholder contracts (`HookEventModel`, `HookEvent`, `GeneratedRegionEvent`) for future listener-oriented integrations. - `cli/src/services/sync.rs` defines cloud-sync abstraction points (`CloudSyncGateway`, `CloudSyncRequest`, `CloudSyncPlan`) layered after the local Turso smoke gate. - `cli/src/app.rs` dispatches `setup`, `doctor`, `mcp`, and `hooks` through service-level modules so runtime messages are sourced from domain modules instead of inline strings. diff --git a/context/context-map.md b/context/context-map.md index b32e7126..8c0ecc2b 100644 --- a/context/context-map.md +++ b/context/context-map.md @@ -35,6 +35,7 @@ Feature/domain context: - `context/sce/agent-trace-rewrite-mapping-engine.md` (T13 hosted rewrite mapping engine contract with patch-id exact precedence, range-diff/fuzzy scoring, and deterministic unresolved outcomes) - `context/sce/agent-trace-retry-queue-observability.md` (T14 retry queue recovery contract plus reconciliation/runtime observability metrics and DB-first queue schema additions) - `context/sce/agent-trace-local-hooks-mvp-contract-gap-matrix.md` (T01 Local Hooks MVP production contract freeze and deterministic gap matrix for `agent-trace-local-hooks-production-mvp`) +- `context/sce/agent-trace-hooks-command-routing.md` (T02 implemented `sce hooks` command routing, subcommand contracts, and deterministic invocation validation behavior) Working areas: - `context/plans/` (active plan execution artifacts, not durable history) diff --git a/context/glossary.md b/context/glossary.md index 58b3b224..a37c7c84 100644 --- a/context/glossary.md +++ b/context/glossary.md @@ -18,9 +18,9 @@ - `cli cargo local install contract`: Supported local CLI install command `cargo install --path cli --locked`, aligned with deterministic lockfile use for reproducible installs. - `cli crates.io readiness policy`: Current Cargo package posture in `cli/Cargo.toml` where crates.io-facing metadata is maintained but `publish = false` remains set until first-publish prerequisites are explicitly approved. - `root-to-cli flake input coherence`: Root `flake.nix` contract that forwards `nixpkgs`, `flake-utils`, and `rust-overlay` to the nested `cli` path input (`cli.inputs..follows`) so `nix flake check` can evaluate nested CLI outputs without missing-input failures. -- `sce` (CLI foundation): Rust binary crate at `cli/` with implemented setup installation flow and placeholder behavior for other command domains. +- `sce` (CLI foundation): Rust binary crate at `cli/` with implemented setup installation flow, implemented `hooks` subcommand routing/validation entrypoints, and placeholder behavior for `mcp` and `sync`. - `command surface contract`: The static command catalog in `cli/src/command_surface.rs` that marks each top-level command as `implemented` or `placeholder`. -- `command loop`: The `lexopt` parser + dispatcher in `cli/src/app.rs` that routes `help`, `setup`, `mcp`, `hooks`, and `sync`, executes setup installation, emits TODO placeholders for non-implemented commands, and returns deterministic actionable errors for invalid invocation. +- `command loop`: The `lexopt` parser + dispatcher in `cli/src/app.rs` that routes `help`, `setup`, `doctor`, `mcp`, `hooks`, and `sync`, executes implemented setup/doctor/hooks flows, emits TODO placeholders for non-implemented commands, and returns deterministic actionable errors for invalid invocation. - `sce dependency contract`: Minimal crate dependency baseline declared in `cli/Cargo.toml` and referenced via `cli/src/dependency_contract.rs` (`anyhow`, `hmac`, `inquire`, `lexopt`, `serde_json`, `sha2`, `tokio`, `turso`). - `local Turso adapter`: Async data-layer module in `cli/src/services/local_db.rs` that initializes local DB targets with `turso::Builder::new_local(...)` and runs execute/query smoke checks. - `sync Turso smoke gate`: Behavior in `cli/src/services/sync.rs` where the `sync` placeholder command runs an in-memory local Turso smoke check under a lazily initialized shared tokio current-thread runtime before returning placeholder cloud-sync messaging. @@ -36,7 +36,7 @@ - `setup install engine`: Installer in `cli/src/services/setup.rs` (`install_embedded_setup_assets`) that writes embedded setup assets into per-target staging directories and swaps them into repository-root `.opencode/`/`.claude/` destinations. - `setup backup-and-replace`: Replacement choreography in `cli/src/services/setup.rs` where existing install targets are renamed to unique `.backup` paths before staged content is promoted; on swap failure, the engine restores the original target from backup and cleans temporary staging paths. - `MCP capability snapshot`: Placeholder capability model in `cli/src/services/mcp.rs` that captures planned file-cache transport/tool contracts (`cache-put`, `cache-get`) and cache policy defaults without enabling runtime MCP execution. -- `hook event model placeholder`: Contract set in `cli/src/services/hooks.rs` defining git-hook event envelopes and generated-region lifecycle placeholders for future listener integration. +- `hooks command routing contract` (T02): Implemented hook command parser/dispatcher in `cli/src/services/hooks.rs` (`HookSubcommand`, `parse_hooks_subcommand`, `run_hooks_subcommand`) that supports `pre-commit`, `commit-msg `, `post-commit`, and `post-rewrite ` with deterministic invocation validation and usage errors. - `cloud sync gateway placeholder`: Abstraction in `cli/src/services/sync.rs` (`CloudSyncGateway`) that returns deferred cloud-sync checkpoints while `sync` remains non-production. - `sce CLI onboarding guide`: Crate-local documentation at `cli/README.md` that defines runnable placeholder commands, non-goals/safety limits, and roadmap mapping to service modules. - `plan/code overlap map`: Context artifact at `context/sce/plan-code-overlap-map.md` that classifies Shared Context Plan/Code, `/change-to-plan`, `/next-task`, `/commit`, and core skills into role-specific vs shared-reusable instruction blocks with explicit dedup targets. diff --git a/context/overview.md b/context/overview.md index 9ad897bb..ddd8c617 100644 --- a/context/overview.md +++ b/context/overview.md @@ -6,7 +6,7 @@ It also includes an early Rust CLI foundation at `cli/` for Shared Context Engin The crate ships onboarding and usage documentation at `cli/README.md` that reflects current implemented vs placeholder behavior. The CLI crate currently enforces a minimal dependency contract: `anyhow`, `hmac`, `inquire`, `lexopt`, `serde_json`, `sha2`, `tokio`, and `turso`. -Its command loop is implemented with `lexopt` argument parsing and `anyhow` error handling, with real setup orchestration, implemented `doctor` rollout validation, and placeholder dispatch for deferred commands through explicit service contracts. +Its command loop is implemented with `lexopt` argument parsing and `anyhow` error handling, with real setup orchestration, implemented `doctor` rollout validation, implemented `hooks` subcommand routing/validation entrypoints, and placeholder dispatch for deferred commands (`mcp`, `sync`) through explicit service contracts. The `setup` command includes an `inquire`-backed target-selection flow: default interactive selection for OpenCode/Claude/both, explicit non-interactive target flags (`--opencode`, `--claude`, `--both`), deterministic mutually-exclusive validation, and non-destructive cancellation exits. The CLI now compiles an embedded setup asset manifest from `config/.opencode/**`, `config/.claude/**`, and `cli/assets/hooks/**` via `cli/build.rs`; `cli/src/services/setup.rs` exposes deterministic normalized relative paths plus file bytes and target-scoped iteration without runtime reads from `config/`. The setup service also provides repository-root install orchestration: it resolves interactive or flag-based target selection, installs embedded assets, and reports deterministic completion details (selected target(s), installed file counts, and backup actions). @@ -37,6 +37,7 @@ The local DB service now also includes reconciliation persistence schema coverag The CLI now also includes a hosted event intake/orchestration seam in `cli/src/services/hosted_reconciliation.rs` that verifies provider signatures, resolves old/new commit heads from GitHub/GitLab payloads, and creates deterministic replay-safe reconciliation run requests; this behavior is documented in `context/sce/agent-trace-hosted-event-intake-orchestration.md`. The hosted reconciliation service now also includes a deterministic rewrite mapping engine (`map_rewritten_commit`) that resolves old->new commit identity using patch-id exact precedence, then range-diff hints, then fuzzy fallback with a `>= 0.60` mapping threshold and explicit ambiguous/unmatched/low-confidence unresolved outcomes; this behavior is documented in `context/sce/agent-trace-rewrite-mapping-engine.md`. The hooks service now also includes retry-queue replay processing (`process_trace_retry_queue`) with per-attempt runtime/error-class metric emission, and the hosted reconciliation service now includes mapped/unmapped + confidence histogram metric snapshots (`summarize_reconciliation_metrics`), with DB-first queue/metrics schema coverage in `apply_core_schema_migrations`; this behavior is documented in `context/sce/agent-trace-retry-queue-observability.md`. +The hooks command surface now also supports concrete runtime subcommand routing (`pre-commit`, `commit-msg`, `post-commit`, `post-rewrite`) with deterministic argument and STDIN contract validation owned by `cli/src/services/hooks.rs`; this behavior is documented in `context/sce/agent-trace-hooks-command-routing.md`. The setup service now also exposes deterministic required-hook embedded asset accessors (`iter_required_hook_assets`, `get_required_hook_asset`) backed by canonical templates in `cli/assets/hooks/` for `pre-commit`, `commit-msg`, and `post-commit`; this behavior is documented in `context/sce/setup-githooks-hook-asset-packaging.md`. The setup service now also includes required-hook install orchestration (`install_required_git_hooks`) that resolves repository root and effective hooks path from git truth, enforces deterministic per-hook outcomes (`Installed`/`Updated`/`Skipped`), and performs backup-and-restore rollback on swap failures; this behavior is documented in `context/sce/setup-githooks-install-flow.md`. The setup command parser/dispatch now also supports `sce setup --hooks` with optional `--repo `, enforces deterministic compatibility validation (`--repo` requires `--hooks`; `--hooks` incompatible with setup target flags), and emits deterministic per-hook setup outcome messaging (`installed`/`updated`/`skipped` with backup status); this behavior is documented in `context/sce/setup-githooks-cli-ux.md`. @@ -105,6 +106,7 @@ Lightweight post-task verification baseline (required after each completed task) - Use `context/sce/agent-trace-rewrite-mapping-engine.md` for the implemented T13 hosted mapping engine contract (patch-id exact matching, range-diff/fuzzy scoring precedence, confidence thresholds, and deterministic unresolved handling). - Use `context/sce/agent-trace-retry-queue-observability.md` for the implemented T14 retry replay contract (notes/DB target-scoped recovery, per-attempt runtime/error-class metrics, reconciliation mapped/unmapped + confidence histogram snapshots, and DB-first retry/metrics schema additions). - Use `context/sce/agent-trace-local-hooks-mvp-contract-gap-matrix.md` for the frozen T01 Local Hooks MVP production contract and deterministic gap matrix that maps current seam-level code truth to the remaining implementation stack (`T02`..`T10`). +- Use `context/sce/agent-trace-hooks-command-routing.md` for the implemented T02 `sce hooks` command routing contract (subcommand parsing, deterministic invocation errors, and initial runtime entrypoint behavior). - Use `context/sce/setup-githooks-hook-asset-packaging.md` for the implemented `sce-setup-githooks-any-repo` T02 compile-time hook-template packaging contract and setup-service required-hook embedded accessor surface. - Use `context/sce/setup-githooks-install-flow.md` for the implemented `sce-setup-githooks-any-repo` T03 required-hook install orchestration contract (git-truth hooks-path resolution, per-hook installed/updated/skipped outcomes, and backup/rollback behavior). - Use `context/sce/setup-githooks-cli-ux.md` for the implemented `sce-setup-githooks-any-repo` T04 setup command-surface contract (`--hooks`, optional `--repo`), compatibility validation rules, and deterministic hook setup messaging. diff --git a/context/plans/agent-trace-local-hooks-production-mvp.md b/context/plans/agent-trace-local-hooks-production-mvp.md index c7a365e3..5bba867f 100644 --- a/context/plans/agent-trace-local-hooks-production-mvp.md +++ b/context/plans/agent-trace-local-hooks-production-mvp.md @@ -29,7 +29,7 @@ Connect the existing Agent Trace service seams into a fully functional local Git - Verification notes (commands or checks): - Context review parity against `cli/src/services/{hooks,agent_trace,local_db,setup,doctor}.rs` and relevant context artifacts. -- [ ] T02: Implement real `sce hooks` command routing and hook argument handling (status:todo) +- [x] T02: Implement real `sce hooks` command routing and hook argument handling (status:done) - Task ID: T02 - Goal: Replace placeholder-only hooks dispatch with concrete subcommand routing for `pre-commit`, `commit-msg`, `post-commit`, and `post-rewrite` execution. - Boundaries (in/out of scope): diff --git a/context/sce/agent-trace-hooks-command-routing.md b/context/sce/agent-trace-hooks-command-routing.md new file mode 100644 index 00000000..d6ee4090 --- /dev/null +++ b/context/sce/agent-trace-hooks-command-routing.md @@ -0,0 +1,27 @@ +# Agent Trace Hooks Command Routing + +## Scope +- Plan: `agent-trace-local-hooks-production-mvp` +- Task: `T02` +- Focus: implemented `sce hooks` subcommand routing and hook invocation contract validation. + +## Implemented command surface +- `sce hooks pre-commit` +- `sce hooks commit-msg ` +- `sce hooks post-commit` +- `sce hooks post-rewrite ` (reads rewrite pairs from STDIN) + +## Parser and dispatch behavior +- `cli/src/app.rs` routes `hooks` through dedicated hook-subcommand parsing instead of generic no-arg subcommand parsing. +- `cli/src/services/hooks.rs` now owns hook CLI usage text, deterministic parse errors, and runtime dispatch through `HookSubcommand` + `run_hooks_subcommand`. +- Invalid and ambiguous invocations return deterministic actionable errors pointing to `sce hooks --help`. + +## Current runtime entrypoint behavior +- `pre-commit`: executes the pre-commit runtime entrypoint and reports staged-checkpoint finalization outcome. +- `commit-msg`: validates that `` exists and is a regular file before accepting invocation. +- `post-commit`: accepts runtime invocation through implemented dispatch entrypoint. +- `post-rewrite`: reads hook pair input from STDIN, validates pair format through remap finalization parsing, and reports ingested/skipped outcomes. + +## Notes for next tasks +- T02 implements routing and invocation contracts only. +- Deep runtime wiring for staged attribution, commit-msg mutation, post-commit persistence adapters, and rewrite-trace persistence remains in `T03+`. From 4edb40050352e866949c0ddeb7a6fed00fb09eb0 Mon Sep 17 00:00:00 2001 From: David Abram Date: Thu, 5 Mar 2026 10:35:12 +0100 Subject: [PATCH 33/39] hooks: Implement pre-commit runtime staged attribution checkpoint persistence Resolve pre-commit runtime state from real repository signals, collect staged vs unstaged unified-diff ranges, and finalize checkpoints from live git anchors. Persist finalized staged-only attribution to the git-resolved pre-commit checkpoint artifact and keep runtime behavior fail-open on collection/persist errors. --- cli/src/services/hooks.rs | 378 +++++++++++++++++- cli/src/services/hooks/tests.rs | 79 +++- .../agent-trace-local-hooks-production-mvp.md | 2 +- ...gent-trace-pre-commit-staged-checkpoint.md | 17 + 4 files changed, 456 insertions(+), 20 deletions(-) diff --git a/cli/src/services/hooks.rs b/cli/src/services/hooks.rs index b90cdebc..77c6b845 100644 --- a/cli/src/services/hooks.rs +++ b/cli/src/services/hooks.rs @@ -1,7 +1,8 @@ use anyhow::{bail, Context, Result}; -use std::collections::HashSet; +use std::collections::{BTreeMap, HashSet}; use std::fs; -use std::path::PathBuf; +use std::path::{Path, PathBuf}; +use std::process::Command; use std::str::FromStr; use std::time::Instant; @@ -13,6 +14,7 @@ use crate::services::agent_trace::{ pub const NAME: &str = "hooks"; pub const CANONICAL_SCE_COAUTHOR_TRAILER: &str = "Co-authored-by: SCE "; pub const POST_COMMIT_PARENT_SHA_METADATA_KEY: &str = "dev.crocoder.sce.parent_revision"; +const PRE_COMMIT_CHECKPOINT_GIT_PATH: &str = "sce/pre-commit-checkpoint.json"; #[derive(Clone, Debug, Eq, PartialEq)] pub enum HookSubcommand { @@ -111,32 +113,372 @@ pub fn run_hooks_subcommand(subcommand: HookSubcommand) -> Result { } fn run_pre_commit_subcommand() -> Result { - let outcome = finalize_pre_commit_checkpoint( - &PreCommitRuntimeState { - sce_disabled: false, - cli_available: true, - is_bare_repo: false, - }, - PreCommitTreeAnchors { - index_tree: "pending-index-tree".to_string(), - head_tree: None, - }, - PendingCheckpoint { files: Vec::new() }, - ); + let repository_root = std::env::current_dir() + .context("Failed to determine current directory for pre-commit runtime invocation.")?; + run_pre_commit_subcommand_in_repo(&repository_root) +} + +fn run_pre_commit_subcommand_in_repo(repository_root: &Path) -> Result { + let runtime = resolve_pre_commit_runtime_state(repository_root); + + if runtime.sce_disabled || !runtime.cli_available || runtime.is_bare_repo { + let reason = if runtime.sce_disabled { + PreCommitNoOpReason::Disabled + } else if !runtime.cli_available { + PreCommitNoOpReason::CliUnavailable + } else { + PreCommitNoOpReason::BareRepository + }; + + return Ok(format!( + "pre-commit hook executed with no-op runtime state: {reason:?}" + )); + } + + let anchors = match capture_pre_commit_tree_anchors(repository_root) { + Ok(anchors) => anchors, + Err(error) => { + return Ok(format!( + "pre-commit hook skipped checkpoint finalization: failed to capture git anchors ({error})" + )); + } + }; + + let pending = match collect_pending_checkpoint(repository_root) { + Ok(pending) => pending, + Err(error) => { + return Ok(format!( + "pre-commit hook skipped checkpoint finalization: failed to collect staged attribution ({error})" + )); + } + }; + + let outcome = finalize_pre_commit_checkpoint(&runtime, anchors, pending); let message = match outcome { PreCommitFinalization::NoOp(reason) => { format!("pre-commit hook executed with no-op runtime state: {reason:?}") } - PreCommitFinalization::Finalized(checkpoint) => format!( - "pre-commit hook executed and finalized staged checkpoint for {} file(s).", - checkpoint.files.len() - ), + PreCommitFinalization::Finalized(checkpoint) => { + if let Err(error) = write_finalized_checkpoint(repository_root, &checkpoint) { + return Ok(format!( + "pre-commit hook finalized staged checkpoint for {} file(s) but failed to persist handoff artifact ({error})", + checkpoint.files.len() + )); + } + format!( + "pre-commit hook executed and finalized staged checkpoint for {} file(s).", + checkpoint.files.len() + ) + } }; Ok(message) } +fn resolve_pre_commit_runtime_state(repository_root: &Path) -> PreCommitRuntimeState { + PreCommitRuntimeState { + sce_disabled: env_flag_is_truthy("SCE_DISABLED"), + cli_available: git_command_success(repository_root, &["--version"]), + is_bare_repo: git_command_output(repository_root, &["rev-parse", "--is-bare-repository"]) + .is_some_and(|output| output == "true"), + } +} + +fn env_flag_is_truthy(name: &str) -> bool { + std::env::var(name).ok().is_some_and(|value| { + matches!( + value.trim().to_ascii_lowercase().as_str(), + "1" | "true" | "yes" | "on" + ) + }) +} + +fn git_command_success(repository_root: &Path, args: &[&str]) -> bool { + Command::new("git") + .args(args) + .current_dir(repository_root) + .output() + .map(|output| output.status.success()) + .unwrap_or(false) +} + +fn git_command_output(repository_root: &Path, args: &[&str]) -> Option { + let output = Command::new("git") + .args(args) + .current_dir(repository_root) + .output() + .ok()?; + + if !output.status.success() { + return None; + } + + let stdout = String::from_utf8(output.stdout).ok()?; + Some(stdout.trim().to_string()) +} + +fn run_git_command(repository_root: &Path, args: &[&str], context_message: &str) -> Result { + let output = Command::new("git") + .args(args) + .current_dir(repository_root) + .output() + .with_context(|| { + format!( + "{} (directory: '{}')", + context_message, + repository_root.display() + ) + })?; + + if !output.status.success() { + let stderr = String::from_utf8_lossy(&output.stderr).trim().to_string(); + let diagnostic = if stderr.is_empty() { + "git command exited with a non-zero status".to_string() + } else { + stderr + }; + bail!("{} {}", context_message, diagnostic); + } + + String::from_utf8(output.stdout) + .context("git command output contained invalid UTF-8") + .map(|stdout| stdout.trim().to_string()) +} + +fn run_git_command_allow_empty( + repository_root: &Path, + args: &[&str], + context_message: &str, +) -> Result { + let output = Command::new("git") + .args(args) + .current_dir(repository_root) + .output() + .with_context(|| { + format!( + "{} (directory: '{}')", + context_message, + repository_root.display() + ) + })?; + + if !output.status.success() { + let stderr = String::from_utf8_lossy(&output.stderr).trim().to_string(); + let diagnostic = if stderr.is_empty() { + "git command exited with a non-zero status".to_string() + } else { + stderr + }; + bail!("{} {}", context_message, diagnostic); + } + + String::from_utf8(output.stdout).context("git command output contained invalid UTF-8") +} + +fn capture_pre_commit_tree_anchors(repository_root: &Path) -> Result { + let index_tree = run_git_command( + repository_root, + &["write-tree"], + "Failed to capture index tree anchor for pre-commit checkpoint.", + )?; + let head_tree = git_command_output(repository_root, &["rev-parse", "--verify", "HEAD^{tree}"]); + + Ok(PreCommitTreeAnchors { + index_tree, + head_tree, + }) +} + +fn collect_pending_checkpoint(repository_root: &Path) -> Result { + let staged_diff = run_git_command_allow_empty( + repository_root, + &[ + "diff", + "--cached", + "--unified=0", + "--no-color", + "--no-ext-diff", + ], + "Failed to collect staged diff for pre-commit attribution.", + )?; + let unstaged_diff = run_git_command_allow_empty( + repository_root, + &["diff", "--unified=0", "--no-color", "--no-ext-diff"], + "Failed to collect unstaged diff for pre-commit attribution.", + )?; + + let staged_ranges = parse_unified_zero_diff_ranges(&staged_diff)?; + let unstaged_ranges = parse_unified_zero_diff_ranges(&unstaged_diff)?; + + let mut all_paths = BTreeMap::new(); + for path in staged_ranges.keys() { + all_paths.insert(path.clone(), ()); + } + for path in unstaged_ranges.keys() { + all_paths.insert(path.clone(), ()); + } + + let files = all_paths + .keys() + .map(|path| PendingFileCheckpoint { + path: path.clone(), + staged_ranges: staged_ranges.get(path).cloned().unwrap_or_default(), + unstaged_ranges: unstaged_ranges.get(path).cloned().unwrap_or_default(), + }) + .collect(); + + Ok(PendingCheckpoint { files }) +} + +fn parse_unified_zero_diff_ranges( + contents: &str, +) -> Result>> { + let mut ranges_by_path: BTreeMap> = BTreeMap::new(); + let mut current_path: Option = None; + + for line in contents.lines() { + if let Some(path) = line.strip_prefix("+++ b/") { + current_path = Some(path.to_string()); + continue; + } + + if line.starts_with("+++") { + current_path = None; + continue; + } + + if !line.starts_with("@@") { + continue; + } + + let Some(path) = current_path.clone() else { + continue; + }; + + if let Some(range) = parse_hunk_new_range(line)? { + ranges_by_path.entry(path).or_default().push(range); + } + } + + Ok(ranges_by_path) +} + +fn parse_hunk_new_range(header_line: &str) -> Result> { + let mut fields = header_line.split_whitespace(); + let _ = fields.next(); + let _ = fields.next(); + let Some(new_range_field) = fields.next() else { + bail!( + "Invalid unified diff hunk header '{}': missing new-range field", + header_line + ); + }; + + let Some(range_body) = new_range_field.strip_prefix('+') else { + bail!( + "Invalid unified diff hunk header '{}': malformed new-range field", + header_line + ); + }; + + let mut parts = range_body.split(','); + let start_line: u32 = parts + .next() + .context("Unified diff hunk is missing start line")? + .parse() + .with_context(|| { + format!( + "Invalid hunk start line in '{}': expected integer", + header_line + ) + })?; + let line_count: u32 = parts + .next() + .map(str::parse) + .transpose() + .with_context(|| { + format!( + "Invalid hunk line count in '{}': expected integer", + header_line + ) + })? + .unwrap_or(1); + + if line_count == 0 { + return Ok(None); + } + + Ok(Some(PendingLineRange { + start_line, + end_line: start_line + line_count - 1, + })) +} + +fn resolve_pre_commit_checkpoint_path(repository_root: &Path) -> Result { + let resolved = run_git_command( + repository_root, + &["rev-parse", "--git-path", PRE_COMMIT_CHECKPOINT_GIT_PATH], + "Failed to resolve pre-commit checkpoint handoff path.", + )?; + let path = PathBuf::from(resolved); + + if path.is_absolute() { + return Ok(path); + } + + Ok(repository_root.join(path)) +} + +fn write_finalized_checkpoint( + repository_root: &Path, + checkpoint: &FinalizedCheckpoint, +) -> Result<()> { + let checkpoint_path = resolve_pre_commit_checkpoint_path(repository_root)?; + let parent = checkpoint_path + .parent() + .context("Resolved pre-commit checkpoint path has no parent directory")?; + fs::create_dir_all(parent).with_context(|| { + format!( + "Failed to create pre-commit checkpoint directory '{}'.", + parent.display() + ) + })?; + + let mut files = Vec::new(); + for file in &checkpoint.files { + let mut ranges = Vec::new(); + for range in &file.ranges { + ranges.push(serde_json::json!({ + "start_line": range.start_line, + "end_line": range.end_line, + })); + } + files.push(serde_json::json!({ + "path": file.path, + "ranges": ranges, + })); + } + + let payload = serde_json::json!({ + "version": 1, + "anchors": { + "index_tree": checkpoint.anchors.index_tree.clone(), + "head_tree": checkpoint.anchors.head_tree.clone(), + }, + "files": files, + }); + + let serialized = serde_json::to_vec_pretty(&payload) + .context("Failed to serialize pre-commit checkpoint artifact")?; + fs::write(&checkpoint_path, serialized).with_context(|| { + format!( + "Failed to persist pre-commit checkpoint artifact '{}'.", + checkpoint_path.display() + ) + }) +} + fn run_commit_msg_subcommand(message_file: PathBuf) -> Result { let metadata = fs::metadata(&message_file).with_context(|| { format!( diff --git a/cli/src/services/hooks/tests.rs b/cli/src/services/hooks/tests.rs index f59f2daf..61461294 100644 --- a/cli/src/services/hooks/tests.rs +++ b/cli/src/services/hooks/tests.rs @@ -1,4 +1,9 @@ use anyhow::Result; +use std::fs; +use std::path::Path; +use std::path::PathBuf; +use std::process::Command; +use std::time::{SystemTime, UNIX_EPOCH}; use crate::services::agent_trace::{ build_trace_payload, ContributorInput, ContributorType, ConversationInput, @@ -9,7 +14,8 @@ use crate::services::agent_trace::{ use super::{ apply_commit_msg_coauthor_policy, finalize_post_commit_trace, finalize_post_rewrite_remap, finalize_pre_commit_checkpoint, finalize_rewrite_trace, parse_hooks_subcommand, - process_trace_retry_queue, run_hooks_subcommand, run_placeholder_hooks, CommitMsgRuntimeState, + process_trace_retry_queue, resolve_pre_commit_checkpoint_path, run_hooks_subcommand, + run_placeholder_hooks, run_pre_commit_subcommand_in_repo, CommitMsgRuntimeState, GeneratedRegionEvent, GeneratedRegionLifecycle, GitHookKind, HookEvent, HookService, HookSubcommand, PendingCheckpoint, PendingFileCheckpoint, PendingLineRange, PersistenceErrorClass, PersistenceFailure, PersistenceTarget, PersistenceWriteResult, @@ -23,6 +29,39 @@ use super::{ POST_COMMIT_PARENT_SHA_METADATA_KEY, }; +fn run_git_in_repo(repo: &Path, args: &[&str]) -> Result<()> { + let output = Command::new("git").args(args).current_dir(repo).output()?; + if output.status.success() { + return Ok(()); + } + + let stderr = String::from_utf8_lossy(&output.stderr).trim().to_string(); + anyhow::bail!( + "git {:?} failed in '{}': {}", + args, + repo.display(), + if stderr.is_empty() { + "git command exited non-zero".to_string() + } else { + stderr + } + ) +} + +fn create_temp_repo() -> Result { + let unique = format!( + "sce-hooks-tests-{}-{}", + std::process::id(), + SystemTime::now().duration_since(UNIX_EPOCH)?.as_nanos() + ); + let repo = std::env::temp_dir().join(unique); + fs::create_dir_all(&repo)?; + run_git_in_repo(&repo, &["init"])?; + run_git_in_repo(&repo, &["config", "user.name", "SCE Test"])?; + run_git_in_repo(&repo, &["config", "user.email", "sce@example.test"])?; + Ok(repo) +} + fn sample_pending_checkpoint() -> PendingCheckpoint { PendingCheckpoint { files: vec![PendingFileCheckpoint { @@ -757,6 +796,44 @@ fn pre_commit_finalization_uses_only_staged_ranges_and_captures_anchors() { ); } +#[test] +fn pre_commit_runtime_persists_staged_only_checkpoint_artifact() -> Result<()> { + let repo = create_temp_repo()?; + let tracked_file = repo.join("src").join("lib.rs"); + fs::create_dir_all( + tracked_file + .parent() + .expect("tracked file path should have parent"), + )?; + fs::write(&tracked_file, "one\ntwo\nthree\nfour\n")?; + run_git_in_repo(&repo, &["add", "."])?; + run_git_in_repo(&repo, &["commit", "-m", "initial"])?; + + fs::write(&tracked_file, "one\ntwo-staged\nthree\nfour\n")?; + run_git_in_repo(&repo, &["add", "src/lib.rs"])?; + fs::write(&tracked_file, "one\ntwo-staged\nthree\nfour-unstaged\n")?; + + let message = run_pre_commit_subcommand_in_repo(&repo)?; + assert_eq!( + message, + "pre-commit hook executed and finalized staged checkpoint for 1 file(s)." + ); + + let checkpoint_path = resolve_pre_commit_checkpoint_path(&repo)?; + let checkpoint = serde_json::from_slice::(&fs::read(&checkpoint_path)?)?; + + assert_eq!(checkpoint["version"], 1); + assert_eq!(checkpoint["files"].as_array().map(Vec::len), Some(1)); + assert_eq!(checkpoint["files"][0]["path"], "src/lib.rs"); + assert_eq!( + checkpoint["files"][0]["ranges"].as_array().map(Vec::len), + Some(1) + ); + assert_eq!(checkpoint["files"][0]["ranges"][0]["start_line"], 2); + assert_eq!(checkpoint["files"][0]["ranges"][0]["end_line"], 2); + Ok(()) +} + fn sample_commit_msg_runtime() -> CommitMsgRuntimeState { CommitMsgRuntimeState { sce_disabled: false, diff --git a/context/plans/agent-trace-local-hooks-production-mvp.md b/context/plans/agent-trace-local-hooks-production-mvp.md index 5bba867f..2bbf1890 100644 --- a/context/plans/agent-trace-local-hooks-production-mvp.md +++ b/context/plans/agent-trace-local-hooks-production-mvp.md @@ -42,7 +42,7 @@ Connect the existing Agent Trace service seams into a fully functional local Git - `cargo test --manifest-path cli/Cargo.toml app::tests` - Focused hook command-surface tests for valid/invalid hook invocations. -- [ ] T03: Wire pre-commit runtime finalization to real staged attribution inputs (status:todo) +- [x] T03: Wire pre-commit runtime finalization to real staged attribution inputs (status:done) - Task ID: T03 - Goal: Connect `finalize_pre_commit_checkpoint` to real runtime data collection and deterministic checkpoint persistence handoff. - Boundaries (in/out of scope): diff --git a/context/sce/agent-trace-pre-commit-staged-checkpoint.md b/context/sce/agent-trace-pre-commit-staged-checkpoint.md index fe178a1d..291af5df 100644 --- a/context/sce/agent-trace-pre-commit-staged-checkpoint.md +++ b/context/sce/agent-trace-pre-commit-staged-checkpoint.md @@ -8,19 +8,36 @@ Task `agent-trace-attribution-no-git-wrapper` `T04` adds a pre-commit finalizati - Code location: `cli/src/services/hooks.rs`. - Finalization entrypoint: `finalize_pre_commit_checkpoint(runtime, anchors, pending)`. +- Runtime hook entrypoint: `run_pre_commit_subcommand` -> `run_pre_commit_subcommand_in_repo(repository_root)`. - Runtime no-op guards: - `sce_disabled = true` -> `NoOp(Disabled)`. - `cli_available = false` -> `NoOp(CliUnavailable)`. - `is_bare_repo = true` -> `NoOp(BareRepository)`. +- Runtime state resolution: + - `SCE_DISABLED` truthy env values (`1`, `true`, `yes`, `on`) set disabled mode. + - CLI availability checks `git --version` in the repository context. + - Bare-repository guard uses `git rev-parse --is-bare-repository`. - Staged-only enforcement: - Input keeps separate `staged_ranges` and `unstaged_ranges` per file. - Finalized output includes only `staged_ranges`. - Files with no staged ranges are dropped from finalized attribution. +- Runtime staged/unstaged extraction: + - Staged hunks from `git diff --cached --unified=0 --no-color --no-ext-diff`. + - Unstaged hunks from `git diff --unified=0 --no-color --no-ext-diff`. + - Unified-diff hunks are parsed into deterministic line ranges per file path. - Anchors captured in finalized output: - required `index_tree`. - optional `head_tree`. +- Anchor capture source: + - `index_tree` from `git write-tree`. + - `head_tree` from `git rev-parse --verify HEAD^{tree}` (optional for repos without `HEAD`). +- Finalized checkpoint handoff artifact: + - Persisted as JSON at Git-resolved path `$(git rev-parse --git-path sce/pre-commit-checkpoint.json)`. + - Payload shape: `version`, `anchors`, and staged-only `files[].ranges[]`. + - Runtime remains fail-open: checkpoint collection/persist failures return deterministic diagnostics without blocking commit flow. ## Verification coverage - Mixed staged/unstaged fixture test confirms unstaged ranges are excluded and anchor values are preserved. - Guard-path tests cover disabled, missing CLI, and bare-repository no-op behavior. +- Runtime fixture test validates persisted pre-commit checkpoint artifact contains staged-only ranges when both staged and unstaged edits exist for the same file. From 2509cd0e70eff2c6416e313b00acf0692b33c33d Mon Sep 17 00:00:00 2001 From: David Abram Date: Thu, 5 Mar 2026 10:59:44 +0100 Subject: [PATCH 34/39] hooks: Wire commit-msg runtime to apply co-author trailer policy Route commit-msg runtime through repository-aware gate resolution, detect staged attribution from the pre-commit checkpoint, and mutate COMMIT_EDITMSG only when canonical trailer insertion is required. --- cli/src/services/hooks.rs | 93 ++++++++++++++-- cli/src/services/hooks/tests.rs | 104 +++++++++++++++--- context/context-map.md | 2 +- .../agent-trace-local-hooks-production-mvp.md | 2 +- .../agent-trace-commit-msg-coauthor-policy.md | 9 ++ .../sce/agent-trace-hooks-command-routing.md | 8 +- 6 files changed, 191 insertions(+), 27 deletions(-) diff --git a/cli/src/services/hooks.rs b/cli/src/services/hooks.rs index 77c6b845..f51d9c7c 100644 --- a/cli/src/services/hooks.rs +++ b/cli/src/services/hooks.rs @@ -186,12 +186,23 @@ fn resolve_pre_commit_runtime_state(repository_root: &Path) -> PreCommitRuntimeS } fn env_flag_is_truthy(name: &str) -> bool { - std::env::var(name).ok().is_some_and(|value| { - matches!( - value.trim().to_ascii_lowercase().as_str(), - "1" | "true" | "yes" | "on" - ) - }) + std::env::var(name) + .ok() + .is_some_and(|value| env_value_is_truthy(&value)) +} + +fn env_flag_is_enabled_by_default(name: &str) -> bool { + match std::env::var(name) { + Ok(value) => env_value_is_truthy(&value), + Err(_) => true, + } +} + +fn env_value_is_truthy(value: &str) -> bool { + matches!( + value.trim().to_ascii_lowercase().as_str(), + "1" | "true" | "yes" | "on" + ) } fn git_command_success(repository_root: &Path, args: &[&str]) -> bool { @@ -480,7 +491,16 @@ fn write_finalized_checkpoint( } fn run_commit_msg_subcommand(message_file: PathBuf) -> Result { - let metadata = fs::metadata(&message_file).with_context(|| { + let repository_root = std::env::current_dir() + .context("Failed to determine current directory for commit-msg runtime invocation.")?; + run_commit_msg_subcommand_in_repo(&repository_root, &message_file) +} + +fn run_commit_msg_subcommand_in_repo( + repository_root: &Path, + message_file: &Path, +) -> Result { + let metadata = fs::metadata(message_file).with_context(|| { format!( "Invalid commit message file '{}': file does not exist or is not readable.", message_file.display() @@ -494,12 +514,67 @@ fn run_commit_msg_subcommand(message_file: PathBuf) -> Result { ); } + let runtime = resolve_commit_msg_runtime_state(repository_root); + let original = fs::read_to_string(message_file).with_context(|| { + format!( + "Invalid commit message file '{}': failed to read UTF-8 content.", + message_file.display() + ) + })?; + + let transformed = apply_commit_msg_coauthor_policy(&runtime, &original); + let gate_passed = + !runtime.sce_disabled && runtime.sce_coauthor_enabled && runtime.has_staged_sce_attribution; + let trailer_applied = gate_passed && transformed != original; + + if trailer_applied { + fs::write(message_file, transformed.as_bytes()).with_context(|| { + format!( + "Failed to update commit message file '{}' with canonical co-author trailer.", + message_file.display() + ) + })?; + } + Ok(format!( - "commit-msg hook accepted message file '{}'.", - message_file.display() + "commit-msg hook processed message file '{}' (policy_gate_passed={}, trailer_applied={}).", + message_file.display(), + gate_passed, + trailer_applied )) } +fn resolve_commit_msg_runtime_state(repository_root: &Path) -> CommitMsgRuntimeState { + CommitMsgRuntimeState { + sce_disabled: env_flag_is_truthy("SCE_DISABLED"), + sce_coauthor_enabled: env_flag_is_enabled_by_default("SCE_COAUTHOR_ENABLED"), + has_staged_sce_attribution: staged_sce_attribution_present(repository_root), + } +} + +fn staged_sce_attribution_present(repository_root: &Path) -> bool { + let Ok(checkpoint_path) = resolve_pre_commit_checkpoint_path(repository_root) else { + return false; + }; + + let Ok(payload) = fs::read_to_string(&checkpoint_path) else { + return false; + }; + let Ok(json) = serde_json::from_str::(&payload) else { + return false; + }; + + json.get("files") + .and_then(serde_json::Value::as_array) + .is_some_and(|files| { + files.iter().any(|file| { + file.get("ranges") + .and_then(serde_json::Value::as_array) + .is_some_and(|ranges| !ranges.is_empty()) + }) + }) +} + fn run_post_commit_subcommand() -> Result { Ok("post-commit hook accepted runtime invocation.".to_string()) } diff --git a/cli/src/services/hooks/tests.rs b/cli/src/services/hooks/tests.rs index 61461294..5db93b1b 100644 --- a/cli/src/services/hooks/tests.rs +++ b/cli/src/services/hooks/tests.rs @@ -14,19 +14,19 @@ use crate::services::agent_trace::{ use super::{ apply_commit_msg_coauthor_policy, finalize_post_commit_trace, finalize_post_rewrite_remap, finalize_pre_commit_checkpoint, finalize_rewrite_trace, parse_hooks_subcommand, - process_trace_retry_queue, resolve_pre_commit_checkpoint_path, run_hooks_subcommand, - run_placeholder_hooks, run_pre_commit_subcommand_in_repo, CommitMsgRuntimeState, - GeneratedRegionEvent, GeneratedRegionLifecycle, GitHookKind, HookEvent, HookService, - HookSubcommand, PendingCheckpoint, PendingFileCheckpoint, PendingLineRange, - PersistenceErrorClass, PersistenceFailure, PersistenceTarget, PersistenceWriteResult, - PlaceholderHookService, PostCommitFinalization, PostCommitInput, PostCommitNoOpReason, - PostCommitRuntimeState, PostRewriteFinalization, PostRewriteNoOpReason, - PostRewriteRuntimeState, PreCommitFinalization, PreCommitNoOpReason, PreCommitRuntimeState, - PreCommitTreeAnchors, RetryMetricsSink, RetryProcessingMetric, RewriteMethod, - RewriteRemapIngestion, RewriteRemapRequest, RewriteTraceFinalization, RewriteTraceInput, - RewriteTraceNoOpReason, TraceEmissionLedger, TraceNote, TraceNotesWriter, TraceRecordStore, - TraceRetryQueue, TraceRetryQueueEntry, CANONICAL_SCE_COAUTHOR_TRAILER, - POST_COMMIT_PARENT_SHA_METADATA_KEY, + process_trace_retry_queue, resolve_pre_commit_checkpoint_path, + run_commit_msg_subcommand_in_repo, run_hooks_subcommand, run_placeholder_hooks, + run_pre_commit_subcommand_in_repo, CommitMsgRuntimeState, GeneratedRegionEvent, + GeneratedRegionLifecycle, GitHookKind, HookEvent, HookService, HookSubcommand, + PendingCheckpoint, PendingFileCheckpoint, PendingLineRange, PersistenceErrorClass, + PersistenceFailure, PersistenceTarget, PersistenceWriteResult, PlaceholderHookService, + PostCommitFinalization, PostCommitInput, PostCommitNoOpReason, PostCommitRuntimeState, + PostRewriteFinalization, PostRewriteNoOpReason, PostRewriteRuntimeState, PreCommitFinalization, + PreCommitNoOpReason, PreCommitRuntimeState, PreCommitTreeAnchors, RetryMetricsSink, + RetryProcessingMetric, RewriteMethod, RewriteRemapIngestion, RewriteRemapRequest, + RewriteTraceFinalization, RewriteTraceInput, RewriteTraceNoOpReason, TraceEmissionLedger, + TraceNote, TraceNotesWriter, TraceRecordStore, TraceRetryQueue, TraceRetryQueueEntry, + CANONICAL_SCE_COAUTHOR_TRAILER, POST_COMMIT_PARENT_SHA_METADATA_KEY, }; fn run_git_in_repo(repo: &Path, args: &[&str]) -> Result<()> { @@ -842,6 +842,36 @@ fn sample_commit_msg_runtime() -> CommitMsgRuntimeState { } } +fn write_staged_checkpoint_artifact(repo: &Path) -> Result<()> { + let checkpoint_path = resolve_pre_commit_checkpoint_path(repo)?; + if let Some(parent) = checkpoint_path.parent() { + fs::create_dir_all(parent)?; + } + fs::write( + checkpoint_path, + r#"{ + "version": 1, + "anchors": { + "index_tree": "index-tree", + "head_tree": "head-tree" + }, + "files": [ + { + "path": "src/lib.rs", + "ranges": [ + { + "start_line": 1, + "end_line": 1 + } + ] + } + ] +} +"#, + )?; + Ok(()) +} + #[test] fn commit_msg_policy_noops_when_sce_disabled() { let mut runtime = sample_commit_msg_runtime(); @@ -913,6 +943,54 @@ fn commit_msg_policy_is_idempotent() { assert_eq!(first, second); } +#[test] +fn commit_msg_runtime_mutates_message_file_when_policy_gate_passes() -> Result<()> { + let repo = create_temp_repo()?; + write_staged_checkpoint_artifact(&repo)?; + let message_file = repo.join("COMMIT_EDITMSG"); + fs::write(&message_file, "feat: add attribution\n")?; + + let message = run_commit_msg_subcommand_in_repo(&repo, &message_file)?; + assert_eq!( + message, + format!( + "commit-msg hook processed message file '{}' (policy_gate_passed=true, trailer_applied=true).", + message_file.display() + ) + ); + + let mutated = fs::read_to_string(&message_file)?; + assert_eq!( + mutated, + format!( + "feat: add attribution\n\n{}\n", + CANONICAL_SCE_COAUTHOR_TRAILER + ) + ); + Ok(()) +} + +#[test] +fn commit_msg_runtime_noops_when_staged_attribution_checkpoint_missing() -> Result<()> { + let repo = create_temp_repo()?; + let message_file = repo.join("COMMIT_EDITMSG"); + let original = "feat: add attribution\n"; + fs::write(&message_file, original)?; + + let message = run_commit_msg_subcommand_in_repo(&repo, &message_file)?; + assert_eq!( + message, + format!( + "commit-msg hook processed message file '{}' (policy_gate_passed=false, trailer_applied=false).", + message_file.display() + ) + ); + + let persisted = fs::read_to_string(&message_file)?; + assert_eq!(persisted, original); + Ok(()) +} + #[test] fn hooks_placeholder_event_model_reserves_generated_region_tracking() { let service = PlaceholderHookService; diff --git a/context/context-map.md b/context/context-map.md index 8c0ecc2b..4cd5b679 100644 --- a/context/context-map.md +++ b/context/context-map.md @@ -35,7 +35,7 @@ Feature/domain context: - `context/sce/agent-trace-rewrite-mapping-engine.md` (T13 hosted rewrite mapping engine contract with patch-id exact precedence, range-diff/fuzzy scoring, and deterministic unresolved outcomes) - `context/sce/agent-trace-retry-queue-observability.md` (T14 retry queue recovery contract plus reconciliation/runtime observability metrics and DB-first queue schema additions) - `context/sce/agent-trace-local-hooks-mvp-contract-gap-matrix.md` (T01 Local Hooks MVP production contract freeze and deterministic gap matrix for `agent-trace-local-hooks-production-mvp`) -- `context/sce/agent-trace-hooks-command-routing.md` (T02 implemented `sce hooks` command routing, subcommand contracts, and deterministic invocation validation behavior) +- `context/sce/agent-trace-hooks-command-routing.md` (implemented `sce hooks` command routing plus current runtime entrypoint behavior, including commit-msg policy gate/file mutation wiring) Working areas: - `context/plans/` (active plan execution artifacts, not durable history) diff --git a/context/plans/agent-trace-local-hooks-production-mvp.md b/context/plans/agent-trace-local-hooks-production-mvp.md index 2bbf1890..44cebb9b 100644 --- a/context/plans/agent-trace-local-hooks-production-mvp.md +++ b/context/plans/agent-trace-local-hooks-production-mvp.md @@ -55,7 +55,7 @@ Connect the existing Agent Trace service seams into a fully functional local Git - `cargo test --manifest-path cli/Cargo.toml pre_commit` - End-to-end local repo fixture test proving unstaged ranges are excluded. -- [ ] T04: Wire commit-msg hook file mutation to canonical co-author policy (status:todo) +- [x] T04: Wire commit-msg hook file mutation to canonical co-author policy (status:done) - Task ID: T04 - Goal: Connect `apply_commit_msg_coauthor_policy` to real commit message file IO in hook runtime with idempotent trailer handling. - Boundaries (in/out of scope): diff --git a/context/sce/agent-trace-commit-msg-coauthor-policy.md b/context/sce/agent-trace-commit-msg-coauthor-policy.md index 1998b190..70541226 100644 --- a/context/sce/agent-trace-commit-msg-coauthor-policy.md +++ b/context/sce/agent-trace-commit-msg-coauthor-policy.md @@ -4,24 +4,33 @@ - Plan: `agent-trace-attribution-no-git-wrapper` - Task: `T05` - Implementation state: done +- Runtime hook wiring: `agent-trace-local-hooks-production-mvp` `T04` (done) ## Canonical contract - Policy entrypoint: `cli/src/services/hooks.rs` -> `apply_commit_msg_coauthor_policy`. +- Runtime entrypoint: `cli/src/services/hooks.rs` -> `run_commit_msg_subcommand` / `run_commit_msg_subcommand_in_repo`. - Canonical trailer string: `Co-authored-by: SCE `. - Runtime gating conditions: - `sce_disabled = false` - `sce_coauthor_enabled = true` - `has_staged_sce_attribution = true` +- Runtime gate source mapping: + - `sce_disabled` resolves from `SCE_DISABLED` truthy evaluation. + - `sce_coauthor_enabled` resolves from `SCE_COAUTHOR_ENABLED` with enabled-by-default semantics. + - `has_staged_sce_attribution` resolves from staged pre-commit checkpoint artifact content (`files[].ranges[]` non-empty). - When all gate conditions pass, output commit message MUST contain exactly one canonical SCE trailer. - When any gate condition fails, commit message is returned unchanged. ## Behavior details +- Hook runtime reads commit message file content as UTF-8 and returns deterministic actionable errors for missing/non-file/non-UTF-8 paths. - Canonical trailer dedupe removes duplicate canonical lines before final insertion. - Trailer insertion is idempotent: applying the policy repeatedly yields the same message. - Existing trailing newline is preserved when present. +- Commit-msg runtime writes the file only when policy gates pass and transformed content differs from original content. - Human author/committer identity is not rewritten; only commit message trailer content is affected. ## Verification evidence - `cargo fmt --manifest-path cli/Cargo.toml -- --check` - `cargo test --manifest-path cli/Cargo.toml commit_msg_policy` +- `cargo test --manifest-path cli/Cargo.toml commit_msg_runtime` - `cargo build --manifest-path cli/Cargo.toml` diff --git a/context/sce/agent-trace-hooks-command-routing.md b/context/sce/agent-trace-hooks-command-routing.md index d6ee4090..dcd520e0 100644 --- a/context/sce/agent-trace-hooks-command-routing.md +++ b/context/sce/agent-trace-hooks-command-routing.md @@ -18,10 +18,12 @@ ## Current runtime entrypoint behavior - `pre-commit`: executes the pre-commit runtime entrypoint and reports staged-checkpoint finalization outcome. -- `commit-msg`: validates that `` exists and is a regular file before accepting invocation. +- `commit-msg`: validates ``, resolves runtime gates (`SCE_DISABLED`, `SCE_COAUTHOR_ENABLED`, staged checkpoint presence), applies canonical co-author policy, and writes back only when trailer mutation is required. - `post-commit`: accepts runtime invocation through implemented dispatch entrypoint. - `post-rewrite`: reads hook pair input from STDIN, validates pair format through remap finalization parsing, and reports ingested/skipped outcomes. ## Notes for next tasks -- T02 implements routing and invocation contracts only. -- Deep runtime wiring for staged attribution, commit-msg mutation, post-commit persistence adapters, and rewrite-trace persistence remains in `T03+`. +- T02 established routing and invocation contracts. +- T03 implemented pre-commit staged-checkpoint runtime wiring. +- T04 implemented commit-msg file IO mutation wiring to canonical co-author policy. +- Post-commit persistence adapters and rewrite-trace persistence remain in `T05+`. From 9863467e17c6a3db4bf7d508e00abb5340d06b94 Mon Sep 17 00:00:00 2001 From: David Abram Date: Thu, 5 Mar 2026 11:22:46 +0100 Subject: [PATCH 35/39] hooks: Implement post-commit trace finalization runtime adapters Wire the post-commit runtime to collect commit attribution input and execute finalize_post_commit_trace with concrete persistence adapters for git notes, local JSONL record storage, retry queue, and emission ledger. --- cli/src/services/hooks.rs | 808 +++++++++++++++++- cli/src/services/hooks/tests.rs | 108 ++- .../agent-trace-local-hooks-production-mvp.md | 2 +- .../sce/agent-trace-hooks-command-routing.md | 5 +- .../sce/agent-trace-post-commit-dual-write.md | 15 + 5 files changed, 921 insertions(+), 17 deletions(-) diff --git a/cli/src/services/hooks.rs b/cli/src/services/hooks.rs index f51d9c7c..f26ee264 100644 --- a/cli/src/services/hooks.rs +++ b/cli/src/services/hooks.rs @@ -1,14 +1,17 @@ use anyhow::{bail, Context, Result}; use std::collections::{BTreeMap, HashSet}; use std::fs; +use std::io::Write; use std::path::{Path, PathBuf}; use std::process::Command; use std::str::FromStr; use std::time::Instant; use crate::services::agent_trace::{ - build_trace_payload, AgentTraceRecord, FileAttributionInput, QualityStatus, RewriteInfo, - TraceAdapterInput, METADATA_IDEMPOTENCY_KEY, TRACE_CONTENT_TYPE, + build_trace_payload, AgentTraceContributor, AgentTraceConversation, AgentTraceFile, + AgentTraceRange, AgentTraceRecord, AgentTraceVcs, ContributorInput, ContributorType, + ConversationInput, FileAttributionInput, QualityStatus, RangeInput, RewriteInfo, + TraceAdapterInput, METADATA_IDEMPOTENCY_KEY, TRACE_CONTENT_TYPE, TRACE_VERSION, VCS_TYPE_GIT, }; pub const NAME: &str = "hooks"; @@ -576,7 +579,300 @@ fn staged_sce_attribution_present(repository_root: &Path) -> bool { } fn run_post_commit_subcommand() -> Result { - Ok("post-commit hook accepted runtime invocation.".to_string()) + let repository_root = std::env::current_dir() + .context("Failed to determine current directory for post-commit runtime invocation.")?; + run_post_commit_subcommand_in_repo(&repository_root) +} + +fn run_post_commit_subcommand_in_repo(repository_root: &Path) -> Result { + let runtime = resolve_post_commit_runtime_state(repository_root); + + if runtime.sce_disabled || !runtime.cli_available || runtime.is_bare_repo { + let reason = if runtime.sce_disabled { + PostCommitNoOpReason::Disabled + } else if !runtime.cli_available { + PostCommitNoOpReason::CliUnavailable + } else { + PostCommitNoOpReason::BareRepository + }; + + return Ok(format!( + "post-commit hook executed with no-op runtime state: {reason:?}" + )); + } + + let runtime_paths = match resolve_post_commit_runtime_paths(repository_root) { + Ok(paths) => paths, + Err(error) => { + return Ok(format!( + "post-commit hook skipped trace finalization: failed to resolve persistence targets ({error})" + )); + } + }; + + let input = match build_post_commit_input(repository_root) { + Ok(input) => input, + Err(error) => { + return Ok(format!( + "post-commit hook skipped trace finalization: failed to collect commit attribution input ({error})" + )); + } + }; + + let mut notes_writer = GitNotesTraceWriter { + repository_root: repository_root.to_path_buf(), + }; + let mut record_store = JsonFileTraceRecordStore { + path: runtime_paths.trace_records_path, + }; + let mut retry_queue = JsonFileTraceRetryQueue { + path: runtime_paths.retry_queue_path, + }; + let mut emission_ledger = FileTraceEmissionLedger { + path: runtime_paths.emission_ledger_path, + }; + + let outcome = match finalize_post_commit_trace( + &runtime, + input, + &mut notes_writer, + &mut record_store, + &mut retry_queue, + &mut emission_ledger, + ) { + Ok(outcome) => outcome, + Err(error) => { + return Ok(format!( + "post-commit hook skipped trace finalization: finalizer execution failed ({error})" + )); + } + }; + + let message = match outcome { + PostCommitFinalization::NoOp(reason) => { + format!("post-commit hook executed with no-op runtime state: {reason:?}") + } + PostCommitFinalization::Persisted(persisted) => format!( + "post-commit hook finalized trace for commit '{}' (trace_id='{}', notes={:?}, database={:?}).", + persisted.commit_sha, persisted.trace_id, persisted.notes, persisted.database + ), + PostCommitFinalization::QueuedFallback(queued) => format!( + "post-commit hook enqueued fallback for commit '{}' (trace_id='{}', failed_targets={:?}).", + queued.commit_sha, queued.trace_id, queued.failed_targets + ), + }; + + Ok(message) +} + +fn resolve_post_commit_runtime_state(repository_root: &Path) -> PostCommitRuntimeState { + PostCommitRuntimeState { + sce_disabled: env_flag_is_truthy("SCE_DISABLED"), + cli_available: git_command_success(repository_root, &["--version"]), + is_bare_repo: git_command_output(repository_root, &["rev-parse", "--is-bare-repository"]) + .is_some_and(|output| output == "true"), + } +} + +struct PostCommitRuntimePaths { + trace_records_path: PathBuf, + retry_queue_path: PathBuf, + emission_ledger_path: PathBuf, +} + +fn resolve_post_commit_runtime_paths(repository_root: &Path) -> Result { + let trace_records_path = resolve_git_path(repository_root, "sce/trace-records.jsonl")?; + let retry_queue_path = resolve_git_path(repository_root, "sce/trace-retry-queue.jsonl")?; + let emission_ledger_path = resolve_git_path(repository_root, "sce/trace-emission-ledger.txt")?; + + Ok(PostCommitRuntimePaths { + trace_records_path, + retry_queue_path, + emission_ledger_path, + }) +} + +fn resolve_git_path(repository_root: &Path, git_path: &str) -> Result { + let resolved = run_git_command( + repository_root, + &["rev-parse", "--git-path", git_path], + "Failed to resolve git persistence path.", + )?; + let path = PathBuf::from(resolved); + if path.is_absolute() { + return Ok(path); + } + + Ok(repository_root.join(path)) +} + +fn build_post_commit_input(repository_root: &Path) -> Result { + let commit_sha = run_git_command( + repository_root, + &["rev-parse", "--verify", "HEAD"], + "Failed to resolve post-commit HEAD SHA.", + )?; + let parent_sha = git_command_output(repository_root, &["rev-parse", "--verify", "HEAD^"]); + let timestamp_rfc3339 = run_git_command( + repository_root, + &["show", "-s", "--format=%cI", "HEAD"], + "Failed to resolve post-commit timestamp.", + )?; + let files = collect_post_commit_file_attribution(repository_root)?; + let idempotency_key = format!("post-commit:{commit_sha}"); + let record_id = deterministic_uuid_v4_from_seed(&format!("{commit_sha}:{timestamp_rfc3339}")); + + Ok(PostCommitInput { + record_id, + timestamp_rfc3339, + commit_sha, + parent_sha, + idempotency_key, + files, + }) +} + +fn collect_post_commit_file_attribution( + repository_root: &Path, +) -> Result> { + let checkpoint_files = load_post_commit_checkpoint_files(repository_root)?; + if !checkpoint_files.is_empty() { + return Ok(checkpoint_files); + } + + let changed_paths = run_git_command_allow_empty( + repository_root, + &["show", "--pretty=format:", "--name-only", "HEAD"], + "Failed to resolve changed files for post-commit attribution.", + )?; + + let mut files = Vec::new(); + for line in changed_paths.lines() { + let path = line.trim(); + if path.is_empty() { + continue; + } + + files.push(FileAttributionInput { + path: path.to_string(), + conversations: vec![ConversationInput { + url: "https://crocoder.dev/sce/local-hooks/post-commit".to_string(), + related: Vec::new(), + ranges: vec![RangeInput { + start_line: 1, + end_line: 1, + contributor: ContributorInput { + kind: ContributorType::Unknown, + model_id: None, + }, + }], + }], + }); + } + + Ok(files) +} + +fn load_post_commit_checkpoint_files(repository_root: &Path) -> Result> { + let checkpoint_path = resolve_pre_commit_checkpoint_path(repository_root)?; + let payload = match fs::read_to_string(&checkpoint_path) { + Ok(payload) => payload, + Err(error) if error.kind() == std::io::ErrorKind::NotFound => return Ok(Vec::new()), + Err(error) => { + bail!( + "Failed to read pre-commit checkpoint '{}' for post-commit finalization: {}", + checkpoint_path.display(), + error + ) + } + }; + + let checkpoint = serde_json::from_str::(&payload).with_context(|| { + format!( + "Failed to parse pre-commit checkpoint '{}' as JSON.", + checkpoint_path.display() + ) + })?; + + let Some(files_json) = checkpoint + .get("files") + .and_then(serde_json::Value::as_array) + else { + return Ok(Vec::new()); + }; + + let mut files = Vec::new(); + for file_json in files_json { + let Some(path) = file_json.get("path").and_then(serde_json::Value::as_str) else { + continue; + }; + + let ranges = file_json + .get("ranges") + .and_then(serde_json::Value::as_array) + .map(|ranges| { + ranges + .iter() + .filter_map(|range_json| { + let start_line = range_json + .get("start_line") + .and_then(serde_json::Value::as_u64) + .map(|value| value as u32)?; + let end_line = range_json + .get("end_line") + .and_then(serde_json::Value::as_u64) + .map(|value| value as u32)?; + + Some(RangeInput { + start_line, + end_line, + contributor: ContributorInput { + kind: ContributorType::Unknown, + model_id: None, + }, + }) + }) + .collect::>() + }) + .unwrap_or_default(); + + if ranges.is_empty() { + continue; + } + + files.push(FileAttributionInput { + path: path.to_string(), + conversations: vec![ConversationInput { + url: "https://crocoder.dev/sce/local-hooks/pre-commit-checkpoint".to_string(), + related: Vec::new(), + ranges, + }], + }); + } + + Ok(files) +} + +fn deterministic_uuid_v4_from_seed(seed: &str) -> String { + use sha2::{Digest, Sha256}; + + let digest = Sha256::digest(seed.as_bytes()); + let mut bytes = [0_u8; 16]; + bytes.copy_from_slice(&digest[..16]); + + bytes[6] = (bytes[6] & 0x0f) | 0x40; + bytes[8] = (bytes[8] & 0x3f) | 0x80; + + format!( + "{:08x}-{:04x}-{:04x}-{:04x}-{:012x}", + u32::from_be_bytes([bytes[0], bytes[1], bytes[2], bytes[3]]), + u16::from_be_bytes([bytes[4], bytes[5]]), + u16::from_be_bytes([bytes[6], bytes[7]]), + u16::from_be_bytes([bytes[8], bytes[9]]), + u64::from_be_bytes([ + 0, 0, bytes[10], bytes[11], bytes[12], bytes[13], bytes[14], bytes[15] + ]) + ) } fn run_post_rewrite_subcommand(rewrite_method: &str) -> Result { @@ -784,6 +1080,512 @@ pub trait TraceEmissionLedger { fn mark_emitted(&mut self, commit_sha: &str); } +struct GitNotesTraceWriter { + repository_root: PathBuf, +} + +impl TraceNotesWriter for GitNotesTraceWriter { + fn write_note(&mut self, note: TraceNote) -> PersistenceWriteResult { + let payload = match serialize_note_payload(¬e) { + Ok(payload) => payload, + Err(error) => { + return PersistenceWriteResult::Failed(PersistenceFailure { + class: PersistenceErrorClass::Permanent, + message: format!("failed to serialize trace note payload: {error}"), + }); + } + }; + + let existing = Command::new("git") + .args([ + "notes", + "--ref", + note.notes_ref.as_str(), + "show", + note.commit_sha.as_str(), + ]) + .current_dir(&self.repository_root) + .output(); + if let Ok(output) = &existing { + if output.status.success() { + let existing_payload = String::from_utf8_lossy(&output.stdout).trim().to_string(); + if existing_payload == payload { + return PersistenceWriteResult::AlreadyExists; + } + } + } + + match Command::new("git") + .args([ + "notes", + "--ref", + note.notes_ref.as_str(), + "add", + "-f", + "-m", + payload.as_str(), + note.commit_sha.as_str(), + ]) + .current_dir(&self.repository_root) + .output() + { + Ok(output) if output.status.success() => PersistenceWriteResult::Written, + Ok(output) => PersistenceWriteResult::Failed(PersistenceFailure { + class: classify_persistence_error_class_from_stderr(&String::from_utf8_lossy( + &output.stderr, + )), + message: format!( + "failed to write git note for commit '{}': {}", + note.commit_sha, + String::from_utf8_lossy(&output.stderr).trim() + ), + }), + Err(error) => PersistenceWriteResult::Failed(PersistenceFailure { + class: classify_persistence_error_class_from_io(&error), + message: format!( + "failed to execute git notes command for commit '{}': {}", + note.commit_sha, error + ), + }), + } + } +} + +struct JsonFileTraceRecordStore { + path: PathBuf, +} + +impl TraceRecordStore for JsonFileTraceRecordStore { + fn write_trace_record(&mut self, record: PersistedTraceRecord) -> PersistenceWriteResult { + if let Some(existing_key) = + load_record_with_idempotency_key(&self.path, &record.idempotency_key) + { + if existing_key == record.idempotency_key { + return PersistenceWriteResult::AlreadyExists; + } + } + + if let Some(parent) = self.path.parent() { + if let Err(error) = fs::create_dir_all(parent) { + return PersistenceWriteResult::Failed(PersistenceFailure { + class: classify_persistence_error_class_from_io(&error), + message: format!( + "failed to create trace record store directory '{}': {}", + parent.display(), + error + ), + }); + } + } + + let line = serde_json::json!({ + "commit_sha": record.commit_sha, + "idempotency_key": record.idempotency_key, + "content_type": record.content_type, + "notes_ref": record.notes_ref, + "record": trace_record_to_json(&record.record), + }) + .to_string(); + + match append_jsonl_line(&self.path, &line) { + Ok(()) => PersistenceWriteResult::Written, + Err(error) => PersistenceWriteResult::Failed(PersistenceFailure { + class: classify_persistence_error_class_from_io(&error), + message: format!( + "failed to write trace record into local JSON store '{}': {}", + self.path.display(), + error + ), + }), + } + } +} + +struct JsonFileTraceRetryQueue { + path: PathBuf, +} + +impl TraceRetryQueue for JsonFileTraceRetryQueue { + fn enqueue(&mut self, entry: TraceRetryQueueEntry) -> Result<()> { + if let Some(parent) = self.path.parent() { + fs::create_dir_all(parent).with_context(|| { + format!( + "Failed to create retry queue directory '{}'", + parent.display() + ) + })?; + } + + let line = serde_json::json!({ + "commit_sha": entry.commit_sha, + "failed_targets": entry + .failed_targets + .iter() + .map(persistence_target_label) + .collect::>(), + "content_type": entry.content_type, + "notes_ref": entry.notes_ref, + "record": trace_record_to_json(&entry.record), + }) + .to_string(); + append_jsonl_line(&self.path, &line)?; + + Ok(()) + } + + fn dequeue_next(&mut self) -> Result> { + if !self.path.exists() { + return Ok(None); + } + + let payload = fs::read_to_string(&self.path).with_context(|| { + format!( + "Failed to read retry queue file '{}' for dequeue.", + self.path.display() + ) + })?; + + let mut lines = payload.lines(); + let Some(first_line) = lines.next() else { + return Ok(None); + }; + + let mut remaining = String::new(); + for line in lines { + remaining.push_str(line); + remaining.push('\n'); + } + fs::write(&self.path, remaining).with_context(|| { + format!( + "Failed to rewrite retry queue file '{}' after dequeue.", + self.path.display() + ) + })?; + + let parsed = serde_json::from_str::(first_line) + .context("Failed to parse retry queue entry JSON during dequeue")?; + let commit_sha = parsed + .get("commit_sha") + .and_then(serde_json::Value::as_str) + .context("Retry queue entry missing 'commit_sha' string")? + .to_string(); + let content_type = parsed + .get("content_type") + .and_then(serde_json::Value::as_str) + .context("Retry queue entry missing 'content_type' string")? + .to_string(); + let notes_ref = parsed + .get("notes_ref") + .and_then(serde_json::Value::as_str) + .context("Retry queue entry missing 'notes_ref' string")? + .to_string(); + let record = trace_record_from_json( + parsed + .get("record") + .context("Retry queue entry missing 'record' object")?, + )?; + + let failed_targets = parsed + .get("failed_targets") + .and_then(serde_json::Value::as_array) + .into_iter() + .flatten() + .filter_map(|value| value.as_str()) + .filter_map(persistence_target_from_label) + .collect::>(); + + Ok(Some(TraceRetryQueueEntry { + commit_sha, + failed_targets, + content_type, + notes_ref, + record, + })) + } +} + +struct FileTraceEmissionLedger { + path: PathBuf, +} + +impl TraceEmissionLedger for FileTraceEmissionLedger { + fn has_emitted(&self, commit_sha: &str) -> bool { + fs::read_to_string(&self.path) + .ok() + .is_some_and(|contents| contents.lines().any(|line| line.trim() == commit_sha)) + } + + fn mark_emitted(&mut self, commit_sha: &str) { + if self.has_emitted(commit_sha) { + return; + } + + if let Some(parent) = self.path.parent() { + if fs::create_dir_all(parent).is_err() { + return; + } + } + + if let Ok(mut file) = fs::OpenOptions::new() + .create(true) + .append(true) + .open(&self.path) + { + let _ = writeln!(file, "{commit_sha}"); + } + } +} + +fn append_jsonl_line(path: &Path, line: &str) -> std::io::Result<()> { + let mut file = fs::OpenOptions::new() + .create(true) + .append(true) + .open(path)?; + writeln!(file, "{line}")?; + Ok(()) +} + +fn load_record_with_idempotency_key(path: &Path, idempotency_key: &str) -> Option { + let payload = fs::read_to_string(path).ok()?; + for line in payload.lines() { + let parsed = serde_json::from_str::(line).ok()?; + let Some(existing_key) = parsed + .get("idempotency_key") + .and_then(serde_json::Value::as_str) + else { + continue; + }; + if existing_key == idempotency_key { + return Some(existing_key.to_string()); + } + } + + None +} + +fn classify_persistence_error_class_from_io(error: &std::io::Error) -> PersistenceErrorClass { + match error.kind() { + std::io::ErrorKind::Interrupted + | std::io::ErrorKind::WouldBlock + | std::io::ErrorKind::TimedOut + | std::io::ErrorKind::ConnectionRefused + | std::io::ErrorKind::ConnectionReset + | std::io::ErrorKind::ConnectionAborted + | std::io::ErrorKind::NotConnected => PersistenceErrorClass::Transient, + _ => PersistenceErrorClass::Permanent, + } +} + +fn classify_persistence_error_class_from_stderr(stderr: &str) -> PersistenceErrorClass { + let lowered = stderr.to_ascii_lowercase(); + if lowered.contains("timed out") + || lowered.contains("temporar") + || lowered.contains("try again") + || lowered.contains("index.lock") + { + return PersistenceErrorClass::Transient; + } + + PersistenceErrorClass::Permanent +} + +fn persistence_target_label(target: &PersistenceTarget) -> &'static str { + match target { + PersistenceTarget::Notes => "notes", + PersistenceTarget::Database => "database", + } +} + +fn persistence_target_from_label(label: &str) -> Option { + match label { + "notes" => Some(PersistenceTarget::Notes), + "database" => Some(PersistenceTarget::Database), + _ => None, + } +} + +fn serialize_note_payload(note: &TraceNote) -> Result { + serde_json::to_string_pretty(&serde_json::json!({ + "content_type": note.content_type, + "record": trace_record_to_json(¬e.record), + })) + .context("Failed to serialize trace note payload") +} + +fn trace_record_to_json(record: &AgentTraceRecord) -> serde_json::Value { + serde_json::json!({ + "version": record.version, + "id": record.id, + "timestamp": record.timestamp, + "vcs": { + "type": record.vcs.r#type, + "revision": record.vcs.revision, + }, + "files": record.files.iter().map(|file| { + serde_json::json!({ + "path": file.path, + "conversations": file.conversations.iter().map(|conversation| { + serde_json::json!({ + "url": conversation.url, + "related": conversation.related, + "ranges": conversation.ranges.iter().map(|range| { + serde_json::json!({ + "start_line": range.start_line, + "end_line": range.end_line, + "contributor": { + "type": range.contributor.r#type, + "model_id": range.contributor.model_id, + }, + }) + }).collect::>(), + }) + }).collect::>(), + }) + }).collect::>(), + "metadata": record.metadata, + }) +} + +fn trace_record_from_json(value: &serde_json::Value) -> Result { + let version = value + .get("version") + .and_then(serde_json::Value::as_str) + .unwrap_or(TRACE_VERSION) + .to_string(); + let id = value + .get("id") + .and_then(serde_json::Value::as_str) + .context("trace record JSON missing id")? + .to_string(); + let timestamp = value + .get("timestamp") + .and_then(serde_json::Value::as_str) + .context("trace record JSON missing timestamp")? + .to_string(); + + let vcs = value + .get("vcs") + .and_then(serde_json::Value::as_object) + .context("trace record JSON missing vcs object")?; + let vcs_type = vcs + .get("type") + .and_then(serde_json::Value::as_str) + .unwrap_or(VCS_TYPE_GIT) + .to_string(); + let vcs_revision = vcs + .get("revision") + .and_then(serde_json::Value::as_str) + .context("trace record JSON missing vcs.revision")? + .to_string(); + + let files_json = value + .get("files") + .and_then(serde_json::Value::as_array) + .cloned() + .unwrap_or_default(); + let mut files = Vec::new(); + for file in files_json { + let Some(path) = file.get("path").and_then(serde_json::Value::as_str) else { + continue; + }; + let mut conversations = Vec::new(); + for conversation in file + .get("conversations") + .and_then(serde_json::Value::as_array) + .into_iter() + .flatten() + { + let Some(url) = conversation.get("url").and_then(serde_json::Value::as_str) else { + continue; + }; + let related = conversation + .get("related") + .and_then(serde_json::Value::as_array) + .into_iter() + .flatten() + .filter_map(|value| value.as_str().map(ToString::to_string)) + .collect::>(); + let mut ranges = Vec::new(); + for range in conversation + .get("ranges") + .and_then(serde_json::Value::as_array) + .into_iter() + .flatten() + { + let Some(start_line) = range + .get("start_line") + .and_then(serde_json::Value::as_u64) + .map(|value| value as u32) + else { + continue; + }; + let Some(end_line) = range + .get("end_line") + .and_then(serde_json::Value::as_u64) + .map(|value| value as u32) + else { + continue; + }; + let contributor = range + .get("contributor") + .and_then(serde_json::Value::as_object) + .cloned() + .unwrap_or_default(); + ranges.push(AgentTraceRange { + start_line, + end_line, + contributor: AgentTraceContributor { + r#type: contributor + .get("type") + .and_then(serde_json::Value::as_str) + .unwrap_or("unknown") + .to_string(), + model_id: contributor + .get("model_id") + .and_then(serde_json::Value::as_str) + .map(ToString::to_string), + }, + }); + } + + conversations.push(AgentTraceConversation { + url: url.to_string(), + related, + ranges, + }); + } + + files.push(AgentTraceFile { + path: path.to_string(), + conversations, + }); + } + + let metadata = value + .get("metadata") + .and_then(serde_json::Value::as_object) + .map(|map| { + map.iter() + .filter_map(|(key, value)| { + value.as_str().map(|value| (key.clone(), value.to_string())) + }) + .collect::>() + }) + .unwrap_or_default(); + + Ok(AgentTraceRecord { + version, + id, + timestamp, + vcs: AgentTraceVcs { + r#type: vcs_type, + revision: vcs_revision, + }, + files, + metadata, + }) +} + #[derive(Clone, Debug, Eq, PartialEq)] pub enum PostCommitNoOpReason { Disabled, diff --git a/cli/src/services/hooks/tests.rs b/cli/src/services/hooks/tests.rs index 5db93b1b..30c6b10a 100644 --- a/cli/src/services/hooks/tests.rs +++ b/cli/src/services/hooks/tests.rs @@ -16,17 +16,18 @@ use super::{ finalize_pre_commit_checkpoint, finalize_rewrite_trace, parse_hooks_subcommand, process_trace_retry_queue, resolve_pre_commit_checkpoint_path, run_commit_msg_subcommand_in_repo, run_hooks_subcommand, run_placeholder_hooks, - run_pre_commit_subcommand_in_repo, CommitMsgRuntimeState, GeneratedRegionEvent, - GeneratedRegionLifecycle, GitHookKind, HookEvent, HookService, HookSubcommand, - PendingCheckpoint, PendingFileCheckpoint, PendingLineRange, PersistenceErrorClass, - PersistenceFailure, PersistenceTarget, PersistenceWriteResult, PlaceholderHookService, - PostCommitFinalization, PostCommitInput, PostCommitNoOpReason, PostCommitRuntimeState, - PostRewriteFinalization, PostRewriteNoOpReason, PostRewriteRuntimeState, PreCommitFinalization, - PreCommitNoOpReason, PreCommitRuntimeState, PreCommitTreeAnchors, RetryMetricsSink, - RetryProcessingMetric, RewriteMethod, RewriteRemapIngestion, RewriteRemapRequest, - RewriteTraceFinalization, RewriteTraceInput, RewriteTraceNoOpReason, TraceEmissionLedger, - TraceNote, TraceNotesWriter, TraceRecordStore, TraceRetryQueue, TraceRetryQueueEntry, - CANONICAL_SCE_COAUTHOR_TRAILER, POST_COMMIT_PARENT_SHA_METADATA_KEY, + run_post_commit_subcommand_in_repo, run_pre_commit_subcommand_in_repo, CommitMsgRuntimeState, + GeneratedRegionEvent, GeneratedRegionLifecycle, GitHookKind, HookEvent, HookService, + HookSubcommand, PendingCheckpoint, PendingFileCheckpoint, PendingLineRange, + PersistenceErrorClass, PersistenceFailure, PersistenceTarget, PersistenceWriteResult, + PlaceholderHookService, PostCommitFinalization, PostCommitInput, PostCommitNoOpReason, + PostCommitRuntimeState, PostRewriteFinalization, PostRewriteNoOpReason, + PostRewriteRuntimeState, PreCommitFinalization, PreCommitNoOpReason, PreCommitRuntimeState, + PreCommitTreeAnchors, RetryMetricsSink, RetryProcessingMetric, RewriteMethod, + RewriteRemapIngestion, RewriteRemapRequest, RewriteTraceFinalization, RewriteTraceInput, + RewriteTraceNoOpReason, TraceEmissionLedger, TraceNote, TraceNotesWriter, TraceRecordStore, + TraceRetryQueue, TraceRetryQueueEntry, CANONICAL_SCE_COAUTHOR_TRAILER, + POST_COMMIT_PARENT_SHA_METADATA_KEY, }; fn run_git_in_repo(repo: &Path, args: &[&str]) -> Result<()> { @@ -48,6 +49,35 @@ fn run_git_in_repo(repo: &Path, args: &[&str]) -> Result<()> { ) } +fn run_git_output_in_repo(repo: &Path, args: &[&str]) -> Result { + let output = Command::new("git").args(args).current_dir(repo).output()?; + if !output.status.success() { + let stderr = String::from_utf8_lossy(&output.stderr).trim().to_string(); + anyhow::bail!( + "git {:?} failed in '{}': {}", + args, + repo.display(), + if stderr.is_empty() { + "git command exited non-zero".to_string() + } else { + stderr + } + ); + } + + Ok(String::from_utf8_lossy(&output.stdout).trim().to_string()) +} + +fn resolve_git_path_in_repo(repo: &Path, git_path: &str) -> Result { + let resolved = run_git_output_in_repo(repo, &["rev-parse", "--git-path", git_path])?; + let path = PathBuf::from(resolved); + if path.is_absolute() { + return Ok(path); + } + + Ok(repo.join(path)) +} + fn create_temp_repo() -> Result { let unique = format!( "sce-hooks-tests-{}-{}", @@ -991,6 +1021,62 @@ fn commit_msg_runtime_noops_when_staged_attribution_checkpoint_missing() -> Resu Ok(()) } +#[test] +fn post_commit_runtime_persists_notes_and_local_record_store() -> Result<()> { + let repo = create_temp_repo()?; + let tracked_file = repo.join("src").join("lib.rs"); + fs::create_dir_all( + tracked_file + .parent() + .expect("tracked file path should have parent"), + )?; + fs::write(&tracked_file, "one\ntwo\n")?; + run_git_in_repo(&repo, &["add", "."])?; + run_git_in_repo(&repo, &["commit", "-m", "initial"])?; + + fs::write(&tracked_file, "one\ntwo\nthree\n")?; + run_git_in_repo(&repo, &["add", "src/lib.rs"])?; + run_git_in_repo(&repo, &["commit", "-m", "feat: update file"])?; + write_staged_checkpoint_artifact(&repo)?; + + let message = run_post_commit_subcommand_in_repo(&repo)?; + assert!(message.contains("post-commit hook finalized trace")); + + let head_sha = run_git_output_in_repo(&repo, &["rev-parse", "--verify", "HEAD"])?; + let note = run_git_output_in_repo( + &repo, + &[ + "notes", + "--ref", + "refs/notes/agent-trace", + "show", + &head_sha, + ], + )?; + let note_json = serde_json::from_str::(¬e)?; + assert_eq!( + note_json + .get("content_type") + .and_then(serde_json::Value::as_str), + Some("application/vnd.agent-trace.record+json") + ); + assert_eq!( + note_json + .get("record") + .and_then(|record| record.get("metadata")) + .and_then(|metadata| metadata.get("dev.crocoder.sce.notes_ref")) + .and_then(serde_json::Value::as_str), + Some("refs/notes/agent-trace") + ); + + let records_path = resolve_git_path_in_repo(&repo, "sce/trace-records.jsonl")?; + let persisted = fs::read_to_string(records_path)?; + assert!(persisted.contains("post-commit:")); + assert!(persisted.contains("application/vnd.agent-trace.record+json")); + + Ok(()) +} + #[test] fn hooks_placeholder_event_model_reserves_generated_region_tracking() { let service = PlaceholderHookService; diff --git a/context/plans/agent-trace-local-hooks-production-mvp.md b/context/plans/agent-trace-local-hooks-production-mvp.md index 44cebb9b..9d007161 100644 --- a/context/plans/agent-trace-local-hooks-production-mvp.md +++ b/context/plans/agent-trace-local-hooks-production-mvp.md @@ -68,7 +68,7 @@ Connect the existing Agent Trace service seams into a fully functional local Git - `cargo test --manifest-path cli/Cargo.toml commit_msg_policy` - Hook-runtime integration test with on-disk commit message fixture. -- [ ] T05: Implement post-commit production persistence adapters (notes + DB + ledger + queue) (status:todo) +- [x] T05: Implement post-commit production persistence adapters (notes + DB + ledger + queue) (status:done) - Task ID: T05 - Goal: Connect `finalize_post_commit_trace` to concrete production adapters for notes writes, DB writes, emission ledger, and retry queue enqueue. - Boundaries (in/out of scope): diff --git a/context/sce/agent-trace-hooks-command-routing.md b/context/sce/agent-trace-hooks-command-routing.md index dcd520e0..a20e2fab 100644 --- a/context/sce/agent-trace-hooks-command-routing.md +++ b/context/sce/agent-trace-hooks-command-routing.md @@ -19,11 +19,12 @@ ## Current runtime entrypoint behavior - `pre-commit`: executes the pre-commit runtime entrypoint and reports staged-checkpoint finalization outcome. - `commit-msg`: validates ``, resolves runtime gates (`SCE_DISABLED`, `SCE_COAUTHOR_ENABLED`, staged checkpoint presence), applies canonical co-author policy, and writes back only when trailer mutation is required. -- `post-commit`: accepts runtime invocation through implemented dispatch entrypoint. +- `post-commit`: resolves runtime guards, builds commit attribution input from git + pre-commit checkpoint artifacts, executes `finalize_post_commit_trace`, writes canonical note payloads to `refs/notes/agent-trace`, persists trace records to git-path JSONL storage (`sce/trace-records.jsonl`), maintains commit-level emission ledger (`sce/trace-emission-ledger.txt`), and enqueues fallback entries (`sce/trace-retry-queue.jsonl`) when a persistence target fails. - `post-rewrite`: reads hook pair input from STDIN, validates pair format through remap finalization parsing, and reports ingested/skipped outcomes. ## Notes for next tasks - T02 established routing and invocation contracts. - T03 implemented pre-commit staged-checkpoint runtime wiring. - T04 implemented commit-msg file IO mutation wiring to canonical co-author policy. -- Post-commit persistence adapters and rewrite-trace persistence remain in `T05+`. +- T05 implemented post-commit persistence adapters and runtime wiring. +- Rewrite-trace production runtime wiring remains in `T07+`. diff --git a/context/sce/agent-trace-post-commit-dual-write.md b/context/sce/agent-trace-post-commit-dual-write.md index bb4e6bf2..7f6eef2d 100644 --- a/context/sce/agent-trace-post-commit-dual-write.md +++ b/context/sce/agent-trace-post-commit-dual-write.md @@ -26,6 +26,21 @@ - Any failed target (`PersistenceWriteResult::Failed`) enqueues one retry item via `TraceRetryQueue` with explicit failed target list and returns `QueuedFallback`. - Retry queue entries carry the full trace record, MIME type, notes ref, and failed target list to support replay-safe recovery. +## Local hook runtime adapter wiring +- Runtime entrypoint: `cli/src/services/hooks.rs` -> `run_post_commit_subcommand_in_repo`. +- Runtime input assembly: + - resolves `HEAD` + optional `HEAD^` via git + - derives commit timestamp from `git show -s --format=%cI HEAD` + - derives file attribution from the pre-commit checkpoint artifact first, then falls back to changed-file discovery (`git show --name-only HEAD`) + - derives deterministic idempotency (`post-commit:`) and deterministic UUIDv4 trace IDs from commit/timestamp seed +- Production adapters currently bound in runtime: + - notes adapter: `GitNotesTraceWriter` writes canonical JSON note payloads to `refs/notes/agent-trace` + - local record store adapter: `JsonFileTraceRecordStore` writes trace records to git-path JSONL storage at `sce/trace-records.jsonl` + - emission ledger adapter: `FileTraceEmissionLedger` stores emitted commit SHAs at `sce/trace-emission-ledger.txt` + - retry queue adapter: `JsonFileTraceRetryQueue` appends failed-target fallback entries to `sce/trace-retry-queue.jsonl` +- Runtime posture remains fail-open: operational errors return deterministic skip/fallback messages instead of aborting commit progression. + ## Verification evidence - `cargo test --manifest-path cli/Cargo.toml post_commit_finalization` +- `cargo test --manifest-path cli/Cargo.toml post_commit` - `cargo build --manifest-path cli/Cargo.toml` From b585dde3a0df7a8f47df88934578174b2af67420 Mon Sep 17 00:00:00 2001 From: David Abram Date: Thu, 5 Mar 2026 11:52:09 +0100 Subject: [PATCH 36/39] hooks: Implement persistent local DB bootstrap for post-commit traces Replace git-path JSONL trace record persistence with a Turso-backed persistent local DB target. Ensure runtime path resolution, directory creation, and schema migration bootstrap run before post-commit writes, and extend tests to validate runtime persistence and restart durability. --- cli/src/services/hooks.rs | 266 ++++++++++++++---- cli/src/services/hooks/tests.rs | 41 ++- cli/src/services/local_db.rs | 102 ++++++- context/architecture.md | 2 +- context/context-map.md | 2 +- context/glossary.md | 3 +- context/overview.md | 4 +- context/patterns.md | 1 + .../agent-trace-local-hooks-production-mvp.md | 2 +- .../sce/agent-trace-hooks-command-routing.md | 2 +- .../sce/agent-trace-post-commit-dual-write.md | 5 +- 11 files changed, 346 insertions(+), 84 deletions(-) diff --git a/cli/src/services/hooks.rs b/cli/src/services/hooks.rs index f26ee264..9c84751b 100644 --- a/cli/src/services/hooks.rs +++ b/cli/src/services/hooks.rs @@ -11,8 +11,10 @@ use crate::services::agent_trace::{ build_trace_payload, AgentTraceContributor, AgentTraceConversation, AgentTraceFile, AgentTraceRange, AgentTraceRecord, AgentTraceVcs, ContributorInput, ContributorType, ConversationInput, FileAttributionInput, QualityStatus, RangeInput, RewriteInfo, - TraceAdapterInput, METADATA_IDEMPOTENCY_KEY, TRACE_CONTENT_TYPE, TRACE_VERSION, VCS_TYPE_GIT, + TraceAdapterInput, METADATA_IDEMPOTENCY_KEY, METADATA_QUALITY_STATUS, TRACE_CONTENT_TYPE, + TRACE_VERSION, VCS_TYPE_GIT, }; +use crate::services::local_db::ensure_agent_trace_local_db_ready_blocking; pub const NAME: &str = "hooks"; pub const CANONICAL_SCE_COAUTHOR_TRAILER: &str = "Co-authored-by: SCE "; @@ -622,8 +624,9 @@ fn run_post_commit_subcommand_in_repo(repository_root: &Path) -> Result let mut notes_writer = GitNotesTraceWriter { repository_root: repository_root.to_path_buf(), }; - let mut record_store = JsonFileTraceRecordStore { - path: runtime_paths.trace_records_path, + let mut record_store = LocalDbTraceRecordStore { + repository_root: repository_root.to_path_buf(), + db_path: runtime_paths.local_db_path, }; let mut retry_queue = JsonFileTraceRetryQueue { path: runtime_paths.retry_queue_path, @@ -675,18 +678,18 @@ fn resolve_post_commit_runtime_state(repository_root: &Path) -> PostCommitRuntim } struct PostCommitRuntimePaths { - trace_records_path: PathBuf, + local_db_path: PathBuf, retry_queue_path: PathBuf, emission_ledger_path: PathBuf, } fn resolve_post_commit_runtime_paths(repository_root: &Path) -> Result { - let trace_records_path = resolve_git_path(repository_root, "sce/trace-records.jsonl")?; + let local_db_path = ensure_agent_trace_local_db_ready_blocking()?; let retry_queue_path = resolve_git_path(repository_root, "sce/trace-retry-queue.jsonl")?; let emission_ledger_path = resolve_git_path(repository_root, "sce/trace-emission-ledger.txt")?; Ok(PostCommitRuntimePaths { - trace_records_path, + local_db_path, retry_queue_path, emission_ledger_path, }) @@ -1151,56 +1154,204 @@ impl TraceNotesWriter for GitNotesTraceWriter { } } -struct JsonFileTraceRecordStore { - path: PathBuf, +struct LocalDbTraceRecordStore { + repository_root: PathBuf, + db_path: PathBuf, } -impl TraceRecordStore for JsonFileTraceRecordStore { +impl TraceRecordStore for LocalDbTraceRecordStore { fn write_trace_record(&mut self, record: PersistedTraceRecord) -> PersistenceWriteResult { - if let Some(existing_key) = - load_record_with_idempotency_key(&self.path, &record.idempotency_key) - { - if existing_key == record.idempotency_key { - return PersistenceWriteResult::AlreadyExists; - } - } - - if let Some(parent) = self.path.parent() { - if let Err(error) = fs::create_dir_all(parent) { + let runtime = match tokio::runtime::Builder::new_current_thread().build() { + Ok(runtime) => runtime, + Err(error) => { return PersistenceWriteResult::Failed(PersistenceFailure { - class: classify_persistence_error_class_from_io(&error), - message: format!( - "failed to create trace record store directory '{}': {}", - parent.display(), - error - ), - }); + class: PersistenceErrorClass::Permanent, + message: format!("failed to initialize local DB runtime: {error}"), + }) } - } - - let line = serde_json::json!({ - "commit_sha": record.commit_sha, - "idempotency_key": record.idempotency_key, - "content_type": record.content_type, - "notes_ref": record.notes_ref, - "record": trace_record_to_json(&record.record), - }) - .to_string(); + }; - match append_jsonl_line(&self.path, &line) { - Ok(()) => PersistenceWriteResult::Written, + match runtime.block_on(write_trace_record_to_local_db( + &self.db_path, + &self.repository_root, + &record, + )) { + Ok(written) => { + if written { + PersistenceWriteResult::Written + } else { + PersistenceWriteResult::AlreadyExists + } + } Err(error) => PersistenceWriteResult::Failed(PersistenceFailure { - class: classify_persistence_error_class_from_io(&error), + class: classify_persistence_error_class_from_message(&error.to_string()), message: format!( - "failed to write trace record into local JSON store '{}': {}", - self.path.display(), - error + "failed to persist trace record in local DB '{}': {error}", + self.db_path.display() ), }), } } } +async fn write_trace_record_to_local_db( + db_path: &Path, + repository_root: &Path, + record: &PersistedTraceRecord, +) -> Result { + let location = db_path.to_str().ok_or_else(|| { + anyhow::anyhow!("Local DB path must be valid UTF-8: {}", db_path.display()) + })?; + let db = turso::Builder::new_local(location).build().await?; + let conn = db.connect()?; + conn.execute("PRAGMA foreign_keys = ON", ()).await?; + + let canonical_root = repository_root + .canonicalize() + .unwrap_or_else(|_| repository_root.to_path_buf()) + .to_string_lossy() + .to_string(); + + conn.execute( + "INSERT OR IGNORE INTO repositories (canonical_root) VALUES (?1)", + [canonical_root.as_str()], + ) + .await?; + + let repository_id = { + let mut rows = conn + .query( + "SELECT id FROM repositories WHERE canonical_root = ?1 LIMIT 1", + [canonical_root.as_str()], + ) + .await?; + let row = rows + .next() + .await? + .ok_or_else(|| anyhow::anyhow!("repository id query returned no rows"))?; + let value = row.get_value(0)?; + value + .as_integer() + .copied() + .ok_or_else(|| anyhow::anyhow!("repository id query returned non-integer"))? + }; + + conn.execute( + "INSERT OR IGNORE INTO commits (repository_id, commit_sha, idempotency_key) VALUES (?1, ?2, ?3)", + ( + repository_id, + record.commit_sha.as_str(), + record.idempotency_key.as_str(), + ), + ) + .await?; + + let commit_id = { + let mut rows = conn + .query( + "SELECT id FROM commits WHERE repository_id = ?1 AND commit_sha = ?2 LIMIT 1", + (repository_id, record.commit_sha.as_str()), + ) + .await?; + let row = rows + .next() + .await? + .ok_or_else(|| anyhow::anyhow!("commit id query returned no rows"))?; + let value = row.get_value(0)?; + value + .as_integer() + .copied() + .ok_or_else(|| anyhow::anyhow!("commit id query returned non-integer"))? + }; + + let existing = { + let mut rows = conn + .query( + "SELECT COUNT(*) FROM trace_records WHERE repository_id = ?1 AND (commit_id = ?2 OR idempotency_key = ?3)", + (repository_id, commit_id, record.idempotency_key.as_str()), + ) + .await?; + let row = rows + .next() + .await? + .ok_or_else(|| anyhow::anyhow!("existing trace count query returned no rows"))?; + let value = row.get_value(0)?; + value + .as_integer() + .copied() + .ok_or_else(|| anyhow::anyhow!("existing trace count query returned non-integer"))? + }; + if existing > 0 { + return Ok(false); + } + + let payload_json = serde_json::to_string(&trace_record_to_json(&record.record)) + .context("failed to serialize trace record JSON payload")?; + let quality_status = record + .record + .metadata + .get(METADATA_QUALITY_STATUS) + .cloned() + .unwrap_or_else(|| "final".to_string()); + + conn.execute( + "INSERT INTO trace_records (repository_id, commit_id, trace_id, version, content_type, notes_ref, payload_json, quality_status, idempotency_key, recorded_at) VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7, ?8, ?9, ?10)", + ( + repository_id, + commit_id, + record.record.id.as_str(), + record.record.version.as_str(), + record.content_type.as_str(), + record.notes_ref.as_str(), + payload_json.as_str(), + quality_status.as_str(), + record.idempotency_key.as_str(), + record.record.timestamp.as_str(), + ), + ) + .await?; + + let trace_record_id = { + let mut rows = conn + .query( + "SELECT id FROM trace_records WHERE trace_id = ?1 LIMIT 1", + [record.record.id.as_str()], + ) + .await?; + let row = rows + .next() + .await? + .ok_or_else(|| anyhow::anyhow!("trace record id query returned no rows"))?; + let value = row.get_value(0)?; + value + .as_integer() + .copied() + .ok_or_else(|| anyhow::anyhow!("trace record id query returned non-integer"))? + }; + + for file in &record.record.files { + for conversation in &file.conversations { + for range in &conversation.ranges { + conn.execute( + "INSERT INTO trace_ranges (trace_record_id, file_path, conversation_url, start_line, end_line, contributor_type, contributor_model_id) VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7)", + ( + trace_record_id, + file.path.as_str(), + conversation.url.as_str(), + i64::from(range.start_line), + i64::from(range.end_line), + range.contributor.r#type.as_str(), + range.contributor.model_id.as_deref(), + ), + ) + .await?; + } + } + } + + Ok(true) +} + struct JsonFileTraceRetryQueue { path: PathBuf, } @@ -1345,24 +1496,6 @@ fn append_jsonl_line(path: &Path, line: &str) -> std::io::Result<()> { Ok(()) } -fn load_record_with_idempotency_key(path: &Path, idempotency_key: &str) -> Option { - let payload = fs::read_to_string(path).ok()?; - for line in payload.lines() { - let parsed = serde_json::from_str::(line).ok()?; - let Some(existing_key) = parsed - .get("idempotency_key") - .and_then(serde_json::Value::as_str) - else { - continue; - }; - if existing_key == idempotency_key { - return Some(existing_key.to_string()); - } - } - - None -} - fn classify_persistence_error_class_from_io(error: &std::io::Error) -> PersistenceErrorClass { match error.kind() { std::io::ErrorKind::Interrupted @@ -1389,6 +1522,19 @@ fn classify_persistence_error_class_from_stderr(stderr: &str) -> PersistenceErro PersistenceErrorClass::Permanent } +fn classify_persistence_error_class_from_message(message: &str) -> PersistenceErrorClass { + let lowered = message.to_ascii_lowercase(); + if lowered.contains("locked") + || lowered.contains("timed out") + || lowered.contains("temporar") + || lowered.contains("try again") + { + return PersistenceErrorClass::Transient; + } + + PersistenceErrorClass::Permanent +} + fn persistence_target_label(target: &PersistenceTarget) -> &'static str { match target { PersistenceTarget::Notes => "notes", diff --git a/cli/src/services/hooks/tests.rs b/cli/src/services/hooks/tests.rs index 30c6b10a..54c432fa 100644 --- a/cli/src/services/hooks/tests.rs +++ b/cli/src/services/hooks/tests.rs @@ -10,6 +10,7 @@ use crate::services::agent_trace::{ FileAttributionInput, QualityStatus, RangeInput, TraceAdapterInput, METADATA_QUALITY_STATUS, METADATA_REWRITE_CONFIDENCE, METADATA_REWRITE_FROM, METADATA_REWRITE_METHOD, }; +use crate::services::local_db::resolve_agent_trace_local_db_path; use super::{ apply_commit_msg_coauthor_policy, finalize_post_commit_trace, finalize_post_rewrite_remap, @@ -68,16 +69,6 @@ fn run_git_output_in_repo(repo: &Path, args: &[&str]) -> Result { Ok(String::from_utf8_lossy(&output.stdout).trim().to_string()) } -fn resolve_git_path_in_repo(repo: &Path, git_path: &str) -> Result { - let resolved = run_git_output_in_repo(repo, &["rev-parse", "--git-path", git_path])?; - let path = PathBuf::from(resolved); - if path.is_absolute() { - return Ok(path); - } - - Ok(repo.join(path)) -} - fn create_temp_repo() -> Result { let unique = format!( "sce-hooks-tests-{}-{}", @@ -1069,10 +1060,32 @@ fn post_commit_runtime_persists_notes_and_local_record_store() -> Result<()> { Some("refs/notes/agent-trace") ); - let records_path = resolve_git_path_in_repo(&repo, "sce/trace-records.jsonl")?; - let persisted = fs::read_to_string(records_path)?; - assert!(persisted.contains("post-commit:")); - assert!(persisted.contains("application/vnd.agent-trace.record+json")); + let db_path = resolve_agent_trace_local_db_path()?; + let runtime = tokio::runtime::Builder::new_current_thread().build()?; + let persisted_count = runtime.block_on(async { + let location = db_path + .to_str() + .ok_or_else(|| anyhow::anyhow!("test DB path must be UTF-8"))?; + let db = turso::Builder::new_local(location).build().await?; + let conn = db.connect()?; + let mut rows = conn + .query( + "SELECT COUNT(*) FROM trace_records tr JOIN commits c ON c.id = tr.commit_id WHERE c.commit_sha = ?1", + [head_sha.as_str()], + ) + .await?; + let row = rows + .next() + .await? + .ok_or_else(|| anyhow::anyhow!("trace record count query returned no rows"))?; + let value = row.get_value(0)?; + value + .as_integer() + .copied() + .ok_or_else(|| anyhow::anyhow!("trace record count query returned non-integer")) + })?; + + assert_eq!(persisted_count, 1); Ok(()) } diff --git a/cli/src/services/local_db.rs b/cli/src/services/local_db.rs index d85b142b..c28f8c1a 100644 --- a/cli/src/services/local_db.rs +++ b/cli/src/services/local_db.rs @@ -1,6 +1,6 @@ -use std::path::Path; +use std::path::{Path, PathBuf}; -use anyhow::{anyhow, ensure, Result}; +use anyhow::{anyhow, ensure, Context, Result}; use turso::Builder; const CORE_SCHEMA_STATEMENTS: &[&str] = &[ @@ -143,6 +143,29 @@ pub struct CoreSchemaMigrationOutcome { pub executed_statements: usize, } +pub fn resolve_agent_trace_local_db_path() -> Result { + let state_root = resolve_state_data_root()?; + Ok(state_root.join("sce").join("agent-trace").join("local.db")) +} + +pub fn ensure_agent_trace_local_db_ready_blocking() -> Result { + let db_path = resolve_agent_trace_local_db_path()?; + if let Some(parent) = db_path.parent() { + std::fs::create_dir_all(parent).with_context(|| { + format!( + "Failed to create Agent Trace local DB directory '{}'.", + parent.display() + ) + })?; + } + + let runtime = tokio::runtime::Builder::new_current_thread().build()?; + runtime.block_on(apply_core_schema_migrations(LocalDatabaseTarget::Path( + &db_path, + )))?; + Ok(db_path) +} + async fn connect_local(target: LocalDatabaseTarget<'_>) -> Result { let location = target_location(target)?; let db = Builder::new_local(location).build().await?; @@ -160,6 +183,52 @@ fn target_location(target: LocalDatabaseTarget<'_>) -> Result<&str> { } } +fn resolve_state_data_root() -> Result { + #[cfg(target_os = "windows")] + { + if let Some(local_app_data) = std::env::var_os("LOCALAPPDATA") { + return Ok(PathBuf::from(local_app_data)); + } + if let Some(app_data) = std::env::var_os("APPDATA") { + return Ok(PathBuf::from(app_data)); + } + } + + #[cfg(target_os = "macos")] + { + return Ok(resolve_home_dir()? + .join("Library") + .join("Application Support")); + } + + #[cfg(target_os = "linux")] + { + if let Some(xdg_state_home) = std::env::var_os("XDG_STATE_HOME") { + return Ok(PathBuf::from(xdg_state_home)); + } + return Ok(resolve_home_dir()?.join(".local").join("state")); + } + + #[cfg(not(any(target_os = "windows", target_os = "macos", target_os = "linux")))] + { + Ok(resolve_home_dir()?.join(".local").join("state")) + } +} + +fn resolve_home_dir() -> Result { + if let Some(home) = std::env::var_os("HOME") { + return Ok(PathBuf::from(home)); + } + + if let Some(user_profile) = std::env::var_os("USERPROFILE") { + return Ok(PathBuf::from(user_profile)); + } + + Err(anyhow!( + "Unable to resolve home directory from HOME or USERPROFILE environment variables" + )) +} + pub async fn apply_core_schema_migrations( target: LocalDatabaseTarget<'_>, ) -> Result { @@ -478,4 +547,33 @@ mod tests { Ok(()) } + + #[test] + fn persistent_target_survives_process_restart() -> Result<()> { + let temp = TestTempDir::new("sce-persistent-local-db-tests")?; + let path = temp.path().join("persistent.db"); + + { + let runtime = tokio::runtime::Builder::new_current_thread().build()?; + runtime.block_on(apply_core_schema_migrations(LocalDatabaseTarget::Path( + &path, + )))?; + runtime.block_on(async { + let conn = super::connect_local(LocalDatabaseTarget::Path(&path)).await?; + conn.execute( + "INSERT INTO repositories (canonical_root) VALUES (?1)", + ["/tmp/restart-proof-repo"], + ) + .await?; + Ok::<(), anyhow::Error>(()) + })?; + } + + let runtime = tokio::runtime::Builder::new_current_thread().build()?; + let repository_rows = + runtime.block_on(repository_count(LocalDatabaseTarget::Path(&path)))?; + assert_eq!(repository_rows, 1); + + Ok(()) + } } diff --git a/context/architecture.md b/context/architecture.md index e8fbb2f0..d36251b7 100644 --- a/context/architecture.md +++ b/context/architecture.md @@ -73,7 +73,7 @@ The repository includes a new placeholder Rust binary crate at `cli/`. - `cli/src/main.rs` is the executable entrypoint (`sce`) and delegates to `app::run`. - `cli/src/app.rs` provides a `lexopt`-based argument parser and dispatch loop with deterministic help, setup installation execution, and consistent `anyhow`-driven error exits. - `cli/src/command_surface.rs` is the source of truth for top-level command contract metadata (`help`, `setup`, `doctor`, `mcp`, `hooks`, `sync`) and explicit implemented-vs-placeholder status. -- `cli/src/services/local_db.rs` provides the local Turso data adapter, including `Builder::new_local(...)` initialization, async execute/query smoke checks for in-memory and file-backed targets, and idempotent migration application for Agent Trace persistence foundations (`repositories`, `commits`, `trace_records`, `trace_ranges`), reconciliation ingestion entities (`reconciliation_runs`, `rewrite_mappings`, `conversations`), and T14 retry/observability storage (`trace_retry_queue`, `reconciliation_metrics`) with replay/query indexes. +- `cli/src/services/local_db.rs` provides the local Turso data adapter, including `Builder::new_local(...)` initialization, deterministic persistent runtime DB target resolution/bootstrap (`ensure_agent_trace_local_db_ready_blocking`), async execute/query smoke checks for in-memory and file-backed targets, and idempotent migration application for Agent Trace persistence foundations (`repositories`, `commits`, `trace_records`, `trace_ranges`), reconciliation ingestion entities (`reconciliation_runs`, `rewrite_mappings`, `conversations`), and T14 retry/observability storage (`trace_retry_queue`, `reconciliation_metrics`) with replay/query indexes. - `cli/src/test_support.rs` provides a shared test-only temp-directory helper (`TestTempDir`) used by service tests that need filesystem fixtures. - `cli/src/services/setup.rs` defines the setup command contract (`SetupMode`, `SetupTarget`, CLI flag parser/validator), an `inquire`-backed interactive target prompter (`InquireSetupTargetPrompter`), setup dispatch outcomes (proceed/cancelled), compile-time embedded asset access (`EmbeddedAsset`, target-scoped iterators, required-hook asset iterators/lookups) generated by `cli/build.rs` from `config/.opencode/**`, `config/.claude/**`, and `cli/assets/hooks/**`, a target-scoped install engine/orchestrator that stages embedded files, performs backup-and-replace with rollback restoration on swap failure, and formats deterministic completion messaging, plus required-hook install orchestration (`install_required_git_hooks`) and command-surface hook mode helpers (`run_setup_hooks`, `resolve_setup_hooks_repository`) used by `sce setup --hooks [--repo ]` with deterministic option compatibility validation and per-hook outcome messaging. - `cli/src/services/doctor.rs` defines hook rollout health validation (`run_doctor`) that resolves effective git hook-path source (default, local `core.hooksPath`, global `core.hooksPath`) and validates required hook files (`pre-commit`, `commit-msg`, `post-commit`) for presence and executable permissions. diff --git a/context/context-map.md b/context/context-map.md index 4cd5b679..96943335 100644 --- a/context/context-map.md +++ b/context/context-map.md @@ -21,7 +21,7 @@ Feature/domain context: - `context/sce/agent-trace-payload-builder-validation.md` (T03 deterministic payload-builder path, model-id normalization behavior, and Agent Trace schema validation suite) - `context/sce/agent-trace-pre-commit-staged-checkpoint.md` (T04 pre-commit staged-only finalization contract with no-op guards and index/tree anchor capture) - `context/sce/agent-trace-commit-msg-coauthor-policy.md` (T05 commit-msg canonical co-author trailer policy with env-gated injection and idempotent dedupe) -- `context/sce/agent-trace-post-commit-dual-write.md` (T06 post-commit trace finalization contract, notes+DB dual-write behavior, idempotency ledger guard, and retry-queue fallback semantics) +- `context/sce/agent-trace-post-commit-dual-write.md` (T06 post-commit trace finalization contract, persistent local DB bootstrap/path policy, notes+DB dual-write behavior, idempotency ledger guard, and retry-queue fallback semantics) - `context/sce/agent-trace-hook-doctor.md` (T07 `sce doctor` hook install/health validation contract for default, per-repo, and global hook-path rollout) - `context/sce/setup-githooks-install-contract.md` (T01 canonical `sce setup --hooks` install contract for target-path resolution, idempotent outcomes, backup/rollback, and doctor-readiness alignment) - `context/sce/setup-githooks-hook-asset-packaging.md` (T02 compile-time `sce setup --hooks` required-hook template packaging contract and setup-service accessor surface) diff --git a/context/glossary.md b/context/glossary.md index a37c7c84..08885867 100644 --- a/context/glossary.md +++ b/context/glossary.md @@ -61,7 +61,8 @@ - `agent trace schema validation suite`: T03 compliance test slice in `services::agent_trace::tests` that validates payload JSON against the published Agent Trace trace-record schema with draft-2020-12 format checks enabled (`uri`, `date-time`, `uuid`) and a local version-pattern compatibility patch for `0.1.0`. - `agent trace pre-commit staged checkpoint finalization`: T04 contract in `cli/src/services/hooks.rs` (`finalize_pre_commit_checkpoint`) that filters pending attribution to staged ranges only, drops unstaged-only files, captures index/head tree anchors, and returns explicit no-op outcomes when SCE is disabled, CLI is unavailable, or the repository is bare. - `agent trace commit-msg co-author policy`: T05 contract in `cli/src/services/hooks.rs` (`apply_commit_msg_coauthor_policy`) that applies exactly one canonical trailer (`Co-authored-by: SCE `) only when SCE is enabled, co-author policy is enabled, and staged SCE attribution exists; duplicate canonical trailers are deduped idempotently. -- `agent trace post-commit dual-write finalization`: T06 contract in `cli/src/services/hooks.rs` (`finalize_post_commit_trace`) that emits one canonical Agent Trace record per commit behind runtime guards, writes to both notes (`refs/notes/agent-trace`) and DB persistence targets, and enqueues retry fallback entries when either persistence target fails. +- `agent trace post-commit dual-write finalization`: T06 contract in `cli/src/services/hooks.rs` (`finalize_post_commit_trace`) that emits one canonical Agent Trace record per commit behind runtime guards, writes to both notes (`refs/notes/agent-trace`) and DB persistence targets, enqueues retry fallback entries when either persistence target fails, and relies on runtime bootstrap of the persistent local DB target before DB writes. +- `agent trace local DB runtime bootstrap`: T06 runtime policy in `cli/src/services/local_db.rs` (`ensure_agent_trace_local_db_ready_blocking`) that resolves the deterministic per-user state path (`${XDG_STATE_HOME:-~/.local/state}/sce/agent-trace/local.db` on Linux; platform-equivalent user state root elsewhere), creates parent directories, and applies `apply_core_schema_migrations` before post-commit DB persistence. - `agent trace post-commit idempotency ledger`: T06 seam (`TraceEmissionLedger`) in `cli/src/services/hooks.rs` used to prevent duplicate emission for the same commit SHA and to mark successful dual-write completion. - `sce doctor` hook rollout validation: Implemented CLI command in `cli/src/services/doctor.rs` (`run_doctor`) that reports readiness for local Agent Trace rollout by resolving hook-path source (default `.git/hooks`, per-repo `core.hooksPath`, or global `core.hooksPath`) and validating required hook presence plus executable permissions. - `agent trace post-rewrite local remap ingestion`: T08 contract in `cli/src/services/hooks.rs` (`finalize_post_rewrite_remap`) that parses git `post-rewrite` old/new SHA pairs, captures normalized rewrite method (`amend`, `rebase`, or lowercase passthrough), derives deterministic `post-rewrite:::` idempotency keys, and dispatches replay-safe remap ingestion requests. diff --git a/context/overview.md b/context/overview.md index ddd8c617..880ce3a2 100644 --- a/context/overview.md +++ b/context/overview.md @@ -28,7 +28,7 @@ The CLI now includes a task-scoped Agent Trace schema adapter contract in `cli/s The Agent Trace service now also provides a deterministic payload-builder path (`build_trace_payload`) with AI `model_id` normalization and schema-compliance validation coverage documented in `context/sce/agent-trace-payload-builder-validation.md`. The hooks service now includes a pre-commit staged checkpoint finalization contract (`finalize_pre_commit_checkpoint`) that enforces staged-only attribution, captures index/tree anchors, and no-ops for disabled/unavailable/bare-repo runtime states; this behavior is documented in `context/sce/agent-trace-pre-commit-staged-checkpoint.md`. The hooks service now also exposes a `commit-msg` co-author trailer policy (`apply_commit_msg_coauthor_policy`) that conditionally injects exactly one canonical SCE trailer based on `SCE_DISABLED`, `SCE_COAUTHOR_ENABLED`, and staged-attribution presence, with idempotent deduplication behavior documented in `context/sce/agent-trace-commit-msg-coauthor-policy.md`. -The hooks service now also includes a post-commit trace finalization seam (`finalize_post_commit_trace`) that builds canonical Agent Trace payloads, enforces commit-level idempotency guards, performs notes + DB dual writes, and enqueues retry fallback metadata when persistence targets fail; this behavior is documented in `context/sce/agent-trace-post-commit-dual-write.md`. +The hooks service now also includes a post-commit trace finalization seam (`finalize_post_commit_trace`) that builds canonical Agent Trace payloads, enforces commit-level idempotency guards, performs notes + DB dual writes, and enqueues retry fallback metadata when persistence targets fail; post-commit runtime now also enforces persistent local DB readiness (`.../sce/agent-trace/local.db`) with automatic schema bootstrap before DB writes, documented in `context/sce/agent-trace-post-commit-dual-write.md`. The CLI now also includes a hook rollout doctor contract documented in `context/sce/agent-trace-hook-doctor.md`. The hooks service now also includes a post-rewrite local remap ingestion seam (`finalize_post_rewrite_remap`) that parses `post-rewrite` old->new SHA pairs, normalizes rewrite method capture, and derives deterministic per-pair idempotency keys before remap dispatch; this behavior is documented in `context/sce/agent-trace-post-rewrite-local-remap-ingestion.md`. The hooks service now also includes rewrite trace transformation finalization (`finalize_rewrite_trace`) that materializes rewritten-SHA Agent Trace records with `rewrite_from`/`rewrite_method`/`rewrite_confidence` metadata, confidence-threshold quality mapping (`final`/`partial`/`needs_review`), and notes+DB persistence parity with retry fallback; this behavior is documented in `context/sce/agent-trace-rewrite-trace-transformation.md`. @@ -96,7 +96,7 @@ Lightweight post-task verification baseline (required after each completed task) - Use `context/sce/agent-trace-payload-builder-validation.md` for the implemented T03 builder path, normalization policy, and schema-validation behavior. - Use `context/sce/agent-trace-pre-commit-staged-checkpoint.md` for the implemented T04 pre-commit staged-only finalization contract and runtime no-op guards. - Use `context/sce/agent-trace-commit-msg-coauthor-policy.md` for the implemented T05 commit-msg canonical co-author trailer policy and idempotent dedupe behavior. -- Use `context/sce/agent-trace-post-commit-dual-write.md` for the implemented T06 post-commit trace finalization and dual-write + queue-fallback behavior. +- Use `context/sce/agent-trace-post-commit-dual-write.md` for the implemented T06 post-commit trace finalization and dual-write + queue-fallback behavior, including persistent local DB path/bootstrap policy for runtime writes. - Use `context/sce/agent-trace-hook-doctor.md` for the implemented T07 hook install and health validation behavior (`sce doctor`) across default/per-repo/global hook-path installs. - Use `context/sce/agent-trace-post-rewrite-local-remap-ingestion.md` for the implemented T08 post-rewrite local remap ingestion pipeline (`post-rewrite` pair parsing, rewrite-method normalization, and deterministic idempotency-key derivation). - Use `context/sce/agent-trace-rewrite-trace-transformation.md` for the implemented T09 rewritten-SHA trace transformation path (`finalize_rewrite_trace`), confidence-based quality status mapping, and rewrite metadata persistence semantics. diff --git a/context/patterns.md b/context/patterns.md index 55a10870..87490f8d 100644 --- a/context/patterns.md +++ b/context/patterns.md @@ -94,6 +94,7 @@ - For post-rewrite remap ingestion seams, parse ` ` pairs from hook input strictly, ignore empty/no-op self-mapping rows, normalize rewrite method labels to lowercase (`amend`/`rebase` when recognized), and derive deterministic per-pair idempotency keys before dispatching remap requests. - For rewrite trace transformation seams, materialize rewritten records through the canonical Agent Trace builder path, require finite confidence in `[0.0, 1.0]`, normalize confidence to two-decimal metadata strings, map quality thresholds to `final` (`>= 0.90`), `partial` (`0.60..0.89`), and `needs_review` (`< 0.60`), and preserve notes+DB dual-write plus retry-fallback parity. - For local persistence rollout, ship core schema changes as idempotent `CREATE TABLE IF NOT EXISTS` and `CREATE INDEX IF NOT EXISTS` statements so migration reapplication is upgrade-safe across empty and preexisting local Turso DB states. +- For local hooks production writes, resolve one deterministic per-user persistent DB target (Linux: `${XDG_STATE_HOME:-~/.local/state}/sce/agent-trace/local.db`; platform-equivalent state roots elsewhere), create parent directories before first use, and run schema bootstrap before DB write attempts. - For hosted rewrite reconciliation persistence, extend the same migration seam (`apply_core_schema_migrations`) with deterministic schema/index statements and per-repository idempotency uniqueness for run/mapping replay safety. - For hosted event intake seams, verify provider signatures before payload parsing (GitHub `sha256=` HMAC over body, GitLab token-equality secret check), resolve old/new heads from provider payload fields, and derive deterministic reconciliation run idempotency keys from provider+event+repo+head tuple material. - For hosted rewrite mapping seams, resolve candidates deterministically in strict precedence order (patch-id exact, then range-diff score, then fuzzy score), classify top-score ties as `ambiguous`, enforce low-confidence unresolved behavior below `0.60`, and preserve stable outcome ordering via canonical candidate SHA sorting. diff --git a/context/plans/agent-trace-local-hooks-production-mvp.md b/context/plans/agent-trace-local-hooks-production-mvp.md index 9d007161..504747c9 100644 --- a/context/plans/agent-trace-local-hooks-production-mvp.md +++ b/context/plans/agent-trace-local-hooks-production-mvp.md @@ -81,7 +81,7 @@ Connect the existing Agent Trace service seams into a fully functional local Git - `cargo test --manifest-path cli/Cargo.toml post_commit_finalization` - Local integration test validating notes content type/ref and DB persistence parity. -- [ ] T06: Productionize local DB runtime policy and schema bootstrap (status:todo) +- [x] T06: Productionize local DB runtime policy and schema bootstrap (status:done) - Task ID: T06 - Goal: Establish and implement production local DB location/bootstrap policy for Linux and other supported local platforms, then wire schema migration lifecycle. - Boundaries (in/out of scope): diff --git a/context/sce/agent-trace-hooks-command-routing.md b/context/sce/agent-trace-hooks-command-routing.md index a20e2fab..6d31d12d 100644 --- a/context/sce/agent-trace-hooks-command-routing.md +++ b/context/sce/agent-trace-hooks-command-routing.md @@ -19,7 +19,7 @@ ## Current runtime entrypoint behavior - `pre-commit`: executes the pre-commit runtime entrypoint and reports staged-checkpoint finalization outcome. - `commit-msg`: validates ``, resolves runtime gates (`SCE_DISABLED`, `SCE_COAUTHOR_ENABLED`, staged checkpoint presence), applies canonical co-author policy, and writes back only when trailer mutation is required. -- `post-commit`: resolves runtime guards, builds commit attribution input from git + pre-commit checkpoint artifacts, executes `finalize_post_commit_trace`, writes canonical note payloads to `refs/notes/agent-trace`, persists trace records to git-path JSONL storage (`sce/trace-records.jsonl`), maintains commit-level emission ledger (`sce/trace-emission-ledger.txt`), and enqueues fallback entries (`sce/trace-retry-queue.jsonl`) when a persistence target fails. +- `post-commit`: resolves runtime guards, builds commit attribution input from git + pre-commit checkpoint artifacts, executes `finalize_post_commit_trace`, writes canonical note payloads to `refs/notes/agent-trace`, ensures persistent local DB readiness (`.../sce/agent-trace/local.db`) with migrations before write attempts, persists trace records to local Turso-backed tables, maintains commit-level emission ledger (`sce/trace-emission-ledger.txt`), and enqueues fallback entries (`sce/trace-retry-queue.jsonl`) when a persistence target fails. - `post-rewrite`: reads hook pair input from STDIN, validates pair format through remap finalization parsing, and reports ingested/skipped outcomes. ## Notes for next tasks diff --git a/context/sce/agent-trace-post-commit-dual-write.md b/context/sce/agent-trace-post-commit-dual-write.md index 7f6eef2d..e17fda11 100644 --- a/context/sce/agent-trace-post-commit-dual-write.md +++ b/context/sce/agent-trace-post-commit-dual-write.md @@ -35,9 +35,12 @@ - derives deterministic idempotency (`post-commit:`) and deterministic UUIDv4 trace IDs from commit/timestamp seed - Production adapters currently bound in runtime: - notes adapter: `GitNotesTraceWriter` writes canonical JSON note payloads to `refs/notes/agent-trace` - - local record store adapter: `JsonFileTraceRecordStore` writes trace records to git-path JSONL storage at `sce/trace-records.jsonl` + - local record store adapter: `LocalDbTraceRecordStore` writes trace records and flattened ranges into the persistent Turso target at `.../sce/agent-trace/local.db` - emission ledger adapter: `FileTraceEmissionLedger` stores emitted commit SHAs at `sce/trace-emission-ledger.txt` - retry queue adapter: `JsonFileTraceRetryQueue` appends failed-target fallback entries to `sce/trace-retry-queue.jsonl` +- Runtime schema bootstrap is mandatory before post-commit persistence: + - `resolve_post_commit_runtime_paths` calls `ensure_agent_trace_local_db_ready_blocking`. + - `ensure_agent_trace_local_db_ready_blocking` resolves platform state-data DB path (`${XDG_STATE_HOME:-~/.local/state}/sce/agent-trace/local.db` on Linux, platform-equivalent user state root elsewhere), creates parent directories, and applies `apply_core_schema_migrations` before writes. - Runtime posture remains fail-open: operational errors return deterministic skip/fallback messages instead of aborting commit progression. ## Verification evidence From 09a5dd8dbee15f6df72e646a073b1d24a788f3d4 Mon Sep 17 00:00:00 2001 From: David Abram Date: Thu, 5 Mar 2026 12:32:14 +0100 Subject: [PATCH 37/39] hooks: Implement post-rewrite remap ingestion and rewrite trace finalization Connect the post-rewrite runtime to local DB-backed remap ingestion with idempotency checks, then finalize rewritten traces through the existing notes/DB persistence pipeline with retry and emission-ledger handling. Add runtime tests that verify successful rewrite-trace persistence and duplicate pair replay skipping. --- cli/src/services/hooks.rs | 300 ++++++++++++++++-- cli/src/services/hooks/tests.rs | 183 ++++++++++- context/context-map.md | 2 +- context/glossary.md | 2 +- context/overview.md | 2 +- .../agent-trace-local-hooks-production-mvp.md | 2 +- .../sce/agent-trace-hooks-command-routing.md | 9 +- 7 files changed, 462 insertions(+), 38 deletions(-) diff --git a/cli/src/services/hooks.rs b/cli/src/services/hooks.rs index 9c84751b..755176ac 100644 --- a/cli/src/services/hooks.rs +++ b/cli/src/services/hooks.rs @@ -743,10 +743,22 @@ fn collect_post_commit_file_attribution( return Ok(checkpoint_files); } + collect_commit_file_attribution( + repository_root, + "HEAD", + "https://crocoder.dev/sce/local-hooks/post-commit", + ) +} + +fn collect_commit_file_attribution( + repository_root: &Path, + revision: &str, + conversation_url: &str, +) -> Result> { let changed_paths = run_git_command_allow_empty( repository_root, - &["show", "--pretty=format:", "--name-only", "HEAD"], - "Failed to resolve changed files for post-commit attribution.", + &["show", "--pretty=format:", "--name-only", revision], + "Failed to resolve changed files for commit attribution.", )?; let mut files = Vec::new(); @@ -759,7 +771,7 @@ fn collect_post_commit_file_attribution( files.push(FileAttributionInput { path: path.to_string(), conversations: vec![ConversationInput { - url: "https://crocoder.dev/sce/local-hooks/post-commit".to_string(), + url: conversation_url.to_string(), related: Vec::new(), ranges: vec![RangeInput { start_line: 1, @@ -879,39 +891,291 @@ fn deterministic_uuid_v4_from_seed(seed: &str) -> String { } fn run_post_rewrite_subcommand(rewrite_method: &str) -> Result { + let repository_root = std::env::current_dir() + .context("Failed to determine current directory for post-rewrite runtime invocation.")?; let stdin = std::io::read_to_string(std::io::stdin()) .context("Failed to read post-rewrite pair input from STDIN")?; - let mut ingestion = AcceptAllRewriteRemapIngestion; - let outcome = finalize_post_rewrite_remap( - &PostRewriteRuntimeState { - sce_disabled: false, - cli_available: true, - is_bare_repo: false, - }, + + run_post_rewrite_subcommand_in_repo(&repository_root, rewrite_method, &stdin) +} + +fn run_post_rewrite_subcommand_in_repo( + repository_root: &Path, + rewrite_method: &str, + pairs_file_contents: &str, +) -> Result { + let runtime = resolve_post_rewrite_runtime_state(repository_root); + + if runtime.sce_disabled || !runtime.cli_available || runtime.is_bare_repo { + let reason = if runtime.sce_disabled { + PostRewriteNoOpReason::Disabled + } else if !runtime.cli_available { + PostRewriteNoOpReason::CliUnavailable + } else { + PostRewriteNoOpReason::BareRepository + }; + + return Ok(format!( + "post-rewrite hook executed with no-op runtime state: {reason:?}" + )); + } + + let runtime_paths = match resolve_post_commit_runtime_paths(repository_root) { + Ok(paths) => paths, + Err(error) => { + return Ok(format!( + "post-rewrite hook skipped rewrite finalization: failed to resolve persistence targets ({error})" + )); + } + }; + + let mut ingestion = LocalDbRewriteRemapIngestion { + repository_root: repository_root.to_path_buf(), + db_path: runtime_paths.local_db_path.clone(), + accepted_requests: Vec::new(), + }; + + let outcome = match finalize_post_rewrite_remap( + &runtime, rewrite_method, - &stdin, + pairs_file_contents, &mut ingestion, - )?; + ) { + Ok(outcome) => outcome, + Err(error) => { + return Ok(format!( + "post-rewrite hook skipped rewrite finalization: remap ingestion failed ({error})" + )); + } + }; + + let mut notes_writer = GitNotesTraceWriter { + repository_root: repository_root.to_path_buf(), + }; + let mut record_store = LocalDbTraceRecordStore { + repository_root: repository_root.to_path_buf(), + db_path: runtime_paths.local_db_path, + }; + let mut retry_queue = JsonFileTraceRetryQueue { + path: runtime_paths.retry_queue_path, + }; + let mut emission_ledger = FileTraceEmissionLedger { + path: runtime_paths.emission_ledger_path, + }; + + let mut rewrite_persisted = 0_usize; + let mut rewrite_queued = 0_usize; + let mut rewrite_noops = 0_usize; + let mut rewrite_failures = 0_usize; + + for request in &ingestion.accepted_requests { + let input = match build_rewrite_trace_input(repository_root, request) { + Ok(input) => input, + Err(_) => { + rewrite_failures += 1; + continue; + } + }; + + match finalize_rewrite_trace( + &runtime, + input, + &mut notes_writer, + &mut record_store, + &mut retry_queue, + &mut emission_ledger, + ) { + Ok(RewriteTraceFinalization::Persisted(_)) => rewrite_persisted += 1, + Ok(RewriteTraceFinalization::QueuedFallback(_)) => rewrite_queued += 1, + Ok(RewriteTraceFinalization::NoOp(_)) => rewrite_noops += 1, + Err(_) => rewrite_failures += 1, + } + } match outcome { PostRewriteFinalization::NoOp(reason) => Ok(format!( "post-rewrite hook executed with no-op runtime state: {reason:?}" )), PostRewriteFinalization::Ingested(ingested) => Ok(format!( - "post-rewrite hook ingested {} pair(s), skipped {} duplicate pair(s), method='{}'.", + "post-rewrite hook ingested {} pair(s), skipped {} duplicate pair(s), method='{}', rewrite_traces=(persisted={}, queued={}, no_op={}, failed={}).", ingested.ingested_pairs, ingested.skipped_pairs, - ingested.rewrite_method.canonical_label() + ingested.rewrite_method.canonical_label(), + rewrite_persisted, + rewrite_queued, + rewrite_noops, + rewrite_failures )), } } -struct AcceptAllRewriteRemapIngestion; +fn resolve_post_rewrite_runtime_state(repository_root: &Path) -> PostRewriteRuntimeState { + PostRewriteRuntimeState { + sce_disabled: env_flag_is_truthy("SCE_DISABLED"), + cli_available: git_command_success(repository_root, &["--version"]), + is_bare_repo: git_command_output(repository_root, &["rev-parse", "--is-bare-repository"]) + .is_some_and(|output| output == "true"), + } +} + +fn build_rewrite_trace_input( + repository_root: &Path, + request: &RewriteRemapRequest, +) -> Result { + let timestamp_rfc3339 = run_git_command( + repository_root, + &["show", "-s", "--format=%cI", request.new_sha.as_str()], + "Failed to resolve rewritten commit timestamp.", + )?; + let files = collect_commit_file_attribution( + repository_root, + request.new_sha.as_str(), + "https://crocoder.dev/sce/local-hooks/post-rewrite", + )?; + + Ok(RewriteTraceInput { + record_id: deterministic_uuid_v4_from_seed(&format!( + "{}:{}", + request.idempotency_key, timestamp_rfc3339 + )), + timestamp_rfc3339, + rewritten_commit_sha: request.new_sha.clone(), + rewrite_from_sha: request.old_sha.clone(), + rewrite_method: request.rewrite_method.clone(), + rewrite_confidence: 1.0, + idempotency_key: request.idempotency_key.clone(), + files, + }) +} + +struct LocalDbRewriteRemapIngestion { + repository_root: PathBuf, + db_path: PathBuf, + accepted_requests: Vec, +} + +impl RewriteRemapIngestion for LocalDbRewriteRemapIngestion { + fn ingest(&mut self, request: RewriteRemapRequest) -> Result { + let runtime = tokio::runtime::Builder::new_current_thread().build()?; + let accepted = runtime.block_on(ingest_rewrite_mapping_to_local_db( + &self.repository_root, + &self.db_path, + &request, + ))?; + if accepted { + self.accepted_requests.push(request); + } + Ok(accepted) + } +} + +async fn ingest_rewrite_mapping_to_local_db( + repository_root: &Path, + db_path: &Path, + request: &RewriteRemapRequest, +) -> Result { + let location = db_path.to_str().ok_or_else(|| { + anyhow::anyhow!("Local DB path must be valid UTF-8: {}", db_path.display()) + })?; + let db = turso::Builder::new_local(location).build().await?; + let conn = db.connect()?; + conn.execute("PRAGMA foreign_keys = ON", ()).await?; + + let canonical_root = repository_root + .canonicalize() + .unwrap_or_else(|_| repository_root.to_path_buf()) + .to_string_lossy() + .to_string(); + + conn.execute( + "INSERT OR IGNORE INTO repositories (canonical_root) VALUES (?1)", + [canonical_root.as_str()], + ) + .await?; + + let repository_id = { + let mut rows = conn + .query( + "SELECT id FROM repositories WHERE canonical_root = ?1 LIMIT 1", + [canonical_root.as_str()], + ) + .await?; + let row = rows + .next() + .await? + .ok_or_else(|| anyhow::anyhow!("repository id query returned no rows"))?; + let value = row.get_value(0)?; + value + .as_integer() + .copied() + .ok_or_else(|| anyhow::anyhow!("repository id query returned non-integer"))? + }; -impl RewriteRemapIngestion for AcceptAllRewriteRemapIngestion { - fn ingest(&mut self, _request: RewriteRemapRequest) -> Result { - Ok(true) + let existing = { + let mut rows = conn + .query( + "SELECT COUNT(*) FROM rewrite_mappings WHERE repository_id = ?1 AND idempotency_key = ?2", + (repository_id, request.idempotency_key.as_str()), + ) + .await?; + let row = rows + .next() + .await? + .ok_or_else(|| anyhow::anyhow!("rewrite mapping count query returned no rows"))?; + let value = row.get_value(0)?; + value + .as_integer() + .copied() + .ok_or_else(|| anyhow::anyhow!("rewrite mapping count query returned non-integer"))? + }; + if existing > 0 { + return Ok(false); } + + let reconciliation_key = format!( + "local-post-rewrite:{}:{}", + request.rewrite_method.canonical_label(), + request.new_sha + ); + conn.execute( + "INSERT OR IGNORE INTO reconciliation_runs (repository_id, provider, idempotency_key, status) VALUES (?1, ?2, ?3, ?4)", + (repository_id, "local-hook", reconciliation_key.as_str(), "completed"), + ) + .await?; + + let run_id = { + let mut rows = conn + .query( + "SELECT id FROM reconciliation_runs WHERE repository_id = ?1 AND idempotency_key = ?2 LIMIT 1", + (repository_id, reconciliation_key.as_str()), + ) + .await?; + let row = rows + .next() + .await? + .ok_or_else(|| anyhow::anyhow!("reconciliation run id query returned no rows"))?; + let value = row.get_value(0)?; + value + .as_integer() + .copied() + .ok_or_else(|| anyhow::anyhow!("reconciliation run id query returned non-integer"))? + }; + + conn.execute( + "INSERT INTO rewrite_mappings (reconciliation_run_id, repository_id, old_commit_sha, new_commit_sha, mapping_status, confidence, idempotency_key) VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7)", + ( + run_id, + repository_id, + request.old_sha.as_str(), + request.new_sha.as_str(), + "mapped", + 1.0_f64, + request.idempotency_key.as_str(), + ), + ) + .await?; + + Ok(true) } #[derive(Clone, Debug, Eq, PartialEq)] diff --git a/cli/src/services/hooks/tests.rs b/cli/src/services/hooks/tests.rs index 54c432fa..df130949 100644 --- a/cli/src/services/hooks/tests.rs +++ b/cli/src/services/hooks/tests.rs @@ -3,6 +3,7 @@ use std::fs; use std::path::Path; use std::path::PathBuf; use std::process::Command; +use std::sync::{Mutex, OnceLock}; use std::time::{SystemTime, UNIX_EPOCH}; use crate::services::agent_trace::{ @@ -17,18 +18,18 @@ use super::{ finalize_pre_commit_checkpoint, finalize_rewrite_trace, parse_hooks_subcommand, process_trace_retry_queue, resolve_pre_commit_checkpoint_path, run_commit_msg_subcommand_in_repo, run_hooks_subcommand, run_placeholder_hooks, - run_post_commit_subcommand_in_repo, run_pre_commit_subcommand_in_repo, CommitMsgRuntimeState, - GeneratedRegionEvent, GeneratedRegionLifecycle, GitHookKind, HookEvent, HookService, - HookSubcommand, PendingCheckpoint, PendingFileCheckpoint, PendingLineRange, - PersistenceErrorClass, PersistenceFailure, PersistenceTarget, PersistenceWriteResult, - PlaceholderHookService, PostCommitFinalization, PostCommitInput, PostCommitNoOpReason, - PostCommitRuntimeState, PostRewriteFinalization, PostRewriteNoOpReason, - PostRewriteRuntimeState, PreCommitFinalization, PreCommitNoOpReason, PreCommitRuntimeState, - PreCommitTreeAnchors, RetryMetricsSink, RetryProcessingMetric, RewriteMethod, - RewriteRemapIngestion, RewriteRemapRequest, RewriteTraceFinalization, RewriteTraceInput, - RewriteTraceNoOpReason, TraceEmissionLedger, TraceNote, TraceNotesWriter, TraceRecordStore, - TraceRetryQueue, TraceRetryQueueEntry, CANONICAL_SCE_COAUTHOR_TRAILER, - POST_COMMIT_PARENT_SHA_METADATA_KEY, + run_post_commit_subcommand_in_repo, run_post_rewrite_subcommand_in_repo, + run_pre_commit_subcommand_in_repo, CommitMsgRuntimeState, GeneratedRegionEvent, + GeneratedRegionLifecycle, GitHookKind, HookEvent, HookService, HookSubcommand, + PendingCheckpoint, PendingFileCheckpoint, PendingLineRange, PersistenceErrorClass, + PersistenceFailure, PersistenceTarget, PersistenceWriteResult, PlaceholderHookService, + PostCommitFinalization, PostCommitInput, PostCommitNoOpReason, PostCommitRuntimeState, + PostRewriteFinalization, PostRewriteNoOpReason, PostRewriteRuntimeState, PreCommitFinalization, + PreCommitNoOpReason, PreCommitRuntimeState, PreCommitTreeAnchors, RetryMetricsSink, + RetryProcessingMetric, RewriteMethod, RewriteRemapIngestion, RewriteRemapRequest, + RewriteTraceFinalization, RewriteTraceInput, RewriteTraceNoOpReason, TraceEmissionLedger, + TraceNote, TraceNotesWriter, TraceRecordStore, TraceRetryQueue, TraceRetryQueueEntry, + CANONICAL_SCE_COAUTHOR_TRAILER, POST_COMMIT_PARENT_SHA_METADATA_KEY, }; fn run_git_in_repo(repo: &Path, args: &[&str]) -> Result<()> { @@ -83,6 +84,11 @@ fn create_temp_repo() -> Result { Ok(repo) } +fn agent_trace_db_test_lock() -> &'static Mutex<()> { + static LOCK: OnceLock> = OnceLock::new(); + LOCK.get_or_init(|| Mutex::new(())) +} + fn sample_pending_checkpoint() -> PendingCheckpoint { PendingCheckpoint { files: vec![PendingFileCheckpoint { @@ -1014,6 +1020,10 @@ fn commit_msg_runtime_noops_when_staged_attribution_checkpoint_missing() -> Resu #[test] fn post_commit_runtime_persists_notes_and_local_record_store() -> Result<()> { + let _db_guard = agent_trace_db_test_lock() + .lock() + .expect("agent trace DB test lock poisoned"); + let repo = create_temp_repo()?; let tracked_file = repo.join("src").join("lib.rs"); fs::create_dir_all( @@ -1090,6 +1100,155 @@ fn post_commit_runtime_persists_notes_and_local_record_store() -> Result<()> { Ok(()) } +#[test] +fn post_rewrite_runtime_ingests_remap_and_persists_rewrite_trace() -> Result<()> { + let _db_guard = agent_trace_db_test_lock() + .lock() + .expect("agent trace DB test lock poisoned"); + + let repo = create_temp_repo()?; + let tracked_file = repo.join("src").join("lib.rs"); + fs::create_dir_all( + tracked_file + .parent() + .expect("tracked file path should have parent"), + )?; + fs::write(&tracked_file, "one\ntwo\n")?; + run_git_in_repo(&repo, &["add", "."])?; + run_git_in_repo(&repo, &["commit", "-m", "initial"])?; + + fs::write(&tracked_file, "one\ntwo\nthree\n")?; + run_git_in_repo(&repo, &["add", "src/lib.rs"])?; + run_git_in_repo(&repo, &["commit", "-m", "feat: rewrite target"])?; + + let old_sha = run_git_output_in_repo(&repo, &["rev-parse", "--verify", "HEAD"])?; + run_git_in_repo(&repo, &["commit", "--amend", "-m", "feat: rewrite amended"])?; + let new_sha = run_git_output_in_repo(&repo, &["rev-parse", "--verify", "HEAD"])?; + + let message = + run_post_rewrite_subcommand_in_repo(&repo, "amend", &format!("{} {}\n", old_sha, new_sha))?; + assert!( + message.contains("post-rewrite hook ingested 1 pair(s), skipped 0 duplicate pair(s)"), + "unexpected message: {message}" + ); + assert!( + message.contains("rewrite_traces=(persisted=1, queued=0, no_op=0, failed=0)"), + "unexpected message: {message}" + ); + + let note = run_git_output_in_repo( + &repo, + &["notes", "--ref", "refs/notes/agent-trace", "show", &new_sha], + )?; + let note_json = serde_json::from_str::(¬e)?; + assert_eq!( + note_json + .get("record") + .and_then(|record| record.get("metadata")) + .and_then(|metadata| metadata.get("dev.crocoder.sce.rewrite_from")) + .and_then(serde_json::Value::as_str), + Some(old_sha.as_str()) + ); + assert_eq!( + note_json + .get("record") + .and_then(|record| record.get("metadata")) + .and_then(|metadata| metadata.get("dev.crocoder.sce.rewrite_method")) + .and_then(serde_json::Value::as_str), + Some("amend") + ); + + let db_path = resolve_agent_trace_local_db_path()?; + let runtime = tokio::runtime::Builder::new_current_thread().build()?; + let (rewrite_mapping_count, rewrite_trace_count) = runtime.block_on(async { + let location = db_path + .to_str() + .ok_or_else(|| anyhow::anyhow!("test DB path must be UTF-8"))?; + let db = turso::Builder::new_local(location).build().await?; + let conn = db.connect()?; + + let mut mapping_rows = conn + .query( + "SELECT COUNT(*) FROM rewrite_mappings WHERE old_commit_sha = ?1 AND new_commit_sha = ?2", + (old_sha.as_str(), new_sha.as_str()), + ) + .await?; + let mapping_row = mapping_rows + .next() + .await? + .ok_or_else(|| anyhow::anyhow!("rewrite mapping count query returned no rows"))?; + let mapping_value = mapping_row.get_value(0)?; + let mapping_count = mapping_value + .as_integer() + .copied() + .ok_or_else(|| anyhow::anyhow!("rewrite mapping count query returned non-integer"))?; + + let mut trace_rows = conn + .query( + "SELECT COUNT(*) FROM trace_records tr JOIN commits c ON c.id = tr.commit_id WHERE c.commit_sha = ?1", + [new_sha.as_str()], + ) + .await?; + let trace_row = trace_rows + .next() + .await? + .ok_or_else(|| anyhow::anyhow!("rewrite trace count query returned no rows"))?; + let trace_value = trace_row.get_value(0)?; + let trace_count = trace_value + .as_integer() + .copied() + .ok_or_else(|| anyhow::anyhow!("rewrite trace count query returned non-integer"))?; + + Ok::<(i64, i64), anyhow::Error>((mapping_count, trace_count)) + })?; + + assert_eq!(rewrite_mapping_count, 1); + assert_eq!(rewrite_trace_count, 1); + + Ok(()) +} + +#[test] +fn post_rewrite_runtime_skips_duplicate_pair_replay() -> Result<()> { + let _db_guard = agent_trace_db_test_lock() + .lock() + .expect("agent trace DB test lock poisoned"); + + let repo = create_temp_repo()?; + let tracked_file = repo.join("src").join("lib.rs"); + fs::create_dir_all( + tracked_file + .parent() + .expect("tracked file path should have parent"), + )?; + fs::write(&tracked_file, "one\n")?; + run_git_in_repo(&repo, &["add", "."])?; + run_git_in_repo(&repo, &["commit", "-m", "initial"])?; + + fs::write(&tracked_file, "one\ntwo\n")?; + run_git_in_repo(&repo, &["add", "src/lib.rs"])?; + run_git_in_repo(&repo, &["commit", "-m", "feat: rewrite target"])?; + + let old_sha = run_git_output_in_repo(&repo, &["rev-parse", "--verify", "HEAD"])?; + run_git_in_repo(&repo, &["commit", "--amend", "-m", "feat: rewrite amended"])?; + let new_sha = run_git_output_in_repo(&repo, &["rev-parse", "--verify", "HEAD"])?; + let pair_input = format!("{} {}\n", old_sha, new_sha); + + let _first = run_post_rewrite_subcommand_in_repo(&repo, "amend", &pair_input)?; + let second = run_post_rewrite_subcommand_in_repo(&repo, "amend", &pair_input)?; + + assert!( + second.contains("post-rewrite hook ingested 0 pair(s), skipped 1 duplicate pair(s)"), + "unexpected message: {second}" + ); + assert!( + second.contains("rewrite_traces=(persisted=0, queued=0, no_op=0, failed=0)"), + "unexpected message: {second}" + ); + + Ok(()) +} + #[test] fn hooks_placeholder_event_model_reserves_generated_region_tracking() { let service = PlaceholderHookService; diff --git a/context/context-map.md b/context/context-map.md index 96943335..ac9539b5 100644 --- a/context/context-map.md +++ b/context/context-map.md @@ -35,7 +35,7 @@ Feature/domain context: - `context/sce/agent-trace-rewrite-mapping-engine.md` (T13 hosted rewrite mapping engine contract with patch-id exact precedence, range-diff/fuzzy scoring, and deterministic unresolved outcomes) - `context/sce/agent-trace-retry-queue-observability.md` (T14 retry queue recovery contract plus reconciliation/runtime observability metrics and DB-first queue schema additions) - `context/sce/agent-trace-local-hooks-mvp-contract-gap-matrix.md` (T01 Local Hooks MVP production contract freeze and deterministic gap matrix for `agent-trace-local-hooks-production-mvp`) -- `context/sce/agent-trace-hooks-command-routing.md` (implemented `sce hooks` command routing plus current runtime entrypoint behavior, including commit-msg policy gate/file mutation wiring) +- `context/sce/agent-trace-hooks-command-routing.md` (implemented `sce hooks` command routing plus current runtime entrypoint behavior, including commit-msg policy gating/file mutation and post-rewrite remap+rewrite finalization wiring) Working areas: - `context/plans/` (active plan execution artifacts, not durable history) diff --git a/context/glossary.md b/context/glossary.md index 08885867..3a695d26 100644 --- a/context/glossary.md +++ b/context/glossary.md @@ -36,7 +36,7 @@ - `setup install engine`: Installer in `cli/src/services/setup.rs` (`install_embedded_setup_assets`) that writes embedded setup assets into per-target staging directories and swaps them into repository-root `.opencode/`/`.claude/` destinations. - `setup backup-and-replace`: Replacement choreography in `cli/src/services/setup.rs` where existing install targets are renamed to unique `.backup` paths before staged content is promoted; on swap failure, the engine restores the original target from backup and cleans temporary staging paths. - `MCP capability snapshot`: Placeholder capability model in `cli/src/services/mcp.rs` that captures planned file-cache transport/tool contracts (`cache-put`, `cache-get`) and cache policy defaults without enabling runtime MCP execution. -- `hooks command routing contract` (T02): Implemented hook command parser/dispatcher in `cli/src/services/hooks.rs` (`HookSubcommand`, `parse_hooks_subcommand`, `run_hooks_subcommand`) that supports `pre-commit`, `commit-msg `, `post-commit`, and `post-rewrite ` with deterministic invocation validation and usage errors. +- `hooks command routing contract` (T02, T07): Implemented hook command parser/dispatcher plus runtime wiring in `cli/src/services/hooks.rs` (`HookSubcommand`, `parse_hooks_subcommand`, `run_hooks_subcommand`) that supports `pre-commit`, `commit-msg `, `post-commit`, and `post-rewrite ` with deterministic invocation validation/usage errors and production post-rewrite remap + rewritten-trace finalization execution. - `cloud sync gateway placeholder`: Abstraction in `cli/src/services/sync.rs` (`CloudSyncGateway`) that returns deferred cloud-sync checkpoints while `sync` remains non-production. - `sce CLI onboarding guide`: Crate-local documentation at `cli/README.md` that defines runnable placeholder commands, non-goals/safety limits, and roadmap mapping to service modules. - `plan/code overlap map`: Context artifact at `context/sce/plan-code-overlap-map.md` that classifies Shared Context Plan/Code, `/change-to-plan`, `/next-task`, `/commit`, and core skills into role-specific vs shared-reusable instruction blocks with explicit dedup targets. diff --git a/context/overview.md b/context/overview.md index 880ce3a2..549c3f82 100644 --- a/context/overview.md +++ b/context/overview.md @@ -37,7 +37,7 @@ The local DB service now also includes reconciliation persistence schema coverag The CLI now also includes a hosted event intake/orchestration seam in `cli/src/services/hosted_reconciliation.rs` that verifies provider signatures, resolves old/new commit heads from GitHub/GitLab payloads, and creates deterministic replay-safe reconciliation run requests; this behavior is documented in `context/sce/agent-trace-hosted-event-intake-orchestration.md`. The hosted reconciliation service now also includes a deterministic rewrite mapping engine (`map_rewritten_commit`) that resolves old->new commit identity using patch-id exact precedence, then range-diff hints, then fuzzy fallback with a `>= 0.60` mapping threshold and explicit ambiguous/unmatched/low-confidence unresolved outcomes; this behavior is documented in `context/sce/agent-trace-rewrite-mapping-engine.md`. The hooks service now also includes retry-queue replay processing (`process_trace_retry_queue`) with per-attempt runtime/error-class metric emission, and the hosted reconciliation service now includes mapped/unmapped + confidence histogram metric snapshots (`summarize_reconciliation_metrics`), with DB-first queue/metrics schema coverage in `apply_core_schema_migrations`; this behavior is documented in `context/sce/agent-trace-retry-queue-observability.md`. -The hooks command surface now also supports concrete runtime subcommand routing (`pre-commit`, `commit-msg`, `post-commit`, `post-rewrite`) with deterministic argument and STDIN contract validation owned by `cli/src/services/hooks.rs`; this behavior is documented in `context/sce/agent-trace-hooks-command-routing.md`. +The hooks command surface now also supports concrete runtime subcommand routing (`pre-commit`, `commit-msg`, `post-commit`, `post-rewrite`) with deterministic argument/STDIN validation and production post-rewrite runtime wiring (local remap ingestion plus rewritten-trace finalization through notes+DB adapters) owned by `cli/src/services/hooks.rs`; this behavior is documented in `context/sce/agent-trace-hooks-command-routing.md`. The setup service now also exposes deterministic required-hook embedded asset accessors (`iter_required_hook_assets`, `get_required_hook_asset`) backed by canonical templates in `cli/assets/hooks/` for `pre-commit`, `commit-msg`, and `post-commit`; this behavior is documented in `context/sce/setup-githooks-hook-asset-packaging.md`. The setup service now also includes required-hook install orchestration (`install_required_git_hooks`) that resolves repository root and effective hooks path from git truth, enforces deterministic per-hook outcomes (`Installed`/`Updated`/`Skipped`), and performs backup-and-restore rollback on swap failures; this behavior is documented in `context/sce/setup-githooks-install-flow.md`. The setup command parser/dispatch now also supports `sce setup --hooks` with optional `--repo `, enforces deterministic compatibility validation (`--repo` requires `--hooks`; `--hooks` incompatible with setup target flags), and emits deterministic per-hook setup outcome messaging (`installed`/`updated`/`skipped` with backup status); this behavior is documented in `context/sce/setup-githooks-cli-ux.md`. diff --git a/context/plans/agent-trace-local-hooks-production-mvp.md b/context/plans/agent-trace-local-hooks-production-mvp.md index 504747c9..d16725d2 100644 --- a/context/plans/agent-trace-local-hooks-production-mvp.md +++ b/context/plans/agent-trace-local-hooks-production-mvp.md @@ -94,7 +94,7 @@ Connect the existing Agent Trace service seams into a fully functional local Git - `cargo test --manifest-path cli/Cargo.toml local_db::tests` - Integration test proving persisted data survives process restart with configured local DB path. -- [ ] T07: Wire post-rewrite runtime flow (remap ingestion + rewrite trace finalization) (status:todo) +- [x] T07: Wire post-rewrite runtime flow (remap ingestion + rewrite trace finalization) (status:done) - Task ID: T07 - Goal: Connect `post-rewrite` hook runtime input parsing and rewrite-method normalization to real remap ingestion and rewritten-trace emission paths. - Boundaries (in/out of scope): diff --git a/context/sce/agent-trace-hooks-command-routing.md b/context/sce/agent-trace-hooks-command-routing.md index 6d31d12d..5416dd45 100644 --- a/context/sce/agent-trace-hooks-command-routing.md +++ b/context/sce/agent-trace-hooks-command-routing.md @@ -2,8 +2,8 @@ ## Scope - Plan: `agent-trace-local-hooks-production-mvp` -- Task: `T02` -- Focus: implemented `sce hooks` subcommand routing and hook invocation contract validation. +- Tasks: `T02`, `T07` +- Focus: implemented `sce hooks` subcommand routing plus production post-rewrite runtime wiring. ## Implemented command surface - `sce hooks pre-commit` @@ -20,11 +20,12 @@ - `pre-commit`: executes the pre-commit runtime entrypoint and reports staged-checkpoint finalization outcome. - `commit-msg`: validates ``, resolves runtime gates (`SCE_DISABLED`, `SCE_COAUTHOR_ENABLED`, staged checkpoint presence), applies canonical co-author policy, and writes back only when trailer mutation is required. - `post-commit`: resolves runtime guards, builds commit attribution input from git + pre-commit checkpoint artifacts, executes `finalize_post_commit_trace`, writes canonical note payloads to `refs/notes/agent-trace`, ensures persistent local DB readiness (`.../sce/agent-trace/local.db`) with migrations before write attempts, persists trace records to local Turso-backed tables, maintains commit-level emission ledger (`sce/trace-emission-ledger.txt`), and enqueues fallback entries (`sce/trace-retry-queue.jsonl`) when a persistence target fails. -- `post-rewrite`: reads hook pair input from STDIN, validates pair format through remap finalization parsing, and reports ingested/skipped outcomes. +- `post-rewrite`: resolves runtime guards, ensures persistence paths are available, ingests parsed rewrite remap pairs into local DB-backed `rewrite_mappings` with deterministic idempotency, and runs rewritten-trace finalization (`finalize_rewrite_trace`) per accepted remap using canonical notes + DB writers, shared emission ledger, and retry queue adapters. +- `post-rewrite` output now reports both remap ingestion counters and rewrite trace finalization counters (`persisted`, `queued`, `no_op`, `failed`) for deterministic operator diagnostics. ## Notes for next tasks - T02 established routing and invocation contracts. - T03 implemented pre-commit staged-checkpoint runtime wiring. - T04 implemented commit-msg file IO mutation wiring to canonical co-author policy. - T05 implemented post-commit persistence adapters and runtime wiring. -- Rewrite-trace production runtime wiring remains in `T07+`. +- T07 implemented production post-rewrite runtime orchestration (remap ingestion + rewritten trace emission). From 7473d297335ee65d71580e1593fed42f3fb3710a Mon Sep 17 00:00:00 2001 From: David Abram Date: Thu, 5 Mar 2026 13:06:18 +0100 Subject: [PATCH 38/39] hooks: Wire runtime retry replay into post-commit and post-rewrite hooks Process up to 16 queued retry entries after finalization, aggregate transient/permanent failure metrics, and append deterministic retry summary output to both hook flows. --- cli/src/services/hooks.rs | 99 ++++++++++++++++++- context/architecture.md | 2 +- context/glossary.md | 2 +- context/overview.md | 2 +- context/patterns.md | 2 +- .../agent-trace-local-hooks-production-mvp.md | 2 +- .../agent-trace-retry-queue-observability.md | 3 + 7 files changed, 102 insertions(+), 10 deletions(-) diff --git a/cli/src/services/hooks.rs b/cli/src/services/hooks.rs index 755176ac..240dd442 100644 --- a/cli/src/services/hooks.rs +++ b/cli/src/services/hooks.rs @@ -20,6 +20,7 @@ pub const NAME: &str = "hooks"; pub const CANONICAL_SCE_COAUTHOR_TRAILER: &str = "Co-authored-by: SCE "; pub const POST_COMMIT_PARENT_SHA_METADATA_KEY: &str = "dev.crocoder.sce.parent_revision"; const PRE_COMMIT_CHECKPOINT_GIT_PATH: &str = "sce/pre-commit-checkpoint.json"; +const RETRY_QUEUE_MAX_ITEMS_PER_RUN: usize = 16; #[derive(Clone, Debug, Eq, PartialEq)] pub enum HookSubcommand { @@ -651,17 +652,31 @@ fn run_post_commit_subcommand_in_repo(repository_root: &Path) -> Result } }; + let retry_report = + match process_runtime_retry_queue(&mut retry_queue, &mut notes_writer, &mut record_store) { + Ok(report) => report, + Err(error) => { + return Ok(format!( + "post-commit hook completed trace finalization but retry replay failed ({error})" + )); + } + }; + let message = match outcome { PostCommitFinalization::NoOp(reason) => { format!("post-commit hook executed with no-op runtime state: {reason:?}") } PostCommitFinalization::Persisted(persisted) => format!( - "post-commit hook finalized trace for commit '{}' (trace_id='{}', notes={:?}, database={:?}).", + "post-commit hook finalized trace for commit '{}' (trace_id='{}', notes={:?}, database={:?}) {}.", persisted.commit_sha, persisted.trace_id, persisted.notes, persisted.database + , retry_report.summary_text() ), PostCommitFinalization::QueuedFallback(queued) => format!( - "post-commit hook enqueued fallback for commit '{}' (trace_id='{}', failed_targets={:?}).", - queued.commit_sha, queued.trace_id, queued.failed_targets + "post-commit hook enqueued fallback for commit '{}' (trace_id='{}', failed_targets={:?}) {}.", + queued.commit_sha, + queued.trace_id, + queued.failed_targets, + retry_report.summary_text() ), }; @@ -992,23 +1007,97 @@ fn run_post_rewrite_subcommand_in_repo( } } + let retry_report = + match process_runtime_retry_queue(&mut retry_queue, &mut notes_writer, &mut record_store) { + Ok(report) => report, + Err(error) => { + return Ok(format!( + "post-rewrite hook completed rewrite finalization but retry replay failed ({error})" + )); + } + }; + match outcome { PostRewriteFinalization::NoOp(reason) => Ok(format!( "post-rewrite hook executed with no-op runtime state: {reason:?}" )), PostRewriteFinalization::Ingested(ingested) => Ok(format!( - "post-rewrite hook ingested {} pair(s), skipped {} duplicate pair(s), method='{}', rewrite_traces=(persisted={}, queued={}, no_op={}, failed={}).", + "post-rewrite hook ingested {} pair(s), skipped {} duplicate pair(s), method='{}', rewrite_traces=(persisted={}, queued={}, no_op={}, failed={}) {}.", ingested.ingested_pairs, ingested.skipped_pairs, ingested.rewrite_method.canonical_label(), rewrite_persisted, rewrite_queued, rewrite_noops, - rewrite_failures + rewrite_failures, + retry_report.summary_text() )), } } +#[derive(Default)] +struct InMemoryRetryMetricsSink { + events: Vec, +} + +impl RetryMetricsSink for InMemoryRetryMetricsSink { + fn record_retry_metric(&mut self, metric: RetryProcessingMetric) { + self.events.push(metric); + } +} + +#[derive(Clone, Copy, Debug, Eq, PartialEq)] +struct RuntimeRetryReport { + summary: RetryQueueProcessSummary, + transient_failures: usize, + permanent_failures: usize, +} + +impl RuntimeRetryReport { + fn summary_text(&self) -> String { + format!( + "retry_queue=(attempted={}, recovered={}, requeued={}, transient_failures={}, permanent_failures={})", + self.summary.attempted, + self.summary.recovered, + self.summary.requeued, + self.transient_failures, + self.permanent_failures + ) + } +} + +fn process_runtime_retry_queue( + retry_queue: &mut impl TraceRetryQueue, + notes_writer: &mut impl TraceNotesWriter, + record_store: &mut impl TraceRecordStore, +) -> Result { + let mut metrics_sink = InMemoryRetryMetricsSink::default(); + let summary = process_trace_retry_queue( + retry_queue, + notes_writer, + record_store, + &mut metrics_sink, + RETRY_QUEUE_MAX_ITEMS_PER_RUN, + )?; + + let mut transient_failures = 0_usize; + let mut permanent_failures = 0_usize; + + for metric in metrics_sink.events { + match metric.error_class { + Some(PersistenceErrorClass::Transient) => transient_failures += 1, + Some(PersistenceErrorClass::Permanent) => permanent_failures += 1, + None => {} + } + } + + Ok(RuntimeRetryReport { + summary, + transient_failures, + permanent_failures, + }) +} + fn resolve_post_rewrite_runtime_state(repository_root: &Path) -> PostRewriteRuntimeState { PostRewriteRuntimeState { sce_disabled: env_flag_is_truthy("SCE_DISABLED"), diff --git a/context/architecture.md b/context/architecture.md index d36251b7..857be1f0 100644 --- a/context/architecture.md +++ b/context/architecture.md @@ -79,7 +79,7 @@ The repository includes a new placeholder Rust binary crate at `cli/`. - `cli/src/services/doctor.rs` defines hook rollout health validation (`run_doctor`) that resolves effective git hook-path source (default, local `core.hooksPath`, global `core.hooksPath`) and validates required hook files (`pre-commit`, `commit-msg`, `post-commit`) for presence and executable permissions. - `cli/src/services/agent_trace.rs` defines the Agent Trace schema adapter and builder contracts (`adapt_trace_payload`, `build_trace_payload`), including fixed git VCS identity, reserved reverse-domain metadata keys, and deterministic AI `model_id` normalization before schema-compliance validation. - `cli/src/services/mcp.rs` defines MCP file-cache capability contracts (`McpService`, transport/capability snapshots, cache policy) with non-runnable placeholder tool declarations. -- `cli/src/services/hooks.rs` defines hook-event and generated-region tracking contracts (`HookService`, `HookEventModel`) plus a pre-commit staged-checkpoint finalization seam (`finalize_pre_commit_checkpoint`) that enforces staged-only attribution and carries index/tree anchors with explicit no-op guard states, a commit-msg co-author policy seam (`apply_commit_msg_coauthor_policy`) that injects one canonical SCE trailer only for allowed attributed commits, a post-commit trace finalization seam (`finalize_post_commit_trace`) that performs notes+DB dual writes with idempotency ledger guards and retry-queue fallback capture, a retry replay seam (`process_trace_retry_queue`) that re-attempts only failed persistence targets and emits per-attempt runtime/error-class metrics, a post-rewrite remap-ingestion seam (`finalize_post_rewrite_remap`) that parses old->new SHA pairs and derives deterministic replay keys for remap dispatch, and a rewrite trace transformation seam (`finalize_rewrite_trace`) that emits rewritten-SHA Agent Trace records with rewrite metadata plus confidence-based quality status. +- `cli/src/services/hooks.rs` defines hook-event and generated-region tracking contracts (`HookService`, `HookEventModel`) plus a pre-commit staged-checkpoint finalization seam (`finalize_pre_commit_checkpoint`) that enforces staged-only attribution and carries index/tree anchors with explicit no-op guard states, a commit-msg co-author policy seam (`apply_commit_msg_coauthor_policy`) that injects one canonical SCE trailer only for allowed attributed commits, a post-commit trace finalization seam (`finalize_post_commit_trace`) that performs notes+DB dual writes with idempotency ledger guards and retry-queue fallback capture, a retry replay seam (`process_trace_retry_queue`) that re-attempts only failed persistence targets and emits per-attempt runtime/error-class metrics, bounded operational retry replay invocation from post-commit/post-rewrite flows (`process_runtime_retry_queue`), a post-rewrite remap-ingestion seam (`finalize_post_rewrite_remap`) that parses old->new SHA pairs and derives deterministic replay keys for remap dispatch, and a rewrite trace transformation seam (`finalize_rewrite_trace`) that emits rewritten-SHA Agent Trace records with rewrite metadata plus confidence-based quality status. - `cli/src/services/hosted_reconciliation.rs` defines hosted intake/orchestration seams (`ingest_hosted_rewrite_event`, `ReconciliationRunStore`) that verify provider signatures (GitHub HMAC-SHA256 and GitLab token equality), parse provider payload old/new heads, normalize deterministic idempotency-backed reconciliation run requests, resolve deterministic old->new rewrite mappings (`map_rewritten_commit`) with patch-id exact precedence, range-diff/fuzzy fallback scoring, and explicit unresolved classifications, and summarize mapped/unmapped confidence/runtime/error-class telemetry (`summarize_reconciliation_metrics`). - `cli/src/services/sync.rs` runs the local adapter through a lazily initialized shared tokio current-thread runtime and composes a placeholder cloud-sync abstraction (`CloudSyncGateway`) so local Turso validation and deferred cloud planning remain separated. - `cli/src/services/` contains module boundaries for setup, doctor, MCP, hooks, sync, and local DB adapters with explicit trait seams for future implementations. diff --git a/context/glossary.md b/context/glossary.md index 3a695d26..e7113322 100644 --- a/context/glossary.md +++ b/context/glossary.md @@ -71,6 +71,6 @@ - `agent trace reconciliation schema ingestion`: T11 contract in `cli/src/services/local_db.rs` (`apply_core_schema_migrations`) that extends local DB migrations with hosted rewrite reconciliation entities (`reconciliation_runs`, `rewrite_mappings`, `conversations`), per-repository idempotency uniqueness, and query indexes for run status and old->new mapping lookup. - `agent trace hosted event intake orchestration`: T12 contract in `cli/src/services/hosted_reconciliation.rs` (`ingest_hosted_rewrite_event`) that verifies GitHub/GitLab webhook signatures, resolves provider payload old/new heads, normalizes provider/repo/event identity into deterministic `hosted::` reconciliation run idempotency keys, and returns replay-safe created-vs-duplicate run outcomes through `ReconciliationRunStore`. - `agent trace rewrite mapping engine`: T13 contract in `cli/src/services/hosted_reconciliation.rs` (`map_rewritten_commit`) that deterministically maps old->new rewritten commits using strict patch-id exact precedence, then range-diff scoring, then fuzzy fallback with `>= 0.60` threshold gating and explicit unresolved outcomes (`ambiguous`, `unmatched`, `low_confidence`). -- `agent trace retry replay processor`: T14 contract in `cli/src/services/hooks.rs` (`process_trace_retry_queue`) that dequeues fallback queue entries, retries only previously failed persistence targets (notes and/or DB), requeues remaining failures, and emits per-attempt runtime/error-class metrics via `RetryMetricsSink`. +- `agent trace retry replay processor`: T14/T08 operational contract in `cli/src/services/hooks.rs` where `process_trace_retry_queue` dequeues fallback queue entries, retries only previously failed persistence targets (notes and/or DB), requeues remaining failures, and emits per-attempt runtime/error-class metrics via `RetryMetricsSink`; production local hook runtime invokes bounded replay (`max_items=16`) after post-commit and post-rewrite finalization with deterministic retry summary output. - `reconciliation metrics snapshot`: T14 contract in `cli/src/services/hosted_reconciliation.rs` (`summarize_reconciliation_metrics`) that reports mapped/unmapped counts, confidence histogram buckets (`high`/`medium`/`low`), runtime (`runtime_ms`), and normalized error class (`signature`/`payload`/`store`) for hosted rewrite runs. - `agent trace local hooks MVP contract and gap matrix`: T01 context artifact at `context/sce/agent-trace-local-hooks-mvp-contract-gap-matrix.md` that freezes local production boundaries/decisions for `agent-trace-local-hooks-production-mvp` and maps current seam-level code truth to required runtime completion tasks (`T02`..`T10`). diff --git a/context/overview.md b/context/overview.md index 549c3f82..059bedcb 100644 --- a/context/overview.md +++ b/context/overview.md @@ -36,7 +36,7 @@ The local DB service now includes core Agent Trace persistence schema migrations The local DB service now also includes reconciliation persistence schema coverage in the same migration entrypoint for hosted rewrite bookkeeping tables (`reconciliation_runs`, `rewrite_mappings`, `conversations`) and replay/query indexes; this behavior is documented in `context/sce/agent-trace-reconciliation-schema-ingestion.md`. The CLI now also includes a hosted event intake/orchestration seam in `cli/src/services/hosted_reconciliation.rs` that verifies provider signatures, resolves old/new commit heads from GitHub/GitLab payloads, and creates deterministic replay-safe reconciliation run requests; this behavior is documented in `context/sce/agent-trace-hosted-event-intake-orchestration.md`. The hosted reconciliation service now also includes a deterministic rewrite mapping engine (`map_rewritten_commit`) that resolves old->new commit identity using patch-id exact precedence, then range-diff hints, then fuzzy fallback with a `>= 0.60` mapping threshold and explicit ambiguous/unmatched/low-confidence unresolved outcomes; this behavior is documented in `context/sce/agent-trace-rewrite-mapping-engine.md`. -The hooks service now also includes retry-queue replay processing (`process_trace_retry_queue`) with per-attempt runtime/error-class metric emission, and the hosted reconciliation service now includes mapped/unmapped + confidence histogram metric snapshots (`summarize_reconciliation_metrics`), with DB-first queue/metrics schema coverage in `apply_core_schema_migrations`; this behavior is documented in `context/sce/agent-trace-retry-queue-observability.md`. +The hooks service now also includes operational retry-queue replay processing (`process_trace_retry_queue`) invoked from post-commit and post-rewrite runtime flows with bounded same-pass replay and deterministic retry summary output, plus per-attempt runtime/error-class metric emission; the hosted reconciliation service now includes mapped/unmapped + confidence histogram metric snapshots (`summarize_reconciliation_metrics`), with DB-first queue/metrics schema coverage in `apply_core_schema_migrations`; this behavior is documented in `context/sce/agent-trace-retry-queue-observability.md`. The hooks command surface now also supports concrete runtime subcommand routing (`pre-commit`, `commit-msg`, `post-commit`, `post-rewrite`) with deterministic argument/STDIN validation and production post-rewrite runtime wiring (local remap ingestion plus rewritten-trace finalization through notes+DB adapters) owned by `cli/src/services/hooks.rs`; this behavior is documented in `context/sce/agent-trace-hooks-command-routing.md`. The setup service now also exposes deterministic required-hook embedded asset accessors (`iter_required_hook_assets`, `get_required_hook_asset`) backed by canonical templates in `cli/assets/hooks/` for `pre-commit`, `commit-msg`, and `post-commit`; this behavior is documented in `context/sce/setup-githooks-hook-asset-packaging.md`. The setup service now also includes required-hook install orchestration (`install_required_git_hooks`) that resolves repository root and effective hooks path from git truth, enforces deterministic per-hook outcomes (`Installed`/`Updated`/`Skipped`), and performs backup-and-restore rollback on swap failures; this behavior is documented in `context/sce/setup-githooks-install-flow.md`. diff --git a/context/patterns.md b/context/patterns.md index 87490f8d..b5b2d495 100644 --- a/context/patterns.md +++ b/context/patterns.md @@ -90,7 +90,7 @@ - For pre-commit attribution finalization seams, keep pending staged and unstaged ranges explicitly separated in input models and finalize from staged ranges only, while carrying index/tree anchors for deterministic commit-time attribution binding. - For commit-msg co-author policy seams, gate canonical trailer insertion on runtime controls (`SCE_DISABLED`, `SCE_COAUTHOR_ENABLED`) plus staged SCE-attribution presence, and enforce idempotent dedupe so allowed cases end with exactly one `Co-authored-by: SCE ` trailer. - For post-commit trace finalization seams, treat commit SHA as the idempotency identity, perform notes + DB writes in the same finalize pass when available, and enqueue retry-fallback entries that explicitly capture failed persistence targets for replay-safe recovery. -- For retry replay seams, process fallback queue entries in bounded batches, avoid same-pass duplicate trace processing, retry only failed targets, and emit per-attempt runtime + persistence error-class metrics for operational visibility. +- For retry replay seams, process fallback queue entries in bounded batches, avoid same-pass duplicate trace processing, retry only failed targets, emit per-attempt runtime + persistence error-class metrics for operational visibility, and run a bounded replay pass from production post-commit/post-rewrite hook runtime with deterministic summary output. - For post-rewrite remap ingestion seams, parse ` ` pairs from hook input strictly, ignore empty/no-op self-mapping rows, normalize rewrite method labels to lowercase (`amend`/`rebase` when recognized), and derive deterministic per-pair idempotency keys before dispatching remap requests. - For rewrite trace transformation seams, materialize rewritten records through the canonical Agent Trace builder path, require finite confidence in `[0.0, 1.0]`, normalize confidence to two-decimal metadata strings, map quality thresholds to `final` (`>= 0.90`), `partial` (`0.60..0.89`), and `needs_review` (`< 0.60`), and preserve notes+DB dual-write plus retry-fallback parity. - For local persistence rollout, ship core schema changes as idempotent `CREATE TABLE IF NOT EXISTS` and `CREATE INDEX IF NOT EXISTS` statements so migration reapplication is upgrade-safe across empty and preexisting local Turso DB states. diff --git a/context/plans/agent-trace-local-hooks-production-mvp.md b/context/plans/agent-trace-local-hooks-production-mvp.md index d16725d2..7cdf6f64 100644 --- a/context/plans/agent-trace-local-hooks-production-mvp.md +++ b/context/plans/agent-trace-local-hooks-production-mvp.md @@ -107,7 +107,7 @@ Connect the existing Agent Trace service seams into a fully functional local Git - `cargo test --manifest-path cli/Cargo.toml post_rewrite_finalization` - `cargo test --manifest-path cli/Cargo.toml rewrite_trace_finalization` -- [ ] T08: Wire retry replay processor into operational runtime and observability outputs (status:todo) +- [x] T08: Wire retry replay processor into operational runtime and observability outputs (status:done) - Task ID: T08 - Goal: Ensure retry queue processing is invokable in production local workflow with deterministic metrics emission and target-scoped recovery. - Boundaries (in/out of scope): diff --git a/context/sce/agent-trace-retry-queue-observability.md b/context/sce/agent-trace-retry-queue-observability.md index 07699c5f..6686d33b 100644 --- a/context/sce/agent-trace-retry-queue-observability.md +++ b/context/sce/agent-trace-retry-queue-observability.md @@ -7,8 +7,10 @@ ## Canonical contract - Retry processing entrypoint: `cli/src/services/hooks.rs` -> `process_trace_retry_queue`. +- Production runtime invocation now runs after both `sce hooks post-commit` and `sce hooks post-rewrite` finalization paths through `process_runtime_retry_queue`. - Queue contract now supports dequeue + enqueue replay via `TraceRetryQueue::{dequeue_next, enqueue}`. - Retry pass processes up to `max_items` entries per invocation and avoids same-pass duplicate processing for the same trace ID. +- Runtime retry passes currently use a bounded `max_items = 16` per hook invocation. - Recovery write behavior is target-scoped: - Failed notes target retries through `TraceNotesWriter`. - Failed DB target retries through `TraceRecordStore` using metadata idempotency key (`dev.crocoder.sce.idempotency_key`) when present. @@ -18,6 +20,7 @@ - runtime histogram input (`runtime_ms`) - `error_class` (from `PersistenceFailure.class` when writes fail) - remaining failed targets. +- Hook command output now includes deterministic retry observability summary text: attempted/recovered/requeued counts plus transient/permanent failure counts. ## Reconciliation metrics contract - Reconciliation mapping metrics entrypoint: `cli/src/services/hosted_reconciliation.rs` -> `summarize_reconciliation_metrics`. From 8a899f0bf34e18b3dc6346326af03e9dea1ddbd4 Mon Sep 17 00:00:00 2001 From: David Abram Date: Thu, 5 Mar 2026 13:24:21 +0100 Subject: [PATCH 39/39] services: Remove placeholder hook-event scaffolding and harden local hook docs Drop obsolete placeholder hook event/generated-region contracts from production hooks runtime and update tests to reflect real subcommand flows. Clean up warning-prone/test-only paths and small lint nits while documenting production `sce hooks` subcommands in the CLI README. --- cli/README.md | 35 +++-- cli/src/services/agent_trace.rs | 1 + cli/src/services/hooks.rs | 139 ------------------ cli/src/services/hooks/tests.rs | 40 +---- cli/src/services/local_db.rs | 2 +- cli/src/services/setup.rs | 2 + cli/src/services/setup/tests.rs | 5 +- context/architecture.md | 2 +- context/cli/placeholder-foundation.md | 4 +- context/patterns.md | 2 +- .../agent-trace-local-hooks-production-mvp.md | 28 +++- .../sce/agent-trace-hooks-command-routing.md | 1 + 12 files changed, 64 insertions(+), 197 deletions(-) diff --git a/cli/README.md b/cli/README.md index 4efe451a..44af2d45 100644 --- a/cli/README.md +++ b/cli/README.md @@ -3,9 +3,11 @@ This crate provides the early command-surface scaffold for the Shared Context Engineering CLI (`sce`). -Current scope is intentionally narrow: deterministic command dispatch, an -implemented repository `setup` flow, implemented local rollout health checks -via `doctor`, and explicit placeholders for commands that are still deferred. +Current scope is intentionally narrow: deterministic command dispatch, +implemented repository `setup` flows (including hook installation), +implemented local rollout health checks via `doctor`, production local +`hooks` runtime execution, and explicit placeholders for commands that are +still deferred. ## Quick start @@ -14,7 +16,8 @@ cargo run --manifest-path cli/Cargo.toml -- --help cargo run --manifest-path cli/Cargo.toml -- setup cargo run --manifest-path cli/Cargo.toml -- doctor cargo run --manifest-path cli/Cargo.toml -- mcp -cargo run --manifest-path cli/Cargo.toml -- hooks +cargo run --manifest-path cli/Cargo.toml -- hooks pre-commit +cargo run --manifest-path cli/Cargo.toml -- hooks commit-msg .git/COMMIT_EDITMSG cargo run --manifest-path cli/Cargo.toml -- sync ``` @@ -57,6 +60,9 @@ Crates.io is prepared but intentionally disabled in this phase. `config/.claude/**` - installation writes to repository-root `.opencode/` and/or `.claude/` using backup-and-replace safety with rollback on swap failures + - required local hooks can be installed with `sce setup --hooks` (optionally + `--repo `) with deterministic per-hook + `installed`/`updated`/`skipped` outcomes - `doctor` is implemented and validates hook rollout readiness: - detects effective hooks directory for default, per-repo `core.hooksPath`, and global `core.hooksPath` installs @@ -65,25 +71,32 @@ Crates.io is prepared but intentionally disabled in this phase. - reports actionable diagnostics for missing or misconfigured hooks - `mcp` is a placeholder for future file-cache tooling contracts (`cache-put`/`cache-get`). -- `hooks` is a placeholder for future git hook event and generated-region - tracking integration. +- `hooks` is implemented for local Git hook execution: + - `sce hooks pre-commit` captures staged-only checkpoint attribution + - `sce hooks commit-msg ` enforces canonical co-author trailer + policy when runtime gates pass + - `sce hooks post-commit` finalizes Agent Trace records and performs + notes+DB persistence with retry fallback + - `sce hooks post-rewrite ` ingests rewrite pairs from + STDIN, applies rewrite remap + rewritten-trace finalization, and runs + bounded retry replay - `sync` is a placeholder that runs a local Turso smoke check, then reports a deferred cloud-sync plan. ## Safety and limitations -- `mcp`, `hooks`, and `sync` remain placeholders and do not perform MCP - transport or cloud sync. +- `mcp` and `sync` remain placeholders and do not perform MCP transport or + cloud sync. - `sync` only validates local adapter wiring and does not require remote auth. -- This crate is scaffolding for incremental delivery and should not be treated - as production-ready workflow automation. +- Hosted reconciliation intake/mapping paths are not wired to public CLI + commands yet. ## Near-term roadmap mapping - Repository setup automation seam: `cli/src/services/setup.rs` - Hook install health validation seam: `cli/src/services/doctor.rs` - MCP file-cache seam: `cli/src/services/mcp.rs` -- Hook event and generated-region seam: `cli/src/services/hooks.rs` +- Local hook runtime + persistence seam: `cli/src/services/hooks.rs` - Cloud sync seam + local Turso gate: `cli/src/services/sync.rs` - Command catalog and placeholder status: `cli/src/command_surface.rs` diff --git a/cli/src/services/agent_trace.rs b/cli/src/services/agent_trace.rs index d6ab0c55..a2af4db0 100644 --- a/cli/src/services/agent_trace.rs +++ b/cli/src/services/agent_trace.rs @@ -74,6 +74,7 @@ impl QualityStatus { } } +#[cfg_attr(not(test), allow(dead_code))] #[derive(Clone, Copy, Debug, Eq, PartialEq)] pub enum ContributorType { Human, diff --git a/cli/src/services/hooks.rs b/cli/src/services/hooks.rs index 240dd442..c86bac66 100644 --- a/cli/src/services/hooks.rs +++ b/cli/src/services/hooks.rs @@ -2709,144 +2709,5 @@ pub fn finalize_pre_commit_checkpoint( PreCommitFinalization::Finalized(FinalizedCheckpoint { anchors, files }) } -#[derive(Clone, Copy, Debug, Eq, PartialEq)] -pub enum GitHookKind { - PreCommit, - PostCommit, - PrePush, -} - -#[derive(Clone, Copy, Debug, Eq, PartialEq)] -pub enum GeneratedRegionLifecycle { - Discovered, - Updated, - Removed, -} - -#[derive(Clone, Debug, Eq, PartialEq)] -pub struct GeneratedRegionEvent { - pub file_path: String, - pub marker_id: String, - pub lifecycle: GeneratedRegionLifecycle, -} - -#[derive(Clone, Debug, Eq, PartialEq)] -pub struct HookEvent { - pub hook: GitHookKind, - pub region_event: Option, -} - -#[derive(Clone, Debug, Eq, PartialEq)] -pub struct HookEventModel { - pub supported_hooks: Vec, - pub generated_region_tracking: bool, -} - -pub trait HookService { - fn event_model(&self) -> HookEventModel; - fn record(&self, event: HookEvent) -> Result<()>; -} - -#[derive(Clone, Copy, Debug, Default)] -pub struct PlaceholderHookService; - -impl HookService for PlaceholderHookService { - fn event_model(&self) -> HookEventModel { - HookEventModel { - supported_hooks: vec![ - GitHookKind::PreCommit, - GitHookKind::PostCommit, - GitHookKind::PrePush, - ], - generated_region_tracking: true, - } - } - - fn record(&self, event: HookEvent) -> Result<()> { - match event.hook { - GitHookKind::PreCommit | GitHookKind::PostCommit | GitHookKind::PrePush => {} - } - - if let Some(region_event) = event.region_event { - match region_event.lifecycle { - GeneratedRegionLifecycle::Discovered - | GeneratedRegionLifecycle::Updated - | GeneratedRegionLifecycle::Removed => {} - } - - let _ = (region_event.file_path, region_event.marker_id); - } - - Ok(()) - } -} - -pub fn run_placeholder_hooks() -> Result { - let service = PlaceholderHookService; - let model = service.event_model(); - - let staged_only_preview = finalize_pre_commit_checkpoint( - &PreCommitRuntimeState { - sce_disabled: false, - cli_available: true, - is_bare_repo: false, - }, - PreCommitTreeAnchors { - index_tree: "placeholder-index-tree".to_string(), - head_tree: Some("placeholder-head-tree".to_string()), - }, - PendingCheckpoint { - files: vec![PendingFileCheckpoint { - path: "context/generated/hooks.md".to_string(), - staged_ranges: vec![PendingLineRange { - start_line: 1, - end_line: 1, - }], - unstaged_ranges: vec![PendingLineRange { - start_line: 2, - end_line: 2, - }], - }], - }, - ); - - let staged_file_count = match staged_only_preview { - PreCommitFinalization::Finalized(checkpoint) => checkpoint.files.len(), - PreCommitFinalization::NoOp(_) => 0, - }; - - let commit_message_preview = apply_commit_msg_coauthor_policy( - &CommitMsgRuntimeState { - sce_disabled: false, - sce_coauthor_enabled: true, - has_staged_sce_attribution: true, - }, - "chore: hooks placeholder preview", - ); - let trailer_applied = commit_message_preview.contains(CANONICAL_SCE_COAUTHOR_TRAILER); - - for lifecycle in [ - GeneratedRegionLifecycle::Discovered, - GeneratedRegionLifecycle::Updated, - GeneratedRegionLifecycle::Removed, - ] { - service.record(HookEvent { - hook: GitHookKind::PreCommit, - region_event: Some(GeneratedRegionEvent { - file_path: "context/generated/hooks.md".to_string(), - marker_id: "placeholder-generated-region".to_string(), - lifecycle, - }), - })?; - } - - Ok(format!( - "TODO: '{NAME}' is planned and not implemented yet. Hook event model reserves {} git hook(s) with generated-region tracking placeholders, staged-only pre-commit checkpoint preview over {} file(s), and commit-msg canonical trailer preview applied={}.", - model.supported_hooks.len(), - staged_file_count, - trailer_applied - )) -} - #[cfg(test)] mod tests; diff --git a/cli/src/services/hooks/tests.rs b/cli/src/services/hooks/tests.rs index df130949..7c8bd76e 100644 --- a/cli/src/services/hooks/tests.rs +++ b/cli/src/services/hooks/tests.rs @@ -17,12 +17,10 @@ use super::{ apply_commit_msg_coauthor_policy, finalize_post_commit_trace, finalize_post_rewrite_remap, finalize_pre_commit_checkpoint, finalize_rewrite_trace, parse_hooks_subcommand, process_trace_retry_queue, resolve_pre_commit_checkpoint_path, - run_commit_msg_subcommand_in_repo, run_hooks_subcommand, run_placeholder_hooks, - run_post_commit_subcommand_in_repo, run_post_rewrite_subcommand_in_repo, - run_pre_commit_subcommand_in_repo, CommitMsgRuntimeState, GeneratedRegionEvent, - GeneratedRegionLifecycle, GitHookKind, HookEvent, HookService, HookSubcommand, - PendingCheckpoint, PendingFileCheckpoint, PendingLineRange, PersistenceErrorClass, - PersistenceFailure, PersistenceTarget, PersistenceWriteResult, PlaceholderHookService, + run_commit_msg_subcommand_in_repo, run_hooks_subcommand, run_post_commit_subcommand_in_repo, + run_post_rewrite_subcommand_in_repo, run_pre_commit_subcommand_in_repo, CommitMsgRuntimeState, + HookSubcommand, PendingCheckpoint, PendingFileCheckpoint, PendingLineRange, + PersistenceErrorClass, PersistenceFailure, PersistenceTarget, PersistenceWriteResult, PostCommitFinalization, PostCommitInput, PostCommitNoOpReason, PostCommitRuntimeState, PostRewriteFinalization, PostRewriteNoOpReason, PostRewriteRuntimeState, PreCommitFinalization, PreCommitNoOpReason, PreCommitRuntimeState, PreCommitTreeAnchors, RetryMetricsSink, @@ -1249,36 +1247,6 @@ fn post_rewrite_runtime_skips_duplicate_pair_replay() -> Result<()> { Ok(()) } -#[test] -fn hooks_placeholder_event_model_reserves_generated_region_tracking() { - let service = PlaceholderHookService; - let model = service.event_model(); - assert!(model.generated_region_tracking); - assert_eq!(model.supported_hooks.len(), 3); -} - -#[test] -fn hooks_placeholder_message_mentions_event_model() -> Result<()> { - let message = run_placeholder_hooks()?; - assert!(message.contains("Hook event model reserves")); - Ok(()) -} - -#[test] -fn hooks_placeholder_accepts_generated_region_events() -> Result<()> { - let service = PlaceholderHookService; - let event = HookEvent { - hook: GitHookKind::PreCommit, - region_event: Some(GeneratedRegionEvent { - file_path: "context/plans/example.md".to_string(), - marker_id: "generated:example".to_string(), - lifecycle: GeneratedRegionLifecycle::Updated, - }), - }; - - service.record(event) -} - #[test] fn parse_hooks_subcommand_routes_pre_commit() -> Result<()> { let parsed = parse_hooks_subcommand(vec!["pre-commit".to_string()])?; diff --git a/cli/src/services/local_db.rs b/cli/src/services/local_db.rs index c28f8c1a..f98d1657 100644 --- a/cli/src/services/local_db.rs +++ b/cli/src/services/local_db.rs @@ -206,7 +206,7 @@ fn resolve_state_data_root() -> Result { if let Some(xdg_state_home) = std::env::var_os("XDG_STATE_HOME") { return Ok(PathBuf::from(xdg_state_home)); } - return Ok(resolve_home_dir()?.join(".local").join("state")); + Ok(resolve_home_dir()?.join(".local").join("state")) } #[cfg(not(any(target_os = "windows", target_os = "macos", target_os = "linux")))] diff --git a/cli/src/services/setup.rs b/cli/src/services/setup.rs index 0de189bc..02223ed1 100644 --- a/cli/src/services/setup.rs +++ b/cli/src/services/setup.rs @@ -23,6 +23,7 @@ pub struct EmbeddedAsset { pub bytes: &'static [u8], } +#[cfg_attr(not(test), allow(dead_code))] #[derive(Clone, Copy, Debug, Eq, PartialEq)] pub enum RequiredHookAsset { PreCommit, @@ -36,6 +37,7 @@ pub fn iter_required_hook_assets() -> std::slice::Iter<'static, EmbeddedAsset> { HOOK_EMBEDDED_ASSETS.iter() } +#[cfg_attr(not(test), allow(dead_code))] pub fn get_required_hook_asset(hook: RequiredHookAsset) -> Option<&'static EmbeddedAsset> { let hook_name = match hook { RequiredHookAsset::PreCommit => "pre-commit", diff --git a/cli/src/services/setup/tests.rs b/cli/src/services/setup/tests.rs index 1d7bb8ab..20ee5fdf 100644 --- a/cli/src/services/setup/tests.rs +++ b/cli/src/services/setup/tests.rs @@ -376,10 +376,7 @@ fn install_engine_rolls_back_when_swap_fails() -> Result<()> { |from, to| { rename_calls.set(rename_calls.get() + 1); if rename_calls.get() == 2 { - return Err(io::Error::new( - io::ErrorKind::Other, - "injected swap failure", - )); + return Err(io::Error::other("injected swap failure")); } fs::rename(from, to) diff --git a/context/architecture.md b/context/architecture.md index 857be1f0..c9a32563 100644 --- a/context/architecture.md +++ b/context/architecture.md @@ -79,7 +79,7 @@ The repository includes a new placeholder Rust binary crate at `cli/`. - `cli/src/services/doctor.rs` defines hook rollout health validation (`run_doctor`) that resolves effective git hook-path source (default, local `core.hooksPath`, global `core.hooksPath`) and validates required hook files (`pre-commit`, `commit-msg`, `post-commit`) for presence and executable permissions. - `cli/src/services/agent_trace.rs` defines the Agent Trace schema adapter and builder contracts (`adapt_trace_payload`, `build_trace_payload`), including fixed git VCS identity, reserved reverse-domain metadata keys, and deterministic AI `model_id` normalization before schema-compliance validation. - `cli/src/services/mcp.rs` defines MCP file-cache capability contracts (`McpService`, transport/capability snapshots, cache policy) with non-runnable placeholder tool declarations. -- `cli/src/services/hooks.rs` defines hook-event and generated-region tracking contracts (`HookService`, `HookEventModel`) plus a pre-commit staged-checkpoint finalization seam (`finalize_pre_commit_checkpoint`) that enforces staged-only attribution and carries index/tree anchors with explicit no-op guard states, a commit-msg co-author policy seam (`apply_commit_msg_coauthor_policy`) that injects one canonical SCE trailer only for allowed attributed commits, a post-commit trace finalization seam (`finalize_post_commit_trace`) that performs notes+DB dual writes with idempotency ledger guards and retry-queue fallback capture, a retry replay seam (`process_trace_retry_queue`) that re-attempts only failed persistence targets and emits per-attempt runtime/error-class metrics, bounded operational retry replay invocation from post-commit/post-rewrite flows (`process_runtime_retry_queue`), a post-rewrite remap-ingestion seam (`finalize_post_rewrite_remap`) that parses old->new SHA pairs and derives deterministic replay keys for remap dispatch, and a rewrite trace transformation seam (`finalize_rewrite_trace`) that emits rewritten-SHA Agent Trace records with rewrite metadata plus confidence-based quality status. +- `cli/src/services/hooks.rs` defines production local hook runtime parsing/dispatch (`HookSubcommand`, `parse_hooks_subcommand`, `run_hooks_subcommand`) plus a pre-commit staged-checkpoint finalization seam (`finalize_pre_commit_checkpoint`) that enforces staged-only attribution and carries index/tree anchors with explicit no-op guard states, a commit-msg co-author policy seam (`apply_commit_msg_coauthor_policy`) that injects one canonical SCE trailer only for allowed attributed commits, a post-commit trace finalization seam (`finalize_post_commit_trace`) that performs notes+DB dual writes with idempotency ledger guards and retry-queue fallback capture, a retry replay seam (`process_trace_retry_queue`) that re-attempts only failed persistence targets and emits per-attempt runtime/error-class metrics, bounded operational retry replay invocation from post-commit/post-rewrite flows (`process_runtime_retry_queue`), a post-rewrite remap-ingestion seam (`finalize_post_rewrite_remap`) that parses old->new SHA pairs and derives deterministic replay keys for remap dispatch, and a rewrite trace transformation seam (`finalize_rewrite_trace`) that emits rewritten-SHA Agent Trace records with rewrite metadata plus confidence-based quality status. - `cli/src/services/hosted_reconciliation.rs` defines hosted intake/orchestration seams (`ingest_hosted_rewrite_event`, `ReconciliationRunStore`) that verify provider signatures (GitHub HMAC-SHA256 and GitLab token equality), parse provider payload old/new heads, normalize deterministic idempotency-backed reconciliation run requests, resolve deterministic old->new rewrite mappings (`map_rewritten_commit`) with patch-id exact precedence, range-diff/fuzzy fallback scoring, and explicit unresolved classifications, and summarize mapped/unmapped confidence/runtime/error-class telemetry (`summarize_reconciliation_metrics`). - `cli/src/services/sync.rs` runs the local adapter through a lazily initialized shared tokio current-thread runtime and composes a placeholder cloud-sync abstraction (`CloudSyncGateway`) so local Turso validation and deferred cloud planning remain separated. - `cli/src/services/` contains module boundaries for setup, doctor, MCP, hooks, sync, and local DB adapters with explicit trait seams for future implementations. diff --git a/context/cli/placeholder-foundation.md b/context/cli/placeholder-foundation.md index 2f919179..240fe2df 100644 --- a/context/cli/placeholder-foundation.md +++ b/context/cli/placeholder-foundation.md @@ -77,7 +77,7 @@ Placeholder commands currently acknowledge planned behavior and do not claim pro - `cli/src/services/doctor.rs` defines hook rollout health validation (`run_doctor`) with path-source detection (default/local/global) and required-hook presence/executable checks. - `cli/src/services/agent_trace.rs` defines the task-scoped schema adapter contract (`adapt_trace_payload`) from internal attribution input structs to Agent Trace-shaped record structs, including fixed git `vcs` mapping, contributor type mapping, and reserved `dev.crocoder.sce.*` metadata placement. - `cli/src/services/mcp.rs` defines `McpService`, a `McpCapabilitySnapshot` model (primary + supported transports), and `CachePolicy` defaults for future file-cache workflows (`cache-put`/`cache-get`) with `runnable: false` placeholders. -- `cli/src/services/hooks.rs` defines hook runtime command parsing/dispatch (`HookSubcommand`, `parse_hooks_subcommand`, `run_hooks_subcommand`) and retains hook-event/generated-region placeholder contracts (`HookEventModel`, `HookEvent`, `GeneratedRegionEvent`) for future listener-oriented integrations. +- `cli/src/services/hooks.rs` defines production local hook runtime parsing/dispatch (`HookSubcommand`, `parse_hooks_subcommand`, `run_hooks_subcommand`) for `pre-commit`, `commit-msg`, `post-commit`, and `post-rewrite`, plus checkpoint/persistence/retry finalization seams used by hook entrypoints. - `cli/src/services/sync.rs` defines cloud-sync abstraction points (`CloudSyncGateway`, `CloudSyncRequest`, `CloudSyncPlan`) layered after the local Turso smoke gate. - `cli/src/app.rs` dispatches `setup`, `doctor`, `mcp`, and `hooks` through service-level modules so runtime messages are sourced from domain modules instead of inline strings. @@ -96,7 +96,7 @@ Placeholder commands currently acknowledge planned behavior and do not claim pro - `cli/src/app.rs` additionally validates setup contract routing for interactive default, explicit target flags, and mutually-exclusive setup flag failures. - `cli/src/services/local_db.rs` tests cover in-memory and file-backed local Turso initialization plus execute/query smoke checks. - `cli/src/services/sync.rs` test confirms `sync` runs the local smoke gate and returns deterministic placeholder messaging. -- `cli/src/services/{setup,mcp,hooks,sync}.rs` include contract-focused tests for setup flag parsing/validation, interactive selection/cancellation dispatch, setup run messaging, and non-runnable capability/event plans. +- `cli/src/services/{setup,mcp,hooks,sync}.rs` include contract-focused tests for setup flag parsing/validation, interactive selection/cancellation dispatch, setup run messaging, and hook runtime argument/IO/finalization behavior. - `cli/src/services/agent_trace.rs` includes adapter mapping tests for required field projection, contributor enum/model_id handling, and extension metadata placement under reserved reverse-domain keys. - `cli/src/services/setup.rs` tests also verify embedded-manifest completeness against runtime `config/` trees, deterministic sorted path normalization, target-scoped iterator behavior (`OpenCode`, `Claude`, `Both`), install backup creation/replacement, and rollback restoration after injected swap failures. - `cli/src/services/setup.rs` and `cli/src/services/local_db.rs` now share temporary path setup through `crate::test_support::TestTempDir` to keep filesystem test fixtures consistent and cleanup deterministic. diff --git a/context/patterns.md b/context/patterns.md index b5b2d495..b1b7e5e2 100644 --- a/context/patterns.md +++ b/context/patterns.md @@ -86,7 +86,7 @@ - For placeholder commands that need real infrastructure checks, use a lazily initialized shared tokio current-thread runtime wrapper in the service layer (`cli/src/services/sync.rs`) and keep user-facing output explicit about remaining placeholder scope. - For rollout health commands, prefer deterministic local diagnostics over implicit pass/fail behavior: report hook-path source, effective directories, required-hook checks, and actionable remediation text (`cli/src/services/doctor.rs`). - For future CLI domains, define trait-first service contracts with request/plan models in `cli/src/services/*` and keep placeholder implementations explicitly non-runnable until production behavior is approved. -- Model deferred integration boundaries with concrete event/capability data structures (for example MCP file-cache snapshots/policies, git-hook/generated-region events, cloud-sync checkpoints) so later tasks can implement behavior without reshaping public seams. +- Model deferred integration boundaries with concrete event/capability data structures (for example MCP file-cache snapshots/policies and cloud-sync checkpoints) so later tasks can implement behavior without reshaping public seams. - For pre-commit attribution finalization seams, keep pending staged and unstaged ranges explicitly separated in input models and finalize from staged ranges only, while carrying index/tree anchors for deterministic commit-time attribution binding. - For commit-msg co-author policy seams, gate canonical trailer insertion on runtime controls (`SCE_DISABLED`, `SCE_COAUTHOR_ENABLED`) plus staged SCE-attribution presence, and enforce idempotent dedupe so allowed cases end with exactly one `Co-authored-by: SCE ` trailer. - For post-commit trace finalization seams, treat commit SHA as the idempotency identity, perform notes + DB writes in the same finalize pass when available, and enqueue retry-fallback entries that explicitly capture failed persistence targets for replay-safe recovery. diff --git a/context/plans/agent-trace-local-hooks-production-mvp.md b/context/plans/agent-trace-local-hooks-production-mvp.md index 7cdf6f64..f29f0fa3 100644 --- a/context/plans/agent-trace-local-hooks-production-mvp.md +++ b/context/plans/agent-trace-local-hooks-production-mvp.md @@ -120,7 +120,7 @@ Connect the existing Agent Trace service seams into a fully functional local Git - `cargo test --manifest-path cli/Cargo.toml hooks::tests::retry_processor_recovers_failed_notes_write_and_emits_success_metric` - `cargo test --manifest-path cli/Cargo.toml hooks::tests::retry_processor_requeues_when_db_write_still_fails` -- [ ] T09: Hardening pass for production gates (warnings, docs, rollout/runbook) (status:todo) +- [x] T09: Hardening pass for production gates (warnings, docs, rollout/runbook) (status:done) - Task ID: T09 - Goal: Satisfy hard release gates by eliminating dead-code warnings in MVP modules through real wiring, tightening failure diagnostics, and updating operator docs. - Boundaries (in/out of scope): @@ -134,7 +134,7 @@ Connect the existing Agent Trace service seams into a fully functional local Git - `cargo test --manifest-path cli/Cargo.toml` - Documentation parity review across `cli/README.md` and context artifacts. -- [ ] T10: Validation and cleanup (status:todo) +- [x] T10: Validation and cleanup (status:done) - Task ID: T10 - Goal: Execute final end-to-end validation, evidence capture, artifact cleanup, and context sync verification for production-readiness signoff. - Boundaries (in/out of scope): @@ -151,6 +151,30 @@ Connect the existing Agent Trace service seams into a fully functional local Git - `nix run ./cli#clippy` - `nix run .#pkl-check-generated` - `nix flake check` + - Completion evidence: + - 2026-03-05: All listed verification commands executed in this session; checks passed. + - `cargo fmt --check` initially reported formatting drift in `cli/src/services/hooks/tests.rs`; `cargo fmt` applied and re-check passed. + - Residual risk/deferred item: `cargo build` and `nix run ./cli#clippy` still report dead-code warnings in hosted reconciliation scaffolding (`cli/src/services/hosted_reconciliation.rs`); this remains outside Local Hooks MVP scope. + - Validation report: + - Commands run + outcomes: + - `cargo fmt --manifest-path cli/Cargo.toml -- --check` -> exit 1 initially (formatting drift in `cli/src/services/hooks/tests.rs`), then exit 0 after `cargo fmt --manifest-path cli/Cargo.toml`. + - `cargo build --manifest-path cli/Cargo.toml` -> exit 0 (build succeeded; warnings only). + - `cargo test --manifest-path cli/Cargo.toml` -> exit 0 (`124 passed; 0 failed`). + - `nix run ./cli#clippy` -> exit 0 (clippy completed; warnings only). + - `nix run .#pkl-check-generated` -> exit 0 (`Generated outputs are up to date.`). + - `nix flake check` -> exit 0 (all configured checks evaluated/built successfully). + - Failed checks + follow-up: + - Initial `cargo fmt --check` failure resolved in-task by running `cargo fmt` and rerunning `cargo fmt --check`. + - Success-criteria verification summary: + - Local hooks MVP runtime criteria remain satisfied by passing full CLI test suite, including local hook runtime paths (`pre-commit`, `commit-msg`, `post-commit`, `post-rewrite`) and retry behavior coverage. + - Persistence/readiness criteria remain validated by tests and successful build/lint/check gates. + - Hard release gates for this final validation task executed with deterministic command evidence recorded above. + - Cleanup: + - No task-scoped temporary scaffolding/artifacts required cleanup in `context/tmp/` for this session. + - Context sync verification: + - Shared root files verified against current code truth: `context/overview.md`, `context/architecture.md`, `context/glossary.md`, `context/patterns.md`, and `context/context-map.md`. + - Verify-only root pass applied (no additional root context edits needed for this validation-only task). + - Feature discoverability links confirmed in `context/context-map.md` for Local Hooks MVP behavior and related Agent Trace domain files. ## 5) Open questions - None. diff --git a/context/sce/agent-trace-hooks-command-routing.md b/context/sce/agent-trace-hooks-command-routing.md index 5416dd45..ba24005b 100644 --- a/context/sce/agent-trace-hooks-command-routing.md +++ b/context/sce/agent-trace-hooks-command-routing.md @@ -14,6 +14,7 @@ ## Parser and dispatch behavior - `cli/src/app.rs` routes `hooks` through dedicated hook-subcommand parsing instead of generic no-arg subcommand parsing. - `cli/src/services/hooks.rs` now owns hook CLI usage text, deterministic parse errors, and runtime dispatch through `HookSubcommand` + `run_hooks_subcommand`. +- Placeholder hook-event/generated-region scaffolding has been removed from production hook modules; local hook runtime behavior is driven only by production entrypoint/finalization seams. - Invalid and ambiguous invocations return deterministic actionable errors pointing to `sce hooks --help`. ## Current runtime entrypoint behavior