diff --git a/.agents/DOCUMENTATION_GUIDE.md b/.agents/DOCUMENTATION_GUIDE.md new file mode 100644 index 0000000..7dc9237 --- /dev/null +++ b/.agents/DOCUMENTATION_GUIDE.md @@ -0,0 +1,169 @@ +# BES Documentation Placement Guide + +Status: canonical shared guidance, 2026-04-29. + +Purpose: make every Codex, Claude, and Symphony worker put documentation in the +right place. Read this before creating, moving, archiving, or publishing docs, +specs, plans, audits, or board records. + +## Core Rule + +Use `.agents/` for agent orchestration. Use repo-native `docs/` paths for +durable product knowledge. + +When unsure, create a draft task/audit spec under `.agents/specs/` and ask for +owner approval before moving anything into public or product docs. + +## What Goes In `.agents/` + +| Path | Use for | Notes | +|---|---|---| +| `.agents/specs/` | Agent task specs, audit proposals, migration proposals, Symphony-dispatched execution specs | Default location for work-control specs. These are executable contracts for agents. | +| `.agents/specs/SPEC.template.md` | Shared spec template | Copy/update for non-trivial tasks. | +| `.agents/skills/` | Codex/shared skill procedures | Canonical shared skills. Claude copies mirror these under `.claude/skills/`. | +| `.agents/workflows/` | Shared workflow wrappers | Canonical command/workflow prompts. | +| `.agents/BOARD_SEED.md` | Initial tracker backlog shape | Root only unless a repo needs its own board seed. | +| `.agents/DOCUMENTATION_GUIDE.md` | This placement policy | Keep copies aligned across active repos. | +| `.agents/archive/` | Superseded agent-control artifacts | Not product history unless explicitly promoted. | + +Do not put long-lived product architecture only in `.agents/`. If it matters to +humans or contributors, graduate it into the repo's native docs path after +owner approval. + +## What Goes In Root Docs + +The root checkout is the company control plane, not a product monorepo. + +| Path | Use for | +|---|---| +| `AGENTS.md` | Root agent entrypoint and policy summary | +| `CLAUDE.md` | Claude entrypoint importing root policy | +| `WORKFLOW.md` | Company-level Symphony workflow contract | +| `.agents/OPERATING_MODEL.md` | Canonical fleet operating model | +| `.agents/BOARD_SEED.md` | Tracker/Symphony seed backlog | +| `.agents/specs/` | Company-control specs and audit proposals | + +Do not add product implementation docs to root unless the work is truly +cross-company. Cross-company ideas should usually become a control-plane spec +that links to repo-local specs. + +## Repo-Native Product Doc Locations + +### ACTOCCATUD + +| Path | Use for | +|---|---| +| `.agents/specs/` | Agent/Symphony audit and task-control specs | +| `docs/plans/` | V1 sequencing, active plans, supersession records | +| `docs/systems/` | Durable system specs | +| `docs/engineering/` | Engineering research, process, gap registries | +| `docs/content/` | Content specs and catalogs | +| `docs/creative/` / `docs/design/` | Creative/design-system surfaces | +| `docs/reviews/` | Audit and review findings | + +Dispatch authority currently starts from `docs/plans/2026-04-27-v1-truth.md`. + +### Floom + +| Path | Use for | +|---|---| +| `.agents/specs/` | Agent/Symphony audit and task-control specs | +| `docs/superpowers/specs/` | Durable product/compiler specs | +| `docs/superpowers/plans/` | Active and historical implementation plans | +| `docs/superpowers/research/` | Research records | +| `docs/concepts/` | First-principles concept documents | +| `docs/architecture.md` / `docs/getting-started.md` | Public-facing technical docs | + +Demo/product work that becomes durable architecture belongs in +`docs/superpowers/specs/`; orchestration specs stay in `.agents/specs/`. + +### UsefulIdiots + +| Path | Use for | +|---|---| +| `.agents/specs/` | Agent/Symphony audit and task-control specs | +| `docs/specs/` | Durable system architecture specs | +| `docs/01-concept.md` through `docs/07-narrative.md` | Locked game-design authority | +| `docs/LOCKED.md` | Decisions that should not be re-litigated | +| `docs/glossary.md` | Canonical terminology | + +No product code should be written from skeleton specs. Detailed system specs +must be approved first. + +### IKTO + +| Path | Use for | +|---|---| +| `.agents/specs/` | Agent/Symphony audit and task-control specs | +| `docs/superpowers/specs/` | Durable design/product specs | +| `docs/superpowers/memos/` | Decision and research memos | +| `docs/superpowers/research/` | Research records | +| `docs/plans/` | Phase plans, audits, open-question catalogs | +| `docs/content/` | Content/category specs | + +Post-pivot work should first clarify whether older docs are historical, +superseded, or still active. + +### Wick + +Public OSS caution applies. + +| Path | Use for | +|---|---| +| `.agents/specs/` | Agent/Symphony audit and task-control specs; keep local unless approved | +| `docs/` | Public contributor/user documentation | +| `docs/planning/` | Historical and active project plans | +| `docs/public-testing/` | Public-test reports | +| `SECURITY.md`, `CONTRIBUTING.md`, `CHANGELOG.md` | Standard public OSS surfaces | + +Do not publish internal BES agent-control language into Wick docs unless it is +intentionally contributor-facing. + +### Mimir + +Public OSS caution applies. + +| Path | Use for | +|---|---| +| `.agents/specs/` | Agent/Symphony audit and task-control specs; keep local unless approved | +| `docs/concepts/` | Durable product architecture specs | +| `docs/integrations/` | Integration setup docs | +| `docs/blog/` | Public article drafts/content | +| `docs/launch-readiness.md` | Launch checklist and promise audit | +| `docs/launch-posting-plan.md` | Public launch/posting plan | +| `.planning/planning/` | Historical planning archive | + +Mimir product docs may mention memory, hooks, and checkpoint features. BES +fleet operating docs must not use Mimir hooks or raw memory as authority until +a new spec-authority integration is approved. + +## Promotion Rules + +Use this promotion path: + +1. Start with `.agents/specs/-/SPEC.md` for task control. +2. If the work creates durable product knowledge, add or update the repo-native + docs path listed above. +3. Keep audit notes and completion evidence in the task spec. +4. Do not move audit prose wholesale into public docs. Rewrite it for the + intended audience. +5. Public OSS repos require owner approval before publishing agent workflow or + internal planning language. + +## Archive And Supersession Rules + +- Prefer supersession headers over deletion when old docs explain why a + decision changed. +- Delete only when the file is a one-time bootstrap, generated scratch, or an + owner-approved removal. +- If a doc is historical, say which doc supersedes it. +- If a doc is active, say where its implementation work is tracked. +- If a doc is rejected, preserve the rejection reason in a spec, ADR, or plan. + +## Memory And Evidence + +Raw chat history, Claude memory, Codex memory, Mimir drafts, and old agent notes +are evidence only. They do not decide document placement. + +Durable rules must cite a checked-in file, approved spec, command output, issue, +PR, or direct owner instruction. diff --git a/.agents/skills/code-review/SKILL.md b/.agents/skills/code-review/SKILL.md new file mode 100644 index 0000000..bcc22d0 --- /dev/null +++ b/.agents/skills/code-review/SKILL.md @@ -0,0 +1,36 @@ +--- +name: code-review +description: Use for reviewing local diffs or PRs. Prioritizes bugs, regressions, missing tests, unsafe assumptions, and broken repo contracts over summaries. +--- + +# Code Review + +Use this when asked to review. + +## Review Focus + +- Correctness bugs. +- Behavioral regressions. +- Missing or weak tests. +- Security, privacy, or secret-handling risks. +- Broken architecture boundaries. +- Drift from `AGENTS.md`, approved specs, or public docs. +- Verification gaps. + +## Output + +Findings first, ordered by severity. Each finding should include: + +- file and line reference when available +- the concrete risk +- why the current change causes it +- a practical fix direction + +Then include open questions and a brief summary only after findings. + +## Hard Rules + +- If there are no findings, say that clearly and list residual risk. +- Do not lead with praise or broad summaries. +- Do not request stylistic churn unless it affects correctness, + maintainability, or repo contracts. diff --git a/.agents/skills/implementation-execution/SKILL.md b/.agents/skills/implementation-execution/SKILL.md new file mode 100644 index 0000000..3b4cbf4 --- /dev/null +++ b/.agents/skills/implementation-execution/SKILL.md @@ -0,0 +1,34 @@ +--- +name: implementation-execution +description: Use when implementing an approved BES SPEC.md. Keeps edits scoped, preserves user work, updates directly coupled tests/docs, and stops when new facts change scope. +--- + +# Implementation Execution + +Use only after a spec is approved by the owner or controlling workflow. + +## Steps + +1. Re-read the approved `SPEC.md`. +2. Re-read the repo `AGENTS.md` and relevant docs. +3. Confirm branch/worktree state with `git status --short --branch`. +4. Edit only files named by the spec or directly required by the change. +5. Add or update tests before or with production changes when behavior changes. +6. Keep unrelated refactors out of scope. +7. Run the spec acceptance commands. +8. Prepare the completion report requested by the spec. + +## Stop Conditions + +- New facts materially change scope. +- Required files contain unrelated local changes that make safe editing + ambiguous. +- Verification requires unavailable secrets or infrastructure. +- The spec's acceptance criteria are not testable. + +## Hard Rules + +- Preserve unrelated user changes. +- Do not silently expand scope. +- Do not bypass hooks, CI, or verification gates. +- Do not claim completion without fresh verification evidence. diff --git a/.agents/skills/release-pr/SKILL.md b/.agents/skills/release-pr/SKILL.md new file mode 100644 index 0000000..c508d26 --- /dev/null +++ b/.agents/skills/release-pr/SKILL.md @@ -0,0 +1,27 @@ +--- +name: release-pr +description: Use when preparing commits, PRs, release handoff, or merge cleanup. Enforces explicit staging, conventional commits, PR evidence, and worktree hygiene. +--- + +# Release And PR + +Use when moving finished work toward review or merge. + +## Steps + +1. Confirm branch and tracking state. +2. Review `git status --short` and `git diff`. +3. Stage explicit files by path. +4. Use the repo's commit convention. +5. Write a PR body with summary, verification output, risk, and links. +6. Confirm CI/check status only after local verification is complete. +7. After merge, clean worktrees and stale local branches according to repo + instructions. + +## Hard Rules + +- No `git add .` unless explicitly approved for the batch. +- No AI attribution in commits, PRs, docs, or generated output. +- No force-push, branch deletion, hook bypass, or merge without approval when + the repo requires it. +- Do not burn CI minutes as a substitute for local verification. diff --git a/.agents/skills/repo-orientation/SKILL.md b/.agents/skills/repo-orientation/SKILL.md new file mode 100644 index 0000000..9b6ef2a --- /dev/null +++ b/.agents/skills/repo-orientation/SKILL.md @@ -0,0 +1,36 @@ +--- +name: repo-orientation +description: Use at the start of work in any BES repo to build a current, cited map of instructions, repo state, verification gates, active plans, and likely risk before editing. +--- + +# Repo Orientation + +Use this before planning or editing. + +## Steps + +1. Read the nearest `AGENTS.md`. +2. If present, read `CLAUDE.md`, `WORKFLOW.md`, `STATUS.md`, + `.agents/DOCUMENTATION_GUIDE.md`, and the docs linked by `AGENTS.md`. +3. Check git state with `git status --short --branch`. +4. Identify the active branch, tracking branch, untracked files, and unrelated + local changes. +5. Identify the repo's verification gate and any hook setup requirements. +6. Locate the task's likely files with `rg` and `rg --files`. +7. Report only verified facts. Cite files or command output. + +## Output + +- Target repo and branch. +- Source-of-truth docs read. +- Relevant files or directories. +- Verification commands. +- Documentation placement constraints for this task. +- Local changes that must be preserved. +- Open questions before implementation. + +## Hard Rules + +- Do not edit during orientation. +- Do not rely on memory when repo docs can answer the question. +- If instructions conflict, stop and report the conflict. diff --git a/.agents/skills/spec-driven-development/SKILL.md b/.agents/skills/spec-driven-development/SKILL.md new file mode 100644 index 0000000..57af240 --- /dev/null +++ b/.agents/skills/spec-driven-development/SKILL.md @@ -0,0 +1,45 @@ +--- +name: spec-driven-development +description: "Use when planning, reviewing, implementing, or verifying non-trivial work in BES repos. Enforces the BES spec-first operating model: author an executable SPEC.md, review it, implement only approved scope, verify with concrete commands, and route durable lessons into spec evidence." +--- + +# Spec-Driven Development + +Use this skill for non-trivial work in BES repos. + +## Workflow + +1. Read `AGENTS.md`, `CLAUDE.md` if present, `STATUS.md` if present, + `.agents/DOCUMENTATION_GUIDE.md` if present, and the relevant project docs. +2. Create or update a task spec from `.agents/specs/SPEC.template.md`. +3. Verify the spec has goals, non-goals, current facts with citations, + desired behavior, safety invariants, acceptance commands, rollback, and open + questions. +4. Do not implement until the spec is approved by the owner or controlling + workflow. +5. Execute only the approved spec. Stop if new facts materially change scope. +6. Run acceptance commands and the repo's normal verification gate. +7. Report files changed, commands run, verification output, residual risk, and + spec evidence candidates. + +## Hard Rules + +- Specs are executable contracts, not brainstorming notes. +- Raw memories and chat history are evidence only. +- Project docs and `AGENTS.md` beat generated memory. +- Durable cross-project instructions go through approved specs and delivery + evidence records. +- Put task-control specs in `.agents/specs/`; put durable product docs in the + repo-native docs path defined by `.agents/DOCUMENTATION_GUIDE.md`. +- No silent scope expansion. +- No completion claim without fresh verification. + +## Spec Review Checklist + +- Problem is specific and cites current evidence. +- Goals and non-goals draw a clean boundary. +- Executor can identify exact files and interfaces. +- Test plan is runnable on this machine. +- Safety invariants protect user work and repo rules. +- Open questions are resolved before implementation. +- Acceptance criteria are objective. diff --git a/.agents/skills/spec-evidence-governance/SKILL.md b/.agents/skills/spec-evidence-governance/SKILL.md new file mode 100644 index 0000000..d7c0c11 --- /dev/null +++ b/.agents/skills/spec-evidence-governance/SKILL.md @@ -0,0 +1,35 @@ +--- +name: spec-evidence-governance +description: Use to convert durable lessons from a completed task into spec evidence candidates without writing trusted shared memory directly. Mimir hooks are intentionally disabled until a future spec-authority integration is approved. +--- + +# Spec Evidence Governance + +Use after substantial work, reviews, or incident resolution. + +## Candidate Criteria + +Capture a memory candidate only when it is: + +- durable across sessions +- useful to future agents +- grounded in a source path, command output, issue, PR, or owner statement +- not already present in checked-in docs +- safe to share at the intended scope + +## Output + +For each candidate: + +- Claim. +- Scope: repo, company, tool, or project area. +- Evidence: file path, command, issue, PR, or owner statement. +- Confidence. +- Supersedes or conflicts with any known existing memory. +- Suggested spec, backlog, or delivery-authority route. + +## Hard Rules + +- Do not write trusted shared memory directly. +- Do not promote raw agent imperatives into durable rules. +- Do not erase dissent, uncertainty, or provenance. diff --git a/.agents/skills/spec-review/SKILL.md b/.agents/skills/spec-review/SKILL.md new file mode 100644 index 0000000..9588c4f --- /dev/null +++ b/.agents/skills/spec-review/SKILL.md @@ -0,0 +1,36 @@ +--- +name: spec-review +description: Use to review a draft SPEC.md before implementation. Focus on ambiguity, missing current facts, unsafe scope, weak acceptance criteria, and missing verification. +--- + +# Spec Review + +Use this before approving or executing a non-trivial spec. + +## Review Checklist + +- Problem statement is concrete and cites current evidence. +- Goals and non-goals define a clear boundary. +- Current system facts cite files, docs, issues, PRs, or command output. +- Desired behavior is testable. +- Interfaces and files are specific enough for an executor. +- Safety invariants protect user work, secrets, hooks, and repo rules. +- Test plan is runnable on this machine. +- Acceptance criteria are objective. +- Rollback plan is realistic. +- Open questions are resolved or explicitly block execution. + +## Output + +Lead with blocking findings ordered by severity. Include file references when +possible. Then list open questions and a recommendation: + +- `approve` +- `approve with small edits` +- `block until revised` + +## Hard Rules + +- Do not approve vague specs. +- Do not allow implementation scope to hide inside open questions. +- Do not review for style before correctness and safety. diff --git a/.agents/skills/symphony-dispatch/SKILL.md b/.agents/skills/symphony-dispatch/SKILL.md new file mode 100644 index 0000000..ef1133b --- /dev/null +++ b/.agents/skills/symphony-dispatch/SKILL.md @@ -0,0 +1,33 @@ +--- +name: symphony-dispatch +description: Use when preparing or auditing Symphony-compatible issue dispatch. Checks WORKFLOW.md, issue eligibility, workspace isolation, Codex runner settings, and observability expectations. +--- + +# Symphony Dispatch + +Use when running or preparing autonomous worker dispatch. + +## Checklist + +- `WORKFLOW.md` exists in the runner cwd or an explicit workflow path is set. +- YAML front matter has `tracker`, `polling`, `workspace`, `hooks`, `agent`, + and `codex` sections. +- `tracker.kind` is `linear` and `tracker.api_key` resolves from + `LINEAR_API_KEY`. +- `tracker.project_slug`, active states, and terminal states match the board. +- `workspace.root` is absolute and outside product repo working trees. +- Hooks are documented; failures have the right abort/ignore behavior. +- `codex.command` is `codex app-server` unless a tested wrapper is in use. +- Concurrency is bounded for the machine and CI budget. +- Running workers use isolated branches or worktrees. +- Logs and completion reports include issue identifier, session, commands, and + verification evidence. + +## Hard Rules + +- Do not dispatch when the target repo is unclear. +- Do not dispatch multiple write-capable workers into the same worktree. +- Do not allow unsupported tool calls or user-input-required turns to stall + indefinitely. +- Treat Symphony as a trusted-environment runner unless stronger sandboxing is + explicitly configured. diff --git a/.agents/skills/verification/SKILL.md b/.agents/skills/verification/SKILL.md new file mode 100644 index 0000000..156c605 --- /dev/null +++ b/.agents/skills/verification/SKILL.md @@ -0,0 +1,33 @@ +--- +name: verification +description: Use before reporting done. Runs the narrowest relevant checks first, then the repo gate when warranted, and records fresh evidence plus residual risk. +--- + +# Verification + +Use before claiming work is complete. + +## Steps + +1. Read the spec acceptance commands and repo `AGENTS.md` verification section. +2. Run the narrowest relevant test or lint first. +3. Run the broader repo gate when the change touches shared behavior, + interfaces, CI, docs contracts, or release surfaces. +4. Capture command, result, and important output. +5. If a command fails, diagnose whether the failure is caused by the change, + existing repo state, missing dependency, sandbox/network limits, or secrets. +6. Re-run only after a meaningful fix or environment change. + +## Output + +- Commands run. +- Pass/fail result. +- Key output lines or summarized failures. +- Residual risk. +- Checks not run and why. + +## Hard Rules + +- Do not say "should pass" as verification. +- Do not hide failing checks. +- Do not spend CI minutes when local gates are required first. diff --git a/.agents/specs/2026-04-29-parallel-handoff-closeout/SPEC.md b/.agents/specs/2026-04-29-parallel-handoff-closeout/SPEC.md new file mode 100644 index 0000000..f225244 --- /dev/null +++ b/.agents/specs/2026-04-29-parallel-handoff-closeout/SPEC.md @@ -0,0 +1,350 @@ +--- +id: mimir-parallel-handoff-closeout-2026-04-29 +status: owner-paused +owner: HasNoBeef +repo: Mimir +source_specs: + - root:.agents/specs/2026-04-29-fleet-realignment-and-handoff/SPEC.md + - .agents/specs/2026-04-29-realignment-handoff/SPEC.md + - .agents/specs/2026-04-29-repo-audit/SPEC.md +branch_policy: local-only-public-oss-parallel-lane +risk: medium +requires_network: false +requires_secrets: [] +acceptance_commands: + - "node ../.agents/scripts/preflight.mjs" + - "git status --short --branch --untracked-files=all" + - "cargo build --workspace" + - "cargo test --workspace" + - "cargo test --workspace --all-features" + - "cargo fmt --all -- --check" + - "cargo clippy --all-targets --all-features -- -D warnings" + - "cargo deny check" + - "cargo doc --workspace --no-deps" +--- + +# SPEC: Mimir Parallel Handoff Closeout + +## 1. Problem + +Mimir still has local handoff/setup work after the BES fleet realignment pass. +The repo is public OSS and CI-budget-sensitive, so the remaining closeout must +preserve local work, avoid public noise, and define the exact point at which a +green room product evaluation may safely begin. + +This spec is a local agent-control handoff artifact. It does not approve product +code, public docs, commits, pushes, tags, releases, or publication. + +## 2. Goals + +- Record the current Mimir branch, head, dirty state, in-flight work, and + verification gates from fresh command output. +- Define a public-OSS-safe parallel closeout lane that is disjoint from other + BES handoff workers. +- Identify owner decisions needed before any Mimir product closeout, public docs + change, PR, push, tag, or release. +- Define stop conditions that protect user work, public OSS posture, and CI + quota. +- State that Mimir green room evaluation may begin only after this closeout is + done or the owner explicitly marks it paused. + +## 3. Non-Goals + +- Do not edit product code. +- Do not edit root files or sibling repos. +- Do not commit, push, tag, publish, open PRs, or mutate GitHub/Linear. +- Do not re-enable BES use of Mimir hooks, MCP servers, or raw memory as work + authority. +- Do not resolve launch-readiness documentation drift in this lane unless the + owner expands scope in a new approved spec. + +## 4. Current System Facts + +- Root `AGENTS.md` says the root checkout is the company control plane, product + code lives in active child repos, non-trivial work starts with a spec, and + public OSS repos including Mimir must not receive public agent-control churn + without owner-approved low-noise PR planning. +- `.agents/OPERATING_MODEL.md` requires root preflight, spec-first execution, + isolated workspaces/branches for parallel writers, explicit verification, and + public OSS release hygiene. +- `.agents/GREEN_ROOM_EVALUATION.md` says remaining repo handoffs should run in + parallel only where write scopes are disjoint, and green room evaluation for a + repo may start only after that repo's handoff lane is closed or owner-paused. +- `.agents/MODEL_ROUTING.md` routes public OSS release/spec work through Codex + `gpt-5.5` with Claude Opus 4.7 independent review when useful, and says + write-capable agents need disjoint file ownership or worktree boundaries. +- Root preflight command `node .agents/scripts/preflight.mjs` passed with zero + warnings on 2026-04-29. +- Mimir `AGENTS.md` says Mimir is public pre-1.0 active development, requires + local verification before pushing, and each tracked branch/PR push triggers a + costly GitHub Actions matrix run. +- Mimir `WORKFLOW.md` lists the canonical local verify command as + `cargo build --workspace && cargo test --workspace && cargo fmt --all -- --check && cargo clippy --all-targets --all-features -- -D warnings`. +- Mimir `STATUS.md` says workspace version is `0.1.0`, no release tag exists, + and `v0.1.0` must wait for owner approval. +- `README.md` and `STATUS.md` both limit public claims: no production-ready + claim, no stable storage/API/MCP schema claim, no hosted-service claim, no + benchmark-proven superiority, and no direct agent writes to canonical memory. +- `docs/launch-readiness.md` records local cargo gates, `cargo deny check`, + `cargo doc --workspace --no-deps`, crate dry-run expectations, recovery + benchmark checks, public-surface sweeps, and first tag target `v0.1.0` after + owner approval. +- `docs/README.md`, `README.md`, `STATUS.md`, `AGENTS.md`, and + `docs/launch-readiness.md` still reference `docs/launch-posting-plan.md`. + Command `test -e docs/launch-posting-plan.md` returned exit code 1, and + recent `git log --oneline --decorate -n 12` shows + `4d38614 Delete docs/launch-posting-plan.md (#16)`. +- Current Mimir branch command output: + +```text +## main...origin/main + M .gitignore + M AGENTS.md +?? .agents/DOCUMENTATION_GUIDE.md +?? .agents/skills/code-review/SKILL.md +?? .agents/skills/implementation-execution/SKILL.md +?? .agents/skills/release-pr/SKILL.md +?? .agents/skills/repo-orientation/SKILL.md +?? .agents/skills/spec-driven-development/SKILL.md +?? .agents/skills/spec-evidence-governance/SKILL.md +?? .agents/skills/spec-review/SKILL.md +?? .agents/skills/symphony-dispatch/SKILL.md +?? .agents/skills/verification/SKILL.md +?? .agents/specs/2026-04-29-realignment-handoff/SPEC.md +?? .agents/specs/2026-04-29-repo-audit/SPEC.md +?? .agents/specs/SPEC.template.md +?? .agents/workflows/author-spec.md +?? .agents/workflows/execute-spec.md +?? .agents/workflows/orient.md +?? .agents/workflows/release-pr.md +?? .agents/workflows/review-diff.md +?? .agents/workflows/review-spec.md +?? .agents/workflows/spec-evidence.md +?? .agents/workflows/symphony-dispatch-check.md +?? .agents/workflows/verify-spec.md +?? .claude/commands/author-spec.md +?? .claude/commands/execute-spec.md +?? .claude/commands/orient.md +?? .claude/commands/release-pr.md +?? .claude/commands/review-diff.md +?? .claude/commands/review-spec.md +?? .claude/commands/spec-evidence.md +?? .claude/commands/symphony-dispatch-check.md +?? .claude/commands/verify-spec.md +?? .claude/settings.json +?? .claude/skills/code-review/SKILL.md +?? .claude/skills/implementation-execution/SKILL.md +?? .claude/skills/release-pr/SKILL.md +?? .claude/skills/repo-orientation/SKILL.md +?? .claude/skills/spec-driven-development/SKILL.md +?? .claude/skills/spec-evidence-governance/SKILL.md +?? .claude/skills/spec-review/SKILL.md +?? .claude/skills/symphony-dispatch/SKILL.md +?? .claude/skills/verification/SKILL.md +?? CLAUDE.md +?? WORKFLOW.md +``` + +- Current Mimir head is `9e81c0f` on `main`, tracking `origin/main`. +- `git diff --name-status` currently reports modified tracked files + `.gitignore` and `AGENTS.md`. +- `git diff -- AGENTS.md` shows the local tracked change adds BES fleet + operating model instructions to Mimir's operating manual. +- `git diff -- .gitignore` shows local tracked changes that stop ignoring + `.claude/settings.json` and `.claude/skills/`, and add `.codex`, + `.mcp.json`, and `.mcp.local.json` ignores. +- Local MCP posture remains zero-default: no Mimir `.mcp.json` is present. + +## 5. Desired Behavior + +The next Mimir worker can close or pause the handoff without disturbing other +parallel lanes. It must know which local files are pre-existing work, which work +requires owner decisions, which gates are required before public activity, and +when green room evaluation is allowed to start. + +## 6. Domain Model / Contract + +Closeout states: + +- `preserve`: local or user work that must not be touched by this lane. +- `verify`: work that may be complete but needs fresh local gates. +- `owner-decision`: work that cannot proceed without HasNoBeef selecting a + public or product posture. +- `ready-for-spec`: future work that needs its own approved spec before edits. +- `closed`: no unresolved in-flight work remains for the handoff lane. +- `owner-paused`: the owner explicitly allows green room evaluation to begin + while named closeout work remains paused. + +## 7. In-Flight Work + +| Item | State | Required next action | +| --- | --- | --- | +| BES agent-control setup in `.agents/`, `.claude/`, `CLAUDE.md`, `WORKFLOW.md`, `.gitignore`, and `AGENTS.md` | preserve | Keep local/draft unless owner approves a low-noise public OSS PR plan. | +| Existing realignment handoff and repo-audit specs | preserve | Use as local source material; do not publish unless owner approves. | +| Active processing adapters at head `9e81c0f` | verify | Run full local cargo gate before any product release, tag, or PR closeout claim. | +| Missing `docs/launch-posting-plan.md` with remaining references | owner-decision | Decide whether to restore, replace, or remove stale references in a separate public-doc-safe spec. | +| Pre-1.0 launch cleanup and `v0.1.0` tag | owner-decision | Requires explicit owner approval, local green gates, and low-noise push/tag plan. | +| BES spec-authority integration research | ready-for-spec | Design only; do not re-enable hooks/MCP/memory authority without approved scope. | +| Green room product evaluation | blocked | May begin only after this handoff is `closed` or explicitly `owner-paused`. | + +## 8. Public-OSS Safe Parallel Lane + +The current lane may write only: + +- `.agents/specs/2026-04-29-parallel-handoff-closeout/SPEC.md` + +Rules for any continuation: + +- Keep all output local. +- Do not push or publish. +- Do not edit product code, public docs, root files, or sibling repos. +- Do not stage files or commit. +- Do not normalize or delete untracked `.agents/` or `.claude/` work from other + agents. +- If owner expands scope beyond this SPEC, create or switch to a dedicated + branch/worktree and name the exact allowed files before editing. +- If another write-capable agent is active in Mimir, stop unless file ownership + and worktree boundaries are explicit. + +## 9. Required Verification Gates + +For this local handoff spec: + +```bash +node ../.agents/scripts/preflight.mjs +git status --short --branch --untracked-files=all +``` + +For any future product, public-doc, PR, push, tag, or release closeout: + +```bash +cargo build --workspace +cargo test --workspace +cargo test --workspace --all-features +cargo fmt --all -- --check +cargo clippy --all-targets --all-features -- -D warnings +cargo deny check +cargo doc --workspace --no-deps +``` + +Additional launch-readiness checks to run when release/public-doc scope is +approved: + +```bash +cargo publish --dry-run -p mimir-core --allow-dirty +cargo test -p mimir-harness --test recovery_benchmark +python3 benchmarks/recovery/test_bench.py +rg -n 'production-ready|stable API|benchmark-proven|hosted service|direct agent writes|mimir hook-context|mimir-checkpoint' README.md STATUS.md docs crates plugins +rg -n 'launch-posting-plan.md' README.md STATUS.md AGENTS.md docs +``` + +Do not run GitHub Actions retries or remote CI probes unless the owner approves +the CI-budget tradeoff. + +## 10. Owner Decisions + +- Owner triage approval on 2026-04-30 marks Mimir `owner-paused` for local BES + setup and public release actions. Green room evaluation may run local-only + after at least one private repo validates the protocol. +- Public docs, PRs, pushes, tags, releases, CI-triggering work, and publication + remain blocked until a separate owner-approved public OSS spec exists. +- Should Mimir close out launch cleanup first, or should spec-authority research + be designed first? +- Should the BES integration pause remain root-only, or should Mimir public docs + mention it in public-facing language? +- Should the local BES agent-control setup be committed to Mimir at all, and if + yes, what is the low-noise public OSS PR plan? +- Should `docs/launch-posting-plan.md` be restored, replaced, or removed from + remaining references after PR #16 deleted it? +- What release posture is acceptable after local gates pass: no tag, `v0.1.0`, + or a new pre-release? +- Is green room evaluation allowed to start now as `owner-paused`, or only after + the local handoff/setup state is closed? + +## 11. Stop Conditions + +Stop and report before editing if any of these occur: + +- The requested file scope expands beyond this SPEC without an approved spec or + explicit owner instruction. +- `git status` changes in files this lane did not touch and the change affects + closeout facts. +- A command would push, publish, tag, re-enable Actions, mutate GitHub/Linear, + install tools, or use network/secrets. +- Product docs or code need edits to resolve the `launch-posting-plan.md` + references. +- Any source conflicts on whether Mimir hooks/MCP/raw memory are active BES work + authority. +- Another write-capable Mimir worker is assigned overlapping files without a + branch/worktree boundary. + +## 12. Execution Plan + +1. Preserve this SPEC as the current Mimir closeout handoff artifact. +2. Ask the owner to resolve the decisions in section 10. +3. If the owner chooses closeout, draft the next executable spec with exact + files and public OSS posture. +4. Run the required local gates before any public-facing PR/push/tag/release. +5. Mark this handoff `closed` only after owner decisions are resolved or + explicitly deferred and there is no unresolved closeout work blocking green + room evaluation. +6. If the owner chooses not to close now, mark the unresolved items + `owner-paused`; only then may green room evaluation begin. + +## 13. Safety Invariants + +- Mimir remains public OSS; internal BES agent-control output stays local until + owner-approved public wording and CI-cost posture exist. +- The librarian remains the product write boundary; no agent writes trusted + shared memory directly. +- BES fleet operation remains spec-first and zero-default-MCP until a future + approved spec changes it. +- Existing local changes and untracked files are user/agent work and must be + preserved. +- CI quota is protected by local verification and batched public activity. + +## 14. Acceptance Criteria + +- [x] Current branch, head, and dirty state are refreshed before closeout. +- [x] Owner decisions in section 10 are resolved or explicitly paused. +- [x] No product code, public docs, root files, or sibling repos are edited by + this lane. +- [x] Required verification gates are run for any approved product/public-doc + closeout. +- [x] Completion report lists commands, results, residual risk, and files + changed. +- [x] Green room evaluation starts only after closeout is `closed` or + `owner-paused`. + +## 15. Rollback Plan + +If this handoff spec is rejected, delete only: + +```text +.agents/specs/2026-04-29-parallel-handoff-closeout/SPEC.md +``` + +Do not revert, delete, stage, or normalize any other local Mimir changes as part +of rollback. + +## 16. Completion Report + +- Files changed: + - `.agents/specs/2026-04-29-parallel-handoff-closeout/SPEC.md` +- Commands run: + - `node ../.agents/scripts/preflight.mjs` - passed with 0 warnings before + this owner-pause decision. + - `git status --short --branch --untracked-files=all` - captured branch + `main...origin/main`, tracked `.gitignore` and `AGENTS.md` edits, and + untracked local agent/Claude setup files. +- Verification result: local control-plane handoff is owner-paused. Product, + public-doc, release, tag, and publication gates were intentionally not run + because no public OSS action is approved. +- Anything intentionally left untouched: product code, public docs, launch + references, root files, sibling repos, untracked `.agents/**`, `.claude/**`, + `CLAUDE.md`, and `WORKFLOW.md`. +- Residual risk: Mimir remains public-OSS sensitive; all public actions remain + blocked until a low-noise owner-approved public spec exists. +- Spec evidence candidates: + - Public OSS green room packets can be local-only after owner pause, but + publication and public-doc changes need a separate public-facing approval. diff --git a/.agents/specs/2026-04-29-realignment-handoff/SPEC.md b/.agents/specs/2026-04-29-realignment-handoff/SPEC.md new file mode 100644 index 0000000..8c9297d --- /dev/null +++ b/.agents/specs/2026-04-29-realignment-handoff/SPEC.md @@ -0,0 +1,122 @@ +--- +id: mimir-realignment-handoff-2026-04-29 +status: draft-handoff +owner: HasNoBeef +repo: Mimir +source_spec: root:.agents/specs/2026-04-29-fleet-realignment-and-handoff/SPEC.md +branch_policy: local-only-public-oss +risk: medium +requires_network: false +requires_secrets: [] +acceptance_commands: + - "cargo build --workspace" + - "cargo test --workspace" + - "cargo test --workspace --all-features" + - "cargo fmt --all -- --check" + - "cargo clippy --all-targets --all-features -- -D warnings" + - "cargo deny check" + - "cargo doc --workspace --no-deps" +--- + +# SPEC: Mimir Realignment Handoff + +## 1. Handoff Purpose + +Mimir is a public pre-1.0 product, while BES company agents have temporarily +moved to spec evidence instead of Mimir hook authority. This handoff keeps those +two facts separate so product work does not get accidentally deprecated and BES +agent policy does not drift back to memory authority. + +## 2. Current Branch And Dirty State + +Observed on 2026-04-29: + +```text +## main...origin/main + M .gitignore + M AGENTS.md +?? .agents/ +?? .claude/ +?? CLAUDE.md +?? WORKFLOW.md +``` + +Recent head: + +```text +9e81c0f feat(librarian): support active processing adapters +4d38614 Delete docs/launch-posting-plan.md (#16) +1650d18 ci: make release publishing idempotent +``` + +Local MCP posture: no repo-local `.mcp.json` is present. Mimir product docs may +still mention MCP and hook features as product surfaces, but BES root operating +policy currently uses zero default MCP servers. + +## 3. Source Docs Read + +- `AGENTS.md` +- `CLAUDE.md` +- `WORKFLOW.md` +- `STATUS.md` +- `.agents/specs/2026-04-29-repo-audit/SPEC.md` + +## 4. Preserve + +- Public pre-1.0 honesty and launch-readiness discipline. +- Mimir product features around governed memory, librarian-mediated writes, + recovery mirroring, hooks, MCP, and benchmarks. +- The BES distinction: spec evidence is current company authority; raw memory + and hooks are not active work authority. +- Root-installed agent surfaces and all existing local/untracked work. + +## 5. Work Classification + +| Item | State | Required next action | +| --- | --- | --- | +| Shared agent setup | preserve | Keep local/draft until owner approves public-facing PR posture. | +| Active processing adapters commit | verify | Confirm local cargo gates before any release/PR closeout. | +| Pre-1.0 launch cleanup | ready-for-dispatch | Use a public-OSS-aware spec and avoid noisy CI churn. | +| BES integration pause note | owner-decision | Decide whether this belongs in Mimir product docs or root-only docs. | +| Spec authority research design | ready-for-dispatch | Design only; do not re-enable hooks without approval. | +| OSS release rollout | owner-decision | Requires explicit owner approval for tag/publish/push. | + +## 6. Verification Gate + +Run before claiming Mimir product work complete: + +```bash +cargo build --workspace +cargo test --workspace +cargo test --workspace --all-features +cargo fmt --all -- --check +cargo clippy --all-targets --all-features -- -D warnings +cargo deny check +cargo doc --workspace --no-deps +``` + +This handoff did not run product gates because it changed only agent-control +handoff documentation. + +## 7. Recommended Next Agent Engagement + +Start from inside `Mimir` and ask the agent to: + +```text +Orient with repo-orientation. Read AGENTS.md, CLAUDE.md, WORKFLOW.md, STATUS.md, +docs/launch-readiness.md, and this handoff. Draft a closeout SPEC for the next +Mimir public-OSS-safe step. Do not push, tag, publish, or add BES hook authority +without owner approval. +``` + +## 8. Owner Decisions Before Execution + +- Should Mimir resume launch cleanup first, or should spec-authority research + happen first? +- Should the BES integration pause be visible in Mimir public docs? +- What release/tag posture is acceptable after local gates pass? + +## 9. Residual Risk + +Mimir is externally visible. Even doc-only agent scaffolding can create public +noise, so keep all new work draft/local until a low-noise PR plan is approved. diff --git a/.agents/specs/2026-04-29-repo-audit/SPEC.md b/.agents/specs/2026-04-29-repo-audit/SPEC.md new file mode 100644 index 0000000..a9cb819 --- /dev/null +++ b/.agents/specs/2026-04-29-repo-audit/SPEC.md @@ -0,0 +1,110 @@ +--- +id: mimir-repo-audit-2026-04-29 +status: draft +owner: HasNoBeef +repo: Mimir +branch_policy: local-only-public-oss +risk: medium +requires_network: false +requires_secrets: [] +acceptance_commands: + - cargo build --workspace + - cargo test --workspace + - cargo test --workspace --all-features + - cargo fmt --all -- --check + - cargo clippy --all-targets --all-features -- -D warnings + - cargo deny check + - cargo doc --workspace --no-deps + - rg -n 'production-ready|stable API|benchmark-proven|hosted service|direct agent writes|mimir hook-context|mimir-checkpoint' README.md STATUS.md docs crates plugins +--- + +# SPEC: Mimir Repo Audit And Spec Migration + +## 1. Problem + +Mimir is a public pre-1.0 memory governance product. BES is temporarily +removing Mimir hook/setup surfaces from the active agent operating layer while +the company moves to spec-first Symphony dispatch. The repo needs to preserve +its product roadmap while making clear that BES agents do not currently rely on +Mimir hooks or raw memory as source-of-truth. + +## 2. Current Facts + +- `STATUS.md` says Mimir is in pre-1.0 public launch cleanup, version `0.1.0`, + with no release tag yet. +- `README.md` says agents may propose memory, but do not write trusted shared + memory directly. +- `docs/launch-readiness.md` records OSS readiness, engineering quality gates, + promise-audit boundaries, and deferred work. +- Product surfaces include core append-only store, librarian, harness, operator + tools, Claude/Codex setup paths, MCP, recovery mirroring, and benchmarks. +- Public claims are explicitly limited: no production-ready claim, no stable + API/storage claim, no hosted-service claim, no benchmark-proven claim. +- Code inventory from `rg --files`: 63 Rust files, 1 Python file, 10 TOML + files, and 54 Markdown files. +- Active BES agent operating surfaces now use `spec-evidence-governance`; the + repo product may still legitimately mention Mimir hook and checkpoint + features in code/docs/tests. + +## 3. Preserve + +- Product mission: local-first governed memory, append-only canonical store, + librarian-mediated writes, transparent harness, and explicit recovery. +- Launch-readiness checklist and promise-audit discipline. +- Public pre-1.0 honesty. +- Product docs/tests for `mimir hook-context`, `mimir-checkpoint`, and native + setup paths, because those are Mimir product features. + +## 4. Archive Or Supersede + +- Do not archive Mimir's product memory-governance docs. They remain product + architecture. +- Do supersede any BES operating instruction that tells agents to use Mimir + hooks or raw memory as work authority. +- Future BES integration should be designed as a spec/delivery evidence system + before re-enabling hooks. + +## 5. Proposed New Executable Specs + +1. **BES Integration Pause Note** + - Scope: add a small product/docs note, if owner approves, clarifying that + BES company agents currently use spec evidence instead of Mimir hooks. + - Acceptance: no claim that Mimir product functionality is deprecated. + +2. **Pre-1.0 Launch Cleanup Batch** + - Scope: continue launch-readiness gates, public-surface scrub, crate/docs + dry-runs, and owner-approved release tagging. + - Acceptance: all `docs/launch-readiness.md` local gates pass. + +3. **Spec Authority Research Design** + - Scope: design how Mimir could later store and govern spec evidence, + delivery records, and supersession decisions instead of generic memories. + - Acceptance: design spec only; no hook re-enable until approved. + +4. **Benchmark Claim Evidence** + - Scope: live recovery benchmark report with transcripts and scorecards. + - Acceptance: benchmark claim is either supported by evidence or kept out of + public copy. + +5. **OSS Release Rollout** + - Scope: batched public changes, crates.io order, docs.rs expectations, + launch article/posting plan, and CI-cost-aware push. + - Acceptance: no remote push/tag until owner approves. + +## 6. Open Questions + +- Should Mimir's first post-audit work be launch cleanup or spec-authority + research? +- Do you want a visible docs note about BES pausing Mimir hooks, or should that + stay in the company control-plane docs only? +- When public launch resumes, should the release be `v0.1.0` exactly or a new + pre-release tag after the current local audit batch? + +## 7. Verification Status + +This audit read docs and performed a lightweight code inventory only. Cargo +gates were not run because no Mimir product code changed in this session and +public OSS CI churn is intentionally avoided. + +The one-time migration bootstrap for Mimir has been actioned locally and should +be deleted in this local audit batch. diff --git a/.agents/specs/2026-04-30-green-room-product-evaluation/EVALUATION.md b/.agents/specs/2026-04-30-green-room-product-evaluation/EVALUATION.md new file mode 100644 index 0000000..2c4f3f7 --- /dev/null +++ b/.agents/specs/2026-04-30-green-room-product-evaluation/EVALUATION.md @@ -0,0 +1,622 @@ +# Mimir Green Room Product Evaluation + +## Repo State + +| Field | Value | +|---|---| +| Repo | `Mimir` | +| Branch | `main` | +| Head | `9e81c0f` | +| Tracking | `origin/main` | +| Dirty state | Yes — `M .gitignore`, `M AGENTS.md`, ~40 untracked agent-control/setup files | +| Public/private | **Public OSS** (Apache-2.0) | +| Handoff status | `owner-paused` per triage SPEC 2026-04-29 | +| Version | `0.1.0` (no release tag yet) | + +Dirty state detail captured from +`Mimir/.agents/specs/2026-04-29-parallel-handoff-closeout/SPEC.md` Section 4: + +- `M .gitignore` — BES agent-control addition. +- `M AGENTS.md` — BES operating manual update. +- Untracked: `.agents/` tree (~25 files including specs, skills, workflows, + scripts, mcp config), `.claude/` tree (~10 files including commands, settings, + skills), `CLAUDE.md`, `WORKFLOW.md`, `.mcp.example.json`. +- No tracked product source files are modified. + +## Primary Agent + +| Field | Value | +|---|---| +| Agent | Claude Code | +| Model | `claude-opus-4-6` (Claude Opus 4.6) | +| Reasoning mode | default (xhigh not explicitly set for this run) | +| Date | 2026-04-30 | +| Network used | No (local file reads only; git commands to Mimir denied by sandbox) | + +## Sources Read + +Root authority: + +| File | Purpose | +|---|---| +| `.agents/GREEN_ROOM_EVALUATION.md` | Green room protocol | +| `.agents/OPERATING_MODEL.md` | Fleet operating contract | +| `.agents/MODEL_ROUTING.md` | Agent/model routing policy | +| `.agents/DOCUMENTATION_GUIDE.md` | Documentation placement rules | +| `.agents/WORKSPACE_LAYOUT.md` | Root workspace layout | +| `.agents/specs/2026-04-29-green-room-product-evaluations/SPEC.md` | Fleet evaluation dispatch spec | +| `.agents/specs/2026-04-29-handoff-triage/SPEC.md` | Handoff triage decisions | + +Mimir authority: + +| File | Purpose | +|---|---| +| `Mimir/AGENTS.md` | Repo operating manual, architectural invariants, design space | +| `Mimir/CLAUDE.md` | Claude entry point | +| `Mimir/WORKFLOW.md` | Symphony workflow contract | +| `Mimir/STATUS.md` | Current phase, CI state, launch work order | +| `Mimir/README.md` | Public entry point, what-works / what-is-not-claimed | +| `Mimir/PRINCIPLES.md` | Engineering principles, testing strategy, error handling | +| `Mimir/CHANGELOG.md` | Unreleased section (first 100 lines) | +| `Mimir/RELEASING.md` | Release runbook | +| `Mimir/Cargo.toml` | Workspace members, dependencies, lint config | +| `Mimir/docs/README.md` | Public docs index | +| `Mimir/docs/concepts/README.md` | Authoritative implementation spec catalog | +| `Mimir/docs/launch-readiness.md` | OSS readiness checklist, engineering gate evidence | +| `Mimir/.github/workflows/ci.yml` | CI pipeline (first 80 lines) | + +Mimir handoff/audit specs: + +| File | Purpose | +|---|---| +| `Mimir/.agents/specs/2026-04-29-parallel-handoff-closeout/SPEC.md` | Captured git state, in-flight work table, owner decisions | +| `Mimir/.agents/specs/2026-04-29-realignment-handoff/SPEC.md` | Branch/dirty state, work classification | +| `Mimir/.agents/specs/2026-04-29-repo-audit/SPEC.md` | Code inventory, proposed specs | + +Code inventory (via `ls`, `wc -l`, `find -type f`): + +| Path | Files | LOC | +|---|---|---| +| `mimir-core/src/` | 22 | ~18,054 | +| `mimir-librarian/src/` | 12 | ~16,139 | +| `mimir-harness/src/` | 2 | ~10,980 | +| `mimir-mcp/src/` | 3 | ~1,357 | +| `mimir-cli/src/` | 2 | ~964 | +| **Source total** | **41** | **~47,494** | +| Test files | 13 | ~8,575 | +| Total `.rs` files | 162 | — | + +## Commands Run + +| Command | Result | +|---|---| +| `ls Mimir/` | Success — listed repo top-level contents | +| `ls Mimir/mimir-core/src/` | Success — 22 source files | +| `ls Mimir/mimir-librarian/src/` | Success — 12 source files | +| `ls Mimir/mimir-harness/src/` | Success — 2 source files | +| `ls Mimir/mimir-mcp/src/` | Success — 3 source files | +| `ls Mimir/mimir-cli/src/` | Success — 2 source files | +| `wc -l Mimir/mimir-core/src/*.rs` | Success — 18,054 LOC | +| `wc -l Mimir/mimir-librarian/src/*.rs` | Success — 16,139 LOC | +| `wc -l Mimir/mimir-harness/src/*.rs` | Success — 10,980 LOC | +| `wc -l Mimir/mimir-mcp/src/*.rs` | Success — 1,357 LOC | +| `wc -l Mimir/mimir-cli/src/*.rs` | Success — 964 LOC | +| `find Mimir/mimir-core/tests -type f -name '*.rs'` | Success — 5 test files | +| `find Mimir/mimir-harness/tests -type f -name '*.rs'` | Success — 4 test files | +| `find Mimir/mimir-cli/tests -type f -name '*.rs'` | Success — 2 test files | +| `find Mimir/mimir-librarian/tests -type f -name '*.rs'` | Success — 2 test files | +| `wc -l` on all test files | Success — 8,575 LOC total | +| `find Mimir -type f -name '*.rs' \| wc -l` | Success — 162 files | +| `mkdir -p Mimir/.agents/specs/2026-04-30-green-room-product-evaluation/` | Success | +| `git -C Mimir status` | **Denied** — sandbox blocked cd-then-git | +| `git -C Mimir log` | **Denied** — sandbox blocked cd-then-git | +| `git -C Mimir diff` | **Denied** — sandbox blocked cd-then-git | + +**Skipped gates:** + +| Gate | Reason | +|---|---| +| `cargo build --workspace` | Bash denied for Mimir git/cargo; dispatch predicted this; launch-readiness.md records pass on 2026-04-28 | +| `cargo test --workspace` | Same; launch-readiness.md records pass on 2026-04-28 | +| `cargo fmt --all -- --check` | Same | +| `cargo clippy --all-targets --all-features -- -D warnings` | Same | +| `cargo deny check` | Same | +| `cargo doc --no-deps --all-features` | Same | +| `cargo publish --dry-run` per crate | Same | + +All cargo gate evidence is from `docs/launch-readiness.md` (2026-04-28 pass) +and `STATUS.md` (CI green on main after PR #11). This is **second-hand +evidence**, not a fresh gate run. The verifier should note this gap. + +## Product Thesis and Target User + +**Thesis** (from AGENTS.md mandate update 2026-04-24): Mimir is an +experimental local-first memory governance system for AI coding agents. It +provides a librarian-mediated, append-only, symbol-tracking memory store that +agents write to through a structured IR surface and read from through governed +retrieval. The goal is durable, auditable, cross-agent memory that survives +context resets, model changes, and session boundaries. + +**Target user**: AI coding agents (Codex, Claude Code, Cursor, Copilot) and +the developers who operate them. Mimir is infrastructure, not a direct +end-user product. The human operator configures and audits; agents interact +through the harness, MCP server, or Codex plugin. + +**Differentiation**: single-writer gate (librarian), append-only canonical +store, agent-native IR (not human-readable prose), compiler-shaped pipeline, +bi-temporal model, confidence decay, cross-agent consensus quorum, transparent +harness wrapping, and scoped memory isolation with governed promotion. + +## Current Status vs. Last Known Roadmap + +`STATUS.md` (last updated 2026-04-28) places Mimir in **pre-1.0 public launch +cleanup**: + +| Status item | Evidence | +|---|---| +| CI state | Main green after PR #11 (2026-04-28) | +| Core store | Implemented (canonical.rs 2090 LOC, store.rs 2089 LOC) | +| Librarian pipeline | Implemented (pipeline.rs 2727 LOC, full lex→parse→bind→semantic→emit chain) | +| Harness | Implemented (lib.rs 8212 LOC, main.rs 2768 LOC) | +| Operator controls | Implemented (bounded context, operator memory controls, project doctor, hook validation) | +| MCP server | Implemented (server.rs 1179 LOC, rmcp 1.5.0) | +| Recovery framework | Implemented (benchmarks/recovery/ with scenarios, scoring, test_bench.py) | +| Codex plugin | Implemented (plugins/mimir/ with skills and marketplace entry) | +| Adapters | Processing adapters present (copilot_session_store.rs 1489 LOC in librarian) | +| Fuzz targets | Present (fuzz/ directory with corpus) | +| Launch readiness | All checklist items Done per docs/launch-readiness.md | +| Release tag | **Not created** — pending owner approval | +| Public publish | **Not done** — crates.io publish paused | + +The launch work order in STATUS.md is: +1. Public surface scrub +2. README/docs cleanup +3. OSS readiness +4. Marketing +5. Local verification +6. Batched commit/push + +Steps 1–5 appear substantially complete based on launch-readiness.md evidence. +Step 6 (batched commit/push) has not happened — the dirty state confirms local +work is uncommitted/unpushed. + +**Drift from roadmap**: The BES fleet realignment paused Mimir before the +final batched commit/push. The handoff triage spec marks Mimir +`owner-paused`. The product is closer to launch-ready than any other BES repo +but the public release action was blocked by fleet policy, not by engineering +gaps. + +## Engineering Quality + +### Architecture + +**Strong.** The architecture is well-specified across 14 authoritative concept +docs plus 2 drafts (scope-model, consensus-quorum). The crate structure +cleanly separates concerns: + +| Crate | Role | +|---|---| +| `mimir-core` | Canonical store, IR pipeline, read/write, decay, inference, symbol tracking | +| `mimir-librarian` | Single-writer gate, LLM integration, drafts, quorum, processing adapters | +| `mimir-harness` | Transparent agent wrapper, operator controls, context management | +| `mimir-mcp` | MCP server exposing governed memory tools | +| `mimir-cli` | Operator CLI for direct store interaction | + +Eight non-negotiable architectural invariants are documented in AGENTS.md: +1. Librarian-mediated writes (single writer gate) +2. Append-only canonical store +3. Bi-temporal model (assertion time + valid time) +4. Agent-native IR (not human-readable) +5. Structured write surface (Lisp S-expression) +6. Compiler-shaped pipeline (lex→parse→bind→semantic→emit) +7. Symbol-tracking IR with stable identity +8. Confidence decay with grounding-aware parameters + +These invariants are enforced through the type system and pipeline design, not +just documentation. + +**Risk**: `mimir-librarian/src/main.rs` at 8,455 LOC and +`mimir-harness/src/lib.rs` at 8,212 LOC are large single files. These are +complexity hot spots that will resist review, refactoring, and onboarding. + +### Build and Test Health + +**Good, with caveats.** + +- Workspace lint config is strict: `missing_docs = "deny"`, + `unsafe_code = "forbid"`, clippy pedantic with `unwrap_used`, + `expect_used`, `panic`, `todo`, `dbg_macro` all denied. +- CI matrix covers Ubuntu, macOS, Windows with fmt, clippy, test, deny. +- CI uses pinned action SHAs, Swatinem/rust-cache, concurrency groups. +- Test corpus: ~8,575 LOC across 13 test files, plus property tests (proptest) + and fuzz targets. +- Engineering gate passed locally on 2026-04-28 per launch-readiness.md. + +**Caveats**: +- No fresh gate run during this evaluation (sandbox restrictions). +- Test-to-source ratio is ~18% by LOC — adequate for a pre-1.0 project but + the large librarian and harness files likely have lower coverage density. +- No code coverage tooling configured. +- No integration test for the full harness→librarian→store pipeline visible + in the test file listing (harness tests exist but the integration boundary + is unclear without reading test content). + +### CI + +**Well-configured.** Cross-platform matrix (Ubuntu, macOS, Windows), cargo-deny +for dependency auditing, fmt and clippy checks, all-features test runs with +doctests. Concurrency groups prevent parallel CI waste. + +**CI cost constraint**: AGENTS.md documents a hard rule — monthly GitHub +Actions budget is capped. The repo must not generate churn that wastes CI +minutes. This is a real operational constraint for any post-evaluation work. + +### Dependency Risk + +**Low.** `Cargo.toml` shows a focused dependency set: +- Core: `thiserror`, `ulid`, `sha2`, `serde`, `serde_json`, `toml` +- Async: `tokio` +- Database: `rusqlite` (bundled SQLite) +- MCP: `rmcp` (pinned =1.5.0) +- Testing: `proptest`, `tempfile`, `criterion`, `wait-timeout` +- Observability: `tracing`, `tracing-subscriber` +- Schema: `schemars` +- Other: `anyhow`, `getrandom` + +`cargo-deny` is integrated into CI. The `rmcp` pin at `=1.5.0` is tight and +may need updating as the MCP ecosystem evolves, but for pre-1.0 this is +acceptable. + +### Observability + +**Present.** `tracing` and `tracing-subscriber` are workspace dependencies. +`PRINCIPLES.md` documents structured event logging with privacy rules. The +harness includes a `log.rs` (607 LOC in core) and the librarian has +`test_tracing.rs`. Actual observability depth requires code-level review. + +### Security + +**Good posture for pre-1.0.** `unsafe_code = "forbid"` workspace-wide. +`SECURITY.md` exists (per launch-readiness.md). Dependency auditing via +`cargo-deny`. No network-facing attack surface in the core library — the MCP +server is the primary external boundary. `rusqlite` bundled mode avoids system +SQLite version issues. Sanitisation boundary is documented in +`docs/concepts/`. + +### Release Posture + +**Ready but blocked.** `RELEASING.md` documents a full tag-triggered release +pipeline: verify-version → dry-run-publish → smoke-install → build-binaries +(5 targets) → github-release → crates-publish. `cargo publish --dry-run` +passed locally on 2026-04-28. The release workflow exists in +`.github/workflows/` (inferred from RELEASING.md). No tag has been created. +Owner approval is required for the first `v0.1.0` tag. + +### Operational Risk + +**Low for pre-1.0.** No production users, no hosted service, no SLA. The +primary operational risk is CI cost — any PR churn costs Actions minutes +against a capped monthly budget. + +## Code Quality + +### Maintainability + +**Mixed.** The crate boundaries and type system are strong. The lint +configuration is among the strictest possible in Rust (forbid unsafe, deny +missing_docs, pedantic clippy). But two files dominate the codebase: + +| File | LOC | Risk | +|---|---|---| +| `mimir-librarian/src/main.rs` | 8,455 | God-file risk; combines CLI entry, server logic, and processing in one file | +| `mimir-harness/src/lib.rs` | 8,212 | Large lib with harness logic, context management, operator controls | + +These files are individually larger than many entire crates. They will be +difficult to review, test in isolation, and onboard new contributors to. + +### Test Coverage + +**Adequate for pre-1.0, with gaps.** + +- 13 test files, ~8,575 LOC. +- Property tests via proptest. +- Fuzz targets present. +- Snapshot tests implied by the testing strategy in PRINCIPLES.md. +- No coverage tooling configured — coverage density of the large files is + unknown. +- Recovery benchmark framework provides structured scenario testing but is + separate from the unit/integration test suite. + +### Complexity Hot Spots + +1. `mimir-librarian/src/main.rs` (8,455 LOC) — needs decomposition. +2. `mimir-harness/src/lib.rs` (8,212 LOC) — needs decomposition. +3. `mimir-core/src/pipeline.rs` (2,727 LOC) — large but arguably appropriate + for a compiler pipeline; should be reviewed for internal modularity. +4. `mimir-core/src/canonical.rs` (2,090 LOC) and `store.rs` (2,089 LOC) — + core storage; size is proportional to responsibility but warrants review. + +### Stale Code + +- `docs/launch-posting-plan.md` was deleted in PR #16 but is still referenced + from `docs/README.md` (confirmed via docs index read). This is a known + broken link. +- `.planning/planning/` contains 14 historical planning docs. These are + archive material and should not be treated as authority. + +### Duplication + +Cannot assess without code-level review. The large file sizes suggest +potential internal duplication within librarian and harness, but this is +speculative. + +### Unsafe Assumptions + +- `unsafe_code = "forbid"` eliminates Rust-level unsafe. +- The LLM integration in `mimir-librarian/src/llm.rs` (1,015 LOC) is a + correctness boundary — LLM outputs fed into the compiler pipeline must be + validated. The validator exists (`validator.rs`, 149 LOC) but its coverage + of adversarial LLM output is unknown. +- The `rmcp` pin at `=1.5.0` assumes API stability of an early-stage MCP + library. + +### Correctness Risks + +- The compiler pipeline (lex→parse→bind→semantic→emit) is the core + correctness boundary. Property tests exist for some stages. Fuzz targets + exist. But the pipeline is 2,727 LOC and correctness of the full chain + depends on integration testing that was not directly observed. +- Confidence decay (`decay.rs`, 1,000 LOC) implements a mathematical model; + correctness requires property tests and potentially formal verification. + Property tests are present (proptest dependency) but depth is unknown. +- Bi-temporal model correctness in the canonical store is critical — temporal + queries must respect both assertion and valid time. Testing depth unknown. + +## Product Quality + +### Feature Completeness + +**Core feature set is implemented.** Per STATUS.md and README.md: + +| Feature | Status | +|---|---| +| Canonical append-only store | Implemented | +| Librarian-mediated writes | Implemented | +| Compiler pipeline (lex→parse→bind→semantic→emit) | Implemented | +| Agent-native IR | Implemented | +| Symbol tracking with stable identity | Implemented | +| Confidence decay | Implemented | +| Bi-temporal model | Implemented | +| Transparent agent harness | Implemented | +| Operator controls (bounded context, memory controls, project doctor) | Implemented | +| MCP server | Implemented | +| Recovery benchmarks | Implemented | +| Codex plugin | Implemented | +| CLI operator tools | Implemented | +| Processing adapters (Copilot session store) | Implemented | +| BC/DR restore drill | Implemented | + +**Not implemented / not claimed** (per README.md): + +| Feature | Status | +|---|---| +| Production readiness | Not claimed | +| Stable API | Not claimed | +| Hosted service | Not claimed | +| Benchmark-proven superiority | Not claimed | +| Relationship/timeline APIs | Deferred | +| OCI/MCPB package | Deferred | +| Broader client recipes | Deferred | +| OpenSSF Scorecard / Best Practices Badge | Deferred | + +### Demo / Showcase Readiness + +**Moderate.** The transparent harness (`mimir [agent args...]`) is the +primary demo surface. The Codex plugin bundle provides an integration path. +The MCP server enables tool-based interaction. But there is no recorded demo +script, no video, no interactive tutorial beyond the README quickstart. For an +infrastructure product targeting AI agent operators, the current entry point +is adequate for technical early adopters but not for broader discovery. + +### Asset and Content Readiness + +**Good for pre-1.0 OSS.** + +- README with quickstart. +- Docs index with concept specs, integration guides, observability docs. +- Launch article draft (docs/blog/). +- CHANGELOG with detailed unreleased section. +- Contributing guide, Code of Conduct, Security policy, issue/PR templates. +- CODEOWNERS, Dependabot config, Citation file. + +### User-Facing Gaps + +1. No recorded demo or tutorial beyond README quickstart. +2. Broken link: `docs/launch-posting-plan.md` deleted but still referenced. +3. No published crate — users cannot `cargo install` yet. +4. No release tag — users cannot pin a version. +5. The agent-native IR is intentionally not human-readable, which is a design + choice but creates an onboarding barrier for operators who want to inspect + memory state. + +## Roadmap Assessment + +### What Is Done + +- Core architecture implemented across 5 crates (~47.5k LOC). +- All 14 authoritative concept specs have corresponding implementations. +- Engineering quality gates pass locally (2026-04-28 evidence). +- CI is green on main. +- Launch readiness checklist is fully Done. +- Public surface (README, docs, legal, community) is prepared. +- Release pipeline exists and dry-run passed. + +### What Is Blocked + +1. **Release tag and crates.io publish**: blocked on owner approval. +2. **Public docs/PR/CI actions**: blocked by BES fleet policy (public OSS + constraint). +3. **Batched commit of local agent-control files**: blocked on owner approval + of what to include vs. exclude. +4. **BES spec-authority integration**: paused per fleet realignment; design + research needed before resuming. + +### What Is Stale + +- `docs/launch-posting-plan.md` reference in `docs/README.md` — the file was + deleted in PR #16. +- `.planning/planning/` historical archive — 14 docs from pre-mandate + planning. Not harmful but not authority. +- The `STATUS.md` launch work order implies a linear sequence that was + interrupted by fleet realignment. The remaining steps (batched commit/push) + are valid but the context has changed. + +### Critical Path + +The critical path to v0.1.0 public launch: + +1. Owner decides on local agent-control file commit scope. +2. Owner decides on launch-posting-plan reference cleanup. +3. Owner approves release tag posture (v0.1.0). +4. Batched commit/push of approved changes. +5. Tag v0.1.0 and trigger release pipeline. +6. Verify crates.io publish and binary artifacts. +7. Announce (launch article is drafted). + +### What Can Be Cut + +- OpenSSF Scorecard / Best Practices Badge — deferred, not blocking launch. +- OCI/MCPB package — deferred. +- Broader client recipes — deferred. +- Relationship/timeline APIs — deferred. +- Live benchmark report hosting — deferred. +- BES spec-authority integration — explicitly paused; separate concern from + launch. + +## Next-Build Plan + +The smallest sequence of specs to move Mimir measurably toward green: + +### Spec 1: Local Agent-Control Commit Plan + +Decide which untracked agent-control files to commit, which to gitignore, and +which to remove. This unblocks the batched commit/push without polluting the +public repo with internal fleet language. + +### Spec 2: Pre-1.0 Launch Cleanup Batch + +Fix the broken `docs/launch-posting-plan.md` reference, verify all docs links, +run a fresh full engineering gate, and prepare the batched commit. This is the +final cleanup before tagging. + +### Spec 3: v0.1.0 Release Tag and Publish + +Create the v0.1.0 tag, trigger the release pipeline, verify crates.io +publish, verify binary artifacts for all 5 targets, and execute the launch +announcement plan. + +### Stretch Spec: Librarian/Harness Decomposition + +Split `mimir-librarian/src/main.rs` (8,455 LOC) and +`mimir-harness/src/lib.rs` (8,212 LOC) into focused modules. This is not +blocking launch but is the highest-impact maintainability improvement for +post-launch development. + +## Proposed Issue List + +| # | Title | Depends on | Risk | Verification gate | Model routing | +|---|---|---|---|---|---| +| 1 | Local agent-control commit plan | Owner decisions | Medium | `git status` shows only approved files staged | Any frontier model | +| 2 | Pre-1.0 launch cleanup batch | #1 | Low | Full cargo gate pass + docs link check | Any frontier model | +| 3 | v0.1.0 release tag and publish | #2 + owner approval | High | Release pipeline succeeds, crates.io publish verified, binaries downloadable | Codex `gpt-5.5` primary, Claude Opus verification | +| 4 | Librarian main.rs decomposition | None (can parallel after #1) | Medium | All tests pass, no public API change | Any frontier model | +| 5 | Harness lib.rs decomposition | None (can parallel after #1) | Medium | All tests pass, no public API change | Any frontier model | +| 6 | BES spec-authority research design | Owner approval to resume | High | Design doc reviewed by second model | Frontier model primary + different family verification | +| 7 | Test coverage tooling setup | None | Low | Coverage report generated, baseline established | Sonnet or fast model | +| 8 | Benchmark claim evidence | None (can parallel) | Low | Benchmark results recorded with methodology | Any model | + +## Owner Decisions Needed + +1. **Local agent-control commit scope**: which of the ~40 untracked + agent-control files should be committed to the public repo, which should + be gitignored, and which should be removed? + +2. **Launch-posting-plan reference**: the file was deleted in PR #16 but + `docs/README.md` still links to it. Should the reference be removed, + redirected, or should the file be restored? + +3. **Release tag posture**: is the owner ready to approve v0.1.0 tagging and + crates.io publish? What conditions must be met first? + +4. **BES integration pause visibility**: should the public repo acknowledge + that BES/Mimir spec-authority integration is paused, or should this remain + internal? + +5. **CI budget for post-evaluation work**: how many PR cycles can Mimir + consume from the monthly GitHub Actions budget for cleanup and release? + +6. **Librarian/harness decomposition priority**: should the large file + decomposition happen before or after v0.1.0 launch? + +## Residual Risks + +1. **No fresh cargo gate run**: all engineering gate evidence is from + 2026-04-28. Any changes since then (even agent-control file additions) + could affect the build. The verifier should note this gap. + +2. **Large file complexity**: the two 8k+ LOC files are technical debt that + will compound. Post-launch development velocity will suffer without + decomposition. + +3. **LLM integration boundary**: the librarian's LLM integration + (`llm.rs` + `validator.rs`) is a correctness boundary where adversarial + or malformed LLM output could corrupt the memory store. Testing depth at + this boundary is unknown. + +4. **rmcp version pin**: `=1.5.0` is a tight pin on an early-stage library. + MCP ecosystem changes may require updates that affect the server's API + surface. + +5. **Public OSS exposure**: once published, any internal agent-control + language accidentally committed becomes public. The commit scope decision + (#1 above) is critical. + +6. **Single-maintainer risk**: the repo appears to have one maintainer + (HasNoBeef). Bus factor is 1. + +7. **No code coverage baseline**: without coverage tooling, it is impossible + to quantify which paths are tested and which are not. + +## Evidence Gaps + +1. **Test content not read**: test files were counted and measured by LOC but + their content was not read. Coverage quality, assertion depth, and edge + case handling are unknown. + +2. **Librarian main.rs and harness lib.rs not read**: the two largest files + were not read due to context constraints. Internal structure, duplication, + and specific complexity risks are inferred from size only. + +3. **LLM integration not read**: `llm.rs`, `validator.rs`, `quorum.rs`, + and `drafts.rs` in the librarian were not read. The correctness of LLM + output validation is unknown. + +4. **CI workflow not fully read**: only the first 80 lines of `ci.yml` were + read. Release workflow, Dependabot config, and other workflow files were + not read. + +5. **Git history not available**: sandbox restrictions prevented git log, git + diff, and git status commands against the Mimir repo. All git state is + from captured handoff spec evidence dated 2026-04-29. + +6. **No live cargo gate**: build, test, clippy, fmt, deny, doc, and + publish-dry-run were not executed. All pass evidence is second-hand from + `docs/launch-readiness.md` (2026-04-28). + +7. **CHANGELOG.md partially read**: only the first 100 lines (Unreleased + section header and early entries). Full change history since last release + is not captured. + +8. **Adapter coverage unknown**: `copilot_session_store.rs` (1,489 LOC) is + the only visible adapter. Whether other agent adapters exist or are + planned is unclear from the files read. diff --git a/.agents/specs/2026-04-30-green-room-product-evaluation/ROADMAP.md b/.agents/specs/2026-04-30-green-room-product-evaluation/ROADMAP.md new file mode 100644 index 0000000..119130a --- /dev/null +++ b/.agents/specs/2026-04-30-green-room-product-evaluation/ROADMAP.md @@ -0,0 +1,173 @@ +# Mimir Green Room Roadmap + +## Current State + +Mimir is a pre-1.0 public OSS memory governance system for AI coding agents. +The core architecture is implemented across 5 Rust crates (~47.5k LOC source, +~8.5k LOC tests). CI is green on main (2026-04-28). All launch readiness +checklist items are Done. The release pipeline exists and dry-run passed. No +release tag has been created. No crates.io publish has occurred. The repo is +`owner-paused` per BES fleet triage. Local dirty state consists of ~40 +untracked agent-control/setup files and two modified files (`.gitignore`, +`AGENTS.md`); no tracked product source is modified. + +The product is closer to public launch than any other BES repo. The gap is +owner decisions and a final cleanup pass, not engineering work. + +## Milestones + +### M1: Commit Scope Resolution + +Decide and execute which local agent-control files to commit, gitignore, or +remove. This is the gate for all subsequent work — the dirty state must be +resolved before any clean batched commit/push. + +**Owner decision required.** + +### M2: Pre-Launch Cleanup + +Fix the broken `docs/launch-posting-plan.md` reference. Verify all docs links. +Run a fresh full engineering gate (`cargo build/test/fmt/clippy/deny/doc` + +`cargo publish --dry-run` for all crates). Prepare the batched commit of +approved changes. + +**Depends on: M1.** + +### M3: v0.1.0 Release + +Create the v0.1.0 tag. Trigger the release pipeline. Verify crates.io publish +and binary artifacts for all 5 targets (per RELEASING.md). Execute the launch +announcement. + +**Depends on: M2 + owner approval for tag and publish.** + +### M4: Post-Launch Maintainability + +Decompose `mimir-librarian/src/main.rs` (8,455 LOC) and +`mimir-harness/src/lib.rs` (8,212 LOC) into focused modules. Set up code +coverage tooling. Establish a coverage baseline. + +**Depends on: M3 (or can start after M1 if owner approves parallel work).** + +### M5: BES Integration Research + +Resume the paused BES spec-authority integration design. Requires a separate +approved research spec and second-model verification before implementation. + +**Depends on: owner decision to resume. Independent of M1–M4.** + +## Critical Path + +``` +M1 (commit scope) → M2 (cleanup) → M3 (release) +``` + +All three milestones are sequentially dependent. M1 is owner-blocked. The +total engineering work for M1–M3 is small — the blocking constraint is owner +decisions, not implementation effort. + +## Parallelizable Work + +The following can proceed in parallel with the critical path after M1 is +resolved: + +| Work | Can start after | Blocks | +|---|---|---| +| Librarian main.rs decomposition | M1 | Nothing on critical path | +| Harness lib.rs decomposition | M1 | Nothing on critical path | +| Test coverage tooling setup | M1 | Nothing on critical path | +| Benchmark claim evidence collection | Any time | Nothing on critical path | +| Adapter inventory and planning | Any time (read-only) | Nothing on critical path | + +## Work That Should Not Start Yet + +| Work | Why not yet | +|---|---| +| BES spec-authority integration | Owner has not approved resumption; requires separate research spec | +| New feature development (relationship/timeline APIs, OCI package, hosted service) | Pre-1.0 — launch first | +| Public PR churn for agent-control scaffolding | CI budget constraint; owner must approve PR plan | +| OpenSSF Scorecard / Best Practices Badge | Deferred per launch-readiness.md; post-launch concern | +| Broader client recipes beyond Codex plugin | Post-launch; current Codex plugin and MCP server are sufficient for v0.1.0 | + +## First Three Executable Specs + +### Spec 1: Local Agent-Control Commit Plan + +**Goal**: Resolve the dirty state by classifying all ~40 untracked files as +commit, gitignore, or remove. + +**Scope**: +- Inventory every untracked file and its purpose. +- Propose a classification for owner review. +- After owner approval, stage only approved files and commit. +- Ensure no internal fleet language leaks into the public repo. + +**Verification**: `git status` shows clean working tree with only approved +changes committed. No internal agent-control language in committed files +visible to public. + +**Risk**: Medium — wrong classification could expose internal fleet language +publicly or lose useful local configuration. + +**Model routing**: Any frontier model. Low-risk enough for Sonnet with +frontier verification. + +--- + +### Spec 2: Pre-1.0 Launch Cleanup Batch + +**Goal**: Fix all known documentation drift, verify engineering gates, and +prepare the final batched commit before tagging. + +**Scope**: +- Remove or redirect the broken `docs/launch-posting-plan.md` reference in + `docs/README.md`. +- Verify all documentation links resolve. +- Run the full engineering gate: + `cargo build --workspace && cargo test --workspace && cargo fmt --all -- --check && cargo clippy --all-targets --all-features -- -D warnings && cargo deny check && cargo doc --no-deps --all-features`. +- Run `cargo publish --dry-run` for each crate in dependency order. +- Fix any issues found. +- Prepare the batched commit. + +**Verification**: Full cargo gate passes. No broken documentation links. +`cargo publish --dry-run` succeeds for all crates. + +**Risk**: Low — this is cleanup, not feature work. The gate passed on +2026-04-28; regressions are unlikely unless dependency updates occurred. + +**Model routing**: Any frontier model. + +--- + +### Spec 3: v0.1.0 Release Tag and Publish + +**Goal**: Execute the first public release of Mimir. + +**Scope**: +- Push the batched commit from Spec 2. +- Create the `v0.1.0` tag per RELEASING.md. +- Verify the tag-triggered release pipeline: + - verify-version + - dry-run-publish + - smoke-install + - build-binaries (5 targets) + - github-release + - crates-publish (dependency order: mimir-core → mimir-librarian → + mimir-harness → mimir-mcp → mimir-cli) +- Verify crates.io pages and binary artifact downloads. +- Execute launch announcement per the drafted launch article. + +**Verification**: Release pipeline succeeds. All 5 crates published on +crates.io. Binary artifacts downloadable for all targets. `cargo install mimir-cli` works from a clean environment. + +**Risk**: High — first public release; irreversible once crates.io publish +completes. Requires owner approval at multiple gates. + +**Model routing**: Codex `gpt-5.5` primary with Claude Opus verification, per +MODEL_ROUTING.md public OSS release guidance. + +**Owner approval gates**: +1. Approve the tag name and version. +2. Approve the crates.io publish. +3. Approve the launch announcement wording. +4. Approve the CI budget spend for the release pipeline. diff --git a/.agents/specs/2026-04-30-green-room-product-evaluation/VERIFICATION.md b/.agents/specs/2026-04-30-green-room-product-evaluation/VERIFICATION.md new file mode 100644 index 0000000..285b2a7 --- /dev/null +++ b/.agents/specs/2026-04-30-green-room-product-evaluation/VERIFICATION.md @@ -0,0 +1,170 @@ +# Mimir Green Room Verification + +## Verifier Metadata + +| Field | Value | +|---|---| +| Verifier | Codex | +| Model | GPT-5 | +| Reasoning mode | high | +| Date | 2026-04-30 | +| Scope | Independent second-model verification of `EVALUATION.md` and `ROADMAP.md`; no implementation or public action | +| Final status | `verified-with-changes` | + +This file is verifier output only. It is not owner approval to implement the +roadmap, stage files, push, tag, publish, open PRs, spend CI minutes, or change +public docs. Owner-approved executable specs are still required before any +roadmap item becomes implementation work. + +## Sources Checked + +- Root policy: `AGENTS.md`, `.agents/OPERATING_MODEL.md`, + `.agents/GREEN_ROOM_EVALUATION.md`, `.agents/MODEL_ROUTING.md`. +- Mimir policy/status: `AGENTS.md`, `CLAUDE.md`, `WORKFLOW.md`, `STATUS.md`, + `README.md`, `.agents/DOCUMENTATION_GUIDE.md`, + `docs/launch-readiness.md`. +- Primary packet: + `.agents/specs/2026-04-30-green-room-product-evaluation/EVALUATION.md` and + `.agents/specs/2026-04-30-green-room-product-evaluation/ROADMAP.md`. +- Supporting evidence: `PRINCIPLES.md`, `docs/README.md`, `Cargo.toml`, + `.github/workflows/ci.yml`, `.github/workflows/release.yml`, + `RELEASING.md`. + +## Predicted Failure Classification + +| Constraint | Prediction | Result | Classification | +|---|---|---|---| +| Public actions | Pushes, PRs, tags, publish, release workflows blocked without owner approval | Not attempted | expected | +| CI budget | GitHub Actions should not be triggered by this audit | Not triggered | expected | +| Cargo gates | Could be slow/heavy for a Rust workspace | Completed quickly locally | predicted but did not fail | +| Optional tools | `cargo deny` might be unavailable | Available and passed | predicted but did not fail | +| Dirty worktree | Existing uncommitted/untracked agent-control work must be preserved | Preserved; only this file written | expected | + +## Evidence Commands + +| Command | Result | +|---|---| +| `node .agents/scripts/preflight.mjs` from root | Pass, 0 warnings | +| `git status --short --branch --untracked-files=all` | `main...origin/main`; existing `M .gitignore`, `M AGENTS.md`, and many untracked agent-control files | +| `git log --oneline --decorate -n 12` | `HEAD` at `9e81c0f feat(librarian): support active processing adapters`; prior public cleanup/release commits present | +| `git diff --name-status` | Existing tracked diffs only: `.gitignore`, `AGENTS.md` | +| `git diff --stat` | Existing tracked diffs: 32 insertions, 2 deletions | +| `git ls-files \| wc -l` | 182 tracked files | +| `find crates -path '*/src/*.rs' ... wc -l` | 41 source files, 47,494 LOC | +| `find crates -path '*/tests/*.rs' ... wc -l` | 18 crate test files, 10,173 LOC | +| `cargo fmt --all -- --check` | Pass | +| `cargo build --workspace` | Pass | +| `cargo test --workspace` | Pass | +| `cargo test --workspace --all-features` | Pass | +| `cargo clippy --all-targets --all-features -- -D warnings` | Pass | +| `cargo deny check` | Pass: advisories, bans, licenses, sources ok | +| `cargo doc --workspace --no-deps` | Pass; docs generated under ignored `target/doc` | +| `rg -n "launch-posting-plan..." docs README.md STATUS.md CHANGELOG.md RELEASING.md AGENTS.md` | Found stale references in `AGENTS.md`, `STATUS.md`, `README.md`, `docs/README.md`, and `docs/launch-readiness.md` | +| `ls -la docs/launch-posting-plan.md` | Failed: file does not exist | + +## Agreement With Primary Findings + +I agree with the primary evaluation's main conclusions: + +- Mimir is public OSS, pre-1.0, and should not receive public release or PR + churn without owner approval. +- The architecture and public product claims are internally consistent with + `AGENTS.md`, `STATUS.md`, `README.md`, `PRINCIPLES.md`, and + `docs/launch-readiness.md`. +- The critical path is owner-gated release/commit scope work, not new feature + engineering. +- `mimir-librarian/src/main.rs` and `mimir-harness/src/lib.rs` are real + maintainability hot spots by size. +- `docs/launch-posting-plan.md` is missing while public docs still reference + it. +- Release/tag/publish work is high risk and must remain owner-approved. + +## Required Changes Before Dispatch + +1. Expand the broken-link cleanup scope. The primary packet and roadmap call + out `docs/README.md`, but the missing `docs/launch-posting-plan.md` is also + referenced from `AGENTS.md`, `STATUS.md`, `README.md`, and + `docs/launch-readiness.md`. A cleanup spec must either restore the file or + update all stale references. + +2. Correct the test inventory in future operator summaries. The verifier + counted 18 crate test files and 10,173 test LOC, not 13 files and ~8,575 + LOC. This does not weaken the roadmap; it improves the evidence baseline. + +3. Replace the primary packet's "no fresh cargo gate" residual risk with the + fresh verifier evidence above when using the roadmap for dispatch. The + green-room packet now has fresh local passes for fmt, build, tests, + all-features tests, clippy, deny, and docs. + +4. Keep Spec 1 explicitly owner-driven. The public OSS constraint is respected + only if the owner decides which agent-control files, if any, belong in the + public repository. + +## Findings + +### High + +None. + +### Medium + +- The roadmap's pre-launch cleanup item is underspecified for the missing + launch-posting plan. It must cover all stale public references or restore the + deleted file; otherwise public docs remain inconsistent after the proposed + cleanup. + +- The release spec remains owner-blocking. Creating `v0.1.0`, publishing + crates, or executing announcements would be irreversible/public and cannot + be inferred from verifier approval. + +### Low + +- The primary evaluation understated test inventory and missed visible + `mimir-mcp` crate tests in its test-file count. This is an evidence + correction, not a rejection. + +- The primary evaluation's git/cargo sandbox limitations are no longer true + for this verifier run. Fresh commands succeeded locally. + +## Owner Decisions Still Required + +- Commit scope for `.agents/`, `.claude/`, `CLAUDE.md`, `WORKFLOW.md`, + `.gitignore`, and `AGENTS.md` changes in a public OSS repo. +- Whether to restore `docs/launch-posting-plan.md` or remove/redirect all + references to it. +- Whether `v0.1.0` is the intended first public tag, and whether crates.io + publishing is approved. +- CI budget for any cleanup PR/release workflow runs. +- Whether large-file decomposition should happen before or after first public + release. +- Whether BES spec-authority integration remains paused or receives a new + research spec. + +## Public OSS Constraint Check + +The primary evaluation and roadmap keep internal agent-control output under +`.agents/specs/` and do not edit public docs or product code. They repeatedly +require owner approval before public PR, CI, tag, publish, and announcement +actions. This satisfies the green-room public OSS constraint. + +One caution: because the repo is public, committing `.agents/` or `.claude/` +content is itself a public documentation decision. That must remain an owner +decision, not an implied verifier conclusion. + +## Residual Risks + +- This verifier did not run `cargo publish --dry-run`; release dry-runs and + real publish remain owner-approved release-spec work. +- This verifier did not inspect every large source file deeply; large-file + maintainability and LLM-boundary risk remain valid follow-up topics. +- Local green checks do not prove GitHub's cross-platform matrix or release + workflow will pass on the next push/tag. +- Existing dirty/untracked files predate this verifier run and were preserved. + +## Final Status + +`verified-with-changes` + +The roadmap is evidence-based, current after the verifier corrections above, +and internally consistent enough to become the basis for owner-approved +executable specs. It is not itself implementation approval. diff --git a/.agents/specs/2026-04-30-remove-launch-posting-plan-references/SPEC.md b/.agents/specs/2026-04-30-remove-launch-posting-plan-references/SPEC.md new file mode 100644 index 0000000..3532d3e --- /dev/null +++ b/.agents/specs/2026-04-30-remove-launch-posting-plan-references/SPEC.md @@ -0,0 +1,321 @@ +--- +id: remove-launch-posting-plan-references +status: ready-for-review +owner: HasNoBeef +repo: Mimir +branch_policy: worktree-preferred +risk: low +requires_network: false +requires_secrets: [] +acceptance_commands: + - test ! -e docs/launch-posting-plan.md + - '! rg -n "docs/launch-posting-plan\\.md|\\(launch-posting-plan\\.md\\)|launch-posting-plan\\.md" AGENTS.md STATUS.md README.md docs/README.md docs/launch-readiness.md' + - "bash -lc 'rg -n \"launch-posting-plan|launch posting|posting plan|Publishing plan\" AGENTS.md STATUS.md README.md docs/README.md docs/launch-readiness.md CHANGELOG.md || test $? -eq 1'" + - git diff --check -- AGENTS.md STATUS.md README.md docs/README.md docs/launch-readiness.md CHANGELOG.md +--- + +# SPEC: Remove Launch Posting Plan References + +## 1. Problem + +`docs/launch-posting-plan.md` is absent from the public Mimir tree, but current +repo documentation still links to it or describes it as an active launch +posting plan. Because Mimir is public OSS, a broken launch/publishing reference +creates public documentation drift and can imply a release or announcement path +that is not owner-approved. + +This spec is local agent-control work only. It authorizes a future +implementation to remove or redirect stale references to the missing file; it +does not authorize restoring a launch posting plan, committing local +agent-control files, pushing, opening a PR, tagging, publishing crates, spending +CI minutes, or announcing a release. + +## 2. Goals + +- Remove or redirect every current reference that points readers to the missing + `docs/launch-posting-plan.md` file. +- Keep public wording factual, pre-1.0, and consistent with current approved + release boundaries. +- Preserve the owner decision that launch/posting/release execution remains + separate release-pr work. +- Keep `.agents/` and other BES agent-control artifacts local-only unless a + separate owner-approved public rollout spec says otherwise. + +## 3. Non-Goals + +- Do not restore, recreate, rewrite, or replace `docs/launch-posting-plan.md`. +- Do not create a new launch-posting, marketing, social, Show HN, crates.io, + docs.rs, MCP Registry, or announcement plan. +- Do not tag `v0.1.0`, publish crates, open PRs, push branches, mutate tracker + state, or trigger release workflows. +- Do not resolve broader dirty-state questions for `.agents/`, `.claude/`, + `CLAUDE.md`, `WORKFLOW.md`, `.gitignore`, or existing modified `AGENTS.md`. +- Do not edit product code, Cargo metadata, CI workflows, release automation, + benchmark assets, or unrelated docs. + +## 4. Current System Facts + +- Owner instruction for this task: remove or redirect stale + launch-posting-plan references; do not restore a launch-posting plan; + agent-control files stay local-only; public release/publish decisions remain + separate release-pr work. +- Root `AGENTS.md`: public OSS repos `Wick` and `Mimir` must not receive + internal agent-control language unless the owner approves a public-facing + rollout; product code lives in child repos, not the root. +- Root `.agents/OPERATING_MODEL.md`: non-trivial work starts with an + executable spec, public OSS repos require extra release hygiene, and public + OSS doc-only churn must not be pushed without an intentional owner-approved + low-noise PR plan. +- Root `.agents/GREEN_ROOM_EVALUATION.md`: green-room packets are local, + isolated, and do not authorize implementation, public docs publication, PRs, + tags, or releases. +- Root + `.agents/specs/2026-04-29-green-room-product-evaluations/CROSS_PRODUCT_SEQUENCE.md`: + Mimir is actionable as local-only public-OSS-safe cleanup to remove or + redirect stale launch-posting-plan references; agent-control files remain + local-only; public release/publish decisions remain separate release-pr work. +- Mimir `AGENTS.md`: the repo is public OSS, uses BES spec-first operation, and + has a hard CI quota rule requiring local verification before any push. +- Mimir `WORKFLOW.md`: canonical verification is + `cargo build --workspace && cargo test --workspace && cargo fmt --all -- --check && cargo clippy --all-targets --all-features -- -D warnings`. +- Mimir `STATUS.md`: Mimir is pre-1.0 public active development; release tags + are absent; `v0.1.0` may be tagged only after owner approval. +- Mimir `README.md`: public claims are intentionally limited and do not claim + production readiness, stable APIs, hosted service availability, benchmark + superiority, direct agent writes, or ungoverned cross-project promotion. +- Mimir `docs/launch-readiness.md`: release/tag/publish state remains + pre-release; docs.rs and crates.io publishing wait for release workflow. +- Mimir green-room verifier + `.agents/specs/2026-04-30-green-room-product-evaluation/VERIFICATION.md`: + `docs/launch-posting-plan.md` is missing, and stale references were found in + `AGENTS.md`, `STATUS.md`, `README.md`, `docs/README.md`, and + `docs/launch-readiness.md`. +- Command: `git -C Mimir log --oneline --decorate -n 5` from the workspace + root showed `4d38614 Delete docs/launch-posting-plan.md (#16)` immediately + before current `HEAD` `9e81c0f`, so the missing file was deliberately deleted + in repository history. +- Command: + `test -e docs/launch-posting-plan.md; printf 'docs/launch-posting-plan.md exists: %s\n' "$?"` + from `Mimir` returned `docs/launch-posting-plan.md exists: 1`, confirming the + file does not exist in the current worktree. +- Command: + `rg -n "launch-posting-plan|posting plan|launch posting|Publishing plan" .` + from `Mimir` found current references in: + - `AGENTS.md` + - `STATUS.md` + - `README.md` + - `docs/README.md` + - `docs/launch-readiness.md` + - `CHANGELOG.md` +- The same search shows `CHANGELOG.md` mentions launch posting assets in the + Unreleased historical change summary but does not link to + `docs/launch-posting-plan.md`. +- Command: `git -C Mimir status --short --branch --untracked-files=all` showed + branch `main...origin/main`, existing modified `.gitignore` and `AGENTS.md`, + and many untracked `.agents/`, `.claude/`, `CLAUDE.md`, and `WORKFLOW.md` + files. Those existing changes predate this spec and must be preserved. + +## 5. Desired Behavior + +After implementation: + +- No current public or operating document points to the missing + `docs/launch-posting-plan.md` path. +- `docs/launch-posting-plan.md` remains absent. +- The public docs continue to point users to existing launch/release status + surfaces, primarily `STATUS.md`, `docs/README.md`, + `docs/launch-readiness.md`, `RELEASING.md`, and the existing launch article + only where those links already exist or are directly relevant. +- Any replacement wording is descriptive and non-promissory. Public-facing + replacements may say that release, tag, publish, listing, and announcement + steps remain pending or are handled separately by the normal release process. + They must not introduce internal terms such as `release-pr`, `agent-control`, + or `BES fleet`, and must not provide new channel strategy, marketing copy, + publish order, launch timing, ownership/approval language, or + public-readiness claims. +- If an implementer believes a user-facing sentence needs subjective marketing + or launch-positioning judgment, the implementer must stop and mark that + sentence owner-blocking instead of inventing wording. + +## 6. Domain Model / Contract + +- `docs/launch-posting-plan.md`: deleted public doc. It is not an implementation + target and must not be restored by this cleanup. +- Stale reference: any link, path mention, index row, status row, or active + current-state claim that directs a reader to `docs/launch-posting-plan.md` or + says the missing plan is the active launch/posting/listing plan. +- Redirect: replacing a stale reference with a link to an existing current + document that already carries the relevant authority, without adding new + launch strategy. Acceptable redirect targets are: + - `STATUS.md` for current state and release tags. + - `docs/launch-readiness.md` for OSS readiness and promise audit. + - `RELEASING.md` for release mechanics, if the surrounding context is release + workflow rather than launch announcement. + - `docs/blog/2026-04-28-agent-memory-compiler-pipeline.md` only for the + existing public article reference, not as a posting plan replacement. +- Removal: deleting the stale bullet, table entry, or sentence when no existing + public doc is an objective replacement. +- Historical changelog entry: a past-tense release-note statement that does not + link to the missing file. It may remain unchanged unless implementation + evidence shows it is now misleading as a current-state claim. + +## 7. Interfaces And Files + +Expected implementation touch points: + +- `AGENTS.md`: remove or redirect the `Where to Look` row that currently + includes `docs/launch-posting-plan.md`. +- `STATUS.md`: remove or redirect the `References` bullet for + `docs/launch-posting-plan.md`. +- `README.md`: remove or redirect the `Documentation` bullet for + `docs/launch-posting-plan.md`. +- `docs/README.md`: remove or redirect the `Start Here` bullet and the + `Launch Execution` paragraph that reference `launch-posting-plan.md`. +- `docs/launch-readiness.md`: update the `Publishing plan` row so it does not + claim the missing file is done or authoritative. + +Files to inspect but not edit unless implementation evidence proves a current +stale reference: + +- `CHANGELOG.md`: current evidence shows only a historical Unreleased summary + mentioning posting assets, with no missing-file link. + +Files and directories out of scope: + +- `docs/launch-posting-plan.md` +- Product source files and tests. +- `Cargo.toml`, `Cargo.lock`, `.github/`, `RELEASING.md`, release workflows, + benchmark assets, and package metadata. +- `.agents/` files other than this spec, `.claude/`, `CLAUDE.md`, + `WORKFLOW.md`, `.gitignore`, git metadata, and tracker state. + +Public interfaces affected: + +- Public README/documentation navigation only. +- No CLI, API, storage, MCP, package, or release interface changes. + +## 8. Execution Plan + +1. Reconfirm the worktree state with + `git status --short --branch --untracked-files=all` and preserve all + pre-existing modified/untracked files. +2. Reconfirm the missing-file reference set with + `rg -n "launch-posting-plan|launch posting|posting plan|Publishing plan" AGENTS.md STATUS.md README.md docs/README.md docs/launch-readiness.md CHANGELOG.md`. +3. Edit only the expected implementation touch points listed in Section 7. +4. For each stale reference, either remove it or redirect it to an existing + current authority document using neutral wording. +5. Do not edit `CHANGELOG.md` unless the implementation search proves it + contains a current missing-file path or active-current-state claim rather + than historical release-note language. +6. Do not create `docs/launch-posting-plan.md`. +7. Run the acceptance commands in Section 10. +8. Report changed files, command results, unchanged pre-existing dirty files, + and any owner-blocking wording decisions encountered. + +## 9. Safety Invariants + +- Do not overwrite or revert pre-existing modifications in `AGENTS.md`, + `.gitignore`, `.agents/`, `.claude/`, `CLAUDE.md`, `WORKFLOW.md`, or any other + file. +- Do not stage, commit, push, open a PR, tag, publish, mutate tracker state, or + run public release workflows. +- Do not restore the deleted launch-posting plan. +- Do not introduce internal BES fleet details into public-facing docs beyond + already-existing local agent-control surfaces. +- Do not add AI attribution to docs, commits, release notes, or generated + output. +- Do not broaden this cleanup into broader launch-readiness, release-pr, + package, CI, benchmark, or public-announcement work. +- If implementation needs subjective launch messaging, owner review is required + before the wording is written. + +## 10. Test Plan + +Run from `/var/home/hasnobeef/buildepicshit/Mimir` after implementation: + +```bash +git status --short --branch --untracked-files=all +test ! -e docs/launch-posting-plan.md +! rg -n "docs/launch-posting-plan\\.md|\\(launch-posting-plan\\.md\\)|launch-posting-plan\\.md" AGENTS.md STATUS.md README.md docs/README.md docs/launch-readiness.md +bash -lc 'rg -n "launch-posting-plan|launch posting|posting plan|Publishing plan" AGENTS.md STATUS.md README.md docs/README.md docs/launch-readiness.md CHANGELOG.md || test $? -eq 1' +git diff --check -- AGENTS.md STATUS.md README.md docs/README.md docs/launch-readiness.md CHANGELOG.md +``` + +Expected results: + +- `git status` shows only the implementation files plus pre-existing dirty and + untracked files; no unrelated files appear. +- `test ! -e docs/launch-posting-plan.md` passes. +- The negative `rg` command returns no matches. +- The broad `rg` evidence command returns either no matches in current docs or + only explicitly reviewed non-stale historical language such as `CHANGELOG.md`. + Exit code 1 from no matches is acceptable; any other `rg` failure is not. +- `git diff --check` passes. + +Do not run cargo gates for this link-only cleanup unless implementation touches +Rust, Cargo, CI, release, package, benchmark, or generated documentation +surfaces. The full local gate remains mandatory before any future push or +release-pr work. + +Manual checks: + +- Open each changed diff hunk and confirm it either removes the stale reference + or redirects to an existing file. +- Confirm no replacement sentence invents a public release date, launch + channel strategy, posting copy, publish approval, or benchmark/performance + claim. +- Confirm `docs/launch-readiness.md` no longer marks the missing publishing + plan as `Done` evidence. + +## 11. Acceptance Criteria + +- [ ] `docs/launch-posting-plan.md` remains absent. +- [ ] No active current-state doc among `AGENTS.md`, `STATUS.md`, `README.md`, + `docs/README.md`, or `docs/launch-readiness.md` references + `docs/launch-posting-plan.md` or `launch-posting-plan.md`. +- [ ] Any remaining "posting plan", "launch posting", or "Publishing plan" + wording is either removed, redirected to an existing authority document, + or explicitly historical and non-actionable. +- [ ] `docs/launch-readiness.md` does not claim a missing publishing plan is + complete evidence. +- [ ] No new launch/posting plan, marketing strategy, release approval, tag, + publish, PR, or CI-spend action is introduced. +- [ ] Only files required by Section 7 are edited, except `CHANGELOG.md` may be + edited only if implementation evidence proves it contains a stale current + reference. +- [ ] Pre-existing dirty/untracked work is preserved. +- [ ] Acceptance commands pass or any failure is classified as expected, + new, or owner-blocking with exact output. +- [ ] Completion report lists files changed, commands run and results, + intentionally untouched files, residual risks, and any spec evidence + candidates. + +## 12. Rollback Plan + +Before commit or PR work, rollback is a normal file-level revert of only the +future implementation hunks in the touched public docs. Do not use +`git reset --hard` or broad checkout commands because the worktree already +contains unrelated owner/agent changes that must be preserved. + +If a later review rejects a redirect target, replace only that sentence or +bullet with a removal or owner-approved redirect. Do not restore +`docs/launch-posting-plan.md` as rollback. + +## 13. Open Questions + +- [ ] None before spec review. Owner has already decided to remove or redirect + stale launch-posting-plan references and not restore the plan. +- [ ] Owner-blocking during implementation: any proposed replacement wording + that requires subjective launch messaging, public positioning, channel + strategy, publish timing, or public-readiness judgment. + +## 14. Completion Report + +To be filled by the executor/verifier: + +- Files changed: +- Commands run: +- Verification result: +- Intentionally untouched: +- Residual risk: +- Spec evidence candidates: diff --git a/.agents/specs/SPEC.template.md b/.agents/specs/SPEC.template.md new file mode 100644 index 0000000..187252c --- /dev/null +++ b/.agents/specs/SPEC.template.md @@ -0,0 +1,107 @@ +--- +id: replace-with-short-stable-id +status: draft +owner: HasNoBeef +repo: replace-with-repo-name +branch_policy: worktree-preferred +risk: low +requires_network: false +requires_secrets: [] +acceptance_commands: [] +--- + +# SPEC: Replace With Task Name + +## 1. Problem + +Describe the problem in concrete terms. Include observed behavior, affected +users, affected files, and why the change matters now. + +## 2. Goals + +- Goal 1. +- Goal 2. + +## 3. Non-Goals + +- Explicitly excluded scope. +- Related work deferred to another spec. + +## 4. Current System Facts + +List only verified facts. Cite files, docs, command output, issues, PRs, or +owner statements. + +- `path/to/file`: fact. +- Command: `example command` -> observed result. + +## 5. Desired Behavior + +State the target behavior in terms an executor can implement and a verifier can +test. + +## 6. Domain Model / Contract + +Define entities, states, schemas, invariants, inputs, outputs, or file formats +the implementation must preserve. + +## 7. Interfaces And Files + +Expected touch points: + +- `path/to/file` +- `path/to/other-file` + +Public interfaces affected: + +- CLI/API/tool/user workflow. + +## 8. Execution Plan + +1. Step one. +2. Step two. +3. Step three. + +## 9. Safety Invariants + +- Invariant that must remain true. +- Files or directories that must not be touched. +- Destructive actions that require explicit approval. + +## 10. Test Plan + +Commands: + +```bash +# fill in repo-specific verification +``` + +Manual checks: + +- Check 1. + +## 11. Acceptance Criteria + +- [ ] Behavior matches Desired Behavior. +- [ ] Tests pass. +- [ ] Docs or operating instructions updated if needed. +- [ ] No unrelated changes. +- [ ] Completion report includes verification output. + +## 12. Rollback Plan + +Describe how to revert or disable the change safely. + +## 13. Open Questions + +- [ ] Question that must be answered before approval. + +## 14. Completion Report + +To be filled by the executor/verifier: + +- Files changed: +- Commands run: +- Verification result: +- Residual risk: +- Spec evidence candidates: diff --git a/.agents/workflows/author-spec.md b/.agents/workflows/author-spec.md new file mode 100644 index 0000000..ca1815d --- /dev/null +++ b/.agents/workflows/author-spec.md @@ -0,0 +1,12 @@ +# Author Spec + +Use this workflow to turn owner intent into an executable `SPEC.md`. + +1. Read `AGENTS.md`, `CLAUDE.md` if present, `STATUS.md` if present, and + relevant docs. +2. Inspect the codebase before proposing implementation details. +3. Use `.agents/specs/SPEC.template.md`. +4. Fill `Current System Facts` with cited facts only. +5. Keep implementation steps concrete enough for another agent to execute. +6. Mark unresolved decisions as open questions instead of guessing. +7. Stop at the spec unless the owner explicitly approves implementation. diff --git a/.agents/workflows/execute-spec.md b/.agents/workflows/execute-spec.md new file mode 100644 index 0000000..27ca609 --- /dev/null +++ b/.agents/workflows/execute-spec.md @@ -0,0 +1,13 @@ +# Execute Spec + +Use this workflow only after a `SPEC.md` is approved. + +1. Re-read the approved spec and repo instructions. +2. Confirm the worktree or branch state. +3. Implement only the approved scope. +4. Preserve unrelated user changes. +5. Update directly coupled tests/docs only when required by the spec. +6. Run the spec acceptance commands. +7. Prepare the completion report requested by the spec. + +If new facts materially change scope, stop and request a spec update. diff --git a/.agents/workflows/orient.md b/.agents/workflows/orient.md new file mode 100644 index 0000000..c0a50f9 --- /dev/null +++ b/.agents/workflows/orient.md @@ -0,0 +1,11 @@ +# Orient + +Use this workflow before planning or editing. + +1. Read `AGENTS.md`. +2. Read `WORKFLOW.md`, `CLAUDE.md`, `STATUS.md`, and linked docs when present. +3. Run `git status --short --branch`. +4. Identify current branch, local changes, likely files, and verification gate. +5. Report verified facts, risks, and open questions. + +Use the `repo-orientation` skill. diff --git a/.agents/workflows/release-pr.md b/.agents/workflows/release-pr.md new file mode 100644 index 0000000..762e4aa --- /dev/null +++ b/.agents/workflows/release-pr.md @@ -0,0 +1,13 @@ +# Release PR + +Use this workflow after implementation and verification. + +1. Confirm branch and worktree state. +2. Review `git diff` and `git status --short`. +3. Stage explicit files by path. +4. Commit with the repo's convention. +5. Prepare a PR body with summary, verification output, risk, and links. +6. Check CI only after local verification. +7. Clean worktrees and stale branches after merge according to repo rules. + +Use the `release-pr` skill. diff --git a/.agents/workflows/review-diff.md b/.agents/workflows/review-diff.md new file mode 100644 index 0000000..6c99472 --- /dev/null +++ b/.agents/workflows/review-diff.md @@ -0,0 +1,11 @@ +# Review Diff + +Use this workflow to review a local diff or PR. + +1. Read repo instructions and the relevant spec. +2. Inspect the diff before summarizing. +3. Prioritize bugs, regressions, missing tests, and broken repo contracts. +4. Lead with findings ordered by severity. +5. If no findings exist, say so and list residual risk. + +Use the `code-review` skill. diff --git a/.agents/workflows/review-spec.md b/.agents/workflows/review-spec.md new file mode 100644 index 0000000..3a937a9 --- /dev/null +++ b/.agents/workflows/review-spec.md @@ -0,0 +1,17 @@ +# Review Spec + +Use this workflow before implementation. + +Check the target `SPEC.md` for: + +- Ambiguous goals or missing non-goals. +- Uncited system facts. +- Hidden architecture decisions. +- Missing safety invariants. +- Missing or non-runnable acceptance commands. +- Missing rollback plan. +- Open questions that block execution. +- Scope that conflicts with `AGENTS.md` or project docs. + +Return findings first, ordered by severity. If the spec is executable, say so +clearly and list any residual risk. diff --git a/.agents/workflows/spec-evidence.md b/.agents/workflows/spec-evidence.md new file mode 100644 index 0000000..65b2e07 --- /dev/null +++ b/.agents/workflows/spec-evidence.md @@ -0,0 +1,12 @@ +# Spec Evidence + +Use this workflow after substantial work or incident resolution. + +1. Review the completion report and verification output. +2. Identify durable lessons that are useful to future agents. +3. Exclude anything already covered by repo docs. +4. Record claim, scope, evidence, confidence, conflicts, and suggested spec or + delivery-authority route. +5. Do not write trusted shared memory directly. + +Use the `spec-evidence-governance` skill. diff --git a/.agents/workflows/symphony-dispatch-check.md b/.agents/workflows/symphony-dispatch-check.md new file mode 100644 index 0000000..1928e80 --- /dev/null +++ b/.agents/workflows/symphony-dispatch-check.md @@ -0,0 +1,12 @@ +# Symphony Dispatch Check + +Use this workflow before enabling or auditing autonomous dispatch. + +1. Confirm `WORKFLOW.md` exists in the runner cwd. +2. Validate tracker, workspace, hooks, agent, and Codex config sections. +3. Confirm Linear project slug and active/terminal states. +4. Confirm workspace isolation and concurrency limits. +5. Confirm completion reports include verification evidence and residual risk. +6. Do not dispatch if the target repo or acceptance gate is unclear. + +Use the `symphony-dispatch` skill. diff --git a/.agents/workflows/verify-spec.md b/.agents/workflows/verify-spec.md new file mode 100644 index 0000000..f464489 --- /dev/null +++ b/.agents/workflows/verify-spec.md @@ -0,0 +1,12 @@ +# Verify Spec + +Use this workflow after implementation. + +1. Inspect the diff against the approved spec. +2. Confirm no unrelated files changed. +3. Run all acceptance commands listed in the spec. +4. Run the repo's normal verification gate if different. +5. Check docs and instructions for consistency. +6. Report exact command results, residual risk, and Mimir memory candidates. + +Do not say complete unless verification ran in the current session. diff --git a/.claude/commands/author-spec.md b/.claude/commands/author-spec.md new file mode 100644 index 0000000..ca1815d --- /dev/null +++ b/.claude/commands/author-spec.md @@ -0,0 +1,12 @@ +# Author Spec + +Use this workflow to turn owner intent into an executable `SPEC.md`. + +1. Read `AGENTS.md`, `CLAUDE.md` if present, `STATUS.md` if present, and + relevant docs. +2. Inspect the codebase before proposing implementation details. +3. Use `.agents/specs/SPEC.template.md`. +4. Fill `Current System Facts` with cited facts only. +5. Keep implementation steps concrete enough for another agent to execute. +6. Mark unresolved decisions as open questions instead of guessing. +7. Stop at the spec unless the owner explicitly approves implementation. diff --git a/.claude/commands/execute-spec.md b/.claude/commands/execute-spec.md new file mode 100644 index 0000000..27ca609 --- /dev/null +++ b/.claude/commands/execute-spec.md @@ -0,0 +1,13 @@ +# Execute Spec + +Use this workflow only after a `SPEC.md` is approved. + +1. Re-read the approved spec and repo instructions. +2. Confirm the worktree or branch state. +3. Implement only the approved scope. +4. Preserve unrelated user changes. +5. Update directly coupled tests/docs only when required by the spec. +6. Run the spec acceptance commands. +7. Prepare the completion report requested by the spec. + +If new facts materially change scope, stop and request a spec update. diff --git a/.claude/commands/orient.md b/.claude/commands/orient.md new file mode 100644 index 0000000..c0a50f9 --- /dev/null +++ b/.claude/commands/orient.md @@ -0,0 +1,11 @@ +# Orient + +Use this workflow before planning or editing. + +1. Read `AGENTS.md`. +2. Read `WORKFLOW.md`, `CLAUDE.md`, `STATUS.md`, and linked docs when present. +3. Run `git status --short --branch`. +4. Identify current branch, local changes, likely files, and verification gate. +5. Report verified facts, risks, and open questions. + +Use the `repo-orientation` skill. diff --git a/.claude/commands/release-pr.md b/.claude/commands/release-pr.md new file mode 100644 index 0000000..762e4aa --- /dev/null +++ b/.claude/commands/release-pr.md @@ -0,0 +1,13 @@ +# Release PR + +Use this workflow after implementation and verification. + +1. Confirm branch and worktree state. +2. Review `git diff` and `git status --short`. +3. Stage explicit files by path. +4. Commit with the repo's convention. +5. Prepare a PR body with summary, verification output, risk, and links. +6. Check CI only after local verification. +7. Clean worktrees and stale branches after merge according to repo rules. + +Use the `release-pr` skill. diff --git a/.claude/commands/review-diff.md b/.claude/commands/review-diff.md new file mode 100644 index 0000000..6c99472 --- /dev/null +++ b/.claude/commands/review-diff.md @@ -0,0 +1,11 @@ +# Review Diff + +Use this workflow to review a local diff or PR. + +1. Read repo instructions and the relevant spec. +2. Inspect the diff before summarizing. +3. Prioritize bugs, regressions, missing tests, and broken repo contracts. +4. Lead with findings ordered by severity. +5. If no findings exist, say so and list residual risk. + +Use the `code-review` skill. diff --git a/.claude/commands/review-spec.md b/.claude/commands/review-spec.md new file mode 100644 index 0000000..3a937a9 --- /dev/null +++ b/.claude/commands/review-spec.md @@ -0,0 +1,17 @@ +# Review Spec + +Use this workflow before implementation. + +Check the target `SPEC.md` for: + +- Ambiguous goals or missing non-goals. +- Uncited system facts. +- Hidden architecture decisions. +- Missing safety invariants. +- Missing or non-runnable acceptance commands. +- Missing rollback plan. +- Open questions that block execution. +- Scope that conflicts with `AGENTS.md` or project docs. + +Return findings first, ordered by severity. If the spec is executable, say so +clearly and list any residual risk. diff --git a/.claude/commands/spec-evidence.md b/.claude/commands/spec-evidence.md new file mode 100644 index 0000000..65b2e07 --- /dev/null +++ b/.claude/commands/spec-evidence.md @@ -0,0 +1,12 @@ +# Spec Evidence + +Use this workflow after substantial work or incident resolution. + +1. Review the completion report and verification output. +2. Identify durable lessons that are useful to future agents. +3. Exclude anything already covered by repo docs. +4. Record claim, scope, evidence, confidence, conflicts, and suggested spec or + delivery-authority route. +5. Do not write trusted shared memory directly. + +Use the `spec-evidence-governance` skill. diff --git a/.claude/commands/symphony-dispatch-check.md b/.claude/commands/symphony-dispatch-check.md new file mode 100644 index 0000000..1928e80 --- /dev/null +++ b/.claude/commands/symphony-dispatch-check.md @@ -0,0 +1,12 @@ +# Symphony Dispatch Check + +Use this workflow before enabling or auditing autonomous dispatch. + +1. Confirm `WORKFLOW.md` exists in the runner cwd. +2. Validate tracker, workspace, hooks, agent, and Codex config sections. +3. Confirm Linear project slug and active/terminal states. +4. Confirm workspace isolation and concurrency limits. +5. Confirm completion reports include verification evidence and residual risk. +6. Do not dispatch if the target repo or acceptance gate is unclear. + +Use the `symphony-dispatch` skill. diff --git a/.claude/commands/verify-spec.md b/.claude/commands/verify-spec.md new file mode 100644 index 0000000..f464489 --- /dev/null +++ b/.claude/commands/verify-spec.md @@ -0,0 +1,12 @@ +# Verify Spec + +Use this workflow after implementation. + +1. Inspect the diff against the approved spec. +2. Confirm no unrelated files changed. +3. Run all acceptance commands listed in the spec. +4. Run the repo's normal verification gate if different. +5. Check docs and instructions for consistency. +6. Report exact command results, residual risk, and Mimir memory candidates. + +Do not say complete unless verification ran in the current session. diff --git a/.claude/settings.json b/.claude/settings.json new file mode 100644 index 0000000..8af7fbe --- /dev/null +++ b/.claude/settings.json @@ -0,0 +1,3 @@ +{ + "autoMemoryEnabled": false +} diff --git a/.claude/skills/code-review/SKILL.md b/.claude/skills/code-review/SKILL.md new file mode 100644 index 0000000..bcc22d0 --- /dev/null +++ b/.claude/skills/code-review/SKILL.md @@ -0,0 +1,36 @@ +--- +name: code-review +description: Use for reviewing local diffs or PRs. Prioritizes bugs, regressions, missing tests, unsafe assumptions, and broken repo contracts over summaries. +--- + +# Code Review + +Use this when asked to review. + +## Review Focus + +- Correctness bugs. +- Behavioral regressions. +- Missing or weak tests. +- Security, privacy, or secret-handling risks. +- Broken architecture boundaries. +- Drift from `AGENTS.md`, approved specs, or public docs. +- Verification gaps. + +## Output + +Findings first, ordered by severity. Each finding should include: + +- file and line reference when available +- the concrete risk +- why the current change causes it +- a practical fix direction + +Then include open questions and a brief summary only after findings. + +## Hard Rules + +- If there are no findings, say that clearly and list residual risk. +- Do not lead with praise or broad summaries. +- Do not request stylistic churn unless it affects correctness, + maintainability, or repo contracts. diff --git a/.claude/skills/implementation-execution/SKILL.md b/.claude/skills/implementation-execution/SKILL.md new file mode 100644 index 0000000..3b4cbf4 --- /dev/null +++ b/.claude/skills/implementation-execution/SKILL.md @@ -0,0 +1,34 @@ +--- +name: implementation-execution +description: Use when implementing an approved BES SPEC.md. Keeps edits scoped, preserves user work, updates directly coupled tests/docs, and stops when new facts change scope. +--- + +# Implementation Execution + +Use only after a spec is approved by the owner or controlling workflow. + +## Steps + +1. Re-read the approved `SPEC.md`. +2. Re-read the repo `AGENTS.md` and relevant docs. +3. Confirm branch/worktree state with `git status --short --branch`. +4. Edit only files named by the spec or directly required by the change. +5. Add or update tests before or with production changes when behavior changes. +6. Keep unrelated refactors out of scope. +7. Run the spec acceptance commands. +8. Prepare the completion report requested by the spec. + +## Stop Conditions + +- New facts materially change scope. +- Required files contain unrelated local changes that make safe editing + ambiguous. +- Verification requires unavailable secrets or infrastructure. +- The spec's acceptance criteria are not testable. + +## Hard Rules + +- Preserve unrelated user changes. +- Do not silently expand scope. +- Do not bypass hooks, CI, or verification gates. +- Do not claim completion without fresh verification evidence. diff --git a/.claude/skills/release-pr/SKILL.md b/.claude/skills/release-pr/SKILL.md new file mode 100644 index 0000000..c508d26 --- /dev/null +++ b/.claude/skills/release-pr/SKILL.md @@ -0,0 +1,27 @@ +--- +name: release-pr +description: Use when preparing commits, PRs, release handoff, or merge cleanup. Enforces explicit staging, conventional commits, PR evidence, and worktree hygiene. +--- + +# Release And PR + +Use when moving finished work toward review or merge. + +## Steps + +1. Confirm branch and tracking state. +2. Review `git status --short` and `git diff`. +3. Stage explicit files by path. +4. Use the repo's commit convention. +5. Write a PR body with summary, verification output, risk, and links. +6. Confirm CI/check status only after local verification is complete. +7. After merge, clean worktrees and stale local branches according to repo + instructions. + +## Hard Rules + +- No `git add .` unless explicitly approved for the batch. +- No AI attribution in commits, PRs, docs, or generated output. +- No force-push, branch deletion, hook bypass, or merge without approval when + the repo requires it. +- Do not burn CI minutes as a substitute for local verification. diff --git a/.claude/skills/repo-orientation/SKILL.md b/.claude/skills/repo-orientation/SKILL.md new file mode 100644 index 0000000..9b6ef2a --- /dev/null +++ b/.claude/skills/repo-orientation/SKILL.md @@ -0,0 +1,36 @@ +--- +name: repo-orientation +description: Use at the start of work in any BES repo to build a current, cited map of instructions, repo state, verification gates, active plans, and likely risk before editing. +--- + +# Repo Orientation + +Use this before planning or editing. + +## Steps + +1. Read the nearest `AGENTS.md`. +2. If present, read `CLAUDE.md`, `WORKFLOW.md`, `STATUS.md`, + `.agents/DOCUMENTATION_GUIDE.md`, and the docs linked by `AGENTS.md`. +3. Check git state with `git status --short --branch`. +4. Identify the active branch, tracking branch, untracked files, and unrelated + local changes. +5. Identify the repo's verification gate and any hook setup requirements. +6. Locate the task's likely files with `rg` and `rg --files`. +7. Report only verified facts. Cite files or command output. + +## Output + +- Target repo and branch. +- Source-of-truth docs read. +- Relevant files or directories. +- Verification commands. +- Documentation placement constraints for this task. +- Local changes that must be preserved. +- Open questions before implementation. + +## Hard Rules + +- Do not edit during orientation. +- Do not rely on memory when repo docs can answer the question. +- If instructions conflict, stop and report the conflict. diff --git a/.claude/skills/spec-driven-development/SKILL.md b/.claude/skills/spec-driven-development/SKILL.md new file mode 100644 index 0000000..57af240 --- /dev/null +++ b/.claude/skills/spec-driven-development/SKILL.md @@ -0,0 +1,45 @@ +--- +name: spec-driven-development +description: "Use when planning, reviewing, implementing, or verifying non-trivial work in BES repos. Enforces the BES spec-first operating model: author an executable SPEC.md, review it, implement only approved scope, verify with concrete commands, and route durable lessons into spec evidence." +--- + +# Spec-Driven Development + +Use this skill for non-trivial work in BES repos. + +## Workflow + +1. Read `AGENTS.md`, `CLAUDE.md` if present, `STATUS.md` if present, + `.agents/DOCUMENTATION_GUIDE.md` if present, and the relevant project docs. +2. Create or update a task spec from `.agents/specs/SPEC.template.md`. +3. Verify the spec has goals, non-goals, current facts with citations, + desired behavior, safety invariants, acceptance commands, rollback, and open + questions. +4. Do not implement until the spec is approved by the owner or controlling + workflow. +5. Execute only the approved spec. Stop if new facts materially change scope. +6. Run acceptance commands and the repo's normal verification gate. +7. Report files changed, commands run, verification output, residual risk, and + spec evidence candidates. + +## Hard Rules + +- Specs are executable contracts, not brainstorming notes. +- Raw memories and chat history are evidence only. +- Project docs and `AGENTS.md` beat generated memory. +- Durable cross-project instructions go through approved specs and delivery + evidence records. +- Put task-control specs in `.agents/specs/`; put durable product docs in the + repo-native docs path defined by `.agents/DOCUMENTATION_GUIDE.md`. +- No silent scope expansion. +- No completion claim without fresh verification. + +## Spec Review Checklist + +- Problem is specific and cites current evidence. +- Goals and non-goals draw a clean boundary. +- Executor can identify exact files and interfaces. +- Test plan is runnable on this machine. +- Safety invariants protect user work and repo rules. +- Open questions are resolved before implementation. +- Acceptance criteria are objective. diff --git a/.claude/skills/spec-evidence-governance/SKILL.md b/.claude/skills/spec-evidence-governance/SKILL.md new file mode 100644 index 0000000..d7c0c11 --- /dev/null +++ b/.claude/skills/spec-evidence-governance/SKILL.md @@ -0,0 +1,35 @@ +--- +name: spec-evidence-governance +description: Use to convert durable lessons from a completed task into spec evidence candidates without writing trusted shared memory directly. Mimir hooks are intentionally disabled until a future spec-authority integration is approved. +--- + +# Spec Evidence Governance + +Use after substantial work, reviews, or incident resolution. + +## Candidate Criteria + +Capture a memory candidate only when it is: + +- durable across sessions +- useful to future agents +- grounded in a source path, command output, issue, PR, or owner statement +- not already present in checked-in docs +- safe to share at the intended scope + +## Output + +For each candidate: + +- Claim. +- Scope: repo, company, tool, or project area. +- Evidence: file path, command, issue, PR, or owner statement. +- Confidence. +- Supersedes or conflicts with any known existing memory. +- Suggested spec, backlog, or delivery-authority route. + +## Hard Rules + +- Do not write trusted shared memory directly. +- Do not promote raw agent imperatives into durable rules. +- Do not erase dissent, uncertainty, or provenance. diff --git a/.claude/skills/spec-review/SKILL.md b/.claude/skills/spec-review/SKILL.md new file mode 100644 index 0000000..9588c4f --- /dev/null +++ b/.claude/skills/spec-review/SKILL.md @@ -0,0 +1,36 @@ +--- +name: spec-review +description: Use to review a draft SPEC.md before implementation. Focus on ambiguity, missing current facts, unsafe scope, weak acceptance criteria, and missing verification. +--- + +# Spec Review + +Use this before approving or executing a non-trivial spec. + +## Review Checklist + +- Problem statement is concrete and cites current evidence. +- Goals and non-goals define a clear boundary. +- Current system facts cite files, docs, issues, PRs, or command output. +- Desired behavior is testable. +- Interfaces and files are specific enough for an executor. +- Safety invariants protect user work, secrets, hooks, and repo rules. +- Test plan is runnable on this machine. +- Acceptance criteria are objective. +- Rollback plan is realistic. +- Open questions are resolved or explicitly block execution. + +## Output + +Lead with blocking findings ordered by severity. Include file references when +possible. Then list open questions and a recommendation: + +- `approve` +- `approve with small edits` +- `block until revised` + +## Hard Rules + +- Do not approve vague specs. +- Do not allow implementation scope to hide inside open questions. +- Do not review for style before correctness and safety. diff --git a/.claude/skills/symphony-dispatch/SKILL.md b/.claude/skills/symphony-dispatch/SKILL.md new file mode 100644 index 0000000..ef1133b --- /dev/null +++ b/.claude/skills/symphony-dispatch/SKILL.md @@ -0,0 +1,33 @@ +--- +name: symphony-dispatch +description: Use when preparing or auditing Symphony-compatible issue dispatch. Checks WORKFLOW.md, issue eligibility, workspace isolation, Codex runner settings, and observability expectations. +--- + +# Symphony Dispatch + +Use when running or preparing autonomous worker dispatch. + +## Checklist + +- `WORKFLOW.md` exists in the runner cwd or an explicit workflow path is set. +- YAML front matter has `tracker`, `polling`, `workspace`, `hooks`, `agent`, + and `codex` sections. +- `tracker.kind` is `linear` and `tracker.api_key` resolves from + `LINEAR_API_KEY`. +- `tracker.project_slug`, active states, and terminal states match the board. +- `workspace.root` is absolute and outside product repo working trees. +- Hooks are documented; failures have the right abort/ignore behavior. +- `codex.command` is `codex app-server` unless a tested wrapper is in use. +- Concurrency is bounded for the machine and CI budget. +- Running workers use isolated branches or worktrees. +- Logs and completion reports include issue identifier, session, commands, and + verification evidence. + +## Hard Rules + +- Do not dispatch when the target repo is unclear. +- Do not dispatch multiple write-capable workers into the same worktree. +- Do not allow unsupported tool calls or user-input-required turns to stall + indefinitely. +- Treat Symphony as a trusted-environment runner unless stronger sandboxing is + explicitly configured. diff --git a/.claude/skills/verification/SKILL.md b/.claude/skills/verification/SKILL.md new file mode 100644 index 0000000..156c605 --- /dev/null +++ b/.claude/skills/verification/SKILL.md @@ -0,0 +1,33 @@ +--- +name: verification +description: Use before reporting done. Runs the narrowest relevant checks first, then the repo gate when warranted, and records fresh evidence plus residual risk. +--- + +# Verification + +Use before claiming work is complete. + +## Steps + +1. Read the spec acceptance commands and repo `AGENTS.md` verification section. +2. Run the narrowest relevant test or lint first. +3. Run the broader repo gate when the change touches shared behavior, + interfaces, CI, docs contracts, or release surfaces. +4. Capture command, result, and important output. +5. If a command fails, diagnose whether the failure is caused by the change, + existing repo state, missing dependency, sandbox/network limits, or secrets. +6. Re-run only after a meaningful fix or environment change. + +## Output + +- Commands run. +- Pass/fail result. +- Key output lines or summarized failures. +- Residual risk. +- Checks not run and why. + +## Hard Rules + +- Do not say "should pass" as verification. +- Do not hide failing checks. +- Do not spend CI minutes when local gates are required first. diff --git a/.gitignore b/.gitignore index 0bc78fe..7548870 100644 --- a/.gitignore +++ b/.gitignore @@ -31,10 +31,11 @@ fuzz/coverage/ *.profraw *.profdata -# ----- Machine-local Claude Code state (user-specific, never committed) ----- +# ----- Local user-specific agent state (not committed) ----- .claude/settings.local.json -.claude/settings.json -.claude/skills/ +.codex +.mcp.json +.mcp.local.json # ----- Mimir operator-local state (per-operator workspace data, never committed) ----- .mimir/ diff --git a/AGENTS.md b/AGENTS.md index 7bf1996..41d35da 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -2,6 +2,35 @@ > Cross-framework operating manual for Mimir following the [AGENTS.md](https://agents.md/) standard. Read automatically by Claude Code, Codex, Cursor, Copilot, Gemini CLI, and any agent framework supporting the standard. +## BES Fleet Operating Model + +This repo participates in the BES spec-first agent fleet. The machine-level +contract is `/var/home/hasnobeef/buildepicshit/.agents/OPERATING_MODEL.md`; +repo-local copies of the shared spec template, workflows, and skills live under +`.agents/`. + +Documentation placement rules live in `.agents/DOCUMENTATION_GUIDE.md`. Read it +before creating, moving, archiving, or publishing docs/specs. In short: +`.agents/specs/` is for agent/Symphony task control; durable product docs live +in this repo's native docs path. + +Non-trivial work requires an approved executable `SPEC.md` before +implementation. Use `.agents/specs/SPEC.template.md`, then run the lifecycle: +orient, author spec, review spec, approve, execute, verify, report, and route +durable lessons to Mimir as governed evidence drafts. Raw Claude/Codex memories +are supporting evidence only; checked-in docs and this file remain authoritative. + +Claude must enter through `CLAUDE.md`, which imports this file. Codex and other +AGENTS-aware tools read this file directly. Keep both surfaces aligned. + +Shared task skills live in `.agents/skills/`; Claude-native copies live in +`.claude/skills/`. Use `.agents/skills/repo-orientation` at task start, +`.agents/skills/spec-driven-development` for non-trivial work, +`.agents/skills/verification` before completion, and +`.agents/skills/spec-evidence-governance` only to propose evidence candidates. +Do not build from raw memory. Build from approved specs, repo docs, and fresh +verification evidence. + > **CI quota constraint (HARD RULE — read before pushing).** This org has a **finite monthly GitHub Actions budget**. Extra Actions usage was added on 2026-04-27 and the owner approved re-enabling Actions for this repo, but every push to a tracked branch / PR still triggers a full matrix run (~30 runner-minutes for 8 jobs across 3 OSes). The 2026-04-20 session burned through the entire monthly quota in heavy iteration — do not repeat that pattern. > > **Verify locally before pushing.** Always run the full local gate before `git push`: diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000..112857e --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,25 @@ +# CLAUDE.md — Mimir workspace guide + +@AGENTS.md + +## Claude Entry Protocol + +Use the BES spec-first model for non-trivial work. Start by reading +`AGENTS.md`, `STATUS.md`, and the relevant concept docs. + +Project commands are available in `.claude/commands/`: + +- `/orient` +- `/author-spec` +- `/review-spec` +- `/execute-spec` +- `/verify-spec` +- `/review-diff` +- `/release-pr` +- `/spec-evidence` +- `/symphony-dispatch-check` + +Use the repo-local skills in `.claude/skills/` for orientation, spec work, +implementation, verification, review, PR handoff, Symphony dispatch readiness, +and spec evidence governance. Treat Claude memory as evidence only; Mimir's +librarian path remains the product boundary for durable shared memory. diff --git a/WORKFLOW.md b/WORKFLOW.md new file mode 100644 index 0000000..ad32868 --- /dev/null +++ b/WORKFLOW.md @@ -0,0 +1,74 @@ +--- +tracker: + kind: linear + endpoint: https://api.linear.app/graphql + api_key: $LINEAR_API_KEY + project_slug: mimir + active_states: + - Todo + - In Progress + - In Review + terminal_states: + - Done + - Canceled + - Duplicate +polling: + interval_ms: 30000 +workspace: + root: /var/home/hasnobeef/buildepicshit/.symphony/workspaces/Mimir +hooks: + after_create: | + git clone git@github.com:buildepicshit/Mimir.git . + before_run: null + after_run: null + before_remove: null + timeout_ms: 60000 +agent: + max_concurrent_agents: 1 + max_turns: 20 + max_retry_backoff_ms: 300000 +codex: + command: codex app-server + approval_policy: on-request + thread_sandbox: workspace-write + turn_timeout_ms: 3600000 + read_timeout_ms: 5000 + stall_timeout_ms: 300000 +bes: + repo: Mimir + default_branch: main + canonical_verify: cargo build --workspace && cargo test --workspace && cargo fmt --all -- --check && cargo clippy --all-targets --all-features -- -D warnings +--- + +# Mimir Workflow + +You are working on Mimir under the BES spec-first model. + +## Issue + +- Identifier: `{{ issue.identifier }}` +- Title: `{{ issue.title }}` +- State: `{{ issue.state }}` +- Priority: `{{ issue.priority }}` +- URL: `{{ issue.url }}` +- Attempt: `{{ attempt }}` + +## Required Procedure + +1. Read `AGENTS.md`, `WORKFLOW.md`, `.agents/DOCUMENTATION_GUIDE.md`, + `STATUS.md`, and the relevant concept docs. +2. For non-trivial work, create or update an executable `SPEC.md` from + `.agents/specs/SPEC.template.md`. +3. Preserve librarian-mediated writes, append-only canonical storage, and + provenance-preserving memory governance. +4. Verify locally before pushing to protect CI quota. +5. Report files changed, commands run, verification result, residual risk, and + spec evidence candidates. + +## Safety + +- Do not write trusted shared memory directly. +- Do not bypass the librarian boundary. +- Do not push before running the local gate unless the owner explicitly says + the CI budget tradeoff is acceptable. +- Do not treat quorum majority as truth.