From 5a6d1deaff8f71d837c585755bf6eb36401b693e Mon Sep 17 00:00:00 2001 From: HasNoBeef Date: Thu, 30 Apr 2026 08:27:12 -0700 Subject: [PATCH 1/2] chore: keep local agent control files out of git --- .gitignore | 13 ++++-- AGENTS.md | 133 ----------------------------------------------------- 2 files changed, 10 insertions(+), 136 deletions(-) delete mode 100644 AGENTS.md diff --git a/.gitignore b/.gitignore index 0bc78fe..499b223 100644 --- a/.gitignore +++ b/.gitignore @@ -31,10 +31,17 @@ fuzz/coverage/ *.profraw *.profdata -# ----- Machine-local Claude Code state (user-specific, never committed) ----- +# ----- Local agent/control-plane state (not committed) ----- +.agents/ +.claude/ +AGENTS.md +CLAUDE.md +WORKFLOW.md .claude/settings.local.json -.claude/settings.json -.claude/skills/ +.codex +.mcp.json +.mcp.example.json +.mcp.local.json # ----- Mimir operator-local state (per-operator workspace data, never committed) ----- .mimir/ diff --git a/AGENTS.md b/AGENTS.md deleted file mode 100644 index 7bf1996..0000000 --- a/AGENTS.md +++ /dev/null @@ -1,133 +0,0 @@ -# AGENTS.md — Mimir Operating Manual - -> Cross-framework operating manual for Mimir following the [AGENTS.md](https://agents.md/) standard. Read automatically by Claude Code, Codex, Cursor, Copilot, Gemini CLI, and any agent framework supporting the standard. - -> **CI quota constraint (HARD RULE — read before pushing).** This org has a **finite monthly GitHub Actions budget**. Extra Actions usage was added on 2026-04-27 and the owner approved re-enabling Actions for this repo, but every push to a tracked branch / PR still triggers a full matrix run (~30 runner-minutes for 8 jobs across 3 OSes). The 2026-04-20 session burned through the entire monthly quota in heavy iteration — do not repeat that pattern. -> -> **Verify locally before pushing.** Always run the full local gate before `git push`: -> -> ```bash -> cargo build --workspace -> cargo test --workspace -> cargo fmt --all -- --check -> cargo clippy --all-targets --all-features -- -D warnings -> ``` -> -> Other quota-conserving rules: -> - **Batch related changes into one push** instead of 4-5 small "fix the previous fix" commits. -> - **Do not push `--allow-empty` retry commits** — they consume the same minutes as a real run. -> - **Do not retry on transient infra hiccups** without first asking the owner. -> - If CI is currently disabled (`gh api /repos/buildepicshit/Mimir/actions/permissions` returns `enabled: false`), do not re-enable without asking. The owner-approved exception was 2026-04-27 after adding more Actions usage. -> - Dependabot is set to **monthly** cadence (not weekly) and groups all non-major updates into one PR per ecosystem per cycle, for the same quota reason. - -> **Naming history.** Public name `Mimir` (Norse: Mímir, the wise being whose preserved head Odin consulted for counsel). Pre-cutover codename was `engram`; the Mimir cutover happened 2026-04-20 (see [`.planning/planning/2026-04-19-roadmap-to-prime-time.md` § Naming + cutover history](.planning/planning/2026-04-19-roadmap-to-prime-time.md#naming--cutover-history)). Pre-cutover history lives in the archived `buildepicshit/Engram` repo; this repo's git history starts at the cutover commit. Use `mimir-*` everywhere in new code, prose, env vars, and tool names. - -## What Mimir Is - -Mimir is an experimental agent-first memory system. The public name refers to Mimir, the wise figure from Norse myth. The pre-cutover codename was `engram`, the neuroscience term for the physical substrate of a memory trace; that thesis still matters: agent memory should be stored in a form optimized for agent consumption, not human legibility. - -**Mandate update — 2026-04-24.** Mimir's mission is now a multi-agent memory governance/control plane. Claude, Codex, MCP clients, and future harnesses may all contribute memory drafts, but no agent writes trusted shared memory directly. The primary user entry point should be a transparent launch harness: `mimir [agent args...]` preserves the native agent UI while wrapping the session with Mimir memory, bootstrap, capture, and governance. Mimir ingests raw memories, cleans and validates them through the librarian, separates observations from instructions, files records by scope, and promotes reusable knowledge only through explicit provenance-preserving governance. Mimir may also orchestrate cross-agent, cross-model consensus quorums; those deliberation outputs are governed evidence drafts, not direct canonical memory writes. - -The design space Mimir explores: - -- **Agent-native IR.** Canonical storage in a bytecode-like format, tokenizer-aligned for Claude. Not markdown, not English, not JSON. Humans access through a decoder tool, never directly. -- **Librarian-mediated writes.** A single-writer gate enforces schema, symbol identity, supersession, and write conflicts. Agents never write to the canonical store directly. -- **Bifurcated reads.** Agents read the canonical store directly on the hot path. They escalate to the librarian on conflict, low confidence, or stale-symbol flag. -- **Compiler-style architecture (Roslyn analog).** The librarian is a compiler pipeline — lexer, parser, binder, semantic analyzer, emit. Deterministic code runs the pipeline; small ML only for semantic fuzziness (dedup, synonymy, supersession candidates). -- **Bi-temporal append-only store.** Four clocks per memory: `valid_at` / `invalid_at` / `committed_at` / `observed_at`. Supersession via edge invalidation — never in-place overwrite. -- **Symbol-tracking IR (Roslyn-grade).** Stable symbol identity across references. Rename propagation, alias chains, retirement flags. -- **Grounding-aware deterministic confidence decay.** Exponential; parameterized by `(memory-type × grounding × symbol-kind)`. Activity-weighted in v1. Pinning suspends. -- **Checkpoint-triggered write batches.** Writes happen at agent context-pressure events. Each checkpoint is one Episode (atomic rollback unit). -- **Scoped isolation with governed promotion.** Raw agent and project memories are isolated by default. Cross-project, operator-level, or ecosystem-level reuse happens only through explicit librarian promotion with provenance, scope, trust tier, and revocation. -- **Transparent agent harness.** Users launch normal agents through `mimir [agent args...]`; Mimir preserves native terminal flows while adding governed rehydration, capture, and draft submission. -- **Cross-agent consensus quorum.** Claude, Codex, and future adapters can be asked to reason over a question from explicit personas, critique each other, vote, preserve dissent, and emit a structured result. Quorum results enter Mimir as provenance-rich drafts or review artifacts, never as automatic truth. -- **DAG supersession.** Bi-temporal edge invalidation with four edge kinds (Supersedes / Corrects / StaleParent / Reconfirms). -- **Four memory types.** Semantic, episodic, procedural, inferential. Ephemeral tier alongside for intra-session state. - -**Current state:** see [`STATUS.md`](STATUS.md). - -## Architectural Invariants (Non-negotiable) - -These are load-bearing and not up for casual revision: - -1. **Librarian is the single writer.** Agents never write directly to the canonical store. -2. **Agent-native IR is not human-readable.** Inspection routes through a decoder tool; operability is a tooling concern, not a format concession. -3. **Append-only canonical store.** No in-place overwrite. Supersession via bi-temporal edge invalidation. -4. **Precision and consistency over speed and token cost.** Compute overhead is acceptable in exchange for determinism. -5. **Adapter-mediated agent surfaces.** Claude and Codex are the first target surfaces. Future agents integrate through the transparent launch harness, draft/retrieval adapters, and optional MCP-compatible tools, never by bypassing the librarian or canonical contracts. -6. **Every write crosses a validated boundary.** The librarian parses, binds, typechecks, and supersession-detects every write before commit. -7. **Memory is local until governed.** Drafts and raw memories remain isolated at their origin scope. A memory may cross agent, project, operator, or ecosystem boundaries only after librarian validation, explicit scope assignment or promotion, provenance retention, trust classification, and revocable append-only lineage. -8. **Consensus is governed evidence, not truth.** Cross-agent quorum outputs must preserve participant identity, prompts, votes, dissent, and provenance. They can propose memory drafts or decisions, but canonical memory still requires the librarian path. - -## Engineering Standards - -1. **TDD.** No production code without a failing test first. RED → GREEN → REFACTOR. -2. **Primary sources.** Design claims cite primary literature (papers, specs). Training-memory claims are flagged "pending verification" until checked against the real source. -3. **Verification before claiming complete.** Tests passing is not correctness. Claim "done" only with fresh verification output. -4. **Small commits.** Atomic, reviewable, frequent. -5. **Conventional Commits.** `(): `. Types: `feat`, `fix`, `chore`, `docs`, `test`, `refactor`, `ci`, `perf`, `build`, `research`, `spec`. -6. **No AI attribution.** Commits, PRs, and project output carry no `Co-Authored-By` lines, no "Generated with" footers, no tool-attribution emojis. -7. **Squash merge only.** Linear history on main. -8. **PR-only workflow.** No direct pushes to main. - -## Engagement Protocol - -Agent operation is high-touch. No autonomous drift. - -1. **Propose** in 2–3 sentences. What and why. -2. **Wait** for yes / change / no. -3. **Execute** the single concrete action. No scope expansion. -4. **Report** what shipped plus the logical next step. Do not auto-roll. -5. **Stop.** Owner directs the next step. - -## Anti-Patterns (Explicitly Disallowed) - -- Human-readable format concessions in the canonical IR. -- Agent direct writes bypassing the librarian. -- In-place memory overwrite. -- Bare English entity mentions (every reference resolves to a stable symbol ID). -- Optimizing for latency at the cost of precision. -- Claiming a design decision without primary-source verification. -- Raw shared-memory namespaces that agents append to directly. -- Cross-scope promotion without provenance, trust tier, and revocation semantics. -- Treating agent-authored imperatives as durable operator instructions without review. -- Treating quorum majority as truth, erasing dissent, or reporting one model playing multiple personas as cross-model agreement. -- Forcing a separate setup ceremony before the requested agent can launch; first-run bootstrap belongs inside the wrapped agent session. -- Making MCP, hooks, or native client configuration the foundational trust boundary. They are adapter conveniences; the session harness and librarian boundary carry the product. -- AI attribution anywhere in git history or project output. -- Skipping commit hooks (`--no-verify`, `--no-gpg-sign`) without explicit owner approval. - -## Commit Conventions - -[Conventional Commits](https://www.conventionalcommits.org/): - -- `feat:` — new feature -- `fix:` — bug fix -- `refactor:` — restructure without behaviour change -- `test:` — add or update tests -- `docs:` — documentation only -- `chore:` — build, tooling, dependency updates -- `research:` — exploratory or measurement work (tokenizer bake-offs, literature surveys) -- `spec:` — design specification updates - -Commit bodies stay under 15 lines. Depth belongs in the spec or PR description. - -## Studio Context - -Mimir is a [BES Studios](https://github.com/buildepicshit) project. Sibling flagships: Floom (generative world design), Wick (Godot MCP with Roslyn-enriched C# exception telemetry), UsefulIdiots. - -## Where to Look - -| Concern | Where | -|---|---| -| Current phase, next milestone | [`STATUS.md`](STATUS.md) | -| Architectural invariants | this file | -| Engineering principles & tooling policy | [`PRINCIPLES.md`](PRINCIPLES.md) | -| Design specs | [`docs/concepts/`](docs/concepts/) — 14 authoritative implementation specs plus draft [`scope-model.md`](docs/concepts/scope-model.md) and [`consensus-quorum.md`](docs/concepts/consensus-quorum.md) | -| Multi-agent mandate | [`docs/concepts/scope-model.md`](docs/concepts/scope-model.md) and [`docs/concepts/consensus-quorum.md`](docs/concepts/consensus-quorum.md) | -| Transparent agent harness | [`README.md`](README.md#running-mimir) and [`docs/first-run.md`](docs/first-run.md) | -| Observability schema | [`docs/observability.md`](docs/observability.md) | -| Prior art attribution | [`docs/attribution.md`](docs/attribution.md) — primary-source verified | -| Roadmap to v1.0 / public launch | [`STATUS.md`](STATUS.md), [`docs/launch-readiness.md`](docs/launch-readiness.md), and [`docs/launch-posting-plan.md`](docs/launch-posting-plan.md) | -| Experimental measurement | `benchmarks/recovery/` for public benchmark assets; ignored local scratch stays out of the public tree | -| Historical planning archive | [`.planning/planning/`](.planning/planning/) | From 532cee2847e995aa6da3547b563061ffd95f7edb Mon Sep 17 00:00:00 2001 From: HasNoBeef Date: Thu, 30 Apr 2026 11:13:28 -0700 Subject: [PATCH 2/2] chore(repo): track agent control standard --- .agents/DOCUMENTATION_GUIDE.md | 169 +++++ .agents/skills/code-review/SKILL.md | 36 + .../skills/implementation-execution/SKILL.md | 34 + .agents/skills/release-pr/SKILL.md | 27 + .agents/skills/repo-orientation/SKILL.md | 36 + .../skills/spec-driven-development/SKILL.md | 45 ++ .../skills/spec-evidence-governance/SKILL.md | 35 + .agents/skills/spec-review/SKILL.md | 36 + .agents/skills/symphony-dispatch/SKILL.md | 33 + .agents/skills/verification/SKILL.md | 33 + .../SPEC.md | 350 ++++++++++ .../2026-04-29-realignment-handoff/SPEC.md | 122 ++++ .agents/specs/2026-04-29-repo-audit/SPEC.md | 110 ++++ .../EVALUATION.md | 622 ++++++++++++++++++ .../ROADMAP.md | 173 +++++ .../VERIFICATION.md | 170 +++++ .../SPEC.md | 321 +++++++++ .agents/specs/SPEC.template.md | 107 +++ .agents/workflows/author-spec.md | 12 + .agents/workflows/execute-spec.md | 13 + .agents/workflows/orient.md | 11 + .agents/workflows/release-pr.md | 13 + .agents/workflows/review-diff.md | 11 + .agents/workflows/review-spec.md | 17 + .agents/workflows/spec-evidence.md | 12 + .agents/workflows/symphony-dispatch-check.md | 12 + .agents/workflows/verify-spec.md | 12 + .claude/commands/author-spec.md | 12 + .claude/commands/execute-spec.md | 13 + .claude/commands/orient.md | 11 + .claude/commands/release-pr.md | 13 + .claude/commands/review-diff.md | 11 + .claude/commands/review-spec.md | 17 + .claude/commands/spec-evidence.md | 12 + .claude/commands/symphony-dispatch-check.md | 12 + .claude/commands/verify-spec.md | 12 + .claude/settings.json | 3 + .claude/skills/code-review/SKILL.md | 36 + .../skills/implementation-execution/SKILL.md | 34 + .claude/skills/release-pr/SKILL.md | 27 + .claude/skills/repo-orientation/SKILL.md | 36 + .../skills/spec-driven-development/SKILL.md | 45 ++ .../skills/spec-evidence-governance/SKILL.md | 35 + .claude/skills/spec-review/SKILL.md | 36 + .claude/skills/symphony-dispatch/SKILL.md | 33 + .claude/skills/verification/SKILL.md | 33 + .gitignore | 8 +- AGENTS.md | 162 +++++ CLAUDE.md | 25 + WORKFLOW.md | 74 +++ 50 files changed, 3265 insertions(+), 7 deletions(-) create mode 100644 .agents/DOCUMENTATION_GUIDE.md create mode 100644 .agents/skills/code-review/SKILL.md create mode 100644 .agents/skills/implementation-execution/SKILL.md create mode 100644 .agents/skills/release-pr/SKILL.md create mode 100644 .agents/skills/repo-orientation/SKILL.md create mode 100644 .agents/skills/spec-driven-development/SKILL.md create mode 100644 .agents/skills/spec-evidence-governance/SKILL.md create mode 100644 .agents/skills/spec-review/SKILL.md create mode 100644 .agents/skills/symphony-dispatch/SKILL.md create mode 100644 .agents/skills/verification/SKILL.md create mode 100644 .agents/specs/2026-04-29-parallel-handoff-closeout/SPEC.md create mode 100644 .agents/specs/2026-04-29-realignment-handoff/SPEC.md create mode 100644 .agents/specs/2026-04-29-repo-audit/SPEC.md create mode 100644 .agents/specs/2026-04-30-green-room-product-evaluation/EVALUATION.md create mode 100644 .agents/specs/2026-04-30-green-room-product-evaluation/ROADMAP.md create mode 100644 .agents/specs/2026-04-30-green-room-product-evaluation/VERIFICATION.md create mode 100644 .agents/specs/2026-04-30-remove-launch-posting-plan-references/SPEC.md create mode 100644 .agents/specs/SPEC.template.md create mode 100644 .agents/workflows/author-spec.md create mode 100644 .agents/workflows/execute-spec.md create mode 100644 .agents/workflows/orient.md create mode 100644 .agents/workflows/release-pr.md create mode 100644 .agents/workflows/review-diff.md create mode 100644 .agents/workflows/review-spec.md create mode 100644 .agents/workflows/spec-evidence.md create mode 100644 .agents/workflows/symphony-dispatch-check.md create mode 100644 .agents/workflows/verify-spec.md create mode 100644 .claude/commands/author-spec.md create mode 100644 .claude/commands/execute-spec.md create mode 100644 .claude/commands/orient.md create mode 100644 .claude/commands/release-pr.md create mode 100644 .claude/commands/review-diff.md create mode 100644 .claude/commands/review-spec.md create mode 100644 .claude/commands/spec-evidence.md create mode 100644 .claude/commands/symphony-dispatch-check.md create mode 100644 .claude/commands/verify-spec.md create mode 100644 .claude/settings.json create mode 100644 .claude/skills/code-review/SKILL.md create mode 100644 .claude/skills/implementation-execution/SKILL.md create mode 100644 .claude/skills/release-pr/SKILL.md create mode 100644 .claude/skills/repo-orientation/SKILL.md create mode 100644 .claude/skills/spec-driven-development/SKILL.md create mode 100644 .claude/skills/spec-evidence-governance/SKILL.md create mode 100644 .claude/skills/spec-review/SKILL.md create mode 100644 .claude/skills/symphony-dispatch/SKILL.md create mode 100644 .claude/skills/verification/SKILL.md create mode 100644 AGENTS.md create mode 100644 CLAUDE.md create mode 100644 WORKFLOW.md diff --git a/.agents/DOCUMENTATION_GUIDE.md b/.agents/DOCUMENTATION_GUIDE.md new file mode 100644 index 0000000..7dc9237 --- /dev/null +++ b/.agents/DOCUMENTATION_GUIDE.md @@ -0,0 +1,169 @@ +# BES Documentation Placement Guide + +Status: canonical shared guidance, 2026-04-29. + +Purpose: make every Codex, Claude, and Symphony worker put documentation in the +right place. Read this before creating, moving, archiving, or publishing docs, +specs, plans, audits, or board records. + +## Core Rule + +Use `.agents/` for agent orchestration. Use repo-native `docs/` paths for +durable product knowledge. + +When unsure, create a draft task/audit spec under `.agents/specs/` and ask for +owner approval before moving anything into public or product docs. + +## What Goes In `.agents/` + +| Path | Use for | Notes | +|---|---|---| +| `.agents/specs/` | Agent task specs, audit proposals, migration proposals, Symphony-dispatched execution specs | Default location for work-control specs. These are executable contracts for agents. | +| `.agents/specs/SPEC.template.md` | Shared spec template | Copy/update for non-trivial tasks. | +| `.agents/skills/` | Codex/shared skill procedures | Canonical shared skills. Claude copies mirror these under `.claude/skills/`. | +| `.agents/workflows/` | Shared workflow wrappers | Canonical command/workflow prompts. | +| `.agents/BOARD_SEED.md` | Initial tracker backlog shape | Root only unless a repo needs its own board seed. | +| `.agents/DOCUMENTATION_GUIDE.md` | This placement policy | Keep copies aligned across active repos. | +| `.agents/archive/` | Superseded agent-control artifacts | Not product history unless explicitly promoted. | + +Do not put long-lived product architecture only in `.agents/`. If it matters to +humans or contributors, graduate it into the repo's native docs path after +owner approval. + +## What Goes In Root Docs + +The root checkout is the company control plane, not a product monorepo. + +| Path | Use for | +|---|---| +| `AGENTS.md` | Root agent entrypoint and policy summary | +| `CLAUDE.md` | Claude entrypoint importing root policy | +| `WORKFLOW.md` | Company-level Symphony workflow contract | +| `.agents/OPERATING_MODEL.md` | Canonical fleet operating model | +| `.agents/BOARD_SEED.md` | Tracker/Symphony seed backlog | +| `.agents/specs/` | Company-control specs and audit proposals | + +Do not add product implementation docs to root unless the work is truly +cross-company. Cross-company ideas should usually become a control-plane spec +that links to repo-local specs. + +## Repo-Native Product Doc Locations + +### ACTOCCATUD + +| Path | Use for | +|---|---| +| `.agents/specs/` | Agent/Symphony audit and task-control specs | +| `docs/plans/` | V1 sequencing, active plans, supersession records | +| `docs/systems/` | Durable system specs | +| `docs/engineering/` | Engineering research, process, gap registries | +| `docs/content/` | Content specs and catalogs | +| `docs/creative/` / `docs/design/` | Creative/design-system surfaces | +| `docs/reviews/` | Audit and review findings | + +Dispatch authority currently starts from `docs/plans/2026-04-27-v1-truth.md`. + +### Floom + +| Path | Use for | +|---|---| +| `.agents/specs/` | Agent/Symphony audit and task-control specs | +| `docs/superpowers/specs/` | Durable product/compiler specs | +| `docs/superpowers/plans/` | Active and historical implementation plans | +| `docs/superpowers/research/` | Research records | +| `docs/concepts/` | First-principles concept documents | +| `docs/architecture.md` / `docs/getting-started.md` | Public-facing technical docs | + +Demo/product work that becomes durable architecture belongs in +`docs/superpowers/specs/`; orchestration specs stay in `.agents/specs/`. + +### UsefulIdiots + +| Path | Use for | +|---|---| +| `.agents/specs/` | Agent/Symphony audit and task-control specs | +| `docs/specs/` | Durable system architecture specs | +| `docs/01-concept.md` through `docs/07-narrative.md` | Locked game-design authority | +| `docs/LOCKED.md` | Decisions that should not be re-litigated | +| `docs/glossary.md` | Canonical terminology | + +No product code should be written from skeleton specs. Detailed system specs +must be approved first. + +### IKTO + +| Path | Use for | +|---|---| +| `.agents/specs/` | Agent/Symphony audit and task-control specs | +| `docs/superpowers/specs/` | Durable design/product specs | +| `docs/superpowers/memos/` | Decision and research memos | +| `docs/superpowers/research/` | Research records | +| `docs/plans/` | Phase plans, audits, open-question catalogs | +| `docs/content/` | Content/category specs | + +Post-pivot work should first clarify whether older docs are historical, +superseded, or still active. + +### Wick + +Public OSS caution applies. + +| Path | Use for | +|---|---| +| `.agents/specs/` | Agent/Symphony audit and task-control specs; keep local unless approved | +| `docs/` | Public contributor/user documentation | +| `docs/planning/` | Historical and active project plans | +| `docs/public-testing/` | Public-test reports | +| `SECURITY.md`, `CONTRIBUTING.md`, `CHANGELOG.md` | Standard public OSS surfaces | + +Do not publish internal BES agent-control language into Wick docs unless it is +intentionally contributor-facing. + +### Mimir + +Public OSS caution applies. + +| Path | Use for | +|---|---| +| `.agents/specs/` | Agent/Symphony audit and task-control specs; keep local unless approved | +| `docs/concepts/` | Durable product architecture specs | +| `docs/integrations/` | Integration setup docs | +| `docs/blog/` | Public article drafts/content | +| `docs/launch-readiness.md` | Launch checklist and promise audit | +| `docs/launch-posting-plan.md` | Public launch/posting plan | +| `.planning/planning/` | Historical planning archive | + +Mimir product docs may mention memory, hooks, and checkpoint features. BES +fleet operating docs must not use Mimir hooks or raw memory as authority until +a new spec-authority integration is approved. + +## Promotion Rules + +Use this promotion path: + +1. Start with `.agents/specs/-/SPEC.md` for task control. +2. If the work creates durable product knowledge, add or update the repo-native + docs path listed above. +3. Keep audit notes and completion evidence in the task spec. +4. Do not move audit prose wholesale into public docs. Rewrite it for the + intended audience. +5. Public OSS repos require owner approval before publishing agent workflow or + internal planning language. + +## Archive And Supersession Rules + +- Prefer supersession headers over deletion when old docs explain why a + decision changed. +- Delete only when the file is a one-time bootstrap, generated scratch, or an + owner-approved removal. +- If a doc is historical, say which doc supersedes it. +- If a doc is active, say where its implementation work is tracked. +- If a doc is rejected, preserve the rejection reason in a spec, ADR, or plan. + +## Memory And Evidence + +Raw chat history, Claude memory, Codex memory, Mimir drafts, and old agent notes +are evidence only. They do not decide document placement. + +Durable rules must cite a checked-in file, approved spec, command output, issue, +PR, or direct owner instruction. diff --git a/.agents/skills/code-review/SKILL.md b/.agents/skills/code-review/SKILL.md new file mode 100644 index 0000000..bcc22d0 --- /dev/null +++ b/.agents/skills/code-review/SKILL.md @@ -0,0 +1,36 @@ +--- +name: code-review +description: Use for reviewing local diffs or PRs. Prioritizes bugs, regressions, missing tests, unsafe assumptions, and broken repo contracts over summaries. +--- + +# Code Review + +Use this when asked to review. + +## Review Focus + +- Correctness bugs. +- Behavioral regressions. +- Missing or weak tests. +- Security, privacy, or secret-handling risks. +- Broken architecture boundaries. +- Drift from `AGENTS.md`, approved specs, or public docs. +- Verification gaps. + +## Output + +Findings first, ordered by severity. Each finding should include: + +- file and line reference when available +- the concrete risk +- why the current change causes it +- a practical fix direction + +Then include open questions and a brief summary only after findings. + +## Hard Rules + +- If there are no findings, say that clearly and list residual risk. +- Do not lead with praise or broad summaries. +- Do not request stylistic churn unless it affects correctness, + maintainability, or repo contracts. diff --git a/.agents/skills/implementation-execution/SKILL.md b/.agents/skills/implementation-execution/SKILL.md new file mode 100644 index 0000000..3b4cbf4 --- /dev/null +++ b/.agents/skills/implementation-execution/SKILL.md @@ -0,0 +1,34 @@ +--- +name: implementation-execution +description: Use when implementing an approved BES SPEC.md. Keeps edits scoped, preserves user work, updates directly coupled tests/docs, and stops when new facts change scope. +--- + +# Implementation Execution + +Use only after a spec is approved by the owner or controlling workflow. + +## Steps + +1. Re-read the approved `SPEC.md`. +2. Re-read the repo `AGENTS.md` and relevant docs. +3. Confirm branch/worktree state with `git status --short --branch`. +4. Edit only files named by the spec or directly required by the change. +5. Add or update tests before or with production changes when behavior changes. +6. Keep unrelated refactors out of scope. +7. Run the spec acceptance commands. +8. Prepare the completion report requested by the spec. + +## Stop Conditions + +- New facts materially change scope. +- Required files contain unrelated local changes that make safe editing + ambiguous. +- Verification requires unavailable secrets or infrastructure. +- The spec's acceptance criteria are not testable. + +## Hard Rules + +- Preserve unrelated user changes. +- Do not silently expand scope. +- Do not bypass hooks, CI, or verification gates. +- Do not claim completion without fresh verification evidence. diff --git a/.agents/skills/release-pr/SKILL.md b/.agents/skills/release-pr/SKILL.md new file mode 100644 index 0000000..c508d26 --- /dev/null +++ b/.agents/skills/release-pr/SKILL.md @@ -0,0 +1,27 @@ +--- +name: release-pr +description: Use when preparing commits, PRs, release handoff, or merge cleanup. Enforces explicit staging, conventional commits, PR evidence, and worktree hygiene. +--- + +# Release And PR + +Use when moving finished work toward review or merge. + +## Steps + +1. Confirm branch and tracking state. +2. Review `git status --short` and `git diff`. +3. Stage explicit files by path. +4. Use the repo's commit convention. +5. Write a PR body with summary, verification output, risk, and links. +6. Confirm CI/check status only after local verification is complete. +7. After merge, clean worktrees and stale local branches according to repo + instructions. + +## Hard Rules + +- No `git add .` unless explicitly approved for the batch. +- No AI attribution in commits, PRs, docs, or generated output. +- No force-push, branch deletion, hook bypass, or merge without approval when + the repo requires it. +- Do not burn CI minutes as a substitute for local verification. diff --git a/.agents/skills/repo-orientation/SKILL.md b/.agents/skills/repo-orientation/SKILL.md new file mode 100644 index 0000000..9b6ef2a --- /dev/null +++ b/.agents/skills/repo-orientation/SKILL.md @@ -0,0 +1,36 @@ +--- +name: repo-orientation +description: Use at the start of work in any BES repo to build a current, cited map of instructions, repo state, verification gates, active plans, and likely risk before editing. +--- + +# Repo Orientation + +Use this before planning or editing. + +## Steps + +1. Read the nearest `AGENTS.md`. +2. If present, read `CLAUDE.md`, `WORKFLOW.md`, `STATUS.md`, + `.agents/DOCUMENTATION_GUIDE.md`, and the docs linked by `AGENTS.md`. +3. Check git state with `git status --short --branch`. +4. Identify the active branch, tracking branch, untracked files, and unrelated + local changes. +5. Identify the repo's verification gate and any hook setup requirements. +6. Locate the task's likely files with `rg` and `rg --files`. +7. Report only verified facts. Cite files or command output. + +## Output + +- Target repo and branch. +- Source-of-truth docs read. +- Relevant files or directories. +- Verification commands. +- Documentation placement constraints for this task. +- Local changes that must be preserved. +- Open questions before implementation. + +## Hard Rules + +- Do not edit during orientation. +- Do not rely on memory when repo docs can answer the question. +- If instructions conflict, stop and report the conflict. diff --git a/.agents/skills/spec-driven-development/SKILL.md b/.agents/skills/spec-driven-development/SKILL.md new file mode 100644 index 0000000..57af240 --- /dev/null +++ b/.agents/skills/spec-driven-development/SKILL.md @@ -0,0 +1,45 @@ +--- +name: spec-driven-development +description: "Use when planning, reviewing, implementing, or verifying non-trivial work in BES repos. Enforces the BES spec-first operating model: author an executable SPEC.md, review it, implement only approved scope, verify with concrete commands, and route durable lessons into spec evidence." +--- + +# Spec-Driven Development + +Use this skill for non-trivial work in BES repos. + +## Workflow + +1. Read `AGENTS.md`, `CLAUDE.md` if present, `STATUS.md` if present, + `.agents/DOCUMENTATION_GUIDE.md` if present, and the relevant project docs. +2. Create or update a task spec from `.agents/specs/SPEC.template.md`. +3. Verify the spec has goals, non-goals, current facts with citations, + desired behavior, safety invariants, acceptance commands, rollback, and open + questions. +4. Do not implement until the spec is approved by the owner or controlling + workflow. +5. Execute only the approved spec. Stop if new facts materially change scope. +6. Run acceptance commands and the repo's normal verification gate. +7. Report files changed, commands run, verification output, residual risk, and + spec evidence candidates. + +## Hard Rules + +- Specs are executable contracts, not brainstorming notes. +- Raw memories and chat history are evidence only. +- Project docs and `AGENTS.md` beat generated memory. +- Durable cross-project instructions go through approved specs and delivery + evidence records. +- Put task-control specs in `.agents/specs/`; put durable product docs in the + repo-native docs path defined by `.agents/DOCUMENTATION_GUIDE.md`. +- No silent scope expansion. +- No completion claim without fresh verification. + +## Spec Review Checklist + +- Problem is specific and cites current evidence. +- Goals and non-goals draw a clean boundary. +- Executor can identify exact files and interfaces. +- Test plan is runnable on this machine. +- Safety invariants protect user work and repo rules. +- Open questions are resolved before implementation. +- Acceptance criteria are objective. diff --git a/.agents/skills/spec-evidence-governance/SKILL.md b/.agents/skills/spec-evidence-governance/SKILL.md new file mode 100644 index 0000000..d7c0c11 --- /dev/null +++ b/.agents/skills/spec-evidence-governance/SKILL.md @@ -0,0 +1,35 @@ +--- +name: spec-evidence-governance +description: Use to convert durable lessons from a completed task into spec evidence candidates without writing trusted shared memory directly. Mimir hooks are intentionally disabled until a future spec-authority integration is approved. +--- + +# Spec Evidence Governance + +Use after substantial work, reviews, or incident resolution. + +## Candidate Criteria + +Capture a memory candidate only when it is: + +- durable across sessions +- useful to future agents +- grounded in a source path, command output, issue, PR, or owner statement +- not already present in checked-in docs +- safe to share at the intended scope + +## Output + +For each candidate: + +- Claim. +- Scope: repo, company, tool, or project area. +- Evidence: file path, command, issue, PR, or owner statement. +- Confidence. +- Supersedes or conflicts with any known existing memory. +- Suggested spec, backlog, or delivery-authority route. + +## Hard Rules + +- Do not write trusted shared memory directly. +- Do not promote raw agent imperatives into durable rules. +- Do not erase dissent, uncertainty, or provenance. diff --git a/.agents/skills/spec-review/SKILL.md b/.agents/skills/spec-review/SKILL.md new file mode 100644 index 0000000..9588c4f --- /dev/null +++ b/.agents/skills/spec-review/SKILL.md @@ -0,0 +1,36 @@ +--- +name: spec-review +description: Use to review a draft SPEC.md before implementation. Focus on ambiguity, missing current facts, unsafe scope, weak acceptance criteria, and missing verification. +--- + +# Spec Review + +Use this before approving or executing a non-trivial spec. + +## Review Checklist + +- Problem statement is concrete and cites current evidence. +- Goals and non-goals define a clear boundary. +- Current system facts cite files, docs, issues, PRs, or command output. +- Desired behavior is testable. +- Interfaces and files are specific enough for an executor. +- Safety invariants protect user work, secrets, hooks, and repo rules. +- Test plan is runnable on this machine. +- Acceptance criteria are objective. +- Rollback plan is realistic. +- Open questions are resolved or explicitly block execution. + +## Output + +Lead with blocking findings ordered by severity. Include file references when +possible. Then list open questions and a recommendation: + +- `approve` +- `approve with small edits` +- `block until revised` + +## Hard Rules + +- Do not approve vague specs. +- Do not allow implementation scope to hide inside open questions. +- Do not review for style before correctness and safety. diff --git a/.agents/skills/symphony-dispatch/SKILL.md b/.agents/skills/symphony-dispatch/SKILL.md new file mode 100644 index 0000000..ef1133b --- /dev/null +++ b/.agents/skills/symphony-dispatch/SKILL.md @@ -0,0 +1,33 @@ +--- +name: symphony-dispatch +description: Use when preparing or auditing Symphony-compatible issue dispatch. Checks WORKFLOW.md, issue eligibility, workspace isolation, Codex runner settings, and observability expectations. +--- + +# Symphony Dispatch + +Use when running or preparing autonomous worker dispatch. + +## Checklist + +- `WORKFLOW.md` exists in the runner cwd or an explicit workflow path is set. +- YAML front matter has `tracker`, `polling`, `workspace`, `hooks`, `agent`, + and `codex` sections. +- `tracker.kind` is `linear` and `tracker.api_key` resolves from + `LINEAR_API_KEY`. +- `tracker.project_slug`, active states, and terminal states match the board. +- `workspace.root` is absolute and outside product repo working trees. +- Hooks are documented; failures have the right abort/ignore behavior. +- `codex.command` is `codex app-server` unless a tested wrapper is in use. +- Concurrency is bounded for the machine and CI budget. +- Running workers use isolated branches or worktrees. +- Logs and completion reports include issue identifier, session, commands, and + verification evidence. + +## Hard Rules + +- Do not dispatch when the target repo is unclear. +- Do not dispatch multiple write-capable workers into the same worktree. +- Do not allow unsupported tool calls or user-input-required turns to stall + indefinitely. +- Treat Symphony as a trusted-environment runner unless stronger sandboxing is + explicitly configured. diff --git a/.agents/skills/verification/SKILL.md b/.agents/skills/verification/SKILL.md new file mode 100644 index 0000000..156c605 --- /dev/null +++ b/.agents/skills/verification/SKILL.md @@ -0,0 +1,33 @@ +--- +name: verification +description: Use before reporting done. Runs the narrowest relevant checks first, then the repo gate when warranted, and records fresh evidence plus residual risk. +--- + +# Verification + +Use before claiming work is complete. + +## Steps + +1. Read the spec acceptance commands and repo `AGENTS.md` verification section. +2. Run the narrowest relevant test or lint first. +3. Run the broader repo gate when the change touches shared behavior, + interfaces, CI, docs contracts, or release surfaces. +4. Capture command, result, and important output. +5. If a command fails, diagnose whether the failure is caused by the change, + existing repo state, missing dependency, sandbox/network limits, or secrets. +6. Re-run only after a meaningful fix or environment change. + +## Output + +- Commands run. +- Pass/fail result. +- Key output lines or summarized failures. +- Residual risk. +- Checks not run and why. + +## Hard Rules + +- Do not say "should pass" as verification. +- Do not hide failing checks. +- Do not spend CI minutes when local gates are required first. diff --git a/.agents/specs/2026-04-29-parallel-handoff-closeout/SPEC.md b/.agents/specs/2026-04-29-parallel-handoff-closeout/SPEC.md new file mode 100644 index 0000000..f225244 --- /dev/null +++ b/.agents/specs/2026-04-29-parallel-handoff-closeout/SPEC.md @@ -0,0 +1,350 @@ +--- +id: mimir-parallel-handoff-closeout-2026-04-29 +status: owner-paused +owner: HasNoBeef +repo: Mimir +source_specs: + - root:.agents/specs/2026-04-29-fleet-realignment-and-handoff/SPEC.md + - .agents/specs/2026-04-29-realignment-handoff/SPEC.md + - .agents/specs/2026-04-29-repo-audit/SPEC.md +branch_policy: local-only-public-oss-parallel-lane +risk: medium +requires_network: false +requires_secrets: [] +acceptance_commands: + - "node ../.agents/scripts/preflight.mjs" + - "git status --short --branch --untracked-files=all" + - "cargo build --workspace" + - "cargo test --workspace" + - "cargo test --workspace --all-features" + - "cargo fmt --all -- --check" + - "cargo clippy --all-targets --all-features -- -D warnings" + - "cargo deny check" + - "cargo doc --workspace --no-deps" +--- + +# SPEC: Mimir Parallel Handoff Closeout + +## 1. Problem + +Mimir still has local handoff/setup work after the BES fleet realignment pass. +The repo is public OSS and CI-budget-sensitive, so the remaining closeout must +preserve local work, avoid public noise, and define the exact point at which a +green room product evaluation may safely begin. + +This spec is a local agent-control handoff artifact. It does not approve product +code, public docs, commits, pushes, tags, releases, or publication. + +## 2. Goals + +- Record the current Mimir branch, head, dirty state, in-flight work, and + verification gates from fresh command output. +- Define a public-OSS-safe parallel closeout lane that is disjoint from other + BES handoff workers. +- Identify owner decisions needed before any Mimir product closeout, public docs + change, PR, push, tag, or release. +- Define stop conditions that protect user work, public OSS posture, and CI + quota. +- State that Mimir green room evaluation may begin only after this closeout is + done or the owner explicitly marks it paused. + +## 3. Non-Goals + +- Do not edit product code. +- Do not edit root files or sibling repos. +- Do not commit, push, tag, publish, open PRs, or mutate GitHub/Linear. +- Do not re-enable BES use of Mimir hooks, MCP servers, or raw memory as work + authority. +- Do not resolve launch-readiness documentation drift in this lane unless the + owner expands scope in a new approved spec. + +## 4. Current System Facts + +- Root `AGENTS.md` says the root checkout is the company control plane, product + code lives in active child repos, non-trivial work starts with a spec, and + public OSS repos including Mimir must not receive public agent-control churn + without owner-approved low-noise PR planning. +- `.agents/OPERATING_MODEL.md` requires root preflight, spec-first execution, + isolated workspaces/branches for parallel writers, explicit verification, and + public OSS release hygiene. +- `.agents/GREEN_ROOM_EVALUATION.md` says remaining repo handoffs should run in + parallel only where write scopes are disjoint, and green room evaluation for a + repo may start only after that repo's handoff lane is closed or owner-paused. +- `.agents/MODEL_ROUTING.md` routes public OSS release/spec work through Codex + `gpt-5.5` with Claude Opus 4.7 independent review when useful, and says + write-capable agents need disjoint file ownership or worktree boundaries. +- Root preflight command `node .agents/scripts/preflight.mjs` passed with zero + warnings on 2026-04-29. +- Mimir `AGENTS.md` says Mimir is public pre-1.0 active development, requires + local verification before pushing, and each tracked branch/PR push triggers a + costly GitHub Actions matrix run. +- Mimir `WORKFLOW.md` lists the canonical local verify command as + `cargo build --workspace && cargo test --workspace && cargo fmt --all -- --check && cargo clippy --all-targets --all-features -- -D warnings`. +- Mimir `STATUS.md` says workspace version is `0.1.0`, no release tag exists, + and `v0.1.0` must wait for owner approval. +- `README.md` and `STATUS.md` both limit public claims: no production-ready + claim, no stable storage/API/MCP schema claim, no hosted-service claim, no + benchmark-proven superiority, and no direct agent writes to canonical memory. +- `docs/launch-readiness.md` records local cargo gates, `cargo deny check`, + `cargo doc --workspace --no-deps`, crate dry-run expectations, recovery + benchmark checks, public-surface sweeps, and first tag target `v0.1.0` after + owner approval. +- `docs/README.md`, `README.md`, `STATUS.md`, `AGENTS.md`, and + `docs/launch-readiness.md` still reference `docs/launch-posting-plan.md`. + Command `test -e docs/launch-posting-plan.md` returned exit code 1, and + recent `git log --oneline --decorate -n 12` shows + `4d38614 Delete docs/launch-posting-plan.md (#16)`. +- Current Mimir branch command output: + +```text +## main...origin/main + M .gitignore + M AGENTS.md +?? .agents/DOCUMENTATION_GUIDE.md +?? .agents/skills/code-review/SKILL.md +?? .agents/skills/implementation-execution/SKILL.md +?? .agents/skills/release-pr/SKILL.md +?? .agents/skills/repo-orientation/SKILL.md +?? .agents/skills/spec-driven-development/SKILL.md +?? .agents/skills/spec-evidence-governance/SKILL.md +?? .agents/skills/spec-review/SKILL.md +?? .agents/skills/symphony-dispatch/SKILL.md +?? .agents/skills/verification/SKILL.md +?? .agents/specs/2026-04-29-realignment-handoff/SPEC.md +?? .agents/specs/2026-04-29-repo-audit/SPEC.md +?? .agents/specs/SPEC.template.md +?? .agents/workflows/author-spec.md +?? .agents/workflows/execute-spec.md +?? .agents/workflows/orient.md +?? .agents/workflows/release-pr.md +?? .agents/workflows/review-diff.md +?? .agents/workflows/review-spec.md +?? .agents/workflows/spec-evidence.md +?? .agents/workflows/symphony-dispatch-check.md +?? .agents/workflows/verify-spec.md +?? .claude/commands/author-spec.md +?? .claude/commands/execute-spec.md +?? .claude/commands/orient.md +?? .claude/commands/release-pr.md +?? .claude/commands/review-diff.md +?? .claude/commands/review-spec.md +?? .claude/commands/spec-evidence.md +?? .claude/commands/symphony-dispatch-check.md +?? .claude/commands/verify-spec.md +?? .claude/settings.json +?? .claude/skills/code-review/SKILL.md +?? .claude/skills/implementation-execution/SKILL.md +?? .claude/skills/release-pr/SKILL.md +?? .claude/skills/repo-orientation/SKILL.md +?? .claude/skills/spec-driven-development/SKILL.md +?? .claude/skills/spec-evidence-governance/SKILL.md +?? .claude/skills/spec-review/SKILL.md +?? .claude/skills/symphony-dispatch/SKILL.md +?? .claude/skills/verification/SKILL.md +?? CLAUDE.md +?? WORKFLOW.md +``` + +- Current Mimir head is `9e81c0f` on `main`, tracking `origin/main`. +- `git diff --name-status` currently reports modified tracked files + `.gitignore` and `AGENTS.md`. +- `git diff -- AGENTS.md` shows the local tracked change adds BES fleet + operating model instructions to Mimir's operating manual. +- `git diff -- .gitignore` shows local tracked changes that stop ignoring + `.claude/settings.json` and `.claude/skills/`, and add `.codex`, + `.mcp.json`, and `.mcp.local.json` ignores. +- Local MCP posture remains zero-default: no Mimir `.mcp.json` is present. + +## 5. Desired Behavior + +The next Mimir worker can close or pause the handoff without disturbing other +parallel lanes. It must know which local files are pre-existing work, which work +requires owner decisions, which gates are required before public activity, and +when green room evaluation is allowed to start. + +## 6. Domain Model / Contract + +Closeout states: + +- `preserve`: local or user work that must not be touched by this lane. +- `verify`: work that may be complete but needs fresh local gates. +- `owner-decision`: work that cannot proceed without HasNoBeef selecting a + public or product posture. +- `ready-for-spec`: future work that needs its own approved spec before edits. +- `closed`: no unresolved in-flight work remains for the handoff lane. +- `owner-paused`: the owner explicitly allows green room evaluation to begin + while named closeout work remains paused. + +## 7. In-Flight Work + +| Item | State | Required next action | +| --- | --- | --- | +| BES agent-control setup in `.agents/`, `.claude/`, `CLAUDE.md`, `WORKFLOW.md`, `.gitignore`, and `AGENTS.md` | preserve | Keep local/draft unless owner approves a low-noise public OSS PR plan. | +| Existing realignment handoff and repo-audit specs | preserve | Use as local source material; do not publish unless owner approves. | +| Active processing adapters at head `9e81c0f` | verify | Run full local cargo gate before any product release, tag, or PR closeout claim. | +| Missing `docs/launch-posting-plan.md` with remaining references | owner-decision | Decide whether to restore, replace, or remove stale references in a separate public-doc-safe spec. | +| Pre-1.0 launch cleanup and `v0.1.0` tag | owner-decision | Requires explicit owner approval, local green gates, and low-noise push/tag plan. | +| BES spec-authority integration research | ready-for-spec | Design only; do not re-enable hooks/MCP/memory authority without approved scope. | +| Green room product evaluation | blocked | May begin only after this handoff is `closed` or explicitly `owner-paused`. | + +## 8. Public-OSS Safe Parallel Lane + +The current lane may write only: + +- `.agents/specs/2026-04-29-parallel-handoff-closeout/SPEC.md` + +Rules for any continuation: + +- Keep all output local. +- Do not push or publish. +- Do not edit product code, public docs, root files, or sibling repos. +- Do not stage files or commit. +- Do not normalize or delete untracked `.agents/` or `.claude/` work from other + agents. +- If owner expands scope beyond this SPEC, create or switch to a dedicated + branch/worktree and name the exact allowed files before editing. +- If another write-capable agent is active in Mimir, stop unless file ownership + and worktree boundaries are explicit. + +## 9. Required Verification Gates + +For this local handoff spec: + +```bash +node ../.agents/scripts/preflight.mjs +git status --short --branch --untracked-files=all +``` + +For any future product, public-doc, PR, push, tag, or release closeout: + +```bash +cargo build --workspace +cargo test --workspace +cargo test --workspace --all-features +cargo fmt --all -- --check +cargo clippy --all-targets --all-features -- -D warnings +cargo deny check +cargo doc --workspace --no-deps +``` + +Additional launch-readiness checks to run when release/public-doc scope is +approved: + +```bash +cargo publish --dry-run -p mimir-core --allow-dirty +cargo test -p mimir-harness --test recovery_benchmark +python3 benchmarks/recovery/test_bench.py +rg -n 'production-ready|stable API|benchmark-proven|hosted service|direct agent writes|mimir hook-context|mimir-checkpoint' README.md STATUS.md docs crates plugins +rg -n 'launch-posting-plan.md' README.md STATUS.md AGENTS.md docs +``` + +Do not run GitHub Actions retries or remote CI probes unless the owner approves +the CI-budget tradeoff. + +## 10. Owner Decisions + +- Owner triage approval on 2026-04-30 marks Mimir `owner-paused` for local BES + setup and public release actions. Green room evaluation may run local-only + after at least one private repo validates the protocol. +- Public docs, PRs, pushes, tags, releases, CI-triggering work, and publication + remain blocked until a separate owner-approved public OSS spec exists. +- Should Mimir close out launch cleanup first, or should spec-authority research + be designed first? +- Should the BES integration pause remain root-only, or should Mimir public docs + mention it in public-facing language? +- Should the local BES agent-control setup be committed to Mimir at all, and if + yes, what is the low-noise public OSS PR plan? +- Should `docs/launch-posting-plan.md` be restored, replaced, or removed from + remaining references after PR #16 deleted it? +- What release posture is acceptable after local gates pass: no tag, `v0.1.0`, + or a new pre-release? +- Is green room evaluation allowed to start now as `owner-paused`, or only after + the local handoff/setup state is closed? + +## 11. Stop Conditions + +Stop and report before editing if any of these occur: + +- The requested file scope expands beyond this SPEC without an approved spec or + explicit owner instruction. +- `git status` changes in files this lane did not touch and the change affects + closeout facts. +- A command would push, publish, tag, re-enable Actions, mutate GitHub/Linear, + install tools, or use network/secrets. +- Product docs or code need edits to resolve the `launch-posting-plan.md` + references. +- Any source conflicts on whether Mimir hooks/MCP/raw memory are active BES work + authority. +- Another write-capable Mimir worker is assigned overlapping files without a + branch/worktree boundary. + +## 12. Execution Plan + +1. Preserve this SPEC as the current Mimir closeout handoff artifact. +2. Ask the owner to resolve the decisions in section 10. +3. If the owner chooses closeout, draft the next executable spec with exact + files and public OSS posture. +4. Run the required local gates before any public-facing PR/push/tag/release. +5. Mark this handoff `closed` only after owner decisions are resolved or + explicitly deferred and there is no unresolved closeout work blocking green + room evaluation. +6. If the owner chooses not to close now, mark the unresolved items + `owner-paused`; only then may green room evaluation begin. + +## 13. Safety Invariants + +- Mimir remains public OSS; internal BES agent-control output stays local until + owner-approved public wording and CI-cost posture exist. +- The librarian remains the product write boundary; no agent writes trusted + shared memory directly. +- BES fleet operation remains spec-first and zero-default-MCP until a future + approved spec changes it. +- Existing local changes and untracked files are user/agent work and must be + preserved. +- CI quota is protected by local verification and batched public activity. + +## 14. Acceptance Criteria + +- [x] Current branch, head, and dirty state are refreshed before closeout. +- [x] Owner decisions in section 10 are resolved or explicitly paused. +- [x] No product code, public docs, root files, or sibling repos are edited by + this lane. +- [x] Required verification gates are run for any approved product/public-doc + closeout. +- [x] Completion report lists commands, results, residual risk, and files + changed. +- [x] Green room evaluation starts only after closeout is `closed` or + `owner-paused`. + +## 15. Rollback Plan + +If this handoff spec is rejected, delete only: + +```text +.agents/specs/2026-04-29-parallel-handoff-closeout/SPEC.md +``` + +Do not revert, delete, stage, or normalize any other local Mimir changes as part +of rollback. + +## 16. Completion Report + +- Files changed: + - `.agents/specs/2026-04-29-parallel-handoff-closeout/SPEC.md` +- Commands run: + - `node ../.agents/scripts/preflight.mjs` - passed with 0 warnings before + this owner-pause decision. + - `git status --short --branch --untracked-files=all` - captured branch + `main...origin/main`, tracked `.gitignore` and `AGENTS.md` edits, and + untracked local agent/Claude setup files. +- Verification result: local control-plane handoff is owner-paused. Product, + public-doc, release, tag, and publication gates were intentionally not run + because no public OSS action is approved. +- Anything intentionally left untouched: product code, public docs, launch + references, root files, sibling repos, untracked `.agents/**`, `.claude/**`, + `CLAUDE.md`, and `WORKFLOW.md`. +- Residual risk: Mimir remains public-OSS sensitive; all public actions remain + blocked until a low-noise owner-approved public spec exists. +- Spec evidence candidates: + - Public OSS green room packets can be local-only after owner pause, but + publication and public-doc changes need a separate public-facing approval. diff --git a/.agents/specs/2026-04-29-realignment-handoff/SPEC.md b/.agents/specs/2026-04-29-realignment-handoff/SPEC.md new file mode 100644 index 0000000..8c9297d --- /dev/null +++ b/.agents/specs/2026-04-29-realignment-handoff/SPEC.md @@ -0,0 +1,122 @@ +--- +id: mimir-realignment-handoff-2026-04-29 +status: draft-handoff +owner: HasNoBeef +repo: Mimir +source_spec: root:.agents/specs/2026-04-29-fleet-realignment-and-handoff/SPEC.md +branch_policy: local-only-public-oss +risk: medium +requires_network: false +requires_secrets: [] +acceptance_commands: + - "cargo build --workspace" + - "cargo test --workspace" + - "cargo test --workspace --all-features" + - "cargo fmt --all -- --check" + - "cargo clippy --all-targets --all-features -- -D warnings" + - "cargo deny check" + - "cargo doc --workspace --no-deps" +--- + +# SPEC: Mimir Realignment Handoff + +## 1. Handoff Purpose + +Mimir is a public pre-1.0 product, while BES company agents have temporarily +moved to spec evidence instead of Mimir hook authority. This handoff keeps those +two facts separate so product work does not get accidentally deprecated and BES +agent policy does not drift back to memory authority. + +## 2. Current Branch And Dirty State + +Observed on 2026-04-29: + +```text +## main...origin/main + M .gitignore + M AGENTS.md +?? .agents/ +?? .claude/ +?? CLAUDE.md +?? WORKFLOW.md +``` + +Recent head: + +```text +9e81c0f feat(librarian): support active processing adapters +4d38614 Delete docs/launch-posting-plan.md (#16) +1650d18 ci: make release publishing idempotent +``` + +Local MCP posture: no repo-local `.mcp.json` is present. Mimir product docs may +still mention MCP and hook features as product surfaces, but BES root operating +policy currently uses zero default MCP servers. + +## 3. Source Docs Read + +- `AGENTS.md` +- `CLAUDE.md` +- `WORKFLOW.md` +- `STATUS.md` +- `.agents/specs/2026-04-29-repo-audit/SPEC.md` + +## 4. Preserve + +- Public pre-1.0 honesty and launch-readiness discipline. +- Mimir product features around governed memory, librarian-mediated writes, + recovery mirroring, hooks, MCP, and benchmarks. +- The BES distinction: spec evidence is current company authority; raw memory + and hooks are not active work authority. +- Root-installed agent surfaces and all existing local/untracked work. + +## 5. Work Classification + +| Item | State | Required next action | +| --- | --- | --- | +| Shared agent setup | preserve | Keep local/draft until owner approves public-facing PR posture. | +| Active processing adapters commit | verify | Confirm local cargo gates before any release/PR closeout. | +| Pre-1.0 launch cleanup | ready-for-dispatch | Use a public-OSS-aware spec and avoid noisy CI churn. | +| BES integration pause note | owner-decision | Decide whether this belongs in Mimir product docs or root-only docs. | +| Spec authority research design | ready-for-dispatch | Design only; do not re-enable hooks without approval. | +| OSS release rollout | owner-decision | Requires explicit owner approval for tag/publish/push. | + +## 6. Verification Gate + +Run before claiming Mimir product work complete: + +```bash +cargo build --workspace +cargo test --workspace +cargo test --workspace --all-features +cargo fmt --all -- --check +cargo clippy --all-targets --all-features -- -D warnings +cargo deny check +cargo doc --workspace --no-deps +``` + +This handoff did not run product gates because it changed only agent-control +handoff documentation. + +## 7. Recommended Next Agent Engagement + +Start from inside `Mimir` and ask the agent to: + +```text +Orient with repo-orientation. Read AGENTS.md, CLAUDE.md, WORKFLOW.md, STATUS.md, +docs/launch-readiness.md, and this handoff. Draft a closeout SPEC for the next +Mimir public-OSS-safe step. Do not push, tag, publish, or add BES hook authority +without owner approval. +``` + +## 8. Owner Decisions Before Execution + +- Should Mimir resume launch cleanup first, or should spec-authority research + happen first? +- Should the BES integration pause be visible in Mimir public docs? +- What release/tag posture is acceptable after local gates pass? + +## 9. Residual Risk + +Mimir is externally visible. Even doc-only agent scaffolding can create public +noise, so keep all new work draft/local until a low-noise PR plan is approved. diff --git a/.agents/specs/2026-04-29-repo-audit/SPEC.md b/.agents/specs/2026-04-29-repo-audit/SPEC.md new file mode 100644 index 0000000..a9cb819 --- /dev/null +++ b/.agents/specs/2026-04-29-repo-audit/SPEC.md @@ -0,0 +1,110 @@ +--- +id: mimir-repo-audit-2026-04-29 +status: draft +owner: HasNoBeef +repo: Mimir +branch_policy: local-only-public-oss +risk: medium +requires_network: false +requires_secrets: [] +acceptance_commands: + - cargo build --workspace + - cargo test --workspace + - cargo test --workspace --all-features + - cargo fmt --all -- --check + - cargo clippy --all-targets --all-features -- -D warnings + - cargo deny check + - cargo doc --workspace --no-deps + - rg -n 'production-ready|stable API|benchmark-proven|hosted service|direct agent writes|mimir hook-context|mimir-checkpoint' README.md STATUS.md docs crates plugins +--- + +# SPEC: Mimir Repo Audit And Spec Migration + +## 1. Problem + +Mimir is a public pre-1.0 memory governance product. BES is temporarily +removing Mimir hook/setup surfaces from the active agent operating layer while +the company moves to spec-first Symphony dispatch. The repo needs to preserve +its product roadmap while making clear that BES agents do not currently rely on +Mimir hooks or raw memory as source-of-truth. + +## 2. Current Facts + +- `STATUS.md` says Mimir is in pre-1.0 public launch cleanup, version `0.1.0`, + with no release tag yet. +- `README.md` says agents may propose memory, but do not write trusted shared + memory directly. +- `docs/launch-readiness.md` records OSS readiness, engineering quality gates, + promise-audit boundaries, and deferred work. +- Product surfaces include core append-only store, librarian, harness, operator + tools, Claude/Codex setup paths, MCP, recovery mirroring, and benchmarks. +- Public claims are explicitly limited: no production-ready claim, no stable + API/storage claim, no hosted-service claim, no benchmark-proven claim. +- Code inventory from `rg --files`: 63 Rust files, 1 Python file, 10 TOML + files, and 54 Markdown files. +- Active BES agent operating surfaces now use `spec-evidence-governance`; the + repo product may still legitimately mention Mimir hook and checkpoint + features in code/docs/tests. + +## 3. Preserve + +- Product mission: local-first governed memory, append-only canonical store, + librarian-mediated writes, transparent harness, and explicit recovery. +- Launch-readiness checklist and promise-audit discipline. +- Public pre-1.0 honesty. +- Product docs/tests for `mimir hook-context`, `mimir-checkpoint`, and native + setup paths, because those are Mimir product features. + +## 4. Archive Or Supersede + +- Do not archive Mimir's product memory-governance docs. They remain product + architecture. +- Do supersede any BES operating instruction that tells agents to use Mimir + hooks or raw memory as work authority. +- Future BES integration should be designed as a spec/delivery evidence system + before re-enabling hooks. + +## 5. Proposed New Executable Specs + +1. **BES Integration Pause Note** + - Scope: add a small product/docs note, if owner approves, clarifying that + BES company agents currently use spec evidence instead of Mimir hooks. + - Acceptance: no claim that Mimir product functionality is deprecated. + +2. **Pre-1.0 Launch Cleanup Batch** + - Scope: continue launch-readiness gates, public-surface scrub, crate/docs + dry-runs, and owner-approved release tagging. + - Acceptance: all `docs/launch-readiness.md` local gates pass. + +3. **Spec Authority Research Design** + - Scope: design how Mimir could later store and govern spec evidence, + delivery records, and supersession decisions instead of generic memories. + - Acceptance: design spec only; no hook re-enable until approved. + +4. **Benchmark Claim Evidence** + - Scope: live recovery benchmark report with transcripts and scorecards. + - Acceptance: benchmark claim is either supported by evidence or kept out of + public copy. + +5. **OSS Release Rollout** + - Scope: batched public changes, crates.io order, docs.rs expectations, + launch article/posting plan, and CI-cost-aware push. + - Acceptance: no remote push/tag until owner approves. + +## 6. Open Questions + +- Should Mimir's first post-audit work be launch cleanup or spec-authority + research? +- Do you want a visible docs note about BES pausing Mimir hooks, or should that + stay in the company control-plane docs only? +- When public launch resumes, should the release be `v0.1.0` exactly or a new + pre-release tag after the current local audit batch? + +## 7. Verification Status + +This audit read docs and performed a lightweight code inventory only. Cargo +gates were not run because no Mimir product code changed in this session and +public OSS CI churn is intentionally avoided. + +The one-time migration bootstrap for Mimir has been actioned locally and should +be deleted in this local audit batch. diff --git a/.agents/specs/2026-04-30-green-room-product-evaluation/EVALUATION.md b/.agents/specs/2026-04-30-green-room-product-evaluation/EVALUATION.md new file mode 100644 index 0000000..2c4f3f7 --- /dev/null +++ b/.agents/specs/2026-04-30-green-room-product-evaluation/EVALUATION.md @@ -0,0 +1,622 @@ +# Mimir Green Room Product Evaluation + +## Repo State + +| Field | Value | +|---|---| +| Repo | `Mimir` | +| Branch | `main` | +| Head | `9e81c0f` | +| Tracking | `origin/main` | +| Dirty state | Yes — `M .gitignore`, `M AGENTS.md`, ~40 untracked agent-control/setup files | +| Public/private | **Public OSS** (Apache-2.0) | +| Handoff status | `owner-paused` per triage SPEC 2026-04-29 | +| Version | `0.1.0` (no release tag yet) | + +Dirty state detail captured from +`Mimir/.agents/specs/2026-04-29-parallel-handoff-closeout/SPEC.md` Section 4: + +- `M .gitignore` — BES agent-control addition. +- `M AGENTS.md` — BES operating manual update. +- Untracked: `.agents/` tree (~25 files including specs, skills, workflows, + scripts, mcp config), `.claude/` tree (~10 files including commands, settings, + skills), `CLAUDE.md`, `WORKFLOW.md`, `.mcp.example.json`. +- No tracked product source files are modified. + +## Primary Agent + +| Field | Value | +|---|---| +| Agent | Claude Code | +| Model | `claude-opus-4-6` (Claude Opus 4.6) | +| Reasoning mode | default (xhigh not explicitly set for this run) | +| Date | 2026-04-30 | +| Network used | No (local file reads only; git commands to Mimir denied by sandbox) | + +## Sources Read + +Root authority: + +| File | Purpose | +|---|---| +| `.agents/GREEN_ROOM_EVALUATION.md` | Green room protocol | +| `.agents/OPERATING_MODEL.md` | Fleet operating contract | +| `.agents/MODEL_ROUTING.md` | Agent/model routing policy | +| `.agents/DOCUMENTATION_GUIDE.md` | Documentation placement rules | +| `.agents/WORKSPACE_LAYOUT.md` | Root workspace layout | +| `.agents/specs/2026-04-29-green-room-product-evaluations/SPEC.md` | Fleet evaluation dispatch spec | +| `.agents/specs/2026-04-29-handoff-triage/SPEC.md` | Handoff triage decisions | + +Mimir authority: + +| File | Purpose | +|---|---| +| `Mimir/AGENTS.md` | Repo operating manual, architectural invariants, design space | +| `Mimir/CLAUDE.md` | Claude entry point | +| `Mimir/WORKFLOW.md` | Symphony workflow contract | +| `Mimir/STATUS.md` | Current phase, CI state, launch work order | +| `Mimir/README.md` | Public entry point, what-works / what-is-not-claimed | +| `Mimir/PRINCIPLES.md` | Engineering principles, testing strategy, error handling | +| `Mimir/CHANGELOG.md` | Unreleased section (first 100 lines) | +| `Mimir/RELEASING.md` | Release runbook | +| `Mimir/Cargo.toml` | Workspace members, dependencies, lint config | +| `Mimir/docs/README.md` | Public docs index | +| `Mimir/docs/concepts/README.md` | Authoritative implementation spec catalog | +| `Mimir/docs/launch-readiness.md` | OSS readiness checklist, engineering gate evidence | +| `Mimir/.github/workflows/ci.yml` | CI pipeline (first 80 lines) | + +Mimir handoff/audit specs: + +| File | Purpose | +|---|---| +| `Mimir/.agents/specs/2026-04-29-parallel-handoff-closeout/SPEC.md` | Captured git state, in-flight work table, owner decisions | +| `Mimir/.agents/specs/2026-04-29-realignment-handoff/SPEC.md` | Branch/dirty state, work classification | +| `Mimir/.agents/specs/2026-04-29-repo-audit/SPEC.md` | Code inventory, proposed specs | + +Code inventory (via `ls`, `wc -l`, `find -type f`): + +| Path | Files | LOC | +|---|---|---| +| `mimir-core/src/` | 22 | ~18,054 | +| `mimir-librarian/src/` | 12 | ~16,139 | +| `mimir-harness/src/` | 2 | ~10,980 | +| `mimir-mcp/src/` | 3 | ~1,357 | +| `mimir-cli/src/` | 2 | ~964 | +| **Source total** | **41** | **~47,494** | +| Test files | 13 | ~8,575 | +| Total `.rs` files | 162 | — | + +## Commands Run + +| Command | Result | +|---|---| +| `ls Mimir/` | Success — listed repo top-level contents | +| `ls Mimir/mimir-core/src/` | Success — 22 source files | +| `ls Mimir/mimir-librarian/src/` | Success — 12 source files | +| `ls Mimir/mimir-harness/src/` | Success — 2 source files | +| `ls Mimir/mimir-mcp/src/` | Success — 3 source files | +| `ls Mimir/mimir-cli/src/` | Success — 2 source files | +| `wc -l Mimir/mimir-core/src/*.rs` | Success — 18,054 LOC | +| `wc -l Mimir/mimir-librarian/src/*.rs` | Success — 16,139 LOC | +| `wc -l Mimir/mimir-harness/src/*.rs` | Success — 10,980 LOC | +| `wc -l Mimir/mimir-mcp/src/*.rs` | Success — 1,357 LOC | +| `wc -l Mimir/mimir-cli/src/*.rs` | Success — 964 LOC | +| `find Mimir/mimir-core/tests -type f -name '*.rs'` | Success — 5 test files | +| `find Mimir/mimir-harness/tests -type f -name '*.rs'` | Success — 4 test files | +| `find Mimir/mimir-cli/tests -type f -name '*.rs'` | Success — 2 test files | +| `find Mimir/mimir-librarian/tests -type f -name '*.rs'` | Success — 2 test files | +| `wc -l` on all test files | Success — 8,575 LOC total | +| `find Mimir -type f -name '*.rs' \| wc -l` | Success — 162 files | +| `mkdir -p Mimir/.agents/specs/2026-04-30-green-room-product-evaluation/` | Success | +| `git -C Mimir status` | **Denied** — sandbox blocked cd-then-git | +| `git -C Mimir log` | **Denied** — sandbox blocked cd-then-git | +| `git -C Mimir diff` | **Denied** — sandbox blocked cd-then-git | + +**Skipped gates:** + +| Gate | Reason | +|---|---| +| `cargo build --workspace` | Bash denied for Mimir git/cargo; dispatch predicted this; launch-readiness.md records pass on 2026-04-28 | +| `cargo test --workspace` | Same; launch-readiness.md records pass on 2026-04-28 | +| `cargo fmt --all -- --check` | Same | +| `cargo clippy --all-targets --all-features -- -D warnings` | Same | +| `cargo deny check` | Same | +| `cargo doc --no-deps --all-features` | Same | +| `cargo publish --dry-run` per crate | Same | + +All cargo gate evidence is from `docs/launch-readiness.md` (2026-04-28 pass) +and `STATUS.md` (CI green on main after PR #11). This is **second-hand +evidence**, not a fresh gate run. The verifier should note this gap. + +## Product Thesis and Target User + +**Thesis** (from AGENTS.md mandate update 2026-04-24): Mimir is an +experimental local-first memory governance system for AI coding agents. It +provides a librarian-mediated, append-only, symbol-tracking memory store that +agents write to through a structured IR surface and read from through governed +retrieval. The goal is durable, auditable, cross-agent memory that survives +context resets, model changes, and session boundaries. + +**Target user**: AI coding agents (Codex, Claude Code, Cursor, Copilot) and +the developers who operate them. Mimir is infrastructure, not a direct +end-user product. The human operator configures and audits; agents interact +through the harness, MCP server, or Codex plugin. + +**Differentiation**: single-writer gate (librarian), append-only canonical +store, agent-native IR (not human-readable prose), compiler-shaped pipeline, +bi-temporal model, confidence decay, cross-agent consensus quorum, transparent +harness wrapping, and scoped memory isolation with governed promotion. + +## Current Status vs. Last Known Roadmap + +`STATUS.md` (last updated 2026-04-28) places Mimir in **pre-1.0 public launch +cleanup**: + +| Status item | Evidence | +|---|---| +| CI state | Main green after PR #11 (2026-04-28) | +| Core store | Implemented (canonical.rs 2090 LOC, store.rs 2089 LOC) | +| Librarian pipeline | Implemented (pipeline.rs 2727 LOC, full lex→parse→bind→semantic→emit chain) | +| Harness | Implemented (lib.rs 8212 LOC, main.rs 2768 LOC) | +| Operator controls | Implemented (bounded context, operator memory controls, project doctor, hook validation) | +| MCP server | Implemented (server.rs 1179 LOC, rmcp 1.5.0) | +| Recovery framework | Implemented (benchmarks/recovery/ with scenarios, scoring, test_bench.py) | +| Codex plugin | Implemented (plugins/mimir/ with skills and marketplace entry) | +| Adapters | Processing adapters present (copilot_session_store.rs 1489 LOC in librarian) | +| Fuzz targets | Present (fuzz/ directory with corpus) | +| Launch readiness | All checklist items Done per docs/launch-readiness.md | +| Release tag | **Not created** — pending owner approval | +| Public publish | **Not done** — crates.io publish paused | + +The launch work order in STATUS.md is: +1. Public surface scrub +2. README/docs cleanup +3. OSS readiness +4. Marketing +5. Local verification +6. Batched commit/push + +Steps 1–5 appear substantially complete based on launch-readiness.md evidence. +Step 6 (batched commit/push) has not happened — the dirty state confirms local +work is uncommitted/unpushed. + +**Drift from roadmap**: The BES fleet realignment paused Mimir before the +final batched commit/push. The handoff triage spec marks Mimir +`owner-paused`. The product is closer to launch-ready than any other BES repo +but the public release action was blocked by fleet policy, not by engineering +gaps. + +## Engineering Quality + +### Architecture + +**Strong.** The architecture is well-specified across 14 authoritative concept +docs plus 2 drafts (scope-model, consensus-quorum). The crate structure +cleanly separates concerns: + +| Crate | Role | +|---|---| +| `mimir-core` | Canonical store, IR pipeline, read/write, decay, inference, symbol tracking | +| `mimir-librarian` | Single-writer gate, LLM integration, drafts, quorum, processing adapters | +| `mimir-harness` | Transparent agent wrapper, operator controls, context management | +| `mimir-mcp` | MCP server exposing governed memory tools | +| `mimir-cli` | Operator CLI for direct store interaction | + +Eight non-negotiable architectural invariants are documented in AGENTS.md: +1. Librarian-mediated writes (single writer gate) +2. Append-only canonical store +3. Bi-temporal model (assertion time + valid time) +4. Agent-native IR (not human-readable) +5. Structured write surface (Lisp S-expression) +6. Compiler-shaped pipeline (lex→parse→bind→semantic→emit) +7. Symbol-tracking IR with stable identity +8. Confidence decay with grounding-aware parameters + +These invariants are enforced through the type system and pipeline design, not +just documentation. + +**Risk**: `mimir-librarian/src/main.rs` at 8,455 LOC and +`mimir-harness/src/lib.rs` at 8,212 LOC are large single files. These are +complexity hot spots that will resist review, refactoring, and onboarding. + +### Build and Test Health + +**Good, with caveats.** + +- Workspace lint config is strict: `missing_docs = "deny"`, + `unsafe_code = "forbid"`, clippy pedantic with `unwrap_used`, + `expect_used`, `panic`, `todo`, `dbg_macro` all denied. +- CI matrix covers Ubuntu, macOS, Windows with fmt, clippy, test, deny. +- CI uses pinned action SHAs, Swatinem/rust-cache, concurrency groups. +- Test corpus: ~8,575 LOC across 13 test files, plus property tests (proptest) + and fuzz targets. +- Engineering gate passed locally on 2026-04-28 per launch-readiness.md. + +**Caveats**: +- No fresh gate run during this evaluation (sandbox restrictions). +- Test-to-source ratio is ~18% by LOC — adequate for a pre-1.0 project but + the large librarian and harness files likely have lower coverage density. +- No code coverage tooling configured. +- No integration test for the full harness→librarian→store pipeline visible + in the test file listing (harness tests exist but the integration boundary + is unclear without reading test content). + +### CI + +**Well-configured.** Cross-platform matrix (Ubuntu, macOS, Windows), cargo-deny +for dependency auditing, fmt and clippy checks, all-features test runs with +doctests. Concurrency groups prevent parallel CI waste. + +**CI cost constraint**: AGENTS.md documents a hard rule — monthly GitHub +Actions budget is capped. The repo must not generate churn that wastes CI +minutes. This is a real operational constraint for any post-evaluation work. + +### Dependency Risk + +**Low.** `Cargo.toml` shows a focused dependency set: +- Core: `thiserror`, `ulid`, `sha2`, `serde`, `serde_json`, `toml` +- Async: `tokio` +- Database: `rusqlite` (bundled SQLite) +- MCP: `rmcp` (pinned =1.5.0) +- Testing: `proptest`, `tempfile`, `criterion`, `wait-timeout` +- Observability: `tracing`, `tracing-subscriber` +- Schema: `schemars` +- Other: `anyhow`, `getrandom` + +`cargo-deny` is integrated into CI. The `rmcp` pin at `=1.5.0` is tight and +may need updating as the MCP ecosystem evolves, but for pre-1.0 this is +acceptable. + +### Observability + +**Present.** `tracing` and `tracing-subscriber` are workspace dependencies. +`PRINCIPLES.md` documents structured event logging with privacy rules. The +harness includes a `log.rs` (607 LOC in core) and the librarian has +`test_tracing.rs`. Actual observability depth requires code-level review. + +### Security + +**Good posture for pre-1.0.** `unsafe_code = "forbid"` workspace-wide. +`SECURITY.md` exists (per launch-readiness.md). Dependency auditing via +`cargo-deny`. No network-facing attack surface in the core library — the MCP +server is the primary external boundary. `rusqlite` bundled mode avoids system +SQLite version issues. Sanitisation boundary is documented in +`docs/concepts/`. + +### Release Posture + +**Ready but blocked.** `RELEASING.md` documents a full tag-triggered release +pipeline: verify-version → dry-run-publish → smoke-install → build-binaries +(5 targets) → github-release → crates-publish. `cargo publish --dry-run` +passed locally on 2026-04-28. The release workflow exists in +`.github/workflows/` (inferred from RELEASING.md). No tag has been created. +Owner approval is required for the first `v0.1.0` tag. + +### Operational Risk + +**Low for pre-1.0.** No production users, no hosted service, no SLA. The +primary operational risk is CI cost — any PR churn costs Actions minutes +against a capped monthly budget. + +## Code Quality + +### Maintainability + +**Mixed.** The crate boundaries and type system are strong. The lint +configuration is among the strictest possible in Rust (forbid unsafe, deny +missing_docs, pedantic clippy). But two files dominate the codebase: + +| File | LOC | Risk | +|---|---|---| +| `mimir-librarian/src/main.rs` | 8,455 | God-file risk; combines CLI entry, server logic, and processing in one file | +| `mimir-harness/src/lib.rs` | 8,212 | Large lib with harness logic, context management, operator controls | + +These files are individually larger than many entire crates. They will be +difficult to review, test in isolation, and onboard new contributors to. + +### Test Coverage + +**Adequate for pre-1.0, with gaps.** + +- 13 test files, ~8,575 LOC. +- Property tests via proptest. +- Fuzz targets present. +- Snapshot tests implied by the testing strategy in PRINCIPLES.md. +- No coverage tooling configured — coverage density of the large files is + unknown. +- Recovery benchmark framework provides structured scenario testing but is + separate from the unit/integration test suite. + +### Complexity Hot Spots + +1. `mimir-librarian/src/main.rs` (8,455 LOC) — needs decomposition. +2. `mimir-harness/src/lib.rs` (8,212 LOC) — needs decomposition. +3. `mimir-core/src/pipeline.rs` (2,727 LOC) — large but arguably appropriate + for a compiler pipeline; should be reviewed for internal modularity. +4. `mimir-core/src/canonical.rs` (2,090 LOC) and `store.rs` (2,089 LOC) — + core storage; size is proportional to responsibility but warrants review. + +### Stale Code + +- `docs/launch-posting-plan.md` was deleted in PR #16 but is still referenced + from `docs/README.md` (confirmed via docs index read). This is a known + broken link. +- `.planning/planning/` contains 14 historical planning docs. These are + archive material and should not be treated as authority. + +### Duplication + +Cannot assess without code-level review. The large file sizes suggest +potential internal duplication within librarian and harness, but this is +speculative. + +### Unsafe Assumptions + +- `unsafe_code = "forbid"` eliminates Rust-level unsafe. +- The LLM integration in `mimir-librarian/src/llm.rs` (1,015 LOC) is a + correctness boundary — LLM outputs fed into the compiler pipeline must be + validated. The validator exists (`validator.rs`, 149 LOC) but its coverage + of adversarial LLM output is unknown. +- The `rmcp` pin at `=1.5.0` assumes API stability of an early-stage MCP + library. + +### Correctness Risks + +- The compiler pipeline (lex→parse→bind→semantic→emit) is the core + correctness boundary. Property tests exist for some stages. Fuzz targets + exist. But the pipeline is 2,727 LOC and correctness of the full chain + depends on integration testing that was not directly observed. +- Confidence decay (`decay.rs`, 1,000 LOC) implements a mathematical model; + correctness requires property tests and potentially formal verification. + Property tests are present (proptest dependency) but depth is unknown. +- Bi-temporal model correctness in the canonical store is critical — temporal + queries must respect both assertion and valid time. Testing depth unknown. + +## Product Quality + +### Feature Completeness + +**Core feature set is implemented.** Per STATUS.md and README.md: + +| Feature | Status | +|---|---| +| Canonical append-only store | Implemented | +| Librarian-mediated writes | Implemented | +| Compiler pipeline (lex→parse→bind→semantic→emit) | Implemented | +| Agent-native IR | Implemented | +| Symbol tracking with stable identity | Implemented | +| Confidence decay | Implemented | +| Bi-temporal model | Implemented | +| Transparent agent harness | Implemented | +| Operator controls (bounded context, memory controls, project doctor) | Implemented | +| MCP server | Implemented | +| Recovery benchmarks | Implemented | +| Codex plugin | Implemented | +| CLI operator tools | Implemented | +| Processing adapters (Copilot session store) | Implemented | +| BC/DR restore drill | Implemented | + +**Not implemented / not claimed** (per README.md): + +| Feature | Status | +|---|---| +| Production readiness | Not claimed | +| Stable API | Not claimed | +| Hosted service | Not claimed | +| Benchmark-proven superiority | Not claimed | +| Relationship/timeline APIs | Deferred | +| OCI/MCPB package | Deferred | +| Broader client recipes | Deferred | +| OpenSSF Scorecard / Best Practices Badge | Deferred | + +### Demo / Showcase Readiness + +**Moderate.** The transparent harness (`mimir [agent args...]`) is the +primary demo surface. The Codex plugin bundle provides an integration path. +The MCP server enables tool-based interaction. But there is no recorded demo +script, no video, no interactive tutorial beyond the README quickstart. For an +infrastructure product targeting AI agent operators, the current entry point +is adequate for technical early adopters but not for broader discovery. + +### Asset and Content Readiness + +**Good for pre-1.0 OSS.** + +- README with quickstart. +- Docs index with concept specs, integration guides, observability docs. +- Launch article draft (docs/blog/). +- CHANGELOG with detailed unreleased section. +- Contributing guide, Code of Conduct, Security policy, issue/PR templates. +- CODEOWNERS, Dependabot config, Citation file. + +### User-Facing Gaps + +1. No recorded demo or tutorial beyond README quickstart. +2. Broken link: `docs/launch-posting-plan.md` deleted but still referenced. +3. No published crate — users cannot `cargo install` yet. +4. No release tag — users cannot pin a version. +5. The agent-native IR is intentionally not human-readable, which is a design + choice but creates an onboarding barrier for operators who want to inspect + memory state. + +## Roadmap Assessment + +### What Is Done + +- Core architecture implemented across 5 crates (~47.5k LOC). +- All 14 authoritative concept specs have corresponding implementations. +- Engineering quality gates pass locally (2026-04-28 evidence). +- CI is green on main. +- Launch readiness checklist is fully Done. +- Public surface (README, docs, legal, community) is prepared. +- Release pipeline exists and dry-run passed. + +### What Is Blocked + +1. **Release tag and crates.io publish**: blocked on owner approval. +2. **Public docs/PR/CI actions**: blocked by BES fleet policy (public OSS + constraint). +3. **Batched commit of local agent-control files**: blocked on owner approval + of what to include vs. exclude. +4. **BES spec-authority integration**: paused per fleet realignment; design + research needed before resuming. + +### What Is Stale + +- `docs/launch-posting-plan.md` reference in `docs/README.md` — the file was + deleted in PR #16. +- `.planning/planning/` historical archive — 14 docs from pre-mandate + planning. Not harmful but not authority. +- The `STATUS.md` launch work order implies a linear sequence that was + interrupted by fleet realignment. The remaining steps (batched commit/push) + are valid but the context has changed. + +### Critical Path + +The critical path to v0.1.0 public launch: + +1. Owner decides on local agent-control file commit scope. +2. Owner decides on launch-posting-plan reference cleanup. +3. Owner approves release tag posture (v0.1.0). +4. Batched commit/push of approved changes. +5. Tag v0.1.0 and trigger release pipeline. +6. Verify crates.io publish and binary artifacts. +7. Announce (launch article is drafted). + +### What Can Be Cut + +- OpenSSF Scorecard / Best Practices Badge — deferred, not blocking launch. +- OCI/MCPB package — deferred. +- Broader client recipes — deferred. +- Relationship/timeline APIs — deferred. +- Live benchmark report hosting — deferred. +- BES spec-authority integration — explicitly paused; separate concern from + launch. + +## Next-Build Plan + +The smallest sequence of specs to move Mimir measurably toward green: + +### Spec 1: Local Agent-Control Commit Plan + +Decide which untracked agent-control files to commit, which to gitignore, and +which to remove. This unblocks the batched commit/push without polluting the +public repo with internal fleet language. + +### Spec 2: Pre-1.0 Launch Cleanup Batch + +Fix the broken `docs/launch-posting-plan.md` reference, verify all docs links, +run a fresh full engineering gate, and prepare the batched commit. This is the +final cleanup before tagging. + +### Spec 3: v0.1.0 Release Tag and Publish + +Create the v0.1.0 tag, trigger the release pipeline, verify crates.io +publish, verify binary artifacts for all 5 targets, and execute the launch +announcement plan. + +### Stretch Spec: Librarian/Harness Decomposition + +Split `mimir-librarian/src/main.rs` (8,455 LOC) and +`mimir-harness/src/lib.rs` (8,212 LOC) into focused modules. This is not +blocking launch but is the highest-impact maintainability improvement for +post-launch development. + +## Proposed Issue List + +| # | Title | Depends on | Risk | Verification gate | Model routing | +|---|---|---|---|---|---| +| 1 | Local agent-control commit plan | Owner decisions | Medium | `git status` shows only approved files staged | Any frontier model | +| 2 | Pre-1.0 launch cleanup batch | #1 | Low | Full cargo gate pass + docs link check | Any frontier model | +| 3 | v0.1.0 release tag and publish | #2 + owner approval | High | Release pipeline succeeds, crates.io publish verified, binaries downloadable | Codex `gpt-5.5` primary, Claude Opus verification | +| 4 | Librarian main.rs decomposition | None (can parallel after #1) | Medium | All tests pass, no public API change | Any frontier model | +| 5 | Harness lib.rs decomposition | None (can parallel after #1) | Medium | All tests pass, no public API change | Any frontier model | +| 6 | BES spec-authority research design | Owner approval to resume | High | Design doc reviewed by second model | Frontier model primary + different family verification | +| 7 | Test coverage tooling setup | None | Low | Coverage report generated, baseline established | Sonnet or fast model | +| 8 | Benchmark claim evidence | None (can parallel) | Low | Benchmark results recorded with methodology | Any model | + +## Owner Decisions Needed + +1. **Local agent-control commit scope**: which of the ~40 untracked + agent-control files should be committed to the public repo, which should + be gitignored, and which should be removed? + +2. **Launch-posting-plan reference**: the file was deleted in PR #16 but + `docs/README.md` still links to it. Should the reference be removed, + redirected, or should the file be restored? + +3. **Release tag posture**: is the owner ready to approve v0.1.0 tagging and + crates.io publish? What conditions must be met first? + +4. **BES integration pause visibility**: should the public repo acknowledge + that BES/Mimir spec-authority integration is paused, or should this remain + internal? + +5. **CI budget for post-evaluation work**: how many PR cycles can Mimir + consume from the monthly GitHub Actions budget for cleanup and release? + +6. **Librarian/harness decomposition priority**: should the large file + decomposition happen before or after v0.1.0 launch? + +## Residual Risks + +1. **No fresh cargo gate run**: all engineering gate evidence is from + 2026-04-28. Any changes since then (even agent-control file additions) + could affect the build. The verifier should note this gap. + +2. **Large file complexity**: the two 8k+ LOC files are technical debt that + will compound. Post-launch development velocity will suffer without + decomposition. + +3. **LLM integration boundary**: the librarian's LLM integration + (`llm.rs` + `validator.rs`) is a correctness boundary where adversarial + or malformed LLM output could corrupt the memory store. Testing depth at + this boundary is unknown. + +4. **rmcp version pin**: `=1.5.0` is a tight pin on an early-stage library. + MCP ecosystem changes may require updates that affect the server's API + surface. + +5. **Public OSS exposure**: once published, any internal agent-control + language accidentally committed becomes public. The commit scope decision + (#1 above) is critical. + +6. **Single-maintainer risk**: the repo appears to have one maintainer + (HasNoBeef). Bus factor is 1. + +7. **No code coverage baseline**: without coverage tooling, it is impossible + to quantify which paths are tested and which are not. + +## Evidence Gaps + +1. **Test content not read**: test files were counted and measured by LOC but + their content was not read. Coverage quality, assertion depth, and edge + case handling are unknown. + +2. **Librarian main.rs and harness lib.rs not read**: the two largest files + were not read due to context constraints. Internal structure, duplication, + and specific complexity risks are inferred from size only. + +3. **LLM integration not read**: `llm.rs`, `validator.rs`, `quorum.rs`, + and `drafts.rs` in the librarian were not read. The correctness of LLM + output validation is unknown. + +4. **CI workflow not fully read**: only the first 80 lines of `ci.yml` were + read. Release workflow, Dependabot config, and other workflow files were + not read. + +5. **Git history not available**: sandbox restrictions prevented git log, git + diff, and git status commands against the Mimir repo. All git state is + from captured handoff spec evidence dated 2026-04-29. + +6. **No live cargo gate**: build, test, clippy, fmt, deny, doc, and + publish-dry-run were not executed. All pass evidence is second-hand from + `docs/launch-readiness.md` (2026-04-28). + +7. **CHANGELOG.md partially read**: only the first 100 lines (Unreleased + section header and early entries). Full change history since last release + is not captured. + +8. **Adapter coverage unknown**: `copilot_session_store.rs` (1,489 LOC) is + the only visible adapter. Whether other agent adapters exist or are + planned is unclear from the files read. diff --git a/.agents/specs/2026-04-30-green-room-product-evaluation/ROADMAP.md b/.agents/specs/2026-04-30-green-room-product-evaluation/ROADMAP.md new file mode 100644 index 0000000..119130a --- /dev/null +++ b/.agents/specs/2026-04-30-green-room-product-evaluation/ROADMAP.md @@ -0,0 +1,173 @@ +# Mimir Green Room Roadmap + +## Current State + +Mimir is a pre-1.0 public OSS memory governance system for AI coding agents. +The core architecture is implemented across 5 Rust crates (~47.5k LOC source, +~8.5k LOC tests). CI is green on main (2026-04-28). All launch readiness +checklist items are Done. The release pipeline exists and dry-run passed. No +release tag has been created. No crates.io publish has occurred. The repo is +`owner-paused` per BES fleet triage. Local dirty state consists of ~40 +untracked agent-control/setup files and two modified files (`.gitignore`, +`AGENTS.md`); no tracked product source is modified. + +The product is closer to public launch than any other BES repo. The gap is +owner decisions and a final cleanup pass, not engineering work. + +## Milestones + +### M1: Commit Scope Resolution + +Decide and execute which local agent-control files to commit, gitignore, or +remove. This is the gate for all subsequent work — the dirty state must be +resolved before any clean batched commit/push. + +**Owner decision required.** + +### M2: Pre-Launch Cleanup + +Fix the broken `docs/launch-posting-plan.md` reference. Verify all docs links. +Run a fresh full engineering gate (`cargo build/test/fmt/clippy/deny/doc` + +`cargo publish --dry-run` for all crates). Prepare the batched commit of +approved changes. + +**Depends on: M1.** + +### M3: v0.1.0 Release + +Create the v0.1.0 tag. Trigger the release pipeline. Verify crates.io publish +and binary artifacts for all 5 targets (per RELEASING.md). Execute the launch +announcement. + +**Depends on: M2 + owner approval for tag and publish.** + +### M4: Post-Launch Maintainability + +Decompose `mimir-librarian/src/main.rs` (8,455 LOC) and +`mimir-harness/src/lib.rs` (8,212 LOC) into focused modules. Set up code +coverage tooling. Establish a coverage baseline. + +**Depends on: M3 (or can start after M1 if owner approves parallel work).** + +### M5: BES Integration Research + +Resume the paused BES spec-authority integration design. Requires a separate +approved research spec and second-model verification before implementation. + +**Depends on: owner decision to resume. Independent of M1–M4.** + +## Critical Path + +``` +M1 (commit scope) → M2 (cleanup) → M3 (release) +``` + +All three milestones are sequentially dependent. M1 is owner-blocked. The +total engineering work for M1–M3 is small — the blocking constraint is owner +decisions, not implementation effort. + +## Parallelizable Work + +The following can proceed in parallel with the critical path after M1 is +resolved: + +| Work | Can start after | Blocks | +|---|---|---| +| Librarian main.rs decomposition | M1 | Nothing on critical path | +| Harness lib.rs decomposition | M1 | Nothing on critical path | +| Test coverage tooling setup | M1 | Nothing on critical path | +| Benchmark claim evidence collection | Any time | Nothing on critical path | +| Adapter inventory and planning | Any time (read-only) | Nothing on critical path | + +## Work That Should Not Start Yet + +| Work | Why not yet | +|---|---| +| BES spec-authority integration | Owner has not approved resumption; requires separate research spec | +| New feature development (relationship/timeline APIs, OCI package, hosted service) | Pre-1.0 — launch first | +| Public PR churn for agent-control scaffolding | CI budget constraint; owner must approve PR plan | +| OpenSSF Scorecard / Best Practices Badge | Deferred per launch-readiness.md; post-launch concern | +| Broader client recipes beyond Codex plugin | Post-launch; current Codex plugin and MCP server are sufficient for v0.1.0 | + +## First Three Executable Specs + +### Spec 1: Local Agent-Control Commit Plan + +**Goal**: Resolve the dirty state by classifying all ~40 untracked files as +commit, gitignore, or remove. + +**Scope**: +- Inventory every untracked file and its purpose. +- Propose a classification for owner review. +- After owner approval, stage only approved files and commit. +- Ensure no internal fleet language leaks into the public repo. + +**Verification**: `git status` shows clean working tree with only approved +changes committed. No internal agent-control language in committed files +visible to public. + +**Risk**: Medium — wrong classification could expose internal fleet language +publicly or lose useful local configuration. + +**Model routing**: Any frontier model. Low-risk enough for Sonnet with +frontier verification. + +--- + +### Spec 2: Pre-1.0 Launch Cleanup Batch + +**Goal**: Fix all known documentation drift, verify engineering gates, and +prepare the final batched commit before tagging. + +**Scope**: +- Remove or redirect the broken `docs/launch-posting-plan.md` reference in + `docs/README.md`. +- Verify all documentation links resolve. +- Run the full engineering gate: + `cargo build --workspace && cargo test --workspace && cargo fmt --all -- --check && cargo clippy --all-targets --all-features -- -D warnings && cargo deny check && cargo doc --no-deps --all-features`. +- Run `cargo publish --dry-run` for each crate in dependency order. +- Fix any issues found. +- Prepare the batched commit. + +**Verification**: Full cargo gate passes. No broken documentation links. +`cargo publish --dry-run` succeeds for all crates. + +**Risk**: Low — this is cleanup, not feature work. The gate passed on +2026-04-28; regressions are unlikely unless dependency updates occurred. + +**Model routing**: Any frontier model. + +--- + +### Spec 3: v0.1.0 Release Tag and Publish + +**Goal**: Execute the first public release of Mimir. + +**Scope**: +- Push the batched commit from Spec 2. +- Create the `v0.1.0` tag per RELEASING.md. +- Verify the tag-triggered release pipeline: + - verify-version + - dry-run-publish + - smoke-install + - build-binaries (5 targets) + - github-release + - crates-publish (dependency order: mimir-core → mimir-librarian → + mimir-harness → mimir-mcp → mimir-cli) +- Verify crates.io pages and binary artifact downloads. +- Execute launch announcement per the drafted launch article. + +**Verification**: Release pipeline succeeds. All 5 crates published on +crates.io. Binary artifacts downloadable for all targets. `cargo install mimir-cli` works from a clean environment. + +**Risk**: High — first public release; irreversible once crates.io publish +completes. Requires owner approval at multiple gates. + +**Model routing**: Codex `gpt-5.5` primary with Claude Opus verification, per +MODEL_ROUTING.md public OSS release guidance. + +**Owner approval gates**: +1. Approve the tag name and version. +2. Approve the crates.io publish. +3. Approve the launch announcement wording. +4. Approve the CI budget spend for the release pipeline. diff --git a/.agents/specs/2026-04-30-green-room-product-evaluation/VERIFICATION.md b/.agents/specs/2026-04-30-green-room-product-evaluation/VERIFICATION.md new file mode 100644 index 0000000..285b2a7 --- /dev/null +++ b/.agents/specs/2026-04-30-green-room-product-evaluation/VERIFICATION.md @@ -0,0 +1,170 @@ +# Mimir Green Room Verification + +## Verifier Metadata + +| Field | Value | +|---|---| +| Verifier | Codex | +| Model | GPT-5 | +| Reasoning mode | high | +| Date | 2026-04-30 | +| Scope | Independent second-model verification of `EVALUATION.md` and `ROADMAP.md`; no implementation or public action | +| Final status | `verified-with-changes` | + +This file is verifier output only. It is not owner approval to implement the +roadmap, stage files, push, tag, publish, open PRs, spend CI minutes, or change +public docs. Owner-approved executable specs are still required before any +roadmap item becomes implementation work. + +## Sources Checked + +- Root policy: `AGENTS.md`, `.agents/OPERATING_MODEL.md`, + `.agents/GREEN_ROOM_EVALUATION.md`, `.agents/MODEL_ROUTING.md`. +- Mimir policy/status: `AGENTS.md`, `CLAUDE.md`, `WORKFLOW.md`, `STATUS.md`, + `README.md`, `.agents/DOCUMENTATION_GUIDE.md`, + `docs/launch-readiness.md`. +- Primary packet: + `.agents/specs/2026-04-30-green-room-product-evaluation/EVALUATION.md` and + `.agents/specs/2026-04-30-green-room-product-evaluation/ROADMAP.md`. +- Supporting evidence: `PRINCIPLES.md`, `docs/README.md`, `Cargo.toml`, + `.github/workflows/ci.yml`, `.github/workflows/release.yml`, + `RELEASING.md`. + +## Predicted Failure Classification + +| Constraint | Prediction | Result | Classification | +|---|---|---|---| +| Public actions | Pushes, PRs, tags, publish, release workflows blocked without owner approval | Not attempted | expected | +| CI budget | GitHub Actions should not be triggered by this audit | Not triggered | expected | +| Cargo gates | Could be slow/heavy for a Rust workspace | Completed quickly locally | predicted but did not fail | +| Optional tools | `cargo deny` might be unavailable | Available and passed | predicted but did not fail | +| Dirty worktree | Existing uncommitted/untracked agent-control work must be preserved | Preserved; only this file written | expected | + +## Evidence Commands + +| Command | Result | +|---|---| +| `node .agents/scripts/preflight.mjs` from root | Pass, 0 warnings | +| `git status --short --branch --untracked-files=all` | `main...origin/main`; existing `M .gitignore`, `M AGENTS.md`, and many untracked agent-control files | +| `git log --oneline --decorate -n 12` | `HEAD` at `9e81c0f feat(librarian): support active processing adapters`; prior public cleanup/release commits present | +| `git diff --name-status` | Existing tracked diffs only: `.gitignore`, `AGENTS.md` | +| `git diff --stat` | Existing tracked diffs: 32 insertions, 2 deletions | +| `git ls-files \| wc -l` | 182 tracked files | +| `find crates -path '*/src/*.rs' ... wc -l` | 41 source files, 47,494 LOC | +| `find crates -path '*/tests/*.rs' ... wc -l` | 18 crate test files, 10,173 LOC | +| `cargo fmt --all -- --check` | Pass | +| `cargo build --workspace` | Pass | +| `cargo test --workspace` | Pass | +| `cargo test --workspace --all-features` | Pass | +| `cargo clippy --all-targets --all-features -- -D warnings` | Pass | +| `cargo deny check` | Pass: advisories, bans, licenses, sources ok | +| `cargo doc --workspace --no-deps` | Pass; docs generated under ignored `target/doc` | +| `rg -n "launch-posting-plan..." docs README.md STATUS.md CHANGELOG.md RELEASING.md AGENTS.md` | Found stale references in `AGENTS.md`, `STATUS.md`, `README.md`, `docs/README.md`, and `docs/launch-readiness.md` | +| `ls -la docs/launch-posting-plan.md` | Failed: file does not exist | + +## Agreement With Primary Findings + +I agree with the primary evaluation's main conclusions: + +- Mimir is public OSS, pre-1.0, and should not receive public release or PR + churn without owner approval. +- The architecture and public product claims are internally consistent with + `AGENTS.md`, `STATUS.md`, `README.md`, `PRINCIPLES.md`, and + `docs/launch-readiness.md`. +- The critical path is owner-gated release/commit scope work, not new feature + engineering. +- `mimir-librarian/src/main.rs` and `mimir-harness/src/lib.rs` are real + maintainability hot spots by size. +- `docs/launch-posting-plan.md` is missing while public docs still reference + it. +- Release/tag/publish work is high risk and must remain owner-approved. + +## Required Changes Before Dispatch + +1. Expand the broken-link cleanup scope. The primary packet and roadmap call + out `docs/README.md`, but the missing `docs/launch-posting-plan.md` is also + referenced from `AGENTS.md`, `STATUS.md`, `README.md`, and + `docs/launch-readiness.md`. A cleanup spec must either restore the file or + update all stale references. + +2. Correct the test inventory in future operator summaries. The verifier + counted 18 crate test files and 10,173 test LOC, not 13 files and ~8,575 + LOC. This does not weaken the roadmap; it improves the evidence baseline. + +3. Replace the primary packet's "no fresh cargo gate" residual risk with the + fresh verifier evidence above when using the roadmap for dispatch. The + green-room packet now has fresh local passes for fmt, build, tests, + all-features tests, clippy, deny, and docs. + +4. Keep Spec 1 explicitly owner-driven. The public OSS constraint is respected + only if the owner decides which agent-control files, if any, belong in the + public repository. + +## Findings + +### High + +None. + +### Medium + +- The roadmap's pre-launch cleanup item is underspecified for the missing + launch-posting plan. It must cover all stale public references or restore the + deleted file; otherwise public docs remain inconsistent after the proposed + cleanup. + +- The release spec remains owner-blocking. Creating `v0.1.0`, publishing + crates, or executing announcements would be irreversible/public and cannot + be inferred from verifier approval. + +### Low + +- The primary evaluation understated test inventory and missed visible + `mimir-mcp` crate tests in its test-file count. This is an evidence + correction, not a rejection. + +- The primary evaluation's git/cargo sandbox limitations are no longer true + for this verifier run. Fresh commands succeeded locally. + +## Owner Decisions Still Required + +- Commit scope for `.agents/`, `.claude/`, `CLAUDE.md`, `WORKFLOW.md`, + `.gitignore`, and `AGENTS.md` changes in a public OSS repo. +- Whether to restore `docs/launch-posting-plan.md` or remove/redirect all + references to it. +- Whether `v0.1.0` is the intended first public tag, and whether crates.io + publishing is approved. +- CI budget for any cleanup PR/release workflow runs. +- Whether large-file decomposition should happen before or after first public + release. +- Whether BES spec-authority integration remains paused or receives a new + research spec. + +## Public OSS Constraint Check + +The primary evaluation and roadmap keep internal agent-control output under +`.agents/specs/` and do not edit public docs or product code. They repeatedly +require owner approval before public PR, CI, tag, publish, and announcement +actions. This satisfies the green-room public OSS constraint. + +One caution: because the repo is public, committing `.agents/` or `.claude/` +content is itself a public documentation decision. That must remain an owner +decision, not an implied verifier conclusion. + +## Residual Risks + +- This verifier did not run `cargo publish --dry-run`; release dry-runs and + real publish remain owner-approved release-spec work. +- This verifier did not inspect every large source file deeply; large-file + maintainability and LLM-boundary risk remain valid follow-up topics. +- Local green checks do not prove GitHub's cross-platform matrix or release + workflow will pass on the next push/tag. +- Existing dirty/untracked files predate this verifier run and were preserved. + +## Final Status + +`verified-with-changes` + +The roadmap is evidence-based, current after the verifier corrections above, +and internally consistent enough to become the basis for owner-approved +executable specs. It is not itself implementation approval. diff --git a/.agents/specs/2026-04-30-remove-launch-posting-plan-references/SPEC.md b/.agents/specs/2026-04-30-remove-launch-posting-plan-references/SPEC.md new file mode 100644 index 0000000..3532d3e --- /dev/null +++ b/.agents/specs/2026-04-30-remove-launch-posting-plan-references/SPEC.md @@ -0,0 +1,321 @@ +--- +id: remove-launch-posting-plan-references +status: ready-for-review +owner: HasNoBeef +repo: Mimir +branch_policy: worktree-preferred +risk: low +requires_network: false +requires_secrets: [] +acceptance_commands: + - test ! -e docs/launch-posting-plan.md + - '! rg -n "docs/launch-posting-plan\\.md|\\(launch-posting-plan\\.md\\)|launch-posting-plan\\.md" AGENTS.md STATUS.md README.md docs/README.md docs/launch-readiness.md' + - "bash -lc 'rg -n \"launch-posting-plan|launch posting|posting plan|Publishing plan\" AGENTS.md STATUS.md README.md docs/README.md docs/launch-readiness.md CHANGELOG.md || test $? -eq 1'" + - git diff --check -- AGENTS.md STATUS.md README.md docs/README.md docs/launch-readiness.md CHANGELOG.md +--- + +# SPEC: Remove Launch Posting Plan References + +## 1. Problem + +`docs/launch-posting-plan.md` is absent from the public Mimir tree, but current +repo documentation still links to it or describes it as an active launch +posting plan. Because Mimir is public OSS, a broken launch/publishing reference +creates public documentation drift and can imply a release or announcement path +that is not owner-approved. + +This spec is local agent-control work only. It authorizes a future +implementation to remove or redirect stale references to the missing file; it +does not authorize restoring a launch posting plan, committing local +agent-control files, pushing, opening a PR, tagging, publishing crates, spending +CI minutes, or announcing a release. + +## 2. Goals + +- Remove or redirect every current reference that points readers to the missing + `docs/launch-posting-plan.md` file. +- Keep public wording factual, pre-1.0, and consistent with current approved + release boundaries. +- Preserve the owner decision that launch/posting/release execution remains + separate release-pr work. +- Keep `.agents/` and other BES agent-control artifacts local-only unless a + separate owner-approved public rollout spec says otherwise. + +## 3. Non-Goals + +- Do not restore, recreate, rewrite, or replace `docs/launch-posting-plan.md`. +- Do not create a new launch-posting, marketing, social, Show HN, crates.io, + docs.rs, MCP Registry, or announcement plan. +- Do not tag `v0.1.0`, publish crates, open PRs, push branches, mutate tracker + state, or trigger release workflows. +- Do not resolve broader dirty-state questions for `.agents/`, `.claude/`, + `CLAUDE.md`, `WORKFLOW.md`, `.gitignore`, or existing modified `AGENTS.md`. +- Do not edit product code, Cargo metadata, CI workflows, release automation, + benchmark assets, or unrelated docs. + +## 4. Current System Facts + +- Owner instruction for this task: remove or redirect stale + launch-posting-plan references; do not restore a launch-posting plan; + agent-control files stay local-only; public release/publish decisions remain + separate release-pr work. +- Root `AGENTS.md`: public OSS repos `Wick` and `Mimir` must not receive + internal agent-control language unless the owner approves a public-facing + rollout; product code lives in child repos, not the root. +- Root `.agents/OPERATING_MODEL.md`: non-trivial work starts with an + executable spec, public OSS repos require extra release hygiene, and public + OSS doc-only churn must not be pushed without an intentional owner-approved + low-noise PR plan. +- Root `.agents/GREEN_ROOM_EVALUATION.md`: green-room packets are local, + isolated, and do not authorize implementation, public docs publication, PRs, + tags, or releases. +- Root + `.agents/specs/2026-04-29-green-room-product-evaluations/CROSS_PRODUCT_SEQUENCE.md`: + Mimir is actionable as local-only public-OSS-safe cleanup to remove or + redirect stale launch-posting-plan references; agent-control files remain + local-only; public release/publish decisions remain separate release-pr work. +- Mimir `AGENTS.md`: the repo is public OSS, uses BES spec-first operation, and + has a hard CI quota rule requiring local verification before any push. +- Mimir `WORKFLOW.md`: canonical verification is + `cargo build --workspace && cargo test --workspace && cargo fmt --all -- --check && cargo clippy --all-targets --all-features -- -D warnings`. +- Mimir `STATUS.md`: Mimir is pre-1.0 public active development; release tags + are absent; `v0.1.0` may be tagged only after owner approval. +- Mimir `README.md`: public claims are intentionally limited and do not claim + production readiness, stable APIs, hosted service availability, benchmark + superiority, direct agent writes, or ungoverned cross-project promotion. +- Mimir `docs/launch-readiness.md`: release/tag/publish state remains + pre-release; docs.rs and crates.io publishing wait for release workflow. +- Mimir green-room verifier + `.agents/specs/2026-04-30-green-room-product-evaluation/VERIFICATION.md`: + `docs/launch-posting-plan.md` is missing, and stale references were found in + `AGENTS.md`, `STATUS.md`, `README.md`, `docs/README.md`, and + `docs/launch-readiness.md`. +- Command: `git -C Mimir log --oneline --decorate -n 5` from the workspace + root showed `4d38614 Delete docs/launch-posting-plan.md (#16)` immediately + before current `HEAD` `9e81c0f`, so the missing file was deliberately deleted + in repository history. +- Command: + `test -e docs/launch-posting-plan.md; printf 'docs/launch-posting-plan.md exists: %s\n' "$?"` + from `Mimir` returned `docs/launch-posting-plan.md exists: 1`, confirming the + file does not exist in the current worktree. +- Command: + `rg -n "launch-posting-plan|posting plan|launch posting|Publishing plan" .` + from `Mimir` found current references in: + - `AGENTS.md` + - `STATUS.md` + - `README.md` + - `docs/README.md` + - `docs/launch-readiness.md` + - `CHANGELOG.md` +- The same search shows `CHANGELOG.md` mentions launch posting assets in the + Unreleased historical change summary but does not link to + `docs/launch-posting-plan.md`. +- Command: `git -C Mimir status --short --branch --untracked-files=all` showed + branch `main...origin/main`, existing modified `.gitignore` and `AGENTS.md`, + and many untracked `.agents/`, `.claude/`, `CLAUDE.md`, and `WORKFLOW.md` + files. Those existing changes predate this spec and must be preserved. + +## 5. Desired Behavior + +After implementation: + +- No current public or operating document points to the missing + `docs/launch-posting-plan.md` path. +- `docs/launch-posting-plan.md` remains absent. +- The public docs continue to point users to existing launch/release status + surfaces, primarily `STATUS.md`, `docs/README.md`, + `docs/launch-readiness.md`, `RELEASING.md`, and the existing launch article + only where those links already exist or are directly relevant. +- Any replacement wording is descriptive and non-promissory. Public-facing + replacements may say that release, tag, publish, listing, and announcement + steps remain pending or are handled separately by the normal release process. + They must not introduce internal terms such as `release-pr`, `agent-control`, + or `BES fleet`, and must not provide new channel strategy, marketing copy, + publish order, launch timing, ownership/approval language, or + public-readiness claims. +- If an implementer believes a user-facing sentence needs subjective marketing + or launch-positioning judgment, the implementer must stop and mark that + sentence owner-blocking instead of inventing wording. + +## 6. Domain Model / Contract + +- `docs/launch-posting-plan.md`: deleted public doc. It is not an implementation + target and must not be restored by this cleanup. +- Stale reference: any link, path mention, index row, status row, or active + current-state claim that directs a reader to `docs/launch-posting-plan.md` or + says the missing plan is the active launch/posting/listing plan. +- Redirect: replacing a stale reference with a link to an existing current + document that already carries the relevant authority, without adding new + launch strategy. Acceptable redirect targets are: + - `STATUS.md` for current state and release tags. + - `docs/launch-readiness.md` for OSS readiness and promise audit. + - `RELEASING.md` for release mechanics, if the surrounding context is release + workflow rather than launch announcement. + - `docs/blog/2026-04-28-agent-memory-compiler-pipeline.md` only for the + existing public article reference, not as a posting plan replacement. +- Removal: deleting the stale bullet, table entry, or sentence when no existing + public doc is an objective replacement. +- Historical changelog entry: a past-tense release-note statement that does not + link to the missing file. It may remain unchanged unless implementation + evidence shows it is now misleading as a current-state claim. + +## 7. Interfaces And Files + +Expected implementation touch points: + +- `AGENTS.md`: remove or redirect the `Where to Look` row that currently + includes `docs/launch-posting-plan.md`. +- `STATUS.md`: remove or redirect the `References` bullet for + `docs/launch-posting-plan.md`. +- `README.md`: remove or redirect the `Documentation` bullet for + `docs/launch-posting-plan.md`. +- `docs/README.md`: remove or redirect the `Start Here` bullet and the + `Launch Execution` paragraph that reference `launch-posting-plan.md`. +- `docs/launch-readiness.md`: update the `Publishing plan` row so it does not + claim the missing file is done or authoritative. + +Files to inspect but not edit unless implementation evidence proves a current +stale reference: + +- `CHANGELOG.md`: current evidence shows only a historical Unreleased summary + mentioning posting assets, with no missing-file link. + +Files and directories out of scope: + +- `docs/launch-posting-plan.md` +- Product source files and tests. +- `Cargo.toml`, `Cargo.lock`, `.github/`, `RELEASING.md`, release workflows, + benchmark assets, and package metadata. +- `.agents/` files other than this spec, `.claude/`, `CLAUDE.md`, + `WORKFLOW.md`, `.gitignore`, git metadata, and tracker state. + +Public interfaces affected: + +- Public README/documentation navigation only. +- No CLI, API, storage, MCP, package, or release interface changes. + +## 8. Execution Plan + +1. Reconfirm the worktree state with + `git status --short --branch --untracked-files=all` and preserve all + pre-existing modified/untracked files. +2. Reconfirm the missing-file reference set with + `rg -n "launch-posting-plan|launch posting|posting plan|Publishing plan" AGENTS.md STATUS.md README.md docs/README.md docs/launch-readiness.md CHANGELOG.md`. +3. Edit only the expected implementation touch points listed in Section 7. +4. For each stale reference, either remove it or redirect it to an existing + current authority document using neutral wording. +5. Do not edit `CHANGELOG.md` unless the implementation search proves it + contains a current missing-file path or active-current-state claim rather + than historical release-note language. +6. Do not create `docs/launch-posting-plan.md`. +7. Run the acceptance commands in Section 10. +8. Report changed files, command results, unchanged pre-existing dirty files, + and any owner-blocking wording decisions encountered. + +## 9. Safety Invariants + +- Do not overwrite or revert pre-existing modifications in `AGENTS.md`, + `.gitignore`, `.agents/`, `.claude/`, `CLAUDE.md`, `WORKFLOW.md`, or any other + file. +- Do not stage, commit, push, open a PR, tag, publish, mutate tracker state, or + run public release workflows. +- Do not restore the deleted launch-posting plan. +- Do not introduce internal BES fleet details into public-facing docs beyond + already-existing local agent-control surfaces. +- Do not add AI attribution to docs, commits, release notes, or generated + output. +- Do not broaden this cleanup into broader launch-readiness, release-pr, + package, CI, benchmark, or public-announcement work. +- If implementation needs subjective launch messaging, owner review is required + before the wording is written. + +## 10. Test Plan + +Run from `/var/home/hasnobeef/buildepicshit/Mimir` after implementation: + +```bash +git status --short --branch --untracked-files=all +test ! -e docs/launch-posting-plan.md +! rg -n "docs/launch-posting-plan\\.md|\\(launch-posting-plan\\.md\\)|launch-posting-plan\\.md" AGENTS.md STATUS.md README.md docs/README.md docs/launch-readiness.md +bash -lc 'rg -n "launch-posting-plan|launch posting|posting plan|Publishing plan" AGENTS.md STATUS.md README.md docs/README.md docs/launch-readiness.md CHANGELOG.md || test $? -eq 1' +git diff --check -- AGENTS.md STATUS.md README.md docs/README.md docs/launch-readiness.md CHANGELOG.md +``` + +Expected results: + +- `git status` shows only the implementation files plus pre-existing dirty and + untracked files; no unrelated files appear. +- `test ! -e docs/launch-posting-plan.md` passes. +- The negative `rg` command returns no matches. +- The broad `rg` evidence command returns either no matches in current docs or + only explicitly reviewed non-stale historical language such as `CHANGELOG.md`. + Exit code 1 from no matches is acceptable; any other `rg` failure is not. +- `git diff --check` passes. + +Do not run cargo gates for this link-only cleanup unless implementation touches +Rust, Cargo, CI, release, package, benchmark, or generated documentation +surfaces. The full local gate remains mandatory before any future push or +release-pr work. + +Manual checks: + +- Open each changed diff hunk and confirm it either removes the stale reference + or redirects to an existing file. +- Confirm no replacement sentence invents a public release date, launch + channel strategy, posting copy, publish approval, or benchmark/performance + claim. +- Confirm `docs/launch-readiness.md` no longer marks the missing publishing + plan as `Done` evidence. + +## 11. Acceptance Criteria + +- [ ] `docs/launch-posting-plan.md` remains absent. +- [ ] No active current-state doc among `AGENTS.md`, `STATUS.md`, `README.md`, + `docs/README.md`, or `docs/launch-readiness.md` references + `docs/launch-posting-plan.md` or `launch-posting-plan.md`. +- [ ] Any remaining "posting plan", "launch posting", or "Publishing plan" + wording is either removed, redirected to an existing authority document, + or explicitly historical and non-actionable. +- [ ] `docs/launch-readiness.md` does not claim a missing publishing plan is + complete evidence. +- [ ] No new launch/posting plan, marketing strategy, release approval, tag, + publish, PR, or CI-spend action is introduced. +- [ ] Only files required by Section 7 are edited, except `CHANGELOG.md` may be + edited only if implementation evidence proves it contains a stale current + reference. +- [ ] Pre-existing dirty/untracked work is preserved. +- [ ] Acceptance commands pass or any failure is classified as expected, + new, or owner-blocking with exact output. +- [ ] Completion report lists files changed, commands run and results, + intentionally untouched files, residual risks, and any spec evidence + candidates. + +## 12. Rollback Plan + +Before commit or PR work, rollback is a normal file-level revert of only the +future implementation hunks in the touched public docs. Do not use +`git reset --hard` or broad checkout commands because the worktree already +contains unrelated owner/agent changes that must be preserved. + +If a later review rejects a redirect target, replace only that sentence or +bullet with a removal or owner-approved redirect. Do not restore +`docs/launch-posting-plan.md` as rollback. + +## 13. Open Questions + +- [ ] None before spec review. Owner has already decided to remove or redirect + stale launch-posting-plan references and not restore the plan. +- [ ] Owner-blocking during implementation: any proposed replacement wording + that requires subjective launch messaging, public positioning, channel + strategy, publish timing, or public-readiness judgment. + +## 14. Completion Report + +To be filled by the executor/verifier: + +- Files changed: +- Commands run: +- Verification result: +- Intentionally untouched: +- Residual risk: +- Spec evidence candidates: diff --git a/.agents/specs/SPEC.template.md b/.agents/specs/SPEC.template.md new file mode 100644 index 0000000..187252c --- /dev/null +++ b/.agents/specs/SPEC.template.md @@ -0,0 +1,107 @@ +--- +id: replace-with-short-stable-id +status: draft +owner: HasNoBeef +repo: replace-with-repo-name +branch_policy: worktree-preferred +risk: low +requires_network: false +requires_secrets: [] +acceptance_commands: [] +--- + +# SPEC: Replace With Task Name + +## 1. Problem + +Describe the problem in concrete terms. Include observed behavior, affected +users, affected files, and why the change matters now. + +## 2. Goals + +- Goal 1. +- Goal 2. + +## 3. Non-Goals + +- Explicitly excluded scope. +- Related work deferred to another spec. + +## 4. Current System Facts + +List only verified facts. Cite files, docs, command output, issues, PRs, or +owner statements. + +- `path/to/file`: fact. +- Command: `example command` -> observed result. + +## 5. Desired Behavior + +State the target behavior in terms an executor can implement and a verifier can +test. + +## 6. Domain Model / Contract + +Define entities, states, schemas, invariants, inputs, outputs, or file formats +the implementation must preserve. + +## 7. Interfaces And Files + +Expected touch points: + +- `path/to/file` +- `path/to/other-file` + +Public interfaces affected: + +- CLI/API/tool/user workflow. + +## 8. Execution Plan + +1. Step one. +2. Step two. +3. Step three. + +## 9. Safety Invariants + +- Invariant that must remain true. +- Files or directories that must not be touched. +- Destructive actions that require explicit approval. + +## 10. Test Plan + +Commands: + +```bash +# fill in repo-specific verification +``` + +Manual checks: + +- Check 1. + +## 11. Acceptance Criteria + +- [ ] Behavior matches Desired Behavior. +- [ ] Tests pass. +- [ ] Docs or operating instructions updated if needed. +- [ ] No unrelated changes. +- [ ] Completion report includes verification output. + +## 12. Rollback Plan + +Describe how to revert or disable the change safely. + +## 13. Open Questions + +- [ ] Question that must be answered before approval. + +## 14. Completion Report + +To be filled by the executor/verifier: + +- Files changed: +- Commands run: +- Verification result: +- Residual risk: +- Spec evidence candidates: diff --git a/.agents/workflows/author-spec.md b/.agents/workflows/author-spec.md new file mode 100644 index 0000000..ca1815d --- /dev/null +++ b/.agents/workflows/author-spec.md @@ -0,0 +1,12 @@ +# Author Spec + +Use this workflow to turn owner intent into an executable `SPEC.md`. + +1. Read `AGENTS.md`, `CLAUDE.md` if present, `STATUS.md` if present, and + relevant docs. +2. Inspect the codebase before proposing implementation details. +3. Use `.agents/specs/SPEC.template.md`. +4. Fill `Current System Facts` with cited facts only. +5. Keep implementation steps concrete enough for another agent to execute. +6. Mark unresolved decisions as open questions instead of guessing. +7. Stop at the spec unless the owner explicitly approves implementation. diff --git a/.agents/workflows/execute-spec.md b/.agents/workflows/execute-spec.md new file mode 100644 index 0000000..27ca609 --- /dev/null +++ b/.agents/workflows/execute-spec.md @@ -0,0 +1,13 @@ +# Execute Spec + +Use this workflow only after a `SPEC.md` is approved. + +1. Re-read the approved spec and repo instructions. +2. Confirm the worktree or branch state. +3. Implement only the approved scope. +4. Preserve unrelated user changes. +5. Update directly coupled tests/docs only when required by the spec. +6. Run the spec acceptance commands. +7. Prepare the completion report requested by the spec. + +If new facts materially change scope, stop and request a spec update. diff --git a/.agents/workflows/orient.md b/.agents/workflows/orient.md new file mode 100644 index 0000000..c0a50f9 --- /dev/null +++ b/.agents/workflows/orient.md @@ -0,0 +1,11 @@ +# Orient + +Use this workflow before planning or editing. + +1. Read `AGENTS.md`. +2. Read `WORKFLOW.md`, `CLAUDE.md`, `STATUS.md`, and linked docs when present. +3. Run `git status --short --branch`. +4. Identify current branch, local changes, likely files, and verification gate. +5. Report verified facts, risks, and open questions. + +Use the `repo-orientation` skill. diff --git a/.agents/workflows/release-pr.md b/.agents/workflows/release-pr.md new file mode 100644 index 0000000..762e4aa --- /dev/null +++ b/.agents/workflows/release-pr.md @@ -0,0 +1,13 @@ +# Release PR + +Use this workflow after implementation and verification. + +1. Confirm branch and worktree state. +2. Review `git diff` and `git status --short`. +3. Stage explicit files by path. +4. Commit with the repo's convention. +5. Prepare a PR body with summary, verification output, risk, and links. +6. Check CI only after local verification. +7. Clean worktrees and stale branches after merge according to repo rules. + +Use the `release-pr` skill. diff --git a/.agents/workflows/review-diff.md b/.agents/workflows/review-diff.md new file mode 100644 index 0000000..6c99472 --- /dev/null +++ b/.agents/workflows/review-diff.md @@ -0,0 +1,11 @@ +# Review Diff + +Use this workflow to review a local diff or PR. + +1. Read repo instructions and the relevant spec. +2. Inspect the diff before summarizing. +3. Prioritize bugs, regressions, missing tests, and broken repo contracts. +4. Lead with findings ordered by severity. +5. If no findings exist, say so and list residual risk. + +Use the `code-review` skill. diff --git a/.agents/workflows/review-spec.md b/.agents/workflows/review-spec.md new file mode 100644 index 0000000..3a937a9 --- /dev/null +++ b/.agents/workflows/review-spec.md @@ -0,0 +1,17 @@ +# Review Spec + +Use this workflow before implementation. + +Check the target `SPEC.md` for: + +- Ambiguous goals or missing non-goals. +- Uncited system facts. +- Hidden architecture decisions. +- Missing safety invariants. +- Missing or non-runnable acceptance commands. +- Missing rollback plan. +- Open questions that block execution. +- Scope that conflicts with `AGENTS.md` or project docs. + +Return findings first, ordered by severity. If the spec is executable, say so +clearly and list any residual risk. diff --git a/.agents/workflows/spec-evidence.md b/.agents/workflows/spec-evidence.md new file mode 100644 index 0000000..65b2e07 --- /dev/null +++ b/.agents/workflows/spec-evidence.md @@ -0,0 +1,12 @@ +# Spec Evidence + +Use this workflow after substantial work or incident resolution. + +1. Review the completion report and verification output. +2. Identify durable lessons that are useful to future agents. +3. Exclude anything already covered by repo docs. +4. Record claim, scope, evidence, confidence, conflicts, and suggested spec or + delivery-authority route. +5. Do not write trusted shared memory directly. + +Use the `spec-evidence-governance` skill. diff --git a/.agents/workflows/symphony-dispatch-check.md b/.agents/workflows/symphony-dispatch-check.md new file mode 100644 index 0000000..1928e80 --- /dev/null +++ b/.agents/workflows/symphony-dispatch-check.md @@ -0,0 +1,12 @@ +# Symphony Dispatch Check + +Use this workflow before enabling or auditing autonomous dispatch. + +1. Confirm `WORKFLOW.md` exists in the runner cwd. +2. Validate tracker, workspace, hooks, agent, and Codex config sections. +3. Confirm Linear project slug and active/terminal states. +4. Confirm workspace isolation and concurrency limits. +5. Confirm completion reports include verification evidence and residual risk. +6. Do not dispatch if the target repo or acceptance gate is unclear. + +Use the `symphony-dispatch` skill. diff --git a/.agents/workflows/verify-spec.md b/.agents/workflows/verify-spec.md new file mode 100644 index 0000000..f464489 --- /dev/null +++ b/.agents/workflows/verify-spec.md @@ -0,0 +1,12 @@ +# Verify Spec + +Use this workflow after implementation. + +1. Inspect the diff against the approved spec. +2. Confirm no unrelated files changed. +3. Run all acceptance commands listed in the spec. +4. Run the repo's normal verification gate if different. +5. Check docs and instructions for consistency. +6. Report exact command results, residual risk, and Mimir memory candidates. + +Do not say complete unless verification ran in the current session. diff --git a/.claude/commands/author-spec.md b/.claude/commands/author-spec.md new file mode 100644 index 0000000..ca1815d --- /dev/null +++ b/.claude/commands/author-spec.md @@ -0,0 +1,12 @@ +# Author Spec + +Use this workflow to turn owner intent into an executable `SPEC.md`. + +1. Read `AGENTS.md`, `CLAUDE.md` if present, `STATUS.md` if present, and + relevant docs. +2. Inspect the codebase before proposing implementation details. +3. Use `.agents/specs/SPEC.template.md`. +4. Fill `Current System Facts` with cited facts only. +5. Keep implementation steps concrete enough for another agent to execute. +6. Mark unresolved decisions as open questions instead of guessing. +7. Stop at the spec unless the owner explicitly approves implementation. diff --git a/.claude/commands/execute-spec.md b/.claude/commands/execute-spec.md new file mode 100644 index 0000000..27ca609 --- /dev/null +++ b/.claude/commands/execute-spec.md @@ -0,0 +1,13 @@ +# Execute Spec + +Use this workflow only after a `SPEC.md` is approved. + +1. Re-read the approved spec and repo instructions. +2. Confirm the worktree or branch state. +3. Implement only the approved scope. +4. Preserve unrelated user changes. +5. Update directly coupled tests/docs only when required by the spec. +6. Run the spec acceptance commands. +7. Prepare the completion report requested by the spec. + +If new facts materially change scope, stop and request a spec update. diff --git a/.claude/commands/orient.md b/.claude/commands/orient.md new file mode 100644 index 0000000..c0a50f9 --- /dev/null +++ b/.claude/commands/orient.md @@ -0,0 +1,11 @@ +# Orient + +Use this workflow before planning or editing. + +1. Read `AGENTS.md`. +2. Read `WORKFLOW.md`, `CLAUDE.md`, `STATUS.md`, and linked docs when present. +3. Run `git status --short --branch`. +4. Identify current branch, local changes, likely files, and verification gate. +5. Report verified facts, risks, and open questions. + +Use the `repo-orientation` skill. diff --git a/.claude/commands/release-pr.md b/.claude/commands/release-pr.md new file mode 100644 index 0000000..762e4aa --- /dev/null +++ b/.claude/commands/release-pr.md @@ -0,0 +1,13 @@ +# Release PR + +Use this workflow after implementation and verification. + +1. Confirm branch and worktree state. +2. Review `git diff` and `git status --short`. +3. Stage explicit files by path. +4. Commit with the repo's convention. +5. Prepare a PR body with summary, verification output, risk, and links. +6. Check CI only after local verification. +7. Clean worktrees and stale branches after merge according to repo rules. + +Use the `release-pr` skill. diff --git a/.claude/commands/review-diff.md b/.claude/commands/review-diff.md new file mode 100644 index 0000000..6c99472 --- /dev/null +++ b/.claude/commands/review-diff.md @@ -0,0 +1,11 @@ +# Review Diff + +Use this workflow to review a local diff or PR. + +1. Read repo instructions and the relevant spec. +2. Inspect the diff before summarizing. +3. Prioritize bugs, regressions, missing tests, and broken repo contracts. +4. Lead with findings ordered by severity. +5. If no findings exist, say so and list residual risk. + +Use the `code-review` skill. diff --git a/.claude/commands/review-spec.md b/.claude/commands/review-spec.md new file mode 100644 index 0000000..3a937a9 --- /dev/null +++ b/.claude/commands/review-spec.md @@ -0,0 +1,17 @@ +# Review Spec + +Use this workflow before implementation. + +Check the target `SPEC.md` for: + +- Ambiguous goals or missing non-goals. +- Uncited system facts. +- Hidden architecture decisions. +- Missing safety invariants. +- Missing or non-runnable acceptance commands. +- Missing rollback plan. +- Open questions that block execution. +- Scope that conflicts with `AGENTS.md` or project docs. + +Return findings first, ordered by severity. If the spec is executable, say so +clearly and list any residual risk. diff --git a/.claude/commands/spec-evidence.md b/.claude/commands/spec-evidence.md new file mode 100644 index 0000000..65b2e07 --- /dev/null +++ b/.claude/commands/spec-evidence.md @@ -0,0 +1,12 @@ +# Spec Evidence + +Use this workflow after substantial work or incident resolution. + +1. Review the completion report and verification output. +2. Identify durable lessons that are useful to future agents. +3. Exclude anything already covered by repo docs. +4. Record claim, scope, evidence, confidence, conflicts, and suggested spec or + delivery-authority route. +5. Do not write trusted shared memory directly. + +Use the `spec-evidence-governance` skill. diff --git a/.claude/commands/symphony-dispatch-check.md b/.claude/commands/symphony-dispatch-check.md new file mode 100644 index 0000000..1928e80 --- /dev/null +++ b/.claude/commands/symphony-dispatch-check.md @@ -0,0 +1,12 @@ +# Symphony Dispatch Check + +Use this workflow before enabling or auditing autonomous dispatch. + +1. Confirm `WORKFLOW.md` exists in the runner cwd. +2. Validate tracker, workspace, hooks, agent, and Codex config sections. +3. Confirm Linear project slug and active/terminal states. +4. Confirm workspace isolation and concurrency limits. +5. Confirm completion reports include verification evidence and residual risk. +6. Do not dispatch if the target repo or acceptance gate is unclear. + +Use the `symphony-dispatch` skill. diff --git a/.claude/commands/verify-spec.md b/.claude/commands/verify-spec.md new file mode 100644 index 0000000..f464489 --- /dev/null +++ b/.claude/commands/verify-spec.md @@ -0,0 +1,12 @@ +# Verify Spec + +Use this workflow after implementation. + +1. Inspect the diff against the approved spec. +2. Confirm no unrelated files changed. +3. Run all acceptance commands listed in the spec. +4. Run the repo's normal verification gate if different. +5. Check docs and instructions for consistency. +6. Report exact command results, residual risk, and Mimir memory candidates. + +Do not say complete unless verification ran in the current session. diff --git a/.claude/settings.json b/.claude/settings.json new file mode 100644 index 0000000..8af7fbe --- /dev/null +++ b/.claude/settings.json @@ -0,0 +1,3 @@ +{ + "autoMemoryEnabled": false +} diff --git a/.claude/skills/code-review/SKILL.md b/.claude/skills/code-review/SKILL.md new file mode 100644 index 0000000..bcc22d0 --- /dev/null +++ b/.claude/skills/code-review/SKILL.md @@ -0,0 +1,36 @@ +--- +name: code-review +description: Use for reviewing local diffs or PRs. Prioritizes bugs, regressions, missing tests, unsafe assumptions, and broken repo contracts over summaries. +--- + +# Code Review + +Use this when asked to review. + +## Review Focus + +- Correctness bugs. +- Behavioral regressions. +- Missing or weak tests. +- Security, privacy, or secret-handling risks. +- Broken architecture boundaries. +- Drift from `AGENTS.md`, approved specs, or public docs. +- Verification gaps. + +## Output + +Findings first, ordered by severity. Each finding should include: + +- file and line reference when available +- the concrete risk +- why the current change causes it +- a practical fix direction + +Then include open questions and a brief summary only after findings. + +## Hard Rules + +- If there are no findings, say that clearly and list residual risk. +- Do not lead with praise or broad summaries. +- Do not request stylistic churn unless it affects correctness, + maintainability, or repo contracts. diff --git a/.claude/skills/implementation-execution/SKILL.md b/.claude/skills/implementation-execution/SKILL.md new file mode 100644 index 0000000..3b4cbf4 --- /dev/null +++ b/.claude/skills/implementation-execution/SKILL.md @@ -0,0 +1,34 @@ +--- +name: implementation-execution +description: Use when implementing an approved BES SPEC.md. Keeps edits scoped, preserves user work, updates directly coupled tests/docs, and stops when new facts change scope. +--- + +# Implementation Execution + +Use only after a spec is approved by the owner or controlling workflow. + +## Steps + +1. Re-read the approved `SPEC.md`. +2. Re-read the repo `AGENTS.md` and relevant docs. +3. Confirm branch/worktree state with `git status --short --branch`. +4. Edit only files named by the spec or directly required by the change. +5. Add or update tests before or with production changes when behavior changes. +6. Keep unrelated refactors out of scope. +7. Run the spec acceptance commands. +8. Prepare the completion report requested by the spec. + +## Stop Conditions + +- New facts materially change scope. +- Required files contain unrelated local changes that make safe editing + ambiguous. +- Verification requires unavailable secrets or infrastructure. +- The spec's acceptance criteria are not testable. + +## Hard Rules + +- Preserve unrelated user changes. +- Do not silently expand scope. +- Do not bypass hooks, CI, or verification gates. +- Do not claim completion without fresh verification evidence. diff --git a/.claude/skills/release-pr/SKILL.md b/.claude/skills/release-pr/SKILL.md new file mode 100644 index 0000000..c508d26 --- /dev/null +++ b/.claude/skills/release-pr/SKILL.md @@ -0,0 +1,27 @@ +--- +name: release-pr +description: Use when preparing commits, PRs, release handoff, or merge cleanup. Enforces explicit staging, conventional commits, PR evidence, and worktree hygiene. +--- + +# Release And PR + +Use when moving finished work toward review or merge. + +## Steps + +1. Confirm branch and tracking state. +2. Review `git status --short` and `git diff`. +3. Stage explicit files by path. +4. Use the repo's commit convention. +5. Write a PR body with summary, verification output, risk, and links. +6. Confirm CI/check status only after local verification is complete. +7. After merge, clean worktrees and stale local branches according to repo + instructions. + +## Hard Rules + +- No `git add .` unless explicitly approved for the batch. +- No AI attribution in commits, PRs, docs, or generated output. +- No force-push, branch deletion, hook bypass, or merge without approval when + the repo requires it. +- Do not burn CI minutes as a substitute for local verification. diff --git a/.claude/skills/repo-orientation/SKILL.md b/.claude/skills/repo-orientation/SKILL.md new file mode 100644 index 0000000..9b6ef2a --- /dev/null +++ b/.claude/skills/repo-orientation/SKILL.md @@ -0,0 +1,36 @@ +--- +name: repo-orientation +description: Use at the start of work in any BES repo to build a current, cited map of instructions, repo state, verification gates, active plans, and likely risk before editing. +--- + +# Repo Orientation + +Use this before planning or editing. + +## Steps + +1. Read the nearest `AGENTS.md`. +2. If present, read `CLAUDE.md`, `WORKFLOW.md`, `STATUS.md`, + `.agents/DOCUMENTATION_GUIDE.md`, and the docs linked by `AGENTS.md`. +3. Check git state with `git status --short --branch`. +4. Identify the active branch, tracking branch, untracked files, and unrelated + local changes. +5. Identify the repo's verification gate and any hook setup requirements. +6. Locate the task's likely files with `rg` and `rg --files`. +7. Report only verified facts. Cite files or command output. + +## Output + +- Target repo and branch. +- Source-of-truth docs read. +- Relevant files or directories. +- Verification commands. +- Documentation placement constraints for this task. +- Local changes that must be preserved. +- Open questions before implementation. + +## Hard Rules + +- Do not edit during orientation. +- Do not rely on memory when repo docs can answer the question. +- If instructions conflict, stop and report the conflict. diff --git a/.claude/skills/spec-driven-development/SKILL.md b/.claude/skills/spec-driven-development/SKILL.md new file mode 100644 index 0000000..57af240 --- /dev/null +++ b/.claude/skills/spec-driven-development/SKILL.md @@ -0,0 +1,45 @@ +--- +name: spec-driven-development +description: "Use when planning, reviewing, implementing, or verifying non-trivial work in BES repos. Enforces the BES spec-first operating model: author an executable SPEC.md, review it, implement only approved scope, verify with concrete commands, and route durable lessons into spec evidence." +--- + +# Spec-Driven Development + +Use this skill for non-trivial work in BES repos. + +## Workflow + +1. Read `AGENTS.md`, `CLAUDE.md` if present, `STATUS.md` if present, + `.agents/DOCUMENTATION_GUIDE.md` if present, and the relevant project docs. +2. Create or update a task spec from `.agents/specs/SPEC.template.md`. +3. Verify the spec has goals, non-goals, current facts with citations, + desired behavior, safety invariants, acceptance commands, rollback, and open + questions. +4. Do not implement until the spec is approved by the owner or controlling + workflow. +5. Execute only the approved spec. Stop if new facts materially change scope. +6. Run acceptance commands and the repo's normal verification gate. +7. Report files changed, commands run, verification output, residual risk, and + spec evidence candidates. + +## Hard Rules + +- Specs are executable contracts, not brainstorming notes. +- Raw memories and chat history are evidence only. +- Project docs and `AGENTS.md` beat generated memory. +- Durable cross-project instructions go through approved specs and delivery + evidence records. +- Put task-control specs in `.agents/specs/`; put durable product docs in the + repo-native docs path defined by `.agents/DOCUMENTATION_GUIDE.md`. +- No silent scope expansion. +- No completion claim without fresh verification. + +## Spec Review Checklist + +- Problem is specific and cites current evidence. +- Goals and non-goals draw a clean boundary. +- Executor can identify exact files and interfaces. +- Test plan is runnable on this machine. +- Safety invariants protect user work and repo rules. +- Open questions are resolved before implementation. +- Acceptance criteria are objective. diff --git a/.claude/skills/spec-evidence-governance/SKILL.md b/.claude/skills/spec-evidence-governance/SKILL.md new file mode 100644 index 0000000..d7c0c11 --- /dev/null +++ b/.claude/skills/spec-evidence-governance/SKILL.md @@ -0,0 +1,35 @@ +--- +name: spec-evidence-governance +description: Use to convert durable lessons from a completed task into spec evidence candidates without writing trusted shared memory directly. Mimir hooks are intentionally disabled until a future spec-authority integration is approved. +--- + +# Spec Evidence Governance + +Use after substantial work, reviews, or incident resolution. + +## Candidate Criteria + +Capture a memory candidate only when it is: + +- durable across sessions +- useful to future agents +- grounded in a source path, command output, issue, PR, or owner statement +- not already present in checked-in docs +- safe to share at the intended scope + +## Output + +For each candidate: + +- Claim. +- Scope: repo, company, tool, or project area. +- Evidence: file path, command, issue, PR, or owner statement. +- Confidence. +- Supersedes or conflicts with any known existing memory. +- Suggested spec, backlog, or delivery-authority route. + +## Hard Rules + +- Do not write trusted shared memory directly. +- Do not promote raw agent imperatives into durable rules. +- Do not erase dissent, uncertainty, or provenance. diff --git a/.claude/skills/spec-review/SKILL.md b/.claude/skills/spec-review/SKILL.md new file mode 100644 index 0000000..9588c4f --- /dev/null +++ b/.claude/skills/spec-review/SKILL.md @@ -0,0 +1,36 @@ +--- +name: spec-review +description: Use to review a draft SPEC.md before implementation. Focus on ambiguity, missing current facts, unsafe scope, weak acceptance criteria, and missing verification. +--- + +# Spec Review + +Use this before approving or executing a non-trivial spec. + +## Review Checklist + +- Problem statement is concrete and cites current evidence. +- Goals and non-goals define a clear boundary. +- Current system facts cite files, docs, issues, PRs, or command output. +- Desired behavior is testable. +- Interfaces and files are specific enough for an executor. +- Safety invariants protect user work, secrets, hooks, and repo rules. +- Test plan is runnable on this machine. +- Acceptance criteria are objective. +- Rollback plan is realistic. +- Open questions are resolved or explicitly block execution. + +## Output + +Lead with blocking findings ordered by severity. Include file references when +possible. Then list open questions and a recommendation: + +- `approve` +- `approve with small edits` +- `block until revised` + +## Hard Rules + +- Do not approve vague specs. +- Do not allow implementation scope to hide inside open questions. +- Do not review for style before correctness and safety. diff --git a/.claude/skills/symphony-dispatch/SKILL.md b/.claude/skills/symphony-dispatch/SKILL.md new file mode 100644 index 0000000..ef1133b --- /dev/null +++ b/.claude/skills/symphony-dispatch/SKILL.md @@ -0,0 +1,33 @@ +--- +name: symphony-dispatch +description: Use when preparing or auditing Symphony-compatible issue dispatch. Checks WORKFLOW.md, issue eligibility, workspace isolation, Codex runner settings, and observability expectations. +--- + +# Symphony Dispatch + +Use when running or preparing autonomous worker dispatch. + +## Checklist + +- `WORKFLOW.md` exists in the runner cwd or an explicit workflow path is set. +- YAML front matter has `tracker`, `polling`, `workspace`, `hooks`, `agent`, + and `codex` sections. +- `tracker.kind` is `linear` and `tracker.api_key` resolves from + `LINEAR_API_KEY`. +- `tracker.project_slug`, active states, and terminal states match the board. +- `workspace.root` is absolute and outside product repo working trees. +- Hooks are documented; failures have the right abort/ignore behavior. +- `codex.command` is `codex app-server` unless a tested wrapper is in use. +- Concurrency is bounded for the machine and CI budget. +- Running workers use isolated branches or worktrees. +- Logs and completion reports include issue identifier, session, commands, and + verification evidence. + +## Hard Rules + +- Do not dispatch when the target repo is unclear. +- Do not dispatch multiple write-capable workers into the same worktree. +- Do not allow unsupported tool calls or user-input-required turns to stall + indefinitely. +- Treat Symphony as a trusted-environment runner unless stronger sandboxing is + explicitly configured. diff --git a/.claude/skills/verification/SKILL.md b/.claude/skills/verification/SKILL.md new file mode 100644 index 0000000..156c605 --- /dev/null +++ b/.claude/skills/verification/SKILL.md @@ -0,0 +1,33 @@ +--- +name: verification +description: Use before reporting done. Runs the narrowest relevant checks first, then the repo gate when warranted, and records fresh evidence plus residual risk. +--- + +# Verification + +Use before claiming work is complete. + +## Steps + +1. Read the spec acceptance commands and repo `AGENTS.md` verification section. +2. Run the narrowest relevant test or lint first. +3. Run the broader repo gate when the change touches shared behavior, + interfaces, CI, docs contracts, or release surfaces. +4. Capture command, result, and important output. +5. If a command fails, diagnose whether the failure is caused by the change, + existing repo state, missing dependency, sandbox/network limits, or secrets. +6. Re-run only after a meaningful fix or environment change. + +## Output + +- Commands run. +- Pass/fail result. +- Key output lines or summarized failures. +- Residual risk. +- Checks not run and why. + +## Hard Rules + +- Do not say "should pass" as verification. +- Do not hide failing checks. +- Do not spend CI minutes when local gates are required first. diff --git a/.gitignore b/.gitignore index 499b223..7548870 100644 --- a/.gitignore +++ b/.gitignore @@ -31,16 +31,10 @@ fuzz/coverage/ *.profraw *.profdata -# ----- Local agent/control-plane state (not committed) ----- -.agents/ -.claude/ -AGENTS.md -CLAUDE.md -WORKFLOW.md +# ----- Local user-specific agent state (not committed) ----- .claude/settings.local.json .codex .mcp.json -.mcp.example.json .mcp.local.json # ----- Mimir operator-local state (per-operator workspace data, never committed) ----- diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 0000000..41d35da --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,162 @@ +# AGENTS.md — Mimir Operating Manual + +> Cross-framework operating manual for Mimir following the [AGENTS.md](https://agents.md/) standard. Read automatically by Claude Code, Codex, Cursor, Copilot, Gemini CLI, and any agent framework supporting the standard. + +## BES Fleet Operating Model + +This repo participates in the BES spec-first agent fleet. The machine-level +contract is `/var/home/hasnobeef/buildepicshit/.agents/OPERATING_MODEL.md`; +repo-local copies of the shared spec template, workflows, and skills live under +`.agents/`. + +Documentation placement rules live in `.agents/DOCUMENTATION_GUIDE.md`. Read it +before creating, moving, archiving, or publishing docs/specs. In short: +`.agents/specs/` is for agent/Symphony task control; durable product docs live +in this repo's native docs path. + +Non-trivial work requires an approved executable `SPEC.md` before +implementation. Use `.agents/specs/SPEC.template.md`, then run the lifecycle: +orient, author spec, review spec, approve, execute, verify, report, and route +durable lessons to Mimir as governed evidence drafts. Raw Claude/Codex memories +are supporting evidence only; checked-in docs and this file remain authoritative. + +Claude must enter through `CLAUDE.md`, which imports this file. Codex and other +AGENTS-aware tools read this file directly. Keep both surfaces aligned. + +Shared task skills live in `.agents/skills/`; Claude-native copies live in +`.claude/skills/`. Use `.agents/skills/repo-orientation` at task start, +`.agents/skills/spec-driven-development` for non-trivial work, +`.agents/skills/verification` before completion, and +`.agents/skills/spec-evidence-governance` only to propose evidence candidates. +Do not build from raw memory. Build from approved specs, repo docs, and fresh +verification evidence. + +> **CI quota constraint (HARD RULE — read before pushing).** This org has a **finite monthly GitHub Actions budget**. Extra Actions usage was added on 2026-04-27 and the owner approved re-enabling Actions for this repo, but every push to a tracked branch / PR still triggers a full matrix run (~30 runner-minutes for 8 jobs across 3 OSes). The 2026-04-20 session burned through the entire monthly quota in heavy iteration — do not repeat that pattern. +> +> **Verify locally before pushing.** Always run the full local gate before `git push`: +> +> ```bash +> cargo build --workspace +> cargo test --workspace +> cargo fmt --all -- --check +> cargo clippy --all-targets --all-features -- -D warnings +> ``` +> +> Other quota-conserving rules: +> - **Batch related changes into one push** instead of 4-5 small "fix the previous fix" commits. +> - **Do not push `--allow-empty` retry commits** — they consume the same minutes as a real run. +> - **Do not retry on transient infra hiccups** without first asking the owner. +> - If CI is currently disabled (`gh api /repos/buildepicshit/Mimir/actions/permissions` returns `enabled: false`), do not re-enable without asking. The owner-approved exception was 2026-04-27 after adding more Actions usage. +> - Dependabot is set to **monthly** cadence (not weekly) and groups all non-major updates into one PR per ecosystem per cycle, for the same quota reason. + +> **Naming history.** Public name `Mimir` (Norse: Mímir, the wise being whose preserved head Odin consulted for counsel). Pre-cutover codename was `engram`; the Mimir cutover happened 2026-04-20 (see [`.planning/planning/2026-04-19-roadmap-to-prime-time.md` § Naming + cutover history](.planning/planning/2026-04-19-roadmap-to-prime-time.md#naming--cutover-history)). Pre-cutover history lives in the archived `buildepicshit/Engram` repo; this repo's git history starts at the cutover commit. Use `mimir-*` everywhere in new code, prose, env vars, and tool names. + +## What Mimir Is + +Mimir is an experimental agent-first memory system. The public name refers to Mimir, the wise figure from Norse myth. The pre-cutover codename was `engram`, the neuroscience term for the physical substrate of a memory trace; that thesis still matters: agent memory should be stored in a form optimized for agent consumption, not human legibility. + +**Mandate update — 2026-04-24.** Mimir's mission is now a multi-agent memory governance/control plane. Claude, Codex, MCP clients, and future harnesses may all contribute memory drafts, but no agent writes trusted shared memory directly. The primary user entry point should be a transparent launch harness: `mimir [agent args...]` preserves the native agent UI while wrapping the session with Mimir memory, bootstrap, capture, and governance. Mimir ingests raw memories, cleans and validates them through the librarian, separates observations from instructions, files records by scope, and promotes reusable knowledge only through explicit provenance-preserving governance. Mimir may also orchestrate cross-agent, cross-model consensus quorums; those deliberation outputs are governed evidence drafts, not direct canonical memory writes. + +The design space Mimir explores: + +- **Agent-native IR.** Canonical storage in a bytecode-like format, tokenizer-aligned for Claude. Not markdown, not English, not JSON. Humans access through a decoder tool, never directly. +- **Librarian-mediated writes.** A single-writer gate enforces schema, symbol identity, supersession, and write conflicts. Agents never write to the canonical store directly. +- **Bifurcated reads.** Agents read the canonical store directly on the hot path. They escalate to the librarian on conflict, low confidence, or stale-symbol flag. +- **Compiler-style architecture (Roslyn analog).** The librarian is a compiler pipeline — lexer, parser, binder, semantic analyzer, emit. Deterministic code runs the pipeline; small ML only for semantic fuzziness (dedup, synonymy, supersession candidates). +- **Bi-temporal append-only store.** Four clocks per memory: `valid_at` / `invalid_at` / `committed_at` / `observed_at`. Supersession via edge invalidation — never in-place overwrite. +- **Symbol-tracking IR (Roslyn-grade).** Stable symbol identity across references. Rename propagation, alias chains, retirement flags. +- **Grounding-aware deterministic confidence decay.** Exponential; parameterized by `(memory-type × grounding × symbol-kind)`. Activity-weighted in v1. Pinning suspends. +- **Checkpoint-triggered write batches.** Writes happen at agent context-pressure events. Each checkpoint is one Episode (atomic rollback unit). +- **Scoped isolation with governed promotion.** Raw agent and project memories are isolated by default. Cross-project, operator-level, or ecosystem-level reuse happens only through explicit librarian promotion with provenance, scope, trust tier, and revocation. +- **Transparent agent harness.** Users launch normal agents through `mimir [agent args...]`; Mimir preserves native terminal flows while adding governed rehydration, capture, and draft submission. +- **Cross-agent consensus quorum.** Claude, Codex, and future adapters can be asked to reason over a question from explicit personas, critique each other, vote, preserve dissent, and emit a structured result. Quorum results enter Mimir as provenance-rich drafts or review artifacts, never as automatic truth. +- **DAG supersession.** Bi-temporal edge invalidation with four edge kinds (Supersedes / Corrects / StaleParent / Reconfirms). +- **Four memory types.** Semantic, episodic, procedural, inferential. Ephemeral tier alongside for intra-session state. + +**Current state:** see [`STATUS.md`](STATUS.md). + +## Architectural Invariants (Non-negotiable) + +These are load-bearing and not up for casual revision: + +1. **Librarian is the single writer.** Agents never write directly to the canonical store. +2. **Agent-native IR is not human-readable.** Inspection routes through a decoder tool; operability is a tooling concern, not a format concession. +3. **Append-only canonical store.** No in-place overwrite. Supersession via bi-temporal edge invalidation. +4. **Precision and consistency over speed and token cost.** Compute overhead is acceptable in exchange for determinism. +5. **Adapter-mediated agent surfaces.** Claude and Codex are the first target surfaces. Future agents integrate through the transparent launch harness, draft/retrieval adapters, and optional MCP-compatible tools, never by bypassing the librarian or canonical contracts. +6. **Every write crosses a validated boundary.** The librarian parses, binds, typechecks, and supersession-detects every write before commit. +7. **Memory is local until governed.** Drafts and raw memories remain isolated at their origin scope. A memory may cross agent, project, operator, or ecosystem boundaries only after librarian validation, explicit scope assignment or promotion, provenance retention, trust classification, and revocable append-only lineage. +8. **Consensus is governed evidence, not truth.** Cross-agent quorum outputs must preserve participant identity, prompts, votes, dissent, and provenance. They can propose memory drafts or decisions, but canonical memory still requires the librarian path. + +## Engineering Standards + +1. **TDD.** No production code without a failing test first. RED → GREEN → REFACTOR. +2. **Primary sources.** Design claims cite primary literature (papers, specs). Training-memory claims are flagged "pending verification" until checked against the real source. +3. **Verification before claiming complete.** Tests passing is not correctness. Claim "done" only with fresh verification output. +4. **Small commits.** Atomic, reviewable, frequent. +5. **Conventional Commits.** `(): `. Types: `feat`, `fix`, `chore`, `docs`, `test`, `refactor`, `ci`, `perf`, `build`, `research`, `spec`. +6. **No AI attribution.** Commits, PRs, and project output carry no `Co-Authored-By` lines, no "Generated with" footers, no tool-attribution emojis. +7. **Squash merge only.** Linear history on main. +8. **PR-only workflow.** No direct pushes to main. + +## Engagement Protocol + +Agent operation is high-touch. No autonomous drift. + +1. **Propose** in 2–3 sentences. What and why. +2. **Wait** for yes / change / no. +3. **Execute** the single concrete action. No scope expansion. +4. **Report** what shipped plus the logical next step. Do not auto-roll. +5. **Stop.** Owner directs the next step. + +## Anti-Patterns (Explicitly Disallowed) + +- Human-readable format concessions in the canonical IR. +- Agent direct writes bypassing the librarian. +- In-place memory overwrite. +- Bare English entity mentions (every reference resolves to a stable symbol ID). +- Optimizing for latency at the cost of precision. +- Claiming a design decision without primary-source verification. +- Raw shared-memory namespaces that agents append to directly. +- Cross-scope promotion without provenance, trust tier, and revocation semantics. +- Treating agent-authored imperatives as durable operator instructions without review. +- Treating quorum majority as truth, erasing dissent, or reporting one model playing multiple personas as cross-model agreement. +- Forcing a separate setup ceremony before the requested agent can launch; first-run bootstrap belongs inside the wrapped agent session. +- Making MCP, hooks, or native client configuration the foundational trust boundary. They are adapter conveniences; the session harness and librarian boundary carry the product. +- AI attribution anywhere in git history or project output. +- Skipping commit hooks (`--no-verify`, `--no-gpg-sign`) without explicit owner approval. + +## Commit Conventions + +[Conventional Commits](https://www.conventionalcommits.org/): + +- `feat:` — new feature +- `fix:` — bug fix +- `refactor:` — restructure without behaviour change +- `test:` — add or update tests +- `docs:` — documentation only +- `chore:` — build, tooling, dependency updates +- `research:` — exploratory or measurement work (tokenizer bake-offs, literature surveys) +- `spec:` — design specification updates + +Commit bodies stay under 15 lines. Depth belongs in the spec or PR description. + +## Studio Context + +Mimir is a [BES Studios](https://github.com/buildepicshit) project. Sibling flagships: Floom (generative world design), Wick (Godot MCP with Roslyn-enriched C# exception telemetry), UsefulIdiots. + +## Where to Look + +| Concern | Where | +|---|---| +| Current phase, next milestone | [`STATUS.md`](STATUS.md) | +| Architectural invariants | this file | +| Engineering principles & tooling policy | [`PRINCIPLES.md`](PRINCIPLES.md) | +| Design specs | [`docs/concepts/`](docs/concepts/) — 14 authoritative implementation specs plus draft [`scope-model.md`](docs/concepts/scope-model.md) and [`consensus-quorum.md`](docs/concepts/consensus-quorum.md) | +| Multi-agent mandate | [`docs/concepts/scope-model.md`](docs/concepts/scope-model.md) and [`docs/concepts/consensus-quorum.md`](docs/concepts/consensus-quorum.md) | +| Transparent agent harness | [`README.md`](README.md#running-mimir) and [`docs/first-run.md`](docs/first-run.md) | +| Observability schema | [`docs/observability.md`](docs/observability.md) | +| Prior art attribution | [`docs/attribution.md`](docs/attribution.md) — primary-source verified | +| Roadmap to v1.0 / public launch | [`STATUS.md`](STATUS.md), [`docs/launch-readiness.md`](docs/launch-readiness.md), and [`docs/launch-posting-plan.md`](docs/launch-posting-plan.md) | +| Experimental measurement | `benchmarks/recovery/` for public benchmark assets; ignored local scratch stays out of the public tree | +| Historical planning archive | [`.planning/planning/`](.planning/planning/) | diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000..112857e --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,25 @@ +# CLAUDE.md — Mimir workspace guide + +@AGENTS.md + +## Claude Entry Protocol + +Use the BES spec-first model for non-trivial work. Start by reading +`AGENTS.md`, `STATUS.md`, and the relevant concept docs. + +Project commands are available in `.claude/commands/`: + +- `/orient` +- `/author-spec` +- `/review-spec` +- `/execute-spec` +- `/verify-spec` +- `/review-diff` +- `/release-pr` +- `/spec-evidence` +- `/symphony-dispatch-check` + +Use the repo-local skills in `.claude/skills/` for orientation, spec work, +implementation, verification, review, PR handoff, Symphony dispatch readiness, +and spec evidence governance. Treat Claude memory as evidence only; Mimir's +librarian path remains the product boundary for durable shared memory. diff --git a/WORKFLOW.md b/WORKFLOW.md new file mode 100644 index 0000000..ad32868 --- /dev/null +++ b/WORKFLOW.md @@ -0,0 +1,74 @@ +--- +tracker: + kind: linear + endpoint: https://api.linear.app/graphql + api_key: $LINEAR_API_KEY + project_slug: mimir + active_states: + - Todo + - In Progress + - In Review + terminal_states: + - Done + - Canceled + - Duplicate +polling: + interval_ms: 30000 +workspace: + root: /var/home/hasnobeef/buildepicshit/.symphony/workspaces/Mimir +hooks: + after_create: | + git clone git@github.com:buildepicshit/Mimir.git . + before_run: null + after_run: null + before_remove: null + timeout_ms: 60000 +agent: + max_concurrent_agents: 1 + max_turns: 20 + max_retry_backoff_ms: 300000 +codex: + command: codex app-server + approval_policy: on-request + thread_sandbox: workspace-write + turn_timeout_ms: 3600000 + read_timeout_ms: 5000 + stall_timeout_ms: 300000 +bes: + repo: Mimir + default_branch: main + canonical_verify: cargo build --workspace && cargo test --workspace && cargo fmt --all -- --check && cargo clippy --all-targets --all-features -- -D warnings +--- + +# Mimir Workflow + +You are working on Mimir under the BES spec-first model. + +## Issue + +- Identifier: `{{ issue.identifier }}` +- Title: `{{ issue.title }}` +- State: `{{ issue.state }}` +- Priority: `{{ issue.priority }}` +- URL: `{{ issue.url }}` +- Attempt: `{{ attempt }}` + +## Required Procedure + +1. Read `AGENTS.md`, `WORKFLOW.md`, `.agents/DOCUMENTATION_GUIDE.md`, + `STATUS.md`, and the relevant concept docs. +2. For non-trivial work, create or update an executable `SPEC.md` from + `.agents/specs/SPEC.template.md`. +3. Preserve librarian-mediated writes, append-only canonical storage, and + provenance-preserving memory governance. +4. Verify locally before pushing to protect CI quota. +5. Report files changed, commands run, verification result, residual risk, and + spec evidence candidates. + +## Safety + +- Do not write trusted shared memory directly. +- Do not bypass the librarian boundary. +- Do not push before running the local gate unless the owner explicitly says + the CI budget tradeoff is acceptable. +- Do not treat quorum majority as truth.