Meta
type: EngineeringScaffold
stage: draft
maturity: L1
created: 2026-05-10
inputs:
- "OpenAI harness engineering (2026-02) — golden principles, doc-gardening agent, recurring cleanup"
- "#53 — architecture fitness linting"
- "#33 — ADR template (knowledge base foundation)"
related: ["#53", "#29", "#33"]
Purpose. Prevent architectural drift and knowledge rot through a system of codified "golden principles" and a recurring lightweight agent-based drift correction process.
Context
OpenAI's harness engineering experience is direct:
"Full agent autonomy also leads to novel problems. Codex replicates patterns that already exist in the repository — including uneven or suboptimal ones. Over time, this inevitably leads to drift."
"We began encoding what we call 'golden principles' directly into the repository and building a recurring cleanup process. These are opinionated, mechanical rules that keep the codebase readable and consistent for future agent runs."
"At regular intervals we run a series of Codex background tasks that search for deviations, update quality scores, and open targeted refactoring PRs. Most of these can be reviewed and auto-merged in under a minute."
"Technical debt is a high-interest loan: it's almost always better to pay it down continuously in small increments rather than letting it accumulate and then addressing it in painful bursts."
The runtime starts small but will accumulate agent-generated code rapidly once the SAO (#43) is operational. Encoding the drift correction process early prevents it from becoming a manual "Friday cleanup" burden.
Golden principles (initial set)
These are the codified invariants that form the runtime's quality baseline. Each is machine-checkable.
| Principle |
Enforcement |
| Prefer shared utility functions over per-file helpers |
Lint: detect local function duplication across ≥3 files |
| Parse at boundaries, trust internally |
Lint: Zod/schema validation only at port boundaries; no ad-hoc JSON.parse in domain code |
| Every exported type has a test |
Structural test: exported types appear in at least one test file |
| Dead exports are removed |
CI: ts-prune or equivalent detects unused exports |
| Docs reference real code |
CI: docs cross-link validator checks referenced symbols exist |
| ADR per major decision |
Process: any merged PR that changes a port, layer boundary, or public type must reference an ADR |
Doc-gardening process
Inspired by OpenAI's recurring doc-gardening agent:
What it checks:
- ADRs that reference superseded decisions (detect
status: accepted ADRs whose implementation is no longer present)
docs/ files that reference types, modules, or files that no longer exist
- Issue cross-references in
related: frontmatter that point to closed or renamed issues
AGENTS.md entries that reference outdated section numbers
How it runs:
- Triggered on a schedule (weekly) or manually
- Runs as a CI job that outputs a report; optionally opens targeted fix PRs
- Each finding includes: file, line, problem description, suggested fix — agent-actionable format
Quality score tracking
Maintain a docs/QUALITY.md that scores each major component area:
## Quality Score — 2026-05-10
| Area | Score | Trend | Notes |
|---|---|---|---|
| Runtime kernel | 85 | ↑ | #14 ratified; awaiting Hello World validation |
| Data model | 80 | → | #15 stable; LLMRequest/Response shapes L2 |
| Orchestrator engine | 50 | → | #43 in design; no implementation yet |
| Engineering scaffold | 70 | ↑ | #53 lint rules added |
| Docs coverage | 60 | ↑ | ADR template landed; 3 ADRs pending |
Updated by the doc-gardening run and by humans when significant work lands.
AGENTS.md as a map, not an encyclopedia
Per the harness engineering article, AGENTS.md must be a short (~100 line) table of contents, not an instruction manual:
"Instead of treating AGENTS.md as an encyclopedia, we treat it as the table of contents."
"Context is a scarce resource. A massive instruction file displaces the actual task, the code, and the relevant docs — causing the agent to either miss important constraints or begin optimising for the wrong ones."
For this repository, AGENTS.md should:
- Be ≤ 150 lines
- Reference
docs/ARCHITECTURE.md, docs/QUALITY.md, docs/adr/, and specs/ by path
- Include a short "start here" map for the three most common agent task types: (1) adding a feature, (2) fixing a bug, (3) updating a design document
- Be enforced by a CI check: fail if
AGENTS.md exceeds 150 lines
The docs/ directory is the system of record. AGENTS.md is the entry point.
Acceptance
Meta
Context
OpenAI's harness engineering experience is direct:
The runtime starts small but will accumulate agent-generated code rapidly once the SAO (#43) is operational. Encoding the drift correction process early prevents it from becoming a manual "Friday cleanup" burden.
Golden principles (initial set)
These are the codified invariants that form the runtime's quality baseline. Each is machine-checkable.
JSON.parsein domain codets-pruneor equivalent detects unused exportsDoc-gardening process
Inspired by OpenAI's recurring doc-gardening agent:
What it checks:
status: acceptedADRs whose implementation is no longer present)docs/files that reference types, modules, or files that no longer existrelated:frontmatter that point to closed or renamed issuesAGENTS.mdentries that reference outdated section numbersHow it runs:
Quality score tracking
Maintain a
docs/QUALITY.mdthat scores each major component area:Updated by the doc-gardening run and by humans when significant work lands.
AGENTS.md as a map, not an encyclopedia
Per the harness engineering article,
AGENTS.mdmust be a short (~100 line) table of contents, not an instruction manual:For this repository,
AGENTS.mdshould:docs/ARCHITECTURE.md,docs/QUALITY.md,docs/adr/, andspecs/by pathAGENTS.mdexceeds 150 linesThe
docs/directory is the system of record.AGENTS.mdis the entry point.Acceptance
docs/QUALITY.mdtemplate created and populated with initial scoresAGENTS.mdsize limit enforced by CIAGENTS.mdrefactored to map pattern (≤ 150 lines, references todocs/)