Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,26 @@
# Changelog

## 0.27.0 — 2026-06-08

### fix: ship + deploy the dangling P1/P2/P6/P7 hook scripts; scaffold METALAYER + schemas; document the two-flow bootstrap (BRO-1431)

Surfaced by building a from-scratch RCS/RSI template (rcs-template) as a gap-discovery probe for bstack's own bootstrap. Found a real safety bug plus two scaffold omissions, and the absence of a generative flow.

### Fixed

- **Dangling hook scripts (safety bug).** `settings.json.snippet` wired the P1 (conversation-bridge), P2 (control-gate), P6 (knowledge-catalog-refresh), and P7 (skill-freshness) hooks at `${BROOMVA_WORKSPACE}/scripts/*.sh`, but bstack shipped none of those scripts and bootstrap never copied them. Result: every workspace bootstrapped anywhere but the bstack origin had a **non-functional control gate (P2)** — the safety shield silently no-op'd because Claude Code invoked a script that did not exist. `doctor §7` detected the gap but nothing closed it. Now bstack **ships** all four (`scripts/{control-gate,skill-freshness,conversation-bridge,knowledge-catalog-refresh}-hook.sh`, self-contained + `$CLAUDE_PROJECT_DIR`-portable) and **deploys** them into `$WORKSPACE/scripts/` — `bootstrap` Phase 3.1 and `repair`'s `deploy_workspace_hooks` (idempotent, never overwrites). Dogfood proof: on a fresh workspace with no bstack on PATH, the deployed control-gate blocks `git push --force` (exit 2) and allows `git status` (exit 0).

### Added

- **`bootstrap` Phase 2 now scaffolds `METALAYER.md` + `schemas/{state,action,trace,evaluator,egri-event}.schema.json`** (the control-systems manifest + typed interfaces). Templates added under `assets/templates/` + `assets/templates/schemas/`. Previously only CLAUDE/AGENTS/policy/arcs scaffolded; the manifest and typed contract were omitted.
- **Two-flow bootstrap doc (SKILL.md).** Names and sequences the **structured flow** (deterministic scaffold — the lossless floor, no LLM) → **generative flow** (agent-authored, workspace-tailored setup — bespoke, to-the-ceiling) → **verify** (`bstack doctor` gates both). Mirrors the P18 Audience Category-B-projection vs Category-C-generative split, applied to workspace setup. Invariant: never wire a hook whose script isn't deployed.

### Notes

- Primitive count unchanged (**20**). This is a bootstrap correctness + completeness fix, not a new P-row.
- Hook-resolution is now consistent for the workspace-resolved hooks (all four deployed into `$WORKSPACE/scripts/`, self-contained on a fresh clone); the L0/L1 audit hooks remain `$BSTACK_REPO`-referenced (already resolving).
- `VERSION` 0.26.0 → 0.27.0.

## 0.26.0 — 2026-06-05

### feat: `bstack skills audit --require-tests` — skill-script test gate (BRO-1411 slice 2)
Expand Down
35 changes: 35 additions & 0 deletions SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ bstack ships two complementary layers:

Installing the substrate without the mode = the workspace has primitives but no entry point to engage them. Invoking the mode without the substrate = wishful thinking. Compounded: `/bstack bootstrap` installs the substrate, then `/autonomous` is the standing operating mode for substantive work units.

Bootstrap itself is **two-flow** — a deterministic structured scaffold (the floor) plus an agent-authored generative tailoring pass (the bespoke layer). See [Two-flow workspace setup](#two-flow-workspace-setup-structured--generative).

## Quick start

Install:
Expand Down Expand Up @@ -249,6 +251,39 @@ Future sessions inspect this for state. `bootstrap_status: failed` is captured t

**Self-application**: when `/bstack bootstrap` is invoked in an existing workspace, the bootstrap itself runs under `/autonomous` discipline — state snapshot, dep-chain trace, validation plan, PR pipeline. The bootstrap that installs the discipline embodies the contract it ships.

**Two-flow**: bootstrap is not a single deterministic pass. The structured scaffold above is the *floor*; the agent then runs a generative tailoring pass on top of it. See [Two-flow workspace setup](#two-flow-workspace-setup-structured--generative) below for the full model and canonical sequence.

### Two-flow workspace setup (structured + generative)

`bstack bootstrap` is a *two-flow* operation, not a single deterministic pass. This mirrors the Audience (P18) split bstack already applies to documents — a deterministic Category-B *projection floor* plus a context-aware Category-C *bespoke authoring* layer — now applied to workspace setup itself.

```
bstack bootstrap = STRUCTURED flow → GENERATIVE flow → VERIFY
(deterministic, no LLM) (agent-authored, contextual) (deterministic)
scripts + templates tailoring of THIS workspace bstack doctor
+ shipped+deployed hooks
+ gates + .control
```

**1. Structured flow (the floor — reproducible, no LLM).** `bstack bootstrap` runs the idempotent scaffold: installs skills; scaffolds governance from `assets/templates/*` (CLAUDE.md, AGENTS.md, METALAYER.md, `.control/policy.yaml`, `.control/arcs.yaml`, `.control/rcs-parameters.toml`, `schemas/`); **deploys** the hook scripts into the workspace (control-gate / skill-freshness / conversation-bridge / knowledge-catalog-refresh + the L0/L1 audit hooks); wires `.claude/settings.json`; installs the L3 rate gate + CI gate. This flow must be COMPLETE and CORRECT — every wired hook must have a backing script deployed (no dangling references). It is the lossless baseline: same inputs → same workspace, every time.

**2. Generative flow (the bespoke layer — agent-authored, to-the-ceiling).** After the structured scaffold, the agent does a context-aware pass that templates cannot produce. Concretely the agent:

a. **Detects the stack + project intent** — signals: language/build files, existing code, README, the user's stated goal.
b. **Tailors the scaffolded governance prose to THIS project** — rewrites the generic CLAUDE.md / AGENTS.md placeholders into project-specific invariants, conventions, and architecture notes (not generic template text).
c. **Authors a project-specific CI workflow** — the structured flow ships the L3-stability gate; the agent generates the test/lint/build job that matches the detected stack.
d. **Fills the Dogfood Plan (Empirical, P11)** with the real entry surfaces + evidence anchors for this project's stack (per [references/dogfood-patterns.md](references/dogfood-patterns.md)).
e. **For RCS/RSI or control-systems repos** — optionally lays down a runnable L0–L3 substrate + a HIERARCHY/instantiation map, so the workspace doesn't merely DESCRIBE a control system, it RUNS one. For ordinary repos, this step is skipped.
f. **Files the initial knowledge-graph entities / decision log (Bookkeeping, P6)** for the new workspace — proactively, never asking permission.

**3. Verify (deterministic).** `bstack doctor` gates BOTH flows: the structured contract (governance files, hooks wired+deployed, gates, schemas) AND the generative output (the doctor surfaces gaps if the agent's tailoring left a hole). Generative output is always checked by the structured contract — never trusted blind.

**Key principles:**

- This mirrors the established Audience (P18) discipline: the STRUCTURED flow is the Category-B *projection floor* (deterministic, lossless, reproducible); the GENERATIVE flow is Category-C *bespoke authoring* (context-aware, to-the-ceiling). The same structured-vs-generative split bstack already applies to documents, now applied to workspace setup.
- The structured flow must never wire a hook whose script isn't deployed — the *dangling-hook* failure mode this work fixes. "Wired but dangling" is forbidden: every hook reference resolves to a real, executable, deployed script.
- The agent runs **structured FIRST** (idempotent floor), **THEN generative** (bespoke), **THEN doctor** (verify). Never generative-without-structured (no floor) or structured-without-generative (generic, untailored workspace).

### `doctor` — verify primitive contract

`scripts/doctor.sh`. Eight check sections:
Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
0.26.0
0.27.0
69 changes: 69 additions & 0 deletions assets/templates/schemas/action.schema.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"title": "Agentic Control Kernel — Action Schema",
"description": "Typed control directive (θ_t) emitted by the LLM agent. Not raw actuation — parameterizes deterministic controllers.",
"type": "object",
"required": ["directive_id", "timestamp", "directive_type"],
"properties": {
"directive_id": {
"type": "string",
"description": "Unique identifier for this control directive"
},
"timestamp": {
"type": "string",
"format": "date-time",
"description": "ISO 8601 timestamp of directive emission"
},
"directive_type": {
"type": "string",
"enum": [
"setpoint_update",
"constraint_update",
"mode_switch",
"parameter_update",
"module_selection",
"experiment_request",
"model_update_trigger",
"plan_update"
],
"description": "Type of control directive"
},
"target_controller": {
"type": "string",
"description": "Which controller module this directive targets"
},
"payload": {
"type": "object",
"description": "Directive-specific payload (setpoints, parameters, etc.)",
"additionalProperties": true
},
"rationale": {
"type": "string",
"description": "LLM's reasoning for this directive (for audit/ledger)"
},
"priority": {
"type": "string",
"enum": ["critical", "high", "normal", "low"],
"default": "normal",
"description": "Execution priority"
},
"requires_approval": {
"type": "boolean",
"default": false,
"description": "Whether this directive needs human approval before execution"
},
"rollback_directive_id": {
"type": ["string", "null"],
"description": "Directive to execute if this one needs to be rolled back"
},
"budget_impact": {
"type": "object",
"description": "Estimated resource consumption",
"properties": {
"tokens": { "type": "integer" },
"compute_s": { "type": "number" },
"cost_usd": { "type": "number" }
}
}
}
}
76 changes: 76 additions & 0 deletions assets/templates/schemas/egri-event.schema.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://broomva.tech/schemas/egri-event.schema.json",
"title": "EGRI Trial Event",
"description": "Payload for autoany EGRI trial records persisted to Lago via EventKind::Custom with 'egri.' prefix",
"type": "object",
"required": ["event_type", "trial"],
"properties": {
"event_type": {
"type": "string",
"pattern": "^egri\\.",
"description": "Event type with 'egri.' prefix, e.g. 'egri.trial'"
},
"trial": {
"$ref": "#/$defs/TrialRecord"
},
"session_id": {
"type": ["string", "null"],
"description": "Optional Arcan session ID for cross-reference"
}
},
"$defs": {
"TrialRecord": {
"type": "object",
"required": ["trial_id", "timestamp", "parent_state", "mutation", "outcome", "decision"],
"properties": {
"trial_id": { "type": "string" },
"timestamp": { "type": "string", "format": "date-time" },
"parent_state": { "type": "string" },
"mutation": { "$ref": "#/$defs/Mutation" },
"execution": { "$ref": "#/$defs/ExecutionResult" },
"outcome": { "$ref": "#/$defs/Outcome" },
"decision": { "$ref": "#/$defs/Decision" },
"strategy_notes": { "type": ["string", "null"] }
}
},
"Mutation": {
"type": "object",
"required": ["operator", "description"],
"properties": {
"operator": { "type": "string" },
"description": { "type": "string" },
"diff": { "type": ["string", "null"] },
"hypothesis": { "type": ["string", "null"] }
}
},
"ExecutionResult": {
"type": ["object", "null"],
"properties": {
"duration_secs": { "type": "number" },
"exit_code": { "type": "integer" },
"error": { "type": ["string", "null"] },
"output": {}
}
},
"Outcome": {
"type": "object",
"required": ["score", "constraints_passed"],
"properties": {
"score": {},
"constraints_passed": { "type": "boolean" },
"constraint_violations": { "type": "array", "items": { "type": "string" } },
"evaluator_metadata": {}
}
},
"Decision": {
"type": "object",
"required": ["action", "reason"],
"properties": {
"action": { "type": "string", "enum": ["promoted", "discarded", "branched", "escalated"] },
"reason": { "type": "string" },
"new_state_id": { "type": ["string", "null"] }
}
}
}
}
114 changes: 114 additions & 0 deletions assets/templates/schemas/evaluator.schema.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"title": "Agentic Control Kernel — Evaluator Schema",
"description": "Score vectors and promotion decisions for EGRI-compatible evaluators.",
"type": "object",
"required": ["evaluator_id", "timestamp", "scores", "decision"],
"properties": {
"evaluator_id": {
"type": "string",
"description": "Identifier of the evaluator instance"
},
"timestamp": {
"type": "string",
"format": "date-time"
},
"trial_id": {
"type": "string",
"description": "EGRI trial being evaluated"
},
"controller_version": {
"type": "string",
"description": "Controller version under evaluation"
},
"scores": {
"type": "object",
"description": "Score vector — scalar or multi-dimensional metrics",
"properties": {
"primary": {
"type": "number",
"description": "Primary objective score (the one driving promotion)"
},
"secondary": {
"type": "object",
"description": "Additional metrics tracked but not driving promotion",
"additionalProperties": { "type": "number" }
}
},
"required": ["primary"]
},
"baseline": {
"type": "object",
"description": "Baseline scores for comparison",
"properties": {
"primary": { "type": "number" },
"secondary": {
"type": "object",
"additionalProperties": { "type": "number" }
}
}
},
"constraints": {
"type": "object",
"required": ["all_passed"],
"properties": {
"all_passed": {
"type": "boolean",
"description": "Whether all hard constraints were satisfied"
},
"violations": {
"type": "array",
"items": {
"type": "object",
"properties": {
"constraint_id": { "type": "string" },
"measured": { "description": "Measured value" },
"threshold": { "description": "Threshold that was violated" },
"severity": {
"type": "string",
"enum": ["hard", "soft"]
}
}
}
}
}
},
"decision": {
"type": "object",
"required": ["action"],
"properties": {
"action": {
"type": "string",
"enum": ["promoted", "discarded", "branched", "escalated"],
"description": "Promotion decision"
},
"reason": {
"type": "string",
"description": "Why this decision was made"
},
"new_controller_version": {
"type": ["string", "null"],
"description": "Version ID of promoted controller (null if discarded)"
},
"rollback_target": {
"type": ["string", "null"],
"description": "Version to rollback to if this promotion fails in deployment"
}
}
},
"scenario_coverage": {
"type": "object",
"description": "Which scenarios were evaluated",
"properties": {
"total_scenarios": { "type": "integer" },
"passed": { "type": "integer" },
"failed": { "type": "integer" },
"holdout_passed": {
"type": "integer",
"description": "Anti-gaming holdout scenarios passed"
},
"holdout_total": { "type": "integer" }
}
}
}
}
Loading
Loading