Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1,170 changes: 1,170 additions & 0 deletions docs/superpowers/plans/2026-05-30-tui-config-review.md

Large diffs are not rendered by default.

144 changes: 144 additions & 0 deletions docs/superpowers/specs/2026-05-30-tui-config-review-design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
# TUI Session Config Review Screen — Design

**Issue:** [#201](https://github.com/windoliver/grove/issues/201) — feat: TUI session config screen
**Date:** 2026-05-30
**Depends on:** #198 (session records store full config), #199 (PolicyEnforcer reads config from session), #200 (GROVE.md optional) — all merged.

## Goal

Add a config review/edit screen between preset selection and goal input so an operator
can tweak a session's resolved config before launching — not just pick a preset. The
screen shows the config resolved from the chosen preset and lets the operator edit a
focused set of scalar/enum fields. Everything else is shown read-only.

## Scope

**Editable (v1, "Tasks-faithful"):**
- **Mode** — `evaluation ⇄ exploration` toggle.
- **Stop conditions (numeric thresholds)** — `maxRoundsWithoutImprovement`,
`budget.maxContributions`, `budget.maxWallClockSeconds`, and `targetMetric.value`
(only when a target metric is defined).
- **Concurrency** — `maxActiveClaims`, `maxClaimsPerAgent`.

**Read-only summaries:**
- Topology — role names + edge count (one line).
- Metrics — `name (direction, gate)` list.
- Gates — `type → target` list.
- `quorumReviewScore` / `deliberationLimit` stop conditions (shown if present).
- Concurrency `maxClaimsPerTarget` (fixed at 1 — not rendered).

**Non-goals (explicitly out of scope):**
- Full visual topology editor.
- Defining metrics from scratch; editing the metrics map.
- Adding/removing gates (deferred — the issue's prose mentioned it, but the Tasks
checklist and non-goals narrow v1 to scalar/enum edits).
- Editing execution / rate-limit / retry / gossip / hooks / admission sections.

## Flow & Routing

New screen `config-review` inserted per the issue's ordering:

```
preset-select → config-review → goal-input → launch-preview → spawning → running → complete
```

- Add `"config-review"` to the `Screen` union (`src/tui/screens/screen-manager.tsx`) and to
`PageKind` (`src/tui/data/pages-store.ts`).
- `handlePresetSelect` (in `screen-manager.tsx`) changes: after setting `selectedPreset`
and the baseline topology, resolve the **baseline contract for the selected preset** via
`presetToSessionConfig(getPreset(name), name)` (from `src/cli/presets/index.js`) rather
than reusing the boot-time `appProps.contract` (which reflects the *previous* grove.json
preset). Store the result as `state.editedConfig` and `pages.push({ kind: "config-review" })`.
- **Bypass rule:** if the selected preset resolves to no contract *and* `appProps.contract`
is also undefined (possible on some remote/nexus backends where the contract probe
returns nothing), skip `config-review` and go straight to `goal-input` — there is nothing
to edit. This is the data-level "use defaults / optional screen" escape hatch.

## Component — `src/tui/screens/config-review.tsx`

Reuses the `agent-detect.tsx` interaction model (cursor list + inline edit buffer). Renders
three zones:

1. **Editable field list** (navigable cursor): Mode, then the present stop-condition
scalars, then concurrency scalars.
2. **Read-only summaries:** topology, metrics, gates.
3. **Hint bar:** active keybindings.

### Keybindings

| Key | Action |
| --- | --- |
| `j` / `k` / ↑ / ↓ | Move cursor between editable fields |
| `e` | Edit focused scalar field (enters inline edit mode) |
| `space` | Toggle the Mode enum (when Mode is focused) |
| `d` | Reset all edits back to preset defaults |
| `Enter` (not editing) | Confirm & continue → `goal-input` |
| `Esc` (not editing) | Back → `preset-select` |
| digits / backspace | (in edit mode) modify the numeric buffer |
| `Enter` (editing) | Commit the field |
| `Esc` (editing) | Cancel just this field's edit |

`Enter` is dual-meaning — it commits the focused field while editing, and confirms the
screen otherwise — matching the existing `goal-input` / `agent-detect` convention. Scalar
edit mode is entered only with `e` (not `Enter`), so `Enter` is unambiguously "confirm
screen" at the list level.

## State & Persistence

- `ScreenState` gains `editedConfig?: GroveContract | undefined`.
- ConfigReview is a controlled-ish screen: it receives the baseline `config` and the
read-only `topology`, holds a working draft in local state, and emits
`onConfirm(updatedConfig: GroveContract)` and `onBack()`.
- `handleConfigReviewConfirm(updatedConfig)` → `setState({ editedConfig: updatedConfig,
screen: "goal-input" })` + `pages.push({ kind: "goal-input" })`.
- `handleConfigReviewBack` → `pages.pop()` (returns to `preset-select`).
- `spawnAgents` (`screen-manager.tsx` ~line 682): build `sessionConfig` from
`state.editedConfig ?? contract` instead of `contract`. The existing
`provider.createSession({ goal, presetName, topology, config })` path already persists the
config into the session record (#198) — no store changes required.
- Component map in `screen-manager.tsx` gains a `config-review` entry wiring the component
to `topology`, `state.editedConfig`, and the two handlers.

## Validation & Error Handling

- **Per-field commit** validates against the Zod bounds defined in `src/core/contract.ts`:
- `concurrency.maxActiveClaims`: int 1–1000
- `concurrency.maxClaimsPerAgent`: int 0–100
- `stopConditions.maxRoundsWithoutImprovement`: int 1–1000
- `budget.maxContributions`: int ≥ 1
- `budget.maxWallClockSeconds`: int ≥ 1
- `targetMetric.value`: number (no bound)

Bounds are mirrored as named constants in the component with a
`// keep in sync with src/core/contract.ts` comment. Invalid input shows an inline error
and remains in edit mode (does not commit).
- **Unset semantics:** clearing a numeric field to empty unsets that optional field. If
**both** budget fields are cleared, the `budget` object is dropped entirely (its Zod
`refine` requires at least one of the two) so an invalid empty budget is never emitted.
- **No new cross-field hazards:** execution / rate-limit / gossip constraints are not
editable here, so their cross-field validations cannot be newly violated.

## Testing

- **Component test** (`config-review.test.tsx`): render with a representative contract, drive
key events — navigate, edit a numeric field (valid + out-of-range), toggle mode, `d` reset,
`Enter` confirm — and assert: the `onConfirm` payload reflects edits, out-of-range input is
rejected (no commit, error shown), clearing both budget fields drops `budget`. Follows the
existing TUI screen-test harness.
- **Flow test** (extends `screen-manager` tests): assert the transition
`preset-select → config-review → goal-input`, and that an edited contract reaches
`createSession` via `spawnAgents`.
- **Manual TUI smoke:** `grove up` → pick preset → edit a stop-condition threshold → confirm
→ verify the edited value lands in the session record (consistent with the project's
E2E-first validation practice; manual smoke is tracked as a follow-up, not auto-run).

## Files Touched

| File | Change |
| --- | --- |
| `src/tui/screens/config-review.tsx` | **New** screen component |
| `src/tui/screens/screen-manager.tsx` | `Screen` union, `editedConfig` state, `handlePresetSelect` rewrite, confirm/back handlers, component-map entry, `spawnAgents` config source |
| `src/tui/data/pages-store.ts` | `PageKind` adds `"config-review"` |
| `src/tui/components/pages-router.tsx` | Route `config-review` (breadcrumb/hints) if router needs an explicit entry |
| `src/tui/screens/config-review.test.tsx` | **New** component test |
| `screen-manager` test file | Flow-transition assertions |
18 changes: 16 additions & 2 deletions src/server/routes/sessions.ts
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,11 @@ const createSessionSchema = z.object({
// Topology accepted as opaque object — parsed in handler to support
// both snake_case wire format (external) and camelCase (TUI/internal)
topology: z.record(z.string(), z.unknown()).optional(),
// Operator-edited session config (TUI config-review screen, #201). Accepted
// opaquely as a camelCase GroveContract and cast in the handler, mirroring
// the topology handling. When present it takes precedence over the server's
// GROVE.md contract / preset defaults.
config: z.record(z.string(), z.unknown()).optional(),
});

const addContributionSchema = z.object({
Expand Down Expand Up @@ -103,14 +108,23 @@ sessions.post("/", async (c) => {
// Normalize: accept both "preset" and "presetName"
const presetName = parsed.data.preset ?? parsed.data.presetName;

// A client-supplied config (the TUI config-review screen, #201) is the
// operator's edited session config and takes precedence over the server's
// GROVE.md contract and preset defaults. Accepted opaquely (camelCase
// GroveContract from the TUI) and cast, mirroring the topology handling
// below. Clients that omit `config` fall through to the prior behavior.
const clientConfig = parsed.data.config as GroveContract | undefined;
// Resolve base config: frozen snapshot source for this session.
// - Prefer a server-loaded GROVE.md (runtime.contract) when present.
// - Prefer a client-supplied config (#201) when present.
// - Then prefer a server-loaded GROVE.md (runtime.contract) when present.
// - Otherwise fall back to preset resolution; #198/#199 mean the
// session's frozen config is what drives enforcement, so callers
// who supply a preset don't need a contract on the server.
// - Reject when neither is available so misconfiguration surfaces.
let baseConfig: GroveContract | undefined;
if (contract) {
if (clientConfig) {
baseConfig = clientConfig;
} else if (contract) {
baseConfig = contract;
} else if (presetName) {
const preset = getPreset(presetName);
Expand Down
2 changes: 2 additions & 0 deletions src/tui/components/breadcrumb-bar.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ export interface BreadcrumbBarProps {
const PAGE_LABELS: Record<PageKind, string> = {
"preset-select": "Preset Select",
"goal-input": "Goal",
"config-review": "Config Review",
"agent-detect": "Launch Preview",
"launch-preview": "Launch Preview",
spawning: "Spawning",
Expand Down Expand Up @@ -164,6 +165,7 @@ const SCREEN_LABELS: Record<Screen, string> = {
"agent-detect": "Launch Preview",
"launch-preview": "Launch Preview",
"goal-input": "Goal",
"config-review": "Config Review",
spawning: "Spawning",
running: "Running",
inspect: "Inspect",
Expand Down
2 changes: 2 additions & 0 deletions src/tui/data/hint-map.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
*/

import { COMPLETE_HINTS } from "../views/complete-hints.js";
import { CONFIG_REVIEW_HINTS } from "../views/config-review-hints.js";
import { GOAL_INPUT_HINTS } from "../views/goal-input-hints.js";
import { INSPECT_HINTS } from "../views/inspect-hints.js";
import { LAUNCH_PREVIEW_HINTS } from "../views/launch-preview-hints.js";
Expand Down Expand Up @@ -63,6 +64,7 @@ const STATIC: Readonly<Record<string, readonly KeyAction[]>> = Object.freeze({
// Top-level page kinds
"preset-select": PRESET_SELECT_HINTS,
"goal-input": GOAL_INPUT_HINTS,
"config-review": CONFIG_REVIEW_HINTS,
"agent-detect": LAUNCH_PREVIEW_HINTS,
"launch-preview": LAUNCH_PREVIEW_HINTS,
// Spawning is transient and has NO active keyboard handlers
Expand Down
1 change: 1 addition & 0 deletions src/tui/data/pages-store.ts
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ import { debugLog } from "../debug-log.js";
export type PageKind =
| "preset-select"
| "goal-input"
| "config-review"
| "agent-detect"
| "launch-preview"
| "spawning"
Expand Down
121 changes: 121 additions & 0 deletions src/tui/screens/config-edit.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
import { describe, expect, test } from "bun:test";
import type { GroveContract } from "../../core/contract.js";
import { getEditableFields, setNumericField, toggleMode } from "./config-edit.js";

const FULL: GroveContract = {
contractVersion: 2,
name: "test",
mode: "evaluation",
metrics: { latency: { direction: "minimize", gate: 100 } },
gates: [{ type: "min_score", metric: "latency", threshold: 0.8 }],
stopConditions: {
maxRoundsWithoutImprovement: 5,
targetMetric: { metric: "latency", value: 50 },
budget: { maxContributions: 200, maxWallClockSeconds: 3600 },
},
concurrency: { maxActiveClaims: 4, maxClaimsPerAgent: 2 },
};

describe("getEditableFields", () => {
test("lists mode + stop + concurrency fields with targetMetric when present", () => {
const ids = getEditableFields(FULL).map((f) => f.id);
expect(ids).toEqual([
"mode",
"stop.maxRoundsWithoutImprovement",
"stop.targetMetric.value",
"stop.budget.maxContributions",
"stop.budget.maxWallClockSeconds",
"concurrency.maxActiveClaims",
"concurrency.maxClaimsPerAgent",
]);
});

test("omits targetMetric.value when no target metric is defined", () => {
const noTarget: GroveContract = { ...FULL, stopConditions: { maxRoundsWithoutImprovement: 5 } };
const ids = getEditableFields(noTarget).map((f) => f.id);
expect(ids).not.toContain("stop.targetMetric.value");
});

test("shows (unset) for absent optional numerics", () => {
const bare: GroveContract = { contractVersion: 2, name: "bare" };
const fields = getEditableFields(bare);
const rounds = fields.find((f) => f.id === "stop.maxRoundsWithoutImprovement");
expect(rounds?.display).toBe("(unset)");
});
});

describe("toggleMode", () => {
test("evaluation -> exploration", () => {
expect(toggleMode(FULL).mode).toBe("exploration");
});
test("exploration -> evaluation", () => {
expect(toggleMode({ ...FULL, mode: "exploration" }).mode).toBe("evaluation");
});
test("undefined -> evaluation", () => {
expect(toggleMode({ contractVersion: 2, name: "x" }).mode).toBe("evaluation");
});
});

describe("setNumericField", () => {
test("sets a valid concurrency value", () => {
const { config, error } = setNumericField(FULL, "concurrency.maxActiveClaims", "7");
expect(error).toBeUndefined();
expect(config.concurrency?.maxActiveClaims).toBe(7);
});

test("rejects out-of-range maxClaimsPerAgent", () => {
const { config, error } = setNumericField(FULL, "concurrency.maxClaimsPerAgent", "200");
expect(error).toContain("≤ 100");
expect(config).toBe(FULL); // unchanged reference on error
});

test("rejects non-integer for an integer field", () => {
const { error } = setNumericField(FULL, "stop.maxRoundsWithoutImprovement", "1.5");
expect(error).toContain("whole number");
});

test("empty unsets an optional field and prunes its empty parent", () => {
const { config } = setNumericField(
{ contractVersion: 2, name: "x", concurrency: { maxActiveClaims: 4 } },
"concurrency.maxActiveClaims",
"",
);
expect(config.concurrency).toBeUndefined();
});

test("clearing one budget field keeps the other", () => {
const { config } = setNumericField(FULL, "stop.budget.maxContributions", "");
expect(config.stopConditions?.budget).toEqual({ maxWallClockSeconds: 3600 });
});

test("clearing the last budget field drops the budget object", () => {
const oneBudget: GroveContract = {
...FULL,
stopConditions: { budget: { maxContributions: 200 } },
};
const { config } = setNumericField(oneBudget, "stop.budget.maxContributions", "");
expect(config.stopConditions?.budget).toBeUndefined();
});

test("targetMetric.value is required — empty is an error", () => {
const { error } = setNumericField(FULL, "stop.targetMetric.value", "");
expect(error).toContain("required");
});

test("targetMetric.value accepts a float and preserves the metric name", () => {
const { config } = setNumericField(FULL, "stop.targetMetric.value", "12.5");
expect(config.stopConditions?.targetMetric).toEqual({ metric: "latency", value: 12.5 });
});

test("rejects a non-finite targetMetric.value", () => {
const { config, error } = setNumericField(FULL, "stop.targetMetric.value", "Infinity");
expect(error).toBeDefined();
expect(config).toBe(FULL);
});

test("does not mutate the input contract", () => {
const snapshot = JSON.parse(JSON.stringify(FULL));
setNumericField(FULL, "concurrency.maxActiveClaims", "9");
expect(FULL).toEqual(snapshot);
});
});
Loading
Loading