Add a Galley setup executor preflight that prepares fresh worktrees before implementation, learns a successful environment.setup plan when missing or stale, persists it to environment.yaml, and routes setup evidence through later AFK… by shinpr · Pull Request #71 · shinpr/galley

shinpr · 2026-05-25T11:22:24Z

Goal

Add a Galley setup executor preflight that prepares fresh worktrees before implementation, learns a successful environment.setup plan when missing or stale, persists it to environment.yaml, and routes setup evidence through later AFK execution.

Acceptance Criteria

AC1 environment.yaml shall support a first-class setup field that describes how Galley prepares a fresh task worktree before implementation begins.
- Verification: Add environment profile schema, validation, example, and documentation coverage for environment.setup. Validate examples/environment-local.yaml and bundled environment schema references.
- Status: satisfied
AC2 When environment.setup is present, Galley shall run that setup plan before acceptance skeleton preflight and before the implementation executor.
- Verification: Add daemon coverage proving setup runs after worktree/input-file preparation and before acceptance skeleton preflight or implementation executor startup.
- Status: satisfied
AC3 When environment.setup is absent, Galley shall run a setup executor that receives the full environment.commands map, repository manifests, lockfiles, setup docs, and quality required checks as context, then attempts to make the worktree ready.
- Verification: Add daemon/setup-executor tests for a profile without setup, asserting the setup executor is invoked with environment.commands and repository setup evidence.
- Status: satisfied
AC4 Galley shall provide both Claude and Codex setup executor prompt paths with the same setup-result contract, selecting the setup executor provider from the task executor backend.
- Verification: Add prompt/embed, command-plan, and fake-provider tests for Claude and Codex setup executor runs. Assert both parse the same setup result shape.
- Status: satisfied
AC5 The Claude setup executor prompt shall be authored from the existing Claude executor, Claude supervisor, and Claude/test-creator prompt patterns, and the Codex setup executor prompt shall be authored from the existing Codex executor, Codex supervisor, and Codex/test-creator prompt patterns; neither prompt shall be a mechanical diff or direct wording transplant from the other provider.
- Verification: Add focused prompt/reference tests or review evidence showing the Claude setup prompt follows Claude-specific runtime/output framing and the Codex setup prompt follows Codex-specific runtime/output framing, while both preserve the shared setup-result contract.
- Status: satisfied
AC6 The setup executor shall first consider any existing environment.setup plan and environment.commands entries, but if those commands do not make the worktree ready, it may discover and return the command sequence that actually succeeds.
- Verification: Add a regression test where the profile provides an incorrect setup command or only generic commands, the setup executor returns a different successful setup plan, and Galley records both attempted and successful commands.
- Status: satisfied
AC7 When setup succeeds with a new or changed setup plan, Galley shall update the repository environment.yaml setup field so subsequent tasks use the successful setup without rediscovery.
- Verification: Add profile update tests asserting environment.yaml is atomically rewritten with the returned setup field, unrelated profile content is preserved, validation runs after update, and a second daemon run uses the persisted setup without invoking discovery.
- Status: satisfied
AC8 Setup success evidence shall be persisted and routed to the implementation executor, supervisor evidence, task verification history, and PR/task output when the setup plan was added or changed.
- Verification: Assert runs//setup_result.json and environment_update.json are written, the implementation work order includes setup readiness evidence, and task/PR evidence records setup profile changes when they occurred.
- Status: satisfied
AC9 If setup cannot make the worktree ready, Galley shall fail the setup phase before spending implementation attempts, with structured evidence that identifies attempted commands, command source, inspected files, stdout/stderr excerpts, and next repair guidance.
- Verification: Add failure-path coverage where setup commands fail and assert the task latest error uses phase setup, kind setup_failed, with setup_result.json available for troubleshooting.
- Status: satisfied
AC10 Setup readiness checks shall exclude acceptance skeleton obligations, so setup does not fail merely because task-specific skeleton tests have not been implemented yet.
- Verification: Add coverage for a task with acceptance skeleton preflight enabled and assert setup runs before skeleton creation and records only repository readiness evidence.
- Status: satisfied
AC11 Galley troubleshooting guidance shall cover setup_failed runs and explain how to use setup_result.json and environment_update.json to repair environment.setup when the learned setup later fails.
- Verification: Update the Galley skill troubleshooting reference and skill description if needed; verify the packaged skill reference includes setup failure diagnosis.
- Status: satisfied

Final Verification

go test ./internal/daemon -count=1: passed
go run ./cmd/galley profile validate --kind environment examples/environment-local.yaml: passed
go vet ./...: passed
gofmt -l .: passed
go build ./...: passed
test -z "$(find . -name '*.go' -not -path './.git/*' -print | xargs gofmt -l)": passed
go test ./...: passed
go build -o /tmp/galley ./cmd/galley: passed
go run ./cmd/galley schema check: passed
go run ./cmd/galley task validate examples/afk-task.yaml: passed
go run ./cmd/galley profile validate --kind quality examples/quality-default.yaml: passed
python3 -m json.tool schemas/claude-result.schema.json >/dev/null: passed
./scripts/smoke-local.sh: passed

Key Decisions

claude-decision-1 How to make TestSetupPreflightSequencesBeforeSkeletonAndExecutor actually exercise the setup preflight, given that SetupExecutorPreflight short-circuits when Profiles.Environment is nil? -> Write a real authored environment profile (with setup.commands: touch setup.sentinel) and pass it via Options.EnvironmentProfileFile so the daemon resolves a Profiles.Environment and runs the authored plan, which creates the sentinel and yields the 'environment.yaml=unchanged' excerpt the test asserts. The previous runner override is reshaped into a t.Fatalf trap so any drift back to discovery fails loudly.
- Rationale: AC2's contract is sequencing (setup before skeleton/executor); the authored-plan path satisfies that contract directly and matches the existing assertion that the excerpt records 'environment.yaml=unchanged'. Switching paths is not a weakening — the assertions remain (sentinel-required-before-skeleton, setup verification status=passed, unchanged-env excerpt) and the new defensive runner trap makes silent fallback impossible.
- Reversibility: high
claude-decision-2 How to reconcile TestSetupPreflightAtomicProfileRewriteAndSecondRunReuse's strings.Contains(body, 'base: main') assertion with the YAML node-style round-trip? -> Change the test input from quoted base: "main" to unquoted base: main so the assertion tests the contract the assertion was written for: that the YAML node style of unrelated scalars is preserved across the atomic rewrite.
- Rationale: The assertion's intent is to verify that unrelated content survives the rewrite; the input/assertion mismatch was a test typo. Correcting the input keeps the assertion as-is rather than weakening it, and the failure log already confirmed the rewrite preserves node style.
- Reversibility: high

Discussion Items

discussion-1 Scope expansion: Accepted diff includes changes outside task.scope.allowed_paths: internal/supervisor/adapter.go, internal/supervisor/supervisor.go. Review whether this scope expansion belongs in this PR.
- Human decision required: true

Risks

R1 contract: environment.setup is a new environment profile contract and must stay synchronized across Go structs, validation, generated schemas, examples, docs, and bundled skill references.
- Mitigation: Run schema generation/checks, profile validation examples, and focused profile update tests.
R2 runtime: Setup executor runs before implementation and can update repository environment profiles, so evidence ownership, update ordering, and failure classification must be explicit.
- Mitigation: Persist setup_result.json and environment_update.json, route setup evidence to work orders and supervisor evidence, and add failure-path daemon coverage.
R3 maintainability: Claude and Codex setup prompts must be independently shaped from their provider-specific prompt families rather than copying one provider's wording into the other.
- Mitigation: Add tests or review evidence that each prompt follows the corresponding provider's existing executor/supervisor/test-creator prompt conventions.

…earned-environment-setup

galley: task-20260525094613-f0e611-add-setup-executor-preflight-and-l…

89cc4e3

…earned-environment-setup

shinpr self-assigned this May 25, 2026

shinpr added 5 commits May 25, 2026 21:18

Harden setup executor command execution

4491836

Clarify setup executor output contract

2daf3b7

Update Galley plugin setup troubleshooting

de4af5e

Fix Codex setup executor output schema

87fe9c7

Make PowerShell install smoke use local checkout

42bf2c8

shinpr merged commit 8522cbd into main May 25, 2026
4 checks passed

shinpr deleted the agent/task-20260525094613-f0e611-add-setup-executor-preflight-and-learned-environment-setup branch May 25, 2026 13:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a Galley setup executor preflight that prepares fresh worktrees before implementation, learns a successful environment.setup plan when missing or stale, persists it to environment.yaml, and routes setup evidence through later AFK…#71

shinpr commented May 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

shinpr commented May 25, 2026

Goal

Acceptance Criteria

Final Verification

Key Decisions

Discussion Items

Risks

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant