Skip to content

Add a Galley setup executor preflight that prepares fresh worktrees before implementation, learns a successful environment.setup plan when missing or stale, persists it to environment.yaml, and routes setup evidence through later AFK…#71

Merged
shinpr merged 6 commits into
mainfrom
agent/task-20260525094613-f0e611-add-setup-executor-preflight-and-learned-environment-setup
May 25, 2026

Conversation

@shinpr
Copy link
Copy Markdown
Owner

@shinpr shinpr commented May 25, 2026

Goal

Add a Galley setup executor preflight that prepares fresh worktrees before implementation, learns a successful environment.setup plan when missing or stale, persists it to environment.yaml, and routes setup evidence through later AFK execution.

Acceptance Criteria

  • AC1 environment.yaml shall support a first-class setup field that describes how Galley prepares a fresh task worktree before implementation begins.
    • Verification: Add environment profile schema, validation, example, and documentation coverage for environment.setup. Validate examples/environment-local.yaml and bundled environment schema references.
    • Status: satisfied
  • AC2 When environment.setup is present, Galley shall run that setup plan before acceptance skeleton preflight and before the implementation executor.
    • Verification: Add daemon coverage proving setup runs after worktree/input-file preparation and before acceptance skeleton preflight or implementation executor startup.
    • Status: satisfied
  • AC3 When environment.setup is absent, Galley shall run a setup executor that receives the full environment.commands map, repository manifests, lockfiles, setup docs, and quality required checks as context, then attempts to make the worktree ready.
    • Verification: Add daemon/setup-executor tests for a profile without setup, asserting the setup executor is invoked with environment.commands and repository setup evidence.
    • Status: satisfied
  • AC4 Galley shall provide both Claude and Codex setup executor prompt paths with the same setup-result contract, selecting the setup executor provider from the task executor backend.
    • Verification: Add prompt/embed, command-plan, and fake-provider tests for Claude and Codex setup executor runs. Assert both parse the same setup result shape.
    • Status: satisfied
  • AC5 The Claude setup executor prompt shall be authored from the existing Claude executor, Claude supervisor, and Claude/test-creator prompt patterns, and the Codex setup executor prompt shall be authored from the existing Codex executor, Codex supervisor, and Codex/test-creator prompt patterns; neither prompt shall be a mechanical diff or direct wording transplant from the other provider.
    • Verification: Add focused prompt/reference tests or review evidence showing the Claude setup prompt follows Claude-specific runtime/output framing and the Codex setup prompt follows Codex-specific runtime/output framing, while both preserve the shared setup-result contract.
    • Status: satisfied
  • AC6 The setup executor shall first consider any existing environment.setup plan and environment.commands entries, but if those commands do not make the worktree ready, it may discover and return the command sequence that actually succeeds.
    • Verification: Add a regression test where the profile provides an incorrect setup command or only generic commands, the setup executor returns a different successful setup plan, and Galley records both attempted and successful commands.
    • Status: satisfied
  • AC7 When setup succeeds with a new or changed setup plan, Galley shall update the repository environment.yaml setup field so subsequent tasks use the successful setup without rediscovery.
    • Verification: Add profile update tests asserting environment.yaml is atomically rewritten with the returned setup field, unrelated profile content is preserved, validation runs after update, and a second daemon run uses the persisted setup without invoking discovery.
    • Status: satisfied
  • AC8 Setup success evidence shall be persisted and routed to the implementation executor, supervisor evidence, task verification history, and PR/task output when the setup plan was added or changed.
    • Verification: Assert runs//setup_result.json and environment_update.json are written, the implementation work order includes setup readiness evidence, and task/PR evidence records setup profile changes when they occurred.
    • Status: satisfied
  • AC9 If setup cannot make the worktree ready, Galley shall fail the setup phase before spending implementation attempts, with structured evidence that identifies attempted commands, command source, inspected files, stdout/stderr excerpts, and next repair guidance.
    • Verification: Add failure-path coverage where setup commands fail and assert the task latest error uses phase setup, kind setup_failed, with setup_result.json available for troubleshooting.
    • Status: satisfied
  • AC10 Setup readiness checks shall exclude acceptance skeleton obligations, so setup does not fail merely because task-specific skeleton tests have not been implemented yet.
    • Verification: Add coverage for a task with acceptance skeleton preflight enabled and assert setup runs before skeleton creation and records only repository readiness evidence.
    • Status: satisfied
  • AC11 Galley troubleshooting guidance shall cover setup_failed runs and explain how to use setup_result.json and environment_update.json to repair environment.setup when the learned setup later fails.
    • Verification: Update the Galley skill troubleshooting reference and skill description if needed; verify the packaged skill reference includes setup failure diagnosis.
    • Status: satisfied

Final Verification

  • go test ./internal/daemon -count=1: passed
  • go run ./cmd/galley profile validate --kind environment examples/environment-local.yaml: passed
  • go vet ./...: passed
  • gofmt -l .: passed
  • go build ./...: passed
  • test -z "$(find . -name '*.go' -not -path './.git/*' -print | xargs gofmt -l)": passed
  • go test ./...: passed
  • go build -o /tmp/galley ./cmd/galley: passed
  • go run ./cmd/galley schema check: passed
  • go run ./cmd/galley task validate examples/afk-task.yaml: passed
  • go run ./cmd/galley profile validate --kind quality examples/quality-default.yaml: passed
  • python3 -m json.tool schemas/claude-result.schema.json >/dev/null: passed
  • ./scripts/smoke-local.sh: passed

Key Decisions

  • claude-decision-1 How to make TestSetupPreflightSequencesBeforeSkeletonAndExecutor actually exercise the setup preflight, given that SetupExecutorPreflight short-circuits when Profiles.Environment is nil? -> Write a real authored environment profile (with setup.commands: touch setup.sentinel) and pass it via Options.EnvironmentProfileFile so the daemon resolves a Profiles.Environment and runs the authored plan, which creates the sentinel and yields the 'environment.yaml=unchanged' excerpt the test asserts. The previous runner override is reshaped into a t.Fatalf trap so any drift back to discovery fails loudly.
    • Rationale: AC2's contract is sequencing (setup before skeleton/executor); the authored-plan path satisfies that contract directly and matches the existing assertion that the excerpt records 'environment.yaml=unchanged'. Switching paths is not a weakening — the assertions remain (sentinel-required-before-skeleton, setup verification status=passed, unchanged-env excerpt) and the new defensive runner trap makes silent fallback impossible.
    • Reversibility: high
  • claude-decision-2 How to reconcile TestSetupPreflightAtomicProfileRewriteAndSecondRunReuse's strings.Contains(body, 'base: main') assertion with the YAML node-style round-trip? -> Change the test input from quoted base: "main" to unquoted base: main so the assertion tests the contract the assertion was written for: that the YAML node style of unrelated scalars is preserved across the atomic rewrite.
    • Rationale: The assertion's intent is to verify that unrelated content survives the rewrite; the input/assertion mismatch was a test typo. Correcting the input keeps the assertion as-is rather than weakening it, and the failure log already confirmed the rewrite preserves node style.
    • Reversibility: high

Discussion Items

  • discussion-1 Scope expansion: Accepted diff includes changes outside task.scope.allowed_paths: internal/supervisor/adapter.go, internal/supervisor/supervisor.go. Review whether this scope expansion belongs in this PR.
    • Human decision required: true

Risks

  • R1 contract: environment.setup is a new environment profile contract and must stay synchronized across Go structs, validation, generated schemas, examples, docs, and bundled skill references.
    • Mitigation: Run schema generation/checks, profile validation examples, and focused profile update tests.
  • R2 runtime: Setup executor runs before implementation and can update repository environment profiles, so evidence ownership, update ordering, and failure classification must be explicit.
    • Mitigation: Persist setup_result.json and environment_update.json, route setup evidence to work orders and supervisor evidence, and add failure-path daemon coverage.
  • R3 maintainability: Claude and Codex setup prompts must be independently shaped from their provider-specific prompt families rather than copying one provider's wording into the other.
    • Mitigation: Add tests or review evidence that each prompt follows the corresponding provider's existing executor/supervisor/test-creator prompt conventions.

@shinpr shinpr self-assigned this May 25, 2026
@shinpr shinpr merged commit 8522cbd into main May 25, 2026
4 checks passed
@shinpr shinpr deleted the agent/task-20260525094613-f0e611-add-setup-executor-preflight-and-learned-environment-setup branch May 25, 2026 13:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant