Skip to content

verification can report success when core checks are blocked by transient infra/tool failures (false-green risk) #50

@danshapiro

Description

@danshapiro

Problem

Stage success can be emitted even when core verification commands fail due to transient infra/tooling issues, creating false-green outcomes.

In run 01KJR25QS6RY52D7VS2ZRAAXMK (worker_1), verification commands failed but stage status was still written as success.

Why this matters

  • Reduces trust in stage/run success as a quality signal.
  • Makes it harder to distinguish “implemented + verified” from “implemented + verification skipped/blocked.”
  • Allows important quality gates to silently degrade under transient infra conditions.

Evidence

Artifacts:

  • ~/.local/state/kilroy/attractor/runs/01KJR25QS6RY52D7VS2ZRAAXMK/parallel/work_pool/pass1/02-worker_1/worker_1/stdout.log
  • .../parallel/work_pool/pass1/02-worker_1/worker_1/output.json

Concrete log lines (stdout.log):

  • :25 command failure: python: command not found (exit 127)
  • :143 npx -y tsc ... failed with EAI_AGAIN / DNS failure to registry.npmjs.org (exit 1)
  • :156 and :160 stage status written/read as "status":"success" with verification note acknowledging TS compile blocked by network

output.json also states TS compile could not run due to network-restricted npx while summarizing successful completion.

Steps to reproduce / observe

  1. Inspect worker_1 logs in run 01KJR25QS6RY52D7VS2ZRAAXMK.
  2. Confirm non-zero exits for verification-related commands.
  3. Confirm status file still reports success.
  4. Compare verification failures vs final stage outcome classification.

Scope boundaries

This issue is about outcome semantics and verification reporting under transient infra failures.

This issue is not:

  • About fixing npm/DNS/environment reliability globally.
  • About forcing fail-closed behavior for every optional check.

Potential directions (non-prescriptive)

  • Separate implementation outcome from verification outcome in status schema.
  • Make required-vs-optional verification explicit and machine-readable.
  • Add degraded-success classification when required verification is blocked.
  • Surface verification execution matrix in stage/run summaries.

Definition of done

  • A stage cannot be reported as unqualified success if required verification was not executed/passed.
  • Verification results are structured, explicit, and queryable.
  • Operators can clearly see whether verification passed, failed, or was blocked and why.
  • Existing true-green path remains unchanged when checks execute and pass.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions