Symptom
When a step's test_suite contract fails (e.g. go test ./... exits non-zero), the run-detail UI shows:
non-retryable failure class "canceled", skipping remaining retries
Operators read "canceled" as "someone clicked stop" — but actually it just means "the contract validator categorised this as non-retryable so retries were skipped."
Concrete trace
wave-test-hardening-20260429-161707-ab9d — harden step ran go test ./..., hit a flaky/broken test in main (NOT caused by harden's writes), the test_suite contract failed, FailureClassifier labelled it canceled, no retry. UI surfaces "canceled" → user thinks Wave aborted the run.
Root
internal/pipeline/failure_modes.go (or wherever FailureClassifier lives) returns FailureClassCanceled for contract-validator non-retryable cases. The class is doing double duty: "user-cancelled" AND "contract said no point retrying". Two different operator stories collapsed into one label.
Fix
Split the class into two:
FailureClassUserCanceled — operator clicked cancel / SIGINT / explicit abort
FailureClassContractFailure — contract validator categorised the result as non-retryable (test_suite failed, json_schema failed with must_pass:true, etc)
Update UI rendering to display:
- canceled → grey badge, message "Run was canceled"
- contract_failure → red badge, message "Step failed: <contract_type> —
Details
"
Backward-compat: existing DB rows with the old combined value continue to render correctly (default to contract_failure since user-cancellation is rarer).
Severity
Low. Cosmetic / UX clarity. No functional impact.
Symptom
When a step's
test_suitecontract fails (e.g.go test ./...exits non-zero), the run-detail UI shows:Operators read "canceled" as "someone clicked stop" — but actually it just means "the contract validator categorised this as non-retryable so retries were skipped."
Concrete trace
wave-test-hardening-20260429-161707-ab9d— harden step rango test ./..., hit a flaky/broken test in main (NOT caused by harden's writes), thetest_suitecontract failed, FailureClassifier labelled itcanceled, no retry. UI surfaces "canceled" → user thinks Wave aborted the run.Root
internal/pipeline/failure_modes.go(or wherever FailureClassifier lives) returnsFailureClassCanceledfor contract-validator non-retryable cases. The class is doing double duty: "user-cancelled" AND "contract said no point retrying". Two different operator stories collapsed into one label.Fix
Split the class into two:
FailureClassUserCanceled— operator clicked cancel / SIGINT / explicit abortFailureClassContractFailure— contract validator categorised the result as non-retryable (test_suite failed, json_schema failed with must_pass:true, etc)Update UI rendering to display:
Details
"Backward-compat: existing DB rows with the old combined value continue to render correctly (default to contract_failure since user-cancellation is rarer).
Severity
Low. Cosmetic / UX clarity. No functional impact.