Context
Raised by Munger in PR #1735 (round-1 review). Currently the _async_migrations.status enum collapses operator-initiated cancellation into failed, which makes triage harder ("did this die because we killed it, or because of a real fault?").
Proposal
Add a cancelled status distinct from failed in:
cmd/ingestor/async_migration.go: defer in RunAsyncMigration detects errors.Is(err, context.Canceled) and writes status='cancelled' instead of failed.
cmd/server/async_migrations.go: mapAsyncStatus adds cancelled → cancelled enum.
public/warmup-banner.js: surface cancelled as an info-tone line (not error), and treat it as auto-dismiss-eligible.
Why
failed should mean "the migration tried and could not complete due to an error worth investigating." Operator-initiated cancellation (shutdown, SIGTERM) is not that.
Context
Raised by Munger in PR #1735 (round-1 review). Currently the
_async_migrations.statusenum collapses operator-initiated cancellation intofailed, which makes triage harder ("did this die because we killed it, or because of a real fault?").Proposal
Add a
cancelledstatus distinct fromfailedin:cmd/ingestor/async_migration.go: defer in RunAsyncMigration detectserrors.Is(err, context.Canceled)and writesstatus='cancelled'instead offailed.cmd/server/async_migrations.go:mapAsyncStatusaddscancelled→cancelledenum.public/warmup-banner.js: surface cancelled as an info-tone line (not error), and treat it as auto-dismiss-eligible.Why
failedshould mean "the migration tried and could not complete due to an error worth investigating." Operator-initiated cancellation (shutdown, SIGTERM) is not that.