Skip to content

deploy.yml: no failure-notification step — failed deploys are silent #44

@amiller

Description

@amiller

Symptom

PR #41 (`approver: silence ⚠️ timeout for lobby rooms nobody joined`) merged 2026-05-25, auto-tagged v0.7.03, kicked off a deploy that failed at the heads-up step (stale token, see related issue). The fix sat undeployed for 2 days before I noticed only because the very spam the PR was meant to silence was still landing.

There is no mechanism in deploy.yml that posts to Matrix / opens a GitHub issue / pages anywhere when the workflow fails.

Proposed fix

Add an if: failure() step at the end of the deploy job (depends on heads-up channel decision — see related issue). Skeleton:

- name: notify on failure
  if: failure()
  env:
    KNOCK_APPROVER_TOKEN: ${{ secrets.KNOCK_APPROVER_TOKEN }}
    ADMIN_COMMAND_ROOM: ${{ secrets.ADMIN_COMMAND_ROOM }}
  run: |
    # post "❌ deploy {{ref}} ({{sha}}) failed — see {{run_url}}"
    # whichever channel issue #TBD picks

Caveat: if KNOCK_APPROVER_TOKEN itself is the thing that's stale, this notification step will also fail. The pre-flight validation issue is the more reliable backstop for that specific class.

Worth considering: GH Actions has built-in workflow-failure email and the mobile app pushes for failed runs. May be enough on its own once the team confirms they actually receive those.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions