Problem
microsoft/apm is APM's canonical reference repo and we do not run apm audit --ci in our own pipeline. We also document an integration-drift CI gate to users (integrations/ci-cd.md "Verify deployed primitives") that we don't use ourselves. Both omissions are self-inflicted: the features exist, the docs exist, the dogfooding doesn't.
PRs #874, #875, and the initial commit of #878 all silently desynced .apm/instructions/cicd.instructions.md (canonical source) from .github/instructions/cicd.instructions.md (regenerated output) because the maintainer (me) edited the regenerated file directly. The next apm install run would have overwritten the new content with the stale source. That should never have made it past PR-time CI.
What's missing
Two CI gates that we ship as features and document publicly, but don't enforce on this repo:
Gate A: apm audit --ci
Documented at https://microsoft.github.io/apm/integrations/ci-cd/#verify-deployed-primitives. Runs the six baseline checks (lockfile-exists, ref-consistency, deployed-files-present, no-orphaned-packages, config-consistency, content-integrity) plus optional --policy checks (unmanaged-files, etc.).
Catches today, even without #684:
- Edited
apm.yml without re-running install (ref drift)
- Deleted a file that's still in
lockfile.deployed_files
- Sideloaded files in governance directories (with
--policy)
- Hidden Unicode injection in deployed package content
Does not catch (today): content drift on already-deployed files. That's the scope of #684.
Gate B: Integration drift check (the "regeneration drift" gate)
The exact gate documented at integrations/ci-cd.md "Verify deployed primitives":
apm install
if [ -n "$(git status --porcelain -- .github/ .claude/ .cursor/ .opencode/)" ]; then
echo "APM integration files are out of date. Run 'apm install' and commit."
exit 1
fi
This is the gate that would have caught #874, #875, and the initial commit of #878. It catches the case where a maintainer edited a regenerated output file (.github/...) without updating the canonical authored source (.apm/...).
Strictly speaking this is a producer-side drift check (for repos that author APM packages), distinct from apm audit --ci's consumer-side install-fidelity check. Both should run; they catch different failure modes.
Proposal
Add a single PR-time job to ci.yml (Tier 1, no secrets needed) that runs both gates via the official microsoft/apm-action. Dogfooding the action is itself part of the value: any consumer reading our ci.yml sees the same one-line setup we recommend they use.
apm-self-check:
name: APM Self-Check
runs-on: ubuntu-24.04
steps:
- uses: actions/checkout@v4
# Installs the APM CLI (latest stable) and runs `apm install` against
# this repo's apm.yml. Auto-detects target from the existing .github/
# directory and re-integrates local .apm/ content, regenerating
# .github/instructions/, .github/agents/, .github/skills/, etc.
# Adds `apm` to PATH for subsequent steps.
- uses: microsoft/apm-action@v1
# Gate A: lockfile / install fidelity (consumer-side).
# Verifies every file in lockfile.deployed_files exists, ref consistency
# between apm.yml and apm.lock.yaml, no orphan packages, and
# content-integrity (hidden Unicode) on deployed package content.
# Does NOT verify deployed-file content vs lockfile (see #684).
- name: apm audit --ci
run: apm audit --ci
# Gate B: regeneration drift (producer-side).
# The action's `apm install` step re-integrated local .apm/ into
# .github/ via target auto-detection. If anything in the governed
# integration directories changed, someone edited the regenerated
# output without updating the canonical .apm/ source.
- name: Check APM integration drift
run: |
if [ -n "$(git status --porcelain -- .github/ .claude/ .cursor/ .opencode/)" ]; then
echo "::error::APM integration files are out of date."
echo "Run 'apm install' locally (with .github/ present) and commit the result."
git --no-pager diff -- .github/ .claude/ .cursor/ .opencode/
exit 1
fi
Notes on action choice:
- The action runs
apm install automatically (no --target input — install relies on auto-detection from existing .github/). That matches what we want here: re-integrate local .apm/ into the existing .github/ tree and surface any divergence.
- The action does NOT run
apm audit --ci — its own audit-report mode runs apm audit -f sarif (Unicode-only security scan), a different surface. We add audit --ci as an explicit step.
apm-version is intentionally unpinned. If a future change requires a specific CLI behavior, pin then; for now we want to catch breakage on latest stable as part of release readiness.
- A future job that needs to validate unreleased CLI behavior should be added alongside this one (built from source), not instead of it. Keep this job representative of what users run.
Add APM Self-Check to merge-gate.yml's EXPECTED_CHECKS list so it becomes a required check via the gate aggregator.
Pre-requisite cleanup
Running this gate today on main would fail because of pre-existing producer-side drift this gate would now catch:
A small precursor PR should run apm install locally and commit the regenerated .github/ tree, before wiring up the self-check job in CI. Otherwise the first PR-time run blocks every subsequent PR.
Acceptance criteria
Out of scope (deliberately)
Why now
We documented two CI gates and don't run them. Every PR that drifts .apm/ and .github/ is an embarrassment for a project whose value prop is "treat governance as code". The pattern caught us three times in a row across #874, #875, #878. Trivial to fix; high signal as both correctness gate AND as marketing evidence ("we use our own action in our own required CI").
Related
Problem
microsoft/apmis APM's canonical reference repo and we do not runapm audit --ciin our own pipeline. We also document an integration-drift CI gate to users (integrations/ci-cd.md"Verify deployed primitives") that we don't use ourselves. Both omissions are self-inflicted: the features exist, the docs exist, the dogfooding doesn't.PRs #874, #875, and the initial commit of #878 all silently desynced
.apm/instructions/cicd.instructions.md(canonical source) from.github/instructions/cicd.instructions.md(regenerated output) because the maintainer (me) edited the regenerated file directly. The nextapm installrun would have overwritten the new content with the stale source. That should never have made it past PR-time CI.What's missing
Two CI gates that we ship as features and document publicly, but don't enforce on this repo:
Gate A:
apm audit --ciDocumented at https://microsoft.github.io/apm/integrations/ci-cd/#verify-deployed-primitives. Runs the six baseline checks (
lockfile-exists,ref-consistency,deployed-files-present,no-orphaned-packages,config-consistency,content-integrity) plus optional--policychecks (unmanaged-files, etc.).Catches today, even without #684:
apm.ymlwithout re-running install (ref drift)lockfile.deployed_files--policy)Does not catch (today): content drift on already-deployed files. That's the scope of #684.
Gate B: Integration drift check (the "regeneration drift" gate)
The exact gate documented at
integrations/ci-cd.md"Verify deployed primitives":This is the gate that would have caught #874, #875, and the initial commit of #878. It catches the case where a maintainer edited a regenerated output file (
.github/...) without updating the canonical authored source (.apm/...).Strictly speaking this is a producer-side drift check (for repos that author APM packages), distinct from
apm audit --ci's consumer-side install-fidelity check. Both should run; they catch different failure modes.Proposal
Add a single PR-time job to
ci.yml(Tier 1, no secrets needed) that runs both gates via the officialmicrosoft/apm-action. Dogfooding the action is itself part of the value: any consumer reading ourci.ymlsees the same one-line setup we recommend they use.Notes on action choice:
apm installautomatically (no--targetinput — install relies on auto-detection from existing.github/). That matches what we want here: re-integrate local.apm/into the existing.github/tree and surface any divergence.apm audit --ci— its ownaudit-reportmode runsapm audit -f sarif(Unicode-only security scan), a different surface. We addaudit --cias an explicit step.apm-versionis intentionally unpinned. If a future change requires a specific CLI behavior, pin then; for now we want to catch breakage on latest stable as part of release readiness.Add
APM Self-Checktomerge-gate.yml'sEXPECTED_CHECKSlist so it becomes a required check via thegateaggregator.Pre-requisite cleanup
Running this gate today on
mainwould fail because of pre-existing producer-side drift this gate would now catch:.github/agents/auth-expert.agent.md— older than canonical.apm/agents/auth-expert.agent.md(from feat(auth): Azure DevOps authentication via Entra ID (AAD) bearer tokens #856 ADO bearer work).github/skills/auth/SKILL.md— same provenance.github/skills/pr-description-skill/*— multiple files drifted vs.apm/A small precursor PR should run
apm installlocally and commit the regenerated.github/tree, before wiring up the self-check job in CI. Otherwise the first PR-time run blocks every subsequent PR.Acceptance criteria
.github/to canonical.apm/for the known drifted files (auth + pr-description-skill).ci.ymlusesmicrosoft/apm-action@v1and runsapm audit --cion every PR; failures fail the job.git status --porcelainon the regenerated directories after the action's install step; non-empty diff fails the job with a clear "run apm install and commit" message.merge-gate.ymlEXPECTED_CHECKSupdated to include the new check name; aggregatorgatewaits for it..github/instructions/cicd.instructions.mdwithout touching.apm/. Self-check job should fail with the diff visible in the log.apm.ymlwithout re-running install. Self-check should fail withref-consistencyviolation.integrations/ci-cd.mdpointing at our ownci.ymlas a reference implementation.Out of scope (deliberately)
apm audit --drift/ per-deployed-file content verification. Tracked in audit --ci: verify deployed file content, not just existence #684. This issue uses only features that exist today.apmsubcommand. That would be a follow-up after audit --ci: verify deployed file content, not just existence #684 lands; for now the documentedapm install + git statusshell pattern is what we ship to users.Why now
We documented two CI gates and don't run them. Every PR that drifts
.apm/and.github/is an embarrassment for a project whose value prop is "treat governance as code". The pattern caught us three times in a row across #874, #875, #878. Trivial to fix; high signal as both correctness gate AND as marketing evidence ("we use our own action in our own required CI").Related
apm audit --ci: verify deployed file content, not just existence (the underlying feature gap)apm audit --drift(public-facing name for audit --ci: verify deployed file content, not just existence #684's scope)integrations/ci-cd.md"Verify deployed primitives"microsoft/apm-action— the official one-line CI setup; this issue dogfoods it