CI: governance drift check (closes #58) by ProfessorPolymorphic · Pull Request #64 · ui-insight/AISPEG

ProfessorPolymorphic · 2026-05-01T22:18:55Z

Summary

Adds a Governance Drift GitHub Actions workflow that runs the upstream
vendor/data-governance/scripts/check_governance_drift.py validator on PRs
that touch the governance surface, plus the corresponding CLAUDE.md /
REFACTOR.md updates.

Path filter: vendor/data-governance/**, lib/governance/**, app/standards/data-model/**, scripts/build-governance-catalog.*, and the workflow file itself
actions/checkout@v4 with submodules: recursive
actions/setup-python@v5 at Python 3.11
No pip install — the upstream drift script is pure stdlib (base64, json, os, subprocess, sys, urllib.error, urllib.request, pathlib, typing); confirmed by reading the script and confirmed by the absence of requirements.txt / pyproject.toml in vendor/data-governance/
Both PORTFOLIO_GH_TOKEN (preferred — read first by the script) and GITHUB_TOKEN (fallback) are exported to the step; this is required because the script reads files from sibling private repos (see "Findings" below)
Output streamed to $GITHUB_STEP_SUMMARY as a fenced code block with a PASS/FAIL header

Acceptance criteria

Workflow at .github/workflows/governance-drift.yml
Submodules checked out with submodules: recursive
Python set up; drift script's deps installed (none — stdlib only)
Workflow fails when drift script reports drift (verified — see Findings)
Drift output in the Actions Step Summary panel
Path filter limits trigger
CLAUDE.md and REFACTOR.md updated

Findings — two real issues that block a green run today

The CI on this PR is currently red, and per the brief's stop condition I
am not papering over either issue. Both serve as the red CI evidence:

Run on commit a6e847a: https://github.com/ui-insight/AISPEG/actions/runs/25235605858
Run on commit e9f1cd8: https://github.com/ui-insight/AISPEG/actions/runs/25235691593

I could not produce a green run from this branch alone. Reasons:

1. Cross-repo token scope (workflow side)

The script's run_remote_checks reads files from five sibling repos in the
ui-insight org (OpenERA, UCMDailyRegister, AuditDashboard,
StratPlanTacticsMB, ProcessMapping), four of which are private. The
auto-issued workflow GITHUB_TOKEN is scoped only to ui-insight/AISPEG and
returns 404 on repos/ui-insight/OpenERA/contents/docker-compose.yml. The
script's file_text method does not catch RemoteMissing, so it crashes
mid-run with an uncaught exception:

RemoteMissing: GitHub resource not found for repos/ui-insight/OpenERA/contents/docker-compose.yml

The script already documents the right env var for this: GitHubClient.__init__
reads PORTFOLIO_GH_TOKEN first. Once an org-level secret named
PORTFOLIO_GH_TOKEN (a PAT with repo scope on the five sibling repos) is
provisioned and visible to this workflow, the script will use the HTTP path
with that token and run to completion. The workflow already exposes
PORTFOLIO_GH_TOKEN to the step, so no further AISPEG-side change is needed
once the secret exists.

2. Pre-existing upstream registry drift (data-governance side)

Even after the token issue is fixed, the script will fail one check:

FAIL ProcessMapping remote controlled-value counts match local governance registry

Verified locally with the host's gh auth: the live ui-insight/ProcessMapping
repo's data/allowed_values.json currently has 10 vocabulary groups / 78
values; the vendored data-governance registry expects 9 / 70. The
submodule pointer is already at ecb9eaccbf23d0a8817366cc85f55a01a32e62a5,
the tip of ui-insight/data-governance@main, so this is genuine pre-existing
drift in the upstream registry — not introduced by this PR.

Recommended unblock: open a follow-up PR against ui-insight/data-governance
to align catalog/processmapping.json, vocabularies/processmapping/allowed_values.json,
and the relevant docs/README rows with the live ProcessMapping vocabulary
counts. Then bump the AISPEG submodule pointer.

What I did not do

I did not add a try/except around the script invocation or || true
to mask the failure.
I did not edit the upstream script.
I did not modify the vendored registry data.

The workflow itself is functioning exactly as designed: it fails the build
on detected drift, surfaces the reason in the step summary, and the path
filter keeps it quiet on PRs that don't touch the governance surface.
Provisioning the secret + bumping upstream registry are out-of-scope
operator actions; once both are done, this PR's CI will go green on a
re-run.

Test plan

Verify the workflow run triggered on this PR appears in the Actions tab and links from the PR's Checks panel
Open the failed run and confirm the step summary panel shows the PASS/FAIL block with the script's output (or its traceback) inline
After PORTFOLIO_GH_TOKEN is provisioned and the upstream registry drift is fixed, re-run the workflow and confirm it goes green
Open a separate PR that touches only an unrelated path (e.g. app/portfolio/**) and confirm this workflow does not run

🤖 Generated with Claude Code

ProfessorPolymorphic · 2026-05-01T22:48:22Z

Token unblock confirmed

PORTFOLIO_GH_TOKEN is now provisioned as an org secret. Re-ran the workflow on this PR; the result tells a much cleaner story:

✅ 36 PASS lines including:
- PASS remote repository exists: ui-insight/OpenERA
- PASS remote repository exists: ui-insight/UCMDailyRegister
- PASS remote repository exists: ui-insight/AuditDashboard
- PASS remote repository exists: ui-insight/StratPlanTacticsMB
- PASS remote repository exists: ui-insight/ProcessMapping
- PASS OpenERA remote compose exposes current frontend/backend ports
- PASS UCM remote compose documents shared DB and HOST_PORT default
- PASS Audit Dashboard remote compose matches documented ports and backend health endpoint
- PASS ProcessMapping remote asset counts match local governance registry
- PASS StratPlan remote canonical counts match local governance registry
❌ One legitimate FAIL: ProcessMapping remote controlled-value counts match local governance registry — the vocabulary drift documented in ui-insight/data-governance#10.

This is the workflow doing exactly what it was built to do. The remaining red ride is now an upstream-registry-alignment problem, not a CI problem.

Run #25235691593, attempt 2

What's left to make this PR mergeable

✅ ~~Provision PORTFOLIO_GH_TOKEN org secret~~ — done
⏭ Resolve ui-insight/data-governance#10 (ProcessMapping vocab counts) — upstream
⏭ Bump the vendor/data-governance submodule pointer in this repo to pick up the upstream fix

Once 2 + 3 land, the next push will produce the green run we need for the PR description's "one green / one red" evidence.

ProcessMapping's live data/allowed_values.json shipped a 10th vocabulary group (DocumentType, 8 values) covering award documents, funding opportunities, budget forms, proposal narratives, subaward agreements, and sponsor correspondence. The vendored registry was still pinned at the prior 9/70 state. The drift validator at scripts/check_governance_drift.py was correctly flagging this on every CI run downstream (ui-insight/AISPEG#64). After this change the validator passes 36/36 locally with exit 0. Changes: - vocabularies/processmapping/allowed_values.json: insert DocumentType group (alphabetical between ActorType and FileType); bump version to 1.1.0; update last_updated - catalog/processmapping.json: register the new group in controlled_vocabularies - scripts/check_governance_drift.py: bump hardcoded totals (56 -> 57 groups, 315 -> 323 values) and update the ProcessMapping expected-values literals (9/70 -> 10/78) - docs/index.md: aggregate vocabulary count summary (56/315 -> 57/323) - docs/vocabulary/index.md: ProcessMapping row (9/70 -> 10/78) - docs/domains/process-mapping.md: stat block (9 (70 codes) -> 10 (78 codes)) Closes #10. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Add a Governance Drift workflow that runs the upstream check_governance_drift.py validator on PRs that touch the governance surface (vendor/data-governance/**, lib/governance/**, app/standards/data-model/**, scripts/build-governance-catalog.*, and the workflow itself). Failures block the merge so the vendored registry cannot drift silently from the live portfolio repos. - actions/checkout@v4 with submodules: recursive so the vendored data-governance content is present for the script. - actions/setup-python@v5 (3.11). The drift script is pure stdlib (base64, json, os, subprocess, sys, urllib, pathlib, typing) — no requirements.txt and no pip install step needed; comment in the workflow records this so a future reader does not re-derive it. - The script's remote-check path uses GITHUB_TOKEN for api.github.com calls, with a fallback to the preinstalled `gh` CLI on ubuntu-latest. - Output is teed into $GITHUB_STEP_SUMMARY as a fenced code block with a PASS/FAIL header so reviewers can triage without opening the run log. Job exit code is propagated unchanged from the script. CLAUDE.md: short note in Development commands describing the workflow and the path filter. REFACTOR.md: new Sprint 5 entry capturing the data-governance integration (submodule + catalog generator from #55, drift CI from this slice). Closes #58. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The first run of the Governance Drift workflow crashed because the auto-issued workflow GITHUB_TOKEN is scoped only to ui-insight/AISPEG; the script's remote-check phase reads files from sibling private repos (OpenERA, UCMDailyRegister, AuditDashboard, StratPlanTacticsMB, ProcessMapping), which return 404 against that token. The script's `file_text` path does not catch RemoteMissing, so the script aborts with an unhandled exception before getting to the legitimate drift findings. The upstream script already reads PORTFOLIO_GH_TOKEN first (see GitHubClient.__init__), so the fix is to expose that secret to the step alongside GITHUB_TOKEN. Once an org-level secret with `repo` scope on the five sibling repos is provisioned as PORTFOLIO_GH_TOKEN, the script will use the HTTP path with that token and run to completion. GITHUB_TOKEN remains as a fallback so the workflow does not fail catastrophically when the secret is missing. I am not adding a try/except wrapper around the script invocation or masking the RemoteMissing crash — per the brief's stop condition, the script's robustness is upstream's call. This change just hands the script the credentials it documents that it expects. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

ui-insight/data-governance#11 merged the ProcessMapping registry sync that adds the DocumentType vocabulary group (8 values). Bumping the submodule pointer + regenerating lib/governance/vocabularies.ts (48 → 49 vocabulary groups) so this branch's drift workflow will pass on the next CI run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

ProfessorPolymorphic · 2026-05-01T23:12:28Z

All-green achieved

The submodule bump landed (vendor/data-governance → 5e811f7) and the workflow flipped:

✅ Lint & Accessibility — 29s
✅ TypeScript — 24s
✅ Production Build — 34s
✅ Validate governance registry — 9s (first green run on this PR)

The PR now has both a red run (attempt 1 — token-scope issue, since fixed by org-secret provisioning) and a green run (latest) captured in the run history — satisfying the "one green / one red" evidence acceptance criterion from #58.

Trajectory

Workflow created → red (uncaught exception, token missing).
PORTFOLIO_GH_TOKEN org secret provisioned → red (script-level, on legitimate ProcessMapping registry drift).
ui-insight/data-governance#11 merged (synced ProcessMapping → registry, adding the DocumentType group).
Submodule bump on this branch (commit 34e75d0) + rebased onto current main.
→ All green.

Merging.

ProfessorPolymorphic mentioned this pull request May 1, 2026

ProcessMapping registry drift: live catalog has 10 groups / 78 values; vendored expects 9 / 70 ui-insight/data-governance#10

Closed

ProfessorPolymorphic mentioned this pull request May 1, 2026

Sync ProcessMapping registry: add DocumentType (closes #10) ui-insight/data-governance#11

Merged

ProfessorPolymorphic and others added 3 commits May 1, 2026 16:10

ProfessorPolymorphic force-pushed the feat/issue-58-ci-drift-check branch from e9f1cd8 to 34e75d0 Compare May 1, 2026 23:10

ProfessorPolymorphic merged commit 8f66b86 into main May 1, 2026
4 checks passed

ProfessorPolymorphic deleted the feat/issue-58-ci-drift-check branch May 1, 2026 23:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI: governance drift check (closes #58)#64

CI: governance drift check (closes #58)#64
ProfessorPolymorphic merged 3 commits into
mainfrom
feat/issue-58-ci-drift-check

ProfessorPolymorphic commented May 1, 2026 •

edited

Loading

Uh oh!

ProfessorPolymorphic commented May 1, 2026

Uh oh!

ProfessorPolymorphic commented May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ProfessorPolymorphic commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Acceptance criteria

Findings — two real issues that block a green run today

1. Cross-repo token scope (workflow side)

2. Pre-existing upstream registry drift (data-governance side)

What I did not do

Test plan

Uh oh!

ProfessorPolymorphic commented May 1, 2026

Token unblock confirmed

What's left to make this PR mergeable

Uh oh!

ProfessorPolymorphic commented May 1, 2026

All-green achieved

Trajectory

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ProfessorPolymorphic commented May 1, 2026 •

edited

Loading