CI: governance drift check (closes #58)#64
Merged
Conversation
Collaborator
Author
Token unblock confirmed
This is the workflow doing exactly what it was built to do. The remaining red ride is now an upstream-registry-alignment problem, not a CI problem. What's left to make this PR mergeable
Once 2 + 3 land, the next push will produce the green run we need for the PR description's "one green / one red" evidence. |
ProfessorPolymorphic
added a commit
to ui-insight/data-governance
that referenced
this pull request
May 1, 2026
ProcessMapping's live data/allowed_values.json shipped a 10th vocabulary group (DocumentType, 8 values) covering award documents, funding opportunities, budget forms, proposal narratives, subaward agreements, and sponsor correspondence. The vendored registry was still pinned at the prior 9/70 state. The drift validator at scripts/check_governance_drift.py was correctly flagging this on every CI run downstream (ui-insight/AISPEG#64). After this change the validator passes 36/36 locally with exit 0. Changes: - vocabularies/processmapping/allowed_values.json: insert DocumentType group (alphabetical between ActorType and FileType); bump version to 1.1.0; update last_updated - catalog/processmapping.json: register the new group in controlled_vocabularies - scripts/check_governance_drift.py: bump hardcoded totals (56 -> 57 groups, 315 -> 323 values) and update the ProcessMapping expected-values literals (9/70 -> 10/78) - docs/index.md: aggregate vocabulary count summary (56/315 -> 57/323) - docs/vocabulary/index.md: ProcessMapping row (9/70 -> 10/78) - docs/domains/process-mapping.md: stat block (9 (70 codes) -> 10 (78 codes)) Closes #10. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add a Governance Drift workflow that runs the upstream check_governance_drift.py validator on PRs that touch the governance surface (vendor/data-governance/**, lib/governance/**, app/standards/data-model/**, scripts/build-governance-catalog.*, and the workflow itself). Failures block the merge so the vendored registry cannot drift silently from the live portfolio repos. - actions/checkout@v4 with submodules: recursive so the vendored data-governance content is present for the script. - actions/setup-python@v5 (3.11). The drift script is pure stdlib (base64, json, os, subprocess, sys, urllib, pathlib, typing) — no requirements.txt and no pip install step needed; comment in the workflow records this so a future reader does not re-derive it. - The script's remote-check path uses GITHUB_TOKEN for api.github.com calls, with a fallback to the preinstalled `gh` CLI on ubuntu-latest. - Output is teed into $GITHUB_STEP_SUMMARY as a fenced code block with a PASS/FAIL header so reviewers can triage without opening the run log. Job exit code is propagated unchanged from the script. CLAUDE.md: short note in Development commands describing the workflow and the path filter. REFACTOR.md: new Sprint 5 entry capturing the data-governance integration (submodule + catalog generator from #55, drift CI from this slice). Closes #58. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The first run of the Governance Drift workflow crashed because the auto-issued workflow GITHUB_TOKEN is scoped only to ui-insight/AISPEG; the script's remote-check phase reads files from sibling private repos (OpenERA, UCMDailyRegister, AuditDashboard, StratPlanTacticsMB, ProcessMapping), which return 404 against that token. The script's `file_text` path does not catch RemoteMissing, so the script aborts with an unhandled exception before getting to the legitimate drift findings. The upstream script already reads PORTFOLIO_GH_TOKEN first (see GitHubClient.__init__), so the fix is to expose that secret to the step alongside GITHUB_TOKEN. Once an org-level secret with `repo` scope on the five sibling repos is provisioned as PORTFOLIO_GH_TOKEN, the script will use the HTTP path with that token and run to completion. GITHUB_TOKEN remains as a fallback so the workflow does not fail catastrophically when the secret is missing. I am not adding a try/except wrapper around the script invocation or masking the RemoteMissing crash — per the brief's stop condition, the script's robustness is upstream's call. This change just hands the script the credentials it documents that it expects. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ui-insight/data-governance#11 merged the ProcessMapping registry sync that adds the DocumentType vocabulary group (8 values). Bumping the submodule pointer + regenerating lib/governance/vocabularies.ts (48 → 49 vocabulary groups) so this branch's drift workflow will pass on the next CI run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
e9f1cd8 to
34e75d0
Compare
Collaborator
Author
All-green achievedThe submodule bump landed (
The PR now has both a red run (attempt 1 — token-scope issue, since fixed by org-secret provisioning) and a green run (latest) captured in the run history — satisfying the "one green / one red" evidence acceptance criterion from #58. Trajectory
Merging. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a
Governance DriftGitHub Actions workflow that runs the upstreamvendor/data-governance/scripts/check_governance_drift.pyvalidator on PRsthat touch the governance surface, plus the corresponding CLAUDE.md /
REFACTOR.md updates.
vendor/data-governance/**,lib/governance/**,app/standards/data-model/**,scripts/build-governance-catalog.*, and the workflow file itselfactions/checkout@v4withsubmodules: recursiveactions/setup-python@v5at Python 3.11pip install— the upstream drift script is pure stdlib (base64,json,os,subprocess,sys,urllib.error,urllib.request,pathlib,typing); confirmed by reading the script and confirmed by the absence ofrequirements.txt/pyproject.tomlinvendor/data-governance/PORTFOLIO_GH_TOKEN(preferred — read first by the script) andGITHUB_TOKEN(fallback) are exported to the step; this is required because the script reads files from sibling private repos (see "Findings" below)$GITHUB_STEP_SUMMARYas a fenced code block with a PASS/FAIL headerAcceptance criteria
.github/workflows/governance-drift.ymlsubmodules: recursiveCLAUDE.mdandREFACTOR.mdupdatedFindings — two real issues that block a green run today
The CI on this PR is currently red, and per the brief's stop condition I
am not papering over either issue. Both serve as the red CI evidence:
I could not produce a green run from this branch alone. Reasons:
1. Cross-repo token scope (workflow side)
The script's
run_remote_checksreads files from five sibling repos in theui-insightorg (OpenERA, UCMDailyRegister, AuditDashboard,StratPlanTacticsMB, ProcessMapping), four of which are private. The
auto-issued workflow
GITHUB_TOKENis scoped only toui-insight/AISPEGandreturns 404 on
repos/ui-insight/OpenERA/contents/docker-compose.yml. Thescript's
file_textmethod does not catchRemoteMissing, so it crashesmid-run with an uncaught exception:
The script already documents the right env var for this:
GitHubClient.__init__reads
PORTFOLIO_GH_TOKENfirst. Once an org-level secret namedPORTFOLIO_GH_TOKEN(a PAT withreposcope on the five sibling repos) isprovisioned and visible to this workflow, the script will use the HTTP path
with that token and run to completion. The workflow already exposes
PORTFOLIO_GH_TOKENto the step, so no further AISPEG-side change is neededonce the secret exists.
2. Pre-existing upstream registry drift (data-governance side)
Even after the token issue is fixed, the script will fail one check:
Verified locally with the host's
ghauth: the liveui-insight/ProcessMappingrepo's
data/allowed_values.jsoncurrently has 10 vocabulary groups / 78values; the vendored data-governance registry expects 9 / 70. The
submodule pointer is already at
ecb9eaccbf23d0a8817366cc85f55a01a32e62a5,the tip of
ui-insight/data-governance@main, so this is genuine pre-existingdrift in the upstream registry — not introduced by this PR.
Recommended unblock: open a follow-up PR against
ui-insight/data-governanceto align
catalog/processmapping.json,vocabularies/processmapping/allowed_values.json,and the relevant docs/README rows with the live ProcessMapping vocabulary
counts. Then bump the AISPEG submodule pointer.
What I did not do
try/exceptaround the script invocation or|| trueto mask the failure.
The workflow itself is functioning exactly as designed: it fails the build
on detected drift, surfaces the reason in the step summary, and the path
filter keeps it quiet on PRs that don't touch the governance surface.
Provisioning the secret + bumping upstream registry are out-of-scope
operator actions; once both are done, this PR's CI will go green on a
re-run.
Test plan
PORTFOLIO_GH_TOKENis provisioned and the upstream registry drift is fixed, re-run the workflow and confirm it goes greenapp/portfolio/**) and confirm this workflow does not run🤖 Generated with Claude Code