Skip to content

CI: governance drift check (closes #58)#64

Merged
ProfessorPolymorphic merged 3 commits into
mainfrom
feat/issue-58-ci-drift-check
May 1, 2026
Merged

CI: governance drift check (closes #58)#64
ProfessorPolymorphic merged 3 commits into
mainfrom
feat/issue-58-ci-drift-check

Conversation

@ProfessorPolymorphic
Copy link
Copy Markdown
Collaborator

@ProfessorPolymorphic ProfessorPolymorphic commented May 1, 2026

Summary

Adds a Governance Drift GitHub Actions workflow that runs the upstream
vendor/data-governance/scripts/check_governance_drift.py validator on PRs
that touch the governance surface, plus the corresponding CLAUDE.md /
REFACTOR.md updates.

  • Path filter: vendor/data-governance/**, lib/governance/**, app/standards/data-model/**, scripts/build-governance-catalog.*, and the workflow file itself
  • actions/checkout@v4 with submodules: recursive
  • actions/setup-python@v5 at Python 3.11
  • No pip install — the upstream drift script is pure stdlib (base64, json, os, subprocess, sys, urllib.error, urllib.request, pathlib, typing); confirmed by reading the script and confirmed by the absence of requirements.txt / pyproject.toml in vendor/data-governance/
  • Both PORTFOLIO_GH_TOKEN (preferred — read first by the script) and GITHUB_TOKEN (fallback) are exported to the step; this is required because the script reads files from sibling private repos (see "Findings" below)
  • Output streamed to $GITHUB_STEP_SUMMARY as a fenced code block with a PASS/FAIL header

Acceptance criteria

  • Workflow at .github/workflows/governance-drift.yml
  • Submodules checked out with submodules: recursive
  • Python set up; drift script's deps installed (none — stdlib only)
  • Workflow fails when drift script reports drift (verified — see Findings)
  • Drift output in the Actions Step Summary panel
  • Path filter limits trigger
  • CLAUDE.md and REFACTOR.md updated

Findings — two real issues that block a green run today

The CI on this PR is currently red, and per the brief's stop condition I
am not papering over either issue. Both serve as the red CI evidence:

I could not produce a green run from this branch alone. Reasons:

1. Cross-repo token scope (workflow side)

The script's run_remote_checks reads files from five sibling repos in the
ui-insight org (OpenERA, UCMDailyRegister, AuditDashboard,
StratPlanTacticsMB, ProcessMapping), four of which are private. The
auto-issued workflow GITHUB_TOKEN is scoped only to ui-insight/AISPEG and
returns 404 on repos/ui-insight/OpenERA/contents/docker-compose.yml. The
script's file_text method does not catch RemoteMissing, so it crashes
mid-run with an uncaught exception:

RemoteMissing: GitHub resource not found for repos/ui-insight/OpenERA/contents/docker-compose.yml

The script already documents the right env var for this: GitHubClient.__init__
reads PORTFOLIO_GH_TOKEN first. Once an org-level secret named
PORTFOLIO_GH_TOKEN (a PAT with repo scope on the five sibling repos) is
provisioned and visible to this workflow, the script will use the HTTP path
with that token and run to completion. The workflow already exposes
PORTFOLIO_GH_TOKEN to the step, so no further AISPEG-side change is needed
once the secret exists.

2. Pre-existing upstream registry drift (data-governance side)

Even after the token issue is fixed, the script will fail one check:

FAIL ProcessMapping remote controlled-value counts match local governance registry

Verified locally with the host's gh auth: the live ui-insight/ProcessMapping
repo's data/allowed_values.json currently has 10 vocabulary groups / 78
values
; the vendored data-governance registry expects 9 / 70. The
submodule pointer is already at ecb9eaccbf23d0a8817366cc85f55a01a32e62a5,
the tip of ui-insight/data-governance@main, so this is genuine pre-existing
drift in the upstream registry — not introduced by this PR.

Recommended unblock: open a follow-up PR against ui-insight/data-governance
to align catalog/processmapping.json, vocabularies/processmapping/allowed_values.json,
and the relevant docs/README rows with the live ProcessMapping vocabulary
counts. Then bump the AISPEG submodule pointer.

What I did not do

  • I did not add a try/except around the script invocation or || true
    to mask the failure.
  • I did not edit the upstream script.
  • I did not modify the vendored registry data.

The workflow itself is functioning exactly as designed: it fails the build
on detected drift, surfaces the reason in the step summary, and the path
filter keeps it quiet on PRs that don't touch the governance surface.
Provisioning the secret + bumping upstream registry are out-of-scope
operator actions; once both are done, this PR's CI will go green on a
re-run.

Test plan

  • Verify the workflow run triggered on this PR appears in the Actions tab and links from the PR's Checks panel
  • Open the failed run and confirm the step summary panel shows the PASS/FAIL block with the script's output (or its traceback) inline
  • After PORTFOLIO_GH_TOKEN is provisioned and the upstream registry drift is fixed, re-run the workflow and confirm it goes green
  • Open a separate PR that touches only an unrelated path (e.g. app/portfolio/**) and confirm this workflow does not run

🤖 Generated with Claude Code

@ProfessorPolymorphic
Copy link
Copy Markdown
Collaborator Author

Token unblock confirmed

PORTFOLIO_GH_TOKEN is now provisioned as an org secret. Re-ran the workflow on this PR; the result tells a much cleaner story:

  • 36 PASS lines including:
    • PASS remote repository exists: ui-insight/OpenERA
    • PASS remote repository exists: ui-insight/UCMDailyRegister
    • PASS remote repository exists: ui-insight/AuditDashboard
    • PASS remote repository exists: ui-insight/StratPlanTacticsMB
    • PASS remote repository exists: ui-insight/ProcessMapping
    • PASS OpenERA remote compose exposes current frontend/backend ports
    • PASS UCM remote compose documents shared DB and HOST_PORT default
    • PASS Audit Dashboard remote compose matches documented ports and backend health endpoint
    • PASS ProcessMapping remote asset counts match local governance registry
    • PASS StratPlan remote canonical counts match local governance registry
  • One legitimate FAIL: ProcessMapping remote controlled-value counts match local governance registry — the vocabulary drift documented in ui-insight/data-governance#10.

This is the workflow doing exactly what it was built to do. The remaining red ride is now an upstream-registry-alignment problem, not a CI problem.

Run #25235691593, attempt 2

What's left to make this PR mergeable

  1. Provision PORTFOLIO_GH_TOKEN org secret — done
  2. ⏭ Resolve ui-insight/data-governance#10 (ProcessMapping vocab counts) — upstream
  3. ⏭ Bump the vendor/data-governance submodule pointer in this repo to pick up the upstream fix

Once 2 + 3 land, the next push will produce the green run we need for the PR description's "one green / one red" evidence.

ProfessorPolymorphic added a commit to ui-insight/data-governance that referenced this pull request May 1, 2026
ProcessMapping's live data/allowed_values.json shipped a 10th vocabulary
group (DocumentType, 8 values) covering award documents, funding
opportunities, budget forms, proposal narratives, subaward agreements,
and sponsor correspondence. The vendored registry was still pinned at
the prior 9/70 state.

The drift validator at scripts/check_governance_drift.py was correctly
flagging this on every CI run downstream (ui-insight/AISPEG#64). After
this change the validator passes 36/36 locally with exit 0.

Changes:
- vocabularies/processmapping/allowed_values.json: insert DocumentType
  group (alphabetical between ActorType and FileType); bump version
  to 1.1.0; update last_updated
- catalog/processmapping.json: register the new group in
  controlled_vocabularies
- scripts/check_governance_drift.py: bump hardcoded totals
  (56 -> 57 groups, 315 -> 323 values) and update the ProcessMapping
  expected-values literals (9/70 -> 10/78)
- docs/index.md: aggregate vocabulary count summary (56/315 -> 57/323)
- docs/vocabulary/index.md: ProcessMapping row (9/70 -> 10/78)
- docs/domains/process-mapping.md: stat block (9 (70 codes) ->
  10 (78 codes))

Closes #10.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ProfessorPolymorphic and others added 3 commits May 1, 2026 16:10
Add a Governance Drift workflow that runs the upstream
check_governance_drift.py validator on PRs that touch the governance
surface (vendor/data-governance/**, lib/governance/**,
app/standards/data-model/**, scripts/build-governance-catalog.*, and
the workflow itself). Failures block the merge so the vendored registry
cannot drift silently from the live portfolio repos.

- actions/checkout@v4 with submodules: recursive so the vendored
  data-governance content is present for the script.
- actions/setup-python@v5 (3.11). The drift script is pure stdlib
  (base64, json, os, subprocess, sys, urllib, pathlib, typing) — no
  requirements.txt and no pip install step needed; comment in the
  workflow records this so a future reader does not re-derive it.
- The script's remote-check path uses GITHUB_TOKEN for api.github.com
  calls, with a fallback to the preinstalled `gh` CLI on ubuntu-latest.
- Output is teed into $GITHUB_STEP_SUMMARY as a fenced code block with
  a PASS/FAIL header so reviewers can triage without opening the run
  log. Job exit code is propagated unchanged from the script.

CLAUDE.md: short note in Development commands describing the workflow
and the path filter. REFACTOR.md: new Sprint 5 entry capturing the
data-governance integration (submodule + catalog generator from #55,
drift CI from this slice).

Closes #58.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The first run of the Governance Drift workflow crashed because the
auto-issued workflow GITHUB_TOKEN is scoped only to ui-insight/AISPEG;
the script's remote-check phase reads files from sibling private repos
(OpenERA, UCMDailyRegister, AuditDashboard, StratPlanTacticsMB,
ProcessMapping), which return 404 against that token. The script's
`file_text` path does not catch RemoteMissing, so the script aborts
with an unhandled exception before getting to the legitimate drift
findings.

The upstream script already reads PORTFOLIO_GH_TOKEN first (see
GitHubClient.__init__), so the fix is to expose that secret to the
step alongside GITHUB_TOKEN. Once an org-level secret with `repo` scope
on the five sibling repos is provisioned as PORTFOLIO_GH_TOKEN, the
script will use the HTTP path with that token and run to completion.
GITHUB_TOKEN remains as a fallback so the workflow does not fail
catastrophically when the secret is missing.

I am not adding a try/except wrapper around the script invocation or
masking the RemoteMissing crash — per the brief's stop condition, the
script's robustness is upstream's call. This change just hands the
script the credentials it documents that it expects.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ui-insight/data-governance#11 merged the ProcessMapping registry sync
that adds the DocumentType vocabulary group (8 values). Bumping the
submodule pointer + regenerating lib/governance/vocabularies.ts
(48 → 49 vocabulary groups) so this branch's drift workflow will pass
on the next CI run.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ProfessorPolymorphic ProfessorPolymorphic force-pushed the feat/issue-58-ci-drift-check branch from e9f1cd8 to 34e75d0 Compare May 1, 2026 23:10
@ProfessorPolymorphic
Copy link
Copy Markdown
Collaborator Author

All-green achieved

The submodule bump landed (vendor/data-governance5e811f7) and the workflow flipped:

  • ✅ Lint & Accessibility — 29s
  • ✅ TypeScript — 24s
  • ✅ Production Build — 34s
  • Validate governance registry — 9s (first green run on this PR)

The PR now has both a red run (attempt 1 — token-scope issue, since fixed by org-secret provisioning) and a green run (latest) captured in the run history — satisfying the "one green / one red" evidence acceptance criterion from #58.

Trajectory

  1. Workflow created → red (uncaught exception, token missing).
  2. PORTFOLIO_GH_TOKEN org secret provisioned → red (script-level, on legitimate ProcessMapping registry drift).
  3. ui-insight/data-governance#11 merged (synced ProcessMapping → registry, adding the DocumentType group).
  4. Submodule bump on this branch (commit 34e75d0) + rebased onto current main.
  5. → All green.

Merging.

@ProfessorPolymorphic ProfessorPolymorphic merged commit 8f66b86 into main May 1, 2026
4 checks passed
@ProfessorPolymorphic ProfessorPolymorphic deleted the feat/issue-58-ci-drift-check branch May 1, 2026 23:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant