Port 8 ProcessMapping workflows: ffr-management, prior-approval, effort-reporting, proposal-doc-completeness, award-compliance, proposal-budget-personnel, foa-checklist, subaward by StringTheoryDev · Pull Request #33 · AI4RA/prompt-library

StringTheoryDev · 2026-04-30T18:11:03Z

Summary

Adds 8 new components and 8 new workflows that port the next batch of ui-insight/ProcessMapping Vandalizer workflows into the prompt-library. Each pair (component + workflow) follows the pattern established by rfa-checklist-extraction-udm in #31: a canonical single-call prompt as the harness invocation surface, a JSON-Schema 2020-12 contract, a manifest-driven Vandalizer workflow that mirrors the source ProcessMapping topology one-for-one, and a workflow-local evals scaffold.

This is 8 of the 14 ProcessMapping workflows still remaining after #31. Combined with the previously merged rfa-checklist-extraction, this brings prompt-library coverage to 9 of the 15 ProcessMapping workflows. The remaining 6 (multi-source compliance-personnel-verification & section2-personnel-eligibility, the classifier-branching award-modification-intake, the drafting workflow budget-justification-generator, and the wider export-to-banner-extraction & risk-domain-assessment) are deferred for future PRs because they introduce patterns this PR doesn't yet model.

Why

Each of these workflows already exists as a runtime configuration inside a Vandalizer instance and as a JSON description in ProcessMapping/workflows/<slug>/workflow.json. Until now they were not (a) versioned component contracts, (b) regenerable Vandalizer manifestations, or (c) catalog-discoverable entries in component_catalog.json. This PR makes all three true for the eight simplest remaining workflows in the same shape PR #31 used for rfa-checklist-extraction.

What ships, per workflow

#	Component (`<slug>-udm`)	Workflow (`<slug>`)	Source	Topology mirrored	Schema fields
1	`ffr-management-extraction-udm@0.1.0`	`workflows/ffr-management-extraction` v0.1.0	`WF-FFR-MANAGEMENT-EXTRACTION`	1 Extraction + 1 Consolidation	5 nested buckets (submission_schedule, submission_system, required_financial_data, compliance_consequences, preparation_timeline) + 2 scalars
2	`prior-approval-extraction-udm@0.1.0`	`workflows/prior-approval-extraction` v0.1.0	`WF-PRIOR-APPROVAL-EXTRACTION`	1 Extraction + 1 Consolidation	3 buckets (budget_approvals, scope_timeline_approvals, approval_procedures) + rtc_waivers
3	`effort-reporting-extraction-udm@0.1.0`	`workflows/effort-reporting-extraction` v0.1.0	`WF-EFFORT-REPORTING-EXTRACTION`	1 Extraction + 1 Consolidation	13 fields incl. `key_personnel_commitments` table + 2 enums
4	`proposal-document-completeness-udm@0.1.0`	`workflows/proposal-document-completeness` v0.1.0	`WF-PROPOSAL-DOC-COMPLETENESS`	2 parallel Extraction + 1 Consolidation/Gap-Analysis	3 layers: as-found inventory, sponsor requirements, gap analysis (per-person + per-subawardee matrices)
5	`award-compliance-extraction-udm@0.1.0`	`workflows/award-compliance-extraction` v0.1.0	`WF-AWARD-COMPLIANCE-EXTRACTION`	2 parallel Extraction + 1 Consolidation	2 blocks (compliance_framework × 10 + financial_management × 10)
6	`proposal-budget-personnel-extraction-udm@0.1.0`	`workflows/proposal-budget-personnel-extraction` v0.1.0	`WF-PROPOSAL-BUDGET-PERSONNEL-EXTRACTION`	2 parallel Extraction + 1 Consolidation	personnel listings + 4 derivable boolean triggers + budget structure
7	`foa-checklist-extraction-udm@0.1.0`	`workflows/foa-checklist-extraction` v0.1.0	`WF-FOA-CHECKLIST-EXTRACTION`	6 parallel Extraction + 1 Consolidation	31 fields across 8 FOA reference sections; sibling of `rfa-checklist-extraction-udm`
8	`subaward-extraction-udm@0.1.0`	`workflows/subaward-extraction` v0.1.0	`WF-SUBAWARD-EXTRACTION`	6 parallel Extraction + 1 Consolidation	6 blocks; 18 flat contact items composed into 6 `{name, email, phone}` objects

Each component ships:

prompt.md — frontmatter (semver-locked at 0.1.0) + canonical single-call extraction prompt with explicit encoding rules
schema.json — JSON Schema 2020-12, with extensive UDM column bindings on leaf fields (preserved verbatim from the source ProcessMapping workflow's UDM_Column annotations)
README.md — overview, contract scope, runtime topology, triad integration, sibling-component relationships
CHANGELOG.md — initial 0.1.0 entry documenting source-workflow lineage, enum-value carryover, and UDM bindings
evals/README.md — planned-cases breakdown (each component lists 4–5 cases that exercise distinct structural features)

Each workflow ships:

manifest.yaml — declarative Vandalizer-workflow source-of-truth (each Extraction task carries an embedded SearchSet whose item titles mirror the component schema field names; enums propagated from the source workflow's Enum_Values)
<slug>.vandalizer.json — generated by scripts/build_vandalizer_workflows.py; never hand-edited
README.md, CHANGELOG.md, evals/README.md
evals/cases/<stub>/metadata.yaml — placeholder shell flagged validated_by: pending-sponsored-programs-review to be replaced with an authorized, de-identified case before promotion to stable

Workflow-local eval posture (per `docs/contracts.md`)

All 8 workflows declare evals.workflow_local: true. None is a 1:1 repackaging of the canonical component prompt — each Extraction task carries a focused per-section prompt_inline body, and each Consolidation Prompt does substantial work that emerges from the multi-task topology and cannot be covered by component-level evals alone:

Add component: award-document-extraction-udm #1 ffr — collapses flat searchset items into nested submission_schedule / submission_system / compliance_consequences objects and normalizes the platform enum.
Add component: sponsor-doc-defaults-udm #2 prior-approval — collapses flat items into nested budget_approvals / scope_timeline_approvals and normalizes the per-row approval_procedures table.
Add component: solicitation-doc-modifications-udm #3 effort — normalizes the reporting_frequency and certification_method enums and enforces the PI-mirror rule (the PI's row in key_personnel_commitments must mirror pi_committed_effort and pi_person_months exactly).
Add component: nsf-budget-justification-udm #4 proposal-doc-completeness — joins as-found inventory with sponsor requirements, derives present/triggered booleans, computes per-person and per-subawardee missing lists, and ranks prioritized_missing (compliance-critical first).
Add component: proposal-compliance-flag-udm #5 award-compliance — merges/dedupes compliance_calendar entries across both upstream fragments, normalizes the audit_requirements and record_retention enums, and enforces the CFR-01 reconciliation sum(budget_period_amounts) == total_award_amount.
Add component: proposal-completeness-review-udm #6 proposal-budget-personnel — derives four boolean compliance triggers (has_postdocs_or_grad_students, mentoring_plan_required, has_subawards, has_equipment_over_5k) from list lengths and combines fa_rate + fa_base into the nested fa_rate_and_base object.
Add component: budget-rule-review-udm #7 foa-checklist — assembles six fragments and enforces cross-field consistency (chronological critical_dates; expected_awards * max(award_range) <= total_funding).
Add component: document-type-classifier-udm #8 subaward — composes 18 flat contact searchset items into six structured {name, email, phone} objects, normalizes the cost_type and invoicing_frequency enums, and enforces the CFR-01 reconciliation between amount_funded, total_direct_costs, and total_indirect_costs.

Each workflow ships exactly one scaffolded stub case under evals/cases/ with validated_against_version set to the workflow's MINOR.PATCH and component_versions_at_validation recording the pinned component version. These stubs satisfy the workflow_local: true requirement per docs/contracts.md and are flagged for replacement with sponsored-programs-validated cases before promotion to stable.

Build script

The kind: Extraction + validation_plan passthrough extension that landed in PR #31 covers all 8 workflows — no new build-script changes in this PR. The build script is unchanged. scripts/build_vandalizer_workflows.py --check confirms all 12 manifests round-trip to identical bytes:

$ python3 scripts/build_vandalizer_workflows.py --check
All 12 workflow export(s) up to date.

Triad integration

prompt-library: 8 new components + 8 new workflows, all catalog-discoverable. component_catalog.json regenerated from manifests + overrides. component_catalog_overrides.yaml adds 8 curated entries with output_contract, triad_integration.harness_notes (describing each runtime topology and what the consolidator does), triad_integration.udm_alignment, and related_components cross-references (e.g., award-compliance-extraction-udm ↔ ffr-management-extraction-udm and prior-approval-extraction-udm as drilldown targets; proposal-budget-personnel-extraction-udm ↔ proposal-document-completeness-udm as producer/consumer; foa-checklist-extraction-udm ↔ rfa-checklist-extraction-udm as siblings).
evaluation-data-sets: none yet — every component has triad_integration.evaluation_datasets: []. Each workflow's eval-stub metadata.yaml describes the structural features and workflow features the case will exercise once a real authorized, de-identified case is selected.
evaluation-harness: the canonical prompt.md remains the single-call harness invocation surface; per-workflow scoring (post-consolidation JSON) is the right signal for the v0.1.0 runtimes. Component READMEs and workflow READMEs both note this distinction — campaign authors should record both signals when both are available.
AI4RA-UDM: scalar metadata + UDM-column leaf bindings preserved verbatim from the source ProcessMapping workflows' UDM_Column annotations. See each component's CHANGELOG.md for the full per-component binding list.

Test plan

python3 scripts/build_component_catalog.py — Wrote component_catalog.json (and --check confirms no drift)
python3 scripts/build_vandalizer_workflows.py --check — All 12 workflow export(s) up to date
python3 .github/scripts/lint_components.py — Linted 22 component(s), 12 workflow(s). Only the 2 pre-existing eval-version-lag warnings on nsf-award-notice-extraction-udm and rfp-extraction remain — unrelated to this PR.
python3 scripts/build_docs.py — 22 components rendered, mkdocs.yml nav spliced.
python3 -m mkdocs build --strict — Documentation built in 2.69 seconds, 0 warnings.
All 8 workflows: source-workflow Validation_Plan mirrored verbatim into the manifest's top-level validation_plan: and round-tripped into the export envelope.
All 8 workflows: source-workflow Enum_Values mirrored verbatim into the manifest's searchset.items[].enum_values (submission_system_platform, reporting_frequency, certification_method, review_type, audit_requirements, record_retention, cost_type, invoicing_frequency, federal_agency, fa_base).
Reviewer to verify by importing each <slug>.vandalizer.json into a Vandalizer instance and running it against representative federal award / proposal documents; output JSON should validate against the corresponding components/<slug>-udm/schema.json.

🤖 Generated with Claude Code

…rt-reporting, proposal-document-completeness, award-compliance, proposal-budget-personnel, foa-checklist, subaward Adds 8 new components and 8 new workflows that port the next batch of ProcessMapping (`ui-insight/ProcessMapping`) workflows into the prompt-library. Each pair (component + Vandalizer workflow) follows the pattern established by `rfa-checklist-extraction-udm` (PR #31): a canonical single-call prompt, a JSON-Schema 2020-12 contract, a manifest-driven Vandalizer workflow that mirrors the source ProcessMapping topology one-for-one, and a workflow-local evals scaffold. This is 8 of the 14 ProcessMapping workflows still remaining after PR #31. Combined with the previously merged `rfa-checklist-extraction`, this brings prompt-library coverage to 9/15 ProcessMapping workflows. Verification (all 5 scripts clean): - python scripts/build_component_catalog.py -> Wrote catalog - python scripts/build_vandalizer_workflows.py -> 12/12 up to date - python .github/scripts/lint_components.py -> 22 components, 12 workflows; only the 2 pre-existing eval-version-lag warnings remain - python scripts/build_docs.py -> 22 components rendered - python -m mkdocs build --strict -> 0 warnings Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

@ProfessorPolymorphic

…rovenance SHA Three review comments from @ProfessorPolymorphic on #33: 1. Monetary number-vs-string mismatch (award-compliance, but pattern audited across the four components with `number`-typed schema fields). Component prompts and workflow manifest prompt_inline bodies told the extractor to quote dollar amounts verbatim as strings, but the schemas require JSON numbers. Updated the encoding rules in: - components/award-compliance-extraction-udm/prompt.md - components/proposal-budget-personnel-extraction-udm/prompt.md - components/foa-checklist-extraction-udm/prompt.md - components/subaward-extraction-udm/prompt.md - workflows/award-compliance-extraction/manifest.yaml - workflows/proposal-budget-personnel-extraction/manifest.yaml - workflows/foa-checklist-extraction/manifest.yaml - workflows/subaward-extraction/manifest.yaml Number-typed fields ($-amounts, integers) are now explicitly rendered as JSON numbers (no quotes / no currency symbol / no thousand-separator); string-typed fields (rates, dates, narrative) keep verbatim quotation. 2. Pinned provenance SHA. Updated the Provenance section of all 8 component READMEs and all 8 workflow READMEs to record `ui-insight/ProcessMapping` commit `b7176b0c913833a205efdb5e4ba00c17ff88af0f` instead of floating `main`. 3. Subaward requiredness. The source workflow marks `Federal_Award_Number`, `Federal_Awarding_Agency`, and `Invoicing_Frequency` as `Is_Required: true` but the port marked them optional. Restored source requiredness in components/subaward-extraction-udm/{schema.json, prompt.md} and workflows/subaward-extraction/manifest.yaml searchset items. Audit sweep flagged the same drift in two other ports: - ffr-management-extraction-udm: `submission_schedule.annual_ffr_due` and `final_ffr_due` were nullable; source marks them required. Fixed schema + prompt. - effort-reporting-extraction-udm: `award_number`, `pi_name`, `project_title`, `reporting_frequency` were optional; source marks them required. Fixed schema + prompt. For required string fields where the document may not state a value, the prompt now instructs the LLM to use "Not specified in the document" rather than null, matching the source workflow's `Not_Found_Value`. CHANGELOGs updated for ffr / effort / subaward components to document the requiredness alignment. Verification clean (no new lint or mkdocs warnings): - python scripts/build_component_catalog.py -> Wrote catalog - python scripts/build_vandalizer_workflows.py -> 4 rebuilt (--check confirms: All 12 workflow export(s) up to date) - python .github/scripts/lint_components.py -> 22 components, 12 workflows, 2 pre-existing warnings only - python scripts/build_docs.py -> regenerated - python -m mkdocs build --strict -> clean Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

ProfessorPolymorphic

Verified Labib’s fixes locally: schema numeric/string rules, restored source requiredness, pinned ProcessMapping provenance, regenerated catalog/workflow/docs outputs, and mkdocs strict build all pass. Good to merge.

ProfessorPolymorphic reviewed Apr 30, 2026

View reviewed changes

Comment thread workflows/award-compliance-extraction/manifest.yaml Outdated

ProfessorPolymorphic reviewed Apr 30, 2026

View reviewed changes

Comment thread components/effort-reporting-extraction-udm/README.md

ProfessorPolymorphic reviewed Apr 30, 2026

View reviewed changes

Comment thread workflows/subaward-extraction/manifest.yaml

ProfessorPolymorphic approved these changes Apr 30, 2026

View reviewed changes

ProfessorPolymorphic merged commit cfda45a into main Apr 30, 2026
1 check passed

StringTheoryDev mentioned this pull request May 4, 2026

Fix Vandalizer SearchSet import: emit cross_field_rules: [] (not null) #34

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Port 8 ProcessMapping workflows: ffr-management, prior-approval, effort-reporting, proposal-doc-completeness, award-compliance, proposal-budget-personnel, foa-checklist, subaward#33

Port 8 ProcessMapping workflows: ffr-management, prior-approval, effort-reporting, proposal-doc-completeness, award-compliance, proposal-budget-personnel, foa-checklist, subaward#33
ProfessorPolymorphic merged 2 commits into
mainfrom
eight-additional-workflows

StringTheoryDev commented Apr 30, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ProfessorPolymorphic left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

StringTheoryDev commented Apr 30, 2026

Summary

Why

What ships, per workflow

Workflow-local eval posture (per docs/contracts.md)

Build script

Triad integration

Test plan

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ProfessorPolymorphic left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Workflow-local eval posture (per `docs/contracts.md`)