Port 8 ProcessMapping workflows: ffr-management, prior-approval, effort-reporting, proposal-doc-completeness, award-compliance, proposal-budget-personnel, foa-checklist, subaward#33
Merged
Conversation
…rt-reporting, proposal-document-completeness, award-compliance, proposal-budget-personnel, foa-checklist, subaward Adds 8 new components and 8 new workflows that port the next batch of ProcessMapping (`ui-insight/ProcessMapping`) workflows into the prompt-library. Each pair (component + Vandalizer workflow) follows the pattern established by `rfa-checklist-extraction-udm` (PR #31): a canonical single-call prompt, a JSON-Schema 2020-12 contract, a manifest-driven Vandalizer workflow that mirrors the source ProcessMapping topology one-for-one, and a workflow-local evals scaffold. This is 8 of the 14 ProcessMapping workflows still remaining after PR #31. Combined with the previously merged `rfa-checklist-extraction`, this brings prompt-library coverage to 9/15 ProcessMapping workflows. Verification (all 5 scripts clean): - python scripts/build_component_catalog.py -> Wrote catalog - python scripts/build_vandalizer_workflows.py -> 12/12 up to date - python .github/scripts/lint_components.py -> 22 components, 12 workflows; only the 2 pre-existing eval-version-lag warnings remain - python scripts/build_docs.py -> 22 components rendered - python -m mkdocs build --strict -> 0 warnings Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…rovenance SHA Three review comments from @ProfessorPolymorphic on #33: 1. Monetary number-vs-string mismatch (award-compliance, but pattern audited across the four components with `number`-typed schema fields). Component prompts and workflow manifest prompt_inline bodies told the extractor to quote dollar amounts verbatim as strings, but the schemas require JSON numbers. Updated the encoding rules in: - components/award-compliance-extraction-udm/prompt.md - components/proposal-budget-personnel-extraction-udm/prompt.md - components/foa-checklist-extraction-udm/prompt.md - components/subaward-extraction-udm/prompt.md - workflows/award-compliance-extraction/manifest.yaml - workflows/proposal-budget-personnel-extraction/manifest.yaml - workflows/foa-checklist-extraction/manifest.yaml - workflows/subaward-extraction/manifest.yaml Number-typed fields ($-amounts, integers) are now explicitly rendered as JSON numbers (no quotes / no currency symbol / no thousand-separator); string-typed fields (rates, dates, narrative) keep verbatim quotation. 2. Pinned provenance SHA. Updated the Provenance section of all 8 component READMEs and all 8 workflow READMEs to record `ui-insight/ProcessMapping` commit `b7176b0c913833a205efdb5e4ba00c17ff88af0f` instead of floating `main`. 3. Subaward requiredness. The source workflow marks `Federal_Award_Number`, `Federal_Awarding_Agency`, and `Invoicing_Frequency` as `Is_Required: true` but the port marked them optional. Restored source requiredness in components/subaward-extraction-udm/{schema.json, prompt.md} and workflows/subaward-extraction/manifest.yaml searchset items. Audit sweep flagged the same drift in two other ports: - ffr-management-extraction-udm: `submission_schedule.annual_ffr_due` and `final_ffr_due` were nullable; source marks them required. Fixed schema + prompt. - effort-reporting-extraction-udm: `award_number`, `pi_name`, `project_title`, `reporting_frequency` were optional; source marks them required. Fixed schema + prompt. For required string fields where the document may not state a value, the prompt now instructs the LLM to use "Not specified in the document" rather than null, matching the source workflow's `Not_Found_Value`. CHANGELOGs updated for ffr / effort / subaward components to document the requiredness alignment. Verification clean (no new lint or mkdocs warnings): - python scripts/build_component_catalog.py -> Wrote catalog - python scripts/build_vandalizer_workflows.py -> 4 rebuilt (--check confirms: All 12 workflow export(s) up to date) - python .github/scripts/lint_components.py -> 22 components, 12 workflows, 2 pre-existing warnings only - python scripts/build_docs.py -> regenerated - python -m mkdocs build --strict -> clean Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
ProfessorPolymorphic
approved these changes
Apr 30, 2026
Contributor
ProfessorPolymorphic
left a comment
There was a problem hiding this comment.
Verified Labib’s fixes locally: schema numeric/string rules, restored source requiredness, pinned ProcessMapping provenance, regenerated catalog/workflow/docs outputs, and mkdocs strict build all pass. Good to merge.
6 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds 8 new components and 8 new workflows that port the next batch of
ui-insight/ProcessMappingVandalizer workflows into the prompt-library. Each pair (component + workflow) follows the pattern established byrfa-checklist-extraction-udmin #31: a canonical single-call prompt as the harness invocation surface, a JSON-Schema 2020-12 contract, a manifest-driven Vandalizer workflow that mirrors the source ProcessMapping topology one-for-one, and a workflow-local evals scaffold.This is 8 of the 14 ProcessMapping workflows still remaining after #31. Combined with the previously merged
rfa-checklist-extraction, this brings prompt-library coverage to 9 of the 15 ProcessMapping workflows. The remaining 6 (multi-sourcecompliance-personnel-verification§ion2-personnel-eligibility, the classifier-branchingaward-modification-intake, the drafting workflowbudget-justification-generator, and the widerexport-to-banner-extraction&risk-domain-assessment) are deferred for future PRs because they introduce patterns this PR doesn't yet model.Why
Each of these workflows already exists as a runtime configuration inside a Vandalizer instance and as a JSON description in
ProcessMapping/workflows/<slug>/workflow.json. Until now they were not (a) versioned component contracts, (b) regenerable Vandalizer manifestations, or (c) catalog-discoverable entries incomponent_catalog.json. This PR makes all three true for the eight simplest remaining workflows in the same shape PR #31 used forrfa-checklist-extraction.What ships, per workflow
<slug>-udm)<slug>)ffr-management-extraction-udm@0.1.0workflows/ffr-management-extractionv0.1.0WF-FFR-MANAGEMENT-EXTRACTIONprior-approval-extraction-udm@0.1.0workflows/prior-approval-extractionv0.1.0WF-PRIOR-APPROVAL-EXTRACTIONeffort-reporting-extraction-udm@0.1.0workflows/effort-reporting-extractionv0.1.0WF-EFFORT-REPORTING-EXTRACTIONkey_personnel_commitmentstable + 2 enumsproposal-document-completeness-udm@0.1.0workflows/proposal-document-completenessv0.1.0WF-PROPOSAL-DOC-COMPLETENESSaward-compliance-extraction-udm@0.1.0workflows/award-compliance-extractionv0.1.0WF-AWARD-COMPLIANCE-EXTRACTIONproposal-budget-personnel-extraction-udm@0.1.0workflows/proposal-budget-personnel-extractionv0.1.0WF-PROPOSAL-BUDGET-PERSONNEL-EXTRACTIONfoa-checklist-extraction-udm@0.1.0workflows/foa-checklist-extractionv0.1.0WF-FOA-CHECKLIST-EXTRACTIONrfa-checklist-extraction-udmsubaward-extraction-udm@0.1.0workflows/subaward-extractionv0.1.0WF-SUBAWARD-EXTRACTION{name, email, phone}objectsEach component ships:
prompt.md— frontmatter (semver-locked at0.1.0) + canonical single-call extraction prompt with explicit encoding rulesschema.json— JSON Schema 2020-12, with extensive UDM column bindings on leaf fields (preserved verbatim from the source ProcessMapping workflow'sUDM_Columnannotations)README.md— overview, contract scope, runtime topology, triad integration, sibling-component relationshipsCHANGELOG.md— initial 0.1.0 entry documenting source-workflow lineage, enum-value carryover, and UDM bindingsevals/README.md— planned-cases breakdown (each component lists 4–5 cases that exercise distinct structural features)Each workflow ships:
manifest.yaml— declarative Vandalizer-workflow source-of-truth (each Extraction task carries an embedded SearchSet whose item titles mirror the component schema field names; enums propagated from the source workflow'sEnum_Values)<slug>.vandalizer.json— generated byscripts/build_vandalizer_workflows.py; never hand-editedREADME.md,CHANGELOG.md,evals/README.mdevals/cases/<stub>/metadata.yaml— placeholder shell flaggedvalidated_by: pending-sponsored-programs-reviewto be replaced with an authorized, de-identified case before promotion tostableWorkflow-local eval posture (per
docs/contracts.md)All 8 workflows declare
evals.workflow_local: true. None is a 1:1 repackaging of the canonical component prompt — each Extraction task carries a focused per-sectionprompt_inlinebody, and each Consolidation Prompt does substantial work that emerges from the multi-task topology and cannot be covered by component-level evals alone:submission_schedule/submission_system/compliance_consequencesobjects and normalizes the platform enum.budget_approvals/scope_timeline_approvalsand normalizes the per-rowapproval_procedurestable.reporting_frequencyandcertification_methodenums and enforces the PI-mirror rule (the PI's row inkey_personnel_commitmentsmust mirrorpi_committed_effortandpi_person_monthsexactly).present/triggeredbooleans, computes per-person and per-subawardeemissinglists, and ranksprioritized_missing(compliance-critical first).compliance_calendarentries across both upstream fragments, normalizes theaudit_requirementsandrecord_retentionenums, and enforces the CFR-01 reconciliationsum(budget_period_amounts) == total_award_amount.has_postdocs_or_grad_students,mentoring_plan_required,has_subawards,has_equipment_over_5k) from list lengths and combinesfa_rate+fa_baseinto the nestedfa_rate_and_baseobject.critical_dates;expected_awards * max(award_range) <= total_funding).{name, email, phone}objects, normalizes thecost_typeandinvoicing_frequencyenums, and enforces the CFR-01 reconciliation betweenamount_funded,total_direct_costs, andtotal_indirect_costs.Each workflow ships exactly one scaffolded stub case under
evals/cases/withvalidated_against_versionset to the workflow's MINOR.PATCH andcomponent_versions_at_validationrecording the pinned component version. These stubs satisfy theworkflow_local: truerequirement perdocs/contracts.mdand are flagged for replacement with sponsored-programs-validated cases before promotion tostable.Build script
The
kind: Extraction+validation_planpassthrough extension that landed in PR #31 covers all 8 workflows — no new build-script changes in this PR. The build script is unchanged.scripts/build_vandalizer_workflows.py --checkconfirms all 12 manifests round-trip to identical bytes:Triad integration
component_catalog.jsonregenerated frommanifests + overrides.component_catalog_overrides.yamladds 8 curated entries withoutput_contract,triad_integration.harness_notes(describing each runtime topology and what the consolidator does),triad_integration.udm_alignment, andrelated_componentscross-references (e.g.,award-compliance-extraction-udm↔ffr-management-extraction-udmandprior-approval-extraction-udmas drilldown targets;proposal-budget-personnel-extraction-udm↔proposal-document-completeness-udmas producer/consumer;foa-checklist-extraction-udm↔rfa-checklist-extraction-udmas siblings).triad_integration.evaluation_datasets: []. Each workflow's eval-stubmetadata.yamldescribes the structural features and workflow features the case will exercise once a real authorized, de-identified case is selected.prompt.mdremains the single-call harness invocation surface; per-workflow scoring (post-consolidation JSON) is the right signal for the v0.1.0 runtimes. Component READMEs and workflow READMEs both note this distinction — campaign authors should record both signals when both are available.UDM_Columnannotations. See each component's CHANGELOG.md for the full per-component binding list.Test plan
python3 scripts/build_component_catalog.py—Wrote component_catalog.json(and--checkconfirms no drift)python3 scripts/build_vandalizer_workflows.py --check—All 12 workflow export(s) up to datepython3 .github/scripts/lint_components.py—Linted 22 component(s),12 workflow(s). Only the 2 pre-existing eval-version-lag warnings onnsf-award-notice-extraction-udmandrfp-extractionremain — unrelated to this PR.python3 scripts/build_docs.py— 22 components rendered,mkdocs.ymlnav spliced.python3 -m mkdocs build --strict—Documentation built in 2.69 seconds, 0 warnings.Validation_Planmirrored verbatim into the manifest's top-levelvalidation_plan:and round-tripped into the export envelope.Enum_Valuesmirrored verbatim into the manifest'ssearchset.items[].enum_values(submission_system_platform,reporting_frequency,certification_method,review_type,audit_requirements,record_retention,cost_type,invoicing_frequency,federal_agency,fa_base).<slug>.vandalizer.jsoninto a Vandalizer instance and running it against representative federal award / proposal documents; output JSON should validate against the correspondingcomponents/<slug>-udm/schema.json.🤖 Generated with Claude Code