feat(xfa): extract orphan scripts + scope design doc#10
Merged
Conversation
Previously, script extraction was driven by walkSubformChildren via the per-node attachFieldScripts helper, so any script whose owner node was suppressed by emitField / emitDraw was silently dropped. The known cases: event-bearing <draw> elements (status indicators), bind="none" non-AddAttachment buttons (Help Text / Show Intro triggers), <pageArea> events, and per-option scripts on <field>s flattened into an <exclGroup>'s Options. The b44a6f9 FormScript comment acknowledged this gap and deferred the fix. Split script extraction from question emission: - New extractAllScripts walks the entire xfaNode tree post-Section walk and emits a FormScript for every event-bearing node, regardless of whether the node was emitted as a Question or Section. OwnerPath is set to the SOM path of the owning node; OwnerID is left empty. - New populateScriptBackRefs indexes the resulting scripts by OwnerPath, fills in OwnerID and Question.Scripts / FormSection.Scripts back-refs whenever the owner was also emitted as a Question/Section, and leaves orphans with empty OwnerID. - exclGroup OptionEvents is a parallel slice to Options that preserves per-option events through the flatten so they reach extractAllScripts. - pageArea and exclGroup are now valid event-stack targets in the parseXFATemplate state machine (previously only subform). The attachFieldScripts / appendScripts helpers and xfaNode.QuestionID field are removed — back-refs are now populated by path lookup in a single post-pass rather than threaded through emission. Tests cover all four orphan cases (pageArea, bind=none button, draw with event, exclGroup per-option) and assert that the corresponding nodes are still NOT emitted as Questions — only their scripts are surfaced. FormScript doc comment rewritten: orphan scripts are now first-class (OwnerPath set, OwnerID empty) rather than missing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Captures the high-level design plan for pdfer's XFA surface: - The scope principle: extract structure and surface logic, don't execute logic. Schema is a projection of the template DOM, not a snapshot of a Form DOM. Runtime model (instance counts, presence toggles, calculation order) is the caller's responsibility. - P1 roadmap: orphan-script extraction (done in 88b6989), a parallel Elements collection for non-question template nodes with visual presence or events, and <occur> / <bind> metadata on Sections and Questions for dynamic XFA. - P2 drafts: SOM path parser and schema resolver, SOM-keyed data-DOM cursor API, and <validate><script> child-element capture. - Explicit non-goals: script execution, merge algorithm, instance management, calculate dependency tracking, layout engine, and script-body parsing. These belong in the runtime layer. Lives under docs/design/ — first design doc in this repo, introduces the convention. Status header marks it as P1 committed, P2 draft, and open for discussion. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Follow-up polish on 88b6989 from a self-review pass. - Per-option script OwnerPath now keys off the option <field>'s name attribute (e.g. "group.optA") rather than the option's data value, which can contain arbitrary text from <items>. Adds OptionFieldNames parallel slice on xfaNode, populated during the exclGroup flatten. The resulting path is a real, SOM-resolvable expression. - populateScriptBackRefs collapsed to a single pass over Questions: a preceding section walk records each question's containing section path; the question pass then computes the full SOM path and calls assign. The previous covered-map / two-pass split is gone. - New TestNestedSubformFieldBackRef puts a field event three subforms deep and asserts both OwnerID and Question.Scripts resolve through the nested sections — locks in that the path built by extractAllScripts matches the one built by populateScriptBackRefs at arbitrary nesting depth. - TestExclGroupOptionScriptsExtracted updated to use field names (optA/optB) distinct from option values (a/b), with a negative assertion that the old value-keyed path does NOT appear. - FormScript doc comment and docs/design/xfa-scope.md §1 reframed around the stability contract: OwnerID empty means "owner is not currently surfaced as a typed schema entity." That signal is stable in meaning even as §2 expands the set of typed entities (Elements, etc.) — what shrinks is the orphan set, which is the direction audit-style consumers want anyway. The previous enumeration of four orphan cases read like a permanent classification.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
emission so scripts on nodes pdfer doesn't surface as Questions or Sections
(
<pageArea>events,bind=\"none\"non-AddAttachment buttons,event-bearing
<draw>s, per-option events flattened out of an<exclGroup>) appear inFormSchema.Scriptsinstead of being silentlydropped. Orphan scripts carry their SOM
OwnerPathwith an emptyOwnerID; scripts whose owner is an emitted Question/Section get back-refspopulated in a single post-pass.
docs/design/— captures the XFA scopeprinciple ("extract structure and surface logic, don't execute logic"),
marks orphan-script extraction as P1#1 done, and lays out the rest of the
P1 / P2 roadmap (Elements collection,
<occur>/<bind>metadata, SOMresolver, data-DOM cursor API,
<validate><script>capture).OwnerPathnow keys off the<field>name attribute (real SOM, e.g.group.optA) rather than theoption's
<items>value (which can be arbitrary text);populateScriptBackRefscollapsed to a single question pass; newTestNestedSubformFieldBackReflocks in that the path produced byextractAllScriptsmatches the one produced bypopulateScriptBackRefsat arbitrary nesting depth;
FormScriptdoc + design doc §1 reframedaround the stability contract for the empty-`OwnerID` signal so the
enumeration of orphan cases isn't read as a permanent classification.
Test plan
`TestDrawEventScriptExtracted`, `TestExclGroupOptionScriptsExtracted`
assert the four orphan cases survive AND the corresponding nodes are
still NOT emitted as Questions
arbitrary nesting depth
pass with the new back-ref machinery