Skip to content

Defer tag writes past fixed-length neighbors#193

Draft
hhugo wants to merge 3 commits into
ocaml-community:masterfrom
hhugo:prefer-end-tag-element-length
Draft

Defer tag writes past fixed-length neighbors#193
hhugo wants to merge 3 commits into
ocaml-community:masterfrom
hhugo:prefer-end-tag-element-length

Conversation

@hhugo
Copy link
Copy Markdown
Collaborator

@hhugo hhugo commented Apr 1, 2026

Summary

  • Prefer end tag over start tag for fixed-length element optimization — delays the tag write past the element
  • Allocate boundary tags for fixed-length tuple elements without a right-position anchor, so that as-binding tags fire as late as possible in the automaton
  • Dead tag elimination pass (Sedlex.optimize) strips unused boundary tags and remaps live ones to a dense range
  • Inner tuples communicate boundary anchors to enclosing aliases via aux's return triple

Test plan

  • Existing expect tests pass (codegen, realistic, basic)
  • New test: deferred start tag past inner prefix (("0x", Plus hexa) as x)
  • Verify dead boundary tags are eliminated (mem_cells unchanged for patterns without aliases)

🤖 Generated with Claude Code

@pmetzger
Copy link
Copy Markdown
Member

pmetzger commented Apr 1, 2026

I am unlikely to be able to meaningfully review this, but I will happily do the merge if people tell me to.

@hhugo hhugo force-pushed the prefer-end-tag-element-length branch 7 times, most recently from 12a4f1f to bf1abc9 Compare April 1, 2026 23:17
@hhugo hhugo force-pushed the prefer-end-tag-element-length branch from bf1abc9 to 91cf66e Compare April 7, 2026 22:50
@hhugo
Copy link
Copy Markdown
Collaborator Author

hhugo commented Apr 7, 2026

rebased on top of #196

@pmetzger
Copy link
Copy Markdown
Member

@hhugo I added you to the writers for this repository, so you should be able to update such things when you believe it is appropriate. Please be careful about testing and correctness, especially with machine generated code.

hhugo and others added 3 commits April 13, 2026 22:46
Delay tag writes as late as possible: when a fixed-length element
needs only one tag, use the end tag instead of the start tag. This
reduces redundant tag operations in loops (e.g., self-loop on 'a'
no longer writes the tag on every iteration).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Allocate boundary tags for fixed-length tuple elements that lack a
right-position anchor, so that as-binding tags fire as late as
possible in the automaton.

During the rights computation (right-to-left pass), when retreat
breaks at a variable-length element but the current element has a
fixed length, a boundary tag is allocated at the element's end.
This tag becomes a concrete anchor: elements further left can
compute their positions via Tag{tag; offset}.  Dead boundary tags
(unreferenced by any as-binding) are eliminated by a new
Sedlex.optimize pass that strips unused tags and remaps live ones
to a dense range.

Inner tuples communicate boundary anchors to an enclosing alias
via the third element of aux's return triple: (start_anchor,
end_anchor).  The alias picks whichever expression fires latest:
- Start_plus/End_minus always win (no tag needed).
- For the start boundary: inner anchor > outer left context.
- For the end boundary: outer right context > inner anchor.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Simplify `prefer` to use a fixed ranking (Start_plus > End_minus > Tag)
instead of a caller-chosen `~best` parameter. Start_plus positions
require no runtime computation, making them strictly better than
End_minus (which needs the token end).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@hhugo hhugo force-pushed the prefer-end-tag-element-length branch from 91cf66e to 245360b Compare April 13, 2026 20:53
@hhugo hhugo marked this pull request as draft April 13, 2026 20:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants