Skip to content

Tag delay optimization#179

Draft
hhugo wants to merge 3 commits into
ocaml-community:masterfrom
hhugo:set-prev
Draft

Tag delay optimization#179
hhugo wants to merge 3 commits into
ocaml-community:masterfrom
hhugo:set-prev

Conversation

@hhugo
Copy link
Copy Markdown
Collaborator

@hhugo hhugo commented Mar 11, 2026

Summary

  • When a Set_position tag fires on every transition entering a cycle state, remove it from those transitions and emit Set_position with offset 1 (pos - 1) on exit transitions. Turns O(n) tag writes per loop into O(1) on exit.
  • Handles multi-state cycles via SCC analysis (Tarjan's algorithm) and multi-exit cycles: exits from other cycle states are safe when they only reach rules that don't read the tag.
  • Set_position { cell; offset } with offset 0 for current position and offset 1 for previous code point (subtracted from pos).

Test plan

  • Codegen expect tests (DOT graphs + generated code)
  • Runtime tests for multi-state cycles, multi-exit cycles, backtracking
  • test_realistic.ml multi-rule lexer baseline

🤖 Generated with Claude Code

@hhugo hhugo marked this pull request as draft March 25, 2026 09:31
@hhugo hhugo changed the title Self-loop tag delay (Set_prev) Tag delay optimization Mar 27, 2026
@hhugo hhugo marked this pull request as ready for review March 27, 2026 15:14
When a Set_position tag fires on every transition entering a cycle
state, remove it from those transitions and emit Set_position with
offset 1 (pos - 1) on exit transitions. This turns O(n) tag writes
per loop into O(1) on exit.

Handles multi-state cycles via SCC analysis (Tarjan's algorithm) and
multi-exit cycles: exits from other cycle states are safe when they
only reach rules that don't read the tag.

Set_position { cell; offset } with offset 0 for current position and
offset 1 for previous code point (subtracted from pos).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@hhugo
Copy link
Copy Markdown
Collaborator Author

hhugo commented Mar 27, 2026

@toots, this PR is ready for review

hhugo and others added 2 commits March 27, 2026 17:24
Tags are allocated during regexp_of_pattern in the PPX, not during
compile_re. The tag_starts approach incorrectly mapped all tags to
rule 0. Use an NFA walk (tag_owners) to correctly discover which
rule each tag belongs to.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
With the tag binding on rule 1 (not rule 0), incorrect tag ownership
would allow an unsafe delay: the check would see rule 0 as unreachable
from the non-s exit (correct) but miss that rule 1 IS reachable there.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@hhugo hhugo marked this pull request as draft March 30, 2026 12:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant