Skip to content

Add basic support for named pattern#1

Closed
hhugo wants to merge 24 commits into
debugfrom
name-minimal
Closed

Add basic support for named pattern#1
hhugo wants to merge 24 commits into
debugfrom
name-minimal

Conversation

@hhugo
Copy link
Copy Markdown
Owner

@hhugo hhugo commented Feb 8, 2026

This PR is part of a multi-PR chain to implement efficient named pattern in sedlex

This commit does not include any optimization whatsoever. It includes many new tests that will change as we implement optimizations.

Replace ocaml-community#112
Fix #5

@hhugo hhugo force-pushed the name-minimal branch 2 times, most recently from 14d80de to b0c1d9e Compare March 11, 2026 16:12
hhugo and others added 7 commits March 11, 2026 22:20
Add two example calculators showing how to bridge Sedlexing.lexbuf
with ocamlyacc/menhir parsers, and document the pattern in the README.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The upper bound of the surrogate rejection range was 0xdf00 instead of
0xdfff, which would have allowed U+DF01..U+DFFF through. In practice
the bug was masked by the local Uchar.of_int wrapper, but fix it for
correctness. Add comments explaining why only check_three needs the
surrogate check, and add an expect test for surrogate rejection.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…nity#176)

* Support nested let..in for [%sedlex.regexp?] definitions (ocaml-community#41)

Allow users to define named regexps using nested let statements, e.g.:
  let int_lit =
    let digit = [%sedlex.regexp? '0'..'9'] in
    [%sedlex.regexp? Plus digit]

Add eval_regexp_expr method that recursively evaluates let..in chains
of regexp definitions, used by both the expression handler and
structure_with_regexps.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add comment to ast match

* Update documentation

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
hhugo and others added 4 commits March 18, 2026 18:53
The default branch in a match%sedlex is not a regexp — it fires when
no rule matches, so zero characters are consumed and the lexeme is "".
To catch unexpected characters, use `any` instead.

Closes ocaml-community#51

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…-community#181)

* Document regexp operator precedence (fixes ocaml-community#35)

Since sedlex regexps are OCaml patterns, they follow OCaml's pattern
precedence: | (lowest) < , < constructor application (highest).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Doc: add new sub sections

* cleanup

* cleanup

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
- Add ppx_sedlex.mli with minimal public surface
- Replace table_counter/partition_counter refs with Hashtbl.length
- Expose reset_state instead of raw partitions/tables hashtables
- Bake builtin_regexps and Fun.id into handle_sedlex_match
- Comment out unused extensions value
- Remove StringMap, builtin_regexps, regexp_of_pattern from interface

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@hhugo hhugo force-pushed the name-minimal branch 2 times, most recently from e7ac0a6 to 79f3579 Compare March 24, 2026 13:27
hhugo and others added 2 commits March 24, 2026 16:41
Add [%compile_error] test extension that applies the sedlex mapper to
an expression, catches errors, and prints them with OCaml's caret
display (line numbers stripped for stability). Expose map_expression
in ppx_sedlex for this purpose.

27 expect tests in test/codegen/test_errors.ml covering every error
path in ppx_sedlex.ml: as-binding restrictions, operator misuse,
malformed strings, invalid patterns, match structure, and regexp
definition errors.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add section covering `as` binding syntax, submatch extraction
functions, or-pattern support, and operator restrictions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@hhugo hhugo closed this Apr 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants