Conversation
By passing exact-length unpadded inputs to regncomp, we can catch out-of-bounds reads using -fsanitize=address.
Many tools expect source files to have some valid encoding (like UTF-8), this makes them happy.
These were all found with the latest retest.c using the Address Sanitizer with: CFLAGS=-O1 -g -fsanitize=address -fno-omit-frame-pointer LDFLAGS=-fsanitize=address
dag-erling
requested changes
Jan 31, 2026
Comment on lines
+1244
to
+1245
| if (ctx->re >= ctx->re_end) | ||
| return REG_BADPAT; |
Collaborator
There was a problem hiding this comment.
This is redundant since we perform the exact same check immediately upon entering the loop.
|
|
||
| #ifdef REG_LITERAL | ||
| if (*(ctx->re + 1) == L'Q') | ||
| if (*(ctx->re + 1) == L'Q' || *(ctx->re + 1) == L'U') |
Collaborator
There was a problem hiding this comment.
This seems unrelated... What is \U supposed to mean? Also, \Q...\E appears to be undocumented. Also also, the TRE website is down.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Updated
retestsoregncompis called with a strictly non‑padded input buffer, rather than passing aNULterminated buffer. With that in place, AddressSanitizer exposed several out‑of‑bounds reads in the parser.Added a few additional parser test cases around atoms, bracket expressions, and escapes.
Then fixed the parser, adding explicit bounds checks and better error handling.