Implement IsRegExp check via Symbol.match#624
Conversation
Replace IsRegExpValue with two distinct operations: IsRegExpInstance
(brand check for RegExp prototype methods) and IsRegExp (Symbol.match
lookup for the RegExp constructor and String.prototype guards). The
RegExp constructor now reads .source/.flags from any object with a
truthy Symbol.match, and String.prototype.{startsWith,endsWith,includes}
reject regexp-like arguments via the same check. The replaceAll/matchAll
global-flag guards now read the flags property per spec instead of the
.global accessor.
Closes #610
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub. 1 Skipped Deployment
|
|
ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThis PR implements Symbol.match-based RegExp detection throughout the engine: replaces IsRegExpValue with IsRegExp/IsRegExpInstance, marks created RegExp objects, updates RegExp constructor/construct, updates RegExp.prototype receiver checks, adjusts String methods to validate/extract flags per spec, adds error strings, and extends tests. ChangesRegExp Symbol.match Detection
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Possibly related PRs
Suggested labelsspec compliance, internal 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
Suite TimingTest Runner (interpreted: 9,242 passed; bytecode: 9,242 passed)
MemoryGC rows aggregate the main thread plus all worker thread-local GCs. Test runner worker shutdown frees thread-local heaps in bulk; that shutdown reclamation is not counted as GC collections or collected objects.
Benchmarks (interpreted: 407; bytecode: 407)
MemoryGC rows aggregate the main thread plus all worker thread-local GCs. Benchmark runner performs explicit between-file collections, so collection and collected-object counts can be much higher than the test runner.
Measured on ubuntu-latest x64. |
Benchmark Results407 benchmarks Interpreted: 🟢 9 improved · 🔴 380 regressed · 18 unchanged · avg -12.2% arraybuffer.js — Interp: 🔴 13, 1 unch. · avg -13.9% · Bytecode: 🟢 7, 7 unch. · avg +2.9%
arrays.js — Interp: 🔴 19 · avg -16.5% · Bytecode: 🟢 7, 12 unch. · avg +1.5%
async-await.js — Interp: 🔴 5, 1 unch. · avg -15.1% · Bytecode: 🟢 1, 5 unch. · avg -1.3%
async-generators.js — Interp: 🔴 1, 1 unch. · avg -13.9% · Bytecode: 🟢 1, 1 unch. · avg -1.4%
base64.js — Interp: 🔴 10 · avg -12.8% · Bytecode: 🟢 4, 6 unch. · avg +2.3%
classes.js — Interp: 🔴 29, 2 unch. · avg -11.6% · Bytecode: 🟢 4, 🔴 4, 23 unch. · avg +1.1%
closures.js — Interp: 🔴 10, 1 unch. · avg -13.0% · Bytecode: 🟢 5, 6 unch. · avg +1.9%
collections.js — Interp: 🔴 11, 1 unch. · avg -12.8% · Bytecode: 🟢 1, 🔴 1, 10 unch. · avg +0.1%
csv.js — Interp: 🔴 13 · avg -12.2% · Bytecode: 🟢 4, 9 unch. · avg +2.6%
destructuring.js — Interp: 🔴 22 · avg -13.5% · Bytecode: 🟢 9, 13 unch. · avg +3.3%
fibonacci.js — Interp: 🔴 8 · avg -15.2% · Bytecode: 8 unch. · avg +0.6%
float16array.js — Interp: 🔴 32 · avg -12.6% · Bytecode: 🟢 16, 🔴 1, 15 unch. · avg +3.4%
for-of.js — Interp: 🔴 7 · avg -20.2% · Bytecode: 🟢 1, 6 unch. · avg +1.5%
generators.js — Interp: 🔴 4 · avg -8.6% · Bytecode: 🟢 1, 3 unch. · avg +0.3%
iterators.js — Interp: 🔴 36, 6 unch. · avg -8.0% · Bytecode: 🟢 31, 11 unch. · avg +4.9%
json.js — Interp: 🔴 19, 1 unch. · avg -12.1% · Bytecode: 🔴 8, 12 unch. · avg -2.1%
jsx.jsx — Interp: 🔴 21 · avg -17.8% · Bytecode: 🟢 5, 16 unch. · avg +2.0%
modules.js — Interp: 🔴 9 · avg -16.0% · Bytecode: 🟢 2, 🔴 1, 6 unch. · avg +0.1%
numbers.js — Interp: 🔴 11 · avg -11.6% · Bytecode: 🔴 2, 9 unch. · avg -1.6%
objects.js — Interp: 🔴 7 · avg -15.4% · Bytecode: 7 unch. · avg +0.4%
promises.js — Interp: 🔴 12 · avg -15.0% · Bytecode: 🟢 6, 6 unch. · avg +3.1%
regexp.js — Interp: 🔴 11 · avg -12.5% · Bytecode: 🟢 2, 🔴 1, 8 unch. · avg +0.6%
strings.js — Interp: 🔴 19 · avg -17.5% · Bytecode: 🟢 1, 🔴 1, 17 unch. · avg +0.0%
tsv.js — Interp: 🔴 9 · avg -15.3% · Bytecode: 🔴 1, 8 unch. · avg +1.2%
typed-arrays.js — Interp: 🟢 4, 🔴 17, 1 unch. · avg -5.7% · Bytecode: 🟢 10, 🔴 3, 9 unch. · avg +13.1%
uint8array-encoding.js — Interp: 🟢 5, 🔴 11, 2 unch. · avg +11.0% · Bytecode: 🟢 5, 🔴 3, 10 unch. · avg -5.9%
weak-collections.js — Interp: 🔴 14, 1 unch. · avg -21.9% · Bytecode: 🟢 9, 🔴 1, 5 unch. · avg +4.2%
Deterministic profile diffDeterministic profile diff: no significant changes. Measured on ubuntu-latest x64. Benchmark ranges compare cached main-branch min/max ops/sec with the PR run; overlapping ranges are treated as unchanged noise. Percentage deltas are secondary context. |
test262 Conformance
Areas closest to 100%
Per-test deltas (+20 / -0)Newly passing (20):
Steady-state failures are non-blocking; regressions vs the cached main baseline (lower total pass count, or any PASS → non-PASS transition) fail the conformance gate. Measured on ubuntu-latest x64, bytecode mode. Areas grouped by the first two test262 path components; minimum 25 attempted tests, areas already at 100% excluded. Δ vs main compares against the most recent cached |
…l cleanup - RegExpConstructorFn step 2b now checks pattern.constructor === RegExp before returning pattern as-is, per §22.2.3.1 step 2.b.i–ii. - replaceAll/matchAll flag checks now throw TypeError via RequireObjectCoercible(flags) when flags is null/undefined, before the 'g' check, per §22.1.3.20 step 2.b.ii. - matchAll caches AArgs.GetElement(0) in a local variable instead of re-evaluating it multiple times. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
source/units/Goccia.Values.StringObjectValue.pas (2)
126-141:⚠️ Potential issue | 🟠 Major | ⚡ Quick win
CoerceRegExpValueno longer matchesRegExpCreatesemantics.Switching this helper to
IsRegExpInstancemakes truthy-Symbol.matchobjects fall back to stringification instead of using their.source/.flags, and it still ignoresANewFlagsfor realRegExpinstances. That breaks the fallback paths forString.prototype.match,matchAll,search, andsplitonce the well-known symbol method is absent. A case like"abc".match({ source: "a", flags: "", [Symbol.match]: true })will now build/[object Object]/instead of/a/.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@source/units/Goccia.Values.StringObjectValue.pas` around lines 126 - 141, CoerceRegExpValue must follow RegExpCreate semantics: when AValue is a real RegExp instance or an object that claims to be a RegExp via a truthy Symbol.match, use that object's .source and .flags rather than stringifying it; only use CloneRegExpObject when AValue is a RegExp instance and ANewFlags is empty — otherwise call CreateRegExpObject with the extracted source and either ANewFlags (if provided) or the object's flags. Update CoerceRegExpValue to detect RegExp instances (IsRegExpInstance) and Symbol.match-claiming objects, extract .source/.flags, respect ANewFlags, and only fall back to ToStringLiteral when neither case applies.
1089-1129:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDon’t keep RegExp replacement semantics after
@@replacelookup fails.At this point
GetMethod(searchValue, @@replace)has already returnedundefined, so the spec fallback is the plain string-replacement path. This branch still performs regex replacement for actualRegExpinstances, which makesconst r = /a/; r[Symbol.replace] = undefined; "a".replace(r, "x")replace when it should treatras the search string. The same issue is repeated inStringReplaceAllMethodbelow.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@source/units/Goccia.Values.StringObjectValue.pas` around lines 1089 - 1129, The code treats any RegExp instance as needing RegExp replacement even when the @@replace method lookup returned undefined; change the condition that enters the RegExp branch so it only runs when the object's @@replace method is not undefined (i.e. check the previously obtained ReplaceMethod or call GetMethod(SearchArg, PROP_REPLACE) and ensure it is not nil/undefined before doing the RegExp-specific logic in the current replace implementation and likewise update StringReplaceAllMethod to use the same guard).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@source/units/Goccia.RegExp.Runtime.pas`:
- Around line 121-126: IsRegExpInstance is spoofable because it only checks
HasOwnProperty(PROP_SOURCE) and PROP_FLAGS; change it to test a real internal
brand/internal slot instead of public property presence: add an internal
non-enumerable marker (e.g., a hidden/internal flag on TGocciaObjectValue such
as FIsRegExp or a hidden PROP_REGEXP_BRAND) and update IsRegExpInstance to
return that internal flag (use a new TGocciaObjectValue method like
HasInternalFlag/IsRegExpBrand), then ensure the RegExp creation path (the RegExp
constructor/factory) sets that internal brand on instances while keeping it
non-writable and invisible to HasOwnProperty so ordinary objects like
{source,flags} cannot spoof it.
---
Outside diff comments:
In `@source/units/Goccia.Values.StringObjectValue.pas`:
- Around line 126-141: CoerceRegExpValue must follow RegExpCreate semantics:
when AValue is a real RegExp instance or an object that claims to be a RegExp
via a truthy Symbol.match, use that object's .source and .flags rather than
stringifying it; only use CloneRegExpObject when AValue is a RegExp instance and
ANewFlags is empty — otherwise call CreateRegExpObject with the extracted source
and either ANewFlags (if provided) or the object's flags. Update
CoerceRegExpValue to detect RegExp instances (IsRegExpInstance) and
Symbol.match-claiming objects, extract .source/.flags, respect ANewFlags, and
only fall back to ToStringLiteral when neither case applies.
- Around line 1089-1129: The code treats any RegExp instance as needing RegExp
replacement even when the @@replace method lookup returned undefined; change the
condition that enters the RegExp branch so it only runs when the object's
@@replace method is not undefined (i.e. check the previously obtained
ReplaceMethod or call GetMethod(SearchArg, PROP_REPLACE) and ensure it is not
nil/undefined before doing the RegExp-specific logic in the current replace
implementation and likewise update StringReplaceAllMethod to use the same
guard).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f46549b9-d4ed-45f5-a2fd-7c62309d2617
📒 Files selected for processing (13)
source/units/Goccia.Builtins.GlobalRegExp.passource/units/Goccia.Builtins.TestingLibrary.passource/units/Goccia.Error.Messages.passource/units/Goccia.Error.Suggestions.passource/units/Goccia.REPL.Formatter.passource/units/Goccia.RegExp.Runtime.passource/units/Goccia.Values.StringObjectValue.pastests/built-ins/RegExp/constructor.jstests/built-ins/String/prototype/endsWith.jstests/built-ins/String/prototype/includes.jstests/built-ins/String/prototype/matchAll.jstests/built-ins/String/prototype/replaceAll.jstests/built-ins/String/prototype/startsWith.js
IsRegExpInstance now checks the internal HasRegExpData flag (set by
CreateRegExpObject) instead of probing for own source/flags properties.
Plain objects like {source: 'a', flags: ''} can no longer spoof a
RegExp instance past prototype method guards.
String.prototype.replace and replaceAll no longer fall through to regex
execution when @@replace is undefined — per §22.1.3.19 step 4, the
searchValue is ToString'd and treated as a plain string when the Symbol
method lookup fails.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
source/units/Goccia.Values.StringObjectValue.pas (2)
126-141:⚠️ Potential issue | 🟠 Major | ⚡ Quick winHonor
IsRegExp, not just branded instances, in fallback coercion.This helper now stringifies regexp-like objects with a truthy
Symbol.matchunless they also carry the internal brand. That breaks theRegExpCreate-style fallback used byString.prototype.match,search, andmatchAll, so inputs like{ [Symbol.match]: true, source: 'a', flags: '' }no longer participate as regexes there.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@source/units/Goccia.Values.StringObjectValue.pas` around lines 126 - 141, The fallback coercion in CoerceRegExpValue incorrectly checks only for branded regexp instances via IsRegExpInstance, causing objects that are RegExp-like (truthy Symbol.match) to be treated as strings; change the initial branch to use IsRegExp(AValue) instead of IsRegExpInstance(AValue) so CloneRegExpObject/CreateRegExpObject paths honor RegExp-like objects (e.g., objects with Symbol.match true and source/flags properties) when deciding whether to clone or construct a RegExp.
1397-1437:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDon't keep regex semantics after
@@splitis absent.By the time execution reaches this branch,
GetMethod(separator, @@split)has already returnedundefined. Spec fallback from there is string splitting, but thisIsRegExpInstancepath still executes regex splitting for branded regexes, so overriding or deletingSymbol.spliton aRegExpinstance won't take effect.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@source/units/Goccia.Values.StringObjectValue.pas` around lines 1397 - 1437, This branch treats a separator as a RegExp even when its @@split method was resolved to undefined earlier; ensure regex semantics are only used if the object's GetMethod(separator, @@split) returned a callable. Modify the logic around IsRegExpInstance / CoerceRegExpValue so that before calling CoerceRegExpValue and MatchRegExpObjectValue you also check the previously-resolved split method (Symbol.split/@@split) and skip this RegExp branch when that method was undefined, preserving ordinary string-splitting semantics when @@split was removed or overridden.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@source/units/Goccia.Values.StringObjectValue.pas`:
- Around line 126-141: The fallback coercion in CoerceRegExpValue incorrectly
checks only for branded regexp instances via IsRegExpInstance, causing objects
that are RegExp-like (truthy Symbol.match) to be treated as strings; change the
initial branch to use IsRegExp(AValue) instead of IsRegExpInstance(AValue) so
CloneRegExpObject/CreateRegExpObject paths honor RegExp-like objects (e.g.,
objects with Symbol.match true and source/flags properties) when deciding
whether to clone or construct a RegExp.
- Around line 1397-1437: This branch treats a separator as a RegExp even when
its @@split method was resolved to undefined earlier; ensure regex semantics are
only used if the object's GetMethod(separator, @@split) returned a callable.
Modify the logic around IsRegExpInstance / CoerceRegExpValue so that before
calling CoerceRegExpValue and MatchRegExpObjectValue you also check the
previously-resolved split method (Symbol.split/@@split) and skip this RegExp
branch when that method was undefined, preserving ordinary string-splitting
semantics when @@split was removed or overridden.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c72fcabd-f95c-4102-a61b-267f7685c1a3
📒 Files selected for processing (4)
source/units/Goccia.RegExp.Runtime.passource/units/Goccia.Values.ObjectValue.passource/units/Goccia.Values.StringObjectValue.pastests/built-ins/RegExp/constructor.js
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/built-ins/RegExp/constructor.js
CoerceRegExpValue now checks IsRegExp (Symbol.match) instead of IsRegExpInstance (brand), and reads .source/.flags with undefined handling. ANewFlags is applied when provided, fixing the matchAll fallback which previously ignored the forced 'g' flag on clones. StringSplit no longer falls through to regex execution when @@split is undefined — per §22.1.3.23 step 5, the separator is ToString'd and used as a plain string delimiter. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
IsRegExpabstract operation, which detects regexp-like objects viaSymbol.matchinstead of a property-based brand check.IsRegExpValue→IsRegExpInstancefor clarity: the brand check is now distinct from the spec'sIsRegExp(Symbol.match lookup).new RegExp(...)andRegExp(...)) to useIsRegExp, so objects with a truthySymbol.matchhave their.source/.flagsread, and objects with a falsySymbol.matchare stringified.IsRegExpguards toString.prototype.{startsWith, endsWith, includes}that throwTypeErrorfor regexp-like arguments, per spec.replaceAll/matchAllglobal-flag checks to useIsRegExpand read theflagsproperty (per spec) instead of the.globalaccessor.Testing