Skip to content

feat(scalar): push byte_length validity down to its input#8328

Open
joseph-isaacs wants to merge 3 commits into
developfrom
claude/scalar-validity-optimization-k3lxyr
Open

feat(scalar): push byte_length validity down to its input#8328
joseph-isaacs wants to merge 3 commits into
developfrom
claude/scalar-validity-optimization-k3lxyr

Conversation

@joseph-isaacs

Copy link
Copy Markdown
Contributor

byte_length is null-preserving: the result is null exactly when the
input is null. Previously, ByteLength did not implement the
ScalarFnVTable::validity rule, so requesting the validity of a
byte_length array fell back to is_not_null(byte_length(input)). That
fallback forces the whole function to be evaluated just to extract the
validity mask, which for compressed encodings like FSST means
decompressing the codes into a canonical VarBinView.

Implement ByteLength::validity to return the input's validity directly.
For FSST this reads the validity buffer straight from the array via
is_not_null without touching the compressed codes.

Signed-off-by: Joe Isaacs joe.isaacs@live.co.uk

claude added 3 commits June 9, 2026 17:07
byte_length is null-preserving: the result is null exactly when the
input is null. Previously, ByteLength did not implement the
ScalarFnVTable::validity rule, so requesting the validity of a
byte_length array fell back to is_not_null(byte_length(input)). That
fallback forces the whole function to be evaluated just to extract the
validity mask, which for compressed encodings like FSST means
decompressing the codes into a canonical VarBinView.

Implement ByteLength::validity to return the input's validity directly.
For FSST this reads the validity buffer straight from the array via
is_not_null without touching the compressed codes.

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
ScalarFn::validity always returned Validity::Array, even when the result
was fully valid or fully invalid. That defeats every downstream fast-path
keyed off Validity::NonNullable | AllValid | AllInvalid (for example the
FSST byte_length kernel), and for non-nullable results it pointlessly
built and evaluated a validity expression to produce a constant-true
array.

Improve the ScalarFn validity vtable to:
- short-circuit a non-nullable result dtype to Validity::NonNullable
  without building or evaluating any validity expression;
- collapse a literal validity expression (e.g. functions that return
  lit(true)) directly to AllValid/AllInvalid; and
- collapse a constant validity result array to the most specific variant.

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
Add a regression test for a compound validity expression (a == b, whose
validity is and(valid(a), valid(b))) and document why the validity
expression is evaluated eagerly: eager evaluation lets a provably
all-valid/all-invalid expression fold to a constant that collapses into
a precise Validity variant, keeping no_nulls() reliable. A lazy validity
tree would hide such constants behind an unevaluated ScalarFn array.

Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
@joseph-isaacs joseph-isaacs requested a review from a team June 10, 2026 10:15
@codspeed-hq

codspeed-hq Bot commented Jun 10, 2026

Copy link
Copy Markdown

Merging this PR will improve performance by 27.23%

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚡ 189 improved benchmarks
❌ 3 regressed benchmarks
✅ 1334 untouched benchmarks

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation bitwise_not_vortex_buffer_mut[128] 216.9 ns 275.3 ns -21.19%
Simulation bitwise_not_vortex_buffer_mut[1024] 278.6 ns 336.9 ns -17.31%
Simulation bitwise_not_vortex_buffer_mut[2048] 342.2 ns 400.6 ns -14.56%
Simulation compare[63] 360.5 µs 244.8 µs +47.29%
Simulation chunked_bool_canonical_into[(1000, 10)] 46.4 µs 31.7 µs +46.04%
Simulation compare[56] 327.6 µs 224.8 µs +45.69%
Simulation compare[62] 363.9 µs 250.1 µs +45.51%
Simulation compare[63] 371.2 µs 255.6 µs +45.26%
Simulation compare[60] 353.8 µs 243.7 µs +45.16%
Simulation compare[56] 332.4 µs 229.6 µs +44.79%
Simulation compare[62] 368.7 µs 254.8 µs +44.71%
Simulation compare[61] 362.7 µs 250.8 µs +44.61%
Simulation compare[60] 358.6 µs 248.5 µs +44.31%
Simulation compare[58] 347.3 µs 241 µs +44.15%
Simulation compare[59] 354.5 µs 246.3 µs +43.97%
Simulation compare[61] 367.5 µs 255.5 µs +43.84%
Simulation compare[58] 352.2 µs 245.7 µs +43.33%
Simulation compare[57] 345.9 µs 241.5 µs +43.25%
Simulation compare[59] 359.3 µs 251 µs +43.15%
Simulation compare[54] 330.4 µs 231.5 µs +42.71%
... ... ... ... ... ...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.


Comparing claude/scalar-validity-optimization-k3lxyr (9e799eb) with develop (47d2041)

Open in CodSpeed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants