feat(scalar): push byte_length validity down to its input#8328
Open
joseph-isaacs wants to merge 3 commits into
Open
feat(scalar): push byte_length validity down to its input#8328joseph-isaacs wants to merge 3 commits into
joseph-isaacs wants to merge 3 commits into
Conversation
byte_length is null-preserving: the result is null exactly when the input is null. Previously, ByteLength did not implement the ScalarFnVTable::validity rule, so requesting the validity of a byte_length array fell back to is_not_null(byte_length(input)). That fallback forces the whole function to be evaluated just to extract the validity mask, which for compressed encodings like FSST means decompressing the codes into a canonical VarBinView. Implement ByteLength::validity to return the input's validity directly. For FSST this reads the validity buffer straight from the array via is_not_null without touching the compressed codes. Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
ScalarFn::validity always returned Validity::Array, even when the result was fully valid or fully invalid. That defeats every downstream fast-path keyed off Validity::NonNullable | AllValid | AllInvalid (for example the FSST byte_length kernel), and for non-nullable results it pointlessly built and evaluated a validity expression to produce a constant-true array. Improve the ScalarFn validity vtable to: - short-circuit a non-nullable result dtype to Validity::NonNullable without building or evaluating any validity expression; - collapse a literal validity expression (e.g. functions that return lit(true)) directly to AllValid/AllInvalid; and - collapse a constant validity result array to the most specific variant. Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
Add a regression test for a compound validity expression (a == b, whose validity is and(valid(a), valid(b))) and document why the validity expression is evaluated eagerly: eager evaluation lets a provably all-valid/all-invalid expression fold to a constant that collapses into a precise Validity variant, keeping no_nulls() reliable. A lazy validity tree would hide such constants behind an unevaluated ScalarFn array. Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
myrrc
approved these changes
Jun 10, 2026
robert3005
approved these changes
Jun 10, 2026
Merging this PR will improve performance by 27.23%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | Simulation | bitwise_not_vortex_buffer_mut[128] |
216.9 ns | 275.3 ns | -21.19% |
| ❌ | Simulation | bitwise_not_vortex_buffer_mut[1024] |
278.6 ns | 336.9 ns | -17.31% |
| ❌ | Simulation | bitwise_not_vortex_buffer_mut[2048] |
342.2 ns | 400.6 ns | -14.56% |
| ⚡ | Simulation | compare[63] |
360.5 µs | 244.8 µs | +47.29% |
| ⚡ | Simulation | chunked_bool_canonical_into[(1000, 10)] |
46.4 µs | 31.7 µs | +46.04% |
| ⚡ | Simulation | compare[56] |
327.6 µs | 224.8 µs | +45.69% |
| ⚡ | Simulation | compare[62] |
363.9 µs | 250.1 µs | +45.51% |
| ⚡ | Simulation | compare[63] |
371.2 µs | 255.6 µs | +45.26% |
| ⚡ | Simulation | compare[60] |
353.8 µs | 243.7 µs | +45.16% |
| ⚡ | Simulation | compare[56] |
332.4 µs | 229.6 µs | +44.79% |
| ⚡ | Simulation | compare[62] |
368.7 µs | 254.8 µs | +44.71% |
| ⚡ | Simulation | compare[61] |
362.7 µs | 250.8 µs | +44.61% |
| ⚡ | Simulation | compare[60] |
358.6 µs | 248.5 µs | +44.31% |
| ⚡ | Simulation | compare[58] |
347.3 µs | 241 µs | +44.15% |
| ⚡ | Simulation | compare[59] |
354.5 µs | 246.3 µs | +43.97% |
| ⚡ | Simulation | compare[61] |
367.5 µs | 255.5 µs | +43.84% |
| ⚡ | Simulation | compare[58] |
352.2 µs | 245.7 µs | +43.33% |
| ⚡ | Simulation | compare[57] |
345.9 µs | 241.5 µs | +43.25% |
| ⚡ | Simulation | compare[59] |
359.3 µs | 251 µs | +43.15% |
| ⚡ | Simulation | compare[54] |
330.4 µs | 231.5 µs | +42.71% |
| ... | ... | ... | ... | ... | ... |
ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing claude/scalar-validity-optimization-k3lxyr (9e799eb) with develop (47d2041)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
byte_length is null-preserving: the result is null exactly when the
input is null. Previously, ByteLength did not implement the
ScalarFnVTable::validity rule, so requesting the validity of a
byte_length array fell back to is_not_null(byte_length(input)). That
fallback forces the whole function to be evaluated just to extract the
validity mask, which for compressed encodings like FSST means
decompressing the codes into a canonical VarBinView.
Implement ByteLength::validity to return the input's validity directly.
For FSST this reads the validity buffer straight from the array via
is_not_null without touching the compressed codes.
Signed-off-by: Joe Isaacs joe.isaacs@live.co.uk