Make ScalarFn array validity lazy when the function defines a validity expression by joseph-isaacs · Pull Request #8336 · vortex-data/vortex

joseph-isaacs · 2026-06-10T13:22:49Z

Summary

PR 4 of a 4-PR stack (stacked on #8335) — the payoff: lazy validity for ScalarFn arrays.

Previously ValidityVTable<ScalarFn> always eagerly executed the validity expression via the legacy session. Now, when the scalar function provides a validity expression over its inputs, the expression is converted into a lazy ScalarFn array DAG instead:

Literal nodes → ConstantArray
ArrayExpr leaves → unwrap to the child array they hold
interior nodes → lazy ScalarFn arrays via Array::<ScalarFn>::try_new
constant results are folded back into AllValid/AllInvalid via child_to_validity

Why the eager path remains for some functions

Functions that don't define a validity expression (Kleene and/or, where validity depends on the computed values) keep the eager path. The erased fallback for these is is_not_null(expr) — self-referential: lazily materializing it means resolving the validity of the inner node, which spawns another is_not_null DAG over a fresh copy of the same node, recursing without ever shrinking. This manifested as a stack overflow in element-wise execution paths (execute_scalar → is_invalid → validity()), caught by test_bool_consistency. The new ScalarFnRef::validity_opt exposes whether a function defines its own validity expression so the vtable can pick the right path.

Checks

cargo nextest run -p vortex-array (2962 passed)
cargo nextest run --workspace --exclude vortex-cuda --exclude vortex-nvcomp --exclude vortex-tensor --exclude vortex-duckdb (passed)
cargo nextest run -p vortex-duckdb (196 passed; 2 network-dependent tests excluded)
cargo clippy -p vortex-array --all-targets, cargo +nightly fmt --all

https://claude.ai/code/session_01VPQ7dfZtijfrsjAipwXvEj

Generated by Claude Code

…y expression Previously ValidityVTable<ScalarFn> always eagerly executed the validity expression via the legacy session. Now, when the scalar function provides a validity expression over its inputs, the expression is converted into a lazy ScalarFn array DAG instead: Literal nodes become ConstantArrays, ArrayExpr leaves unwrap to the child arrays they hold, and interior nodes become lazy ScalarFn arrays. Constant results are folded back into AllValid/AllInvalid via child_to_validity. Functions that do not define a validity expression (e.g. Kleene logic and/or, where validity depends on the computed values) keep the eager path. The erased fallback for these is is_not_null over the expression itself, so a lazy representation would be self-referential: resolving the validity of the inner node spawns another is_not_null DAG, which recurses without ever shrinking (this manifested as a stack overflow in element-wise execution paths). ScalarFnRef::validity_opt is added to expose whether a function defines its own validity expression. https://claude.ai/code/session_01VPQ7dfZtijfrsjAipwXvEj Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>

…to claude/cool-bardeen-l8jlsy-4-lazy-scalarfn-validity

codspeed-hq · 2026-06-10T13:29:52Z

Merging this PR will degrade performance by 17.27%

⚠️

Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

❌ 5 regressed benchmarks
✅ 1521 untouched benchmarks

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

	Mode	Benchmark	`BASE`	`HEAD`	Efficiency
❌	Simulation	`varbinview_zip_block_mask`	2.9 ms	3.7 ms	-21.57%
❌	Simulation	`bitwise_not_vortex_buffer_mut[128]`	216.9 ns	275.3 ns	-21.19%
❌	Simulation	`bitwise_not_vortex_buffer_mut[1024]`	278.6 ns	336.9 ns	-17.31%
❌	Simulation	`bitwise_not_vortex_buffer_mut[2048]`	342.2 ns	400.6 ns	-14.56%
❌	Simulation	`varbinview_zip_fragmented_mask`	6.2 ms	6.9 ms	-11.27%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.

_{Comparing claude/cool-bardeen-l8jlsy-4-lazy-scalarfn-validity (d72800b) with claude/cool-bardeen-l8jlsy-3-definitely-all-invalid (365401e)¹}

No successful run was found on claude/cool-bardeen-l8jlsy-3-definitely-all-invalid (97b60ac) during the generation of this report, so ff6018d was used instead as the comparison base. There might be some changes unrelated to this pull request in this report. ↩

joseph-isaacs · 2026-06-10T14:44:00Z

Investigated the CodSpeed varbinview_zip_* regressions — they are a codegen-layout artifact, not a real cost of this change:

The changed code never runs in these benchmarks. Instrumenting ScalarFn::validity with a counter shows 0 calls during varbinview_zip_block_mask/varbinview_zip_fragmented_mask (vs ~47k calls in the bool-consistency tests, confirming the instrumentation works). The benches go straight through the dedicated ZipKernel for VarBinView and never query a ScalarFn array's validity.
Bisecting the two files in this PR reproduces the full ~19% delta locally only when arrays/scalar_fn/vtable/validity.rs changes — the file whose code is never executed here. The scalar_fn/erased.rs change alone shows no delta.
codegen-units=1 erases the regression entirely (branch 3: 505µs median vs branch 3 + this PR's files: 494µs). With default codegen units, adding code to that module shifts LLVM's CGU partitioning and changes inlining in the unrelated hot zip loop, which CodSpeed's instruction-counting simulation faithfully reports.

The same flavor of noise shows in the rest of the stack: the bitwise_not_vortex_buffer_mut and chunked_*_canonical_into benchmarks flip between ±20–47% on adjacent PRs that don't touch vortex-buffer or the chunked builders at all.

I can't acknowledge regressions on the CodSpeed dashboard from this session — that needs someone with dashboard access.

https://claude.ai/code/session_01VPQ7dfZtijfrsjAipwXvEj

Generated by Claude Code

robert3005 · 2026-06-10T15:26:33Z

    /// Transforms the expression into one representing the validity of this expression.
    pub fn validity(&self, expr: &Expression) -> VortexResult<Expression> {
-        Ok(self.0.validity(expr)?.unwrap_or_else(|| {
+        Ok(self.validity_opt(expr)?.unwrap_or_else(|| {


do you want to remove the TODO?

joseph-isaacs added the changelog/feature A new feature label Jun 10, 2026 — with Claude

Merge branch 'claude/cool-bardeen-l8jlsy-3-definitely-all-invalid' in…

d72800b

…to claude/cool-bardeen-l8jlsy-4-lazy-scalarfn-validity

robert3005 reviewed Jun 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make ScalarFn array validity lazy when the function defines a validity expression#8336

Make ScalarFn array validity lazy when the function defines a validity expression#8336
joseph-isaacs wants to merge 2 commits into
claude/cool-bardeen-l8jlsy-3-definitely-all-invalidfrom
claude/cool-bardeen-l8jlsy-4-lazy-scalarfn-validity

joseph-isaacs commented Jun 10, 2026

Uh oh!

codspeed-hq Bot commented Jun 10, 2026 •

edited

Loading

Uh oh!

joseph-isaacs commented Jun 10, 2026

Uh oh!

robert3005 Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

joseph-isaacs commented Jun 10, 2026

Summary

Why the eager path remains for some functions

Checks

Uh oh!

codspeed-hq Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will degrade performance by 17.27%

Performance Changes

Footnotes

Uh oh!

joseph-isaacs commented Jun 10, 2026

Uh oh!

robert3005 Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codspeed-hq Bot commented Jun 10, 2026 •

edited

Loading