Rename Validity::no_nulls to definitely_no_nulls and add execute_no_nulls#8333
Rename Validity::no_nulls to definitely_no_nulls and add execute_no_nulls#8333joseph-isaacs wants to merge 1 commit into
Conversation
…ulls The boolean returned by no_nulls() only proves the absence of nulls for the NonNullable and AllValid variants; a Validity::Array may still resolve to all-valid once executed. As validity arrays become lazy this distinction matters, so the cheap check is renamed to definitely_no_nulls() and documented as conservative. For call sites that need a definitive answer, add execute_no_nulls(), which executes the validity into a Mask via execute_mask(). The CUDA ListView export ensure-check is switched to the exact variant since a false answer there is a hard error rather than a slow path. https://claude.ai/code/session_01VPQ7dfZtijfrsjAipwXvEj Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
| /// resolve to all-valid once executed. Callers must treat `false` as "unknown without | ||
| /// compute" and either fall back to a null-handling path or execute the validity. | ||
| #[inline] | ||
| pub fn no_nulls(&self) -> bool { |
There was a problem hiding this comment.
This is an approximate function. Do you think has_no_nulls says that?
There was a problem hiding this comment.
They all sound vague to me, does it really need to be a function vs allowing callers to just have it explicitly there?
Merging this PR will improve performance by 11.73%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ⚡ | Simulation | chunked_varbinview_canonical_into[(100, 100)] |
309 µs | 274.5 µs | +12.59% |
| ⚡ | Simulation | bitwise_not_vortex_buffer_mut[128] |
275.3 ns | 246.1 ns | +11.85% |
| ⚡ | Simulation | chunked_varbinview_into_canonical[(100, 100)] |
362.5 µs | 327.2 µs | +10.76% |
Tip
Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.
Comparing claude/cool-bardeen-l8jlsy-1-definitely-no-nulls (4906359) with develop (0a41704)
Summary
PR 1 of a 4-PR stack preparing
Validityfor lazy validity arrays, where aValidity::Arraymay resolve to all-true/all-false once executed.Validity::no_nulls()→definitely_no_nulls(). The boolean only proves the absence of nulls forNonNullable/AllValid; for a (lazy)Arrayvariant,falsemeans "unknown without compute", not "has nulls". The new name and doc comment make the conservative semantics explicit.Validity::execute_no_nulls(length, ctx), the exact counterpart built onexecute_mask(), for call sites that need a definitive answer.vortex_ensure!, where afalseanswer is a hard error rather than a slow path — that one is switched toexecute_no_nullsso a lazy all-valid child validity no longer spuriously fails the export.Stack
no_nulls()→definitely_no_nulls()+execute_no_nulls()mask_eq()semantics fix for mixed variantsdefinitely_all_invalid()helperScalarFnarray validityChecks
cargo build -p vortex-array -p vortex-zstd -p vortex-parquet-variantcargo nextest run -p vortex-array(2962 passed)cargo clippy -p vortex-array -p vortex-zstd -p vortex-parquet-variant --all-targetscargo +nightly fmt --allvortex-cudalocally: itsvortex-nvcompbuild script requires the CUDA toolkit, which is unavailable in this environment.https://claude.ai/code/session_01VPQ7dfZtijfrsjAipwXvEj
Generated by Claude Code