compute: Add WASM-compiled scalar expression evaluation (milestone 1) by antiguru · Pull Request #35250 · MaterializeInc/materialize

antiguru · 2026-02-27T14:40:13Z

Introduce the mz-expr-compiler crate, which compiles a subset of MirScalarExpr trees to WebAssembly modules that operate on columnar data.
This is milestone 1: an end-to-end proof compiling a + b for Int64 columns.

The crate has four modules:

analyze — determines whether an expression tree is compilable (milestone 1: Column, Literal(Int64), CallBinary(AddInt64))
codegen — emits a WASM module via wasm-encoder with a row loop that reads columnar Int64 inputs, performs i64.add, and writes columnar output with null propagation
columnar — columnar buffer types (ColumnBatch, TypedColumn, ResultColumn) with row-to-column and column-to-row conversions
engine — wraps wasmtime to compile and instantiate the generated WASM, exposing ExprEngine::compile() and CompiledExpr::evaluate()

The compute layer integration adds an ENABLE_COMPILED_EXPRESSIONS dyncfg (default: false) and a try_compile_mfp_expressions hook in as_collection_core() that attempts WASM compilation when the flag is enabled.
Actual batch evaluation replacing the interpreter is a follow-up.

Known limitation: the generated WASM uses wrapping i64.add without overflow detection.
The interpreter uses checked_add and returns NumericFieldOverflow.
Overflow detection will be added in milestone 2.

Tests added: 19 tests in mz-expr-compiler — analysis unit tests, codegen validation, columnar round-trip tests, engine integration tests, and proptest differential tests comparing compiled vs interpreted on random Int64 data.

🤖 Generated with Claude Code

Introduce the `mz-expr-compiler` crate, which compiles a subset of `MirScalarExpr` trees to WebAssembly modules that operate on columnar data. This is milestone 1: an end-to-end proof compiling `a + b` for Int64 columns. The crate has four modules: * `analyze` — determines whether an expression tree is compilable (milestone 1: Column, Literal(Int64), CallBinary(AddInt64)) * `codegen` — emits a WASM module via `wasm-encoder` with a row loop that reads columnar Int64 inputs, performs i64.add, and writes columnar output with null propagation * `columnar` — columnar buffer types (`ColumnBatch`, `TypedColumn`, `ResultColumn`) with `rows_to_columns` / `columns_to_rows` conversions * `engine` — wraps `wasmtime` to compile and instantiate the generated WASM, exposing `ExprEngine::compile()` and `CompiledExpr::evaluate()` The compute layer integration adds: * `ENABLE_COMPILED_EXPRESSIONS` dyncfg (default: false) in `compute-types` * A `try_compile_mfp_expressions` hook in `as_collection_core()` that attempts WASM compilation when the flag is enabled and logs the result. Actual batch evaluation is a follow-up. Known limitation: the generated WASM uses wrapping i64.add without overflow detection; the interpreter uses checked_add and returns NumericFieldOverflow. Overflow detection will be added in milestone 2. Tests: 19 tests including 7 analysis unit tests, 3 codegen tests, 2 columnar round-trip tests, 4 engine integration tests, and 3 proptest differential tests comparing compiled vs interpreted on random data. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions · 2026-02-27T14:40:23Z

Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone.

PR title guidelines

Use imperative mood: "Fix X" not "Fixed X" or "Fixes X"
Be specific: "Fix panic in catalog sync when controller restarts" not "Fix bug" or "Update catalog code"
Prefix with area if helpful: compute: , storage: , adapter: , sql:

Pre-merge checklist

The PR title is descriptive and will make sense in the git log.
This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.
This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).

* Align wasm-encoder version with wasmtime's internal dependency (0.244.0) to avoid duplicate crate in cargo deny * Add gimli and linux-raw-sys to deny.toml skip list for wasmtime deps * Add wasmtime crates as wrappers for the banned `log` crate * Refactor proptests to use closure form with #[mz_ore::test] to satisfy the test-attribute lint * Regenerate workspace-hack Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace the `try_compile_mfp_expressions` compilation probe in `as_collection_core()` with an actual `MfpEvaluator` dispatch that routes between the interpreter and WASM-compiled evaluation. Add `CompiledExprSession` for per-row WASM evaluation of a single expression through a cached instance, and `CompiledMfp` which wraps an `MfpPlan` and dispatches each expression to WASM or the interpreter. The temporal bounds and projection logic remains interpreted. The `MfpEvaluator` enum in the flat_map closure selects between `MfpPlan::evaluate` (interpreted) and `CompiledMfp::evaluate` (compiled) based on the `ENABLE_COMPILED_EXPRESSIONS` feature flag. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The WASM codegen for AddInt64 previously emitted a bare i64.add which wraps on overflow, while the interpreter uses checked_add and returns EvalError::NumericFieldOverflow. This caused the compiled path to silently produce wrong results on overflow. Add inline overflow detection after i64.add in emit_add_int64. Two new i64 locals (local_a, local_b) save operands before the add via local.tee. After the add, the standard signed overflow check ((a ^ result) & (b ^ result)) < 0 detects when both operands share a sign but the result differs. The check is guarded by !is_null to skip garbage values from null propagation. Tighten assert_compiled_matches_interpreted in the proptests to require exact error agreement between compiled and interpreted paths, removing the lenient blocks that previously tolerated wrapping. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Expand the expr-compiler from only AddInt64 to all Int64 binary arithmetic (sub, mul, div, mod), bitwise (and, or, xor), and unary (neg, bitnot, abs) operations. Infrastructure changes: * Error codes instead of boolean error flag: the WASM error byte now carries a discriminant (0=ok, 1=NumericFieldOverflow, 2=DivisionByZero, 3=Int64OutOfRange) that the host maps to the appropriate EvalError. * Fix local clobbering in nested expressions: operands are saved to locals only after both children are evaluated, preventing inner calls from overwriting local_a/local_b before the outer operation reads them. * Fix null/error precedence: each fallible operation saves and restores is_null around child evaluation so that errors in sibling subtrees propagate even when another subtree produces null, matching the interpreter's semantics where errors take precedence over nulls. * Add CallUnary support to is_compilable, collect_columns, and emit_expr. * Introduce EmitLocals struct and preamble/postamble helpers to reduce parameter threading across emit functions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add milestone 3 of the WASM expression compiler: comparison operators, Bool as a WASM output type, and compiled predicate evaluation. Type inference (`infer_type`) determines expression output types from input column types, enabling generic comparison operators (Eq, NotEq, Lt, Lte, Gt, Gte) to be compiled when both operands are Int64. WASM comparisons return i32, widened to i64 via `i64.extend_i32_u` to maintain the stack convention. Bool values use i64 encoding (0=false, 1=true). `is_compilable` now accepts `input_types` to gate comparisons on operand types. Bool unary `Not` uses `i64.eqz` + extend. `Datum::True`/`False` compile to `i64.const 1`/`0`. Host-side decoding maps i64 results to `Datum::True`/`Datum::False` or `TypedColumn::Bool` based on inferred output type. `CompiledMfp::try_new` accepts `input_types` and compiles predicates alongside expressions. `evaluate_inner` uses compiled predicate sessions, falling back to the interpreter for non-compiled predicates. The compute call site passes `&[]` for now (no regressions). Added 10 proptests covering all comparison operators, Not, nested comparisons with arithmetic, and comparisons with literals. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…inference Add WASM-compiled support for VariadicFunc::And/Or with three-valued SQL null logic (FALSE dominates AND, TRUE dominates OR), UnaryFunc::IsNull/ IsTrue/IsFalse as null-consuming operations that never produce null output, and Datum::True/False input handling. Add infer_input_types_from_mfp() to derive column types from expression usage patterns (e.g., AddInt64 implies Int64 operands), replacing the hardcoded &[] at the compute call site so comparisons and predicates can now compile to WASM at runtime. The And/Or codegen saves and restores tracking locals on the WASM stack around each child evaluation to support arbitrary nesting depth. Proptests cover all new operations including nested And-of-Or and Or-of-And combinations, plus deterministic edge-case tests for null dominance semantics. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…uation Add mfp_eval benchmark with 7 scenarios (arithmetic, predicates, boolean logic, null filtering) across 3 batch sizes comparing CompiledMfp against SafeMfpPlan throughput per row. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add eval_batch() that processes N rows in a single WASM call instead of one call per row. The method reuses the existing WASM instance, grows memory automatically when the batch exceeds capacity, and writes input data in column-major layout matching the generated WASM function's expectations. Benchmarks show 8-17x speedup over interpreted evaluation and 15-40x over per-row compiled evaluation, confirming WASM call overhead was the bottleneck in the per-row path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

antiguru and others added 9 commits February 27, 2026 15:58

compute: Add design doc for WASM-compiled scalar expression evaluation

4a1cd2e

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

compute: Add WASM-compiled scalar expression evaluation (milestone 1)#35250

compute: Add WASM-compiled scalar expression evaluation (milestone 1)#35250
antiguru wants to merge 10 commits intoMaterializeInc:mainfrom
antiguru:compiled_mse

antiguru commented Feb 27, 2026

Uh oh!

github-actions bot commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

antiguru commented Feb 27, 2026

Uh oh!

github-actions bot commented Feb 27, 2026

PR title guidelines

Pre-merge checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant