[PERF] evaluation: args definitions at compile time#8806
Open
LucasLefevre wants to merge 6 commits into
Open
Conversation
Collaborator
b7f0257 to
95e1ccc
Compare
067c316 to
0af8d01
Compare
2d5b0b2 to
dee5190
Compare
dee5190 to
38b4a91
Compare
67ed857 to
0bf4d40
Compare
0bf4d40 to
38b4a91
Compare
38b4a91 to
4175a73
Compare
LucasLefevre
commented
Jun 3, 2026
0fa9ce0 to
593c0e4
Compare
Collaborator
|
Review taken into account, |
Task: 6254300
A number of checks and pre/post processing is done for each executed function. Some of it can be known at compile time since it only depends on the function and the number of arguments. With this commit, we pre-compute a number of things only once at compile time to minimize what is done when executing the function. Bunch report ------------ cells imported in 09ccc24: Mean: 2841.65 ms, StdErr: 3.91 ms c0ed960: Mean: 2758.60 ms, StdErr: 6.32 ms → 🟢 (vs prev: -3%, Δ=-83.05 ms, combined StdErr=7.43 ms, |Δ|/SE=-11.2x; n=20) evaluate all cells 09ccc24: Mean: 2678.65 ms, StdErr: 8.38 ms c0ed960: Mean: 2458.64 ms, StdErr: 11.72 ms → 🟢 (vs prev: -8%, Δ=-220.02 ms, combined StdErr=14.41 ms, |Δ|/SE=-15.3x; n=20) Model created in 09ccc24: Mean: 6470.76 ms, StdErr: 9.44 ms c0ed960: Mean: 6158.89 ms, StdErr: 14.90 ms → 🟢 (vs prev: -5%, Δ=-311.87 ms, combined StdErr=17.64 ms, |Δ|/SE=-17.7x; n=20) Large formula dataset --------------------- cells imported in 09ccc24: Mean: 2703.84 ms, StdErr: 14.52 ms c0ed960: Mean: 2704.29 ms, StdErr: 14.18 ms → ⚫ (vs prev: +0%, Δ=+0.45 ms, combined StdErr=20.30 ms, |Δ|/SE=0.0x; n=20) evaluate all cells 09ccc24: Mean: 400.51 ms, StdErr: 2.94 ms c0ed960: Mean: 375.49 ms, StdErr: 2.43 ms → 🟢 (vs prev: -6%, Δ=-25.02 ms, combined StdErr=3.81 ms, |Δ|/SE=-6.6x; n=20) Model created in 09ccc24: Mean: 3441.85 ms, StdErr: 14.57 ms c0ed960: Mean: 3419.21 ms, StdErr: 14.61 ms → ⚫ (vs prev: -1%, Δ=-22.64 ms, combined StdErr=20.63 ms, |Δ|/SE=-1.1x; n=20) Other production spreadsheet (heavily vectorized) ------------------------------------------------- ⚫ no measureable change anywhere Yet another production spreadsheet (with some vectorization) ------------------------------------------------------------ cells imported in 09ccc24: Mean: 491.04 ms, StdErr: 6.53 ms c0ed960: Mean: 493.73 ms, StdErr: 4.17 ms → ⚫ (vs prev: +1%, Δ=+2.69 ms, combined StdErr=7.75 ms, |Δ|/SE=0.3x; n=20) evaluate all cells 09ccc24: Mean: 660.53 ms, StdErr: 3.85 ms c0ed960: Mean: 633.88 ms, StdErr: 3.46 ms → 🟢 (vs prev: -4%, Δ=-26.65 ms, combined StdErr=5.18 ms, |Δ|/SE=-5.1x; n=20) Model created in 09ccc24: Mean: 1423.05 ms, StdErr: 9.20 ms c0ed960: Mean: 1396.71 ms, StdErr: 7.02 ms → 🟢 (vs prev: -2%, Δ=-26.33 ms, combined StdErr=11.57 ms, |Δ|/SE=-2.3x; n=20) Legend: ⚫: no measurable change |Δ|/SE < 2 🔴: slower Δ > 0 and |Δ|/SE >= 2 🟢: faster Δ < 0 and |Δ|/SE >= 2 Task: 6254300
Remove the layer that was adapting the return value of the compute formula by directly creating the values object within each compute function definition. This simplifies typing by reducing the number of different return cases that needed to be handled in the compute functions. Task: 6254300
Introduce a type-level distinction between formulas that return a scalar value and formulas that return an array. Previously, all formulas shared the same return type signature, making it impossible to enforce correct usage at compile time. This distinction will help in future commits to know, at compile time, what can be vectorized and what cannot. Task: 6254300
It is no longer necessary to check at each compute function whether vectorization is required. Now, the compilation tells us whether a formula needs to be vectorized or not. Task: 6254300
593c0e4 to
821c700
Compare
…ized args Instead of storing a boolean array where each index indicates whether an argument should be vectorized, store only the indices of arguments that need vectorization. This avoids iterating over all arguments and directly targets only the relevant ones. Task: 6254300
821c700 to
28808d6
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Description:
A number of checks and pre/post processing is done for each executed
function.
Some of it can be known at compile time since it only depends on the
function and the number of arguments.
With this commit, we pre-compute a number of things only once at compile
time to minimize what is done when executing the function.
Bunch report
cells imported in
09ccc24: Mean: 2841.65 ms, StdErr: 3.91 ms
c0ed960: Mean: 2758.60 ms, StdErr: 6.32 ms → 🟢 (vs prev: -3%, Δ=-83.05 ms, combined StdErr=7.43 ms, |Δ|/SE=-11.2x; n=20)
evaluate all cells
09ccc24: Mean: 2678.65 ms, StdErr: 8.38 ms
c0ed960: Mean: 2458.64 ms, StdErr: 11.72 ms → 🟢 (vs prev: -8%, Δ=-220.02 ms, combined StdErr=14.41 ms, |Δ|/SE=-15.3x; n=20)
Model created in
09ccc24: Mean: 6470.76 ms, StdErr: 9.44 ms
c0ed960: Mean: 6158.89 ms, StdErr: 14.90 ms → 🟢 (vs prev: -5%, Δ=-311.87 ms, combined StdErr=17.64 ms, |Δ|/SE=-17.7x; n=20)
Large formula dataset
cells imported in
09ccc24: Mean: 2703.84 ms, StdErr: 14.52 ms
c0ed960: Mean: 2704.29 ms, StdErr: 14.18 ms → ⚫ (vs prev: +0%, Δ=+0.45 ms, combined StdErr=20.30 ms, |Δ|/SE=0.0x; n=20)
evaluate all cells
09ccc24: Mean: 400.51 ms, StdErr: 2.94 ms
c0ed960: Mean: 375.49 ms, StdErr: 2.43 ms → 🟢 (vs prev: -6%, Δ=-25.02 ms, combined StdErr=3.81 ms, |Δ|/SE=-6.6x; n=20)
Model created in
09ccc24: Mean: 3441.85 ms, StdErr: 14.57 ms
c0ed960: Mean: 3419.21 ms, StdErr: 14.61 ms → ⚫ (vs prev: -1%, Δ=-22.64 ms, combined StdErr=20.63 ms, |Δ|/SE=-1.1x; n=20)
Other production spreadsheet (heavily vectorized)
⚫ no measureable change anywhere
Yet another production spreadsheet (with some vectorization)
cells imported in
09ccc24: Mean: 491.04 ms, StdErr: 6.53 ms
c0ed960: Mean: 493.73 ms, StdErr: 4.17 ms → ⚫ (vs prev: +1%, Δ=+2.69 ms, combined StdErr=7.75 ms, |Δ|/SE=0.3x; n=20)
evaluate all cells
09ccc24: Mean: 660.53 ms, StdErr: 3.85 ms
c0ed960: Mean: 633.88 ms, StdErr: 3.46 ms → 🟢 (vs prev: -4%, Δ=-26.65 ms, combined StdErr=5.18 ms, |Δ|/SE=-5.1x; n=20)
Model created in
09ccc24: Mean: 1423.05 ms, StdErr: 9.20 ms
c0ed960: Mean: 1396.71 ms, StdErr: 7.02 ms → 🟢 (vs prev: -2%, Δ=-26.33 ms, combined StdErr=11.57 ms, |Δ|/SE=-2.3x; n=20)
Legend:
⚫: no measurable change |Δ|/SE < 2
🔴: slower Δ > 0 and |Δ|/SE >= 2
🟢: faster Δ < 0 and |Δ|/SE >= 2
Task: TASK_ID
review checklist