Use a per-benchmarks baseline instead of the last fully succesful run#8332
Conversation
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
Polar Signals Profiling ResultsLatest Run
Previous Runs (1)
Powered by Polar Signals Cloud |
Merging this PR will improve performance by 26.95%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | Simulation | bitwise_not_vortex_buffer_mut[128] |
216.9 ns | 275.3 ns | -21.19% |
| ❌ | Simulation | bitwise_not_vortex_buffer_mut[1024] |
278.6 ns | 336.9 ns | -17.31% |
| ❌ | Simulation | bitwise_not_vortex_buffer_mut[2048] |
342.2 ns | 400.6 ns | -14.56% |
| ❌ | WallTime | cuda/bitpacked_u8/unpack/3bw[100M] |
299.2 µs | 348 µs | -14.02% |
| ⚡ | Simulation | compare[63] |
360.5 µs | 244.8 µs | +47.23% |
| ⚡ | Simulation | chunked_bool_canonical_into[(1000, 10)] |
46.4 µs | 31.7 µs | +46.03% |
| ⚡ | Simulation | compare[56] |
327.6 µs | 224.9 µs | +45.63% |
| ⚡ | Simulation | compare[62] |
363.9 µs | 250 µs | +45.54% |
| ⚡ | Simulation | compare[63] |
371.2 µs | 255.5 µs | +45.27% |
| ⚡ | Simulation | compare[60] |
353.8 µs | 243.7 µs | +45.18% |
| ⚡ | Simulation | compare[56] |
332.4 µs | 229.7 µs | +44.69% |
| ⚡ | Simulation | compare[62] |
368.7 µs | 254.9 µs | +44.68% |
| ⚡ | Simulation | compare[61] |
362.7 µs | 250.8 µs | +44.64% |
| ⚡ | Simulation | compare[60] |
358.6 µs | 248.5 µs | +44.32% |
| ⚡ | Simulation | compare[58] |
347.3 µs | 240.9 µs | +44.16% |
| ⚡ | Simulation | compare[59] |
354.5 µs | 246.3 µs | +43.93% |
| ⚡ | Simulation | compare[61] |
367.5 µs | 255.6 µs | +43.82% |
| ⚡ | Simulation | compare[57] |
345.9 µs | 241.4 µs | +43.28% |
| ⚡ | Simulation | compare[58] |
352.2 µs | 245.8 µs | +43.27% |
| ⚡ | Simulation | compare[59] |
359.3 µs | 251.1 µs | +43.1% |
| ... | ... | ... | ... | ... | ... |
ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing adamg/benchmarks-baseline (6a4663d) with develop (47d2041)
Benchmarks: PolarSignals ProfilingVortex (geomean): 0.969x ➖ How to read Verdict and Engines
datafusion / vortex-file-compressed (0.969x ➖, 0↑ 0↓)
No file size changes detected. |
Benchmarks: TPC-H SF=1 on NVMEVerdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.786x ✅, 22↑ 0↓)
datafusion / vortex-compact (0.821x ✅, 21↑ 0↓)
datafusion / parquet (0.898x ✅, 12↑ 2↓)
datafusion / arrow (0.755x ✅, 19↑ 0↓)
duckdb / vortex-file-compressed (0.809x ✅, 20↑ 0↓)
duckdb / vortex-compact (0.845x ✅, 22↑ 0↓)
duckdb / parquet (0.916x ➖, 10↑ 2↓)
duckdb / duckdb (0.873x ✅, 14↑ 0↓)
File Size Changes (10 files changed, +0.0% overall, 7↑ 3↓)
Totals:
|
Benchmarks: FineWeb NVMeVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.107x ❌, 0↑ 3↓)
datafusion / vortex-compact (1.045x ➖, 0↑ 1↓)
datafusion / parquet (1.052x ➖, 0↑ 1↓)
duckdb / vortex-file-compressed (1.077x ➖, 0↑ 3↓)
duckdb / vortex-compact (1.045x ➖, 0↑ 0↓)
duckdb / parquet (1.046x ➖, 0↑ 0↓)
File Size Changes (1 files changed, -0.1% overall, 0↑ 1↓)
Totals:
|
Benchmarks: TPC-DS SF=1 on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.017x ➖, 0↑ 1↓)
datafusion / vortex-compact (1.010x ➖, 2↑ 1↓)
datafusion / parquet (1.014x ➖, 1↑ 1↓)
duckdb / vortex-file-compressed (1.013x ➖, 0↑ 4↓)
duckdb / vortex-compact (1.017x ➖, 1↑ 2↓)
duckdb / parquet (1.009x ➖, 0↑ 1↓)
duckdb / duckdb (1.014x ➖, 0↑ 0↓)
File Size Changes (6 files changed, -0.0% overall, 2↑ 4↓)
Totals:
|
Benchmarks: FineWeb S3Verdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.002x ➖, 0↑ 0↓)
datafusion / vortex-compact (1.560x ❌, 0↑ 4↓)
datafusion / parquet (0.982x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.952x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.120x ➖, 0↑ 0↓)
duckdb / parquet (0.998x ➖, 0↑ 0↓)
|
Benchmarks: Statistical and Population GeneticsVerdict: No clear signal (low confidence) How to read Verdict and Engines
duckdb / vortex-file-compressed (0.994x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.989x ➖, 0↑ 0↓)
duckdb / parquet (0.987x ➖, 0↑ 0↓)
File Size Changes (1 files changed, -0.0% overall, 0↑ 1↓)
Totals:
|
Benchmarks: TPC-H SF=10 on NVMEVerdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.350x ❌, 0↑ 22↓)
datafusion / vortex-compact (1.109x ❌, 0↑ 10↓)
datafusion / parquet (1.209x ❌, 0↑ 16↓)
datafusion / arrow (1.008x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.022x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.001x ➖, 0↑ 0↓)
duckdb / parquet (1.024x ➖, 0↑ 0↓)
duckdb / duckdb (1.001x ➖, 0↑ 0↓)
File Size Changes (26 files changed, +0.0% overall, 14↑ 12↓)
Totals:
|
Benchmarks: Clickbench on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.026x ➖, 0↑ 1↓)
datafusion / parquet (1.039x ➖, 0↑ 3↓)
duckdb / vortex-file-compressed (1.012x ➖, 3↑ 1↓)
duckdb / parquet (1.016x ➖, 0↑ 2↓)
duckdb / duckdb (1.007x ➖, 0↑ 0↓)
File Size Changes (104 files changed, +0.0% overall, 55↑ 49↓)
Totals:
|
Signed-off-by: Adam Gutglick <adam@spiraldb.com>
Benchmarks: TPC-H SF=1 on S3Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.330x ❌, 0↑ 13↓)
datafusion / vortex-compact (1.204x ➖, 0↑ 13↓)
datafusion / parquet (0.981x ➖, 2↑ 2↓)
duckdb / vortex-file-compressed (1.034x ➖, 0↑ 4↓)
duckdb / vortex-compact (0.967x ➖, 0↑ 0↓)
duckdb / parquet (1.043x ➖, 0↑ 1↓)
|
Benchmarks: Random AccessVortex (geomean): 0.909x ➖ How to read Verdict and Engines
unknown / unknown (0.990x ➖, 7↑ 1↓)
|
Benchmarks: Appian on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.890x ✅, 5↑ 0↓)
datafusion / parquet (0.925x ➖, 4↑ 0↓)
duckdb / vortex-file-compressed (0.906x ➖, 5↑ 0↓)
duckdb / parquet (0.912x ➖, 3↑ 0↓)
duckdb / duckdb (0.912x ➖, 3↑ 0↓)
File Size Changes (4 files changed, -0.0% overall, 1↑ 3↓)
Totals:
|
Benchmarks: TPC-H SF=10 on S3Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.049x ➖, 0↑ 1↓)
datafusion / vortex-compact (0.936x ➖, 0↑ 0↓)
datafusion / parquet (1.137x ➖, 0↑ 6↓)
duckdb / vortex-file-compressed (0.983x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.966x ➖, 0↑ 0↓)
duckdb / parquet (0.987x ➖, 0↑ 0↓)
|
Benchmarks: CompressionVortex (geomean): 0.997x ➖ How to read Verdict and Engines
unknown / unknown (0.979x ➖, 4↑ 2↓)
|
Summary
Currently - when one post-merge benchmarks fails for any reason, the baseline we use is the last fully successful run. This PR changes it to be the last successful run of that specific benchmark.
Tested this works by running it in such a state and visually verifying the results.