feat(benchmarks): improve measurement accuracy and add missing benchmarks #6

tbhb · 2025-12-28T21:12:44Z

Summary

Restructure compact benchmarks to pre-populate history/tombstones in setup, measuring pure compact() performance
Fix delete benchmark to use unique keys per iteration via counter pattern (no restore cycle)
Add benchmarks for find_one, items, and reload methods
Add EDGE_PARAMS (scale=0, scale=1) for boundary testing on appropriate tests
Add generator helpers: create_extended_test_table, create_table_with_history, create_table_with_tombstones

…arks - Restructure compact benchmarks to pre-populate history/tombstones in setup - Fix delete benchmark to use unique keys per iteration (no restore cycle) - Add benchmarks for find_one, items, and reload methods - Add EDGE_PARAMS (scale=0, scale=1) for boundary testing - Add generator helpers: create_extended_test_table, create_table_with_history, create_table_with_tombstones

codecov · 2025-12-28T21:14:13Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

codspeed-hq · 2025-12-28T21:31:58Z

CodSpeed Performance Report

Merging #6 will create unknown performance changes

_{Comparing benchmarks (4fb6a33) with main (0326a5a)¹}

Summary

🆕 173 new
⏩ 1 skipped²

Benchmarks breakdown

	Benchmark	`BASE`	`HEAD`	Efficiency
🆕	`test_put_update_record[small-int-100]`	N/A	3.7 ms	N/A
🆕	`test_batch_put_10[small-str-100]`	N/A	35.9 ms	N/A
🆕	`test_compact_with_tombstones[small-tuple-1k]`	N/A	94.2 ms	N/A
🆕	`test_batch_put_10[small-int-1k]`	N/A	321.1 ms	N/A
🆕	`test_batch_put_100[small-str-100]`	N/A	494.3 ms	N/A
🆕	`test_put_new_record[small-int-100]`	N/A	3.7 ms	N/A
🆕	`test_batch_put_10[small-str-1k]`	N/A	321.6 ms	N/A
🆕	`test_batch_put_100[small-int-100]`	N/A	492.1 ms	N/A
🆕	`test_compact_with_tombstones[small-str-1k]`	N/A	60.2 ms	N/A
🆕	`test_batch_put_10[small-tuple-1k]`	N/A	376.4 ms	N/A
🆕	`test_put_update_record[small-int-1k]`	N/A	33.6 ms	N/A
🆕	`test_put_new_record[small-int-1k]`	N/A	33.6 ms	N/A
🆕	`test_put_new_record[small-tuple-1k]`	N/A	39.4 ms	N/A
🆕	`test_put_new_record[small-tuple-100]`	N/A	4.3 ms	N/A
🆕	`test_batch_put_10[small-int-100]`	N/A	35.6 ms	N/A
🆕	`test_put_new_record[small-str-100]`	N/A	4.2 ms	N/A
🆕	`test_compact_with_history[small-int-100]`	N/A	6.9 ms	N/A
🆕	`test_put_update_record[small-str-1k]`	N/A	33.6 ms	N/A
🆕	`test_compact_with_history[small-str-100]`	N/A	7.1 ms	N/A
🆕	`test_put_update_record[small-tuple-1k]`	N/A	39.4 ms	N/A
...	...	...	...	...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

No successful run was found on main (caa1c0d) during the generation of this report, so 0326a5a was used instead as the comparison base. There might be some changes unrelated to this pull request in this report. ↩
1 benchmark was skipped, so the baseline result was used instead. If it was deleted from the codebase, click here and archive it to remove it from the performance reports. ↩

Split benchmarks into 11 shards (from 4) to ensure each completes within the 6-minute timeout. Max shard size is now 43 tests (find) which should complete in ~5.7 minutes at ~8s per test. Shard distribution: - load-reload: 38 tests - get: 25 tests - all: 25 tests - find: 43 tests - find-one: 18 tests - write-compact: 32 tests - keys-delete: 31 tests - items: 25 tests - count: 25 tests - has: 31 tests - memory: 18 tests

Split benchmarks into 15 shards to ensure each completes within 5 minutes. Separated slow vs non-slow tests for expensive operations (all, items, keys) using the slow marker. Shard distribution (estimated times): - get: 25 tests (~3m) - count: 25 tests (~3m) - has: 31 tests (~3m) - find-one-delete: 24 tests (~2m) - write-compact: 32 tests (~5m) - load: 19 tests (~4m) - reload: 19 tests (~4m) - keys-ci: 12 tests (~2.5m) - keys-slow: 13 tests (~2.5m) - all-ci: 12 tests (~3m) - all-slow: 13 tests (~3.5m) - items-ci: 12 tests (~3m) - items-slow: 13 tests (~3.5m) - find-high: 19 tests (~3m) - find-other: 24 tests (~4m) Memory tests excluded for now due to memray profiling overhead.

Exclude slow tests from PR benchmarks to ensure completion under 5 minutes. Slow tests take ~24s each vs ~7s for CI tests. Shard distribution (7 shards, ~176 tests total): - load-reload: 12 tests (~1.5m) - get: 12 tests (~1.5m) - find: 30 tests (~3.5m) - find-one-delete: 24 tests (~2m) - write-compact: 32 tests (~5m) - all-keys-items: 36 tests (~4m) - count-has: 30 tests (~3.5m)

tbhb added 2 commits December 28, 2025 16:10

chore: release v0.1.0a3

3d6b9b1

tbhb added 2 commits December 28, 2025 16:24

fix: register limit_memory pytest marker for memray

30ba69e

perf(ci): shard benchmarks into 4 parallel jobs for faster CI

68580f7

tbhb added 3 commits December 28, 2025 17:05

tbhb merged commit 75ed41c into main Dec 31, 2025
30 checks passed

tbhb deleted the benchmarks branch December 31, 2025 04:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(benchmarks): improve measurement accuracy and add missing benchmarks #6

feat(benchmarks): improve measurement accuracy and add missing benchmarks #6

Uh oh!

tbhb commented Dec 28, 2025

Uh oh!

codecov bot commented Dec 28, 2025

Uh oh!

codspeed-hq bot commented Dec 28, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(benchmarks): improve measurement accuracy and add missing benchmarks #6

feat(benchmarks): improve measurement accuracy and add missing benchmarks #6

Uh oh!

Conversation

tbhb commented Dec 28, 2025

Summary

Uh oh!

codecov bot commented Dec 28, 2025

Codecov Report

Uh oh!

codspeed-hq bot commented Dec 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Performance Report

Merging #6 will create unknown performance changes

Summary

Benchmarks breakdown

Footnotes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codspeed-hq bot commented Dec 28, 2025 •

edited

Loading