Skip to content

Implement plan 05: test coverage gaps#960

Merged
blooop merged 24 commits into
mainfrom
test/coverage-plan-05
Jun 12, 2026
Merged

Implement plan 05: test coverage gaps#960
blooop merged 24 commits into
mainfrom
test/coverage-plan-05

Conversation

@blooop

@blooop blooop commented Jun 12, 2026

Copy link
Copy Markdown
Owner

Summary

Implements plans/05-test-coverage.md (from #959): direct unit tests for the previously indirectly-tested results/ and core modules, de-flaked file server tests, and resolved long-skipped tests. One module per commit, as the plan prescribes.

Coverage: 90% → 91% (missed lines 1050 → 1001 of 10776). Tests: 1474 passed / 5 skipped → 1651 passed / 2 skipped (+177 tests across 16 new test files).

Task 1 — unit tests for untested result modules

Module Coverage before → after
results/bench_result.py 75% → 83%
results/histogram_result.py 83% → 83% (behavioral assertions added)
results/optimize_result.py 97% → 100%
results/dataset_result.py 64% → 86%
holoview_results/scatter_result.py 76% → 100%
holoview_results/bar_result.py 96% → 96%
holoview_results/curve_result.py 100% (direct tests added)
holoview_results/band_result.py 71% → 95%
holoview_results/table_result.py 100% (direct tests added)
distribution_result/ box/violin/jitter 100% (direct tests added)
results/volume_result.py 100% (direct tests added)
composable_container_dataframe.py 100% (data-intactness tests added)

Several of these were already at high line coverage purely through generated-example integration tests; the new tests assert behavior directly — element types, kdims/vdims, title/ylabel/units propagation, values matching the worker function, filter rejection paths, and NaN robustness (NaN is the missing-value default since v1.105).

Task 2 — core modules

  • test_bench_cfg.py (42 tests): defaults, flag round-trips, deprecated level= mapping, persistent hash behavior, describe/sentence helpers, optuna target partition. bench_cfg.py 97% → 99%.
  • test_worker_job.py (16 tests): input assembly, canonical form, hash stability/sensitivity.
  • video_writer.py: no new file — the plan's target (write path with synthetic frames, file exists and non-empty) is already covered by test_video_writer_extended.py.

Task 3 — de-flake test_file_server.py

Replaced each time.sleep(0.3) with a wait_for_port() TCP poll (5s timeout, 0.1s step, TimeoutError on failure). Verified stable across 10 consecutive runs (all passed).

Task 4 — long-skipped tests (with deviations; the plan's named tests don't exist)

  • test_combinations.py: not an empty placeholder — a 184-line hypothesis class fully @pytest.mark.skip since 2023. Deleted per the plan's intent.
  • The skipped test in test_sweep_vars.py is test_int_sweep_samples_all (not test_missing_default_value): an exact line-for-line duplicate of the passing test above it. Deleted.
  • test_bencher.py::test_unique_file_names (not test_plot_all_permutations): bare skip replaced with a self-describing reason pointing at the plan.
  • test_usability.py: deprecated bn.ResultVarbn.ResultFloat (verified identical behavior).

Bugs found while writing tests (not fixed here — test-only PR)

  1. ScatterResult.to_scatter filters result_types=(ResultVar,), but ResultVar is the deprecated shim — modern ResultFloat sweeps silently return None (the shipped scatter example appends None to its report). Pinned by test_to_scatter_result_float_returns_none; the filter likely wants ResultFloat.
  2. DistributionResult._plot_distribution forwards title inside **kwargs and as an explicit kwarg into .opts() — any caller passing title= would hit "multiple values for keyword argument".
  3. Histogram output never shows result-var units (only the var name); tests assert the implemented behavior.

Notes

  • fail_under (plan 01) is not in place yet, so there is nothing to raise; when plan 01 lands it can start at 91%.
  • Full pixi run ci passes locally.

🤖 Generated with Claude Code

Summary by Sourcery

Increase unit test coverage for core result and configuration modules and stabilize flaky file server tests.

Bug Fixes:

  • Deflake file server integration tests by replacing fixed sleeps with port polling and clarify a previously bare skip reason.
  • Remove redundant and long-skipped sweep/combinations tests and update usability tests to use the non-deprecated ResultFloat API.

Enhancements:

  • Add focused unit tests for BenchRunCfg/BenchCfg behavior including defaults, hashing, and descriptive helpers.
  • Introduce direct behavioral tests for BenchResult and various Holoviews-based result types (curve, bar, band, histogram, scatter, box/violin/jitter, volume, table, dataset).
  • Add tests for WorkerJob input assembly and hashing and for DataFrame-based composable container datasets, including NaN robustness and data integrity.

blooop added 19 commits June 12, 2026 07:18
…leeps

Replace each time.sleep(0.3) with wait_for_port(), which polls a TCP
connect to the server port (5s timeout, 0.1s step) and raises
TimeoutError on failure. Verified stable across 10 consecutive runs.

Plan 05, task 3.
- Delete test/test_combinations.py: the whole class has been
  @pytest.mark.skip since 2023; its hypothesis sweep duplicates what
  test_bencher.py and the generated-example suite now cover.
- Delete test_sweep_vars.py::test_int_sweep_samples_all: it was an
  exact line-for-line duplicate of the passing test_int_sweep_samples
  directly above it, skipped with no reason.
- test_bencher.py::test_unique_file_names: replace the bare skip with a
  self-describing reason pointing at plans/05-test-coverage.md task 4.
- test_usability.py: use bn.ResultFloat instead of the deprecated
  bn.ResultVar shim (identical behavior, verified before/after).

Plan 05, task 4.
Cover to(result_type) conversion (values plotted match the worker),
to_auto plot_list/remove_plots/failing-callback handling, to_auto_plots
summary placement, plot() callback dispatch, default_plot_callbacks,
from_existing state copies, and NaN-point robustness.

Plan 05, task 1.
Cover viewer construction from a small ResultDataSet sweep, dataset_list
round-tripping worker DataFrames unchanged, xarray index-to-frame
mapping, and ds_to_container unwrapping the stored frame.

Plan 05, task 1.
Cover 3-float volume construction (Plotly pane, single Volume trace,
axis titles with units, trace values verified against the worker
formula, isomin/isomax), the unsupported-shape rejection path, the
over_time short-circuit, and NaN robustness (finite-only isomin/isomax).

Plan 05, task 1.
…_dataframe.py)

Assert data intactness through composition: append order/identity,
single-DataFrame passthrough, right/down/sequence concat values and
coords, overlay elementwise mean with skipna semantics, and
label_formatter output. Complements test_composable_container_dataset.py
which covers dims/sizes only.

Plan 05, task 1.
Cover defaults, flag round-trips (repeats/over_time/cache flags),
deprecated level= mapping, with_defaults semantics, persistent hash
behavior (tag/bench_name/const-vars/include_repeats), describe_benchmark
and sweep_sentence content, panel helpers, optimized/unoptimized input
partition, optuna targets, and DimsCfg construction.

Plan 05, task 2.
Cover construction defaults, function_input/canonical_input assembly
(constants merged but excluded from canonical form), fn_inputs_sorted,
and hash behavior: stable across dim ordering, sensitive to values/dim
names/tag/constants, independent of bench_cfg_sample_hash.

video_writer.py needs no new file: the core write path is already
covered by test_video_writer_extended.py.

Plan 05, task 2.
Cover the repeats>1 Curve+Spread overlay (labels, kdims/vdims, std
band), to_plot delegation, categorical groupby via to_curve_ds, the
repeats=1 filter rejection, and NaN robustness.

Plan 05, task 1.
Cover static percentile-band composition (two Areas, median Curve,
samples Scatter, percentile vdim names), default/explicit titles and
unit-bearing ylabel, enable_scatter=False, categorical flattening into
the sample pool, the over_time band path, regression-report
suppression of to_band_ds, and NaN handling (nanpercentile + scatter
mask drops NaN points).

Plan 05, task 1.
Cover hv.Table construction (kdims/vdims), row count = samples x
repeats, values matching the worker, repeat-dim squeeze at repeats=1,
and NaN row preservation.

Plan 05, task 1.
Cover _to_scatter_ds element type and title, the public to_scatter path,
2-cat NdOverlay grouping, float-input rejection, and NaN robustness.

Also pins a discovered quirk: to_scatter filters result_types on the
deprecated ResultVar, so modern ResultFloat sweeps silently return None
(test_to_scatter_result_float_returns_none documents current behavior;
the filter likely wants ResultFloat).

Plan 05, task 1.
Cover to_bar Row/HoloViews/Bars structure, to_plot delegation, dims and
unit-bearing ylabel via to_bar_ds, the ResultBool repeats=2 REDUCE
scenario, 2-cat grouped kdims, float-input rejection, and NaN
robustness.

Plan 05, task 1.
…r_result.py)

Cover overlay/element types, kdims/vdims and unit-bearing ylabel/title,
raw repeats present per category (no aggregation), 2-cat kdims,
repeats=1 filter rejection, and NaN robustness.

Plan 05, task 1.
Cover overlay/element types, label/units propagation, raw repeats per
category, filter rejection, and NaN robustness through the shared
DistributionResult base.

Plan 05, task 1.
…_jitter_result.py)

Cover jitter scatter element type, label propagation, per-category raw
sample integrity with repeats=3, its stricter cat_range filter (2-cat
rejection), and NaN robustness.

Plan 05, task 1.
Cover element structure (kdim/vdim names), bin frequencies summing to
the sample count, bins= kwarg forwarding, title/ylabel/xrotation opts,
the native 0-input repeats filter and float-input rejection, and NaN
samples being dropped from bin counts without crashing.

Note: result-var units never appear in histogram output (only the var
name); tests assert the implemented behavior.

Plan 05, task 1.
Cover exact best_value/best_params and summary() lines for
single-objective studies, Pareto-front membership and the
single-objective-only RuntimeError for multi-objective, sweep-driven
structure (trial counts, target_names, directions), and NaN worker
output marking the trial FAIL while the study continues.

Plan 05, task 1.
@sourcery-ai

sourcery-ai Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Reviewer's Guide

Adds extensive new unit tests for core result and configuration modules, replaces flaky file server sleeps with a port polling helper, removes or clarifies long-skipped tests, and tightens behavioral guarantees (including NaN handling and plotting/filtering behavior) without changing production code.

File-Level Changes

Change Details Files
Stabilized file server tests by replacing fixed sleeps with a TCP port wait helper.
  • Added a wait_for_port helper that polls a TCP connection until the server accepts connections or times out.
  • Updated file server tests to use wait_for_port instead of time.sleep, improving reliability without extending total runtime.
test/test_file_server.py
Removed or clarified long-skipped tests to align with the coverage plan and reduce dead code.
  • Deleted the fully skipped, long-unused hypothesis test module for combinations.
  • Removed a duplicated skipped hypothesis test in sweep var tests that duplicated behavior of an existing passing test.
  • Replaced a bare skip in bencher tests with a descriptive reason referencing the coverage plan.
  • Updated usability tests to use ResultFloat instead of deprecated ResultVar for output, preserving behavior while modernizing APIs.
test/test_combinations.py
test/test_sweep_vars.py
test/test_bencher.py
test/test_usability.py
Introduced comprehensive unit tests for bench configuration objects and worker job hashing to pin defaults and hash semantics.
  • Added tests for BenchPlotSrvCfg and BenchRunCfg defaults, deprecated level mapping, with_defaults behavior, and persistent hash behavior in BenchCfg, including optuna target partitioning and DimsCfg extraction.
  • Added tests for WorkerJob construction, canonical input/constant merging, and hash stability/sensitivity to inputs, dim names, constants, tags, and ordering.
test/test_bench_cfg.py
test/test_worker_job.py
Added behavioral tests for BenchResult conversion paths and holoviews-based result types, focusing on kdims/vdims, labels, title/ylabel, and NaN robustness.
  • Added tests for BenchResult.to, to_auto, plotting callbacks, default plot callbacks, clone semantics, and NaN handling across plotting paths.
  • Introduced direct tests for BandResult, CurveResult, BarResult, ScatterResult, ScatterJitterResult, BoxWhiskerResult, ViolinResult, HistogramResult, TableResult, VolumeResult, and DistributionResult-derived classes, checking filter behavior, element composition, grouping, and handling of NaNs.
  • Documented and pinned the current ScatterResult filter bug where ResultFloat sweeps return None due to result_types=(ResultVar,) and ensured tests reflect existing behavior without changing production code.
test/test_bench_result.py
test/test_band_result.py
test/test_curve_result.py
test/test_bar_result.py
test/test_scatter_result.py
test/test_scatter_jitter_result.py
test/test_box_whisker_result.py
test/test_violin_result.py
test/test_histogram_result.py
test/test_volume_result.py
test/test_table_result.py
Strengthened dataset and composable container behavior tests to ensure data integrity and dataset result handling.
  • Added tests for DataSetResult to validate that worker-produced pandas DataFrames are stored, indexed, and recovered intact from xarray datasets and viewer paths.
  • Added tests for ComposableContainerDataset created from pandas DataFrames to ensure composition methods (right, down, sequence, overlay) preserve values, coordinates, NaN semantics, and var name/value labelling.
test/test_dataset_result.py
test/test_composable_container_dataframe.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • Several test helpers (e.g. unwrap_hv, _inner_element, _run_sweep, run_cfg_with) are reimplemented with very similar logic across multiple new test modules; consider moving these into a shared test utility module to avoid duplication and keep behavior consistent.
  • The new wait_for_port helper in test_file_server.py is specific but generic enough to be reused; if other tests also spin up HTTP/TCP servers it may be worth centralizing this function in a common test helper to avoid diverging polling logic and timeouts.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Several test helpers (e.g. `unwrap_hv`, `_inner_element`, `_run_sweep`, `run_cfg_with`) are reimplemented with very similar logic across multiple new test modules; consider moving these into a shared test utility module to avoid duplication and keep behavior consistent.
- The new `wait_for_port` helper in `test_file_server.py` is specific but generic enough to be reused; if other tests also spin up HTTP/TCP servers it may be worth centralizing this function in a common test helper to avoid diverging polling logic and timeouts.

## Individual Comments

### Comment 1
<location path="test/test_band_result.py" line_range="123-132" />
<code_context>
+class TestBandResult:
</code_context>
<issue_to_address>
**suggestion (testing):** Add a negative test to pin the BandResult filter behavior when repeats=1 or otherwise not matching the filter criteria.

Current BandResult tests only cover valid shapes and special cases; they don't verify what happens when the filter rejects input (e.g., sweeps that don't meet `float_range`/`repeats_range` like `repeats>=2`). For other result types, we assert that `to_*` with `override=False` returns a diagnostic/None instead of a plot when the filter fails. Please add a similar negative test here that:

- builds a Band-style sweep that should be rejected (e.g., `repeats=1`),
- calls `to_band(override=False)`, and
- asserts that the result is not an `hv.Overlay`/HoloViews pane (and optionally checks the diagnostic text when `print_debug` is enabled).

This will lock in the rejection behavior for unsupported shapes and prevent regressions where BandResult might silently emit misleading plots.

Suggested implementation:

```python
class TestBandResult:
    def test_to_band_overlay_composition(self, res_1d):
        """to_band yields two percentile Areas, a median Curve and a samples Scatter."""
        plot = res_1d.to_band()
        assert plot is not None
        overlay = unwrap_hv(plot)
        assert isinstance(overlay, hv.Overlay)
        # exact types: hv.Area is a subclass of hv.Curve, so isinstance would double count
        assert len([el for el in overlay if type(el) is hv.Area]) == 2
        assert len([el for el in overlay if type(el) is hv.Curve]) == 1
        assert len([el for el in overlay if type(el) is hv.Scatter]) == 1

    def test_to_band_rejected_when_filter_fails(self, res_1d_repeats_1):
        """to_band(override=False) should not emit a Band plot if the filter rejects the sweep.

        In particular, a Band-style sweep with repeats=1 (or otherwise outside the
        allowed repeats/float ranges) should be rejected and yield a diagnostic
        instead of an hv.Overlay/HoloViews pane.
        """
        # When the filter fails, to_band(override=False) should *not* return a plot.
        # With print_debug enabled, we also expect a human-readable diagnostic.
        result = res_1d_repeats_1.to_band(override=False, print_debug=True)

        # Unwrap any Panel/HoloViews wrappers, then assert that we did *not* get an Overlay.
        unwrapped = unwrap_hv(result)
        assert not isinstance(unwrapped, hv.Overlay)

        # Optional: pin a bit of the diagnostic so regressions in the filter behavior are caught.
        # Adjust the substring below to match the actual diagnostic message emitted by BandResult.
        assert "repeats" in str(result) or "filter" in str(result)

```

To make this test pass, you also need to:

1. Add a `res_1d_repeats_1` fixture in this file (or a shared conftest) that:
   - Builds a Band-style sweep identical (or very similar) to `res_1d`, but with `repeats=1` or otherwise configured so that the BandResult filter rejects it (e.g., by violating `float_range`/`repeats_range` such as `repeats>=2`).
   - Returns the corresponding `BandResult` instance.

   For example (pseudocode, adapt to your actual API):

   ```python
   @pytest.fixture
   def res_1d_repeats_1(bench, run_cfg):
       run_cfg_single = dataclasses.replace(run_cfg, repeats=1)  # or whatever your API is
       res = bench.plot_sweep(
           "band_time",
           input_vars=["size"],
           result_vars=["throughput"],
           run_cfg=run_cfg_single,
           time_src="2026-06-10 snap0000",
       )
       return res
   ```

2. Adjust the diagnostic assertion in the test if the actual `BandResult.to_band` rejection message differs:
   - Change the `"repeats"` / `"filter"` substring checks to match the concrete text your implementation emits when the filter fails.
3. If `to_band(override=False, print_debug=True)` returns a structured object (e.g. `(diagnostic, plot)` or a Panel pane) instead of a single value, update the test accordingly:
   - Unpack the return value, and apply the `unwrap_hv` / `isinstance(..., hv.Overlay)` checks to the plot component.
   - Apply the substring assertion to the diagnostic component.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread test/test_band_result.py
@github-actions

Copy link
Copy Markdown

Performance Report for f8eee40

Metric Value
Total tests 1653
Total time 123.24s
Mean 0.0746s
Median 0.0020s
Top 10 slowest tests
Test Time (s)
test.test_bench_examples.TestBenchExamples::test_example_meta 16.957
test.test_over_time_save_perf::test_save_faster_without_aggregated_tab 4.982
test.test_generated_examples::test_generated_example[regression/example_regression_tuning_drift.py] 4.544
test.test_generated_examples::test_generated_example[cartesian_animation/example_cartesian_animation.py] 2.949
test.test_split_render_examples::test_split_render_subprocess_media 2.940
test.test_generated_examples::test_generated_example[regression/example_regression_tuning_step.py] 2.830
test.test_generated_examples::test_generated_example[result_types/result_image/example_result_image_to_video.py] 2.807
test.test_hash_persistent.TestCrossProcessDeterminism::test_hash_stable_across_two_processes[ResultBool] 2.795
test.test_generated_examples::test_generated_example[regression/example_regression_tuning_noise.py] 2.777
test.test_over_time_repeats.TestMaxSliderPoints::test_default_subsampling_caps_at_max 2.398

Full report

Updated by Performance Tracking workflow

@blooop

blooop commented Jun 12, 2026

Copy link
Copy Markdown
Owner Author

"Histogram output never shows result-var units (only the var name); tests assert the implemented behaviour" this is not desired. All graphs should show units. check this across the codebase (make a follow up pr if appropriate)

blooop added 2 commits June 12, 2026 16:01
Addresses review feedback that all graphs should display units. The
histogram x-axis carries the result variable but only showed its name;
now uses the '{name} [{units}]' convention shared with band/bar/heatmap.
Move the copy-pasted unwrap_hv, inner_element and run_cfg_with helpers
into a shared module, and add a BandResult negative test pinning that a
non-scalar (vector) result is rejected rather than silently plotted.
@blooop

blooop commented Jun 12, 2026

Copy link
Copy Markdown
Owner Author

Good catch — fixed the histogram in this PR (e9a91fc): the x-axis now shows {name} [{units}] like band/bar/heatmap, and the test asserts value [m] instead of pinning the old behavior.

Audited the rest of the codebase per your request. Units are already shown by band, bar, heatmap, surface, and the distribution plots (box/violin/jitter inherit from DistributionResult). The remaining gaps are curve_result.py, line_result.py, and scatter_result.py, which don't set an axis label with units at all. I'll address those in a follow-up PR since they're production-only changes outside this test-coverage PR's scope.

Also addressed the Sourcery feedback: consolidated the duplicated unwrap_hv / _inner_element / run_cfg_with helpers into test/helpers.py, and added a BandResult negative test (the suggested repeats=1 case isn't rejected by band's filter, so the test pins the real rejection path — a non-scalar vector result).

@github-actions

Copy link
Copy Markdown

Performance Report for e9a91fc

Metric Value
Total tests 1656
Total time 120.29s
Mean 0.0726s
Median 0.0020s
Top 10 slowest tests
Test Time (s)
test.test_bench_examples.TestBenchExamples::test_example_meta 16.002
test.test_over_time_save_perf::test_save_faster_without_aggregated_tab 4.634
test.test_generated_examples::test_generated_example[regression/example_regression_tuning_drift.py] 4.377
test.test_generated_examples::test_generated_example[cartesian_animation/example_cartesian_animation.py] 3.036
test.test_split_render_examples::test_split_render_subprocess_media 3.023
test.test_hash_persistent.TestCrossProcessDeterminism::test_hash_stable_across_two_processes[ResultBool] 2.869
test.test_generated_examples::test_generated_example[result_types/result_image/example_result_image_to_video.py] 2.706
test.test_generated_examples::test_generated_example[regression/example_regression_tuning_noise.py] 2.584
test.test_generated_examples::test_generated_example[regression/example_regression_tuning_step.py] 2.253
test.test_over_time_repeats.TestMaxSliderPoints::test_default_subsampling_caps_at_max 2.238

Full report

Updated by Performance Tracking workflow

blooop added 3 commits June 12, 2026 16:38
Addresses the Sourcery review on #960: the per-file _run_sweep bodies in
the bar/scatter/box-whisker/violin/scatter-jitter result tests were
near-identical copies. Add run_named_sweep and run_dist_sweep to
test/helpers.py and delegate to them, so the run-config and plot-callback
setup lives in one place.

Also bump version to 1.106.0.
@blooop blooop enabled auto-merge June 12, 2026 15:52
@github-actions

Copy link
Copy Markdown

Performance Report for 8c8b3a6

Metric Value
Total tests 1659
Total time 124.12s
Mean 0.0748s
Median 0.0020s
Top 10 slowest tests
Test Time (s)
test.test_bench_examples.TestBenchExamples::test_example_meta 17.088
test.test_over_time_save_perf::test_save_faster_without_aggregated_tab 5.142
test.test_generated_examples::test_generated_example[regression/example_regression_tuning_drift.py] 4.564
test.test_split_render_examples::test_split_render_subprocess_media 2.986
test.test_generated_examples::test_generated_example[cartesian_animation/example_cartesian_animation.py] 2.938
test.test_hash_persistent.TestCrossProcessDeterminism::test_hash_stable_across_two_processes[ResultBool] 2.855
test.test_generated_examples::test_generated_example[result_types/result_image/example_result_image_to_video.py] 2.826
test.test_generated_examples::test_generated_example[regression/example_regression_tuning_noise.py] 2.782
test.test_generated_examples::test_generated_example[regression/example_regression_tuning_step.py] 2.492
test.test_over_time_repeats.TestMaxSliderPoints::test_default_subsampling_caps_at_max 2.414

Full report

Updated by Performance Tracking workflow

@blooop blooop merged commit 73fca69 into main Jun 12, 2026
7 of 8 checks passed
@blooop blooop deleted the test/coverage-plan-05 branch June 12, 2026 15:58
@blooop

blooop commented Jun 12, 2026

Copy link
Copy Markdown
Owner Author

Follow-up PR for the remaining axis-unit gaps (curve/line/scatter) is up: #963

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant