add option to filter runs by AdnanAbouelela · Pull Request #17 · quadbio/scembed

AdnanAbouelela · 2025-12-17T13:03:53Z

This PR is introducing an option to filter runs to meet certain criteria. This way one can restrict the benchmarking to only tools that are label aware vs. unaware etc.

for more information, see https://pre-commit.ci

Marius1311

Summary

Useful feature — being able to slice the WandB sweep down to runs matching specific config values is a real gap in the aggregator API. The implementation is straightforward and lands in the right place (fetch_runs → _filter_params → _process_runs).

A few items below before this can merge. CI red is the blocker; the rest are review-level questions.

Blocking

1. CI is red — 3 ruff errors introduced by this PR

pre-commit.ci fails on ruff check. Local reproduction:

UP007 src/scembed/aggregation.py:21  — use `X | Y` for type annotations
UP007 src/scembed/aggregation.py:22  — use `X | Y` for type annotations
C419  src/scembed/aggregation.py:195 — unnecessary list comprehension inside any()

Suggested patch:

-from typing import Union
-...
-# Allowed scalar types for filtering
-AllowedScalar = Union[int, float, str, bool]
-AllowedFilterValue = Union[AllowedScalar, list[AllowedScalar]]
+# Allowed scalar types for filtering
+AllowedScalar = int | float | str | bool
+AllowedFilterValue = AllowedScalar | list[AllowedScalar]

-                if any([param_value == x for x in target_value]):
+                if any(param_value == x for x in target_value):

(Or just if param_value in target_value: — semantically identical and simpler.)

2. No tests for the new filter behavior

Per REVIEW_GUIDE.md (Testing → New code): new behavior must be covered. tests/test_aggregation.py has no test that exercises filter_params / filter_allow_none.

The existing tests hit a live W&B project, which makes adding a hermetic test for _filter_params a bit awkward — but the method itself is pure: it operates on self.raw_df. A direct unit test is easy:

def test_filter_params_scalar_and_list(self):
    self.agg.raw_df = pd.DataFrame({
        "config": [{"method": "scvi"}, {"method": "harmony"}, {"method": None}, None],
    })
    self.agg.filter_allow_none = True
    self.agg._filter_params("method", ["scvi", "harmony"])
    assert len(self.agg.raw_df) == 3  # two matches + one None kept

    # ... and a filter_allow_none=False variant, plus a scalar target_value case

A parametrize with (target_value, filter_allow_none, expected_n) rows would cover the four meaningful branches in one test.

Review-level questions

3. `_filter_params` lacks type hints

Public fetch_runs is fully typed but _filter_params(self, target_key, target_value) is bare. It is private, but for consistency:

def _filter_params(self, target_key: str, target_value: AllowedFilterValue) -> None:

4. Implicit "rows with no config" behavior

config = row.get("config")
if config is None or not isinstance(config, dict):
    continue

Rows with missing/malformed config are silently dropped — they end up in neither keep_runs nor discard_runs. The comment says "Missing config runs will be filtered out later in _process_runs()", which is correct for the n_runs_filtered counter, but the in-loop continue here means those rows are not in self.raw_df after the loc[keep_runs] reassignment. So _process_runs will not see them and will not bump missing_config_runs.

Two options:

Append to keep_runs (let _process_runs handle it as before — preserves the existing accounting).
Append to discard_runs and add them to self.missing_config_runs here.

Right now it's a third hidden path. Worth choosing one explicitly.

5. State leak via `self.filter_allow_none`

fetch_runs writes self.filter_params / self.filter_allow_none and _filter_params reads self.filter_allow_none. Passing filter_allow_none as an argument to _filter_params would keep _filter_params self-contained and avoid leaving filter state on the instance after fetch (relevant if fetch_runs is called twice with different filters).

def _filter_params(self, target_key, target_value, *, filter_allow_none: bool) -> None:

6. Public API surface

AllowedScalar / AllowedFilterValue end up on the public fetch_runs signature, which makes them effectively part of the public API even though they are not re-exported from __init__.py. Per AGENTS.md (public-API guidance) this is fine — just flagging it so the names are intentional. AllowedFilterValue is reasonable; consider whether AllowedScalar needs to exist at all (it's only used inside AllowedFilterValue).

7. Docstring

fetch_runs docstring is a single line and does not document the two new parameters. A short numpydoc block describing filter_params (mapping of config key → allowed value or list of values) and filter_allow_none (whether runs with None/missing values for the filtered key are kept) would help users discover the feature.

Non-blocking nits

discard_runs is collected only to log len(discard_runs). A counter is enough; the list is unused.
if not isinstance(target_value, list): ... else: ... could be unified by normalizing target_value to a list at entry: targets = target_value if isinstance(target_value, list) else [target_value], then a single in check.
The type alias allows bool, but bool is a subclass of int — passing True against an int-typed config field will match 1. Probably fine in practice; just noting.

Documentation Impact

No README.md, docs/api.md, or notebook updates needed — fetch_runs is already in the autosummary via scIBAggregator, and the new kwargs surface there once the docstring (point 7) is filled in.

No AGENTS.md invariant is touched (this respects the "W&B SDK drift isolated in aggregation.py helpers" rule — filtering is layered on top, not inside the SDK helpers). Good.

codecov · 2026-04-30T15:31:19Z

Codecov Report

❌ Patch coverage is 22.22222% with 28 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.28%. Comparing base (5e3d37d) to head (9d7ea37).

Files with missing lines	Patch %	Lines
src/scembed/aggregation.py	22.22%	28 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #17      +/-   ##
==========================================
- Coverage   75.74%   74.28%   -1.46%     
==========================================
  Files          11       11              
  Lines        1303     1338      +35     
==========================================
+ Hits          987      994       +7     
- Misses        316      344      +28

Files with missing lines	Coverage Δ
src/scembed/aggregation.py	`73.86% <22.22%> (-6.42%)`	⬇️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

AdnanAbouelela and others added 2 commits April 30, 2026 17:22

add option to filter runs

4a72ee8

[pre-commit.ci] auto fixes from pre-commit.com hooks

9d7ea37

for more information, see https://pre-commit.ci

Marius1311 force-pushed the feat/filter_runs branch from 119a1d8 to 9d7ea37 Compare April 30, 2026 15:22

Marius1311 reviewed Apr 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add option to filter runs#17

add option to filter runs#17
AdnanAbouelela wants to merge 2 commits intomainfrom
feat/filter_runs

AdnanAbouelela commented Dec 17, 2025

Uh oh!

Marius1311 left a comment

Uh oh!

codecov Bot commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AdnanAbouelela commented Dec 17, 2025

Uh oh!

Marius1311 left a comment

Choose a reason for hiding this comment

Summary

Blocking

1. CI is red — 3 ruff errors introduced by this PR

2. No tests for the new filter behavior

Review-level questions

3. _filter_params lacks type hints

4. Implicit "rows with no config" behavior

5. State leak via self.filter_allow_none

6. Public API surface

7. Docstring

Non-blocking nits

Documentation Impact

Uh oh!

codecov Bot commented Apr 30, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

3. `_filter_params` lacks type hints

5. State leak via `self.filter_allow_none`