Currently, our benchmark script maintains a hard-coded mapping of data-generating functions to the CI tests that can consume them. Every time we add a new generator, we must also update this central mapping—adding boilerplate and introducing a risk of inconsistency.
Proposal:
Build a lightweight registry framework that lets each data-generating function declare, via a decorator, which CI tests it supports. At import time, the decorator will register the function into a global lookup table keyed by test name. The benchmark runner then discovers all generators for a given test by querying this registry.
# registry.py
_FROM collections import defaultdict
_GENERATORS_BY_TEST = defaultdict(list)
def data_generator(*test_names):
def decorator(fn):
fn.supported_tests = getattr(fn, "supported_tests", []) + list(test_names)
for name in test_names:
_GENERATORS_BY_TEST[name].append(fn)
return fn
return decorator
def get_generators_for(test_name):
return list(_GENERATORS_BY_TEST[test_name])
Data generating method:
# linear_gaussian.py
from registry import data_generator
import numpy as np
@data_generator("pearsonr", "pillai", "gcm")
def linear_gaussian():
...
Benefits:
- Modularity: Each generator self-documents which tests it supports.
- Adding a new generator requires only creating a decorated function and no changes to the CI scripts is required.
- Maintainability: Eliminates a growing, error-prone central mapping.
Currently, our benchmark script maintains a hard-coded mapping of data-generating functions to the CI tests that can consume them. Every time we add a new generator, we must also update this central mapping—adding boilerplate and introducing a risk of inconsistency.
Proposal:
Build a lightweight registry framework that lets each data-generating function declare, via a decorator, which CI tests it supports. At import time, the decorator will register the function into a global lookup table keyed by test name. The benchmark runner then discovers all generators for a given test by querying this registry.
Data generating method:
Benefits: