When throughput (specimen releases per episode) is the primary metric, run the throughput_sla task. The coordination pack (coord_risk) targets safety, security, and resilience under injections; its cells may report zero throughput until coordination methods and scale setup produce release-capable work (see the Coordination benchmark card and the non-zero throughput plan).
-
Run the benchmark with the scripted baseline (default for throughput_sla in the baseline registry):
labtrust run-benchmark --task throughput_sla --episodes 10 --out ./out/throughput_sla.json
The baseline registry maps
throughput_slatoscripted_ops_v1(scripted agents that perform accept, process, and release). No coordination method is used; the task uses a fixed set of scripted agents and an initial state with specimens already inacceptedstatus. -
Read throughput from the result JSON: each episode in
results.jsonhasepisodes[].metrics.throughput(count of RELEASE_RESULT per episode). Aggregate as needed (e.g. mean over episodes). -
Optional: Use
labtrust throughput-compareto run throughput_sla with default 10 episodes and print mean throughput (writesthroughput_compare_results.jsonby default; use--episodes Nand--out <path>to customize). -
Optional: Use
labtrust run-summary --run <dir>orlabtrust summarize-resultsto get one-line or tabular stats including throughput for a run directory.
If you want to compare a coordination method (e.g. kernel_auction_whca) on a setup that can produce releases, use the coord_scale or coord_risk task only after ensuring the scale has non-empty device queues (e.g. pre-populated initial queues or a reception path that enqueues work). See the coordination benchmark card and scale operational limits for scale configs and horizon_steps. The coordination pack’s default scales (small_smoke, medium_stress_signed_bus) require initial queue population or coordinator-driven accept/queue actions to achieve non-zero throughput.
- Coordination benchmark card – perf.throughput definition and when the pack reports it.
- Benchmark card – throughput_sla task and baselines.
- Official benchmark pack – full pipeline including throughput_sla baseline runs.