Releases: Quitetall/blut
Releases · Quitetall/blut
blut v0.10.0 — never-OOM-the-box orchestrator
First tagged release of the blut engine. The domain-agnostic orchestrator has graduated from the 0.1 prototype to a never-OOM-the-box, parallel DAG orchestrator with a distributed launch seam, run lineage, and declarative scheduling.
blutis the engine crate (Stage → Plan → Recipe → Registry + CLI/TUI + resource broker). The user-facing binaries live in the cookbook crates —blut-lamquant(binaryblut, neural EEG codec) andblut-lamu(binaryblut-lamu, generic LLM) — which depend on this crate.
Highlights
- Never-OOM-the-box resource broker (ADR 0046). Every memory-spending stage runs inside a cgroup-capped systemd unit, sized from a scaling footprint model and gated by a live admission probe. A too-small estimate is caught by the cgroup as a clean unit kill — never the OS OOM killer taking the host.
- Parallel DAG execution. The
ParallelExecutorruns independent stages concurrently (the long-missing piece), with capacity-aware memory admission so concurrency never overcommits the box.
Added
- Resource broker — system probe, conservative-high scaling RAM estimate (
workers × prefetch + tier + latent + batch + in_ch), fail-closed admission gate that refuses or right-sizes against live free RAM. - Calibration store + self-healing OOM retry — measured cgroup peaks are MAX-merged per config key so the broker stops over-refusing; an OOM records an
OomCorrectedlower bound that the next admission escalates (box-fit clamped so an escalation can never push the cgroup past the host). - Capacity-aware parallel admission — each stage reserves its footprint before being scheduled concurrently.
in_chfootprint term — a 168-channel fullband run is no longer billed like a 21-channel L3 run (the under-bill behind admit-then-cgroup-kill on every fullband launch).- Distributed launch seam —
blut … --launcher local|slurm|ray; a backend-agnosticWrappedCommand+LaunchTargetroute a stage to a cluster, whileLocalkeeps the cgroup-contained this-box path byte-identical. - LineageDB — an embedded, rebuildable SQLite index of every run with git + hardware provenance;
blut lineage trace <hash>/reindex. - Declarative recipe schedules via
blut-managed systemd timers. register_recipe!macro + schema helpers for typed cookbook recipes.- Runtime DAG mutation —
KILL-on-NaNbranch pruning (a no-op policy path is byte-identical to static execution). - Watchdog + heartbeat, per-stage retry policy and soft/hard timeouts.
- CLI —
--jsonflags,blut runs diff, cache-stats + artifact inspection, and a build-commit stamp (blut --version→0.10.0+<commit>[-dirty]) with a stale-binary warning at startup (the "git pull, forgotcargo install --force" trap).
Fixed
- Single cross-crate source of truth for footprint drivers (fixed a calibration key-parity bug where RESOLVE and RECORD keyed differently).
- The warm fullband-cache precompute stage (cookbook) was the last uncontained stage and could OOM the box; it now runs contained, fork-pool-sized, box-fit clamped, with a non-fatal OOM path.
- Footprint right-sized to the measured envelope (tier-3 ~35G → ~23G); many review-driven hardening fixes across broker / executor / launcher / lineage.
Known limitations
- No mid-epoch durable resume yet (BLUT-API Phase D) — stages retry on OOM, but a killed run restarts from scratch. Does not block normal runs.
- Warm fork-pool sizing is clamped against total box RAM, not live-free RAM — on a heavily loaded box the warm default may not complete (the run falls back to a slower decode-bound path and warns); tune the worker count down per run where needed.
- The LamQuant python payload still lives under this crate's
python/transitionally; it moves toblut-lamquantin a later migration.
Full changelog: CHANGELOG.md