Releases: Gerrux/xrun
xrun v0.7.1
Added
xrun install skill --codex|--claudeinstalls repository-local agent
instructions for Codex and Claude.- Installers now install the Python TUI by default, check for Python 3.11+
and pip, and can tryensurepipvia--install-pip/-InstallPip. xrun updatechecks GitHub Releases, prompts before installing, and runs
the official installer. Interactivexrunstartup also prompts when a
newer release exists; setXRUN_NO_UPDATE_CHECK=1to skip that check.
Changed
- Installer and README links now target the
masterbranch. - The xrun agent skill text is harness-neutral instead of Claude-only.
xrun v0.7.0
Pluggable metric sinks. WandB joins MLflow as a first-class fan-out
target; the poller now mirrors one xrun run to N tracking servers in
parallel, with per-sink failure isolation. Comet ML and TensorBoard
plug into the same trait in v0.8 with no further refactor.
Skipped 0.6.x — the field-feedback ramp was already absorbed into 0.5.4
and the next milestone deserves a clear minor bump.
Added
MetricSink trait + WandB
-
xrun_core::metric_sink::MetricSink(async viaasync-trait):
name/open_run/log_metrics_batch/log_artifact/finalize.
Backends are independent crates; adding one is a single
impl MetricSink for Xplus a registry entry inMetricFanOut::new. -
OpenRunCtx<'_>carries everything a tracking server needs
(run_id, experiment, run_name, vendor, instance_id, config, tags),
borrowed so callers don't clone hashmaps per tick. -
RemoteRunHandle { sink_name, remote_run_id, remote_url }— the opaque
token.remote_urlletsxrun show/xrun metrics --mlflow-url
build deep links without rebuilding from raw config (which broke when
the user later editedmlflow.url). -
MetricSinkErrorgranular enough to retry onNetwork/Server,
fail loud onAuth/Config, swallow onDisabled. -
crates/xrun-wandb/— new crate.WandbClientwraps three operations
onapi.wandb.ai:viewer { entity }GraphQL probe for default-entity resolution.mutation upsertBucketto create / reuse a run row (idempotent on
run name, so a daemon restart doesn't double-create).POST /files/{entity}/{project}/{run}/file_streamto append to
wandb-history.jsonland signalcomplete + exitcodeon finalize
(same path the official wandb Python SDK uses internally).
-
WandbSink: MetricSinkgroupsMetricPoints by step before posting
so multi-key updates within one training iteration land on the same
wandb x-axis tick. Tracks anAtomicU64history offset per run for
retry-safe appends.
Poller fan-out
MetricFanOut(wasMlflowMirror) holdsVec<Box<dyn MetricSink>>.
start()opens runs on every enabled sink in sequence;log_metrics
fans batches out concurrently within oneblockboundary, so total
latency ismax(per_sink)not the sum. Per-sinkwarnedlatches
surface the first failure and swallow the rest.MetricSinksConfig { mlflow: Option<MlflowSubConfig>, wandb: Option<WandbSubConfig>, … }— toggle each sub independently.- Sinks are gated by both
[metrics] sinks = […](config) and
credential presence; a sink listed without creds warns and is dropped
(no fail), so a partial setup still gives the user whatever sinks
remain — including local-only.
Persistence
- Schema migration
005_sink_run_ids.sql:mlflow_run_url(resolved at sink-open time).wandb_run_id,wandb_run_url(paired write — never half-populated).
Store::set_mlflow_run_url,Store::set_wandb_runsetters.MetricFanOut::startroutes the newRemoteRunHandleback to the
right setter via a sink-name dispatch.
CLI
xrun init --wandb-key -flag, mirrors--vast-key/--kaggle-token
with-to read trimmed line from stdin (one stdin read max per
invocation).xrun config probe --vendor wandb— lands in the existing probe
surface, mirroring vast / kaggle / mlflow shape. Reads
XRUN_PROBE_WANDB_KEYfrom env so the wizard / TUI can validate a
pasted key without writing it to disk first. Hitsviewer { entity },
returns 401 cleanly on a bad key.xrun init-manifest --vendor <X> --sink <Y>— generate a manifest
skeleton for any (vendor × sink) combination. Every editable spot
is markedTODO_<field>;grep TODO_ <path>lights up everything
needing review before launch. Vendors:vast,kaggle,local,
ssh. Sinks:mlflow,wandb,none. Closes the loop on the
user-facing goal — Claude Code can produce a working manifest
without knowing the schema.xrun init --sink wandbnow accepted (was a v0.8 rejection).xrun config showincludeswandb.api_key(masked unless
--secrets);xrun config set wandb.api_key …works through the
schema-driven setter.
TUI
- New
screens/sinks.py(Python Textual). Cards: MLflow, WandB,
Comet ([v0.8] disabled). Status pill with five distinct states:
EMPTY / PAUSED / CHECK / READY / ERROR — PAUSED makes the
"key set but not in metrics.sinks list" mistake visible at a glance.
Actions: edit / test / toggle-default / revoke. Routing:g mchordGo: Sinkspalette entry.
SinkEditScreenwith masked inputs: 4 fields for MLflow (url goes
to global config, auth fields to credentials.toml), 1 for WandB
(entity is auto-probed at first launch).- Wizard catalog: wandb
available_nowflag flipped toTrue. Wizard
step still treats wandb as "tick to add to default list, configure
key on Sinks screen" — full wizard form is a follow-up.
Schema
WandbCredentials { api_key }inCredentials.Runcarriesmlflow_run_url,wandb_run_id,wandb_run_url
(Option<String>).
Changed
mlflow_mirror.rs→metric_fanout.rs.MlflowMirror→
MetricFanOut,MlflowMirrorConfig→MetricSinksConfig.
Poller::with_mlflow→Poller::with_metric_sinks. The wiremock-
based poller integration tests still validate the same wire-level
MLflow calls (create / log-batch / update-run) — they don't care
that the path goes through a trait now.
Tests
- 11 new unit + wiremock tests in
xrun-wandbcovering viewer-probe,
upsertBucket, file_stream batching, finalize exit code, 401 auth
surfacing. - 8 unit tests in
init_manifestcovering the (vendor × sink) matrix
and theTODO_token contract. - 1 multi-sink integration test in
xrun-pollerruns the poller
end-to-end against two wiremock servers (one MLflow, one wandb) and
asserts both received the metric batch. - Full workspace: 404 passing, 0 failures across 50 suites.
Live smoke
- Run
01KQWJSX0VDNEEMBQGPK8KH6HN— quickstart.yaml with
metrics.sinks = ["wandb"]. WandB run created, finalized, page
exists athttps://wandb.ai/data_force/quickstart/runs/xrun-…
(HTTP 200), DB row populated withwandb_run_id+wandb_run_url. xrun init-manifest --vendor local --sink mlflowparses cleanly
throughxrun doctor --manifest <path>.
Not yet shipped
- WandB artifacts API —
log_artifactreturnsDisabledfor now.
Needs the separatemanifests + S3-presigned PUTsurface; deferred
to v0.7.x patch or v0.8. - Wizard form for wandb (just
api_keyinput) — checkbox enables
the sink inmetrics.sinksbut credential entry still goes through
the Sinks screen. - Comet ML — slot in TUI catalog, no impl. v0.8.
xrun v0.5.3
Field-feedback sweep: closes the eight items in ISSUES.md from the
arborust evening session, plus one latent Store::open bug surfaced
while wiring live telemetry. End-to-end live metrics + events on Kaggle
now work via the existing MLflow side-channel.
Added
xrun pullis no longer a stub — resolves run → vendor adapter →
adapter.pull(), defaults destination toruns/<id>/artifacts/,
supports--into, and reports matching files for--ckpt best|latest|all. (#4)- Live telemetry on Kaggle:
xrun_hook's log streamer now also tails
events.jsonl/metrics.jsonland pushes them as MLflow artifact
chunks alongside stdout. The Kaggle adapter ingests new chunks every
poll tick — events and metrics appear inxrun events/xrun metrics
while the kernel is still running. Requiresmlflow.urlconfigured.
(#8) - Streamer also mirrors metric records to MLflow's native
/api/2.0/mlflow/runs/log-batchendpoint so the MLflow UI's Metrics
tab plots them. The artifact JSONL is still the source of truth for
the local poller; the native logging is purely for the human
dashboard. (Without it, the UI shows "No model metrics recorded".) - Kaggle dataset version pinning: after
datasets status = ready, the
adapter resolves the dataset'scurrentVersionNumbervia the REST
API and rewrites slugs toowner/name/N, so kernels never mount a
stale snapshot due to the kernel-creation cache lag. Slugs that
already carry an explicit/Nare left alone. (#1) - TUI Artifacts screen: walks the run's local artifacts dir and lists
real files (was previously axrun pull not yet implementedstub).
Empty state hintspress \a` to pull;a` runs the real CLI and
auto-refreshes. (Surfaced after #4 unblocked it.)
Fixed
xrun launch --detachno longer hangs afterkaggle kernels push
finishes. The push subprocess's pipes are now drained on background
threads, sotry_waitreturns instead of waiting on a full OS pipe
buffer. (#3)- Duplicate
running:startevents after a poll-daemon restart. The
Kaggle adapter rehydrates its last kernel state from the DB on the
first poll so already-emitted transitions don't re-fire. (#6) - TUI
⚠ stalewarning for healthy runs whose poll-daemon died
mid-session: the app now runs auto-resume every 60 s, so a poller
killed by a binary upgrade self-heals without user action. (#7) xrun_hookinstall path on Kaggle: the wheel is base64-embedded in
the kernelmain.pyand bootstrapped before usersetup, so the
resolution-atomicpip installno longer drops siblings when
xrun_hookisn't on PyPI. (#2/#5)- Latent:
Store::openwas being called on the data directory in
three Kaggle adapter callsites (post-pull ingest, kernel-state
recovery, live telemetry ingest), failing silently with
unable to open database file. Newdb_path()helper appends
runs.db; without this the live-telemetry ingest never persisted
any events or metrics.
Changed
crates/xrun-kaggle/src/log_stream.rsexports
parse_chunk_seq_with(stem, ext, path)so adapters can reuse the
same chunk reassembly path forlogs/,events/, andmetrics/
prefixes.
xrun v0.5.1
Release 0.5.1
xrun v0.5.0
Added
xrun-localadapter — run manifests directly on the host as a first-class
vendor (vendor: local). Full lifecycle parity with vast (provision →
upload → execute → tail → pull → destroy), cross-platform (bash/sh on Unix,
pwsh/powershell on Windows), idempotent destroy via PID file.xrun-sshadapter — same lifecycle against your own server / NAS / VPS over
SSH; reuses the local execution model.xrun init— first-run wizard. TTY → spawnsxrun-tui --wizard(4 steps:
local capabilities → vendors → logging mode → recap with livexrun doctor).
Non-interactive flags for the Claude skill / CI:--probe-local --json,
--non-interactive --mark-completed --sink mlflow. Credential flags
(--vast-key,--kaggle-token,--kaggle-username,--kaggle-key) accept
-to read one stdin line, so secrets stay out of shell history.xrun doctorcategorized output: env / vendors / manifests groups with
per-category counts,--jsonfor skill consumption.xrun config probe --vendor <name>— validate credentials read from
XRUN_PROBE_*env (used by the wizard).xrun metrics --per-key— emit one chart per metric key instead of overlay.xrun_hook.metrics(values: dict, step: int)— batch shortcut: writes one row
per key sharing a single timestamp. Avoids N separatemetric()calls in the
training loop.- Kaggle live log streaming end-to-end via
xrun_hook→ MLflow chunked
artifacts (the public Kaggle API has no live-log endpoint). exp/templates/— starter manifests + train.py for common ML tasks:
quickstart(zero-config smoke test),classification
(loss/acc/f1_macro/precision/recall),regression(loss/mae/rmse/r2).
Templates run end-to-end without torch so they smoke-test the structure
before adaptation.- TUI Vendors screen: brand-coloured cards (vast orange
#ff6b35, kaggle cyan
#20beffviaborder-left thickaccent), status pills
(READY/CHECK/ERROR/EMPTY), pulse animation on the connectivity probe dot,
double-click opens edit. - TUI Vast edit screen: "Region filter" section reusing the existing
CountryExcludeScreen— pulls and savessearch.exclude_countriesvia
xrun config, surfaces current exclusions inline. - TUI country picker: pills tinted by continent (EU blue, AS red, NA green,
ME orange, AF violet, OC cyan, SA amber) so country codes render and group
visually in any terminal — including Windows Terminal which doesn't compose
regional-indicator codepoints into flag glyphs. - TUI
ubinding on Vendors screen — opens the vendor's quota/billing page
in a browser (kaggle.com/settings,cloud.vast.ai/billing). - TUI wizard split into a sub-package with
image_view/metrics_view/
report_viewwidgets for reuse in run detail.
Changed
- Wizard rebuilt for keyboard-first UX:
Checkboxwidgets (Tab/Space toggle),
oopens API-key page of focused card (works before selecting), Esc-skip
now requires Y/N confirmation, probe shows a loading indicator, Recap runs
xrun doctor --jsonand prints ✓/⚠/✗ per check. Toggling no longer rebuilds
the body — pasted keys keep focus. - TUI Kaggle card surfaces free-tier limits (CPU ∞ 24h/session,
GPU 30h/wk 12h/session, TPU 20h/wk 9h/session) instead of an opaque
"competitions visible: N" probe artifact. - TUI vendor cards: flat
Horizontalrows replaced withVerticalcards;
vendor-row/vendor-dot/vendor-name-colCSS retired in favour of
vendor-card/vendor-card-head/vendor-card-foot. Old class names removed. xrun gcfilters non-vast records — local cleanup is viaxrun stop.
Removed
xrun init --vendorflag. It was informational-only (echoed in JSON, never
wrote anything). The wizard now relies on--sinkand the credential flags
for non-interactive setup.
Configuration
[ui] wizard_completed: bool(default false) — TUI auto-launches the
wizard when false; finishing the wizard sets both this flag and
[metrics] sinksvia the CLI.[metrics] sinks: Vec<String>(default["mlflow"]) — editable through
xrun config set. WandB and Comet sink checkboxes are visible but disabled
with a[v0.8]badge.