fix(swarm): make the multi-persona campaign actually work (stdout was corrupting ACP)#16
Merged
Merged
Conversation
…s corrupting ACP
The campaign's "theatrical swarm" was a protocol bug, not fake work. Every
worker subprocess was doing real work (git worktree, applied patch, measured
diff, signed results) and SUCCEEDING — but the leader recorded each one as
`crashed` (exit_code=-1). The cause: stdout pollution corrupting the ACP
envelope channel the leader parses as JSON. The worker's first reply was
preceded by a log line ("2026-…") so the leader's first read failed to parse,
hit EOF, and stamped a false crash before reading any result. This affected
every persona in both deterministic and ollama campaigns.
The worker already sent a correct TerminationReport (with the real
success/doom_loop status) — it was simply never readable. The real fixes are
to stdout discipline:
- korg-core/telemetry.rs: init_tracing defaulted to STDOUT (while checking
stderr for ANSI). Logs now go to STDERR; stdout stays a clean ACP channel.
- korg-runtime/harness.rs: stray `[TelemetryEmitter]` println! -> eprintln!;
the legacy run() path's println!s -> eprintln! (same latent hazard).
Campaign provider selection + live reliability:
- src/main.rs: global --provider/--model/--base-url, exported as env at startup
so every worker subprocess builds the selected provider via KorgConfig::load()
(no config threading; covers TUI/web). run-once unified onto these flags
(behavior unchanged: no flag -> deterministic).
- korg-runtime/session.rs: worker spawn forwards the LLM env to the child.
- korg-runtime/personas.rs: implementer personas (Benjamin/Lucas) request
response_format=json_object so live models reliably emit a parseable mutations
block; prose personas unchanged; the deterministic stub ignores it.
Proven: deterministic campaign — all workers terminate success, Benjamin
attests a real measured mutations=1, DAG data-flow real. Live ollama campaign —
all personas complete on a real model; an implementer produced a real mutation.
Gated e2e tests/campaign_e2e.rs guards it. Also renames the non-standard Tests/
-> tests/ so the integration test is discoverable cross-platform.
Honesty unchanged: the mutations count is the real git-diff measurement; this
only fixes whether a worker is RECORDED as done and which provider it uses.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
The campaign's "theatrical swarm" was a stdout-pollution protocol bug, not fake work. Every
korg workersubprocess was doing real work (git worktree, applied patch, real measured diff, signed results) and succeeding — but the leader recorded each one ascrashed(exit_code=-1).Root cause: the leader parses each worker's stdout as newline-delimited ACP/JSON. Log output was going to stdout too, so the worker's first reply was preceded by a tracing line (
2026-…) → the leader's firstread_acp_envelopefailed to parse → EOF → false crash, before reading any result. This hit every persona in both deterministic and ollama campaigns. SP1–SP4 fixed the in-process paths; this was the subprocess path, invisible until a real campaign's worker stdout was read.The fix (stdout discipline)
korg-core/telemetry.rs:init_tracinghad no.with_writer→ defaulted to stdout (while ironically checking stderr for ANSI). Logs now go to stderr.korg-runtime/harness.rs: a stray[TelemetryEmitter]println!→eprintln!; the legacyrun()path'sprintln!s →eprintln!(same latent hazard).The worker already sent the authoritative
TerminationReport(realsuccess/doom_loopstatus +terminal_tx_id) — it was just never readable. (An earlier extra report I added was a misdiagnosis; review caught the double-send and it was removed. The e2e still passes, proving the stdout fix alone is the cure.)Campaign provider selection + live reliability
src/main.rs: global--provider/--model/--base-url, exported as env at startup so every worker subprocess builds the selected provider viaKorgConfig::load()(no config threading; covers TUI/web).run-onceunified onto these flags (no behavior change: no flag →deterministic).korg-runtime/session.rs: worker spawn forwards the LLM env to the child.korg-runtime/personas.rs: implementer personas (Benjamin/Lucas) requestresponse_format: json_objectso live models reliably emit a parseable mutations block; prose personas unchanged; deterministic stub ignores it.Proven
success, Benjamin attests a real measuredmutations=1, DAG data-flow real (Benjamin's payload carries Captain+Harper output; Lucas's carries Benjamin's mutation).mutations=1.Test plan
tests/campaign_e2e.rs(#[ignore]) — runs the realkorgcampaign, asserts workers complete, none crash, Benjamin attestsmutations=1(passes 71s locally)cargo test -p korg-runtime— 143 pass;cargo test -p korg-corepassTerminationReport)#[ignore], not run in CI)Also
Tests/→tests/so integration tests are discoverable cross-platform.Honesty unchanged: the
mutationscount is the realgit diffmeasurement; this only fixes whether a worker is recorded as done and which provider it uses.Known pre-existing follow-ups (not introduced here)
doom_loopexit status maps tocrashed=true(session.rs:467/workers.rs:870) — a controlled doom-loop signal shouldn't be a crash.read_acp_envelopeinrun_as_stdio_workerhas no timeout (works in practice — the leader drops stdin → EOF).