feat(swarm): SP4 — honest demo + visible honest pipeline (korg run-once)#12
Closed
New1Direction wants to merge 5 commits into
Closed
feat(swarm): SP4 — honest demo + visible honest pipeline (korg run-once)#12New1Direction wants to merge 5 commits into
New1Direction wants to merge 5 commits into
Conversation
…ture A new run-once subcommand and korg_runtime::run_once::run_once_honest drive the SP1 honest pipeline directly (below the broken campaign orchestration): build Benjamin's system+user messages, ask the hermetic DeterministicProvider for a patch, parse + apply_mutations to a real git worktree, then measure reality (numstat + cargo_check). The attested mutation count equals the real git-diff file count by construction — the fixture task yields 1 (compiling), an unrelated task yields an honest null (0, no fabrication). It writes a verifiable korg-ledger@v1 JSONL journal (hash-chained via the conformance-tested korg-ledger primitives re-exported through korg_registry::ledger_chain) that korg-verify accepts (journal VALID). TDD: tests/run_once.rs asserts both cases (fixture->1, unrelated->0) and the attested==numstat invariant. Default repo = temp git-inited copy of fixtures/honest-demo-repo, mirroring the keystone test setup.
- Hero alt text: drop the false 'fork' claim → 'record, verify, and rewind an AI agent session as a hash-chained ledger' (what the binary actually does). - Reversibility bullet: 'rewind, fork, or branch' → 'rewind the ledger to any prior sequence point' (only rewind is shipped). - 'Rewind & Fork' section: the phantom 'korg checkpoints list|restore' commands (no such variant in enum Commands) replaced with the real run-once + verify flow, and a note marks fork/checkpoints as planned, not shipped.
demo.tape now types and runs ONLY real commands: korg run-once (real patch + real cargo check + honest attestation), korg-verify on the emitted korg-ledger@v1 journal (independent green verdict), cat of the real applied fix, and a real korg rewind. Every Type line is the actual binary — there is no simulation script and no pre-scripted output. demo-sim.sh (the fabrication source, which printed fake seq/timestamps/ mutations) is deleted so the tape can't be re-pointed at it. demo.gif/mp4/webp regenerated with vhs from the real binaries; frames verified to show the real HONEST ATTESTATION block and the real fixed add(). (webp built via gif2webp from the same real GIF — this ffmpeg has no webp encoder.)
…real CLI flags) Review of SP4 found the README still marked Speculative branches + Execution checkpoints as shipped (✅ in the comparison table, [x] in the status checklist) — contradicting the 'planned, not yet shipped' disclaimer 3 lines above. Also corrected phantom runnable commands: korg-tui→korg, 'korg goal/run --preview/ --mode'→top-level --goal/--preview/--mode flags (there is no run/goal subcommand).
New1Direction
added a commit
that referenced
this pull request
Jun 15, 2026
… run-once) Lands the stacked branch 'feat/swarm-honest-demo' (PR #12) onto main.
Owner
Author
|
Landed on |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Track B SP4: stop faking the demo, expose the honest pipeline
Pivotal finding (from grounding): the orchestrated
korg campaigndoes NOT do real work — its workers idle, get killed, get a faked recovery, and attesttotal_mutations_so_far: 0(Benjamin's DAG package is literally"Implement (simulate-crash): …"). The SP1 honest pipeline is real but only fires below this broken orchestration. So an honest demo can't record the campaign — it needs a real, working entrypoint. (Fixing the campaign itself is SP2.)What shipped
korg run-once— the honest pipeline made visible (6cf38af)Commands::RunOnce { task, repo }→ drives the SP1 pipeline on a temp git-inited copy offixtures/honest-demo-repo:DeterministicProvider→ apply → realgit diff --numstat→ realcargo check→ honest attestation.files_changed=N · cargo check=PASSED · attested mutation count=N (== real git diff)and a✓ SP1 invariant holdsequality check.korg-verifyindependently accepts (✓ journal VALID — 4 events, hash-chain + DAG intact).cargo check(both cases).README honesty (
d88c83c,113057a)korg fork,korg checkpoints list|restore— no such variants) and correctedkorg-tui→korg,korg goal/run --preview/--mode→ the real top-level--goal/--preview/--modeflags.🚧 planned.Honest demo recording (
78640d1)demo.taperewritten to type+run only the real binary (korg run-once→korg-verify→ realkorg rewind); deleteddemo-sim.sh(the fabrication source).demo.gif/.mp4/.webpfrom real output — verified frame-by-frame to show the genuine honest-attestation block, not sim. (The webp encoder failure that almost re-shipped the old sim GIF was caught and corrected.)Independent review
Built by a fresh implementer, then independently reviewed: run-once honesty confirmed (attested == real numstat, no fabrication), ledger verified, tape de-simulated, 0 regressions. The review caught the README table/checklist contradiction (an honesty defect), now fixed.
Test plan
cargo test -p korg-runtime --test run_once— both fixture→1 and unrelated→0 cases pass against real git+cargokorg run-once "Fix the add function in src/lib.rs so it adds"→ honest attestation + ledger thatkorg-verifyacceptscargo test --workspace— 32 binaries, 0 failuresStacked on
feat/swarm-honest-pipeline(SP1). For review — not intended to merge ahead of the stack.Spec:
docs/superpowers/specs/2026-06-14-korg-swarm-sp4-honest-demo-design.md