Skip to content

fix: codex loop robustness#2

Open
dickymoore wants to merge 2 commits into
NathanJ60:mainfrom
dickymoore:fix/codex-support
Open

fix: codex loop robustness#2
dickymoore wants to merge 2 commits into
NathanJ60:mainfrom
dickymoore:fix/codex-support

Conversation

@dickymoore
Copy link
Copy Markdown
Contributor

Summary

  • Use raw yq output when reading statuses/keys so story lookups don't get quoted.
  • Avoid set -e abort by replacing ((processed++))/((failed++)) with safe arithmetic.

Why

  • yq was returning quoted keys (e.g., "1-1"), which made status checks fall back to unknown and skipped dev-story.
  • ((processed++)) returns exit status 1 on the first iteration, which terminates the loop under set -e. This caused only one story to run per invocation.

Testing

  • CODEX_HOME=~/bmad-sandbox/.codex RALPH_PROJECT_ROOT=~/bmad-sandbox RALPH_LOG_DIR=~/bmad-sandbox/logs ./codex-ralph-loop.sh --dry-run

@dickymoore
Copy link
Copy Markdown
Contributor Author

Hey @NathanJ60. There were a couple of bugs in the codex support so this should fix it. I haven't tested the changes with Claude though as I don't have access. If you're able to have a quick test that'll be ace. Thanks, Dicky

Copy link
Copy Markdown
Owner

@NathanJ60 NathanJ60 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Dicky! Thanks again for the contributions, really appreciate the effort here.

The two bug fixes (yq -r flag and the ((processed++)) arithmetic issue under set -e) are solid and I'd love to get those merged ASAP. However, this PR has grown quite a bit beyond those fixes — the parallel execution engine is a massive addition (~1900 lines) and I think it deserves its own dedicated PR so it can get proper review.

Could you split this into separate PRs?

  1. Bug fixes only (commits 1-2) — I'll merge those right away
  2. Parallel execution feature — as its own PR, with a few things I'd like to see addressed:
    • Modularization: ralph-loop-core.sh grows to 2250+ lines with 40+ functions. It'd be great to split the parallel logic into separate sourced files (lib/parallel.sh, lib/worktree.sh, etc.)
    • Security: the source "$previous_result_file" pattern for worker results is risky — if the file gets tampered with between write and read, it's arbitrary code execution. Explicit parsing (grep/cut) would be safer. Same concern with the script snapshot defaulting to /tmp with predictable filenames — mktemp -d with restricted permissions would be better.
    • Return code convention: code_review_requires_dev returns 0 for "needs more dev" and 1 for "clean", which inverts the standard bash convention. Easy to introduce bugs when reading the code.
    • wait_for_worker_completion signature: 9 nameref parameters is hard to follow — maybe use a result file or associative array instead?

The parallel feature itself is a really cool idea and I'd be happy to work with you on getting it merged once these are addressed. Let me know if you have any questions!

Cheers,
Nathan

@dickymoore
Copy link
Copy Markdown
Contributor Author

Hi @NathanJ60, and huge apologies. I’ve made a bit of a rookie error here and only just realised this branch was the one backing the open PR, so I accidentally pushed the newer commits onto it.

Completely agree they should be split out. I’ll separate out the original bug fixes so you can merge those, then I’ll put the parallel execution changes into a dedicated PR and work through the review points there. The parallel execution commits were originally only intended to live on my fork while I tested the idea, but if you think the direction is useful I’m happy to open a PR from a cleaned-up branch once I’ve ironed out the remaining issues.

Really appreciate the thoughtful feedback, and I’m glad the first two fixes look good.

Dicky

@dickymoore
Copy link
Copy Markdown
Contributor Author

dickymoore commented Apr 27, 2026

Hey Nathan. I hope you're well. I've narrowed this PR back down to the two original Codex bug fixes you called out:

  • use raw yq output for story/status keys
  • avoid post-increment arithmetic under set -e

The larger parallel/runtime work has been moved out of this PR path and preserved separately for a dedicated follow-up branch/PR. Local verification passes with bash -n across the shell scripts, and the branch now contains 2 commits, 1 changed file.

@dickymoore
Copy link
Copy Markdown
Contributor Author

Hi Nathan, quick update on the split.

I’ve kept this PR narrowed to the two Codex bug fixes you said looked good, and opened the parallel work as a separate stacked set so it is easier to review:

  1. fix: codex loop robustness #2 - Codex bug fixes only. This should go first.
  2. feat: add parallel story execution MVP #4 - Parallel execution MVP: controller/worker mode, story worktrees, dependency-aware scheduling, and serial integration.
  3. feat: add parallel runtime controls and recovery #5 - Runtime controls/recovery: pause/resume/drain/stop, idle timeouts, safe run snapshots, and retry/salvage context. I opened this as draft so it does not get merged before feat: add parallel story execution MVP #4.
  4. refactor: modularize parallel runtime and harden safety #6 - Safety/modularization cleanup: no source result.env, private snapshot dirs, standard review-gate return codes, simpler worker wait state, and the core split into lib/runtime.sh, lib/review-loop.sh, and lib/parallel.sh. Also draft until the earlier stack entries land.

Testing I ran on the split branches:

  • bash -n on the core, wrappers, installer, and lib modules where present.
  • wrapper --help smoke checks.
  • snapshot and simulated installed-layout checks for the modularized branch.
  • worker result parser rejection test for unexpected/malicious result keys.
  • fake Codex parallel smoke test on each branch with RALPH_CONCURRENCY=2 in a temporary git repo: two ready-for-dev stories ran through worker worktrees, fake dev/code-review workflows completed, the controller integrated both worker outputs, and both stories ended as done in sprint status.

I did not run live Claude/Codex agent workflows for these checks; the provider was faked to avoid burning real model calls during branch validation.

Thanks again for the detailed review notes. I’ve tried to map the follow-up PRs directly to them so you can take them one at a time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants