Skip to content

Rewrite ralph script in Python #23

@rjernst

Description

@rjernst

branch: python-ralph

Spec: Rewrite ralph script in Python

Overview

Rewrite scripts/ralph from zsh (~500 lines) to Python 3 (stdlib only), following the same pattern established by the ta-wt rewrite. The script's CLI interface, behavior, and exit codes must remain identical. The docker/ralph/entrypoint.sh and docker/ralph/Dockerfile are NOT part of this rewrite — they stay as-is.

Key changes beyond a straight port:

  • Drop jq host dependency — Python handles JSON natively via json module
  • Replace gh ... -q '<jq-expression>' calls with gh --json <fields> + json.loads() in Python
  • Structure code with helper classes (Git, Docker, GitHub) as done in ta-wt

Architecture

scripts/ralph              ← Python 3 rewrite (stdlib only)
docker/ralph/entrypoint.sh ← unchanged (bash, runs inside container)
docker/ralph/Dockerfile    ← unchanged
tests/test_ralph.bats      ← integration tests, adapted for Python
tests/test_ralph.py        ← unit tests for pure functions (new, pytest)
tests/conftest.py          ← shared import helper for scripts/ (new)

Python module structure (single file):

#!/usr/bin/env python3
# Helper classes
class Git:        # git subprocess wrapper (same pattern as ta-wt)
class Docker:     # image build, container run
class GitHub:     # gh CLI wrapper, issue CRUD, label management

# Pure functions
parse_duration()
parse_frontmatter()
parse_issue_branch()

# Orchestration
ensure_worktree()
check_dependencies()
unblock_ready_specs()
process_issue()
poll_loop()
main()

if __name__ == "__main__":
    main()

The if __name__ == "__main__" guard is required so that pytest can import individual functions for unit testing.


1. CLI Interface (must match exactly)

Usage: ralph [options]

Options:
  --issue <number>      Execute a single GitHub Issue spec
  --poll                Poll for status:ready issues and process them
  --interval <duration> Poll interval (default: 30s, requires --poll)
  --timeout <duration>  Limit poll duration (e.g. 30m, 4h, 1d; requires --poll)
  --packages "pkg ..."  Extra apt packages baked into image
  --push                Git push after each iteration
  --model <model>       Claude model (default: sonnet)
  -h, --help            Show usage

Validation rules (exit code 2):

  • --poll and --issue mutually exclusive
  • --interval requires --poll
  • --timeout requires --poll
  • No mode specified → error with usage

Duration format:

<number>[s|m|h|d] — bare number = seconds. Invalid → exit 2.

Prerequisite checks (exit code 1):

  • docker must be on PATH
  • gh must be on PATH
  • (jq check is REMOVED — Python handles JSON)

2. Docker Image Building

  • Dockerfile path: ../docker/ralph/ relative to the script's real path (resolve symlinks)
  • Without --packages: tag = ralph:uid-<uid>
  • With --packages: tag = ralph:custom-<sha256(packages:uid)[:12]>
  • Cache: skip build if docker image inspect <tag> succeeds
  • Build args: EXTRA_PACKAGES, HOST_UID

3. Auth

Extract OAuth token from macOS Keychain and pass as env var to the container.

  • Read credentials: security find-generic-password -s "Claude Code-credentials" -w
  • Extract access token from the JSON (currently done with jq on the host — Python rewrite uses json module instead)
  • Pass to container as CLAUDE_CODE_OAUTH_TOKEN env var
  • If no credentials found → error with message to run claude to log in first

4. Frontmatter Parsing

Hand-rolled parser (no YAML library). Extracts fields from --- delimited frontmatter in issue body.

  • branch: <name> → string
  • base: <name> → string (optional)
  • depends: [11, 17] → list of ints; also supports scalar depends: 11

Returns None when field is missing. Must handle: missing frontmatter, missing field, bracket lists, scalar values, whitespace around values.

5. Dependency System

  • check_dependencies(deps, repo) → returns list of unmet dep numbers. Uses gh issue view <num> --repo <repo> --json labels + check for status:done in labels.
  • unblock_ready_specs(repo) → fetches status:blocked + spec issues, checks each's depends frontmatter, transitions to status:ready if all deps met.

6. Worktree Management

ensure_worktree(branch, base=None) — find or create a git worktree:

  • Parse git worktree list --porcelain to find existing worktree for branch
  • Remote preference: upstream > origin
  • Default branch: from refs/remotes/<remote>/HEAD, fallback to current HEAD
  • base overrides default branch
  • Remote branch exists → git worktree add --track -b <branch> <path> <remote>/<branch>
  • No remote branch → git worktree add -b <branch> <path> <default>
  • Path: ../<repo>-<sanitized-branch> (slashes → hyphens)

7. Issue Processing

process_issue(issue_number):

  1. Resolve repo from origin remote
  2. Fetch issue title + body via gh issue view --json
  3. Parse branch from frontmatter (fallback: [branch] prefix in title)
  4. Parse optional base and depends
  5. Dependency check: unmet → label status:blocked, return
  6. Ensure worktree
  7. Label status:in-progress
  8. Iteration loop:
    • Write body to temp file
    • Resolve worktree git dir for mount
    • Record HEAD before
    • Run container with mounts: worktree, git dir, ssh, gitconfig, spec file
    • Container env: PUSH, PROMPT_FILE=/tmp/spec.md, MODEL, GIT_USER, GIT_EMAIL, CLAUDE_CODE_OAUTH_TOKEN
    • Container failure → status:needs-attention, return 1
    • No new commit → status:done, unblock dependents, break
    • New commit → update issue body from spec file, optional push, continue

8. Poll Mode

  1. Resolve repo, compute deadline from --timeout
  2. Signal handling: SIGINT/SIGTERM → clean exit
  3. Loop:
    • Check deadline
    • unblock_ready_specs(repo)
    • gh issue list --label "spec,status:ready" --author "@me" --repo <repo> --json number
    • Process each issue
    • Idle: ralph: no ready issues found (last checked at HH:MM:SS) (carriage return overwrite)
    • Sleep interval

Implementation Plan

Step 1: Write complete Python rewrite of scripts/ralph ✅

Files:

  • scripts/ralph — Complete rewrite from zsh to Python

Implement:

  1. Read the current scripts/ralph zsh script thoroughly to understand every behavior
  2. Read scripts/ta-wt to follow the established Python conventions (Git helper class, argparse patterns, subprocess usage)
  3. Write the Python replacement with:
    • Git class (subprocess wrapper, same as ta-wt)
    • Docker class (image build with caching, container run)
    • GitHub class (gh CLI wrapper: issue view/edit/list, label management, repo resolution)
    • Pure functions: parse_duration(), parse_frontmatter(), parse_issue_branch()
    • Orchestration: ensure_worktree(), check_dependencies(), unblock_ready_specs(), process_issue(), poll_loop()
    • main() with argparse, validation, and mode dispatch
    • if __name__ == "__main__" guard
  4. Ensure identical CLI behavior: same flags, same defaults, same error messages (prefixed with ralph:), same exit codes (0=success, 1=runtime, 2=usage)
  5. Replace all gh ... -q '<jq>' patterns with gh --json <fields> + json.loads() in Python
  6. Remove jq prerequisite check entirely
  7. Auth: use subprocess to call security find-generic-password and parse JSON with json module
  8. Use hashlib.sha256 for package hash, time.time() for epoch, signal for SIGINT/SIGTERM

Test: Script runs: ralph --help shows usage, ralph with no args exits 2.

Verify: python3 scripts/ralph --help prints usage and exits 0. python3 scripts/ralph 2>&1; echo $? shows "no mode specified" and exits 2.

Review: Check for: identical CLI behavior, proper subprocess error handling, no jq usage, stdlib-only imports, if __name__ == "__main__" guard present.

Address feedback: Fix all review findings. Re-verify.

Step 2: Create pytest unit tests for pure functions ✅

Files:

  • tests/conftest.py — Shared import helper for importing scripts without .py extension
  • tests/test_ralph.py — Pytest unit tests for pure functions

Implement:

  1. Create tests/conftest.py with a helper that uses importlib to import scripts by path (no .py extension). This will be reusable as other scripts (ta-workspace, ta-tmux, etc.) are rewritten to Python.
  2. Create tests/test_ralph.py with pytest tests covering all pure functions previously tested via zsh -c 'eval "$(sed -n ...)"' in the bats file. These are:
    • parse_duration() — plain number, s/m/h/d suffixes, invalid input, empty string
    • parse_frontmatter() — scalar values, missing field, no frontmatter, branch/base extraction, whitespace handling, extra fields
    • parse_issue_branch() — branches with slashes, numbers, hyphens; malformed titles
    • check_dependencies() — all done, some not done, gh failure (requires subprocess mocking)
    • unblock_ready_specs() — transitions blocked→ready when deps met, leaves blocked when unmet, unblocks when no depends field (requires subprocess mocking)
  3. For pure functions (parse_duration, parse_frontmatter, parse_issue_branch): test by direct function call — no mocking needed.
  4. For functions that call gh (check_dependencies, unblock_ready_specs): use unittest.mock.patch to mock the GitHub class methods, or pass a mock GitHub instance if the design supports it.
  5. Preserve all test semantics — same edge cases, same expected outputs as the original bats tests.

Test: Run pytest tests/test_ralph.py -v

Verify: All pytest tests pass.

Review: Check that every zsh -c unit test from the original bats file has a pytest equivalent, and that assertions match the original expected values.

Address feedback: Fix all review findings. Re-run tests.

Step 3: Adapt bats integration tests for Python ✅

Files:

  • tests/test_ralph.bats — Adapt integration tests for the Python script

Implement:

  1. Read all existing tests carefully to understand what each tests
  2. Integration tests: Change run zsh "$RALPH" <args> to run "$RALPH" <args> — the Python shebang handles execution
  3. Remove unit tests: Delete the tests that used run zsh -c 'eval "$(sed -n ...)"' — these are now covered by tests/test_ralph.py
  4. Remove jq checks: Delete any tests that verify jq is required, and remove jq from test stubs if present
  5. The setup() function: remove jq from any prerequisite stubs. Keep docker/gh/git/security stubs.
  6. Preserve all integration test semantics — same assertions, same stubs, same expected outputs
  7. The auth tests that grep for CLAUDE_CODE_OAUTH_TOKEN in docker commands should still work since the Python script passes the same env var.

Test: Run bats tests/test_ralph.bats — all tests should pass.

Verify: bats tests/test_ralph.bats — 0 failures.

Review: Check that no integration test semantics were changed (only the invocation mechanism), no integration tests were deleted (except jq-specific ones), and all removed unit tests have pytest equivalents in test_ralph.py.

Address feedback: Fix all review findings. Re-run tests. Re-review if changes were substantial.

Step 4: Run all checks ✅

Implement:

  1. Run pytest tests/test_ralph.py -v — fix any failures
  2. Run bats tests/test_ralph.bats — fix any failures
  3. Run bats tests/test_ralph_entrypoint.bats — ensure entrypoint tests still pass (they test the bash entrypoint, should be unaffected)
  4. Run python3 -m py_compile scripts/ralph — verify no syntax errors
  5. Run any other project checks (shellcheck on other scripts, etc.)

Verify: All checks pass clean.

Step 5: Create commit

Implement:

  1. Stage scripts/ralph, tests/test_ralph.bats, tests/test_ralph.py, and tests/conftest.py
  2. Create a commit: "Rewrite ralph in Python (stdlib only)"

Verify: git log -1 shows the commit. git diff HEAD~1 --stat shows the expected files changed.


Conventions

  • Language: Python 3 (stdlib only — no pip dependencies)
  • Tests: Two layers:
    • pytest (tests/test_ralph.py) — unit tests for pure functions (parse_duration, parse_frontmatter, etc.) and functions with mockable dependencies
    • BATS (tests/test_ralph.bats) — integration tests that run the script as a subprocess with docker/gh/git stubs
    • conftest (tests/conftest.py) — shared import helper for scripts without .py extension (reusable for future rewrites)
  • Error messages: Prefix with ralph: (e.g., ralph: docker is not installed)
  • Exit codes: 0=success, 1=runtime error, 2=usage error
  • Style: Follow ta-wt patterns: Git helper class, dataclasses where useful, argparse with subcommands/flags, subprocess.run with check=True

Metadata

Metadata

Assignees

No one assigned

    Labels

    specRalph spec for automated execution

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions