Releases: voidborne-d/hermit-agent
v0.1.50 — chrome-launcher: deterministic ports + IPv4 lock + collision detect
Three-layer fix for the silent CDP-port-share failure
Root cause
find_free_port had a TOCTOU race: between lsof -i :PORT returning empty (during a brief Chrome restart) and the next agent binding PORT, a different agent could grab the same port. And Chrome's macOS behavior on bind conflict is to silently fall back to [::1]:PORT instead of failing — so two agents end up "running on port 19900," one IPv4 and one IPv6, with chrome.json recording the requested port (not what Chrome actually bound to). Tools connecting via 127.0.0.1:19900 then talk to whichever agent won the IPv4 race.
Layer 1 — Force IPv4-only
--remote-debugging-address=127.0.0.1
Chrome no longer silently rebinds to IPv6 on conflict; it hard-fails and the launcher knows.
Layer 2 — Deterministic per-agent port
deterministic_port() {
local offset=$(printf '%s' "$1" | cksum | awk '{print $1 % 100}')
echo $((19900 + offset))
}Same agent name always gets the same port. The race window doesn't exist for the common case. Falls back to find_free_port only on real collision (hash collision or external process).
find_free_port now also cross-checks sibling agents' chrome.json files via sibling_owns_port, so a port owned by a live sibling whose Chrome is between restarts is still skipped.
Layer 3 — Detection
multi-agent-status-report.sh scans every sibling's chrome.json and emits:
🟥 chrome-cdp · collision · port 19900: agent-a ↔ agent-b
if two live agents claim the same port. Catches anything the first two layers miss.
Files
template/scripts/chrome-launcher.sh—deterministic_port,sibling_owns_port,--remote-debugging-address=127.0.0.1template/scripts/multi-agent-status-report.sh— collision checktest/smoke.mjs— +4 → 115/115
v0.1.49 — block AskUserQuestion (TUI hang fix)
Stop AskUserQuestion from silently hanging headless agents
Claude Code's built-in AskUserQuestion tool renders a TUI modal directly to stdin/stdout. In hermit-agent's headless tmux + Telegram setup the user is on Telegram and never sees the modal, so the agent hangs forever waiting on stdin. The Telegram channel process stays alive — looks like the agent has gone silent, but the tool just never reaches it.
Fix: PreToolUse hook + AGENTS.md rule
scripts/hook-block-askuserquestion.sh denies any AskUserQuestion call and surfaces a permissionDecisionReason back to the model:
Send a Telegram reply via
mcp__plugin_telegram_telegram__replywith the options as numbered lines, end the turn, and let the user answer in the next inbound message.
The model adapts mid-turn — no human intervention needed. AGENTS.md "Telegram Replies — Hard Rules" gains a third rule explicitly forbidding the tool, so the model avoids it in the first place.
Origin incident
2026-05-15 master-skill called AskUserQuestion to pick between two skill-regeneration strategies, sat silent in tmux for hours until the operator manually inspected the pane. Bot was alive (PID 30147), Telegram channel was healthy — the tool just renders to a place no one was looking.
Files
template/scripts/hook-block-askuserquestion.shtemplate/.claude/settings.local.json.tmpl— wiresAskUserQuestionmatchertemplate/AGENTS.md— third hard rule under Telegram Repliestest/smoke.mjs— +3 assertions → 111/111
v0.1.48 — status-reporter accuracy fixes
Three digest bugs fixed
Surfaced after a kernel-panic reboot exposed them all at once.
1. Self-report noise
The agent's own status-reporter LaunchAgent was being picked up by the auto-discover loop, always reading "ran 0s ago" — useless signal (by definition, if you're reading the digest, the reporter just fired). Now skipped via SELF_STATUS_REPORTER_LABEL.
2. launchctl runs counter resets on every load
After reboot (or plist edit / bootstrap), every task that had been running for weeks suddenly reads "🟨 never ran" because the runs counter is back at 0. Fix: check_interval_agent now accepts an optional log_path and falls back to log mtime when runs == 0 but the log exists. Auto-discover loop derives <agent>/.claude/state/<task>.log from the label, so any plist following the standard log layout gets this for free.
3. Calendar tasks were silently invisible
reap-dead-sessions (added in 0.1.47) uses StartCalendarInterval, so the auto-discover loop's [ -z "$interval" ] && continue filtered it out — never appeared in the digest. New check_cron_mtime helper plus an explicit hookup that fires when the per-agent reaper plist is installed.
Files
template/scripts/multi-agent-status-report.shtest/smoke.mjs— +3 → 108/108
v0.1.47 — dead-session reaper
Daily Trash sweep of stale Claude session JSONLs
Each Claude Code session writes a JSONL under ~/.claude/projects/<encoded-AGENTS_ROOT>-<agent>/ and these files accumulate fast. New master-only LaunchAgent (macOS) / systemd timer (Linux) runs daily at 04:10 and ships JSONLs (plus their sidecar subdirs) to the OS recycle bin once they're no longer needed.
Reap criteria — ALL three must hold
session_id≠<agent>/.claude/state/session-status.json.session_id(active)session_id≠<agent>/.claude/state/paused.json.session_id(hibernated wake target)- JSONL
mtimeolder thanREAP_AGE_DAYS(default 3 — buffer forclaude --resume)
Protection check runs before the mtime check, so active sessions and hibernated wake targets are safe regardless of age.
Recycle bin, not rm
- macOS:
/usr/bin/trash - Linux:
gio trash
Without one of those the script refuses to run rather than mv-shuffle silently.
Manual ops
scripts/reap-dead-sessions.sh --dry-run # preview
scripts/reap-dead-sessions.sh --age-days 14 # wider buffer
tail -f .claude/state/reap-dead-sessions.logOrigin incident
Backlog of ~833 stale JSONLs / 916 MB across 8 sibling agents in the local asst hub before this landed. After daily sweeps the working set should stay flat.
Files
template/scripts/reap-dead-sessions.shtemplate/launchd/reap-dead-sessions.plist.tmpltemplate/systemd/reap-dead-sessions.{service,timer}.tmplbin/create-hermit-agent.js—installDeadSessionReaper()(master-only)template/AGENTS.md— new "Dead-session reaper" sectiontest/smoke.mjs— +7 assertions → 105/105
v0.1.46: fleet hibernation (idle-hibernator + wake-poller)
Fleet-wide agent hibernation. Long-idle hermits get paused (claude + bun + chrome killed, session JSONL preserved); inbound Telegram traffic auto-wakes them within 60s.
What's new
scripts/hibernate-agent.sh— pauses an idle agent: saves session_id + tmux remain-on-exit + kills claude/bun/chrome, leavespaused.jsonfor the wake sidescripts/wake-agent.sh—tmux respawn-panewithclaude --resume <session_id>, auto-dismisses the "Resume from summary" modal that fires for old/large sessions (picks summary; full resume can cost several USD per wake)scripts/idle-hibernator.sh— LaunchAgent / systemd timer (10 min); hibernates siblings idle beyondIDLE_THRESHOLD_SEC(default 48h). Self-excludes viaHIBERNATOR_SELFscripts/wake-poller.sh— LaunchAgent / systemd timer (60s); peeks each paused agent's Telegram bot queue withgetUpdates(no ACK), wakes on any pending update. Fast-path exits in <100ms when nothing is hibernatedmulti-agent-status-report.sh— distinguishes hibernated (💤) from down (⚫) by checkingpaused.jsontemplate/AGENTS.md— new Hibernation section with state machine + manual opsinstallHibernationSystemincreate-hermit-agent— auto-installs on macOS launchd + Linux systemd-user for master role only (workers skip; status-reporter convention is mirrored)
States
agent.pidpresent,paused.jsonabsent → aliveagent.pidabsent,paused.jsonpresent → hibernated (💤)- both absent → down (⚫)
Tested
- 98/98 smoke checks (+13 hibernation-specific)
- End-to-end on the port-source workspace (asst): hibernate → status-reporter shows 💤 → wake → resume-modal auto-dismiss → back to idle. The wake-poller fast path was exercised live; the "pending update triggers wake" path is straightforward but only fires with a real Telegram inbound (verifies organically on next use)
v0.1.42: codex-hermit timeout fix — process-group kill + 30 min ceiling
Fixes a wedge condition in the codex-flavor bridge daemon discovered when a 30-minute README-generation turn hung the daemon for 26+ minutes with no error reaching the user.
Root cause
`subprocess.run(timeout=600)` only signals the immediate child. `codex exec` spawns helpers underneath, so SIGKILL to the top process leaves the helpers running. `subprocess.run`'s internal wait() never returns; the daemon stays stuck inside the catch-the-timeout path. The orphan codex process gets reparented to PPID=1 and keeps talking to the OpenAI backend silently.
Fixes
- `run_codex` now uses `Popen + start_new_session=True` so the daemon owns a fresh process group. On `TimeoutExpired` we `os.killpg(SIGKILL)` to take the whole tree, with a bounded 10s reap. The exception is re-raised so `handle()` can message the user.
- `CODEX_TIMEOUT_SEC` bumped from 600s (10 min) to 1800s (30 min). For prompts that legitimately need codex to write files, run git, install deps, etc., 10 min was too tight. Cron still wraps with `with-timeout.sh 1200` (20 min) per AGENTS.md Cron Safety.
- Timeout reply text now tells the user the thread is preserved — they can send another message to continue from where codex got stuck.
Smoke
88/88 (one new assertion verifies `Popen` + `start_new_session` + `killpg` + 1800s ceiling).
Verified
Killed the orphan codex on the dev workspace (pid 40702, etime 26 min) and restarted the daemon. Next turn comes back clean. The new `Popen` + pgroup-kill path will catch this case automatically going forward.
v0.1.41: codex-hermit gets full sandbox + approval bypass
Fixes a silent capability gap on codex flavor: agents could read but not write or hit the network.
What was wrong
Codex CLI defaults --sandbox to read-only and --ask-for-approval to a mode that prompts for any write. The bridge daemon was inheriting both defaults, so codex hermits could load AGENTS.md, recall persona, and reply — but they couldn't write memory/YYYY-MM-DD.md, run scripts, install packages, generate images into the workspace, or fetch URLs. Symptom: agents repeatedly explained "current environment is read-only sandbox" and gave up halfway through any task that required a side effect.
Fix
scripts/tg-bridge.py and scripts/run-cron.sh now pass --dangerously-bypass-approvals-and-sandbox (alias --yolo) to every codex exec invocation. This is the codex-side equivalent of claude flavor's --dangerously-skip-permissions:
--sandbox=danger-full-access(full filesystem + network)--ask-for-approval=never(no prompts mid-turn)
Same threat model as claude flavor — the user owns the machine and runs the agent as themselves; the agent inherits the user's authority.
Smoke
87/87 (one new assertion verifies both tg-bridge.py and run-cron.sh carry the flag).
v0.1.40: README documents Codex CLI host (no code change)
README-only update: makes the Codex CLI host (--host codex, shipped in v0.1.38) first-class on the README instead of buried in release notes.
Changes
Both README.md and README.zh-CN.md:
- Hero tagline names both hosts: Claude Code (default) and OpenAI Codex CLI
- Two host badges instead of one
- "Why a hermit crab?" credits table adds Codex CLI row
- 30-second quickstart shows both
npx create-hermit-agent(claude) andnpx create-hermit-agent <name> --host codex(codex) - Feature matrix gains a top "Host choice" row; each capability row notes how claude vs codex flavor implements it
- New "Host choice: Claude Code or Codex" section — side-by-side comparison table covering CLI / cost / Telegram bridge / hooks / slash commands / skills / MCP / cron / state, plus when-to-pick-which guidance
- Install section: dedicated "codex flavor" prereq block (codex CLI, python3, no bun), scaffold + start commands shown for both flavors
No code change
This is a docs-only release. v0.1.39 already shipped the actual sendPhoto fix; v0.1.40 just makes sure people landing on the npm page or GitHub repo can see the codex story up front.
v0.1.39: codex-hermit relays generated images via sendPhoto
Fixes a silent failure for codex-flavor hermits: when Codex's built-in image_gen tool produced a png in ~/.codex/generated_images/<thread>/, the bridge daemon only posted the text reply — the user saw "image generated successfully" but never received the actual file. Looked like a generation error, was actually a missing transport.
Fix
snapshot_generated()walks~/.codex/generated_images/recursively and returns the set of png/jpg/webp/gif files. Called before and after eachcodex execturn; the difference is the artifact set this turn produced.send_photo()buildsmultipart/form-datamanually (stdlib only, norequestsdep) and POSTs to/sendPhoto. The bot token stays in env, never on argv — token-safety hard rule preserved.handle()now dispatches new images viasend_photowith anupload_photochat_action pulse first so users see "is uploading photo…" before the bytes arrive. Text reply follows.
Compatibility
No breaking changes. Pure addition to the codex flavor's bridge daemon. Claude flavor untouched.
Smoke
86/86 (added 1 assertion covering send_photo + snapshot_generated + upload_photo wiring).
Verified end-to-end
Tested on the dev workspace /Users/mac/claudeclaw/codex-demo with a live ChatGPT image-generation request before porting into the template.
v0.1.38: --host codex flag (run hermit on ChatGPT subscription)
A new flag for npx create-hermit-agent lets users opt into a Codex CLI flavor instead of Claude Code. The codex hermit uses the user's ChatGPT subscription (via the codex CLI) instead of Claude API spend, while keeping the same hermit pattern: persona files (SOUL/IDENTITY/USER/AGENTS/TOOLS/MEMORY), daily logs, hooks, cron, Telegram bridge.
Usage
# Default (unchanged):
npx create-hermit-agent my-agent --bot-token <T> --user-id <CID> --persona "..." --yes
# Codex flavor (new):
npx create-hermit-agent my-agent --host codex --bot-token <T> --user-id <CID> --persona "..." --yes
Default stays --host claude. Codex is opt-in only.
What's different in --host codex
- No Claude Code dependency. Prereqs are codex CLI + python3 + tmux + curl + jq. No bun, no claude.
- Python Telegram bridge (
scripts/tg-bridge.py) replaces the--channels plugin:telegram@claude-plugins-officialmodel. PollsgetUpdates, runscodex exec [resume <thread_id>]per message, posts captured--output-last-messageback viasendMessage. - Hooks in
scripts/hooks/{boot,pre-run,post-run}.shrecreate the equivalents of Claude Code's SessionStart / UserPromptSubmit / Stop. pre-run does image safety; post-run strips markdown + writes daily log; boot fires FIRST_RUN.md welcome on first launch. - Admin commands intercepted by daemon:
/help,/status,/reset(new codex thread),/restart(full bridge respawn). - Cron uses
scripts/run-cron.shwrappingcodex execinwith-timeout.sh 1200. Pattern mirrors the claude-side cron. - AGENTS.md auto-loaded by Codex (native behavior — no need for
CLAUDE.mdentry file).
Provision skills
provision-agent: passes--host codexonly when the user explicitly asks for it. Default stays claude.provision-clone: clones inherit parent's host. Codex-flavored--clone-ofis not yet supported (the symlink layout assumes Claude Code's.claude/tree); fall back to a freshprovision-agent --host codex.
Why
asst (a hermit) was burning ~$350/day on Claude API to run 8 agents. ChatGPT Pro 20x ($200/mo) gives 300-1600 messages per 5h window — with the math, the codex flavor can cover most hermit workloads at ~50× lower cost. Default stays claude because (a) Anthropic has been the upstream investment, (b) the official telegram plugin's --channels integration is genuinely nicer than a polling daemon, and (c) some hermit workflows (cron -p, MCP plugin marketplace) lean on Claude Code features that codex doesn't have direct equivalents for.
Smoke
85/85 (68 claude flavor + 17 codex flavor assertions). Codex flavor verified end-to-end on a separate workspace (codex-demo) before extraction into template-codex/.
Notes
- The Codex bridge daemon is intentionally simple (~250 lines Python). Future iterations may grow it.
~/.codex/sessions/accumulates jsonl per thread; users on long conversations can/resetfor fresh threads to avoid token bloat.- ChatGPT subscription rate-limits are per-window (5h); users running heavy crons + chat might want to monitor
codex login statusand the daily ccusage breakdown.