Skip to content

feat(tlon): add core platform adapter#2

Draft
wca4a wants to merge 67 commits into
mainfrom
tlon/core-platform
Draft

feat(tlon): add core platform adapter#2
wca4a wants to merge 67 commits into
mainfrom
tlon/core-platform

Conversation

@wca4a

@wca4a wca4a commented May 15, 2026

Copy link
Copy Markdown
Owner

Split from NousResearch#26300.

Scope:

  • Register Platform.TLON in gateway config, CLI setup/status, channel directory, toolsets, and send_message routing.
  • Add the initial Tlon adapter with auth, SSE connection, DM/channel send and receive, settings/discovery helper modules, and core adapter tests.

Validation:

  • scripts/run_tests.sh tests/gateway/test_tlon_adapter.py tests/tools/test_send_message_tool.py

Result: 127 passed.

stephenschoettler and others added 30 commits May 14, 2026 14:28
Adds optional channel-context backfill for Discord shared-channel sessions
so the agent can see recent messages it missed between its own turns
(typically when require_mention=true filters out most traffic).

Previously the agent only saw the @mention message that triggered it, which
led to disorienting replies in active multi-user channels where the
conversation context was invisible. With backfill enabled, a configurable
number of recent messages are fetched per-turn and prepended to the trigger
message as a context block, kept separate from sender-prefix logic so
attribution remains clean.

This re-opens the work from NousResearch#13063 (approved by @OutThisLife on 2026-04-20,
closed when I closed the branch to address the simpolism:main head-branch
issue plus an ordering bug I caught later in live use). Filing against the
freshly-rewritten problem statement in NousResearch#13054 so the design is grounded in
the failure mode rather than the implementation shape.

The implementation follows the **push-mode last-self-anchored** design from
the two options laid out in NousResearch#13054. See the issue for the trade-off
discussion vs pull-mode (NousResearch#13120 was an earlier closed PR using that shape).
Treating this as a reference implementation — happy to rewrite as
last-trigger anchoring or as a hybrid with NousResearch#13120 if maintainers prefer.

Changes:

- gateway/platforms/discord.py:
  - new `_discord_history_backfill()` / `_discord_history_backfill_limit()`
    helpers (config.extra > env > default), mirroring the existing
    `_discord_require_mention()` shape
  - new `_fetch_channel_context()` that scans `channel.history()` backwards
    from the trigger to the bot's last message (or limit), formats as
    `[Recent channel messages] / [name] msg / ...`, respects DISCORD_ALLOW_BOTS,
    skips system messages
  - per-channel `_last_self_message_id` cache to narrow the fetch window
    on hot paths (avoids full history scan when the bot has spoken recently)
  - **IMPORTANT**: passes `oldest_first=False` explicitly to `channel.history()`.
    discord.py 2.x silently flips the default to True when `after=` is supplied,
    which would select the EARLIEST N messages after our last response instead
    of the LATEST N before the trigger. In high-traffic windows this would
    return stale tool traces and drop the actual final answer the user is
    asking about. See regression test below. Caught in live use during a
    Codex tool-trace burst on May 13 2026.
- gateway/config.py: discord_history_backfill + discord_history_backfill_limit
  settings + yaml→env bridge
- gateway/platforms/base.py: channel_context field on MessageEvent
- gateway/run.py: prepend channel_context after sender-prefix so the
  [sender name] tag applies to the trigger message alone, not to the backfill
- hermes_cli/config.py: defaults for new discord.history_backfill and
  discord.history_backfill_limit keys
- cli-config.yaml.example: documented defaults
- tests/gateway/test_discord_free_response.py: 7 new tests covering
  cold-start backfill, self-message stop boundary, other-bot filtering,
  cache hot-path narrowing, stale-cache fallback, shared-channel +
  per-user backfill paths, and the ordering regression test
  (`test_fetch_channel_context_cache_uses_latest_window_when_after_set`)
- tests/gateway/test_config.py: yaml→env bridge tests
- tests/gateway/test_session.py: prefix-order edge cases
- website/docs/user-guide/messaging/discord.md: env vars + config keys +
  usage docs

Tested on Ubuntu 24.04 — empirically validated in my own multi-bot Discord
research server for the past three weeks.

Fixes NousResearch#13054
Supersedes NousResearch#13063 (closed)
Follow-up to snav's PR NousResearch#25463 contribution: flip default to on, broaden
scope so backfill fires whenever require_mention gates the bot (not just
shared-session channels).

Why:
- The mention-gate creates a session-transcript gap regardless of whether
  the channel is shared or per-user. In per-user sessions, Alice's session
  is still missing other participants' messages and her own pre-mention
  messages — backfill fills both gaps.
- Threads naturally scope to thread-only history because discord.py's
  channel.history() on a thread returns only that thread's messages.
- DMs still skip — every DM triggers the bot, so the session transcript
  is already complete.

Changes:
- hermes_cli/config.py: discord.history_backfill default → true
- gateway/platforms/discord.py: drop the _is_shared gate, keep _is_dm
  skip and _needed_mention gate; env var DISCORD_HISTORY_BACKFILL
  default → 'true'
- cli-config.yaml.example + website docs: update defaults and prose;
  add the DISCORD_HISTORY_BACKFILL / _LIMIT env var rows that were
  documented in the PR description but missing from the env-var table
- tests/gateway/test_discord_free_response.py:
  - flip test_discord_per_user_channel_does_not_backfill →
    test_discord_per_user_channel_backfills_too (new behavior)
  - add test_discord_dm_does_not_backfill (DM skip is invariant)
  - give FakeThread a no-op history() so existing thread tests don't hit
    a fake discord.Forbidden when backfill now fires on threads too

Tests: 160/160 in target files; 400/400 across all tests/gateway/ -k discord.
The prebuild step used `rm -rf` and `cp -r`, which fail on Windows
(`'rm' is not recognized`). Replace with an inline Node one-liner
using fs.rmSync / fs.cpSync so the build works on Windows, macOS,
and Linux without adding a dependency.
…Research#25978)

Pre-existing diagnostics below an edit point used to surface as 'LSP
diagnostics introduced by this edit' whenever the edit deleted or
inserted lines.  The delta-filter key included the diagnostic's
range, so the same logical error reported at a different line in
the post-edit snapshot looked like a brand new diagnostic.

Concrete case: deleting 14 lines in cli.py caused Pyright errors at
lines 9873, 10590, 12413, 13004 (unrelated to the edit) to be
reported as introduced by it.

Fix: build a piecewise-linear line-shift map (via difflib's
SequenceMatcher) from pre and post content, and remap baseline
diagnostics into post-edit coordinates before the set-difference.
Diagnostics in deleted regions drop out cleanly; diagnostics below
the edit shift by the right amount; diagnostics above are untouched.
The strict (range-aware) equality key stays — so a genuinely new
instance of an identical error class at a different line still
surfaces as new.

Pieces:
- agent/lsp/range_shift.py — build_line_shift, shift_diagnostic_range,
  shift_baseline.  Pure functions, no LSP state.
- agent/lsp/manager.py — LSPService.get_diagnostics_sync gains an
  optional line_shift kwarg; baseline is shift_baseline'd before
  computing the seen-set.  _diag_key keeps the strict range key.
- tools/file_operations.py — write_file captures pre_content for any
  LSP-handled extension (not just LINTERS_INPROC) and passes pre/post
  to _maybe_lsp_diagnostics, which builds the shift map.
- New _lsp_handles_extension helper guards the pre_content read.

Trade-offs preserved:
- Genuinely new same-class errors at different lines still surface
  (content-only key would have swallowed them).
- Pre-existing errors at unshifted positions still get filtered
  (covered by the strict-key path with no shift).
- Best-effort: when pre_content can't be captured (file didn't
  exist, permissions), the unshifted comparison still catches
  most pre-existing errors; the edge case it misses is a new file
  with a non-empty baseline, which is structurally impossible.
Three Windows-only bugs in the web-dashboard build path. Each is small,
scoped, and verified end-to-end on Windows 11 — including under a stock
cmd.exe / PowerShell console with its default cp1252 encoding.

1. `sync-assets` shells out to Unix-only commands

   web/package.json hard-codes `rm -rf … && cp -r …`. Neither exists on
   Windows cmd.exe. `hermes_cli/main.py::_build_web_ui` runs npm via
   subprocess (which on Windows defaults to cmd.exe), so the prebuild
   hook crashed before Vite ever ran and the dashboard never built.

   Fix: web/scripts/sync-assets.mjs — ~20 lines of Node using fs.rmSync
   + fs.cpSync (stdlib, Node >= 16.7). No new deps, identical behavior
   on POSIX and Windows.

2. Build failures were silent

   _build_web_ui ran both subprocess calls with capture_output=True and
   never relayed the captured buffers on failure. Users saw 'Web UI
   build failed' and nothing else — no stdout, no stderr, no hint that
   the real problem was 'rm is not recognized'.

   Fix: inner _relay() helper that decodes and prints stdout + stderr
   (utf-8, errors='replace') whenever a step returns non-zero. Replaces
   the existing stderr_tail-only relay on the build path; success path
   is unchanged. (stderr_tail is preserved for the stale-dist fallback
   branch added by NousResearch#23817.)

Salvaged from NousResearch#13368 by @johnisag onto current main. Conflict
resolution preserves main's improvements:
- _run_npm_install_deterministic() (replaces bare subprocess.run for
  npm install)
- npm-build retry-after-sleep for Windows boot-time races (NousResearch#23817)
- stale-dist fallback for non-interactive callers (NousResearch#23817)

Closes NousResearch#25073, NousResearch#13368.
Codex review pointed out that even with the sync-assets fix applied,
_build_web_ui still crashes on a stock Windows console before reaching
npm: Python stdout defaults to cp1252 (or similar) and raises
UnicodeEncodeError when print() hits the arrow/check glyphs used for
status messages (→, ✗, ⚠, ✓). Reproduced locally in PowerShell:

    $ PYTHONIOENCODING=cp1252 python -c "from hermes_cli.main import _build_web_ui; _build_web_ui(Path('web'), fatal=True)"
    UnicodeEncodeError: 'charmap' codec can't encode character '\u2192' ...

The previous PR body claimed "end-to-end verified on Windows 11", but
that was under the venv's default (utf-8) stdout. A plain `py` or
PowerShell invocation would still fail before sync-assets ever ran.

Fix: inner _say() helper that falls back to
  text.encode(sys.stdout.encoding, errors="replace")
when print() raises UnicodeEncodeError. Glyphs degrade to '?' on
ASCII / cp1252 consoles; utf-8 consoles are unaffected. Verified the
full build pipeline runs to completion with PYTHONIOENCODING=cp1252.

Scoped tightly to _build_web_ui (the function this PR already touches);
other call sites in the codebase with the same risk are out of scope.
…gnal_handler

The call site at line 246 is already wrapped in try/except NotImplementedError
(added in NousResearch#25969). The checker just doesn't peek at surrounding context.
Mark with the suppression comment so the blocking check passes.
The 'sessions' command has been registered in the central command
registry since NousResearch#20805 (May 2025) and surfaces in /help and tab-completion,
but the classic CLI's process_command() never had an elif branch for it.
The canonical name fell through and printed 'Unknown command: sessions'.
The TUI side was wired up correctly via the SessionPicker overlay; only
the legacy CLI was missing the dispatch.

Adds _handle_sessions_command() which mirrors /resume's no-arg behavior
inline (the CLI has no overlay primitive equivalent to the TUI picker):

- /sessions and /sessions list  → print the recent-sessions table
- /sessions <id_or_title>       → delegates to _handle_resume_command

Includes regression tests covering the dispatcher wiring (the original
bug) plus the three handler branches.
…-ci-unblocker-after-21012

fix(ci): stabilize shared test state after 21012
AGENT_BROWSER_CHROME_FLAGS is not read by agent-browser CLI.
The correct env var is AGENT_BROWSER_ARGS, with comma-separated values.

This fixes Chrome 'No usable sandbox' crash on Ubuntu 23.10+ systems
where AppArmor restricts unprivileged user namespaces. The detection
logic was correct but the fix used the wrong environment variable name
and space-separated instead of comma-separated args.
Follow-up to the sandbox-bypass env-var fix:

- Update the opt-out gate so a user-provided AGENT_BROWSER_ARGS is also
  respected, not just the legacy AGENT_BROWSER_CHROME_FLAGS. Previously
  the gate only checked the broken legacy var, so a user who pre-set
  AGENT_BROWSER_ARGS would still get clobbered by Hermes's auto-injection.
- Document AGENT_BROWSER_ARGS in .env.example, the browser feature page,
  and the env var reference, with notes about the auto-injection on
  AppArmor-restricted systems (Ubuntu 23.10+, DGX Spark, containers).
- Add Anadi Jaggia to AUTHOR_MAP.
On macOS with uv-managed cPython 3.11, the default kqueue selector cannot
register fd 0, so prompt_toolkit's loop.add_reader raises
OSError(EINVAL) ("[Errno 22] Invalid argument") from kqueue.control()
and the agent crashes immediately on startup (NousResearch#5884, also reported in
NousResearch#6393).

Probe KqueueSelector.register(0, EVENT_READ) before launching
prompt_toolkit. If it fails, install an event-loop policy that returns a
SelectorEventLoop backed by SelectSelector — select() works fine on
stdin in this Python build, so add_reader succeeds and the agent
launches normally.

Also extend the existing NousResearch#6393 fallback handler to recognize EINVAL /
EBADF / "Invalid argument" so that any future selector failure on stdin
shows the friendly "reinstall Python via pyenv or Homebrew" guidance
instead of an opaque traceback.

Verified on macOS (Darwin 24.6.0) with uv-managed cPython 3.11.15: the
kqueue probe fails, the policy switch fires, and `hermes` launches
cleanly. No effect on platforms where kqueue can register fd 0.
When the auxiliary client falls through Nous (e.g. no stored auth, or
runtime credential mint failed), users currently see only `debug`-level
lines, so the next provider in the fallback chain takes over silently.
Promote the no-auth path to a warning that tells operators to run
`hermes auth`, and add a debug breadcrumb on the rarer
mint-failed-but-stored-auth-still-present fallback path so the existing
behavior (use the raw stored token) is preserved while staying
investigable.

Salvaged from NousResearch#23881 by @0xharryriddle. The contributor's original
patch also short-circuited the second branch with a return, which broke
the pool-entry fallback path covered by
`test_try_nous_uses_pool_entry` — kept the warning intent, dropped the
return so the fallback still works. Dropped the contributor's changes
to `hermes_cli/goals.py` because the goal-pause path is unreachable
when the auxiliary client is None (`judge_goal` returns
`parse_failed=False`, which resets `consecutive_parse_failures`),
so the reason string they added never surfaces in the pause message.

Refs NousResearch#23876
The ACP Registry manifest (acp_registry/agent.json), the npm launcher
package.json, and the launcher's HERMES_AGENT_VERSION constant must all
match pyproject.toml exactly — tests/acp/test_registry_manifest.py
enforces this lockstep.

Without a release-script hook, the next weekly version bump fails that
test until someone hand-edits four files. Extend update_version_files()
to drive the ACP bump alongside __init__.py and pyproject.toml, and
add tests covering the lockstep and the missing-files no-op path.

Also map adam.manning@gmail.com -> am423 for the salvage commit.
…ousResearch#26106)

* chore: remove Atropos RL environments, tools, tests, skill, and tinker-atropos submodule

Delete:
- environments/ (43 files — base env, agent loop, tool call parsers, benchmarks)
- rl_cli.py (standalone RL training CLI)
- tools/rl_training_tool.py (all 10 rl_* tools)
- tests: test_rl_training_tool, test_tool_call_parsers, test_managed_server_tool_support,
  test_agent_loop, test_agent_loop_vllm, test_agent_loop_tool_calling,
  test_terminalbench2_env_security
- optional-skills/mlops/hermes-atropos-environments/
- tinker-atropos git submodule + .gitmodules

* chore: remove RL/Atropos references from Python source

- toolsets.py: remove rl toolset block + update comment
- model_tools.py: remove rl_tools group + update async bridging comment
- hermes_cli/tools_config.py: remove RL display entry, _DEFAULT_OFF_TOOLSETS,
  setup block, and rl_training post-setup handler
- tools/budget_config.py: remove RL environment reference in docstring
- tests/test_model_tools.py: remove rl_tools from expected groups
- tests/run_agent/test_streaming_tool_call_repair.py: fix stale cross-reference

* chore: remove rl/yc-bench extras and tinker-atropos refs from pyproject.toml

- Remove rl extra (atroposlib, tinker, fastapi, uvicorn, wandb)
- Remove yc-bench extra
- Remove rl_cli from py-modules
- Remove [tool.ty.src] exclude for tinker-atropos
- Remove [tool.ruff] exclude for tinker-atropos
- Regenerate uv.lock

* chore: remove tinker-atropos from install/setup scripts

- setup-hermes.sh: remove entire tinker-atropos submodule install block
- scripts/install.sh: remove both tinker-atropos blocks (Termux + standard)
- scripts/install.ps1: remove tinker-atropos block
- nix/hermes-agent.nix: remove tinker-atropos pip install line

* chore: remove RL references from cli-config.yaml.example

* docs: remove Atropos/RL references from README, CONTRIBUTING, AGENTS.md

* docs: remove RL/Atropos references from website

- Delete: environments.md, rl-training.md, mlops-hermes-atropos-environments.md
- sidebars.ts: remove rl-training and environments sidebar entries
- optional-skills-catalog.md: remove hermes-atropos-environments row
- tools-reference.md: remove entire rl toolset section
- toolsets-reference.md: remove rl row + update example
- integrations/index.md: remove RL Training bullet
- architecture.md: remove environments/ from tree + RL section
- contributing.md: remove tinker-atropos setup
- updating.md: remove tinker-atropos install + stale submodule update

* chore: remove remaining RL/Atropos stragglers

- hermes_cli/config.py: remove TINKER_API_KEY + WANDB_API_KEY env var defs
- hermes_cli/doctor.py: remove Submodules check section (tinker-atropos)
- hermes_cli/setup.py: remove RL Training status check
- hermes_cli/status.py: remove Tinker + WandB from API key status display
- agent/display.py: remove both rl_* tool preview/activity blocks
- website/docs: remove RL references from providers.md + env-variables.md
- tests: remove TINKER_API_KEY from conftest, set_config_value, setup_script

* chore: remove RL training section from .env.example
The ACP Registry schema supports uvx as a first-class distribution method
alongside npx and binary. Pointing the registry directly at the existing
hermes-agent PyPI release removes:

- the @NousResearch npm scope (we don't own it)
- a separate npm publish step on every weekly release
- 90 lines of Node launcher + tests in packages/hermes-agent-acp/

The Zed registry now installs Hermes via:

  uvx --from 'hermes-agent[acp]==<version>' hermes-acp

This is the same command the npm launcher was shelling out to anyway, so
end-user behavior is unchanged. Registry CI validates the PyPI URL +
version-pin exact match automatically.

Changes:
- acp_registry/agent.json: distribution.npx -> distribution.uvx
- delete packages/hermes-agent-acp/ entirely
- scripts/release.py: drop npm-launcher bump paths, keep manifest lockstep
- tests/acp/test_registry_manifest.py: assert uvx shape + version pin
- tests/scripts/test_release_acp_registry.py: rewrite for uvx-only shape
- docs (user-guide + dev-guide): drop all npm-launcher references
- delete docs/plans/acp-registry-zed-integration.md (stale, npm-shaped)

Validated against agentclientprotocol/registry agent.schema.json via
jsonschema. hermes-agent==0.13.0 is already live on PyPI.
…chments

Discord's CDN serves attachments with Content-Encoding: br. aiohttp's
compression_utils tries 'import brotlicffi as brotli' first and falls back
to google's Brotli, but Brotli<1.2.0's Decompressor.process() is 1-arg
while aiohttp calls it with 2 args (data, max_length). Result: every
.txt/.md/.doc uploaded to a Discord-gateway session fails to decode at
att.read() with 'Can not decode content-encoding: br' / 'TypeError:
process() takes exactly 1 argument (2 given)', the agent never sees the
bytes, and falls back to filesystem guessing.

Pin brotlicffi==1.2.0.1 in both surfaces:

  - tools/lazy_deps.py 'platform.discord' tuple: Discord users on the
    lazy-install path get it on first discord.py import.
  - pyproject.toml [messaging] extra: users who explicitly install
    hermes-agent[messaging] (skipping the lazy path) get it eagerly.

brotlicffi wins aiohttp's import race regardless of what else is
installed (try brotlicffi / except: import brotli), so existing setups
that already pulled google's Brotli transitively don't change behavior
beyond the bug fix. ~1.5 MB wheel, manylinux/macOS/Windows coverage.

E2E verified: round-trip decode of Brotli-compressed payload via
aiohttp.compression_utils.brotli succeeds with brotlicffi pinned; same
test against Brotli==1.1.0 alone reproduces the reported TypeError.

Credit to @Korkyzer for the original diagnosis and fix shape in NousResearch#15744;
the lazy-deps gating layer was added on top to keep brotlicffi out of
the install path for users who don't run a Discord gateway.

Fixes NousResearch#12511.
Closes NousResearch#15744.

Co-authored-by: Korky <korkyzer@gmail.com>
Two long-standing prompt_toolkit bugs in the base hermes CLI:

1. Resize duplication. Column-shrink resize used to push 40+ rows of
   duplicate chrome (status bar, input rules) into terminal scrollback
   every resize. Same wall as pt issues NousResearch#29 (open since 2014), NousResearch#1675,
   NousResearch#1933 — aider/xonsh/ipython all use alt-screen to dodge it.

   Root cause (verified by reading prompt_toolkit/renderer.py):
   _output_screen_diff (renderer.py L232-242) deliberately moves the
   cursor to the bottom of the canvas after every paint 'to make sure
   the terminal scrolls up'. In non-fullscreen mode this scrolls chrome
   content into terminal scrollback on every render — not just on
   resize.

   Fix: monkey-patch prompt_toolkit.renderer._output_screen_diff to
   bypass the reserve-vertical-space cursor move. When pt's logic checks
   'if current_height > previous_screen.height', we inflate the previous
   screen height so the branch falls through. ~30-line wrapper, no fork
   of pt, no alt-screen, no DECSTBM scroll region.

   Verified empirically in real Terminal.app: 10 resizes (mixed
   shrinks/widens 1300→500→1400) during streaming produced ZERO
   scrollback delta, full agent response preserved, status bar pinned
   at bottom, no visible duplicates. pt is pinned to ==3.0.52 so the
   private-function patch is safe; future pt bumps will need to
   re-verify the signature matches.

2. Light-mode terminal visibility. Hardcoded skin colors (#FFF8DC
   cornsilk, #FFD700 gold, #B8860B dark goldenrod) are tuned for dark
   Terminal.app — invisible on light/cream backgrounds.

   Port ui-tui/src/theme.ts detectLightMode() to Python so the base CLI
   adapts. Detection priority: HERMES_LIGHT/HERMES_TUI_LIGHT env →
   HERMES_TUI_THEME=light|dark → HERMES_TUI_BACKGROUND=#RRGGBB →
   COLORFGBG env (xterm/Konsole/urxvt) → OSC 11 query
   (\x1b]11;?\x1b\\) with 100ms timeout → default dark. OSC 11 is
   tty-gated so gateway/cron/batch/subagent code paths don't pay the
   timeout cost.

   When light mode is detected, dark-mode colors auto-remap to readable
   equivalents (#FFF8DC → #1A1A1A, #FFD700 → #9A6B00, etc). Hooked at
   three points:
   - _hex_to_ansi() — auto-remaps any color emitted via the ANSI helper
   - _build_tui_style_dict() — rewrites pt style strings (chrome bg/fg)
   - SkinConfig.get_color() — wrapped at module load so Rich Panel
     borders/body text get the remap too

   Status-bar foreground colors (#C0C0C0, #888888, etc.) are explicitly
   skipped because they're paired with a dark navy bg — remapping them
   would make them invisible in dark mode.

3. Other visibility fixes: [thinking] reasoning preview now uses ANSI
   dim+italic (\x1b[2;3m) instead of #B8860B so it inherits terminal
   default fg color. Input/prompt area defaults to terminal default fg
   (was #FFF8DC cornsilk → invisible on cream).

Co-authored-by: Brooklyn Nicholson <brooklyn.bb.nicholson@gmail.com>
Adds 16 unit tests covering the light/dark terminal detection path
introduced in the previous commit:

- Env override priority (HERMES_LIGHT, HERMES_TUI_LIGHT,
  HERMES_TUI_THEME, HERMES_TUI_BACKGROUND, COLORFGBG)
- Detection cache stickiness
- _maybe_remap_for_light_mode() no-op in dark mode
- Known dark-mode color remap (#FFF8DC -> #1A1A1A etc)
- Case-insensitive lookup
- Unknown color passthrough
- Status-bar paired colors (#C0C0C0, #888888, #555555, #8B8682) are
  intentionally NOT remapped — regression guard for the patch-11 fix,
  since remapping them would produce dark-on-dark on the status bar's
  navy bg
- SkinConfig.get_color() wrapper is installed and idempotent
- SkinConfig.get_color() does remap in light mode and passes through
  in dark mode

We don't try to fake an OSC 11 reply — that path is exercised
end-to-end in real Terminal.app; the env-override path covers the
algorithmic logic.
…store full-width borders (NousResearch#26163)

NousResearch#25975 (salvaging NousResearch#24403) clamped decorative scrollback Panels and
streaming box rules to `max(32, min(width, 56))` as a defense against
terminal-emulator reflow when columns shrink. On any modern wide
terminal this made the response/reasoning borders look stubby — 56
cols inside a 200-col viewport.

NousResearch#26137 (salvaging NousResearch#25981, by @OutThisLife) landed a more fundamental
fix: prompt_toolkit's `_output_screen_diff` is monkey-patched so its
reserve-vertical-space cursor move no longer pushes chrome into
scrollback at all. With that in place, the clamp is no longer
load-bearing for the chrome-into-scrollback class of bugs — the
remaining risk is purely cosmetic reflow of *already stamped*
Panel borders during an aggressive column shrink, which we now
accept as a tradeoff for restoring proper full-width rendering.

Changes:
- `_scrollback_box_width()` returns `max(32, width)` (just the floor,
  no upper cap). All 10 call sites stay valid.
- Updated `test_scrollback_box_width_caps_to_resize_safe_value` to
  the new `test_scrollback_box_width_returns_viewport_width` asserting
  full-width passthrough above the 32-col floor.

Floor of 32 is kept so `'─' * (w - 2)` math stays positive on tiny
terminals.

Refs NousResearch#18449 NousResearch#19280 NousResearch#22976 (the original reflow class) and NousResearch#25975
(the clamp this reverts).
The freeform /goal judge was capped at max_tokens=200, which reliably
truncated the JSON verdict on reasoning-heavy models (deepseek-v4-pro,
qwq, etc.) — the model burns tokens on hidden reasoning before emitting
visible content, and the first /goal turn's prompt is larger than later
turns, blowing past 200. Symptom: agent.log shows
`judge reply was not JSON: '{"done": true, "reason": "The agent successfully'`
followed by repeated `judge returned empty response` lines, then the
goal pauses with a misleading 'judge model isn't returning the required
JSON verdict' message.

Diagnosed live by @helix4u — empirically verified that raising the
budget on an unmodified worktree makes the failures go away on the
exact configs users were hitting on Nous Plus subscription paths.

Changes:
- DEFAULT_JUDGE_MAX_TOKENS = 4096 (up from 200)
- New auxiliary.goal_judge.max_tokens config knob for tuning in
  specifically constrained setups
- _goal_judge_max_tokens() resolves the value with fail-open semantics
  (non-int / non-positive / load failure → default). load_config() is
  mtime-cached so per-turn lookup is cheap.

Scoped narrowly to the verified root cause — does not introduce a
submit_verdict tool-call schema (see NousResearch#26162 / NousResearch#23671 for that direction;
they can land separately if we want them).

Tests: tests/hermes_cli/test_goals.py + tests/cli/test_cli_goal_interrupt.py
+ tests/gateway/test_goal_verdict_send.py — 62/62 passing.

E2E verified: config override honored (8192), missing/garbage/zero
values fall back to 4096, no-auxiliary-section falls back to 4096.

Co-authored-by: helix4u <4317663+helix4u@users.noreply.github.com>

Credits:
- @helix4u (Gille) — diagnosed the max_tokens=200 truncation via live
  testing on an unmodified worktree, drafted the original fix shape
  in NousResearch#26162.
- @AhmetArif0 — flagged the freeform judge fragility in NousResearch#23671 from
  the tool-call angle.
- @0xharryriddle (HarryRiddle.eth) — reported the issue from a Nous
  Plus subscription setup in NousResearch#23876 with full debug reports.

Closes NousResearch#23876
Supersedes NousResearch#26162, NousResearch#23671, NousResearch#23881
…sResearch#26148)

* ci(pypi): add publish workflow for automated PyPI releases

Triggered by CalVer tag pushes from scripts/release.py (v20* pattern).
Three jobs: build (uv build) → publish (OIDC trusted publishing) → sign
(Sigstore + attach to existing GitHub Release).

- workflow_dispatch as manual escape hatch
- skip-existing for safe re-runs
- Graceful skip when GitHub Release not found (sign job)
- Top-level permissions: contents: read (CodeQL compliant)

Requires one-time setup: PyPI trusted publisher + GitHub pypi environment.

Co-authored-by: dmahan93 <44207705+dmahan93@users.noreply.github.com>

* fix(release): address review findings

- Stage acp_registry/agent.json in version bump commit (was silently left unstaged)
- Add missing return when no previous tags found without --first-release
- Fix get_pr_number return type annotation (str -> str | None)
- Prefer uv build over python -m build (matches CI workflow), with fallback
- Use unit separator (%x1f) in git log format to handle | in author names
- Add explicit encoding='utf-8' to .release_notes.md write

Workflow hardening:
- Gracefully skip signing when GitHub Release not found (env var gate
  instead of exit 1, so PyPI publish still shows green)

* fix(ci): harden PyPI workflow — SHA-pin actions, guard workflow_dispatch, explicit build flags

- Pin all actions to commit SHAs (supply-chain hardening for id-token:write)
- workflow_dispatch now requires confirm_tag input + checks out that tag
- Both uv build paths explicitly pass --sdist --wheel

---------

Co-authored-by: dmahan93 <44207705+dmahan93@users.noreply.github.com>
buntingszn and others added 25 commits May 15, 2026 01:36
Cron mutation operations (run/pause/resume/remove) and 'hermes cron edit'
now accept a job name in addition to the hex ID, with case-insensitive
matching. Before this, 'hermes cron run my_job_name' died with
'Job with ID my_job_name not found' and forced the user to look up the
hex ID first.

The original PR matched by name but silently picked the first match when
two jobs shared a name. This version refuses to act on an ambiguous name
and surfaces every matching job (id, name, schedule, next_run_at) so the
caller can pick a specific ID.

- cron/jobs.py:
  - get_job() stays ID-only (preserves existing call-site semantics for
    web_server/api_server/curator/scheduler/test code that always passes
    real IDs).
  - resolve_job_ref() is the new name-or-ID resolver, used by pause/
    resume/trigger/remove_job. Exact ID match wins over a name match
    even if a different job's name happens to equal that ID. Ambiguous
    name match raises AmbiguousJobReference with all candidate IDs.
- tools/cronjob_tools.py: dispatch site uses resolve_job_ref, surfaces
  ambiguous matches as a structured error with the matching IDs.
- hermes_cli/cron.py: 'cron edit' uses resolve_job_ref so editing by
  name works and ambiguous names are reported with IDs.
- tests/cron/test_jobs.py: new TestResolveJobRef covering ID match,
  case-insensitive name match, ID-wins-over-name, ambiguous refusal,
  and that pause/resume/trigger/remove all refuse on ambiguity.

Closes NousResearch#2627
…gistry installs

The Zed ACP Registry path (uvx --from 'hermes-agent[acp]==X' hermes-acp)
gets a Python-only install. Browser tools depend on the agent-browser npm
package + Chromium, neither of which are in the wheel. Without an
explicit bootstrap, registry users have no path to working browser tools.

Ship a bundled, idempotent bootstrap script (Linux/macOS bash + Windows
PowerShell) inside acp_adapter/bootstrap/ as wheel package-data. New
entry points:

  hermes acp --setup-browser        # interactive; prompts before Chromium download
  hermes acp --setup-browser --yes  # non-interactive
  hermes-acp --setup-browser

The terminal-auth flow (hermes acp --setup) also offers the browser
bootstrap as a follow-up after model selection, so first-run registry
users get the option without knowing the flag exists.

Key design choices:
- npm install -g --prefix $NODE_PREFIX so we never need sudo. System Node
  on PATH is respected; only the install target is redirected to the
  user-writable Hermes-managed Node prefix.
- tools/browser_tool.py::_browser_candidate_path_dirs() already walks
  $HERMES_HOME/node/bin, so installed binaries are discovered with no
  agent-side code change.
- System Chrome/Chromium detection short-circuits the ~400 MB Playwright
  download when a suitable browser already exists.
- Bash + PowerShell live as ONE copy each under acp_adapter/bootstrap/.
  Not duplicated under scripts/. install.sh and install.ps1 keep their
  inline browser blocks for the source-checkout path.

E2E validated end-to-end:
  bash bootstrap_browser_tools.sh --skip-chromium
    → installs agent-browser into ~/.hermes/node/bin/
  tools.browser_tool._find_agent_browser()
    → returns the installed path
  check_browser_requirements()
    → returns True (browser tools register)

Tests:
- tests/acp/test_entry.py: 11 tests covering --setup-browser dispatch
  (linux + windows + --yes forwarding + failure propagation), the
  terminal-auth follow-up prompt path, and a package-data wheel-shipping
  assertion that catches any future pyproject.toml regression.

Docs: website/docs/user-guide/features/acp.md gains a 'Browser tools
(optional)' subsection with the two-line install + what-it-does.
SimpleX Chat (https://simplex.chat) is a private, decentralised messenger
with no persistent user IDs — every contact is identified by an opaque
internal ID generated at connection time. This adds it as a Hermes
gateway platform via the plugin system.

The adapter connects to a local simplex-chat daemon via WebSocket,
listens for inbound messages, and sends replies. Originally proposed in
PR NousResearch#2558 as a core-modifying integration; reshaped here as a self-
contained plugin under plugins/platforms/simplex/ with no edits to any
core file. Discovery is filesystem-based (scanned by gateway.config),
and the platform identity is resolved on demand via Platform("simplex").

Plugin contract:
- check_requirements() requires SIMPLEX_WS_URL AND the websockets package
- validate_config() / is_connected() accept env or config.yaml input
- _env_enablement() seeds PlatformConfig.extra (ws_url + home_channel)
- _standalone_send() supports out-of-process cron delivery
- interactive_setup() provides a stdin wizard for hermes gateway setup
- register() wires the adapter into the registry with required_env,
  install_hint, cron_deliver_env_var, allowed_users_env, and a
  platform_hint for the LLM.

Lazy dependency: the websockets Python package is imported inside the
functions that need it. The plugin is importable and discoverable even
when websockets is missing — check_requirements() simply returns False
until `pip install websockets` is run. No new pyproject extras are
introduced.

Environment variables:
  SIMPLEX_WS_URL             WebSocket URL of the daemon (required)
  SIMPLEX_ALLOWED_USERS      Comma-separated allowed contact IDs
  SIMPLEX_ALLOW_ALL_USERS    Set true to allow all contacts
  SIMPLEX_HOME_CHANNEL       Default contact for cron delivery
  SIMPLEX_HOME_CHANNEL_NAME  Human label for the home channel

Closes NousResearch#2557.
- Adds plugins/platforms/simplex docs page to the messaging sidebar
  between LINE and Open WebUI.
- Maps louismichalot@hotmail.com -> Mibayy in scripts/release.py so the
  attribution check on the salvage PR passes.
When running with --yolo, all dangerous command approvals are bypassed.
Make this state visible so users don't forget:

- Banner: '⚠ YOLO mode — all approval prompts bypassed' line in red, only
  shown when YOLO is active. Default case is silent (no extra line, no
  always-on 'restricted' label).
- Status bar: '⚠ YOLO' fragment appended in red (#FF4444 bold) across all
  three width tiers (<52, <76, ≥76) in both the plain-text fallback and
  the fragments builder.

Closes NousResearch#2663

Co-authored-by: Mibayy <Mibayy@users.noreply.github.com>
Replace O(n²) string concatenation of truncated_response_prefix in the
length-continuation retry loop with a list + ''.join(). Functionally
equivalent: same partial response on early return, same prepend on
final assembly. The legacy retry path is capped at 3 iterations, so
the practical wall-clock win is small, but the new idiom matches the
rest of the codebase and removes a needless repeated allocation.

Salvaged from PR NousResearch#2717 (the run_conversation portion only — trajectory
refactor dropped because it silently rewrote </tool_response> to </think>).

Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
Some catalog endpoints (OpenCode Zen, etc.) sit behind a WAF that
returns 403 for the default Python-urllib/<ver> User-Agent.  The
generic profile-based live fetch in providers/base.py was silently
failing for any such provider — falling through to the static catalog
and missing newly-launched models.

Set a generic 'hermes-cli/<version>' UA on the catalog probe so every
api_key provider profile benefits.  Verified live against opencode-zen:
before this change, profile.fetch_models() raised HTTP 403; after, it
returns 42 models including gpt-5.5, gpt-5.5-pro, kimi-k2.6, glm-5.1
and the *-free variants the static catalog doesn't list.

Also strip the now-stale comment in validate_requested_model() claiming
opencode-zen's /models returns 404 against the HTML marketing site —
the API endpoint at /zen/v1/models returns 200 with valid JSON.

Surfaced by NousResearch#2651 (@aashizpoudel) — fixes the same user-facing gap
their PR targeted, applied at the right layer so all api_key provider
profiles get live catalogs through the same code path.

Co-authored-by: Aashish Poudel <mr.aashiz@gmail.com>
Remove redundant inner `import re` and regex recompilation on every call in
_interpolate_env_vars. Add module-level _ENV_VAR_PATTERN compiled once.

Replace the separate _interpolate_value() in mcp_config.py (which used \w+
and would silently fail on env vars containing hyphens or dots) with the
shared _ENV_VAR_PATTERN from mcp_tool.py. Remove now-unused import re.
Replaces bare `except Exception: pass` with debug-level logging
so failures in local endpoint model discovery are diagnosable
instead of silently hidden.
Three asyncio.gather() calls in tools/web_tools.py ran without
return_exceptions=True. A single failing task (e.g. LLM rate limit on
one URL) would raise out of gather() and discard every other
successfully fetched/summarized result.

Pass return_exceptions=True and filter BaseException entries with a
warning log before unpacking. Affects:

- chunk summarization gather (large web_extract pages)
- firecrawl per-result LLM post-processing
- tavily crawl per-result LLM post-processing

Closes NousResearch#2744
PR NousResearch#2751 salvage. CI requires AUTHOR_MAP coverage for all
contributor commit emails.
When a user sends a Slack message like '/hermes   ' (trailing whitespace
after the slash) the legacy subcommand router hit `text.split()[0]` with
a truthy-but-whitespace-only `text`. `'   '.split()` returns `[]` →
IndexError, blowing up the slash handler before fallthrough to `/help`.

Switch to a two-step guard that materializes the parts list first and
indexes only if non-empty.

Salvaged from PR NousResearch#2752 by @nidhi-singh02. The PR's other two hunks
(`tools/file_operations.py`, `agent/anthropic_adapter.py`) are
unreachable in current code — `LINTERS` is a hardcoded constant dict
with no empty values, and the anthropic version-detection site is
already guarded by a `result.stdout.strip()` truthy check — so only the
slack hunk is taken.

Closes NousResearch#2745

Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
Wrap requests.post() in create_session() for browser_use, browserbase,
and firecrawl providers with requests.RequestException handling.
Connection timeouts and DNS resolution failures now surface as clean
RuntimeError messages instead of raw requests exception tracebacks.

Browser Use managed-gateway mode preserves raw exception propagation
so the existing idempotency-key retry semantics keep working.

Closes NousResearch#2746

Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>
…_HOME into config.toml

Builds on @steezkelly's Bug A fix (NousResearch#25857, top-level default_permissions
via _insert_managed_block_at_top_level) by addressing the other two
config-corruption bugs described in NousResearch#26250:

Bug B (duplicate [plugins.X] tables)
  - Codex itself writes [plugins."<name>@<marketplace>"] tables to
    config.toml when the user runs `codex plugins enable` directly,
    before hermes-agent's managed block exists. On the next migrate run,
    _query_codex_plugins() re-discovers the same plugins via plugin/list
    and render_codex_toml_section() re-emits them inside the managed
    block. Codex's strict TOML parser then rejects the duplicate table
    header on startup.
  - Add _strip_unmanaged_plugin_tables() that drops [plugins.*] tables
    from the user-content portion of the file. Only run it when
    plugin/list succeeded — if the RPC failed we can't re-emit and
    must preserve the user's tables. plugin/list is the source of
    truth when it answers.

Bug C (HERMES_HOME pytest-tempdir leak into ~/.codex/config.toml)
  - _build_hermes_tools_mcp_entry() read HERMES_HOME directly from
    os.environ, so a sibling pytest's monkeypatch.setenv("HERMES_HOME",
    tmp_path) silently burned a transient pytest tempdir into the
    user's real ~/.codex/config.toml. After pytest reaped the tempdir,
    every codex-routed hermes-tools tool call failed silently.
  - Derive HERMES_HOME from get_hermes_home() (the canonical resolver
    that goes through the profile-aware path) and refuse to emit
    obvious test-tempdir paths via _looks_like_test_tempdir() as
    belt-and-suspenders for any other callsite that forgets to patch
    migrate().
  - test_enable_succeeds_when_codex_present in test_codex_runtime_switch.py
    invoked the real migrate() (no mock), writing to Path.home() / .codex
    using whatever HERMES_HOME the running pytest session had set. Add
    the same migrate patch the other apply() tests already use, so the
    suite stops touching the user's real ~/.codex/config.toml.

E2E verification (replicating the issue's repro):
  - Pre-state config.toml with user [mcp_servers.omx_team_run] +
    codex-installed [plugins."tasks@openai-curated"],
    HERMES_HOME="/private/var/folders/.../pytest-of-.../..."
  - On origin/main: tomllib refuses to load the result with
    "Cannot declare ('plugins', 'tasks@openai-curated') twice" AND
    the pytest-tempdir HERMES_HOME is burned in.
  - On this branch: file parses cleanly, default_permissions is
    top-level, exactly one [plugins."tasks@openai-curated"] table
    inside the managed block, no HERMES_HOME in the MCP env.

7 new regression tests covering all three bugs + the test-leak guard.
`bash scripts/run_tests.sh tests/hermes_cli/test_codex_runtime_*.py` —
95 passed, 0 failed.

Closes NousResearch#26250
…2345 salvage (NousResearch#26319)

PR NousResearch#22345 by @btorresgil authors commits as 'Brian Conklin
<brian@dralth.com>' (git config carries a different name/email than the
GitHub account). GitHub's commit-author mapping correctly attributes these
commits to @btorresgil based on the public-key registration, but Hermes'
release attribution audit reads the raw commit email, not the GitHub
mapping. Without this AUTHOR_MAP entry, salvaging NousResearch#22345 would fail
`scripts/contributor_audit.py` strict mode at release time.

Prerequisite for the langfuse trace fix salvage that cherry-picks
@btorresgil's commits onto current main.
…placeholder credentials (closes NousResearch#22342, NousResearch#22763) (NousResearch#26320)

* fix(langfuse): reject placeholder credentials with one-shot warning

When operators leave HERMES_LANGFUSE_PUBLIC_KEY / HERMES_LANGFUSE_SECRET_KEY
at a template value like 'placeholder', 'test-key', or 'your-langfuse-key',
the Langfuse SDK silently accepts the credentials at construction time and
drops every trace at flush time. No warning, no error — just an empty
Langfuse dashboard the operator only notices hours later.

Add prefix-based validation in _get_langfuse() against the documented
'pk-lf-' / 'sk-lf-' prefixes that Langfuse always issues server-side.
Anything else fires a single warning naming the offending env var(s)
with a log-safe value preview (full string for short placeholders so the
operator knows which template they left in place; truncated for long
values so a real secret pasted into the wrong field never hits the log),
then short-circuits via the existing _INIT_FAILED cache so the warning
fires once per process, not once per hook invocation.

The check sits after the 'Langfuse is None' SDK-installed guard so hosts
without the optional langfuse SDK don't see misleading 'set real keys'
hints when the actionable fix is 'pip install langfuse'. Missing
credentials remains the documented opt-out path and stays silent — no
log noise for unconfigured installs.

Fixes NousResearch#22763
Fixes NousResearch#23823

* fix(langfuse): use actual API request messages for generation input

on_pre_llm_request previously used the messages kwarg alone, which
could be None when Hermes passes the payload via request_messages,
conversation_history, or user_message instead. Add _coerce_request_messages
to pick the first available list across all variants, falling back to a
synthetic user message. Generations now show the real outbound payload
rather than an empty input.

* fix(langfuse): record tool call outputs in traces

Tool observations showed input (arguments) but output was always
undefined. Root cause: when tool_call_id is empty, pre_tool_call stored
observations under a unique time-based key that post_tool_call could
never reconstruct, so every tool span was closed without output by the
_finish_trace sweep.

Fix pre/post matching by routing empty-tool_call_id tools through a
per-name FIFO queue (pending_tools_by_name) instead of the time-based
key. Tools with a tool_call_id continue to use the id-keyed dict.

Also:
 - Preserve OpenAI-style nested function shape in serialized tool calls
   so Langfuse renders name/arguments correctly
 - Keep name + tool_call_id on role:tool messages for proper pairing
 - Backfill tool results onto the matching turn_tool_calls entry so the
   generation's tool-call record carries the result alongside arguments
 - Coerce request messages from whichever field the runtime provides
   (request_messages, messages, conversation_history, user_message)

* fix(langfuse): salvage-review polish — drop dead is_first_turn, shallow-copy request_messages, real threaded FIFO test

Self-review of the combined NousResearch#22345 + NousResearch#23831 salvage surfaced three issues
worth fixing in the same PR rather than as follow-ups:

1. Drop is_first_turn from the pre_api_request hook. The boolean expression
   `not bool(conversation_history)` was wrong: conversation_history is
   reassigned to None mid-run after compression (5 sites in run_agent.py),
   so the value flips False -> True mid-conversation on every post-compression
   API call. The langfuse plugin never consumed it, so the kwarg was both
   misleading AND dead.

2. Replace copy.deepcopy(request_messages) with shallow list() copy. The
   pre_api_request hook contract discards return values (invoke_hook never
   writes back to api_kwargs), and the langfuse plugin's _serialize_messages
   already builds its own snapshot dicts via _safe_value. A deepcopy on every
   API call would walk every tool result and base64 image — significant
   overhead for no real isolation benefit. Shallow copy of the outer list
   protects against later mutations of api_messages without paying for the
   inner-dict walk.

3. Rename test_empty_tool_call_id_concurrent_fifo_order ->
   test_empty_tool_call_id_observations_are_fifo_within_tool_name and add a
   real test_threaded_post_calls_preserve_fifo_under_lock that spawns 8
   threads behind a barrier to actually exercise _STATE_LOCK on the
   pending_tools_by_name queue. The original test was sequential and only
   validated Python list semantics; this one validates the lock discipline.

4. Fix stale 'Cleared by reset_cache_for_tests()' comment on _INIT_FAILED —
   that function does not exist. Tests reload the module via sys.modules.pop
   + importlib.import_module instead.

Tests: 37 langfuse plugin tests pass, 658 plugin tests overall pass.

---------

Co-authored-by: xxxigm <tuancanhnguyen706@gmail.com>
Co-authored-by: Brian Conklin <brian@dralth.com>
…sResearch#26071) (NousResearch#26327)

* feat(process-registry): add format_process_notification shared helper

* feat(process-registry): add drain_notifications method

* refactor(cli): use shared drain_notifications and format_process_notification

* feat(tui): add background notification poller for completion_queue

* feat(tui): wire notification poller into session init/finalize

* refactor(tui): add post-turn drain using shared helper as safety net
…put renders correctly (NousResearch#26011)

* fix(tui): restrict fast-echo bypass to ASCII so Vietnamese/CJK/IME input renders correctly

The composer's fast-echo path (canFastAppend / canFastBackspace) writes
characters straight to stdout to skip an Ink re-render on the hot
typing path. The previous guard only checked
'stringWidth(text) === text.length', which lets a lot of non-ASCII
through:

  - Vietnamese precomposed letters (ề, ắ, ờ, ự, ...) report width 1 and
    length 1, but a Vietnamese Telex / IME stack produces them across
    multiple keystrokes; the intermediate composition state must be
    drawn by Ink so the rendered cell, the stored value, and the
    cursor column stay in lockstep when the final commit replaces the
    preview.
  - NFD combining marks (U+0300..U+036F) are zero-width but length 1,
    so even a passing equality lets them slip and silently desync the
    cell column.
  - CJK/East-Asian wide and emoji rejected only because their length
    differs, but the boundary was shape-shaped, not intent-shaped.

User-visible bug from the original report:
  Example: eê noiói nge neène
  -> the bypass committed the IME preview char before the diacritic
     replaced it, leaving doubled letters on screen.

Fix: gate fast-echo on pure printable ASCII (0x20-0x7e). The
performance-critical English typing path is unchanged; everything else
goes through the normal Ink render path so layout stays accurate.

Also extracts the shape preconditions as pure exported helpers
(canFastAppendShape / canFastBackspaceShape) so the regression matrix
is testable without spinning up a TextInput.

Tests: ui-tui/src/__tests__/textInputFastEcho.test.ts adds 20 cases
covering ASCII still works, Vietnamese precomposed + NFD, CJK, emoji,
NBSP / Latin-1, ANSI / control bytes, multi-line, and end-of-line
preconditions. Verified RED on the previous guard (11 of 20 fail) and
GREEN on the new guard.

Refs: NousResearch#5221, NousResearch#7443, NousResearch#17602, NousResearch#17603 (similar wide-char rendering bugs).

* docs(tui): clarify Vietnamese char terminology in regression comment

Address Copilot review: 'single byte width' implied UTF-8 byte semantics,
but the relevant property is JS code units (`text.length === 1`) and
display width (`stringWidth === 1`). Reworded to match.
@github-actions

Copy link
Copy Markdown

🚨 CRITICAL Supply Chain Risk Detected

This PR contains a pattern that has been used in real supply chain attacks. A maintainer must review the flagged code carefully before merging.

🚨 CRITICAL: Install-hook file added or modified

These files can execute code during package installation or interpreter startup.

Files:

hermes_cli/setup.py

Scanner only fires on high-signal indicators: .pth files, base64+exec/eval combos, subprocess with encoded commands, or install-hook files. Low-signal warnings were removed intentionally — if you're seeing this comment, the finding is worth inspecting.

@github-actions

Copy link
Copy Markdown

🔎 Lint report: tlon/core-platform vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 8320 on HEAD, 8367 on base (✅ -47)

🆕 New issues (44):

Rule Count
unresolved-attribute 14
unresolved-import 11
unsupported-operator 9
invalid-assignment 3
invalid-argument-type 3
unresolved-reference 1
unused-type-ignore-comment 1
call-non-callable 1
invalid-method-override 1
First entries
tests/tools/test_process_registry.py:902: [unsupported-operator] unsupported-operator: Operator `in` is not supported between objects of type `Literal["watch pattern \"ERROR\""]` and `str | None`
tests/acp/test_entry.py:6: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
gateway/platforms/discord.py:3677: [unresolved-attribute] unresolved-attribute: Attribute `Object` is not defined on `None` in union `Unknown | None`
cli.py:1489: [unresolved-attribute] unresolved-attribute: Unresolved attribute `_hermes_light_mode_hook_installed` on type `<class 'SkinConfig'>`.
tests/tools/test_process_registry.py:889: [unsupported-operator] unsupported-operator: Operator `in` is not supported between objects of type `Literal["Output:\ndone]"]` and `str | None`
gateway/platforms/tlon.py:2045: [unresolved-attribute] unresolved-attribute: Attribute `poke` is not defined on `None` in union `TlonSSEClient | None`
tests/gateway/test_tlon_adapter.py:376: [unresolved-attribute] unresolved-attribute: Object of type `bound method TlonAdapter.handle_message(event) -> CoroutineType[Any, Any, None]` has no attribute `assert_not_awaited`
acp_adapter/auth.py:40: [unresolved-import] unresolved-import: Cannot resolve imported module `acp.schema`
tests/tools/test_process_registry.py:925: [unsupported-operator] unsupported-operator: Operator `in` is not supported between objects of type `Literal["[IMPORTANT: Watch disabled for proc_xyz"]` and `str | None`
tests/gateway/test_tlon_adapter.py:4: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
tests/gateway/test_simplex_plugin.py:313: [unresolved-import] unresolved-import: Cannot resolve imported module `websockets.client`
tests/gateway/test_tlon_adapter.py:379: [unresolved-attribute] unresolved-attribute: Object of type `bound method TlonAdapter.send(chat_id: str, content: str, reply_to: str | None = None, metadata: dict[str, Any] | None = None) -> CoroutineType[Any, Any, SendResult]` has no attribute `assert_awaited_once`
tests/tools/test_process_registry.py:903: [unsupported-operator] unsupported-operator: Operator `in` is not supported between objects of type `Literal["Matched output:\nERROR: disk full"]` and `str | None`
tests/gateway/test_tlon_adapter.py:359: [invalid-assignment] invalid-assignment: Object of type `AsyncMock` is not assignable to attribute `handle_message` of type `def handle_message(self, event) -> CoroutineType[Any, Any, None]`
gateway/platforms/tlon.py:920: [unresolved-attribute] unresolved-attribute: Attribute `delete` is not defined on `None` in union `Any | None`
tests/tools/test_process_registry.py:886: [unsupported-operator] unsupported-operator: Operator `in` is not supported between objects of type `Literal["[IMPORTANT: Background process proc_abc completed"]` and `str | None`
tests/gateway/test_tlon_adapter.py:315: [unresolved-attribute] unresolved-attribute: Object of type `bound method TlonAdapter.handle_message(event) -> CoroutineType[Any, Any, None]` has no attribute `assert_awaited_once`
gateway/platforms/tlon.py:974: [unresolved-reference] unresolved-reference: Name `aiohttp` used when not defined
gateway/platforms/yuanbao.py:2621: [invalid-argument-type] invalid-argument-type: Argument is incorrect: Expected `MessageType`, found `Literal[MessageType.DOCUMENT] | Any | None`
gateway/platforms/tlon.py:871: [unresolved-attribute] unresolved-attribute: Attribute `put` is not defined on `None` in union `Any | None`
cli.py:13400: [unresolved-import] unresolved-import: Cannot resolve imported module `prompt_toolkit.renderer`
gateway/platforms/tlon.py:1819: [unresolved-attribute] unresolved-attribute: Attribute `scry` is not defined on `None` in union `TlonSSEClient | None`
tests/gateway/test_tlon_adapter.py:360: [invalid-assignment] invalid-assignment: Object of type `AsyncMock` is not assignable to attribute `send` of type `def send(self, chat_id: str, content: str, reply_to: str | None = None, metadata: dict[str, Any] | None = None) -> CoroutineType[Any, Any, SendResult]`
gateway/platforms/tlon_media.py:195: [unresolved-import] unresolved-import: Cannot resolve imported module `aiohttp`
tests/tools/test_process_registry.py:888: [unsupported-operator] unsupported-operator: Operator `in` is not supported between objects of type `Literal["Command: sleep 5"]` and `str | None`
... and 19 more

✅ Fixed issues (107):

Rule Count
unresolved-import 62
invalid-assignment 13
unresolved-attribute 13
invalid-argument-type 12
not-subscriptable 2
invalid-parameter-default 2
unsupported-operator 1
invalid-return-type 1
unresolved-reference 1
First entries
tools/rl_training_tool.py:973: [invalid-assignment] invalid-assignment: Invalid subscript assignment with key of type `Literal["history"]` and value of type `list[dict[Unknown, Unknown]]` on object of type `dict[str, str]`
tools/rl_training_tool.py:1130: [invalid-argument-type] invalid-argument-type: Method `__getitem__` of type `Overload[(i: SupportsIndex, /) -> dict[str, str | int | float], (s: slice[SupportsIndex | None, SupportsIndex | None, SupportsIndex | None], /) -> list[dict[str, str | int | float]]]` cannot be called with key of type `Literal["max_num_workers"]` on object of type `list[dict[str, str | int | float]]`
tools/rl_training_tool.py:1276: [invalid-assignment] invalid-assignment: Invalid subscript assignment with key of type `Literal["accuracy"]` and value of type `int | float` on object of type `dict[str, str | list[Unknown] | int]`
tools/rl_training_tool.py:693: [unresolved-attribute] unresolved-attribute: Attribute `get` is not defined on `list[dict[str, str | int | float]]`, `bool` in union `dict[str, str | int | float] | list[dict[str, str | int | float]] | bool`
hermes_cli/tools_config.py:2469: [unsupported-operator] unsupported-operator: Operator `in` is not supported between objects of type `dict[Unknown, Unknown]` and `str | list[dict[str, str | list[Unknown] | bool | list[str]] | dict[str, str | list[Unknown]] | dict[str, str | list[dict[str, str]]]] | list[dict[str, str | list[Unknown] | bool | list[str]] | dict[str, str | list[dict[str, str]]]] | ... omitted 4 union elements`
tools/rl_training_tool.py:337: [unresolved-attribute] unresolved-attribute: Unresolved attribute `api_log_file` on type `RunState`
tools/rl_training_tool.py:774: [invalid-assignment] invalid-assignment: Invalid subscript assignment with key of type `Literal["wandb_project"]` and value of type `Any` on object of type `list[dict[str, str | int | float]]`
tools/rl_training_tool.py:1129: [invalid-argument-type] invalid-argument-type: Method `__getitem__` of type `Overload[(i: SupportsIndex, /) -> dict[str, str | int | float], (s: slice[SupportsIndex | None, SupportsIndex | None, SupportsIndex | None], /) -> list[dict[str, str | int | float]]]` cannot be called with key of type `Literal["max_token_length"]` on object of type `list[dict[str, str | int | float]]`
tools/rl_training_tool.py:1131: [not-subscriptable] not-subscriptable: Cannot subscript object of type `bool` with no `__getitem__` method
tests/hermes_cli/test_tools_config.py:534: [invalid-argument-type] invalid-argument-type: Argument to function `_configure_provider` is incorrect: Expected `dict[Unknown, Unknown]`, found `str | dict[str, str | list[Unknown] | bool | list[str]] | dict[str, str | list[Unknown]] | ... omitted 3 union elements`
tests/tools/test_managed_server_tool_support.py:22: [unresolved-import] unresolved-import: Cannot resolve imported module `atroposlib`
environments/web_research_env.py:67: [unresolved-import] unresolved-import: Cannot resolve imported module `atroposlib.envs.base`
environments/hermes_base_env.py:343: [unresolved-import] unresolved-import: Cannot resolve imported module `atroposlib.envs.server_handling.openai_server`
environments/web_research_env.py:68: [unresolved-import] unresolved-import: Cannot resolve imported module `atroposlib.envs.server_handling.server_manager`
environments/agentic_opd_env.py:83: [unresolved-import] unresolved-import: Cannot resolve imported module `atroposlib.envs.base`
environments/benchmarks/yc_bench/yc_bench_env.py:62: [unresolved-import] unresolved-import: Cannot resolve imported module `atroposlib.envs.base`
environments/benchmarks/terminalbench_2/terminalbench2_env.py:57: [unresolved-import] unresolved-import: Cannot resolve imported module `atroposlib.envs.base`
environments/benchmarks/yc_bench/yc_bench_env.py:63: [unresolved-import] unresolved-import: Cannot resolve imported module `atroposlib.envs.server_handling.server_manager`
environments/hermes_swe_env/hermes_swe_env.py:45: [unresolved-import] unresolved-import: Cannot resolve imported module `atroposlib.envs.base`
tests/hermes_cli/test_tools_config.py:720: [unresolved-attribute] unresolved-attribute: Attribute `get` is not defined on `str`, `int` in union `str | dict[str, str | list[Unknown] | bool | list[str]] | dict[str, str | list[Unknown]] | ... omitted 3 union elements`
environments/benchmarks/terminalbench_2/terminalbench2_env.py:58: [unresolved-import] unresolved-import: Cannot resolve imported module `atroposlib.envs.server_handling.server_manager`
environments/hermes_swe_env/hermes_swe_env.py:46: [unresolved-import] unresolved-import: Cannot resolve imported module `atroposlib.envs.server_handling.server_manager`
environments/web_research_env.py:51: [unresolved-import] unresolved-import: Cannot resolve imported module `pydantic`
tools/rl_training_tool.py:775: [invalid-assignment] invalid-assignment: Invalid subscript assignment with key of type `Literal["wandb_run_name"]` and value of type `str` on object of type `list[dict[str, str | int | float]]`
tests/run_agent/test_agent_loop_tool_calling.py:29: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
... and 82 more

Unchanged: 4299 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

wca4a pushed a commit that referenced this pull request May 22, 2026
Three issues flagged by the Copilot review on this PR:

1. Double JSON emit on stage failure (Copilot #1, #2). When -Stage <name>
   ran a worker that threw, Invoke-Stage's finally emitted a JSON result
   frame AND the entry-point catch emitted a second error frame --
   producing two concatenated JSON objects on stdout and breaking the
   one-line-per-invocation contract that drivers parse against. Same
   issue applied to -Json mode on a full install (every stage's finally
   plus a final error frame missing duration_ms/skipped).

   Fix: Invoke-Stage's finally now sets $script:_StageEmittedErrorFrame
   when it emits a failure frame; the entry-point catch checks the flag
   and skips its own emit, still exit 1.

2. $prevEAP uninitialized on early try-block throw (Copilot #3). In
   Install-Uv, Test-Python, Test-Node's winget fallback,
   _Run-NpmInstall, and the playwright block, '$prevEAP =
   $ErrorActionPreference' lived as the first statement INSIDE the
   try. If anything between 'try {' and that line threw (Write-Info on
   an unusual host, the npx-finding loop, etc.), the catch's
   'if ($prevEAP) { ... }' restore was a no-op and EAP could remain
   relaxed.

   Fix: hoist '$prevEAP = $ErrorActionPreference' to the line
   immediately before 'try {' in all five sites. Catch's restore is
   now always meaningful regardless of where in the try the throw
   originated.

No change to Invoke-Stage's success path or to the four lint-clean EAP
sites (Test-Node was the only winget-related catch). All 19 metadata
smoke tests still pass.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.