Skip to content

fix: support Docker-backed Hermes installs in SSH mode#435

Open
dergachoff wants to merge 2 commits into
fathah:mainfrom
dergachoff:fix/ssh-docker-hermes
Open

fix: support Docker-backed Hermes installs in SSH mode#435
dergachoff wants to merge 2 commits into
fathah:mainfrom
dergachoff:fix/ssh-docker-hermes

Conversation

@dergachoff

Copy link
Copy Markdown

Fixes #432

Problem

SSH mode assumes Hermes is installed directly on the remote host:

  • the hermes CLI is on PATH or in known host venv paths
  • Hermes data lives under ~/.hermes

Docker-backed Hermes Agent installs keep the CLI inside the container and persist data through the container /opt/data mount. In that setup, SSH doctor/version commands can fail, while sessions, config, env, logs, models, and profiles appear empty because desktop SSH helpers read the wrong host paths.

Root cause

src/main/ssh-remote.ts hardcoded host install assumptions in two places:

  • CLI commands used host venv probes, then bare hermes
  • file reads/writes used ~/.hermes paths directly

That misses official Docker image deployments where the working CLI is inside the container and the real Hermes home is the host mount backing /opt/data.

Fix

  • Add remote Hermes home discovery for SSH mode:
    • keep existing host install behavior
    • detect a running nousresearch/hermes-agent container
    • map the container /opt/data mount back to its host path
  • Add Docker-backed CLI fallback via docker exec.
  • Route SSH config/env/session/profile/memory/log/skill/model paths through the resolved Hermes home.
  • Reuse the existing CLI command builder for SSH skill/profile CLI operations.
  • Document Docker-backed SSH requirements in docs/SSH-TUNNEL-VPS.md.
  • Add focused tests for Docker fallback, Docker home mapping, and shell-safe argument quoting.

Validation

  • npm test -- --run tests/ssh-remote.test.ts tests/ssh-remote-paths.test.ts
  • npm run typecheck
  • npm test
  • npx eslint src/main/ssh-remote.ts tests/ssh-remote.test.ts --quiet
  • npm run build:unpack
  • Smoke-tested the unpacked macOS app with a temporary HERMES_HOME and SSH config; the app-created SSH tunnel reached the remote Hermes /health endpoint.

Note: full npm run lint -- --quiet still reports unrelated existing lint errors under scripts/*.js. The files touched by this PR pass ESLint directly.

@pmos69 pmos69 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for jumping on #432. I agree with the direction: the data-home discovery plus Docker exec fallback addresses the core “chat works but Desktop reads the wrong filesystem / cannot find the CLI” report. I also checked this against the staged plan I had for the issue and against the earlier WSL/Docker probe I ran: for a single official nousresearch/hermes-agent container with /opt/data mounted to the host, the new discovery path is the right shape and the generated Docker CLI command worked in that live probe.

Requesting changes for a couple of blocking gaps before this lands:

  1. tests/ssh-remote.test.ts:120 now unconditionally executes bash. On a clean Windows dev environment this focused suite fails with Error: spawnSync bash ENOENT:

    npm test -- --run tests/ssh-remote.test.ts
    FAIL tests/ssh-remote.test.ts > buildRemoteHermesCmd venv probe > quotes the remote bash script safely...
    Error: spawnSync bash ENOENT
    

    This should either skip the execution assertion when bash is unavailable, resolve a known Git Bash path on Windows, or rewrite the test so the Windows test suite does not require a POSIX shell on PATH.

  2. Docker-backed gateway lifecycle is still routed through the host install path. buildGatewayStartCommand, buildGatewayStopCommand, and buildGatewayStatusCommand still use bare hermes gateway ... and $HOME/.hermes/gateway.pid / $HOME/.hermes/gateway.log instead of the resolved Hermes home and/or buildRemoteHermesCmd. In the Docker/Coolify case this means the Gateway page can report stopped even while the container gateway is running, and Start/Stop targets a nonexistent host CLI or the wrong host data directory. This was the main remaining surface in my staged plan: either route these controls through the detected Docker CLI/home, or explicitly treat Docker/Coolify gateways as externally managed so Desktop does not show misleading controls.

One related design risk: dockerHermesContainerProbe() picks the first running nousresearch/hermes-agent container, and the home cache key only uses SSH host/user/port/key. If a Coolify host has more than one Hermes container, Desktop may tunnel chat to one configured remotePort but read/write config/sessions via another container’s mount. That may be acceptable as a follow-up if this PR is deliberately “single official container only”, but it should at least be documented or made explicit with a container override/selection path.

Validation run locally:

  • npm test -- --run tests/ssh-remote.test.ts fails on Windows as above.
  • npm run typecheck passes.
  • npm run build passes.

@pmos69

pmos69 commented May 28, 2026

Copy link
Copy Markdown
Collaborator

Small clarification on my review: I am not asking for the implementation to follow my earlier staged plan specifically. Any shape is fine if the Docker-backed SSH mode works reliably.

The two concrete blockers I meant are behavior-level issues:

  • The focused SSH test suite currently fails on Windows because the new test unconditionally executes bash.
  • For Docker-backed installs, the Gateway lifecycle controls should either work through the detected Docker/container path, or be clearly treated as externally managed/unsupported so Desktop does not show misleading Start/Stop/Status behavior that still targets the host hermes CLI / $HOME/.hermes.

The “first matching container” point is more of a design caveat/follow-up unless it causes common Coolify multi-container setups to read/write the wrong container’s data.

@dergachoff dergachoff force-pushed the fix/ssh-docker-hermes branch from 88f2d60 to 7fe6561 Compare May 30, 2026 05:56
@dergachoff

Copy link
Copy Markdown
Author

@pmos69 thanks for the review!

Lack of windows compatibility – my mistake, thanks for noticing. As for multiple containers, I've decided that for end-user it would me more valuable if solution won't be half-assed and we'd be able to detect multiple containers and let the user pick.

So I updated the patch to avoid first-container selection by adding an explicit SSH Docker target selector in Settings/Welcome, and the selected container is now threaded through CLI, home resolution, cache keys, and and gateway start/stop/status. I also made the bash/python execution test skip when those tools are unavailable, while keeping portable assertions.

Validated with targeted SSH/config/preload/ipc tests, 'npm run typecheck', targeted ESLint on touched files, 'nom run build' and git diff -check. And of course local build and verification on my remote container setup.

@pmos69

pmos69 commented May 30, 2026

Copy link
Copy Markdown
Collaborator

Live-tested the updated branch against a WSL SSH host with a Docker-style Hermes setup, including multi-container selection and gateway lifecycle routing through the selected container. Everything behaved as expected.

One small remaining change before merge: the focused SSH suite still has one Windows-local failure because tests/ssh-remote.test.ts still calls execFileSync("bash", ...) directly in executes apostrophe-bearing arguments safely when bash is available. The earlier Bash-availability guard fixed the other cases, but this one should use the same bashIt/availability pattern or otherwise skip when Bash is not on PATH.

After that I’m comfortable with this PR.

@dergachoff dergachoff force-pushed the fix/ssh-docker-hermes branch from 7fe6561 to b7bbcc6 Compare May 31, 2026 13:23
@pmos69

pmos69 commented May 31, 2026

Copy link
Copy Markdown
Collaborator

Thanks, this now addresses the main behavior-level concerns I had: Docker-backed gateway start/stop/status is routed through the selected Docker target/resolved Hermes home, and the Settings/Welcome flow now has explicit target inspection/selection instead of blindly using the first matching container.

One requested change is still not fully met: the focused SSH suite still fails on Windows.

I reran:

npm test -- --run tests/ssh-remote.test.ts tests/connection-config-security.test.ts

Current result: tests/connection-config-security.test.ts passes, but tests/ssh-remote.test.ts has 2 failures:

  1. executes apostrophe-bearing arguments safely when bash is available

    This is now wrapped in bashIt, which is the right direction, but bashPath is currently the string "bash". Inside the test, the spawned process overrides PATH to /usr/bin:/bin, so on Windows the nested execFileSync(bashPath, ...) can no longer resolve bash and fails with spawnSync bash ENOENT.

  2. matches only Docker containers published on the configured remote host port

    This test creates a fake docker shim under a Windows temp path and prepends that path to PATH, then executes the generated script through Bash/Python. On this Windows setup, Bash/Python do not resolve that Windows temp path as a POSIX executable path, so the fake Docker shim is never found and dockerContainers comes back empty.

Suggested implementation:

  • Make resolveExecutable("bash") return the absolute executable path from where.exe bash / Get-Command bash, not just "bash".
  • For execution tests that run inside Bash/Python and need fake commands on PATH, either:
    • create the shim in a shell-native temp directory returned by that Bash environment, or
    • convert the Windows temp/bin path to a POSIX path before passing it into the Bash-side PATH, or
    • skip those integration-style execution tests on Windows unless the test can prove the shim is visible from the selected Bash/Python environment.

The portable string-assertion tests are fine; the remaining issue is just that the execution tests still assume a POSIX-compatible path model after we intentionally override PATH.

@swarthyplacebo

Copy link
Copy Markdown

Hi — hitting this exact issue with a nousresearch/hermes-agent Docker deployment on a home NAS (OMV7). SSH tunnel connects fine, health probes succeed, but sessions/models/config/logs all come up empty and Doctor reports "hermes CLI not found on remote PATH or in any known venv location." Chat works if I fall back to Remote mode with a manual SSH forward, but that's a workaround rather than a fix.

PR #435 looks like it addresses this directly. Is there anything blocking a merge — conflicts to resolve, a review pass still needed, or is it in the queue? Would be happy to test against a Docker install if that would help move things forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: SSH mode does not support Docker/Coolify Hermes Agent deployments

3 participants