diff --git a/docs/local-dev.md b/docs/local-dev.md index ba2b644..98bf4be 100644 --- a/docs/local-dev.md +++ b/docs/local-dev.md @@ -57,7 +57,7 @@ sh -n scripts/install-amesh-node.sh - The published remote bootstrap path is `curl .../install-amesh-node.sh | ... bash`, so the installer must keep working when Bash reads it from stdin instead of from a file. - The installer now logs whether it is reusing or creating config/state, and on systemd hosts it fails the install if the user service does not remain active after startup. When that happens it prints both `systemctl --user status` and recent `journalctl --user -u amesh-node` output. - `install-amesh-node.sh` also normalizes `~/.acpx/config.json` so ACPX non-interactive health probes start from a valid baseline on first install. -- Detected agents now persist the registering shell's `PATH` into node config. This avoids later service-only regressions where a systemd user unit resolves a different `node` binary than the interactive shell that successfully ran the same agent CLI. +- Detected agents now persist the registering shell's `PATH` into node config and prepend the resolved executable directories for the detected agent CLI and `node`. This avoids later service-only regressions where a systemd user unit or an `fnm` multishell shim resolves a different or stale Node runtime than the interactive shell that successfully ran the same agent CLI. - ACP aliases for external clients can be served locally with `go run ./cmd/amesh acp `. The default alias registry is `~/.config/amesh/acp.json`: ```json @@ -133,3 +133,17 @@ amesh-node update Authenticated admins can also trigger the same node-side updater from the dashboard. The control plane sends a `node.update` command over the existing node websocket, the daemon runs `amesh-node update`, and a managed systemd service should restart back into the new binary after the process exits. - The dashboard only shows the update action when the node reports an installed release tag and that tag differs from the control plane's latest known GitHub release tag. +- Daemon-triggered self-updates reuse the node's active `server`, `config`, and `state` paths and deliberately avoid `systemctl stop` during the update run. The daemon exits after the installer finishes and systemd restarts it into the new binary. + +## Remote reinstall + +```bash +amesh-node reinstall +``` + +The shared CLI also exposes the same command as `amesh reinstall`. + +`reinstall` is the destructive recovery path for a stale or suspect node install. It stops and disables the managed user service, removes the node service file, durable node state, detected agent config, installed `amesh-node` and `amesh` binaries, and the managed `~/.local/share/amesh` payload, then runs the installer again from scratch. +- Use `reinstall` when you suspect stale node state, stale detected agent inventory, or broken managed ACPX/node wiring. +- `reinstall` preserves the user ACPX config at `~/.acpx/config.json`; it only wipes amesh-managed node artifacts. +- On success, the installer re-detects agents, re-registers the node, rewrites the service, and starts the managed daemon again. diff --git a/docs/past-failures.md b/docs/past-failures.md index 309e942..a23cebd 100644 --- a/docs/past-failures.md +++ b/docs/past-failures.md @@ -113,6 +113,12 @@ - Consequence: a fresh node could advertise agents, yet the dashboard showed runtime errors like `/usr/bin/env: 'node': No such file or directory` or `toSorted is not a function` once the daemon tried to execute them. - Mitigation: detected agent configs now persist the working shell `PATH`, and the installer now fails fast unless `node` `22.x+` is available before it installs the daemon service. Covered by a Go detection test that asserts the saved agent env includes the original `PATH`. +## 2026-05-15: `fnm` multishell shims made saved agent PATH entries go stale + +- Symptom: daemon-side health probes failed with `/usr/bin/env: 'node': No such file or directory` even though detection succeeded in an `fnm` shell and the saved config already included `PATH`. +- Cause: detection persisted the shell's raw PATH order. In `fnm` environments that can put transient multishell shim directories ahead of the stable Node installation path, so later daemon runs reused a dead shim directory. +- Mitigation: detected agent env now prepends the resolved executable directories for both the agent CLI and `node`, then appends the original shell `PATH` as fallback. Covered by a Go regression test that simulates `fnm`-style symlink shims. + ## 2026-05-11: Node inventory had no lightweight way to express multiple working directories - The node config only described base agents, so a single machine could not advertise the same local agent across multiple useful workspaces without hand-editing duplicate agent entries. diff --git a/docs/testing.md b/docs/testing.md index 90ad494..e7d7cc4 100644 --- a/docs/testing.md +++ b/docs/testing.md @@ -38,3 +38,5 @@ - The web app also covers the top-bar MCP config panel so the copy-paste client snippets stay aligned with the server endpoint and scope headers. - The Go daemon owns table-driven tests for config loading, reconnect logic, update, detect, exposed-path command dispatch, and `acpx` process lifecycle including streamed output and cancellation. - The dev helper script also has a regression shell test for the stale local reconnect-token path, so local `pnpm dev:daemon` re-registers automatically after a fresh control-plane reset. +- The Go daemon also covers the shared `reinstall` subcommand and verifies that reinstall mode passes the destructive reset flag through to the installer. +- `scripts/test-install-amesh-node.sh` also covers remote self-update and full reinstall flows, including reinstall-time cleanup of stale node state, config, service, binaries, and managed amesh home. diff --git a/install-amesh-node.sh b/install-amesh-node.sh index 5a6e44c..16b6163 100644 --- a/install-amesh-node.sh +++ b/install-amesh-node.sh @@ -18,6 +18,8 @@ SERVICE_PATH="${SERVICE_PATH:-$HOME/.config/systemd/user/${SERVICE_NAME}.service SERVER_URL="${SERVER_URL:-}" REGISTRATION_TOKEN="${REGISTRATION_TOKEN:-}" NODE_ID="${NODE_ID:-$(hostname)-amesh}" +SELF_UPDATE="${AMESH_NODE_SELF_UPDATE:-0}" +REINSTALL="${AMESH_NODE_REINSTALL:-0}" log() { printf '%s\n' "$*" >&2 @@ -194,7 +196,7 @@ main() { need_cmd install need_cmd mkdir - if [[ -z "$SERVER_URL" ]]; then + if [[ -z "$SERVER_URL" && ! -f "$STATE_PATH" ]]; then fail "SERVER_URL is required" fi @@ -219,6 +221,16 @@ main() { tmp_dir="$(mktemp -d)" trap 'rm -rf "${tmp_dir}"' EXIT + if [[ "$REINSTALL" == "1" ]]; then + log "reinstall requested; removing existing node install artifacts" + if command -v systemctl >/dev/null 2>&1; then + systemctl --user stop "$SERVICE_NAME" >/dev/null 2>&1 || true + systemctl --user disable "$SERVICE_NAME" >/dev/null 2>&1 || true + fi + rm -f "$SERVICE_PATH" "$STATE_PATH" "$CONFIG_PATH" "$binary_path" "$cli_binary_path" + rm -rf "$AMESH_HOME" + fi + mkdir -p "${install_dir}" mkdir -p "${AMESH_HOME}" mkdir -p "$(dirname "$STATE_PATH")" @@ -246,7 +258,7 @@ main() { install -m 0755 "${extract_dir}/${cli_binary_name}" "${cli_binary_path}" fi - if command -v systemctl >/dev/null 2>&1; then + if command -v systemctl >/dev/null 2>&1 && [[ "$SELF_UPDATE" != "1" ]]; then systemctl --user stop "$SERVICE_NAME" >/dev/null 2>&1 || true fi @@ -309,16 +321,21 @@ EOF if command -v systemctl >/dev/null 2>&1; then systemctl --user daemon-reload - systemctl --user enable --now "$SERVICE_NAME" - sleep 2 - if ! systemctl --user --quiet is-active "$SERVICE_NAME"; then - log "service failed to stay active: $SERVICE_NAME" - systemctl --user --no-pager --full status "$SERVICE_NAME" >&2 || true - journalctl --user -u "$SERVICE_NAME" -n 80 --no-pager >&2 || true - fail "amesh-node user service did not reach active state" + if [[ "$SELF_UPDATE" == "1" ]]; then + systemctl --user enable "$SERVICE_NAME" + log "prepared user service restart after self-update: $SERVICE_NAME" + else + systemctl --user enable --now "$SERVICE_NAME" + sleep 2 + if ! systemctl --user --quiet is-active "$SERVICE_NAME"; then + log "service failed to stay active: $SERVICE_NAME" + systemctl --user --no-pager --full status "$SERVICE_NAME" >&2 || true + journalctl --user -u "$SERVICE_NAME" -n 80 --no-pager >&2 || true + fail "amesh-node user service did not reach active state" + fi + log "installed and started user service: $SERVICE_NAME" + log "service logs: journalctl --user -u ${SERVICE_NAME} -f" fi - log "installed and started user service: $SERVICE_NAME" - log "service logs: journalctl --user -u ${SERVICE_NAME} -f" else log "systemctl not found; service file written to $SERVICE_PATH" log "start manually: AMESH_ACPX_PATH='${ACPX_BIN}' '${binary_path}' run --state '${STATE_PATH}'" diff --git a/internal/app/app.go b/internal/app/app.go index 123a95b..fc14f52 100644 --- a/internal/app/app.go +++ b/internal/app/app.go @@ -41,7 +41,16 @@ type sleeper func(ctx context.Context, delay time.Duration) error type capabilityProber func(ctx context.Context, agent nodeconfig.AgentConfig) error -type updateRunner func(ctx context.Context, stdout, stderr io.Writer) error +type nodeUpdateOptions struct { + ServerURL string + NodeID string + ConfigPath string + StatePath string + SelfUpdate bool + Reinstall bool +} + +type updateRunner func(ctx context.Context, stdout, stderr io.Writer, options nodeUpdateOptions) error type detectRunner func(ctx context.Context, configPath string) error type retryableDaemonError struct { @@ -88,7 +97,7 @@ func Run(ctx context.Context, args []string) error { func run(ctx context.Context, args []string, update updateRunner, detect detectRunner) error { if len(args) == 0 { - return errors.New("expected subcommand: register, run, detect, update, or acp") + return errors.New("expected subcommand: register, run, detect, update, reinstall, logs, or acp") } switch args[0] { @@ -99,7 +108,9 @@ func run(ctx context.Context, args []string, update updateRunner, detect detectR case "detect": return runDetectCommand(ctx, args[1:], detect) case "update": - return update(ctx, os.Stdout, os.Stderr) + return update(ctx, os.Stdout, os.Stderr, nodeUpdateOptions{}) + case "reinstall": + return update(ctx, os.Stdout, os.Stderr, nodeUpdateOptions{Reinstall: true}) case "acp": return runACPBridge(ctx, args[1:], os.Stdin, os.Stdout) case "logs": @@ -159,7 +170,16 @@ func runACPBridge(ctx context.Context, args []string, stdin io.Reader, stdout io return bridge.Serve(ctx, stdin, stdout) } -func runUpdate(ctx context.Context, stdout, stderr io.Writer) error { +func runUpdate(ctx context.Context, stdout, stderr io.Writer, options nodeUpdateOptions) error { + return runInstaller(ctx, stdout, stderr, options, options.Reinstall) +} + +func runReinstall(ctx context.Context, stdout, stderr io.Writer, options nodeUpdateOptions) error { + options.Reinstall = true + return runInstaller(ctx, stdout, stderr, options, true) +} + +func runInstaller(ctx context.Context, stdout, stderr io.Writer, options nodeUpdateOptions, reinstall bool) error { if _, err := exec.LookPath("bash"); err != nil { return errors.New("required CLI missing: bash") } @@ -180,14 +200,39 @@ func runUpdate(ctx context.Context, stdout, stderr io.Writer) error { cmd.Stdout = stdout cmd.Stderr = stderr cmd.Env = append(os.Environ(), "AMESH_INSTALL_URL="+installerURL) + if strings.TrimSpace(options.ServerURL) != "" && os.Getenv("SERVER_URL") == "" { + cmd.Env = append(cmd.Env, "SERVER_URL="+options.ServerURL) + } + if strings.TrimSpace(options.NodeID) != "" && os.Getenv("NODE_ID") == "" { + cmd.Env = append(cmd.Env, "NODE_ID="+options.NodeID) + } + if strings.TrimSpace(options.ConfigPath) != "" && os.Getenv("CONFIG_PATH") == "" { + cmd.Env = append(cmd.Env, "CONFIG_PATH="+options.ConfigPath) + } + if strings.TrimSpace(options.StatePath) != "" && os.Getenv("STATE_PATH") == "" { + cmd.Env = append(cmd.Env, "STATE_PATH="+options.StatePath) + } + if options.SelfUpdate { + cmd.Env = append(cmd.Env, "AMESH_NODE_SELF_UPDATE=1") + } + if reinstall { + cmd.Env = append(cmd.Env, "AMESH_NODE_REINSTALL=1") + } if os.Getenv("INSTALL_DIR") == "" { if installDir, ok := currentInstallDir(); ok { cmd.Env = append(cmd.Env, "INSTALL_DIR="+installDir) } } - fmt.Fprintf(stdout, "updating amesh-node from %s\n", installerURL) + action := "updating" + if reinstall { + action = "reinstalling" + } + fmt.Fprintf(stdout, "%s amesh-node from %s\n", action, installerURL) if err := cmd.Run(); err != nil { + if reinstall { + return fmt.Errorf("reinstall failed: %w", err) + } return fmt.Errorf("update failed: %w", err) } return nil @@ -326,7 +371,7 @@ func verifiedOpenClawEnv(ctx context.Context, runner acpx.Runner, fallback map[s } baseEntries := filepath.SplitList(os.Getenv("PATH")) - nodeDirs := lookPathDir("node") + nodeDirs := commandPathDirs("node") for _, dir := range candidateDirs { pathEntries := uniquePathEntries([]string{dir}, nodeDirs, baseEntries) env := map[string]string{ @@ -365,21 +410,22 @@ func openClawPathDirs() []string { if err != nil || info.IsDir() || info.Mode()&0o111 == 0 { continue } - clean := filepath.Clean(dir) - if _, ok := seen[clean]; ok { - continue + for _, candidateDir := range executableDirs(path) { + if _, ok := seen[candidateDir]; ok { + continue + } + seen[candidateDir] = struct{}{} + dirs = append(dirs, candidateDir) } - seen[clean] = struct{}{} - dirs = append(dirs, clean) } return dirs } func detectedAgentEnv(candidate detectableAgent) map[string]string { pathEntries := uniquePathEntries( + commandPathDirs(candidate.ACPXAgent), + commandPathDirs("node"), filepath.SplitList(os.Getenv("PATH")), - lookPathDir(candidate.ACPXAgent), - lookPathDir("node"), ) if len(pathEntries) == 0 { return map[string]string{} @@ -389,7 +435,7 @@ func detectedAgentEnv(candidate detectableAgent) map[string]string { } } -func lookPathDir(command string) []string { +func commandPathDirs(command string) []string { if strings.TrimSpace(command) == "" { return nil } @@ -397,11 +443,35 @@ func lookPathDir(command string) []string { if err != nil { return nil } - dir := strings.TrimSpace(filepath.Dir(path)) - if dir == "" { + return executableDirs(path) +} + +func executableDirs(path string) []string { + path = strings.TrimSpace(path) + if path == "" { return nil } - return []string{dir} + + dirs := make([]string, 0, 2) + add := func(dir string) { + dir = strings.TrimSpace(dir) + if dir == "" { + return + } + dir = filepath.Clean(dir) + for _, existing := range dirs { + if existing == dir { + return + } + } + dirs = append(dirs, dir) + } + + if resolved, err := filepath.EvalSymlinks(path); err == nil { + add(filepath.Dir(resolved)) + } + add(filepath.Dir(path)) + return dirs } func uniquePathEntries(groups ...[]string) []string { @@ -655,6 +725,7 @@ func runDaemon(ctx context.Context, args []string, update updateRunner, detect d *nodeID, *reconnectToken, *configPath, + *statePath, runner, sessions, func(serverURL string) daemonClient { @@ -929,6 +1000,7 @@ func runDaemonLoop( nodeID string, reconnectToken string, configPath string, + statePath string, runner acpx.Runner, sessions *sessionStore, clientFactory daemonClientFactory, @@ -947,6 +1019,7 @@ func runDaemonLoop( nodeID, reconnectToken, configPath, + statePath, runner, sessions, clientFactory, @@ -982,6 +1055,7 @@ func runDaemonSession( nodeID string, reconnectToken string, configPath string, + statePath string, runner acpx.Runner, sessions *sessionStore, clientFactory daemonClientFactory, @@ -1135,8 +1209,18 @@ func runDaemonSession( } case "node.update": logf("update command node=%s", nodeID) - sendNodeLog(sessionCtx, client, nodeID, "warn", "node update requested", nil) - if err := update(sessionCtx, os.Stdout, os.Stderr); err != nil { + sendNodeLog(sessionCtx, client, nodeID, "warn", "node update requested", map[string]any{ + "serverUrl": serverURL, + "config": configPath, + "state": statePath, + }) + if err := update(sessionCtx, os.Stdout, os.Stderr, nodeUpdateOptions{ + ServerURL: serverURL, + NodeID: nodeID, + ConfigPath: configPath, + StatePath: statePath, + SelfUpdate: true, + }); err != nil { sendNodeLog(sessionCtx, client, nodeID, "error", "node update failed", map[string]any{ "error": err.Error(), }) diff --git a/internal/app/app_test.go b/internal/app/app_test.go index b2e7558..f36f340 100644 --- a/internal/app/app_test.go +++ b/internal/app/app_test.go @@ -68,6 +68,7 @@ func TestRunDaemonLoopReconnectsAfterDisconnect(t *testing.T) { "node-a", "token-a", writeConfig(t, nodeconfig.File{NodeName: "node-a"}), + filepath.Join(t.TempDir(), "node-state.json"), acpx.Runner{}, newSessionStore(), func(_ string) daemonClient { @@ -89,7 +90,7 @@ func TestRunDaemonLoopReconnectsAfterDisconnect(t *testing.T) { } return nil }, - func(context.Context, io.Writer, io.Writer) error { + func(context.Context, io.Writer, io.Writer, nodeUpdateOptions) error { t.Fatal("unexpected update invocation") return nil }, @@ -125,7 +126,7 @@ func TestRunDispatchesUpdateSubcommand(t *testing.T) { err := run( context.Background(), []string{"update"}, - func(context.Context, io.Writer, io.Writer) error { + func(context.Context, io.Writer, io.Writer, nodeUpdateOptions) error { called = true return nil }, @@ -139,6 +140,30 @@ func TestRunDispatchesUpdateSubcommand(t *testing.T) { } } +func TestRunDispatchesReinstallSubcommand(t *testing.T) { + t.Parallel() + + called := false + err := run( + context.Background(), + []string{"reinstall"}, + func(_ context.Context, _ io.Writer, _ io.Writer, options nodeUpdateOptions) error { + called = true + if !options.Reinstall { + t.Fatal("expected reinstall flag to be set") + } + return nil + }, + func(context.Context, string) error { return nil }, + ) + if err != nil { + t.Fatalf("run() error = %v", err) + } + if !called { + t.Fatal("expected update runner to be called for reinstall") + } +} + func TestRunDispatchesDetectSubcommand(t *testing.T) { t.Parallel() @@ -147,7 +172,7 @@ func TestRunDispatchesDetectSubcommand(t *testing.T) { err := run( context.Background(), []string{"detect", "--config", configPath}, - func(context.Context, io.Writer, io.Writer) error { return nil }, + func(context.Context, io.Writer, io.Writer, nodeUpdateOptions) error { return nil }, func(_ context.Context, path string) error { called = path == configPath return nil @@ -196,6 +221,7 @@ func TestRunDaemonSessionHandlesNodeUpdate(t *testing.T) { }, } + statePath := filepath.Join(t.TempDir(), "node-state.json") called := false err := runDaemonSession( context.Background(), @@ -203,12 +229,25 @@ func TestRunDaemonSessionHandlesNodeUpdate(t *testing.T) { "node-a", "token-a", writeConfig(t, nodeconfig.File{NodeName: "node-a"}), + statePath, acpx.Runner{}, newSessionStore(), func(string) daemonClient { return client }, func(context.Context, nodeconfig.AgentConfig) error { return nil }, - func(context.Context, io.Writer, io.Writer) error { + func(_ context.Context, _ io.Writer, _ io.Writer, options nodeUpdateOptions) error { called = true + if options.ServerURL != "ws://example.invalid/ws?role=node" { + t.Fatalf("update server url = %q", options.ServerURL) + } + if options.NodeID != "node-a" { + t.Fatalf("update node id = %q", options.NodeID) + } + if options.StatePath != statePath { + t.Fatalf("update state path = %q, want %q", options.StatePath, statePath) + } + if !options.SelfUpdate { + t.Fatal("expected self update flag") + } return nil }, func(context.Context, string) error { @@ -225,6 +264,85 @@ func TestRunDaemonSessionHandlesNodeUpdate(t *testing.T) { assertEnvelopeTypes(t, client.sent, []string{"node.resume", "node.capabilities.sync"}) } +func TestRunUpdatePassesRuntimeContextToInstaller(t *testing.T) { + binDir := t.TempDir() + envLogPath := filepath.Join(t.TempDir(), "installer-env.log") + writeExecutable(t, filepath.Join(binDir, "curl"), fmt.Sprintf(`#!/bin/sh + printf 'SERVER_URL=%%s\nNODE_ID=%%s\nCONFIG_PATH=%%s\nSTATE_PATH=%%s\nAMESH_NODE_SELF_UPDATE=%%s\n' \ + "$SERVER_URL" "$NODE_ID" "$CONFIG_PATH" "$STATE_PATH" "$AMESH_NODE_SELF_UPDATE" > %q + printf '%%s\n' '#!/bin/sh' + printf '%%s\n' 'exit 0' +`, envLogPath)) + t.Setenv("PATH", binDir+string(os.PathListSeparator)+os.Getenv("PATH")) + t.Setenv("AMESH_INSTALL_URL", "https://example.invalid/install-amesh-node.sh") + + var stdout bytes.Buffer + err := runUpdate(context.Background(), &stdout, io.Discard, nodeUpdateOptions{ + ServerURL: "ws://example.invalid/ws?role=node", + NodeID: "node-a", + ConfigPath: "/srv/amesh/agents.json", + StatePath: "/srv/amesh/node-state.json", + SelfUpdate: true, + }) + if err != nil { + t.Fatalf("runUpdate() error = %v", err) + } + + bytes, err := os.ReadFile(envLogPath) + if err != nil { + t.Fatalf("read env log: %v", err) + } + got := string(bytes) + for _, want := range []string{ + "SERVER_URL=ws://example.invalid/ws?role=node", + "NODE_ID=node-a", + "CONFIG_PATH=/srv/amesh/agents.json", + "STATE_PATH=/srv/amesh/node-state.json", + "AMESH_NODE_SELF_UPDATE=1", + } { + if !strings.Contains(got, want) { + t.Fatalf("installer env = %q, want %q", got, want) + } + } +} + +func TestRunReinstallPassesResetModeToInstaller(t *testing.T) { + binDir := t.TempDir() + envLogPath := filepath.Join(t.TempDir(), "installer-env.log") + writeExecutable(t, filepath.Join(binDir, "curl"), fmt.Sprintf(`#!/bin/sh + printf 'AMESH_NODE_REINSTALL=%%s\nSERVER_URL=%%s\nSTATE_PATH=%%s\n' \ + "$AMESH_NODE_REINSTALL" "$SERVER_URL" "$STATE_PATH" > %q + printf '%%s\n' '#!/bin/sh' + printf '%%s\n' 'exit 0' +`, envLogPath)) + t.Setenv("PATH", binDir+string(os.PathListSeparator)+os.Getenv("PATH")) + t.Setenv("AMESH_INSTALL_URL", "https://example.invalid/install-amesh-node.sh") + + var stdout bytes.Buffer + err := runReinstall(context.Background(), &stdout, io.Discard, nodeUpdateOptions{ + ServerURL: "ws://example.invalid/ws?role=node", + StatePath: "/srv/amesh/node-state.json", + }) + if err != nil { + t.Fatalf("runReinstall() error = %v", err) + } + + bytes, err := os.ReadFile(envLogPath) + if err != nil { + t.Fatalf("read env log: %v", err) + } + got := string(bytes) + for _, want := range []string{ + "AMESH_NODE_REINSTALL=1", + "SERVER_URL=ws://example.invalid/ws?role=node", + "STATE_PATH=/srv/amesh/node-state.json", + } { + if !strings.Contains(got, want) { + t.Fatalf("installer env = %q, want %q", got, want) + } + } +} + func TestRunDaemonSessionHandlesNodeDetect(t *testing.T) { t.Parallel() @@ -251,11 +369,12 @@ func TestRunDaemonSessionHandlesNodeDetect(t *testing.T) { {ID: "agent-a", Name: "Agent A", ACPXAgent: "claude"}, }, }), + filepath.Join(t.TempDir(), "node-state.json"), acpx.Runner{}, newSessionStore(), func(string) daemonClient { return client }, func(context.Context, nodeconfig.AgentConfig) error { return nil }, - func(context.Context, io.Writer, io.Writer) error { return nil }, + func(context.Context, io.Writer, io.Writer, nodeUpdateOptions) error { return nil }, func(_ context.Context, path string) error { called = true cancel() @@ -306,6 +425,7 @@ func TestRunDaemonSessionHandlesNodePathUpdate(t *testing.T) { "node-a", "token-a", configPath, + filepath.Join(t.TempDir(), "node-state.json"), acpx.Runner{}, newSessionStore(), func(string) daemonClient { return client }, @@ -313,7 +433,7 @@ func TestRunDaemonSessionHandlesNodePathUpdate(t *testing.T) { cancel() return nil }, - func(context.Context, io.Writer, io.Writer) error { return nil }, + func(context.Context, io.Writer, io.Writer, nodeUpdateOptions) error { return nil }, func(context.Context, string) error { return nil }, ) if err != nil { @@ -368,13 +488,14 @@ func TestRunDaemonSessionHandlesNodePathBrowse(t *testing.T) { "node-a", "token-a", configPath, + filepath.Join(t.TempDir(), "node-state.json"), acpx.Runner{}, newSessionStore(), func(string) daemonClient { return client }, func(context.Context, nodeconfig.AgentConfig) error { return nil }, - func(context.Context, io.Writer, io.Writer) error { return nil }, + func(context.Context, io.Writer, io.Writer, nodeUpdateOptions) error { return nil }, func(context.Context, string) error { return nil }, ) }() @@ -491,6 +612,7 @@ func TestRunDaemonSessionSendsHealthProbeLogs(t *testing.T) { {ID: "agent-openclaw", Name: "OpenClaw", ACPXAgent: "openclaw"}, }, }), + filepath.Join(t.TempDir(), "node-state.json"), acpx.Runner{}, newSessionStore(), func(string) daemonClient { return client }, @@ -498,7 +620,7 @@ func TestRunDaemonSessionSendsHealthProbeLogs(t *testing.T) { cancel() return errors.New("ACP metadata is missing") }, - func(context.Context, io.Writer, io.Writer) error { return nil }, + func(context.Context, io.Writer, io.Writer, nodeUpdateOptions) error { return nil }, func(context.Context, string) error { return nil }, ) if err != nil { @@ -638,6 +760,56 @@ exit 1 } } +func TestDetectAgentsPrefersResolvedExecutableDirsForFNMStyleShims(t *testing.T) { + home := t.TempDir() + shimDir := filepath.Join(t.TempDir(), "fnm-multishell") + stableDir := filepath.Join(t.TempDir(), "fnm-installation", "bin") + t.Setenv("HOME", home) + t.Setenv("PATH", shimDir) + t.Setenv("AMESH_ACPX_PATH", "") + + managed := filepath.Join(home, ".local", "share", "amesh", "acpx", "bin", "acpx") + writeExecutable(t, managed, `#!/bin/sh +if [ "$1" = "--help" ]; then +cat <<'EOF' +Commands: + codex [options] [prompt...] Use codex agent +EOF +exit 0 +fi +exit 1 +`) + writeExecutable(t, filepath.Join(stableDir, "node"), "#!/bin/sh\nexit 0\n") + writeExecutable(t, filepath.Join(stableDir, "codex"), "#!/bin/sh\nexit 0\n") + if err := os.MkdirAll(shimDir, 0o755); err != nil { + t.Fatalf("mkdir %s: %v", shimDir, err) + } + if err := os.Symlink(filepath.Join(stableDir, "node"), filepath.Join(shimDir, "node")); err != nil { + t.Fatalf("symlink node shim: %v", err) + } + if err := os.Symlink(filepath.Join(stableDir, "codex"), filepath.Join(shimDir, "codex")); err != nil { + t.Fatalf("symlink codex shim: %v", err) + } + + got := detectAgents(context.Background(), acpx.Runner{}) + want := []nodeconfig.AgentConfig{ + { + ID: "agent-codex", + Name: "Codex", + ACPXAgent: "codex", + Command: managed, + Args: []string{}, + Env: map[string]string{ + "PATH": strings.Join([]string{stableDir, shimDir}, string(os.PathListSeparator)), + }, + Labels: []string{"detected"}, + }, + } + if !reflect.DeepEqual(got, want) { + t.Fatalf("detectAgents() = %#v, want %#v", got, want) + } +} + func TestDetectAgentsVerifiesOpenClawACPReadinessAcrossPathCandidates(t *testing.T) { home := t.TempDir() badDir := filepath.Join(t.TempDir(), "bad-bin") diff --git a/scripts/test-install-amesh-node.sh b/scripts/test-install-amesh-node.sh index f9e0654..6bb1a7a 100644 --- a/scripts/test-install-amesh-node.sh +++ b/scripts/test-install-amesh-node.sh @@ -173,3 +173,329 @@ assert_contains 'Environment="AMESH_ACPX_PATH=' "$stdin_env_dir/amesh-node.servi assert_contains 'Environment="AMESH_NODE_VERSION=test-tag"' "$stdin_env_dir/amesh-node.service" assert_contains "$stdin_space_dir" "$stdin_env_dir/amesh-node.service" test -x "$stdin_env_dir/bin/amesh" + +self_stub_dir="$tmp_dir/self-update-bin" +mkdir -p "$self_stub_dir" + +cat <<'EOF' >"$self_stub_dir/curl" +#!/usr/bin/env bash +set -euo pipefail +archive="${@: -1}" +printf 'stub archive' >"$archive" +EOF +chmod +x "$self_stub_dir/curl" + +cat <<'EOF' >"$self_stub_dir/npm" +#!/usr/bin/env bash +set -euo pipefail +exit 0 +EOF +chmod +x "$self_stub_dir/npm" + +cat <<'EOF' >"$self_stub_dir/systemctl" +#!/usr/bin/env bash +set -euo pipefail +printf '%s\n' "$*" >>"${SYSTEMCTL_LOG:?}" +verb= +for arg in "$@"; do + case "$arg" in + --user|--now|--quiet|--no-pager|--full) + continue + ;; + *) + verb="$arg" + break + ;; + esac +done +case "$verb" in + daemon-reload|enable) + exit 0 + ;; + *) + exit 99 + ;; +esac +EOF +chmod +x "$self_stub_dir/systemctl" + +cat <<'EOF' >"$self_stub_dir/uname" +#!/usr/bin/env bash +set -euo pipefail +case "${1:-}" in + -m) + printf 'x86_64\n' + ;; + *) + printf 'Linux\n' + ;; +esac +EOF +chmod +x "$self_stub_dir/uname" + +cat <<'EOF' >"$self_stub_dir/mktemp" +#!/usr/bin/env bash +set -euo pipefail +dir="${TMPDIR:-/tmp}/amesh-test-self-update" +mkdir -p "$dir" +printf '%s\n' "$dir" +EOF +chmod +x "$self_stub_dir/mktemp" + +cat <<'EOF' >"$self_stub_dir/tar" +#!/usr/bin/env bash +set -euo pipefail +target_dir= +while [[ $# -gt 0 ]]; do + case "$1" in + -C) + target_dir="$2" + shift 2 + ;; + *) + shift + ;; + esac +done +mkdir -p "$target_dir" +cat <<'BIN' >"$target_dir/amesh-node" +#!/usr/bin/env bash +set -euo pipefail +exit 0 +BIN +chmod +x "$target_dir/amesh-node" +EOF +chmod +x "$self_stub_dir/tar" + +cat <<'EOF' >"$self_stub_dir/install" +#!/usr/bin/env bash +set -euo pipefail +src="${@: -2:1}" +dest="${@: -1}" +cp "$src" "$dest" +chmod 0755 "$dest" +EOF +chmod +x "$self_stub_dir/install" + +cat <<'EOF' >"$self_stub_dir/node" +#!/usr/bin/env bash +set -euo pipefail +case "${1:-}" in + -v) + printf 'v24.13.1\n' + ;; + -p) + printf '24\n' + ;; + *) + exit 0 + ;; +esac +EOF +chmod +x "$self_stub_dir/node" + +self_env_dir="$tmp_dir/self-update-env" +mkdir -p "$self_env_dir" +printf '{}\n' >"$self_env_dir/agents.json" +printf '{"nodeId":"node-a","reconnectToken":"token","serverUrl":"ws://saved.invalid/ws?role=node","configPath":"%s"}\n' "$self_env_dir/agents.json" >"$self_env_dir/node-state.json" + +self_systemctl_log="$tmp_dir/self-update-systemctl.log" +self_log="$tmp_dir/self-update.log" +if ! PATH="$self_stub_dir:$PATH" \ + SYSTEMCTL_LOG="$self_systemctl_log" \ + AMESH_NODE_SELF_UPDATE='1' \ + AMESH_VERSION_TAG='test-tag' \ + INSTALL_DIR="$self_env_dir/bin" \ + AMESH_HOME="$self_env_dir/home" \ + ACPX_PREFIX="$self_env_dir/acpx" \ + ACPX_CONFIG_PATH="$self_env_dir/acpx-config.json" \ + CONFIG_PATH="$self_env_dir/agents.json" \ + STATE_PATH="$self_env_dir/node-state.json" \ + SERVICE_PATH="$self_env_dir/amesh-node.service" \ + NODE_ID='self-update-node' \ + bash <"$ROOT_DIR/install-amesh-node.sh" >"$self_log" 2>&1; then + printf 'expected self-update installer execution without SERVER_URL to succeed\n' >&2 + cat "$self_log" >&2 + exit 1 +fi + +assert_contains 'daemon-reload' "$self_systemctl_log" +assert_contains 'enable amesh-node' "$self_systemctl_log" +if grep -F 'stop amesh-node' "$self_systemctl_log" >/dev/null 2>&1; then + printf 'self-update must not stop its own service\n' >&2 + cat "$self_systemctl_log" >&2 + exit 1 +fi + +reinstall_stub_dir="$tmp_dir/reinstall-bin" +mkdir -p "$reinstall_stub_dir" + +cat <<'EOF' >"$reinstall_stub_dir/curl" +#!/usr/bin/env bash +set -euo pipefail +archive="${@: -1}" +printf 'stub archive' >"$archive" +EOF +chmod +x "$reinstall_stub_dir/curl" + +cat <<'EOF' >"$reinstall_stub_dir/npm" +#!/usr/bin/env bash +set -euo pipefail +exit 0 +EOF +chmod +x "$reinstall_stub_dir/npm" + +cat <<'EOF' >"$reinstall_stub_dir/systemctl" +#!/usr/bin/env bash +set -euo pipefail +printf '%s\n' "$*" >>"${SYSTEMCTL_LOG:?}" +verb= +for arg in "$@"; do + case "$arg" in + --user|--now|--quiet|--no-pager|--full) + continue + ;; + *) + verb="$arg" + break + ;; + esac +done +case "$verb" in + stop|disable|daemon-reload|enable|is-active) + if [[ "$verb" == "is-active" ]]; then + exit 0 + fi + exit 0 + ;; + *) + exit 99 + ;; +esac +EOF +chmod +x "$reinstall_stub_dir/systemctl" + +cat <<'EOF' >"$reinstall_stub_dir/uname" +#!/usr/bin/env bash +set -euo pipefail +case "${1:-}" in + -m) + printf 'x86_64\n' + ;; + *) + printf 'Linux\n' + ;; +esac +EOF +chmod +x "$reinstall_stub_dir/uname" + +cat <<'EOF' >"$reinstall_stub_dir/mktemp" +#!/usr/bin/env bash +set -euo pipefail +dir="${TMPDIR:-/tmp}/amesh-test-reinstall" +mkdir -p "$dir" +printf '%s\n' "$dir" +EOF +chmod +x "$reinstall_stub_dir/mktemp" + +cat <<'EOF' >"$reinstall_stub_dir/tar" +#!/usr/bin/env bash +set -euo pipefail +target_dir= +while [[ $# -gt 0 ]]; do + case "$1" in + -C) + target_dir="$2" + shift 2 + ;; + *) + shift + ;; + esac +done +mkdir -p "$target_dir" +cat <<'BIN' >"$target_dir/amesh-node" +#!/usr/bin/env bash +set -euo pipefail +exit 0 +BIN +chmod +x "$target_dir/amesh-node" +cat <<'BIN' >"$target_dir/amesh" +#!/usr/bin/env bash +set -euo pipefail +exit 0 +BIN +chmod +x "$target_dir/amesh" +EOF +chmod +x "$reinstall_stub_dir/tar" + +cat <<'EOF' >"$reinstall_stub_dir/install" +#!/usr/bin/env bash +set -euo pipefail +src="${@: -2:1}" +dest="${@: -1}" +cp "$src" "$dest" +chmod 0755 "$dest" +EOF +chmod +x "$reinstall_stub_dir/install" + +cat <<'EOF' >"$reinstall_stub_dir/node" +#!/usr/bin/env bash +set -euo pipefail +case "${1:-}" in + -v) + printf 'v24.13.1\n' + ;; + -p) + printf '24\n' + ;; + *) + exit 0 + ;; +esac +EOF +chmod +x "$reinstall_stub_dir/node" + +reinstall_env_dir="$tmp_dir/reinstall-env" +mkdir -p "$reinstall_env_dir/home/keep-me" +mkdir -p "$reinstall_env_dir/bin" +printf '{"old":true}\n' >"$reinstall_env_dir/agents.json" +printf '{"nodeId":"node-a","reconnectToken":"token","serverUrl":"ws://saved.invalid/ws?role=node","configPath":"%s"}\n' "$reinstall_env_dir/agents.json" >"$reinstall_env_dir/node-state.json" +printf '[Unit]\nDescription=old service\n' >"$reinstall_env_dir/amesh-node.service" +printf 'old binary\n' >"$reinstall_env_dir/bin/amesh-node" +printf 'old cli\n' >"$reinstall_env_dir/bin/amesh" +printf 'stale managed home\n' >"$reinstall_env_dir/home/keep-me/stale.txt" + +reinstall_systemctl_log="$tmp_dir/reinstall-systemctl.log" +reinstall_log="$tmp_dir/reinstall.log" +if ! PATH="$reinstall_stub_dir:$PATH" \ + SYSTEMCTL_LOG="$reinstall_systemctl_log" \ + AMESH_NODE_REINSTALL='1' \ + AMESH_VERSION_TAG='test-tag' \ + INSTALL_DIR="$reinstall_env_dir/bin" \ + AMESH_HOME="$reinstall_env_dir/home" \ + ACPX_PREFIX="$reinstall_env_dir/acpx" \ + ACPX_CONFIG_PATH="$reinstall_env_dir/acpx-config.json" \ + CONFIG_PATH="$reinstall_env_dir/agents.json" \ + STATE_PATH="$reinstall_env_dir/node-state.json" \ + SERVICE_PATH="$reinstall_env_dir/amesh-node.service" \ + NODE_ID='reinstall-node' \ + SERVER_URL='wss://example.invalid/ws?role=node' \ + REGISTRATION_TOKEN='token' \ + bash <"$ROOT_DIR/install-amesh-node.sh" >"$reinstall_log" 2>&1; then + printf 'expected reinstall installer execution to succeed\n' >&2 + cat "$reinstall_log" >&2 + exit 1 +fi + +assert_contains 'stop amesh-node' "$reinstall_systemctl_log" +assert_contains 'disable amesh-node' "$reinstall_systemctl_log" +assert_contains 'enable --now amesh-node' "$reinstall_systemctl_log" +if [[ -f "$reinstall_env_dir/home/keep-me/stale.txt" ]]; then + printf 'reinstall should remove previous managed amesh home\n' >&2 + exit 1 +fi +if grep -F '"old":true' "$reinstall_env_dir/agents.json" >/dev/null 2>&1; then + printf 'reinstall should replace stale agent config\n' >&2 + exit 1 +fi