Skip to content

feat: L-FRESH-BOOT fresh-user smoke + isolated dev-server helper#1506

Open
ystknsh wants to merge 11 commits into
mainfrom
feat/e2e-live-fresh-boot
Open

feat: L-FRESH-BOOT fresh-user smoke + isolated dev-server helper#1506
ystknsh wants to merge 11 commits into
mainfrom
feat/e2e-live-fresh-boot

Conversation

@ystknsh
Copy link
Copy Markdown
Contributor

@ystknsh ystknsh commented May 25, 2026

Summary

  • mulmoclaude を 任意の workspace dir + 隔離 HOME + 隔離 port + pinned auth token で spawn できる test infra を追加 (e2e-live/fixtures/isolated-dev-server.tsspawnIsolatedDevServer)
  • これを使って、 新規ユーザーと同等の状態 (空 workspace + 空 HOME + 認証コピーのみ) から 1 ターン chat を完走するか を end-to-end で検証する live e2e (L-FRESH-BOOT) を追加
  • なお、 この infra は本 PR の L-FRESH-BOOT 以外にも以下に流用できる:
    • 開発者の手元 ~/mulmoclaude/ を一切汚さず e2e-live を走らせる (= e2e-live suite 全体の self-contained 化、 plan で TODO 化済)
    • 後続シナリオ (L-10 Gemini key 未設定 / L-13 server 再接続 / L-FRESH-PRESET-SKILL 等) を同 helper を extend して書く
    • 1 host 上で複数の独立 workspace に対する並列 test (port 衝突を helper が free port 探索で吸う)

検証する 5 段 (L-FRESH-BOOT spec assertion):

  • (a) /api/health が 200 応答
  • (b) workspace dir 自動初期化 (conversations/chat, config/helps, .session-token)
  • (c) SPA index.html に <meta name="mulmoclaude-auth"> token が注入
  • (d) 1 ターン (Reply with the single word: okfresh-<nonce>) を assistant body が echo
  • (e) cleanup 後に host ~/mulmoclaude/ / ~/.claude/skills/ の mtime が test 前と変わらない (recursive walk で in-place 上書きまで検出)

Items to Confirm / Review

reviewer に重点的に見てほしい設計判断:

  1. production code への test seam 追加server/index.tsMULMOCLAUDE_CLIENT_DIR env を読む分岐を追加。 用途は「test spawner が dist/client/ を指すため」 のみで、 production user (npx mulmoclaude tarball) は env 未設定 → default の <__dirname>/../client/ を踏むので動作変化なし。 ただし「test のために production code を触っている」 構造であることは事実。 代替案 (worktree への symlink、 prepare-dist 起動) も検討したが、 副作用最小の env seam を採用。 → resolveClientDir を pure function に切り出し、 env set / undefined / 空文字列 / 相対 path / whitespace-only の 5 ブランチを unit test (test/server/utils/test_clientDir.ts) で被覆。 default 経路 (npx mulmoclaude ユーザーが実際に踏む) の退行は unit test で防御
  2. buildServerEnv が parent process.env を全 inherit — spawned isolated server は親 shell の GEMINI_API_KEY / ANTHROPIC_API_KEY / X_BEARER_TOKEN 等を継承する。 厳密な「fresh user」 simulation ではない (= 開発者 shell の export 状態に左右される)。 ただし PATH 不在で tsx が起動しない等を避けるため一律 strip はしていない
  3. DISABLE_SANDBOX=1 を helper が常に set — fresh-boot は boot path 連動性のみを検証、 sandbox 経路は L-FRESH-SANDBOX-BUILD (別 PR、 未着手) で扱う前提
  4. assertHostUntouched は top-level dir mtime のみ対応済/codex-cross-review 3 ラウンドで以下を実装 (commits 9477c9b3 + b6a90637): (a) 再帰 walk で全 descendant の mtime を Map<absolutePath, mtimeMs> に snapshot、 (b) symlinked dir も follow (stat().isDirectory() 判定)、 (c) ancestor-realpath chain で cycle 検出 (aliased symlink は両方 walk される)、 (d) MAX_WALK_DEPTH=64 safety net、 (e) EACCES/EPERM/ENOENT を tolerate、 (f) drift は modified / created / removed / existence-flip の 4 種を双方向 diff。 test/e2e-live/test_isolated_dev_server_drift.ts で 8 ケース被覆 (in-place rewrite / aliased symlink / cycle / removed / created / existence-flip 等)
  5. CI no-LLM matrix への登録なしfresh-boot.spec.ts は自前 server を spawn する構造のため、 MULMOCLAUDE_FAKE_AGENT=1 で起動する shared dev server を前提とした既存 matrix に乗らない。 docker.spec.ts と同じ位置付け
  6. 初回 run で yarn build:client を helper が自動起動dist/client/index.html 不在時のみ。 cold cache で +10-20s、 warm では no-op

User Prompt

L-FRESH-BOOT (新規ユーザー smoke) を実装。重要度 A (first-run UX = ユーザー獲得関門)。test 専用 dev server spawn infra (HOME + MULMOCLAUDE_WORKSPACE_PATH + MULMOCLAUDE_PORT で隔離 + 認証 copy) を含む。launcher 抽象化を最小限で済ませて 1 PR に収める方針、 後の L-10 / L-13 / L-FRESH-* で同 infra を再利用する。

実装の詳細

コミット構成

commit 内容
525b43ed server/index.tsMULMOCLAUDE_CLIENT_DIR env を読む分岐を追加
946c6133 spawnIsolatedDevServer helper + fresh-boot.spec.ts + package.json script
96f44883 plans/feat-e2e-live.md を ✅ 化、 docs/e2e-live-testing.md に fresh-user 例外 callout 追加
77f4becc plans に suite-level self-contained 化 TODO 追記
aabbbe75 resolveClientDir を pure function に切り出し + unit test 5 ブランチ

Helper の env 設定 (buildServerEnv)

env 役割
HOME=<temp> claude CLI auth lookup / ~/.claude/skills/ を temp dir に振る
MULMOCLAUDE_WORKSPACE_PATH=<temp> workspace dir 自動初期化先
PORT=<random> ユーザーの yarn dev (3001) との衝突回避
MULMOCLAUDE_AUTH_TOKEN=<random hex> bearer token を test 側で事前生成して pin
NODE_ENV=production express が SPA + index.html を直接 serve
MULMOCLAUDE_CLIENT_DIR=<repo>/dist/client source 配置で index.html を見つけるため
DISABLE_SANDBOX=1 sandbox 経路を回避 (L-FRESH-SANDBOX-BUILD scope)

設計指針 (plan の階層モデル)

plans/feat-e2e-live.md の「環境を壊さず再現する設計指針」 より:

  • 階層 1 (共有 dev server + unique slug seed): 既存 L-01 〜 L-32 が採用
  • 階層 2 (test 専用 dev server spawn): 本 PR の L-FRESH-BOOT が初採用 — 「空 workspace 起動」 を再現するには階層 1 では原理的に不可能 (= 既存 workspace のままだと「空」 を作れない)
  • 階層 3 (ソース改修 + env 新設): MULMOCLAUDE_CLIENT_DIR の追加が該当 (最小限)

後続シナリオでの再利用設計

  • L-FRESH-PRESET-SKILL: 同 helper で空 workspace 起動 → preset migration 経路を end-to-end 検証
  • L-10 / L-13: helper を env 引数で extend (Gemini key 削除 / server restart trigger)
  • L-FRESH-SANDBOX-BUILD: helper + MULMOCLAUDE_SANDBOX_IMAGE env 新設 (別 PR)
  • suite-level self-contained 化: 同 helper を Playwright globalSetup / webServer で 1 回 spawn する形に格上げすれば全 e2e-live が host 無汚染になる構想 (plans に TODO 化、 別 PR)

動作確認

ローカル (macOS, host dev = port 3001 起動中) で実行:

```bash
yarn test:e2e:live:fresh-boot
```

  • wall time: ~15-19s (warm cache)
  • 連続 2 回 pass で安定確認
  • spawned server は random free port、 host yarn dev と非衝突
  • cleanup 後 host ~/mulmoclaude/ / ~/.claude/skills/ mtime 不変

unit test:

```bash
./node_modules/.bin/tsx --test test/server/utils/test_clientDir.ts
```

  • 5 ブランチ全 pass

CI チェック: yarn format / yarn lint / yarn typecheck / yarn build 全通過。

関連

Summary by CodeRabbit

  • Documentation

    • Clarified the fresh-user smoke testing pattern, isolation approach, and updated roadmap status.
  • Tests

    • Added isolated dev-server test infrastructure, a fresh-user smoke E2E verifying first-time boot and SPA auth flow, and drift-detection tests to ensure host filesystem isolation.
    • Added unit tests for production client-directory resolution.
  • Chores

    • Added an npm script to run the fresh-user smoke E2E.
  • Improvements

    • Production server can override the client bundle directory for testing.

Review Change Stack

ystknsh added 4 commits May 25, 2026 13:09
…ion static path

Default static path (`<server>/../client/`) resolves correctly only in
the prepared package layout that `bin/prepare-dist.js` copies
`dist/client/` into. When running the server straight from a source
checkout (fresh-user smoke tests spawn `tsx server/index.ts` without
that copy step), the path resolves to a non-existent `<repo-root>/client/`.

The new `MULMOCLAUDE_CLIENT_DIR` env lets callers point the production
static path at any directory — fresh-boot tests use it to target
`<repo-root>/dist/client/`. Empty string is treated as "not set" so
the default keeps working without ceremony.
…-server helper

`spawnIsolatedDevServer` (e2e-live/fixtures/isolated-dev-server.ts)
boots a fully isolated `tsx server/index.ts` subprocess with four
overrides:

  - HOME=<temp-home>                       -> claude CLI auth lookup, ~/.claude/skills/
  - MULMOCLAUDE_WORKSPACE_PATH=<temp-ws>   -> workspace dir creation
  - PORT=<random-free>                     -> no collision with `yarn dev`
  - MULMOCLAUDE_AUTH_TOKEN=<pinned hex>    -> bearer token known up-front

`~/.claude/.credentials.json` is copied into the test HOME so the
spawned claude CLI subprocess has auth. `DISABLE_SANDBOX=1` and
`NODE_ENV=production` are set: the spec is about boot path / first-run
UX, not sandbox semantics (L-FRESH-SANDBOX-BUILD covers the Docker
path separately) and production mode makes express serve index.html
itself (with the bearer token substituted into the meta tag).

`fresh-boot.spec.ts` (L-FRESH-BOOT) drives the spawn helper and asserts
5 layers of first-run integrity:

  (a) /api/health responds 200 (re-issued with the pinned bearer so the
      trace shows the green light verbatim)
  (b) workspace auto-init created `conversations/chat`, `config/helps`,
      and `.session-token`
  (c) GET / serves index.html with the `<meta name="mulmoclaude-auth">`
      token substituted (placeholder must be gone)
  (d) one chat turn (`Reply with the single word: okfresh-<nonce>`)
      completes — assistant body contains the marker, thinking
      indicator clears
  (e) AFTER server stop, host `~/mulmoclaude/` and `~/.claude/skills/`
      mtimes match the pre-spawn baseline — the strongest isolation
      contract; any stray `homedir()` call that bypassed the env
      override surfaces here

`yarn test:e2e:live:fresh-boot` runs it standalone. Wall time ~15-19s
on warm cache (server boot ~5s + 1 LLM turn ~5s + Vite client build
amortised across runs). CI no-LLM matrix is intentionally not updated
because the spec self-spawns its server (same pattern as docker.spec.ts).
…lated-server helper

Update the implementation status table, the 2026-05-23 re-eval
matrix, the implementation order roadmap, and the open TODO list to
reflect L-FRESH-BOOT landing along with the reusable
`spawnIsolatedDevServer` helper that L-10 / L-13 / L-FRESH-* can
extend rather than reinventing.

`docs/e2e-live-testing.md` gets a one-paragraph callout describing
the fresh-user smoke exception to the "shared E2E_LIVE_BASE_URL
server" rule, so future spec authors know when self-spawning is the
acceptable pattern versus when they should stick with the shared
boot.
…e-live

Capture the design follow-up surfaced in PR review: `spawnIsolatedDevServer`
could be promoted to a suite-level Playwright `webServer` / `globalSetup`
spawn so the entire e2e-live suite stops depending on the developer's
`yarn dev` and stops touching the host `~/mulmoclaude/`. Records the
trade-offs (one-time boot cost, loss of host workspace coverage, dual
configs for dev / pre-release modes, Docker matrix switch shape) and
explicitly notes L-FRESH-BOOT must stay per-test spawn because its
"empty workspace on every run" assertion would be weakened by a
suite-shared isolated server.
Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @ystknsh, you have reached your weekly rate limit of 500000 diff characters.

Please try again later or upgrade to continue using Sourcery

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 25, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds an isolated dev-server test fixture, integrates client-dir override into server startup, introduces Playwright fresh-boot smoke test and drift/unit tests for host snapshot/assertion, updates docs/plans, and adds a package script to run the fresh-boot e2e spec.

Changes

Fresh Boot E2E Test Implementation

Layer / File(s) Summary
Client directory resolution utility
server/utils/clientDir.ts, test/server/utils/test_clientDir.ts, server/index.ts
New resolveClientDir utility enables environment-driven override (MULMOCLAUDE_CLIENT_DIR) of the production SPA directory, tested with unit cases covering env passthrough and default fallback, and integrated into server startup.
Isolated dev server fixture
e2e-live/fixtures/isolated-dev-server.ts
Comprehensive test fixture for spawning and managing isolated dev servers: builds client bundle on-demand, allocates free ports, snapshots filesystem baselines for isolation verification (mtime walk with symlink handling), constructs sanitized env with auth token pinning and HOME/workspace overrides, spawns tsx server/index.ts, forwards logs, polls /api/health with retry/timeout, gracefully shuts down with SIGTERM→SIGKILL escalation, and asserts host filesystem remains untouched via mtime comparison. Exports SpawnOptions, IsolatedServerHandle, HostFsBaseline types and spawnIsolatedDevServer, stopIsolatedDevServer, assertHostUntouched, and time constants.
Fresh boot smoke test scenario
e2e-live/tests/fresh-boot.spec.ts
New Playwright e2e test validating fresh user boot path (L-FRESH-BOOT) using isolated server: verifies workspace auto-initialization, health endpoint, HTML auth meta tag injection, SPA session creation, single-turn LLM marker echo within timeout, and thinking indicator hide; ensures cleanup and host filesystem isolation via mtime baselines in finally block.
Host-drift tests for fixture
test/e2e-live/test_isolated_dev_server_drift.ts
Unit/e2e tests covering snapshot/assertHostUntouched behaviors: no-op, mtime advanced, created/removed files, absent→present flip, symlink traversal/alias handling, and symlink cycle resilience/idempotence.
Documentation and project status
docs/e2e-live-testing.md, plans/feat-e2e-live.md, package.json
Documents fresh-boot isolation pattern in e2e-live testing guide, updates feat-e2e-live plan to mark L-FRESH-BOOT as implemented with isolation strategy details (env overrides, production serving seam, sandbox disable, mtime checks), records completed dev-server infra and adjusted roadmap, and adds test:e2e:live:fresh-boot npm script for standalone test execution.

🎯 4 (Complex) | ⏱️ ~40 minutes

Sequence Diagram(s)

sequenceDiagram
  participant Test
  participant Fixture as spawnIsolatedDevServer
  participant Builder as ensureClientBuilt
  participant Env as buildServerEnv
  participant Child as tsx_server_index
  participant Health as /api/health

  Test->>Fixture: spawnIsolatedDevServer(testId, opts)
  Fixture->>Builder: ensureClientBuilt()
  Fixture->>Fixture: probeFreePort(), snapshotHostFs(), makeTestDirs()
  Fixture->>Env: buildServerEnv(...)
  Fixture->>Child: spawn tsx server/index.ts (env, pinned token)
  Fixture->>Child: forwardChildLogs()
  Fixture->>Health: waitForHealth(baseUrl, authToken)
  Health-->>Fixture: HTTP 200 OK
  Fixture-->>Test: return IsolatedServerHandle (baseUrl, token, baselines, _process)
Loading

Possibly Related PRs

  • receptron/mulmoclaude#1411: Modifies package.json to chain yarn ensure:playwright-browsers into test:e2e:live* scripts, which complements this PR's added test:e2e:live:fresh-boot script.

Suggested Reviewers

  • isamu

Poem

🐰 A fresh boot springs forth from the dev server's nest,
With isolated homes and workspaces gently pressed,
Tokens pinned tight while mtimes mark the trace,
A single-turn echo blooms in tidy space,
Host untouched, the rabbit bows—tests passed with grace.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 18.18% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main changes: adding L-FRESH-BOOT fresh-user smoke test and the isolated dev-server helper fixture, which are the primary deliverables of this PR.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/e2e-live-fresh-boot

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

* isolation contract is broken (probably a stray `homedir()` call
* that bypassed the env override).
*/
export async function assertHostUntouched(baselines: readonly HostMtimeBaseline[]): Promise<void> {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertHostUntouched currently snapshots only the directory mtimes of ~/mulmoclaude and ~/.claude/skills. That misses an important contamination class: writes to existing files under those directories (for example ~/mulmoclaude/config/settings.json), because file-content updates do not change the parent directory mtime. So this can report "host untouched" even when host files were modified. Can we make the baseline recursive (or at least include critical file mtimes/hashes) so the isolation guarantee matches the comment?

@github-actions
Copy link
Copy Markdown

CODEX VERDICT: CHANGES REQUESTED

  • e2e-live/fixtures/isolated-dev-server.ts: assertHostUntouched checks only top-level directory mtimes (~/mulmoclaude, ~/.claude/skills), which can miss mutations to existing files under those trees because parent directory mtime does not change on file-content writes. This weakens the core isolation guarantee and can yield false negatives. Please make the host-baseline check recursive or include critical-file mtimes/hashes so contamination is reliably detected.

…branches

The MULMOCLAUDE_CLIENT_DIR env override branch was reachable end-to-end
via L-FRESH-BOOT (the spec helper sets the env), but the DEFAULT branch
(env unset / empty string — what every `npx mulmoclaude` user hits)
had no automated coverage. A future refactor that accidentally changed
the default path would silently break every production install while
every test (which sets the env) still passes.

Extract the resolver into a pure function `resolveClientDir(baseDir,
envValue)` in server/utils/clientDir.ts and unit-test five branches:

  - env non-empty -> verbatim (override)
  - env undefined -> <baseDir>/../client (default, what npx users hit)
  - env empty string -> falls back to default
  - env relative path -> verbatim (no auto-absolutize)
  - env whitespace-only -> verbatim (intentional, length > 0 check)

server/index.ts now delegates to the resolver; behavior unchanged.
export async function assertHostUntouched(baselines: readonly HostMtimeBaseline[]): Promise<void> {
const drift: string[] = [];
for (const baseline of baselines) {
const current = await snapshotMtime(baseline.path);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertHostUntouched still compares only the top-level directory mtimes. That can false-pass when a file under those trees is modified in place (for example ~/mulmoclaude/config/settings.json), because file-content writes do not update the parent directory mtime. Since this helper is the main proof of host isolation, can we make this check recursive (or snapshot a critical-file set) so contamination is reliably detected?

@github-actions
Copy link
Copy Markdown

CODEX VERDICT: CHANGES REQUESTED

  • assertHostUntouched still only checks top-level directory mtimes (~/mulmoclaude, ~/.claude/skills) and can miss in-place mutations to existing files under those trees (e.g. config/settings.json), so the isolation guarantee can false-pass. Please make the baseline recursive or include a critical-file mtime/hash set.

@ystknsh ystknsh changed the title Feat/e2e live fresh boot feat: L-FRESH-BOOT fresh-user smoke + isolated dev-server helper May 25, 2026
@ystknsh ystknsh marked this pull request as ready for review May 25, 2026 07:43
Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @ystknsh, you have reached your weekly rate limit of 500000 diff characters.

Please try again later or upgrade to continue using Sourcery

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
e2e-live/fixtures/isolated-dev-server.ts (1)

182-212: ⚡ Quick win

Refactor oversized helpers to satisfy the 20-line function limit.

buildServerEnv (starts at Line 182) and spawnIsolatedDevServer (starts at Line 224) exceed the configured cap. Please split setup steps (env inheritance, token/bootstrap prep, child spawn/wait lifecycle) into smaller functions.

As per coding guidelines, Keep functions under 20 lines; split into smaller functions if needed.

Also applies to: 224-252

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@e2e-live/fixtures/isolated-dev-server.ts` around lines 182 - 212,
buildServerEnv and spawnIsolatedDevServer are too large; split them into smaller
helpers so each function stays under 20 lines. Extract the environment
construction into two functions (e.g., buildInheritedEnv(process.env) that
returns Record<string,string> and buildOverrides(env: ServerEnv) that returns
the fixed keys like HOME, PORT, MULMOCLAUDE_AUTH_TOKEN, DISABLE_SANDBOX), move
token/bootstrap preparation into a prepareAuthAndBootstrap(env: ServerEnv)
helper, and move child process creation and lifecycle/wait logic from
spawnIsolatedDevServer into spawnAndAwaitChild(command, args, options). Replace
the original code by composing these helpers (buildInheritedEnv + buildOverrides
=> buildServerEnv, prepareAuthAndBootstrap before spawnAndAwaitChild) so each
named function (buildServerEnv, spawnIsolatedDevServer, buildInheritedEnv,
buildOverrides, prepareAuthAndBootstrap, spawnAndAwaitChild) is small and
focused.
test/server/utils/test_clientDir.ts (1)

25-66: ⚡ Quick win

Split the describe callback into smaller helpers to meet the function-length cap.

On Line 25, the suite callback is currently long; moving repeated assertions into tiny helper functions keeps the body under the 20-line threshold and makes failures easier to scan.

As per coding guidelines, Keep functions under 20 lines; split into smaller functions if needed.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/server/utils/test_clientDir.ts` around lines 25 - 66, The
describe("resolveClientDir", ...) callback is too long; refactor the test by
extracting repeated assertions into small helper functions (e.g.,
assertEnvVerbatim, assertFallbackDefault, assertWhitespaceHandled) and call
those helpers from the describe block so the suite callback body stays under 20
lines; keep references to resolveClientDir and SAMPLE_BASE_DIR/EXPECTED_DEFAULT
in the helpers and preserve the same assertions and messages so behavior and
test semantics remain unchanged.
e2e-live/tests/fresh-boot.spec.ts (1)

36-127: ⚡ Quick win

Break the main test body into small exported step helpers.

On Line 36, this callback now packs multiple independent phases (health, workspace init, auth meta validation, one-turn chat, teardown checks). Splitting them into exported helpers will satisfy the 20-line cap and improve test reuse/readability.

As per coding guidelines, Keep functions under 20 lines; split into smaller functions if needed and Extract pure logic into exported helpers for testability.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@e2e-live/tests/fresh-boot.spec.ts` around lines 36 - 127, The test callback
in the "L-FRESH-BOOT" test is too large and should be split into small exported
step helpers: extract the health check, workspace-init assertions, auth-meta
validation, SPA one-turn chat (new-session, send, assert response using
getCurrentSessionId and marker), and teardown/host-checks (stopIsolatedDevServer
+ assertHostUntouched) into separate exported async functions (e.g.,
checkHealth, verifyWorkspaceInit, validateAuthMeta, runOneTurnChat,
finalizeAndAssertHostUntouched) and call them from the test body; keep each
exported helper under 20 lines, accept the server/page/testInfo or specific
params they need, and reuse existing helpers like
spawnIsolatedDevServer/stopIsolatedDevServer/getCurrentSessionId to preserve
behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@e2e-live/fixtures/isolated-dev-server.ts`:
- Around line 182-212: buildServerEnv and spawnIsolatedDevServer are too large;
split them into smaller helpers so each function stays under 20 lines. Extract
the environment construction into two functions (e.g.,
buildInheritedEnv(process.env) that returns Record<string,string> and
buildOverrides(env: ServerEnv) that returns the fixed keys like HOME, PORT,
MULMOCLAUDE_AUTH_TOKEN, DISABLE_SANDBOX), move token/bootstrap preparation into
a prepareAuthAndBootstrap(env: ServerEnv) helper, and move child process
creation and lifecycle/wait logic from spawnIsolatedDevServer into
spawnAndAwaitChild(command, args, options). Replace the original code by
composing these helpers (buildInheritedEnv + buildOverrides => buildServerEnv,
prepareAuthAndBootstrap before spawnAndAwaitChild) so each named function
(buildServerEnv, spawnIsolatedDevServer, buildInheritedEnv, buildOverrides,
prepareAuthAndBootstrap, spawnAndAwaitChild) is small and focused.

In `@e2e-live/tests/fresh-boot.spec.ts`:
- Around line 36-127: The test callback in the "L-FRESH-BOOT" test is too large
and should be split into small exported step helpers: extract the health check,
workspace-init assertions, auth-meta validation, SPA one-turn chat (new-session,
send, assert response using getCurrentSessionId and marker), and
teardown/host-checks (stopIsolatedDevServer + assertHostUntouched) into separate
exported async functions (e.g., checkHealth, verifyWorkspaceInit,
validateAuthMeta, runOneTurnChat, finalizeAndAssertHostUntouched) and call them
from the test body; keep each exported helper under 20 lines, accept the
server/page/testInfo or specific params they need, and reuse existing helpers
like spawnIsolatedDevServer/stopIsolatedDevServer/getCurrentSessionId to
preserve behavior.

In `@test/server/utils/test_clientDir.ts`:
- Around line 25-66: The describe("resolveClientDir", ...) callback is too long;
refactor the test by extracting repeated assertions into small helper functions
(e.g., assertEnvVerbatim, assertFallbackDefault, assertWhitespaceHandled) and
call those helpers from the describe block so the suite callback body stays
under 20 lines; keep references to resolveClientDir and
SAMPLE_BASE_DIR/EXPECTED_DEFAULT in the helpers and preserve the same assertions
and messages so behavior and test semantics remain unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d0999de2-a7cb-4ca9-9cd9-175274acb307

📥 Commits

Reviewing files that changed from the base of the PR and between f3157fd and aabbbe7.

📒 Files selected for processing (8)
  • docs/e2e-live-testing.md
  • e2e-live/fixtures/isolated-dev-server.ts
  • e2e-live/tests/fresh-boot.spec.ts
  • package.json
  • plans/feat-e2e-live.md
  • server/index.ts
  • server/utils/clientDir.ts
  • test/server/utils/test_clientDir.ts

ystknsh and others added 2 commits May 25, 2026 17:20
Codex GHA (PR #1506 iter-1 + iter-2 both repeated the same finding):

  `assertHostUntouched` only checked top-level directory mtimes
  (`~/mulmoclaude`, `~/.claude/skills`), so an in-place rewrite of an
  existing file under either tree (e.g. `config/settings.json`) would
  leave the parent dir mtime unchanged and the assertion would false-
  pass — the very isolation guarantee it exists to prove.

Switch `snapshotMtime` / `snapshotHostBaselines` to a recursive walk
that captures every descendant entry's mtime into a flat
`Map<absolutePath, mtimeMs>`, and rewrite `assertHostUntouched` to
diff baseline vs current three ways (modified mtime, created entry,
removed entry). Also surface absent→present existence flips as a
distinct failure mode.

Drift output is capped at 20 entries (with a "...and N more" tail)
so a runaway leak does not produce an unreadable trace.

Add `test/e2e-live/test_isolated_dev_server_drift.ts` covering the
four detection shapes against a real tmpdir fixture — including the
specific in-place rewrite case Codex flagged. The regression net
will fail immediately if a future refactor re-introduces shallow
mtime checking.

Known false-positive risk documented inline: an external process
mutating the host roots concurrently (developer's own `yarn dev`,
editor autosave) will surface as drift. Acceptable because L-FRESH-
BOOT's target environments are CI and dev-server-off local runs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex flagged two follow-up correctness gaps in the recursive walk
introduced by 9477c9b, then approved after the fix:

iter 2 — [must-fix] symlink-to-dir recursion gap:
  - walkMtimeTree gated recursion on entry.isDirectory() (Dirent
    type, returns false for symlink-to-dir), so in-place rewrites
    under a symlinked subtree of ~/.claude/skills/ or ~/mulmoclaude/
    were still missed. Switched to stat(full).isDirectory() which
    follows the symlink target.

iter 2 — [must-fix] global dev:ino dedupe caused false drift:
  - The cycle guard added in iter 1 (dedup by dev:ino) was too
    aggressive: two sibling symlinks pointing at the same real dir
    would have only ONE alias traversed, in a readdir-order-dependent
    way. Replaced with a per-branch ancestor-realpath chain: a child
    is skipped only when its realpath matches an ancestor in the
    current recursion stack (true cycle), not when any other branch
    has previously visited the same inode.

iter 2 — [question] EACCES/EPERM tolerance:
  - Extended per-entry error tolerance from ENOENT-only to
    {ENOENT, EACCES, EPERM} so symlinks into restricted trees
    (/etc, foreign home, unreadable mount) skip the subtree rather
    than aborting the walk. Documented the blind spot explicitly
    in iter 3 (POSIX write permission is independent of read
    permission, so the assertion is "untouched where readable" not
    "untouched everywhere").

Defensive: MAX_WALK_DEPTH = 64 belt-and-braces guard against
pathological chains the realpath check might miss. Helpers split
into readEntries / visitEntry / safeRealpath to stay under the
project's cognitive-complexity threshold.

Unit tests added:
  - follows symlinked directories so in-place rewrites under them
    are detected
  - records both aliases when two symlinks point at the same real
    dir (no global inode dedupe)
  - does not loop on a symlink cycle (ancestor-realpath guard)

Codex iter 3 final verdict: APPROVED (no blocking findings).
Local checks: yarn format / lint / typecheck:e2e-live / 13 unit
tests / L-FRESH-BOOT e2e all green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

CODEX VERDICT: LGTM

Re-checked the current diff against prior thread history. The earlier Codex concerns about top-level-only mtime checks are resolved in the latest patch via recursive host snapshots (snapshotHostFs/assertHostUntouched) plus focused regression tests (test_isolated_dev_server_drift.ts) covering in-place rewrites, add/remove drift, symlink traversal, aliasing, and cycle handling.

I did not find additional correctness, security, accessibility, i18n-lockstep, or test-coverage issues that remain outstanding in this PR.

@ystknsh ystknsh requested a review from isamu May 25, 2026 22:32
…ad of (baseDir, envValue)

Review feedback on PR #1506: the previous signature took a `baseDir`
argument that the resolver silently discarded under the override
branch — a parameter that one of two execution paths ignores is a
smell that hides the conditional dependency at the call site.

Move the default-path construction to the caller and have the
resolver pick between two ready values (env override vs caller-
provided default). Both arguments now flow into the output
symmetrically; reading the signature tells you exactly what each
argument is for.

  // Before
  const clientDir = resolveClientDir(__dirname, process.env.MULMOCLAUDE_CLIENT_DIR);
  // After
  const clientDir = resolveClientDir(
    process.env.MULMOCLAUDE_CLIENT_DIR,
    path.join(__dirname, "../client"),
  );

Unit tests updated to the new signature; added one extra case
(`does not touch defaultDir when env is set`) that pins the
intended symmetry with a sentinel `defaultDir` value so a future
regression to the old "argument-ignored" shape fails immediately.

Local checks: yarn format / lint / typecheck:server+test / 6 unit
tests / L-FRESH-BOOT e2e all green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

CODEX VERDICT: LGTM

Re-checked the current diff against prior review history.

  • The earlier CODEX VERDICT: CHANGES REQUESTED item about top-level-only mtime checks is resolved: host-drift detection is now recursive (snapshotHostFs/assertHostUntouched) and covered by dedicated regression tests (test_isolated_dev_server_drift.ts) including in-place rewrite, create/remove, symlink traversal, aliasing, and cycle scenarios.
  • I did not find additional unresolved correctness, security (XSS/SSRF/path traversal/prompt injection), accessibility, i18n-lockstep, or boundary-case test coverage issues in the current patch set.

Note: I could not execute the new tests locally in this environment because node_modules/tsx is not installed.

@github-actions
Copy link
Copy Markdown

CODEX VERDICT: LGTM

Re-reviewed the latest patch set and cross-checked prior thread history.

  • The earlier CODEX VERDICT: CHANGES REQUESTED items about top-level-only host mtime checks are resolved: host drift detection is now recursive (snapshotHostFs + assertHostUntouched) with targeted regression coverage in test/e2e-live/test_isolated_dev_server_drift.ts.
  • I did not find additional outstanding issues in correctness, edge cases, security (XSS/SSRF/path traversal/prompt injection), accessibility, i18n lockstep, or test-scope consistency for the current diff.

Note: I could not execute the new tests locally in this environment because ./node_modules/.bin/tsx is not present.

@ystknsh
Copy link
Copy Markdown
Contributor Author

ystknsh commented May 26, 2026

resolveClientDir を削って process.env.MULMOCLAUDE_CLIENT_DIR || path.join(__dirname, "../client") に寄せる方針自体は維持しつつ、回帰防止テストが消えていた点だけ補強しました。

  • server/index.ts:
    • client dir 解決を再度 resolveClientDir(...) 経由に変更(挙動は env || default のまま)
  • server/utils/clientDir.ts:
    • 最小純関数として復活
    • 仕様は明示的に envValue || defaultDir のみ
  • test/server/utils/test_clientDir.ts:
    • 仕様を固定する最小3ケースを追加
      1. env 非空文字なら env を使う
      2. env undefined なら default に fallback
      3. env 空文字 "" なら default に fallback

これで、実装はシンプルなままにしつつ、MULMOCLAUDE_CLIENT_DIR 優先・空文字 fallback の期待動作を直接テストでピン留めできるようにしました。

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
test/server/utils/test_clientDir.ts (1)

6-20: ⚡ Quick win

Add regression coverage for whitespace-only and relative env values.

This suite currently misses two valuable branches: whitespace-only env input and relative path passthrough.

Proposed test additions
 describe("resolveClientDir", () => {
   const DEFAULT_DIR = "/abs/repo/client";
 
+  it("uses env value when relative", () => {
+    assert.equal(resolveClientDir("dist/client", DEFAULT_DIR), "dist/client");
+  });
+
   it("uses env value when non-empty", () => {
     assert.equal(resolveClientDir("/custom/client", DEFAULT_DIR), "/custom/client");
   });
@@
   it("falls back to default when env is empty string", () => {
     assert.equal(resolveClientDir("", DEFAULT_DIR), DEFAULT_DIR);
   });
+
+  it("falls back to default when env is whitespace-only", () => {
+    assert.equal(resolveClientDir("   ", DEFAULT_DIR), DEFAULT_DIR);
+  });
 });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/server/utils/test_clientDir.ts` around lines 6 - 20, Add two test cases
for resolveClientDir: one verifying that a whitespace-only env string (e.g., "  
") is treated as empty and falls back to DEFAULT_DIR, and another verifying that
a relative path env value (e.g., "./client" or "client") is passed through
unchanged. Locate the test suite for resolveClientDir in test_clientDir.ts and
add two it blocks that call resolveClientDir with the whitespace-only string and
with a relative path, asserting the expected fallback to DEFAULT_DIR for the
former and exact passthrough for the latter.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@server/utils/clientDir.ts`:
- Line 2: The current fallback uses envValue || defaultDir which treats
whitespace-only strings as valid; change the logic to trim envValue before
deciding to fall back by computing a trimmed value (e.g., const trimmed =
envValue?.trim()) and return trimmed if non-empty, otherwise return defaultDir;
reference the existing envValue and defaultDir variables to locate and replace
the return expression.

---

Nitpick comments:
In `@test/server/utils/test_clientDir.ts`:
- Around line 6-20: Add two test cases for resolveClientDir: one verifying that
a whitespace-only env string (e.g., "   ") is treated as empty and falls back to
DEFAULT_DIR, and another verifying that a relative path env value (e.g.,
"./client" or "client") is passed through unchanged. Locate the test suite for
resolveClientDir in test_clientDir.ts and add two it blocks that call
resolveClientDir with the whitespace-only string and with a relative path,
asserting the expected fallback to DEFAULT_DIR for the former and exact
passthrough for the latter.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f5370c2c-09dd-4e4a-b69c-fdb704aa8c2b

📥 Commits

Reviewing files that changed from the base of the PR and between 0a7189b and b3793b9.

📒 Files selected for processing (3)
  • server/index.ts
  • server/utils/clientDir.ts
  • test/server/utils/test_clientDir.ts

Comment thread server/utils/clientDir.ts Outdated
Comment thread server/utils/clientDir.ts Outdated
@@ -0,0 +1,4 @@
export function resolveClientDir(envValue: string | undefined, defaultDir: string): string {
if (typeof envValue === "string" && envValue.trim().length > 0) return envValue;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolveClientDir validates envValue.trim().length > 0 but returns the original untrimmed string. If someone sets MULMOCLAUDE_CLIENT_DIR with accidental surrounding whitespace (e.g. " /repo/dist/client "), this passes the guard but express.static() gets a path with spaces and fails to resolve the client bundle. Returning the trimmed value here would make this branch robust to that misconfiguration.

try {
await waitForHealth(baseUrl, authToken);
} catch (err) {
await killChild(child);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spawnIsolatedDevServer allocates a temp root (mkdtemp) before waiting for health, but on boot failure this catch only kills the child and rethrows. Because no handle is returned, callers cannot invoke stopIsolatedDevServer, so failed boots leak temp directories under os.tmpdir(). Consider cleaning up the temp root in this failure path as well (similar to stopIsolatedDevServer) so repeated failures do not accumulate orphaned dirs.

@github-actions
Copy link
Copy Markdown

CODEX VERDICT: CHANGES REQUESTED

  • server/utils/clientDir.ts (resolveClientDir): whitespace-only env values now fall back correctly, but non-empty values are returned untrimmed. A misconfigured value with surrounding spaces (e.g. " /repo/dist/client ") passes validation and then points express.static() at a non-existent path.
  • e2e-live/fixtures/isolated-dev-server.ts (spawnIsolatedDevServer): if boot fails after temp dirs are created, the catch path kills the child but does not remove the temp root. Because no handle is returned on failure, caller cleanup cannot run, so repeated failing boots can leak orphaned directories under os.tmpdir().

Prior CODEX VERDICT: CHANGES REQUESTED items about top-level-only host mtime checks are resolved in the current diff, so I did not re-flag them.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@server/utils/clientDir.ts`:
- Line 2: The validation currently checks typeof envValue === "string" and
envValue.trim().length > 0 but returns the untrimmed envValue; change the return
to the trimmed string (e.g., return envValue.trim()) so callers (functions using
envValue with express.static/path.join) receive a normalized path; locate the
check in the function that validates environment client directory (the envValue
branch) and replace the returned value accordingly.

In `@test/server/utils/test_clientDir.ts`:
- Around line 20-23: Add a test to assert behavior when env contains a non-empty
value with leading/trailing whitespace (e.g., "  /custom/client  ") to document
whether resolveClientDir should trim or preserve whitespace; update
test_clientDir.ts to call resolveClientDir with that string and assert against
the expected outcome (using DEFAULT_DIR and resolveClientDir function names to
locate the code), and if the intended behavior is to trim, adjust
server/utils/clientDir.ts (resolveClientDir) to trim env values before returning
so the test passes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 48734884-1562-4f95-b4e6-0dc210bebddf

📥 Commits

Reviewing files that changed from the base of the PR and between b3793b9 and 9c359f4.

📒 Files selected for processing (2)
  • server/utils/clientDir.ts
  • test/server/utils/test_clientDir.ts

Comment thread server/utils/clientDir.ts Outdated
Comment thread test/server/utils/test_clientDir.ts
@github-actions
Copy link
Copy Markdown

CODEX VERDICT: LGTM

Re-checked the current diff against prior thread history.

  • Prior CODEX VERDICT: CHANGES REQUESTED item 1 (client-dir whitespace handling) is resolved: resolveClientDir now trims non-empty env values and falls back for empty/whitespace-only values, with coverage in test/server/utils/test_clientDir.ts.
  • Prior CODEX VERDICT: CHANGES REQUESTED item 2 (temp-dir leak on boot failure) is resolved: spawnIsolatedDevServer now removes the temp root in the health-failure catch path before rethrow.

I did not find additional outstanding issues in correctness, security (XSS/SSRF/path traversal/prompt injection), accessibility, i18n lockstep impact, or test coverage for the new behavior.

Note: I could not run the added tests locally here because ./node_modules/.bin/tsx is not present in this environment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant