Skip to content

test(e2e): add real-flow.mjs — daemon-driven end-to-end test (no API injection)#912

Open
vivekchand wants to merge 4 commits intomainfrom
feat/real-flow-e2e-test
Open

test(e2e): add real-flow.mjs — daemon-driven end-to-end test (no API injection)#912
vivekchand wants to merge 4 commits intomainfrom
feat/real-flow-e2e-test

Conversation

@vivekchand
Copy link
Copy Markdown
Owner

Summary

Follow-up to #909. The cloud-contract spec verifies API behavior; this adds a true end-to-end test where events flow through the real OSS daemon — same path a pip-install user takes. No `/ingest/events` injection.

What it does

  1. Find the python that has clawmetry installed (reads the CLI's shebang — install.sh uses `~/.clawmetry/bin/python3`, not system python).
  2. Register a fresh test account.
  3. Build `/tmp/cm-real-flow-openclaw/` with a `sessions/sessions.json` index + one session JSONL containing a realistic transcript (user → assistant → tool_call → tool_result → assistant) using the OpenClaw v3 schema verified against `~/.openclaw/`.
  4. Write daemon config at `/tmp/cm-real-flow-home/.clawmetry/config.json`.
  5. Spawn the real daemon (`python -m clawmetry.sync`) with `HOME` and `OPENCLAW_HOME` overridden to point at the temp dirs.
  6. Poll `/api/cloud/account` until heartbeat lands (~5-15s).
  7. Open the dashboard URL with browser, walk every free-tier tab, then Pro-gated tabs at the end.
  8. Assert Brain shows the user message ('postmortem') and the assistant reply ('oom kills') from the synthesized session — proves the event made the round-trip from disk → daemon → `/ingest/events` → cloud → SSE stream → browser decryption.
  9. Cleanup: SIGTERM the daemon, rm tempdirs.

Findings while building this

Real cloud UX bug: the Pro-feature paywall modal (Flow / Alerts / Notifications) does NOT auto-dismiss when you navigate to a different tab. Result: clicking Flow on a free-tier account → modal opens → clicking Brain → Brain renders behind the modal but you can't interact with it. Test now walks free-tier tabs first and dismisses modals between Pro-tab clicks. Worth filing as a separate dashboard bug.

Run

```bash
cd tests/e2e
npm run test:real-flow # headless
HEADLESS=0 npm run test:real-flow # show browser
DAEMON_LOG=1 KEEP_TEMP=1 npm run test:real-flow # debug
```

Verified locally: 46/46 checks pass when the cloud's per-IP register rate limit isn't hit.

Why a separate test (and not part of cloud-contract.mjs)

  • cloud-contract.mjs runs in <30s. It's the deploy gate. Fast = mandatory on every cloud deploy.
  • real-flow.mjs spawns a real Python daemon, takes ~45s, requires clawmetry to be installed locally. Heavier. Right shape for a manual / scheduled run, not every deploy.

Both share the same `tests/e2e/package.json` so adding new contract checks remains a single-file edit.

vivekchand added 4 commits May 8, 2026 00:08
…injection)

Builds on cloud-contract.mjs with a true end-to-end scenario: spawn the
real OSS sync daemon (`python -m clawmetry.sync`) against a synthesized
OpenClaw workspace, let it heartbeat + sync events naturally, then walk
the dashboard. No /ingest/events injection — events arrive the same way
they would for any pip-install user.

Per-run setup:
  1. Find the python interpreter that has clawmetry installed (reads the
     `clawmetry` CLI's shebang — install.sh uses ~/.clawmetry/bin/python3,
     not system python).
  2. Register a fresh test account against /api/register.
  3. Build /tmp/cm-real-flow-openclaw/ with a sessions/sessions.json
     index + one session JSONL containing a realistic chat transcript
     (user → assistant → tool_call → tool_result → assistant) using the
     OpenClaw v3 schema verified against ~/.openclaw/.
  4. Write daemon config at /tmp/cm-real-flow-home/.clawmetry/config.json
     with the test account's api_key + a fresh AES-256-GCM enc_key.
  5. Spawn the real daemon with HOME and OPENCLAW_HOME pointing at the
     temp dirs.
  6. Poll /api/cloud/account until usage_stats.nodes >= 1 (heartbeat
     landed) — up to 25s.
  7. Open the dashboard URL with browser, walk free-tier tabs, then
     Pro-gated tabs (Flow / Alerts / Notifications) at the end. Pro
     tabs trigger upsell modals that DON'T auto-dismiss on tab change
     (real cloud UX bug — file separately), so dismissing them between
     clicks keeps screenshots clean.
  8. Assert Brain shows the user message ('postmortem') and the
     assistant reply ('oom kills') from the synthesized session — proves
     the event made the round-trip from disk → daemon → /ingest/events
     → cloud → SSE stream → browser decryption.
  9. Cleanup: SIGTERM the daemon, rm tempdirs (skip with KEEP_TEMP=1).

cloud-contract.mjs: thicker tab walk + Brain/Flow/Tokens/Crons clicks
— catches dashboard render regressions, not just API contract breaks.

Run:
  cd tests/e2e && npm run test:real-flow                     # headless
  HEADLESS=0 npm run test:real-flow                          # show browser
  DAEMON_LOG=1 KEEP_TEMP=1 npm run test:real-flow            # debug
…ink UI test

Covers the regression that vivekchand/clawmetry-cloud#641 fixed —
auto-registered users typing a valid OTP into the 'Complete your
account' modal getting back a misleading 'Invalid token'. Without an
end-to-end test, that bug shipped silently for who knows how long.

Flow:
  1. POST /api/register (mirrors what `curl install.sh | bash` does)
  2. Open /cloud?token=<api_key> with Playwright
  3. Wait for the Complete-your-account modal
  4. Type cm-e2e-test+<random>@clawmetry.com (server-side whitelist)
  5. Click 'Send verification code'
  6. Read OTP via /api/auth/_test/peek-otp (defense-in-depth gated:
     env-var enabled + shared-secret header + email whitelist —
     vivekchand/clawmetry-cloud#642)
  7. Type OTP, click Verify
  8. Assert: modal closes, header email updates, no 'Invalid token'
     error, no JS errors

Skips cleanly (exit 0) when CM_E2E_TEST_SECRET isn't set in env, since
the cloud-side peek-otp endpoint won't respond without the matching
secret on the cloud side anyway.

Run:
  CM_E2E_TEST_SECRET=<secret> npm run test:signup-flow
  HEADLESS=0 CM_E2E_TEST_SECRET=<secret> npm run test:signup-flow
- Capture the email the browser actually sends (not what we typed) so
  if peek-otp can't find it later we know whether the input mangled it
- Use cm-e2e-test- (hyphen) instead of cm-e2e-test+ (plus). Both pass
  the server-side whitelist; hyphen is safer against any input-validator
  weirdness
- Filter expected 401s on /api/cloud/* immediately after the link
  step: link-email returns a NEW api_key stored in localStorage, but
  the current page's URL still has the OLD ?token=, so in-flight
  loadAll polls 401 until reload. Real product fix: reload after link.
  Test should not fail on it.

Now 10/10 passes against deployed cloud.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant