test(e2e): add real-flow.mjs — daemon-driven end-to-end test (no API injection)#912
Open
vivekchand wants to merge 4 commits intomainfrom
Open
test(e2e): add real-flow.mjs — daemon-driven end-to-end test (no API injection)#912vivekchand wants to merge 4 commits intomainfrom
vivekchand wants to merge 4 commits intomainfrom
Conversation
added 4 commits
May 8, 2026 00:08
…injection)
Builds on cloud-contract.mjs with a true end-to-end scenario: spawn the
real OSS sync daemon (`python -m clawmetry.sync`) against a synthesized
OpenClaw workspace, let it heartbeat + sync events naturally, then walk
the dashboard. No /ingest/events injection — events arrive the same way
they would for any pip-install user.
Per-run setup:
1. Find the python interpreter that has clawmetry installed (reads the
`clawmetry` CLI's shebang — install.sh uses ~/.clawmetry/bin/python3,
not system python).
2. Register a fresh test account against /api/register.
3. Build /tmp/cm-real-flow-openclaw/ with a sessions/sessions.json
index + one session JSONL containing a realistic chat transcript
(user → assistant → tool_call → tool_result → assistant) using the
OpenClaw v3 schema verified against ~/.openclaw/.
4. Write daemon config at /tmp/cm-real-flow-home/.clawmetry/config.json
with the test account's api_key + a fresh AES-256-GCM enc_key.
5. Spawn the real daemon with HOME and OPENCLAW_HOME pointing at the
temp dirs.
6. Poll /api/cloud/account until usage_stats.nodes >= 1 (heartbeat
landed) — up to 25s.
7. Open the dashboard URL with browser, walk free-tier tabs, then
Pro-gated tabs (Flow / Alerts / Notifications) at the end. Pro
tabs trigger upsell modals that DON'T auto-dismiss on tab change
(real cloud UX bug — file separately), so dismissing them between
clicks keeps screenshots clean.
8. Assert Brain shows the user message ('postmortem') and the
assistant reply ('oom kills') from the synthesized session — proves
the event made the round-trip from disk → daemon → /ingest/events
→ cloud → SSE stream → browser decryption.
9. Cleanup: SIGTERM the daemon, rm tempdirs (skip with KEEP_TEMP=1).
cloud-contract.mjs: thicker tab walk + Brain/Flow/Tokens/Crons clicks
— catches dashboard render regressions, not just API contract breaks.
Run:
cd tests/e2e && npm run test:real-flow # headless
HEADLESS=0 npm run test:real-flow # show browser
DAEMON_LOG=1 KEEP_TEMP=1 npm run test:real-flow # debug
…ink UI test
Covers the regression that vivekchand/clawmetry-cloud#641 fixed —
auto-registered users typing a valid OTP into the 'Complete your
account' modal getting back a misleading 'Invalid token'. Without an
end-to-end test, that bug shipped silently for who knows how long.
Flow:
1. POST /api/register (mirrors what `curl install.sh | bash` does)
2. Open /cloud?token=<api_key> with Playwright
3. Wait for the Complete-your-account modal
4. Type cm-e2e-test+<random>@clawmetry.com (server-side whitelist)
5. Click 'Send verification code'
6. Read OTP via /api/auth/_test/peek-otp (defense-in-depth gated:
env-var enabled + shared-secret header + email whitelist —
vivekchand/clawmetry-cloud#642)
7. Type OTP, click Verify
8. Assert: modal closes, header email updates, no 'Invalid token'
error, no JS errors
Skips cleanly (exit 0) when CM_E2E_TEST_SECRET isn't set in env, since
the cloud-side peek-otp endpoint won't respond without the matching
secret on the cloud side anyway.
Run:
CM_E2E_TEST_SECRET=<secret> npm run test:signup-flow
HEADLESS=0 CM_E2E_TEST_SECRET=<secret> npm run test:signup-flow
- Capture the email the browser actually sends (not what we typed) so if peek-otp can't find it later we know whether the input mangled it - Use cm-e2e-test- (hyphen) instead of cm-e2e-test+ (plus). Both pass the server-side whitelist; hyphen is safer against any input-validator weirdness - Filter expected 401s on /api/cloud/* immediately after the link step: link-email returns a NEW api_key stored in localStorage, but the current page's URL still has the OLD ?token=, so in-flight loadAll polls 401 until reload. Real product fix: reload after link. Test should not fail on it. Now 10/10 passes against deployed cloud.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follow-up to #909. The cloud-contract spec verifies API behavior; this adds a true end-to-end test where events flow through the real OSS daemon — same path a pip-install user takes. No `/ingest/events` injection.
What it does
Findings while building this
Real cloud UX bug: the Pro-feature paywall modal (Flow / Alerts / Notifications) does NOT auto-dismiss when you navigate to a different tab. Result: clicking Flow on a free-tier account → modal opens → clicking Brain → Brain renders behind the modal but you can't interact with it. Test now walks free-tier tabs first and dismisses modals between Pro-tab clicks. Worth filing as a separate dashboard bug.
Run
```bash
cd tests/e2e
npm run test:real-flow # headless
HEADLESS=0 npm run test:real-flow # show browser
DAEMON_LOG=1 KEEP_TEMP=1 npm run test:real-flow # debug
```
Verified locally: 46/46 checks pass when the cloud's per-IP register rate limit isn't hit.
Why a separate test (and not part of cloud-contract.mjs)
Both share the same `tests/e2e/package.json` so adding new contract checks remains a single-file edit.