feat(release): integrate gameplay telemetry hardening for v0.2.0#197
feat(release): integrate gameplay telemetry hardening for v0.2.0#197tuxerrante wants to merge 4 commits into
Conversation
Track started, completed, and abandoned sessions as append-only telemetry so admin analytics reflect real gameplay outcomes instead of leaderboard submissions alone. Expose the resulting aggregates through a new admin API and UI while keeping the JSON and MSSQL storage paths aligned. Signed-off-by: Alessandro Affinito <aaffinit@redhat.com> Made-with: Cursor
Unifies PR #153 telemetry integration with mainline hardening by binding AI routes to server-owned session scenario context, tightening runtime timeout behavior, and strengthening release/deploy gating before v0.2.0. Signed-off-by: Alessandro Affinito <aaffinit@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
Pull request overview
This PR prepares the v0.2.0 release by layering gameplay lifecycle telemetry/admin analytics on top of main, hardening scenario/session trust boundaries for AI routes, adding bounded AI timeouts, and tightening deploy/release gates (notably requiring Helm integration success).
Changes:
- Add server-owned scenario context persistence to sessions and enforce it in chat/command flows, plus introduce AI timeout controls.
- Gate persistent leaderboard writes behind
PERSISTENT_LEADERBOARD_ENABLED(default off in Helm values). - Add an Admin Analytics UI surface and strengthen Helm/CI deploy checks + version/changelog updates for
0.2.0.
Reviewed changes
Copilot reviewed 31 out of 33 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| helm/sre-simulator/values.yaml | Adds new Helm values for proxy trust, admin analytics visibility, leaderboard gate, and AI timeouts. |
| helm/sre-simulator/templates/tests/test-connection.yaml | Keeps failed Helm test pods for post-failure diagnostics. |
| helm/sre-simulator/templates/frontend-deployment.yaml | Wires new frontend env vars (public origin, proxy trust, admin analytics flag). |
| helm/sre-simulator/templates/configmap.yaml | Adds backend env vars for proxy trust, leaderboard gate, and AI timeouts. |
| helm/sre-simulator/Chart.yaml | Bumps chart/app versions to 0.2.0. |
| frontend/src/lib/telemetry/capture.ts | Prevents Sentry/telemetry failures from breaking user flows. |
| frontend/src/lib/release.ts | Updates frontend-reported app version to v0.2.0. |
| frontend/src/app/page.tsx | Conditionally renders an Admin Analytics link based on a public flag. |
| frontend/src/app/admin/page.tsx | Adds the Admin Analytics page UI and data-fetch flow. |
| frontend/package.json | Bumps frontend package version and updates Next.js dependency range. |
| frontend/package-lock.json | Locks updated frontend version/dependency changes. |
| frontend/.env.local.example | Documents new NEXT_PUBLIC_ADMIN_ANALYTICS_ENABLED flag. |
| CHANGELOG.md | Adds 0.2.0 release notes (features, changes, security). |
| backend/src/routes/scores.ts | Adds PERSISTENT_LEADERBOARD_ENABLED gate to persistent leaderboard writes. |
| backend/src/routes/scores.test.ts | Ensures tests set/restore PERSISTENT_LEADERBOARD_ENABLED. |
| backend/src/routes/scenario.ts | Stores scenario payload in session + adds scenario generation timeout handling. |
| backend/src/routes/command.ts | Prefers server-stored scenario payload and adds integrity checks. |
| backend/src/routes/command.test.ts | Updates unit test sessions to include stored scenario fields. |
| backend/src/routes/command.route.test.ts | Updates route-level test sessions to include stored scenario fields. |
| backend/src/routes/chat.ts | Prefers server-stored scenario payload, adds chat streaming abort/timeout behavior. |
| backend/src/routes/chat.test.ts | Updates unit test sessions to include stored scenario fields. |
| backend/src/routes/chat.mock-mode.test.ts | Updates mock-mode test sessions to include stored scenario fields. |
| backend/src/lib/storage/types.ts | Extends GameSession and creation input with scenarioId and scenarioPayload. |
| backend/src/lib/storage/mssql-stores.test.ts | Updates MSSQL store tests for new session columns/fields. |
| backend/src/lib/storage/mssql-session-store.ts | Persists/reads scenario_id and scenario_payload in MSSQL sessions. |
| backend/src/lib/storage/migrations/005_session_scenario_context.sql | Adds MSSQL columns for server-owned scenario identity/payload. |
| backend/src/lib/storage/json-session-store.ts | Stores new scenario fields in JSON session store. |
| backend/src/integration/game-flow.test.ts | Updates integration tests to set/restore leaderboard gate env var. |
| backend/package.json | Bumps backend package version to 0.2.0. |
| backend/package-lock.json | Locks backend version bump. |
| backend/.env.local.example | Documents TRUST_PROXY_HEADERS, leaderboard gate, and AI timeout env vars. |
| .github/workflows/helm-integration.yml | Improves Helm test log capture (--logs) and failure diagnostics. |
| .github/workflows/deploy-prod.yml | Adds a production deploy gate requiring Helm integration check success. |
Files not reviewed (2)
- backend/package-lock.json: Language not supported
- frontend/package-lock.json: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| TRUST_PROXY_HEADERS: "{{ ternary "true" "false" .Values.backend.trustProxyHeaders }}" | ||
| PERSISTENT_LEADERBOARD_ENABLED: "{{ ternary "true" "false" .Values.backend.persistentLeaderboardEnabled }}" |
| } | ||
| } else { | ||
| scenario = rawScenario ?? null; | ||
| if (scenario && scenario.difficulty !== session.difficulty) { |
| } | ||
| } else { | ||
| scenario = rawScenario ?? null; | ||
| if (scenario && scenario.difficulty !== session.difficulty) { |
| try { | ||
| const response = await fetch("/api/gameplay/admin"); | ||
| const raw = await response.text(); | ||
| const parsed = JSON.parse(raw) as GameplayAnalytics | { error?: string }; | ||
| if (!response.ok) { | ||
| throw new Error("error" in parsed ? parsed.error : "Failed to load gameplay analytics"); | ||
| } |
Fail closed when stored session scenario payloads are invalid, enforce title+difficulty parity in legacy fallback paths, and keep proxy-header trust opt-in by default to preserve request identity guarantees. Signed-off-by: Alessandro Affinito <aaffinit@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Align Helm env templating with valid quote rendering, tighten legacy scenario fallback parity with session IDs, and make admin analytics prompt for a token so secured backend reads succeed from the UI. Signed-off-by: Alessandro Affinito <aaffinit@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
| const streamController = new AbortController(); | ||
| const streamTimeoutMs = getChatTimeoutMs(); | ||
| const streamTimeout = setTimeout(() => { | ||
| streamController.abort(new ChatStreamTimeoutError(streamTimeoutMs)); | ||
| }, streamTimeoutMs); |
| const storeToken = (token: string): void => { | ||
| if (typeof window === "undefined") { | ||
| return; | ||
| } | ||
| window.localStorage.setItem("gameplayAdminToken", token); | ||
| }; |
| const promptedToken = window.prompt("Enter gameplay admin token"); | ||
| if (promptedToken && promptedToken.trim()) { | ||
| token = promptedToken.trim(); | ||
| storeToken(token); | ||
| response = await fetch("/api/gameplay/admin", { | ||
| headers: { "x-gameplay-admin-token": token }, | ||
| }); | ||
| } |
|
Addressed Copilot and multi-role review findings in follow-up commits:
Validation rerun after fixes:
Current PR checks are green. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 31 out of 33 changed files in this pull request and generated 3 comments.
Files not reviewed (2)
- backend/package-lock.json: Language not supported
- frontend/package-lock.json: Language not supported
Comments suppressed due to low confidence (1)
backend/src/routes/chat.ts:241
- The route intentionally aborts the AI request on
req.close(client disconnect) and on timeout, but the innercatchalways callscaptureBackendRouteError. This will report expected aborts (e.g., user navigates away) as errors in Sentry and can create noisy telemetry. Consider detecting abort/disconnect cases (e.g.,streamController.signal.aborted/error.name === "AbortError") and either skip capturing or downgrade to a debug log, while still capturing genuine provider failures.
} catch (error) {
captureBackendRouteError(req, error, "Chat stream failed");
const errorMessage = error instanceof ChatStreamTimeoutError
? "Chat stream timed out. Please retry."
: "Chat stream failed";
res.write(`data: ${JSON.stringify({ error: errorMessage })}\n\n`);
res.end();
| const streamController = new AbortController(); | ||
| const streamTimeoutMs = getChatTimeoutMs(); | ||
| const streamTimeout = setTimeout(() => { | ||
| streamController.abort(new ChatStreamTimeoutError(streamTimeoutMs)); | ||
| }, streamTimeoutMs); | ||
| const onClientClose = () => { | ||
| streamController.abort(new Error("Chat client disconnected")); | ||
| }; | ||
| req.on("close", onClientClose); | ||
|
|
||
| const stream = streamAiText({ | ||
| maxTokens: MAX_CHAT_TOKENS, | ||
| system: systemPrompt, | ||
| messages: compaction.messages, | ||
| route: "chat", | ||
| cacheKey: scenario?.title ?? "no-scenario", | ||
| signal: streamController.signal, | ||
| compactionMeta: { |
| helm_integration_conclusion="$( | ||
| gh api \ | ||
| "repos/${GITHUB_REPOSITORY}/commits/${RELEASE_SHA}/check-runs" \ | ||
| --jq ' | ||
| .check_runs[] | | ||
| select(.name == "helm-test") | | ||
| .conclusion | ||
| ' | \ | ||
| head -n 1 | ||
| )" | ||
| if [[ "${helm_integration_conclusion}" != "success" ]]; then | ||
| echo "Helm integration check did not pass for ${RELEASE_TAG}." | ||
| echo "Expected check run name: helm-test" | ||
| echo "Commit: ${RELEASE_SHA}" | ||
| exit 1 | ||
| fi |
| messages: [ | ||
| { | ||
| role: "user", | ||
| content: `Generate a ${difficulty} difficulty ARO incident scenario.`, | ||
| }, | ||
| ], | ||
| }), | ||
| getScenarioTimeoutMs(), | ||
| ); | ||
|
|
Summary
mainwith conflict-safe behavior.PERSISTENT_LEADERBOARD_ENABLED(default false).v0.2.0.Test plan
make installmake validatemake testmake test-integrationmake verify-release-version TAG=v0.2.0make e2e-azure-route(blocked locally: missing required env vars inbackend/.env.local)Made with Cursor