Skip to content

Feat/eng 2964#725

Merged
bao-byterover merged 13 commits into
proj/analytics-system-tool-modefrom
feat/ENG-2964
May 28, 2026
Merged

Feat/eng 2964#725
bao-byterover merged 13 commits into
proj/analytics-system-tool-modefrom
feat/ENG-2964

Conversation

@cuongdo-byterover
Copy link
Copy Markdown
Collaborator

Summary

  • Problem:
  • Why it matters:
  • What changed:
  • What did NOT change (scope boundary):

Type of change

  • Bug fix
  • New feature
  • Refactor (no behavior change)
  • Documentation
  • Test
  • Chore (build, dependencies, CI)

Scope (select all touched areas)

  • TUI / REPL
  • Agent / Tools
  • LLM Providers
  • Server / Daemon
  • Shared (constants, types, transport events)
  • CLI Commands (oclif)
  • Hub / Connectors
  • Cloud Sync
  • CI/CD / Infra

Linked issues

  • Closes #
  • Related #

Root cause (bug fixes only, otherwise write N/A)

  • Root cause:
  • Why this was not caught earlier:

Test plan

  • Coverage added:
    • Unit test
    • Integration test
    • Manual verification only
  • Test file(s):
  • Key scenario(s) covered:

User-visible changes

List user-visible changes (including defaults, config, or CLI output).
If none, write None.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording

Checklist

  • Tests added or updated and passing (npm test)
  • Lint passes (npm run lint)
  • Type check passes (npm run typecheck)
  • Build succeeds (npm run build)
  • Commits follow Conventional Commits format
  • Documentation updated (if applicable)
  • No breaking changes (or clearly documented above)
  • Branch is up to date with main

Risks and mitigations

List real risks for this PR. If none, write None.

  • Risk:
    • Mitigation:

…sk types

Adds curate-tool-mode, query-tool-mode, dream-scan, dream-finalize to
TaskTypes + TASK_TYPE_VALUES. After ENG-2925's rename the daemon
dispatches these four types but the analytics enum still predates v4.0
— task_created / task_completed / task_failed silently rejected every
tool-mode emit at the Zod boundary.

- TaskTypes keeps the legacy 'curate' / 'query' / 'dream' values for
  back-compat with any constructor still building those payloads
- per-event schemas (task_created / task_completed / task_failed) pick
  up the new types automatically via z.enum(TASK_TYPE_VALUES)
- M12 per-flavor schemas (curate_run_completed / query_completed) still
  hardcode their own literals and continue to reject tool-mode types
  here — M14.2 migrates them to the canonical enum as a follow-up

TDD:
- new task-types tests assert TaskCreated/Completed/Failed accept all
  four new types
- regression tests pin the M12 schemas' continued rejection so M14.2
  has a clear flip point
…sk_type to TASK_TYPE_VALUES

Migrates the two M12 per-flavor schemas from hardcoded literal task_type
values to the canonical TASK_TYPE_VALUES enum so v4.0 tool-mode tasks
round-trip the wire boundary instead of being silently Zod-rejected
inside AnalyticsHook.

- curate_run_completed: z.enum(['curate', 'curate-folder']) →
  z.enum(TASK_TYPE_VALUES). curate-tool-mode payloads now validate.
- query_completed: z.literal('query') → z.enum(TASK_TYPE_VALUES).
  query-tool-mode payloads now validate.

The schemas no longer structurally constrain task_type to the curate
or query family; the hook is trusted to only emit each event for the
right flavor. Docblocks call out the widening for the next maintainer.

TDD:
- curate-run-completed.test asserts curate-tool-mode + legacy values
  both succeed; an unknown task_type still rejects
- query-completed.test mirrors the same coverage for query-tool-mode
- task-types.test M14.1 regression assertions flipped from rejection
  to acceptance for the M12 schemas; M14.1 docblock comments updated

Without this, M14.3's hook code would land but the M12 emits that fire
alongside the new generic task_* emits would silently disappear on
every tool-mode task — that's the bug operators noticed in Mixpanel.
…ed in AnalyticsHook

Adds the three generic funnel-event emits described in M14's customer
ask. Every daemon task (curate / curate-folder / curate-tool-mode
(aliased from curate-html-direct) / query / query-tool-mode /
dream-scan / dream-finalize / search) now produces, in order:

  task_created      onTaskCreate (funnel entry — unconditional)
  ...optional per-op + M12 per-flavor terminal emit
  task_completed    onTaskCompleted (terminal-event-last on success)
  task_failed       onTaskError / onTaskCancelled (terminal-event-last)

Three coupled changes shipped together so the wire stays consistent:

1. analytics-hook.ts grows three emits and a `toAnalyticsTaskType`
   alias-translator. The daemon still dispatches the pre-ENG-2925 name
   `curate-html-direct`; analytics canonicalises it to the post-rename
   `curate-tool-mode` so the wire enum matches TASK_TYPE_VALUES. Once
   ENG-2925 lands, the alias becomes a no-op identity.
2. CURATE_TASK_TYPES + QUERY_TASK_TYPES gain the tool-mode names so M12
   per-flavor state init kicks in for tool-mode curates / queries. M12
   counters stay all-zero on tool-mode today (no LLM tool calls fire) —
   that's a separate follow-up (FU-1 in plans/analytics-m14/follow-ups).
3. Both M12 payload builders route `task_type` through the same alias
   so curate-tool-mode tasks emit `task_type='curate-tool-mode'` on the
   curate_run_completed / query_completed events too.

TDD:
- new analytics-hook-m14.test.ts: 15 simulation tests covering every
  task type, the curate-html-direct → curate-tool-mode alias, has_files
  / has_folder semantics, both terminal paths, and the ordering invariant
- existing analytics-hook.test.ts updated to filter out the new generic
  emits via `filterM12()` so M12-focused assertions keep their intent
- integration stress tests updated for the new 13-event sequence
  (task_created → ops → curate_run_completed → task_completed)
- repo full suite: 9121 passing, 0 failing
…ords/tags, structured related_paths

Tightens the curate / query M12 payloads per code review:

curate_operation_applied:
- rename absolute_path → relative_path (project-relative via path.relative
  against the task's projectPath; falls back to identity when projectPath
  is unset so search tasks at the daemon root still emit a usable string)
- keywords / tags promoted from optional to required arrays (default [])
  so the wire shape stays uniform regardless of frontmatter read success

query_completed.read_paths_with_metadata:
- same absolute_path → relative_path rename
- same keywords / tags promotion (required arrays, default [])
- flat related: string[] → structured related_paths: [{relative_path,
  keywords, tags}] so each linked topic carries its own metadata slot.
  Keywords/tags default to [] until a future FU cascade-reads each
  linked file's frontmatter — the wire shape doesn't need to change
  when that lands.

Hook implementation:
- new `toRelativePath(filePath, projectPath)` helper using node:path.relative
- CurateTaskAnalyticsState stores projectPath captured at onTaskCreate so
  per-op emits can relativize without threading task through processToolResult
- all four payload sites (curate-op, curate-run-completed, query-completed
  read_paths_with_metadata) route through the helper

Inspection test added at test/unit/.../analytics-hook-toolmode-inspection
that pretty-prints every event + payload for curate-tool-mode / query
/ query-tool-mode flows — gives PMs a single place to verify the wire
shape end-to-end.

Privacy win: relative paths drop the /Users/{name} prefix from every
file-touched event, keeping host-identifiable PII off the analytics
wire while still letting PMs reason about which file inside a project
an operation touched.

Tests: full repo 9132 passing.
…re_kind on task_failed

Delivers M15.6's customer-facing piece: the actual wiring of AnalyticsHook
into the daemon's lifecycleHooks[] (which M14.3 built emit logic for but
never plumbed) plus the coarse failure_kind classifier on task_failed.

Per the M14 / M15.6 alignment (Option A): M14.1-M14.3 ship as the
foundation. M15.6 adds the wiring + failure_kind. M15.6's stated
"scaffolded only" stance on task_created + per-flavor schemas is
intentionally relaxed — M14 already produces those events and the
review-tightening on curate_operation_applied / read_paths_with_metadata
delivers PM-visible value that the M15.6 description deferred.

Changes:

src/server/infra/daemon/brv-server.ts:
- import AnalyticsHook
- construct it BEFORE TransportHandlers so it can land in lifecycleHooks[]
  alongside curate-log / query-log / task-history (now 4 peers, not 3)
- the isAnalyticsEnabled gate from setupFeatureHandlers binds in via a
  closure ref that defers lookup until emit-time
- capture setupFeatureHandlers's return value (was discarded), then
  setAnalyticsClient(analyticsClient) on the pre-registered hook

src/shared/analytics/events/task-failed.ts:
- new FailureKindValues + FailureKind type: 'cancelled' | 'timeout' |
  'agent_error' | 'unknown' (coarse vocab, ≤64 chars, never raw
  error.message)
- failure_kind added to TaskFailedSchema as a required field
- docblock notes the M15.6 privacy contract

src/server/infra/process/analytics-hook.ts:
- new classifyFailureKind(errorMessage) helper. Substring sentinels
  for 'timeout' / 'timed out' / 'deadline exceeded' → 'timeout';
  'agent' / 'llm' / 'provider' / 'tool' → 'agent_error'; default
  'unknown'. Raw error string NEVER ends up on the wire.
- emitTaskFailed now takes a FailureKind argument
- onTaskCancelled → emitTaskFailed(..., 'cancelled')
- onTaskError → emitTaskFailed(..., classifyFailureKind(errorMessage))

TDD:
- task-failed.test: 3 new tests pin the failure_kind shape (accepts the
  4 canonical values, rejects out-of-vocab, rejects missing)
- analytics-hook-m14.test: 3 new tests verify the classifier paths land
  on the wire — 'kaboom' → 'unknown', timeout strings → 'timeout',
  agent strings → 'agent_error'
- new analytics-hook-lifecycle-wiring.test integration covers the
  end-to-end task:create → analyticsHook.onTaskCreate path through a
  real TaskRouter, plus all four task-type variants going through
  task_created → task_completed in sequence

Full suite: 9145 passing, 0 failing. Lint clean.
#1 classifyFailureKind — word-boundary regex + pinned precedence:
  Substring matching let 'tooltip' / 'engagement' / 'urgent' bucket into
  agent_error. Tightened to /\b(agent|llm|provider|tool)\b/ and
  /\b(timeout|timed out|deadline exceeded)\b/. Docblock pins precedence
  (timeout > agent_error > unknown) so future if-order shuffles can't
  silently rebucket the funnel.

#2 toRelativePath outside-project guard (privacy bug):
  path.relative('/proj','/Users/dev/other/x.md') returned
  '../../Users/dev/other/x.md' — still encoded the host layout. Same
  hole when projectPath was undefined (fell back to raw absolute path).
  Now returns '<outside-project>/<basename>' in both cases; preserves
  enough signal for backend grouping without becoming PII. Same guard
  also runs against absolute-path tails (Windows drive-letter switches).

#3 toAnalyticsTaskType drift guard:
  Replaced `daemonType as TaskType` (per CLAUDE.md anti-cast rule) with
  a TASK_TYPE_SET membership check + processLog warning + 'unknown'
  sentinel fallback. A future un-enumerated dispatch now lands a
  debuggable warning at the daemon instead of disappearing at the
  backend Zod check.

#4 dumpEvents test pollution:
  Gated behind DUMP_ANALYTICS=1. `npm test` runs the shape assertions
  silently; opt-in dump still works via
  `DUMP_ANALYTICS=1 npx mocha test/unit/.../analytics-hook-toolmode-inspection.test.ts`.

#5 FU-1 forward-compat comment:
  Added a block-comment note above the tool-mode "operations_*: 0"
  assertions so when FU-1 lands and the counters flip non-zero, the
  failure reads as "feature, update the test" rather than a regression.

#6 curate_operation_applied.related asymmetry:
  Added a TODO(M15.x) marker on the `related` field flagging the
  wire-shape asymmetry with `query_completed.read_paths_with_metadata
  .related_paths` (structured). Restructure is consumer-migration
  territory — own ticket.

#7 brv-server.ts loud-fail assertion:
  Added an explicit throw if setupFeatureHandlers returns without
  analyticsClient. Future refactors that drop the field will explode
  here instead of silently no-op'ing every emit forever.

Tests added to lock in #1, #2, #3 behavior so the review intent
doesn't drift. Existing stress-test fixtures moved from `/A/op-N.md`
under-the-root paths to `/proj/A/op-N.md` so they exercise the
in-project case (out-of-project is now its own focused test).

Full suite: 9227 passing, 0 failing. Lint clean.
#722 re-review)

Re-review caught a real bug in the previous fixup: `toAnalyticsTaskType`
returned `'unknown' as TaskType` for un-enumerated daemon types, but
`'unknown'` was NOT in `TASK_TYPE_VALUES`. The docblock promised "keeps
the event on the wire instead of silently failing the Zod check at the
backend" — but the wire-side `z.enum(TASK_TYPE_VALUES)` actually still
rejected the row. The local Sinon trackStub doesn't run Zod, which is
why the m14 test passed even though the runtime path was broken.

Also the `as TaskType` cast moved sites but didn't disappear — the
exact anti-pattern the original review flagged.

Fix:
- Add `TaskTypes.UNKNOWN: 'unknown'` to the canonical vocabulary.
- Append `TaskTypes.UNKNOWN` to `TASK_TYPE_VALUES` so every task_*
  schema accepts the sentinel via `z.enum(TASK_TYPE_VALUES)`.
- Introduce `isCanonicalTaskType()` predicate so `toAnalyticsTaskType`
  narrows without the `as TaskType` cast. Sentinel returned as
  `TaskTypes.UNKNOWN` directly.
- Update `task-types.test` to assert the new `UNKNOWN` key/value pair.
  Existing negative test for an unknown task_type already uses
  `'not-a-real-type'` (not `'unknown'`), so it stays green.

Cosmetic: dropped the extra-space drift in the OUTSIDE_PROJECT_PATH
docblock noted in the same re-review.

Tests: 9227 passing, 0 failing.
… A e2e

E2E shake-down caught a regression my own M15.6 commit (6983e88)
introduced: when I refactored `setupFeatureHandlers`'s return capture
from destructured `{analyticsClient, analyticsFlushScheduler,
isAnalyticsEnabled}` to a bundled `featureHandlers` object, I dropped
the call to `analyticsFlushScheduler.start()` that lived right after
the destructure (originally landed in M4.3 / ENG-2645) AND lost the
`analyticsFinalFlush` shutdown hook bound to the ShutdownHandler.

Symptom: every `track()` call still landed in JSONL with status='pending',
but the periodic 30s flush tick never armed (start() un-called) and
shutdown never force-flushed (analyticsFinalFlush undefined). Events
sat on disk forever. Production-affecting; the existing
dev-beta.e2e.ts kept passing only because its CI run uses a real
backend that the periodic flush never reached.

Fix in brv-server.ts:
- Re-add `let analyticsFinalFlush: (() => Promise<void>) | undefined`
  before ShutdownHandler construction so the shutdown closure can
  reference it via late-bind (same mutable-holder pattern M4.3 used).
- Pass `analyticsFinalFlush: () => analyticsFinalFlush?.() ?? Promise.resolve()`
  to ShutdownHandler.
- After the M15.6 late-bind of setAnalyticsClient, call
  `featureHandlers.analyticsFlushScheduler.start()` and assign the
  final-flush closure (`stop() + flushFinal({timeoutMs: 3000})`).

E2E coverage (Phase A — 5 cancel scenarios):
- New test/e2e/analytics/lifecycle-wire.e2e.ts drives real daemon +
  transport client + in-process HTTP capture stub.
- Each scenario: task:create → 50ms wait → task:cancel → wait for the
  natural 30s flush tick → assert wire payload.
- Coverage: curate-html-direct (alias → curate-tool-mode on wire),
  curate-folder (via the same flow), query-tool-mode, dream-scan,
  dream-finalize, search.
- Assertions: task_created + task_failed{failure_kind=cancelled}
  reach the wire with the right task_type. dream / search assert no
  per-flavor M12 row (only the generic task_* pair).
- Wire-shape sanity per request: device-id header, brv-cli/* UA,
  super-properties (cli_version, os, node_version, environment,
  device_id) on every event payload.

Gated by `npm run test:e2e:lifecycle` (180s timeout) — not part of
default `npm test` glob. ~2m end-to-end. Matches dev-beta.e2e.ts
precedent.

Phase B (super-props deep checks, JSONL/HTTP parity) deferred to a
follow-up commit.
… into feat/ENG-2964

# Conflicts:
#	src/server/infra/daemon/brv-server.ts
#	test/integration/infra/process/analytics-hook-lifecycle-wiring.test.ts
Adds `npm run test:e2e:db` covering the FULL chain from `brv` CLI through
real HTTP to the local telemetry container into postgres `raw_events`.
Complements the Phase A `test:e2e:lifecycle` in-process capture suite —
that one stops at the daemon's HTTP request body; this one proves the
records actually land in the partitioned `raw_events` table with the
documented event names, task_type alias translation, failure_kind, and
shared device_id.

Three scenarios:
- curate-tool-mode (alias-translated from curate-html-direct)
- query-tool-mode
- search (no per-flavor M12 event, only task_created + task_failed)

Test reaches postgres via `docker exec ... psql -t -A` against the
existing byterover-telemetry compose stack (no host psql dependency).
A `before()` precondition skips the suite with a clear migration hint
when raw_events is missing, and `afterEach` preserves the scenario
temp dirs on failure so daemon logs + JSONL can be inspected post-hoc.

Not picked up by `npm test` (default glob excludes test/e2e/).
…erage

Grows `npm run test:e2e:db` from 3 → 7 scenarios. Adds dream-scan +
dream-finalize roundtrips (previously absent from DB coverage), a
deep wire-shape sanity check on a curate scenario, and a two-task
isolation guard that confirms task_id selectors don't bleed events
across concurrent runs.

New per-row sanity helper (`assertRowShape`) is invoked from every
scenario and pins:
  - schema_version = 1
  - cli_version matches semver
  - os ∈ {darwin, linux, win32}
  - node_version starts with `v<digit>`
  - environment ∈ {development, production}
  - device_id non-empty + identical across all rows of a task
  - user_id null (anonymous emits — test never authenticates)

Per-flavor payload assertions added:
  - curate_run_completed: outcome='cancelled', duration_ms ≥ 0, all
    operations_* counters and pending_review_count = 0 (we cancel
    before any tool calls land)
  - query_completed: outcome='cancelled', duration_ms ≥ 0,
    read_tool_call_count = 0, search_call_count = 0
  - task_created: has_files=false, has_folder=false
  - task_failed: failure_kind='cancelled'

Wire-shape sanity scenario adds:
  - client_timestamp ≤ received_at + 5s budget
  - both timestamps fall in [test_start - 10s, test_end + 10s]
  - properties.device_id === identity_device_id column (M4.1 stamp
    matches the request header)

Isolation scenario:
  - drives a curate-tool-mode task and a query-tool-mode task in
    sequence against the same daemon
  - asserts no task_id bleed, task_type partitions cleanly along
    task_id, both share device_id (single daemon → single config)

Cutover note: switched all e2e dispatches from 'curate-html-direct'
to 'curate-tool-mode' after the proj branch merge — TaskTypeSchema
only accepts the canonical name on the wire now. The legacy alias
remains in AnalyticsHook's `toAnalyticsTaskType` and is exercised
by the integration test directly, not via the transport.

Suite runtime: ~5 min (7 scenarios × ~30s each), all green.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 28, 2026

Claude finished @cuongdo-byterover's task in 5m 44s —— View job


Claude PR Review — feat/ENG-2964 (M15.6 e2e + final-flush wiring)

Overall score: 2 / 5 — The shutdown-flush restoration in brv-server.ts is sound, and the DB e2e is well-structured. However, the wire e2e and the in-PR integration-test edit both use a task type ('curate-html-direct') that the transport schema rejects, which means the very paths they claim to cover never execute. Looks like the merge resolution in 438efff flipped the integration test from a passing value to a broken one and was never re-run.


Findings

Blockers (must fix before merge):

# File Why
1 test/integration/infra/process/analytics-hook-lifecycle-wiring.test.ts:225 type: 'curate-html-direct' is rejected by TaskRouter.isValidTaskType (task-router.ts:917) because the value is not in TaskTypeSchema (schemas.ts:427-436). Lifecycle hooks never fire → expect(created).to.not.equal(undefined) should fail. (inline)
2 test/e2e/analytics/lifecycle-wire.e2e.ts:296 Same root cause — client.requestWithAck('task:create', {type:'curate-html-direct'}) is rejected at the transport boundary; waitFor for 2 events will time out every run. The DB e2e in this same PR (lifecycle-db.e2e.ts:381-384) actually documents this exact rejection and uses the canonical name. (inline)

Substantive:

  • src/server/infra/daemon/brv-server.ts:704-713 — commit message says "restore analyticsFlushScheduler.start()", but the comment claims start() should run AFTER first track and the call sits before any track happens. Either the scheduler defers internally (comment is stale wording) or the original M4.3 contract is being weakened. Worth pinning down. (inline)
  • test/e2e/analytics/lifecycle-wire.e2e.ts:154-195 — task ids are e2e-…-${Date.now()} but TaskCreateRequestSchema.taskId is declared z.string().uuid(). Not validated at runtime today (TaskRouter doesn't .parse()), but a latent break if anyone wires that validator. Use crypto.randomUUID(). (inline)
  • test/e2e/analytics/lifecycle-db.e2e.ts:147-166fetchEvents interpolates taskIdLike into a raw SQL string. Safe today (programmatic ids), but trivially injectable if a future caller derives the id from env/fixtures. Switch to psql -v quoted variables. (inline)

Minor / nitpicks:

  • test/e2e/analytics/lifecycle-db.e2e.ts:339-347cleanupDirs.length = 0 on failure preserves all queued dirs, not just the failing scenario's. (inline)
  • test/e2e/analytics/lifecycle-wire.e2e.ts:81-86runBrv populates a reason field that the chai assertion form drops; switch to expect(r.ok, r.reason).to.equal(true) for usable failures. (inline)

What looks good

  • brv-server.ts restores the M4.3 final-flush hook in a way that survives partial-startup crashes (hook short-circuits when the closure target is undefined). The PR Feat/eng 2964 #722 review's defensive throw on missing analyticsClient is preserved.
  • DB e2e is well-architected: real telemetry + postgres, three skip gates (telemetry reachable, postgres reachable, raw_events table exists) with actionable repair instructions in the skip messages. Per-row assertRowShape + per-test deep checks keep duplication low.
  • Isolation + wire-shape scenarios in lifecycle-db.e2e.ts:577-606 and :531-575 add valuable coverage (task-id partitioning, client_timestamp ≤ received_at, identity self-consistency) that goes beyond dev-beta.e2e.ts.
  • package.json adds test:e2e:lifecycle + test:e2e:db scripts with sensible timeouts and the same npm run build && mocha … chain as the existing analytics e2e.

Verification suggestion

Before merging, please run locally with the two task-type edits applied:

npm run build && npx mocha --forbid-only "test/integration/infra/process/analytics-hook-lifecycle-wiring.test.ts"
# expected: 7/7 passing
npm run test:e2e:lifecycle -- --grep "P0 — curate"
# expected: P0 curate-tool-mode case asserts task_created + task_failed

· branch feat/ENG-2964

Comment thread test/integration/infra/process/analytics-hook-lifecycle-wiring.test.ts Outdated
Comment thread test/e2e/analytics/lifecycle-wire.e2e.ts Outdated
Comment thread test/e2e/analytics/lifecycle-db.e2e.ts
Comment thread test/e2e/analytics/lifecycle-db.e2e.ts
Comment thread test/e2e/analytics/lifecycle-wire.e2e.ts
Comment thread test/e2e/analytics/lifecycle-wire.e2e.ts
Comment thread src/server/infra/daemon/brv-server.ts
… tests

After the proj/analytics-system-tool-mode merge, TaskTypeSchema only
accepts the canonical 'curate-tool-mode' over the wire — ENG-2925's
rename fully landed and 'curate-html-direct' is no longer enumerated
on `TaskCreateRequest` / `TaskExecute`. Tests that dispatch through
TaskRouter (the integration lifecycle-wiring suite + the e2e wire
capture suite) were rejected at the Zod boundary, never reached
AnalyticsHook, and asserted on undefined emits.

The merge resolution I picked for the integration test was wrong — I
kept the alias in HEAD on the assumption it exercised the alias
translation, but the transport boundary now strips it long before
AnalyticsHook gets a chance to translate.

The AnalyticsHook unit tests (analytics-hook-m14.test.ts, et al)
still exercise the alias path by constructing TaskInfo directly and
invoking `hook.onTaskCreate(task)` — those bypass TaskRouter, so
'curate-html-direct' remains a valid test input there and the
`toAnalyticsTaskType` aliasing is still covered.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 28, 2026

Claude finished @cuongdo-byterover's task in 2m 59s —— View job


Claude PR Review — feat/ENG-2964 (re-review after 3f42369)

Overall score: 4 / 5 — Up from 2/5 after the prior round. Both blockers fixed in 3f42369 ("use 'curate-tool-mode' through transport in lifecycle tests"). One substantive comment-vs-code inconsistency in brv-server.ts remains; nitpicks from the prior round are unaddressed but non-blocking.

  • Pull latest changes and diff against base
  • Verify previous-review blockers
  • Re-check brv-server.ts scheduler wiring
  • Audit new e2e files for remaining issues
  • Post inline + summary review

Findings

Blockers (status):

# Status Notes
1 ✅ Fixed analytics-hook-lifecycle-wiring.test.ts:225 now uses type: 'curate-tool-mode' (commit 3f42369).
2 ✅ Fixed lifecycle-wire.e2e.ts:296 now uses 'curate-tool-mode'; the describe label and inline comment were also updated to drop the misleading "alias-translated" framing.

Substantive (still open):

  • src/server/infra/daemon/brv-server.ts:704-713repeat from previous round, not addressed in 3f42369. The comment still claims start() is called "AFTER the first track lands so the initial 30s window aligns with real traffic", but AnalyticsFlushScheduler.start() arms setTimeout(...nextIntervalMs()) immediately (analytics-flush-scheduler.ts:205-209). There is no notifyPushed-gated deferral. The empty-pending guard (tryFlush at :292) means an empty first tick is a cheap no-op — so this isn't a correctness bug — but the comment is stale and misleading. Suggested rewording posted as inline comment. (inline)

Minor (unchanged, low priority — not blocking merge):

  • test/e2e/analytics/lifecycle-db.e2e.ts:164taskIdLike still interpolated into raw SQL. Safe today (programmatic ids), but the prior review's psql -v :'task_id' suggestion still applies. Worth a follow-up, not a blocker.
  • test/e2e/analytics/lifecycle-wire.e2e.ts:240-241,292,326,... and lifecycle-db.e2e.ts:365,534,580-581,... — task ids are still e2e-...-${Date.now()}. TaskCreateRequestSchema.taskId is z.string().uuid() (schemas.ts:788). Not validated at runtime today, but a latent break if anyone wires that validator at the transport boundary. crypto.randomUUID() is a one-liner fix.
  • test/e2e/analytics/lifecycle-wire.e2e.ts:240-241,335-336runBrv populates reason on failure, but .to.deep.include({ok: true}) drops it. Switch to expect(r.ok, r.reason).to.equal(true) for usable diagnostics.

What's good

  • The merge-resolution regression is correctly identified and fixed; commit message explains exactly what happened and why alias coverage still lives at the unit layer (analytics-hook-m14.test.ts:72,88).
  • DB e2e (lifecycle-db.e2e.ts) is well-architected: three skip gates with actionable repair instructions, assertRowShape factored out, three deep-check scenarios (curate roundtrip, wire-shape, two-task isolation) that go meaningfully beyond dev-beta.e2e.ts.
  • Wire e2e (lifecycle-wire.e2e.ts) uses an in-process capture stub at a random localhost port — clean, hermetic, no flake-prone network dependencies.
  • package.json adds test:e2e:lifecycle (180s) + test:e2e:db (300s) with sensible timeouts and the same npm run build && mocha … chain as the existing analytics e2e.
  • brv-server.ts final-flush wiring is sound — stop() first, then flushFinal({timeoutMs: 3000}). Hook gate at :698-700 preserves PR Feat/eng 2964 #722's defensive throw.

Verification suggestion before merge

npm run build && npx mocha --forbid-only "test/integration/infra/process/analytics-hook-lifecycle-wiring.test.ts"
# expected: all cases passing now that the transport accepts the type
npm run test:e2e:lifecycle -- --grep "P0 — curate"
# expected: task_created + task_failed{failure_kind=cancelled,task_type=curate-tool-mode} land via the capture stub

· branch feat/ENG-2964 ·

Comment thread src/server/infra/daemon/brv-server.ts
@bao-byterover bao-byterover merged commit 2f418cc into proj/analytics-system-tool-mode May 28, 2026
7 checks passed
@bao-byterover bao-byterover deleted the feat/ENG-2964 branch May 28, 2026 03:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants