fix(analytics): define top-session active duration as clamped idle gaps (alt to #867)#869
Open
mariusvniekerk wants to merge 8 commits into
Open
fix(analytics): define top-session active duration as clamped idle gaps (alt to #867)#869mariusvniekerk wants to merge 8 commits into
mariusvniekerk wants to merge 8 commits into
Conversation
Top Sessions "By Duration" introduced an active-duration metric that summed only the gaps following a tool-use message. That captured tool-execution latency but discarded all model generation and thinking time, including the stretch before the first tool call, so it systematically undercounted how long a session was actually worked on. Redefine active duration as the sum of consecutive inter-message gaps with each gap capped at five minutes, which is exactly the definition the velocity "active minutes" metric already uses. A gap below the cap counts in full (generation, tool execution, or a quick human turnaround); only stretches longer than the cap are bounded as idle. Timestamps alone cannot distinguish a long tool run from a human stepping away, so capping fails gracefully in both directions instead of guessing. Hoist the cap to a single shared constant (ActiveGapCapSec / ActiveGapCapMs) so the two "active" definitions cannot drift, guarded by a test that the seconds and milliseconds forms agree. The active-duration SQL drops the has_tool_use filter and the trailing gap to ended_at to match the velocity computation across SQLite, PostgreSQL, and DuckDB, and the SQLite Go fallback runs the same clamp. Alternative implementation to #83, built on its commits.
roborev: Combined Review (
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Alternative to #867. This branch is built on @rodboev's commits (kept in
the history, so the original work and authorship are preserved); it changes
only how "active duration" is computed.
What changed
#867 computes a session's active duration by summing only the gaps that
follow a tool-use message. That captures tool-execution latency but
discards all model generation and thinking time — including the time spent
before the first tool call — so a session that reasons for minutes and then
fires one quick tool is scored as nearly idle.
This branch instead defines active duration the same way the existing
velocity "active minutes" metric already does: the sum of consecutive
inter-message gaps, with each gap capped at 5 minutes. Every gap counts —
model generation, tool execution, and quick human turnarounds — and only
stretches longer than the cap are bounded as idle.
Why
Message timestamps alone can't tell a 20-minute subagent run from a
20-minute coffee break; both are one long gap. Capping each gap fails
gracefully in both directions (a long idle gap and a long active gap each
contribute at most 5 minutes) instead of guessing, and it keeps the metric
robust to messy data: resumed sessions, machine sleep, and API stalls
produce the largest gaps, and those would otherwise dominate the sum.
It also unifies the definition. The 5-minute cap is hoisted to a single
shared constant (
db.ActiveGapCapSec/ActiveGapCapMs) used by both thevelocity metric and Top Sessions, so the two "active" numbers on the
dashboard can't drift; a test asserts the seconds and milliseconds forms
agree.
Tradeoff
A genuinely long active operation — a 15-minute test run, a 20-minute
subagent — is also capped at 5 minutes, so it is undercounted. That is the
deliberate cost of not guessing whether a long gap was work or idle; the
error is bounded and symmetric.
Where to look
internal/db/analytics.go— shared constant, SQLite active-duration SQL,and the velocity metric now sourcing the same cap.
internal/db/timing.go— SQLite Go fallback (timezone-aware ranking path)running the same clamp.
internal/postgres/analytics.go,internal/duckdb/analytics_usage.go—PostgreSQL and DuckDB twins.
All three SQL backends drop the
has_tool_usefilter and the trailing gapto
ended_atso the value matches the velocity computation.generated by a clanker