Is your feature request related to a problem? Please describe.
Currently, the application measures query response latency on the backend using:
record_query_response_time(time.perf_counter() - started_at)
However, this metric is only recorded internally and never exposed to users. As a result:
Users have no visibility into how long a response took to generate.
Performance improvements are difficult for users to notice.
Power users cannot easily compare latency across different queries or models.
Existing response time measurements are effectively discarded after being logged.
The response duration is already available during request processing, making this a missed opportunity to provide useful UX feedback with minimal implementation effort.
Describe the solution you'd like
Add a lightweight Per-Message Response Time Badge that displays the generation time for each assistant response.
Backend Changes
Include the response duration in the SSE done event emitted by ask_question_stream().
Example:
elapsed_ms = round((time.perf_counter() - started_at) * 1000)
yield f"data: {json.dumps({
'type': 'done',
'response_time_ms': elapsed_ms
})}\n\n"
This avoids database schema changes while leveraging the existing timing measurements.
Frontend Changes
Capture the value from the SSE done event in ChatPanel.
} else if (event.type === "done") {
setMessages((prev) =>
prev.map((m) =>
m.id === assistantId
? {
...m,
isStreaming: false,
response_time_ms: event.response_time_ms,
}
: m
)
);
}
Pass the value to MessageBubble and render a small latency badge.
{message.response_time_ms && (
⚡ {(message.response_time_ms / 1000).toFixed(1)}s
)}
Expected Result
Users will see a compact badge such as:
⚡ 1.4s
⚡ 2.8s
⚡ 0.9s
next to completed assistant responses.
Describe alternatives you've considered
- Global Session Statistics
Display average response times in a dashboard or settings page.
Pros
Useful for long-term analytics.
Cons
Does not provide message-level feedback.
Less visible during normal usage.
- Database Persistence
Store response times within chat history records.
Pros
Historical analysis becomes possible.
Cons
Requires schema updates and migrations.
Adds complexity for a feature that can be implemented entirely through SSE events.
- Developer-Only Metrics
Keep latency measurements available only through logs and monitoring dashboards.
Pros
No UI changes required.
Cons
End users receive no benefit from existing measurements.
Additional Context
Why this feature is valuable
Improves transparency by showing how long responses take to generate.
Helps users understand system performance and model behavior.
Provides immediate feedback during optimization efforts.
Uses data that is already being collected.
Requires only a small, focused implementation with minimal risk.
Technical Scope
Backend: SSE streaming event enhancement.
Frontend: Message state update and badge rendering.
No database migration required.
No API breaking changes.
Estimated implementation size: fewer than 30 lines of code.
Benefits for Contributors
This feature touches multiple parts of the stack:
Server-side streaming (SSE)
Frontend state management
React UI components
making it an excellent beginner-to-intermediate contribution while still delivering meaningful user value.
GSSoC '26
Is your feature request related to a problem? Please describe.
Currently, the application measures query response latency on the backend using:
record_query_response_time(time.perf_counter() - started_at)
However, this metric is only recorded internally and never exposed to users. As a result:
Users have no visibility into how long a response took to generate.
Performance improvements are difficult for users to notice.
Power users cannot easily compare latency across different queries or models.
Existing response time measurements are effectively discarded after being logged.
The response duration is already available during request processing, making this a missed opportunity to provide useful UX feedback with minimal implementation effort.
Describe the solution you'd like
Add a lightweight Per-Message Response Time Badge that displays the generation time for each assistant response.
Backend Changes
Include the response duration in the SSE done event emitted by ask_question_stream().
Example:
elapsed_ms = round((time.perf_counter() - started_at) * 1000)
yield f"data: {json.dumps({
'type': 'done',
'response_time_ms': elapsed_ms
})}\n\n"
This avoids database schema changes while leveraging the existing timing measurements.
Frontend Changes
Capture the value from the SSE done event in ChatPanel.
} else if (event.type === "done") {
setMessages((prev) =>
prev.map((m) =>
m.id === assistantId
? {
...m,
isStreaming: false,
response_time_ms: event.response_time_ms,
}
: m
)
);
}
Pass the value to MessageBubble and render a small latency badge.
{message.response_time_ms && (
⚡ {(message.response_time_ms / 1000).toFixed(1)}s
)}
Expected Result
Users will see a compact badge such as:
⚡ 1.4s
⚡ 2.8s
⚡ 0.9s
next to completed assistant responses.
Describe alternatives you've considered
Display average response times in a dashboard or settings page.
Pros
Useful for long-term analytics.
Cons
Does not provide message-level feedback.
Less visible during normal usage.
Store response times within chat history records.
Pros
Historical analysis becomes possible.
Cons
Requires schema updates and migrations.
Adds complexity for a feature that can be implemented entirely through SSE events.
Keep latency measurements available only through logs and monitoring dashboards.
Pros
No UI changes required.
Cons
End users receive no benefit from existing measurements.
Additional Context
Why this feature is valuable
Improves transparency by showing how long responses take to generate.
Helps users understand system performance and model behavior.
Provides immediate feedback during optimization efforts.
Uses data that is already being collected.
Requires only a small, focused implementation with minimal risk.
Technical Scope
Backend: SSE streaming event enhancement.
Frontend: Message state update and badge rendering.
No database migration required.
No API breaking changes.
Estimated implementation size: fewer than 30 lines of code.
Benefits for Contributors
This feature touches multiple parts of the stack:
Server-side streaming (SSE)
Frontend state management
React UI components
making it an excellent beginner-to-intermediate contribution while still delivering meaningful user value.
GSSoC '26