Skip to content

feat(dashboard): stream live audit and usage logs#334

Open
SantiagoDePolonia wants to merge 3 commits into
mainfrom
feat/low-latency-logs
Open

feat(dashboard): stream live audit and usage logs#334
SantiagoDePolonia wants to merge 3 commits into
mainfrom
feat/low-latency-logs

Conversation

@SantiagoDePolonia
Copy link
Copy Markdown
Contributor

@SantiagoDePolonia SantiagoDePolonia commented May 15, 2026

Summary

  • add an in-process live log broker and SSE endpoint for dashboard audit/usage previews
  • stream sequential audit lifecycle updates with compact rows and lazy detail loading
  • mark live workflow chart current steps in blue and only mark audit/usage persisted after flush events
  • add dashboard runtime config and docs for live logs controls

Tests

  • go test ./...
  • node --test internal/admin/dashboard/static/js/modules/*.test.cjs
  • commit hook: make test-race, make lint, dashboard JS tests, perf guard

Summary by CodeRabbit

  • New Features

    • Dashboard: real-time streaming of audit and usage logs with SSE-based live updates, bounded replay/cursor support, and single-audit detail retrieval.
    • Dashboard: visual highlight for the current workflow step during live requests.
  • Configuration

    • New runtime/env options to enable live logs and tune buffer size, replay limit, and heartbeat/heartbeat interval.
  • Documentation

    • README and docs updated with live-logs settings and behavior.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 15, 2026

📝 Walkthrough

Walkthrough

Adds end-to-end realtime dashboard live logs: new Admin config flags, an in-process Broker with bounded replay, audit/usage live-event emission, SSE endpoints (/admin/live/logs, /admin/audit/detail), App wiring to publish events, a browser module to consume/merge live events, tests, and documentation updates.

Changes

Dashboard Live Logs Streaming Feature

Layer / File(s) Summary
Configuration and Live Broker Core
config/admin.go, config/config.go, config/config_test.go, internal/live/broker.go, internal/live/broker_test.go
New AdminConfig fields and defaults for live logs. Added Broker with sequence/replay window, Subscription handle, publish/replay logic, and tests for replay/reset/close/preview shaping.
Audit and Usage Event Publishing
internal/auditlog/auditlog.go, internal/auditlog/constants.go, internal/auditlog/logger.go, internal/auditlog/middleware.go, internal/usage/usage.go, internal/usage/logger.go
Live-event constants and publisher/emitter interfaces. Audit and usage loggers now support setting a live publisher and emit lifecycle events (started/updated/completed/flushed/removed and usage equivalents). Middleware and enrichment helpers emit updates.
App-Level Broker Integration
internal/app/app.go, internal/app/app_test.go
Broker instantiated from Admin config in App.New, injected into loggers when supported, passed into admin init/handler, exposed via dashboard runtime config, and closed on shutdown with tests validating subscriber closure ordering.
Admin API Endpoints and Routes
internal/admin/handler.go, internal/admin/handler_audit.go, internal/admin/handler_live.go, internal/admin/routes.go, internal/admin/routes_test.go, internal/admin/handler_test.go
Added SSE LiveLogs handler (GET /admin/live/logs) with cursor/type filtering, replay/reset, and heartbeat emission; AuditLogDetail (GET /admin/audit/detail) to fetch a single enriched audit entry; routes and tests updated; handler wiring for WithLiveBroker and runtime flag.
Dashboard Live Logs Module and Visualization
internal/admin/dashboard/static/js/modules/live-logs.js, internal/admin/dashboard/static/js/modules/live-logs.test.cjs, internal/admin/dashboard/static/js/dashboard.js, internal/admin/dashboard/static/js/modules/audit-list.js, internal/admin/dashboard/static/js/modules/workflows.js, internal/admin/dashboard/static/js/modules/workflows.test.cjs, internal/admin/dashboard/static/css/dashboard.css, internal/admin/dashboard/templates/layout.html
Browser dashboardLiveLogsModule streams SSE frames, parses and merges audit/usage previews into lists with live/pending/flushed flags, fetches audit detail, schedules reconnect/backoff, and integrates with workflow chart live-step rendering and new .workflow-node-current styles. Dashboard init stops/restarts live logs around credential changes.
Configuration Documentation
CLAUDE.md, README.md, docs/advanced/*, .env.template
Documented DASHBOARD_LIVE_LOGS_ENABLED and related tuning env vars (buffer size, replay limit, heartbeat) and described replay/reset behavior and defaults.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • ENTERPILOT/GoModel#148: Overlaps with changes to internal/auditlog/middleware.go and enrichment logic, potentially related to live-event emission wiring.

Suggested labels

release:internal

Poem

🐰 I hop, I watch the live logs stream,

Blue current nodes where events do gleam,
Broker hums and sequences float,
Dashboards brighten with each note,
A rabbit cheers this realtime dream!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 42.86% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(dashboard): stream live audit and usage logs' directly and clearly describes the primary change in the PR—adding live streaming of audit and usage logs to the dashboard. It is specific, concise, and accurately summarizes the main objective.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/low-latency-logs

Warning

Review ran into problems

🔥 Problems

Stopped waiting for pipeline failures after 30000ms. One of your pipelines takes longer than our 30000ms fetch window to run, so review may not consider pipeline-failure results for inline comments if any failures occurred after the fetch window. Increase the timeout if you want to wait longer or run a @coderabbit review after the pipeline has finished.

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.


this.fetchAll();
const initialFetch = this.fetchAll();
if (initialFetch && typeof initialFetch.finally === "function") {
this.stopLiveLogs();
}
const refresh = this.fetchAll();
if (refresh && typeof refresh.finally === "function") {
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 15, 2026

Greptile Summary

This PR adds live dashboard log streaming for audit and usage events. The main changes are:

  • An in-process live broker with replay and SSE streaming.
  • New dashboard live-log client code for audit and usage previews.
  • Audit detail loading for expanded live rows.
  • Workflow chart styling for current live request steps.
  • Runtime config and docs for live dashboard log controls.

Confidence Score: 3/5

This should be fixed before merging.

  • Normal dashboard streams do not receive idle heartbeats with the default type filter.
  • Live usage rows can report incorrect cached-token accounting.
  • Live audit workflow charts can show incomplete workflow state until detail is loaded.

Focus on internal/admin/handler_live.go, internal/live/broker.go, and the live dashboard merge/chart paths.

Important Files Changed

Filename Overview
internal/admin/handler_live.go Adds the SSE endpoint and type filtering for replay, live events, and heartbeats.
internal/live/broker.go Adds live event fan-out plus compact audit and usage payload serialization.
internal/admin/dashboard/static/js/modules/live-logs.js Adds dashboard stream consumption and live audit/usage row merging.
internal/admin/dashboard/static/js/modules/workflows.js Updates workflow chart node states for live current-step and flush status.

Sequence Diagram

sequenceDiagram
    participant Request as API request
    participant Audit as audit/usage logger
    participant Broker as live broker
    participant SSE as /admin/live/logs
    participant UI as dashboard
    Request->>Audit: lifecycle and usage entries
    Audit->>Broker: compact live events
    UI->>SSE: "subscribe with types=audit,usage"
    SSE->>Broker: subscribe with cursor
    Broker-->>SSE: replay and live events
    SSE-->>UI: SSE frames
    UI->>UI: merge live rows and workflow chart state
Loading

Reviews (1): Last reviewed commit: "feat(dashboard): stream live audit and u..." | Re-trigger Greptile

Comment on lines +64 to +72
case <-ctx.Done():
return nil
case event, ok := <-sub.Events:
if !ok {
return nil
}
if !filter.matches(event.Type) {
continue
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Heartbeat gets filtered

The dashboard opens this endpoint with types=audit,usage, but heartbeat has no audit. or usage. prefix. This branch applies the type filter before writing every live event, so idle dashboard streams never receive the configured heartbeat. During quiet periods, proxies or browsers can close the SSE connection and the dashboard falls into reconnect churn instead of keeping one live stream open.

Comment thread internal/live/broker.go
Comment on lines +309 to +330
func usagePreviewFromEntry(entry *usage.UsageEntry) usage.UsageLogEntry {
return usage.UsageLogEntry{
ID: entry.ID,
RequestID: entry.RequestID,
ProviderID: entry.ProviderID,
Timestamp: entry.Timestamp.UTC(),
Model: entry.Model,
Provider: entry.Provider,
ProviderName: entry.ProviderName,
Endpoint: entry.Endpoint,
UserPath: entry.UserPath,
CacheType: entry.CacheType,
InputTokens: entry.InputTokens,
OutputTokens: entry.OutputTokens,
TotalTokens: entry.TotalTokens,
InputCost: entry.InputCost,
OutputCost: entry.OutputCost,
TotalCost: entry.TotalCost,
CostSource: entry.CostSource,
CostsCalculationCaveat: entry.CostsCalculationCaveat,
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Raw usage is dropped

Live usage previews copy the top-level token and cost fields but omit RawData. The persisted usage summary uses raw_data to split cached prompt reads and cache writes, while the live dashboard then has to synthesize cached_input_tokens: 0 from this payload. A cached OpenAI or Anthropic request will show all prompt tokens as uncached, and cache savings will be wrong until a REST reload replaces the live row.

Comment thread internal/live/broker.go
Comment on lines +259 to +307
AuthKeyID string `json:"auth_key_id,omitempty"`
AuthMethod string `json:"auth_method,omitempty"`
ClientIP string `json:"client_ip,omitempty"`
Method string `json:"method,omitempty"`
Path string `json:"path,omitempty"`
UserPath string `json:"user_path,omitempty"`
Stream bool `json:"stream,omitempty"`
ErrorType string `json:"error_type,omitempty"`
ErrorMessage string `json:"error_message,omitempty"`
LiveState string `json:"_live_state,omitempty"`
LivePending bool `json:"_live_pending,omitempty"`
}

func auditPreviewFromEntry(eventType string, entry *auditlog.LogEntry) auditPreview {
preview := auditPreview{
ID: entry.ID,
RequestID: entry.RequestID,
Timestamp: entry.Timestamp.UTC(),
RequestedModel: entry.RequestedModel,
ResolvedModel: entry.ResolvedModel,
Provider: entry.Provider,
ProviderName: entry.ProviderName,
AliasUsed: entry.AliasUsed,
WorkflowVersionID: entry.WorkflowVersionID,
CacheType: entry.CacheType,
AuthKeyID: entry.AuthKeyID,
AuthMethod: entry.AuthMethod,
ClientIP: entry.ClientIP,
Method: entry.Method,
Path: entry.Path,
UserPath: entry.UserPath,
Stream: entry.Stream,
ErrorType: entry.ErrorType,
LiveState: eventType,
LivePending: eventType != EventAuditFlushed,
}
if entry.DurationNs > 0 {
duration := entry.DurationNs
preview.DurationNs = &duration
}
if entry.StatusCode > 0 {
status := entry.StatusCode
preview.StatusCode = &status
}
if entry.Data != nil {
preview.ErrorMessage = entry.Data.ErrorMessage
}
return preview
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Workflow metadata is missing

The live audit preview omits the compact workflow data stored under entry.Data, including workflow_features and failover. The workflow chart reads those fields from entry.data, and live rows only fetch full detail when expanded. A collapsed live row for a workflow request can therefore hide or misstate lanes such as usage, cache, budget, and fallback until the row is expanded or the page reloads.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 15, 2026

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 49.35065% with 195 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
internal/admin/handler_live.go 0.00% 89 Missing ⚠️
internal/live/broker.go 82.87% 18 Missing and 13 partials ⚠️
internal/admin/handler_audit.go 0.00% 21 Missing ⚠️
internal/app/app.go 15.78% 16 Missing ⚠️
internal/auditlog/middleware.go 36.36% 12 Missing and 2 partials ⚠️
internal/auditlog/logger.go 45.00% 10 Missing and 1 partial ⚠️
internal/usage/logger.go 47.36% 9 Missing and 1 partial ⚠️
internal/admin/handler.go 25.00% 3 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/auditlog/logger.go (1)

205-215: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Emit a terminal live event when batch persistence fails.

At Line 205, WriteBatch failure drops the whole batch, but only the success path emits LiveEventAuditFlushed. That leaves previously published audit.completed entries stuck as pending in live clients forever.

Suggested fix
 if err := l.store.WriteBatch(ctx, batch); err != nil {
     slog.Error("failed to write audit log batch",
         "error", err,
         "count", len(batch),
     )
+    for _, entry := range batch {
+        l.PublishLiveEvent(LiveEventAuditRemoved, entry)
+    }
     return
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/auditlog/logger.go` around lines 205 - 215, The WriteBatch error
path currently logs the failure but does not notify live clients, leaving prior
audit.completed events pending; after the slog.Error call in the WriteBatch
error branch (where l.store.WriteBatch is invoked), iterate the batch and call
l.PublishLiveEvent for each entry with a terminal failure event (e.g.,
LiveEventAuditFailed) so live clients receive a terminal state; use the same
entry objects from batch and the existing l.PublishLiveEvent method to mark them
failed (mirroring the success loop that uses LiveEventAuditFlushed) so pending
entries are cleared.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@CLAUDE.md`:
- Line 121: The documentation lists only defaults for DASHBOARD_LIVE_LOGS_* but
lacks actionable tuning guidance; update the CLAUDE.md entry for
DASHBOARD_LIVE_LOGS_ENABLED / DASHBOARD_LIVE_LOGS_BUFFER_SIZE /
DASHBOARD_LIVE_LOGS_REPLAY_LIMIT / DASHBOARD_LIVE_LOGS_HEARTBEAT_SECONDS to
include one-line guidance for each: when to increase (e.g., high traffic, large
spikes, long client reconnect windows), when to decrease (e.g., memory/latency
constraints), and minimal example thresholds (e.g., buffer size increase for
>1000 msgs/sec, replay limit for long reconnects, heartbeat lower for frequent
client liveness) so operators know how to tune rather than only seeing defaults.

In `@config/config_test.go`:
- Around line 200-217: The test assertions for cfg.Admin.LiveLogs* require a
clean environment; extend the existing clearAllConfigEnvVars helper to also
unset the four new env vars (DASHBOARD_LIVE_LOGS_ENABLED,
DASHBOARD_LIVE_LOGS_BUFFER_SIZE, DASHBOARD_LIVE_LOGS_REPLAY_LIMIT,
DASHBOARD_LIVE_LOGS_HEARTBEAT_SECONDS) so the Admin defaults used by the config
tests (cfg.Admin.EndpointsEnabled, UIEnabled, LiveLogsEnabled,
LiveLogsBufferSize, LiveLogsReplayLimit, LiveLogsHeartbeatSeconds) are
deterministic; update clearAllConfigEnvVars to call os.Unsetenv for those four
keys (or remove them before the test) so the assertions in config_test.go always
run against default values.

In `@internal/admin/handler_audit.go`:
- Around line 176-206: Add Swagger/OpenAPI annotations for the AuditLogDetail
handler to match the pattern used by AuditLog and AuditConversation: annotate
the AuditLogDetail function (AuditLogDetail) with operation summary,
description, tags (e.g., "admin"), parameters (query param "log_id" required),
success response schema (returning an audit log entry / 200), and error
responses (400/404/500). Place these comments immediately above the
AuditLogDetail function declaration so the API generator picks them up and
ensure the parameter name and response type match the types used in
auditlog.LogEntry and the existing audit endpoints.

In `@README.md`:
- Line 277: Update the README config table to include the three missing
dashboard live-log controls: DASHBOARD_LIVE_LOGS_BUFFER_SIZE,
DASHBOARD_LIVE_LOGS_REPLAY_LIMIT, and DASHBOARD_LIVE_LOGS_HEARTBEAT_SECONDS; for
each add the default value, a one-line description (what it controls and a short
hint when to change it), and keep the existing DASHBOARD_LIVE_LOGS_ENABLED row
style so operators can tune buffering, replay bounds and heartbeat behavior from
the main docs entry point.

---

Outside diff comments:
In `@internal/auditlog/logger.go`:
- Around line 205-215: The WriteBatch error path currently logs the failure but
does not notify live clients, leaving prior audit.completed events pending;
after the slog.Error call in the WriteBatch error branch (where
l.store.WriteBatch is invoked), iterate the batch and call l.PublishLiveEvent
for each entry with a terminal failure event (e.g., LiveEventAuditFailed) so
live clients receive a terminal state; use the same entry objects from batch and
the existing l.PublishLiveEvent method to mark them failed (mirroring the
success loop that uses LiveEventAuditFlushed) so pending entries are cleared.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7a1a2521-0b87-4ac6-9c72-194aa67d680e

📥 Commits

Reviewing files that changed from the base of the PR and between ddd80ae and 6bf7428.

📒 Files selected for processing (29)
  • CLAUDE.md
  • README.md
  • config/admin.go
  • config/config.go
  • config/config_test.go
  • internal/admin/dashboard/static/css/dashboard.css
  • internal/admin/dashboard/static/js/dashboard.js
  • internal/admin/dashboard/static/js/modules/audit-list.js
  • internal/admin/dashboard/static/js/modules/live-logs.js
  • internal/admin/dashboard/static/js/modules/live-logs.test.cjs
  • internal/admin/dashboard/static/js/modules/workflows.js
  • internal/admin/dashboard/static/js/modules/workflows.test.cjs
  • internal/admin/dashboard/templates/layout.html
  • internal/admin/handler.go
  • internal/admin/handler_audit.go
  • internal/admin/handler_live.go
  • internal/admin/handler_test.go
  • internal/admin/routes.go
  • internal/admin/routes_test.go
  • internal/app/app.go
  • internal/app/app_test.go
  • internal/auditlog/auditlog.go
  • internal/auditlog/constants.go
  • internal/auditlog/logger.go
  • internal/auditlog/middleware.go
  • internal/live/broker.go
  • internal/live/broker_test.go
  • internal/usage/logger.go
  • internal/usage/usage.go

Comment thread CLAUDE.md
- **Models:** `MODELS_ENABLED_BY_DEFAULT` (true), `MODEL_OVERRIDES_ENABLED` (true), `KEEP_ONLY_ALIASES_AT_MODELS_ENDPOINT` (false), `CONFIGURED_PROVIDER_MODELS_MODE` (`fallback` or `allowlist`, default `fallback`; `allowlist` skips upstream `/models` for providers with configured lists); persisted overrides restrict/allow selectors with `user_paths`. When alias-only models listing is enabled, `GET /v1/models` returns only model aliases, not full concrete model specs, to operators.
- **Audit logging:** `LOGGING_ENABLED` (false), `LOGGING_LOG_BODIES` (false), `LOGGING_LOG_HEADERS` (false), `LOGGING_RETENTION_DAYS` (30)
- **Usage tracking:** `USAGE_ENABLED` (true), `ENFORCE_RETURNING_USAGE_DATA` (true), `USAGE_RETENTION_DAYS` (90)
- **Dashboard live logs:** `DASHBOARD_LIVE_LOGS_ENABLED` (true), `DASHBOARD_LIVE_LOGS_BUFFER_SIZE` (10000), `DASHBOARD_LIVE_LOGS_REPLAY_LIMIT` (1000), `DASHBOARD_LIVE_LOGS_HEARTBEAT_SECONDS` (15)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add tuning guidance for live-log knobs (not just defaults).

Line 121 lists defaults, but it doesn’t tell operators when to adjust BUFFER_SIZE, REPLAY_LIMIT, or HEARTBEAT_SECONDS. Add a brief “increase/decrease when…” note so this section is actionable.

As per coding guidelines: "**/*.md: Documentation should be concise, practical, and user-focused. Show defaults, explain when to change them, and include minimal examples when useful."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@CLAUDE.md` at line 121, The documentation lists only defaults for
DASHBOARD_LIVE_LOGS_* but lacks actionable tuning guidance; update the CLAUDE.md
entry for DASHBOARD_LIVE_LOGS_ENABLED / DASHBOARD_LIVE_LOGS_BUFFER_SIZE /
DASHBOARD_LIVE_LOGS_REPLAY_LIMIT / DASHBOARD_LIVE_LOGS_HEARTBEAT_SECONDS to
include one-line guidance for each: when to increase (e.g., high traffic, large
spikes, long client reconnect windows), when to decrease (e.g., memory/latency
constraints), and minimal example thresholds (e.g., buffer size increase for
>1000 msgs/sec, replay limit for long reconnects, heartbeat lower for frequent
client liveness) so operators know how to tune rather than only seeing defaults.

Comment thread config/config_test.go
Comment on lines +200 to +217
if !cfg.Admin.EndpointsEnabled {
t.Error("expected Admin.EndpointsEnabled=true")
}
if !cfg.Admin.UIEnabled {
t.Error("expected Admin.UIEnabled=true")
}
if !cfg.Admin.LiveLogsEnabled {
t.Error("expected Admin.LiveLogsEnabled=true")
}
if cfg.Admin.LiveLogsBufferSize != 10000 {
t.Errorf("expected Admin.LiveLogsBufferSize=10000, got %d", cfg.Admin.LiveLogsBufferSize)
}
if cfg.Admin.LiveLogsReplayLimit != 1000 {
t.Errorf("expected Admin.LiveLogsReplayLimit=1000, got %d", cfg.Admin.LiveLogsReplayLimit)
}
if cfg.Admin.LiveLogsHeartbeatSeconds != 15 {
t.Errorf("expected Admin.LiveLogsHeartbeatSeconds=15, got %d", cfg.Admin.LiveLogsHeartbeatSeconds)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add env cleanup for new live-log variables to keep these default assertions deterministic.

These new assertions rely on clean process env, but clearAllConfigEnvVars doesn’t clear DASHBOARD_LIVE_LOGS_ENABLED, DASHBOARD_LIVE_LOGS_BUFFER_SIZE, DASHBOARD_LIVE_LOGS_REPLAY_LIMIT, and DASHBOARD_LIVE_LOGS_HEARTBEAT_SECONDS. If any test sets them, defaults checks here can become order-dependent.

Suggested fix
func clearAllConfigEnvVars(t *testing.T) {
    t.Helper()
    for _, key := range []string{
@@
        "HTTP_TIMEOUT", "HTTP_RESPONSE_HEADER_TIMEOUT",
        "WORKFLOW_REFRESH_INTERVAL",
+       "DASHBOARD_LIVE_LOGS_ENABLED",
+       "DASHBOARD_LIVE_LOGS_BUFFER_SIZE",
+       "DASHBOARD_LIVE_LOGS_REPLAY_LIMIT",
+       "DASHBOARD_LIVE_LOGS_HEARTBEAT_SECONDS",
    } {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@config/config_test.go` around lines 200 - 217, The test assertions for
cfg.Admin.LiveLogs* require a clean environment; extend the existing
clearAllConfigEnvVars helper to also unset the four new env vars
(DASHBOARD_LIVE_LOGS_ENABLED, DASHBOARD_LIVE_LOGS_BUFFER_SIZE,
DASHBOARD_LIVE_LOGS_REPLAY_LIMIT, DASHBOARD_LIVE_LOGS_HEARTBEAT_SECONDS) so the
Admin defaults used by the config tests (cfg.Admin.EndpointsEnabled, UIEnabled,
LiveLogsEnabled, LiveLogsBufferSize, LiveLogsReplayLimit,
LiveLogsHeartbeatSeconds) are deterministic; update clearAllConfigEnvVars to
call os.Unsetenv for those four keys (or remove them before the test) so the
assertions in config_test.go always run against default values.

Comment on lines +176 to +206
// AuditLogDetail handles GET /admin/audit/detail.
func (h *Handler) AuditLogDetail(c *echo.Context) error {
logID := strings.TrimSpace(c.QueryParam("log_id"))
if logID == "" {
return handleError(c, core.NewInvalidRequestError("log_id is required", nil))
}
if h.auditReader == nil {
return handleError(c, featureUnavailableError("audit log detail is unavailable"))
}

entry, err := h.auditReader.GetLogByID(c.Request().Context(), logID)
if err != nil {
return handleError(c, err)
}
if entry == nil {
return handleError(c, core.NewNotFoundError("audit log not found: "+logID))
}

response, err := h.auditLogResponse(c.Request().Context(), &auditlog.LogListResult{
Entries: []auditlog.LogEntry{*entry},
Total: 1,
Limit: 1,
})
if err != nil {
return handleError(c, err)
}
if len(response.Entries) == 0 {
return handleError(c, core.NewNotFoundError("audit log not found: "+logID))
}
return c.JSON(http.StatusOK, response.Entries[0])
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | 💤 Low value

Consider adding API documentation for discoverability.

The new AuditLogDetail endpoint lacks Swagger/OpenAPI annotations that are present on related endpoints like AuditLog (line 21) and AuditConversation (line 208). Adding standard annotations would improve API documentation consistency and help dashboard developers discover this endpoint.

📝 Suggested documentation pattern
+// AuditLogDetail handles GET /admin/audit/detail.
+//
+// `@Summary`      Get single audit log entry by ID
+// `@Tags`         admin  
+// `@Produce`      json
+// `@Security`     BearerAuth
+// `@Param`        log_id  query  string  true  "Audit log entry ID"
+// `@Success`      200  {object}  auditLogEntryResponse
+// `@Failure`      400  {object}  core.GatewayError
+// `@Failure`      401  {object}  core.GatewayError
+// `@Failure`      404  {object}  core.GatewayError
+// `@Router`       /admin/audit/detail [get]
 func (h *Handler) AuditLogDetail(c *echo.Context) error {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// AuditLogDetail handles GET /admin/audit/detail.
func (h *Handler) AuditLogDetail(c *echo.Context) error {
logID := strings.TrimSpace(c.QueryParam("log_id"))
if logID == "" {
return handleError(c, core.NewInvalidRequestError("log_id is required", nil))
}
if h.auditReader == nil {
return handleError(c, featureUnavailableError("audit log detail is unavailable"))
}
entry, err := h.auditReader.GetLogByID(c.Request().Context(), logID)
if err != nil {
return handleError(c, err)
}
if entry == nil {
return handleError(c, core.NewNotFoundError("audit log not found: "+logID))
}
response, err := h.auditLogResponse(c.Request().Context(), &auditlog.LogListResult{
Entries: []auditlog.LogEntry{*entry},
Total: 1,
Limit: 1,
})
if err != nil {
return handleError(c, err)
}
if len(response.Entries) == 0 {
return handleError(c, core.NewNotFoundError("audit log not found: "+logID))
}
return c.JSON(http.StatusOK, response.Entries[0])
}
// AuditLogDetail handles GET /admin/audit/detail.
//
// `@Summary` Get single audit log entry by ID
// `@Tags` admin
// `@Produce` json
// `@Security` BearerAuth
// `@Param` log_id query string true "Audit log entry ID"
// `@Success` 200 {object} auditLogEntryResponse
// `@Failure` 400 {object} core.GatewayError
// `@Failure` 401 {object} core.GatewayError
// `@Failure` 404 {object} core.GatewayError
// `@Router` /admin/audit/detail [get]
func (h *Handler) AuditLogDetail(c *echo.Context) error {
logID := strings.TrimSpace(c.QueryParam("log_id"))
if logID == "" {
return handleError(c, core.NewInvalidRequestError("log_id is required", nil))
}
if h.auditReader == nil {
return handleError(c, featureUnavailableError("audit log detail is unavailable"))
}
entry, err := h.auditReader.GetLogByID(c.Request().Context(), logID)
if err != nil {
return handleError(c, err)
}
if entry == nil {
return handleError(c, core.NewNotFoundError("audit log not found: "+logID))
}
response, err := h.auditLogResponse(c.Request().Context(), &auditlog.LogListResult{
Entries: []auditlog.LogEntry{*entry},
Total: 1,
Limit: 1,
})
if err != nil {
return handleError(c, err)
}
if len(response.Entries) == 0 {
return handleError(c, core.NewNotFoundError("audit log not found: "+logID))
}
return c.JSON(http.StatusOK, response.Entries[0])
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/admin/handler_audit.go` around lines 176 - 206, Add Swagger/OpenAPI
annotations for the AuditLogDetail handler to match the pattern used by AuditLog
and AuditConversation: annotate the AuditLogDetail function (AuditLogDetail)
with operation summary, description, tags (e.g., "admin"), parameters (query
param "log_id" required), success response schema (returning an audit log entry
/ 200), and error responses (400/404/500). Place these comments immediately
above the AuditLogDetail function declaration so the API generator picks them up
and ensure the parameter name and response type match the types used in
auditlog.LogEntry and the existing audit endpoints.

Comment thread README.md
| `STORAGE_TYPE` | `sqlite` | Storage backend (`sqlite`, `postgresql`, `mongodb`) |
| `METRICS_ENABLED` | `false` | Enable Prometheus metrics (experimental) |
| `LOGGING_ENABLED` | `false` | Enable audit logging |
| `DASHBOARD_LIVE_LOGS_ENABLED` | `true` | Stream realtime dashboard log previews with bounded replay |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

README config table is missing the other live-log controls.

Line 277 adds only DASHBOARD_LIVE_LOGS_ENABLED, but operators also need DASHBOARD_LIVE_LOGS_BUFFER_SIZE, DASHBOARD_LIVE_LOGS_REPLAY_LIMIT, and DASHBOARD_LIVE_LOGS_HEARTBEAT_SECONDS in this table to tune behavior from the main docs entry point.

As per coding guidelines: "**/*.md: Documentation should be concise, practical, and user-focused. Show defaults, explain when to change them, and include minimal examples when useful."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@README.md` at line 277, Update the README config table to include the three
missing dashboard live-log controls: DASHBOARD_LIVE_LOGS_BUFFER_SIZE,
DASHBOARD_LIVE_LOGS_REPLAY_LIMIT, and DASHBOARD_LIVE_LOGS_HEARTBEAT_SECONDS; for
each add the default value, a one-line description (what it controls and a short
hint when to change it), and keep the existing DASHBOARD_LIVE_LOGS_ENABLED row
style so operators can tune buffering, replay bounds and heartbeat behavior from
the main docs entry point.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/admin/handler_live.go`:
- Around line 17-87: Add unit tests for LiveLogs exercising cursor parsing, type
filtering, replay/reset behavior, heartbeat pings, and live event forwarding by
creating tests that call Handler.LiveLogs with controlled inputs; specifically,
add *_test.go cases that (1) pass an invalid cursor string to ensure liveCursor
error handling, (2) set query "types" to verify liveTypeFilter excludes/accepts
events, (3) supply a mocked/standalone liveBroker Subscribe result where
sub.Reset is true and sub.Replay contains events to assert writeLiveEvent sends
a reset and replayed events, and (4) provide a Subscribe with an Events channel
and Heartbeat interval to assert periodic heartbeat events and forwarding of
live events. Implement a test liveBroker stub (implementing Enabled, Subscribe,
LatestSeq, Heartbeat) and use an echo request/response recorder to capture SSE
output for assertions, ensuring to close the Events channel to end the handler
loop.
- Around line 101-125: The current liveTypeFilter(raw string) +
liveLogTypeFilter.matches(eventType string) treats an input like "types=foo" (no
valid tokens) the same as no filter and returns true for all events; fix by
distinguishing "no filter provided" from "filter provided but no valid tokens":
change liveLogTypeFilter to carry a boolean (e.g., provided) or equivalent
sentinel set by liveTypeFilter when raw is non-empty, set provided=true only if
raw was given (even if no tokens were valid), and then update matches to return
true when provided==false (no filter specified), but return false when
provided==true and the internal map is empty (filter provided but contained no
valid tokens); update liveTypeFilter(raw string) to trim raw and set the
provided flag accordingly and populate the map only with valid tokens.

In `@internal/live/broker.go`:
- Around line 245-248: When the replay buffer exceeds b.bufferSize, avoid
allocating a new slice; instead shift entries left in-place with copy(b.events,
b.events[drop:]), zero out the now-unused tail slots to avoid retaining
references (e.g. for i := b.bufferSize; i < len(b.events); i++ { b.events[i] =
Event{} }), and then reslice with b.events = b.events[:b.bufferSize]; replace
the current append([]Event(nil), ...) reallocation with this in-place trim using
the identifiers b.events, b.bufferSize, and drop.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 98702627-7b01-4603-81ae-effb459b69e8

📥 Commits

Reviewing files that changed from the base of the PR and between 7b499b9 and 54e97b8.

📒 Files selected for processing (5)
  • internal/admin/handler_live.go
  • internal/app/app.go
  • internal/app/app_test.go
  • internal/live/broker.go
  • internal/live/broker_test.go

Comment on lines +17 to +87
// LiveLogs handles GET /admin/live/logs.
func (h *Handler) LiveLogs(c *echo.Context) error {
if h.liveBroker == nil || !h.liveBroker.Enabled() {
return handleError(c, featureUnavailableError("live logs are unavailable"))
}

cursor, err := liveCursor(c.QueryParam("cursor"))
if err != nil {
return handleError(c, err)
}
filter := liveTypeFilter(c.QueryParam("types"))
sub := h.liveBroker.Subscribe(cursor)
if sub == nil {
return handleError(c, featureUnavailableError("live logs are unavailable"))
}
defer sub.Close()

res := c.Response()
// SSE responses are intentionally long-lived; keep disconnect detection via writes.
_ = http.NewResponseController(res).SetWriteDeadline(time.Time{})
res.Header().Set(echo.HeaderContentType, "text/event-stream")
res.Header().Set(echo.HeaderCacheControl, "no-cache, no-transform")
res.Header().Set(echo.HeaderConnection, "keep-alive")
res.Header().Set("X-Accel-Buffering", "no")
res.WriteHeader(http.StatusOK)

if sub.Reset {
if err := writeLiveEvent(res, live.Event{
Seq: h.liveBroker.LatestSeq(),
Type: live.EventReset,
}); err != nil {
return err
}
}
for _, event := range sub.Replay {
if !filter.matches(event.Type) {
continue
}
if err := writeLiveEvent(res, event); err != nil {
return err
}
}

ticker := time.NewTicker(h.liveBroker.Heartbeat())
defer ticker.Stop()

ctx := c.Request().Context()
for {
select {
case <-ctx.Done():
return nil
case event, ok := <-sub.Events:
if !ok {
return nil
}
if !filter.matches(event.Type) {
continue
}
if err := writeLiveEvent(res, event); err != nil {
return err
}
case <-ticker.C:
if err := writeLiveEvent(res, live.Event{
Seq: h.liveBroker.LatestSeq(),
Type: live.EventHeartbeat,
}); err != nil {
return err
}
}
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

Add targeted tests for cursor/type parsing and SSE replay/reset flow.

This endpoint introduces multiple control-path branches (invalid cursor, type filtering, reset replay, heartbeat/live forwarding), but file-level coverage is currently missing in this PR context.

As per coding guidelines: **/*_test.go: Add or update tests for behavior changes.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/admin/handler_live.go` around lines 17 - 87, Add unit tests for
LiveLogs exercising cursor parsing, type filtering, replay/reset behavior,
heartbeat pings, and live event forwarding by creating tests that call
Handler.LiveLogs with controlled inputs; specifically, add *_test.go cases that
(1) pass an invalid cursor string to ensure liveCursor error handling, (2) set
query "types" to verify liveTypeFilter excludes/accepts events, (3) supply a
mocked/standalone liveBroker Subscribe result where sub.Reset is true and
sub.Replay contains events to assert writeLiveEvent sends a reset and replayed
events, and (4) provide a Subscribe with an Events channel and Heartbeat
interval to assert periodic heartbeat events and forwarding of live events.
Implement a test liveBroker stub (implementing Enabled, Subscribe, LatestSeq,
Heartbeat) and use an echo request/response recorder to capture SSE output for
assertions, ensuring to close the Events channel to end the handler loop.

Comment on lines +101 to +125
type liveLogTypeFilter map[string]struct{}

func liveTypeFilter(raw string) liveLogTypeFilter {
filter := liveLogTypeFilter{}
for _, item := range strings.Split(raw, ",") {
item = strings.ToLower(strings.TrimSpace(item))
switch item {
case "audit", "usage":
filter[item] = struct{}{}
}
}
return filter
}

func (f liveLogTypeFilter) matches(eventType string) bool {
if len(f) == 0 {
return true
}
prefix, _, ok := strings.Cut(eventType, ".")
if !ok {
prefix = eventType
}
_, matched := f[prefix]
return matched
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

types with only unknown values currently matches all events.

With types=foo, the parsed filter is empty and matches returns true for everything. That silently broadens output instead of narrowing it.

💡 Proposed fix (distinguish “no filter provided” from “no valid filter tokens”)
-type liveLogTypeFilter map[string]struct{}
+type liveLogTypeFilter struct {
+	allowAll bool
+	allowed  map[string]struct{}
+}

 func liveTypeFilter(raw string) liveLogTypeFilter {
-	filter := liveLogTypeFilter{}
+	if strings.TrimSpace(raw) == "" {
+		return liveLogTypeFilter{allowAll: true}
+	}
+	filter := liveLogTypeFilter{allowed: map[string]struct{}{}}
 	for _, item := range strings.Split(raw, ",") {
 		item = strings.ToLower(strings.TrimSpace(item))
 		switch item {
 		case "audit", "usage":
-			filter[item] = struct{}{}
+			filter.allowed[item] = struct{}{}
 		}
 	}
 	return filter
 }

 func (f liveLogTypeFilter) matches(eventType string) bool {
-	if len(f) == 0 {
+	if f.allowAll {
 		return true
 	}
 	prefix, _, ok := strings.Cut(eventType, ".")
 	if !ok {
 		prefix = eventType
 	}
-	_, matched := f[prefix]
+	_, matched := f.allowed[strings.ToLower(prefix)]
 	return matched
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
type liveLogTypeFilter map[string]struct{}
func liveTypeFilter(raw string) liveLogTypeFilter {
filter := liveLogTypeFilter{}
for _, item := range strings.Split(raw, ",") {
item = strings.ToLower(strings.TrimSpace(item))
switch item {
case "audit", "usage":
filter[item] = struct{}{}
}
}
return filter
}
func (f liveLogTypeFilter) matches(eventType string) bool {
if len(f) == 0 {
return true
}
prefix, _, ok := strings.Cut(eventType, ".")
if !ok {
prefix = eventType
}
_, matched := f[prefix]
return matched
}
type liveLogTypeFilter struct {
allowAll bool
allowed map[string]struct{}
}
func liveTypeFilter(raw string) liveLogTypeFilter {
if strings.TrimSpace(raw) == "" {
return liveLogTypeFilter{allowAll: true}
}
filter := liveLogTypeFilter{allowed: map[string]struct{}{}}
for _, item := range strings.Split(raw, ",") {
item = strings.ToLower(strings.TrimSpace(item))
switch item {
case "audit", "usage":
filter.allowed[item] = struct{}{}
}
}
return filter
}
func (f liveLogTypeFilter) matches(eventType string) bool {
if f.allowAll {
return true
}
prefix, _, ok := strings.Cut(eventType, ".")
if !ok {
prefix = eventType
}
_, matched := f.allowed[strings.ToLower(prefix)]
return matched
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/admin/handler_live.go` around lines 101 - 125, The current
liveTypeFilter(raw string) + liveLogTypeFilter.matches(eventType string) treats
an input like "types=foo" (no valid tokens) the same as no filter and returns
true for all events; fix by distinguishing "no filter provided" from "filter
provided but no valid tokens": change liveLogTypeFilter to carry a boolean
(e.g., provided) or equivalent sentinel set by liveTypeFilter when raw is
non-empty, set provided=true only if raw was given (even if no tokens were
valid), and then update matches to return true when provided==false (no filter
specified), but return false when provided==true and the internal map is empty
(filter provided but contained no valid tokens); update liveTypeFilter(raw
string) to trim raw and set the provided flag accordingly and populate the map
only with valid tokens.

Comment thread internal/live/broker.go
Comment on lines +245 to +248
if len(b.events) > b.bufferSize {
drop := len(b.events) - b.bufferSize
b.events = append([]Event(nil), b.events[drop:]...)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid per-event slice reallocation when replay buffer is full.

This path reallocates and copies the full replay slice on every publish after saturation, which will add GC pressure on a hot logging path.

💡 Proposed fix (in-place trim without reallocation)
 	b.events = append(b.events, event)
 	if len(b.events) > b.bufferSize {
 		drop := len(b.events) - b.bufferSize
-		b.events = append([]Event(nil), b.events[drop:]...)
+		copy(b.events, b.events[drop:])
+		b.events = b.events[:len(b.events)-drop]
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/live/broker.go` around lines 245 - 248, When the replay buffer
exceeds b.bufferSize, avoid allocating a new slice; instead shift entries left
in-place with copy(b.events, b.events[drop:]), zero out the now-unused tail
slots to avoid retaining references (e.g. for i := b.bufferSize; i <
len(b.events); i++ { b.events[i] = Event{} }), and then reslice with b.events =
b.events[:b.bufferSize]; replace the current append([]Event(nil), ...)
reallocation with this in-place trim using the identifiers b.events,
b.bufferSize, and drop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants