feat(dashboard): stream live audit and usage logs#334
feat(dashboard): stream live audit and usage logs#334SantiagoDePolonia wants to merge 3 commits into
Conversation
📝 WalkthroughWalkthroughAdds end-to-end realtime dashboard live logs: new Admin config flags, an in-process Broker with bounded replay, audit/usage live-event emission, SSE endpoints (/admin/live/logs, /admin/audit/detail), App wiring to publish events, a browser module to consume/merge live events, tests, and documentation updates. ChangesDashboard Live Logs Streaming Feature
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsStopped waiting for pipeline failures after 30000ms. One of your pipelines takes longer than our 30000ms fetch window to run, so review may not consider pipeline-failure results for inline comments if any failures occurred after the fetch window. Increase the timeout if you want to wait longer or run a Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
|
||
| this.fetchAll(); | ||
| const initialFetch = this.fetchAll(); | ||
| if (initialFetch && typeof initialFetch.finally === "function") { |
| this.stopLiveLogs(); | ||
| } | ||
| const refresh = this.fetchAll(); | ||
| if (refresh && typeof refresh.finally === "function") { |
Greptile SummaryThis PR adds live dashboard log streaming for audit and usage events. The main changes are:
Confidence Score: 3/5This should be fixed before merging.
Focus on Important Files Changed
Sequence DiagramsequenceDiagram
participant Request as API request
participant Audit as audit/usage logger
participant Broker as live broker
participant SSE as /admin/live/logs
participant UI as dashboard
Request->>Audit: lifecycle and usage entries
Audit->>Broker: compact live events
UI->>SSE: "subscribe with types=audit,usage"
SSE->>Broker: subscribe with cursor
Broker-->>SSE: replay and live events
SSE-->>UI: SSE frames
UI->>UI: merge live rows and workflow chart state
Reviews (1): Last reviewed commit: "feat(dashboard): stream live audit and u..." | Re-trigger Greptile |
| case <-ctx.Done(): | ||
| return nil | ||
| case event, ok := <-sub.Events: | ||
| if !ok { | ||
| return nil | ||
| } | ||
| if !filter.matches(event.Type) { | ||
| continue | ||
| } |
There was a problem hiding this comment.
The dashboard opens this endpoint with types=audit,usage, but heartbeat has no audit. or usage. prefix. This branch applies the type filter before writing every live event, so idle dashboard streams never receive the configured heartbeat. During quiet periods, proxies or browsers can close the SSE connection and the dashboard falls into reconnect churn instead of keeping one live stream open.
| func usagePreviewFromEntry(entry *usage.UsageEntry) usage.UsageLogEntry { | ||
| return usage.UsageLogEntry{ | ||
| ID: entry.ID, | ||
| RequestID: entry.RequestID, | ||
| ProviderID: entry.ProviderID, | ||
| Timestamp: entry.Timestamp.UTC(), | ||
| Model: entry.Model, | ||
| Provider: entry.Provider, | ||
| ProviderName: entry.ProviderName, | ||
| Endpoint: entry.Endpoint, | ||
| UserPath: entry.UserPath, | ||
| CacheType: entry.CacheType, | ||
| InputTokens: entry.InputTokens, | ||
| OutputTokens: entry.OutputTokens, | ||
| TotalTokens: entry.TotalTokens, | ||
| InputCost: entry.InputCost, | ||
| OutputCost: entry.OutputCost, | ||
| TotalCost: entry.TotalCost, | ||
| CostSource: entry.CostSource, | ||
| CostsCalculationCaveat: entry.CostsCalculationCaveat, | ||
| } | ||
| } |
There was a problem hiding this comment.
Live usage previews copy the top-level token and cost fields but omit RawData. The persisted usage summary uses raw_data to split cached prompt reads and cache writes, while the live dashboard then has to synthesize cached_input_tokens: 0 from this payload. A cached OpenAI or Anthropic request will show all prompt tokens as uncached, and cache savings will be wrong until a REST reload replaces the live row.
| AuthKeyID string `json:"auth_key_id,omitempty"` | ||
| AuthMethod string `json:"auth_method,omitempty"` | ||
| ClientIP string `json:"client_ip,omitempty"` | ||
| Method string `json:"method,omitempty"` | ||
| Path string `json:"path,omitempty"` | ||
| UserPath string `json:"user_path,omitempty"` | ||
| Stream bool `json:"stream,omitempty"` | ||
| ErrorType string `json:"error_type,omitempty"` | ||
| ErrorMessage string `json:"error_message,omitempty"` | ||
| LiveState string `json:"_live_state,omitempty"` | ||
| LivePending bool `json:"_live_pending,omitempty"` | ||
| } | ||
|
|
||
| func auditPreviewFromEntry(eventType string, entry *auditlog.LogEntry) auditPreview { | ||
| preview := auditPreview{ | ||
| ID: entry.ID, | ||
| RequestID: entry.RequestID, | ||
| Timestamp: entry.Timestamp.UTC(), | ||
| RequestedModel: entry.RequestedModel, | ||
| ResolvedModel: entry.ResolvedModel, | ||
| Provider: entry.Provider, | ||
| ProviderName: entry.ProviderName, | ||
| AliasUsed: entry.AliasUsed, | ||
| WorkflowVersionID: entry.WorkflowVersionID, | ||
| CacheType: entry.CacheType, | ||
| AuthKeyID: entry.AuthKeyID, | ||
| AuthMethod: entry.AuthMethod, | ||
| ClientIP: entry.ClientIP, | ||
| Method: entry.Method, | ||
| Path: entry.Path, | ||
| UserPath: entry.UserPath, | ||
| Stream: entry.Stream, | ||
| ErrorType: entry.ErrorType, | ||
| LiveState: eventType, | ||
| LivePending: eventType != EventAuditFlushed, | ||
| } | ||
| if entry.DurationNs > 0 { | ||
| duration := entry.DurationNs | ||
| preview.DurationNs = &duration | ||
| } | ||
| if entry.StatusCode > 0 { | ||
| status := entry.StatusCode | ||
| preview.StatusCode = &status | ||
| } | ||
| if entry.Data != nil { | ||
| preview.ErrorMessage = entry.Data.ErrorMessage | ||
| } | ||
| return preview | ||
| } |
There was a problem hiding this comment.
The live audit preview omits the compact workflow data stored under entry.Data, including workflow_features and failover. The workflow chart reads those fields from entry.data, and live rows only fetch full detail when expanded. A collapsed live row for a workflow request can therefore hide or misstate lanes such as usage, cache, budget, and fallback until the row is expanded or the page reloads.
|
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/auditlog/logger.go (1)
205-215:⚠️ Potential issue | 🟠 Major | ⚡ Quick winEmit a terminal live event when batch persistence fails.
At Line 205,
WriteBatchfailure drops the whole batch, but only the success path emitsLiveEventAuditFlushed. That leaves previously publishedaudit.completedentries stuck as pending in live clients forever.Suggested fix
if err := l.store.WriteBatch(ctx, batch); err != nil { slog.Error("failed to write audit log batch", "error", err, "count", len(batch), ) + for _, entry := range batch { + l.PublishLiveEvent(LiveEventAuditRemoved, entry) + } return }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/auditlog/logger.go` around lines 205 - 215, The WriteBatch error path currently logs the failure but does not notify live clients, leaving prior audit.completed events pending; after the slog.Error call in the WriteBatch error branch (where l.store.WriteBatch is invoked), iterate the batch and call l.PublishLiveEvent for each entry with a terminal failure event (e.g., LiveEventAuditFailed) so live clients receive a terminal state; use the same entry objects from batch and the existing l.PublishLiveEvent method to mark them failed (mirroring the success loop that uses LiveEventAuditFlushed) so pending entries are cleared.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@CLAUDE.md`:
- Line 121: The documentation lists only defaults for DASHBOARD_LIVE_LOGS_* but
lacks actionable tuning guidance; update the CLAUDE.md entry for
DASHBOARD_LIVE_LOGS_ENABLED / DASHBOARD_LIVE_LOGS_BUFFER_SIZE /
DASHBOARD_LIVE_LOGS_REPLAY_LIMIT / DASHBOARD_LIVE_LOGS_HEARTBEAT_SECONDS to
include one-line guidance for each: when to increase (e.g., high traffic, large
spikes, long client reconnect windows), when to decrease (e.g., memory/latency
constraints), and minimal example thresholds (e.g., buffer size increase for
>1000 msgs/sec, replay limit for long reconnects, heartbeat lower for frequent
client liveness) so operators know how to tune rather than only seeing defaults.
In `@config/config_test.go`:
- Around line 200-217: The test assertions for cfg.Admin.LiveLogs* require a
clean environment; extend the existing clearAllConfigEnvVars helper to also
unset the four new env vars (DASHBOARD_LIVE_LOGS_ENABLED,
DASHBOARD_LIVE_LOGS_BUFFER_SIZE, DASHBOARD_LIVE_LOGS_REPLAY_LIMIT,
DASHBOARD_LIVE_LOGS_HEARTBEAT_SECONDS) so the Admin defaults used by the config
tests (cfg.Admin.EndpointsEnabled, UIEnabled, LiveLogsEnabled,
LiveLogsBufferSize, LiveLogsReplayLimit, LiveLogsHeartbeatSeconds) are
deterministic; update clearAllConfigEnvVars to call os.Unsetenv for those four
keys (or remove them before the test) so the assertions in config_test.go always
run against default values.
In `@internal/admin/handler_audit.go`:
- Around line 176-206: Add Swagger/OpenAPI annotations for the AuditLogDetail
handler to match the pattern used by AuditLog and AuditConversation: annotate
the AuditLogDetail function (AuditLogDetail) with operation summary,
description, tags (e.g., "admin"), parameters (query param "log_id" required),
success response schema (returning an audit log entry / 200), and error
responses (400/404/500). Place these comments immediately above the
AuditLogDetail function declaration so the API generator picks them up and
ensure the parameter name and response type match the types used in
auditlog.LogEntry and the existing audit endpoints.
In `@README.md`:
- Line 277: Update the README config table to include the three missing
dashboard live-log controls: DASHBOARD_LIVE_LOGS_BUFFER_SIZE,
DASHBOARD_LIVE_LOGS_REPLAY_LIMIT, and DASHBOARD_LIVE_LOGS_HEARTBEAT_SECONDS; for
each add the default value, a one-line description (what it controls and a short
hint when to change it), and keep the existing DASHBOARD_LIVE_LOGS_ENABLED row
style so operators can tune buffering, replay bounds and heartbeat behavior from
the main docs entry point.
---
Outside diff comments:
In `@internal/auditlog/logger.go`:
- Around line 205-215: The WriteBatch error path currently logs the failure but
does not notify live clients, leaving prior audit.completed events pending;
after the slog.Error call in the WriteBatch error branch (where
l.store.WriteBatch is invoked), iterate the batch and call l.PublishLiveEvent
for each entry with a terminal failure event (e.g., LiveEventAuditFailed) so
live clients receive a terminal state; use the same entry objects from batch and
the existing l.PublishLiveEvent method to mark them failed (mirroring the
success loop that uses LiveEventAuditFlushed) so pending entries are cleared.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7a1a2521-0b87-4ac6-9c72-194aa67d680e
📒 Files selected for processing (29)
CLAUDE.mdREADME.mdconfig/admin.goconfig/config.goconfig/config_test.gointernal/admin/dashboard/static/css/dashboard.cssinternal/admin/dashboard/static/js/dashboard.jsinternal/admin/dashboard/static/js/modules/audit-list.jsinternal/admin/dashboard/static/js/modules/live-logs.jsinternal/admin/dashboard/static/js/modules/live-logs.test.cjsinternal/admin/dashboard/static/js/modules/workflows.jsinternal/admin/dashboard/static/js/modules/workflows.test.cjsinternal/admin/dashboard/templates/layout.htmlinternal/admin/handler.gointernal/admin/handler_audit.gointernal/admin/handler_live.gointernal/admin/handler_test.gointernal/admin/routes.gointernal/admin/routes_test.gointernal/app/app.gointernal/app/app_test.gointernal/auditlog/auditlog.gointernal/auditlog/constants.gointernal/auditlog/logger.gointernal/auditlog/middleware.gointernal/live/broker.gointernal/live/broker_test.gointernal/usage/logger.gointernal/usage/usage.go
| - **Models:** `MODELS_ENABLED_BY_DEFAULT` (true), `MODEL_OVERRIDES_ENABLED` (true), `KEEP_ONLY_ALIASES_AT_MODELS_ENDPOINT` (false), `CONFIGURED_PROVIDER_MODELS_MODE` (`fallback` or `allowlist`, default `fallback`; `allowlist` skips upstream `/models` for providers with configured lists); persisted overrides restrict/allow selectors with `user_paths`. When alias-only models listing is enabled, `GET /v1/models` returns only model aliases, not full concrete model specs, to operators. | ||
| - **Audit logging:** `LOGGING_ENABLED` (false), `LOGGING_LOG_BODIES` (false), `LOGGING_LOG_HEADERS` (false), `LOGGING_RETENTION_DAYS` (30) | ||
| - **Usage tracking:** `USAGE_ENABLED` (true), `ENFORCE_RETURNING_USAGE_DATA` (true), `USAGE_RETENTION_DAYS` (90) | ||
| - **Dashboard live logs:** `DASHBOARD_LIVE_LOGS_ENABLED` (true), `DASHBOARD_LIVE_LOGS_BUFFER_SIZE` (10000), `DASHBOARD_LIVE_LOGS_REPLAY_LIMIT` (1000), `DASHBOARD_LIVE_LOGS_HEARTBEAT_SECONDS` (15) |
There was a problem hiding this comment.
Add tuning guidance for live-log knobs (not just defaults).
Line 121 lists defaults, but it doesn’t tell operators when to adjust BUFFER_SIZE, REPLAY_LIMIT, or HEARTBEAT_SECONDS. Add a brief “increase/decrease when…” note so this section is actionable.
As per coding guidelines: "**/*.md: Documentation should be concise, practical, and user-focused. Show defaults, explain when to change them, and include minimal examples when useful."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@CLAUDE.md` at line 121, The documentation lists only defaults for
DASHBOARD_LIVE_LOGS_* but lacks actionable tuning guidance; update the CLAUDE.md
entry for DASHBOARD_LIVE_LOGS_ENABLED / DASHBOARD_LIVE_LOGS_BUFFER_SIZE /
DASHBOARD_LIVE_LOGS_REPLAY_LIMIT / DASHBOARD_LIVE_LOGS_HEARTBEAT_SECONDS to
include one-line guidance for each: when to increase (e.g., high traffic, large
spikes, long client reconnect windows), when to decrease (e.g., memory/latency
constraints), and minimal example thresholds (e.g., buffer size increase for
>1000 msgs/sec, replay limit for long reconnects, heartbeat lower for frequent
client liveness) so operators know how to tune rather than only seeing defaults.
| if !cfg.Admin.EndpointsEnabled { | ||
| t.Error("expected Admin.EndpointsEnabled=true") | ||
| } | ||
| if !cfg.Admin.UIEnabled { | ||
| t.Error("expected Admin.UIEnabled=true") | ||
| } | ||
| if !cfg.Admin.LiveLogsEnabled { | ||
| t.Error("expected Admin.LiveLogsEnabled=true") | ||
| } | ||
| if cfg.Admin.LiveLogsBufferSize != 10000 { | ||
| t.Errorf("expected Admin.LiveLogsBufferSize=10000, got %d", cfg.Admin.LiveLogsBufferSize) | ||
| } | ||
| if cfg.Admin.LiveLogsReplayLimit != 1000 { | ||
| t.Errorf("expected Admin.LiveLogsReplayLimit=1000, got %d", cfg.Admin.LiveLogsReplayLimit) | ||
| } | ||
| if cfg.Admin.LiveLogsHeartbeatSeconds != 15 { | ||
| t.Errorf("expected Admin.LiveLogsHeartbeatSeconds=15, got %d", cfg.Admin.LiveLogsHeartbeatSeconds) | ||
| } |
There was a problem hiding this comment.
Add env cleanup for new live-log variables to keep these default assertions deterministic.
These new assertions rely on clean process env, but clearAllConfigEnvVars doesn’t clear DASHBOARD_LIVE_LOGS_ENABLED, DASHBOARD_LIVE_LOGS_BUFFER_SIZE, DASHBOARD_LIVE_LOGS_REPLAY_LIMIT, and DASHBOARD_LIVE_LOGS_HEARTBEAT_SECONDS. If any test sets them, defaults checks here can become order-dependent.
Suggested fix
func clearAllConfigEnvVars(t *testing.T) {
t.Helper()
for _, key := range []string{
@@
"HTTP_TIMEOUT", "HTTP_RESPONSE_HEADER_TIMEOUT",
"WORKFLOW_REFRESH_INTERVAL",
+ "DASHBOARD_LIVE_LOGS_ENABLED",
+ "DASHBOARD_LIVE_LOGS_BUFFER_SIZE",
+ "DASHBOARD_LIVE_LOGS_REPLAY_LIMIT",
+ "DASHBOARD_LIVE_LOGS_HEARTBEAT_SECONDS",
} {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@config/config_test.go` around lines 200 - 217, The test assertions for
cfg.Admin.LiveLogs* require a clean environment; extend the existing
clearAllConfigEnvVars helper to also unset the four new env vars
(DASHBOARD_LIVE_LOGS_ENABLED, DASHBOARD_LIVE_LOGS_BUFFER_SIZE,
DASHBOARD_LIVE_LOGS_REPLAY_LIMIT, DASHBOARD_LIVE_LOGS_HEARTBEAT_SECONDS) so the
Admin defaults used by the config tests (cfg.Admin.EndpointsEnabled, UIEnabled,
LiveLogsEnabled, LiveLogsBufferSize, LiveLogsReplayLimit,
LiveLogsHeartbeatSeconds) are deterministic; update clearAllConfigEnvVars to
call os.Unsetenv for those four keys (or remove them before the test) so the
assertions in config_test.go always run against default values.
| // AuditLogDetail handles GET /admin/audit/detail. | ||
| func (h *Handler) AuditLogDetail(c *echo.Context) error { | ||
| logID := strings.TrimSpace(c.QueryParam("log_id")) | ||
| if logID == "" { | ||
| return handleError(c, core.NewInvalidRequestError("log_id is required", nil)) | ||
| } | ||
| if h.auditReader == nil { | ||
| return handleError(c, featureUnavailableError("audit log detail is unavailable")) | ||
| } | ||
|
|
||
| entry, err := h.auditReader.GetLogByID(c.Request().Context(), logID) | ||
| if err != nil { | ||
| return handleError(c, err) | ||
| } | ||
| if entry == nil { | ||
| return handleError(c, core.NewNotFoundError("audit log not found: "+logID)) | ||
| } | ||
|
|
||
| response, err := h.auditLogResponse(c.Request().Context(), &auditlog.LogListResult{ | ||
| Entries: []auditlog.LogEntry{*entry}, | ||
| Total: 1, | ||
| Limit: 1, | ||
| }) | ||
| if err != nil { | ||
| return handleError(c, err) | ||
| } | ||
| if len(response.Entries) == 0 { | ||
| return handleError(c, core.NewNotFoundError("audit log not found: "+logID)) | ||
| } | ||
| return c.JSON(http.StatusOK, response.Entries[0]) | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | 💤 Low value
Consider adding API documentation for discoverability.
The new AuditLogDetail endpoint lacks Swagger/OpenAPI annotations that are present on related endpoints like AuditLog (line 21) and AuditConversation (line 208). Adding standard annotations would improve API documentation consistency and help dashboard developers discover this endpoint.
📝 Suggested documentation pattern
+// AuditLogDetail handles GET /admin/audit/detail.
+//
+// `@Summary` Get single audit log entry by ID
+// `@Tags` admin
+// `@Produce` json
+// `@Security` BearerAuth
+// `@Param` log_id query string true "Audit log entry ID"
+// `@Success` 200 {object} auditLogEntryResponse
+// `@Failure` 400 {object} core.GatewayError
+// `@Failure` 401 {object} core.GatewayError
+// `@Failure` 404 {object} core.GatewayError
+// `@Router` /admin/audit/detail [get]
func (h *Handler) AuditLogDetail(c *echo.Context) error {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // AuditLogDetail handles GET /admin/audit/detail. | |
| func (h *Handler) AuditLogDetail(c *echo.Context) error { | |
| logID := strings.TrimSpace(c.QueryParam("log_id")) | |
| if logID == "" { | |
| return handleError(c, core.NewInvalidRequestError("log_id is required", nil)) | |
| } | |
| if h.auditReader == nil { | |
| return handleError(c, featureUnavailableError("audit log detail is unavailable")) | |
| } | |
| entry, err := h.auditReader.GetLogByID(c.Request().Context(), logID) | |
| if err != nil { | |
| return handleError(c, err) | |
| } | |
| if entry == nil { | |
| return handleError(c, core.NewNotFoundError("audit log not found: "+logID)) | |
| } | |
| response, err := h.auditLogResponse(c.Request().Context(), &auditlog.LogListResult{ | |
| Entries: []auditlog.LogEntry{*entry}, | |
| Total: 1, | |
| Limit: 1, | |
| }) | |
| if err != nil { | |
| return handleError(c, err) | |
| } | |
| if len(response.Entries) == 0 { | |
| return handleError(c, core.NewNotFoundError("audit log not found: "+logID)) | |
| } | |
| return c.JSON(http.StatusOK, response.Entries[0]) | |
| } | |
| // AuditLogDetail handles GET /admin/audit/detail. | |
| // | |
| // `@Summary` Get single audit log entry by ID | |
| // `@Tags` admin | |
| // `@Produce` json | |
| // `@Security` BearerAuth | |
| // `@Param` log_id query string true "Audit log entry ID" | |
| // `@Success` 200 {object} auditLogEntryResponse | |
| // `@Failure` 400 {object} core.GatewayError | |
| // `@Failure` 401 {object} core.GatewayError | |
| // `@Failure` 404 {object} core.GatewayError | |
| // `@Router` /admin/audit/detail [get] | |
| func (h *Handler) AuditLogDetail(c *echo.Context) error { | |
| logID := strings.TrimSpace(c.QueryParam("log_id")) | |
| if logID == "" { | |
| return handleError(c, core.NewInvalidRequestError("log_id is required", nil)) | |
| } | |
| if h.auditReader == nil { | |
| return handleError(c, featureUnavailableError("audit log detail is unavailable")) | |
| } | |
| entry, err := h.auditReader.GetLogByID(c.Request().Context(), logID) | |
| if err != nil { | |
| return handleError(c, err) | |
| } | |
| if entry == nil { | |
| return handleError(c, core.NewNotFoundError("audit log not found: "+logID)) | |
| } | |
| response, err := h.auditLogResponse(c.Request().Context(), &auditlog.LogListResult{ | |
| Entries: []auditlog.LogEntry{*entry}, | |
| Total: 1, | |
| Limit: 1, | |
| }) | |
| if err != nil { | |
| return handleError(c, err) | |
| } | |
| if len(response.Entries) == 0 { | |
| return handleError(c, core.NewNotFoundError("audit log not found: "+logID)) | |
| } | |
| return c.JSON(http.StatusOK, response.Entries[0]) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/admin/handler_audit.go` around lines 176 - 206, Add Swagger/OpenAPI
annotations for the AuditLogDetail handler to match the pattern used by AuditLog
and AuditConversation: annotate the AuditLogDetail function (AuditLogDetail)
with operation summary, description, tags (e.g., "admin"), parameters (query
param "log_id" required), success response schema (returning an audit log entry
/ 200), and error responses (400/404/500). Place these comments immediately
above the AuditLogDetail function declaration so the API generator picks them up
and ensure the parameter name and response type match the types used in
auditlog.LogEntry and the existing audit endpoints.
| | `STORAGE_TYPE` | `sqlite` | Storage backend (`sqlite`, `postgresql`, `mongodb`) | | ||
| | `METRICS_ENABLED` | `false` | Enable Prometheus metrics (experimental) | | ||
| | `LOGGING_ENABLED` | `false` | Enable audit logging | | ||
| | `DASHBOARD_LIVE_LOGS_ENABLED` | `true` | Stream realtime dashboard log previews with bounded replay | |
There was a problem hiding this comment.
README config table is missing the other live-log controls.
Line 277 adds only DASHBOARD_LIVE_LOGS_ENABLED, but operators also need DASHBOARD_LIVE_LOGS_BUFFER_SIZE, DASHBOARD_LIVE_LOGS_REPLAY_LIMIT, and DASHBOARD_LIVE_LOGS_HEARTBEAT_SECONDS in this table to tune behavior from the main docs entry point.
As per coding guidelines: "**/*.md: Documentation should be concise, practical, and user-focused. Show defaults, explain when to change them, and include minimal examples when useful."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@README.md` at line 277, Update the README config table to include the three
missing dashboard live-log controls: DASHBOARD_LIVE_LOGS_BUFFER_SIZE,
DASHBOARD_LIVE_LOGS_REPLAY_LIMIT, and DASHBOARD_LIVE_LOGS_HEARTBEAT_SECONDS; for
each add the default value, a one-line description (what it controls and a short
hint when to change it), and keep the existing DASHBOARD_LIVE_LOGS_ENABLED row
style so operators can tune buffering, replay bounds and heartbeat behavior from
the main docs entry point.
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/admin/handler_live.go`:
- Around line 17-87: Add unit tests for LiveLogs exercising cursor parsing, type
filtering, replay/reset behavior, heartbeat pings, and live event forwarding by
creating tests that call Handler.LiveLogs with controlled inputs; specifically,
add *_test.go cases that (1) pass an invalid cursor string to ensure liveCursor
error handling, (2) set query "types" to verify liveTypeFilter excludes/accepts
events, (3) supply a mocked/standalone liveBroker Subscribe result where
sub.Reset is true and sub.Replay contains events to assert writeLiveEvent sends
a reset and replayed events, and (4) provide a Subscribe with an Events channel
and Heartbeat interval to assert periodic heartbeat events and forwarding of
live events. Implement a test liveBroker stub (implementing Enabled, Subscribe,
LatestSeq, Heartbeat) and use an echo request/response recorder to capture SSE
output for assertions, ensuring to close the Events channel to end the handler
loop.
- Around line 101-125: The current liveTypeFilter(raw string) +
liveLogTypeFilter.matches(eventType string) treats an input like "types=foo" (no
valid tokens) the same as no filter and returns true for all events; fix by
distinguishing "no filter provided" from "filter provided but no valid tokens":
change liveLogTypeFilter to carry a boolean (e.g., provided) or equivalent
sentinel set by liveTypeFilter when raw is non-empty, set provided=true only if
raw was given (even if no tokens were valid), and then update matches to return
true when provided==false (no filter specified), but return false when
provided==true and the internal map is empty (filter provided but contained no
valid tokens); update liveTypeFilter(raw string) to trim raw and set the
provided flag accordingly and populate the map only with valid tokens.
In `@internal/live/broker.go`:
- Around line 245-248: When the replay buffer exceeds b.bufferSize, avoid
allocating a new slice; instead shift entries left in-place with copy(b.events,
b.events[drop:]), zero out the now-unused tail slots to avoid retaining
references (e.g. for i := b.bufferSize; i < len(b.events); i++ { b.events[i] =
Event{} }), and then reslice with b.events = b.events[:b.bufferSize]; replace
the current append([]Event(nil), ...) reallocation with this in-place trim using
the identifiers b.events, b.bufferSize, and drop.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 98702627-7b01-4603-81ae-effb459b69e8
📒 Files selected for processing (5)
internal/admin/handler_live.gointernal/app/app.gointernal/app/app_test.gointernal/live/broker.gointernal/live/broker_test.go
| // LiveLogs handles GET /admin/live/logs. | ||
| func (h *Handler) LiveLogs(c *echo.Context) error { | ||
| if h.liveBroker == nil || !h.liveBroker.Enabled() { | ||
| return handleError(c, featureUnavailableError("live logs are unavailable")) | ||
| } | ||
|
|
||
| cursor, err := liveCursor(c.QueryParam("cursor")) | ||
| if err != nil { | ||
| return handleError(c, err) | ||
| } | ||
| filter := liveTypeFilter(c.QueryParam("types")) | ||
| sub := h.liveBroker.Subscribe(cursor) | ||
| if sub == nil { | ||
| return handleError(c, featureUnavailableError("live logs are unavailable")) | ||
| } | ||
| defer sub.Close() | ||
|
|
||
| res := c.Response() | ||
| // SSE responses are intentionally long-lived; keep disconnect detection via writes. | ||
| _ = http.NewResponseController(res).SetWriteDeadline(time.Time{}) | ||
| res.Header().Set(echo.HeaderContentType, "text/event-stream") | ||
| res.Header().Set(echo.HeaderCacheControl, "no-cache, no-transform") | ||
| res.Header().Set(echo.HeaderConnection, "keep-alive") | ||
| res.Header().Set("X-Accel-Buffering", "no") | ||
| res.WriteHeader(http.StatusOK) | ||
|
|
||
| if sub.Reset { | ||
| if err := writeLiveEvent(res, live.Event{ | ||
| Seq: h.liveBroker.LatestSeq(), | ||
| Type: live.EventReset, | ||
| }); err != nil { | ||
| return err | ||
| } | ||
| } | ||
| for _, event := range sub.Replay { | ||
| if !filter.matches(event.Type) { | ||
| continue | ||
| } | ||
| if err := writeLiveEvent(res, event); err != nil { | ||
| return err | ||
| } | ||
| } | ||
|
|
||
| ticker := time.NewTicker(h.liveBroker.Heartbeat()) | ||
| defer ticker.Stop() | ||
|
|
||
| ctx := c.Request().Context() | ||
| for { | ||
| select { | ||
| case <-ctx.Done(): | ||
| return nil | ||
| case event, ok := <-sub.Events: | ||
| if !ok { | ||
| return nil | ||
| } | ||
| if !filter.matches(event.Type) { | ||
| continue | ||
| } | ||
| if err := writeLiveEvent(res, event); err != nil { | ||
| return err | ||
| } | ||
| case <-ticker.C: | ||
| if err := writeLiveEvent(res, live.Event{ | ||
| Seq: h.liveBroker.LatestSeq(), | ||
| Type: live.EventHeartbeat, | ||
| }); err != nil { | ||
| return err | ||
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift
Add targeted tests for cursor/type parsing and SSE replay/reset flow.
This endpoint introduces multiple control-path branches (invalid cursor, type filtering, reset replay, heartbeat/live forwarding), but file-level coverage is currently missing in this PR context.
As per coding guidelines: **/*_test.go: Add or update tests for behavior changes.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/admin/handler_live.go` around lines 17 - 87, Add unit tests for
LiveLogs exercising cursor parsing, type filtering, replay/reset behavior,
heartbeat pings, and live event forwarding by creating tests that call
Handler.LiveLogs with controlled inputs; specifically, add *_test.go cases that
(1) pass an invalid cursor string to ensure liveCursor error handling, (2) set
query "types" to verify liveTypeFilter excludes/accepts events, (3) supply a
mocked/standalone liveBroker Subscribe result where sub.Reset is true and
sub.Replay contains events to assert writeLiveEvent sends a reset and replayed
events, and (4) provide a Subscribe with an Events channel and Heartbeat
interval to assert periodic heartbeat events and forwarding of live events.
Implement a test liveBroker stub (implementing Enabled, Subscribe, LatestSeq,
Heartbeat) and use an echo request/response recorder to capture SSE output for
assertions, ensuring to close the Events channel to end the handler loop.
| type liveLogTypeFilter map[string]struct{} | ||
|
|
||
| func liveTypeFilter(raw string) liveLogTypeFilter { | ||
| filter := liveLogTypeFilter{} | ||
| for _, item := range strings.Split(raw, ",") { | ||
| item = strings.ToLower(strings.TrimSpace(item)) | ||
| switch item { | ||
| case "audit", "usage": | ||
| filter[item] = struct{}{} | ||
| } | ||
| } | ||
| return filter | ||
| } | ||
|
|
||
| func (f liveLogTypeFilter) matches(eventType string) bool { | ||
| if len(f) == 0 { | ||
| return true | ||
| } | ||
| prefix, _, ok := strings.Cut(eventType, ".") | ||
| if !ok { | ||
| prefix = eventType | ||
| } | ||
| _, matched := f[prefix] | ||
| return matched | ||
| } |
There was a problem hiding this comment.
types with only unknown values currently matches all events.
With types=foo, the parsed filter is empty and matches returns true for everything. That silently broadens output instead of narrowing it.
💡 Proposed fix (distinguish “no filter provided” from “no valid filter tokens”)
-type liveLogTypeFilter map[string]struct{}
+type liveLogTypeFilter struct {
+ allowAll bool
+ allowed map[string]struct{}
+}
func liveTypeFilter(raw string) liveLogTypeFilter {
- filter := liveLogTypeFilter{}
+ if strings.TrimSpace(raw) == "" {
+ return liveLogTypeFilter{allowAll: true}
+ }
+ filter := liveLogTypeFilter{allowed: map[string]struct{}{}}
for _, item := range strings.Split(raw, ",") {
item = strings.ToLower(strings.TrimSpace(item))
switch item {
case "audit", "usage":
- filter[item] = struct{}{}
+ filter.allowed[item] = struct{}{}
}
}
return filter
}
func (f liveLogTypeFilter) matches(eventType string) bool {
- if len(f) == 0 {
+ if f.allowAll {
return true
}
prefix, _, ok := strings.Cut(eventType, ".")
if !ok {
prefix = eventType
}
- _, matched := f[prefix]
+ _, matched := f.allowed[strings.ToLower(prefix)]
return matched
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| type liveLogTypeFilter map[string]struct{} | |
| func liveTypeFilter(raw string) liveLogTypeFilter { | |
| filter := liveLogTypeFilter{} | |
| for _, item := range strings.Split(raw, ",") { | |
| item = strings.ToLower(strings.TrimSpace(item)) | |
| switch item { | |
| case "audit", "usage": | |
| filter[item] = struct{}{} | |
| } | |
| } | |
| return filter | |
| } | |
| func (f liveLogTypeFilter) matches(eventType string) bool { | |
| if len(f) == 0 { | |
| return true | |
| } | |
| prefix, _, ok := strings.Cut(eventType, ".") | |
| if !ok { | |
| prefix = eventType | |
| } | |
| _, matched := f[prefix] | |
| return matched | |
| } | |
| type liveLogTypeFilter struct { | |
| allowAll bool | |
| allowed map[string]struct{} | |
| } | |
| func liveTypeFilter(raw string) liveLogTypeFilter { | |
| if strings.TrimSpace(raw) == "" { | |
| return liveLogTypeFilter{allowAll: true} | |
| } | |
| filter := liveLogTypeFilter{allowed: map[string]struct{}{}} | |
| for _, item := range strings.Split(raw, ",") { | |
| item = strings.ToLower(strings.TrimSpace(item)) | |
| switch item { | |
| case "audit", "usage": | |
| filter.allowed[item] = struct{}{} | |
| } | |
| } | |
| return filter | |
| } | |
| func (f liveLogTypeFilter) matches(eventType string) bool { | |
| if f.allowAll { | |
| return true | |
| } | |
| prefix, _, ok := strings.Cut(eventType, ".") | |
| if !ok { | |
| prefix = eventType | |
| } | |
| _, matched := f.allowed[strings.ToLower(prefix)] | |
| return matched | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/admin/handler_live.go` around lines 101 - 125, The current
liveTypeFilter(raw string) + liveLogTypeFilter.matches(eventType string) treats
an input like "types=foo" (no valid tokens) the same as no filter and returns
true for all events; fix by distinguishing "no filter provided" from "filter
provided but no valid tokens": change liveLogTypeFilter to carry a boolean
(e.g., provided) or equivalent sentinel set by liveTypeFilter when raw is
non-empty, set provided=true only if raw was given (even if no tokens were
valid), and then update matches to return true when provided==false (no filter
specified), but return false when provided==true and the internal map is empty
(filter provided but contained no valid tokens); update liveTypeFilter(raw
string) to trim raw and set the provided flag accordingly and populate the map
only with valid tokens.
| if len(b.events) > b.bufferSize { | ||
| drop := len(b.events) - b.bufferSize | ||
| b.events = append([]Event(nil), b.events[drop:]...) | ||
| } |
There was a problem hiding this comment.
Avoid per-event slice reallocation when replay buffer is full.
This path reallocates and copies the full replay slice on every publish after saturation, which will add GC pressure on a hot logging path.
💡 Proposed fix (in-place trim without reallocation)
b.events = append(b.events, event)
if len(b.events) > b.bufferSize {
drop := len(b.events) - b.bufferSize
- b.events = append([]Event(nil), b.events[drop:]...)
+ copy(b.events, b.events[drop:])
+ b.events = b.events[:len(b.events)-drop]
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/live/broker.go` around lines 245 - 248, When the replay buffer
exceeds b.bufferSize, avoid allocating a new slice; instead shift entries left
in-place with copy(b.events, b.events[drop:]), zero out the now-unused tail
slots to avoid retaining references (e.g. for i := b.bufferSize; i <
len(b.events); i++ { b.events[i] = Event{} }), and then reslice with b.events =
b.events[:b.bufferSize]; replace the current append([]Event(nil), ...)
reallocation with this in-place trim using the identifiers b.events,
b.bufferSize, and drop.
Summary
Tests
Summary by CodeRabbit
New Features
Configuration
Documentation