Feature Request: Real-time dashboard view combining in-memory buffer with SQLite

## Problem

We evaluated **dozens of LLM gateways and proxies** (Phoenix/Arize, LiteLLM, GoModel, and many others) looking for a self-hosted solution that provides **real-time visibility** of LLM requests and streamed completions in a Web UI.

The core requirement is simple: when a client sends a request to the gateway and the LLM streams tokens back via SSE, the dashboard should display the prompt and streamed tokens **as they happen** — not with a 5-second delay.

### What we found

Most tools have significant delays (30–50s polling) or no body visibility at all. **GoModel** came closest — it captures request/response bodies, parses SSE events, and reconstructs full response content. However, the dashboard only reads from **SQLite**, not from the in-memory buffer.

GoModel's `StreamLogObserver` correctly captures SSE events and builds the full response body in memory. It then flushes to SQLite every 5 seconds (`FlushInterval: 5s`) or every 1000 entries (`BufferSize: 1000`). The dashboard's `GET /admin/api/v1/audit/log` endpoint queries **only SQLite** — it does not also read the in-memory buffer.

This means:
- **No live updates** in the dashboard between flush cycles
- **Minimum 5-second delay** before any new request appears
- **No auto-refresh** or polling mechanism to check for new entries
- The in-memory buffer exists solely as a write-back cache, not as a live data source

## Proposed Solution

Combine in-memory and SQLite data in the dashboard API response:

1. **Keep the in-memory buffer as the primary read source** for the dashboard API, with SQLite as the durable backup.
2. **Or**: Return both in-memory entries (newest, not yet flushed) and SQLite entries (persisted) in the `GET /admin/api/v1/audit/log` response, merging them by timestamp.
3. **Or**: Add a separate WebSocket/SSE endpoint that pushes new entries to the dashboard as they arrive in memory, without waiting for SQLite flush.

This would give users near-instant visibility into LLM requests — the prompt appears immediately, and streamed tokens update in real-time as they're captured by `StreamLogObserver`.

## Why this matters

For debugging LLM integrations, monitoring token usage, and understanding what prompts/completions are flowing through the gateway, **real-time visibility is essential**. A 5-second delay is acceptable for analytics and cost tracking, but it makes the dashboard useless for live debugging.

We tested many tools and found that **none** of them provide true real-time dashboard visibility of LLM request/response bodies. GoModel has the architecture to do it (the in-memory buffer already exists and captures everything), but it doesn't expose it.

## Environment

- GoModel version: latest (ENTERPILOT/GoModel)
- Use case: Debugging LLM streaming, monitoring real-time token usage, understanding prompt/response content
- Current workaround: None — must wait 5+ seconds for each request to appear


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Real-time dashboard view combining in-memory buffer with SQLite #290

Problem

What we found

Proposed Solution

Why this matters

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Feature Request: Real-time dashboard view combining in-memory buffer with SQLite #290

Description

Problem

What we found

Proposed Solution

Why this matters

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions