feat: streaming api logs in the admin web ui#1959
Conversation
|
It looks cool. But I personally wouldn't feel like there should be product code doing any of this. I'd probably extend the UI to fetch logs from where they are stored (e.g. Loki) - but obviously it depends on the log storage. |
Maybe under config flag? I see that it can be useful for debug purposes at minimum. |
Yeah we could flag it? I was just having problems with Loki and |
I was having random certificate and credential issues the other day when I was trying to look at API logs. It was annoying, and I was like this shouldn't be so difficult. Seemed to make sense to just add a `Logs` section to the web UI where I can stream API logs. Plan is to actually have support for: - API logs via SSE - Scout logs (either via SSE or WebSockets using the `ScoutStream` thing we have). - DPU Agent logs (either via SSE or WebSockets, also using a similar approach to `ScoutStream`). Tested via my `local-dev` environment. Screen capture attached showing it in action! Signed-off-by: Chet Nichols III <chetn@nvidia.com>
4077f8c to
65c7800
Compare
I'd say the reverse. Basically that we shouldn't presume Loki or any log / metrics platform exists. We don't want to force a deployment of a particular metrics so this log view will work. I hear people saying "what about vault and Postgres" - well we're moving off vault and you can run different databases without losing anything; a metrics platform there's generally only one of and people are very passionate about them. |
|
This is slick! Individual managed hosts and explored endpoints pages have a log button in the top right corner that just links to loki today (filtered on that object) but maybe it can point to this new view instead with the filter pre-populated? I'm happy to take that on as a follow-up PR later :) |
|
@ajf @krish-nvidia One major difference is that this tooling won't show old logs. Which doesn't make it too useful for debugging anything that happened in the past. There's some chance that a feature request will be coming in afterwards that also asks for these logs - and then there's probably a decision to be made whether any log storage service (like Loki) should just always directly be bundled as part of NICo. But if people think that this tool adds enough value on its own I'm fine adding it. At least in the current state its not too much code. |
The decision would obviously be that we wouldn't ship Loki - since it doesn't really make sense to ship a log analysis platform with a host lifecycle manager. Shipping in Otel format to some collector seems like the right choice? Then someone can just do whatever they want with the logs. NICo isn't an appliance. |
|
OTEL unfortunately doesn't help with this use-case, since it doesn't allow to query logs in any form. It's just an interface to stream telemetry when its created. If the UI needs to have a builtin way to query past logs, then a more concrete interface to a log storage service is required. |
Your call Chet, but If I was looking at logs I'd expect it to be historical and current. Not just current + future. It Was useful to you, it may be useful to others. |
Description
I was having a bunch of issues the other day when I was trying to look at
carbide-apilogs. It was annoying, and I was like this shouldn't be so difficult just to checkcarbide-apilogs. Seemed to make sense to just add aLogssection to the web UI where I can stream API logs, so this does that.My plan is to actually have support for:
carbide-apilogs (which is what this does).scoutlogs (via WebSockets using theScoutStreamthing we have).nico-dpu-agentlogs (via WebSockets, also using a similar approach toScoutStream).Tested via my
local-devenvironment. Screen capture attached showing it in action!api_web_ui_log_stream.mov
Signed-off-by: Chet Nichols III chetn@nvidia.com
Type of Change
Related Issues (Optional)
Breaking Changes
Testing
Additional Notes