Skip to content

feat(health): Add NVOS Streaming telemetry#1975

Open
mkoci wants to merge 46 commits into
NVIDIA:mainfrom
mkoci:feature-nvos-health
Open

feat(health): Add NVOS Streaming telemetry#1975
mkoci wants to merge 46 commits into
NVIDIA:mainfrom
mkoci:feature-nvos-health

Conversation

@mkoci
Copy link
Copy Markdown
Contributor

@mkoci mkoci commented May 28, 2026

Description

gNMI collector ([collectors.nvue.gnmi], disabled by default) subscribes to NVUE gNMI ON_CHANGE /system-events and SAMPLE paths for:

/components/component
/interfaces/interface
/platform-general/leak-sensors

It uses long-lived gRPC streams with reconnection (exp. back-off + jitter).

Builds on #711 (SSE streaming + OtlpSink for logs). Protos vendored for reproducible offline builds, same as #711.

Currently supports PrometheusSink and OtlpSink

Type of Change

  • Add - New feature or capability
  • Change - Changes in existing functionality
  • Fix - Bug fixes
  • Remove - Removed features or deprecated functionality
  • Internal - Internal changes (refactoring, tests, docs, etc.)

Related Issues

Testing

  • Unit tests added/updated
  • Integration tests added/updated
  • Manual testing performed
  • No testing required (docs, internal refactor, etc.)

Additional Notes

Future

It would be useful to port this to support HealthAlerts especially for ON_CHANGE from /system-events - there is extra information available here which is not available via the Switch BMC.

mkoci and others added 30 commits May 14, 2026 14:20
Signed-off-by: mkoci <mkoci@nvidia.com>
Signed-off-by: mkoci <mkoci@nvidia.com>
Signed-off-by: mkoci <mkoci@nvidia.com>
Signed-off-by: mkoci <mkoci@nvidia.com>
Signed-off-by: mkoci <mkoci@nvidia.com>
Signed-off-by: mkoci <mkoci@nvidia.com>
Signed-off-by: mkoci <mkoci@nvidia.com>
Signed-off-by: mkoci <mkoci@nvidia.com>
Signed-off-by: mkoci <mkoci@nvidia.com>
…ming

Signed-off-by: mkoci <26286151+mkoci@users.noreply.github.com>
Signed-off-by: mkoci <26286151+mkoci@users.noreply.github.com>
Signed-off-by: mkoci <26286151+mkoci@users.noreply.github.com>
Signed-off-by: mkoci <26286151+mkoci@users.noreply.github.com>
Signed-off-by: mkoci <26286151+mkoci@users.noreply.github.com>
Signed-off-by: mkoci <26286151+mkoci@users.noreply.github.com>
Signed-off-by: mkoci <26286151+mkoci@users.noreply.github.com>
…ls on switch hosts

Signed-off-by: mkoci <26286151+mkoci@users.noreply.github.com>
Signed-off-by: mkoci <26286151+mkoci@users.noreply.github.com>
Signed-off-by: mkoci <26286151+mkoci@users.noreply.github.com>
Signed-off-by: mkoci <26286151+mkoci@users.noreply.github.com>
… monitoring

Signed-off-by: mkoci <26286151+mkoci@users.noreply.github.com>
Signed-off-by: mkoci <26286151+mkoci@users.noreply.github.com>
Signed-off-by: mkoci <26286151+mkoci@users.noreply.github.com>
Signed-off-by: mkoci <26286151+mkoci@users.noreply.github.com>
@mkoci mkoci requested review from a team and Coco-Ben as code owners May 28, 2026 02:10
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 28, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

# Conflicts:
#	crates/health/src/api_client.rs
#	crates/health/src/discovery/spawn.rs
@mkoci mkoci force-pushed the feature-nvos-health branch from 35586c5 to ab24084 Compare May 28, 2026 02:26
@mkoci mkoci changed the title Feature nvos health feat(health): Add NVOS Streaming telemetry May 28, 2026
@mkoci mkoci force-pushed the feature-nvos-health branch 2 times, most recently from a4430a9 to a1f996c Compare May 28, 2026 02:52
@mkoci mkoci requested review from tmcroberts97 and yoks May 28, 2026 02:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant