Skip to content

test: validate Hermes routed provider OpenInference spans#235

Merged
rapids-bot[bot] merged 3 commits into
NVIDIA:mainfrom
mnajafian-nv:test/hermes-routed-openinference-consistency
Jun 6, 2026
Merged

test: validate Hermes routed provider OpenInference spans#235
rapids-bot[bot] merged 3 commits into
NVIDIA:mainfrom
mnajafian-nv:test/hermes-routed-openinference-consistency

Conversation

@mnajafian-nv
Copy link
Copy Markdown
Contributor

@mnajafian-nv mnajafian-nv commented Jun 5, 2026

Overview

This PR validates that Hermes routed provider flows emit the expected OpenInference spans by adding explicit session-path coverage for the routed Anthropic /v1/messages, OpenAI /v1/responses, and OpenAI /v1/chat/completions surfaces.

  • I confirm this contribution is my own work, or I have the right to submit it under this project's license.
  • I searched existing issues and open pull requests, and this does not duplicate existing work.

Details

  • Adds a Hermes session-path OpenInference regression that drives the real routed provider flow through SessionManager::start_llm and SessionManager::end_llm.
  • Validates that the routed Anthropic and OpenAI payloads already covered in Hermes ATOF/ATIF produce the expected OpenInference LLM spans, including text output, tool-call rendering, usage, cached-token, and cost attributes.
  • Refactors the existing Hermes routed ATIF regression to reuse a shared routed-provider session helper, reducing duplication while keeping the assertions in the same reviewer-visible coverage layer.
  • Keeps the scope test-only and does not change runtime behavior.
  • Closes the remaining explicit Hermes routed OpenInference proof gap without widening into unrelated parity cleanup.

Validated with:

  • cargo test -p nemo-relay-cli hermes_routed_provider_payloads_write_exact_atif_trajectory -- --nocapture
  • cargo test -p nemo-relay-cli hermes_routed_provider_payloads_emit_openinference_text_usage_and_cost -- --nocapture
  • cargo test -p nemo-relay-cli hermes -- --nocapture
  • cargo test -p nemo-relay openinference -- --nocapture
  • cargo fmt --all
  • uv run pre-commit run --all-files

Where should the reviewer start?

Start in crates/cli/tests/coverage/session_tests.rs with hermes_routed_provider_payloads_emit_openinference_text_usage_and_cost, then review the shared routed-provider session helper used by both the new OpenInference regression and the existing routed ATIF regression.

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • Relates to Hermes observability consistency work.

Summary by CodeRabbit

  • Tests

    • Added new OpenInference-focused test utilities and an async session driver to exercise Hermes-routed provider sessions.
    • Expanded coverage to verify end-to-end routed LLM workflows and assert precise OpenInference attributes (input/output text, token counts, model names, and costs).
  • Chores

    • Added developer tracing dependencies to enable observability-based test instrumentation.

Signed-off-by: mnajafian-nv <mnajafian@nvidia.com>
@mnajafian-nv mnajafian-nv requested review from a team as code owners June 5, 2026 22:46
@github-actions github-actions Bot added size:L PR is large Test Test related lang:rust PR changes/introduces Rust code labels Jun 5, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 5, 2026

Linter diff in the way? Review this PR in Change Stack to focus on meaningful changes and expand context only when needed.

Review Change Stack

Walkthrough

Adds workspace/dev OpenTelemetry deps and test tracing utilities: stable routed-test session marker, in-memory exporter/subscriber helpers, a reusable Hermes routed session driver (three routed LLM calls), and a test asserting emitted OpenInference span attributes.

Changes

OpenInference Testing Infrastructure

Layer / File(s) Summary
OpenTelemetry workspace & dev-dependencies
Cargo.toml, crates/cli/Cargo.toml, crates/core/Cargo.toml
Adds workspace entries for opentelemetry and opentelemetry_sdk (v0.31), switches core to workspace = true, and adds them as dev-dependencies in crates/cli with trace and testing features.
Test helpers and setup
crates/cli/tests/coverage/session_tests.rs
Defines HERMES_ROUTED_TEST_SESSION_KEY and helpers to create an in-memory OpenInference exporter/subscriber, convert OpenTelemetry KeyValue attributes to HashMap, and build routed gateway metadata including the test marker.
Hermes routed session driver
crates/cli/tests/coverage/session_tests.rs
Adds drive_hermes_routed_provider_session async helper that executes a full Hermes routed session lifecycle with three routed LLM sequences (Anthropic messages, OpenAI responses, OpenAI chat-completions) and session finalization.
OpenInference span assertions
crates/cli/tests/coverage/session_tests.rs
Refactors hermes_routed_provider_payloads_write_exact_atif_trajectory to use the session driver and adds hermes_routed_provider_payloads_emit_openinference_text_usage_and_cost, which registers the in-memory subscriber, runs the session, flushes spans, and asserts emitted OpenInference span attributes (exact I/O text, model names, token counts, costs).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • NVIDIA/NeMo-Relay#222: Modifies the same Hermes routed-provider test coverage and routed session lifecycle in session_tests.rs.
  • NVIDIA/NeMo-Relay#219: Adds Hermes ATIF fidelity and USD cost regression tests overlapping session_tests.rs routed-provider assertions.
  • NVIDIA/NeMo-Relay#215: Related Hermes observability coverage and OpenInference span attribute assertions.
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed Title follows Conventional Commits format with 'test' type, concise imperative summary, 57 characters, and no trailing period.
Description check ✅ Passed Description includes all required sections: Overview with confirmation checkboxes, Details with technical context, reviewer guidance, and related issues reference.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 5, 2026

@mnajafian-nv mnajafian-nv self-assigned this Jun 5, 2026
@mnajafian-nv mnajafian-nv added this to the 0.4 milestone Jun 5, 2026
Copy link
Copy Markdown
Member

@willkill07 willkill07 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good except for the potential package version drift. I included a comment on how to resolve.

Comment thread crates/cli/Cargo.toml
Signed-off-by: mnajafian-nv <mnajafian@nvidia.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@Cargo.toml`:
- Around line 29-30: The Cargo.toml currently pins opentelemetry and
opentelemetry_sdk to 0.31; either add a comment/PR description justifying why
0.31 is required for compatibility/security, or update both dependencies to the
aligned 0.32.x releases (e.g., opentelemetry = "0.32.0" and opentelemetry_sdk =
"0.32.1") ensuring versions match the registry, then run cargo update and cargo
build/test to verify no breakages; reference the dependency names opentelemetry
and opentelemetry_sdk when making the change and in the commit/PR message.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 7288b804-5b83-4a79-b880-958e49aaaa80

📥 Commits

Reviewing files that changed from the base of the PR and between d497e44 and 77e4ba2.

📒 Files selected for processing (3)
  • Cargo.toml
  • crates/cli/Cargo.toml
  • crates/core/Cargo.toml
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Check / Run
  • GitHub Check: Preview docs
🧰 Additional context used
📓 Path-based instructions (12)
**/{Cargo.toml,**/*.rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Maintain consistency between Rust package names in Cargo.toml and their actual usage across the codebase

Files:

  • crates/cli/Cargo.toml
  • crates/core/Cargo.toml
  • Cargo.toml
**/*.{rs,toml}

📄 CodeRabbit inference engine (.agents/skills/rename-surfaces/SKILL.md)

Update Rust crate names and module prefixes during coordinated rename operations

Files:

  • crates/cli/Cargo.toml
  • crates/core/Cargo.toml
  • Cargo.toml
**/*.{py,txt,toml,cfg,yaml,yml}

📄 CodeRabbit inference engine (.agents/skills/rename-surfaces/SKILL.md)

Update Python package names and top-level module imports during coordinated rename operations

Files:

  • crates/cli/Cargo.toml
  • crates/core/Cargo.toml
  • Cargo.toml
**/Cargo.toml

📄 CodeRabbit inference engine (.agents/skills/rename-surfaces/SKILL.md)

Update WebAssembly crate names and generated package names during coordinated rename operations

Confirm or infer the target release version from upstream/main:Cargo.toml. Derive the release branch as release/<major>.<minor>.

Files:

  • crates/cli/Cargo.toml
  • crates/core/Cargo.toml
  • Cargo.toml
{docs/**,README.md,**/Cargo.toml,**/package.json,**/*.md}

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

Ensure renamed public surfaces are reflected consistently in manifests and docs for large or public-facing changes

Files:

  • crates/cli/Cargo.toml
  • crates/core/Cargo.toml
  • Cargo.toml
**/*.{md,mdx,py,sh,yaml,yml,toml,json}

📄 CodeRabbit inference engine (.agents/skills/contribute-docs/SKILL.md)

Keep package names, repo references, and build commands current

Files:

  • crates/cli/Cargo.toml
  • crates/core/Cargo.toml
  • Cargo.toml
**/*.toml

📄 CodeRabbit inference engine (CONTRIBUTING.md)

Include SPDX license header in TOML configuration files using hash comment syntax

Files:

  • crates/cli/Cargo.toml
  • crates/core/Cargo.toml
  • Cargo.toml
**/*.{rs,py,js,ts,tsx,jsx,go,sh,toml,yaml,yml,md}

📄 CodeRabbit inference engine (AGENTS.md)

Keep SPDX headers on source, docs, scripts, and configuration files. The project is Apache-2.0.

Files:

  • crates/cli/Cargo.toml
  • crates/core/Cargo.toml
  • Cargo.toml
**

⚙️ CodeRabbit configuration file

**:

AGENTS.md

This file provides guidance to agents, including Claude Code and OpenAI Codex, when working in this repository.

Project Overview

NeMo Relay is a multi-language agent runtime framework for execution scopes, lifecycle events, middleware, plugins, and observability around tool and LLM calls. The core runtime is Rust. Primary supported bindings are Rust, Python, and Node.js. Go, WebAssembly, and the raw C FFI are experimental and source-first.

The shared runtime model is:

  1. Scope stacks decide where work belongs and which scope-local behavior is visible.
  2. Middleware registries decide what guardrails and intercepts run around managed calls.
  3. Plugins install reusable runtime behavior from configuration.
  4. Events record runtime behavior in ATOF form.
  5. Subscribers and exporters consume events in-process or export them to ATIF, OpenTelemetry, OpenInference, or other backends.

Repository Structure

The repository layout separates the Rust runtime, language bindings, documentation,
integration patches, and agent-facing skills.

crates/
  core/       # Rust core runtime crate, published as nemo-relay
  adaptive/   # Adaptive runtime primitives and plugin components
  python/     # PyO3 native extension for the Python package
  ffi/        # Raw C ABI layer used by downstream bindings such as Go
  node/       # NAPI Node.js binding and JavaScript/TypeScript entry points
  wasm/       # wasm-bindgen WebAssembly binding and JS wrappers
python/
  nemo_relay/  # Python wrapper package: scopes, tools, LLM, middleware, typed helpers, plugins, adaptive helpers
  tests/      # Python tests
go/
  nemo_relay/  # Experimental Go CGo binding and tests
fern/         # Fern documentation site
scripts/      # Stable wrappers and helper scripts; build/test/docs entry points live in justfile
third_party/  # P...

Files:

  • crates/cli/Cargo.toml
  • crates/core/Cargo.toml
  • Cargo.toml
{crates/core,crates/adaptive}/**/*

📄 CodeRabbit inference engine (.agents/skills/prepare-pr/SKILL.md)

Changes to crates/core or crates/adaptive must run the full language matrix

Files:

  • crates/core/Cargo.toml
crates/{core,adaptive}/**

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

If crates/core or crates/adaptive changed, run the full matrix across Rust, Python, Go, Node.js, and WebAssembly

Files:

  • crates/core/Cargo.toml
Cargo.toml

📄 CodeRabbit inference engine (.agents/skills/update-project-version/SKILL.md)

Cargo.toml: Maintain Cargo.toml [workspace.package].version as the source of truth for Rust workspace and Python build versioning
Keep Cargo.toml [workspace.dependencies] self-references aligned when the workspace version changes

Files:

  • Cargo.toml
🔇 Additional comments (2)
crates/cli/Cargo.toml (1)

53-54: LGTM!

crates/core/Cargo.toml (1)

89-90: LGTM!

Also applies to: 103-103

Comment thread Cargo.toml
@mnajafian-nv
Copy link
Copy Markdown
Contributor Author

/merge

@rapids-bot rapids-bot Bot merged commit 324f745 into NVIDIA:main Jun 6, 2026
166 of 170 checks passed
@mnajafian-nv mnajafian-nv deleted the test/hermes-routed-openinference-consistency branch June 6, 2026 00:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lang:rust PR changes/introduces Rust code size:L PR is large Test Test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants