feat: add LLM cost layering and pricing lookup by AjayThorve · Pull Request #236 · NVIDIA/NeMo-Relay

AjayThorve · 2026-06-05T22:48:11Z

Overview

Adds a durable LLM cost layer for annotated provider responses. Relay now preserves upstream-reported cost, can estimate cost from configured pricing sources when token usage and model pricing are available, and leaves cost unknown instead of fabricating $0 when pricing or required token data is missing.

This PR does not make Relay the owner of a canonical bundled pricing catalog. Relay owns the cost schema, pricing-source/resolver behavior, precedence rules, CLI/plugin validation, and observability propagation; pricing data is supplied by configured sources today, with DB/service-backed sources possible later by adapting them into validated catalog snapshots.

I confirm this contribution is my own work, or I have the right to submit it under this project's license.
I searched existing issues and open pull requests, and this does not duplicate existing work.

Details

Extends normalized response usage with CostEstimate, CostSource, and Usage.cost while keeping provider response annotations limited to response fields such as provider response id, model, message, tool usage, and token usage.
Uses the managed LLM call name as provider/route identity for provider-aware pricing, so routes such as azure/openai can resolve differently from generic openai without adding a model_provider field to AnnotatedLlmResponse.
Adds a source-backed pricing resolver with file, inline, and custom PricingSource support. The resolver handles provider-qualified/routed model names and uses first-match source precedence.
Keeps provider/framework-reported cost authoritative. Model pricing estimates are only computed when Usage.cost is absent and a configured pricing source has enough token-pricing data.
Represents normalized cost amounts as currency-neutral fields (total, input, output, cache_read, cache_write) plus currency; generic pricing APIs stay currency-neutral, top-level provider/framework usage.cost_usd input remains supported, and exporters emit either their schema-required cost names or Relay cost total plus currency.
Treats unknown models, stale/auditable pricing, non-token pricing units, and missing token fields explicitly. Unknown cost remains absent instead of being rounded or defaulted to zero.
Propagates normalized or estimated cost into ATIF step/final metrics, OpenInference attributes, and OpenTelemetry attributes without changing existing token behavior.
Adds nemo-relay pricing validate, init, add-source, and resolve so users can validate catalog files, wire project/user config, and inspect which pricing source matched a model.
Extends nemo-relay doctor to validate configured pricing sources, so missing/unreadable files or invalid pricing catalogs surface before gateway launch.
Updates Rust, Python, Node, WASM, Go, and FFI coverage plus end-user docs for provider response codecs, CLI pricing setup, plugin configuration layering, direct application instrumentation, and embedded plugin initialization across harnesses/custom agents/framework integrations.

Where should the reviewer start?

Start with crates/core/src/codec/response.rs for the normalized response cost shape and crates/core/src/codec/pricing.rs for provider-aware lookup. Then review crates/core/src/api/llm.rs / crates/core/src/stream.rs for how LLM call names feed pricing, crates/cli/src/pricing.rs and crates/cli/src/doctor.rs for CLI setup/validation behavior, and docs/integrate-into-frameworks/provider-response-codecs.mdx / docs/instrument-applications/instrument-llm-call.mdx for the public behavior contract across CLI and non-CLI consumers.

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Resolves RELAY-184
Resolves RELAY-227
Resolves RELAY-228
Resolves RELAY-229
Resolves RELAY-230

Validation

cargo test -p nemo-relay provider_reported_cost -- --nocapture
cargo test -p nemo-relay test_attach_estimated_cost_uses_event_provider -- --nocapture
cargo test -p nemo-relay pricing -- --nocapture
cargo test -p nemo-relay atif -- --nocapture
cargo test -p nemo-relay openinference -- --nocapture
cargo test -p nemo-relay otel -- --nocapture
cargo test -p nemo-relay --test pipeline_integration -- --nocapture
cargo test -p nemo-relay-cli cli_pricing -- --nocapture
cargo test -p nemo-relay-cli collect_observability_ -- --nocapture
cargo test -p nemo-relay-cli doctor::tests -- --nocapture
cargo run -p nemo-relay-cli -- doctor codex --json | jq '.observability'
cargo run -p nemo-relay-cli -- pricing --help
cargo fmt --check && git diff --check
cargo clippy --workspace --all-targets -- -D warnings
RUST_TEST_THREADS=1 just test-rust
just test-go
PATH=/Users/athorve/.cache/codex-runtimes/codex-primary-runtime/dependencies/node/bin:$PATH just test-node
PATH=/Users/athorve/.cache/codex-runtimes/codex-primary-runtime/dependencies/node/bin:$PATH just test-wasm
env -u CONDA_PREFIX just test-python
PATH=/Users/athorve/.cache/codex-runtimes/codex-primary-runtime/dependencies/node/bin:$PATH just docs

Notes:

just docs passed with the existing Fern redirect warning because this local environment is not authenticated with Fern.
just test-python required using the repo .venv without the outer CONDA_PREFIX.

Summary by CodeRabbit

New Features
- Built-in pricing plugin and resolver to annotate LLM responses with cost estimates; CLI pricing commands (validate, init, add-source, resolve) with scope flags and preserved source-order merging.
- LLM end events and exporters (OTEL/ATIF/OpenInference) now emit normalized cost metadata when available.
Tests
- Extensive new and updated tests for CLI flows, codecs, resolver behavior, observability exports, and merge ordering.
Documentation
- New and expanded guides for pricing setup, CLI usage, catalog schema, and cost-estimation behavior.

coderabbitai · 2026-06-05T22:48:20Z

Worried about impact? Review this PR in Change Stack to explore blast radius before you approve or request changes.

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

This PR implements a versioned pricing catalog and resolver, populates Usage.cost from provider-reported or estimated model-token pricing across codecs/streams/LLM API, exposes CLI/plugin commands to manage pricing sources, and surfaces cost to ATIF/OpenInference/OTEL exporters with extensive tests/docs updates.

Changes

Pricing System & Cost Estimation

Layer / File(s)	Summary
Pricing core types & resolver `crates/core/src/codec/pricing.rs`	Adds `PricingCatalog`, `ModelPricing`, `PricingConfig`/sources, `PricingResolver`, token-rate math, prompt-cache rules, public estimate APIs, and active-resolver management.
Response cost contract `crates/core/src/codec/response.rs`	Adds `Usage.cost: Option<CostEstimate>`, `CostEstimate`/`CostSource`, provider-reported conversion helpers, and reexports pricing APIs.
Codec provider-cost mapping & enrichment `crates/core/src/codec/{anthropic.rs,openai_chat.rs,openai_responses.rs}`, `crates/core/src/api/llm.rs`, `crates/core/src/stream.rs`	Codecs deserialize provider cost inputs, prefer provider-reported `CostEstimate`, and fall back to model-based estimation; non-streaming/streaming paths call `attach_estimated_cost_for_provider`.
Plugin registration & activation `crates/core/src/plugins/pricing.rs`, `crates/core/src/plugin.rs`, `crates/core/src/plugins/mod.rs`	Registers built-in `pricing` plugin, validates `PricingConfig`, constructs `PricingResolver`, sets/reset active resolver during register/teardown.
CLI surface & config I/O `crates/cli/src/config.rs`, `crates/cli/src/main.rs`, `crates/cli/src/plugins.rs`, `crates/cli/src/plugins/config_io.rs`	Adds `pricing` subcommand, subcommands (validate/init/add-source/resolve), scope flags, and exposes `config_io` helpers (`pub(crate)`) for CLI edits and TOML merge special-case for `pricing` sources.
CLI pricing workflows & doctor checks `crates/cli/src/pricing.rs`, `crates/cli/src/doctor.rs`	Implements validate/init/add-source/resolve flows, file/inline source handling, catalog parsing, and `nemo-relay doctor` pricing source validations with per-source Pass/Fail/Info checks.
Observability cost attribution `crates/core/src/observability/{atif.rs,openinference.rs,otel.rs}`	Extracts token/cache metrics provider/model-aware, prefers explicit USD or `usage.cost`, otherwise estimates via resolver; OTEL/OpenInference emit `nemo_relay.llm.cost.*` attributes.
Codec module export `crates/core/src/codec/mod.rs`	Exports new `pricing` submodule.
Tests: unit, integration, polyglot `crates//tests/`, `crates/core/tests/*`, `go/`, `node/`, `python/`, `wasm/`, `ffi/`	Adds/updates tests for pricing resolver, codec mapping, CLI flows, doctor checks, ATIF/OTEL/OpenInference behaviors, and updates Usage literals to include `cost: None` or provider-reported `CostEstimate`.
Docs `docs/*`	Documents pricing catalog schema and CLI flows, plugin merge behavior, response-codec guidance for reporting `usage.cost`, and instrumentation checklist.

Sequence Diagram

sequenceDiagram
  participant Client
  participant LLMHandler
  participant ResponseCodec
  participant PricingResolver
  participant PluginConfig
  participant Exporter

  Client->>LLMHandler: Execute LLM call (with response_codec)
  LLMHandler->>ResponseCodec: Decode provider response (model, usage, provider cost)
  ResponseCodec->>PricingResolver: pricing_for(provider?, model) / estimate_cost(model, usage)
  PricingResolver-->>ResponseCodec: CostEstimate / None
  ResponseCodec-->>LLMHandler: Annotated response (usage.cost set if absent)
  PluginConfig->>PricingResolver: set_active_pricing_resolver / reset
  LLMHandler->>Exporter: Emit end event with annotated response (usage.cost)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

NVIDIA/NeMo-Relay#206: Overlaps with ATIF/OpenInference cost-attribution changes.
NVIDIA/NeMo-Relay#219: Related Hermes ATIF session cost/usage test adjustments.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-06-05T22:52:01Z

Fern docs preview: https://nvidia-preview-pull-request-236.docs.buildwithfern.com/nemo/relay (https://nvidia-preview-pull-request-236.docs.buildwithfern.com/nemo/relay)

Signed-off-by: Ajay Thorve <athorve@nvidia.com>

coderabbitai

Actionable comments posted: 11

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/cli/src/pricing.rs`:
- Around line 83-91: The current error hides the case where no pricing sources
are configured by treating it as a model-match failure; change the logic after
calling pricing_catalog_sources_from_current_config() to first check whether the
returned sources collection is empty and, if so, return a distinct
CliError::Config (e.g., "no pricing sources configured") instead of calling
resolve_pricing; otherwise call resolve_pricing(&sources,
command.provider.as_deref(), &command.model) and keep the existing
CliError::Config message for the “no model match” case so callers can
distinguish missing sources from an unmatched model.
- Around line 52-60: Persist the pricing file source as an absolute/canonical
path instead of storing command.path verbatim: before constructing
PricingSourceConfig::File (where the code currently sets path:
command.path.clone()), canonicalize the path (e.g.
std::fs::canonicalize(&command.path) and convert to a String) and use that
canonical string for the File variant; propagate or return any IO errors from
canonicalize so callers (e.g. the flow around read_pricing_catalog,
target_pricing_scope, ensure_pricing_component, pricing_config_from_component)
handle failures consistently.

In `@crates/core/src/codec/pricing.rs`:
- Around line 196-223: Inline and source-returned catalogs are currently
accepted without schema/alias/rate validation; update
PricingResolver::from_config and PricingResolver::from_sources to validate those
catalogs the same way file-based catalogs are validated by round-tripping them
through the existing parser/validator: serialize the inline/source
PricingCatalog to JSON (e.g., serde_json::to_string) and then call
PricingCatalog::from_json_str(&raw) (or otherwise invoke the same validation
function used for file input) and push the resulting validated PricingCatalog
into catalogs, propagating any errors as PricingCatalogError.

In `@crates/core/src/codec/response.rs`:
- Around line 209-228: The code can label a USD provider total
(provider_total_cost / cost_usd) with a non-USD currency because the current
currency selection only checks has_currency_native_amount and cost.currency;
update the currency selection so that if provider_total_cost.is_some() (i.e.
cost_usd is present and used as total) the currency is set to the USD provider
currency (or the canonical USD/default for provider totals) instead of reading
cost.currency; otherwise keep the existing has_currency_native_amount ->
cost.currency.unwrap_or_else(default_cost_currency) / default_cost_currency()
behavior. Ensure this change touches the CostEstimate construction (variables:
provider_total_cost, total, has_currency_native_amount, cost.currency,
default_cost_currency) so total and currency source remain consistent.

In `@crates/core/src/observability/atif.rs`:
- Around line 689-692: The code sets explicit_cost from usage.cost.total without
checking currency; update the logic around explicit_cost to only accept
cost.total when cost.currency is present and equals "USD" (use
cost.get("currency").and_then(Json::as_str) and compare to "USD") — if currency
is not "USD" (or explicitly present and different) return None (or skip setting
explicit_cost) instead of treating raw total as USD; keep the existing fallback
of usage.get("cost_usd").and_then(Json::as_f64) as the primary source and only
use cost.get("total").and_then(Json::as_f64) when currency == "USD".

In `@crates/core/src/observability/openinference.rs`:
- Around line 1115-1119: The fallback cost-estimation branch currently only
calls event.model_name(), which misses cases where the provider response
includes a top-level model in output; update the or_else closure to try
extracting the model from the raw response (e.g. call the same helper used in
atif.rs, response_model_name(output) or equivalent) before giving up: use
fallback_usage and response_model_name(output) (or fall back to
event.model_name()) and then call estimate_cost_for_provider(Some(event.name()),
model_name, usage). Ensure you still call cost.total_for_currency("USD") on the
resulting estimate.

In `@crates/core/src/observability/otel.rs`:
- Around line 682-695: cost_from_llm_event currently only uses
annotated_response() and misses the raw-output fallback used in
openinference.rs, causing inconsistent exporter behavior; update
cost_from_llm_event (or extract a shared helper used by cost_from_llm_event and
the openinference exporter) to first attempt response.usage/response.model from
annotated_response(), then fall back to reading provider-native output
usage/model (the same manual-usage/model fallback implemented in
crates/core/src/observability/openinference.rs) before calling
estimate_cost_for_provider, and keep using cost_total_and_currency to convert
CostEstimate to (f64,String); ensure the helper is referenced by name from both
exporters so they resolve cost identically.

In `@crates/core/tests/unit/atif_tests.rs`:
- Around line 755-814: The test sets global pricing via
set_active_pricing_resolver but only calls reset_active_pricing_resolver on the
happy path; make cleanup panic-safe by creating a local RAII guard that calls
reset_active_pricing_resolver() in its Drop and instantiating it immediately
after set_active_pricing_resolver(...) in
test_exporter_derives_llm_cost_from_model_pricing (or wrap the setup/reset with
std::panic::catch_unwind). Implement a small guard type (e.g.,
ResetPricingResolverGuard) that calls reset_active_pricing_resolver() in Drop
and use it right after calling set_active_pricing_resolver to guarantee cleanup
even if assertions fail; reference pricing_test_mutex(),
set_active_pricing_resolver, and reset_active_pricing_resolver in the change.

In `@crates/core/tests/unit/codec/response_tests.rs`:
- Around line 479-502: The test mutates global pricing state via
pricing_test_mutex(), set_active_pricing_resolver(...) and relies on
reset_active_pricing_resolver() at the end; ensure
reset_active_pricing_resolver() always runs even on panics by introducing a
Drop-based guard or a scopeguard/defer in this test (and the other block at
709-758) that calls reset_active_pricing_resolver() in its Drop implementation,
or by wrapping the test body in std::panic::catch_unwind and calling
reset_active_pricing_resolver() in a finally-like path; locate usages around
attach_estimated_cost_for_provider(...) / set_active_pricing_resolver(...) and
replace the current end-of-test reset with a guaranteed cleanup guard.

In `@crates/core/tests/unit/observability/openinference_tests.rs`:
- Around line 2178-2222: The test manipulates the process-global pricing
resolver without holding the pricing_test_mutex for teardown and the
reset_active_pricing_resolver() call is not panic-safe; wrap the
install_test_pricing()/test body and the reset_active_pricing_resolver() call
under the pricing_test_mutex() lock and make teardown RAII-safe (e.g., create a
small guard that calls reset_active_pricing_resolver() in Drop or use
scopeguard::defer) so reset_active_pricing_resolver() always runs while holding
the same mutex even if the test panics; update this test (and the similar block
at the other range) to acquire pricing_test_mutex().lock().unwrap() before
install_test_pricing("priced-model") and ensure the reset is performed inside
the guarded scope or via the Drop guard.

In `@crates/core/tests/unit/observability/otel_tests.rs`:
- Around line 837-875: The pricing resolver reset is only called on the success
path so a panic in the assertions leaves the global resolver installed; wrap the
test's install/reset semantics in a panic-safe guard (e.g., create a small RAII
guard or use std::panic::catch_unwind) so reset_active_pricing_resolver() always
runs when the scope exits; specifically, ensure the code that calls
install_test_pricing("priced-model") returns a guard whose Drop calls
reset_active_pricing_resolver(), or wrap the assertion block with catch_unwind
and call reset_active_pricing_resolver() in a finally-like path, referencing the
existing install_test_pricing and reset_active_pricing_resolver symbols.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 26305c7d-d14f-4099-aee2-7c2d61a3a64b

📥 Commits

Reviewing files that changed from the base of the PR and between fc9eb81 and dc36906.

📒 Files selected for processing (43)

crates/adaptive/tests/unit/acg/telemetry_tests.rs
crates/adaptive/tests/unit/drain_tests.rs
crates/cli/src/config.rs
crates/cli/src/doctor.rs
crates/cli/src/main.rs
crates/cli/src/plugins.rs
crates/cli/src/plugins/config_io.rs
crates/cli/src/pricing.rs
crates/cli/tests/cli_tests.rs
crates/cli/tests/coverage/config_tests.rs
crates/cli/tests/coverage/doctor_tests.rs
crates/cli/tests/coverage/session_tests.rs
crates/core/src/api/llm.rs
crates/core/src/codec/anthropic.rs
crates/core/src/codec/mod.rs
crates/core/src/codec/openai_chat.rs
crates/core/src/codec/openai_responses.rs
crates/core/src/codec/pricing.rs
crates/core/src/codec/response.rs
crates/core/src/observability/atif.rs
crates/core/src/observability/openinference.rs
crates/core/src/observability/otel.rs
crates/core/src/plugin.rs
crates/core/src/plugins/mod.rs
crates/core/src/plugins/pricing.rs
crates/core/src/stream.rs
crates/core/tests/integration/pipeline_tests.rs
crates/core/tests/unit/atif_tests.rs
crates/core/tests/unit/codec/openai_chat_tests.rs
crates/core/tests/unit/codec/response_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/ffi/tests/unit/types_tests.rs
crates/node/tests/typed_tests.mjs
crates/python/tests/coverage/py_types_coverage_tests.rs
crates/wasm/tests-js/typed_tests.mjs
docs/build-plugins/about.mdx
docs/build-plugins/plugin-configuration-files.mdx
docs/instrument-applications/instrument-llm-call.mdx
docs/integrate-into-frameworks/provider-response-codecs.mdx
docs/nemo-relay-cli/about.mdx
docs/nemo-relay-cli/basic-usage.mdx
go/nemo_relay/llm_test.go

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Check / Run
GitHub Check: Preview docs

🧰 Additional context used

📓 Path-based instructions (40)

**/*.rs

📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)

Use snake_case naming convention for Rust identifiers (e.g., nemo_relay_tool_call)

**/*.rs: Any Rust change must run just test-rust
Any Rust change must run cargo fmt --all
Any Rust change must run cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all for all FFI work since it is Rust work
Run just test-rust to validate FFI changes
Run cargo clippy --workspace --all-targets -- -D warnings to enforce strict linting on FFI work

When Rust files changed as part of Go work, also run cargo fmt --all, just test-rust, and cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all when Rust files are changed as part of Node work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files are changed as part of Node work
Run just test-rust when Rust files are changed as part of Node work

**/*.rs: Run cargo fmt --all to format all Rust code
Run cargo clippy --workspace --all-targets -- -D warnings to enforce all clippy lints as errors

**/*.rs: Run cargo fmt --all when Rust files changed as part of WebAssembly work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files changed as part of WebAssembly work

**/*.rs: If any Rust code changed, always run just test-rust
If any Rust code changed, also run cargo fmt --all
If any Rust code changed, also run cargo clippy --workspace --all-targets -- -D warnings
Run Rust formatting with cargo fmt --all
Run Rust linting with cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Use cargo fmt for Rust code formatting
Run cargo clippy -- -D warnings to lint Rust code and treat all warnings as errors
Use Rust snake_case naming convention for Rust identifiers
Include SPDX license header in all Rust source files using double-slash comment syntax
Validate Rust code with uv run pre-commit run --all-files to enforce cargo fmt formatting check, cargo clippy lints, and cargo deny aud...

Files:

crates/core/src/codec/mod.rs
crates/core/src/api/llm.rs
crates/core/src/plugins/mod.rs
crates/cli/tests/cli_tests.rs
crates/core/src/stream.rs
crates/core/src/codec/openai_responses.rs
crates/cli/src/main.rs
crates/core/tests/unit/codec/openai_chat_tests.rs
crates/adaptive/tests/unit/drain_tests.rs
crates/ffi/tests/unit/types_tests.rs
crates/core/src/codec/openai_chat.rs
crates/cli/tests/coverage/session_tests.rs
crates/cli/src/plugins/config_io.rs
crates/core/src/codec/anthropic.rs
crates/cli/src/plugins.rs
crates/core/src/observability/otel.rs
crates/core/src/plugin.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/cli/tests/coverage/doctor_tests.rs
crates/cli/tests/coverage/config_tests.rs
crates/python/tests/coverage/py_types_coverage_tests.rs
crates/cli/src/doctor.rs
crates/cli/src/config.rs
crates/cli/src/pricing.rs
crates/adaptive/tests/unit/acg/telemetry_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/src/plugins/pricing.rs
crates/core/src/codec/response.rs
crates/core/tests/integration/pipeline_tests.rs
crates/core/src/observability/openinference.rs
crates/core/src/observability/atif.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/codec/pricing.rs
crates/core/tests/unit/codec/response_tests.rs

**/{Cargo.toml,**/*.rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Maintain consistency between Rust package names in Cargo.toml and their actual usage across the codebase

Files:

crates/core/src/codec/mod.rs
crates/core/src/api/llm.rs
crates/core/src/plugins/mod.rs
crates/cli/tests/cli_tests.rs
crates/core/src/stream.rs
crates/core/src/codec/openai_responses.rs
crates/cli/src/main.rs
crates/core/tests/unit/codec/openai_chat_tests.rs
crates/adaptive/tests/unit/drain_tests.rs
crates/ffi/tests/unit/types_tests.rs
crates/core/src/codec/openai_chat.rs
crates/cli/tests/coverage/session_tests.rs
crates/cli/src/plugins/config_io.rs
crates/core/src/codec/anthropic.rs
crates/cli/src/plugins.rs
crates/core/src/observability/otel.rs
crates/core/src/plugin.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/cli/tests/coverage/doctor_tests.rs
crates/cli/tests/coverage/config_tests.rs
crates/python/tests/coverage/py_types_coverage_tests.rs
crates/cli/src/doctor.rs
crates/cli/src/config.rs
crates/cli/src/pricing.rs
crates/adaptive/tests/unit/acg/telemetry_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/src/plugins/pricing.rs
crates/core/src/codec/response.rs
crates/core/tests/integration/pipeline_tests.rs
crates/core/src/observability/openinference.rs
crates/core/src/observability/atif.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/codec/pricing.rs
crates/core/tests/unit/codec/response_tests.rs

**/*.{h,hpp,c,cpp,rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Ensure FFI header and library naming follows consistent conventions across platform-specific builds

Files:

crates/core/src/codec/mod.rs
crates/core/src/api/llm.rs
crates/core/src/plugins/mod.rs
crates/cli/tests/cli_tests.rs
crates/core/src/stream.rs
crates/core/src/codec/openai_responses.rs
crates/cli/src/main.rs
crates/core/tests/unit/codec/openai_chat_tests.rs
crates/adaptive/tests/unit/drain_tests.rs
crates/ffi/tests/unit/types_tests.rs
crates/core/src/codec/openai_chat.rs
crates/cli/tests/coverage/session_tests.rs
crates/cli/src/plugins/config_io.rs
crates/core/src/codec/anthropic.rs
crates/cli/src/plugins.rs
crates/core/src/observability/otel.rs
crates/core/src/plugin.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/cli/tests/coverage/doctor_tests.rs
crates/cli/tests/coverage/config_tests.rs
crates/python/tests/coverage/py_types_coverage_tests.rs
crates/cli/src/doctor.rs
crates/cli/src/config.rs
crates/cli/src/pricing.rs
crates/adaptive/tests/unit/acg/telemetry_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/src/plugins/pricing.rs
crates/core/src/codec/response.rs
crates/core/tests/integration/pipeline_tests.rs
crates/core/src/observability/openinference.rs
crates/core/src/observability/atif.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/codec/pricing.rs
crates/core/tests/unit/codec/response_tests.rs

{crates/core,crates/adaptive}/**/*

📄 CodeRabbit inference engine (.agents/skills/prepare-pr/SKILL.md)

Changes to crates/core or crates/adaptive must run the full language matrix

Files:

crates/core/src/codec/mod.rs
crates/core/src/api/llm.rs
crates/core/src/plugins/mod.rs
crates/core/src/stream.rs
crates/core/src/codec/openai_responses.rs
crates/core/tests/unit/codec/openai_chat_tests.rs
crates/adaptive/tests/unit/drain_tests.rs
crates/core/src/codec/openai_chat.rs
crates/core/src/codec/anthropic.rs
crates/core/src/observability/otel.rs
crates/core/src/plugin.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/adaptive/tests/unit/acg/telemetry_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/src/plugins/pricing.rs
crates/core/src/codec/response.rs
crates/core/tests/integration/pipeline_tests.rs
crates/core/src/observability/openinference.rs
crates/core/src/observability/atif.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/codec/pricing.rs
crates/core/tests/unit/codec/response_tests.rs

**/*.{rs,toml}

📄 CodeRabbit inference engine (.agents/skills/rename-surfaces/SKILL.md)

Update Rust crate names and module prefixes during coordinated rename operations

Files:

crates/core/src/codec/mod.rs
crates/core/src/api/llm.rs
crates/core/src/plugins/mod.rs
crates/cli/tests/cli_tests.rs
crates/core/src/stream.rs
crates/core/src/codec/openai_responses.rs
crates/cli/src/main.rs
crates/core/tests/unit/codec/openai_chat_tests.rs
crates/adaptive/tests/unit/drain_tests.rs
crates/ffi/tests/unit/types_tests.rs
crates/core/src/codec/openai_chat.rs
crates/cli/tests/coverage/session_tests.rs
crates/cli/src/plugins/config_io.rs
crates/core/src/codec/anthropic.rs
crates/cli/src/plugins.rs
crates/core/src/observability/otel.rs
crates/core/src/plugin.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/cli/tests/coverage/doctor_tests.rs
crates/cli/tests/coverage/config_tests.rs
crates/python/tests/coverage/py_types_coverage_tests.rs
crates/cli/src/doctor.rs
crates/cli/src/config.rs
crates/cli/src/pricing.rs
crates/adaptive/tests/unit/acg/telemetry_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/src/plugins/pricing.rs
crates/core/src/codec/response.rs
crates/core/tests/integration/pipeline_tests.rs
crates/core/src/observability/openinference.rs
crates/core/src/observability/atif.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/codec/pricing.rs
crates/core/tests/unit/codec/response_tests.rs

crates/core/**/*.rs

📄 CodeRabbit inference engine (.agents/skills/test-go-binding/SKILL.md)

If the change touched crates/core or shared runtime semantics, also use validate-change for broader validation

crates/core/**/*.rs: Use Json = serde_json::Value in Rust-facing runtime APIs where the existing code expects JSON payloads.
Use Result<T> with FlowError in core runtime paths. Keep errors explicit and binding-appropriate at the wrapper layer.

Files:

crates/core/src/codec/mod.rs
crates/core/src/api/llm.rs
crates/core/src/plugins/mod.rs
crates/core/src/stream.rs
crates/core/src/codec/openai_responses.rs
crates/core/tests/unit/codec/openai_chat_tests.rs
crates/core/src/codec/openai_chat.rs
crates/core/src/codec/anthropic.rs
crates/core/src/observability/otel.rs
crates/core/src/plugin.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/src/plugins/pricing.rs
crates/core/src/codec/response.rs
crates/core/tests/integration/pipeline_tests.rs
crates/core/src/observability/openinference.rs
crates/core/src/observability/atif.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/codec/pricing.rs
crates/core/tests/unit/codec/response_tests.rs

crates/{core,adaptive}/**

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

If crates/core or crates/adaptive changed, run the full matrix across Rust, Python, Go, Node.js, and WebAssembly

Files:

crates/core/src/codec/mod.rs
crates/core/src/api/llm.rs
crates/core/src/plugins/mod.rs
crates/core/src/stream.rs
crates/core/src/codec/openai_responses.rs
crates/core/tests/unit/codec/openai_chat_tests.rs
crates/adaptive/tests/unit/drain_tests.rs
crates/core/src/codec/openai_chat.rs
crates/core/src/codec/anthropic.rs
crates/core/src/observability/otel.rs
crates/core/src/plugin.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/adaptive/tests/unit/acg/telemetry_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/src/plugins/pricing.rs
crates/core/src/codec/response.rs
crates/core/tests/integration/pipeline_tests.rs
crates/core/src/observability/openinference.rs
crates/core/src/observability/atif.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/codec/pricing.rs
crates/core/tests/unit/codec/response_tests.rs

**/*.{rs,py,js,ts,tsx,jsx,go,sh,toml,yaml,yml,md}

📄 CodeRabbit inference engine (AGENTS.md)

Keep SPDX headers on source, docs, scripts, and configuration files. The project is Apache-2.0.

Files:

crates/core/src/codec/mod.rs
crates/core/src/api/llm.rs
crates/core/src/plugins/mod.rs
crates/cli/tests/cli_tests.rs
crates/core/src/stream.rs
crates/core/src/codec/openai_responses.rs
crates/cli/src/main.rs
crates/core/tests/unit/codec/openai_chat_tests.rs
crates/adaptive/tests/unit/drain_tests.rs
go/nemo_relay/llm_test.go
crates/ffi/tests/unit/types_tests.rs
crates/core/src/codec/openai_chat.rs
crates/cli/tests/coverage/session_tests.rs
crates/cli/src/plugins/config_io.rs
crates/core/src/codec/anthropic.rs
crates/cli/src/plugins.rs
crates/core/src/observability/otel.rs
crates/core/src/plugin.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/cli/tests/coverage/doctor_tests.rs
crates/cli/tests/coverage/config_tests.rs
crates/python/tests/coverage/py_types_coverage_tests.rs
crates/cli/src/doctor.rs
crates/cli/src/config.rs
crates/cli/src/pricing.rs
crates/adaptive/tests/unit/acg/telemetry_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/src/plugins/pricing.rs
crates/core/src/codec/response.rs
crates/core/tests/integration/pipeline_tests.rs
crates/core/src/observability/openinference.rs
crates/core/src/observability/atif.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/codec/pricing.rs
crates/core/tests/unit/codec/response_tests.rs

**/*.{rs,py,go,js,ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Follow binding naming conventions: Rust and Python use snake_case, C FFI exports prefixed nemo_relay_, Go uses PascalCase for public APIs, Node.js uses camelCase.

Files:

crates/core/src/codec/mod.rs
crates/core/src/api/llm.rs
crates/core/src/plugins/mod.rs
crates/cli/tests/cli_tests.rs
crates/core/src/stream.rs
crates/core/src/codec/openai_responses.rs
crates/cli/src/main.rs
crates/core/tests/unit/codec/openai_chat_tests.rs
crates/adaptive/tests/unit/drain_tests.rs
go/nemo_relay/llm_test.go
crates/ffi/tests/unit/types_tests.rs
crates/core/src/codec/openai_chat.rs
crates/cli/tests/coverage/session_tests.rs
crates/cli/src/plugins/config_io.rs
crates/core/src/codec/anthropic.rs
crates/cli/src/plugins.rs
crates/core/src/observability/otel.rs
crates/core/src/plugin.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/cli/tests/coverage/doctor_tests.rs
crates/cli/tests/coverage/config_tests.rs
crates/python/tests/coverage/py_types_coverage_tests.rs
crates/cli/src/doctor.rs
crates/cli/src/config.rs
crates/cli/src/pricing.rs
crates/adaptive/tests/unit/acg/telemetry_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/src/plugins/pricing.rs
crates/core/src/codec/response.rs
crates/core/tests/integration/pipeline_tests.rs
crates/core/src/observability/openinference.rs
crates/core/src/observability/atif.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/codec/pricing.rs
crates/core/tests/unit/codec/response_tests.rs

crates/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

crates/**/*.rs: Keep async behavior on the existing tokio-based model. Bindings should preserve callback and future lifetimes rather than blocking or hiding async work unexpectedly.
Use Json = serde_json::Value in Rust-facing runtime APIs for JSON payload handling.

Files:

crates/core/src/codec/mod.rs
crates/core/src/api/llm.rs
crates/core/src/plugins/mod.rs
crates/cli/tests/cli_tests.rs
crates/core/src/stream.rs
crates/core/src/codec/openai_responses.rs
crates/cli/src/main.rs
crates/core/tests/unit/codec/openai_chat_tests.rs
crates/adaptive/tests/unit/drain_tests.rs
crates/ffi/tests/unit/types_tests.rs
crates/core/src/codec/openai_chat.rs
crates/cli/tests/coverage/session_tests.rs
crates/cli/src/plugins/config_io.rs
crates/core/src/codec/anthropic.rs
crates/cli/src/plugins.rs
crates/core/src/observability/otel.rs
crates/core/src/plugin.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/cli/tests/coverage/doctor_tests.rs
crates/cli/tests/coverage/config_tests.rs
crates/python/tests/coverage/py_types_coverage_tests.rs
crates/cli/src/doctor.rs
crates/cli/src/config.rs
crates/cli/src/pricing.rs
crates/adaptive/tests/unit/acg/telemetry_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/src/plugins/pricing.rs
crates/core/src/codec/response.rs
crates/core/tests/integration/pipeline_tests.rs
crates/core/src/observability/openinference.rs
crates/core/src/observability/atif.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/codec/pricing.rs
crates/core/tests/unit/codec/response_tests.rs

crates/{core,adaptive}/**/*.rs

⚙️ CodeRabbit configuration file

crates/{core,adaptive}/**/*.rs: Review the Rust runtime for async correctness, scope isolation, middleware ordering, and event lifecycle regressions.
Pay close attention to task-local/thread-local scope propagation, callback lifetimes, stream finalization, and root_uuid isolation.
Public API changes should preserve existing behavior unless tests and docs show the intended migration path.

Files:

crates/core/src/codec/mod.rs
crates/core/src/api/llm.rs
crates/core/src/plugins/mod.rs
crates/core/src/stream.rs
crates/core/src/codec/openai_responses.rs
crates/core/tests/unit/codec/openai_chat_tests.rs
crates/adaptive/tests/unit/drain_tests.rs
crates/core/src/codec/openai_chat.rs
crates/core/src/codec/anthropic.rs
crates/core/src/observability/otel.rs
crates/core/src/plugin.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/adaptive/tests/unit/acg/telemetry_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/src/plugins/pricing.rs
crates/core/src/codec/response.rs
crates/core/tests/integration/pipeline_tests.rs
crates/core/src/observability/openinference.rs
crates/core/src/observability/atif.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/codec/pricing.rs
crates/core/tests/unit/codec/response_tests.rs

**

⚙️ CodeRabbit configuration file

**:

AGENTS.md

This file provides guidance to agents, including Claude Code and OpenAI Codex, when working in this repository.

Project Overview

NeMo Relay is a multi-language agent runtime framework for execution scopes, lifecycle events, middleware, plugins, and observability around tool and LLM calls. The core runtime is Rust. Primary supported bindings are Rust, Python, and Node.js. Go, WebAssembly, and the raw C FFI are experimental and source-first.

The shared runtime model is:

Scope stacks decide where work belongs and which scope-local behavior is visible.

Middleware registries decide what guardrails and intercepts run around managed calls.

Plugins install reusable runtime behavior from configuration.

Events record runtime behavior in ATOF form.

Subscribers and exporters consume events in-process or export them to ATIF, OpenTelemetry, OpenInference, or other backends.

Repository Structure

The repository layout separates the Rust runtime, language bindings, documentation,
integration patches, and agent-facing skills.
crates/
  core/       # Rust core runtime crate, published as nemo-relay
  adaptive/   # Adaptive runtime primitives and plugin components
  python/     # PyO3 native extension for the Python package
  ffi/        # Raw C ABI layer used by downstream bindings such as Go
  node/       # NAPI Node.js binding and JavaScript/TypeScript entry points
  wasm/       # wasm-bindgen WebAssembly binding and JS wrappers
python/
  nemo_relay/  # Python wrapper package: scopes, tools, LLM, middleware, typed helpers, plugins, adaptive helpers
  tests/      # Python tests
go/
  nemo_relay/  # Experimental Go CGo binding and tests
fern/         # Fern documentation site
scripts/      # Stable wrappers and helper scripts; build/test/docs entry points live in justfile
third_party/  # P...

Files:

crates/core/src/codec/mod.rs
docs/nemo-relay-cli/about.mdx
crates/core/src/api/llm.rs
docs/build-plugins/about.mdx
docs/build-plugins/plugin-configuration-files.mdx
crates/core/src/plugins/mod.rs
docs/nemo-relay-cli/basic-usage.mdx
docs/instrument-applications/instrument-llm-call.mdx
crates/cli/tests/cli_tests.rs
crates/core/src/stream.rs
crates/core/src/codec/openai_responses.rs
crates/wasm/tests-js/typed_tests.mjs
crates/cli/src/main.rs
crates/core/tests/unit/codec/openai_chat_tests.rs
crates/node/tests/typed_tests.mjs
crates/adaptive/tests/unit/drain_tests.rs
go/nemo_relay/llm_test.go
crates/ffi/tests/unit/types_tests.rs
crates/core/src/codec/openai_chat.rs
crates/cli/tests/coverage/session_tests.rs
crates/cli/src/plugins/config_io.rs
crates/core/src/codec/anthropic.rs
crates/cli/src/plugins.rs
crates/core/src/observability/otel.rs
crates/core/src/plugin.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/cli/tests/coverage/doctor_tests.rs
crates/cli/tests/coverage/config_tests.rs
crates/python/tests/coverage/py_types_coverage_tests.rs
crates/cli/src/doctor.rs
crates/cli/src/config.rs
docs/integrate-into-frameworks/provider-response-codecs.mdx
crates/cli/src/pricing.rs
crates/adaptive/tests/unit/acg/telemetry_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/src/plugins/pricing.rs
crates/core/src/codec/response.rs
crates/core/tests/integration/pipeline_tests.rs
crates/core/src/observability/openinference.rs
crates/core/src/observability/atif.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/codec/pricing.rs
crates/core/tests/unit/codec/response_tests.rs

{docs/**,README.md,CONTRIBUTING.md}

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

{docs/**,README.md,CONTRIBUTING.md}: For docs-only changes, run targeted checks only if commands, package names, or examples changed. Use just docs for docs-site builds and just docs-linkcheck when links changed
Run docs site build with just docs

Files:

docs/nemo-relay-cli/about.mdx
docs/build-plugins/about.mdx
docs/build-plugins/plugin-configuration-files.mdx
docs/nemo-relay-cli/basic-usage.mdx
docs/instrument-applications/instrument-llm-call.mdx
docs/integrate-into-frameworks/provider-response-codecs.mdx

{docs/**,README.md,CONTRIBUTING.md,**/*.md}

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

Run docs link validation with just docs-linkcheck when links change

Files:

docs/nemo-relay-cli/about.mdx
docs/build-plugins/about.mdx
docs/build-plugins/plugin-configuration-files.mdx
docs/nemo-relay-cli/basic-usage.mdx
docs/instrument-applications/instrument-llm-call.mdx
docs/integrate-into-frameworks/provider-response-codecs.mdx

{docs/**,README.md}

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

Verify README and docs entry points still match current package names and paths for large or public-facing changes

Files:

docs/nemo-relay-cli/about.mdx
docs/build-plugins/about.mdx
docs/build-plugins/plugin-configuration-files.mdx
docs/nemo-relay-cli/basic-usage.mdx
docs/instrument-applications/instrument-llm-call.mdx
docs/integrate-into-frameworks/provider-response-codecs.mdx

{docs/**,examples/**,README.md}

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

Verify examples still run with documented commands for large or public-facing changes

Files:

docs/nemo-relay-cli/about.mdx
docs/build-plugins/about.mdx
docs/build-plugins/plugin-configuration-files.mdx
docs/nemo-relay-cli/basic-usage.mdx
docs/instrument-applications/instrument-llm-call.mdx
docs/integrate-into-frameworks/provider-response-codecs.mdx

{docs/**,README.md,**/Cargo.toml,**/package.json,**/*.md}

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

Ensure renamed public surfaces are reflected consistently in manifests and docs for large or public-facing changes

Files:

docs/nemo-relay-cli/about.mdx
docs/build-plugins/about.mdx
docs/build-plugins/plugin-configuration-files.mdx
docs/nemo-relay-cli/basic-usage.mdx
docs/instrument-applications/instrument-llm-call.mdx
docs/integrate-into-frameworks/provider-response-codecs.mdx

**/*.{md,mdx,py,sh,yaml,yml,toml,json}

📄 CodeRabbit inference engine (.agents/skills/contribute-docs/SKILL.md)

Keep package names, repo references, and build commands current

Files:

docs/nemo-relay-cli/about.mdx
docs/build-plugins/about.mdx
docs/build-plugins/plugin-configuration-files.mdx
docs/nemo-relay-cli/basic-usage.mdx
docs/instrument-applications/instrument-llm-call.mdx
docs/integrate-into-frameworks/provider-response-codecs.mdx

**/*.mdx

📄 CodeRabbit inference engine (.agents/skills/contribute-docs/SKILL.md)

In MDX files, top-of-file comments must use JSX comment delimiters: {/* to open and */} to close. Do not use HTML comments for MDX SPDX headers.

MDX top-of-file SPDX comments must use {/* ... */} delimiters instead of HTML comment delimiters (Must-Fix)

Files:

docs/nemo-relay-cli/about.mdx
docs/build-plugins/about.mdx
docs/build-plugins/plugin-configuration-files.mdx
docs/nemo-relay-cli/basic-usage.mdx
docs/instrument-applications/instrument-llm-call.mdx
docs/integrate-into-frameworks/provider-response-codecs.mdx

**/*.{html,md,mdx}

📄 CodeRabbit inference engine (CONTRIBUTING.md)

Include SPDX license header in HTML and Markdown files using HTML comment syntax

Files:

docs/nemo-relay-cli/about.mdx
docs/build-plugins/about.mdx
docs/build-plugins/plugin-configuration-files.mdx
docs/nemo-relay-cli/basic-usage.mdx
docs/instrument-applications/instrument-llm-call.mdx
docs/integrate-into-frameworks/provider-response-codecs.mdx

docs/**/*.{md,mdx}

📄 CodeRabbit inference engine (CONTRIBUTING.md)

Update embedded documentation snippets, patch docs, and binding-support notes if examples or supported bindings changed

Files:

docs/nemo-relay-cli/about.mdx
docs/build-plugins/about.mdx
docs/build-plugins/plugin-configuration-files.mdx
docs/nemo-relay-cli/basic-usage.mdx
docs/instrument-applications/instrument-llm-call.mdx
docs/integrate-into-frameworks/provider-response-codecs.mdx

docs/**

📄 CodeRabbit inference engine (CONTRIBUTING.md)

Run just docs or ./scripts/build-docs.sh html to regenerate ignored Fern API reference pages before validation for documentation site changes

Files:

docs/nemo-relay-cli/about.mdx
docs/build-plugins/about.mdx
docs/build-plugins/plugin-configuration-files.mdx
docs/nemo-relay-cli/basic-usage.mdx
docs/instrument-applications/instrument-llm-call.mdx
docs/integrate-into-frameworks/provider-response-codecs.mdx

{docs/**,README.md,CONTRIBUTING.md,RELEASING.md,SECURITY.md}

⚙️ CodeRabbit configuration file

{docs/**,README.md,CONTRIBUTING.md,RELEASING.md,SECURITY.md}: Review documentation for technical accuracy against the current API, command correctness, and consistency across language bindings.
Flag stale examples, missing SPDX headers where required, and instructions that no longer match CI or pre-commit behavior.

Files:

docs/nemo-relay-cli/about.mdx
docs/build-plugins/about.mdx
docs/build-plugins/plugin-configuration-files.mdx
docs/nemo-relay-cli/basic-usage.mdx
docs/instrument-applications/instrument-llm-call.mdx
docs/integrate-into-frameworks/provider-response-codecs.mdx

crates/core/src/api/**/*.rs

📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)

Implement behavior first in Rust core API modules: crates/core/src/api/ and related core modules such as crates/core/src/api/runtime/, crates/core/src/codec/, or crates/core/src/json.rs

Files:

crates/core/src/api/llm.rs

crates/core/src/api/{tool,llm}.rs

📄 CodeRabbit inference engine (.agents/skills/add-middleware/SKILL.md)

Wire the new middleware chain into the execute path in crates/core/src/api/tool.rs or crates/core/src/api/llm.rs at the appropriate pipeline stage

Files:

crates/core/src/api/llm.rs

{crates/adaptive/**/*.rs,**/*test*.{rs,py,go,ts,js},**/*adaptive*test*.{rs,py,go,ts,js},docs/plugins/adaptive/**}

📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)

Maintain documented and tested validation and report behavior for adaptive surfaces

Files:

crates/cli/tests/cli_tests.rs
crates/core/tests/unit/codec/openai_chat_tests.rs
crates/adaptive/tests/unit/drain_tests.rs
go/nemo_relay/llm_test.go
crates/ffi/tests/unit/types_tests.rs
crates/cli/tests/coverage/session_tests.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/cli/tests/coverage/doctor_tests.rs
crates/cli/tests/coverage/config_tests.rs
crates/python/tests/coverage/py_types_coverage_tests.rs
crates/adaptive/tests/unit/acg/telemetry_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/tests/integration/pipeline_tests.rs
crates/core/tests/unit/atif_tests.rs
crates/core/tests/unit/codec/response_tests.rs

{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}

⚙️ CodeRabbit configuration file

{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}: Tests should cover the behavior promised by the changed API surface, including error paths and cross-request isolation where relevant.
Prefer assertions on lifecycle events, scope stacks, middleware ordering, and binding parity over shallow smoke tests.

Files:

crates/cli/tests/cli_tests.rs
crates/core/tests/unit/codec/openai_chat_tests.rs
crates/node/tests/typed_tests.mjs
crates/adaptive/tests/unit/drain_tests.rs
go/nemo_relay/llm_test.go
crates/ffi/tests/unit/types_tests.rs
crates/cli/tests/coverage/session_tests.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/cli/tests/coverage/doctor_tests.rs
crates/cli/tests/coverage/config_tests.rs
crates/python/tests/coverage/py_types_coverage_tests.rs
crates/adaptive/tests/unit/acg/telemetry_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/tests/integration/pipeline_tests.rs
crates/core/tests/unit/atif_tests.rs
crates/core/tests/unit/codec/response_tests.rs

{crates/adaptive/**,python/nemo_relay/adaptive.py,python/nemo_relay/plugin.py,go/nemo_relay/adaptive/**,go/nemo_relay/!(adaptive)/**,**/node/**,**/wasm/**}

📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)

Keep adaptive surface in sync across crates/adaptive, shared plugin behavior in core and bindings, Python adaptive/plugin wrappers in python/nemo_relay/adaptive.py and python/nemo_relay/plugin.py, Go adaptive helpers under go/nemo_relay/adaptive plus shared plugin helpers in go/nemo_relay, and Node/WebAssembly adaptive helpers and plugin wrappers

Files:

crates/wasm/tests-js/typed_tests.mjs
crates/node/tests/typed_tests.mjs
crates/adaptive/tests/unit/drain_tests.rs
crates/adaptive/tests/unit/acg/telemetry_tests.rs

{crates/adaptive/**,python/nemo_relay/plugin.py,go/nemo_relay/**,**/node/**,**/wasm/**}

📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)

{crates/adaptive/**,python/nemo_relay/plugin.py,go/nemo_relay/**,**/node/**,**/wasm/**}: Maintain consistent plugin lifecycle across all language bindings (Python, Go, Node/WebAssembly, and Rust)
Keep plugin context surfaces aligned across all language implementations

Files:

crates/wasm/tests-js/typed_tests.mjs
crates/node/tests/typed_tests.mjs
crates/adaptive/tests/unit/drain_tests.rs
go/nemo_relay/llm_test.go
crates/adaptive/tests/unit/acg/telemetry_tests.rs

crates/{python,ffi,node,wasm}/**/*

⚙️ CodeRabbit configuration file

crates/{python,ffi,node,wasm}/**/*: Treat binding changes as public API changes. Check for parity with the other language bindings, FFI ownership/lifetime safety,
callback error propagation, stable type conversion, and consistent async/stream semantics.
Flag changes that update one binding without corresponding tests or documentation for the same surface elsewhere.

Files:

crates/wasm/tests-js/typed_tests.mjs
crates/node/tests/typed_tests.mjs
crates/ffi/tests/unit/types_tests.rs
crates/python/tests/coverage/py_types_coverage_tests.rs

go/nemo_relay/**/*.go

📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)

Update Go wrapper in go/nemo_relay/nemo_relay.go with doc comment and shorthand package if the capability belongs there

go/nemo_relay/**/*.go: Format changed Go packages with cd go/nemo_relay && go fmt ./...
Run Go tests with just test-go to build and test the NeMo Relay Go binding
Use just build-go when you want an explicit build-only pass or need the artifact for other work
Use just ci=true test-go when you need the CI-style coverage and JUnit path
On macOS, set DYLD_LIBRARY_PATH to the ../../target/release directory before running the raw go test command directly

Files:

go/nemo_relay/llm_test.go

go/**/*.go

📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)

Use PascalCase naming convention for Go identifiers (e.g., nemo_relay.ToolCall)

Run Go formatting with cd go/nemo_relay && go fmt ./...

Files:

go/nemo_relay/llm_test.go

{go/nemo_relay/go.mod,go/**/*.go}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Ensure Go module path in go/nemo_relay/go.mod matches import statements in Go source files

Files:

go/nemo_relay/llm_test.go

**/*.go

📄 CodeRabbit inference engine (.agents/skills/rename-surfaces/SKILL.md)

Update Go module paths and package paths during coordinated rename operations

**/*.go: Use gofmt for Go code formatting
Run go vet ./... for Go static analysis
Use Go PascalCase naming convention for Go identifiers
Include SPDX license header in all Go source files using double-slash comment syntax
Validate Go code with uv run pre-commit run --all-files to enforce gofmt formatting and go vet static analysis

Files:

go/nemo_relay/llm_test.go

go/nemo_relay/**/*

⚙️ CodeRabbit configuration file

go/nemo_relay/**/*: Review Go binding changes for cgo memory ownership, race safety, callback cleanup, idiomatic exported APIs, and parity with Rust/FFI behavior.
Any API change should include focused Go tests and consider race-test behavior.

Files:

go/nemo_relay/llm_test.go

crates/ffi/**

📄 CodeRabbit inference engine (.agents/skills/test-ffi-surface/SKILL.md)

Rebuild the FFI crate in release mode so the shared library and header stay in sync when making changes to crates/ffi

Files:

crates/ffi/tests/unit/types_tests.rs

crates/ffi/**/*.rs

📄 CodeRabbit inference engine (.agents/skills/test-go-binding/SKILL.md)

If the change touched crates/ffi, also use test-ffi-surface for validation

Files:

crates/ffi/tests/unit/types_tests.rs

**/*config*.{rs,ts,py,go,js,json,yaml,yml}

📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)

Ensure dynamic config shape still matches the documented canonical model

Files:

crates/cli/src/plugins/config_io.rs
crates/cli/tests/coverage/config_tests.rs
crates/cli/src/config.rs

crates/core/src/observability/{atif,otel,openinference}.rs

📄 CodeRabbit inference engine (.agents/skills/maintain-observability/SKILL.md)

When changing event fields in ATIF, OpenTelemetry, or OpenInference observability surfaces, keep the core event model in crates/core/src/observability/atif.rs, crates/core/src/observability/otel.rs, and crates/core/src/observability/openinference.rs in sync

Files:

crates/core/src/observability/otel.rs
crates/core/src/observability/openinference.rs
crates/core/src/observability/atif.rs

crates/python/**/*.rs

📄 CodeRabbit inference engine (.agents/skills/test-python-binding/SKILL.md)

If the native Rust bridge changed, add the Rust crate tests for nemo-relay-python

Files:

crates/python/tests/coverage/py_types_coverage_tests.rs

🔇 Additional comments (40)

crates/core/src/codec/mod.rs (1)

17-17: LGTM!

crates/core/src/plugins/mod.rs (1)

7-7: LGTM!

crates/core/src/plugins/pricing.rs (1)

21-34: LGTM!

Also applies to: 47-71, 73-95

crates/core/src/plugin.rs (1)

767-768: LGTM!

crates/node/tests/typed_tests.mjs (1)

513-529: LGTM!

Also applies to: 561-567

crates/python/tests/coverage/py_types_coverage_tests.rs (1)

22-23: LGTM!

Also applies to: 620-620, 1202-1214, 1238-1241, 1475-1475

crates/wasm/tests-js/typed_tests.mjs (1)

94-161: LGTM!

crates/core/src/codec/response.rs (1)

13-21: LGTM!

Also applies to: 80-187, 241-243

crates/core/src/codec/anthropic.rs (1)

30-32: LGTM!

Also applies to: 69-71, 356-379

crates/core/src/codec/openai_chat.rs (1)

17-19: LGTM!

Also applies to: 76-78, 176-193

crates/core/src/codec/openai_responses.rs (1)

29-31: LGTM!

Also applies to: 69-71, 478-498

crates/core/src/api/llm.rs (1)

26-27: LGTM!

Also applies to: 588-592

crates/core/src/stream.rs (1)

39-40: LGTM!

Also applies to: 147-151

crates/ffi/tests/unit/types_tests.rs (1)

572-592: LGTM!

Also applies to: 617-622

go/nemo_relay/llm_test.go (1)

195-219: LGTM!

Also applies to: 273-273

crates/core/tests/unit/codec/openai_chat_tests.rs (1)

10-10: LGTM!

Also applies to: 118-148

crates/core/tests/unit/observability/openinference_tests.rs (1)

16-24: LGTM!

Also applies to: 125-150, 2141-2141, 2224-2273, 2860-2860

crates/core/tests/unit/observability/otel_tests.rs (1)

16-70: LGTM!

Also applies to: 821-835, 876-919

crates/adaptive/tests/unit/acg/telemetry_tests.rs (1)

171-171: LGTM!

Also applies to: 200-200, 226-226, 249-249, 277-277, 311-311, 349-349, 418-418, 486-486, 541-541, 597-597, 650-650, 691-691

crates/adaptive/tests/unit/drain_tests.rs (1)

873-873: LGTM!
crates/core/tests/integration/pipeline_tests.rs (2)
32-35: LGTM!

Also applies to: 66-91, 923-934, 955-955, 972-972, 1001-1001, 1149-1149, 1170-1170, 1204-1204

992-998: Refine float-warning: exact assert_eq! is appropriate here. usage.cost.total is produced by round_cost_amount, which rounds to a fixed 1e-12 scale ((cost * 1_000_000_000_000.0).round() / ...), and the same value is already asserted exactly in crates/core/tests/unit/codec/response_tests.rs (assert_eq!(cost.total, Some(0.000_435))).
			> Likely an incorrect or invalid review comment.
crates/core/tests/unit/atif_tests.rs (1)

14-17: LGTM!

Also applies to: 685-753, 854-871, 4072-4079

crates/core/tests/unit/codec/response_tests.rs (1)

7-174: LGTM!

Also applies to: 180-477, 505-671, 760-935

crates/cli/src/config.rs (1)

82-83: LGTM!

Also applies to: 167-253, 990-1029

crates/cli/src/main.rs (1)

18-18: LGTM!

Also applies to: 28-28, 85-93

docs/build-plugins/about.mdx (1)

21-27: LGTM!

Also applies to: 35-37

docs/build-plugins/plugin-configuration-files.mdx (1)

191-196: LGTM!

Also applies to: 253-254

docs/instrument-applications/instrument-llm-call.mdx (1)

35-42: LGTM!

Also applies to: 53-56, 226-229, 249-250

docs/integrate-into-frameworks/provider-response-codecs.mdx (1)

43-44: LGTM!

Also applies to: 49-316, 456-457, 475-476, 581-582

docs/nemo-relay-cli/about.mdx (1)

27-27: LGTM!

docs/nemo-relay-cli/basic-usage.mdx (1)

150-228: LGTM!

crates/cli/src/plugins.rs (1)

23-23: LGTM!

crates/cli/src/plugins/config_io.rs (1)

20-20: LGTM!

Also applies to: 26-26, 45-45, 60-60, 81-81, 118-118

crates/cli/src/pricing.rs (1)

26-49: LGTM!

Also applies to: 124-203, 223-270

crates/cli/src/doctor.rs (1)

18-18: LGTM!

Also applies to: 34-35, 640-640, 722-803

crates/cli/tests/cli_tests.rs (1)

16-18: LGTM!

Also applies to: 104-257, 722-740

crates/cli/tests/coverage/config_tests.rs (1)

406-468: LGTM!

crates/cli/tests/coverage/doctor_tests.rs (1)

780-863: LGTM!

crates/cli/tests/coverage/session_tests.rs (1)

1946-1951: LGTM!

Also applies to: 1953-2025

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

crates/cli/tests/cli_tests.rs (1)
16-18: 🧹 Nitpick | 🔵 Trivial | 💤 Low value

Consider escaping control characters for TOML compliance.

The function correctly escapes backslashes and quotes, but TOML basic strings also require escaping control characters (U+0000-U+001F, U+007F). While file paths shouldn't contain these in practice, the function name toml_basic_string suggests general-purpose TOML escaping.

Consider either adding control character escaping for completeness, or renaming to toml_escape_path to clarify the limited scope.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/cli/tests/cli_tests.rs` around lines 16 - 18, toml_basic_string
currently only escapes backslashes and quotes; update the function
toml_basic_string to also escape TOML control characters (U+0000–U+001F and
U+007F) by mapping common ones to their short escapes (e.g., \n, \t, \r, \b, \f)
and encoding other controls as \u00XX (or \uXXXX) so generated basic strings are
TOML-compliant; adjust relevant tests in crates/cli/tests/cli_tests.rs to expect
these escapes.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@crates/cli/tests/cli_tests.rs`:
- Around line 16-18: toml_basic_string currently only escapes backslashes and
quotes; update the function toml_basic_string to also escape TOML control
characters (U+0000–U+001F and U+007F) by mapping common ones to their short
escapes (e.g., \n, \t, \r, \b, \f) and encoding other controls as \u00XX (or
\uXXXX) so generated basic strings are TOML-compliant; adjust relevant tests in
crates/cli/tests/cli_tests.rs to expect these escapes.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: d1914f99-c45b-4154-849a-314f80c9cf41

📥 Commits

Reviewing files that changed from the base of the PR and between dc36906 and a4b8ae8.

📒 Files selected for processing (1)

crates/cli/tests/cli_tests.rs

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Check / Run
GitHub Check: Preview docs

🧰 Additional context used

📓 Path-based instructions (10)

**/*.rs

📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)

Use snake_case naming convention for Rust identifiers (e.g., nemo_relay_tool_call)

**/*.rs: Any Rust change must run just test-rust
Any Rust change must run cargo fmt --all
Any Rust change must run cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all for all FFI work since it is Rust work
Run just test-rust to validate FFI changes
Run cargo clippy --workspace --all-targets -- -D warnings to enforce strict linting on FFI work

When Rust files changed as part of Go work, also run cargo fmt --all, just test-rust, and cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all when Rust files are changed as part of Node work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files are changed as part of Node work
Run just test-rust when Rust files are changed as part of Node work

**/*.rs: Run cargo fmt --all to format all Rust code
Run cargo clippy --workspace --all-targets -- -D warnings to enforce all clippy lints as errors

**/*.rs: Run cargo fmt --all when Rust files changed as part of WebAssembly work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files changed as part of WebAssembly work

**/*.rs: If any Rust code changed, always run just test-rust
If any Rust code changed, also run cargo fmt --all
If any Rust code changed, also run cargo clippy --workspace --all-targets -- -D warnings
Run Rust formatting with cargo fmt --all
Run Rust linting with cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Use cargo fmt for Rust code formatting
Run cargo clippy -- -D warnings to lint Rust code and treat all warnings as errors
Use Rust snake_case naming convention for Rust identifiers
Include SPDX license header in all Rust source files using double-slash comment syntax
Validate Rust code with uv run pre-commit run --all-files to enforce cargo fmt formatting check, cargo clippy lints, and cargo deny aud...

Files:

crates/cli/tests/cli_tests.rs

{crates/adaptive/**/*.rs,**/*test*.{rs,py,go,ts,js},**/*adaptive*test*.{rs,py,go,ts,js},docs/plugins/adaptive/**}

📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)

Maintain documented and tested validation and report behavior for adaptive surfaces

Files:

crates/cli/tests/cli_tests.rs

**/{Cargo.toml,**/*.rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Maintain consistency between Rust package names in Cargo.toml and their actual usage across the codebase

Files:

crates/cli/tests/cli_tests.rs

**/*.{h,hpp,c,cpp,rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Ensure FFI header and library naming follows consistent conventions across platform-specific builds

Files:

crates/cli/tests/cli_tests.rs

**/*.{rs,toml}

📄 CodeRabbit inference engine (.agents/skills/rename-surfaces/SKILL.md)

Update Rust crate names and module prefixes during coordinated rename operations

Files:

crates/cli/tests/cli_tests.rs

**/*.{rs,py,js,ts,tsx,jsx,go,sh,toml,yaml,yml,md}

📄 CodeRabbit inference engine (AGENTS.md)

Keep SPDX headers on source, docs, scripts, and configuration files. The project is Apache-2.0.

Files:

crates/cli/tests/cli_tests.rs

**/*.{rs,py,go,js,ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Follow binding naming conventions: Rust and Python use snake_case, C FFI exports prefixed nemo_relay_, Go uses PascalCase for public APIs, Node.js uses camelCase.

Files:

crates/cli/tests/cli_tests.rs

crates/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

crates/**/*.rs: Keep async behavior on the existing tokio-based model. Bindings should preserve callback and future lifetimes rather than blocking or hiding async work unexpectedly.
Use Json = serde_json::Value in Rust-facing runtime APIs for JSON payload handling.

Files:

crates/cli/tests/cli_tests.rs

{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}

⚙️ CodeRabbit configuration file

{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}: Tests should cover the behavior promised by the changed API surface, including error paths and cross-request isolation where relevant.
Prefer assertions on lifecycle events, scope stacks, middleware ordering, and binding parity over shallow smoke tests.

Files:

crates/cli/tests/cli_tests.rs

**

⚙️ CodeRabbit configuration file

**:

AGENTS.md

This file provides guidance to agents, including Claude Code and OpenAI Codex, when working in this repository.

Project Overview

NeMo Relay is a multi-language agent runtime framework for execution scopes, lifecycle events, middleware, plugins, and observability around tool and LLM calls. The core runtime is Rust. Primary supported bindings are Rust, Python, and Node.js. Go, WebAssembly, and the raw C FFI are experimental and source-first.

The shared runtime model is:

Scope stacks decide where work belongs and which scope-local behavior is visible.

Middleware registries decide what guardrails and intercepts run around managed calls.

Plugins install reusable runtime behavior from configuration.

Events record runtime behavior in ATOF form.

Subscribers and exporters consume events in-process or export them to ATIF, OpenTelemetry, OpenInference, or other backends.

Repository Structure

The repository layout separates the Rust runtime, language bindings, documentation,
integration patches, and agent-facing skills.
crates/
  core/       # Rust core runtime crate, published as nemo-relay
  adaptive/   # Adaptive runtime primitives and plugin components
  python/     # PyO3 native extension for the Python package
  ffi/        # Raw C ABI layer used by downstream bindings such as Go
  node/       # NAPI Node.js binding and JavaScript/TypeScript entry points
  wasm/       # wasm-bindgen WebAssembly binding and JS wrappers
python/
  nemo_relay/  # Python wrapper package: scopes, tools, LLM, middleware, typed helpers, plugins, adaptive helpers
  tests/      # Python tests
go/
  nemo_relay/  # Experimental Go CGo binding and tests
fern/         # Fern documentation site
scripts/      # Stable wrappers and helper scripts; build/test/docs entry points live in justfile
third_party/  # P...

Files:

crates/cli/tests/cli_tests.rs

🔇 Additional comments (4)

crates/cli/tests/cli_tests.rs (4)

59-62: LGTM!

108-154: LGTM!

156-200: LGTM!

202-259: LGTM!

Also applies to: 725-743

Signed-off-by: Ajay Thorve <athorve@nvidia.com>

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

crates/core/src/observability/atif.rs (1)

689-715: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't estimate cost_usd when the payload already carries non-USD cost.

For usage.cost = { "total": ..., "currency": "EUR" }, Lines 689-700 clear explicit_cost and then fall through to estimate_cost_for_provider(...). That exports a synthetic USD amount even though the provider already reported authoritative cost in another currency, so metrics.cost_usd/total_cost_usd can become incorrect. Treat non-USD usage.cost as “cost present but not exportable to cost_usd” and skip the estimator in that case.

Suggested fix

     let explicit_cost = usage.get("cost_usd").and_then(Json::as_f64).or_else(|| {
         let cost = usage.get("cost")?.as_object()?;
         if cost
             .get("currency")
             .and_then(Json::as_str)
             .is_some_and(|currency| currency != "USD")
         {
             return None;
         }
         cost.get("total").and_then(Json::as_f64)
     });
-    let cost = explicit_cost.or_else(|| {
+    let has_non_usd_cost = usage
+        .get("cost")
+        .and_then(Json::as_object)
+        .is_some_and(|cost| {
+            cost.get("currency")
+                .and_then(Json::as_str)
+                .is_some_and(|currency| currency != "USD")
+        });
+    let cost = if has_non_usd_cost {
+        explicit_cost
+    } else {
+        explicit_cost.or_else(|| {
         let model_name = model_name.or_else(|| response_model_name(output))?;
         estimate_cost_for_provider(
             provider,
             model_name,
             &Usage {
@@
         )
         .and_then(|cost| cost.total_for_currency("USD"))
-    });
+        })
+    };

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/core/src/observability/atif.rs` around lines 689 - 715, The code
currently clears explicit_cost when usage.cost.currency != "USD" and then falls
through to calling estimate_cost_for_provider, producing a synthetic USD value;
change this so non-USD reported costs prevent estimation. Detect the non-USD
case when reading usage.get("cost") (e.g., set a has_non_usd_cost boolean when
cost.get("currency") exists and != "USD"), keep explicit_cost as before for USD,
and then when computing cost (the variable using explicit_cost.or_else(...)),
skip calling estimate_cost_for_provider and return None if has_non_usd_cost is
true; otherwise call estimate_cost_for_provider as now. Ensure you reference
explicit_cost, usage.get("cost"), has_non_usd_cost (new flag),
estimate_cost_for_provider and the cost variable when making the change.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/core/src/observability/otel.rs`:
- Around line 683-702: The function cost_from_llm_event currently uses a `?` on
`model_name` which returns early and prevents the manual-output fallback from
running; change the flow so you don't return early: in cost_from_llm_event,
replace `let model_name = response.model.as_deref().or_else(||
event.model_name())?;` with an Option-valued binding (no `?`) and then branch—if
`Some(model_name)` call
estimate_cost_for_provider(...).and_then(cost_total_and_currency) and return
that result, but if `None` fall through to the existing manual-output fallback
that uses usage_from_manual_llm_output and model_name_from_manual_llm_output so
the raw-output estimation path still runs when annotated model is absent.

---

Outside diff comments:
In `@crates/core/src/observability/atif.rs`:
- Around line 689-715: The code currently clears explicit_cost when
usage.cost.currency != "USD" and then falls through to calling
estimate_cost_for_provider, producing a synthetic USD value; change this so
non-USD reported costs prevent estimation. Detect the non-USD case when reading
usage.get("cost") (e.g., set a has_non_usd_cost boolean when
cost.get("currency") exists and != "USD"), keep explicit_cost as before for USD,
and then when computing cost (the variable using explicit_cost.or_else(...)),
skip calling estimate_cost_for_provider and return None if has_non_usd_cost is
true; otherwise call estimate_cost_for_provider as now. Ensure you reference
explicit_cost, usage.get("cost"), has_non_usd_cost (new flag),
estimate_cost_for_provider and the cost variable when making the change.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: fb56ab89-d618-4bfb-936c-ad4ab3813e45

📥 Commits

Reviewing files that changed from the base of the PR and between a4b8ae8 and 1b30b60.

📒 Files selected for processing (11)

crates/cli/src/pricing.rs
crates/cli/tests/cli_tests.rs
crates/core/src/codec/pricing.rs
crates/core/src/codec/response.rs
crates/core/src/observability/atif.rs
crates/core/src/observability/openinference.rs
crates/core/src/observability/otel.rs
crates/core/tests/unit/atif_tests.rs
crates/core/tests/unit/codec/response_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/tests/unit/observability/otel_tests.rs

📜 Review details

🧰 Additional context used

📓 Path-based instructions (15)

**/*.rs

📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)

Use snake_case naming convention for Rust identifiers (e.g., nemo_relay_tool_call)

**/*.rs: Any Rust change must run just test-rust
Any Rust change must run cargo fmt --all
Any Rust change must run cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all for all FFI work since it is Rust work
Run just test-rust to validate FFI changes
Run cargo clippy --workspace --all-targets -- -D warnings to enforce strict linting on FFI work

When Rust files changed as part of Go work, also run cargo fmt --all, just test-rust, and cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all when Rust files are changed as part of Node work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files are changed as part of Node work
Run just test-rust when Rust files are changed as part of Node work

**/*.rs: Run cargo fmt --all to format all Rust code
Run cargo clippy --workspace --all-targets -- -D warnings to enforce all clippy lints as errors

**/*.rs: Run cargo fmt --all when Rust files changed as part of WebAssembly work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files changed as part of WebAssembly work

**/*.rs: If any Rust code changed, always run just test-rust
If any Rust code changed, also run cargo fmt --all
If any Rust code changed, also run cargo clippy --workspace --all-targets -- -D warnings
Run Rust formatting with cargo fmt --all
Run Rust linting with cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Use cargo fmt for Rust code formatting
Run cargo clippy -- -D warnings to lint Rust code and treat all warnings as errors
Use Rust snake_case naming convention for Rust identifiers
Include SPDX license header in all Rust source files using double-slash comment syntax
Validate Rust code with uv run pre-commit run --all-files to enforce cargo fmt formatting check, cargo clippy lints, and cargo deny aud...

Files:

crates/core/src/observability/openinference.rs
crates/core/tests/unit/atif_tests.rs
crates/cli/tests/cli_tests.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/src/codec/response.rs
crates/core/src/observability/atif.rs
crates/core/src/observability/otel.rs
crates/cli/src/pricing.rs
crates/core/tests/unit/codec/response_tests.rs
crates/core/src/codec/pricing.rs

crates/core/src/observability/{atif,otel,openinference}.rs

📄 CodeRabbit inference engine (.agents/skills/maintain-observability/SKILL.md)

When changing event fields in ATIF, OpenTelemetry, or OpenInference observability surfaces, keep the core event model in crates/core/src/observability/atif.rs, crates/core/src/observability/otel.rs, and crates/core/src/observability/openinference.rs in sync

Files:

crates/core/src/observability/openinference.rs
crates/core/src/observability/atif.rs
crates/core/src/observability/otel.rs

**/{Cargo.toml,**/*.rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Maintain consistency between Rust package names in Cargo.toml and their actual usage across the codebase

Files:

crates/core/src/observability/openinference.rs
crates/core/tests/unit/atif_tests.rs
crates/cli/tests/cli_tests.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/src/codec/response.rs
crates/core/src/observability/atif.rs
crates/core/src/observability/otel.rs
crates/cli/src/pricing.rs
crates/core/tests/unit/codec/response_tests.rs
crates/core/src/codec/pricing.rs

**/*.{h,hpp,c,cpp,rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Ensure FFI header and library naming follows consistent conventions across platform-specific builds

Files:

crates/core/src/observability/openinference.rs
crates/core/tests/unit/atif_tests.rs
crates/cli/tests/cli_tests.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/src/codec/response.rs
crates/core/src/observability/atif.rs
crates/core/src/observability/otel.rs
crates/cli/src/pricing.rs
crates/core/tests/unit/codec/response_tests.rs
crates/core/src/codec/pricing.rs

{crates/core,crates/adaptive}/**/*

📄 CodeRabbit inference engine (.agents/skills/prepare-pr/SKILL.md)

Changes to crates/core or crates/adaptive must run the full language matrix

Files:

crates/core/src/observability/openinference.rs
crates/core/tests/unit/atif_tests.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/src/codec/response.rs
crates/core/src/observability/atif.rs
crates/core/src/observability/otel.rs
crates/core/tests/unit/codec/response_tests.rs
crates/core/src/codec/pricing.rs

**/*.{rs,toml}

📄 CodeRabbit inference engine (.agents/skills/rename-surfaces/SKILL.md)

Update Rust crate names and module prefixes during coordinated rename operations

Files:

crates/core/src/observability/openinference.rs
crates/core/tests/unit/atif_tests.rs
crates/cli/tests/cli_tests.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/src/codec/response.rs
crates/core/src/observability/atif.rs
crates/core/src/observability/otel.rs
crates/cli/src/pricing.rs
crates/core/tests/unit/codec/response_tests.rs
crates/core/src/codec/pricing.rs

crates/core/**/*.rs

📄 CodeRabbit inference engine (.agents/skills/test-go-binding/SKILL.md)

If the change touched crates/core or shared runtime semantics, also use validate-change for broader validation

crates/core/**/*.rs: Use Json = serde_json::Value in Rust-facing runtime APIs where the existing code expects JSON payloads.
Use Result<T> with FlowError in core runtime paths. Keep errors explicit and binding-appropriate at the wrapper layer.

Files:

crates/core/src/observability/openinference.rs
crates/core/tests/unit/atif_tests.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/src/codec/response.rs
crates/core/src/observability/atif.rs
crates/core/src/observability/otel.rs
crates/core/tests/unit/codec/response_tests.rs
crates/core/src/codec/pricing.rs

crates/{core,adaptive}/**

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

If crates/core or crates/adaptive changed, run the full matrix across Rust, Python, Go, Node.js, and WebAssembly

Files:

crates/core/src/observability/openinference.rs
crates/core/tests/unit/atif_tests.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/src/codec/response.rs
crates/core/src/observability/atif.rs
crates/core/src/observability/otel.rs
crates/core/tests/unit/codec/response_tests.rs
crates/core/src/codec/pricing.rs

**/*.{rs,py,js,ts,tsx,jsx,go,sh,toml,yaml,yml,md}

📄 CodeRabbit inference engine (AGENTS.md)

Keep SPDX headers on source, docs, scripts, and configuration files. The project is Apache-2.0.

Files:

crates/core/src/observability/openinference.rs
crates/core/tests/unit/atif_tests.rs
crates/cli/tests/cli_tests.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/src/codec/response.rs
crates/core/src/observability/atif.rs
crates/core/src/observability/otel.rs
crates/cli/src/pricing.rs
crates/core/tests/unit/codec/response_tests.rs
crates/core/src/codec/pricing.rs

**/*.{rs,py,go,js,ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Follow binding naming conventions: Rust and Python use snake_case, C FFI exports prefixed nemo_relay_, Go uses PascalCase for public APIs, Node.js uses camelCase.

Files:

crates/core/src/observability/openinference.rs
crates/core/tests/unit/atif_tests.rs
crates/cli/tests/cli_tests.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/src/codec/response.rs
crates/core/src/observability/atif.rs
crates/core/src/observability/otel.rs
crates/cli/src/pricing.rs
crates/core/tests/unit/codec/response_tests.rs
crates/core/src/codec/pricing.rs

crates/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

crates/**/*.rs: Keep async behavior on the existing tokio-based model. Bindings should preserve callback and future lifetimes rather than blocking or hiding async work unexpectedly.
Use Json = serde_json::Value in Rust-facing runtime APIs for JSON payload handling.

Files:

crates/core/src/observability/openinference.rs
crates/core/tests/unit/atif_tests.rs
crates/cli/tests/cli_tests.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/src/codec/response.rs
crates/core/src/observability/atif.rs
crates/core/src/observability/otel.rs
crates/cli/src/pricing.rs
crates/core/tests/unit/codec/response_tests.rs
crates/core/src/codec/pricing.rs

crates/{core,adaptive}/**/*.rs

⚙️ CodeRabbit configuration file

crates/{core,adaptive}/**/*.rs: Review the Rust runtime for async correctness, scope isolation, middleware ordering, and event lifecycle regressions.
Pay close attention to task-local/thread-local scope propagation, callback lifetimes, stream finalization, and root_uuid isolation.
Public API changes should preserve existing behavior unless tests and docs show the intended migration path.

Files:

crates/core/src/observability/openinference.rs
crates/core/tests/unit/atif_tests.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/src/codec/response.rs
crates/core/src/observability/atif.rs
crates/core/src/observability/otel.rs
crates/core/tests/unit/codec/response_tests.rs
crates/core/src/codec/pricing.rs

**

⚙️ CodeRabbit configuration file

**:

AGENTS.md

This file provides guidance to agents, including Claude Code and OpenAI Codex, when working in this repository.

Project Overview

NeMo Relay is a multi-language agent runtime framework for execution scopes, lifecycle events, middleware, plugins, and observability around tool and LLM calls. The core runtime is Rust. Primary supported bindings are Rust, Python, and Node.js. Go, WebAssembly, and the raw C FFI are experimental and source-first.

The shared runtime model is:

Scope stacks decide where work belongs and which scope-local behavior is visible.

Middleware registries decide what guardrails and intercepts run around managed calls.

Plugins install reusable runtime behavior from configuration.

Events record runtime behavior in ATOF form.

Subscribers and exporters consume events in-process or export them to ATIF, OpenTelemetry, OpenInference, or other backends.

Repository Structure

The repository layout separates the Rust runtime, language bindings, documentation,
integration patches, and agent-facing skills.
crates/
  core/       # Rust core runtime crate, published as nemo-relay
  adaptive/   # Adaptive runtime primitives and plugin components
  python/     # PyO3 native extension for the Python package
  ffi/        # Raw C ABI layer used by downstream bindings such as Go
  node/       # NAPI Node.js binding and JavaScript/TypeScript entry points
  wasm/       # wasm-bindgen WebAssembly binding and JS wrappers
python/
  nemo_relay/  # Python wrapper package: scopes, tools, LLM, middleware, typed helpers, plugins, adaptive helpers
  tests/      # Python tests
go/
  nemo_relay/  # Experimental Go CGo binding and tests
fern/         # Fern documentation site
scripts/      # Stable wrappers and helper scripts; build/test/docs entry points live in justfile
third_party/  # P...

Files:

crates/core/src/observability/openinference.rs
crates/core/tests/unit/atif_tests.rs
crates/cli/tests/cli_tests.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/src/codec/response.rs
crates/core/src/observability/atif.rs
crates/core/src/observability/otel.rs
crates/cli/src/pricing.rs
crates/core/tests/unit/codec/response_tests.rs
crates/core/src/codec/pricing.rs

{crates/adaptive/**/*.rs,**/*test*.{rs,py,go,ts,js},**/*adaptive*test*.{rs,py,go,ts,js},docs/plugins/adaptive/**}

📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)

Maintain documented and tested validation and report behavior for adaptive surfaces

Files:

crates/core/tests/unit/atif_tests.rs
crates/cli/tests/cli_tests.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/tests/unit/codec/response_tests.rs

{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}

⚙️ CodeRabbit configuration file

{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}: Tests should cover the behavior promised by the changed API surface, including error paths and cross-request isolation where relevant.
Prefer assertions on lifecycle events, scope stacks, middleware ordering, and binding parity over shallow smoke tests.

Files:

crates/core/tests/unit/atif_tests.rs
crates/cli/tests/cli_tests.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/tests/unit/codec/response_tests.rs

🔇 Additional comments (13)

crates/core/src/codec/pricing.rs (1)

196-223: Please attach the required crates/core validation for this change.

This touches crates/core, so please confirm the broader validation ran before merge: validate-change, just test-rust, cargo fmt --all, cargo clippy --workspace --all-targets -- -D warnings, uv run pre-commit run --all-files, and the full Rust/Python/Go/Node/WebAssembly matrix.

As per coding guidelines, changes to crates/core must "also use validate-change for broader validation", "run the full matrix across Rust, Python, Go, Node.js, and WebAssembly", and Rust changes must run just test-rust, cargo fmt --all, cargo clippy --workspace --all-targets -- -D warnings, and uv run pre-commit run --all-files.

Source: Coding guidelines

crates/core/tests/unit/atif_tests.rs (1)

21-27: LGTM!

Also applies to: 750-750, 762-774, 804-804

crates/core/tests/unit/codec/response_tests.rs (1)

19-33: LGTM!

Also applies to: 501-501, 690-707, 743-767, 805-805

crates/core/tests/unit/observability/openinference_tests.rs (2)

2186-2267: Please confirm the full core-change validation matrix ran.

These tests harden the Rust side, but this PR changes crates/core pricing semantics shared across bindings. Please include the final results for validate-change, just test-rust, just test-python, just test-go, just test-node, just test-wasm, cargo fmt --all, cargo clippy --workspace --all-targets -- -D warnings, and uv run pre-commit run --all-files.

As per coding guidelines, "If crates/core or crates/adaptive changed, run the full matrix across Rust, Python, Go, Node.js, and WebAssembly", "crates/core/**/*.rs: If the change touched crates/core or shared runtime semantics, also use validate-change for broader validation", and "Before review, prefer uv run pre-commit run --all-files when the change crosses languages or tooling."

Source: Coding guidelines

36-42: LGTM!

Also applies to: 2186-2230, 2232-2267, 2320-2357

crates/core/tests/unit/observability/otel_tests.rs (1)

32-38: LGTM!

Also applies to: 845-882, 928-958

crates/core/src/codec/response.rs (2)

224-226: LGTM!

1-3: Cannot verify Rust validation status — sandbox environment lacks required tools.

The validation commands (cargo fmt, cargo clippy, just test-rust, uv run pre-commit) cannot be executed in the current sandbox environment. The SPDX header on lines 1–3 is correct per guidelines, and crates/core/src/codec/response.rs contains valid Rust code. However, confirmation that cargo fmt, cargo clippy --workspace --all-targets -- -D warnings, just test-rust, and uv run pre-commit run --all-files all pass requires access to a fully configured development environment with Rust toolchain, just, uv, and related dependencies installed. Obtain validation output from the CI build or local development environment before merge.

crates/core/src/observability/openinference.rs (1)

1117-1119: LGTM!

Also applies to: 1125-1127

crates/core/src/observability/otel.rs (2)

28-30: LGTM!

709-877: LGTM!

crates/cli/src/pricing.rs (1)

52-58: LGTM!

Also applies to: 64-64, 88-92

crates/cli/tests/cli_tests.rs (1)

17-34: LGTM!

Also applies to: 36-41, 204-215, 228-228, 290-310

Signed-off-by: Ajay Thorve <athorve@nvidia.com>

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

crates/core/src/observability/atif.rs (1)
689-726: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't estimate over a present usage.cost object.

This path only treats usage.cost.total as explicit cost. If the provider supplies authoritative component fields without total (for example input/output/cache_*), the exporter falls through to estimate_cost_for_provider(...) and can emit a different USD value instead of leaving ATIF cost unset. For ATIF, derive cost_usd from explicit USD components when possible; otherwise skip estimation whenever usage.cost exists.

This matches the PR cost contract: provider/framework-reported cost stays authoritative, and pricing estimation should run only when Usage.cost is absent.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/core/src/observability/atif.rs` around lines 689 - 726, The current
logic falls back to estimate_cost_for_provider even when a usage.cost object is
present but lacks a total field; change it so that when usage.get("cost") exists
you never call estimate_cost_for_provider: instead, if cost.currency == "USD"
try to derive cost_usd from explicit USD component fields (e.g., input, output,
cache_read, cache_write, or usage.get("total") if present) and set explicit_cost
accordingly, otherwise leave cost_usd None; only call estimate_cost_for_provider
in the branch where usage.get("cost") is completely absent. Update the logic
around explicit_cost, has_non_usd_cost, and the block that calls
estimate_cost_for_provider to reflect this contract.
crates/core/tests/unit/observability/otel_tests.rs (1)
845-999: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add a provider-qualified OTEL pricing case.

All priced fixtures here install a catalog entry under provider: "test", but the events only carry model / model_name and never a matching provider or route identity. That means this block can still pass via provider-agnostic fallback while the new provider-aware lookup path regresses. Add one span-end case that resolves through the provider-qualified branch as well. As per coding guidelines, {crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}: "Tests should cover the behavior promised by the changed API surface, including error paths and cross-request isolation where relevant."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/core/tests/unit/observability/otel_tests.rs` around lines 845 - 999,
Add a test case that exercises the provider-qualified pricing lookup by creating
a span-end event that includes an explicit provider/route identity (e.g., set
provider or route fields on the event payload) so the installed test catalog
entry installed by install_test_pricing("priced-model") under provider "test" is
matched via the provider-aware branch; specifically, use
make_scope_event_with_profile (or make_scope_event_with_profile's payload) to
include the provider identity alongside model/model_name, mirror the usage/token
counts used in the other priced cases, then assert the same
nemo_relay.llm.cost.total and nemo_relay.llm.cost.currency values to verify
provider-qualified resolution, ensuring the new case sits alongside the existing
blocks that use install_test_pricing and ResetPricingResolverGuard.
Source: Coding guidelines

♻️ Duplicate comments (1)

crates/core/src/observability/otel.rs (1)

687-703: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve fallback ordering when the annotated branch can't emit a total.

The direct return here still short-circuits the manual fallback below when the annotated estimate comes back None. That means partial normalized usage can suppress a successful raw-output estimate. Also, when usage.cost is present but only has component fields, this branch currently treats it as absent and estimates anyway; that breaks the provider-cost precedence the PR is introducing.

💡 Minimal fix

     if let Some(response) = event.annotated_response()
         && let Some(usage) = response.usage.as_ref()
     {
-        if let Some(cost) = usage.cost.as_ref().and_then(cost_total_and_currency) {
-            return Some(cost);
+        if let Some(cost) = usage.cost.as_ref() {
+            return cost_total_and_currency(cost);
         }
-        if let Some(model_name) = response.model.as_deref().or_else(|| event.model_name()) {
-            return estimate_cost_for_provider(Some(event.name()), model_name, usage)
-                .and_then(|cost| cost_total_and_currency(&cost));
+        if let Some(model_name) = response.model.as_deref().or_else(|| event.model_name())
+            && let Some(cost) = estimate_cost_for_provider(Some(event.name()), model_name, usage)
+                .and_then(|cost| cost_total_and_currency(&cost))
+        {
+            return Some(cost);
         }
     }

This keeps provider/framework cost authoritative and only falls through to the raw-output estimation path when the annotated branch has no explicit cost and cannot produce an estimate.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/core/src/observability/otel.rs` around lines 687 - 703, The
annotated-response branch currently returns early and short-circuits the
manual/raw-output fallback; change it so the branch only short-circuits when it
can produce an explicit total or when annotated usage.cost exists (even if
cost_total_and_currency() is None) which should be treated as authoritative and
stop further estimation; otherwise (no usage.cost present) attempt
estimate_cost_for_provider(...) and use that only if it yields a total, else
fall through to the existing usage_from_manual_llm_output(...) /
model_name_from_manual_llm_output(...) path. Reference:
event.annotated_response(), response.usage / usage.cost,
cost_total_and_currency, estimate_cost_for_provider,
usage_from_manual_llm_output, model_name_from_manual_llm_output.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@crates/core/src/observability/atif.rs`:
- Around line 689-726: The current logic falls back to
estimate_cost_for_provider even when a usage.cost object is present but lacks a
total field; change it so that when usage.get("cost") exists you never call
estimate_cost_for_provider: instead, if cost.currency == "USD" try to derive
cost_usd from explicit USD component fields (e.g., input, output, cache_read,
cache_write, or usage.get("total") if present) and set explicit_cost
accordingly, otherwise leave cost_usd None; only call estimate_cost_for_provider
in the branch where usage.get("cost") is completely absent. Update the logic
around explicit_cost, has_non_usd_cost, and the block that calls
estimate_cost_for_provider to reflect this contract.

In `@crates/core/tests/unit/observability/otel_tests.rs`:
- Around line 845-999: Add a test case that exercises the provider-qualified
pricing lookup by creating a span-end event that includes an explicit
provider/route identity (e.g., set provider or route fields on the event
payload) so the installed test catalog entry installed by
install_test_pricing("priced-model") under provider "test" is matched via the
provider-aware branch; specifically, use make_scope_event_with_profile (or
make_scope_event_with_profile's payload) to include the provider identity
alongside model/model_name, mirror the usage/token counts used in the other
priced cases, then assert the same nemo_relay.llm.cost.total and
nemo_relay.llm.cost.currency values to verify provider-qualified resolution,
ensuring the new case sits alongside the existing blocks that use
install_test_pricing and ResetPricingResolverGuard.

---

Duplicate comments:
In `@crates/core/src/observability/otel.rs`:
- Around line 687-703: The annotated-response branch currently returns early and
short-circuits the manual/raw-output fallback; change it so the branch only
short-circuits when it can produce an explicit total or when annotated
usage.cost exists (even if cost_total_and_currency() is None) which should be
treated as authoritative and stop further estimation; otherwise (no usage.cost
present) attempt estimate_cost_for_provider(...) and use that only if it yields
a total, else fall through to the existing usage_from_manual_llm_output(...) /
model_name_from_manual_llm_output(...) path. Reference:
event.annotated_response(), response.usage / usage.cost,
cost_total_and_currency, estimate_cost_for_provider,
usage_from_manual_llm_output, model_name_from_manual_llm_output.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 022204fb-736e-412f-999b-09a5b2de6caa

📥 Commits

Reviewing files that changed from the base of the PR and between 1b30b60 and 992d3a3.

📒 Files selected for processing (4)

crates/core/src/observability/atif.rs
crates/core/src/observability/otel.rs
crates/core/tests/unit/atif_tests.rs
crates/core/tests/unit/observability/otel_tests.rs

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Check / Run
GitHub Check: Preview docs

🧰 Additional context used

📓 Path-based instructions (15)

**/*.rs

📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)

Use snake_case naming convention for Rust identifiers (e.g., nemo_relay_tool_call)

**/*.rs: Any Rust change must run just test-rust
Any Rust change must run cargo fmt --all
Any Rust change must run cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all for all FFI work since it is Rust work
Run just test-rust to validate FFI changes
Run cargo clippy --workspace --all-targets -- -D warnings to enforce strict linting on FFI work

When Rust files changed as part of Go work, also run cargo fmt --all, just test-rust, and cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all when Rust files are changed as part of Node work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files are changed as part of Node work
Run just test-rust when Rust files are changed as part of Node work

**/*.rs: Run cargo fmt --all to format all Rust code
Run cargo clippy --workspace --all-targets -- -D warnings to enforce all clippy lints as errors

**/*.rs: Run cargo fmt --all when Rust files changed as part of WebAssembly work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files changed as part of WebAssembly work

**/*.rs: If any Rust code changed, always run just test-rust
If any Rust code changed, also run cargo fmt --all
If any Rust code changed, also run cargo clippy --workspace --all-targets -- -D warnings
Run Rust formatting with cargo fmt --all
Run Rust linting with cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Use cargo fmt for Rust code formatting
Run cargo clippy -- -D warnings to lint Rust code and treat all warnings as errors
Use Rust snake_case naming convention for Rust identifiers
Include SPDX license header in all Rust source files using double-slash comment syntax
Validate Rust code with uv run pre-commit run --all-files to enforce cargo fmt formatting check, cargo clippy lints, and cargo deny aud...

Files:

crates/core/src/observability/atif.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/observability/otel.rs

crates/core/src/observability/{atif,otel,openinference}.rs

📄 CodeRabbit inference engine (.agents/skills/maintain-observability/SKILL.md)

When changing event fields in ATIF, OpenTelemetry, or OpenInference observability surfaces, keep the core event model in crates/core/src/observability/atif.rs, crates/core/src/observability/otel.rs, and crates/core/src/observability/openinference.rs in sync

Files:

crates/core/src/observability/atif.rs
crates/core/src/observability/otel.rs

**/{Cargo.toml,**/*.rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Maintain consistency between Rust package names in Cargo.toml and their actual usage across the codebase

Files:

crates/core/src/observability/atif.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/observability/otel.rs

**/*.{h,hpp,c,cpp,rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Ensure FFI header and library naming follows consistent conventions across platform-specific builds

Files:

crates/core/src/observability/atif.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/observability/otel.rs

{crates/core,crates/adaptive}/**/*

📄 CodeRabbit inference engine (.agents/skills/prepare-pr/SKILL.md)

Changes to crates/core or crates/adaptive must run the full language matrix

Files:

crates/core/src/observability/atif.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/observability/otel.rs

**/*.{rs,toml}

📄 CodeRabbit inference engine (.agents/skills/rename-surfaces/SKILL.md)

Update Rust crate names and module prefixes during coordinated rename operations

Files:

crates/core/src/observability/atif.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/observability/otel.rs

crates/core/**/*.rs

📄 CodeRabbit inference engine (.agents/skills/test-go-binding/SKILL.md)

If the change touched crates/core or shared runtime semantics, also use validate-change for broader validation

crates/core/**/*.rs: Use Json = serde_json::Value in Rust-facing runtime APIs where the existing code expects JSON payloads.
Use Result<T> with FlowError in core runtime paths. Keep errors explicit and binding-appropriate at the wrapper layer.

Files:

crates/core/src/observability/atif.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/observability/otel.rs

crates/{core,adaptive}/**

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

If crates/core or crates/adaptive changed, run the full matrix across Rust, Python, Go, Node.js, and WebAssembly

Files:

crates/core/src/observability/atif.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/observability/otel.rs

**/*.{rs,py,js,ts,tsx,jsx,go,sh,toml,yaml,yml,md}

📄 CodeRabbit inference engine (AGENTS.md)

Keep SPDX headers on source, docs, scripts, and configuration files. The project is Apache-2.0.

Files:

crates/core/src/observability/atif.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/observability/otel.rs

**/*.{rs,py,go,js,ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Follow binding naming conventions: Rust and Python use snake_case, C FFI exports prefixed nemo_relay_, Go uses PascalCase for public APIs, Node.js uses camelCase.

Files:

crates/core/src/observability/atif.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/observability/otel.rs

crates/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

crates/**/*.rs: Keep async behavior on the existing tokio-based model. Bindings should preserve callback and future lifetimes rather than blocking or hiding async work unexpectedly.
Use Json = serde_json::Value in Rust-facing runtime APIs for JSON payload handling.

Files:

crates/core/src/observability/atif.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/observability/otel.rs

crates/{core,adaptive}/**/*.rs

⚙️ CodeRabbit configuration file

crates/{core,adaptive}/**/*.rs: Review the Rust runtime for async correctness, scope isolation, middleware ordering, and event lifecycle regressions.
Pay close attention to task-local/thread-local scope propagation, callback lifetimes, stream finalization, and root_uuid isolation.
Public API changes should preserve existing behavior unless tests and docs show the intended migration path.

Files:

crates/core/src/observability/atif.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/observability/otel.rs

**

⚙️ CodeRabbit configuration file

**:

AGENTS.md

This file provides guidance to agents, including Claude Code and OpenAI Codex, when working in this repository.

Project Overview

NeMo Relay is a multi-language agent runtime framework for execution scopes, lifecycle events, middleware, plugins, and observability around tool and LLM calls. The core runtime is Rust. Primary supported bindings are Rust, Python, and Node.js. Go, WebAssembly, and the raw C FFI are experimental and source-first.

The shared runtime model is:

Scope stacks decide where work belongs and which scope-local behavior is visible.

Middleware registries decide what guardrails and intercepts run around managed calls.

Plugins install reusable runtime behavior from configuration.

Events record runtime behavior in ATOF form.

Subscribers and exporters consume events in-process or export them to ATIF, OpenTelemetry, OpenInference, or other backends.

Repository Structure

The repository layout separates the Rust runtime, language bindings, documentation,
integration patches, and agent-facing skills.
crates/
  core/       # Rust core runtime crate, published as nemo-relay
  adaptive/   # Adaptive runtime primitives and plugin components
  python/     # PyO3 native extension for the Python package
  ffi/        # Raw C ABI layer used by downstream bindings such as Go
  node/       # NAPI Node.js binding and JavaScript/TypeScript entry points
  wasm/       # wasm-bindgen WebAssembly binding and JS wrappers
python/
  nemo_relay/  # Python wrapper package: scopes, tools, LLM, middleware, typed helpers, plugins, adaptive helpers
  tests/      # Python tests
go/
  nemo_relay/  # Experimental Go CGo binding and tests
fern/         # Fern documentation site
scripts/      # Stable wrappers and helper scripts; build/test/docs entry points live in justfile
third_party/  # P...

Files:

crates/core/src/observability/atif.rs
crates/core/tests/unit/observability/otel_tests.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/observability/otel.rs

{crates/adaptive/**/*.rs,**/*test*.{rs,py,go,ts,js},**/*adaptive*test*.{rs,py,go,ts,js},docs/plugins/adaptive/**}

📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)

Maintain documented and tested validation and report behavior for adaptive surfaces

Files:

crates/core/tests/unit/observability/otel_tests.rs
crates/core/tests/unit/atif_tests.rs

{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}

⚙️ CodeRabbit configuration file

{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}: Tests should cover the behavior promised by the changed API surface, including error paths and cross-request isolation where relevant.
Prefer assertions on lifecycle events, scope stacks, middleware ordering, and binding parity over shallow smoke tests.

Files:

crates/core/tests/unit/observability/otel_tests.rs
crates/core/tests/unit/atif_tests.rs

🔇 Additional comments (1)

crates/core/tests/unit/atif_tests.rs (1)

29-54: Please confirm the required Rust/core validation matrix ran.

These changes are under crates/core/tests, so the repo guidance requires more than the local unit assertions here: cargo fmt --all, cargo clippy --workspace --all-targets -- -D warnings, just test-rust, uv run pre-commit run --all-files, validate-change, and the full language matrix for crates/core changes. As per coding guidelines, **/*.rs: "Any Rust change must run just test-rust", "cargo fmt --all", and "cargo clippy --workspace --all-targets -- -D warnings", and crates/{core,adaptive}/**/*: "Changes to crates/core or crates/adaptive must run the full language matrix".

Also applies to: 805-866

Source: Coding guidelines

Signed-off-by: Ajay Thorve <athorve@nvidia.com>

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

crates/core/src/observability/otel.rs (1)

687-695: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't short-circuit when Usage.cost lacks total.

Lines 690-691 return immediately for any annotated usage.cost. If that cost only carries input / output / cache_* components, cost_total_and_currency returns None and the function never reaches the remaining logic. This regresses valid normalized cost payloads to “no cost”.

Proposed fix

 fn cost_total_and_currency(cost: &CostEstimate) -> Option<(f64, String)> {
-    Some((cost.total?, cost.currency.clone()))
+    let total = cost.total.or_else(|| {
+        let (has_component, total) = [cost.input, cost.output, cost.cache_read, cost.cache_write]
+            .into_iter()
+            .flatten()
+            .fold((false, 0.0), |(_, total), value| (true, total + value));
+        has_component.then_some(total)
+    })?;
+    Some((total, cost.currency.clone()))
 }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/core/src/observability/otel.rs` around lines 687 - 695, The current
logic returns as soon as response.usage.cost is Some via cost_total_and_currency
even when that helper returns None, preventing fallback estimation; change the
flow so that you only return if cost_total_and_currency(&cost) yields
Some(value), otherwise continue to the subsequent model-based estimation path
(use the existing symbols: event.annotated_response(), response.usage,
usage.cost, cost_total_and_currency, event.model_name(),
estimate_cost_for_provider(Some(event.name()), model_name, usage)) so that
missing total components in Usage.cost don't short-circuit estimation.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/core/src/observability/atif.rs`:
- Around line 752-763: In cost_usd_from_cost_object, the currency equality check
only matches the exact string "USD" so lowercase/ mixed-case currencies are
rejected; update the check that currently uses matches!(currency, Some("USD"))
to perform a case-insensitive comparison (e.g., use eq_ignore_ascii_case on the
&str or a matches! pattern with a guard like Some(c) if
c.eq_ignore_ascii_case("USD")) so payloads like "usd" are treated as USD while
keeping the rest of the is_usd_cost logic intact.

In `@crates/core/src/observability/openinference.rs`:
- Around line 1106-1115: The current block in the function handling
event.annotated_response()/response.usage returns early when usage.cost exists
but cost.total_for_currency("USD") is None, dropping costs that are only present
as component fields; change the logic in the response.usage branch (the
usage.cost handling around cost.total_for_currency("USD") and
estimate_cost_for_provider) to: first attempt cost.total_for_currency("USD"),
and if that yields None, derive a total by summing the cost component fields
(input, output, cache_* etc.) converted to USD (or using existing helper if
available), and only if no component-derived total can be produced fall back to
estimate_cost_for_provider(Some(event.name()), model_name, usage). Ensure you
reference and update the code paths around annotated_response(),
response.model.as_deref().or_else(|| event.model_name()),
cost.total_for_currency("USD"), and estimate_cost_for_provider to implement this
fallback behavior.

---

Outside diff comments:
In `@crates/core/src/observability/otel.rs`:
- Around line 687-695: The current logic returns as soon as response.usage.cost
is Some via cost_total_and_currency even when that helper returns None,
preventing fallback estimation; change the flow so that you only return if
cost_total_and_currency(&cost) yields Some(value), otherwise continue to the
subsequent model-based estimation path (use the existing symbols:
event.annotated_response(), response.usage, usage.cost, cost_total_and_currency,
event.model_name(), estimate_cost_for_provider(Some(event.name()), model_name,
usage)) so that missing total components in Usage.cost don't short-circuit
estimation.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 5657c503-3c6c-4604-86f7-30eae6406c61

📥 Commits

Reviewing files that changed from the base of the PR and between 992d3a3 and e64f48c.

📒 Files selected for processing (6)

crates/core/src/observability/atif.rs
crates/core/src/observability/openinference.rs
crates/core/src/observability/otel.rs
crates/core/tests/unit/atif_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/tests/unit/observability/otel_tests.rs

📜 Review details

🧰 Additional context used

📓 Path-based instructions (15)

**/*.rs

📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)

Use snake_case naming convention for Rust identifiers (e.g., nemo_relay_tool_call)

**/*.rs: Any Rust change must run just test-rust
Any Rust change must run cargo fmt --all
Any Rust change must run cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all for all FFI work since it is Rust work
Run just test-rust to validate FFI changes
Run cargo clippy --workspace --all-targets -- -D warnings to enforce strict linting on FFI work

When Rust files changed as part of Go work, also run cargo fmt --all, just test-rust, and cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all when Rust files are changed as part of Node work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files are changed as part of Node work
Run just test-rust when Rust files are changed as part of Node work

**/*.rs: Run cargo fmt --all to format all Rust code
Run cargo clippy --workspace --all-targets -- -D warnings to enforce all clippy lints as errors

**/*.rs: Run cargo fmt --all when Rust files changed as part of WebAssembly work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files changed as part of WebAssembly work

**/*.rs: If any Rust code changed, always run just test-rust
If any Rust code changed, also run cargo fmt --all
If any Rust code changed, also run cargo clippy --workspace --all-targets -- -D warnings
Run Rust formatting with cargo fmt --all
Run Rust linting with cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Use cargo fmt for Rust code formatting
Run cargo clippy -- -D warnings to lint Rust code and treat all warnings as errors
Use Rust snake_case naming convention for Rust identifiers
Include SPDX license header in all Rust source files using double-slash comment syntax
Validate Rust code with uv run pre-commit run --all-files to enforce cargo fmt formatting check, cargo clippy lints, and cargo deny aud...

Files:

crates/core/src/observability/openinference.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/observability/otel.rs
crates/core/src/observability/atif.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/tests/unit/observability/otel_tests.rs

crates/core/src/observability/{atif,otel,openinference}.rs

📄 CodeRabbit inference engine (.agents/skills/maintain-observability/SKILL.md)

When changing event fields in ATIF, OpenTelemetry, or OpenInference observability surfaces, keep the core event model in crates/core/src/observability/atif.rs, crates/core/src/observability/otel.rs, and crates/core/src/observability/openinference.rs in sync

Files:

crates/core/src/observability/openinference.rs
crates/core/src/observability/otel.rs
crates/core/src/observability/atif.rs

**/{Cargo.toml,**/*.rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Maintain consistency between Rust package names in Cargo.toml and their actual usage across the codebase

Files:

crates/core/src/observability/openinference.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/observability/otel.rs
crates/core/src/observability/atif.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/tests/unit/observability/otel_tests.rs

**/*.{h,hpp,c,cpp,rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Ensure FFI header and library naming follows consistent conventions across platform-specific builds

Files:

crates/core/src/observability/openinference.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/observability/otel.rs
crates/core/src/observability/atif.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/tests/unit/observability/otel_tests.rs

{crates/core,crates/adaptive}/**/*

📄 CodeRabbit inference engine (.agents/skills/prepare-pr/SKILL.md)

Changes to crates/core or crates/adaptive must run the full language matrix

Files:

crates/core/src/observability/openinference.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/observability/otel.rs
crates/core/src/observability/atif.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/tests/unit/observability/otel_tests.rs

**/*.{rs,toml}

📄 CodeRabbit inference engine (.agents/skills/rename-surfaces/SKILL.md)

Update Rust crate names and module prefixes during coordinated rename operations

Files:

crates/core/src/observability/openinference.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/observability/otel.rs
crates/core/src/observability/atif.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/tests/unit/observability/otel_tests.rs

crates/core/**/*.rs

📄 CodeRabbit inference engine (.agents/skills/test-go-binding/SKILL.md)

If the change touched crates/core or shared runtime semantics, also use validate-change for broader validation

crates/core/**/*.rs: Use Json = serde_json::Value in Rust-facing runtime APIs where the existing code expects JSON payloads.
Use Result<T> with FlowError in core runtime paths. Keep errors explicit and binding-appropriate at the wrapper layer.

Files:

crates/core/src/observability/openinference.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/observability/otel.rs
crates/core/src/observability/atif.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/tests/unit/observability/otel_tests.rs

crates/{core,adaptive}/**

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

If crates/core or crates/adaptive changed, run the full matrix across Rust, Python, Go, Node.js, and WebAssembly

Files:

crates/core/src/observability/openinference.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/observability/otel.rs
crates/core/src/observability/atif.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/tests/unit/observability/otel_tests.rs

**/*.{rs,py,js,ts,tsx,jsx,go,sh,toml,yaml,yml,md}

📄 CodeRabbit inference engine (AGENTS.md)

Keep SPDX headers on source, docs, scripts, and configuration files. The project is Apache-2.0.

Files:

crates/core/src/observability/openinference.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/observability/otel.rs
crates/core/src/observability/atif.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/tests/unit/observability/otel_tests.rs

**/*.{rs,py,go,js,ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Follow binding naming conventions: Rust and Python use snake_case, C FFI exports prefixed nemo_relay_, Go uses PascalCase for public APIs, Node.js uses camelCase.

Files:

crates/core/src/observability/openinference.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/observability/otel.rs
crates/core/src/observability/atif.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/tests/unit/observability/otel_tests.rs

crates/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

crates/**/*.rs: Keep async behavior on the existing tokio-based model. Bindings should preserve callback and future lifetimes rather than blocking or hiding async work unexpectedly.
Use Json = serde_json::Value in Rust-facing runtime APIs for JSON payload handling.

Files:

crates/core/src/observability/openinference.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/observability/otel.rs
crates/core/src/observability/atif.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/tests/unit/observability/otel_tests.rs

crates/{core,adaptive}/**/*.rs

⚙️ CodeRabbit configuration file

crates/{core,adaptive}/**/*.rs: Review the Rust runtime for async correctness, scope isolation, middleware ordering, and event lifecycle regressions.
Pay close attention to task-local/thread-local scope propagation, callback lifetimes, stream finalization, and root_uuid isolation.
Public API changes should preserve existing behavior unless tests and docs show the intended migration path.

Files:

crates/core/src/observability/openinference.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/observability/otel.rs
crates/core/src/observability/atif.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/tests/unit/observability/otel_tests.rs

**

⚙️ CodeRabbit configuration file

**:

AGENTS.md

This file provides guidance to agents, including Claude Code and OpenAI Codex, when working in this repository.

Project Overview

NeMo Relay is a multi-language agent runtime framework for execution scopes, lifecycle events, middleware, plugins, and observability around tool and LLM calls. The core runtime is Rust. Primary supported bindings are Rust, Python, and Node.js. Go, WebAssembly, and the raw C FFI are experimental and source-first.

The shared runtime model is:

Scope stacks decide where work belongs and which scope-local behavior is visible.

Middleware registries decide what guardrails and intercepts run around managed calls.

Plugins install reusable runtime behavior from configuration.

Events record runtime behavior in ATOF form.

Subscribers and exporters consume events in-process or export them to ATIF, OpenTelemetry, OpenInference, or other backends.

Repository Structure

The repository layout separates the Rust runtime, language bindings, documentation,
integration patches, and agent-facing skills.
crates/
  core/       # Rust core runtime crate, published as nemo-relay
  adaptive/   # Adaptive runtime primitives and plugin components
  python/     # PyO3 native extension for the Python package
  ffi/        # Raw C ABI layer used by downstream bindings such as Go
  node/       # NAPI Node.js binding and JavaScript/TypeScript entry points
  wasm/       # wasm-bindgen WebAssembly binding and JS wrappers
python/
  nemo_relay/  # Python wrapper package: scopes, tools, LLM, middleware, typed helpers, plugins, adaptive helpers
  tests/      # Python tests
go/
  nemo_relay/  # Experimental Go CGo binding and tests
fern/         # Fern documentation site
scripts/      # Stable wrappers and helper scripts; build/test/docs entry points live in justfile
third_party/  # P...

Files:

crates/core/src/observability/openinference.rs
crates/core/tests/unit/atif_tests.rs
crates/core/src/observability/otel.rs
crates/core/src/observability/atif.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/tests/unit/observability/otel_tests.rs

{crates/adaptive/**/*.rs,**/*test*.{rs,py,go,ts,js},**/*adaptive*test*.{rs,py,go,ts,js},docs/plugins/adaptive/**}

📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)

Maintain documented and tested validation and report behavior for adaptive surfaces

Files:

crates/core/tests/unit/atif_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/tests/unit/observability/otel_tests.rs

{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}

⚙️ CodeRabbit configuration file

{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}: Tests should cover the behavior promised by the changed API surface, including error paths and cross-request isolation where relevant.
Prefer assertions on lifecycle events, scope stacks, middleware ordering, and binding parity over shallow smoke tests.

Files:

crates/core/tests/unit/atif_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/core/tests/unit/observability/otel_tests.rs

🔇 Additional comments (5)

crates/core/tests/unit/atif_tests.rs (1)

805-885: LGTM!

crates/core/tests/unit/observability/openinference_tests.rs (1)

2319-2370: LGTM!

crates/core/tests/unit/observability/otel_tests.rs (3)

80-118: LGTM!

924-962: LGTM!

1008-1052: LGTM!

Signed-off-by: Ajay Thorve <athorve@nvidia.com>

github-actions Bot added the size:XL PR is extra large label Jun 5, 2026

github-actions Bot added lang:go PR changes/introduces Go code lang:js PR changes/introduces Javascript/Typescript code lang:rust PR changes/introduces Rust code labels Jun 5, 2026

copy-pr-bot Bot temporarily deployed to fern June 5, 2026 22:48 Inactive

AjayThorve changed the title ~~Add LLM cost layering and pricing lookup~~ feat: add LLM cost layering and pricing lookup Jun 5, 2026

github-actions Bot added the Feature a new feature label Jun 5, 2026

copy-pr-bot Bot had a problem deploying to fern June 5, 2026 23:03 Error

AjayThorve force-pushed the ajay/relay-184-cost-layering-for-annotatedllmresponse branch from 0da1210 to cfef280 Compare June 5, 2026 23:06

copy-pr-bot Bot temporarily deployed to fern June 5, 2026 23:07 Inactive

copy-pr-bot Bot had a problem deploying to fern June 5, 2026 23:36 Failure

willkill07 assigned AjayThorve Jun 5, 2026

willkill07 added this to the 0.4 milestone Jun 5, 2026

copy-pr-bot Bot had a problem deploying to fern June 6, 2026 00:03 Failure

copy-pr-bot Bot had a problem deploying to fern June 6, 2026 00:13 Error

copy-pr-bot Bot had a problem deploying to fern June 6, 2026 00:18 Error

AjayThorve force-pushed the ajay/relay-184-cost-layering-for-annotatedllmresponse branch from 81ce718 to 4ee9713 Compare June 6, 2026 00:19

copy-pr-bot Bot had a problem deploying to fern June 6, 2026 00:20 Failure

copy-pr-bot Bot temporarily deployed to fern June 6, 2026 01:17 Inactive

copy-pr-bot Bot had a problem deploying to fern June 6, 2026 02:13 Error

AjayThorve force-pushed the ajay/relay-184-cost-layering-for-annotatedllmresponse branch from 5433c44 to 3e7d330 Compare June 6, 2026 02:14

copy-pr-bot Bot had a problem deploying to fern June 6, 2026 02:14 Error

copy-pr-bot Bot had a problem deploying to fern June 6, 2026 02:18 Error

copy-pr-bot Bot temporarily deployed to fern June 6, 2026 02:21 Inactive

AjayThorve added 4 commits June 5, 2026 19:24

feat: add LLM cost layering and pricing lookup

e783bbe

Signed-off-by: Ajay Thorve <athorve@nvidia.com>

fix: validate pricing sources in doctor

54957e3

Signed-off-by: Ajay Thorve <athorve@nvidia.com>

docs: clarify pricing setup and validation

b280ddb

Signed-off-by: Ajay Thorve <athorve@nvidia.com>

docs: cover pricing for embedded integrations

03bf000

Signed-off-by: Ajay Thorve <athorve@nvidia.com>

test: isolate pricing CLI resolve config

dc36906

Signed-off-by: Ajay Thorve <athorve@nvidia.com>

AjayThorve force-pushed the ajay/relay-184-cost-layering-for-annotatedllmresponse branch from dcfbb69 to dc36906 Compare June 6, 2026 02:27

AjayThorve marked this pull request as ready for review June 6, 2026 02:27

AjayThorve requested a review from a team as a code owner June 6, 2026 02:27

copy-pr-bot Bot temporarily deployed to fern June 6, 2026 02:27 Inactive

test: isolate doctor CLI config lookup

a4b8ae8

Signed-off-by: Ajay Thorve <athorve@nvidia.com>

coderabbitai Bot reviewed Jun 6, 2026

View reviewed changes

copy-pr-bot Bot temporarily deployed to fern June 6, 2026 02:39 Inactive

coderabbitai Bot reviewed Jun 6, 2026

View reviewed changes

fix: address pricing review feedback

055e798

Signed-off-by: Ajay Thorve <athorve@nvidia.com>

copy-pr-bot Bot had a problem deploying to fern June 6, 2026 03:40 Error

test: harden TOML string escaping

1b30b60

Signed-off-by: Ajay Thorve <athorve@nvidia.com>

copy-pr-bot Bot temporarily deployed to fern June 6, 2026 03:43 Inactive

coderabbitai Bot reviewed Jun 6, 2026

View reviewed changes

Comment thread crates/core/src/observability/otel.rs

fix: preserve cost fallback semantics

992d3a3

Signed-off-by: Ajay Thorve <athorve@nvidia.com>

copy-pr-bot Bot temporarily deployed to fern June 6, 2026 04:12 Inactive

coderabbitai Bot reviewed Jun 6, 2026

View reviewed changes

fix: preserve reported cost precedence

e64f48c

Signed-off-by: Ajay Thorve <athorve@nvidia.com>

copy-pr-bot Bot temporarily deployed to fern June 6, 2026 04:51 Inactive

coderabbitai Bot reviewed Jun 6, 2026

View reviewed changes

Comment thread crates/core/src/observability/atif.rs

Comment thread crates/core/src/observability/openinference.rs

fix: handle component-only reported costs

84c7f32

Signed-off-by: Ajay Thorve <athorve@nvidia.com>

github-actions Bot added size:XXL PR is very large and removed size:XL PR is extra large labels Jun 6, 2026

copy-pr-bot Bot deployed to fern June 6, 2026 05:12 Active

Conversation

AjayThorve commented Jun 5, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Details

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Validation

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Uh oh!

github-actions Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

AGENTS.md

Project Overview

Repository Structure

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

AGENTS.md

Project Overview

Repository Structure

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

AGENTS.md

Project Overview

Repository Structure

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

AGENTS.md

Project Overview

Repository Structure

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

AGENTS.md

Project Overview

Repository Structure

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

AjayThorve commented Jun 5, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 5, 2026 •

edited

Loading

github-actions Bot commented Jun 5, 2026 •

edited

Loading