Skip to content

test: validate OpenClaw hook-only fallback exports#220

Merged
rapids-bot[bot] merged 1 commit into
NVIDIA:mainfrom
mnajafian-nv:test/openclaw-hook-only-fallback-export-validation
Jun 4, 2026
Merged

test: validate OpenClaw hook-only fallback exports#220
rapids-bot[bot] merged 1 commit into
NVIDIA:mainfrom
mnajafian-nv:test/openclaw-hook-only-fallback-export-validation

Conversation

@mnajafian-nv
Copy link
Copy Markdown
Contributor

@mnajafian-nv mnajafian-nv commented Jun 4, 2026

Overview

This PR adds exporter-visible validation for the remaining OpenClaw hook-only fallback paths across ATIF, ATOF, and OpenInference so stripped content, partial usage, and explicit cost-only usage remain honest in exported observability outputs.

  • I confirm this contribution is my own work, or I have the right to submit it under this project's license.
  • I searched existing issues and open pull requests, and this does not duplicate existing work.

Details

  • Adds ATIF validation that stripped OpenClaw replay inputs and outputs fall back honestly while preserving partial usage and explicit cost-only metrics.
  • Adds ATOF validation that the same OpenClaw hook-only fallback lifecycle payloads are preserved as raw JSONL without inventing higher-fidelity structure.
  • Adds OpenInference validation that stripped OpenClaw payloads fall back to sanitized JSON attributes, partial usage stays partial, and explicit cost-only usage is preserved without inventing flattened message attributes.
  • Keeps the change test-only and focused on exporter-visible proof of existing OpenClaw hook-backed fallback behavior.
  • Avoids runtime, adapter, provider-event, or exporter implementation changes in this PR.

Validated with:

  • cargo test -p nemo-relay atif
  • cargo test -p nemo-relay atof
  • cargo test -p nemo-relay openinference
  • cargo fmt --all --check
  • uv run pre-commit run --files crates/core/tests/unit/atif_tests.rs crates/core/tests/unit/observability/atof_tests.rs crates/core/tests/unit/observability/openinference_tests.rs

Where should the reviewer start?

Start in crates/core/tests/unit/observability/openinference_tests.rs for the sanitized fallback-shape and partial-usage assertions, then review crates/core/tests/unit/atif_tests.rs for the ATIF stripped-content and metric-preservation proof, and crates/core/tests/unit/observability/atof_tests.rs for the raw JSONL preservation check.

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • Relates to OpenClaw observability consistency work.

Summary by CodeRabbit

  • Tests
    • Added comprehensive unit tests for OpenClaw hook fallback handling, validating preservation of stripped content, explicit metrics aggregation, JSONL payload serialization, and usage data across multiple exporters and event processors.

Signed-off-by: mnajafian-nv <mnajafian@nvidia.com>
@mnajafian-nv mnajafian-nv added this to the 0.4 milestone Jun 4, 2026
@mnajafian-nv mnajafian-nv self-assigned this Jun 4, 2026
@mnajafian-nv mnajafian-nv requested a review from a team as a code owner June 4, 2026 17:15
@mnajafian-nv mnajafian-nv added Improvement improvement to existing functionality Test Test related labels Jun 4, 2026
@github-actions github-actions Bot added size:M PR is medium lang:rust PR changes/introduces Rust code and removed Improvement improvement to existing functionality labels Jun 4, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 4, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: aa06726a-89a4-46c6-95ff-1e641afc0ee4

📥 Commits

Reviewing files that changed from the base of the PR and between 280c88f and 440c9f9.

📒 Files selected for processing (3)
  • crates/core/tests/unit/atif_tests.rs
  • crates/core/tests/unit/observability/atof_tests.rs
  • crates/core/tests/unit/observability/openinference_tests.rs
📜 Recent review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Check / Run
🧰 Additional context used
📓 Path-based instructions (13)
**/*.rs

📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)

Use snake_case naming convention for Rust identifiers (e.g., nemo_relay_tool_call)

**/*.rs: Any Rust change must run just test-rust
Any Rust change must run cargo fmt --all
Any Rust change must run cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all for all FFI work since it is Rust work
Run just test-rust to validate FFI changes
Run cargo clippy --workspace --all-targets -- -D warnings to enforce strict linting on FFI work

When Rust files changed as part of Go work, also run cargo fmt --all, just test-rust, and cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all when Rust files are changed as part of Node work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files are changed as part of Node work
Run just test-rust when Rust files are changed as part of Node work

**/*.rs: Run cargo fmt --all to format all Rust code
Run cargo clippy --workspace --all-targets -- -D warnings to enforce all clippy lints as errors

**/*.rs: Run cargo fmt --all when Rust files changed as part of WebAssembly work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files changed as part of WebAssembly work

**/*.rs: If any Rust code changed, always run just test-rust
If any Rust code changed, also run cargo fmt --all
If any Rust code changed, also run cargo clippy --workspace --all-targets -- -D warnings
Run Rust formatting with cargo fmt --all
Run Rust linting with cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Use cargo fmt for Rust code formatting
Run cargo clippy -- -D warnings to lint Rust code and treat all warnings as errors
Use Rust snake_case naming convention for Rust identifiers
Include SPDX license header in all Rust source files using double-slash comment syntax
Validate Rust code with uv run pre-commit run --all-files to enforce cargo fmt formatting check, cargo clippy lints, and cargo deny aud...

Files:

  • crates/core/tests/unit/atif_tests.rs
  • crates/core/tests/unit/observability/atof_tests.rs
  • crates/core/tests/unit/observability/openinference_tests.rs
{crates/adaptive/**/*.rs,**/*test*.{rs,py,go,ts,js},**/*adaptive*test*.{rs,py,go,ts,js},docs/plugins/adaptive/**}

📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)

Maintain documented and tested validation and report behavior for adaptive surfaces

Files:

  • crates/core/tests/unit/atif_tests.rs
  • crates/core/tests/unit/observability/atof_tests.rs
  • crates/core/tests/unit/observability/openinference_tests.rs
**/{Cargo.toml,**/*.rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Maintain consistency between Rust package names in Cargo.toml and their actual usage across the codebase

Files:

  • crates/core/tests/unit/atif_tests.rs
  • crates/core/tests/unit/observability/atof_tests.rs
  • crates/core/tests/unit/observability/openinference_tests.rs
**/*.{h,hpp,c,cpp,rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Ensure FFI header and library naming follows consistent conventions across platform-specific builds

Files:

  • crates/core/tests/unit/atif_tests.rs
  • crates/core/tests/unit/observability/atof_tests.rs
  • crates/core/tests/unit/observability/openinference_tests.rs
{crates/core,crates/adaptive}/**/*

📄 CodeRabbit inference engine (.agents/skills/prepare-pr/SKILL.md)

Changes to crates/core or crates/adaptive must run the full language matrix

Files:

  • crates/core/tests/unit/atif_tests.rs
  • crates/core/tests/unit/observability/atof_tests.rs
  • crates/core/tests/unit/observability/openinference_tests.rs
**/*.{rs,toml}

📄 CodeRabbit inference engine (.agents/skills/rename-surfaces/SKILL.md)

Update Rust crate names and module prefixes during coordinated rename operations

Files:

  • crates/core/tests/unit/atif_tests.rs
  • crates/core/tests/unit/observability/atof_tests.rs
  • crates/core/tests/unit/observability/openinference_tests.rs
crates/core/**/*.rs

📄 CodeRabbit inference engine (.agents/skills/test-go-binding/SKILL.md)

If the change touched crates/core or shared runtime semantics, also use validate-change for broader validation

crates/core/**/*.rs: Use Json = serde_json::Value in Rust-facing runtime APIs where the existing code expects JSON payloads.
Use Result<T> with FlowError in core runtime paths. Keep errors explicit and binding-appropriate at the wrapper layer.

Files:

  • crates/core/tests/unit/atif_tests.rs
  • crates/core/tests/unit/observability/atof_tests.rs
  • crates/core/tests/unit/observability/openinference_tests.rs
crates/{core,adaptive}/**

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

If crates/core or crates/adaptive changed, run the full matrix across Rust, Python, Go, Node.js, and WebAssembly

Files:

  • crates/core/tests/unit/atif_tests.rs
  • crates/core/tests/unit/observability/atof_tests.rs
  • crates/core/tests/unit/observability/openinference_tests.rs
**/*.{rs,py,js,ts,tsx,jsx,go,sh,toml,yaml,yml,md}

📄 CodeRabbit inference engine (AGENTS.md)

Keep SPDX headers on source, docs, scripts, and configuration files. The project is Apache-2.0.

Files:

  • crates/core/tests/unit/atif_tests.rs
  • crates/core/tests/unit/observability/atof_tests.rs
  • crates/core/tests/unit/observability/openinference_tests.rs
**/*.{rs,py,go,js,ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Follow binding naming conventions: Rust and Python use snake_case, C FFI exports prefixed nemo_relay_, Go uses PascalCase for public APIs, Node.js uses camelCase.

Files:

  • crates/core/tests/unit/atif_tests.rs
  • crates/core/tests/unit/observability/atof_tests.rs
  • crates/core/tests/unit/observability/openinference_tests.rs
crates/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

crates/**/*.rs: Keep async behavior on the existing tokio-based model. Bindings should preserve callback and future lifetimes rather than blocking or hiding async work unexpectedly.
Use Json = serde_json::Value in Rust-facing runtime APIs for JSON payload handling.

Files:

  • crates/core/tests/unit/atif_tests.rs
  • crates/core/tests/unit/observability/atof_tests.rs
  • crates/core/tests/unit/observability/openinference_tests.rs
crates/{core,adaptive}/**/*.rs

⚙️ CodeRabbit configuration file

crates/{core,adaptive}/**/*.rs: Review the Rust runtime for async correctness, scope isolation, middleware ordering, and event lifecycle regressions.
Pay close attention to task-local/thread-local scope propagation, callback lifetimes, stream finalization, and root_uuid isolation.
Public API changes should preserve existing behavior unless tests and docs show the intended migration path.

Files:

  • crates/core/tests/unit/atif_tests.rs
  • crates/core/tests/unit/observability/atof_tests.rs
  • crates/core/tests/unit/observability/openinference_tests.rs
{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}

⚙️ CodeRabbit configuration file

{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}: Tests should cover the behavior promised by the changed API surface, including error paths and cross-request isolation where relevant.
Prefer assertions on lifecycle events, scope stacks, middleware ordering, and binding parity over shallow smoke tests.

Files:

  • crates/core/tests/unit/atif_tests.rs
  • crates/core/tests/unit/observability/atof_tests.rs
  • crates/core/tests/unit/observability/openinference_tests.rs
🔇 Additional comments (3)
crates/core/tests/unit/atif_tests.rs (1)

1045-1202: Confirm required Rust/core validation matrix execution for this change.

Line 1045 modifies crates/core/**/*.rs; please confirm the mandated checks were run (just test-rust, cargo fmt --all, cargo clippy --workspace --all-targets -- -D warnings, plus the crates/core broader validation/full language matrix requirement).

As per coding guidelines "If any Rust code changed, always run just test-rust ... run cargo fmt --all ... run cargo clippy --workspace --all-targets -- -D warnings" and "If crates/core or crates/adaptive changed, run the full matrix across Rust, Python, Go, Node.js, and WebAssembly."

crates/core/tests/unit/observability/atof_tests.rs (1)

748-845: LGTM!

crates/core/tests/unit/observability/openinference_tests.rs (1)

740-880: LGTM!


Walkthrough

Adds unit tests across three exporter/processor implementations to validate OpenClaw hook-only fallback behavior: ATIF validates trajectory structure and metrics aggregation; ATOF validates JSON-L payload field preservation; OpenInference validates span attribute presence/absence for stripped and partial content scenarios.

Changes

Hook-only Fallback Validation Tests

Layer / File(s) Summary
ATIF exporter trajectory validation
crates/core/tests/unit/atif_tests.rs
Test verifies ATIF v1.7 trajectory with four steps (two user/agent pairs), asserts omission of stripped prompt/system/assistant content, and validates explicit per-step cost and aggregated metrics (total_cost_usd, total_prompt_tokens).
ATOF exporter JSON-L payload preservation
crates/core/tests/unit/observability/atof_tests.rs
Test emits four LLM scope events, validates JSONL output preserves parent_uuid, and checks selective field presence: absent prompt/messages in stripped start; preserved cost in end; prompt_tokens only in partial end.
OpenInference processor span attribute validation
crates/core/tests/unit/observability/openinference_tests.rs
Test emits two span lifecycles (stripped and partial), asserts two exported spans, validates cost.total and MIME type preservation in stripped span while omitting sensitive tokens and message/token-count flattened attributes, and confirms prompt_tokens presence without completion/total/cost in partial span.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • NVIDIA/NeMo-Relay#207: Main PR's unit tests validate the OpenClaw replay and sanitization logic added in #207 across openinference, ATIF, and ATOF exporters.
  • NVIDIA/NeMo-Relay#206: Main PR adds tests that assert explicit usage cost preservation (cost_usd, cost.total, and aggregated metrics) directly validating the cost-surfacing logic in #206.
  • NVIDIA/NeMo-Relay#209: Main PR adds Rust unit tests for OpenClaw hook-only fallback sanitization aligned with #209's OpenClaw hook-backed provenance metadata tightening and "no overclaimed fields" assertions.
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed Title follows Conventional Commits format with 'test' type, clear imperative summary, under 72 characters, and accurately describes the test-only changes across ATIF/ATOF/OpenInference.
Description check ✅ Passed Description includes all required template sections (Overview, Details, Where should the reviewer start, Related Issues) with comprehensive information about the test additions and validation scope.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@willkill07
Copy link
Copy Markdown
Member

/merge

@rapids-bot rapids-bot Bot merged commit 08345c6 into NVIDIA:main Jun 4, 2026
30 checks passed
@mnajafian-nv mnajafian-nv deleted the test/openclaw-hook-only-fallback-export-validation branch June 4, 2026 20:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lang:rust PR changes/introduces Rust code size:M PR is medium Test Test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants