Skip to content

test: add focused unit tests for orchestrator budget guardrails#72

Merged
Flamki merged 2 commits into
Flamki:masterfrom
Fury03:test/issue-19-budget-guardrail-tests
Jun 8, 2026
Merged

test: add focused unit tests for orchestrator budget guardrails#72
Flamki merged 2 commits into
Flamki:masterfrom
Fury03:test/issue-19-budget-guardrail-tests

Conversation

@Fury03

@Fury03 Fury03 commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Summary

Describe what changed and why

Adds focused unit tests for the orchestrator's budget enforcement (Key Design Decision #2: totalSpent + cost > budget), which was core logic with no dedicated test coverage.

To make the guardrail testable in isolation, the budget/payment/summary logic is extracted from orchestrate() into a new dependency-free src/agents/budget.js (same pattern as the existing settlement-header.js). orchestrator.js now imports those helpers, so the tests guard the real code path rather than a re-implementation. This is a behavior-preserving refactor — the orchestration flow is unchanged; each extracted helper is a 1:1 mapping of the original inline expression.

tests/orchestrator.budget.test.js adds 33 tests covering:

  • totalSpent + cost > budget skip behavior, including the exact-boundary case (a step whose cost exactly consumes the remaining budget still runs — pins > against an accidental >=)
  • mixed payment outcomes and the x402 → XLM fallback accounting (paymentBucket, tallyPaymentOutcomes, paymentProtocolSummary)
  • final summary totals and skipped-step reporting (isBudgetExhausted, used/skipped counts, skip reason string)
  • edge cases: zero budget, tiny budgets below the cheapest agent, and malformed-price parse failures (parseFloatNaN)
  • an end-to-end simulateRun that proves spend never exceeds budget and the right later steps are skipped

CI: adds .github/workflows/test.yml (runs on push/PR) and wires the suite into npm test, plus a standalone npm run test:budget, so regressions fail fast.

Two existing behaviors are pinned by tests rather than changed (so any future fix is intentional): a malformed price makes the guard fall through instead of skipping, and an "agent not found" entry counts toward agentsUsed. Flagging the first for reviewer attention — silently overspending on a bad price could matter for real funds.

Closes #19.

Validation

  • Ran relevant local command(s)
  • Verified no secrets were added
  • Updated docs if behavior changed

Checklist

  • No .env or private keys committed
  • No sensitive data in screenshots/logs
  • README.md and examples still accurate

Summary by CodeRabbit

  • New Features

    • Deterministic budget guardrails: orchestrator now enforces budget limits, reports skipped steps, and summarizes payment protocols.
  • Tests

    • Added comprehensive unit and integration tests for budget logic and payment outcomes.
    • CI updated to run unit tests plus a dedicated budget test step.
  • Chores

    • Test scripts reorganized to allow independent execution of budget tests.

@vercel

vercel Bot commented Jun 1, 2026

Copy link
Copy Markdown

@Fury03 is attempting to deploy a commit to the flamki's projects Team on Vercel.

A member of the Team first needs to authorize it.

@drips-wave

drips-wave Bot commented Jun 1, 2026

Copy link
Copy Markdown

@Fury03 Great news! 🎉 Based on an automated assessment of this PR, the linked Wave issue(s) no longer count against your application limits.

You can now already apply to more issues while waiting for a review of this PR. Keep up the great work! 🚀

Learn more about application limits

@coderabbitai

coderabbitai Bot commented Jun 1, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c1a86d26-bd92-4709-89ad-ff7a872e8076

📥 Commits

Reviewing files that changed from the base of the PR and between cc2139b and a519488.

📒 Files selected for processing (4)
  • package.json
  • src/agents/budget.js
  • src/agents/orchestrator.js
  • tests/orchestrator.budget.test.js
🚧 Files skipped from review as they are similar to previous changes (4)
  • package.json
  • src/agents/orchestrator.js
  • tests/orchestrator.budget.test.js
  • src/agents/budget.js

📝 Walkthrough

Walkthrough

This PR extracts budget enforcement logic into a reusable deterministic module, adds comprehensive unit and integration tests to validate guardrails and edge cases, refactors orchestrator to delegate budget operations to the new module, and configures CI to run all tests including budget validation.

Changes

Budget Guardrails Extraction and Testing

Layer / File(s) Summary
Budget module core helpers
src/agents/budget.js (lines 1–70)
Agent cost parsing via parseFloat, remaining budget calculation, USDC amount formatting to 4 decimals, and budget-exceeds predicate using strict > semantics; deterministic construction of skip results and budget limit event payloads without timestamps.
Budget module outcome and summary helpers
src/agents/budget.js (lines 89–159)
Payment outcome classification into x402, stellar-xlm, or unpaid; tallying and protocol summarization from payment lists; exhaustion flag and used/skipped step counting from result objects.
Comprehensive budget test suite
tests/orchestrator.budget.test.js
In-file test harness with pass/fail tracking; unit tests for cost parsing edge cases, budget boundary conditions (NaN handling, zero budgets, accumulated overspend), amount formatting precision, skip result and event payload generation, payment classification and aggregation, rollup helper semantics; integration-style simulation that mirrors orchestrator's budget loop with multiple scenarios (generous, low, zero, and mixed payment budgets) and asserts final totals, skip counts, protocol summaries, and exhaustion flags.
Orchestrator refactoring to use budget module
src/agents/orchestrator.js
Imports budget helpers; replaces inline totalSpent + cost > budget checks with exceedsBudget; generates skip results and budget limit events via buildSkipResult and buildBudgetLimitEvent; classifies payment outcomes via paymentBucket for accurate payment counting; derives agentsUsed and agentsSkipped via countUsed(results) and countSkipped(results) instead of filtering; removes local paymentProtocolSummary in favor of imported version.
CI workflow and npm script configuration
.github/workflows/test.yml, package.json
GitHub Actions workflow runs on all pushes and pull requests with Node.js 20, npm cache, dependency install via npm ci, unit tests via npm test, and budget guardrail tests via npm run test:budget; npm scripts updated to add test:parser and test:budget aliases alongside the unified test command.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

I’m a rabbit with a tiny scroll and pen,
I count the coins and guard the budget when—
Steps would overspend, I gently say “no,”
Tests keep me tidy, numbers all aglow.
🐇💸

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 70.59% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'test: add focused unit tests for orchestrator budget guardrails' clearly summarizes the main change—adding unit tests for budget enforcement logic.
Description check ✅ Passed The PR description covers the summary, implementation details, test coverage, and validation steps. It addresses all major aspects of the change and references issue #19.
Linked Issues check ✅ Passed The PR fully addresses issue #19 requirements: adds focused unit tests for budget guardrails, covers totalSpent + cost > budget skip behavior including boundary cases, validates payment outcomes and summaries, exercises edge cases, and adds CI workflow.
Out of Scope Changes check ✅ Passed All changes are scoped to issue #19: a new budget.js module with helpers, orchestrator.js refactored to use those helpers, a comprehensive test suite, and CI workflow additions. No unrelated changes detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/test.yml:
- Around line 13-20: The workflow uses floating tags for actions and leaves
checkout credentials persisted; update the uses references to fixed commit SHAs
instead of actions/checkout@v4 and actions/setup-node@v4, and add
persist-credentials: false to the actions/checkout step to avoid keeping the
GITHUB_TOKEN in the workspace; keep the existing setup-node input (node-version:
20 and cache: npm) but reference the setup-node action by its pinned commit SHA
as well.

In `@src/agents/budget.js`:
- Around line 52-54: The exceedsBudget function currently allows non-finite
inputs (NaN, Infinity) to bypass budget checks; update exceedsBudget(totalSpent,
cost, budget) to first validate Number.isFinite for totalSpent, cost, and budget
and immediately return true (treat as over-budget) if any are not finite,
otherwise compute totalSpent + cost > budget as before; this prevents NaN
propagation and guardrail bypass while keeping the existing comparison logic in
place.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 61ac6a1c-56d4-4583-8e50-da71133cdcfa

📥 Commits

Reviewing files that changed from the base of the PR and between b9a790d and cc2139b.

📒 Files selected for processing (5)
  • .github/workflows/test.yml
  • package.json
  • src/agents/budget.js
  • src/agents/orchestrator.js
  • tests/orchestrator.budget.test.js

Comment on lines +13 to +20
- name: Checkout
uses: actions/checkout@v4

- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: 20
cache: npm

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Harden GitHub Actions usage by pinning SHAs and disabling credential persistence.

Line 14 and Line 17 use tag refs instead of commit SHAs, and Line 14 doesn’t disable persisted credentials. This weakens CI supply-chain and token hygiene.

Suggested fix
       - name: Checkout
-        uses: actions/checkout@v4
+        uses: actions/checkout@<FULL_LENGTH_COMMIT_SHA>
+        with:
+          persist-credentials: false

       - name: Set up Node.js
-        uses: actions/setup-node@v4
+        uses: actions/setup-node@<FULL_LENGTH_COMMIT_SHA>
         with:
           node-version: 20
           cache: npm
🧰 Tools
🪛 zizmor (1.25.2)

[warning] 13-14: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false

(artipacked)


[error] 14-14: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)


[error] 17-17: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/test.yml around lines 13 - 20, The workflow uses floating
tags for actions and leaves checkout credentials persisted; update the uses
references to fixed commit SHAs instead of actions/checkout@v4 and
actions/setup-node@v4, and add persist-credentials: false to the
actions/checkout step to avoid keeping the GITHUB_TOKEN in the workspace; keep
the existing setup-node input (node-version: 20 and cache: npm) but reference
the setup-node action by its pinned commit SHA as well.

Comment thread src/agents/budget.js
Comment on lines +52 to +54
export function exceedsBudget(totalSpent, cost, budget) {
return totalSpent + cost > budget;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Treat non-finite values as over-budget to prevent guardrail bypass.

If cost (or totalSpent) is NaN, this predicate always returns false, which disables budget enforcement and can poison later accounting with NaN.

Proposed fix
 export function exceedsBudget(totalSpent, cost, budget) {
-  return totalSpent + cost > budget;
+  if (!Number.isFinite(totalSpent) || !Number.isFinite(cost) || !Number.isFinite(budget)) {
+    return true;
+  }
+  return totalSpent + cost > budget;
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/agents/budget.js` around lines 52 - 54, The exceedsBudget function
currently allows non-finite inputs (NaN, Infinity) to bypass budget checks;
update exceedsBudget(totalSpent, cost, budget) to first validate Number.isFinite
for totalSpent, cost, and budget and immediately return true (treat as
over-budget) if any are not finite, otherwise compute totalSpent + cost > budget
as before; this prevents NaN propagation and guardrail bypass while keeping the
existing comparison logic in place.

@Flamki

Flamki commented Jun 2, 2026

Copy link
Copy Markdown
Owner

@Fury03 can you resolve this branch conflicts

…get-guardrail-tests

# Conflicts:
#	package.json
#	src/agents/orchestrator.js
@Fury03

Fury03 commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

@Flamki conflicts resolved.

@Flamki Flamki merged commit bf7b36d into Flamki:master Jun 8, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add unit tests for orchestrator budget enforcement rules

2 participants