test: add focused unit tests for orchestrator budget guardrails#72
Conversation
|
@Fury03 is attempting to deploy a commit to the flamki's projects Team on Vercel. A member of the Team first needs to authorize it. |
|
@Fury03 Great news! 🎉 Based on an automated assessment of this PR, the linked Wave issue(s) no longer count against your application limits. You can now already apply to more issues while waiting for a review of this PR. Keep up the great work! 🚀 |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
🚧 Files skipped from review as they are similar to previous changes (4)
📝 WalkthroughWalkthroughThis PR extracts budget enforcement logic into a reusable deterministic module, adds comprehensive unit and integration tests to validate guardrails and edge cases, refactors orchestrator to delegate budget operations to the new module, and configures CI to run all tests including budget validation. ChangesBudget Guardrails Extraction and Testing
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/test.yml:
- Around line 13-20: The workflow uses floating tags for actions and leaves
checkout credentials persisted; update the uses references to fixed commit SHAs
instead of actions/checkout@v4 and actions/setup-node@v4, and add
persist-credentials: false to the actions/checkout step to avoid keeping the
GITHUB_TOKEN in the workspace; keep the existing setup-node input (node-version:
20 and cache: npm) but reference the setup-node action by its pinned commit SHA
as well.
In `@src/agents/budget.js`:
- Around line 52-54: The exceedsBudget function currently allows non-finite
inputs (NaN, Infinity) to bypass budget checks; update exceedsBudget(totalSpent,
cost, budget) to first validate Number.isFinite for totalSpent, cost, and budget
and immediately return true (treat as over-budget) if any are not finite,
otherwise compute totalSpent + cost > budget as before; this prevents NaN
propagation and guardrail bypass while keeping the existing comparison logic in
place.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 61ac6a1c-56d4-4583-8e50-da71133cdcfa
📒 Files selected for processing (5)
.github/workflows/test.ymlpackage.jsonsrc/agents/budget.jssrc/agents/orchestrator.jstests/orchestrator.budget.test.js
| - name: Checkout | ||
| uses: actions/checkout@v4 | ||
|
|
||
| - name: Set up Node.js | ||
| uses: actions/setup-node@v4 | ||
| with: | ||
| node-version: 20 | ||
| cache: npm |
There was a problem hiding this comment.
Harden GitHub Actions usage by pinning SHAs and disabling credential persistence.
Line 14 and Line 17 use tag refs instead of commit SHAs, and Line 14 doesn’t disable persisted credentials. This weakens CI supply-chain and token hygiene.
Suggested fix
- name: Checkout
- uses: actions/checkout@v4
+ uses: actions/checkout@<FULL_LENGTH_COMMIT_SHA>
+ with:
+ persist-credentials: false
- name: Set up Node.js
- uses: actions/setup-node@v4
+ uses: actions/setup-node@<FULL_LENGTH_COMMIT_SHA>
with:
node-version: 20
cache: npm🧰 Tools
🪛 zizmor (1.25.2)
[warning] 13-14: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false
(artipacked)
[error] 14-14: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)
(unpinned-uses)
[error] 17-17: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)
(unpinned-uses)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/workflows/test.yml around lines 13 - 20, The workflow uses floating
tags for actions and leaves checkout credentials persisted; update the uses
references to fixed commit SHAs instead of actions/checkout@v4 and
actions/setup-node@v4, and add persist-credentials: false to the
actions/checkout step to avoid keeping the GITHUB_TOKEN in the workspace; keep
the existing setup-node input (node-version: 20 and cache: npm) but reference
the setup-node action by its pinned commit SHA as well.
| export function exceedsBudget(totalSpent, cost, budget) { | ||
| return totalSpent + cost > budget; | ||
| } |
There was a problem hiding this comment.
Treat non-finite values as over-budget to prevent guardrail bypass.
If cost (or totalSpent) is NaN, this predicate always returns false, which disables budget enforcement and can poison later accounting with NaN.
Proposed fix
export function exceedsBudget(totalSpent, cost, budget) {
- return totalSpent + cost > budget;
+ if (!Number.isFinite(totalSpent) || !Number.isFinite(cost) || !Number.isFinite(budget)) {
+ return true;
+ }
+ return totalSpent + cost > budget;
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/agents/budget.js` around lines 52 - 54, The exceedsBudget function
currently allows non-finite inputs (NaN, Infinity) to bypass budget checks;
update exceedsBudget(totalSpent, cost, budget) to first validate Number.isFinite
for totalSpent, cost, and budget and immediately return true (treat as
over-budget) if any are not finite, otherwise compute totalSpent + cost > budget
as before; this prevents NaN propagation and guardrail bypass while keeping the
existing comparison logic in place.
|
@Fury03 can you resolve this branch conflicts |
…get-guardrail-tests # Conflicts: # package.json # src/agents/orchestrator.js
|
@Flamki conflicts resolved. |
Summary
Describe what changed and why
Adds focused unit tests for the orchestrator's budget enforcement (Key Design Decision #2:
totalSpent + cost > budget), which was core logic with no dedicated test coverage.To make the guardrail testable in isolation, the budget/payment/summary logic is extracted from
orchestrate()into a new dependency-freesrc/agents/budget.js(same pattern as the existingsettlement-header.js).orchestrator.jsnow imports those helpers, so the tests guard the real code path rather than a re-implementation. This is a behavior-preserving refactor — the orchestration flow is unchanged; each extracted helper is a 1:1 mapping of the original inline expression.tests/orchestrator.budget.test.jsadds 33 tests covering:totalSpent + cost > budgetskip behavior, including the exact-boundary case (a step whose cost exactly consumes the remaining budget still runs — pins>against an accidental>=)paymentBucket,tallyPaymentOutcomes,paymentProtocolSummary)isBudgetExhausted, used/skipped counts, skip reason string)parseFloat→NaN)simulateRunthat proves spend never exceeds budget and the right later steps are skippedCI: adds
.github/workflows/test.yml(runs on push/PR) and wires the suite intonpm test, plus a standalonenpm run test:budget, so regressions fail fast.Two existing behaviors are pinned by tests rather than changed (so any future fix is intentional): a malformed price makes the guard fall through instead of skipping, and an "agent not found" entry counts toward
agentsUsed. Flagging the first for reviewer attention — silently overspending on a bad price could matter for real funds.Closes #19.
Validation
Checklist
.envor private keys committedREADME.mdand examples still accurateSummary by CodeRabbit
New Features
Tests
Chores