feat: concurrency per label with head-of-line skip (#1304) by isamu · Pull Request #1314 · receptron/graphai

isamu · 2026-05-02T23:20:09Z

Summary

GraphData.concurrency is extended to accept either a number (legacy) or { global: number, labels?: Record<string, number> }.
ComputedNodeData gains an optional label: string. When set on a node and the label has a configured limit, that node counts against the per-label cap as well as the global cap.
TaskManager walks the priority-sorted queue and dispatches the first task whose label still has capacity ("head-of-line skip"), so a saturated label cannot stall unrelated work.
100% backward compatible: concurrency: number and unlabeled nodes behave exactly as before.

Items to Confirm / Review

HoL skip vs strict priority order — when the highest-priority queued task is blocked by its label, lower-priority tasks of other labels are dispatched first. This is documented in the issue and intended (otherwise label limits would degenerate into global stalls), but it does mean priority is now "priority within a feasible-label set." Starvation of a hot label is possible by design; aging is split out to TaskManager: queue で待機中のタスクの優先度を時間経過で自動昇格 (aging) させたい #1311.
onComplete drains aggressively — when a label slot opens, onComplete keeps dispatching while the global limit allows. This is what makes mixed labeled/free-label workloads behave like the user expects, but please sanity-check the loop condition in task_manager.ts:108-114.
Nested graphs — prepareForNesting()/restoreAfterNesting() still touch only the global limit. Per-label limits are not bumped when nested graphs run. If a nested graph's tasks share a label with the parent, they share the cap. Flag if a different policy is wanted.
Validator messages — the new error messages (Concurrency.global must be ..., Concurrency.labels.<key> must be ...) follow the existing Concurrency must be ... style. Check the tone matches your preference.

User Prompt

現在のGraphAIでは concurrency は全体の同時実行数だけだが、特定LLM/外部APIごとに並列上限を設定したい（レート制限対策）。agent単位 / agent + model単位 / または node に label を付けて label単位、で制御したい。

例:
concurrency:
  global: 20
  labels:
    openai: 3
    vertex: 5
設計判断:

HoL skip 方式 (B案) — キュー先頭から走査して、global枠 AND label枠が空いている最初のタスクを dispatch

priority は既存の数値仕様をそのまま維持（数値が大きいほど優先）

aging / weight / RPM(token bucket) はそれぞれ別チケットに切り出した:

TaskManager: queue で待機中のタスクの優先度を時間経過で自動昇格 (aging) させたい #1311 aging

TaskManager: タスクに weight を持たせて concurrency を「重みの合計」で制御したい #1312 weight

TaskManager: label 単位の RPM/TPM レート制限（token bucket）を検討 #1313 token bucket / RPM・TPM

事前に変更箇所のユニットテストが足りているかチェックし、不足していれば現状動作を locked-in するテストを足してから実装すること。

Implementation Approach

Order of work (TDD-ish)

Coverage gap audit — TaskManager had zero direct unit tests. Validator only had 4 negative tests for concurrency.
Characterization tests first — added 9 tests locking current TaskManager behavior (priority order, FIFO, prepareForNesting, isRunning, reset, getStatus, etc.) and 3 positive tests for the validator. Confirmed all 212 baseline tests pass on unchanged code.
Implement — type.ts, task_manager.ts, node.ts, validator.
Add label-feature tests — 10 new TaskManager tests + 7 new validator tests covering object-form concurrency.
Verify — yarn build, yarn test (all 19 workspaces, fail 0), yarn format.

Files changed

File	Change
`packages/graphai/src/type.ts`	Add `ComputedNodeData.label`, `ConcurrencyConfig`, union `concurrency`
`packages/graphai/src/task_manager.ts`	Constructor accepts union; HoL skip; per-label running counts
`packages/graphai/src/node.ts`	Read `data.label` into `ComputedNode.label`
`packages/graphai/src/validators/common.ts`	`label` added to allowed computed-node attribute keys
`packages/graphai/src/validators/graph_data_validator.ts`	Validate object-form concurrency
`packages/graphai/tests/units/test_task_manager.ts`	New — 19 tests
`packages/graphai/tests/validators/test_validator_graph_data.ts`	+10 tests
`plans/feat-concurrency-per-label.md`	New — design doc

Algorithm sketch (TaskManager.dequeueTaskIfPossible)

if global_running >= global_limit: return
for task in queue (priority desc):
    label = task.node.label
    if label is None or label not in labelLimits:
        dispatch(task); return
    if running_by_label[label] < labelLimits[label]:
        dispatch(task); return
return  # nothing eligible right now; wait for next onComplete

onComplete then loops dequeueTaskIfPossible while progress is being made, so a single completion can dispatch multiple newly-eligible queued tasks (bounded by the global limit).

Validation

yarn build — succeeds across all workspaces
yarn test — 19 workspaces, 0 failures (graphai package: 212 → 229 tests after this PR)
yarn format — clean
yarn pack --dry-run not applicable (no package.json change)
yarn eslint — preexisting errors only in untracked scratch files (tests/ensemble.ts, tests/runner.ts, tests/units/test_aaa.ts, tests/units/test_graph_aaa.ts); no errors introduced by this PR

Test plan

CI passes
Manual: write a graph with concurrency: { global: 10, labels: { openai: 2 } }, send 6 nodes labeled openai and confirm only 2 run concurrently while other labels run freely
Manual: confirm legacy concurrency: 5 graphs continue to work without changes

Out of scope (followups)

TaskManager: queue で待機中のタスクの優先度を時間経過で自動昇格 (aging) させたい #1311 — aging / starvation prevention
TaskManager: タスクに weight を持たせて concurrency を「重みの合計」で制御したい #1312 — weighted concurrency (heavy LLM calls consume multiple slots)
TaskManager: label 単位の RPM/TPM レート制限（token bucket）を検討 #1313 — RPM/TPM rate limiting (token bucket)

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Optional node labels for grouping tasks and applying per-label concurrency limits.
- Scheduler now respects per-label limits and skips blocked queue heads so eligible tasks can run.
Bug Fixes
- Nesting/preparation now correctly preserves and restores label-aware concurrency state across early exits.
Tests
- Expanded tests covering label validation, label-aware scheduling, nesting, and concurrency behaviors.
Documentation
- Added design doc describing label-based concurrency controls.

Why: a single global concurrency limit cannot model per-API rate concerns (e.g. limiting OpenAI calls to 3 in flight while keeping unrelated nodes running freely). Adds optional per-label limits that compose with the existing global limit. - type.ts: add ComputedNodeData.label and ConcurrencyConfig union for GraphData.concurrency. Number form remains valid. - task_manager.ts: constructor accepts number | ConcurrencyConfig. dequeueTaskIfPossible walks the priority-sorted queue and dispatches the first task whose label still has capacity (HoL skip), so a saturated label cannot block other work. onComplete drains as many newly-eligible tasks as the global limit allows. - node.ts / validators: read and validate the new fields. Tests: - New tests/units/test_task_manager.ts (19 tests) - locks current TaskManager behavior (priority order, FIFO, nesting, reset) and covers the new label semantics (HoL skip, mixed labeled/unlabeled, slot release). - tests/validators/test_validator_graph_data.ts gains 10 tests for the object-form concurrency validator. Backward compatibility: concurrency: number and unlabeled nodes behave identically to before. prepareForNesting still bumps the global limit only. Follow-ups split out as separate issues: - #1311 aging (queued-task priority bump) - #1312 weight (weighted concurrency) - #1313 token-bucket RPM/TPM (with rationale for why concurrency-only is insufficient for TPM and burst control) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-05-02T23:20:20Z

Warning

Rate limit exceeded

@isamu has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 6 minutes and 35 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c3d784c6-be0b-42be-8661-4b9a0407714b

📥 Commits

Reviewing files that changed from the base of the PR and between 5d7314c and 0e8e395.

📒 Files selected for processing (7)

packages/graphai/src/graphai.ts
packages/graphai/src/node.ts
packages/graphai/src/task_manager.ts
packages/graphai/src/utils/utils.ts
packages/graphai/src/validators/graph_data_validator.ts
packages/graphai/tests/units/test_task_manager.ts
plans/feat-concurrency-per-label.md

📝 Walkthrough

Walkthrough

Adds optional per-node label and a ConcurrencyConfig union for GraphData.concurrency; TaskManager enforces global and optional per-label concurrency with head-of-line skipping; validators validate the new config and label; nested-execution nesting uses label-aware prepare/restore; comprehensive unit and validator tests plus design doc added.

Changes

Per-Label Concurrency Control

Layer / File(s)	Summary
Type & Schema Extensions `packages/graphai/src/type.ts`	Adds `label?: string` to `ComputedNodeData`, introduces exported `ConcurrencyConfig = { global: number; labels?: Record<string, number> }`, and makes `GraphData.concurrency?: number
Node wiring `packages/graphai/src/node.ts`	`ComputedNode` gains `public readonly label?: string` (set only when `data.label` is a string). `execute()` uses `prepareForNesting(this.label)` / `restoreAfterNesting(this.label)` with a try/finally guarded `nestingPrepared` flag.
Validator Schema Updates `packages/graphai/src/validators/common.ts`, `packages/graphai/src/validators/computed_node_validator.ts`, `packages/graphai/src/validators/graph_data_validator.ts`	`computedNodeAttributeKeys` now includes `"label"`. `computedNodeValidator` enforces `label` is a string when present. `graph_data_validator` adds `isPlainObject`, `validateConcurrencyValue`, and `validateConcurrencyConfig` to validate `number
Core TaskManager Label-Aware Logic `packages/graphai/src/task_manager.ts`	Constructor now accepts `number
Tests & Documentation `packages/graphai/tests/units/test_task_manager.ts`, `packages/graphai/tests/validators/test_validator_graph_data.ts`, `packages/graphai/tests/validators/test_validator_computed_node.ts`, `plans/feat-concurrency-per-label.md`	Extensive unit tests exercise global concurrency, FIFO/priority semantics, per-label limits, independent label scheduling, HoL skipping, unlabeled/unconfigured-label behavior, nested-label bumping behavior, `prepareForNesting`/`restoreAfterNesting`, `reset`, and verbose status. Validator tests cover numeric and object-form concurrency (valid and many malformed cases). Added design doc describing behavior and backward compatibility.

Sequence Diagram

sequenceDiagram
    participant Client as Task Producer
    participant Manager as TaskManager
    participant Queue as Priority Queue
    participant Running as Running Tracker
    participant Worker as Task Executor

    Client->>Manager: addTask(task { label? }, priority)
    Manager->>Manager: normalizeConcurrencyConfig / canRun(task)
    alt eligible (global && label OK)
        Manager->>Running: increment global & label counters
        Manager->>Worker: dispatch task callback
    else ineligible
        Manager->>Queue: enqueue task (priority-sorted)
    end

    Worker->>Manager: onComplete(task)
    Manager->>Running: decrement global & label counters
    Manager->>Queue: scan queue (HoL skip) for first eligible task
    loop dispatch while eligible tasks exist
        Manager->>Running: increment counters
        Manager->>Worker: dispatch next eligible task
        Manager->>Queue: remove dispatched entry
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰 I hop along queues with a label so neat,
Global gates open, but some labels I must seat.
If one line is full, I glance and skip ahead,
Let others sprint past while the tight ones tread.
A tidy garden of tasks — concurrency well-fed!

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title clearly and specifically describes the main change: adding per-label concurrency control with head-of-line skip dispatching strategy.
Linked Issues check	✅ Passed	The implementation fully addresses issue `#1304` requirements: per-label concurrency limits alongside global limits [`#1304`], head-of-line skip to prevent label saturation from stalling unrelated work [`#1304`], label support on nodes [`#1304`], and backward compatibility with numeric concurrency [`#1304`].
Out of Scope Changes check	✅ Passed	All changes are directly scoped to implementing per-label concurrency control and associated validation. The plan document and tests are supporting artifacts for the feature.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/concurrency-per-label

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Review rate limit: 0/1 reviews remaining, refill in 6 minutes and 35 seconds.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/graphai/src/validators/common.ts`:
- Line 16: computedNodeValidator currently accepts a `label` key without
checking its type, allowing non-string values (e.g., numbers/booleans) to pass
and bypass TaskManager's per-label limits; update the validator (the
computedNodeValidator in common.ts) to validate that node.label is a string
before accepting the key—reject or throw for non-string label values so only
string labels are permitted and TaskManager label-based caps work correctly.

In `@packages/graphai/src/validators/graph_data_validator.ts`:
- Around line 18-32: The concurrency object validation must reject unknown keys:
after verifying concurrency is an object and that "global" exists, ensure the
object only contains the allowed keys "global" and optionally "labels" by
inspecting Object.keys(concurrency) and throwing a ValidationError if any other
key is present; keep using validateConcurrencyValue(concurrency.global,
"Concurrency.global") and the existing labels branch (typeof concurrency.labels
checks and validateConcurrencyValue for each label) but add the extra-key check
before validating values so typos like "label" fail fast.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ba501093-d1c6-4063-9234-68a8b93bcbc3

📥 Commits

Reviewing files that changed from the base of the PR and between cb15212 and 591f365.

📒 Files selected for processing (8)

packages/graphai/src/node.ts
packages/graphai/src/task_manager.ts
packages/graphai/src/type.ts
packages/graphai/src/validators/common.ts
packages/graphai/src/validators/graph_data_validator.ts
packages/graphai/tests/units/test_task_manager.ts
packages/graphai/tests/validators/test_validator_graph_data.ts
plans/feat-concurrency-per-label.md

…label

Findings from Codex (#1314 review): 1. prepareForNesting/restoreAfterNesting were label-unaware, allowing a deadlock when a parent node holding the only slot for label X invokes a nested graph whose children also use label X. Fix: both methods now accept an optional label argument; node.ts passes its own label so the parent's label-bucket gets a temporary +1 slot for the nested children, mirroring the existing global-slot escape hatch. 2. ComputedNode.label was copied through without runtime validation, so a malformed (non-string) label from YAML/JSON would silently bypass the per-label limit (Map<string, ...>.get(<non-string>) -> undefined, treated as untracked). Fix: defensive coercion in ComputedNode + a schema check in computed_node_validator. Also addresses user feedback to lean on existing utility predicates: - task_manager.normalizeConcurrencyConfig now uses isObject for the labels guard. - graph_data_validator now uses isObject + isNull instead of inlined null/typeof/Array.isArray combinations. Tests added: - TaskManager: nested-graph deadlock prevention, untouched behavior when label is undefined or unconfigured. - computed_node_validator: rejects non-string label, accepts string label. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

isamu · 2026-05-02T23:38:23Z

Codex cross-review iteration 1 — addressed

Codex flagged two issues (network-isolated, so verdict was returned to local CLI rather than posted here). Both addressed in 96ba009:

1. Nested-graph deadlock with per-label limits

Codex finding: prepareForNesting() only bumps the global slot. If a parent task with label openai (limit 1) invokes a nested graph that also wants label openai, the children queue forever — parent waits for children, children wait for the label slot the parent owns.

Fix: prepareForNesting(label?) and restoreAfterNesting(label?) now also bump the parent's label limit when the label has a configured cap. node.ts passes this.label to both. Behavior when label is undefined or unconfigured is unchanged.

2. Runtime label validation

Codex finding: data.label was copied straight to ComputedNode.label. A non-string label from YAML/JSON would bypass the label bucket because Map<string, ...>.get(<non-string>) returns undefined → treated as untracked.

Fix:

Defensive coercion in ComputedNode constructor: keep label only if it is actually a string at runtime.
Schema check added to computed_node_validator: Computed node label must be a string.

Also (per user feedback)

Refactored task_manager.normalizeConcurrencyConfig and graph_data_validator to use the project's isObject / isNull utilities instead of inlined null/typeof/Array.isArray checks.

Tests added (graphai package now: 229 → 235)

test_task_manager.ts: 3 new tests for nested-graph deadlock prevention and the no-op cases (undefined / unconfigured label).
test_validator_computed_node.ts: 2 new tests for label type validation.

All 19 workspace test suites pass (yarn test fail count: 0). yarn build clean.

Findings from Codex iteration-2: 1. Exception safety of nested-graph slot bookkeeping. prepareForNesting was paired with a happy-path-only restoreAfterNesting, so a throw inside the agent body leaked both the global concurrency bump and (after iteration-1) the label bump. This was a pre-existing bug for the global slot but became more impactful with the label addition. Fix: wrap the nested execution block in try/finally so restoreAfterNesting always runs. 2. validateConcurrencyConfig was accepting concurrency.labels: null after the iteration-1 isNull refactor relaxed the check. The declared type is Record<string, number>; null is malformed. Fix: compare against undefined directly so only the absent sentinel skips validation. 3. Iteration-1 tests did not actually exercise the label decrement path -- the only post-restore assertion was the global concurrency value. Strengthened test_task_manager.ts so the nested-graph test schedules two openai tasks after restoreAfterNesting and confirms one queues (which would not happen if the label limit were left bumped). Plus, per user feedback to cover negative patterns more thoroughly, broadened the validator suites: - test_validator_computed_node.ts: label rejected for null, array, object, boolean (in addition to number); valid for omitted and string. - test_validator_graph_data.ts: top-level concurrency rejected for null, boolean, array. concurrency.global rejected for string, null, negative, float. concurrency.labels.<key> rejected for string, null, negative, boolean, object. Empty-key edge case still validates. graphai package tests: 235 -> 254. All 19 workspace test suites pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

isamu · 2026-05-03T00:08:11Z

Codex cross-review iteration 2 — addressed (`f804289`)

1. Exception safety of nested-graph slot bookkeeping

Codex finding: prepareForNesting() was paired with a happy-path-only restoreAfterNesting(), so a throw inside agentFilterHandler leaked both the global slot and the new label bump. Pre-existing for the global slot; iteration-1 made it worse by adding label leakage.

Fix: node.ts now wraps the nested-execution block in try/finally so restoreAfterNesting(this.label) always runs, even on throw.

2. `concurrency.labels: null` snuck through after iteration-1 refactor

Codex finding: I had switched the labels presence check to !isNull(...), which made null pass when it shouldn't (the declared type is Record<string, number>).

Fix: replaced with concurrency.labels !== undefined. Only undefined skips validation; null and other malformed shapes are rejected.

3. Iteration-1 nesting test did not actually verify the label decrement

Codex finding: my nesting test only checked concurrency after restoreAfterNesting. A broken label decrement could have passed silently.

Fix: rebuilt the test with global=2, labels: { openai: 1 }. After the full nested cycle, two new openai tasks are scheduled — only one runs and one queues, which is impossible if the label decrement was missed (a leaked bump would let both run since the global limit is 2).

Plus, per user direction to cover negative patterns more thoroughly

Extended validator suites:

test_validator_computed_node.ts: label rejected for null, array, object, boolean (in addition to number); valid for omitted and string.
test_validator_graph_data.ts: top-level concurrency rejected for null/boolean/array; concurrency.global rejected for string/null/negative/float; concurrency.labels.<key> rejected for string/null/negative/boolean/object; empty-key edge case still validates.

Test counts

graphai package: 235 → 254 tests (all pass). 19 workspace suites: fail count 0.

Findings from Codex iteration-3: 1. The iteration-2 try/finally block did not protect the forNestedGraph construction. resultOf() can throw when nestedGraph is a DataSource, leaving prepareForNesting's bump unrestored. Fix: move both prepareForNesting and the forNestedGraph build inside the try block, and gate restoreAfterNesting on a `nestingPrepared` flag that is set only after prepareForNesting actually returned. This way: - throw inside the build -> finally runs, restore is called. - throw inside prepareForNesting itself -> flag stays false, restore is NOT called (no bump to undo). 2. validateConcurrencyConfig accepted any non-array object for labels, so Map/Date/class instances would pass typeof === "object" and silently disable enforcement (Object.entries yields nothing). It also ignored unexpected extra keys on the concurrency object, so a typo like `labls` would be silently dropped. Fix: introduce isPlainObject (proto === Object.prototype || null) and require it for both the concurrency object and labels. Reject any keys other than "global"/"labels". Tightened validateConcurrencyValue to also require typeof === "number" before the Number.isInteger check. Tests: - Reject extra keys on concurrency: { global, foo }. - Reject Map / Date as concurrency value (call validateGraphData directly to bypass anonymization's JSON round-trip, which would otherwise mask the class identity). - Reject Map / class instance as labels value. graphai package tests: 254 -> 261. All 19 workspace suites pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

isamu · 2026-05-03T00:26:35Z

Codex cross-review iteration 3 — addressed (`5d7314c`)

1. try/finally still left a gap before agent invocation

Codex finding: prepareForNesting ran first, but context.forNestedGraph = { graphData: this.graph.resultOf(...) ... } could throw before the try block, leaking the bump.

Fix: pulled both prepareForNesting and the forNestedGraph construction into the try block. Added a nestingPrepared flag set only after prepareForNesting returned, so the finally restores only when there is a bump to undo (correct on three paths: build throws → restore; agent throws → restore; prepareForNesting itself throws → no restore).

2. Validator accepted non-plain-object shapes

Codex finding: Map, Date, class instances, etc. would pass isObject(...) but yield nothing from Object.entries(), silently disabling label enforcement. Also, extra/typo keys on the concurrency object were ignored.

Fix:

Added isPlainObject (proto === Object.prototype || null) and require it for both the concurrency object and labels.
Reject any keys on the concurrency object other than global / labels (catches typos like labls).
Tightened validateConcurrencyValue to require typeof === "number" before Number.isInteger.

Tests

Reject extra keys on concurrency: { global: 5, foo: 1 } → Concurrency object does not allow foo.
Reject Map / Date as concurrency value, and Map / class instance as labels value. These tests bypass anonymization (which JSON-roundtrips and would mask class identity) by calling validateGraphData directly.

Test counts

graphai package: 254 → 261 tests (all pass). 19 workspace suites: fail count 0.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/graphai/src/task_manager.ts`:
- Around line 127-138: prepareForNesting currently increments the global
labelLimits (mutating labelLimits) which temporarily widens the shared cap and
lets unrelated tasks with the same label consume that slot; instead, stop
mutating labelLimits and implement a per-nesting reservation tracked on the
TaskManager instance (e.g., a new Map like nestingReservations keyed by the
nesting path or a per-run id) that increments on prepareForNesting(label) and
decrements on restoreAfterNesting(label), and change throttling/availability
checks to consult labelLimits minus any global usage plus the current run's
nestingReservations (so only the current nested path sees the reserved
capacity); update prepareForNesting and restoreAfterNesting to operate on this
per-nesting reservation map and add a regression test that runs a labeled parent
that nests without using the label while a queued sibling with the same label is
waiting and assert the sibling does not start during the parent’s nesting
window.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0f890848-88fa-470f-a785-d9decc27ce65

📥 Commits

Reviewing files that changed from the base of the PR and between 591f365 and 5d7314c.

📒 Files selected for processing (7)

packages/graphai/src/node.ts
packages/graphai/src/task_manager.ts
packages/graphai/src/validators/computed_node_validator.ts
packages/graphai/src/validators/graph_data_validator.ts
packages/graphai/tests/units/test_task_manager.ts
packages/graphai/tests/validators/test_validator_computed_node.ts
packages/graphai/tests/validators/test_validator_graph_data.ts

isamu · 2026-05-03T00:50:18Z

Codex cross-review iteration 4 — LGTM

Codex returned LGTM with one nit: isPlainObject compares against the current realm's Object.prototype, so a plain object created in a different JS realm would be rejected.

Decision: defer. GraphAI's graph data is sourced from JSON.parse or in-process programmatic construction; cross-realm objects are not a realistic path for this codebase. The constraint matches lodash's isPlainObject and similar libraries. If a real cross-realm use case appears, it is a one-line swap to Object.prototype.toString.call(x) === "[object Object]".

Final status

Codex iterations 1-4: 5 substantive findings (all MUST-FIX), all addressed.
graphai package tests: 212 (pre-PR baseline) → 261 after this PR.
19 workspace test suites: fail count 0 at every iteration.
yarn build clean across the monorepo.

Ready for human review and CI green-light.

CodeRabbit raised that prior iterations widened the shared per-label cap during nesting (labelLimits++), which meant unrelated tasks sharing that label could opportunistically take the bumped slot -- defeating the throttle even when the nested graph never used the label. Fix: introduce a per-parent-graphId bypass map. nestingBypassByLabel: Map<label, Map<parentGraphId, count>> prepareForNesting(label, parentGraphId) records a bypass for that parent without touching labelLimits. canRun() consults the bypass ONLY for tasks whose graphId differs from the parent's. So: - parent on graphId X with label openai (limit 1) is running - sibling on graphId X with label openai is queued - parent invokes nested graph -> bypass recorded for parent X - canRun(sibling): same graphId as parent -> bypass NOT applied -> sibling stays queued (correct: per-label throttle preserved) - canRun(child on nested graphId Y): different from parent -> +1 cap -> child runs (correct: deadlock avoided) Also: prepareForNesting now drains the queue (its global slot bump or label bypass can immediately enable a queued task). The Map<Map<>> shape is necessary: a single graph can host two labeled nodes simultaneously invoking nested graphs, so refcounting per parentGraphId is required (a Set would deduplicate). Kept as the minimum data shape that handles refcount correctly. Documented residual edge case in test file: two completely independent top-level graphs sharing a TaskManager would still let an "unrelated" queued node consume the bypass, because graphId comparison alone cannot distinguish "my actual nested child" from "an unrelated independent graph sharing this manager." That scenario is not part of GraphAI's documented hierarchical usage and would require threading parent-child graphId binding through the GraphAI constructor (out of scope). Tests: - New: child on nested graphId runs, sibling on parent's graphId blocks. - Updated existing nesting tests to use the (label, parentGraphId) signature and to model real nested-vs-sibling graphId topology. graphai package tests: 261 -> 264. All 19 workspace suites pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

isamu · 2026-05-03T01:20:39Z

CodeRabbit finding addressed (`77ec2cf`)

CodeRabbit pointed out a subtle but real bug not caught by Codex: prior iterations widened the shared per-label cap during nesting (labelLimits.set(label, +1)). That defeats the per-label throttle for any unrelated task sharing the label, because the widened cap is global to the TaskManager — even if the nested graph never uses that label, an arbitrary sibling could sneak through during the nesting window.

Fix

prepareForNesting no longer mutates labelLimits. Instead it records a per-parent-graphId bypass:

nestingBypassByLabel: Map<label, Map<parentGraphId, count>>

canRun consults the bypass only for tasks whose graphId differs from the parent's. Concretely:

task	graphId vs parent's	running < limit?	bypass applies?	result
sibling	same	full	no (same graphId)	queued ✅ throttle preserved
nested child	different	full	yes	runs ✅ deadlock avoided

The Map-of-Maps shape is the minimum needed for correctness: a single outer graph can host two labeled nodes invoking nested graphs simultaneously, so refcount per parentGraphId is required (a Set would deduplicate the second prepareForNesting from the same graph and lead to a premature decrement on the first restore).

Other adjustments

prepareForNesting now drains the queue, since freeing global capacity or activating a label bypass can immediately unblock queued tasks (previously this was implicit because the next addTask would trigger dispatch).
Existing nested-graph test rewritten to model real topology (parent on outer graphId, child on nested graphId) and to assert the sibling-blocks-during-nesting property.

Documented residual edge case

Two completely independent top-level graphs sharing a single TaskManager would still let an "unrelated" queued node consume the bypass, because graphId comparison alone cannot distinguish "my actual nested child" from "an independent graph that happens to share this manager." That is not part of GraphAI's documented hierarchical usage; closing it would require threading parent→child graphId binding through the GraphAI constructor, which is out of scope here. Documented in test_task_manager.ts.

Test counts

graphai package: 261 → 264 (all pass). 19 workspace suites: fail count 0.

snakajima · 2026-05-03T13:52:49Z

Cross-review evaluation of CodeRabbit + Codex findings

I reviewed every reviewer comment on this PR and evaluated whether each finding is valid and whether the fix is correct. Summary first, details below.

Source	Findings	All addressed?	Quality of fixes
CodeRabbit inline	3	yes	Strong; the `nestingBypassByLabel` design is better than what was suggested
Codex iter 1	2	yes	Deadlock fix + label runtime check
Codex iter 2	3	yes	Exception-safety catch was the most valuable single finding
Codex iter 3	2	yes	Tightened both try/finally guard and validator plain-object check
Codex iter 4	0 (LGTM, 1 nit)	deferred	Reasonable

All actionable findings are valid and have been correctly addressed. No reviewer comment was dismissed without a defensible reason.

CodeRabbit comment 1 — `label` runtime type validation

Valid and important. Fix is correct.

Without it, label: 123 from YAML/JSON would pass schema validation, then silently bypass per-label limits because Map<string, number>.get(123) is undefined (treated as untracked). Author went further than the suggestion — also added defensive runtime coercion in ComputedNode constructor (node.ts:125), so even if validation is bypassed via direct programmatic construction, label still defaults to undefined for non-strings. Six negative tests lock the contract.

CodeRabbit comment 2 — Reject unknown keys in `concurrency` config

Valid and important. Fix is correct.

Without it, a typo like concurrency: { global: 5, labls: { openai: 2 } } would validate fine and run with no per-label enforcement. Fix uses a concurrencyConfigKeys whitelist tied to keyof ConcurrencyConfig — type-safe and minimal.

CodeRabbit comment 3 — Don't widen shared `labelLimits` during nesting (the big one)

Valid and substantive. Fix exceeds what was suggested.

This was the most consequential finding. Earlier iterations did labelLimits.set(label, current+1) during prepareForNesting, which globally widened the cap — an unrelated openai sibling could run during the parent's nesting window even if the nested graph never used openai, silently breaking the rate-limit promise.

CodeRabbit suggested a "per-nesting reservation map keyed by nesting path or per-run id." The author chose a cleaner shape: Map<label, Map<parentGraphId, refcount>>, consulted in canRun only for tasks with graphId !== parentGraphId. The reasoning is solid:

Same-graphId siblings → bypass not applied → cap preserved
Different-graphId children (the actual nested ones) → +1 capacity → deadlock avoided
Refcount is needed (not a Set) because a single graph can host two labeled nodes nesting simultaneously

The new test TaskManager nesting allows child on different graphId, blocks sibling on same graphId directly exercises the regression scenario, plus tears down and re-checks bookkeeping. Above the bar.

The author honestly documented one residual edge case (two completely independent top-level GraphAI instances sharing a TaskManager). This is not the documented hierarchical pattern, and the workaround is documented in the test file. Reasonable to defer.

Codex iteration 1 — Nested-graph deadlock + label runtime check

Both valid and correctly addressed. The deadlock fix was later superseded by the per-parent-graphId bypass (CodeRabbit-driven), which is the more correct shape.

Codex iteration 2 — Exception safety + null labels + weak test

All three valid. The exception safety catch was the strongest single finding in the entire review cycle: an agent throw between prepareForNesting and restoreAfterNesting leaked the bump. This was a preexisting bug for the global slot, but the label addition made the leak more impactful. Try/finally fix is correct.

The concurrency.labels: null finding was a legitimate regression from the iteration-1 isNull refactor.

The test-strength critique was fair — iteration-1's nested test only checked concurrency after restore, not the label decrement. The strengthened test properly exercises both.

Codex iteration 3 — try/finally gap + non-plain-object validator

Both valid.

try/finally gap: subtle. prepareForNesting was called before the try block, but context.forNestedGraph = { graphData: this.graph.resultOf(...) ... } could throw between prepare and the try. The nestingPrepared flag pattern is the right resolution — restore only when there's a bump to undo. Verified at node.ts:356-407.
Non-plain-object accept: Map, Date, class instances all pass typeof === "object" but yield no entries from Object.entries, silently disabling enforcement. The isPlainObject (proto check) fix is right. The tests bypass anonymization (which would JSON-roundtrip these to plain objects) by calling validateGraphData directly — a thoughtful test design choice.

Codex iteration 4 — LGTM with cross-realm `isPlainObject` nit

Author's defer is correct. Object.getPrototypeOf(x) === Object.prototype doesn't cross JS realms (iframes, vm contexts, worker boundaries). Realistic vector for GraphAI? Effectively zero — graph data comes from JSON.parse (always current realm) or in-process construction. Lodash, type-fest, etc. have the same constraint. One-line swap available if a real case ever appears.

Minor items not raised by any reviewer

this.concurrency / reset() interaction: prepareForNesting does this.concurrency++, restoreAfterNesting does this.concurrency--. If reset() is called between them, nestingBypassByLabel is cleared but this.concurrency retains its bumped value, and the eventual restoreAfterNesting decrements below the original. Pre-existing latent bug, not introduced here. Worth a future ticket if anyone hits it.
Naming inconsistency with the plan doc: plans/feat-concurrency-per-label.md:55 says globalLimit: number but the field is named concurrency. Cosmetic.
Direct TaskManager construction bypasses the validator: new TaskManager({ global: 5, labels: ["openai", 2] as any }) would let isObject through normalizeConcurrencyConfig. Only relevant for non-public construction paths; could harden with the same isPlainObject check inside task_manager.ts.

None of these are blockers.

Bottom line

Reviewer engagement was high-signal, author's responses are thorough, iteration trajectory is healthy. Ready for human merge approval from my standpoint.

🤖 Generated with Claude Code

Followups from snakajima's audit on #1314: 1. plan doc had a stale name `globalLimit` that did not match the actual field `concurrency` in TaskManager. Fixed. 2. TaskManager constructor previously silently coerced a malformed ConcurrencyConfig (e.g. labels as Array, Map, or class instance) into something benign, so direct callers bypassing validateGraphData could end up with broken label enforcement without any error. Fix: - Promoted `isPlainObject` to packages/graphai/src/utils/utils.ts (single source of truth, used by both task_manager and the graph_data_validator). - Hardened `normalizeConcurrencyConfig` to fail fast: rejects non-plain-object configs, non-integer / non-positive values, non-plain-object labels, and bad label values. - Reordered GraphAI constructor to run validateGraphData / validateAgent BEFORE constructing TaskManager, so user-facing ValidationError keeps wrapping malformed graph data instead of TaskManager's internal Error. New tests in test_task_manager.ts cover: zero/negative/non-integer concurrency, null/array/Date as config, Map/class-instance/Array as labels, and non-integer label values. Out of scope for this PR (filed as #1315): - reset() does not restore the original concurrency value, so prepare/restore left dangling across reset() can drift the global cap. Pre-existing bug, separate issue. graphai package tests: 264 -> 270. All 19 workspace suites pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

isamu · 2026-05-03T23:08:00Z

Followups from @snakajima's audit (`0e8e395`)

1. Plan doc naming inconsistency — fixed

plans/feat-concurrency-per-label.md had globalLimit: number in the implementation sketch, but the actual TaskManager field is concurrency. Updated the plan to match the code.

2. Direct TaskManager construction not guarded — fixed

The previous normalizeConcurrencyConfig used isObject for the labels guard, which lets Array, Map, Date, class instances, etc. through. A consumer doing new TaskManager({ global: 5, labels: ["openai", 2] as any }) would silently end up with Object.entries-derived garbage in the limits map.

Promoted isPlainObject to packages/graphai/src/utils/utils.ts so it's the single source of truth shared by task_manager.ts and graph_data_validator.ts.
Hardened normalizeConcurrencyConfig to fail fast on malformed shapes (non-plain-object config, non-positive-integer values, non-plain-object labels).
Reordered GraphAI constructor to run validateGraphData / validateAgent before constructing the TaskManager, so user-facing ValidationError still wraps malformed graph data; TaskManager's own guard is a backstop for direct callers that bypass validation.

7 new tests in test_task_manager.ts covering: zero/negative/non-integer concurrency, null/array/Date as the whole config, Map/class-instance/Array as labels, and non-integer label values.

3. Out of scope — filed as separate issue

#1315 — reset() does not restore the original concurrency value, so a reset() between prepareForNesting and restoreAfterNesting can drift the global cap. Pre-existing bug; not blocking this PR.

Test counts

graphai package: 264 → 270 (all pass). 19 workspace suites: fail count 0.

CI will rerun automatically.

coderabbitai Bot reviewed May 2, 2026

View reviewed changes

Comment thread packages/graphai/src/validators/common.ts

Comment thread packages/graphai/src/validators/graph_data_validator.ts Outdated

isamu and others added 2 commits May 3, 2026 08:35

Merge remote-tracking branch 'origin/main' into feat/concurrency-per-…

da217b0

…label

coderabbitai Bot reviewed May 3, 2026

View reviewed changes

Comment thread packages/graphai/src/task_manager.ts Outdated

isamu mentioned this pull request May 3, 2026

TaskManager: reset() during nesting can corrupt the concurrency counter #1315

Open

isamu merged commit 51248b9 into main May 3, 2026
12 checks passed

Conversation

isamu commented May 2, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Items to Confirm / Review

User Prompt

Implementation Approach

Order of work (TDD-ish)

Files changed

Algorithm sketch (TaskManager.dequeueTaskIfPossible)

Validation

Test plan

Out of scope (followups)

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

isamu commented May 2, 2026

Codex cross-review iteration 1 — addressed

1. Nested-graph deadlock with per-label limits

2. Runtime label validation

Also (per user feedback)

Tests added (graphai package now: 229 → 235)

Uh oh!

isamu commented May 3, 2026

Codex cross-review iteration 2 — addressed (f804289)

1. Exception safety of nested-graph slot bookkeeping

2. concurrency.labels: null snuck through after iteration-1 refactor

3. Iteration-1 nesting test did not actually verify the label decrement

Plus, per user direction to cover negative patterns more thoroughly

Test counts

Uh oh!

isamu commented May 3, 2026

Codex cross-review iteration 3 — addressed (5d7314c)

1. try/finally still left a gap before agent invocation

2. Validator accepted non-plain-object shapes

Tests

Test counts

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

isamu commented May 3, 2026

Codex cross-review iteration 4 — LGTM

Final status

Uh oh!

isamu commented May 3, 2026

CodeRabbit finding addressed (77ec2cf)

Fix

Other adjustments

Documented residual edge case

Test counts

Uh oh!

snakajima commented May 3, 2026

Cross-review evaluation of CodeRabbit + Codex findings

CodeRabbit comment 1 — label runtime type validation

CodeRabbit comment 2 — Reject unknown keys in concurrency config

CodeRabbit comment 3 — Don't widen shared labelLimits during nesting (the big one)

Codex iteration 1 — Nested-graph deadlock + label runtime check

Codex iteration 2 — Exception safety + null labels + weak test

Codex iteration 3 — try/finally gap + non-plain-object validator

Codex iteration 4 — LGTM with cross-realm isPlainObject nit

Minor items not raised by any reviewer

Bottom line

Uh oh!

isamu commented May 3, 2026

Followups from @snakajima's audit (0e8e395)

isamu commented May 2, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 2, 2026 •

edited

Loading

Codex cross-review iteration 2 — addressed (`f804289`)

2. `concurrency.labels: null` snuck through after iteration-1 refactor

Codex cross-review iteration 3 — addressed (`5d7314c`)

CodeRabbit finding addressed (`77ec2cf`)

CodeRabbit comment 1 — `label` runtime type validation

CodeRabbit comment 2 — Reject unknown keys in `concurrency` config

CodeRabbit comment 3 — Don't widen shared `labelLimits` during nesting (the big one)

Codex iteration 4 — LGTM with cross-realm `isPlainObject` nit

Followups from @snakajima's audit (`0e8e395`)