Skip to content

scheduler orchestration optimization 1#381

Merged
mayinghan merged 6 commits intomainfrom
priority-scheduler-2
Dec 19, 2025
Merged

scheduler orchestration optimization 1#381
mayinghan merged 6 commits intomainfrom
priority-scheduler-2

Conversation

@mayinghan
Copy link
Collaborator

@mayinghan mayinghan commented Dec 17, 2025

Refactors the PriorityRolloutScheduler from batch-based to streaming scheduling. Each completed run now immediately schedules the next run, rather than waiting for an entire mini-batch to complete. This improves overall throughput by maximizing concurrent execution.

Key Changes

Streaming Scheduling Architecture

  • New SampleState dataclass: Tracks per-sample state across multiple runs, including active/completed run counts and async locks for safe concurrent updates
  • Refactored RolloutTask: Now represents a single run instead of a batch of runs
  • Immediate next-run scheduling: When a run completes, the next run is immediately scheduled with high priority (0), ensuring samples finish as quickly as possible

Concurrency Simplification

  • Removed redundant rollout_sem: The rollout processor (e.g., SingleTurnRolloutProcessor) already handles concurrency via config.semaphore. The scheduler-level semaphore was causing double-limiting and has been removed.

Priority Queue Behavior

  • Initial runs are scheduled with low priority (1, row_index, run_idx)
  • Subsequent runs (after first completion) are scheduled with high priority (0, row_index, run_idx) to finish in-progress samples ASAP

Benefits

Higher throughput: New runs start immediately when a slot opens, no waiting for batch completion
Better resource utilization: Maintains in_group_minibatch_size concurrent runs per sample at all times
Cleaner concurrency model: Single point of concurrency control in the rollout processor


Note

Refactors the rollout scheduler to stream per-run tasks with high/low priorities, adds speculative history injection, and simplifies concurrency to rely on the rollout processor’s semaphore.

  • Scheduler (PriorityRolloutScheduler):
    • Introduces SampleState and redefines RolloutTask to a single run with priority (status, row_index, run_index).
    • Implements streaming scheduling: each completed run immediately enqueues the next with high priority; maintains per-sample concurrency via in_group_minibatch_size (defaults depend on ENABLE_SPECULATION).
    • Adds speculative prediction by injecting history_snapshot into completion_params.extra_body.prediction per run.
    • Simplifies concurrency: relies on rollout_processor's semaphore; worker count set to max_concurrent_rollouts.
    • Tracks active rollouts for progress display; preserves pointwise/groupwise eval flows with buffered group dispatch.
  • Tests:
    • Update to new RolloutTask/SampleState API; adjust expectations for worker scaling, priority ordering, concurrency limits, and groupwise evaluation.
  • Misc:
    • Minor whitespace adjustment in default_single_turn_rollout_process.py.

Written by Cursor Bugbot for commit fb8debe. This will update automatically on new commits. Configure here.

@mayinghan mayinghan requested a review from morgendave December 17, 2025 23:49
@mayinghan mayinghan changed the title fix scheduler concurrency and orchestration scheduler orchestration optimization 1 Dec 17, 2025
@mayinghan mayinghan marked this pull request as ready for review December 18, 2025 20:14
extra_body["prediction"] = {"type": "content", "content": " ".join(task.history)[:max_tokens]}
cp["extra_body"] = extra_body
row_copy.input_metadata.completion_params = cp
cp["extra_body"]["prediction"] = " ".join(task.history_snapshot)[:max_tokens]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Shallow copy allows concurrent mutation of shared config

The comment says "Deep copy completion_params to avoid mutating shared config" but dict() only performs a shallow copy. If the base config already contains extra_body (which is common in practice), it will be shared by reference. When multiple speculation tasks from different samples run concurrently, they will all modify the same shared extra_body dict, causing race conditions where one task's prediction value gets overwritten by another. The extra_body dict needs to be explicitly copied or a true deep copy is needed.

Fix in Cursor Fix in Web

Tracks state for a single dataset sample across multiple runs.
Enables streaming scheduling where each completed run immediately triggers the next.
"""
row: EvaluationRow
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this row for? Do we update this? And for the lock I can see one state shouldn't be consumed by two async tasks?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is just a reference and keep track of the history and rollout status for one sample. the row will be deep copied during actual usage

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this samplestate only maintaining the state for each sample.

@mayinghan mayinghan requested a review from morgendave December 19, 2025 02:11
tc = time.perf_counter()
# print(f"run_id {row.execution_metadata.run_id} request_params: {json.dumps(request_params)}")
response = await acompletion(**request_params)
print(f"run_id {row.execution_metadata.run_id} time taken: {time.perf_counter() - tc} speculation_enabled: {request_params.get('extra_body', {}).get('prediction', None) is not None}")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Debug print statement left in production code

A debug print statement was left in the production code path. This prints timing information and speculation status for every non-streaming LLM completion call, which will pollute logs and stdout in production. The timing variable tc and the uncommented print at line 103 appear to be debugging/profiling code that wasn't removed before committing.

Fix in Cursor Fix in Web

Copy link
Collaborator

@morgendave morgendave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally looks fine, unblock, need to verify the safety of lock and queue

@mayinghan mayinghan merged commit 57a46a6 into main Dec 19, 2025
17 checks passed
@mayinghan mayinghan deleted the priority-scheduler-2 branch December 19, 2025 22:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants