feat(scheduler): foundation — class, config, policy (PR 2 M1) by CatherineSue · Pull Request #1541 · lightseekorg/smg

CatherineSue · 2026-05-25T01:45:23Z

Description

Problem

The model gateway has no notion of priority on inbound traffic. Today's concurrency_limit_middleware admits requests in FIFO order against a single global slot pool — interactive chat completions queue behind whatever bulk batch happened to arrive first.

The full design lives in .claude/docs/scheduler/02-priority-scheduler-design.md. Splitting the implementation into milestones lets each chunk land reviewably without flipping the master --priority-scheduler-enabled flag to true until everything is wired up.

Solution

This PR is Milestone 1 of 6 for PR 2 (priority scheduler). It introduces the vocabulary used by every later milestone: the Class enum, per-class config (YAML + runtime), tenant policy resolver. No call sites yet — these types are reachable from crate::middleware::scheduler::* but no production code path constructs them. Behavior on main is unchanged.

Subsequent milestones (each its own PR):

M2 — scheduler core (slot accounting, queues, dispatcher, capacity-watch)
M3 — preemption (TTFT CAS + body wrapper)
M4 — CLI flags, AppContext wiring, priority_admission_middleware, cancel-token plumbing across handlers
M5 — observability (metrics + tracing)
M6 — integration tests + flip the master flag from "experimental" to "ready"

The implementation plan lives at .claude/docs/scheduler/02-priority-scheduler-plan.md (also in this PR's diff — it's a new file).

Changes

Five commits, one per task in M1:

feat(scheduler): add Class enum + priority header parser — Class { Bulk=0, Default=1, Interactive=2, System=3 } with Ord derived (tenant clamp = min(header, max)) and repr(u8) (later packed into AtomicU64). parse_header is case-insensitive, whitespace-tolerant, lenient on unknown values. PRIORITY_HEADER = "x-smg-priority".
feat(scheduler): add ClassConfig + ClassRuntimeConfig with defaults — ClassConfig is the YAML-shaped on-disk form (u64 secs, f32 multiplier); ClassRuntimeConfig is the runtime form (Duration). default_for(Class) returns the built-in per-class defaults from the design's §3 table.
feat(scheduler): PrioritySchedulerYaml serde schema — PrioritySchedulerYaml { classes, tenant_policies }, both #[serde(default)]. Class derives Serialize/Deserialize with #[serde(rename_all = "lowercase")]. Empty YAML round-trips to default.
feat(scheduler): SchedulerSettings::from_cli_and_yaml builder — Merges built-in defaults + optional YAML + CLI flags into a single read-only SchedulerSettings. Per-field validation (non-negative multiplier, positive timeouts). Capacity-vs-reserved check is deferred to PriorityScheduler::new (M2).
feat(scheduler): tenant policy resolver trait + static impl — trait TenantPolicyResolver (lets a future async store-backed impl swap in without touching the admission path) + StaticTenantPolicyResolver (HashMap with default fallback).

Plus: .claude/docs/scheduler/02-priority-scheduler-plan.md (the implementation plan for PR 2).

Test Plan

Each commit follows TDD discipline: failing tests first, verify they fail with the expected error, then implement, verify pass, commit. 29 new unit tests cover:

Class::parse_header over all four valid values, case-insensitivity, whitespace tolerance, unknown→Default, and the Ord-derived clamp semantics.
ClassConfig::default_for matches the design §3 defaults table exactly for all four classes.
ClassRuntimeConfig::from_class_config converts seconds to Duration and preserves can_preempt.
PrioritySchedulerYaml serde round-trips: empty document, partial class override, tenant policy map, unknown-class serde error, lowercase serialization.
SchedulerSettings::from_cli_and_yaml: defaults-only, partial YAML override merges with defaults, tenant policies propagate, validation rejects negative multiplier / zero queue timeout / zero starvation threshold.
StaticTenantPolicyResolver: empty resolver returns default, known tenant overrides default, trait-object dispatch, admission clamp via min(header, policy.max_class).

$ cargo test --package smg --lib middleware::scheduler
... 29 passed; 0 failed; 0 ignored; 0 measured; 902 filtered out

$ cargo clippy --package smg --all-targets --all-features -- -D warnings
... Finished `dev` profile [unoptimized + debuginfo] target(s)

Checklist

cargo +nightly fmt passes (enforced by pre-commit hook)
cargo clippy --all-targets --all-features -- -D warnings passes
(Optional) Documentation updated — design + implementation plan in .claude/docs/scheduler/
(Optional) Please join us on Slack #sig-smg to discuss, review, and merge PRs

Summary by CodeRabbit

New Features
- Added priority-aware admission scheduler middleware enabling request prioritization
- Introduced four priority classes for request classification
- Added HTTP header support for specifying request priority
- Enabled configurable per-class settings and tenant-level policies
- Implemented tenant policy resolver for dynamic policy application

4 classes (Bulk < Default < Interactive < System) with #[repr(u8)] and Ord derived so tenant-clamp is min(header_class, max_class). parse_header is case-insensitive, whitespace-tolerant, and degrades unknown values to Default. PRIORITY_HEADER constant "x-smg-priority". Signed-off-by: Chang Su <8605658+CatherineSue@users.noreply.github.com>

ClassConfig is the YAML-shaped on-disk form (u64 secs, f32 multiplier); ClassRuntimeConfig is the runtime form (Duration). Built-in per-class defaults match the table in 02-priority-scheduler-design.md §3. Signed-off-by: Chang Su <8605658+CatherineSue@users.noreply.github.com>

Optional YAML file loaded via --priority-scheduler-config <path>. Classes and tenant_policies are both optional; absent keys fall back to built-in defaults in a later commit. Class enum serializes as lowercase strings. Signed-off-by: Chang Su <8605658+CatherineSue@users.noreply.github.com>

Merges built-in defaults + optional YAML + CLI flags into one runtime config. Validates per-field invariants (non-negative multiplier, positive timeouts); capacity-vs-reserved validation deferred to PriorityScheduler. Signed-off-by: Chang Su <8605658+CatherineSue@users.noreply.github.com>

Trait-shaped so a future async store-backed impl can land without touching the admission path. v1 impl is a HashMap lookup with a default fallback. TenantPolicy is the future growth point (weight, slot_quota, rps_cap). Signed-off-by: Chang Su <8605658+CatherineSue@users.noreply.github.com>

coderabbitai · 2026-05-25T01:45:33Z

📝 Walkthrough

Walkthrough

This PR introduces a new priority-aware admission scheduler middleware for model_gateway. It adds a complete scheduler infrastructure module with priority class definitions, configuration management with YAML and CLI integration, per-tenant policy enforcement, and modular re-exports. All new code includes comprehensive unit test coverage.

Changes

Priority-aware admission scheduler middleware

Layer / File(s)	Summary
Priority class type and header parsing `model_gateway/src/middleware/scheduler/class.rs`	Defines `Class` enum (Bulk=0, Default=1, Interactive=2, System=3) with `repr(u8)` ordering, `parse_header()` for case-insensitive whitespace-tolerant HTTP header parsing defaulting to `Default`, and `as_str()` for lowercase label output. Includes tests for parsing, case/whitespace handling, ordering, and constant values.
Configuration structures and validation `model_gateway/src/middleware/scheduler/config.rs`	Introduces `ClassConfig` with built-in defaults per class, `ClassRuntimeConfig` converting seconds to `Duration`, `TenantPolicyConfig` for per-tenant YAML entries, `PrioritySchedulerYaml` as the top-level YAML shape, `SettingsValidationError` for invariant violations, and `SchedulerSettings` that merges CLI flags with optional YAML overrides while validating negative multipliers and non-zero timeout thresholds. Includes extensive tests for defaults, duration conversion, serde behavior, and settings validation.
Tenant policy resolution `model_gateway/src/middleware/scheduler/policy.rs`	Defines `TenantPolicy` with `max_class` field, `TenantPolicyResolver` trait for tenant-to-policy lookup, `StaticTenantPolicyResolver` backed by `HashMap<TenantKey, TenantPolicy>` built from `SchedulerSettings`, and `Arc<dyn TenantPolicyResolver>` delegation. Tests validate default fallback, per-tenant overrides, Arc dispatch, and effective-class clamping logic.
Scheduler module organization and public API `model_gateway/src/middleware/scheduler/mod.rs`	Declares `class`, `config`, and `policy` submodules and re-exports `Class`, `PRIORITY_HEADER`, all configuration types, and resolver types to form the scheduler middleware's public interface.
Middleware registration `model_gateway/src/middleware/mod.rs`	Adds `pub mod scheduler;` to expose the new scheduler submodule as part of the middleware stack.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

lightseekorg/smg#1151: Rewires the middleware module structure, which this PR builds upon by adding the scheduler submodule to the central middleware exports.

Suggested labels

model-gateway

Suggested reviewers

claude

Poem

🐰 A scheduler born with priorities clear,
Four classes racing—Bulk to System tier,
YAML configs and policies aligned,
Tenant clamping keeps requests well-designed,
Middleware magic, admission refined! ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically summarizes the PR's main contribution: adding foundational scheduler types (class, config, policy) as part of Milestone 1.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a9943ba220

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-25T01:47:14Z

+
+        for class in Class::ALL {
+            let cfg = &classes[class as usize];
+            if cfg.queue_size_per_slot < 0.0 {


Validate queue_size_per_slot is finite

from_cli_and_yaml only rejects queue_size_per_slot < 0.0, so YAML values like .nan (and potentially inf) pass validation even though they are not meaningful multipliers. This can silently produce invalid effective queue limits when the scheduler applies the documented formula ceil(queue_size_per_slot * capacity), leading to unpredictable admission behavior once the feature is wired in. Please reject non-finite values (e.g., !is_finite()) in the same validation block.

Useful? React with 👍 / 👎.

gemini-code-assist

Code Review

This pull request introduces a priority-aware admission scheduler, including the definition of priority classes, configuration management for per-class and per-tenant policies, and a static policy resolver. The review feedback identifies a performance optimization opportunity in the header parsing logic to avoid unnecessary heap allocations and suggests enhancing configuration validation to handle non-finite floating-point values for queue size multipliers.

gemini-code-assist · 2026-05-25T01:47:23Z

+    pub fn parse_header(value: &str) -> Class {
+        match value.trim().to_ascii_lowercase().as_str() {
+            "system" => Self::System,
+            "interactive" => Self::Interactive,
+            "bulk" => Self::Bulk,
+            _ => Self::Default,
+        }
+    }


The parse_header function currently allocates a new String via to_ascii_lowercase() on every call. Since this is intended for the request admission path, it is more efficient to avoid heap allocations by using eq_ignore_ascii_case on the trimmed string slice.

pub fn parse_header(value: &str) -> Class { let trimmed = value.trim(); if trimmed.eq_ignore_ascii_case("system") { Self::System } else if trimmed.eq_ignore_ascii_case("interactive") { Self::Interactive } else if trimmed.eq_ignore_ascii_case("bulk") { Self::Bulk } else { Self::Default } }

References

Avoid heap allocations (like Vec or String) within performance-critical comparison logic. Use allocation-free helpers to reduce memory pressure.

gemini-code-assist · 2026-05-25T01:47:23Z

+
+        for class in Class::ALL {
+            let cfg = &classes[class as usize];
+            if cfg.queue_size_per_slot < 0.0 {


The validation for queue_size_per_slot only checks if the value is negative. It should also check for non-finite values like NaN or Infinity to ensure that downstream capacity calculations (e.g., using ceil) remain deterministic and safe.

Suggested change

if cfg.queue_size_per_slot < 0.0 {

if cfg.queue_size_per_slot < 0.0 || !cfg.queue_size_per_slot.is_finite() {

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@model_gateway/src/middleware/scheduler/config.rs`:
- Around line 208-210: Update the validation in the scheduler settings so
non-finite queue_size_per_slot values are rejected: in the code that currently
checks cfg.queue_size_per_slot < 0.0 (the branch that returns
SettingsValidationError::NegativeMultiplier), additionally test
cfg.queue_size_per_slot.is_finite() and reject when false (or explicitly check
for NaN/inf) so NaN/+inf do not pass; update the error path to return the same
SettingsValidationError (or add a new variant if you prefer) and ensure this
logic is inside the same validation routine invoked by
SchedulerSettings::from_cli_and_yaml. Also add unit tests that load YAML with
queue_size_per_slot: .nan and queue_size_per_slot: .inf and assert
SchedulerSettings::from_cli_and_yaml returns an Err for those inputs.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9a761abb-7598-44ac-b21d-85e13b3132d5

📥 Commits

Reviewing files that changed from the base of the PR and between a38b5a2 and a9943ba.

📒 Files selected for processing (5)

model_gateway/src/middleware/mod.rs
model_gateway/src/middleware/scheduler/class.rs
model_gateway/src/middleware/scheduler/config.rs
model_gateway/src/middleware/scheduler/mod.rs
model_gateway/src/middleware/scheduler/policy.rs

coderabbitai · 2026-05-25T01:50:13Z

+            if cfg.queue_size_per_slot < 0.0 {
+                return Err(SettingsValidationError::NegativeMultiplier { class });
+            }


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail # Locate and print the relevant validation block rg -n "queue_size_per_slot" model_gateway/src/middleware/scheduler/config.rs sed -n '160,260p' model_gateway/src/middleware/scheduler/config.rs # Find where SettingsValidationError is defined/used rg -n "SettingsValidationError" model_gateway/src/middleware/scheduler/config.rs rg -n "enum SettingsValidationError|SettingsValidationError::" model_gateway/src/middleware/scheduler -S # Search for tests covering YAML parsing / validation errors fd -t f -e rs model_gateway | rg -n "queue_size_per_slot|NegativeMultiplier|SettingsValidationError" -S fd -t f -e rs model_gateway | rg -n "yaml|serde_yaml|Number|NaN|inf|infinity|non-finite|is_finite" -S # If there are YAML fixtures, inspect them fd -t f model_gateway -g "*.yaml" -g "*.yml" | head -n 50 fd -t f model_gateway -g "*.yaml" -g "*.yml" | xargs -I{} sh -c 'echo "---- {}"; sed -n "1,200p" "{}" | rg -n "queue_size_per_slot|nan|inf|infinity" || true'

Repository: lightseekorg/smg

Length of output: 5836

🏁 Script executed:

#!/bin/bash set -euo pipefail # Show the relevant validation block and error enum sed -n '120,240p' model_gateway/src/middleware/scheduler/config.rs # Show existing tests around queue_size_per_slot validation sed -n '240,360p' model_gateway/src/middleware/scheduler/config.rs # Show tests that assert SettingsValidationError variants sed -n '420,540p' model_gateway/src/middleware/scheduler/config.rs # Locate YAML structs / deserialization types rg -n "struct PrioritySchedulerYaml|PrioritySchedulerYaml|serde_yaml|serde.*yaml" model_gateway/src/middleware/scheduler/config.rs model_gateway/src/middleware/scheduler -S # If PrioritySchedulerYaml is defined in another file, show it fd -t f -e rs model_gateway/src/middleware/scheduler | xargs -I{} sh -c 'rg -n "struct PrioritySchedulerYaml|PrioritySchedulerYaml" "{}" && true' # Print the file(s) where PrioritySchedulerYaml is defined fd -t f -e rs model_gateway/src/middleware/scheduler | xargs -I{} sh -c 'rg -n "struct PrioritySchedulerYaml|PrioritySchedulerYaml" "{}" | head -n 5 || true'

Repository: lightseekorg/smg

Length of output: 15628

🏁 Script executed:

#!/bin/bash set -euo pipefail # Search for non-finite handling in this module and repo rg -n "is_finite|is_nan|NaN|nan|INFINITY|inf|infinity|non-finite" model_gateway/src -S || true rg -n "queue_size_per_slot.*(is_finite|nan|inf)" model_gateway/src -S || true # Check whether other float validations exist rg -n "must be.*finite|finite" model_gateway/src -S || true # Look for tests that parse YAML containing NaN/inf rg -n "queue_size_per_slot:.*(nan|inf|infinity)" model_gateway/src -S || true rg -n "from_str\\(.*nan|from_str\\(.*inf|\\.nan|\\.inf|NaN|Infinity" model_gateway/src -S || true # If there are other YAML float validation tests, inspect them fd -t f -e rs model_gateway/src | xargs -I{} rg -n "serde_yaml::from_str\\(" {} | head -n 50 || true

Repository: lightseekorg/smg

Length of output: 50765

🏁 Script executed:

#!/bin/bash set -euo pipefail # Find all usages of queue_size_per_slot (text search, but scoped) rg -n "queue_size_per_slot" model_gateway/src/middleware/scheduler -S # Inspect the code that computes queue limits (likely uses queue_size_per_slot * capacity) rg -n "queue_size_per_slot.*capacity|capacity.*queue_size_per_slot|ceil\\(queue_size_per_slot|queue_size_per_slot.*ceil" model_gateway/src/middleware/scheduler -S # Search broader for conversions of f32 -> integer / as-casts involving queue sizes rg -n "queue_size.*per_slot|per_slot.*queue_size|queue_limit|queue.*limit|ceil\\(" model_gateway/src/middleware/scheduler -S # Look for serde_yaml float tests (NaN/inf) across repo rg -n "NaN|nan|Infinity|infinity|\\binf\\b|non-finite|is_finite|is_nan|is_infinite" model_gateway/src -S || true

Repository: lightseekorg/smg

Length of output: 50372

🏁 Script executed:

#!/bin/bash set -euo pipefail # Find computations that use queue_size_per_slot rg -n "queue_size_per_slot.*(capacity|cap|ceil|floor|max|min)|ceil\\(.*queue_size_per_slot|queue_size_per_slot.*\\*" model_gateway/src/middleware/scheduler -S # Inspect nearby code in scheduler files around those usages rg -n "queue_size_per_slot" model_gateway/src/middleware/scheduler -S | cut -d: -f1 | sort -u # Show the first 250 lines around each usage site of queue_size_per_slot (to avoid huge output) rg -n "queue_size_per_slot" model_gateway/src/middleware/scheduler -S | head -n 50 # Use sed to show ranges in config.rs and any other file identified sed -n '1,220p' model_gateway/src/middleware/scheduler/config.rs

Repository: lightseekorg/smg

Length of output: 10513

🏁 Script executed:

#!/bin/bash set -euo pipefail # 1) Inspect scheduler module files to find where queue_size_per_slot affects queue limits ls -1 model_gateway/src/middleware/scheduler # Search for queue-limit computations using queue_size_per_slot rg -n "queue_size_per_slot" model_gateway/src/middleware/scheduler -S # Search specifically for ceil/casts based on the config comment rg -n "ceil\\(|floor\\(|max\\(|min\\(|as u32|as usize|as i32" model_gateway/src/middleware/scheduler -S # Narrow to files likely doing the computation: scheduler.rs / mod.rs / any file with "PriorityScheduler" fd -t f model_gateway/src/middleware/scheduler | xargs -I{} sh -c 'rg -n "PriorityScheduler" "{}" || true' # Find the constructor `PriorityScheduler::new` mentioned in comments fd -t f model_gateway/src/middleware/scheduler | xargs -I{} sh -c 'rg -n "PriorityScheduler::new|fn new\\(" "{}" || true' # Print any matches around those likely computations rg -n "queue_size_per_slot|effective limit|effective limit =|max\\(queue_size|queue_size_per_slot \\* capacity" model_gateway/src/middleware/scheduler -S -n --context 3 # 2) Check serde_yaml parsing of non-finite floats (for grounding tests) # Search in Cargo.lock / docs? limited; we'll just find any yaml parsing patterns in tests. rg -n "serde_yaml::from_str\\(|from_str\\(y.*yaml|NaN|nan|Infinity|inf|is_finite" model_gateway/src -S || true

Repository: lightseekorg/smg

Length of output: 3501

🏁 Script executed:

#!/bin/bash set -euo pipefail # Where is the field used anywhere in the repo? rg -n "queue_size_per_slot" . -S # Check for effective queue limit computation: look for ceil/floor and casts around queue_size/per_slot rg -n "ceil\\(|floor\\(|max\\(|min\\(|effective.*queue|queue.*limit|queue.*depth|queue.*size" model_gateway/src -S | rg -n "queue" -S || true # Narrow to queue_size computations (any file mentioning ceil with queue_size) rg -n "ceil\\(.*queue_size|queue_size.*ceil\\(|queue_size_per_slot.*ceil|ceil\\(.*capacity|capacity.*queue_size_per_slot" model_gateway/src -S || true # Inspect scheduler module files for queue depth computations by searching for u32 casts rg -n "as u32|as i32|as usize|queue.*as" model_gateway/src/middleware/scheduler -S || true

Repository: lightseekorg/smg

Length of output: 8700

🏁 Script executed:

#!/bin/bash set -euo pipefail # Inspect scheduler module files sed -n '1,220p' model_gateway/src/middleware/scheduler/mod.rs sed -n '1,260p' model_gateway/src/middleware/scheduler/class.rs sed -n '1,260p' model_gateway/src/middleware/scheduler/policy.rs # Search for effective queue/limit calculations mentioning capacity or queue_size_per_slot math rg -n "capacity|queue limit|effective|ceil\\(|queue_size_per_slot|queue_size" model_gateway/src/middleware/scheduler -S

Repository: lightseekorg/smg

Length of output: 13891

🏁 Script executed:

#!/bin/bash set -euo pipefail # Find the actual scheduler implementation rg -n "struct PriorityScheduler|PriorityScheduler::new|fn new\\(|PriorityScheduler" model_gateway/src -S # Look for the effective queue limit formula mentioned in config.rs rg -n "queue_size_per_slot.*\\*|\\*.*queue_size_per_slot|ceil\\(|floor\\(|max\\(|effective.*queue|queue.*size" model_gateway/src/middleware -S # Search for uses of ClassConfig fields outside config.rs rg -n "queue_size_per_slot|queue_size:" model_gateway/src/middleware -S # Also narrow to any file that mentions WorkerCapacity / admission scheduler computation rg -n "WorkerCapacity|watch channel|AtomicU16|capacity" model_gateway/src/middleware -S

Repository: lightseekorg/smg

Length of output: 20000

🌐 Web query:

Does serde_yaml parse YAML scalars like ".nan", "nan", ".inf", "inf", and "Infinity" into f32/f64 (and what tokens are supported)?

💡 Result:

Yes. serde_yaml can parse YAML scalars representing IEEE-754 special floats into f32/f64. Supported tokens (case-sensitive) shown by serde_yaml’s number parsing: 1) ".nan" -> NaN (e.g., v.as_f64.unwrap.is_nan) [1] 2) ".inf" -> +Infinity (e.g., v.as_f64 == Some(f64::INFINITY)) [1] 3) "-.inf" -> -Infinity (e.g., v.as_f64 == Some(f64::NEG_INFINITY)) [1] So, among the tokens you listed: - ".nan": supported [1] - "nan": not evidenced as supported in the sources I found (serde_yaml’s docs/examples explicitly use ".nan") [1] - ".inf": supported [1] - "inf": not evidenced as supported (examples use ".inf") [1] - "Infinity": not evidenced as supported [1] Notes on f32/f64: - serde_yaml’s Number type stores floats as an f64 internally (“May be infinite or NaN”); the parsing examples validate values via as_f64 [1]. (I didn’t find an explicit token-by-token f32 conversion table, but the float-special handling is part of serde_yaml’s Number parsing.) [1]

Citations:

1: https://shadow.github.io/docs/rust/src/serde_yaml/number.rs.html

Reject non-finite queue_size_per_slot during settings validation (NaN/+inf currently slip through)

In model_gateway/src/middleware/scheduler/config.rs (around lines 208-210), the check cfg.queue_size_per_slot < 0.0 rejects only negative values/-inf, but it does not reject NaN (comparisons are false) or +inf—so those can pass validation and break the later ceil(queue_size_per_slot * capacity) behavior.

Suggested fix

- if cfg.queue_size_per_slot < 0.0 { + if !cfg.queue_size_per_slot.is_finite() || cfg.queue_size_per_slot < 0.0 { return Err(SettingsValidationError::NegativeMultiplier { class }); }

Add unit tests that parse YAML with queue_size_per_slot: .nan and queue_size_per_slot: .inf and assert SchedulerSettings::from_cli_and_yaml rejects them.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if cfg.queue_size_per_slot < 0.0 {

return Err(SettingsValidationError::NegativeMultiplier { class });

}

if !cfg.queue_size_per_slot.is_finite() || cfg.queue_size_per_slot < 0.0 {

return Err(SettingsValidationError::NegativeMultiplier { class });

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@model_gateway/src/middleware/scheduler/config.rs` around lines 208 - 210, Update the validation in the scheduler settings so non-finite queue_size_per_slot values are rejected: in the code that currently checks cfg.queue_size_per_slot < 0.0 (the branch that returns SettingsValidationError::NegativeMultiplier), additionally test cfg.queue_size_per_slot.is_finite() and reject when false (or explicitly check for NaN/inf) so NaN/+inf do not pass; update the error path to return the same SettingsValidationError (or add a new variant if you prefer) and ensure this logic is inside the same validation routine invoked by SchedulerSettings::from_cli_and_yaml. Also add unit tests that load YAML with queue_size_per_slot: .nan and queue_size_per_slot: .inf and assert SchedulerSettings::from_cli_and_yaml returns an Err for those inputs.

CatherineSue added 5 commits May 24, 2026 18:36

CatherineSue requested a review from slin1237 as a code owner May 25, 2026 01:45

github-actions Bot added the model-gateway Model gateway crate changes label May 25, 2026

chatgpt-codex-connector Bot reviewed May 25, 2026

View reviewed changes

gemini-code-assist Bot reviewed May 25, 2026

View reviewed changes

coderabbitai Bot requested changes May 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(scheduler): foundation — class, config, policy (PR 2 M1)#1541

feat(scheduler): foundation — class, config, policy (PR 2 M1)#1541
CatherineSue wants to merge 5 commits into
lightseekorg:mainfrom
CatherineSue:feat/priority-scheduler

CatherineSue commented May 25, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 25, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 25, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 25, 2026

Uh oh!

gemini-code-assist Bot May 25, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	if cfg.queue_size_per_slot < 0.0 {
	if cfg.queue_size_per_slot < 0.0 \|\| !cfg.queue_size_per_slot.is_finite() {

Conversation

CatherineSue commented May 25, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Problem

Solution

Changes

Test Plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

CatherineSue commented May 25, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 25, 2026 •

edited

Loading