Single canonical header topic (only source of truth)
All runtime/scheduler state must live inside accounts.json under each account.
Summary
Implement smart account-aware rotation that selects the best-ready account (not blind round-robin), with per-account runtime intelligence embedded directly in accounts.json.
Evidence from current code
-
Rotation is primarily round-robin
new_idx = (old_idx + 1) % count
src/account_rotator.rs:397-400
-
Persisted account model currently lacks readiness metadata
src/account_rotator.rs:34-44
-
Standby visibility is limited in status API
src/api/mod.rs:519-521, src/api/mod.rs:547-551
-
Retry/reset signals exist but client delay is clamped
src/api/retry_delay.rs:129-143
Required design direction
1) Embed runtime state in accounts.json only
Add optional runtime object per account in accounts.json:
{
"accounts": [
{
"email": "user@example.com",
"refresh_token": "1//...",
"project_id": "...",
"runtime": {
"last_active_at": "...",
"banned": false,
"restricted": false,
"next_ready_at": "...",
"models": {
"gemini-3-pro": {
"remaining_fraction": 0.37,
"reset_time": "...",
"cooldown_until": "...",
"last_retry_delay_secs_raw": 18840,
"last_retry_delay_secs_client": 300,
"consecutive_429": 1,
"last_success_at": "...",
"confidence": 0.82,
"updated_at": "..."
}
}
}
}
],
"active": "user@example.com"
}
2) Signal ingestion
- Quota polling (
src/quota.rs) updates remaining/reset/banned/restricted and recomputes next_ready_at.
- Rate limiter (
src/api/rate_limiter.rs) updates cooldown/retry/consecutive/success outcomes.
- Rotation orchestrator (
src/rotation.rs) persists transition metadata (from, to, at).
3) Delay semantics
Store both:
last_retry_delay_secs_raw (unclamped) for scheduler intelligence, including long reset windows (e.g. 5h / weekly)
last_retry_delay_secs_client (clamped) for API-facing behavior
4) Smart selection scorer
For requested model:
- Build candidates
- Exclude hard-ineligible (
banned, restricted, active hard block)
- Compute
model_ready_at = max(cooldown_until, reset_time_if_known)
- Score:
score =
+ ready_now_bonus
+ remaining_fraction_weight
- wait_time_penalty
- consecutive_429_penalty
- staleness_penalty
+ recent_success_bonus
- Pick highest score
- Apply hysteresis to avoid churn
Fallback: if runtime metadata missing/corrupt/stale -> deterministic round-robin.
5) Parked recovery
When all accounts are temporarily ineligible:
earliest_wake_at = min(next_ready_at across accounts)
- wake and re-evaluate at that time (+ jitter)
- auto-unpark when any account becomes eligible
Persistence/write contract for accounts.json
Because runtime is embedded in the same file:
- Atomic write (
tmp + rename)
- File locking/serialized writes
0600 permissions
- Debounced flush (e.g. 5-10s or significant deltas)
- Startup schema validation/migration
Operational note:
- Durable smart rotation requires writable
accounts.json.
- If
accounts.json is mounted read-only, system must fall back to memory-only runtime state safely (non-durable) without crash.
API observability additions
Extend /v1/accounts/status with:
next_ready_at
selection_score
selection_reason
last_known_model_state_summary
confidence
Rollout plan
Phase 1
- add
runtime schema in accounts.json
- populate from quota/rate-limiter/rotation
- expose status fields
Phase 2
- enable scorer behind feature flag
Phase 3
- make scorer default after burn-in
Acceptance criteria
- Mixed cooldown accounts -> best-ready account selected.
- State persists across restart when
accounts.json is writable.
- Long windows (e.g. 5h / weekly) influence scheduling decisions.
- Missing/corrupt runtime metadata falls back safely to round-robin.
- Status API clearly explains selection rationale.
Implementation slices
accounts.json schema extension (runtime per account)
- atomic + locked persistence path
- signal ingestion (quota + limiter + rotation)
- candidate scorer + fallback + hysteresis
- parked wake scheduler
- status API + documentation updates
Single canonical header topic (only source of truth)
All runtime/scheduler state must live inside
accounts.jsonunder each account.Summary
Implement smart account-aware rotation that selects the best-ready account (not blind round-robin), with per-account runtime intelligence embedded directly in
accounts.json.Evidence from current code
Rotation is primarily round-robin
new_idx = (old_idx + 1) % countsrc/account_rotator.rs:397-400Persisted account model currently lacks readiness metadata
src/account_rotator.rs:34-44Standby visibility is limited in status API
src/api/mod.rs:519-521,src/api/mod.rs:547-551Retry/reset signals exist but client delay is clamped
src/api/retry_delay.rs:129-143Required design direction
1) Embed runtime state in
accounts.jsononlyAdd optional
runtimeobject per account inaccounts.json:{ "accounts": [ { "email": "user@example.com", "refresh_token": "1//...", "project_id": "...", "runtime": { "last_active_at": "...", "banned": false, "restricted": false, "next_ready_at": "...", "models": { "gemini-3-pro": { "remaining_fraction": 0.37, "reset_time": "...", "cooldown_until": "...", "last_retry_delay_secs_raw": 18840, "last_retry_delay_secs_client": 300, "consecutive_429": 1, "last_success_at": "...", "confidence": 0.82, "updated_at": "..." } } } } ], "active": "user@example.com" }2) Signal ingestion
src/quota.rs) updates remaining/reset/banned/restricted and recomputesnext_ready_at.src/api/rate_limiter.rs) updates cooldown/retry/consecutive/success outcomes.src/rotation.rs) persists transition metadata (from,to,at).3) Delay semantics
Store both:
last_retry_delay_secs_raw(unclamped) for scheduler intelligence, including long reset windows (e.g. 5h / weekly)last_retry_delay_secs_client(clamped) for API-facing behavior4) Smart selection scorer
For requested model:
banned,restricted, active hard block)model_ready_at = max(cooldown_until, reset_time_if_known)Fallback: if runtime metadata missing/corrupt/stale -> deterministic round-robin.
5) Parked recovery
When all accounts are temporarily ineligible:
earliest_wake_at = min(next_ready_at across accounts)Persistence/write contract for
accounts.jsonBecause runtime is embedded in the same file:
tmp + rename)0600permissionsOperational note:
accounts.json.accounts.jsonis mounted read-only, system must fall back to memory-only runtime state safely (non-durable) without crash.API observability additions
Extend
/v1/accounts/statuswith:next_ready_atselection_scoreselection_reasonlast_known_model_state_summaryconfidenceRollout plan
Phase 1
runtimeschema inaccounts.jsonPhase 2
Phase 3
Acceptance criteria
accounts.jsonis writable.Implementation slices
accounts.jsonschema extension (runtimeper account)