Skip to content

PR3: Explicit global scan-budget priority scheduler (issue #2, phase 5)#9

Open
jdkajewski wants to merge 3 commits into
jdkajewski/ts-rewrite-planfrom
dk-global-scan-scheduler
Open

PR3: Explicit global scan-budget priority scheduler (issue #2, phase 5)#9
jdkajewski wants to merge 3 commits into
jdkajewski/ts-rewrite-planfrom
dk-global-scan-scheduler

Conversation

@jdkajewski

@jdkajewski jdkajewski commented Jun 20, 2026

Copy link
Copy Markdown
Owner

What & why

Issue #2 phase 5: make the shared ~2 req/s scan budget an explicit priority queue keyed by value × staleness, instead of FIFO.

PR1 gave each market a per-market refresh interval. But when many markets come due in one sweep, refreshDue fetched every due market in scheduler (insertion) order with no cap — reading dead markets ahead of lane-critical ones and monopolising the same ~2 req/s budget that latency-sensitive trade/nav requests need. This PR spends a bounded per-sweep budget highest value×staleness first.

Design (scan-scoped, lever-gated, safe default OFF)

  • market/scanBudget.ts (PURE, unit-tested)
    • scanPriority(relValue, overrun) = max(0,relValue) × max(0,overrun)relValue = value vs fleet mean (same scale as the PR1 scheduler), overrun = (now − lastScanAt)/interval (1.0 at due, rising the longer it waits). Never-scanned markets get a cold-start overrun so they are classified promptly (unknown ≠ dead).
    • scanBudgetPerSweep = floor(reqPerSec × sweepSeconds × fraction), optional hard cap, floored at 1. The fraction reserves headroom so scans cannot starve trades.
    • allocateScanBudget(candidates, budget){granted, deferred, byTier}: sort by priority desc (tie → longest-overdue first), grant top budget, defer the rest.
    • Starvation-free: a deferred market's overrun keeps climbing every sweep, so its priority rises until it is granted.
  • markets.ts wiring: when SCAN_BUDGET_ON, refreshDue ranks the due set and fetches only the granted waypoints; exposes scanBudgetStatus(). OFF ⇒ byte-for-byte legacy fetch-all-due.
  • Metric: additive/conditional scanBudget block in the scan status snapshot (perSweep, due, granted, deferred, byTier) — the prioritisation + budget spend are now provable next to credits-per-request.

Scope boundary (deliberate)

Prioritises/caps only discretionary market-scan reads. Latency-sensitive action requests (navigate/dock/sell/buy/contract) keep hitting the client token bucket directly — a fully general global queue would risk starving navigation / deadlocking trade loops. The budget is derived from a SCAN_BUDGET_REQ_PER_SEC lever (mirrors the client's bucket), so this PR does not modify clients/spacetraders.ts.

Levers (all num()/boolOff, default OFF)

Lever Default Meaning
SCAN_BUDGET_ON off gate the priority scheduler; off ⇒ legacy fetch-all-due
SCAN_BUDGET_REQ_PER_SEC 2 account request ceiling the budget is computed against
SCAN_BUDGET_REQ_FRACTION 0.6 fraction of sweep capacity scans may spend (rest reserved for trades)
SCAN_BUDGET_MAX_PER_SWEEP 0 absolute hard cap per sweep; 0 ⇒ fraction-derived only

Calibration

Live legacy .mjs history is unreachable from this env → principled defaults + the metric (as agreed for PR1/PR2). The credits-per-request + new scanBudget block let us measure the effect once the TS bot runs live.

Tests / verification

  • 12 scanBudget unit tests (priority ordering, budget derivation + headroom, hard cap, cold-start promotion, starvation-avoidance across sweeps, tier histogram, edge cases).
  • markets.test.ts wiring tests: null without the lever; due-burst capped to budget, rest deferred, status reported.
  • 279 bot tests green, @st/shared build + @st/bot tsc --noEmit clean, eslint clean on all changed files. (One pre-existing main.ts:63 lint error exists on the base and is unrelated to this PR.)
  • status-shape parity preserved (new field is conditional-spread).

Out of scope

Phase 6 (traders-as-scanners) → later PR. Does not touch issue #8 (putMarkets Record-vs-array 400).

Branched off latest jdkajewski/ts-rewrite-plan (PR1 #5 + PR2 #6 already merged) — diff is PR3-only, no stacking.


Update — correctness steers applied (commit 6e95e10)

Two reviewer steers, both landed:

1. Presence/coverage-gate the candidates. A GET on an uncovered market is ship-presence-gated → returns no live prices → spending scan budget there is a wasted request. allocateScanBudget now only ever ranks markets that are scannable right now:

  • The always-on fleet poll (fleetTableManager) now also snapshots every ship's present + inbound waypoint into state.coverageWps when SCAN_BUDGET_ONreusing the existing getAllShips, no extra request. (PR2's exact "present or inbound" coverage definition.)
  • markets.ts allocateDue filters the DUE burst to that covered set before allocation, so budget never leaks onto a price-blind read. The pure allocateScanBudget stays presence-agnostic (gating is the caller's job, documented in its JSDoc).
  • New uncovered field in the scanBudget metric makes the gate visible: due = granted + deferred + uncovered.
  • Cold-start safe: the gate is skipped while coverage is empty (before the first poll) → ungated/legacy behaviour until data exists. Cold re-check of uncovered markets stays PR2's separate presence-gated single read, not part of this budget.

2. Average vs instantaneous headroom. SCAN_BUDGET_REQ_FRACTION < 1 reserves average headroom, but a tight burst of granted scans could still momentarily monopolise the FIFO token bucket and delay an action arriving mid-burst. Tradeoff for v1: SCAN_BUDGET_MAX_PER_SWEEP is the burst cap — set it to keep granted-per-sweep modest so the bucket refills between scans. I did not inject inter-read pacing sleeps, because the sweep is awaited by getMarkets callers and pacing there would stall the worker/contract loops; a true intra-sweep pacer + action-contention signal (needs token-bucket instrumentation) is deferred. Burst-cap alone is the v1 mechanism, as agreed.

Verification: +2 presence-gating unit tests (covered-subset → only covered granted + uncovered counted; empty set → ungated). 281 bot tests green, @st/shared build + @st/bot tsc --noEmit clean, eslint clean on changed files (pre-existing main.ts cast error unrelated, now line 65 after a +2-line insertion).


Update — calibrated from live production metrics (commit 1ecbf5d)

Folded in real calibration data parsed from 13.45h of the live UPRISING (.mjs) agent (phase PORTAL_OPEN, 226 ships, 29.15M cr) — gathered with zero API calls (log + status parse, production budget untouched). This replaces the "principled defaults" caveat with empirical backing.

Default changes / validation:

  • SCAN_BUDGET_REQ_FRACTION 0.6 → 0.4. The observed request mix was scans 40.1% / actions 55.7% / price 4.3%. The lever is the scan share, so capping scans at their empirical ~40% keeps actions' observed ~60% — directly matching Steer Value-weighted, lane-driven market scan budgeting (replace uniform 1:1 probe coverage) #2's action-protection goal. At the observed steady-state scan rate (5.74/min) the cap never binds; it only bounds a synchronized due-burst. (0.6 would reserve only 40% for actions, below the observed 56% action load — hence 0.4.)
  • LANE_VALUE_ALPHA / LANE_VALUE_HALFLIFE_MS / LANE_TOPK / COVERAGE_COLD_MULT: values validated, not changed — the real distribution (net/lane median 21,980, p90 60,480, 15.5% negative; ~12 dominant sinks/goods; ~0.8 lanes/min fleet-wide) supports the existing picks. Annotated each with its empirical basis in config.ts.

Replay calibration test (market/__tests__/replay.calibration.test.ts) — proves the win on real numbers, not toys. Feeds the real realized-net distribution (the topSinksByNet fixture) through the pure cores (laneRegistry → value.scoreMarkets → scanScheduler.intervalFor → scanBudget.allocateScanBudget) and asserts:

  • refresh cadence is monotonic in realized value (no inversions across the real sinks; dead markets strictly slower);
  • ≥10:1 hot:cold concentration and CV > 0.3, vs the observed near-uniform 24.65 refreshes/market (hottest real sink read >5× the flat baseline) — the issue-Value-weighted, lane-driven market scan budgeting (replace uniform 1:1 probe coverage) #2 thesis, demonstrated;
  • a constrained 8-read budget is spent value-first (HOT-skewed, zero dead reads granted);
  • a dead tail sized to the real 15.5% negative-lane share parks at the SCAN_MAX ceiling.

286 bot tests green (+5 replay), @st/shared build + @st/bot tsc clean, eslint clean.

Open calibration nuance for the reviewer: I read "leave ~0.6 for actions / default 0.6 looks well-chosen" as referring to the action reservation; since this lever is the scan fraction, I set it to 0.4 so actions keep their observed ~60%. Flag if you'd rather keep the literal 0.6 (it's an opt-in lever, default OFF, so either is safe).

…phase 5)

PR1 gave each market a per-market refresh interval; when many markets came
due in one sweep, refreshDue fetched every due market FIFO, reading dead
markets ahead of lane-critical ones and monopolising the shared ~2 req/s
budget that trade/nav also need. This makes the allocation explicit.

- market/scanBudget.ts (PURE): scanPriority = relValue x overrun (staleness),
  scanBudgetPerSweep = floor(reqPerSec x sweepSeconds x fraction) capped, and
  allocateScanBudget → {granted, deferred, byTier}. Deferral is not starvation:
  a deferred market's overrun keeps rising until it wins a later sweep.
- markets.ts: when SCAN_BUDGET_ON, refreshDue ranks the due set by value x
  staleness and grants only the per-sweep budget (rest ride the next sweep);
  exposes scanBudgetStatus() for the metric. OFF ⇒ byte-for-byte legacy
  fetch-all-due.
- main.ts: additive/conditional `scanBudget` block in the scan status snapshot
  (perSweep, due, granted, deferred, byTier) so the prioritisation is provable.
- config.ts: SCAN_BUDGET_ON (boolOff), SCAN_BUDGET_REQ_PER_SEC (2),
  SCAN_BUDGET_REQ_FRACTION (0.6), SCAN_BUDGET_MAX_PER_SWEEP (0). Default OFF.

Scoped to discretionary market-scan reads only; latency-sensitive action
requests keep hitting the client token bucket directly (no nav/trade
starvation). Calibration data unreachable from env → principled defaults + the
metric. 12 scanBudget unit tests + markets wiring tests; 279 bot tests green,
tsc + eslint clean (scoped files).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
jdkajewski and others added 2 commits June 20, 2026 10:12
Apply the two orchestrator correctness steers to the global scan-budget
scheduler (issue #2, phase 5):

1. Presence/coverage-gate the scan candidates. A `GET /market` only returns
   live prices where a ship is present, so spending budget on an uncovered
   market is a wasted request. The fleet poll (`fleetTableManager`, already
   always-on) now also snapshots every ship's present + inbound waypoint into
   `state.coverageWps` when `SCAN_BUDGET_ON` — no extra `getAllShips` call.
   `allocateDue` filters the DUE burst to that covered set before allocation
   and records the skipped count as a new `uncovered` scan-budget metric.
   The gate is skipped when coverage is empty (cold start, before the first
   poll) so behaviour matches today until coverage data exists. Cold re-check
   of uncovered markets stays PR2's separate presence-gated single read.

2. Headroom: `SCAN_BUDGET_MAX_PER_SWEEP` already caps the per-sweep burst to
   keep grants modest; documented the burst-vs-pacing tradeoff rather than
   injecting sleeps into the awaited sweep (would stall getMarkets callers).

Pure `allocateScanBudget` stays presence-agnostic (gating is the caller's
job, documented). +2 unit tests (covered-subset → only covered granted +
uncovered counted; empty set → ungated). 281 bot tests green, tsc + eslint
clean (pre-existing main.ts cast error unrelated).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Fold in real calibration data parsed from 13.45h of the live production .mjs
agent (UPRISING, phase PORTAL_OPEN, 226 ships) — zero API calls spent
(log + status parse). Replaces "principled defaults" with empirical backing.

Defaults:
- SCAN_BUDGET_REQ_FRACTION 0.6 -> 0.4. Observed request mix was scans 40.1% /
  actions 55.7% / price 4.3%, so capping scans at their empirical ~40% share
  keeps actions' observed ~60% — matching Steer #2's action-protection intent
  ("leave ~0.6 for actions"). At the observed steady-state scan rate
  (5.74/min) the cap never binds; it only bounds a synchronized due-burst.
  (Note: the lever is the SCAN fraction; 0.6 would reserve only 40% for
  actions, below the observed 56% action load — hence 0.4.)
- LANE_VALUE_* / LANE_TOPK / COVERAGE_COLD_MULT: values VALIDATED against the
  real distribution (net/lane median 21,980 p90 60,480, 15.5% negative; ~12
  dominant sinks/goods; ~0.8 lanes/min) and annotated with the empirical basis.
  No value change — the data supports the existing picks.

Replay calibration test (market/__tests__/replay.calibration.test.ts):
feeds the REAL realized-net distribution (topSinksByNet fixture) through the
pure cores (laneRegistry -> value.scoreMarkets -> scanScheduler.intervalFor ->
scanBudget.allocateScanBudget) and proves the win on real numbers:
- refresh cadence is MONOTONIC in realized value (no inversions across the
  real sinks; dead markets strictly slower);
- concentration is >=10:1 hot:cold and CV > 0.3, vs the observed near-uniform
  24.65 refreshes/market (hot read >5x the flat baseline);
- a constrained 8-read budget is spent value-first (HOT-skewed, zero dead
  reads granted);
- a dead tail sized to the real 15.5% negative-lane share parks at SCAN_MAX.

286 bot tests green (+5 replay), tsc + eslint clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant