Performance research and runtime-core scaffolding for long-lived AI interfaces: append-heavy, viewport-centric surfaces such as chat sessions, agent traces, coding assistants, logs, and review workspaces.
This is not a production UI framework. It is a measured portfolio/research project that studies why long-running AI surfaces can become interaction-sensitive, then derives a worker-resident runtime direction from controlled evidence.
- What it is: a research-backed OSS / portfolio engineering asset for long-lived AI surfaces.
- Problem studied: append-heavy, viewport-centric, tail-mutating AI workloads can stress document/tree-oriented DOM or VDOM ownership models.
- Technical direction: worker-resident ownership/offload, transaction scheduling, and bounded projection commit.
- Evidence: controlled P1 benchmarks plus synthetic P5 scheduling-delay proxy evidence support reducing and localizing main-thread blocking under these workloads.
- Boundary: not a production UI, not browser-level INP or Event Timing evidence, not real product superiority, and not P4/WebGPU or P7 productization.
Streaming UI Runtime is a research-backed OSS / portfolio engineering asset for long-lived AI surfaces. It studies the workload-architecture mismatch between append-heavy, viewport-centric, tail-mutating AI workloads and document/tree-oriented DOM or VDOM UI stacks.
The runtime direction is worker-resident logical ownership, transaction scheduling, and bounded viewport projection. The main thread still matters, but the intended role is bounded projection commit rather than ownership of all session-scale logical work.
- Not a production AI chat UI.
- Not a productized Agent Trace Viewer.
- Not a commercial product.
- Not a WebGPU-first renderer.
- Not browser-level INP or Event Timing evidence.
- Not real product superiority evidence.
- Not production readiness.
- Not P4/WebGPU authorization or P7 productization.
Most web UI stacks are optimized around document-oriented DOM or VDOM updates. Long-lived AI surfaces behave differently:
- sessions grow over time;
- new content streams into the tail;
- users still need low-latency input, click, and scroll paths;
- background state/fanout work can collide with urgent interactions;
- visible output is usually a bounded viewport projection, not the whole logical session.
The thesis is a workload-architecture mismatch: long-lived AI surfaces increasingly look more like terminal/editor/log surfaces than ordinary document pages.
| Stage | Question | Evidence-backed result | Boundary |
|---|---|---|---|
| P0 product-trace motivation | What mechanism appears in long sessions? | 600ms+ interaction windows in product traces, dominated by click -> microtask-heavy coordination | Motivating trace evidence, not source replay |
| F0-D controlled reproduction | Can a controlled workload reproduce long main-thread tasks? | f0_run_task_max_ms mean about 68.633ms; max 70.117ms; one 50ms+ long task per run |
Controlled derived-fanout workload |
| F1 worker offload | Can equivalent work leave the main thread? | main-thread max task mean about 2.679ms; long task count 0 |
Worker-offload solution lever, not full runtime |
| F2 worker scheduling | Does worker scheduling improve controlled urgent projection timing? | controlled urgent projection timing mean about 22.867ms -> 3.333ms |
Urgent projection timing improvement, not throughput or user-perceived latency proof |
| P2 pure core | What runtime scaffold follows from the evidence? | protocol, validation, scheduler, state-store, projection policy, adapters, and harnesses frozen with 406/406 runtime tests passing |
Pure core only, not real Worker/Main integration |
| P1 streaming Markdown stability demo | What does tail-mutating Markdown churn look like in a browser? | local demo compares naive full reparse against stable completed-block plus mutable-tail rendering across four simulated cases | Demonstration only; not a production Markdown library, not browser-level INP, and not a provider integration |
See docs/portfolio/evidence-map.md for claim-to-evidence mapping. Use docs/portfolio/document-status-map.md to distinguish current public-reader documents from historical drafts and process notes.
Controlled P1 and synthetic P5 evidence support worker-resident ownership/offload as a way to reduce and localize main-thread blocking under long-lived AI-surface workloads. Bounded main-thread projection commit remains the remaining blocking window.
This is a conservative mechanism claim. It is not browser-level INP, not Event Timing, not real product superiority, and not production readiness.
- Recruiters: read this README and the portfolio overview for the project story and resume-safe scope.
- Engineers: read the evidence map and related systems positioning for architecture and comparison context.
- Reviewers: read the short paper draft, P5 appendix, P5 final packet, and P5 adversarial audit.
Read in this order:
- Portfolio overview
- Document status map
- Evidence map
- Related systems positioning
- Short paper draft
- P5 scheduling evidence appendix
- P5 final reviewer evidence packet
- P5 adversarial audit
P5 is a synthetic scheduling-mechanism evidence chain for long-lived AI surfaces. It studies main-thread blocking under send-start, commit-window, dynamic active-context, multistream, and product-trace-shaped synthetic workloads.
The strongest current result is P5-X product-trace-shaped synthetic scheduling-delay proxy: B2x 176.1ms vs R0x 0.1ms under equal trace/logical invariants. Interpret this as a blocked-vs-near-unblocked internal scheduling category, not browser-level INP and not a precise user-perceived speedup.
R0 does not eliminate work. It moves logical send/update/multistream/product-trace-shaped work into Worker and localizes remaining main-thread blocking to bounded projection commit. P4 remains not authorized.
See P5 final reviewer evidence packet, P5 adversarial audit, and P5 scheduling evidence summary.
- TypeScript pure-core runtime modules under
runtime/ - operation and transaction validation
- scheduler and backpressure policies
- projection policy and bounded commit gates
- immutable state store and op log
- message serialization and checksums
- pure worker/main adapter contracts
- in-memory roundtrip and session scenario harnesses
- runtime guard checks
- controlled benchmark targets and analysis scripts
- isolated browser streaming Markdown stability demo
- paper-style documentation and figure drafts
- production Worker runtime
- production main-thread runtime
- DOM/React integration
- projection engine
- Canvas, OffscreenCanvas, or WebGPU backend
- product integration
- accessibility/focus/caret production model
- broad workload matrix
- multi-urgent stress testing
- Not browser-level INP.
- Not Event Timing.
- Not production runtime evidence.
- Not real product superiority.
- Not a complete Canvas, OffscreenCanvas, or WebGPU backend.
- Not P4 authorization.
- Not P7 productization.
Install dependencies:
npm installRun validation:
npm run typecheck
npm run test:runtime
npm run check:runtime-guards
npm run check:p2-toolingServe the controlled P0 target:
node scripts/p0/serve_controlled_target.mjs --host 127.0.0.1 --port 4317 --default-level L1Open the default controlled surface:
http://127.0.0.1:4317/controlled_append_surface.html?level=L1
Serve the P1 local browser demos:
node scripts/p1/serve_p1_streaming_baselines.mjs --host 127.0.0.1 --port 4319Open the streaming Markdown stability demo:
http://127.0.0.1:4319/p1_streaming_markdown_stability_demo.html
Inspect capture CLI usage:
bash scripts/p0/run_capture.sh --helpPrint benchmark matrices:
bash scripts/p0/print_p0d_matrix.sh
bash scripts/p0/print_p0e_matrix.sh
bash scripts/p1/print_p1a_b0_b1_matrix.shbench/ controlled targets, scenarios, and public benchmark summaries
docs/p0/ P0 motivation, measurement notes, and trace-derived analysis
docs/p1/ controlled baseline/offload/scheduler result notes
docs/p2/ pure-core runtime abstraction and freeze docs
docs/paper/ paper draft, review packet, and figure drafts
docs/portfolio/ public portfolio packaging and release-safety notes
runtime/ TypeScript pure-core runtime scaffold
scripts/ capture, serving, analysis, and guard scripts
tests/runtime/ runtime contract and policy tests
This repository should only publish sanitized benchmark summaries and reproducible synthetic/controlled workloads. Private traces, raw captures, credentials, local logs, and private result folders must stay out of the public repo.
See docs/portfolio/privacy-and-data.md.
Public-facing documentation has been added, and raw product-derived trace CSVs are excluded from tracked public evidence. Final publication should use the sanitized summaries under docs/portfolio/results/ and should not publish local trace-derived CSVs without a separate privacy review.
- Portfolio overview
- Document status map
- Evidence map
- One-pager
- Interview pitch
- Walkthrough script
- Application and outreach pack
- Architecture diagrams
- Benchmark suite mini spec
- Streaming Markdown stability demo post-audit
- Results summary
- Privacy and data boundary
This public repo is a sanitized portfolio snapshot. Raw trace-derived CSVs and trace-specific research notes are excluded from public history.