Skip to content

Commit af30cb9

Browse files
committed
feat: Update execution plans with external capability verification and refine embedding provider integration
1 parent 8b1180b commit af30cb9

2 files changed

Lines changed: 42 additions & 13 deletions

File tree

CORTEX-DESIGN-PLAN-TODO.md

Lines changed: 19 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -33,10 +33,15 @@ Completed since the prior snapshot:
3333
6. Deterministic dummy SHA-256 embedder exists for pre-model hotpath testing (`embeddings/DeterministicDummyEmbeddingBackend.ts`).
3434
7. Benchmark harness exists for dummy embedder throughput baselining (`npm run benchmark:dummy`).
3535
8. Baseline adaptive provider selection exists with capability filtering + benchmark-based winner choice (`embeddings/ProviderResolver.ts`, `embeddings/EmbeddingRunner.ts`).
36+
9. External capability verification completed for real-provider planning:
37+
- Transformers.js is ONNX Runtime-backed and directly exposes `webnn`, `webgpu`, and `wasm` device paths.
38+
- Transformers.js does not currently expose `webgl` as a direct device; `webgl` should remain an explicit ORT adapter path.
3639

3740
Next focus:
3841
1. Wire resolved model profiles into runtime ingest/query entry points.
39-
2. Add real embedding providers (ONNX/Transformers/WebNN/WebGPU/WebGL/WASM) to resolver candidate sets.
42+
2. Add real embedding providers to resolver candidate sets, split by runtime family:
43+
- Transformers.js provider (`webnn/webgpu/wasm`)
44+
- Explicit ORT WebGL provider (`webgl`)
4045
3. Add browser/electron runtime test lanes to match merge-gate policy.
4146

4247
## 1. Design
@@ -150,18 +155,23 @@ Performance budget targets for v1:
150155

151156
Graceful degradation:
152157
1. `webgpu` preferred.
153-
2. `webgl` fallback.
154-
3. `webnn` optional path for matmul-friendly ops.
158+
2. `webnn` optional path for matmul-friendly ops.
159+
3. `webgl` fallback via explicit ORT adapter path.
155160
4. `wasm` guaranteed baseline.
156161

162+
Implementation note (verified 2026-03-11):
163+
1. Transformers.js path currently maps to `webnn/webgpu/wasm` (no direct `webgl` device key).
164+
2. Keep `webgl` in architecture through the explicit ORT adapter backend.
165+
157166
### 1.9 Current gap analysis from repo snapshot
158167
Observed blockers in current PoC files:
159-
1. Embedding runtime modules (provider resolver + runner) are still missing.
160-
2. Ingest/query orchestrators are not yet wired to resolved `ModelProfile` values.
161-
3. Browser/Electron runtime test lanes are not yet implemented in scripts/CI.
162-
4. Shader and backend files compile but are not yet integrated into a full vertical runtime path.
168+
1. Embedding runtime modules exist (`ProviderResolver` + `EmbeddingRunner`), but only baseline/dummy-provider flow is wired.
169+
2. Real provider adapters are not yet wired (Transformers.js for `webnn/webgpu/wasm`; explicit ORT adapter for `webgl`).
170+
3. Ingest/query orchestrators are not yet wired to resolved `ModelProfile` values.
171+
4. Browser/Electron runtime test lanes are not yet implemented in scripts/CI.
172+
5. Shader and backend files compile but are not yet integrated into a full vertical runtime path.
163173

164-
These are Phase 0 blockers and should be fixed before feature work.
174+
These are the remaining vertical-slice blockers before broader feature expansion.
165175

166176
## 2. Implementation Plan
167177

@@ -318,7 +328,7 @@ Priority legend:
318328
6. Add corruption recovery tooling for vector store and metadata store.
319329
7. Add schema migration tests across multiple versions.
320330
8. Add large-corpus stress tests for memory and latency.
321-
9. Add adaptive runtime policy based on backend capability.
331+
9. Extend adaptive runtime policy with real providers and runtime telemetry persistence.
322332
10. Add resource governance controls for Daydreamer CPU budget.
323333
11. Improve ranking quality with optional rerank stage.
324334
12. Add developer docs with architecture diagrams and troubleshooting.

PROJECT-EXECUTION-PLAN.md

Lines changed: 23 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -38,10 +38,27 @@ Completed in this pass:
3838
- selection now supports capability filtering + benchmark-based winner choice
3939

4040
Open items carried to next pass:
41-
1. Wire resolved `ModelProfile` into first concrete ingest/query orchestrator path (once those runtime modules are added).
41+
1. Wire resolved `ModelProfile` into first concrete ingest/query orchestrator path.
4242
2. Add real embedding providers (ONNX/Transformers/WebNN/WebGPU/WebGL/WASM) as candidates for the resolver.
4343
3. Add browser/electron runtime scripts and CI lanes for non-Node merge gating.
4444

45+
### External Capability Verification (2026-03-11)
46+
47+
Verified during this pass (hard handoff facts):
48+
1. Transformers.js is ONNX Runtime-backed (docs + upstream source).
49+
2. Transformers.js device mapping exposes `webnn`, `webnn-gpu`, `webnn-cpu`, `webnn-npu`, `webgpu`, and `wasm` for web environments.
50+
3. Transformers.js does not currently expose `webgl` as a direct `device` type; WebGL should remain an explicit ORT adapter path.
51+
4. Node-side ONNX providers (platform-dependent) include `cuda`, `dml`, and `coreml` in upstream mapping.
52+
53+
Source anchors to re-check quickly in a new session:
54+
1. `huggingface/transformers.js` -> `packages/transformers/src/backends/onnx.js`:
55+
- `DEVICE_TO_EXECUTION_PROVIDER_MAPPING`
56+
- `deviceToExecutionProviders(...)`
57+
2. `huggingface/transformers.js` -> `packages/transformers/src/utils/devices.js`:
58+
- `DEVICE_TYPES`
59+
3. Transformers.js docs (`v3.8.1` and `main`):
60+
- index + WebGPU guide + ONNX backend API pages
61+
4562
## Next Session Highest Priority (P0)
4663

4764
Connect adaptive embedding selection to runtime orchestration and add real provider candidates.
@@ -61,7 +78,9 @@ Definition of done for this pass:
6178

6279
1. Strict TDD: Red -> Green -> Refactor for every slice.
6380
2. Runtime realism: browser and Electron lanes are required merge gates.
64-
3. Provider fallback policy: `webnn -> webgpu -> webgl -> wasm`.
81+
3. Provider fallback policy:
82+
- Transformers.js path: `webnn -> webgpu -> wasm`
83+
- Explicit ORT path: `webnn -> webgpu -> webgl -> wasm`
6584
4. Numeric ownership: model-derived values from profile; policy values from declared policy objects.
6685

6786
## Execution Sequence
@@ -76,8 +95,8 @@ Definition of done for this pass:
7695
5. Implement embedding runner with fallback chain and telemetry:
7796
- `embeddings/EmbeddingRunner.ts` ✅ baseline done (2026-03-11)
7897
- `embeddings/ProviderResolver.ts` ✅ baseline done (2026-03-11)
79-
- `embeddings/TransformersEmbeddingBackend.ts`
80-
- `embeddings/OrtWebglEmbeddingBackend.ts`
98+
- `embeddings/TransformersEmbeddingBackend.ts` (targeting `webnn/webgpu/wasm`)
99+
- `embeddings/OrtWebglEmbeddingBackend.ts` (explicit `webgl` path)
81100
- `embeddings/OnnxEmbeddingRunner.ts`
82101
6. Build Hippocampus ingest using profile-derived chunking/dimensions.
83102
7. Build Cortex retrieval using profile-derived routing/truncation policies.

0 commit comments

Comments
 (0)