Conversation
…t suites Agent-Logs-Url: https://github.com/DaScient-Intelligence/Plan-Examiner/sessions/12d85c28-ffe2-4719-9c89-2471b6fcd098 Co-authored-by: DaScient <25983786+DaScient@users.noreply.github.com>
…, consent), update README/SECURITY/TEST_PLANS Agent-Logs-Url: https://github.com/DaScient-Intelligence/Plan-Examiner/sessions/12d85c28-ffe2-4719-9c89-2471b6fcd098 Co-authored-by: DaScient <25983786+DaScient@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds an opt-in AI-vision augmentation to the existing deterministic PDF extraction pipeline, plus hardening of the BYO-key LLM bridge (storage, transport, validation, redaction, consent) and associated UI/docs/tests.
Changes:
- Introduces optional PDF page rasterization + vision fact extraction, merges vision facts additively, and surfaces regex↔vision conflicts as REVIEW findings.
- Reworks
PE.LLMinto a single outbound-request chokepoint with schema-versioned config, session-only storage option, base-URL validation, capability detection, multimodalsend(), retries/timeouts, and header isolation. - Adds extensive
node --testcoverage and updates documentation (README/SECURITY/TEST_PLANS) for vision mode and BYO-key hardening.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/pipeline.vision.test.js | Verifies pipeline vision gating, merge behavior, conflict→REVIEW finding, and graceful degradation. |
| tests/log.redaction.test.js | Ensures logger redaction prevents credential leakage in buffer and formatted output. |
| tests/llm-bridge.test.js | Covers LLM capability detection, URL validation, error normalization, header isolation, and vision JSON parsing/clamping. |
| tests/extractors.vision.test.js | Unit-tests vision/regex merge logic, provenance, conflicts, and malformed/low-confidence handling. |
| index.html | Expands AI Settings UI and adds a one-time vision consent modal. |
| assets/js/utils/log.js | Adds defense-in-depth redaction for sensitive keys/values in log messages and data payloads. |
| assets/js/app.js | Updates AI settings UX (test connection, session-only, vision toggle/pages, consent flow) and nav badge. |
| assets/js/agent/pipeline.js | Adds vision sub-step after regex extraction, merges facts, records provenance, and emits REVIEW findings on conflicts. |
| assets/js/agent/llm-bridge.js | Major rewrite: storage options, base URL validation, capability detection, unified fetch w/ retries+timeouts, multimodal support, ping/endpoints, vision JSON parsing and schema clamping. |
| assets/js/agent/extractors.js | Adds PDF rasterization helper and deterministic merge routine for vision facts. |
| TEST_PLANS.md | Adds manual vision-mode verification walkthrough and references new test suites. |
| SECURITY.md | Expands threat model and mitigations for BYO-key + vision mode. |
| README.md | Documents text-only vs vision mode, capability/budgets, outbound endpoints, and consent revocation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| catch (e) { return { ok: false, reason: 'Not a valid URL.' }; } | ||
|
|
||
| if (u.search || u.hash) return { ok: false, reason: 'Base URL must not contain query string or fragment.' }; | ||
| if (/(?:^|\/)\.\.(?:\/|$)/.test(urlStr)) return { ok: false, reason: 'Base URL must not contain "..".' }; |
There was a problem hiding this comment.
validateBaseUrl() claims to reject paths containing "..", but the check runs against the raw urlStr and only matches literal ... Percent-encoded traversal segments (e.g. %2e%2e) or other normalized forms can bypass this and allow the base URL to resolve outside the intended path when you later append endpoints. Consider validating against u.pathname after decoding/normalizing path segments (and rejecting any segment that equals ..).
| if (/(?:^|\/)\.\.(?:\/|$)/.test(urlStr)) return { ok: false, reason: 'Base URL must not contain "..".' }; | |
| var pathSegments = u.pathname.split('/'); | |
| for (var i = 0; i < pathSegments.length; i++) { | |
| var decodedSegment; | |
| try { decodedSegment = decodeURIComponent(pathSegments[i]); } | |
| catch (e2) { return { ok: false, reason: 'Not a valid URL.' }; } | |
| if (decodedSegment === '..') return { ok: false, reason: 'Base URL must not contain "..".' }; | |
| } |
| function _normalizeMessages(messages) { | ||
| // Accept either plain string content or already-multimodal arrays. | ||
| return messages.map(function (m) { | ||
| return { role: m.role, content: m.content }; | ||
| }); | ||
| } | ||
|
|
There was a problem hiding this comment.
_normalizeMessages() is defined but never used. Keeping unused helpers increases maintenance surface (and can confuse future refactors around message normalization). Either remove it or wire it into send() so there’s a single canonical message-shape normalization step.
| function _normalizeMessages(messages) { | |
| // Accept either plain string content or already-multimodal arrays. | |
| return messages.map(function (m) { | |
| return { role: m.role, content: m.content }; | |
| }); | |
| } |
| }); | ||
|
|
||
| if (totalBytes + bytes > byteBudget) { | ||
| _log('warn', 'vision byte budget reached; truncating at page ' + i, { totalBytes: totalBytes, budget: byteBudget }); |
There was a problem hiding this comment.
The truncation warning logs page " + i, but i is zero-based in this loop. This will report the wrong page number (e.g. "page 0") when the byte budget is reached. Log i + 1 (and/or the attempted page index) so the message matches the actual page where truncation occurred.
| _log('warn', 'vision byte budget reached; truncating at page ' + i, { totalBytes: totalBytes, budget: byteBudget }); | |
| _log('warn', 'vision byte budget reached; truncating at page ' + (i + 1), { totalBytes: totalBytes, budget: byteBudget }); |
| var imgs = []; | ||
| var flat = messages.map(function (m) { | ||
| if (typeof m.content === 'string') return m; | ||
| var text = ''; |
There was a problem hiding this comment.
In _ollamaImagesFromMessages, imgs is declared outside the per-message mapping and then appended to for every message. That means images can accumulate across multiple chat turns and get attached to later messages that didn’t include image parts (and potentially to assistant messages), which can produce an invalid/overshared payload for Ollama multimodal requests. Build an images array per message (or attach images only to the message that actually contains them) instead of reusing a shared array.
| var imgs = []; | |
| var flat = messages.map(function (m) { | |
| if (typeof m.content === 'string') return m; | |
| var text = ''; | |
| var flat = messages.map(function (m) { | |
| if (typeof m.content === 'string') return m; | |
| var text = ''; | |
| var imgs = []; |
Summary
Augments the deterministic regex/rule extractor with an optional, opt-in AI-vision pass that rasterizes PDF pages through the user's BYO-key LLM, and audits the BYO-key surface (storage, transport, redaction, validation, consent). The regex/rule path remains the source of truth at every stage — vision is strictly additive and degrades gracefully on failure.
Vision-augmented PDF extraction (additive, gated)
Extractors.rasterizePdfPages()— OffscreenCanvas/Canvas, ~1600 px long-edge cap, JPEG q 0.85, page-count + ~18 MB byte budgets, abort-aware. Logs only counts/dimensions/sha256 — never image bytes.LLM.extractFactsFromImages()— strict-JSON system prompt, tolerant parser (code fences, trailing prose, trailing commas), schema-clamped fact map withconfidence+evidenceper field.Extractors.mergeVisionFacts()— agree (5% tolerance) →regex+vision; disagree → keep regex + recordconflict; miss+hit →vision; miss+miss → unchanged. Provenance carriesrunId,model,page,evidence.extractstep runs vision after regex when gated byLLM.isConfigured() && isVisionCapable() && consent && useVision && PDF. Per-page progress events,AbortSignal-aware. On failure: warn, setvisionStatus: 'failed', continue with regex-only. Each conflict becomes a REVIEW finding.BYO-key hardening
PE.LLMrewrite: schema-versioned config;localStorageorsessionStorage(Session-only);VISION_MODELScapability table →visionCapability()returnsvision/text/unknown.referrerPolicy: 'no-referrer',cache: 'no-store',credentials: 'omit'. Per-request timeouts (60 s text / 120 s vision). Bounded retry+jitter on 429/5xx; no retry on 401/403. Errors normalized to{ status, code, message, retryable }.Authorizationto Anthropic/Azure/Ollama, nox-api-keyto OpenAI, noapi-keyoutside Azure (regression-tested).https://required for cloud providers; loopback-only for Ollama (opt-in for remote); query strings, fragments, and..segments rejected.send()with per-provider image adapters (OpenAI/Azure:image_url; Anthropic:base64blocks; Ollama: separateimages: [base64,…]).ping()powers the Test connection button.Logger redactor (defense in depth)
PE.Logstrips any field whose name matchesapiKey/api-key/Authorization/x-api-key/Bearer/token/secret/password/cookie/session, plus inline****** /Basic …strings — applied to both buffer and downloaded.log`.UI
Docs
README.md— text-only vs vision modes, capability table, caps/budgets, sent-vs-not-sent table, revoke path.SECURITY.md— threat model with per-row mitigations (XSS/localStorage, log leakage, header leakage, path traversal, hung tabs, image-byte leakage).TEST_PLANS.md— vision walkthrough on a public sample PDF including failure-path and revoke verification.Type of Change
Rule Pack Changes (if applicable)
N/A — no rule pack changes.
Testing
python3 -c "import json; json.load(open('assets/data/rules/your-pack.json'))")Automated coverage extended from 27 → 79 tests, all passing under
node --test. New suites:tests/log.redaction.test.js— buffer never contains common credential keys/values.tests/llm-bridge.test.js— capability detection, URL validation, error normalization, vision-JSON parsing, schema clamping, header isolation,send()capability gating, key-shape warnings, consent persistence.tests/extractors.vision.test.js— merge agree/disagree/miss/miss, provenance, confidence threshold, malformed-vision safety.tests/pipeline.vision.test.js— gating (no key / no consent / text-only / toggle off), successful merge, conflict→REVIEW, graceful degradation on vision failure.Manual browser verification per the new walkthrough in
TEST_PLANS.mdis recommended before merge.Checklist
Screenshots (if applicable)
N/A — UI text changes only; existing dark/glass styling preserved. The new AI Settings panel adds vision toggle, capability badge, Test-connection button, Session-only toggle, and outbound-endpoints preview within the existing modal frame.