Fix/hydration mismatch and discovery latency#95
Fix/hydration mismatch and discovery latency#95404kidwiz wants to merge 2 commits intonexu-io:mainfrom
Conversation
β¦ warning Two issues resolved: 1. Discovery form latency β the mandatory Turn-1 question-form added 1-2 full LLM round trips before any artifact could be produced, inflating generation time to 30-45 min even for rich briefs. A complete brief (>= 200 substantive chars) now skips directly to RULE 3 (TodoWrite), eliminating the unnecessary stop-and-wait turn. 2. React hydration mismatch β Next.js 16 DevTools overlay injects a data-redeviation-bs-uid attribute on <html> during SSR that the client doesn't reproduce, causing a console error in dev mode. Adding suppressHydrationWarning to the html element silences this; it has no effect on production builds. Constraint: next-env.d.ts staged in error; reverted separately. Tested: pnpm typecheck passes.
β¦t quality ## What Five targeted improvements addressing latency, robustness, and signal quality across the prompt composition pipeline and daemon event handling. ### 1. Prompt budget reduction (discovery β execution phase) The full composed prompt (~3000+ tokens for discovery layer + base designer + design system + skill) is now trimmed after the agent's first TodoWrite indicating work has started. In execution mode, DISCOVERY_AND_PHILOSOPHY is dropped β the agent still has skill workflow, design tokens, and base charter, but no longer re-processes the discovery form rules every turn. - phase: 'discovery' | 'execution' on ComposeInput (system.ts) - didAgentStartWorkRef flips phase once on first TodoWrite (ProjectView.tsx) ### 2. Template truncation signal When a template file exceeds 12 kB and is truncated, aβ οΈ warning now appears before the code fence (not inside it) so the agent sees it as markdown text and can contextualize the partial reference. - Warning format: "β οΈ Template source "X" truncated β N chars omitted. Key structures copied; adapt freely." (system.ts renderMetadataBlock) ### 3. artifactKind normalization The dual-path `skillMode === 'deck' OR metadata?.kind === 'deck'` for the deck framework is now a single field: artifactKind derived once at the compose call site as `skillMode ?? metadata?.kind ?? 'prototype'`. Same semantics, cleaner code. - artifactKind field on ComposeInput (system.ts) - Derivation at compose call site (ProjectView.tsx) - Conditional changed to `artifactKind === 'deck' && !hasSkillSeed` ### 4. Direction library CI smoke-test scripts/test-direction-library.ts validates that renderDirectionFormBody() produces parseable JSON with the expected schema (description, questions, options for radio/checkbox/select), and that renderDirectionSpecBlock() returns a non-empty string. Runs via `pnpm test:direction-lib`. - scripts/test-direction-library.ts (new) - "test:direction-lib" script + tsx dependency added (package.json) ### 5. P0 checklist enforcement gate The daemon's /api/chat handler now buffers text_delta chunks per turn and inspects the first artifact emission (messageCount <= 1) for a P0 completion marker. If </artifact> appears without a P0 confirmation, the handler intercepts the artifact and injects a self-check prompt instead β forcing the agent to confirm checklist completion before presenting. - P0_PATTERNS + hasP0Completion() (daemon/server.js) - sendAgent wrapper with artifact gating (daemon/server.js) - Only gates first artifact; iterative tweaks pass through unconditionally. ## Testing - pnpm typecheck: clean - pnpm test:direction-lib: both smoke-tests pass Constraint: pnpm-lock.yaml updated with tsx dependency. Tested: typecheck, direction-lib smoke-test. Not-tested: runtime daemon behavior (requires live agent session).
lefarcen
left a comment
There was a problem hiding this comment.
Hey @404kidwiz! Thanks for tackling the hydration warning and discovery latency. The execution-phase optimization is clever β skipping discovery on subsequent turns will save tokens. However, I found several P1 correctness and architecture issues that need fixes before merge.
Most critical: daemon/server.js removed the upload multer instance entirely (lines 91-100) but kept importUpload and projectUpload. This breaks any endpoint that references upload (search for .single('file') / .array('files') calls in server.js that aren't importUpload or projectUpload β those will throw ReferenceError at runtime).
Detailed findings inline. The core idea (discovery gate + phase flag) is solid β these are fixable bugs, not design problems.
| /P0\s*[-β]\s*all\s+items\s+passed/i, | ||
| /checklist:\s*P0\s+pass/i, | ||
| ]; | ||
| function hasP0Completion(content) { |
There was a problem hiding this comment.
P1: upload multer instance removed but importUpload and projectUpload remain. Any server.js endpoint that calls upload.single() / upload.array() will throw ReferenceError. Grep the file for upload.single\|upload.array to find affected routes. Either restore the upload instance or rename all call sites to use one of the remaining instances (if that was the refactor intent). Without this, file-upload routes break on deploy.
| }; | ||
|
|
||
| // P0 artifact gate: buffers text_delta on each turn and applies the gate | ||
| // before releasing it on 'usage' (end-of-turn). Only gates the FIRST |
There was a problem hiding this comment.
P2: sendAgent intercepts text_delta and buffers it until usage fires. If the agent never sends usage (truncated response / network timeout / CLI crash mid-stream), bufferedText is silently lost β the browser never sees it. Add a flush fallback in the stream close handler or timeout so partial output still renders.
| } | ||
| if (ev.type === 'usage') { | ||
| messageCount.value++; | ||
| if (pendingArtifact && bufferedText.includes('</artifact>') && !hasP0Completion(bufferedText)) { |
There was a problem hiding this comment.
P3: The P0 completion regex checks bufferedText.includes('</artifact>') but doesn't verify the artifact tag is well-formed (could match <pre></artifact></pre> in a code snippet). Consider tightening to /<artifact[^>]*>.*<\/artifact>/s or check that </artifact> appears after a valid opening tag before blocking.
| artifactKind, | ||
| designSystemBody, | ||
| designSystemTitle, | ||
| metadata, |
There was a problem hiding this comment.
P1: phase defaults to undefined, so isExecution is false. On first turn didAgentStartWorkRef.current is still false β phase=undefined β discovery runs. This is correct. But after the first TodoWrite, phase='execution' β parts array starts empty (line 80) β the entire BASE_SYSTEM_PROMPT is skipped. This means the official designer identity + workflow charter disappears mid-session. Either (a) always include BASE_SYSTEM_PROMPT (move it outside the conditional), or (b) document that 'execution' mode is minimal/tool-only (though that breaks continuity β the agent loses its identity). Option (a) is safer.
| !!skillBody && /assets\/template\.html/.test(skillBody); | ||
| if (isDeckProject && !hasSkillSeed) { | ||
| if (artifactKind === 'deck' && !hasSkillSeed) { | ||
| parts.push(`\n\n---\n\n${DECK_FRAMEWORK_DIRECTIVE}`); |
There was a problem hiding this comment.
P2: artifactKind fallback to metadata?.kind makes sense for no-skill projects. But isDeckProject (old code) was skillMode === 'deck' || metadata?.kind === 'deck' and checked both. New code only checks artifactKind === 'deck', and artifactKind is (skillMode ?? metadata?.kind) as ... β if skillMode is 'prototype' and metadata.kind is 'deck', artifactKind becomes 'prototype' (the ?? short-circuits) and the deck framework is NOT injected. Flip the priority: (metadata?.kind ?? skillMode) to respect user-level metadata over skill default, OR use the old || check on both separately.
| } | ||
| } | ||
| } | ||
| const skillMode = skills.find((s) => s.id === project.skillId)?.mode as 'prototype' | 'deck' | 'template' | 'design-system' | undefined; |
There was a problem hiding this comment.
P3: skillMode and artifactKind are recomputed on every composedSystemPrompt call (inside the useCallback) by re-searching the skills array. Minor: you could hoist these outside the function body or memoize the skill summary lookup. Not a bug, just inefficient (O(skills.length) on every turn).
| let parsed: unknown; | ||
| try { | ||
| parsed = JSON.parse(body); | ||
| } catch { |
There was a problem hiding this comment.
P3: The smoke-test catches malformed JSON / missing fields, but doesn't validate that options arrays are non-empty (a radio/checkbox question with options: [] would pass). Add if (question.options.length === 0) throw ... after the Array.isArray check.
| - The user is replying *inside an active design* with a tweak ("make the headline bigger", "swap slide 3 image", "add a feature row"). | ||
| - The user explicitly says "skip questions" / "just build" / "no questions, go". | ||
| - The user's message starts with \`[form answers β β¦]\` (you already have the answers). | ||
| - The user's opening message is a **complete brief** β substantive description with real context, goals, or copy. A message β₯ 200 characters that reads as a real brief (not a one-liner like "make it purple") is treated as the full discovery payload; jump to the brand/direction branch immediately without stopping for the form. |
There was a problem hiding this comment.
P3: The new "complete brief" skip rule (β₯ 200 chars β skip form) is a heuristic that might misfire on verbose but vague requests ("I need a landing page for my startup that does X Y Z and we want it to be modern and professional and have a hero section and three feature blocksβ¦" hits 200 chars but lacks brand/direction). Consider bumping the threshold to 300+ or adding a second check (e.g., message contains 3+ sentences AND β₯ 200 chars) to reduce false positives.
There was a problem hiding this comment.
π‘ Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: eb60de536a
βΉοΈ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with π.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| return P0_PATTERNS.some((re) => re.test(content)); | ||
| } | ||
|
|
||
| const importUpload = multer({ |
There was a problem hiding this comment.
Reintroduce shared upload middleware
This block removed the upload multer instance, but /api/upload and /api/projects/:id/import still invoke upload.array(...) / upload.single(...) later in the same file, so route registration now hits ReferenceError: upload is not defined and upload flows break. Please keep a shared upload middleware binding (or update those routes to a defined middleware) so these endpoints remain callable.
Useful? React with πΒ / π.
| const parts: string[] = isExecution | ||
| ? [] | ||
| : [ |
There was a problem hiding this comment.
Keep base system prompt in execution phase
Setting parts to [] for execution mode drops BASE_SYSTEM_PROMPT entirely after discovery, which removes the core system-level behavior constraints on subsequent turns instead of only skipping discovery instructions. This can cause mid-session behavior drift (artifact/process rules disappear once phase flips). Execution mode should omit discovery-only text while still retaining the base system prompt.
Useful? React with πΒ / π.
| const idx = bufferedText.lastIndexOf('</artifact>'); | ||
| const prefix = bufferedText.slice(0, idx).trim(); | ||
| rawSend('agent', { type: 'text_delta', delta: prefix }); | ||
| rawSend('agent', { |
There was a problem hiding this comment.
Preserve closing when gating P0 completion
When the P0 gate triggers, this sends bufferedText.slice(0, idx) and drops the </artifact> close tag. The frontend artifact parser stays in "inside artifact" mode until final flush, so the follow-up reminder text is consumed as artifact HTML and can be persisted into the generated file instead of shown as plain assistant text. Send either a complete artifact block or no artifact block to avoid corrupting the stream semantics.
Useful? React with πΒ / π.
What these two commits fix1. React hydration mismatch (dev mode only)
2. Discovery form latency β 30-45 min generation timeThe mandatory Turn-1
Both commits are type-checked ( β submitted via Claude Code |
|
Thanks for the detailed explanation! The intent (skip form for complete briefs + execution-phase prompt reduction) is solid. However, my review above found implementation bugs in these commits that need fixes before merge: P1 Critical (daemon/server.js:91-100): You removed the
P1 Architecture (src/prompts/system.ts:68-80): Execution mode skips P2 (src/prompts/system.ts:117): The idea is great β these are just fixable bugs in the execution. Push fixes and I'll re-review! π |
|
Hi @404kidwiz! Just checking in on this PR β it's been about 90h since the last update, and there are now merge conflicts with main. No rush if you're still working on it. If you'd like a hand, just reply here β happy to walk through the rebase strategy or clarify any of the review points. If you've moved on, no worries β feel free to close the PR. Thanks for the contribution! |
No description provided.