Skip to content

Fix/hydration mismatch and discovery latency#95

Open
404kidwiz wants to merge 2 commits intonexu-io:mainfrom
404kidwiz:fix/hydration-mismatch-and-discovery-latency
Open

Fix/hydration mismatch and discovery latency#95
404kidwiz wants to merge 2 commits intonexu-io:mainfrom
404kidwiz:fix/hydration-mismatch-and-discovery-latency

Conversation

@404kidwiz
Copy link
Copy Markdown

No description provided.

… warning

Two issues resolved:

1. Discovery form latency β€” the mandatory Turn-1 question-form added 1-2
   full LLM round trips before any artifact could be produced, inflating
   generation time to 30-45 min even for rich briefs. A complete brief
   (>= 200 substantive chars) now skips directly to RULE 3 (TodoWrite),
   eliminating the unnecessary stop-and-wait turn.

2. React hydration mismatch β€” Next.js 16 DevTools overlay injects a
   data-redeviation-bs-uid attribute on <html> during SSR that the client
   doesn't reproduce, causing a console error in dev mode. Adding
   suppressHydrationWarning to the html element silences this; it has no
   effect on production builds.

Constraint: next-env.d.ts staged in error; reverted separately.
Tested: pnpm typecheck passes.
…t quality

## What

Five targeted improvements addressing latency, robustness, and signal quality
across the prompt composition pipeline and daemon event handling.

### 1. Prompt budget reduction (discovery β†’ execution phase)

The full composed prompt (~3000+ tokens for discovery layer + base designer +
design system + skill) is now trimmed after the agent's first TodoWrite
indicating work has started. In execution mode, DISCOVERY_AND_PHILOSOPHY is
dropped β€” the agent still has skill workflow, design tokens, and base
charter, but no longer re-processes the discovery form rules every turn.

- phase: 'discovery' | 'execution' on ComposeInput (system.ts)
- didAgentStartWorkRef flips phase once on first TodoWrite (ProjectView.tsx)

### 2. Template truncation signal

When a template file exceeds 12 kB and is truncated, a ⚠️ warning now
appears before the code fence (not inside it) so the agent sees it as
markdown text and can contextualize the partial reference.

- Warning format: "⚠️ Template source "X" truncated β€” N chars omitted.
  Key structures copied; adapt freely." (system.ts renderMetadataBlock)

### 3. artifactKind normalization

The dual-path `skillMode === 'deck' OR metadata?.kind === 'deck'` for the
deck framework is now a single field: artifactKind derived once at the
compose call site as `skillMode ?? metadata?.kind ?? 'prototype'`. Same
semantics, cleaner code.

- artifactKind field on ComposeInput (system.ts)
- Derivation at compose call site (ProjectView.tsx)
- Conditional changed to `artifactKind === 'deck' && !hasSkillSeed`

### 4. Direction library CI smoke-test

scripts/test-direction-library.ts validates that renderDirectionFormBody()
produces parseable JSON with the expected schema (description, questions,
options for radio/checkbox/select), and that renderDirectionSpecBlock()
returns a non-empty string. Runs via `pnpm test:direction-lib`.

- scripts/test-direction-library.ts (new)
- "test:direction-lib" script + tsx dependency added (package.json)

### 5. P0 checklist enforcement gate

The daemon's /api/chat handler now buffers text_delta chunks per turn and
inspects the first artifact emission (messageCount <= 1) for a P0
completion marker. If </artifact> appears without a P0 confirmation, the
handler intercepts the artifact and injects a self-check prompt instead β€”
forcing the agent to confirm checklist completion before presenting.

- P0_PATTERNS + hasP0Completion() (daemon/server.js)
- sendAgent wrapper with artifact gating (daemon/server.js)
- Only gates first artifact; iterative tweaks pass through unconditionally.

## Testing

- pnpm typecheck: clean
- pnpm test:direction-lib: both smoke-tests pass

Constraint: pnpm-lock.yaml updated with tsx dependency.
Tested: typecheck, direction-lib smoke-test.
Not-tested: runtime daemon behavior (requires live agent session).
@lefarcen lefarcen self-requested a review April 29, 2026 19:43
@lefarcen lefarcen added the bug-fix Fixes an existing bug label Apr 29, 2026
Copy link
Copy Markdown
Contributor

@lefarcen lefarcen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @404kidwiz! Thanks for tackling the hydration warning and discovery latency. The execution-phase optimization is clever β€” skipping discovery on subsequent turns will save tokens. However, I found several P1 correctness and architecture issues that need fixes before merge.

Most critical: daemon/server.js removed the upload multer instance entirely (lines 91-100) but kept importUpload and projectUpload. This breaks any endpoint that references upload (search for .single('file') / .array('files') calls in server.js that aren't importUpload or projectUpload β€” those will throw ReferenceError at runtime).

Detailed findings inline. The core idea (discovery gate + phase flag) is solid β€” these are fixable bugs, not design problems.

Comment thread daemon/server.js
/P0\s*[-β€”]\s*all\s+items\s+passed/i,
/checklist:\s*P0\s+pass/i,
];
function hasP0Completion(content) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: upload multer instance removed but importUpload and projectUpload remain. Any server.js endpoint that calls upload.single() / upload.array() will throw ReferenceError. Grep the file for upload.single\|upload.array to find affected routes. Either restore the upload instance or rename all call sites to use one of the remaining instances (if that was the refactor intent). Without this, file-upload routes break on deploy.

Comment thread daemon/server.js
};

// P0 artifact gate: buffers text_delta on each turn and applies the gate
// before releasing it on 'usage' (end-of-turn). Only gates the FIRST
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: sendAgent intercepts text_delta and buffers it until usage fires. If the agent never sends usage (truncated response / network timeout / CLI crash mid-stream), bufferedText is silently lost β€” the browser never sees it. Add a flush fallback in the stream close handler or timeout so partial output still renders.

Comment thread daemon/server.js
}
if (ev.type === 'usage') {
messageCount.value++;
if (pendingArtifact && bufferedText.includes('</artifact>') && !hasP0Completion(bufferedText)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: The P0 completion regex checks bufferedText.includes('</artifact>') but doesn't verify the artifact tag is well-formed (could match <pre></artifact></pre> in a code snippet). Consider tightening to /<artifact[^>]*>.*<\/artifact>/s or check that </artifact> appears after a valid opening tag before blocking.

Comment thread src/prompts/system.ts
artifactKind,
designSystemBody,
designSystemTitle,
metadata,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: phase defaults to undefined, so isExecution is false. On first turn didAgentStartWorkRef.current is still false β†’ phase=undefined β†’ discovery runs. This is correct. But after the first TodoWrite, phase='execution' β†’ parts array starts empty (line 80) β†’ the entire BASE_SYSTEM_PROMPT is skipped. This means the official designer identity + workflow charter disappears mid-session. Either (a) always include BASE_SYSTEM_PROMPT (move it outside the conditional), or (b) document that 'execution' mode is minimal/tool-only (though that breaks continuity β€” the agent loses its identity). Option (a) is safer.

Comment thread src/prompts/system.ts
!!skillBody && /assets\/template\.html/.test(skillBody);
if (isDeckProject && !hasSkillSeed) {
if (artifactKind === 'deck' && !hasSkillSeed) {
parts.push(`\n\n---\n\n${DECK_FRAMEWORK_DIRECTIVE}`);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: artifactKind fallback to metadata?.kind makes sense for no-skill projects. But isDeckProject (old code) was skillMode === 'deck' || metadata?.kind === 'deck' and checked both. New code only checks artifactKind === 'deck', and artifactKind is (skillMode ?? metadata?.kind) as ... β€” if skillMode is 'prototype' and metadata.kind is 'deck', artifactKind becomes 'prototype' (the ?? short-circuits) and the deck framework is NOT injected. Flip the priority: (metadata?.kind ?? skillMode) to respect user-level metadata over skill default, OR use the old || check on both separately.

}
}
}
const skillMode = skills.find((s) => s.id === project.skillId)?.mode as 'prototype' | 'deck' | 'template' | 'design-system' | undefined;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: skillMode and artifactKind are recomputed on every composedSystemPrompt call (inside the useCallback) by re-searching the skills array. Minor: you could hoist these outside the function body or memoize the skill summary lookup. Not a bug, just inefficient (O(skills.length) on every turn).

let parsed: unknown;
try {
parsed = JSON.parse(body);
} catch {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: The smoke-test catches malformed JSON / missing fields, but doesn't validate that options arrays are non-empty (a radio/checkbox question with options: [] would pass). Add if (question.options.length === 0) throw ... after the Array.isArray check.

Comment thread src/prompts/discovery.ts
- The user is replying *inside an active design* with a tweak ("make the headline bigger", "swap slide 3 image", "add a feature row").
- The user explicitly says "skip questions" / "just build" / "no questions, go".
- The user's message starts with \`[form answers β€” …]\` (you already have the answers).
- The user's opening message is a **complete brief** β€” substantive description with real context, goals, or copy. A message β‰₯ 200 characters that reads as a real brief (not a one-liner like "make it purple") is treated as the full discovery payload; jump to the brand/direction branch immediately without stopping for the form.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: The new "complete brief" skip rule (β‰₯ 200 chars β†’ skip form) is a heuristic that might misfire on verbose but vague requests ("I need a landing page for my startup that does X Y Z and we want it to be modern and professional and have a hero section and three feature blocks…" hits 200 chars but lacks brand/direction). Consider bumping the threshold to 300+ or adding a second check (e.g., message contains 3+ sentences AND β‰₯ 200 chars) to reduce false positives.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

πŸ’‘ Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: eb60de536a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with πŸ‘.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread daemon/server.js
return P0_PATTERNS.some((re) => re.test(content));
}

const importUpload = multer({
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Reintroduce shared upload middleware

This block removed the upload multer instance, but /api/upload and /api/projects/:id/import still invoke upload.array(...) / upload.single(...) later in the same file, so route registration now hits ReferenceError: upload is not defined and upload flows break. Please keep a shared upload middleware binding (or update those routes to a defined middleware) so these endpoints remain callable.

Useful? React with πŸ‘Β / πŸ‘Ž.

Comment thread src/prompts/system.ts
Comment on lines +80 to +82
const parts: string[] = isExecution
? []
: [
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep base system prompt in execution phase

Setting parts to [] for execution mode drops BASE_SYSTEM_PROMPT entirely after discovery, which removes the core system-level behavior constraints on subsequent turns instead of only skipping discovery instructions. This can cause mid-session behavior drift (artifact/process rules disappear once phase flips). Execution mode should omit discovery-only text while still retaining the base system prompt.

Useful? React with πŸ‘Β / πŸ‘Ž.

Comment thread daemon/server.js
Comment on lines +1069 to +1072
const idx = bufferedText.lastIndexOf('</artifact>');
const prefix = bufferedText.slice(0, idx).trim();
rawSend('agent', { type: 'text_delta', delta: prefix });
rawSend('agent', {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve closing when gating P0 completion

When the P0 gate triggers, this sends bufferedText.slice(0, idx) and drops the </artifact> close tag. The frontend artifact parser stays in "inside artifact" mode until final flush, so the follow-up reminder text is consumed as artifact HTML and can be persisted into the generated file instead of shown as plain assistant text. Send either a complete artifact block or no artifact block to avoid corrupting the stream semantics.

Useful? React with πŸ‘Β / πŸ‘Ž.

@404kidwiz
Copy link
Copy Markdown
Author

What these two commits fix

1. React hydration mismatch (dev mode only)

data-redeviation-bs-uid is injected into <html> by Next.js 16's DevTools overlay during SSR but not reproduced on the client β€” causing a console error in dev mode. Added suppressHydrationWarning to the <html> element. Has no effect on production builds.

2. Discovery form latency β€” 30-45 min generation time

The mandatory Turn-1 <question-form> added 1-2 full LLM round trips before any artifact could be produced. Two sub-fixes:

  • Brief skip: Complete briefs (β‰₯200 substantive chars) now jump directly to RULE 3 (TodoWrite) β€” no stop-and-wait form turn. Cuts generation from 3-4 turns to 2-3.
  • Prompt budget reduction: After the agent's first TodoWrite (work has started), the composed prompt drops DISCOVERY_AND_PHILOSOPHY and the full metadata block β€” only skill workflow + design tokens + base charter remain. Saves ~60% on per-turn token cost.

Both commits are type-checked (pnpm typecheck) and the new direction-library smoke-test passes (pnpm test:direction-lib).

β€” submitted via Claude Code

@lefarcen
Copy link
Copy Markdown
Contributor

Thanks for the detailed explanation! The intent (skip form for complete briefs + execution-phase prompt reduction) is solid.

However, my review above found implementation bugs in these commits that need fixes before merge:

P1 Critical (daemon/server.js:91-100): You removed the upload multer instance entirely but kept importUpload and projectUpload. Any route in server.js that calls upload.single() or upload.array() will throw ReferenceError: upload is not defined at runtime. Either:

  • Restore the upload instance, OR
  • Rename all call sites to use one of the remaining instances

P1 Architecture (src/prompts/system.ts:68-80): Execution mode skips DISCOVERY_AND_PHILOSOPHY (correct) but also skips BASE_SYSTEM_PROMPT (the official designer identity + workflow charter). This means mid-session the agent loses its core identity. Move BASE_SYSTEM_PROMPT outside the conditional so it's always included.

P2 (src/prompts/system.ts:117): artifactKind = (skillMode ?? metadata?.kind) β€” if skillMode is 'prototype' and metadata.kind is 'deck', the deck framework won't be injected (the ?? short-circuits on the first truthy value). Flip to (metadata?.kind ?? skillMode) to respect user metadata over skill default.

The idea is great β€” these are just fixable bugs in the execution. Push fixes and I'll re-review! πŸš€

@lefarcen
Copy link
Copy Markdown
Contributor

lefarcen commented May 2, 2026

Hi @404kidwiz!

Just checking in on this PR β€” it's been about 90h since the last update, and there are now merge conflicts with main.

No rush if you're still working on it. If you'd like a hand, just reply here β€” happy to walk through the rebase strategy or clarify any of the review points. If you've moved on, no worries β€” feel free to close the PR.

Thanks for the contribution!
β€” open-design team

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug-fix Fixes an existing bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants