Skip to content

Epic: Production Hardening: Wiii Runtime Reliability #170

@meiiie

Description

@meiiie

Status

Active planning epic for the post-NVIDIA provider foundation.

Context

Wiii needs to move from demo-capable to production-reliable quickly. PR #169 establishes the NVIDIA provider foundation and local smoke gate, but reliability still needs explicit runtime health, fallback, streaming finalization, pipeline simplification, and memory/identity contracts.

The goal is not just to make requests succeed. The goal is that Wiii feels continuous, responsive, and alive: short chat should stay fast, long/tool/RAG work should stream honestly, memory should be visible in behavior, and provider failures should degrade gracefully instead of freezing the frontend.

Principles

  • Keep PRs narrow and mergeable. Do not combine provider runtime, architecture docs, and memory refactors in one PR.
  • Do not delete LangGraph/history/compat code in a sweep. Mark, isolate, and remove by phase with rollback notes.
  • Prefer health-based runtime decisions over static provider assumptions.
  • Preserve Vietnamese-first user-facing behavior.
  • Every high-risk change needs tests, rollback notes, and PR reviewer focus.

PR 1: Provider Runtime Reliability

Scope:

  • Model-level health probe: if NVIDIA Flash times out, mark it degraded and stop selecting it until recovery conditions pass.
  • Per-provider/per-model timeout profiles for chat, streaming, structured/router, and probe paths.
  • Same-provider fallback: Flash -> Pro or Pro -> Flash according to current health and configured policy.
  • SSE timeout/finalization guard so the frontend never remains stuck in “đang suy nghĩ” after backend/provider failure.
  • Ensure structured/router path cannot slow ordinary short chat.

Acceptance:

  • Unit tests cover model degraded state, timeout profile selection, same-provider fallback order, and recovery behavior.
  • Streaming tests cover timeout/error finalization and done/terminal event behavior.
  • Local NVIDIA smoke documents which model was selected and why.
  • No secrets or .env* changes are committed.

PR 2: Pipeline Simplification Plan

Scope:

  • Document current request lifecycle: request -> auth/org -> memory -> router -> agent -> tool/RAG -> stream.
  • Mark active runtime paths vs historical compatibility paths.
  • Identify remaining LangGraph/history/compat references and classify them as active, compatibility, test-only, doc-only, or deletion candidate.
  • Propose phased LangGraph removal without breaking runtime or rollback.

Acceptance:

  • Architecture doc includes lifecycle diagram, ownership, risk, and rollback plan.
  • Cleanup list links each LangGraph/history reference to a proposed phase.
  • No large runtime deletion in this planning PR unless separately proven safe.

PR 3: Memory/Wiii Identity Reliability

Scope:

  • Define the memory contract that prevents failures like “Wiii không nhớ mình”.
  • Separate memory namespaces: persona, human, relationship, goals, craft, and world.
  • Make living memory behavior visible in chat responses without dumping raw memory.
  • Add tests for memory retrieval/injection, relationship continuity, and fallback behavior when memory is unavailable.

Acceptance:

  • Memory contract doc and tests prove Wiii can recall stable identity/relationship facts safely.
  • Wiii remains clearly AI, not human-impersonating, while feeling continuous and companion-like.
  • Memory writes are selective and interpretable, not raw-turn dumps.
  • UI-visible behavior explains memory uncertainty gracefully instead of claiming total amnesia.

Risks

  • Provider reliability touches high-risk runtime routing and streaming contracts.
  • Memory reliability touches privacy, persistence, and user trust.
  • LangGraph cleanup can remove hidden compatibility paths if done too aggressively.

Initial Verification Targets

Backend:

cd maritime-ai-service
set PYTHONIOENCODING=utf-8 && pytest tests/unit/ -p no:capture --tb=short -q
ruff check app/ --select=E9,F63,F7

Desktop/streaming when frontend paths change:

cd wiii-desktop
npx vitest run
npx tsc --noEmit
npm run build:embed

Repository hygiene:

git diff --check
git status --short

Related Work

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions