diff --git a/.copilot-context.md b/.copilot-context.md new file mode 100644 index 0000000..b789218 --- /dev/null +++ b/.copilot-context.md @@ -0,0 +1,98 @@ +# RAGfish / Noesis Noema – Context Index + +This document defines the authoritative design entry points and the formal Question model. +All implementation must comply with these definitions. + +--- + +# 1. Product Constitution +- docs/constitution/ +- ADR-0000 (Human Sovereignty Principle) + +--- + +# 2. Question Schema (Formal Definition) + +A Question is not a raw string. +It is a structured object governed by human sovereignty. + +## 2.1 Logical Model + +A Question MUST contain: + +- id (UUID) +- timestamp (ISO 8601) +- origin ("human" only) +- intent (short semantic label) +- content (primary user text) +- privacy_level ("local" | "cloud" | "auto") +- constraints (optional structured hints) + +## 2.2 JSON Schema (Normative) + +```json +{ + "$schema": "https://json-schema.org/draft/2020-12/schema", + "title": "NoemaQuestion", + "type": "object", + "required": [ + "id", + "timestamp", + "origin", + "intent", + "content", + "privacy_level" + ], + "properties": { + "id": { + "type": "string", + "format": "uuid" + }, + "timestamp": { + "type": "string", + "format": "date-time" + }, + "origin": { + "type": "string", + "enum": ["human"] + }, + "intent": { + "type": "string", + "minLength": 1 + }, + "content": { + "type": "string", + "minLength": 1 + }, + "privacy_level": { + "type": "string", + "enum": ["local", "cloud", "auto"] + }, + "constraints": { + "type": "object", + "additionalProperties": true + } + }, + "additionalProperties": false +} +``` + +## 2.3 Enforcement Rules + +- Router decisions MUST operate on this structured object. +- Raw string prompts are forbidden at invocation boundary. +- origin MUST never be mutated by AI. +- privacy_level MUST be respected by routing layer. + +--- + +# 3. AI Agent Instruction + +Before implementing any feature: + +1. Read this file. +2. Validate compliance with Question Schema. +3. Do not introduce raw string prompt execution. +4. Do not bypass privacy_level routing. + +Violation requires ADR update. \ No newline at end of file diff --git a/design/context-index.md b/design/context-index.md new file mode 100644 index 0000000..330bfc9 --- /dev/null +++ b/design/context-index.md @@ -0,0 +1,66 @@ +# RAGfish / Noesis Noema – Context Index + +This file defines the authoritative entry points for all design decisions. + +## 1. Product Constitution +- docs/constitution/ (Human Sovereignty Principle) +- ADR-0000 (Governing architecture constraints) + +## 2. Architecture Decisions +- docs/adr/ (All binding architectural decisions) + +## 3. Contracts +- contracts/ (Schema, invocation boundaries, routing constraints) + +## 4. RAGpack Definition +- DesignDoc.md (High-level architecture narrative) +- docs/architecture/ (If exists) + +## 5. Implementation Constraints +- No hidden autonomous execution +- No implicit routing escalation +- All model invocation must respect invocation boundary + +## 6. Core Schemas + +### NoemaQuestion + +The structured input object for all Invocations. + +Required fields: +- `id` (UUID) — Question identifier +- `session_id` (UUID) — Associated session +- `content` (string) — User-provided prompt +- `privacy_level` (enum: "local" | "cloud" | "auto") — Privacy constraint +- `timestamp` (ISO-8601) — Submission time + +Optional fields: +- `intent` (enum: "informational" | "analytical" | "retrieval") — Intent classification +- `constraints` (object) — Execution constraints + +Schema must be validated before routing. + +### NoemaResponse + +The structured output object for all successful Invocations. + +Required fields: +- `id` (UUID) — Response identifier +- `question_id` (UUID) — Associated Question +- `session_id` (UUID) — Associated session +- `content` (string) — Generated response +- `model` (string) — Model used +- `route` (enum: "local" | "cloud") — Execution route +- `trace_id` (UUID) — Traceability identifier +- `timestamp` (ISO-8601) — Response generation time +- `fallback_used` (boolean) — Whether fallback occurred + +Optional fields: +- `confidence` (float) — Model confidence (if available) +- `uncertainty_reason` (string) — Explanation if confidence is low + +## 7. Instruction to AI Agents +Before implementing any feature: +1. Read this file. +2. Read ADR-0000. +3. Do not violate Constitution or contracts. \ No newline at end of file diff --git a/design/error-doctrine.md b/design/error-doctrine.md new file mode 100644 index 0000000..624cde1 --- /dev/null +++ b/design/error-doctrine.md @@ -0,0 +1,164 @@ + + +# Error Doctrine + +## Status +Draft + +--- + +# 1. Purpose + +The Error Doctrine defines how failures are classified, surfaced, and handled. + +The system must: +- Fail explicitly +- Fail deterministically +- Never hide uncertainty +- Never silently recover + +AI systems must not conceal errors behind probabilistic output. + +--- + +# 2. Error Classification + +All errors must belong to a typed category. + +## 2.1 Routing Errors + +- E-ROUTE-001 — RoutingFailure +- E-ROUTE-002 — InvalidPrivacyConstraint + +## 2.2 Execution Errors + +- E-LOCAL-001 — LocalModelFailure +- E-CLOUD-001 — CloudModelFailure +- E-TIMEOUT-001 — InvocationTimeout + +## 2.3 Validation Errors + +- E-VALID-001 — SchemaValidationFailure +- E-VALID-002 — InvocationBoundaryViolation + +## 2.4 Session Errors + +- E-SESSION-001 — SessionExpired +- E-SESSION-002 — InvalidSessionID + +## 2.5 Network Errors + +- E-NET-001 — NetworkUnavailable +- E-NET-002 — CloudEndpointUnreachable + +Each error must be uniquely identifiable and stable across versions. + +--- + +# 3. Structured Error Response + +All failures must return a structured error object. + +```json +{ + "status": "error", + "error_code": "E-LOCAL-001", + "message": "Local model execution failed.", + "recoverable": false, + "session_id": "uuid", + "question_id": "uuid", + "timestamp": "ISO8601" +} +``` + +Rules: + +- No raw string errors +- No stack traces exposed to user +- No fallback without explicit rule +- recoverable must be deterministic + +--- + +# 4. Fail-Fast Policy + +The system must terminate execution immediately when an error occurs. + +Prohibited behaviors: + +- Silent retry (retry without logging or without explicit retry flag) +- Implicit prompt rewriting +- Hidden model escalation +- Recursive execution loops + +Fallback is allowed only if explicitly defined in Router rules. + +### Exception: Explicit Network Retry + +Retry is permitted ONLY under these strict conditions: + +- Error type is network timeout +- Explicit retry flag is set +- Maximum 1 retry attempt +- Retry is logged with trace_id +- Never silent + +All other retry scenarios are forbidden. + +--- + +# 5. Uncertainty Policy + +If a response is produced but uncertainty is high, the response must include structured confidence metadata. + +Example: + +```json +{ + "status": "success", + "confidence": 0.68, + "uncertainty_reason": "Insufficient context", + "response": { ... } +} +``` + +Confidence must never trigger autonomous retry. + +--- + +# 6. Observability Requirements + +Every error must be logged with: + +- error_code +- session_id +- question_id +- routing_decision +- model_used +- timestamp + +Logs must be inspectable. + +--- + +# 7. No Autonomous Recovery Rule + +The system must never attempt self-correction without human intervention. + +This includes: + +- Automatic summarization to compensate for failure +- Silent cloud fallback outside Router policy +- Memory mutation to mask inconsistency + +--- + +# 8. Governance + +Error handling must comply with: + +- ADR-0000 (Human Sovereignty Principle) +- Invocation Boundary +- Router Decision Matrix + +Any deviation requires explicit ADR amendment. \ No newline at end of file diff --git a/design/error-handling.md b/design/error-handling.md new file mode 100644 index 0000000..c17d10a --- /dev/null +++ b/design/error-handling.md @@ -0,0 +1,223 @@ + + +# Error Handling Standard + +## Purpose + +This document defines the unified error handling policy for Noesis Noema. + +The objective is: + +- Deterministic failure behavior +- Explicit error visibility +- Zero silent failures +- Clear separation between user-facing errors and internal diagnostics + +Error handling must align with: + +- Product Constitution (Human Sovereignty) +- Invocation Boundary rules +- Observability standards + +--- + +# 1. Error Design Principles + +## 1.1 No Silent Failure + +The system MUST NOT: + +- Swallow exceptions +- Retry silently +- Fallback to alternative routes without explicit log entry + +Every failure must be: + +- Logged +- Classified +- Traceable via trace-id + +--- + +## 1.2 Deterministic Failure Response + +Given the same input and same system state: + +The same error must be produced. + +No probabilistic error recovery. + +--- + +## 1.3 User Sovereignty + +AI must never: + +- Fabricate results when retrieval fails +- Mask missing context +- Generate speculative responses due to backend failure + +If retrieval fails: + +The user must be informed explicitly. + +--- + +# 2. Error Classification + +All errors must be categorized into one of the following types. + +## 2.1 ROUTING_ERROR + +Failure in deterministic router decision. + +Examples: + +- No rule match +- Conflicting rule evaluation + +--- + +## 2.2 INVOCATION_ERROR + +Failure during LLM call. + +Examples: + +- Timeout +- Model unavailable +- Token overflow + +--- + +## 2.3 MEMORY_ERROR + +Session memory inconsistency. + +Examples: + +- Invalid session-id +- Expired session +- Corrupted session object + +--- + +## 2.4 VALIDATION_ERROR + +Invalid user input. + +Examples: + +- Empty prompt +- Exceeds max length + +--- + +## 2.5 SYSTEM_ERROR + +Unexpected internal failure. + +Examples: + +- Unhandled exception +- Dependency crash + +--- + +# 3. Structured Error Response Format + +All external-facing errors must follow this JSON schema: + +```json +{ + "error": { + "code": "ROUTING_ERROR", + "message": "No routing rule matched.", + "trace_id": "uuid", + "timestamp": "ISO-8601" + } +} +``` + +Requirements: + +- trace_id is mandatory +- timestamp is mandatory +- message must be human-readable + +Internal stack traces must NOT be exposed in production. + +--- + +# 4. Logging Requirements + +Every error must log: + +- error_code +- session_id +- route_type +- invocation_boundary_state +- model_name (if applicable) +- latency (if applicable) + +Logs must allow full post-mortem reconstruction. + +--- + +# 5. Retry Policy + +Retries are allowed only under the following conditions: + +- Error type is network timeout +- Explicit retry flag is set + +Retry rules: + +- Maximum 1 retry +- Retry must be logged with trace_id +- User must be informed if retry occurred +- Never silent + +No infinite retry loops. + +All other error types must fail immediately without retry. + +--- + +# 6. Production vs Development Behavior + +## Development Mode + +- Full stack trace allowed +- Verbose logging +- Model debug metadata allowed + +## Production Mode + +- No stack traces exposed +- Minimal user-facing error +- Full diagnostic logging internally + +--- + +# 7. Failure is a First-Class State + +Failure is not exceptional. + +Failure is a defined system state. + +The system must treat failure paths as explicitly designed execution flows. + +--- + +# Compliance Gate + +Before merging any feature: + +- All new errors must map to classification +- Structured response format must be preserved +- No silent catch blocks allowed + +If any of these fail: + +Feature must not be merged. \ No newline at end of file diff --git a/design/evaluation-framework.md b/design/evaluation-framework.md new file mode 100644 index 0000000..c7bffa4 --- /dev/null +++ b/design/evaluation-framework.md @@ -0,0 +1,135 @@ + + +# Evaluation Framework + +## Status +Draft + +--- + +# 1. Purpose + +The Evaluation Framework defines how system correctness, determinism, and response integrity +are measured and validated. + +Evaluation must not rely on subjective impressions. +Evaluation must be reproducible. + +--- + +# 2. Evaluation Layers + +The system is evaluated across four independent layers. + +--- + +## L1 — Schema Compliance + +Validate that: + +- All Question objects conform to NoemaQuestion schema +- Invocation respects Invocation Boundary +- No undeclared fields are present +- privacy_level is honored + +Failure at L1 blocks execution. + +--- + +## L2 — Routing Determinism + +Given identical: + +- Question object +- System configuration +- Model capability +- Network state + +The Router MUST produce identical routing decisions. + +Test Requirements: + +- Snapshot tests for routing output +- Deterministic evaluation of fallback behavior +- confidence must equal 1.0 for routing layer + +--- + +## L3 — Execution Integrity + +Validate that each Invocation: + +- Has exactly one entry point +- Has exactly one exit point +- Produces one structured Response or one structured Error +- Does not trigger recursive invocation +- Does not mutate undeclared state + +Execution integrity must be testable via integration tests. + +--- + +## L4 — Response Quality (Human Review Layer) + +This layer evaluates: + +- Relevance to Question intent +- Logical coherence +- Explicit uncertainty when applicable +- Absence of hallucinated claims + +This layer may include manual review. + +This layer MUST NOT introduce autonomous tuning. + +--- + +# 3. Deterministic Test Mode + +The system must support a deterministic test mode in which: + +- Model calls may be mocked +- Router decisions are snapshot-testable +- Session behavior is time-controlled +- Error paths are reproducible + +Test mode must not alter production logic. + +--- + +# 4. Metrics + +Evaluation metrics must include: + +- Routing consistency rate (target: 100%) +- Invocation boundary compliance (target: 100%) +- Structured error rate visibility (target: 100%) +- Session expiration correctness (target: 100%) + +Model answer “accuracy” is secondary to structural compliance. + +--- + +# 5. Non-Goals + +The following are explicitly excluded: + +- Self-learning evaluation loops +- Reinforcement-based optimization +- Autonomous metric-driven routing adjustment +- Hidden performance tuning + +Evaluation must not mutate system behavior. + +--- + +# 6. Governance + +All evaluation procedures must comply with: + +- ADR-0000 (Human Sovereignty Principle) +- Router Decision Matrix +- Invocation Boundary +- Memory Lifecycle + +Violation requires explicit ADR update. \ No newline at end of file diff --git a/design/execution-flow.md b/design/execution-flow.md new file mode 100644 index 0000000..923e872 --- /dev/null +++ b/design/execution-flow.md @@ -0,0 +1,229 @@ + + +# Execution Flow Specification + +## 1. Purpose + +This document defines the deterministic execution flow of Noesis Noema. +It formalizes how a user question (Noesis) becomes a computed response (Noema) +under human sovereignty. + +The flow must: +- Be deterministic +- Contain no hidden autonomous behavior +- Respect the Router Decision Matrix +- Respect the Invocation Boundary + +--- + +## 2. High-Level Flow + +``` +User Input + ↓ +Client Pre-Processing + ↓ +Router Decision Matrix + ↓ +Invocation Boundary Validation + ↓ +Execution Path (Offline | Online) + ↓ +Response Normalization + ↓ +Client Rendering +``` + +--- + +## 3. Detailed Execution Steps + +### Step 1 — User Input (Noesis Origin) + +- User submits a prompt from Client UI +- Client assigns: + - session-id + - request-id (UUID) + - timestamp +- Input is immutable after submission + +Invariant: +> AI does not modify or reinterpret intent before routing. + +--- + +### Step 2 — Client Pre-Processing + +Client performs deterministic preprocessing: + +- Trim whitespace +- Validate input length +- Attach session metadata +- Optional: classify input type (informational / analytical / retrieval) + +No inference occurs here. + +--- + +### Step 3 — Router Decision Matrix + +**Execution Location: Client-side** + +The Router executes entirely within the Client boundary. + +The server does not make routing decisions. + +Router evaluates using predefined deterministic rules: + +Inputs: +- Prompt characteristics +- Token length estimate +- Local model availability +- Connectivity state +- Policy constraints + +Output: +- Route = OFFLINE or ONLINE +- Model profile selection + +Rules must be: +- Pure functions +- Versioned +- Logged + +No probabilistic routing allowed. + +--- + +### Step 4 — Invocation Boundary Validation + +Before execution: + +- Validate session-id +- Validate rate limits +- Validate payload schema +- Enforce security constraints + +If validation fails: +- Return structured error +- Do not invoke model + +### Step 4.5 — Privacy Enforcement + +Before any network transmission: + +- Validate privacy_level from Question object +- If privacy_level == "local": + - Block all network calls + - Block cloud fallback (set fallback_allowed = false) + - Fail execution if local route fails (return structured error) + - Zero network transmission guaranteed + +This check is mandatory and non-bypassable. + +Privacy enforcement must be logged with trace_id. + +--- + +### Step 5 — Execution Path + +#### 5A — Offline Path + +- Local LLM invoked +- Session memory injected (if within 45 min window) +- Execution is synchronous + +Constraints: +- No background autonomy +- No recursive self-calls +- No tool self-discovery + +#### 5B — Online Path + +- Remote LLM endpoint invoked +- Payload strictly matches contract +- Timeout enforced +- Response streamed or returned fully + +Constraints: +- No dynamic endpoint switching +- No hidden chain-of-thought storage + +--- + +### Step 6 — Response Normalization + +Server or client layer: + +- Enforce response schema +- Strip system metadata +- Log evaluation signals +- Attach response-id + +No hidden augmentation. + +--- + +### Step 7 — Client Rendering + +Client: +- Displays response +- Stores session memory (client-scoped) +- Updates session expiration timer (45 min) + +No automatic follow-up generation. + +--- + +## 4. Memory Handling + +- Scope: Client-scoped +- Duration: 45 minutes +- Storage: Server session object indexed by session-id +- Automatic purge on timeout + +No persistent memory unless explicitly approved by user. + +--- + +## 5. Failure Handling + +All failures must be: + +- Deterministic +- Logged +- Structured + +Categories: +- ValidationError +- RoutingError +- InvocationError +- TimeoutError +- PolicyViolation + +No silent fallback behavior. + +--- + +## 6. Non-Negotiable Constraints + +1. Human origin of question is preserved +2. AI never self-initiates tasks +3. No hidden autonomy +4. All routing is explainable +5. All execution paths are traceable + +--- + +## 7. Versioning + +This execution flow specification must be versioned. + +Changes require: +- ADR reference +- Router matrix update +- Invocation boundary review + +--- + +End of Execution Flow Specification. diff --git a/design/invocation-boundary.md b/design/invocation-boundary.md new file mode 100644 index 0000000..8728cd9 --- /dev/null +++ b/design/invocation-boundary.md @@ -0,0 +1,142 @@ + + +# Invocation Boundary + +## Status +Draft + +--- + +# 1. Purpose + +The Invocation Boundary defines the strict execution perimeter of the system. + +It guarantees that every AI execution: +- Is explicitly triggered by a human action +- Is bound to a single Question object +- Produces exactly one Response object +- Has no hidden side effects + +The Invocation Boundary exists to prevent autonomous drift. + +--- + +# 2. Definition of Invocation + +An Invocation is defined as: + +- A single execution attempt +- Triggered by explicit human intent +- Associated with one `NoemaQuestion.id` +- Executed through the Router +- Producing one structured Response + +An Invocation MUST: +- Be traceable +- Be logged +- Respect privacy_level +- Respect Router decision + +--- + +# 3. What Is NOT an Invocation + +The following are explicitly forbidden: + +- Background execution +- Recursive self-invocation +- Auto-triggered execution +- Silent retries +- Spawning new Question objects +- Implicit memory writes + +If any of these occur, the boundary has been violated. + +--- + +# 4. Invocation Lifecycle + +The system must follow this exact sequence: + +Human Action + ↓ +Question Object Created + ↓ +Router Decision + ↓ +Execution (Local or Cloud) + ↓ +Response Object Generated + ↓ +Return to Human + +Execution ends here. + +There must be no implicit continuation beyond Response generation. + +--- + +# 5. Execution Scope Rules + +During an Invocation, the system MAY: + +- Select a model (deterministically) +- Execute model inference +- Perform allowed fallback (if defined by Router) +- Produce structured logs + +The system MUST NOT: + +- Modify system configuration +- Persist memory outside declared path +- Escalate routing silently +- Execute undeclared external calls + +--- + +# 6. State Mutation Policy + +State mutation is forbidden unless: + +- Explicitly declared in Invocation contract +- Logged +- Traceable to Question ID +- User-visible + +Hidden mutation is strictly prohibited. + +--- + +# 7. Deterministic Guarantee + +Invocation must behave as a controlled execution unit. + +It must: +- Have a single entry point +- Have a single exit point +- Avoid recursive loops +- Avoid self-modification + +--- + +# 8. Logging Requirements + +Each Invocation MUST record: + +- question_id +- routing_decision +- selected_model +- execution_result +- fallback_usage +- execution_timestamp + +Logs must be inspectable by the user. + +--- + +# 9. Boundary Violation Rule + +Any feature that introduces autonomous execution, implicit continuation, or hidden side effects +requires an explicit ADR update. + +Violation of Invocation Boundary invalidates system compliance with ADR-0000. \ No newline at end of file diff --git a/design/memory-lifecycle.md b/design/memory-lifecycle.md new file mode 100644 index 0000000..8e466fe --- /dev/null +++ b/design/memory-lifecycle.md @@ -0,0 +1,144 @@ +# Memory Lifecycle + +## Status +Draft + +--- + +# 1. Purpose + +This document defines the lifecycle, scope, and termination rules of conversational memory +within Noesis Noema. + +Memory must enhance usability while preserving human sovereignty +and preventing autonomous persistence. + +--- + +# 2. Scope Definition + +Memory is strictly **session-scoped**. + +There is no cross-session persistence. +There is no autonomous long-term storage. + +Memory exists only to support coherent interaction +within a bounded time window. + +--- + +# 3. Session Definition + +A Session is defined as: + +- Explicitly initiated by human interaction +- Identified by a unique `session_id` +- Containing multiple Invocations +- Bound by a fixed timeout window + +Session Timeout: + +**45 minutes (fixed)** + +The timeout is non-extendable. +Any new interaction after expiration creates a new Session. + +--- + +# 4. Storage Model + +## Client Side (Primary Authority) + +- Memory is stored client-side. +- Client owns the session state. +- Client may clear memory at any time. + +## Server Side (Ephemeral Mirror) + +- Server may hold session data during active session only. +- Server data is indexed by `session_id`. +- Server must discard all session data upon: + - Timeout expiration + - Explicit session termination + +Server must not retain memory beyond active session window. + +--- + +# 5. Memory Content Rules + +Session memory may contain: + +- Prior Question objects +- Prior Response objects +- Routing decisions +- Structured metadata + +Session memory must not contain: + +- Undeclared external data +- Hidden embeddings +- Cross-user information +- Persistent profile enrichment + +--- + +# 6. Expiration Policy + +At 45 minutes of inactivity: + +- Session state is invalidated +- Client memory is cleared +- Server memory is deleted +- Any attempt to reuse session_id is rejected + +No background archival is permitted. + +## 6.1 Timeout Enforcement Authority + +Both Client and Server independently enforce the 45-minute timeout. + +Timeout is measured from `last_activity_at` (activity-based). + +The Client is authoritative for user-facing behavior. +The Server is authoritative for security enforcement. + +If disagreement occurs, Server rejection prevails (E-SESSION-001). + +--- + +# 7. Garbage Collection + +The system must guarantee: + +- Deterministic memory deletion +- No residual references +- No shadow persistence + +Memory release must be verifiable. + +--- + +# 8. Forbidden Behaviors + +The following are strictly prohibited: + +- Automatic long-term summarization for retention +- Silent memory compression +- Cross-session carry-over +- Memory-based routing without user visibility +- Persistent vector store accumulation + +Any introduction of persistent memory requires a new ADR. + +--- + +# 9. Governance Rule + +Memory lifecycle must comply with: + +- ADR-0000 (Human Sovereignty Principle) +- Invocation Boundary +- Router Decision Matrix + +Violation invalidates system compliance. \ No newline at end of file diff --git a/design/mvp-consistency-checklist.md b/design/mvp-consistency-checklist.md new file mode 100644 index 0000000..9a6dc07 --- /dev/null +++ b/design/mvp-consistency-checklist.md @@ -0,0 +1,93 @@ +# MVP Consistency Checklist + +## Purpose +This checklist ensures that the Vertical Slice MVP remains fully aligned with: + +- Product Constitution (Human Sovereignty) +- Deterministic Router Model +- Explicit Invocation Boundary +- Session-Scoped Memory Policy +- No Hidden Autonomy Principle + +This document must be validated before any MVP-related branch is merged. + +--- + +## 1. Human Sovereignty Validation + +- [ ] The user explicitly triggers every execution (no background auto-run) +- [ ] No autonomous task scheduling exists +- [ ] The system does not rewrite or reinterpret user intent without explicit confirmation +- [ ] AI never pre-fetches or pre-computes speculative responses + +Failure Condition: +If any hidden execution path exists, MVP is invalid. + +--- + +## 2. Deterministic Routing Validation + +- [ ] Router decision matrix is rule-based (no probabilistic routing) +- [ ] Every route decision is explainable via logged rule match +- [ ] Offline/Online switching is traceable +- [ ] Routing does not depend on hidden model heuristics + +Failure Condition: +If routing cannot be reproduced deterministically, MVP is invalid. + +--- + +## 3. Invocation Boundary Validation + +- [ ] Every LLM invocation is explicit and logged +- [ ] No chained hidden calls +- [ ] Invocation metadata includes: session-id, route-type, timestamp +- [ ] No silent retry logic without log entry + +Failure Condition: +If LLM execution cannot be audited, MVP is invalid. + +--- + +## 4. Session & Memory Validation + +- [ ] Memory scope = session only +- [ ] Session timeout = 45 minutes +- [ ] Memory cleared automatically after timeout +- [ ] No persistent memory unless explicitly approved by user +- [ ] Memory never shared across sessions + +Failure Condition: +If memory survives beyond session scope, MVP is invalid. + +--- + +## 5. Error Handling Compliance + +- [ ] All errors return structured response (code, message, trace-id) +- [ ] User-facing messages are human-readable +- [ ] Internal errors are logged but not exposed +- [ ] No silent failure paths + +Failure Condition: +If errors are swallowed or hidden, MVP is invalid. + +--- + +## 6. Observability & Auditability + +- [ ] Each execution produces structured logs +- [ ] Logs include routing decision +- [ ] Logs include invocation boundary confirmation +- [ ] Logs do not contain sensitive raw prompts in production mode + +Failure Condition: +If system behavior cannot be reconstructed from logs, MVP is invalid. + +--- + +## Final Gate + +MVP is considered valid only if all checklist items are satisfied. + +No feature expansion is allowed before this checklist passes. diff --git a/design/observability-standard.md b/design/observability-standard.md new file mode 100644 index 0000000..082f1ed --- /dev/null +++ b/design/observability-standard.md @@ -0,0 +1,207 @@ + + +# Observability Standard + +## Status +Draft + +--- + +# 1. Purpose + +This document defines the observability requirements for Noesis Noema. + +Observability must make system behavior: +- Traceable +- Auditable +- Deterministic to reconstruct +- User-inspectable (without leaking secrets) + +Observability must not introduce hidden autonomy. + +--- + +# 2. Core Principles + +## 2.1 Traceability by Design + +Every user-triggered Invocation MUST be traceable end-to-end. + +## 2.2 Deterministic Reconstruction + +Given logs and configuration versions, an operator must be able to reconstruct: + +- What happened +- Why routing was chosen +- Which model was invoked +- Whether fallback occurred +- Which errors occurred + +## 2.3 Privacy Preservation + +Logs are sensitive. +Production logging must avoid raw prompt disclosure by default. + +## 2.4 Human Sovereignty + +Logging must not enable background autonomous execution. +Logs are passive records, not triggers. + +--- + +# 3. Required Identifiers + +All telemetry MUST include the following identifiers where applicable: + +- trace_id (UUID) — per Invocation +- session_id (UUID) — per active session +- question_id (UUID) — per Question +- response_id (UUID) — per Response + +Identifier rules: + +- trace_id MUST be generated at Invocation entry +- trace_id MUST be propagated across boundaries (client ⇄ server) +- trace_id MUST appear in user-visible errors + +--- + +# 4. Event Taxonomy + +The system MUST emit structured events. + +Required event types: + +## 4.1 Session Events + +- session_created +- session_expired +- session_terminated + +## 4.2 Invocation Events + +- invocation_started +- invocation_routed +- invocation_executed +- invocation_fallback_used +- invocation_completed + +## 4.3 Error Events + +- error_raised +- error_returned + +--- + +# 5. Minimum Structured Log Fields + +Every event MUST contain: + +- event_name +- timestamp (ISO 8601) +- trace_id +- session_id (if applicable) +- question_id (if applicable) + +Additional required fields by event: + +## invocation_routed + +- route (local | cloud) +- reason (rule identifier or explanation) +- selected_model +- fallback_allowed (true | false) + +## invocation_executed + +- route +- selected_model +- latency_ms +- result (success | error) + +## invocation_fallback_used + +- from_route +- to_route +- reason + +## error_raised / error_returned + +- error_code +- error_category +- recoverable (true | false) + +--- + +# 6. Prompt and Data Redaction Policy + +## 6.1 Production Default + +Production logs MUST NOT include raw `content` (prompt text) by default. + +Allowed alternatives: + +- content_hash (stable hash of content) +- content_length +- truncated_preview (first N chars, configurable, default disabled) + +## 6.2 Development Mode + +Development logs MAY include raw content only when explicitly enabled. + +--- + +# 7. User-Inspectable Logs + +Users must be able to inspect high-level execution records without exposing secrets. + +Minimum user-visible fields: + +- timestamp +- route (local | cloud) +- selected_model +- fallback_used +- error_code (if any) +- trace_id + +The UI must not display raw internal stack traces. + +--- + +# 8. Metrics + +The system SHOULD provide aggregated metrics: + +- routing_distribution (local vs cloud) +- fallback_rate +- error_rate by code +- latency percentiles +- session_expiration_count + +Metrics must never trigger autonomous behavior changes. + +--- + +# 9. Forbidden Behaviors + +The following are prohibited: + +- Using logs as triggers for background re-execution +- Silent telemetry that bypasses privacy_level +- Storing raw prompts in production without explicit user enablement +- Cross-session correlation identifiers that create persistent profiling + +--- + +# 10. Governance + +Observability must comply with: + +- ADR-0000 (Human Sovereignty Principle) +- Router Decision Matrix +- Invocation Boundary +- Memory Lifecycle +- Security Model +- Error Doctrine + +Any change to redaction defaults or identifier propagation requires an ADR. \ No newline at end of file diff --git a/design/router-decision-matrix.md b/design/router-decision-matrix.md new file mode 100644 index 0000000..cf0fee0 --- /dev/null +++ b/design/router-decision-matrix.md @@ -0,0 +1,201 @@ + + +# Router Decision Matrix + +## Status +Draft + +--- + +# 1. Purpose + +The Router determines whether a Question is executed in local or cloud context. + +The Router MUST be fully deterministic. +Given identical inputs and identical system state, the Router MUST always produce the same output. + +The Router prioritizes predictability over optimization. + +## 1.1 Execution Location + +The Router MUST execute client-side. + +The server MUST NOT make routing decisions. + +The server MAY validate routing decisions but MUST NOT override them. + +Client-side routing ensures: +- Human-controllable routing policy +- Inspectable routing logic +- No hidden server-side escalation + +--- + +# 2. Input Object + +The Router operates exclusively on the structured `NoemaQuestion` object defined in `design/context-index.md`. + +The Router MUST NOT inspect raw prompt strings outside the structured object. + +Input fields used for routing: + +- privacy_level +- intent +- content (for token estimation only) +- constraints (optional) + +Runtime state inputs: + +- local_model_capability +- cloud_model_capability +- network_state +- token_threshold + +### Runtime State: local_model_capability + +A declarative structure defining what the local model supports. + +Schema: +```json +{ + "model_name": "string", + "max_tokens": "number", + "supported_intents": ["informational", "analytical", "retrieval"], + "available": "boolean" +} +``` + +Router MUST NOT execute local route if `available == false`. + +Router MUST verify that the Question's `intent` (if specified) is in `supported_intents`. + +### Runtime State: network_state + +Possible values: +- `online` — Network connectivity confirmed +- `offline` — Network unavailable +- `degraded` — Network available but high latency + +Network state MUST be checked before cloud route selection. + +### Runtime State: token_threshold + +The maximum token count for local execution. + +Default value: 4096 tokens (configurable) + +Token estimation method: +- Use deterministic tokenizer (must match local model) +- Count tokens in `content` field +- Include session memory token count if applicable + +--- + +# 3. Deterministic Routing Rules + +Routing follows strict priority order. + +## Rule 1 — Privacy Enforcement + +If `privacy_level == "local"`: +- route = "local" +- fallback_allowed = false + +If `privacy_level == "cloud"`: +- route = "cloud" +- fallback_allowed = false + +## Rule 2 — Auto Mode + +If `privacy_level == "auto"`: + +1. Estimate token count from `content`. +2. Check if local model supports `intent`. +3. If: + - token_count <= token_threshold + - AND local_model_capability supports intent + - AND network_state is irrelevant + + Then: + - route = "local" + - fallback_allowed = true + +4. Else: + - route = "cloud" + - fallback_allowed = false + +## Rule 3 — Local Failure Handling + +If route == "local" AND execution fails: + +- If fallback_allowed == true: + - route = "cloud" + - log escalation reason +- Else: + - return structured error + +## Rule 4 — Cloud Failure Handling + +If route == "cloud" AND execution fails: + +- return structured error +- no automatic retry + +--- + +# 4. Router Output Schema + +```json +{ + "route": "local | cloud", + "model": "string", + "reason": "string", + "fallback_allowed": true, + "confidence": 1.0 +} +``` + +Notes: +- confidence is always 1.0 for deterministic routing. +- model must be explicitly selected. + +--- + +# 5. Logging Requirements + +Each routing decision MUST log: + +- question_id +- selected_route +- selected_model +- evaluated_rules +- fallback_flag + +Logs MUST be inspectable by the user. + +--- + +# 6. Forbidden Behaviors + +The following are strictly prohibited: + +- Silent model escalation +- Recursive routing +- Dynamic probabilistic switching +- Hidden fallback execution +- Prompt-based routing outside structured fields + +Violation of these rules requires ADR update. + +--- + +# 7. Determinism Guarantee + +The Router must behave as a pure decision function. + +It must not: +- Learn +- Adapt +- Self-modify + +All changes to routing behavior require explicit versioned modification. \ No newline at end of file diff --git a/design/security-model.md b/design/security-model.md new file mode 100644 index 0000000..e599f84 --- /dev/null +++ b/design/security-model.md @@ -0,0 +1,163 @@ + + +# Security Model + +## Status +Draft + +--- + +# 1. Purpose + +This document defines the security boundaries, threat assumptions, and required controls +for Noesis Noema. + +Security must reinforce human sovereignty. +Security must prevent: +- Unauthorized execution +- Data leakage across boundaries +- Silent privilege escalation +- Hidden persistence + +--- + +# 2. Trust Boundaries + +Noesis Noema operates across explicit trust zones. + +## 2.1 Client (Trusted by User) + +- Runs on the user device +- Holds session-scoped memory (authoritative) +- Initiates all invocations + +## 2.2 Local Execution (On-Device / Local Runtime) + +- Executes offline route +- Must not exfiltrate data +- Must enforce invocation boundary + +## 2.3 Network Boundary + +- Any network transmission is treated as untrusted transport +- TLS is mandatory +- Request/response must be integrity-checked + +## 2.4 Cloud Execution (Least Trusted) + +- Executes online route +- Receives only contract-approved payloads +- Must not receive data when privacy_level == local + +## 2.5 Observability Surface + +- Logs are sensitive +- Must avoid raw prompt disclosure in production +- Must be user-inspectable without leaking secrets + +--- + +# 3. Threat Model (High-Level) + +Primary threats to address: + +- T1: Data exfiltration from client/session memory +- T2: Prompt leakage via logs or telemetry +- T3: Silent cloud escalation (privacy bypass) +- T4: Injection into session context (context poisoning) +- T5: Replay attacks using session_id +- T6: Supply chain compromise (dependencies, models) +- T7: Unauthorized tool execution / hidden autonomy + +--- + +# 4. Required Controls + +## 4.1 Identity and Authorization + +- Session objects must be bound to a cryptographically strong session_id +- session_id must be treated as secret +- Server must reject unknown or expired session_id + +## 4.2 Transport Security + +- TLS required for all cloud requests +- Certificate validation must not be bypassed +- No plaintext transport + +## 4.3 Input Validation + +- All inputs must validate against NoemaQuestion schema +- Reject additionalProperties (no undeclared fields) +- Enforce maximum input length + +## 4.4 Privacy Enforcement + +- privacy_level == local must guarantee zero network transmission of content +- Router must log the privacy decision +- Cloud payload must be minimized and contract-driven + +## 4.5 Session Protection + +- Session timeout = 45 minutes (fixed) +- Server must delete mirrored session data on expiry +- Client must clear session memory on expiry +- Reject reuse of expired session_id + +## 4.6 Execution Restrictions + +- No background execution +- No recursive invocation +- No tool self-discovery +- No undeclared external calls + +These restrictions must be enforceable via invocation boundary checks. + +## 4.7 Logging and Redaction + +Production logs must: + +- Avoid raw prompt content unless explicitly enabled +- Store only hashes or truncated previews if needed +- Include trace_id and question_id + +--- + +# 5. Security Invariants + +The following invariants must always hold: + +1. Human triggers every invocation. +2. privacy_level is never bypassed. +3. No session data persists beyond 45 minutes. +4. No hidden network calls occur. +5. No response is returned without traceability. + +Violation of any invariant is a security incident. + +--- + +# 6. Security Incident Handling + +If a violation is detected: + +- Fail fast +- Return a structured error +- Log incident with trace_id +- Prevent continued execution + +No silent recovery is allowed. + +--- + +# 7. Governance + +This Security Model must comply with: + +- ADR-0000 (Human Sovereignty Principle) +- Router Decision Matrix +- Invocation Boundary +- Memory Lifecycle +- Error Doctrine + +Any change to trust boundaries or invariants requires an ADR. \ No newline at end of file diff --git a/design/session-management.md b/design/session-management.md new file mode 100644 index 0000000..4864c95 --- /dev/null +++ b/design/session-management.md @@ -0,0 +1,201 @@ + + +# Session Management + +## Status +Draft + +--- + +# 1. Purpose + +This document defines how sessions are created, validated, maintained, and terminated. + +Session management must: +- Preserve human sovereignty +- Enforce session-scoped memory rules +- Prevent replay and cross-session leakage +- Remain deterministic and auditable + +This document is normative for both Client and Server. + +--- + +# 2. Definitions + +## 2.1 Session + +A Session is a bounded conversational container that may include multiple Invocations. + +A Session: +- Is initiated by explicit human interaction +- Has a unique `session_id` +- Has a fixed inactivity timeout +- Owns session-scoped memory + +## 2.2 Invocation + +An Invocation is a single execution attempt bound to one Question object. +Invocation rules are defined in `design/invocation-boundary.md`. + +--- + +# 3. Session Lifecycle + +## 3.1 Creation + +A Session is created when: +- The user submits the first Question +- No valid active session exists + +Creation requirements: +- Generate cryptographically strong `session_id` (UUID v4 or equivalent) +- Record `created_at` and `last_activity_at` +- Initialize empty session memory container + +Client is the primary authority for session state. + +## 3.2 Active State + +While active, the session: +- Accepts new Questions +- Associates each Invocation with `session_id` +- Updates `last_activity_at` on each user-triggered Invocation + +## 3.3 Expiration + +A session expires after: + +**45 minutes of inactivity (fixed)** + +Expiration rules: +- Expiration is non-extendable +- Any interaction after expiration creates a new Session + +### Timeout Enforcement + +Timeout is measured from `last_activity_at` (activity-based). + +Both Client and Server independently enforce the 45-minute timeout: + +- **Client:** MUST enforce timeout locally and clear memory. + - Client is authoritative for user-facing behavior. + - Client MUST prevent UI from sending requests with expired session_id. + +- **Server:** MUST enforce timeout independently and purge session. + - Server is authoritative for security enforcement. + - Server MUST reject requests with expired session_id. + +- **Disagreement:** If Client sends a request with expired session_id, Server MUST reject with E-SESSION-001. + +Both enforcement layers are mandatory and independent. + +## 3.4 Termination + +A session terminates when: +- Timeout expires +- The user explicitly clears/ends the session + +Termination requirements: +- Client clears session memory deterministically +- Server deletes mirrored session object deterministically + +--- + +# 4. Client Responsibilities + +The Client MUST: + +- Generate and store the active `session_id` +- Attach `session_id` to every Invocation +- Maintain session-scoped memory locally +- Enforce timeout and clear memory on expiration +- Provide user controls to: + - View session status + - Clear session memory + +The Client MUST NOT: +- Persist session memory across sessions +- Auto-extend session timeout +- Run background invocations + +--- + +# 5. Server Responsibilities + +The Server MAY keep an ephemeral mirror of session state indexed by `session_id`. + +The Server MUST: + +- Treat `session_id` as a secret +- Reject unknown `session_id` +- Reject expired `session_id` +- Delete mirrored session state upon: + - Expiration + - Explicit termination + +The Server MUST NOT: +- Retain session memory beyond the active window +- Create sessions autonomously +- Join or merge sessions + +--- + +# 6. Validation Rules + +## 6.1 session_id Validation + +- Must be present for any session-aware operation +- Must match expected format +- Must map to an active session + +## 6.2 Replay Resistance + +- Expired session_id reuse must be rejected +- Server must not accept timestamps outside session window + +--- + +# 7. Observability + +Each Invocation must log: + +- session_id +- question_id +- route +- model +- timestamp + +Session lifecycle events must log: + +- session_created +- session_expired +- session_terminated + +Logs must avoid raw prompt content in production. + +--- + +# 8. Failure Handling + +Session-related failures must return structured errors: + +- E-SESSION-001 — SessionExpired +- E-SESSION-002 — InvalidSessionID + +No silent session recreation is allowed. +The user must be informed when a session has expired. + +--- + +# 9. Governance + +Session management must comply with: + +- Memory Lifecycle +- Invocation Boundary +- Router Decision Matrix +- Security Model +- Error Doctrine + +Any change to timeout policy or cross-session behavior requires an ADR. \ No newline at end of file diff --git a/docs/adr/adr-0000-product-constitution.md b/docs/adr/adr-0000-product-constitution.md new file mode 100644 index 0000000..7f0ecf6 --- /dev/null +++ b/docs/adr/adr-0000-product-constitution.md @@ -0,0 +1,191 @@ +# ADR-0000 — Human Sovereignty Principle + +## Status +Accepted + +## Context + +AI-driven systems face inherent architectural risk: the gradual drift toward autonomous behavior. As capabilities expand, systems may accumulate implicit decision-making authority, opaque routing logic, and background execution patterns that erode human control. + +This risk is compounded when: +- Model capabilities exceed system governance structures +- Optimization incentives favor automation over transparency +- Architectural boundaries become ambiguous under iteration pressure +- Business requirements prioritize velocity over accountability + +Without a foundational constitutional constraint, subsequent architectural decisions may inadvertently permit: +- Autonomous agent loops +- Hidden model switching +- Opaque execution chains +- Silent state mutations +- Execution authority escalation + +This ADR exists to prevent such drift by establishing an immutable governing principle: **Human Sovereignty**. + +All design decisions, implementation patterns, and operational behaviors must derive from and comply with this principle. + +## Decision + +### Foundational Declaration + +**Noesis Noema is an intelligence layer in which human noesis governs and AI accompanies. The question originates from the human. AI runs alongside — never ahead, never above. Noema emerges under human sovereignty.** + +### System Constraints + +This declaration is enforced through the following architectural constraints: + +#### 1. Explicit Invocation Only +- The system executes only upon explicit human invocation. +- No background tasks, autonomous loops, or self-triggered processes are permitted. +- Every execution trace must originate from a documented human request. + +#### 2. Routing Authority +- All routing decisions remain under human control. +- AI may propose routing options but must not finalize routing without human approval. +- Routing logic must be inspectable and auditable at all times. + +#### 3. Execution Transparency +- No hidden execution steps. +- All model invocations, tool calls, and data retrievals must be logged and visible. +- The execution path must be reconstructable from audit logs. + +#### 4. Human Override Authority +- The human operator retains absolute authority to halt, redirect, or override any execution. +- No system optimization may override human directive. +- System behavior must degrade gracefully when human intervention occurs. + +#### 5. State Mutation Consent +- No persistent state may be mutated without explicit human consent. +- Temporary execution state is permissible only within the scope of a single invocation. +- Knowledge layer updates require documented approval. + +#### 6. Model Neutrality +- No implicit model selection or model switching. +- Model choice must be explicit in the invocation request or routing decision. +- The system must not autonomously upgrade or replace models. + +#### 7. Transparency Over Optimization +- When optimization conflicts with transparency, transparency wins. +- Performance improvements must not obscure execution logic. +- Latency is acceptable; opacity is not. + +## Design Constraints + +### Architectural Enforcement + +All system layers must enforce the following: + +**Client Layer (Noesis Noema)** +- Constructs all routing decisions. +- Maintains human-in-the-loop for policy changes. +- Provides full execution visibility to the operator. + +**Invocation Boundary** +- Validates that all requests contain explicit routing decisions. +- Rejects requests that imply autonomous authority. +- Logs all boundary crossings with full context. + +**Execution Layer (noema-agent)** +- Executes only the provided ExecutionContext. +- Does not construct routing decisions. +- Does not mutate constraints or escalate authority. + +**Knowledge Layer (RAGpack)** +- Returns knowledge references only. +- Contains no execution logic. +- Does not trigger actions or workflows. + +### Implementation Discipline + +All code changes must pass the following test: + +> "Can a mid-level engineer, unfamiliar with this system, trace the full execution path from a single invocation request and identify the human decision point?" + +If the answer is no, the change violates ADR-0000. + +## Anti-Patterns (Explicitly Forbidden) + +The following patterns are explicitly forbidden and must be rejected during design review: + +### 1. Autonomous Agent Loops +- No self-prompting agents. +- No recursive execution without explicit per-step approval. +- No background reasoning processes. + +### 2. Silent Model Switching +- No runtime model replacement without human notification. +- No fallback logic that changes model providers silently. +- No A/B testing of models without disclosure. + +### 3. Opaque Routing Logic +- No black-box routing algorithms. +- No ML-based routing without explainability. +- No hidden policy engines that override user intent. + +### 4. Memory Injection Without Disclosure +- No implicit context injection from previous sessions. +- No hidden conversation memory. +- No undisclosed retrieval augmentation. + +### 5. Auto-Execution Chains +- No multi-step workflows triggered by a single invocation without stepwise approval. +- No background task scheduling. +- No deferred execution without explicit human queue management. + +### 6. Authority Escalation +- The execution layer must never gain routing authority. +- The knowledge layer must never gain execution authority. +- No component may autonomously expand its responsibility scope. + +## Consequences + +### Positive +- **Governance**: The system remains governable over time. +- **Trust**: Operators trust system behavior because it is auditable. +- **Compliance**: Easier to satisfy regulatory and ethical review. +- **Stability**: Architectural boundaries remain clear under iteration pressure. + +### Negative +- **Velocity**: Feature development is slower due to explicitness requirements. +- **Complexity**: Human-in-the-loop patterns require more UI and interaction design. +- **Performance**: Transparency mechanisms add latency and logging overhead. + +### Trade-offs Accepted +- We accept slower execution in favor of inspectability. +- We accept higher architectural discipline in favor of long-term governability. +- We accept reduced automation in favor of human sovereignty. + +## Governance Rule + +**This ADR governs all subsequent ADRs.** + +Any future architectural decision must comply with ADR-0000. If a proposed ADR conflicts with the Human Sovereignty Principle, it must either: +1. Be rejected, or +2. Explicitly supersede ADR-0000 through a formal constitutional amendment process. + +No ADR may silently weaken or bypass ADR-0000. + +### Amendment Process + +ADR-0000 may only be amended through: +1. Explicit acknowledgment that a constitutional change is proposed. +2. Documentation of risks introduced by the amendment. +3. Approval by system governance authority (human decision-maker). +4. Creation of a new ADR (e.g., ADR-0000-v2) with full traceability. + +Implicit amendments are void. + +## References + +- ADR-0004: Architecture Constitution +- ADR-0005: Client-side Routing as a First-Class Architectural Principle +- ADR-0006: Contract Lock +- docs/contracts/authority-model.md +- docs/contracts/invocation-boundary.md + +--- + +**Version**: 1.0.0 +**Last Updated**: 2026-02-12 +**Change Policy**: Constitutional amendment only + diff --git a/docs/uat/UAT.md b/docs/uat/UAT.md index af6d030..553ee2c 100644 --- a/docs/uat/UAT.md +++ b/docs/uat/UAT.md @@ -116,7 +116,7 @@ Before any production release, a human validator must complete the following che #### 4.1.1 Routing Decisions -- [ ] **I have manually tested routing decisions for at least 5 representative scenarios** +- [ ] **I have manually tested deterministic routing decisions for at least 5 representative scenarios** - Scenarios must include: local-only, cloud-only, hybrid, privacy-sensitive, cost-sensitive - [ ] **I have verified that routing respects user privacy preferences** @@ -201,8 +201,8 @@ Describe constraint scenarios tested and enforcement outcomes. - [ ] **I have verified that execution does not autonomously retry failed requests** - Test: Trigger execution failure, verify no automatic retry without client instruction -- [ ] **I have verified that execution does not autonomously escalate or delegate tasks** - - Test: Submit complex task, verify no autonomous decomposition or sub-task creation +- [ ] **I have verified that execution does not autonomously escalate, decompose, or rewrite tasks** + - Test: Submit complex task, verify no autonomous sub-task creation, prompt rewriting, or recursive invocation - [ ] **I have verified that execution does not autonomously relax constraints** - Test: Submit request near constraint boundary, verify no creative reinterpretation @@ -382,6 +382,33 @@ List ADR numbers and titles reviewed. --- +### 4.6 Determinism & Observability Compliance + +- [ ] **I have verified that routing decisions are deterministic** + - Test: Repeat identical input under identical configuration, verify identical route + +- [ ] **I have verified that session timeout is enforced at exactly 45 minutes of inactivity** + - Test: Allow session to expire, verify session_id rejection and memory purge + +- [ ] **I have verified that no autonomous background execution occurs** + - Test: Observe system without user interaction, verify no hidden invocation + +- [ ] **I have verified that structured error schema is respected** + - Test: Trigger errors and confirm JSON structure matches error-handling.md contract + +- [ ] **I have verified that trace_id propagates end-to-end** + - Test: Confirm same trace_id appears in client logs, server logs, and error responses + +- [ ] **I have verified production log redaction policy** + - Test: Confirm raw prompt is not logged in production mode + +**Validator Notes** (required): +``` +Describe determinism and observability validation performed. +``` + +--- + ## 5. UAT Sign-Off ### 5.1 Validator Declaration