diff --git a/docs/uat/X5_Evidence_Attachment_UAT_Runbook.md b/docs/uat/X5_Evidence_Attachment_UAT_Runbook.md index 5b5bd0e..6712612 100644 --- a/docs/uat/X5_Evidence_Attachment_UAT_Runbook.md +++ b/docs/uat/X5_Evidence_Attachment_UAT_Runbook.md @@ -1,9 +1,9 @@ # X-5 Manual UAT Runbook -**Release:** X-4 — Evidence Attachment -**Target Repository:** rag-fish/RAGFish -**Status:** Operational -**Version:** 1.0 +**Release:** X-4 — Evidence Attachment +**Target Repository:** rag-fish/RAGFish +**Status:** Operational +**Version:** 2.1 --- @@ -13,32 +13,19 @@ This runbook defines the concrete execution procedure required to validate: **X-5 — Manual UAT** -This document ensures the following Definition of Done (DoD): +Definition of Done (DoD): -- UAT.md executed -- Results documented -- Human approval recorded - -This runbook operationalizes the validation requirements defined in: - -- docs/uat/UAT.md -- design/invocation-boundary.md -- design/error-doctrine.md -- design/execution-flow.md -- docs/adr/adr-0000-product-constitution.md +- UAT.md executed +- Results documented +- Human approval recorded --- ## 2. Preconditions -Before executing UAT: - -- X-4 (Evidence Attachment) is merged into `main` +- X-4 merged into `main` - All automated tests pass - No pending schema changes -- Local environment matches latest `main` - -Update repository: ```bash git checkout main @@ -49,7 +36,7 @@ git pull origin main ## 3. Environment Setup -Start the noema-agent service: +Start noema-agent: ```bash uvicorn app.main:app --reload --port 8000 @@ -58,182 +45,248 @@ uvicorn app.main:app --reload --port 8000 Health check: ```bash -curl http://localhost:8000/health +curl -s http://127.0.0.1:8000/health | jq ``` -Expected result: +Expected: -- HTTP 200 OK +- HTTP 200 +- Service status OK --- -## 4. Functional Validation — Evidence +# 4. Functional Validation — Invocation -### 4.1 Echo With Evidence +## 4.1 Normal Invocation ```bash -curl -X POST http://localhost:8000/execute \ +curl -s -X POST http://127.0.0.1:8000/invoke \ -H "Content-Type: application/json" \ -d '{ - "task": "echo", - "input": "hello", - "evidence": [ - { - "source_id": "doc-1", - "source_type": "note", - "location": "§1", - "snippet": "hello world", - "score": 0.98 - } - ] - }' -``` - -### Expected: - -- `evidence` field present -- `source_id` preserved -- `snippet` readable UTF-8 text -- No mutation of evidence content -- `trace_id` present -- `execution_time_ms` present + "session_id": "uat-001", + "task_type": "echo", + "payload": { + "text": "hello uat" + } + }' | jq +``` + +Verify: + +- HTTP 200 +- `request_id` present +- `executed_at` present +- `result` present --- -### 4.2 Echo Without Evidence +## 4.2 Evidence Attachment Verification ```bash -curl -X POST http://localhost:8000/execute \ +curl -s -X POST http://127.0.0.1:8000/invoke \ -H "Content-Type: application/json" \ - -d '{"task": "echo", "input": "hello"}' + -d '{ + "session_id": "uat-002", + "task_type": "echo", + "payload": { + "text": "evidence test" + } + }' | jq ``` -Expected: +Verify: -- `evidence: []` -- No server error +- `evidence` field exists (or explicitly `[]` if no evidence) +- Evidence entries are human-readable +- No corruption of content --- -## 5. Error Handling Validation +# 5. Error Handling Validation -### 5.1 Unsupported Task +## 5.1 Unsupported Task ```bash -curl -X POST http://localhost:8000/execute \ +curl -s -X POST http://127.0.0.1:8000/invoke \ -H "Content-Type: application/json" \ - -d '{"task": "unknown"}' + -d '{ + "session_id": "uat-003", + "task_type": "non_existing_task", + "payload": {} + }' | jq ``` -Expected: +Verify: - Structured error JSON - `error.code` present -- `trace_id` present +- `request_id` present - No stacktrace leakage --- -### 5.2 Malformed Evidence +## 5.2 Malformed Payload -Remove required field (e.g., `source_id`) from evidence payload. +```bash +curl -s -X POST http://127.0.0.1:8000/invoke \ + -H "Content-Type: application/json" \ + -d '{ + "session_id": "uat-004", + "task_type": "echo" + }' | jq +``` -Expected: +Verify: - Validation error returned - No crash -- Proper error schema --- -## 6. Determinism Validation +# 6. Determinism Validation -Repeat identical request twice. +```bash +curl -s -X POST http://127.0.0.1:8000/invoke \ + -H "Content-Type: application/json" \ + -d '{ + "session_id": "uat-005", + "task_type": "echo", + "payload": { + "text": "determinism" + } + }' > r1.json +``` -Verify: +```bash +curl -s -X POST http://127.0.0.1:8000/invoke \ + -H "Content-Type: application/json" \ + -d '{ + "session_id": "uat-006", + "task_type": "echo", + "payload": { + "text": "determinism" + } + }' > r2.json +``` -- Logical output identical -- Different `trace_id` -- No randomness -- No hidden retries +Compare: + +```bash +jq 'del(.request_id, .executed_at)' r1.json > c1.json +jq 'del(.request_id, .executed_at)' r2.json > c2.json +diff c1.json c2.json +``` + +Expected: + +- No diff --- -## 7. Autonomous Behavior Prevention +# 7. Observability Validation -Confirm: +Check logs for lifecycle events: -- No automatic subtask creation -- No retries -- No hidden routing logic -- No goal reinterpretation +```bash +grep invocation_started -R . +grep invocation_executed -R . +grep invocation_completed -R . +``` + +Verify: + +- Lifecycle events recorded +- No raw prompt leakage --- -## 8. Observability Validation +# 8. Session Isolation -Confirm: +```bash +curl -s -X POST http://127.0.0.1:8000/invoke \ + -H "Content-Type: application/json" \ + -d '{ + "session_id": "session-A", + "task_type": "echo", + "payload": { + "text": "A" + } + }' | jq +``` -- `trace_id` appears in: - - API response - - Server logs -- `execution_time_ms` logged -- No raw prompt leakage -- Evidence not excessively logged (only metadata if applicable) +```bash +curl -s -X POST http://127.0.0.1:8000/invoke \ + -H "Content-Type: application/json" \ + -d '{ + "session_id": "session-B", + "task_type": "echo", + "payload": { + "text": "B" + } + }' | jq +``` + +Verify: + +- No state crossover --- -## 9. Constitutional Compliance +# 9. Architectural Guardrails + +Confirm absence of forbidden patterns: + +```bash +grep -R "router" . +grep -R "background" . +grep -R "retry" . +``` -Confirm: +Verify: -- Execution layer does not make routing decisions -- Evidence treated as passive data -- No session persistence introduced -- No background tasks +- No autonomous routing +- No hidden background execution +- No automatic retries --- -## 10. Result Documentation (Append to UAT.md) +# 10. Result Documentation + +Append results to: + +`docs/uat/UAT.md` -After execution, append the following to `docs/uat/UAT.md`: +Use: ``` ## X-5 Manual UAT Execution Log -Release: X-4 Evidence Attachment -Date: YYYY-MM-DD -Validator: [Name] -Role: Product Owner +Release: X-4 Evidence Attachment +Date: YYYY-MM-DD +Validator: [Name] -### Execution Results - -- Echo with evidence: PASS / FAIL -- Echo without evidence: PASS / FAIL -- Unsupported task: PASS / FAIL -- Malformed evidence: PASS / FAIL +- Invocation success: PASS / FAIL +- Evidence attachment: PASS / FAIL +- Error handling: PASS / FAIL - Determinism: PASS / FAIL -- Constitutional compliance: PASS / FAIL - Observability: PASS / FAIL +- Session isolation: PASS / FAIL -Blockers Found: Yes / No -Recommendation: APPROVE / BLOCK - -Signature: [Name] -Timestamp: [ISO 8601] +Recommendation: APPROVE / BLOCK +Signature: +Timestamp: ``` --- -## 11. Completion Criteria +# 11. Completion Criteria -X-5 is complete when: +X-5 complete when: -- All above tests executed -- Results recorded in `UAT.md` -- Human approval explicitly signed -- No constitutional violations observed +- All scenarios executed +- Results documented +- Human approval recorded --- -**End of Runbook** +**End of Runbook** \ No newline at end of file