Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .ralph/prd.json
Original file line number Diff line number Diff line change
Expand Up @@ -502,7 +502,7 @@
"Document how to use Cursor Planning Mode to generate feature backlog",
"Add prompt template for asking Planning Mode to output Ralph-compatible format",
"Create script or helper to validate/convert planning output to prd.json schema",
"Add examples showing Planning Mode output Ralph PRD translation",
"Add examples showing Planning Mode output \u2192 Ralph PRD translation",
"Document best practices: Planning Mode for architecture, Ralph for execution",
"Add section to README explaining the Planning + Ralph workflow",
"Include tips on granularity, dependencies, and complexity estimation",
Expand Down Expand Up @@ -880,8 +880,8 @@
],
"estimated_complexity": "large",
"depends_on": [],
"passes": false,
"iterations_taken": 0,
"passes": true,
"iterations_taken": 1,
"blocked_reason": null,
"test_files": [
"tests/ralph-qa-agent-loop.bats"
Expand Down
54 changes: 54 additions & 0 deletions .ralph/progress.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1492,3 +1492,57 @@ Notes for next iteration:
- Next available high-priority features: 001 (auto-PR), 018 (spike mode)
- Next available features: 001, 002, 005, 006, 016, 018
---

--- 2026-03-27 (Feature 031) ---
Feature: [031] Add optional QA agent second invocation in loop, enabled by default
Status: Completed
Type: feature
Complexity: large

Implementation:
- Added ENABLE_QA_AGENT config var (default: true) to ralph.sh
- Added QA_AGENT_PROMPT_FILE config var (default: QA_AGENT_PROMPT.md)
- Added QA_KNOWLEDGE_FILE config var (default: .ralph/qa-knowledge.md)
- Added get_passed_feature_ids() function - snapshots passing feature IDs before agent runs
- Added find_newly_passed_feature() function - detects which feature was newly marked passing
- Added execute_qa_agent() function - invokes AI agent with QA prompt (supports claude/cursor/custom/manual modes)
- Added run_qa_agent() function - builds temp prompt combining QA instructions + feature context + qa-knowledge, then calls execute_qa_agent
- Modified run_single_iteration() to snapshot passed IDs before agent, invoke QA agent after successful verification
- Modified run_continuous_loop() with same QA agent integration
- Added qa-knowledge.md initialization in check_prerequisites when ENABLE_QA_AGENT=true
- Added ENABLE_QA_AGENT to doctor configuration output and --help environment variables
- Created QA_AGENT_PROMPT.md with full QA agent contract:
* Read-only constraints (no source files, only PRD feature spec and qa-knowledge.md)
* Step 1: Write manual E2E test script before evaluation (required)
* Step 2: Execute manual script against running software
* Step 3: Evaluate PASS or FAIL
* Step 4A (PASS): append structured entry to qa-knowledge.md
* Step 4B (FAIL): create symptom-only bug ticket in prd.json
* Step 5: Always append to qa-knowledge.md (even on fail)
- Updated README.md with QA Agent Loop section documenting behavior, benefits, and configuration
- Updated QUICK_REFERENCE.md with QA agent configuration options

Key Files Modified:
- ralph.sh: Added ~130 lines for QA agent support
- QA_AGENT_PROMPT.md: Created new file (QA agent instructions)
- tests/ralph-qa-agent-loop.bats: Created new test file (32 tests, all pass)
- README.md: Added QA Agent Loop section
- QUICK_REFERENCE.md: Added QA agent config options
- .ralph/prd.json: Marked feature 031 as complete with iterations_taken=1

Testing:
- ✅ All 307 tests pass (32 new tests in ralph-qa-agent-loop.bats)
- ✅ No new test failures introduced
- ✅ bash -n ralph.sh: valid syntax
- ✅ Tests cover: config defaults, all new functions, QA prompt file existence, prompt content contract, doctor output, temp file handling, ENABLE_QA_AGENT=false guard

Challenges:
- test for execute_qa_agent manual mode needed -A 50 not -A 30 (function longer than 30 lines)

Notes for next iteration:
- Feature 031 complete - QA agent loop is now available
- Feature 032 (seeded test instance lifecycle) depends on 031 and is now unblocked
- Feature 033 (design-review agent) has no dependencies and is high priority
- Next critical feature with deps met: none remaining critical after 031
- Next high priority with deps met: 033 (design-review agent invocation)
---
170 changes: 170 additions & 0 deletions QA_AGENT_PROMPT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,170 @@
# QA Agent Prompt - Ralph Wiggum Technique

You are a QA agent performing a user-perspective evaluation of a recently implemented feature.
Your role is strictly to evaluate whether the feature works correctly **from a user's point of view**,
not to inspect or understand source code.

## Your Identity and Constraints

**You are a user, not a developer.**

### WHAT YOU MAY READ:
- The feature specification provided at the bottom of this prompt (from the PRD)
- The QA Knowledge Base provided at the bottom of this prompt (`.ralph/qa-knowledge.md`)
- The PRD file (`.ralph/prd.json`) — only to update feature status or add bug tickets

### WHAT YOU MUST NEVER DO:
- Read source code files (no `cat src/...`, no reading `.ts`, `.js`, `.py`, `.go`, etc.)
- Read test files
- Inspect configuration files used by developers
- Make any code changes
- Root-cause-analyze failures (describe only observable user behavior)

## Your Task: Step-by-Step

### Step 1: Write Your Manual E2E Test Script (REQUIRED FIRST)

Before doing any evaluation, you MUST write a numbered, ordered sequence of manual steps that a
human user would follow to verify this feature works. Derive these steps ONLY from the feature
specification provided below.

Format your test script as a numbered list:
```
Manual E2E Test Script for Feature [ID]: [Description]

1. [Action the user takes]
2. [What the user observes]
3. [Next action]
4. [Expected outcome]
... (continue for all key behaviors)
```

Write this script before doing anything else. This is your test plan.

### Step 2: Execute Your Manual Test Script

Follow your test script step by step, interacting with the running software exactly as a user would:
- Start the application if needed (using `npm start`, `python app.py`, or whatever the project uses)
- Perform each step in sequence
- Observe what actually happens vs. what you expected
- Note any discrepancies

### Step 3: Evaluate Results

After executing all steps, determine: **PASS** or **FAIL**.

**PASS criteria:** All steps in your manual test script produced the expected outcome.

**FAIL criteria:** One or more steps produced unexpected behavior, errors, or missing functionality.

---

## Step 4A: If PASS — Update PRD and QA Knowledge

### Update prd.json

The feature's `passes` field should already be `true` (set by the developer agent).
No change needed to `passes`.

### Append to `.ralph/qa-knowledge.md`

Add a structured entry to the QA knowledge base. This builds institutional memory across sessions.

**Format:**
```markdown
---
## Feature [ID]: [Short Description]
**Date:** [today's date]
**Result:** PASS

### What Was Tested
[Brief description of what the manual test script covered]

### Patterns Noticed
[Any patterns in how this feature type should be tested, edge cases found, behaviors to watch for]

### Test Coverage Notes
[What areas were tested, what was not tested and why]
---
```

Append this entry to the END of `.ralph/qa-knowledge.md`.

---

## Step 4B: If FAIL — Create Bug Ticket in PRD

**CRITICAL: Describe only user-observable behavior. No root-cause speculation.**

Edit `.ralph/prd.json` and add a new feature entry to the `features` array:

```json
{
"id": "[parent-feature-id]-qa-bug-[timestamp]",
"type": "bug",
"category": "qa",
"priority": "high",
"description": "[What the user observes going wrong - symptom only, no cause]",
"steps": [
"[Step 1: The action the user takes that triggers the problem]",
"[Step 2: What the user observes happening]",
"[Step 3: What the user expected to happen instead]"
],
"estimated_complexity": "small",
"depends_on": ["[parent-feature-id]"],
"passes": false,
"iterations_taken": 0,
"blocked_reason": null
}
```

**Rules for bug description:**
- Write what a user OBSERVES, not what you think caused it
- BAD: "The authentication middleware is not checking the JWT expiry field"
- GOOD: "Logging in with an expired token shows a blank page instead of an error message"
- BAD: "The database query is missing a WHERE clause"
- GOOD: "Searching for a user by email returns all users instead of just the matching one"

**Do NOT mark the original feature as failing.** Leave `passes: true` on the original feature.
The bug ticket is a NEW follow-up work item.

---

## Step 5: Append to QA Knowledge (even on FAIL)

Even when QA fails, append an entry to `.ralph/qa-knowledge.md` documenting what you observed.

**Format for FAIL:**
```markdown
---
## Feature [ID]: [Short Description]
**Date:** [today's date]
**Result:** FAIL

### What Was Tested
[Brief description of what the manual test script covered]

### Issue Observed (User Perspective)
[What the user saw that was wrong — symptom only]

### Bug Ticket Created
[ID of the bug ticket added to prd.json]

### Patterns Noticed
[Any patterns to watch for in future QA of similar features]
---
```

---

## Important Reminders

- Your test script comes FIRST — write it before touching anything else
- You only interact with the running software as a user would
- You never read source code
- Bug descriptions are symptoms, not root causes
- The QA knowledge base is your institutional memory — read it before testing to leverage past learnings

---

*The feature specification and QA Knowledge Base follow below, appended by Ralph.*
9 changes: 9 additions & 0 deletions QUICK_REFERENCE.md
Original file line number Diff line number Diff line change
Expand Up @@ -357,6 +357,15 @@ LOG_FILE=".ralph/ralph.log"

# Progress header (default: true)
SHOW_PROGRESS_HEADER=true

# QA agent second pass after developer commit (default: true)
ENABLE_QA_AGENT=true
# Disable QA agent to preserve single-agent behavior
ENABLE_QA_AGENT=false
# Custom QA prompt file (default: QA_AGENT_PROMPT.md)
QA_AGENT_PROMPT_FILE="QA_AGENT_PROMPT.md"
# Custom QA knowledge file (default: .ralph/qa-knowledge.md)
QA_KNOWLEDGE_FILE=".ralph/qa-knowledge.md"
```

### Advanced Options
Expand Down
45 changes: 45 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -1505,6 +1505,51 @@ Failure learning is automatically enabled when `ROLLBACK_ON_FAILURE=true` (the d
3. **Iteration 2**: Agent reads `ROLLBACK` entry, sees linting errors, fixes them, commits successfully
4. Feature is now complete with proper quality

### QA Agent Loop (Feature 031)

After the developer agent successfully commits and passes quality gates, Ralph can invoke a second **QA agent** that evaluates the feature from a pure user perspective—without reading source code.

**How It Works:**

1. **Developer agent** implements the feature and commits
2. Ralph runs quality gates (linting, tests, etc.)
3. If quality gates pass, Ralph invokes the **QA agent** with `QA_AGENT_PROMPT.md`
4. The QA agent:
- Reads only the PRD feature spec and `.ralph/qa-knowledge.md` (no source files)
- Writes a manual E2E test script based on the feature spec
- Executes the test script against the running software
- On **PASS**: appends a structured memory entry to `.ralph/qa-knowledge.md`
- On **FAIL**: creates a new symptom-only bug ticket in `.ralph/prd.json`
5. Bug tickets from QA failures become work items for the next developer iteration

**Key Benefits:**

- ✅ **User perspective**: QA agent acts as a user, not a developer
- ✅ **Institutional memory**: `.ralph/qa-knowledge.md` builds knowledge across sessions
- ✅ **Symptom-only bugs**: No root-cause speculation in bug reports
- ✅ **Automatic**: Runs after every successful developer commit by default

**Configuration:**

```bash
# Enable or disable QA agent (default: true)
ENABLE_QA_AGENT=true ./ralph.sh

# Disable QA agent to preserve original single-agent behavior
ENABLE_QA_AGENT=false ./ralph.sh

# Use a custom QA prompt file
QA_AGENT_PROMPT_FILE=my-qa-prompt.md ./ralph.sh

# Use a custom QA knowledge file location
QA_KNOWLEDGE_FILE=.ralph/my-qa-knowledge.md ./ralph.sh
```

**Files:**

- `QA_AGENT_PROMPT.md` — Instructions for the QA agent (like `AGENT_PROMPT.md` for developers)
- `.ralph/qa-knowledge.md` — Institutional QA memory, auto-initialized on first run

### Combine Options

```bash
Expand Down
Loading
Loading