whiteboardmonk · whiteboardmonk · Dec 3, 2025 · Oct 30, 2025 · Oct 30, 2025 · Oct 30, 2025
diff --git a/AGENTS.md b/AGENTS.md
@@ -0,0 +1,32 @@
+# Repository Guidelines
+
+## Project Structure & Modules
+- `src/agcluster/container/` – FastAPI backend: `api/` endpoints, `core/` orchestration (session, container, translation), `models/` Pydantic schemas, and `ui/` Next.js dashboard.
+- `tests/` – Pytest suites split into `unit/` and `integration/`; markers also cover `e2e` (Docker-backed).
+- `e2e/agcluster.spec.ts` – Root Playwright checks against a running stack.
+- `docker/`, `docker-compose.yml`, `configs/`, `.env.example` – Deployment assets; copy `.env.example` to `.env` before local runs.
+- `docs/`, `examples/` – Reference material and sample agent configurations.
+
+## Build, Test, and Development Commands
+- Backend setup: `pip install -e ".[dev]"` (Python 3.11+). Start stack: `docker compose up -d`; rebuild images with `docker compose build`.
+- API-only run (dev): `uvicorn agcluster.container.api.main:app --reload` after setting `ANTHROPIC_API_KEY` and other env vars.
+- Backend tests: `pytest tests/` (all), `pytest tests/unit`, `pytest tests/integration`, or `pytest --cov=agcluster.container tests/`.
+- UI workspace (`src/agcluster/container/ui`): `npm install`, then `npm run dev` for Next.js, `npm run build`/`npm start` for production, `npm test` for Vitest, `npm run test:e2e` for Playwright UI flows.
+- Root Playwright smoke (if stack is up): `npm test` from repo root runs `playwright test`.
+
+## Coding Style & Naming Conventions
+- Python: Black and Ruff with 100-char lines; optional mypy (`mypy src/agcluster`). Prefer typed function signatures in new code.
+- Tests: filenames `test_*.py`; classes `Test*`; functions `test_*`. Use pytest markers `@pytest.mark.unit|integration|e2e` and keep fixtures in `tests/conftest.py`.
+- TypeScript/React: ESLint + TypeScript; follow existing component patterns in `src/agcluster/container/ui`. Use PascalCase for components, camelCase for hooks/utilities.
+- Tailwind is locked to v3; avoid `@apply` in new styles—prefer utility classes or scoped CSS modules.
+
+## Testing Guidelines
+- Add unit tests for new services/models; integration tests for API surface changes; Playwright for end-to-end UI/agent flows.
+- Aim to keep coverage gaps small (HTML report in `htmlcov/` from `pytest --cov`). Include edge cases around Docker/session lifecycle and file handling.
+- For UI changes, pair Vitest component coverage with Playwright scenarios that validate auth, upload/download, and agent launch flows.
+- Record any required Docker/ENV prerequisites in the test description to keep CI reproducible.
+
+## Commit & Pull Request Guidelines
+- Use conventional commits when possible (`feat:`, `fix:`, `chore:`, `docs:`); keep subjects imperative and under ~72 chars.
+- PRs should include: concise summary, linked issue/Linear ticket, test plan with commands run, and screenshots/GIFs for UI changes.
+- Keep commits scoped; avoid mixing backend and UI refactors unless tightly coupled. Update docs/config examples when behavior changes.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -5,6 +5,22 @@ All notable changes to the AgCluster Container project will be documented in thi
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
+## [0.3.2] - 2025-12-03
+
+### Added
+- GitHub Code Review preset now exposes MCP permissions in the UI builder (including permission modes, tool selection, and MCP server envs).
+
+### Changed
+- Respect `permission_mode` end-to-end (including container env) so MCP-enabled agents honor preset settings such as `bypassPermissions`.
+- Clarified MCP credential expectations for GitHub by standardizing on `GITHUB_PERSONAL_ACCESS_TOKEN` in configs and docs.
+- Docker/Fly providers and API client updated to carry full permission modes and MCP settings.
+
+### Fixed
+- Prevented stale configs in containers by mounting repository presets in Docker Compose.
+- Resolved MCP auth/approval loops for GitHub by propagating permissions correctly and passing PATs unchanged.
+
+---
+
 ## [0.3.0] - 2025-01-28
 
 ### Added

diff --git a/CLAUDE.md b/CLAUDE.md
@@ -165,7 +165,7 @@ docker compose down
 
 ### Testing
 
-**Test Suite**: 133 tests, 100% passing, 83% coverage
+**Test Suite**: Run `pytest tests/` (unit, integration, e2e markers); target 80%+ coverage locally
 
 ```bash
 # Run all tests
@@ -351,11 +351,12 @@ SESSION_IDLE_TIMEOUT=1800         # 30 minutes idle timeout
 ### Agent Configuration Files
 
 Agent presets are stored in `configs/presets/` as YAML files. See "Agent Configuration System" section above for structure and details. The system includes:
-- ✅ 4 preset configurations (code-assistant, research-agent, data-analysis, fullstack-team)
+- ✅ 5 preset configurations (code-assistant, research-agent, data-analysis, github-code-review, fullstack-team)
 - ✅ Custom inline config support via `/api/agents/launch`
 - ✅ Full validation via Pydantic models
 - ✅ Multi-agent orchestration (sub-agents)
-- ✅ Per-agent tool specialization and resource limits
+- ✅ Per-agent tool specialization and resource limits (with MCP servers where configured)
+- ✅ Launch-time MCP credentials (`mcp_env`) must match keys declared under each server's `env`; reserved container env vars are blocked from override
 
 Documentation: `configs/README.md`
 
@@ -484,18 +485,18 @@ curl -X POST "http://localhost:8000/api/files/conv-abc123.../upload?overwrite=fa
 - ✅ FastAPI endpoints (`/`, `/health`, `/api/agents/*`, `/api/configs`, `/api/files`)
 - ✅ Claude-native chat API (`/api/agents/{session_id}/chat`)
 - ✅ Agent configuration endpoints (`/api/configs`, `/api/agents/launch`, `/api/agents/sessions`)
-- ✅ 4 preset agent configurations (code-assistant, research-agent, data-analysis, fullstack-team)
-- ✅ Custom inline configuration support
+- ✅ 5 preset agent configurations (adds `github-code-review` with MCP)
+- ✅ Custom inline config support
 - ✅ Multi-agent orchestration (fullstack-team with 3 sub-agents)
 - ✅ Config-based session management with persistent containers
 - ✅ Background cleanup task (30-minute idle timeout)
 - ✅ Claude SDK integration with configurable tools (Bash, Read, Write, Grep, Task, WebFetch, NotebookEdit, TodoWrite)
 - ✅ TodoWrite tool for all presets (task tracking)
 - ✅ NotebookEdit tool for data-analysis (Jupyter support)
+- ✅ MCP server support with launch-time credentials and auto-allowed MCP tools
 - ✅ Docker container isolation per conversation/session
 - ✅ Per-agent resource limits (CPU, memory, storage)
 - ✅ Namespace package structure for modularity
-- ✅ Comprehensive test suite (218 tests, 212 passing, 66% coverage)
 - ✅ Web UI with Next.js 15 + React + TypeScript
 - ✅ File operations API with security (browse, preview, download, **upload**)
 - ✅ File upload with multi-provider support (Docker + Fly.io)
@@ -514,7 +515,7 @@ curl -X POST "http://localhost:8000/api/files/conv-abc123.../upload?overwrite=fa
 **Tested and Verified**:
 - ✅ Multi-turn conversations with context preservation
 - ✅ Config-based agent launching and session management
-- ✅ All 4 preset configurations load and validate successfully
+- ✅ All presets load and validate successfully (including MCP-enabled configs)
 - ✅ Inline custom configuration support
 - ✅ Tool specialization per agent type
 - ✅ Resource limits enforcement
@@ -541,4 +542,5 @@ curl -X POST "http://localhost:8000/api/files/conv-abc123.../upload?overwrite=fa
 - Agent-to-agent communication enhancements
 - Conversation export and history persistence
 - Never add this to the commit message: 🤖 Generated with Claude Code                                                                                                     Co-Authored-By: Claude <noreply@anthropic.com>
-- Always run ruff check src/ tests/
+- Always run ruff check src/ tests/
+- Always run black --check src/ tests/
diff --git a/README.md b/README.md
@@ -61,12 +61,14 @@
 
 ### Agent Configuration System
 
-- **Preset Configurations** - 4 ready-to-use templates:
+- **Preset Configurations** - 5 ready-to-use templates:
   - `code-assistant` - Full-stack development
   - `research-agent` - Web research and analysis
   - `data-analysis` - Statistical analysis with Jupyter
+  - `github-code-review` - GitHub PR reviews with MCP integration
   - `fullstack-team` - Multi-agent orchestration with sub-agents
 - **Custom Configurations** - Define agents with specific tools and limits
+- **MCP Server Support** - Integrate external tools via Model Context Protocol (GitHub, filesystem, Postgres, etc.)
 - **Tool Specialization** - Configure which tools each agent can access
 - **Resource Management** - Per-agent CPU, memory, storage limits
 
@@ -207,7 +209,32 @@ Statistical analysis and data visualization
 - **Resources**: 2 CPUs, 6GB RAM, 15GB storage
 - **Use Cases**: Exploratory data analysis, statistical testing, Jupyter workflows
 
-#### 4. Full-Stack Team (`fullstack-team`)
+#### 4. GitHub Code Review (`github-code-review`)
+
+GitHub PR review agent with MCP integration
+
+- **Tools**: Read, Write, Grep, TodoWrite (+ auto-added MCP tools)
+- **MCP Servers**: GitHub MCP server for PR/issue operations
+- **Permissions**: `permission_mode: bypassPermissions` to avoid repeated approval prompts for MCP calls
+- **Resources**: 1 CPU, 2GB RAM, 5GB storage
+- **Use Cases**: Automated PR reviews, security scanning, code quality checks
+
+**Launch with GitHub token:**
+```bash
+curl -X POST http://localhost:8000/api/agents/launch \
+  -H "Content-Type: application/json" \
+  -d '{
+    "api_key": "sk-ant-...",
+    "config_id": "github-code-review",
+    "mcp_env": {
+      "github": {
+        "GITHUB_PERSONAL_ACCESS_TOKEN": "ghp_..."
+      }
+    }
+  }'
+```
+
+#### 5. Full-Stack Team (`fullstack-team`)
 
 Multi-agent orchestrator with specialized sub-agents
 
@@ -390,6 +417,21 @@ Launch a new agent from configuration.
 }
 ```
 
+**With MCP credentials:**
+```json
+{
+  "api_key": "sk-ant-...",
+  "config_id": "github-code-review",
+  "mcp_env": {
+    "github": {
+      "GITHUB_PERSONAL_ACCESS_TOKEN": "ghp_..."
+    }
+  }
+}
+```
+
+**MCP credential rules and permissions:** keys in `mcp_env` must match those declared under each server’s `env` in the config (e.g., `GITHUB_PERSONAL_ACCESS_TOKEN`). Core container env vars (such as `ANTHROPIC_API_KEY`, `AGENT_CONFIG_JSON`, `AGENT_ID`) cannot be overridden. When `mcp_servers` are present, agents auto-enable `ListMcpResources`, `ReadMcpResource`, and `mcp__{server}__*` tool permissions, and use Claude SDK permission mode `acceptEdits` by default. If you require stricter gating, set `permission_mode: plan` or `default` in the config.
+
 **Response:**
 ```json
 {

diff --git a/configs/README.md b/configs/README.md
@@ -47,7 +47,69 @@ This directory contains pre-configured agent templates that can be used to launc
 
 ---
 
-### 3. Full-Stack Team (`fullstack-team.yaml`)
+### 3. Data Analysis Agent (`data-analysis.yaml`)
+
+**Purpose:** Statistical analysis and data visualization
+
+**Features:**
+- Pandas, numpy, scipy, matplotlib support
+- NotebookEdit for Jupyter-style workflows
+- Bash for data processing scripts
+- Optimized for exploratory data analysis
+
+**Best For:**
+- Statistical testing
+- Data cleaning and transformation
+- Visualization and plotting
+- ML model evaluation
+- Interactive data debugging
+
+**Resource Allocation:** 2 CPUs, 6GB RAM
+
+---
+
+### 4. GitHub Code Review (`github-code-review.yaml`)
+
+**Purpose:** Automated GitHub pull request reviews
+
+**Features:**
+- **MCP Integration**: Uses GitHub MCP server for API access
+- Systematic code review with focus on:
+  - Code quality and maintainability
+  - Security vulnerabilities
+  - Performance optimizations
+  - Testing coverage
+  - Architecture patterns
+- Auto-enabled MCP tools (ListMcpResources, ReadMcpResource)
+- Runtime credential injection via `mcp_env`
+
+**Best For:**
+- Automated PR reviews
+- Security audits
+- Code quality checks
+- Architecture analysis
+- Best practices enforcement
+
+**Resource Allocation:** 1 CPU, 2GB RAM
+
+**Usage with Credentials:**
+```bash
+curl -X POST http://localhost:8000/api/agents/launch \
+  -H "Content-Type: application/json" \
+  -d '{
+    "api_key": "sk-ant-...",
+    "config_id": "github-code-review",
+    "mcp_env": {
+      "github": {
+        "GITHUB_PERSONAL_ACCESS_TOKEN": "ghp_..."
+      }
+    }
+  }'
+```
+
+---
+
+### 5. Full-Stack Team (`fullstack-team.yaml`)
 
 **Purpose:** Multi-agent orchestration for complex projects
 
@@ -152,15 +214,52 @@ agents:
 
 ### MCP Server Integration
 
+AgCluster supports the Model Context Protocol (MCP) for integrating external tools and services. MCP servers are defined in configuration files and credentials are provided at launch time.
+
+**Configuration Structure:**
 ```yaml
 mcp_servers:
-  github:
-    command: npx
-    args: ["-y", "@modelcontextprotocol/server-github"]
+  github:                                          # Server name
+    command: npx                                   # Executable
+    args: ["-y", "@modelcontextprotocol/server-github"]  # Arguments
     env:
-      GITHUB_PERSONAL_ACCESS_TOKEN: ${GITHUB_TOKEN}
+      GITHUB_PERSONAL_ACCESS_TOKEN: ${GITHUB_PERSONAL_ACCESS_TOKEN}  # Placeholder
 ```
 
+**Runtime Credentials:**
+
+Provide actual credentials via the `mcp_env` parameter when launching agents:
+
+```bash
+curl -X POST http://localhost:8000/api/agents/launch \
+  -H "Content-Type: application/json" \
+  -d '{
+    "api_key": "sk-ant-...",
+    "config_id": "github-code-review",
+    "mcp_env": {
+      "github": {
+        "GITHUB_PERSONAL_ACCESS_TOKEN": "ghp_your_token_here"
+      }
+    }
+  }'
+```
+
+**Key Features:**
+- **Auto-Allow MCP Tools**: When `mcp_servers` are configured, `ListMcpResources` and `ReadMcpResource` are automatically added to `allowed_tools`
+- **Environment Variable Merging**: Runtime `mcp_env` overrides config-defined placeholders
+- **Multi-Provider Support**: MCP works with Docker, Fly.io, and other providers
+- **Tool Discovery**: Agents can list and read available MCP resources
+- **Runtime Credential Validation**: `mcp_env` keys must match the `env` keys defined per server; core container env vars cannot be overridden
+
+**Preset MCP Example:**
+- `github-code-review.yaml` — GitHub PR reviews using the GitHub MCP server with launch-time personal access token
+
+**Available MCP Servers:**
+- `@modelcontextprotocol/server-github` - GitHub API integration
+- `@modelcontextprotocol/server-filesystem` - File system access
+- `@modelcontextprotocol/server-postgres` - PostgreSQL integration
+- Custom MCP servers (see [MCP docs](https://modelcontextprotocol.io))
+
 ---
 
 ## Resource Limits
@@ -202,13 +301,25 @@ permission_mode: acceptEdits
 ```yaml
 id: github-bot
 name: GitHub Bot
-allowed_tools: ["mcp__github__create_issue", "mcp__github__list_prs"]
+# Note: ListMcpResources and ReadMcpResource are auto-added
+# No need to explicitly list mcp__* tools
+allowed_tools: ["Read", "Write"]
 mcp_servers:
   github:
     command: npx
     args: ["-y", "@modelcontextprotocol/server-github"]
     env:
-      GITHUB_PERSONAL_ACCESS_TOKEN: ${GITHUB_TOKEN}
+      GITHUB_PERSONAL_ACCESS_TOKEN: ${GITHUB_PERSONAL_ACCESS_TOKEN}  # Placeholder
+```
+
+Launch with actual credentials:
+```bash
+curl -X POST http://localhost:8000/api/agents/launch \
+  -d '{
+    "api_key": "sk-ant-...",
+    "config_id": "github-bot",
+    "mcp_env": {"github": {"GITHUB_PERSONAL_ACCESS_TOKEN": "ghp_..."}}
+  }'
 ```
 
 ### Multi-Agent Team