elvern18 · elvern18 · Feb 18, 2026 · Feb 14, 2026 · Feb 14, 2026 · Feb 14, 2026
diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md
@@ -6,11 +6,43 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
 
 ElvAgent is an autonomous AI newsletter agent that curates and publishes AI news hourly across multiple platforms (Discord, X, Instagram, Telegram, Markdown). Built to demonstrate Claude Code best practices including MCP servers, skills, and subagents.
 
+## 📚 Documentation
+
+All project documentation is organized in the `docs/` folder:
+
+- **[docs/STATUS.md](../docs/STATUS.md)** - Current development status (<100 lines, high-level)
+- **[docs/logs/](../docs/logs/)** - Session handover logs for continuity
+- **[docs/PROJECT_STRUCTURE.md](../docs/PROJECT_STRUCTURE.md)** - Complete project structure
+- **[docs/TESTING_GUIDE.md](../docs/TESTING_GUIDE.md)** - Testing best practices
+
+**Quick reference:** Start each session with `docs/STATUS.md` + latest session log from `docs/logs/`.
+
+## ⚠️ CRITICAL: Always Use Virtual Environment
+
+**IMPORTANT:** This project uses a Python virtual environment at `.venv/`.
+
+**ALWAYS activate it before running ANY Python command:**
+
+```bash
+# Activate virtual environment (do this first!)
+source .venv/bin/activate
+
+# Verify it's active (should show .venv path)
+which python
+```
+
+**All commands below assume .venv is activated.**
+
+If you see `ModuleNotFoundError`, you forgot to activate the venv!
+
 ## Common Commands
 
 ### Development
 ```bash
-# Install dependencies
+# ALWAYS activate venv first!
+source .venv/bin/activate
+
+# Install dependencies (first time only)
 pip install -r requirements.txt
 
 # Run single test cycle (doesn't publish)
@@ -21,17 +53,47 @@ pytest tests/
 
 # Run specific test file
 pytest tests/unit/test_researchers.py -v
+
+# Run foundation test script
+python scripts/test_foundation.py
+```
+
+### Dev Setup
+```bash
+# Install dev dependencies
+pip install -r requirements-dev.txt
+
+# Install pre-commit hooks
+pre-commit install
+
+# Run linter
+ruff check src/ tests/
+
+# Auto-fix lint issues
+ruff check --fix src/ tests/
+
+# Check formatting
+ruff format --check src/ tests/
+
+# Type checking
+mypy src/ --ignore-missing-imports
+
+# Run tests with coverage
+pytest tests/unit/ -v --cov=src --cov-report=term-missing
 ```
 
 ### Database
 ```bash
+# Activate venv first!
+source .venv/bin/activate
+
 # Initialize database
-python -c "from src.core.state_manager import StateManager; StateManager().init_db()"
+python -c "from src.core.state_manager import StateManager; import asyncio; asyncio.run(StateManager().init_db())"
 
-# Query published items
+# Query published items (doesn't need venv)
 sqlite3 data/state.db "SELECT * FROM published_items ORDER BY published_at DESC LIMIT 10;"
 
-# Check metrics
+# Check metrics (doesn't need venv)
 sqlite3 data/state.db "SELECT * FROM api_metrics WHERE date = date('now');"
 ```
 
@@ -50,6 +112,119 @@ tail -f logs/stdout.log
 launchctl kickstart -k gui/$(id -u)/com.elvagent.newsletter
 ```
 
+## Agent Selection Rubric
+
+Claude Code autonomously chooses the appropriate execution mode. Users can override by explicitly requesting a different approach.
+
+### Single Agent (Default)
+
+**Use when:**
+- <50 lines of code affected
+- Single file modification
+- Clear implementation path
+- No architectural decisions
+- Simple refactoring or bug fixes
+
+**Examples:** Fix bug, add utility function, update config, write tests
+
+### Subagents
+
+**Use when:**
+- Research needed (APIs, docs, libraries)
+- Parallel independent work
+- Context isolation desired
+- Data gathering and analysis
+
+**Subagent types:**
+- **Haiku:** Fast, simple tasks (fetching, basic parsing)
+- **Sonnet:** Complex analysis (design, architecture)
+
+**Examples:** Research RSS libraries, explore APIs, fetch from 4 sources in parallel
+
+### Plan Mode
+
+**Use when:**
+- Multi-component changes (>3 files)
+- Architectural decisions required
+- Multiple valid approaches exist
+- New feature spanning layers
+- Unclear implementation path
+
+**Process:** Explore → Design → Get approval → Implement
+
+**Examples:** New platform integration, pipeline redesign, multi-agent orchestration
+
+**Planning Code:** Ensure code is readable, maintainable and modularised. Use test-planner agent to plan on testing code.
+
+### Autonomous Execution
+
+Claude will:
+1. Assess task complexity using criteria above
+2. Announce approach (e.g., "Entering plan mode for this multi-component change")
+3. Proceed with chosen mode
+4. Accept user override if explicitly requested
+
+**Override examples:**
+- "Use plan mode for this"
+- "Just implement directly"
+- "Research with a subagent first"
+
+## Session Documentation
+
+ElvAgent uses automated session documentation for perfect continuity between sessions.
+
+### Available Skills
+
+**Session Lifecycle:**
+- **`/session-start`** - Load context at beginning (auto-finds latest log, shows summary, suggests next action)
+- **`/session-end`** - Document session at end (creates log + compresses STATUS.md + shows next steps)
+
+**Individual Operations:**
+- **`/log-session`** - Create session handover log only
+- **`/update-status`** - Compress STATUS.md only
+
+### When Claude Suggests Documentation
+
+Claude will proactively suggest `/session-end` when:
+- Context usage >75%
+- Session duration >2 hours (estimated)
+- Major milestone completed
+- User indicates they're stopping
+
+**You can always:**
+- Accept: "yes" or invoke `/session-end`
+- Decline: "no" or "not yet"
+- Manually invoke later: `/session-end` at any time
+
+### Conventional Commits
+
+All documentation commits use format:
+```bash
+git commit -m "docs: Session YYYY-MM-DD-N - {brief summary}"
+```
+
+Example:
+```bash
+git commit -m "docs: Session 2026-02-17-1 - ContentEnhancer complete"
+```
+
+### Session Continuity
+
+**Complete Workflow:**
+
+```
+Session Start:  /session-start  (auto-loads context, shows summary)
+      ↓
+  [Work on tasks]
+      ↓
+Session End:    /session-end    (creates log, updates STATUS.md)
+```
+
+**Manual approach (if preferred):**
+1. Read `docs/STATUS.md` (high-level current state)
+2. Read latest session log from `docs/logs/` (detailed context)
+3. Continue from "Next Steps" section
+
 ## Architecture Principles
 
 ### Component Hierarchy
@@ -124,11 +299,11 @@ launchctl kickstart -k gui/$(id -u)/com.elvagent.newsletter
 
 ### Common Pitfalls
 
-1. **Don't fill context:** Use subagents for research, not direct scraping in main context
-2. **Check duplicates early:** Query database before processing content
+1. **Don't fill context:** Use subagents for research, not direct scraping
+2. **Check duplicates early:** Query database before processing
 3. **Normalize URLs:** Remove tracking parameters before fingerprinting
-4. **Respect rate limits:** Each platform has different limits (see `src/utils/rate_limiter.py`)
-5. **Use full paths in launchd:** Relative paths won't work in background execution
+4. **Respect rate limits:** See `src/utils/rate_limiter.py`
+5. **Use full paths in launchd:** Relative paths won't work in background
 
 ### Adding New Features
 

diff --git a/.claude/agents/content-researcher.md b/.claude/agents/content-researcher.md
@@ -0,0 +1,149 @@
+---
+name: content-researcher
+description: Research content from specified sources
+tools: [Read, Grep, Bash, WebFetch]
+model: sonnet
+---
+
+# Content Researcher Agent
+
+I am a specialized subagent for researching AI news and content from various sources. I keep the main agent's context clean by handling all the messy research work in isolation.
+
+## My Role
+
+When spawned, I will:
+1. Research the specified source (ArXiv, HuggingFace, funding news, or general AI news)
+2. Fetch and parse content
+3. Score relevance and filter low-quality items
+4. Check for duplicates against the database
+5. Return summarized findings in JSON format
+
+## Sources I Can Research
+
+### ArXiv Papers
+- Fetch http://export.arxiv.org/rss/cs.AI
+- Extract AI/ML papers from the last hour
+- Score based on novelty, impact, code availability
+- Return top 5 papers
+
+### HuggingFace Papers
+- Fetch https://huggingface.co/papers
+- Extract trending papers from today
+- Include associated models and datasets
+- Return top 5 papers
+
+### Startup Funding News
+- Fetch TechCrunch RSS (AI-filtered)
+- Extract funding announcements >$5M
+- Include company, amount, investors
+- Return significant announcements
+
+### General AI News
+- Aggregate from multiple tech news sources
+- Filter for AI-relevant keywords
+- Exclude fluff and opinion pieces
+- Return top news items
+
+## Input Format
+
+I expect to receive a task specification like:
+
+```json
+{
+  "source": "arxiv|huggingface|funding|news",
+  "time_window_hours": 1,
+  "max_items": 5
+}
+```
+
+## Output Format
+
+I return structured JSON:
+
+```json
+{
+  "source": "arxiv",
+  "item_count": 3,
+  "items": [
+    {
+      "title": "Content title",
+      "url": "https://...",
+      "source": "arxiv",
+      "category": "research|product|funding|news",
+      "relevance_score": 8,
+      "summary": "Brief summary...",
+      "metadata": {
+        "authors": ["..."],
+        "additional_info": "..."
+      },
+      "published_date": "2026-02-15T10:30:00"
+    }
+  ],
+  "research_time": 2.5,
+  "duplicates_filtered": 2
+}
+```
+
+## Deduplication
+
+Before returning items, I:
+1. Generate SHA-256 hash from normalized URL + title
+2. Query database: `SELECT 1 FROM content_fingerprints WHERE content_hash = ?`
+3. Filter out any items that already exist
+4. Report duplicate count in output
+
+## Scoring Guidelines
+
+**Relevance scores (1-10):**
+- 9-10: Major breakthrough, high-impact announcement
+- 7-8: Significant advancement, popular topic
+- 5-6: Interesting but incremental, niche application
+- 3-4: Minor update, limited interest
+- 1-2: Low quality, not relevant
+
+**I prioritize:**
+- Novel techniques and architectures
+- Code releases and practical tools
+- Major funding rounds (>$50M)
+- Industry-shaping news
+- Multimodal AI, LLMs, agents, alignment
+
+**I deprioritize:**
+- Purely theoretical work
+- Opinion pieces without substance
+- Small incremental improvements
+- Very narrow applications
+- Marketing fluff
+
+## Error Handling
+
+If I encounter errors:
+- I retry API calls up to 3 times with exponential backoff
+- I log warnings for individual item failures but continue processing
+- I return partial results rather than failing completely
+- I always return valid JSON even if empty
+
+## Context Management
+
+I keep main agent context clean by:
+- Processing all raw data myself
+- Summarizing findings before returning
+- Returning only the top N items (not everything I found)
+- Using compact JSON format
+
+## Usage Example
+
+Main agent spawns me like this:
+
+```python
+result = await spawn_subagent(
+    agent_type="content-researcher",
+    task={
+        "source": "arxiv",
+        "time_window_hours": 1,
+        "max_items": 5
+    }
+)
+```
+
+I return condensed findings that the main agent can easily process without cluttering its context.