Skip to content

feat: Autonomous GitHub Agent (ReAct framework)#1

Merged
elvern18 merged 25 commits into
mainfrom
agent-1-data-layer
Feb 18, 2026
Merged

feat: Autonomous GitHub Agent (ReAct framework)#1
elvern18 merged 25 commits into
mainfrom
agent-1-data-layer

Conversation

@elvern18
Copy link
Copy Markdown
Owner

Summary

  • Implements the ReAct (Reason → Act) agent framework as the architectural foundation for all future autonomous agents
  • Ships the first concrete agent: GitHubMonitor — polls open PRs every 60s and autonomously handles the full PR lifecycle

What's included

New modules

Path Description
src/agents/base.py AgentLoop ABC: poll → triage → act → record loop + run_forever(max_cycles=N)
src/github/client.py GitHubClient: httpx REST wrapper (9 methods, token-bucket rate-limited)
src/github/monitor.py GitHubMonitor(AgentLoop): orchestrates the PR lifecycle
src/github/workers/pr_describer.py Claude Haiku generates PR descriptions
src/github/workers/ci_fixer.py 3-tier: ruff auto-fix → Claude Sonnet → secret alert (circuit breaker at 3 attempts)
src/github/workers/code_reviewer.py Claude Sonnet reviews on green CI, idempotent via marker

Modified files

  • src/config/constants.py — GitHub rate limit + agent constants
  • src/config/settings.py — 6 new GitHub settings (github_token, github_repo, etc.)
  • src/core/state_manager.pygithub_events table + record_github_event, is_github_event_processed, count_fix_attempts
  • src/main.py--mode=github-monitor --cycles=N entry point

Test plan

  • 217 unit tests passing (33 new GitHub agent tests)
  • ruff check + ruff format clean
  • mypy src/ --ignore-missing-imports CI-green
  • Live test against real GitHub API: python src/main.py --mode=github-monitor --verbose --cycles=1
    • Token auth verified ✅
    • Rate limiter active (79/80 tokens) ✅
    • PR polling succeeded (HTTP 200) ✅
    • Clean exit after 1 cycle ✅

Usage

# Run forever (production)
python src/main.py --mode=github-monitor

# Run one cycle (testing)
python src/main.py --mode=github-monitor --verbose --cycles=1

🤖 Generated with Claude Code

elvern18 and others added 25 commits February 15, 2026 04:13
- Created Pydantic settings with type-safe configuration
- Implemented structured logging with structlog
- Built cost tracking system for API usage
- Added rate limiter with token bucket algorithm
- Implemented retry utilities with exponential backoff
- Created state manager with SQLite operations
- Built base classes for researchers and publishers
- Added database schema with tables for:
  - published_items (content tracking)
  - newsletters (publication history)
  - publishing_logs (per-platform status)
  - api_metrics (cost tracking)
  - content_fingerprints (deduplication)

All Agent 1 (Data Layer) tasks complete.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Created ArXiv researcher extending BaseResearcher
- Fetches from ArXiv RSS feed with relevance scoring
- Implements time window filtering (last hour)
- Scores based on keywords: LLMs, multimodal, agents, etc.
- Created research-arxiv skill with detailed workflow
- Created content-researcher subagent specification
- Handles deduplication and error recovery

Research layer ready for testing.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Created src/main.py with test and production modes
- Built scripts/test_foundation.py to verify components
- Updated STATUS.md with current progress
- Added __init__.py files for proper package structure

Foundation ready for testing and MCP server development.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- IMPLEMENTATION_PROGRESS.md: Detailed day 1 summary
- PROJECT_STRUCTURE.md: Complete project structure overview
- Includes statistics, next steps, and navigation guide

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Fixed Settings to allow optional ANTHROPIC_API_KEY for testing
- Created pytest.ini with test configuration and markers
- Built tests/conftest.py with shared fixtures
- Implemented tests/unit/test_state_manager.py (complete unit tests)
- Created test-planner agent for comprehensive test case planning
- Added TESTING_GUIDE.md with best practices and workflows
- Fixed test_foundation.py to work without real API keys
- Added production config validation

Testing workflow now supports:
- TDD (test-driven development)
- TAD (test-after development)
- Hybrid approach
- Test planning agent integration

Run: pytest -v (for unit tests)
Run: python scripts/test_foundation.py (for quick check)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Changed ArXiv RSS URL from HTTP to HTTPS (fixes 301 redirect)
- Fixed syntax error in test_state_manager.py (async import)
- Updated CLAUDE.md with critical venv activation instructions
- Added venv reminder to all command examples

All tests now passing:
- Foundation test suite: ✓
- Pytest unit tests: 9/9 passed ✓

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Created Database MCP Server with 3 tools:
  - check_duplicate: Check if content exists
  - store_content: Save published items
  - get_metrics: Retrieve API usage stats
- Uses official MCP Python SDK
- Wraps StateManager for all database operations
- Implements proper error handling and logging
- Full async support

Tests: 7/7 passed (0.70s)
- Server initialization
- Duplicate checking (exists/not exists)
- Content storage (success/validation errors)
- Metrics retrieval (empty/with data)

This unblocks Claude integration with database!

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Documentation updates:
- STATUS.md: Marked MCP server complete, updated metrics (55% complete)
- PROJECT_STRUCTURE.md: Marked database_server.py as done
- Updated test counts and file statistics

Testing improvements:
- Created scripts/test_mcp_server.py for end-to-end MCP testing
- Tests all 3 MCP tools (check_duplicate, store_content, get_metrics)
- Tests error handling and validation
- All tests passing ✓

Test Results:
- Pytest: 16/16 tests passing
- Foundation test: All passing
- MCP server test: All passing

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Documentation restructure:
- Created docs/ folder for all project documentation
- Moved STATUS.md, IMPLEMENTATION_PROGRESS.md, PROJECT_STRUCTURE.md, TESTING_GUIDE.md to docs/
- Added docs/README.md as documentation index
- Updated CLAUDE.md with documentation references
- Updated root README.md with new doc locations

Benefits:
✓ Cleaner root directory
✓ All documentation in one organized location
✓ Clear documentation index for easy navigation
✓ CLAUDE.md automatically references docs

File structure:
/
├── README.md                 # Project overview
├── .claude/CLAUDE.md        # Dev guidelines (auto-read)
└── docs/                    # All project documentation
    ├── README.md            # Documentation index
    ├── STATUS.md            # Current progress
    ├── PROJECT_STRUCTURE.md # File organization
    ├── TESTING_GUIDE.md     # Testing practices
    └── IMPLEMENTATION_PROGRESS.md  # Detailed progress

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Added quick-start section for developers resuming work in fresh sessions:
- Simple command to get oriented
- Current phase summary
- Pre-coding checklist
- Direct link to next steps

Makes it easy to pick up where we left off without reading entire file.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Created 4 skills: session-start, session-end, log-session, update-status
- Implemented session handover log system in docs/logs/
- Compressed STATUS.md from 345 to 94 lines (73% reduction)
- Added agent selection rubric to CLAUDE.md for autonomous mode selection
- Archived historical session docs to logs/ with new format
- Updated README.md and START_HERE.md with /session-start workflow

Enables perfect session continuity with one-command automation.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Implemented ContentEnhancer to orchestrate 4 AI enhancement agents:
- HeadlineWriter: Generates viral headlines (Claude Sonnet)
- TakeawayGenerator: Creates 'why it matters' insights (Claude Haiku)
- EngagementEnricher: Extracts social proof metrics (local)
- SocialFormatter: Formats category messages (Claude Haiku)

Features:
- Sequential item enhancement with exponential backoff retry (3 attempts)
- Template fallback when AI enhancement fails (never fails)
- Category grouping (max 5 items per category, sorted by relevance)
- AI-powered category formatting with simple fallback
- Comprehensive cost and performance metrics tracking

Integration:
- Added TelegramPublisher.publish_enhanced() for enhanced publishing
- Added TelegramFormatter.format_enhanced() for enhanced formatting
- Backward compatible: Original publish_newsletter() unchanged

Testing:
- 10 ContentEnhancer unit tests (sequential flow, retry logic, fallbacks)
- 6 TelegramFormatter enhanced tests (formatting, splitting, markdown)
- 5 integration tests (end-to-end enhancement + publishing)
- All 21 tests passing

Cost: ~$0.035 per 15-item newsletter (well under $3/day target)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Fixed issue where ANTHROPIC_API_KEY wasn't loaded when running main.py
from src/ directory. Pydantic's env_file was using a relative path '.env'
which looked in the current working directory instead of project root.

Solution: Compute project root before model_config and use absolute path
for env_file. Now works from any directory.

Fixes: 'ANTHROPIC_API_KEY is required for production' error

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Use explicit AsyncAnthropic import for consistency with ContentPipeline
- Changes in headline_writer.py, takeaway_generator.py, social_formatter.py
- From: import anthropic; anthropic.AsyncAnthropic()
- To: from anthropic import AsyncAnthropic; AsyncAnthropic()
- Improves code clarity and IDE support
- All 10 ContentEnhancer unit tests passing

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add enhance_phase() between filter and publish phases:
  Research → Filter → Enhance (optional) → Publish → Record

Features:
- Feature flag: settings.enable_content_enhancement (default: True)
- Feature flag: settings.max_items_per_category (default: 5)
- Backward compatible: publishers without publish_enhanced() supported
- Metrics tracking: cost, success rate, AI vs template
- Cost control: orchestrator checks budget before enhancement

Architecture:
- ContentEnhancer is centralized, reusable across platforms
- TelegramPublisher.publish_enhanced() called when available
- Fallback to publish_newsletter() for other publishers
- Enhancement metrics tracked in CycleResult

Implementation:
- Add enhance_phase() method to Orchestrator
- Update publish_phase() to handle both Newsletter and CategoryMessage
- Add _category_to_newsletter() helper for backward compatibility
- Update record_phase() to log enhancement metrics
- Add enhancement fields to CycleResult dataclass

Testing:
- scripts/test_content_enhancer_real.py - Test with real sources
  * Fetches from ArXiv, HuggingFace, Reddit, TechCrunch
  * 100% AI enhancement success rate
  * Cost: $0.05 per 20 items
- scripts/test_orchestrator_enhanced.py - Full orchestrator cycle
  * Validates enhancement integration
  * Tracks metrics correctly

Cost Impact:
- +$0.035/newsletter (15 items, 5 categories)
- Budget: $0.84/day (24 cycles) = 28% of $3 daily budget

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Session Documentation:
- Created session log for 2026-02-18-1 (orchestrator integration)
- Updated STATUS.md (Phase 2B → Phase 3, 100% complete)
- Updated .gitignore to track docs/logs/ (session logs now versioned)
- Added historical session logs (2026-02-16, 2026-02-17)
- Minor CLAUDE.md update (planning code guidance)

Session Highlights:
- Fixed AsyncAnthropic imports (consistency)
- Tested with real sources (100% AI success, $0.05 cost)
- Integrated ContentEnhancer into orchestrator pipeline
- Added feature flags (enable_content_enhancement, max_items_per_category)
- Created 2 integration test scripts
- Made 2 commits (bug fixes + integration)

Architecture:
Research → Filter → Enhance (optional) → Publish → Record

Next Steps:
- End-to-end test with real Telegram
- Monitor enhancement quality (1 day)
- Adjust prompts if needed

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add pyproject.toml with ruff, mypy, and pytest config (replaces pytest.ini)
- Add requirements-dev.txt with dev tooling dependencies
- Add .pre-commit-config.yaml with ruff v0.15.1, formatting, and detect-private-key hooks
- Add GitHub Actions CI workflow (lint + unit-tests 3.10/3.11 + integration + gitleaks secret scan)
- Add GitHub Actions auto-PR workflow for agent-* branches with auto-merge
- Add 62 new unit tests: test_researchers.py (41), test_utils.py (21)
- Fix 2 pre-existing test failures in test_orchestrator.py (mock AsyncMock gaps)
- Apply ruff format and fix 308 lint issues across all source files

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Stage 5 now automatically commits documentation without asking.
Removes the optional prompt; commit runs unconditionally with
Co-Authored-By trailer.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- image_generator.py: Use Path | None for optional arg (PEP 604 syntax)
- markdown_formatter.py: Add type annotation for by_category dict
- telegram_formatter.py: Add type annotation for current list

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- state_manager.py: annotate metrics dict as dict[str, Any]
- content_pipeline.py: assert client not None before API call
- markdown_publisher.py: fix publish/publish_newsletter signatures
  to be compatible with BasePublisher; add dict-to-Newsletter conversion
- orchestrator.py: add type annotations, assert enhancer not None,
  use isinstance narrowing for gather results, handle None published_date

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements a fully autonomous GitHub PR lifecycle agent:

- AgentLoop ABC (src/agents/base.py): ReAct loop with poll→triage→act→record,
  run_forever() with max_cycles support for testing
- GitHubClient (src/github/client.py): httpx REST wrapper for GitHub API
  (9 methods, rate-limited via token bucket)
- GitHubMonitor (src/github/monitor.py): orchestrates the full PR lifecycle
- 3 AI workers (src/github/workers/):
  - PRDescriber: Claude Haiku generates PR descriptions
  - CIFixer: 3-tier (ruff auto-fix → Claude Sonnet → secret alert)
    with circuit breaker at MAX_FIX_ATTEMPTS=3
  - CodeReviewer: Claude Sonnet posts reviews on green CI, idempotent
    via <!-- elvagent-review --> marker
- github_events DB table + 3 StateManager methods for deduplication
- --cycles N flag for controlled test runs

Verified live against GitHub API: token auth, rate limiting, PR polling
all working. 217 tests passing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add investigation layer: CI logs + check annotations + local file contents
- Escalate Tier 1 (ruff) to Claude Sonnet when ruff makes no changes
- Add get_fix_history() to StateManager for per-PR fix context
- Refactor CIFixer tests to use _make_fixer() factory pattern
- Add test for ruff→Claude escalation path

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
run_github_monitor() used `logger` before main() set it up as a global.
Add get_logger("main") at module level to satisfy mypy.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@elvern18 elvern18 merged commit f8ed953 into main Feb 18, 2026
10 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant