Releases: NeoTheCapt/RedteamAgent
v0.1.1 — 24 fixes since v0.1.0
RedTeam Agent v0.1.1
Maintenance release — 24 commits collected since v0.1.0 (4 days of autonomous auditor cycles + 2 deep meta-audits). Same install path, just bumped to the new tag.
bash <(curl -fsSL https://raw.githubusercontent.com/NeoTheCapt/RedteamAgent/v0.1.1/install.sh) docker
cd ~/redteam-docker && ./run.sh
# inside the TUI:
/autoengage http://your-target:8080🛠 What's fixed
Backend reconcile (orchestrator/backend/app/services/runs.py + launcher.py)
- Continuous-observation hold preservation correctly distinguishes genuinely-completed holds from detached holds with stale
queue_incomplete/runtime_disappearedlog entries — the latter now route to_maybe_auto_resume_runinstead of being silently flipped tocompleted.reason_text="continuous observation hold detached"restored. - Bounded-blocker classification: completed runs whose
stop_reason_textdocuments an exhaustive bounded queue closure are kept terminal ascompleted_with_blockersinstead of being demoted tofailed/incomplete_terminal_stateby the reconciler. - Metasploit MCP env defaults:
workspace/.envnow ships with non-blankMSF_*defaults so the in-container helpers connect on first launch; explicit project overrides still win. /engage --autodirect launch replaces the prior/autoengageindirection — fewer command layers, simpler runtime detection.
Agent operator-core (agent/operator-core.md + CLAUDE.md + AGENTS.md)
- Auth-respawn dispatch now uses real
task @recon-specialist/task @source-analyzercalls instead of bookkeeping-only text that could trigger permission prompts in autonomous mode. - CTF recall closure gate hardened: explicit named challenge lists for FTP / Web3 / NFT / Database Schema / Upload Type / Password Strength branches, and pre-report verification that all peak-solved challenges have current solved-state evidence.
- New "incomplete recall blocker requeue" rule: if a closure batch proves the technical primitive but the named peak-solved challenge stays unsolved, the case must be requeued with the next concrete challenge-triggering action rather than retired clean.
New regression contracts
agent/scripts/check_operator_respawn_contract.py, check_operator_prompt_contract.py, check_sensitive_data_skill_contract.py, check_exploit_developer_prompt_contract.py — focused contract scripts that fail loudly if any of the above guarantees regress.
Plus a few audit-found stale-test catches: tests/agent/test_exploit_developer_prompt_guardrails.py, tests/orchestrator/backend/test_launcher.py (loopback rewriting + workspace-env baseline), tests/orchestrator/backend/test_runs.py (continuous-hold ignores stale stop reason) all updated to track post-rewording behavior.
Auth helper hardening (local-hermes-agent/scripts/lib/orchestrator_auth.sh)
_refresh_orchestrator_tokencooldown now bypasses when currentORCH_TOKENis empty. Previously the cooldown could trap the 401-retry path with an empty bearer header — would emit the same orch_api FND every cycle until the cooldown TTL expired.
📦 Compatibility
- Same docker image (
redteam-allinone:latest), no rebuild required if you've already pulled v0.1.0 - Same install command structure — only the tag in the URL changed (v0.1.0 → v0.1.1)
- Same per-project config — Auth / Env / Crawler / Parallel / Agents tabs unchanged
- Same CLI commands —
/engage,/autoengage,/resume,/status,/queue,/report, etc.
📥 Upgrade
If you installed v0.1.0:
# Force-reinstall from v0.1.1 tag
bash <(curl -fsSL https://raw.githubusercontent.com/NeoTheCapt/RedteamAgent/v0.1.1/install.sh) docker
# or, if you want to pin a different ref:
REDTEAM_REF=v0.1.1 bash <(curl -fsSL https://raw.githubusercontent.com/NeoTheCapt/RedteamAgent/v0.1.1/install.sh) dockerEngagement workspaces from v0.1.0 keep working — this release only changes orchestrator-side reconcile behavior and operator-prompt guardrails; nothing in the engagement file format changed.
🙏 Acknowledgements
Built on top of OpenCode, Claude Code, Codex, Katana, mitmproxy, Metasploit Framework, and the broader Kali Linux toolchain.
For authorized security testing only. All targets must be CTF / lab / explicitly-authorized environments.
v0.1.0 — Autonomous AI-Powered Red Team Simulation
RedTeam Agent v0.1.0
Drop-in autonomous red team operator for any AI CLI. Aim it at a CTF / lab target, walk away, and come back to a finished engagement: structured findings, full evidence chain, and a generated report. Ships as a single Docker image plus an optional web orchestrator, so there's no host install pollution and no manual phase babysitting.
🚀 Quick Start — Docker (recommended)
The whole agent runtime, OpenCode, and the entire Kali toolchain live in one image. You don't install anything on the host except Docker.
bash <(curl -fsSL https://raw.githubusercontent.com/NeoTheCapt/RedteamAgent/v0.1.0/install.sh) docker
cd ~/redteam-docker
./run.shInside the TUI:
/autoengage http://your-ctf-target:8080
That's it. Recon → Collect → Test → Exploit + OSINT → Report runs unattended for hours, dispatching subagents in parallel, and lands a finalized report.md plus structured findings/intel/auth/cases under workspace/engagements/<timestamp-target>/.
run.sh persists across restarts:
workspace/— all engagement artifactsopencode-home/— auth tokens (so you don't re-login)opencode-config/— model selection (so you don't re-pick a provider)opencode-state/— TUI state--ephemeral-opencodeif you want zero state outside the container--rebuildto force a clean image rebuild
🖥️ Orchestrator GUI (optional, but highly recommended)
A local web app that turns the agent into a multi-project, multi-run platform: live event stream, dispatch timelines, finding browser, per-project model/auth/crawler/parallelism config, and one-click run launch.
git clone --branch v0.1.0 --depth 1 https://github.com/NeoTheCapt/RedteamAgent.git
cd RedteamAgent
./orchestrator/run.sh
# open http://127.0.0.1:18000orchestrator/run.sh bootstraps the backend virtualenv, installs frontend dependencies on first run, builds the frontend, and starts the FastAPI service. To stop:
./orchestrator/stop.shWhat you get:
- Project sidebar with per-project Model / Auth / Env / Crawler / Parallel / Agents tabs (every field inherited by every run launched under that project).
- Live runs with phase tracker, queue stats, live concurrent-agent banner, and historical concurrent-windows timeline.
- Document browser for the engagement workspace (findings.md, report.md, intel.md, log.md, surfaces.jsonl, downloaded artifacts).
- Auto-recovery — the backend auto-resumes runs after supervisor / backend restarts, synthesizes a partial report from artifacts when a cycle is interrupted, and enforces completion health checks. Suitable for long-running unattended runs.
✨ What's in the box
- 8 specialized subagents — operator, recon-specialist, source-analyzer, vulnerability-analyst, exploit-developer, fuzzer, osint-analyst, report-writer — coordinated through a stage-based streaming pipeline (no rigid phase gates).
- 31 attack-methodology skills + 79 reference files — OWASP Top 10:2025, API Security 2023, AD/Kerberos attacks, offensive tactics. Skills load on demand; references survive every model context window.
- Containerized toolchain —
nmap,sqlmap,ffuf,gobuster,nikto,nuclei,subfinder,katana,mitmproxy, Metasploit Framework + MCP server — all baked into one image. - Streaming case pipeline — SQLite-backed case queue with 4 producers (mitmproxy, Katana, recon, source). Cases flow through
ingested → vuln_confirmed → exploitedstages with cross-kind agent concurrency. Routing is type-aware (api/form → vuln-analyst; javascript/page → source-analyzer). - Unattended
/autoengage— zero prompts, end-to-end. Auto-resume on stalls, queue-stall recovery, surface-coverage enforcement, finding deduplication, and partial-report fallback if a cycle is interrupted. - Multi-CLI install — same agent runtime, install for Claude Code, OpenCode, Codex, or all-in-one Docker. OpenCode and Codex prompts are generated from a single source via
render-operator-prompts.sh.
🏗️ Architecture in one frame
producers ──→ cases.db (stages) ──→ dispatcher.sh ──→ subagents (parallel)
katana │ │
mitmproxy │ └─→ findings.md
recon-specialist │ └─→ surfaces.jsonl
source-analyzer │ └─→ intel.md
└─→ orchestrator events API
↓
Web UI (live)
5-phase pipeline:
Phase 1: RECON ─── recon-specialist + source-analyzer (parallel)
Phase 2: COLLECT ─ Import endpoints → SQLite queue, start Katana crawler
Phase 3: TEST ──── Stage-based case pipeline; serialized fetch+dispatch per turn
Phase 4: EXPLOIT ─ osint-analyst + exploit-developer (parallel)
Phase 5: REPORT ── report-writer with coverage stats + intel summary
📦 Install modes
| Mode | Command | Output |
|---|---|---|
| Docker (all-in-one) | install.sh docker |
~/redteam-docker/ with run.sh, persistent workspace/, OpenCode XDG dirs |
| OpenCode | install.sh opencode |
~/redteam-agent/ with .opencode/, skills/, references/, scripts/, docker/ |
| Claude Code | install.sh claude |
~/redteam-agent/ with generated .claude/agents/ + .claude/commands/ + CLAUDE.md |
| Codex | install.sh codex |
~/redteam-agent/ with generated .codex/agents/ + AGENTS.md |
Override the bootstrap ref with REDTEAM_REF=<tag-or-branch> when you need a different snapshot.
⚙️ Per-project configuration
Six tabs on every project (Edit-project modal or sidebar inline): Model, Auth, Env, Crawler, Parallel, Agents. Every field is inherited by every run launched under that project, so once a project is set up /autoengage is one click.
🚀 Get involved
- Issues / questions: GitHub Issues
- Customize a skill: drop a
SKILL.mdintoagent/skills/<name>/, register inagent/.opencode/opencode.json, re-runinstall.sh. - Wire your own CLI: agent runtime is at
agent/— operator prompt atagent/operator-core.md, subagent prompts atagent/.opencode/prompts/agents/.
🙏 Acknowledgements
Built on top of OpenCode, Claude Code, Codex, Katana, mitmproxy, Metasploit Framework, and the broader Kali Linux toolchain. Reference files draw from OWASP, PortSwigger, MITRE ATT&CK, and the offensive-security community.
For authorized security testing only. All targets must be CTF / lab / explicitly-authorized environments — this is a hard guard the operator agent enforces.

