You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
AI-powered pentest agent with a persistent Pentest Task Tree, autonomous recon, vulnerability validation, exploit analysis, finding correlation, and kill chain tracking
31 agent tools • 60+ pentest tools • 5 playbooks • 22 CLI commands • 5,330 lines of Python
Overview
A single-file Python agent that connects to a Kali/Parrot attack box via SSH (or runs locally), autonomously executes security tools, analyses output, plans next steps, stores credentials for reuse, spawns parallel subagents, sprays credentials across services, builds attack graphs, maintains a persistent Pentest Task Tree as durable memory, and documents findings — all driven by an LLM agentic loop with Claude or OpenAI.
File
pentest_copilot.py
Version
2.5.0
Lines
~5,330
Agent Tools
31
CLI Commands
22
Pentest Tools
60+ in registry
Playbooks
5 (webapp, network, api, ad, cloud)
Recon Pipelines
4 (full, quick, subdomain, stealth)
Python
3.8+
Dependencies
anthropic or openai + paramiko
License
MIT
Quick Start
# Install dependencies
pip install anthropic paramiko # Claude (recommended)
pip install openai paramiko # OpenAI alternative# SSH to a remote Kali attack boxexport ANTHROPIC_API_KEY=sk-ant-...
python pentest_copilot.py --target 10.0.0.1 \
--ssh-host kali.local --ssh-user root --ssh-key ~/.ssh/id_rsa
# Run locally on a Kali/Parrot machine
python pentest_copilot.py --target 10.0.0.1 --local
# Stealth mode (rate limiting + IDS evasion flags)
python pentest_copilot.py --target 10.0.0.1 --local --stealth
# Use OpenAI GPT-4oexport OPENAI_API_KEY=sk-...
python pentest_copilot.py --target 10.0.0.1 --local \
--provider openai --model gpt-4o
# Use any OpenAI-compatible endpoint (Ollama, vLLM, etc.)
python pentest_copilot.py --target 10.0.0.1 --local \
--provider openai --model llama3 \
--base-url http://localhost:11434/v1
31 Agent Tools
Core (7)
Tool
Description
run_command
Execute any bash command on the attack box
run_script
Write and execute Python scripts for custom exploits
install_tool
Install security tools on demand (apt, pip, go, git)
read_file
Read scan results, configs, exploit output
write_file
Create wordlists, exploit scripts, configs
report_finding
Document a vulnerability with severity, evidence, CVSS
ask_user
Ask for clarification, approval, or additional info
Tier 1 — Parallelism & State (6)
Tool
Description
spawn_subagent
Spawn background agents for concurrent tasks
store_credential
Store discovered credentials for cross-service reuse
list_credentials
List all credentials in the vault
open_shell
Open a named persistent shell session
run_in_shell
Run a command in a specific named shell
use_playbook
Load a methodology playbook (webapp, network, api, ad, cloud)
Score findings by Impact × Exploitability / Detection Time with P1-P4 priority ratings
correlate_findings
Cross-tool deduplication — groups findings by host+port+CVE, boosts confidence when multiple tools agree
Planning — Pentest Task Tree (1)
Tool
Description
manage_tasks
Maintain the Pentest Task Tree (add / add_many / update / list) — a persistent, hierarchical plan re-injected into every prompt so the agent stays oriented even after chat history is trimmed
4 Autonomous Recon Pipelines
Pipeline
Tools Chained
Use Case
full
nmap → whatweb → wafw00f → nikto → ffuf → nuclei
Comprehensive target assessment
quick
nmap → whatweb → ffuf
Fast initial sweep
subdomain
subfinder → httpx
Domain-level attack surface mapping
stealth
nmap (slow SYN) → whatweb
Evasive reconnaissance
Each pipeline auto-captures evidence, respects stealth mode, and returns aggregated output.
At each stage, the agent evaluates confidence and can reject false positives. Only validated findings get reported — eliminating noise from tools that cry wolf.
Exploit Analysis Engine
Scores confirmed findings by Impact × Exploitability / Detection Time and assigns priority ratings:
Pentest Task Tree (PTT) — persistent externalized plan that survives history trimming; manage_tasks tool, /tasks view, session persistence; first unit-test suite
Example Session
You: Use the webapp playbook and run a full recon pipeline on http://10.0.0.1
[PLAYBOOK] Loaded: Web Application Pentest
[Agent] Starting with automated reconnaissance.
[RECON PIPELINE] Running 'full' on http://10.0.0.1
──────────────────────────────────────────────────
[RECON] nmap: nmap -sV -sC -O -p- 10.0.0.1 ...
22/tcp open ssh OpenSSH 8.2p1
80/tcp open http Apache httpd 2.4.41
3306/tcp open mysql MySQL 5.7.33
[OK in 45.2s]
[RECON] whatweb: whatweb 10.0.0.1 ...
Apache 2.4.41, PHP 7.4.3, WordPress 5.7
[OK in 2.1s]
[RECON] ffuf: ffuf -u 10.0.0.1/FUZZ ...
/admin, /wp-login.php, /xmlrpc.php, /backup/
[OK in 18.4s]
[RECON] nuclei: nuclei -u 10.0.0.1 ...
[critical] CVE-2021-44228 Log4Shell
[high] CVE-2020-11023 jQuery XSS
[OK in 32.1s]
──────────────────────────────────────────────────
[RECON COMPLETE]
[Agent] Recon complete. Let me search for exploits on the discovered services.
[TOOL] smart_exploit_search: parsing services...
Found 3 services. Searching ExploitDB...
1. [METASPLOIT] Apache 2.4.49 Path Traversal (80/http)
2. [REMOTE] MySQL 5.7 Auth Bypass (3306/mysql)
3. [WEBAPP] WordPress 5.7 RCE (80/http)
[TOOL] credential_spray: 10.0.0.1 — 3 services
──────────────────────────────────────────────────
admin@ssh:22 ... failed
admin@mysql:3306 ... SUCCESS
──────────────────────────────────────────────────
[CRED] Stored: admin:P@s*** [password] @ mysql://10.0.0.1:3306
[ATTACK GRAPH] [Initial Access] SQL Injection in /login
[ATTACK GRAPH] [Credential Access] MySQL creds via brute-force
[FINDING] [CRITICAL] SQL Injection in login form
[FINDING] [HIGH] MySQL default credentials