Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -134,3 +134,4 @@ dmypy.json


.copilotignore
tests/e2e/attacks/
219 changes: 165 additions & 54 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,14 @@
<div align="center">

<p align="center">
<img src="https://docs.hackagent.dev/img/banner.svg" alt="HackAgent - AI Agent Security Testing Toolkit" width="800">
</p>

<strong>AI Security Red-Team Toolkit</strong>
<strong>AI Security Red-Team Toolkit</strong>

<br>

[App](https://app.hackagent.dev/) -- [Docs](https://docs.hackagent.dev/) -- [API](https://api.hackagent.dev/schema/redoc)


<br>

![Python Version](https://img.shields.io/badge/python-3.10%2B-blue)
Expand All @@ -20,93 +18,206 @@
![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)
![Test Coverage](https://img.shields.io/codecov/c/github/AISecurityLab/hackagent)
![CI Status](https://img.shields.io/github/actions/workflow/status/AISecurityLab/hackagent/ci.yml)
</div>

## Top Links

<br>
- [Introduction](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/introduction.mdx)

</div>
<details>
<summary><strong>Getting Started</strong></summary>

- [Section Source](https://github.com/AISecurityLab/hackagent/tree/main/docs/docs/getting-started)
- [Installation](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/getting-started/installation.mdx)
- [Quick Start](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/getting-started/quick-start.mdx)
- [Dashboard](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/getting-started/dashboard.mdx)
- [Quick Security Scan](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/getting-started/quick-security-scan.mdx)
- [Attack Tutorial](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/getting-started/attack-tutorial.mdx)
- [Datasets Tutorial](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/getting-started/datasets-tutorial.mdx)

## Overview
</details>

HackAgent is an open-source toolkit designed to help security researchers, developers and AI safety practitioners evaluate the security of AI agents.
It provides a structured approach to discover potential vulnerabilities, including prompt injection, jailbreaking techniques, and other attack vectors.
<details>
<summary><strong>AI Risks</strong></summary>

## 🔥 Features
- [Section Source](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/risks/index.mdx)
- [Jailbreak](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/risks/vulnerabilities.md)

- **Comprehensive Attack Library**: Pre-built techniques for prompt injections, jailbreaks, and goal hijacking
- **Modular Framework**: Easily extend with custom attack vectors and testing methodologies
- **Safety Focused**: Responsible disclosure guidelines and ethical usage recommendations
</details>

### 🔌 AI Agent Frameworks Supported
<details>
<summary><strong>Attacks</strong></summary>

[![LiteLLM](https://img.shields.io/badge/LiteLLM-blue?style=flat&logo=github)](https://github.com/BerriAI/litellm)
[![ADK](https://img.shields.io/badge/Google-ADK-green?style=flat&logo=openai)](https://google.github.io/adk-docs/)
[![OpenAI](https://img.shields.io/badge/OpenAI-SDK-412991?style=flat&logo=openai)](https://platform.openai.com/docs)
- [Section Source](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/attacks/index.mdx)
- [AdvPrefix](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/attacks/advprefix.md)
- [AutoDAN-Turbo](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/attacks/autodan_turbo.md)
- [PAIR](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/attacks/pair.md)
- [TAP](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/attacks/tap.md)
- [FlipAttack](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/attacks/flipattack.md)
- [BoN](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/attacks/bon.md)
- [h4rm3l](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/attacks/h4rm3l.md)
- [CipherChat](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/attacks/cipherchat.md)
- [PAP](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/attacks/pap.md)
- [Baseline](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/attacks/baseline.md)

## 🚀 Installation
</details>

<details>
<summary><strong>Datasets</strong></summary>

### Installation from PyPI
- [Section Source](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/datasets/index.md)
- [Presets](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/datasets/presets.md)
- [Hugging Face](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/datasets/huggingface.md)
- [File Provider](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/datasets/file.md)
- [Custom Providers](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/datasets/custom-providers.md)
- [Troubleshooting](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/datasets/troubleshooting.md)

HackAgent can be installed directly from PyPI:
</details>

```bash
# With uv (recommended)
uv add hackagent
<details>
<summary><strong>Agents</strong></summary>

# Or with pip
pip install hackagent
```
- [Section Source](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/agents/index.mdx)
- [Ollama](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/agents/ollama.mdx)
- [OpenAI SDK](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/agents/openai-sdk.mdx)
- [Google ADK](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/agents/google-adk.mdx)

</details>

<details>
<summary><strong>CLI Reference</strong></summary>

- [Section Source](https://github.com/AISecurityLab/hackagent/tree/main/docs/docs/cli)
- [Overview](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/cli/overview.md)
- [Initialization](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/cli/initialization.md)
- [Configuration](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/cli/config.md)
- [Attack](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/cli/attack.mdx)
- [Results](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/cli/results.md)

</details>

<details>
<summary><strong>SDK Reference</strong></summary>

- [Section Source](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/api-index.md)
- [Agent](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/hackagent/agent.md)
- [Errors](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/hackagent/errors.md)
- [Logger](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/hackagent/logger.md)
- [Utils](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/hackagent/utils.md)
- [Router](https://github.com/AISecurityLab/hackagent/tree/main/docs/docs/hackagent/router)
- [Attacks](https://github.com/AISecurityLab/hackagent/tree/main/docs/docs/hackagent/attacks)
- [Datasets](https://github.com/AISecurityLab/hackagent/tree/main/docs/docs/hackagent/datasets)
- [Risks](https://github.com/AISecurityLab/hackagent/tree/main/docs/docs/hackagent/risks)
- [Server](https://github.com/AISecurityLab/hackagent/tree/main/docs/docs/hackagent/server)

</details>

<details>
<summary><strong>Security & Ethics</strong></summary>

- [Section Source](https://github.com/AISecurityLab/hackagent/tree/main/docs/docs/security)
- [Responsible Disclosure](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/security/responsible-disclosure.md)
- [Ethical Guidelines](https://github.com/AISecurityLab/hackagent/blob/main/docs/docs/security/ethical-guidelines.md)

</details>

## What is HackAgent?

HackAgent is a comprehensive Python SDK and CLI designed to help security researchers, developers, and AI safety practitioners evaluate and strengthen the security of AI agents.

## 📚 Quick Start

Run the interactive CLI to start testing your AI agents:

As AI agents become more powerful and autonomous, they face security challenges that traditional testing tools cannot address:

| Threat | Description |
|--------|-------------|
| **Prompt Injection** | Malicious inputs that hijack agent behavior |
| **Jailbreaking** | Bypassing safety guardrails and content filters |
| **Goal Hijacking** | Manipulating agents to pursue unintended objectives |
| **Tool Misuse** | Exploiting agent capabilities for unauthorized actions |

HackAgent automates testing for these vulnerabilities using research-backed attack techniques, helping you identify and fix security issues before they are exploited.

<div align="center">
<img src="docs/static/gifs/terminal.gif" alt="HackAgent CLI Demo" width="100%" />
<p><em>Interactive TUI with real-time attack progress and visual reporting.</em></p>
</div>

## Get Started Now

### Quick Install

```bash
hackagent
python3 -m venv .venv
source .venv/bin/activate
pip install git+https://github.com/AISecurityLab/HackAgent.git
```

Or use the SDK:
No API key required: HackAgent works locally out of the box.


```python
from hackagent import HackAgent, AgentTypeEnum
Questions? Join [community discussions](https://github.com/AISecurityLab/hackagent/discussions) or email ais@ai4i.it.

agent = HackAgent(
name="my_agent",
endpoint="http://localhost:8000",
agent_type=AgentTypeEnum.GOOGLE_ADK
)
## Architecture

results = agent.hack(attack_config={
"attack_type": "advprefix",
"goals": ["Test goal"],
# ... generator and judges config
})
HackAgent uses a modular pipeline to test agent robustness end-to-end.

| Component | Description |
|-----------|-------------|
| **Attack Engine** | Orchestrates attacks using AdvPrefix, AutoDAN-Turbo, PAIR, TAP, FlipAttack, BoN, h4rm3l, CipherChat, PAP, and Baseline |
| **Generator** | LLM role that creates adversarial prompts to test the target agent |
| **Judge** | LLM role that evaluates whether attacks bypass safety measures |
| **Target Agent** | Your AI agent under test across supported frameworks |
| **Datasets** | Pre-built benchmark presets plus custom HuggingFace/file datasets |

## Supported Frameworks

[![Google ADK](https://img.shields.io/badge/Google-ADK-green?style=for-the-badge&logo=google)](https://google.github.io/adk-docs/)
[![OpenAI SDK](https://img.shields.io/badge/OpenAI-SDK-412991?style=for-the-badge&logo=openai)](https://platform.openai.com/docs)
[![LiteLLM](https://img.shields.io/badge/LiteLLM-blue?style=for-the-badge&logo=github)](https://github.com/BerriAI/litellm)
[![LangChain](https://img.shields.io/badge/LangChain-1C3C3C?style=for-the-badge)](https://python.langchain.com)

## Reporting

HackAgent supports both local and remote reporting.

- Local mode stores test results in SQLite and includes a built-in dashboard.
- Cloud mode syncs runs to the HackAgent remote platform when an API key is configured.

```bash
hackagent web
```

Obtain your credentials at [https://app.hackagent.dev](https://app.hackagent.dev)
Access cloud reporting at [https://app.hackagent.dev](https://app.hackagent.dev).

## Responsible Use

For detailed examples and advanced usage, visit our [documentation](https://docs.hackagent.dev).
HackAgent is designed for authorized security testing only. Always obtain explicit permission before testing any AI system.

## 📊 Reporting
### Do

HackAgent automatically sends test results to the dashboard for analysis and visualization.
- Test your own agents
- Conduct authorized pentesting
- Follow coordinated disclosure
- Share security knowledge responsibly

Access your dashboard at [https://app.hackagent.dev](https://app.hackagent.dev)
### Don't

## 🤝 Contributing
- Test systems without permission
- Exploit vulnerabilities maliciously
- Violate terms of service
- Share harmful exploit instructions irresponsibly

We welcome contributions! Please see [CONTRIBUTING.md](CONTRIBUTING.md) and [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md) for guidelines.
Read the full guidelines: [Responsible Disclosure](docs/docs/security/responsible-disclosure.md)

## 📜 License
## Contributing

This project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.
Contributions are welcome. See [CONTRIBUTING.md](CONTRIBUTING.md) and [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md).

## ⚠️ Disclaimer
## License

HackAgent is a tool designed for security research and improving AI safety. Always obtain proper authorization before testing any AI systems. The authors are not responsible for any misuse of this software.
Licensed under Apache-2.0. See [LICENSE](LICENSE).

---
## Disclaimer

*This project is for educational and research purposes. Always use responsibly and ethically.*
HackAgent is intended for security research and AI safety improvement. The authors are not responsible for misuse.
9 changes: 4 additions & 5 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,10 @@ This directory contains the documentation generation tools for HackAgent.
## Quick Start

```bash
# Generate docs for latest PyPI version (default)
# Generate docs for current local project version (default)
poetry run python docs/scripts/generate_docs.py

# Generate docs for specific version
# Generate docs with a custom version label
poetry run python docs/scripts/generate_docs.py --version 0.2.4

# Generate docs for current local version
Expand All @@ -26,7 +26,7 @@ npm run start

## View Documentation

After generation, the API reference will be available directly in `docs/docs/` and integrated into the main documentation site.
After generation, the SDK reference will be available directly in `docs/docs/` and integrated into the main documentation site.

```bash
# View documentation locally
Expand All @@ -37,13 +37,12 @@ cd docs && npm start

- Poetry: `curl -sSL https://install.python-poetry.org | python3 -`
- Node.js 18+ and npm
- Internet connection (for fetching PyPI versions)

## What the Script Does

The script automatically handles:
- Installing documentation dependencies via Poetry
- Fetching version information from PyPI
- Reading the project version from local `pyproject.toml` (or a custom `--version` label)
- Generating Markdown files from Python docstrings using pydoc-markdown
- Creating proper Docusaurus-compatible output
- Copying generated files to the correct locations for the documentation site
Expand Down
Loading
Loading