Rust-powered auditable SQL injection detection for Python 🐍. Built for AI agents ✨.
AI agents now write and execute SQL directly. Text-to-SQL pipelines, MCP database connectors, LangChain SQL toolkits, and autonomous data analysts all generate queries on the fly. That creates a new attack surface: an LLM that can be prompted, jailbroken, or simply confused into generating '; DROP TABLE users; -- and then running it.
sqlhund is a runtime guardrail for that. It detects SQL injection patterns and classifies them by CWE and CAPEC. The detection engine is written in Rust with Python bindings via PyO3, so it can sit in an agent's hot path without adding latency. Every match is classified across two axes: technique (HOW the injection works) and impact (WHAT it can do). You get structured, auditable threat intelligence instead of a bare boolean.
>>> import sqlhund
>>> sqlhund.is_query_malicious("SELECT * FROM users WHERE id = 1")
False
>>> sqlhund.is_query_malicious("' OR 1=1 --")
TrueNote
The primary goal is to prevent AI agents from manipulating data in or the structure of your database.
- Installation
- Quick Start
- Usage Examples
- Features
- How sqlhund Compares
- Security Classification
- Supported Databases
- Benchmarks
- Building from Source
- Testing
- Contributing
pip install sqlhund # pip
poetry add sqlhund # poetry
uv add sqlhund # uvRequires Python 3.10+. No runtime dependencies.
sqlhund exposes two functions. That's the entire API.
import sqlhund
# Boolean check
sqlhund.is_query_malicious("SELECT * FROM users; DROP TABLE users")
# True
# Detailed analysis with CWE/CAPEC classification
sqlhund.analyze_query("SELECT * FROM users; DROP TABLE users")
# {
# 'is_malicious': True,
# 'matches': {
# 'general': [{
# 'technique': ['CWE-89'], # HOW: SQL Injection
# 'impact': ['CWE-285', 'CWE-471'], # WHAT: Auth bypass + data tampering
# 'capec': [66] # CAPEC-66: SQL Injection
# }]
# }
# }
# Safe queries pass through cleanly
sqlhund.analyze_query("SELECT id FROM users WHERE id = 1")
# {'is_malicious': False, 'matches': {}}See Detailed Threat Analysis below for the file-operation and multi-database detection cases.
from langchain.tools import tool
import sqlhund
@tool
def execute_sql(query: str) -> str:
"""Execute a SQL query against the database. Rejects malicious queries."""
if sqlhund.is_query_malicious(query):
return "Query rejected: potential SQL injection detected."
return database.execute(query)sqlhund sits between the LLM's output and your database. The agent never reaches the database if the query is malicious, no matter how the prompt was crafted.
import sqlhund
def execute_ai_query(query: str):
"""Execute AI-generated SQL with injection protection."""
if sqlhund.is_query_malicious(query):
raise ValueError("Potential SQL injection detected")
# Safe to execute
return database.execute(query)result = sqlhund.analyze_query("SELECT * FROM users WHERE id = 1 OR 1=1")
if result['is_malicious']:
for db_name, patterns in result['matches'].items():
print(f"Database: {db_name}")
for pattern in patterns:
print(f" Technique: {pattern['technique']}") # CWE-89
print(f" Impact: {pattern['impact']}") # CWE-285
print(f" CAPEC: {pattern['capec']}") # 66File-operation attacks are detected too, scoped to the database they target:
sqlhund.analyze_query("SELECT load_extension('evil')")
# {
# 'is_malicious': True,
# 'matches': {
# 'sqlite': [{
# 'technique': ['CWE-89', 'CWE-610', 'CWE-114'],
# 'impact': ['CWE-200', 'CWE-285'],
# 'capec': [470]
# }]
# }
# }def sanitize_search_query(user_input: str) -> str:
"""Validate search input before building SQL."""
test_query = f"SELECT * FROM products WHERE name LIKE '%{user_input}%'"
if sqlhund.is_query_malicious(test_query):
raise ValueError("Invalid search term")
return user_input- Fast: core detection engine written in Rust, compiled to a native Python extension
- Accurate: 100% precision and recall on a 10M+ query benchmark (zero false positives, zero false negatives)
- Multi-database: detects injection patterns targeting SQLite, PostgreSQL, and DuckDB
- Zero dependencies: ships as a self-contained native wheel
- AI-agent ready: built as a guardrail for LLM-generated SQL
- Security classification: maps detected patterns to CWE and CAPEC taxonomies for threat intelligence
| Tool | What it does | Where it fits |
|---|---|---|
| sqlhund | Pattern detection plus CWE/CAPEC classification, Python-native | Runtime guardrail for AI-generated or agent-relayed SQL |
libinjection |
C library, pattern-based SQLi/XSS detection | Closest classic analog. No native Python bindings, no CWE/CAPEC mapping |
sqlmap |
Active penetration-testing scanner | Offensive testing, not a runtime guard |
| Parameterized queries | Prevents injection at query-construction time | The right long-term fix, but it doesn't help when an LLM generates SQL whose structure isn't known ahead of time |
Note
sqlhund doesn't replace parameterized queries. It covers the case parameterization can't: SQL whose structure is generated dynamically by a model.
sqlhund classifies detected patterns using industry-standard security frameworks.
Every detected pattern is analyzed across two independent axes:
-
Technique (HOW): CWE identifiers describing the injection mechanism
- CWE-89: SQL Injection
- CWE-610: External Resource Reference (file operations)
- CWE-94/95: Code/Eval Injection
- CWE-77/78: Command/OS Command Injection
- CWE-114: Process Control (loading untrusted libraries)
- CWE-116/184: Encoding evasion and filter bypass
-
Impact (WHAT): CWE identifiers describing the attack consequences
- CWE-200: Information Disclosure
- CWE-285: Authorization Bypass
- CWE-269: Privilege Escalation
- CWE-471: Data Tampering
- CWE-400: Resource Exhaustion (DoS)
- CWE-208: Timing Side-Channel (blind injection)
- CWE-497: System Information Exposure
Matches are also mapped to CAPEC attack pattern IDs:
- CAPEC-66: SQL Injection
- CAPEC-7: Blind SQL Injection
- CAPEC-54: Query System for Information
- CAPEC-470: Expanding Control over the OS from the Database
- CAPEC-664: Server-Side Request Forgery
sqlhund detects patterns from OWASP Top 10 A03:2021 - Injection, covering:
- SQL Injection (CWE-89)
- Command Injection (CWE-77, CWE-78)
- Code Injection (CWE-94, CWE-95)
- File/Resource Injection (CWE-610)
Resources:
- OWASP SQL Injection Prevention Cheat Sheet
- OWASP Query Parameterization Cheat Sheet
- OWASP Injection Prevention Cheat Sheet
- CWE-89: SQL Injection
- CAPEC-66: SQL Injection
sqlhund detects database-specific injection patterns for:
| Database | Detection Patterns |
|---|---|
| General | UNION, comments, tautologies, subqueries, time delays |
| SQLite | load_extension, ATTACH, PRAGMA, virtual tables |
| PostgreSQL | pg_read_file, COPY, DO blocks, dblink, extensions |
| DuckDB | read_csv, ATTACH, httpfs, CREATE SECRET, macros |
Evaluated against the RbSQLi dataset: 10,304,026 labeled SQL queries (2,813,146 malicious, 7,490,880 benign).
| Predicted Malicious | Predicted Benign | |
|---|---|---|
| Actual Malicious | 2,813,146 | 0 |
| Actual Benign | 0 | 7,490,880 |
Precision: 100% · Recall: 100% · Accuracy: 100%
Requires Rust, Maturin, and uv.
git clone https://github.com/KanishkNavale/sqlhund
cd sqlhund
make dev # set up development environment
make build # compile debug build
make release # compile optimized release buildRun unit tests (Rust + Python):
make unittestRun evaluation against the full RbSQLi dataset (download the dataset, place it at tests/data/wild.csv):
make wildtestContributions are welcome. See the open issues or submit a pull request.
The MIT License licenses this project.