sqlhund

Rust-powered auditable SQL injection detection for Python 🐍. Built for AI agents ✨.

AI agents now write and execute SQL directly. Text-to-SQL pipelines, MCP database connectors, LangChain SQL toolkits, and autonomous data analysts all generate queries on the fly. That creates a new attack surface: an LLM that can be prompted, jailbroken, or simply confused into generating '; DROP TABLE users; -- and then running it.

sqlhund is a runtime guardrail for that. It detects SQL injection patterns and classifies them by CWE and CAPEC. The detection engine is written in Rust with Python bindings via PyO3, so it can sit in an agent's hot path without adding latency. Every match is classified across two axes: technique (HOW the injection works) and impact (WHAT it can do). You get structured, auditable threat intelligence instead of a bare boolean.

>>> import sqlhund
>>> sqlhund.is_query_malicious("SELECT * FROM users WHERE id = 1")
False
>>> sqlhund.is_query_malicious("' OR 1=1 --")
True

Note

The primary goal is to prevent AI agents from manipulating data in or the structure of your database.

Installation

pip install sqlhund  # pip
poetry add sqlhund   # poetry
uv add sqlhund       # uv

Requires Python 3.10+. No runtime dependencies.

Quick Start

sqlhund exposes two functions. That's the entire API.

import sqlhund

# Boolean check
sqlhund.is_query_malicious("SELECT * FROM users; DROP TABLE users")
# True

# Detailed analysis with CWE/CAPEC classification
sqlhund.analyze_query("SELECT * FROM users; DROP TABLE users")
# {
#     'is_malicious': True,
#     'matches': {
#         'general': [{
#             'technique': ['CWE-89'],           # HOW: SQL Injection
#             'impact': ['CWE-285', 'CWE-471'],  # WHAT: Auth bypass + data tampering
#             'capec': [66]                      # CAPEC-66: SQL Injection
#         }]
#     }
# }

# Safe queries pass through cleanly
sqlhund.analyze_query("SELECT id FROM users WHERE id = 1")
# {'is_malicious': False, 'matches': {}}

See Detailed Threat Analysis below for the file-operation and multi-database detection cases.

Usage Examples

Guarding an LLM Agent's SQL Tool Calls

from langchain.tools import tool
import sqlhund

@tool
def execute_sql(query: str) -> str:
    """Execute a SQL query against the database. Rejects malicious queries."""
    if sqlhund.is_query_malicious(query):
        return "Query rejected: potential SQL injection detected."
    return database.execute(query)

sqlhund sits between the LLM's output and your database. The agent never reaches the database if the query is malicious, no matter how the prompt was crafted.

Validating AI-Generated SQL

import sqlhund

def execute_ai_query(query: str):
    """Execute AI-generated SQL with injection protection."""
    if sqlhund.is_query_malicious(query):
        raise ValueError("Potential SQL injection detected")

    # Safe to execute
    return database.execute(query)

Detailed Threat Analysis

result = sqlhund.analyze_query("SELECT * FROM users WHERE id = 1 OR 1=1")

if result['is_malicious']:
    for db_name, patterns in result['matches'].items():
        print(f"Database: {db_name}")
        for pattern in patterns:
            print(f"  Technique: {pattern['technique']}")  # CWE-89
            print(f"  Impact: {pattern['impact']}")        # CWE-285
            print(f"  CAPEC: {pattern['capec']}")          # 66

File-operation attacks are detected too, scoped to the database they target:

sqlhund.analyze_query("SELECT load_extension('evil')")
# {
#     'is_malicious': True,
#     'matches': {
#         'sqlite': [{
#             'technique': ['CWE-89', 'CWE-610', 'CWE-114'],
#             'impact': ['CWE-200', 'CWE-285'],
#             'capec': [470]
#         }]
#     }
# }

Pre-screening User Input

def sanitize_search_query(user_input: str) -> str:
    """Validate search input before building SQL."""
    test_query = f"SELECT * FROM products WHERE name LIKE '%{user_input}%'"

    if sqlhund.is_query_malicious(test_query):
        raise ValueError("Invalid search term")

    return user_input

Features

Fast: core detection engine written in Rust, compiled to a native Python extension
Accurate: 100% precision and recall on a 10M+ query benchmark (zero false positives, zero false negatives)
Multi-database: detects injection patterns targeting SQLite, PostgreSQL, and DuckDB
Zero dependencies: ships as a self-contained native wheel
AI-agent ready: built as a guardrail for LLM-generated SQL
Security classification: maps detected patterns to CWE and CAPEC taxonomies for threat intelligence

How sqlhund Compares

Tool	What it does	Where it fits
sqlhund	Pattern detection plus CWE/CAPEC classification, Python-native	Runtime guardrail for AI-generated or agent-relayed SQL
`libinjection`	C library, pattern-based SQLi/XSS detection	Closest classic analog. No native Python bindings, no CWE/CAPEC mapping
`sqlmap`	Active penetration-testing scanner	Offensive testing, not a runtime guard
Parameterized queries	Prevents injection at query-construction time	The right long-term fix, but it doesn't help when an LLM generates SQL whose structure isn't known ahead of time

Note

sqlhund doesn't replace parameterized queries. It covers the case parameterization can't: SQL whose structure is generated dynamically by a model.

Security Classification

sqlhund classifies detected patterns using industry-standard security frameworks.

Dual-Axis CWE Analysis

Every detected pattern is analyzed across two independent axes:

Technique (HOW): CWE identifiers describing the injection mechanism
- CWE-89: SQL Injection
- CWE-610: External Resource Reference (file operations)
- CWE-94/95: Code/Eval Injection
- CWE-77/78: Command/OS Command Injection
- CWE-114: Process Control (loading untrusted libraries)
- CWE-116/184: Encoding evasion and filter bypass
Impact (WHAT): CWE identifiers describing the attack consequences
- CWE-200: Information Disclosure
- CWE-285: Authorization Bypass
- CWE-269: Privilege Escalation
- CWE-471: Data Tampering
- CWE-400: Resource Exhaustion (DoS)
- CWE-208: Timing Side-Channel (blind injection)
- CWE-497: System Information Exposure

CAPEC Attack Patterns

Matches are also mapped to CAPEC attack pattern IDs:

CAPEC-66: SQL Injection
CAPEC-7: Blind SQL Injection
CAPEC-54: Query System for Information
CAPEC-470: Expanding Control over the OS from the Database
CAPEC-664: Server-Side Request Forgery

OWASP Alignment

sqlhund detects patterns from OWASP Top 10 A03:2021 - Injection, covering:

SQL Injection (CWE-89)
Command Injection (CWE-77, CWE-78)
Code Injection (CWE-94, CWE-95)
File/Resource Injection (CWE-610)

Resources:

Supported Databases

sqlhund detects database-specific injection patterns for:

Database	Detection Patterns
General	UNION, comments, tautologies, subqueries, time delays
SQLite	load_extension, ATTACH, PRAGMA, virtual tables
PostgreSQL	pg_read_file, COPY, DO blocks, dblink, extensions
DuckDB	read_csv, ATTACH, httpfs, CREATE SECRET, macros

Benchmarks

Evaluated against the RbSQLi dataset: 10,304,026 labeled SQL queries (2,813,146 malicious, 7,490,880 benign).

	Predicted Malicious	Predicted Benign
Actual Malicious	2,813,146	0
Actual Benign	0	7,490,880

Precision: 100% · Recall: 100% · Accuracy: 100%

Building from Source

Requires Rust, Maturin, and uv.

git clone https://github.com/KanishkNavale/sqlhund
cd sqlhund
make dev         # set up development environment
make build       # compile debug build
make release     # compile optimized release build

Testing

Run unit tests (Rust + Python):

make unittest

Run evaluation against the full RbSQLi dataset (download the dataset, place it at tests/data/wild.csv):

make wildtest

Contributing

Contributions are welcome. See the open issues or submit a pull request.

License

The MIT License licenses this project.

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
.github		.github
src		src
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
sqlhund.pyi		sqlhund.pyi
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sqlhund

Contents

Installation

Quick Start

Usage Examples

Guarding an LLM Agent's SQL Tool Calls

Validating AI-Generated SQL

Detailed Threat Analysis

Pre-screening User Input

Features

How sqlhund Compares

Security Classification

Dual-Axis CWE Analysis

CAPEC Attack Patterns

OWASP Alignment

Supported Databases

Benchmarks

Building from Source

Testing

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

sqlhund

Contents

Installation

Quick Start

Usage Examples

Guarding an LLM Agent's SQL Tool Calls

Validating AI-Generated SQL

Detailed Threat Analysis

Pre-screening User Input

Features

How sqlhund Compares

Security Classification

Dual-Axis CWE Analysis

CAPEC Attack Patterns

OWASP Alignment

Supported Databases

Benchmarks

Building from Source

Testing

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages