Secure AI Applications in 3 Lines of Code
Stop prompt injection, jailbreaks, and data leaks in production LLM applications.
pip install promptshieldsfrom promptshield import Shield
shield = Shield.balanced()
result = shield.protect_input(user_input, system_prompt)
if result['blocked']:
print(f"Blocked: {result['reason']} (score: {result['threat_level']:.2f})")
print(f"Breakdown: {result['threat_breakdown']}")That's it. Production-ready security in 3 lines.
| Feature | PromptShields | DIY Regex | Paid APIs |
|---|---|---|---|
| Setup Time | 3 minutes | Weeks | Days |
| Cost | Free | Free | $$$$ |
| Privacy | 100% Local | Local | Cloud |
| F1 Score | 0.97 (RF) / 0.96 (DeBERTa) | ~0.60 | ~0.95 |
| ML Models | 4 + DeBERTa | None | Black box |
| Async | β Native | DIY | Varies |
- π‘οΈ Prompt injection attacks (direct + indirect)
- π Jailbreak attempts (DAN, persona replacement)
- π System prompt extraction
- π PII leakage
- π Session anomalies
- π€ Encoded/obfuscated attacks (Base64, URL, Unicode)
Choose the right tier for your application:
Shield.fast() # ~1ms - High throughput (pattern matching only)
Shield.balanced() # ~2ms - Production default (patterns + session tracking)
Shield.strict() # ~7ms - Sensitive apps (+ 1 ML model + PII detection)
Shield.secure() # ~12ms - Maximum security (4 ML models ensemble)Securely wrap real-time LLM token streams to detect leaked PII or injected prompts before the full response finishes generating.
# Auto-scans generator stream chunks in real-time
for safe_chunk in shield.protect_stream(llm.stream("Summarize...")):
print(safe_chunk, end="")The random_forest, logistic_regression, linear_svc, and gradient_boosting ensemble models were retrained from scratch on a curated 14K dataset to completely eliminate "data leakage" false positives (e.g., blocking benign math/conversion queries). Combine with DeBERTa via Shield.balanced() for optimal F1 performance.
Launch shields declaratively without changing application code.
shield = Shield.from_config("promptshield.yml")Instantly trigger webhooks whenever high-severity threats are blocked natively.
shield = Shield.balanced(webhook_url="https://hooks.slack.com/...")Every response now shows exactly which layer triggered:
result = shield.protect_input(user_text, system_prompt)
print(result["threat_breakdown"])
# {"pattern_score": 0.0, "ml_score": 0.994, "session_score": 0.0}shield = Shield(models=["deberta"]) # Auto-downloads from HuggingFacefrom promptshield import AsyncShield
shield = AsyncShield.balanced()
result = await shield.aprotect_input(user_text, system_prompt)from promptshield import Shield
from promptshield.integrations.fastapi import PromptShieldMiddleware
app.add_middleware(PromptShieldMiddleware, shield=Shield.balanced())shield = Shield(
patterns=True,
models=["random_forest"],
allowlist=["summarize this document", "translate to french"],
custom_patterns=[r"jailbreak|dan mode|evil\s*bot"],
)Trained on neuralchemy/Prompt-injection-dataset:
| Model | F1 | ROC-AUC | FPR | Latency |
|---|---|---|---|---|
| Random Forest | 0.969 | 0.994 | 6.9% | <1ms |
| Logistic Regression | 0.964 | 0.995 | 6.4% | <1ms |
| Gradient Boosting | 0.961 | 0.994 | 7.9% | <1ms |
| LinearSVC | 0.959 | 0.995 | 10.3% | <1ms |
| DeBERTa-v3-small | 0.959 | 0.950 | 8.5% | ~50ms |
Pre-trained models: neuralchemy/prompt-injection-detector Β· neuralchemy/prompt-injection-deberta
π Full Documentation β Complete guide with framework integrations
π Quickstart Guide β Get running in 5 minutes
MIT License β see LICENSE
Built by NeurAlchemy β AI Security & LLM Safety Research