adversarial-ai

Star

Here are 28 public repositories matching this topic...

obscuralabs-AI / Symbolic-Prompt-PenTest

Star

Semantic Stealth Attacks & Symbolic Prompt Red Teaming on GPT and other LLMs.

prompt-engineering ai-penetration-testing adversarial-ai llm-red-teaming symbolic-prompt gpt4-security obscuralabs

Updated May 16, 2025

SnailSploit / AATMF-Adversarial-AI-Threat-Modeling-Framework

Star

AATMF | An Open Source - Adversarial AI Threat Modeling Framework

owasp threat-modeling penetration-testing-framework redteaming ai-security mitre-atlas generative-ai llm-security adversarial-ai aatmf

Updated Feb 22, 2026
YARA

provnai / vex

Star

VEX Protocol — The trust layer for AI agents. Adversarial verification, temporal memory, Merkle audit trails, and tamper-proof execution. Built in Rust.

rust ai verification multi-agent trust merkle-tree ai-agents llm adversarial-ai

Updated Mar 7, 2026
Rust

karloks2005 / JailbreakLab

Star

Test and evaluate Large Language Models against prompt injections, jailbreaks, and adversarial attacks with a web-based interactive lab.

react docker kubernetes jailbreak model-alignment machine-learning-security ai-security fastapi huggingface prompt-injection llm-security llm-safety security-research-tool ai-evaluation-framework adversarial-ai prompt-defense llm-red-teaming

Updated Jan 28, 2026
Python

noobforanonymous / basilisk

Star

Basilisk — Open-source AI red teaming framework with genetic prompt evolution. Automated LLM security testing for GPT-4, Claude, Gemini. OWASP LLM Top 10 coverage. 29 attack modules.

Updated Mar 5, 2026
Python

LLM Attack Testing Toolkit is a structured methodology and mindset framework for testing Large Language Model (LLM) applications against logic abuse, prompt injection, jailbreaks, and workflow manipulation.

offensive-security security-research prompt-injection llm-security agent-security ai-red-teaming adversarial-ai rag-security llm-pentesting tool-injection jailbreak-testing logic-abuse ai-workflow-testing context-leakage ai-application-security

Updated Feb 27, 2026

0ameyasr / VB-AF

Star

Implementation of Vocabulary-Based Adversarial Fuzzing (VB-AF) to systematically probe vulnerabilities in Large Language Models (LLMs).

fuzzing-framework generative-ai adversarial-ai

Updated Aug 25, 2025
Python

khanovico / prompt-guard

Star

🛡️ Enterprise-grade AI security framework protecting LLMs from prompt injection attacks using ML-powered detection

python machine-learning mongodb cybersecurity faiss ai-security huggingface prompt-injection ai-protection llm-security prompt-security adversarial-ai

Updated Aug 7, 2025
Python

ZyluxXD / zerobypass

Star

Proof of concept tool to bypass document replay technology (such as GPTZero).

python proof-of-concept poc ai-detection-bypasser llm-detection adversarial-ai

Updated Feb 16, 2026
Python

daletoniris / security-talks

Star

Slides and materials from cybersecurity talks at Chubut Hack (2021-2022)

talks presentations cybersecurity infosec wardriving adversarial-ai chubut-hack

Updated Mar 3, 2026

annoeyed / MA_BLR

Star

A research framework for simulating, detecting, and defending against backdoor loop attacks in LLM-based multi-agent systems.

cybersecurity multi-agent-systems red-teaming ai-security backdoor-attacks large-language-models llm-security adversarial-ai python-simulation-framework

Updated Aug 4, 2025
Python

scthornton / Chain-of-Thought-Reasoning-Attacks

Star

Breaking Chain-of-Thought: A Comprehensive Taxonomy of Reasoning Vulnerabilities in Production AI Systems

jailbreak jupyter-notebook security-research ai-security chain-of-thought prompt-injection llm-security adversarial-ai

Updated Jan 21, 2026
Jupyter Notebook

DUBSOpenHub / havoc-hackathon

Star

Pit AI models against each other. Score them sealed. Crown a winner. All built using the GitHub Copilot CLI. ⚡

orchestration multi-agent multi-model ai-agents prompt-engineering copilot-cli copilot-extensions adversarial-ai blind-adjudication

Updated Mar 6, 2026
Python

vonofdaville / adversarial-phish-forge

Star

🔍 Emulate advanced phishing tactics ethically with this open-source framework for red team operations focused on social engineering sophistication.

Updated Mar 7, 2026
Python

jasoncobra3 / LLM_Sentinel

Star

LLM Sentinel Red Teaming Platform is an enterprise-grade framework for automated security testing of Large Language Models, detecting vulnerabilities such as jailbreaks, prompt injection, and system prompt leakage across multiple providers, with structured attack orchestration, risk scoring, and security reporting to harden models before production

openai ai-safety ai-security rag azure-openai large-language-models generative-ai langchain prompt-injection anthropic llm-security multi-llm llm-evaluation genai-security ai-security-tool adversarial-ai llm-red-teaming model-security jailbreak-testing

Updated Mar 4, 2026
Python

lucien-vallois / adversarial-phish-forge

Star

Ethically-bounded red team framework for AI-driven social engineering simulation with consent enforcement and identity graph mapping

Updated Dec 19, 2025
Python

jasminemoreira / Versus

Star

Adversarial verification layer for AI coding assistants. Based on IACDM — Interactive Adversarial Convergence Development Methodology. The AI proposes. Versus critiques.

code-review vscode-extension methodology claude methodology-development ai-assisted-development github-copilot claude-code code-review-assistant-active adversarial-ai iacdm

Updated Feb 28, 2026

itsjwill / ghosthacker

Star

👻 Adversarial AI Pentester - CHAOS vs ORDER dual-agent exploitation with collective memory

Updated Feb 13, 2026
TypeScript

KailashSatkuri-warangal / apisl

Star

A Django-based platform for testing LLMs against prompt injection, social engineering, and policy bypass attacks using red teaming methodologies.

django cybersecurity ethical-hacking ai-safety red-teaming ai-security prompt-injection llm-security adversarial-ai

Updated Jan 11, 2026
Python

Mikeup91 / Gemini-S2-Signal

Star

AI Security Research: Gemini 3.0 Pro S2-Class Exfiltration & Adversarial Robustness. Hardening frontier models against autonomous mutation vectors. NIST VDP / AI Safety Institute compliant.

ai-safety zero-day red-teaming ai-security google-deepmind cybersecurity-research llm-security adversarial-ai sequoia-capital founders-fund google-vrp nist-vdp ai-safety-institute gemini-3-exploit

Updated Jan 5, 2026

Improve this page

Add a description, image, and links to the adversarial-ai topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the adversarial-ai topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adversarial-ai

Here are 28 public repositories matching this topic...

obscuralabs-AI / Symbolic-Prompt-PenTest

SnailSploit / AATMF-Adversarial-AI-Threat-Modeling-Framework

provnai / vex

karloks2005 / JailbreakLab

noobforanonymous / basilisk

URDev4ever / LATT

0ameyasr / VB-AF

khanovico / prompt-guard

ZyluxXD / zerobypass

daletoniris / security-talks

annoeyed / MA_BLR

scthornton / Chain-of-Thought-Reasoning-Attacks

DUBSOpenHub / havoc-hackathon

vonofdaville / adversarial-phish-forge

jasoncobra3 / LLM_Sentinel

lucien-vallois / adversarial-phish-forge

jasminemoreira / Versus

itsjwill / ghosthacker

KailashSatkuri-warangal / apisl

Mikeup91 / Gemini-S2-Signal

Improve this page

Add this topic to your repo