Semantic Stealth Attacks & Symbolic Prompt Red Teaming on GPT and other LLMs.
-
Updated
May 16, 2025
Semantic Stealth Attacks & Symbolic Prompt Red Teaming on GPT and other LLMs.
AATMF | An Open Source - Adversarial AI Threat Modeling Framework
VEX Protocol — The trust layer for AI agents. Adversarial verification, temporal memory, Merkle audit trails, and tamper-proof execution. Built in Rust.
Test and evaluate Large Language Models against prompt injections, jailbreaks, and adversarial attacks with a web-based interactive lab.
Basilisk — Open-source AI red teaming framework with genetic prompt evolution. Automated LLM security testing for GPT-4, Claude, Gemini. OWASP LLM Top 10 coverage. 29 attack modules.
LLM Attack Testing Toolkit is a structured methodology and mindset framework for testing Large Language Model (LLM) applications against logic abuse, prompt injection, jailbreaks, and workflow manipulation.
Implementation of Vocabulary-Based Adversarial Fuzzing (VB-AF) to systematically probe vulnerabilities in Large Language Models (LLMs).
🛡️ Enterprise-grade AI security framework protecting LLMs from prompt injection attacks using ML-powered detection
Proof of concept tool to bypass document replay technology (such as GPTZero).
Slides and materials from cybersecurity talks at Chubut Hack (2021-2022)
A research framework for simulating, detecting, and defending against backdoor loop attacks in LLM-based multi-agent systems.
Breaking Chain-of-Thought: A Comprehensive Taxonomy of Reasoning Vulnerabilities in Production AI Systems
Pit AI models against each other. Score them sealed. Crown a winner. All built using the GitHub Copilot CLI. ⚡
🔍 Emulate advanced phishing tactics ethically with this open-source framework for red team operations focused on social engineering sophistication.
LLM Sentinel Red Teaming Platform is an enterprise-grade framework for automated security testing of Large Language Models, detecting vulnerabilities such as jailbreaks, prompt injection, and system prompt leakage across multiple providers, with structured attack orchestration, risk scoring, and security reporting to harden models before production
Ethically-bounded red team framework for AI-driven social engineering simulation with consent enforcement and identity graph mapping
Adversarial verification layer for AI coding assistants. Based on IACDM — Interactive Adversarial Convergence Development Methodology. The AI proposes. Versus critiques.
👻 Adversarial AI Pentester - CHAOS vs ORDER dual-agent exploitation with collective memory
A Django-based platform for testing LLMs against prompt injection, social engineering, and policy bypass attacks using red teaming methodologies.
AI Security Research: Gemini 3.0 Pro S2-Class Exfiltration & Adversarial Robustness. Hardening frontier models against autonomous mutation vectors. NIST VDP / AI Safety Institute compliant.
Add a description, image, and links to the adversarial-ai topic page so that developers can more easily learn about it.
To associate your repository with the adversarial-ai topic, visit your repo's landing page and select "manage topics."