Skip to content

natnew/Awesome-Agentic-AI-Security

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

43 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Awesome Agentic AI Security

License: MIT Map: Security Risks And Controls Focus: Agentic AI

Visit the Awesome Agentic AI Security project site Β· site source

The security boundary has moved from the model to the agentic execution system.

A curated list of resources, standards, benchmarks, tools, threat models, architectures, and research for securing agentic, multi-agent, tool-using, memory-bearing, and cyber-capable AI systems.

Start Here

  • Landscape Map - System-level map of prompts, context, tools, credentials, memory, approvals, and downstream action.
  • Threat Model - Failure modes, preconditions, impact paths, and control questions for agentic systems.
  • Attack Surfaces - Where language, context, authority, state, tools, memory, and policies expose risk.
  • Agentic Attack Chains - How local weaknesses compose into breach paths and where defenders can interrupt them.
  • Defence Architecture - Runtime control model for observing, interpreting, constraining, auditing, discovering, protecting, and governing agentic systems.
  • Resource Catalogue - Standards, frameworks, research, tools, benchmarks, cyber-capable AI agents, and evidence requirements.
  • Patterns - Secure engineering patterns for runtime boundaries, tool calling, MCP, memory, credentials, and approval.
  • Visuals - Mermaid diagrams for execution boundaries, action paths, control points, and reference architectures.

Contents

Core Concepts

Agentic systems behave less like isolated chat applications and more like distributed execution environments. Instructions can shape tool calls, trigger workflows, update memory, write code, route data, and influence decisions across enterprise systems.

The central security question is:

What can this AI system do, under whose authority, with which tools, using which data, with what memory, and under what controls?

Useful security for these systems must understand the relationship between intent, authority, action, context, and outcome.

flowchart TB
    UP["User prompt"]
    RD["Retrieved context"]
    SR["System rules"]
    AR["Agentic reasoning<br/>Goals emerge at runtime"]
    IK["Internal knowledge"]
    EA["External APIs"]
    OT["Operational tools"]
    Risk["Risk accumulation<br/>Composed outcomes may exceed approved scope"]

    UP --> AR
    RD --> AR
    SR --> AR
    AR -->|permitted step| IK
    AR -->|permitted step| EA
    AR -->|permitted step| OT
    IK --> Risk
    EA --> Risk
    OT --> Risk
Loading
Text description of the Risk Accumulation flow

The diagram illustrates how a user prompt, retrieved context, and system rules are processed by agentic reasoning. This reasoning leads to several permitted actions: querying internal knowledge, calling external APIs, or using operational tools. These actions collectively lead to "Risk accumulation," where the final composed outcomes of the agent's work may exceed the originally approved security scope.

The repository organises controls around the AI Defense Plane: discover where agents, tools, prompts, data flows, credentials, memory, and autonomous workflows exist; protect tool use, memory writes, credentials, and actions; and govern evidence, audit trails, delegated authority, and risk acceptance. The fuller model is in Defence Architecture.

Standards and Frameworks

Threat Models and Attack Surfaces

  • Agentic AI Threat Model - Repository threat model for failure modes across prompts, tools, memory, credentials, approvals, and multi-agent workflows.
  • Attack Surfaces: Agentic Execution Systems - Boundary map for language, context, authority, state, policies, tools, and downstream systems.
  • Agentic Attack Chains - Defensive chain model for recognising and interrupting multi-step compromise paths.
  • Agentic Attack Chain Library - Structured stubs for prompt injection, poisoned context, memory poisoning, unsafe MCP extensions, credential overreach, fake approvals, and related chain patterns.
  • Lakera Progressive Breach Model
    • Vendor analysis of how agentic compromise can progress from manipulated intent to tool use, delegated authority, propagation, and containment failure.

Prompt Injection and Instruction Attacks

Tool Use, MCP, and Runtime Security

  • Secure Tool Calling - Pattern for tool brokers, schemas, scopes, allow-lists, side-effect controls, and approval gates.
  • Secure MCP - Pattern for trust boundaries, transport hardening, capability scoping, and untrusted-context handling in Model Context Protocol integrations.
  • Secure Agent Runtime - Pattern for sandboxing, isolation, policy enforcement, and observability inside the execution loop.
  • OWASP Agentic Skills Top 10
    • Emerging guidance for the security of reusable agent skills and extension ecosystems.
  • NVIDIA NeMo Agent Toolkit Safety and Security Example
    • Practical example of agent workflow red teaming and risk scoring.
  • Tools catalogue - Defensive tools for red teaming, evaluation, observability, inventory, and runtime control.

Memory, State, and Context Security

Credentials, Identity, and Delegated Authority

Benchmarks and Evaluations

  • AgentDojo - Evaluation environment for indirect prompt injection and defences in tool-using agents.
  • CyberSecEval
    • Cybersecurity benchmark suite for LLMs used in coding, analysis, and automation contexts.
  • CyberGym - Benchmark environment for real-world AI-agent vulnerability analysis, reproduction, and verification tasks.
  • ExploitGym - Capability benchmark for whether AI agents can turn known vulnerabilities into working exploits; use as a defensive risk signal, not operational guidance.
  • Inspect AI - Evaluation framework from the UK AI Security Institute for structured tasks, solvers, scorers, and logs.
  • Benchmark catalogue - Benchmarks, testbeds, and evaluation methods with proof limits and maturity notes.

Cyber-Capable AI Agents

This section tracks the defensive governance problem created by AI systems that can assist with vulnerability discovery, exploit-capability evaluation, patch verification, disclosure workflows, and forensic traceability. It does not provide exploitation instructions.

Observability, Audit, and Forensics

Governance and Assurance

Physical AI and Robotics Security

Open-Weight and Frontier Capability Risks

Engineering Patterns

  • Secure Agent Runtime - Runtime boundaries, sandboxing, policy enforcement, and audit evidence.
  • Secure Tool Calling - Tool schemas, brokers, scopes, side-effect controls, and approval gates.
  • Secure MCP - Model Context Protocol boundaries, trust assumptions, and capability scoping.
  • Memory Security - Memory write controls, provenance, poisoning detection, and retention.
  • Credential and Token Boundaries
    • Delegated authority, credential brokers, scoped tokens, and impersonation controls.
  • Secure Engineering Patterns - How the threat model, attack surfaces, and chain interruptions map to reusable implementation controls.

Docs and Maps

Section Use it for
Docs Conceptual maps, threat models, breach chains, defence architecture, evaluation, governance, case studies, and open questions.
Resources Curated standards, frameworks, vendor research, papers, tools, benchmarks, cyber-capable AI agents, and evidence requirements.
Patterns Secure engineering patterns for agent runtimes, tool calling, MCP, memory, credentials, approval, sandboxing, observability, and policy enforcement.
Visuals Mermaid diagrams for execution boundaries, action paths, control points, and reference architectures.

Related Projects

Companion field guides by the same maintainer covering adjacent areas of AI. Read alongside this repository for broader context on how agentic AI is being built and applied beyond the security boundary.

Repository Focus
Awesome Agentic Engineering Engineering practices, patterns, and tooling for building agentic AI systems.
Awesome AI Scientists AI for scientific research, discovery, and AI-as-scientist tooling.
Awesome Physical AI Physical AI: robotics, embodied agents, and sensor-driven systems.

Licence

This project is released under the MIT License.

Contributing

Section banner featuring the text "We love Contributors" with stylized graphics.

Thrilled to have you here. Whether it is a quick typo fix, a fresh resource, a doc polish, or a sweeping overhaul - every contribution helps this list grow. Jump in and join the community - PRs of every size are welcome.

Read the contributing guide Β· good first issues

About

A curated, structured, and continuously updated map of security risks, controls, benchmarks, architectures, and research for agentic, multi-agent, tool-using, self-improving AI systems. 🌟 Star if you like it!

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors