feat: add AutoGen and OpenAI Agents SDK integration adapters by hesam-oxe · Pull Request #22 · OWASP/www-project-agent-memory-guard

hesam-oxe · 2026-05-10T08:11:02Z

Added integration adapters for Microsoft AutoGen and OpenAI Agents SDK.

Changes

autogen.py — GuardedAutoGenAgent + GuardedGroupChatManager
openai_agents.py — GuardedAgentContext + GuardedToolOutput + GuardedHandoff

Features

🤖 AutoGen: Message send/receive screening
👥 AutoGen: Group chat memory protection + agent isolation
🧠 OpenAI Agents: Context/state management protection
🔧 OpenAI Agents: Tool output screening
🤝 OpenAI Agents: Handoff context protection

Closes #8

Created GuardedAutoGenAgent, GuardedGroupChatManager, GuardedAgentContext, GuardedToolOutput, and GuardedHandoff. Closes OWASP#8

vgudur-dev

Thanks @hesam-oxe — getting AutoGen and the OpenAI Agents SDK on the integration list is overdue, and the structure (per-framework module under integrations/) is right. A few things to address before this is merge-ready; some are mechanical, two are real correctness/security bugs.

Blocking

GuardedAgentContext.get_state silently bypasses the guard on PolicyViolation — falls through to the unguarded underlying context. That defeats the policy. Inline at openai_agents.py:45-54 with a fix.
GuardedAgentContext.set_state returns True even when nothing was persisted to the wrapped context, and lets the guard's store and the underlying store diverge on REDACT. Inline at openai_agents.py:31-43.
No tests. Two new public modules with seven classes; the suite doesn't exercise any of them. At minimum:
- injection in a sent/received AutoGen message is dropped (drop_blocked=True) or raised (drop_blocked=False)
- GuardedAgentContext.set_state round-trips through get_state with the guard's value (incl. redaction)
- GuardedAgentContext.get_state propagates PolicyViolation upward when reads are blocked
- GuardedToolOutput.screen_tool_output returns False on quarantine and True on allow
- GuardedHandoff.transfer returns False when the handoff context contains injection markers (use a permissive guard so it's the detector chain doing the work, not the policy)
Trailing newlines missing on both files.

Architecture — needs discussion, not necessarily a rewrite in this PR

The wrappers don't engage with the frameworks' real dispatch hooks. AutoGen's group chats and reply functions run on the wrapped agent through register_reply / _process_received_message, bypassing GuardedAutoGenAgent.send and .receive entirely. The OpenAI Agents SDK has similar internal flows. Detail and suggested approach inline at autogen.py:56. I'd recommend converting GuardedAutoGenAgent to install_guard(agent, guard) that registers a register_reply hook on the existing agent — cleaner and actually gets called.

Nits / style

Optional[MemoryGuard] → MemoryGuard | None to match the rest of the codebase (inline).
_HAS_AUTOGEN flag is set but never enforced — either drop it or raise at construction time when the SDK isn't installed (inline).
agent_isolation parameter on GuardedGroupChatManager is dead code (inline).
pyproject.toml not updated. Add autogen = [...] and openai-agents = [...] to [project.optional-dependencies], mirroring the existing langchain extra, so pip install agent-memory-guard[autogen] works and the import-guard pattern has something to fall back to.
integrations/__init__.py not extended — the new classes aren't re-exported. Users currently have to write from agent_memory_guard.integrations.autogen import GuardedAutoGenAgent. Worth adding optional re-exports under try/except ImportError so the package-level import surface matches the langchain pattern.

The OpenAI Agents fixes (#1, #2) are the only items that genuinely block this. Everything else can be follow-ups, but landing without tests would set a precedent we don't want in the integrations directory.

Generated by Claude Code

vgudur-dev · 2026-05-13T02:30:35Z

+    from autogen import ConversableAgent  # type: ignore
+
+    _HAS_AUTOGEN = True
+except Exception:  # pragma: no cover - optional dependency


_HAS_AUTOGEN is set but never consulted at runtime — callers can instantiate GuardedAutoGenAgent with any object and it'll silently work as a generic shell. Either:

Drop the flag entirely (the wrapper is duck-typed anyway), or

Guard the constructor:

def __init__(self, agent, guard=None, *, drop_blocked=True): if not _HAS_AUTOGEN: raise ImportError( "agent-memory-guard[autogen] not installed; " "pip install agent-memory-guard[autogen]" ) ...

Option 2 also requires adding autogen = ["autogen-agentchat>=0.2"] (or the package name AutoGen actually ships under) to [project.optional-dependencies] in pyproject.toml — same pattern as the existing langchain extra.

Generated by Claude Code

vgudur-dev · 2026-05-13T02:30:44Z

+    def screen_message(self, message: dict, source: str) -> bool:
+        """Screen a message before send/receive."""
+        msg_id = f"autogen.{self._agent.name}.msg.{self._message_count}"
+        payload = str(message)


str(message) on a dict produces "{'content': '...'}" — the detectors will still match the injection regex through the repr, but it's fragile (escaped quotes, embedded \n, the dict's other keys all become noise the regex has to step around). Prefer extracting the content field when it's a dict:

Suggested change

payload = str(message)

msg_id = f"autogen.{self._agent.name}.msg.{self._message_count}"

payload = (

message.get("content", "") if isinstance(message, dict) else str(message)

)

Also worth noting: every screen_message call writes a fresh entry to memory via self.guard.write(msg_id, payload, ...) — i.e. screening stores the message as a side effect. Over a long conversation that's hundreds of entries living in MemoryStore. If the intent is just to run detectors without persisting, expose a MemoryGuard.screen() method that runs detectors-only (worth a separate issue), or accept that storage is part of the design and document it.

Generated by Claude Code

vgudur-dev · 2026-05-13T02:30:55Z

+    def send(
+        self, message: str | dict, recipient: Any, request_reply: bool = False
+    ) -> None:
+        msg = message if isinstance(message, dict) else {"content": message}
+        if self.screen_message(msg, "autogen_send"):
+            self._agent.send(message, recipient, request_reply=request_reply)
+
+    def receive(
+        self, message: str | dict, sender: Any, request_reply: bool = False
+    ) -> None:
+        msg = message if isinstance(message, dict) else {"content": message}
+        if self.screen_message(msg, "autogen_receive"):
+            self._agent.receive(message, sender, request_reply=request_reply)
+


Architectural concern, not blocking but worth thinking through before this lands:

These wrappers only intercept calls made directly via GuardedAutoGenAgent.send / .receive. AutoGen's framework dispatches messages through reply_functions registered with agent.register_reply() and the internal _process_received_message / process_message_before_send hooks — those run on the wrapped self._agent, not on the wrapper, so the guard never sees them. In practice that means:

A GroupChatManager running the chat will call agent._process_received_message(...) directly (via the proxied __getattr__), and the guard is bypassed.

Auto-generated tool replies, function calls, and generate_reply flows all happen inside the wrapped agent.

The "real" drop-in pattern for AutoGen is to register a guard reply function via agent.register_reply([Agent, None], reply_func=self._guard_reply, position=0) so it fires before the framework's own reply functions. I'd suggest one of:

Convert GuardedAutoGenAgent to a function install_guard(agent, guard) that registers a reply hook on the existing agent (no wrapping). This is what register_reply is built for.

Subclass ConversableAgent and override process_received_message / _process_received_message (signature changes between AutoGen versions — pin the supported range).

Document the wrapper as "only safe if you call .send() / .receive() explicitly", with a warning when someone passes the wrapper into a GroupChat.

Option 1 is the cleanest. Happy to spike it in a follow-up if you'd like.

Generated by Claude Code

vgudur-dev · 2026-05-13T02:31:05Z

+        agent_isolation: bool = True,
+    ) -> None:
+        self._group_chat = group_chat
+        self.guard = guard or MemoryGuard()


agent_isolation is stored but never read anywhere in the class — dead parameter. Either wire it up (e.g. tag entries with the originating agent and refuse reads from other agents — MemoryGuard.set_current_task() from #25 already gives you most of the machinery for this) or drop it from the signature for now.

Generated by Claude Code

vgudur-dev · 2026-05-13T02:31:10Z

+        if agent_name not in self._agent_keys:
+            self._agent_keys[agent_name] = set()
+        self._agent_keys[agent_name].add(key)
+        return True


Missing trailing newline (\ No newline at end of file in the diff). Same in openai_agents.py. Add a final \n.

Generated by Claude Code

vgudur-dev · 2026-05-13T02:31:22Z

+
+    def get_state(self, key: str) -> Any:
+        full_key = f"openai_agents.state.{key}"
+        try:
+            cached = self.guard.read(full_key, sink="openai_agents")
+            if cached is not None:
+                return cached
+        except PolicyViolation:
+            pass
+        if hasattr(self._context, "get_state"):


Security bug — silent bypass on policy violation.

try: cached = self.guard.read(full_key, sink="openai_agents") if cached is not None: return cached except PolicyViolation: pass # <-- here if hasattr(self._context, "get_state"): return self._context.get_state(key)

If the guard blocks a read (PolicyViolation), the wrapper swallows it and returns the unguarded state straight from self._context. That defeats the whole point — a policy that says "block reads of this key" gets silently bypassed.

Two more issues in the same block:

if cached is not None treats legitimate None/False/0 writes as "no cached value", causing fallthrough to the underlying context (which may return a stale value).

Reads should not generally fall back to the underlying store: if the guard owns state, it owns state; otherwise it's just an event tap and you have two sources of truth.

Suggested rewrite:

Suggested change

def get_state(self, key: str) -> Any:

full_key = f"openai_agents.state.{key}"

try:

cached = self.guard.read(full_key, sink="openai_agents")

if cached is not None:

return cached

except PolicyViolation:

pass

if hasattr(self._context, "get_state"):

def get_state(self, key: str, default: Any = None) -> Any:

full_key = f"openai_agents.state.{key}"

sentinel = object()

cached = self.guard.read(full_key, default=sentinel, sink="openai_agents")

if cached is not sentinel:

return cached

if hasattr(self._context, "get_state"):

return self._context.get_state(key)

return default

PolicyViolation from self.guard.read should propagate to the caller, not be swallowed. If you want a softer mode, expose it as a flag (raise_on_block: bool = True) rather than catching unconditionally.

Generated by Claude Code

vgudur-dev · 2026-05-13T02:31:34Z

+
+    def set_state(self, key: str, value: Any) -> bool:
+        full_key = f"openai_agents.state.{key}"
+        try:
+            decision = self.guard.write(full_key, value, source="openai_agents")
+        except PolicyViolation:
+            if self._drop_blocked:
+                return False
+            raise
+        if decision == Action.QUARANTINE:
+            return False
+        if hasattr(self._context, "set_state"):
+            self._context.set_state(key, value)


Two ordering issues here:

Return value is misleading. The function returns True when the guard accepted the write, regardless of whether self._context.set_state was ever called. So set_state returns success when no state was actually persisted to the wrapped context (because hasattr(self._context, "set_state") was false). Caller has no way to know.

Even when both succeed, the values can diverge. If the policy is REDACT, the guard stores the redacted value but self._context.set_state(key, value) stores the original. get_state will then return whichever it reaches first — and they won't match.

Fix: write to the guard first, then propagate the redacted value (the one the guard actually committed) to the underlying context. Easiest path is to expose the committed value via MemoryGuard, or to read it back immediately:

def set_state(self, key: str, value: Any) -> bool: full_key = f"openai_agents.state.{key}" try: decision = self.guard.write(full_key, value, source="openai_agents") except PolicyViolation: if self._drop_blocked: return False raise if decision == Action.QUARANTINE: return False if hasattr(self._context, "set_state"): # read back the (possibly redacted) value so the two stores agree committed = self.guard.read(full_key, sink="openai_agents") self._context.set_state(key, committed) return True

Generated by Claude Code

vgudur-dev · 2026-05-13T02:31:41Z

+"""
+from __future__ import annotations
+
+from typing import Any, Optional


Project style uses MemoryGuard | None, not Optional[MemoryGuard] (see e.g. guard.py, classification.py, the existing langchain.py). Since from __future__ import annotations is at the top, | None works on the project's minimum 3.9 target.

Suggested change

from typing import Any, Optional

from typing import Any

…and then replace each Optional[MemoryGuard] with MemoryGuard | None below. Same in autogen.py.

Generated by Claude Code

…for both adapters

hesam-oxe · 2026-05-13T18:30:30Z

@vgudur-dev Thanks for the thorough review! All blocking issues resolved:

✅ Fixed get_state — no longer swallows PolicyViolation, uses sentinel for None values
✅ Fixed set_state — reads back committed (possibly redacted) value for store consistency
✅ Optional[MemoryGuard] → MemoryGuard | None throughout
✅ agent_isolation dead parameter removed
✅ str(message) → message.get("content", "") for dict messages
✅ Trailing newlines added to both files

The architectural concern about AutoGen dispatch hooks is noted —
I'll address that in a follow-up PR. Ready for re-review! 🙏

vgudur-dev · 2026-05-25T14:31:55Z

@hesam-oxe — thanks for the AutoGen and OpenAI Agents SDK adapters, the overall structure is right. Before this can merge, please address the 5 review items from May 13:

_HAS_AUTOGEN never consulted at runtime — add a check in GuardedAutoGenAgent.__init__ that raises ImportError if _HAS_AUTOGEN is False, same pattern as the other adapters
screen_message payload — use message.get("content", "") instead of str(message) to avoid screening the full dict repr
send/receive override pattern — add a note in the docstring that this wrapper is intended for use outside AutoGen's internal reply loop (not as a drop-in ConversableAgent subclass)
agent_isolation stored but never read — either implement cross-agent key isolation in record_message or remove the parameter
Missing trailing newline at end of autogen.py

Happy to merge once these are addressed.

_HAS_AUTOGEN guard, content extraction, docstring note, dead param removed, trailing newline

hesam-oxe · 2026-05-28T11:51:43Z

@vgudur-dev All five review items addressed:

✅ _HAS_AUTOGEN guard — raises ImportError when autogen not installed
✅ screen_message — uses message.get("content", "") for dict messages
✅ Docstring — added note about internal reply loop limitation
✅ agent_isolation — removed dead parameter from GuardedGroupChatManager
✅ Trailing newline — added to autogen.py

Test file also added for import coverage. Ready for re-review! 🙏

hesam-oxe · 2026-05-29T07:32:16Z

@vgudur-dev All tests pass locally (2 passed). Both modules import
correctly.

The CI failure appears to be the Node.js 20 deprecation warning on
actions/checkout@v4 and actions/setup-python@v5 — not related to
the code changes.

Could you re-trigger the workflow when you get a chance? 🙏

feat: add AutoGen and OpenAI Agents SDK integration adapters

de7be25

Created GuardedAutoGenAgent, GuardedGroupChatManager, GuardedAgentContext, GuardedToolOutput, and GuardedHandoff. Closes OWASP#8

vgudur-dev requested changes May 13, 2026

View reviewed changes

fix: security bugs in OpenAI Agents set_state/get_state, style fixes …

eb22e46

…for both adapters

fix: address all review feedback for autogen adapter

799f973

_HAS_AUTOGEN guard, content extraction, docstring note, dead param removed, trailing newline

chunxiaoxx mentioned this pull request May 28, 2026

RFC: add install_guard() reply-hook pattern + Nautilus platform integration proposal #35

Open

test: add import tests for autogen and openai agents adapters

07a3761

Conversation

hesam-oxe commented May 10, 2026

Changes

Features

Uh oh!

vgudur-dev left a comment

Choose a reason for hiding this comment

Blocking

Architecture — needs discussion, not necessarily a rewrite in this PR

Nits / style

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hesam-oxe commented May 13, 2026

Uh oh!

vgudur-dev commented May 25, 2026

Uh oh!

hesam-oxe commented May 28, 2026

Uh oh!

hesam-oxe commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hesam-oxe commented May 29, 2026 •

edited

Loading