The "Deep Audit" Release (Patched some vulnerabilities) #6

Harshit-J004 · 2026-03-22T20:52:54Z

Harshit-J004
Mar 22, 2026
Maintainer

The v3.0.0 release gave you the execution firewall. Today, v3.1.0 stress-tests it against the real world.

We ran ToolGuard's automated fuzzer against actual tools from major AI agent ecosystems and validated native integration with all 7 supported frameworks. Then we audited every line of ToolGuard's own codebase and shipped 11 critical stability patches.

1. 🛡️ Real-World Tool Fuzzing

We fuzzed actual, shipping tools from major AI agent ecosystems to prove ToolGuard catches what the frameworks don't.

LangChain — WikipediaQueryRun (real community tool):
We imported the actual WikipediaQueryRun tool from langchain-community and hit it with 40 hallucinated LLM payloads. LangChain tools accept complex str | dict | ToolCall unions. When the fuzzer sent invalid types, the native pipeline threw massive unhandled Pydantic tracebacks. ToolGuard's guard_langchain_tool intercepted 39 of 40 crashes, converting them to clean SchemaValidationError messages the LLM can self-correct from.

CrewAI — ScrapeWebsiteTool (real community tool):
We imported the actual ScrapeWebsiteTool from crewai-tools and hit it with 44 hallucinated payloads. CrewAI unpacks inputs via **kwargs and throws hardcoded ValueError traps when required fields are missing. ToolGuard's guard_crewai_tool intercepted all 44 crashes gracefully — zero Python tracebacks escaped.

2. 🔌 Native Framework Integration Validation

Beyond real-tool fuzzing, we validated that ToolGuard's adapters work natively with every supported framework's tool interface. These tests confirm that ToolGuard adds the input validation layer that these frameworks intentionally leave to the developer:

• Microsoft AutoGen (FunctionTool): AutoGen passes raw JSON dicts to user functions with zero validation. guard_autogen_tool catches TypeError and SchemaValidationError before they crash your agent loop.
• LlamaIndex (FunctionTool): LlamaIndex relies on Pydantic internally but doesn't gracefully handle hallucinated edge-cases like null or type-mismatched inputs. guard_llamaindex_tool absorbs the ValidationError tracebacks.
• OpenAI Swarm (Agent.functions): Swarm exposes plain Python functions with no validation at all. guard_swarm_agent extracts every function, wraps each with Pydantic schema enforcement, and even detected 3 Prompt Injection Vulnerabilities via reflected payload scanning.
• FastAPI (Middleware): When FastAPI endpoints are exposed as agent tools, the HTTP 422 safety net disappears. as_fastapi_tool restores Pydantic validation at the function level.
• AutoGPT (web_search): AutoGPT blindly trusts LLM inputs. @create_tool catches null and type-mismatch payloads before they reach the DuckDuckGo scraper.

Input validation failed for 'load_financial_data'
  Tool: load_financial_data
  💡 Suggestion: Agent hallucinated payload. Schema mismatch:
  - Field 'year': Input should be a valid integer (Got: None | Type: NoneType)

All 7 framework adapters verified. Zero ToolGuard internal crashes.

3. 🛠️ Deep Codebase Audit & 11 Critical Patches

We executed a line-by-line codebase audit and shipped 11 stabilization patches:

• CrewAI Native Extraction Bug: Fixed logic flaw in guard_crewai_tool where non-callable Pydantic subclass instances threw TypeErrors during initialization.
• Bare Decorator Support: Fixed DX bug where @create_tool (without parentheses) caused a runtime crash.
• Fuzzer Base Inference: test_chain now infers base inputs from wrapped tool signatures when base_input is empty.
• Clean Exception Handlers: Stopped raw Python tracebacks from leaking when self._sig.bind() failed.
• LangChain / CrewAI Inheritance Checks: Repaired NotImplementedError bypass logic for legacy _run methods.
• Dynamic Scoring Stability: Fixed KeyError in Console Reporter when tools dynamically mutated names during __init__.
• Pruned Tech Debt: Removed duplicated comments, fixed type annotations (callable → typing.Callable), and dead variable paths.

ToolGuard v3.1.0: battle-tested against real community tools, natively integrated with 7 frameworks, and hardened with 11 stability patches. Install the update and guard your agents.

This discussion was created from the release The "Deep Audit" Release (Patched some vulnerabilities).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The "Deep Audit" Release (Patched some vulnerabilities) #6

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

The "Deep Audit" Release (Patched some vulnerabilities) #6

Uh oh!

Harshit-J004 Mar 22, 2026 Maintainer

1. 🛡️ Real-World Tool Fuzzing

2. 🔌 Native Framework Integration Validation

3. 🛠️ Deep Codebase Audit & 11 Critical Patches

Replies: 0 comments

Harshit-J004
Mar 22, 2026
Maintainer