Add tool injection attack module for MCP tool misuse testing#65
Open
bharqav wants to merge 5 commits into
Open
Add tool injection attack module for MCP tool misuse testing#65bharqav wants to merge 5 commits into
bharqav wants to merge 5 commits into
Conversation
…qav/crucible into feat/tool-injection-attacks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This pull request introduces a new, dedicated tool-augmented agent attack module to address vulnerabilities related to MCP tool injection and tool misuse. The new module includes 20 adversarial payloads categorized into four distinct attack vectors: parameter injection, selection manipulation, tool chain poisoning, and unauthorized tool invocation. The primary goal is to evaluate whether AI agents with tool access can be manipulated into executing unauthorized commands, selecting inappropriate endpoints, or trusting malicious output structures.
Several key architectural decisions were made during implementation:
Fixes issue : #49
Type of change
✓ New attack vector (non-breaking change which adds functionality)
✓ This change requires a documentation update ( this introduces a completely new security module and four new attack types, it requires updating the security architecture documentation.)
How Has This Been Tested?
The implementation was verified against a local Python environment using three distinct steps:
✓
pytest tests/test_tool_injection.py✓
mypy crucible/ tests/ --strict✓
ruff check crucible/286 tests passed
All checks passed successfully without warnings.
Checklist:
✓ My code follows the style guidelines of this project
✓ I have performed a self-review of my own code
✓ I have commented my code, particularly in hard-to-understand areas
✓ I have made corresponding changes to the documentation
✓ My changes generate no new warnings
✓ I have added tests that prove my fix is effective or that my feature works
✓ New and existing unit tests pass locally with my changes
✓ Any dependent changes have been merged and published in downstream modules