Skip to content

Token-efficient MCP server for Ghidra - batch operations, context search, deterministic APIs for LLM-assisted RE

License

Notifications You must be signed in to change notification settings

mad-sol-dev/GhidraMCPd

 
 

Repository files navigation

ghidra_MCP_logo

GhidraMCPd – token-efficient MCP server for Ghidra

Deterministic MCP server for the Ghidra plugin, focused on lowering token spend while keeping response schemas stable and auditable.

⚠️ AI-Generated Code: This repository's code is almost entirely generated by AI assistants (Codex, AiderDesk). Human role: architecture, planning, review, testing, and documentation. Use at your own risk.

Status: Experimental • License: Apache 2.0 Credit: Fork of GhidraMCP – thanks to Laurie Wired for the original project.


ghidraMCPd_codex_demo.mp4

Quickstart

python -m venv .venv
source .venv/bin/activate
# install the runtime dependencies only
python -m pip install -r requirements.txt
# optionally install development/test dependencies (needed for unit and contract tests)
python -m pip install -r requirements-dev.txt
uvicorn bridge.app:create_app --factory --host 127.0.0.1 --port 8000

Once running, open Ghidra with a project and the server will connect automatically.

HTTP/SSE Transport: Uvicorn serves the HTTP API and /sse endpoint for clients that support Server-Sent Events (most GUI agents and web frontends).

Stdio Transport

Need a console-first workflow or using a local LLM client that can't handle SSE (e.g., CLI tools or OpenAI's Codex)? Run the stdio transport instead of Uvicorn:

python scripts/bridge_stdio.py --transport stdio

This launches the MCP server directly over stdio (no /sse endpoint or OpenWebUI shim) and is the required transport for non-SSE clients.

Verify MCP tools over stdio

Run a quick smoke test of the MCP tools over stdio using the uvicorn factory entry point:

python scripts/verify_mcp_tools.py --ghidra-server-url http://127.0.0.1:8080/

The helper spawns uvicorn bridge.app:create_app --factory as a subprocess, then calls project_info, search_strings (defaults to the query boot), search_functions (defaults to main), and read_bytes (default address 0x401000). The script exits non-zero if any tool returns an error envelope or empty content.

Deterministic smoke test (stub firmware)

The repository ships with a tiny reference firmware fixture at bridge/tests/fixtures/reference.bin. To sanity-check common tools without a running Ghidra instance, run the stubbed MCP smoke test:

python scripts/mcp_smoke_test.py

The helper boots scripts/reference_mcp_server.py (FastMCP over stdio with the fixture-backed stub client) and asserts that project_info, project_overview, search_strings, search_functions, search_scalars_with_context, mmio_annotate_compact, read_bytes, and read_words all return data matching the fixture's layout. See docs/smoke-test.md for sample output and runtime options.


Motivation

Started as a side-quest while building an e-recumbent bike battery (needed to label matched cells → bought a Chinese handheld HP45 printer → found SD card → firmware RE → here we are 🚴).

Bridging Ghidra through MCP can be API-expensive when clients emit many small calls. Each call has a fixed overhead (request envelope, auth, JSON schemas, response framing), so dozens of tiny round-trips quickly blow up token usage.

GhidraMCPd reduces this by:

  • favoring fewer, coarse-grained calls instead of many tiny ones
  • doing more context assembly on the server side
  • enforcing deterministic, compact response shapes that are easy to diff and cache

How much you save depends on your workflow, but batching and server-side context usually cut round-trips and JSON noise significantly.


Highlights

  • Batch operations disassemble_batch, read_words, and the collect endpoint let you work on many addresses / ranges in a single request.

  • Contextual search search_scalars_with_context returns matches plus a server-side disassembly window around them, so clients don't need extra “give me the surrounding instructions” calls.

  • Deterministic pagination Resume-friendly cursors with fixed limits; totals stay stable when determinable so clients can page reliably.

  • Strict envelopes & schemas All responses use a { ok, data, errors[] } envelope with additionalProperties:false, making them LLM-friendly and easy to diff/audit.

  • Guard rails for writes Write operations are disabled by default (ENABLE_WRITES, dry_run), with safety limits and observability exposed via /state.

  • Tested for drift Contract, golden (OpenAPI/HTTP parity), and unit tests keep implementation and spec aligned over time.


MCP client examples

Theoretically, any MCP client should work with GhidraMCPd.

AiderDesk

Example configuration for AiderDesk: go to SettingsAgentMCP Servers (Agent Settings)Add / Edit Config and add:

{
  "mcpServers": {
    "ghidra-bridge": {
      "name": "Ghidra Bridge",
      "type": "sse",
      "url": "http://127.0.0.1:8000/sse"
    }
  }
}

Codex

Add to /home/user/.codex/config.toml

[mcp_servers.ghidra-bridge]
# use precisely the venv Python
command = “/your/path/to/GhidraMCPd/.venv/bin/python”

# Script relative to the repo root, therefore set cwd
args = [“scripts/bridge_stdio.py”, “--transport”, ‘stdio’]

# Important so that relative paths in the server are correct
cwd = “/your/path/to/GhidraMCPd”

For more MCP client examples, see docs/getting-started.md.


Example usage

# Find all functions using MMIO address 0x40000000
curl -X POST http://localhost:8000/api/collect.json \
  -H 'content-type: application/json' \
  -d '{
        "queries": [
          {
            "id": "mmio-usages",
            "op": "search_scalars_with_context",
            "params": {"value": "0x40000000", "context_lines": 2, "limit": 10}
          }
        ]
      }'

# Analyze function with decompiler
curl -X POST http://localhost:8000/api/analyze_function_complete.json \
  -H 'content-type: application/json' \
  -d '{"address":"0x1000","fields":["function","decompile","xrefs"]}'

For more examples, see docs/api.md.


Build the Ghidra extension

The canonical way to build the installable ZIP is via Docker:

./scripts/build_docker.sh

The script resolves the latest public Ghidra release, builds the extension inside a container, and copies the artifacts to target/ (including GhidraMCP-<version>.zip).

Installation:

  1. In Ghidra, go to File → Install Extensions and select the ZIP from target/.
  2. Restart Ghidra to load the extension.
  3. Activate: After restarting, go to File → Configure → Developer/Configure. Find GhidraMCP in the list (usually under the "Analysis" category) and check the box next to it to enable it. The plugin will not be active until you do this.

API overview

For the full list of endpoints and request/response schemas, see docs/api.md.

Key categories:

  • Batch & aggregation/api/collect.json
  • Search & analysissearch_*, string_xrefs, list_functions_in_range, etc.
  • Memory & disassemblyread_*, disassemble_*
  • Advanced analysisanalyze_function_complete, jump-table helpers
  • Data & utilitiesstrings_compact, project_*, write_bytes, health

Documentation


Status

This repo is maintained with a deterministic plan. See Development workflow for the .plan/ process.

Contributing

Contributions welcome! This is an experimental project with AI-generated code – review carefully and test thoroughly. See Development workflow for the .plan/ process.

License

Apache 2.0 – See LICENSE for details.

About

Token-efficient MCP server for Ghidra - batch operations, context search, deterministic APIs for LLM-assisted RE

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 81.9%
  • Java 15.6%
  • Shell 2.5%