📄 PDF Reader MCP (Vision & OCR Edition)

Give your AI eyes. The ultimate PDF tool for Claude Desktop. Reads text, sees diagrams, and digitizes scans using Mistral AI.

⚡ Why this one?

Most PDF tools just dump raw text. This one is different:

It sees Diagrams: Uses Computer Vision (Mistral) to understand charts, timing diagrams, and technical drawings.
It reads Scans: Built-in OCR detects scanned pages and converts them to text automatically.
It's Fast & Cheap: Parallel processing (5-10x faster) and creates a local cache so you don't pay for API calls twice.
Smart Fallback: If you don't have an API key, it intelligently degrades to standard text extraction or passes images to Claude.

✅ Battle-tested on 897-page chip datasheets for reverse engineering. See real-world test results

🚀 Quick Start (2 Minutes)

Since this is a power tool, we build it locally to give you full control.

1. Install Prerequisites

You need Git, Node.js (v22+), and Bun (a fast Node alternative).

Don't have Bun?

# Mac/Linux/WSL
curl -fsSL https://bun.sh/install | bash

# Windows (PowerShell)
powershell -c "irm bun.sh/install.ps1 | iex"

2. Download & Build

Open your terminal/PowerShell and run:

# Clone the repo
git clone https://github.com/mad-sol-dev/pdf-reader-mcp.git
cd pdf-reader-mcp

# Install & Build
bun install
bun run build

# ⚠️ COPY THE PATH BELOW - You need it for the config!
echo "Your absolute path is:"
pwd
# (On Windows, use 'cd' to see the path)

3. Configure Claude Desktop

Open your config file:

Mac: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json

Add this (replace /YOUR/PATH/... with the path from Step 2):

{
  "mcpServers": {
    "pdf-reader": {
      "command": "node",
      "args": ["/YOUR/PATH/TO/pdf-reader-mcp/dist/index.js"],
      "env": {
        "MISTRAL_API_KEY": "your_mistral_key_here",
        "PDF_ALLOWED_PATHS": "/Users/Me/Documents:/Users/Me/Downloads"
      }
    }
  }
}

Note on API Keys: MISTRAL_API_KEY is optional but highly recommended for OCR and Diagram analysis. Without it, the tool falls back to basic text extraction.

🗣️ How to use it (Prompts)

Once installed, restart Claude Desktop. You don't need to use technical commands. Just talk:

📄 Standard Reading

"Read the file specification.pdf and summarize the introduction."

📊 Analyzing Diagrams (The Superpower)

"Look at the timing diagram on page 5 of datasheet.pdf. Explain the signal sequence." (The tool will auto-select pdf_vision for this)

🧾 Reading Scans (OCR)

"This invoice_scan.pdf is an image. Extract the total amount and date."

⚙️ Configuration & Security

To keep your files safe, this tool uses a security allowlist.

Variable	Function	Example
`PDF_ALLOWED_PATHS`	Required. Colon-separated list of folders the AI can access.	`/Users/Me/Docs:/tmp`
`MISTRAL_API_KEY`	Enables Vision & OCR. Get one at console.mistral.ai.	`xYz123...`
`PDF_BASE_DIR`	(Optional) Base folder for relative paths.	`/Users/Me/Projects`

Troubleshooting: "Resolved path is outside allowed directories" If you see this error, it means you are trying to read a PDF that isn't in one of the folders listed in PDF_ALLOWED_PATHS. Add the folder to your config and restart Claude.

🛠️ For Developers: The Toolkit

Under the hood, this server exposes these tools to Claude:

pdf_read: Fast, parallel text extraction. Supports [IMAGE] markers.
pdf_vision: Uses Mistral Vision for complex visual elements.
pdf_ocr: High-fidelity OCR for scanned docs and tables.
pdf_search: Regex-enabled search across documents.
pdf_extract_image: Pulls raw images for manual inspection.

Caching: Results are cached in {filename}_ocr.json next to your PDF.

First run: Takes a few seconds (API call).
Second run: Instant (Local cache).

📚 Documentation

TESTING_NOTES.md - Real-world testing with 897-page technical PDFs
CHANGELOG.md - Version history and features
CLAUDE.md - Development guidelines (for contributors)

🙏 Credits

This project is built on the excellent foundation from SylphxAI/pdf-reader-mcp – thank you for the solid architecture!

What they built:

Fast parallel processing (5-10x speedup)
Smart content ordering and error handling
Flexible path resolution
Rock-solid test coverage

What we added:

Vision API for technical diagrams (Mistral)
Enhanced OCR with full response structure
Smart content routing (Vision vs OCR)
Real-world validation and testing

Contributors:

Sylphx Team - Original architecture and core PDF processing
mad-sol-dev & Claude Sonnet 4.5 - Vision/OCR integration, testing, docs

Built with ❤️ using the Model Context Protocol

Report Bug • Request Feature

Name		Name	Last commit message	Last commit date
Latest commit History 391 Commits
.github		.github
dist		dist
src		src
test-data		test-data
test		test
.dockerignore		.dockerignore
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
.mcp.json.example		.mcp.json.example
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
TESTING_NOTES.md		TESTING_NOTES.md
biome.json		biome.json
bun.lock		bun.lock
commitlint.config.cjs		commitlint.config.cjs
lefthook.yml		lefthook.yml
opencode.jsonc		opencode.jsonc
package.json		package.json
test-client.ts		test-client.ts
tsconfig.json		tsconfig.json
vercel.json		vercel.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📄 PDF Reader MCP (Vision & OCR Edition)

⚡ Why this one?

🚀 Quick Start (2 Minutes)

1. Install Prerequisites

2. Download & Build

3. Configure Claude Desktop

🗣️ How to use it (Prompts)

⚙️ Configuration & Security

🛠️ For Developers: The Toolkit

📚 Documentation

🙏 Credits

About

Uh oh!

Languages

License

mad-sol-dev/pdf-reader-mcp

Folders and files

Latest commit

History

Repository files navigation

📄 PDF Reader MCP (Vision & OCR Edition)

⚡ Why this one?

🚀 Quick Start (2 Minutes)

1. Install Prerequisites

2. Download & Build

3. Configure Claude Desktop

🗣️ How to use it (Prompts)

⚙️ Configuration & Security

🛠️ For Developers: The Toolkit

📚 Documentation

🙏 Credits

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages