Give your AI eyes. The ultimate PDF tool for Claude Desktop. Reads text, sees diagrams, and digitizes scans using Mistral AI.
Most PDF tools just dump raw text. This one is different:
- It sees Diagrams: Uses Computer Vision (Mistral) to understand charts, timing diagrams, and technical drawings.
- It reads Scans: Built-in OCR detects scanned pages and converts them to text automatically.
- It's Fast & Cheap: Parallel processing (5-10x faster) and creates a local cache so you don't pay for API calls twice.
- Smart Fallback: If you don't have an API key, it intelligently degrades to standard text extraction or passes images to Claude.
β Battle-tested on 897-page chip datasheets for reverse engineering. See real-world test results
Since this is a power tool, we build it locally to give you full control.
You need Git, Node.js (v22+), and Bun (a fast Node alternative).
Don't have Bun?
# Mac/Linux/WSL
curl -fsSL https://bun.sh/install | bash
# Windows (PowerShell)
powershell -c "irm bun.sh/install.ps1 | iex"Open your terminal/PowerShell and run:
# Clone the repo
git clone https://github.com/mad-sol-dev/pdf-reader-mcp.git
cd pdf-reader-mcp
# Install & Build
bun install
bun run build
# β οΈ COPY THE PATH BELOW - You need it for the config!
echo "Your absolute path is:"
pwd
# (On Windows, use 'cd' to see the path)Open your config file:
- Mac:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json
Add this (replace /YOUR/PATH/... with the path from Step 2):
{
"mcpServers": {
"pdf-reader": {
"command": "node",
"args": ["/YOUR/PATH/TO/pdf-reader-mcp/dist/index.js"],
"env": {
"MISTRAL_API_KEY": "your_mistral_key_here",
"PDF_ALLOWED_PATHS": "/Users/Me/Documents:/Users/Me/Downloads"
}
}
}
}Note on API Keys:
MISTRAL_API_KEYis optional but highly recommended for OCR and Diagram analysis. Without it, the tool falls back to basic text extraction.
Once installed, restart Claude Desktop. You don't need to use technical commands. Just talk:
π Standard Reading
"Read the file
specification.pdfand summarize the introduction."
π Analyzing Diagrams (The Superpower)
"Look at the timing diagram on page 5 of
datasheet.pdf. Explain the signal sequence." (The tool will auto-selectpdf_visionfor this)
π§Ύ Reading Scans (OCR)
"This
invoice_scan.pdfis an image. Extract the total amount and date."
To keep your files safe, this tool uses a security allowlist.
| Variable | Function | Example |
|---|---|---|
PDF_ALLOWED_PATHS |
Required. Colon-separated list of folders the AI can access. | /Users/Me/Docs:/tmp |
MISTRAL_API_KEY |
Enables Vision & OCR. Get one at console.mistral.ai. | xYz123... |
PDF_BASE_DIR |
(Optional) Base folder for relative paths. | /Users/Me/Projects |
Troubleshooting: "Resolved path is outside allowed directories"
If you see this error, it means you are trying to read a PDF that isn't in one of the folders listed in PDF_ALLOWED_PATHS. Add the folder to your config and restart Claude.
Under the hood, this server exposes these tools to Claude:
pdf_read: Fast, parallel text extraction. Supports[IMAGE]markers.pdf_vision: Uses Mistral Vision for complex visual elements.pdf_ocr: High-fidelity OCR for scanned docs and tables.pdf_search: Regex-enabled search across documents.pdf_extract_image: Pulls raw images for manual inspection.
Caching:
Results are cached in {filename}_ocr.json next to your PDF.
- First run: Takes a few seconds (API call).
- Second run: Instant (Local cache).
- TESTING_NOTES.md - Real-world testing with 897-page technical PDFs
- CHANGELOG.md - Version history and features
- CLAUDE.md - Development guidelines (for contributors)
This project is built on the excellent foundation from SylphxAI/pdf-reader-mcp β thank you for the solid architecture!
What they built:
- Fast parallel processing (5-10x speedup)
- Smart content ordering and error handling
- Flexible path resolution
- Rock-solid test coverage
What we added:
- Vision API for technical diagrams (Mistral)
- Enhanced OCR with full response structure
- Smart content routing (Vision vs OCR)
- Real-world validation and testing
Contributors:
- Sylphx Team - Original architecture and core PDF processing
- mad-sol-dev & Claude Sonnet 4.5 - Vision/OCR integration, testing, docs
Built with β€οΈ using the Model Context Protocol