A self-hosted OCR API powered by DeepSeek-OCR-2. Runs on GPU via Docker, processes PDFs and images, and returns clean markdown content in JSON responses. Supports multiple languages with high accuracy.
| Model | DeepSeek-OCR-2 |
| Architecture | DeepSeek-VL-v2 based |
| Model size | 6.4GB |
| GPU VRAM | ~8GB |
| Input formats | PDF, PNG, JPG, JPEG, BMP, TIFF, WEBP |
- Docker with NVIDIA Container Toolkit
- NVIDIA GPU with ~8GB VRAM
Using Docker Hub image:
services:
deepseek-ocr2:
image: edgaras0x4e/deepseek-ocr-2-api:latest
container_name: deepseek-ocr2
ports:
- "9713:7860"
volumes:
- ocr-data:/data
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
restart: unless-stopped
volumes:
ocr-data:docker compose up -dOr build from source:
docker compose up --build -dThe API will be available at http://localhost:9713. On first startup the model (~6.4GB) is downloaded and loaded into GPU memory (~8GB VRAM). The API accepts requests immediately, but jobs will start processing once the model is ready.
# PDF
curl -X POST http://localhost:9713/ocr -F "file=@document.pdf"
# Image
curl -X POST http://localhost:9713/ocr -F "file=@scan.jpg"{
"job_id": "994e7b398bb44d8ab5eade4d2ef57a15",
"filename": "document.pdf",
"status": "queued"
}curl http://localhost:9713/ocr/{job_id}{
"job_id": "994e7b398bb44d8ab5eade4d2ef57a15",
"filename": "document.pdf",
"status": "processing",
"total_pages": 185,
"processed_pages": 42,
"error": null
}curl http://localhost:9713/ocr/{job_id}/pages/1{
"job_id": "994e7b398bb44d8ab5eade4d2ef57a15",
"page_num": 1,
"markdown": "## Chapter 1\n\nLorem ipsum dolor sit amet, consectetur adipiscing elit..."
}curl http://localhost:9713/ocr/{job_id}/result{
"job_id": "994e7b398bb44d8ab5eade4d2ef57a15",
"filename": "document.pdf",
"status": "completed",
"total_pages": 185,
"processed_pages": 185,
"pages": [
{"page_num": 1, "markdown": "## Chapter 1\n\nLorem ipsum dolor sit amet..."},
{"page_num": 2, "markdown": "..."}
]
}curl http://localhost:9713/jobs{
"jobs": [
{
"job_id": "994e7b398bb44d8ab5eade4d2ef57a15",
"filename": "document.pdf",
"status": "completed",
"total_pages": 185,
"processed_pages": 185
}
]
}curl -X POST http://localhost:9713/ocr/{job_id}/cancel{
"job_id": "994e7b398bb44d8ab5eade4d2ef57a15",
"status": "cancelling"
}curl -X DELETE http://localhost:9713/ocr/{job_id}{
"status": "deleted"
}| Method | Endpoint | Description |
|---|---|---|
POST |
/ocr |
Upload a PDF or image for processing |
GET |
/ocr/{job_id} |
Get job status and progress |
GET |
/ocr/{job_id}/pages/{page_num} |
Get markdown for a specific page |
GET |
/ocr/{job_id}/result |
Get all completed pages |
POST |
/ocr/{job_id}/cancel |
Cancel a queued or running job |
DELETE |
/ocr/{job_id} |
Delete a job and its data |
GET |
/jobs |
List all jobs |
GET |
/health |
Check API and model status |
Environment variables in docker-compose.yml:
| Variable | Default | Description |
|---|---|---|
API_KEY |
(empty) | Optional API key. When set, all requests must include an X-API-Key header |
OCR_DPI |
300 |
DPI for PDF page rendering |
DB_PATH |
/data/ocr.db |
SQLite database path |
UPLOAD_DIR |
/data/uploads |
Upload storage path |
Uncomment the environment section in docker-compose.yml:
environment:
- API_KEY=your-secret-keyThen restart:
docker compose down && docker compose up -dAll requests must then include the header:
curl -H "X-API-Key: your-secret-key" http://localhost:9713/jobsservices:
deepseek-ocr2:
build: .
container_name: deepseek-ocr2
ports:
- "9713:7860"
# environment:
# - API_KEY=your-secret-key
volumes:
- ocr-data:/data
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
restart: unless-stopped
volumes:
ocr-data:- A PDF or image is uploaded and saved to disk
- A background worker picks up queued jobs in order
- PDFs: Each page is converted to an image using PyMuPDF
- Images: Used directly as a single-page job
- DeepSeek-OCR-2 extracts text and converts it to markdown with grounding
- Special markup tags and bounding box data are stripped from the output
- Results are stored in SQLite and available per-page as they complete
- Jobs interrupted by a restart are automatically re-queued
PDFs: Multi-page processing, each page processed independently
Images: Single-page processing
- PNG
- JPG/JPEG
- BMP
- TIFF
- WEBP
The /data volume stores the SQLite database and uploaded files. This is a named Docker volume (ocr-data) that persists across container restarts and rebuilds.
- Model loading takes ~2-5 minutes on first startup
- Processing speed depends on GPU and image complexity
- PDF pages are rendered at 300 DPI by default (configurable)
- Jobs are processed sequentially in order of submission
MIT