deepseek-ocr-intelligence

An intelligent OCR system powered by DeepSeek models that extracts, understands, and structures text from images and documents. The system combines Optical Character Recognition (OCR) with Large Language Models (LLMs) to deliver clean, structured, and actionable data.

🎯 Project concept

👉 Intelligent pipeline: Image → OCR → Text → DeepSeek → Structured Output

🧠 Models used

Hugging Face (platform)
DeepSeek AI
Model: deepseek-ai/deepseek-coder-1.3b-instruct (free)
Role:
- Clean OCR text
- Correct recognition errors
- Structure data into JSON format

🧱 Architecture

backend/ : FastAPI, Tesseract OCR, Hugging Face integration
frontend/ : Vanilla JS, API service, upload UI

⚙️ Tech stack

Python (FastAPI)
Tesseract OCR (pytesseract)
Hugging Face Transformers
DeepSeek Coder 1.3B
Frontend (JS/HTML/CSS)

🔥 Features

OCR Extraction: Image → Raw text
Intelligent cleaning (DeepSeek): Automatic correction (e.g. "Totl: 12O.OO USD" → "Total: 120.00 USD")
JSON structuring: Automatic extraction of key fields
Document types: Invoices, receipts, scanned documents

🚀 Technical pipeline

📌 1. OCR (pytesseract)

import pytesseract
from PIL import Image

text = pytesseract.image_to_string(Image.open("receipt.png"))

📌 2. DeepSeek via Hugging Face

from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "deepseek-ai/deepseek-coder-1.3b-instruct"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

prompt = f"Clean and structure this OCR text into JSON:\n{text}"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=200)

result = tokenizer.decode(outputs[0])

🖥️ Frontend (Usage)

The frontend is located in the /frontend folder. It allows you to upload an image and see in real time the raw extraction and the JSON result.

Start the backend (see below)
Open frontend/index.html (via Live Server or py -m http.server 5500)
Configure the API base URL if needed (default: http://localhost:8000)

🛠️ Installation & setup

Backend

Install dependencies:

    cd backend
    pip install -r requirements.txt

Start the API:

    uvicorn api.main:app --reload

Swagger UI: http://localhost:8000/docs

Frontend

cd frontend
py -m http.server 5500

Open http://localhost:5500

🚀 AI Engineer / Data Engineer project — OCR + LLM (DeepSeek + Hugging Face)

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
LICENSE		LICENSE
README copy.md		README copy.md
README.md		README.md
demo.gif		demo.gif

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

deepseek-ocr-intelligence

🎯 Project concept

🧠 Models used

🧱 Architecture

⚙️ Tech stack

🔥 Features

🚀 Technical pipeline

📌 1. OCR (pytesseract)

📌 2. DeepSeek via Hugging Face

🖥️ Frontend (Usage)

🛠️ Installation & setup

Backend

Frontend

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

deepseek-ocr-intelligence

🎯 Project concept

🧠 Models used

🧱 Architecture

⚙️ Tech stack

🔥 Features

🚀 Technical pipeline

📌 1. OCR (pytesseract)

📌 2. DeepSeek via Hugging Face

🖥️ Frontend (Usage)

🛠️ Installation & setup

Backend

Frontend

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages