Skip to content

Olaverse-Labs/OCR-API

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Image to Text OCR API

A FastAPI-based OCR service that extracts text and structure from images, supporting advanced features like preprocessing, language selection, confidence scores, and more.

Features

  • Extracts text from uploaded images or image URLs
  • Returns detected languages with confidence
  • Returns bounding boxes for blocks, paragraphs, lines, and words
  • Includes confidence scores for each word
  • Supports image preprocessing: grayscale, thresholding, denoising
  • Allows language selection for OCR
  • Can automatically deskew (rotate) images

Installation

  1. Clone the repository and navigate to the project directory.
  2. Install dependencies:
    pip install -r requirements.txt
  3. Make sure Tesseract OCR is installed and in your system PATH.

Running the API

uvicorn main:app --reload

Or, if uvicorn is not in your PATH:

python -m uvicorn main:app --reload

API Usage

Visit http://localhost:8000/docs for interactive documentation.

Endpoint: /extract-text/ (POST)

Accepts either an image file upload or an image URL, with optional processing options.

Form Fields

  • file: (optional) Image file to upload
  • image_url: (optional) URL to an image
  • preprocess: (optional) Comma-separated preprocessing steps: grayscale, threshold, denoise
  • ocr_lang: (optional) Tesseract language(s), e.g. eng, fra, eng+fra (default: eng)
  • deskew: (optional) true or false (default: false), deskew image before OCR

Example Request (using curl)

curl -X POST "http://localhost:8000/extract-text/" \
  -F "file=@your_image.png" \
  -F "preprocess=grayscale,threshold" \
  -F "ocr_lang=eng+fra" \
  -F "deskew=true"

Example Response

{
  "status": true,
  "text": "...extracted text...",
  "boxCoordinates": [0.1, 0.2, 0.3, 0.4],
  "blocks": [
    {
      "boxCoordinates": [ ... ],
      "paragraphs": [
        {
          "boxCoordinates": [ ... ],
          "lines": [
            {
              "boxCoordinates": [ ... ],
              "text": "...",
              "words": [
                { "text": "...", "boxCoordinates": [ ... ], "confidence": 95.0 }
              ]
            }
          ]
        }
      ]
    }
  ],
  "detectedLanguages": [
    { "languageCode": "en", "confidence": 0.98 }
  ],
  "executionTimeMS": 1234
}

Docker

A Dockerfile is provided for containerized deployment. Build and run with:

docker build -t image-to-text-api .
docker run -p 8000:8000 image-to-text-api

Notes

  • For best results, ensure Tesseract is installed and available in your system PATH.
  • You can combine preprocessing options for improved OCR accuracy.
  • The API supports both file uploads and image URLs.

License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Sponsor this project

Packages

 
 
 

Contributors