Text Remover

Text Remover is a Python application that uses Optical Character Recognition (OCR) to detect and remove text from images. It provides a graphical user interface (GUI) for easy usage.

Features

Load an image (JPEG, PNG, WebP formats are supported)
Set a confidence threshold for OCR text detection
Remove detected text from the image
Save the processed image

Dependencies

The following Python packages are required to run the application:

opencv-python
pytesseract
PyQt5
numpy

Installation

Clone the repository:

git clone https://github.com/KadirKess/Python-Text-Remover.git

Navigate to the project directory:
```
cd Python-Text-Remover
```
Install the requirements:
```
pip install -r requirements.txt
```

Usage

Run the application:
```
python main.py
```
or
```
python3 main.py
```
Click on "Select Image" to choose an image.
Adjust the "Confidence Threshold" if needed.
Click on "Remove Text" to process the image.
Click on "Save Image" to save the processed image.
Click on "Close" to exit the application.

How it works

The Text Remover application uses a combination of OpenCV and Pytesseract to detect and remove text from images. Here's a detailed breakdown of the process:

Image Loading: The script loads the image using OpenCV's imread function. This function reads the image from the specified file path and returns it in an array format that can be processed by the application.
OCR Processing: The loaded image is then processed using Pytesseract, an Optical Character Recognition (OCR) tool for Python. Pytesseract uses the Tesseract engine to detect and recognize text within the image. The image_to_data function is used, which returns the recognized text along with its bounding box coordinates and a confidence score.
Text Region Identification: The script then iterates over the detected text regions. If the confidence score of a text region is below the specified threshold, it is skipped. This is done to avoid removing regions that are not text. Otherwise, the bounding box coordinates of the text region are used to create a mask.
Mask Creation: A mask is a binary image where the pixels corresponding to the text region are set to 1 (or true), and all other pixels are set to 0 (or false). This mask is used to isolate the text region for further processing.
Inpainting: The masked region (i.e., the text region) is then inpainted using OpenCV's inpaint function. Inpainting is a process where the selected region in an image is filled with information extrapolated from the surrounding areas. In this case, it is used to fill the text region with similar colors and patterns from the rest of the image, effectively removing the text.
Iterative Process: This process is repeated for all detected text regions in the image, resulting in an image where the text has been removed.

Project Structure

.
├── main.py
├── requirements.txt
└── src
    ├── gui.py
    └── ocr.py

main.py: The entry point of the application.
src/gui.py: The module that contains the GUI of the application.
src/ocr.py: The module that contains the OCR functionality.
requirements.txt: The list of Python packages required to run the application.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
src		src
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text Remover

Features

Dependencies

Installation

Usage

How it works

Project Structure

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Text Remover

Features

Dependencies

Installation

Usage

How it works

Project Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages