A lightweight, privacy-first application that automatically detects and censors sensitive information in images.
It uses computer vision and NLP techniques to blur or pixelate:
- Human faces
- License plates
- Personally Identifiable Information (PII) such as names, dates, card/account numbers
All processing happens locally on-device — no images are ever uploaded externally. Original uncensored files are stored securely and protected with a password system, ensuring that only authenticated users can view them.
- Automatic face & license plate detection (OpenCV Haar cascades)
- PII detection with OCR (Tesseract) + NLP (spaCy) + regex patterns
- Optional Presidio Analyzer integration for enhanced PII recognition
- Choice of Mosaic pixelation or Gaussian blur, with adjustable strength
- Password-protected viewer for originals
- Batch processing of all images in the
images/folder - Local-first privacy – no cloud uploads
-
Language: Python 3.10+
-
UI Framework: Streamlit
-
Libraries & Tools:
- OpenCV – face & license plate detection
- Tesseract OCR – text extraction
- spaCy – Named Entity Recognition (NER)
- Presidio Analyzer – advanced PII detection (optional)
- NumPy – array manipulation
- Hashlib & JSON – password management
git clone https://github.com/JordanTwz/tiktok-techjam.git
cd tiktok-techjampip install -r requirements.txtWindows: https://github.com/UB-Mannheim/tesseract/wiki
Linux (Ubuntu/Debian):
sudo apt-get install tesseract-ocrmacOS (Homebrew):
brew install tesseractNote: Update the path in
backend.pyandface_plate_censor_app.pyif Tesseract is installed elsewhere.
Password-protected app:
streamlit run frontend.pyLightweight demo (no password system):
streamlit run face_plate_censor_app.py- Set password on first use
- Place images in the
images/folder - Censored outputs are saved in the
censored/folder - Authenticate with your password to view original images