Skip to content

fix: EML file decoding crash due to UnicodeDecodeError#231

Merged
Userunknown84 merged 2 commits into
Userunknown84:mainfrom
onkar0127:fix-eml-decoding
Jun 23, 2026
Merged

fix: EML file decoding crash due to UnicodeDecodeError#231
Userunknown84 merged 2 commits into
Userunknown84:mainfrom
onkar0127:fix-eml-decoding

Conversation

@onkar0127

@onkar0127 onkar0127 commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

fixes #149

Description

This Pull Request resolves the crash that occurred when uploading non-UTF-8 encoded .eml files (e.g. accented characters in Latin-1 / ISO-8859-1) to the Sender Verifier tool.

Changes

  1. Resilient Fallback Decoding (backend/api.py):
    • Updated the /analyze-email-header endpoint to attempt strict UTF-8 decoding first.
    • If a UnicodeDecodeError is caught, it falls back to decoding via latin-1 with errors="replace". This ensures legacy or mixed-encoding files are read successfully without raising decode exceptions.
  2. Added Unit Tests (backend/tests/test_email_header_analyzer.py):
    • Added a new unit test case test_api_endpoint_multipart_eml_non_utf8 that simulates uploading a Latin-1 encoded .eml file containing accented characters (ñ) and verifies it resolves successfully.

Verification Results

  • Pytest: Executed pytest backend/tests and all 59 tests passed with zero failures.

@vercel

vercel Bot commented Jun 23, 2026

Copy link
Copy Markdown

@onkar0127 is attempting to deploy a commit to the Aditya Sharma's projects Team on Vercel.

A member of the Team first needs to authorize it.

@Userunknown84 Userunknown84 merged commit a82cbdb into Userunknown84:main Jun 23, 2026
5 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: UnicodeDecodeError crashes backend when parsing EML files containing non-UTF-8 characters

2 participants