This repository contains the implementation of a real-time phishing detection system designed to analyze emails and URLs for potential phishing threats. The system leverages machine learning models, integrates with Gmail API, and provides a user-friendly interface.
The system consists of the following major components:
- Technology: Flask-based backend
- Functions:
- Handles requests and integrates with machine learning models.
- Communicates with the frontend to deliver real-time phishing detection results.
- LightGBM for phishing classification.
- Autoencoder for anomaly detection in email features.
- XGBoost for phishing classification.
- Autoencoder for detecting anomalous URL features.
- Technology: Angular-based interface.
- Features:
- Manual email and URL phishing checks.
- Real-time email analysis using Gmail API.
- Visual indicators and detailed insights on phishing threats.
- Technology: Gmail API
- Functions:
- Fetches emails in real-time.
- Extracts key features and sends them for phishing analysis.
- Displays results with phishing probability and anomaly scores.
- Feature extraction: sender, receiver, subject, body.
- NLP techniques: tokenization, stopword removal, stemming/lemmatization.
- Text vectorization: TF-IDF or BERT-based embeddings.
- Feature extraction: domain, URL length, lexical analysis.
- Handling missing values and normalizing features.
- LightGBM: Trained on labeled email data for classification.
- Autoencoder: Learns normal email patterns and detects anomalies.
- XGBoost: Trained on labeled URL data for phishing probability prediction.
- Autoencoder: Detects anomalous URL patterns.
For both email and URL analysis:
- Combine classification model score and autoencoder anomaly score.
- Final decision is based on a weighted combination of both scores.
- Flask-based REST API
- Endpoints:
- Email Analysis: Accepts email payload and returns phishing predictions.
- URL Analysis: Accepts a URL and returns a phishing probability score.
- Handles preprocessing, model inference, and result aggregation.
- Angular-based UI
- Features:
- Email input form for manual checks.
- URL input form for phishing detection.
- Gmail integration for real-time email analysis.
- Displays phishing probability, anomaly scores, and alerts.
- Uses Gmail API for email fetching and analysis.
- Implements OAuth 2.0 for authentication.
- Processes emails and extracts features for phishing detection.
- Sends results back to the frontend with detailed insights.
git clone https://github.com/arcc-hitt/URL_Email_Phishing_Detection.git
cd URL_Email_Phishing_Detectionpip install -r requirements.txtpython app.pycd frontend
npm installng serve- Enable Gmail API in Google Cloud Console.
- Configure OAuth 2.0 credentials.
- Store authentication keys securely.
- Manual Analysis: Enter an email or URL in the frontend for phishing detection.
- Gmail Integration: Authenticate with Gmail and analyze emails in real-time.
- View Results: Check phishing probability scores and anomaly warnings.
Contributions are welcome! Feel free to submit issues or pull requests.
For any queries or support, please open an issue in the repository.