Skip to content

Singhvishal003/Email-Spam-Detector-Model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Email Spam Detection Model

Overview

This project demonstrates how to build an email spam detection system using Natural Language Processing (NLP) and machine learning techniques. The goal is to automatically identify and filter out unwanted spam emails, improving user experience and security.

Project Structure

  1. Data: A dataset of emails labeled as spam or ham (non-spam).
  2. Preprocessing: Steps to clean and prepare the email data for modeling.
  3. Modeling: Machine learning algorithms used to classify emails.
  4. Evaluation: Assessing the performance of the models.
  5. Pipeline: A complete pipeline from data preprocessing to prediction.
  6. Testing: Examples of how the model performs on new, unseen emails.

Data

The dataset contains two main columns:

  • Category: Indicates whether the email is spam or ham.
  • Message: The actual content of the email.

Data Preprocessing

Effective preprocessing is essential for building a reliable spam detection model. The preprocessing steps include:

  • Lowercasing
  • Removing punctuation
  • Removing non-alphabetic characters
  • Tokenization
  • Removing stop words
  • Lemmatization

Modeling

The processed data is used to train a machine learning model. Techniques like TF-IDF (Term Frequency-Inverse Document Frequency) are used for vectorization. The model is trained on the processed and vectorized data, with oversampling applied to balance the classes.

Evaluation

The model's performance is evaluated using metrics such as accuracy, precision, recall, and F1-score.

Deployment

The trained model can be deployed as a web application using frameworks like Streamlit, allowing users to input email messages and receive predictions on whether they are spam or not.

About

Email Spam Detector model to detect spam messages.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors