Skip to content

Nazeef1/FakeNewsLSTM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Fake News Detection using Bidirectional LSTM with Explainable AI

A deep learning–based Fake News Detection system built using a Bidirectional LSTM (BiLSTM) model, trained on the WELFake dataset, and enhanced with Explainable AI techniques (SHAP and LIME) to provide transparent and interpretable predictions.


Project Overview

The rapid spread of fake news poses a significant threat to public trust and decision-making.
This project focuses on accurately classifying news articles as Real or Fake using Natural Language Processing (NLP) and Deep Learning, while also ensuring model interpretability through explainable AI methods.


Objectives

  • Develop an accurate fake news classification model
  • Perform comprehensive data cleaning and exploratory data analysis
  • Train a Bidirectional LSTM-based deep learning model
  • Evaluate performance using standard classification metrics
  • Interpret model predictions using SHAP and LIME
  • Generate and store visual insights and trained models

Model Architecture

  • Embedding Layer
  • Bidirectional LSTM (100 units)
  • Dropout Layer (rate = 0.3)
  • Dense Output Layer with Sigmoid Activation

Loss Function: Binary Crossentropy
Optimizer: Adam


Dataset

  • Name: WELFake Dataset
  • Total Records: ~72,000
  • Classes:
    • 0 – Real News
    • 1 – Fake News

After preprocessing and cleaning, 70,795 valid samples were retained for training and evaluation.


Technologies Used

  • Python
  • TensorFlow / Keras
  • NLTK
  • Scikit-learn
  • Pandas, NumPy
  • Matplotlib, Seaborn
  • SHAP
  • LIME

Workflow

  1. Data loading and cleaning
  2. Exploratory Data Analysis (EDA)
  3. Text preprocessing
    • Lowercasing
    • Punctuation removal
    • Stopword removal
    • Stemming
  4. Tokenization and padding
  5. Model training
  6. Model evaluation
  7. Explainability using SHAP and LIME
  8. Visualization and output storage

Model Performance

  • Accuracy: 88.67%
  • Precision, Recall, F1-score: Approximately 0.89 for both classes

The confusion matrix shows a well-balanced distribution of true positives, true negatives, false positives, and false negatives.


Explainable AI

SHAP (SHapley Additive Explanations)

  • Quantifies the contribution of word positions to predictions
  • Identifies globally important features
  • Red bars indicate contribution toward Fake predictions
  • Green bars indicate contribution toward Real predictions
  • Larger absolute SHAP values represent stronger influence

LIME (Local Interpretable Model-Agnostic Explanations)

  • Explains individual predictions locally
  • Highlights influential words in news text
  • Applied to both correctly classified and misclassified samples
  • Helps analyze model behavior and errors

Generated Visualizations

File Name Description
eda_plots.png Exploratory Data Analysis
model_evaluation.png Accuracy, loss curves, and confusion matrix
shap_feature_importance.png Global SHAP feature importance
shap_sample_explanation.png SHAP explanation for a single sample
shap_multiple_samples.png SHAP explanations for multiple samples

Saved Model

The trained Bidirectional LSTM model is saved as:

fake_news_bilstm_model.h5

Output Files Generated

  • eda_plots.png – EDA visualizations
  • model_evaluation.png – Accuracy, loss curves, and confusion matrix
  • shap_feature_importance.png – Global SHAP importance
  • shap_sample_explanation.png – Single-sample SHAP explanation
  • shap_multiple_samples.png – Multi-sample SHAP explanations
  • fake_news_bilstm_model.h5 – Trained model

Notes

  • Training time may vary depending on hardware configuration
  • SHAP KernelExplainer is computationally expensive; sample sizes are limited for efficiency
  • The .h5 model format is considered legacy; future versions may adopt the .keras format

Future Enhancements

  • Integrate transformer-based models such as BERT or RoBERTa
  • Develop a web-based interface for real-time fake news detection
  • Enhance explainability using attention-based visualizations
  • Optimize performance for longer and more complex articles

License

This project is intended for academic and educational purposes only.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors