Skip to content

SRINIVASBN/Malware-Detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Malware-Detection-using-Machine-Learning

Python TensorFlow

A deep-learning–based system that converts a file's binary into a grayscale image and classifies it as benign or malicious using a MobileNet-based Convolutional Neural Network (CNN).
The project supports on-demand scanning, drag-and-drop execution, and real-time directory monitoring, enabling fast, safe, and highly accurate malware detection without executing suspicious files.


Project Summary

Section Details
Model Type MobileNet-based CNN
Input File binary converted to 224×224 grayscale image
Output Benign / Malware classification
Techniques Used Transfer Learning, Static Analysis, Image Transformation
Scanning Modes On-demand, Drag-and-drop, Real-time watcher
Deployment Python CLI + Watchdog file monitor
Target OS Windows (for notifications & watcher)
Accuracy 99.30%
Recall 99.63%
Precision 99.60%

Abstract

Malware Detection using Machine Learning introduces an image-based deep learning method that converts executable files into grayscale images and classifies them using a MobileNet CNN model.
The system integrates a real-time monitoring tool that automatically scans new files and alerts users when malware is detected.
Experimental results show that this static, image-based approach provides an efficient and robust alternative to traditional signature-based antivirus systems, especially for detecting obfuscated or zero-day malware.


Results

The MobileNet-based malware classifier achieved the following performance on the test dataset:

Metric Score
Accuracy 99.30%
Precision 99.60%
Recall 99.63%
F1-Score 99.61%
ROC-AUC 0.9993

Confusion Matrix Summary:

  • 261 benign samples correctly classified
  • 2708 malware samples correctly classified
  • 10 malware samples misclassified as benign
  • 11 benign samples misclassified as malware

These results demonstrate that converting executables into grayscale images enables a lightweight and highly accurate detection system suitable for real-time scanning.


Features

  • Converts any file type into a 224×224 grayscale image (new_convert.py)
  • MobileNet-based CNN classifier (model_training.ipynb)
  • Real-time directory monitoring and automatic scanning (scanner2.py)
  • On-demand scanning via command line
  • Drag-and-drop scanning using scan_files.bat
  • Complete ML pipeline including:
    • Class balancing
    • Model checkpointing
    • Confusion matrix and ROC curve
    • Accuracy and loss tracking

Quick Start

1. Clone the repository

git clone https://github.com/SRINIVASBN/Malware-Detection.git
cd Malware-Detection

2. Install dependencies

pip install -r requirements.txt

Or manually:

pip install tensorflow numpy pillow watchdog win10toast winotify scikit-learn matplotlib

Usage

On-demand scanning

python scanner2.py C:\path\to\file1.exe C:\path\to\file2.dll

Drag and drop (Windows)

Drag files onto:

scan_files.bat

Real-time monitoring

Configure:

WATCH_DIRECTORY = r"C:\Users\YourName\Downloads"
MODEL_PATH = r"model_files1\malware_model_final.keras"

Start monitoring:

python scanner2.py

The tool automatically scans every new file added to the folder.


Training the Model

Step 1: Dataset Preparation

images/
  ├── Benign/
  └── Malware/

Step 2: Configure Notebook

root_dir = r"path\to\images"
save_model_dir = r"path\to\models"
local_checkpoint_dir = r"path\to\checkpoints"

Step 3: Run Notebook Cells

  • Data preprocessing
  • Model training
  • Fine-tuning
  • Evaluation (confusion matrix, ROC curve, accuracy plots)

Step 4: Deploy Final Model

Copy:

malware_model_final.keras

into the directory referenced by scanner2.py.


Project Structure

Malware-Detection/
├── model_training.ipynb
├── scanner2.py
├── new_convert.py
├── scan_files.bat
├── model_files1/
├── requirements.txt
└── README.md

Configuration

Edit in scanner2.py:

MODEL_PATH = r"model_files1\malware_model_final.keras"
WATCH_DIRECTORY = r"C:\Users\YourName\Downloads"
CONFIDENCE_THRESHOLD = 0.5

Logs and Outputs

File Description
training_log.csv Training metrics
malware_model_best.keras Best checkpoint
malware_model_final.keras Final deployed model
scanner.log Real-time scan logs

Troubleshooting

Issue Solution
Out-of-memory during training Reduce batch size
Weight download failure Set weights=None
Model not found Check MODEL_PATH
Real-time watcher inactive Verify WATCH_DIRECTORY
Git push errors Configure username/email

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Commit enhancements
  4. Submit a pull request

References

  • Microsoft Malware Classification Challenge (BIG 2015)
  • MobileNet Architecture (TensorFlow/Keras)
  • Libraries: TensorFlow, NumPy, PIL, Watchdog, Scikit-Learn, Matplotlib

About

Deep-learning–based system that converts a file's binary into a grayscale image and classifies it as benign or malicious using a MobileNet-based CNN.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors