Facial Emotion Detection with DenseNet

This repository contains the project work for the module Machine Learning for Physicists 2024. The project focuses on implementing a DenseNet architecture for facial emotion detection and comparing its performance with a traditional Convolutional Neural Network (CNN) approach.

Installation

To set up the environment and run the project, follow these steps:

Clone the repository:

git clone https://github.com/RnLe/MachineLearning.git
cd MachineLearning

Create and activate the conda environment:

conda env create -f environment.yml
conda activate tf_gpu2

Download the dataset from Google Drive and extract the .zip in the folder emotions_facial.

Usage

To execute the code, you can simply run the main.ipynb notebook. This notebook uses the optimal hyperparameters from the optuna.db file.

Files

main.ipynb: Main notebook, using optimal hyperparameters from optuna.db.
densenet.py: Implementation of the DenseNet class.
alternativeMethod_CNN.ipynb: Notebook for the alternative method.
hyperparam_optimization.py: Hyperparameter optimization using Optuna.

Note: The optimal hyperparameters are already provided in the project files.

Report Summary

Abstract & Introduction

Facial expressions serve as strong indicators of human emotion. Consequently, Facial Emotion Recognition (FER) has become increasingly significant for human-machine interaction, with applications ranging from automated mood detection in customer service to security and public safety [1].

This study explores the use of DenseNet architectures to address the challenges of FER using the FER2013 dataset [2]. While traditional Convolutional Neural Networks (CNNs) are powerful, they often struggle with feature retention in deeper layers. This project aims to train a DenseNet model to improve feature reuse and mitigate overfitting, comparing its performance against a simpler CNN alternative to quantify the benefits of dense connectivity [3].

Dataset

The dataset consists of 48x48 pixel grayscale images categorized into 7 classes. The distribution of the images across these classes is as follows:

Angry	Disgust	Fear	Happy	Neutral	Sad	Surprise	Total
7532	815	7705	13739	9490	9242	5717	54240

Figure 1: Sample images from the FER2013 dataset representing different emotional expressions.

Architecture

The core of this project is the Densely Connected Convolutional Network (DenseNet). Unlike traditional CNNs that connect layers sequentially, DenseNet introduces direct connections between all layers within a dense block. This architecture ensures that each layer receives feature maps from all preceding layers, which promotes feature reuse, improves gradient flow, and mitigates the vanishing gradient problem [4].

Figure 1: Visualization of a deep DenseNet with three dense blocks and transition layers.

Hyperparameter Optimization

To maximize model performance, we employed the Optuna framework using the Tree-structured Parzen Estimator method [5]. The optimization process involved 40 steps, evaluating various hyperparameters including the number of dense blocks, growth rate, dropout rates, and L2 regularization factors to balance model complexity and generalization.

Results

The DenseNet architecture demonstrated superior performance compared to the traditional CNN baseline.

DenseNet Accuracy: 57%
CNN Accuracy: 34%

The confusion matrices reveal that while the DenseNet generalizes reasonably well, the traditional CNN struggles significantly with class imbalance, failing completely to predict the "disgust" class and biasing heavily towards representative classes like "angry" [3].

Figure 2: Normalized Confusion Matrix for the optimal DenseNet model.

Figure 3: Normalized Confusion Matrix for the alternative CNN model.

Conclusion

This study demonstrates that DenseNet architectures significantly outperform traditional CNNs in facial emotion recognition by leveraging dense connectivity for improved feature propagation. While accuracy improved, future work must address data quality limitations by integrating temporal data and advanced augmentation techniques

Acknowledgments

This project was completed as part of the Machine Learning for Physicists 2024 module at TU Dortmund.

For any questions or issues, please feel free to contact the authors.

References

[1] A.-L. Cîrneanu, D. Popescu, and D. Iordache. "New Trends in Emotion Recognition Using Image Analysis by Neural Networks, A Systematic Review". In: Sensors 23.16 (2023).

[2] Kaggle. "FER-2013 - Learn facial expressions from an image". Available: Kaggle Dataset.

[3] R. M. Lehner and L. Hagemann. "Machine Learning for Physicists: Final Report". TU Dortmund, 2024.

[4] G. Huang, Z. Liu, L. van der Maaten, and K. Q. Weinberger. "Densely Connected Convolutional Networks". In: Proceedings of the IEEE conference on computer vision and pattern recognition (2017).

[5] T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama."Optuna: A Next-generation Hyperparameter Optimization Framework". arXiv:1907.10902 (2019).

Enjoy exploring the project and feel free to contribute or provide feedback!

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
.vscode		.vscode
emotions_facial		emotions_facial
.gitattributes		.gitattributes
.gitignore		.gitignore
ML_Report.pdf		ML_Report.pdf
README.md		README.md
environment.yml		environment.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Facial Emotion Detection with DenseNet

Table of Contents

Installation

Usage

Files