A multi-task learning approach to detecting toxicity, empathy, and cognitive maturity in social media comments from YouTube, Twitter, and Reddit using DistilBERT.
- Can a compact transformer model (DistilBERT) predict toxicity, empathy, and cognitive maturity simultaneously?
- Does multi-task learning outperform TF-IDF + Logistic Regression baseline?
- How do the model's predictions compare with GPT-3.5?
- Total: 450 manually annotated comments
- Sources: YouTube (150), Reddit (150), Twitter (150)
- Labels:
- Toxicity: 3-class (non-toxic, neutral, toxic)
- Empathy: 1-5 Likert scale
- Maturity: 1-5 Likert scale
- DistilBERT outperformed TF-IDF baseline in toxicity classification
- Non-toxic comments had higher empathy and maturity scores
- 50% agreement with GPT-3.5 on toxicity labels
- Platform differences: YouTube (non-toxic), Twitter (toxic), Reddit (balanced)
- Model: Fine-tuned DistilBERT with multi-task heads
- Architecture: Shared encoder + 3 task-specific outputs
- Loss Function: Weighted CrossEntropy (toxicity) + MSE (empathy, maturity)
- Training: AdamW optimizer, 13 epochs, batch size 16
| Metric | Value |
|---|---|
| Toxicity Accuracy | 68.8% |
| Toxicity F1 (macro) | 0.56 |
| Toxic Class AUC | 0.68 |
| Agreement with GPT-3.5 | 50% |
truecolortest.ipynb- Full implementation and evaluationResearch_Project.pdf- Full academic reportresults/- Confusion matrices, ROC curves, and visualizations
- Python 3.10
- PyTorch
- Hugging Face Transformers (DistilBERT)
- Scikit-learn
- Pandas, NumPy, Matplotlib
- Install dependencies
- Run
truecolortest.ipynb - View results in
results/folder
- Small dataset (450 samples) limits generalization
- Class imbalance across platforms
- No multilingual support
- Manual annotation introduces subjectivity
- Expand dataset with balanced platform representation
- Multilingual and cross-cultural analysis
- Human-in-the-loop refinement
- Real-time deployment for moderation systems
Phone Pyae Aung - MSc Data Science, University of Exeter
See Research_Project.pdf for full references