Skip to content

Pranitha-22/MAP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

4 Commits
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿง  MAP โ€“ Student Misconception Detection (DeBERTa-v3)

Transformer-Based NLP Classification | Kaggle Competition Project

This project tackles the MAP โ€“ Charting Student Math Misunderstandings Kaggle competition.

The goal is to classify mathematical misconceptions from open-ended student explanations using a transformer-based deep learning model.


๐ŸŽฏ Problem Overview

Students explain their reasoning in free-text format.
The task is to predict:

Category : Misconception

based on:

  • Question text
  • Multiple-choice answer
  • Student explanation

This is a multi-class NLP classification problem with strong class imbalance.


๐Ÿ— Model Architecture

Input Text
(Question + MC Answer + Student Explanation)

โ†’ Tokenization (DeBERTa-v3)
โ†’ Transformer Encoder
โ†’ Classification Head
โ†’ Softmax
โ†’ Category:Misconception Prediction


โš™๏ธ Implementation Highlights

  • โœ… DeBERTa-v3 Backbone
  • โœ… Stratified K-Fold Cross Validation (3 Folds)
  • โœ… Mixed Precision Training (AMP)
  • โœ… Cosine Learning Rate Scheduler
  • โœ… AdamW Optimizer
  • โœ… Layer Freezing for Faster Training
  • โœ… Fold Ensembling for Final Submission
  • โœ… Macro F1 Evaluation

๐Ÿ“Š Evaluation

Primary Metric: Macro F1 Score

  • Stratified splitting ensures label balance
  • Validation performance tracked per fold
  • Best fold model saved and used for ensembling

๐Ÿ›  Tech Stack

  • PyTorch
  • Hugging Face Transformers
  • Scikit-learn
  • Pandas / NumPy
  • Google Colab (GPU T4)

๐Ÿ“ธ Notebook

Colab Notebook: (https://colab.research.google.com/drive/1p7GqShMU9kcon3isXY7xAhMCfBCfrcqu?usp=sharing)


๐Ÿง  Engineering Learnings

  • Handling extreme class imbalance in NLP classification
  • Implementing stratified K-Fold validation
  • Managing transformer training stability
  • Debugging mixed precision & gradient instability
  • Designing clean Kaggle submission pipelines

๐Ÿ”ฎ Future Improvements

  • Use DeBERTa-v3-Large
  • Apply label smoothing
  • Try class-balanced loss
  • Apply pseudo-labeling
  • Add model distillation for faster inference

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors