🧠 MiniBERT: A Small-Scale Implementation and Fine-tuning of BERT

A compact and educational replication of the original BERT model (Devlin et al., 2018), MiniBERT is designed for practical understanding and experimentation under constrained compute environments. This project pre-trains a simplified BERT architecture on ~30M tokens and fine-tunes it on two NLP tasks: SST-2 for sentiment classification and SWAG for commonsense inference.

📖 Overview

"Pre-train on unlabeled data, fine-tune on everything else."

MiniBERT recreates the BERT pipeline — from scratch — with:

4 Transformer layers
Hidden size of 256
4 attention heads
Max sequence length: 128
Pretraining on BookCorpus + Wikipedia (trimmed)
Fine-tuning on SST-2 & SWAG

This makes it an ideal reference for anyone learning about Transformers, BERT architecture, or resource-efficient deep learning.

🏗️ Model Architecture

Component	MiniBERT	BERT-Base
Layers	4	12
Hidden Size	256	768
Attention Heads	4	12
Seq Length	128	512
Vocabulary Size	30,522	30,522

Built in PyTorch, with inspiration from HuggingFace's transformers library.

🧪 Pre-training Tasks

1. Masked Language Modeling (MLM)

15% of tokens masked (80% [MASK], 10% random, 10% unchanged)
Loss calculated only on masked tokens

2. Next Sentence Prediction (NSP)

50% IsNext (same document), 50% NotNext (random pairing)
Input format: [CLS] Sentence A [SEP] Sentence B [SEP]

📊 Fine-tuning & Results

✅ SST-2: Sentiment Classification

Binary classification
87.2% validation accuracy

❌ SWAG: Commonsense Inference

4-choice multiple choice
37.3% validation accuracy (limited by compute/resources)

🛠️ Tech Stack

Python
PyTorch
Hugging Face transformers
Google Colab (T4 GPU)
Datasets: BookCorpus, Wikipedia, SST-2, SWAG

🚀 How to Run

Clone the repo and install dependencies:

pip install -r requirements.txt

Open the notebook:

jupyter notebook DLbertcode.ipynb

Run each section:
- Preprocessing + Dataset loading
- Model implementation
- Pre-training (MLM + NSP)
- Fine-tuning on SST-2 & SWAG

📁 Project Structure

File	Description
`DLbertcode.ipynb`	Main notebook with full implementation
`DLbert.pdf`	Project report (background, results)
`requirements.txt`	Dependencies list
`models/`	(Optional) Saved checkpoints
`data/`	(Optional) Preprocessed datasets

📌 Limitations

Small model capacity
Limited compute (1 GPU, 3 epochs)
Restricted token count (~30M vs. BERT's 3.3B+)
Only 2 downstream tasks evaluated

📚 References

Devlin et al., 2018 — BERT: Pre-training of Deep Bidirectional Transformers
Vaswani et al., 2017 — Attention is All You Need
Zellers et al., 2018 — SWAG Dataset
Socher et al., 2013 — SST-2 Dataset

🙏 Acknowledgments

Thanks to the contributors and the department of CSE(AIML) for their support and guidance.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
DLbert.pdf		DLbert.pdf
DLbertcode.ipynb		DLbertcode.ipynb
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 MiniBERT: A Small-Scale Implementation and Fine-tuning of BERT

📖 Overview

🏗️ Model Architecture

🧪 Pre-training Tasks

1. Masked Language Modeling (MLM)

2. Next Sentence Prediction (NSP)

📊 Fine-tuning & Results

✅ SST-2: Sentiment Classification

❌ SWAG: Commonsense Inference

🛠️ Tech Stack

🚀 How to Run

📁 Project Structure

📌 Limitations

📚 References

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧠 MiniBERT: A Small-Scale Implementation and Fine-tuning of BERT

📖 Overview

🏗️ Model Architecture

🧪 Pre-training Tasks

1. Masked Language Modeling (MLM)

2. Next Sentence Prediction (NSP)

📊 Fine-tuning & Results

✅ SST-2: Sentiment Classification

❌ SWAG: Commonsense Inference

🛠️ Tech Stack

🚀 How to Run

📁 Project Structure

📌 Limitations

📚 References

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages