Skip to content

alihamza701/News-Topic-Classifier-Using-BERT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 

Repository files navigation

News Headline Classifier β€” BERT (AG News)

Fine-tuned bert-base-uncased on the AG News dataset for classifying news headlines into four categories: World, Sports, Business, Sci/Tech.

This repo contains a single script that trains the model, evaluates it, saves the checkpoint, and launches a Gradio demo for live inference.


πŸš€ Features

  • Loads AG News via datasets
  • Tokenizes with AutoTokenizer (bert-base-uncased)
  • Fine-tunes AutoModelForSequenceClassification
  • Evaluation using Accuracy & weighted F1
  • Simple Gradio web UI for live predictions
  • Model and tokenizer saved to ./bert_agnews

πŸ“¦ Requirements

Install dependencies:

pip install --upgrade transformers datasets evaluate scikit-learn gradio torch

##▢️ Quickstart

Clone the repo:

git clone cd

Run the script:

python main.py

Gradio demo:

Enter a news headline in the input box.

Output β†’ class probabilities, e.g.:

{"World": 0.01, "Sports": 0.92, "Business": 0.03, "Sci/Tech": 0.04}

πŸ“‚ Outputs

Model & tokenizer: ./bert_agnews

Trainer artifacts: ./results

πŸ›  Example Inference

from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch

tokenizer = AutoTokenizer.from_pretrained("./bert_agnews") model = AutoModelForSequenceClassification.from_pretrained("./bert_agnews") model.eval()

text = "SpaceX launches new satellite to orbit" inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True) with torch.no_grad(): out = model(**inputs) probs = torch.softmax(out.logits, dim=-1).squeeze().tolist() labels = ["World","Sports","Business","Sci/Tech"] print({labels[i]: probs[i] for i in range(len(labels))})

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors