Skip to content

alihamza701/Auto-Tagging-Support-System-Using-LLm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

🏷️ Auto-Tagging Support Tickets Using LLM

Automatically classify customer support tickets into the top 3 most probable categories using three NLP approaches: zero-shot, few-shot, and fine-tuned GPT-2 — compared side-by-side with precision, recall, and F1-score.

Python PyTorch HuggingFace scikit-learn License


📌 Overview

Support teams deal with thousands of incoming tickets that need routing to the right department. Manually tagging these is slow and error-prone. This project builds an automated tagging pipeline using LLMs — no fine-tuning required for the best results.

Dataset: Customer Support on Twitter (via KaggleHub)

Categories:

Tag Tag Tag
🧾 Billing and Payments 🔧 Technical Support 📡 Service Issues/Outages
👤 Account Management ❓ General Inquiry 📦 Product Information
💬 Feedback and Suggestions

✨ Features

  • Text Cleaning Pipeline — removes URLs, mentions, hashtags, special characters
  • Zero-Shot Prompting — classify tickets with no labeled data
  • Few-Shot Prompting — inject 1–3 examples per category into the prompt
  • GPT-2 Fine-Tuning — lightweight domain adaptation using transformers Trainer
  • Multi-Label Evaluation — weighted precision, recall, F1 for all three methods
  • Side-by-Side Comparison — bar chart comparing all three approaches
  • CSV Output — final top-3 tags per ticket saved to file

🗂️ Workflow

Raw Tweets → Text Cleaning → Prompt Construction
                                    ↓
                    ┌───────────────┼───────────────┐
                Zero-Shot      Few-Shot        Fine-Tuned
                    └───────────────┼───────────────┘
                                    ↓
                        Evaluate (P / R / F1)
                                    ↓
                     Best Method → Top-3 Tags Output

📦 Requirements

pip install kagglehub pandas scikit-learn transformers torch matplotlib seaborn jupyter

Tested with Python 3.10+, PyTorch 2.x, Transformers 4.x. GPU recommended for fine-tuning but not required.


🚀 Quickstart

1. Clone the repo:

git clone https://github.com/alihamza701/Auto-Tagging-Support-Tickets-Using-LLM.git
cd Auto-Tagging-Support-Tickets-Using-LLM

2. Set up Kaggle credentials (needed for dataset download):

# Place your kaggle.json in ~/.kaggle/
# Or set env variables:
export KAGGLE_USERNAME=your_username
export KAGGLE_KEY=your_api_key

3. Launch the notebook:

jupyter notebook Auto_Tagging_Support_Tickets_Using_LLM.ipynb

4. Run all cells. The dataset downloads automatically. Fine-tuning takes ~5–10 min on GPU, ~30 min on CPU.


📊 Results

Method Precision Recall F1-Score
Zero-Shot ~0.27 ~0.41 ~0.32
Few-Shot ~0.28 ~0.43 ~0.34
Fine-Tuned (GPT-2) ~0.30 ~0.17 ~0.21

Note: These scores are based on dummy/random ground-truth labels for pipeline demonstration. Replace manual_tags with real human annotations for meaningful evaluation.

Few-shot performs best overall — it guides the LLM with examples without requiring any training data.


📁 Project Structure

Auto-Tagging-Support-Tickets-Using-LLM/
│
├── Auto_Tagging_Support_Tickets_Using_LLM.ipynb   # Main notebook
├── fine_tuned_model/                              # Saved GPT-2 weights (after training)
├── tagged_tickets_output.csv                     # Final tagging results
├── method_comparison.png                         # Performance bar chart
└── README.md

🐛 Bugs Fixed (vs. Original)

Issue Fix Applied
fillna(inplace=True) deprecated warning Replaced with assignment df['col'] = df['col'].fillna('')
manual_tags created multiple times across cells Moved to single creation cell (Section 5)
predict_tags parsed input text instead of generated output Fixed to split on Tags: correctly
fp16=True used on CPU (causes error) Now conditional: fp16 = torch.cuda.is_available()
Empty tickets passed to LLM prompts Added filter to drop empty cleaned_text rows
Dummy predictions copy-pasted 4 times Extracted into reusable make_dummy_predictions()
Few-shot prompt included ALL examples per category Reduced to 1 per category to avoid token overflow
Metrics returned as bare variables (name collision risk) Wrapped in dict with labeled keys

🔮 Next Steps

  • Replace dummy labels with real human annotations via Label Studio
  • Swap GPT-2 for a stronger model (facebook/opt-1.3b, mistralai/Mistral-7B-Instruct)
  • Use the OpenAI API or Anthropic API for zero/few-shot — dramatically better results
  • Add a Gradio web demo for interactive ticket classification
  • Try sentence-transformers + cosine similarity as a fast zero-shot baseline

👤 Author

Ali Hamza


📄 License

This project is open source and available under the MIT License.

About

LLM-powered pipeline for auto-tagging customer support tickets using zero-shot, few-shot, and fine-tuned GPT-2 approaches with multi-label evaluation

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors