NLP Ticket Triage & Sentiment

Auto-label help-desk tickets with topic + sentiment using a multi-task transformer head — served via FastAPI, evaluated in a Streamlit dashboard, with PII redaction built-in.

Results (mock inference on seed data)

Metric	Value
Topic accuracy	~83% (keyword mock)
Sentiment accuracy	~75% (keyword mock)
Latency	<5ms (mock) / ~14ms (DistilBERT)
PII redacted	email, phone, ticket IDs

Replace mock inference with the fine-tuned DistilBERT head by running python -m src.models.train.

API Screenshot

Eval Dashboard Screenshot

Quickstart

python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

# Run API
uvicorn app.main:app --host 0.0.0.0 --port 8000
# → http://localhost:8000/docs

# Run eval dashboard
streamlit run src/eval/dashboard.py

# Run tests
pytest tests/ -v

Docker

docker build -t triage:latest .
docker run -p 8000:8000 triage:latest

API

POST /predict

curl -X POST http://localhost:8000/predict \
  -H "Content-Type: application/json" \
  -d '{"text": "Refund failed twice; card charged."}'

Response

{
  "topic":     {"label": "billing", "score": 0.82},
  "sentiment": {"label": "neg",     "score": 0.90},
  "probs": {
    "topic":     {"billing": 0.82, "bug": 0.04, ...},
    "sentiment": {"neg": 0.90, "neu": 0.05, "pos": 0.05}
  },
  "latency_ms": 3,
  "model_version": "mock-keyword-0.1.0"
}

GET /health

{"status": "ok", "model_version": "mock-keyword-0.1.0"}

Topics

login · billing · bug · feature · shipping · other

Sentiment

neg · neu · pos

Project Structure

app/
└── main.py             — FastAPI app factory
src/
├── infer/
│   ├── preprocess.py   — PII redaction (email, phone, ticket IDs)
│   └── service.py      — inference router (mock + real model path)
├── models/
│   ├── multitask_head.py — DistilBERT multi-task PyTorch module
│   └── train.py          — fine-tuning script (DistilBERT on CSV)
└── eval/
    └── dashboard.py    — Streamlit eval dashboard
tests/
├── test_service.py     — 24 API tests
└── test_preprocess.py  — 7 PII redaction tests
data/
└── seed/seed.csv       — 36 labelled seed tickets for training + eval

Training

# Train on seed data (36 examples — add more for real accuracy)
python -m src.models.train \
  --train_csv data/seed/seed.csv \
  --epochs 3 \
  --batch_size 8 \
  --lr 2e-5
# Saves to artifacts/model/ — API picks it up automatically on restart

PII Redaction

preprocess.py strips emails, phone numbers, and ticket IDs before any logging or inference:

from src.infer.preprocess import redact
text, n = redact("Contact user@example.com or call 555-123-4567 about TKT-001.")
# → "<email> or call <phone> about <ticket>", 3

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.github/workflows		.github/workflows
app		app
docs		docs
scripts		scripts
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLP Ticket Triage & Sentiment

Results (mock inference on seed data)

API Screenshot

Eval Dashboard Screenshot

Quickstart

Docker

API

Topics

Sentiment

Project Structure

Training

PII Redaction

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NLP Ticket Triage & Sentiment

Results (mock inference on seed data)

API Screenshot

Eval Dashboard Screenshot

Quickstart

Docker

API

Topics

Sentiment

Project Structure

Training

PII Redaction

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages