Support Ticket Analyzer

A Jupyter notebook pipeline that takes a CSV of support tickets, classifies each one by category and severity using Claude, and produces a prioritized PM report — with a Fix Now / Next Sprint / Backlog breakdown and leadership-ready insights.

The Problem

PMs inheriting a support queue face the same problem every sprint: hundreds of tickets, no consistent categorization, and stakeholders asking "what are the top issues?" The answer requires reading through noise to spot patterns — a process that doesn't scale and introduces bias toward whichever tickets were filed most recently or loudest.

This tool automates the triage: classify every ticket by category and AI-assessed severity, surface recurring themes, and output a structured backlog recommendation in under 5 minutes.

The Solution

CSV of support tickets → Classify (category + severity + topic) → Theme synthesis → PM report

Input: Any CSV with subject and body columns. Optional: id, date, priority.

Output:

Per-ticket classification: category, AI-assessed severity, topic label, one-liner summary
Top 5 recurring themes with severity skew
Prioritized backlog: Fix Now / Next Sprint / Backlog
PM insights for leadership
Charts: category breakdown, severity distribution, ticket volume over time
Exported CSV with all classifications for further analysis

Quickstart

1. Clone the repo

git clone https://github.com/elmarto87/support-ticket-analyzer.git
cd support-ticket-analyzer

2. Install dependencies

pip install -r requirements.txt

3. Set up your API key

cp .env.example .env
# Edit .env and add your Anthropic API key

4. Run with sample data

jupyter notebook main.ipynb

sample_data/sample_tickets.csv is included so you can run the full pipeline immediately — no data required.

CSV Format

Column	Required	Description
`subject`	✅	Ticket subject line
`body`	✅	Full ticket description
`id`	optional	Ticket ID for reference
`date`	optional	Date filed — enables volume trend chart
`priority`	optional	Agent-assigned priority (not used for AI severity)

Example Output

Run on 50 sample SaaS support tickets

Top themes Claude identified:

Authentication & Account Access Failures — mostly high severity — SSO breakdowns, 2FA SMS failures, and password reset issues are collectively locking users out; treat as systemic, not isolated
Data Integrity & Export Reliability — mostly high severity — CSV import corruption, missing custom fields in exports, and CRM sync delays are blocking reporting workflows
Security & Compliance Gaps — mostly high severity — Non-functional API key revocation and missing audit logs represent active compliance risk

PM backlog (Claude-generated):

🔴 Fix Now

SSO login failing for ~20% of users with no workaround — P0 blocker
API key revoke button non-functional — active security vulnerability
CSV import silently corrupting blank fields — data integrity risk

🟡 Next Sprint

Consolidate password reset + 2FA SMS delivery failures into one infra audit
Profile and fix reports page query performance regression
Investigate post-upgrade entitlement provisioning pipeline

🟢 Backlog

Folder and tag organization for content
Dashboard widget layout persistence
Search result ordering improvements

Tradeoffs and Decisions

1. Claude classification vs. embedding-based clustering

Embedding clustering groups tickets by semantic similarity but produces unlabeled clusters — you still have to read each cluster to understand what it represents. Claude classification returns human-readable labels (category, topic, one-liner) directly, making the output immediately actionable without manual interpretation. The tradeoff: Claude occasionally disagrees with a human classifier on edge cases, while clustering is fully unsupervised. For a PM use case where interpretability matters more than precision, classification wins.

2. AI-assessed severity vs. agent-assigned priority

Agent-assigned priority fields are unreliable — enterprise customers tend to mark everything high, and support agents triage by loudness rather than impact. The analyzer ignores the input priority field and has Claude re-assess severity based on the ticket text (user impact, number of users affected, availability vs. UX issue). This produces a more consistent signal for prioritization.

3. Batch size of 20 tickets per API call

Larger batches (50+) reduce API calls but increase the chance of the model losing track of index alignment across a long context. Smaller batches (5–10) are more accurate but multiply cost. 20 tickets per call balances cost, speed, and classification consistency — the model has enough context to normalize topic labels across similar tickets without losing index accuracy.

What I Learned

Agent-assigned priority fields are almost useless for PM prioritization — Claude's re-assessment based on the ticket description is more consistent and impact-aligned than whatever the support team entered
Authentication issues cluster together in the data but feel like separate bugs in the queue — seeing them as a theme (rather than individual tickets) is what surfaces the systemic root cause pattern
Asking Claude to generate a "one-liner" per ticket, rather than summarizing the full body, produces a better input for the theme synthesis step — shorter, more normalized text makes the cross-ticket pattern recognition more accurate

Requirements

Python 3.9+
Anthropic API key — get one here
See requirements.txt for package versions

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
sample_data		sample_data
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
analyzer.py		analyzer.py
main.ipynb		main.ipynb
report.py		report.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Support Ticket Analyzer

The Problem

The Solution

Quickstart

1. Clone the repo

2. Install dependencies

3. Set up your API key

4. Run with sample data

CSV Format

Example Output

Tradeoffs and Decisions

What I Learned

Requirements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Support Ticket Analyzer

The Problem

The Solution

Quickstart

1. Clone the repo

2. Install dependencies

3. Set up your API key

4. Run with sample data

CSV Format

Example Output

Tradeoffs and Decisions

What I Learned

Requirements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages