A Jupyter notebook pipeline that takes a CSV of support tickets, classifies each one by category and severity using Claude, and produces a prioritized PM report — with a Fix Now / Next Sprint / Backlog breakdown and leadership-ready insights.
PMs inheriting a support queue face the same problem every sprint: hundreds of tickets, no consistent categorization, and stakeholders asking "what are the top issues?" The answer requires reading through noise to spot patterns — a process that doesn't scale and introduces bias toward whichever tickets were filed most recently or loudest.
This tool automates the triage: classify every ticket by category and AI-assessed severity, surface recurring themes, and output a structured backlog recommendation in under 5 minutes.
CSV of support tickets → Classify (category + severity + topic) → Theme synthesis → PM report
Input: Any CSV with subject and body columns. Optional: id, date, priority.
Output:
- Per-ticket classification: category, AI-assessed severity, topic label, one-liner summary
- Top 5 recurring themes with severity skew
- Prioritized backlog: Fix Now / Next Sprint / Backlog
- PM insights for leadership
- Charts: category breakdown, severity distribution, ticket volume over time
- Exported CSV with all classifications for further analysis
git clone https://github.com/elmarto87/support-ticket-analyzer.git
cd support-ticket-analyzerpip install -r requirements.txtcp .env.example .env
# Edit .env and add your Anthropic API keyjupyter notebook main.ipynbsample_data/sample_tickets.csv is included so you can run the full pipeline immediately — no data required.
| Column | Required | Description |
|---|---|---|
subject |
✅ | Ticket subject line |
body |
✅ | Full ticket description |
id |
optional | Ticket ID for reference |
date |
optional | Date filed — enables volume trend chart |
priority |
optional | Agent-assigned priority (not used for AI severity) |
Run on 50 sample SaaS support tickets
Top themes Claude identified:
Authentication & Account Access Failures— mostly high severity — SSO breakdowns, 2FA SMS failures, and password reset issues are collectively locking users out; treat as systemic, not isolatedData Integrity & Export Reliability— mostly high severity — CSV import corruption, missing custom fields in exports, and CRM sync delays are blocking reporting workflowsSecurity & Compliance Gaps— mostly high severity — Non-functional API key revocation and missing audit logs represent active compliance risk
PM backlog (Claude-generated):
🔴 Fix Now
- SSO login failing for ~20% of users with no workaround — P0 blocker
- API key revoke button non-functional — active security vulnerability
- CSV import silently corrupting blank fields — data integrity risk
🟡 Next Sprint
- Consolidate password reset + 2FA SMS delivery failures into one infra audit
- Profile and fix reports page query performance regression
- Investigate post-upgrade entitlement provisioning pipeline
🟢 Backlog
- Folder and tag organization for content
- Dashboard widget layout persistence
- Search result ordering improvements
1. Claude classification vs. embedding-based clustering
Embedding clustering groups tickets by semantic similarity but produces unlabeled clusters — you still have to read each cluster to understand what it represents. Claude classification returns human-readable labels (category, topic, one-liner) directly, making the output immediately actionable without manual interpretation. The tradeoff: Claude occasionally disagrees with a human classifier on edge cases, while clustering is fully unsupervised. For a PM use case where interpretability matters more than precision, classification wins.
2. AI-assessed severity vs. agent-assigned priority
Agent-assigned priority fields are unreliable — enterprise customers tend to mark everything high, and support agents triage by loudness rather than impact. The analyzer ignores the input priority field and has Claude re-assess severity based on the ticket text (user impact, number of users affected, availability vs. UX issue). This produces a more consistent signal for prioritization.
3. Batch size of 20 tickets per API call
Larger batches (50+) reduce API calls but increase the chance of the model losing track of index alignment across a long context. Smaller batches (5–10) are more accurate but multiply cost. 20 tickets per call balances cost, speed, and classification consistency — the model has enough context to normalize topic labels across similar tickets without losing index accuracy.
- Agent-assigned priority fields are almost useless for PM prioritization — Claude's re-assessment based on the ticket description is more consistent and impact-aligned than whatever the support team entered
- Authentication issues cluster together in the data but feel like separate bugs in the queue — seeing them as a theme (rather than individual tickets) is what surfaces the systemic root cause pattern
- Asking Claude to generate a "one-liner" per ticket, rather than summarizing the full body, produces a better input for the theme synthesis step — shorter, more normalized text makes the cross-ticket pattern recognition more accurate
- Python 3.9+
- Anthropic API key — get one here
- See
requirements.txtfor package versions


