Can you teach an AI to read a credit memo?
Two phases. One loan. From a single PDF to a full loan file. Your AI teammate. Whatever tools you want. Go.
We give you one real credit memo — a $514,500 multifamily loan in Philadelphia. It has form fields, financial tables, analyst commentary, infographics, and property photos. Every content type that makes PDF extraction hard.
Your job: extract structured data from it, verify the math, and show where every number came from. Use any language, any framework, any AI. The only thing that matters is whether your output passes the scorecard.
| Tier | What it means |
|---|---|
| Bronze (3 of 9 checks) | "I can extract numbers and the math adds up" |
| Silver (6 of 9 checks) | "I can extract tables and cross-reference across pages" |
| Gold (all 9 checks) | "Every check passes. Full scorecard satisfied." |
Expand from one PDF to a full loan file: Personal Financial Statement, Rent Roll, Appraisal Summary, and scanned documents like tax returns. The challenge shifts from single-document extraction to cross-document verification and auto-classification.
New dimensions: net worth reconciliation, LTV validation, income consistency, guarantor consistency, document auto-classification, and an origination vs. servicing timeline.
| Tier | What it means |
|---|---|
| Pipeline (auto-classify 3+ doc types) | "My system identifies document types without being told" |
| Crosscheck (cross-document verification passes) | "Numbers reconcile across documents" |
| Platinum (handle scans, timeline, full loan file) | "Full loan file processed — scans, timeline, everything." |
Join the Discord to find a team, get help, and share your best extraction failures.
git clone https://github.com/jymiller/creditmemo.git
cd creditmemo/starter-kit
bash setup-check.sh # verify your environment
python3 validate.py sample-output.json # see what Gold looks like
python3 quickstart.py # extract your first fieldThen build your own solution and run python3 validate.py your-output.json to check your score anytime.
| File | What it is |
|---|---|
HACKATHON.md |
Event guide — spirit, teams, schedule, tiers, awards |
demo-requirements.md |
Phase 1 scorecard, output schema, and full spec |
Sample-Enhanced-Memo.pdf |
Phase 1 input — the 22-page credit memo |
| File | What it does |
|---|---|
starter-kit/sample-output.json |
Reference output showing the target shape — look at this first |
starter-kit/validate.py |
Runs the 9 checks against your output, prints your tier |
starter-kit/quickstart.py |
Extracts your first field from the PDF in 5 minutes |
starter-kit/schemas.py |
Importable Pydantic schema (Python teams) |
starter-kit/setup-check.sh |
Verifies your environment is ready |
starter-kit/.env.example |
Environment variable template |
If you want to understand why a bank would care about this problem:
| Document | Read time |
|---|---|
| One-Page Brief | 3 min |
| Strategic Brief | 12 min |
| Full Spec | 15 min |
Prefer video? Walkthrough on Loom
- Read
HACKATHON.mdfor the spirit, schedule, and team formation - Clone the repo and run
setup-check.sh - Look at
sample-output.jsonto see the shape of what you're building - Build your solution — any language, any framework, any approach
- Run
validate.pyearly and often — it's a tool for building, not a final exam - Demo at 5pm Sunday PT — 5 minutes, show what you built and what you learned
Bronze is a win. Gold is a flex. Phase 2 is the deep end. All tiers get celebrated. The best hackathon outcome is not the code — it's the team that built it and the things they learned.
.
├── index.html Landing page (GitHub Pages)
├── HACKATHON.md Event guide
├── demo-requirements.md / .html Scorecard and full spec
├── one-page-brief.md / .html Executive brief
├── credit-memo-extraction-explainer.html Strategic brief
├── Sample-Enhanced-Memo.pdf The input document
├── hackathon-design-notes.md Meta-analysis of hackathon design
├── starter-kit/ Validator, quickstart, schemas, sample output
├── feedback/ Submission reviews
└── .github/workflows/pages.yml GitHub Pages deployment