Local Gmail triage daemon that automatically files, labels, and archives email using a three-layer decision pipeline. Inspired by SaneBox but runs entirely under your control.
Gmail API (OAuth2)
|
v
Fetch unread emails
|
v
+---------------------+
| Layer 1: Cache | Sender/domain DB lookup
| (SQLite) | -> hit + high confidence -> apply immediately
+----------+----------+
| miss
v
+---------------------+
| Layer 2: Heuristics| Header analysis + scoring
| | -> high confidence -> apply + cache
+----------+----------+
| ambiguous
v
+---------------------+
| Layer 3: LLM | Headers + body snippet -> structured JSON
| Classifier | -> apply + cache for next time
+---------------------+
The cache learns from every decision. Repeat senders skip heuristics and LLM entirely after the first classification.
- Three-layer pipeline -- cache, heuristics, LLM -- minimizes API calls
- LLM flexibility -- Anthropic Claude API or local models via LM Studio
- Observe mode -- dry-run to see what mailfiler would do before enabling full auto
- Sender management -- pin, trust, block, or reset individual senders
- Audit log -- every decision stored in SQLite with full provenance
- Feedback loop -- user corrections update the cache for future accuracy
- Configurable labels -- define your own categories with descriptions, or use the 11 built-in defaults
- Implicit learning -- move an email in Gmail and mailfiler learns your preference for that sender
- Docker support -- multi-stage Dockerfile included
- Python 3.11+
- uv (recommended) or pip
- A Google Cloud project with the Gmail API enabled
- OAuth2 credentials (
credentials.json) for your Gmail account
git clone https://github.com/JoeCotellese/inboxzero.git
cd inboxzero
uv synccp config.toml.example config.tomlEdit config.toml with your settings:
- Set
user_emailto your Gmail address - Choose an LLM provider (
anthropic,lmstudio, or leave blank for a stub that keeps everything in inbox) - Adjust confidence thresholds if desired
Place your Google OAuth2 credentials at ~/.mailfiler/credentials.json. On first run, mailfiler will open a browser for OAuth consent and save the token.
# Observe mode (dry run, no changes to Gmail)
mailfiler run
# Check what happened
mailfiler audit
# View pipeline stats
mailfiler statsOnce you're comfortable with the decisions, set run_mode = "full_auto" in config.toml.
Labels are defined in config.toml under [[labels.categories]]. Each category becomes a Gmail label like mailfiler/newsletter. The inbox and archived categories are required.
[labels]
prefix = "mailfiler"
[[labels.categories]]
name = "inbox"
description = "Important emails that need attention"
[[labels.categories]]
name = "newsletter"
description = "Subscription content, digests, editorial emails"
[[labels.categories]]
name = "finance"
description = "Bank statements, invoices, tax documents"
[[labels.categories]]
name = "archived"
description = "General archive for low-priority items"Descriptions are passed to the LLM to guide classification. If you omit [[labels.categories]] entirely, mailfiler uses 10 built-in defaults (inbox, newsletter, marketing, github, jira, automated, receipts, calendar, security, archived).
mailfiler watches for corrections you make in Gmail. On each run, it compares the current Gmail state of previously processed emails to its recorded decisions:
| You do this in Gmail | mailfiler learns |
|---|---|
| Move an archived email back to inbox | Sender → keep_inbox |
| Archive an email that was kept in inbox | Sender → archive |
Change the mailfiler/* label |
Sender → the new label |
After enough overrides (3+), mailfiler pins the sender so it stops second-guessing you. Use --no-learn to skip the learning phase:
mailfiler run --no-learnView learned corrections:
mailfiler audit --learned| Command | Description |
|---|---|
mailfiler run |
Run one processing pass in the foreground |
mailfiler run --no-learn |
Run without the implicit learning phase |
mailfiler status |
Show daemon status |
mailfiler audit |
Show recent processed emails with decisions |
mailfiler audit --learned |
Show only learned corrections |
mailfiler stats |
Show cache hit rate, LLM usage, override stats |
mailfiler pin <email> |
Always keep sender in inbox |
mailfiler unpin <email> |
Remove inbox pin |
mailfiler trust <email> |
Keep in inbox with max confidence |
mailfiler block <email> |
Always archive sender |
mailfiler reset-sender <email> |
Delete sender profile, re-evaluate from scratch |
Anthropic Claude (cloud):
[llm]
provider = "anthropic"
model = "claude-haiku-4-5"Requires ANTHROPIC_API_KEY environment variable.
LM Studio (local):
[llm]
provider = "lmstudio"
model = "qwen3-30b-a3b-2507"
base_url = "http://localhost:1234/v1"No API key needed. Good for privacy-first setups or 32GB+ Macs.
uv sync --group dev
uv run pytest
uv run ruff check .
uv run pyright