From 93396e6eaf3b065d89a77e1c9c367264b06d8626 Mon Sep 17 00:00:00 2001
From: Claude <noreply@anthropic.com>
Date: Sat, 25 Oct 2025 01:57:04 +0000
Subject: [PATCH 1/5] docs: Add ActivityWatch integration PRD
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Add comprehensive PRD for semantic attention guardrail system:
- Ambient compass indicator for real-time alignment feedback
- End-of-phase reflection summaries
- Local-first AI classification using Ollama
- Zero-config ActivityWatch bundling
- Privacy-first design with local LLM processing

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
---
 docs/prds/ACTIVITY_WATCH_PROJECT.md | 699 ++++++++++++++++++++++++++++
 1 file changed, 699 insertions(+)
 create mode 100644 docs/prds/ACTIVITY_WATCH_PROJECT.md

diff --git a/docs/prds/ACTIVITY_WATCH_PROJECT.md b/docs/prds/ACTIVITY_WATCH_PROJECT.md
new file mode 100644
index 0000000..80e3806
--- /dev/null
+++ b/docs/prds/ACTIVITY_WATCH_PROJECT.md
@@ -0,0 +1,699 @@
+# ActivityWatch Integration - Semantic Attention Guardrails
+
+**Status**: Draft
+**Target**: MVP Extension (Phase 1)
+**Philosophy**: Reduce the distance from intent to action through ambient awareness
+
+---
+
+## Vision
+
+**The Problem**: We allocate consciousness to "Product Spec" but spend 90 minutes on Twitter. The gap between intention and action is invisible until reflection time - by then, the day is spent.
+
+**The Solution**: Integrate ActivityWatch as a bundled extension to provide **semantic attention guardrails** - AI-powered ambient feedback that gently closes the gap between stated intention and observed activity.
+
+**Core Principle**:
+> "Technology as a mirror for consciousness, not a taskmaster."
+
+This is not time tracking. This is **attention alignment detection** using AI to understand the semantic relationship between what you committed to doing and what you're actually doing.
+
+---
+
+## What This Is
+
+A **passive ambient awareness system** that:
+1. Observes computer activity via ActivityWatch
+2. Classifies alignment with current moment using local LLM
+3. Provides **peripheral feedback** (ambient compass indicator)
+4. Offers **reflective summaries** (end-of-phase ritual)
+
+**Not**: Performance tracking, productivity metrics, nagging notifications, or guilt-inducing dashboards.
+
+**Is**: A gentle, intelligent mirror that helps you notice drift before hours pass.
+
+---
+
+## User Experience
+
+### The Ambient Indicator (Passive, Real-time)
+
+A small compass indicator in the corner of Zenborg:
+
+```
+Current: "Product Spec" ☕ Morning
+
+[Compass widget - collapsed state]
+🧭 ↑  (aligned)
+
+[Compass widget - drift detected]
+🧭 ↙  (drifting)
+```
+
+**Behavior**:
+- Updates every 5-10 minutes
+- Lives in peripheral vision (top-right corner, can collapse/hide)
+- No modal takeovers, no sounds, no badges
+- Clicking shows brief summary: "Currently aligned with product work theme"
+- Can be dismissed entirely (respects user agency)
+
+**States**:
+- **Aligned** (↑): Activity matches moment's semantic theme
+- **Neutral** (↔): Ambiguous (email, Slack, quick searches)
+- **Drifting** (↙): Clear misalignment detected
+- **Untracked** (○): No digital activity (reading, meetings, thinking)
+
+### End-of-Phase Reflection (Passive, Retrospective)
+
+When a phase completes (e.g., Morning → Afternoon transition):
+
+```
+☕ Morning Complete
+
+┌─────────────────────────────────────────────────┐
+│ Product Spec                    [Craft]         │
+│ ✓ Aligned (2h 15m observed)                     │
+│ → Linear, Notion, Figma mockups                 │
+├─────────────────────────────────────────────────┤
+│ Email Triage                    [Admin]         │
+│ ✗ Drift detected (45m allocated, 12m observed)  │
+│ → Spent 1h 20m on Twitter/HN instead            │
+├─────────────────────────────────────────────────┤
+│ Deep Reading                    [Strategy]      │
+│ ? Untracked (no digital footprint)              │
+└─────────────────────────────────────────────────┘
+
+Press any key to continue to Afternoon...
+```
+
+**Design**:
+- Non-blocking (can dismiss immediately)
+- Shows up once per phase transition
+- No judgement language ("drift detected" not "you failed")
+- Acknowledges untracked time as valid (reading, thinking, meetings)
+
+---
+
+## Technical Architecture
+
+### High-Level Flow
+
+```
+┌──────────────────────────────────────────────────┐
+│           Zenborg Core (Phase 1)                 │
+│  - Moments with Area associations                │
+│  - Areas define semantic themes                  │
+│  - Current moment awareness                      │
+└─────────────────┬────────────────────────────────┘
+                  │
+                  ▼
+┌──────────────────────────────────────────────────┐
+│      ActivityWatch Extension Bundle              │
+│  - aw-watcher-window (desktop apps)              │
+│  - aw-watcher-web (browser tabs/URLs)            │
+│  - aw-watcher-afk (idle detection)               │
+│  - Local SQLite database                         │
+└─────────────────┬────────────────────────────────┘
+                  │
+                  ▼
+┌──────────────────────────────────────────────────┐
+│         Activity Collector Service               │
+│  - Polls AW database every 5-10 min              │
+│  - Aggregates recent events (last 15 min)        │
+│  - Filters: apps, window titles, URLs, duration  │
+└─────────────────┬────────────────────────────────┘
+                  │
+                  ▼
+┌──────────────────────────────────────────────────┐
+│       Semantic Classifier (Local LLM)            │
+│  - Ollama/llama.cpp (3B-7B param model)          │
+│  - Input: current moment + observed activity     │
+│  - Output: alignment classification + confidence │
+│  - Understands work themes semantically          │
+└─────────────────┬────────────────────────────────┘
+                  │
+                  ▼
+┌──────────────────────────────────────────────────┐
+│          Ambient Feedback Layer                  │
+│  - Compass indicator (real-time UI)              │
+│  - Phase reflection summary (transition screen)  │
+│  - Alignment history (stored in IndexedDB)       │
+└──────────────────────────────────────────────────┘
+```
+
+### Data Model Extensions
+
+**Area** (extended from Zenborg core):
+```typescript
+interface Area {
+  // ... existing fields ...
+  themeKeywords?: string[]  // ["linear", "notion", "spec", "roadmap"]
+  themeDescription?: string // "Product work: writing specs, prioritizing..."
+}
+```
+
+**Default Area Themes** (for user "Thopiax"):
+```typescript
+const DEFAULT_THEMES = {
+  "Product": {
+    keywords: ["linear", "notion", "spec", "roadmap", "jira", "prd"],
+    description: "Writing specs, scopes, prioritizing features"
+  },
+  "Data": {
+    keywords: ["jupyter", "python", "sql", "postgres", "dbt", "pandas"],
+    description: "Exploring data, writing models, running batches, experiments"
+  },
+  "UX": {
+    keywords: ["figma", "framer", "prototype", "design", "css", "component"],
+    description: "Prototyping, fine-tuning interfaces"
+  },
+  "Strategy": {
+    keywords: ["docs", "notes", "research", "reading", "writing"],
+    description: "Slow, deliberate thinking and planning"
+  }
+}
+```
+
+**AlignmentEvent** (new entity):
+```typescript
+interface AlignmentEvent {
+  id: string                    // UUID
+  momentId: string              // FK to Moment
+  timestamp: string             // ISO timestamp
+  classification: AlignmentType // "aligned" | "neutral" | "drifting"
+  confidence: number            // 0.0-1.0
+  observedActivities: ActivitySummary[]
+  themeDetected: string | null  // "product", "data", etc.
+  createdAt: string
+}
+
+interface ActivitySummary {
+  app: string
+  windowTitle: string
+  url?: string
+  duration: number              // seconds
+}
+
+type AlignmentType = "aligned" | "neutral" | "drifting" | "untracked"
+```
+
+### LLM Classification Service
+
+**Local Model Options** (ranked by preference):
+1. **Ollama** with Llama 3.2 3B (fastest, good balance)
+2. **llama.cpp** with Phi-3 Mini (smallest, edge devices)
+3. **Fallback**: Claude API (privacy implications, requires API key)
+
+**Classification Prompt Template**:
+```typescript
+const CLASSIFICATION_PROMPT = `You are an attention alignment classifier for a mindful productivity system.
+
+CURRENT INTENTION:
+- Moment: "${moment.name}"
+- Area: ${moment.area.name}
+- Theme: ${moment.area.themeDescription}
+- Phase: ${phase} (${phaseEmoji})
+
+OBSERVED ACTIVITY (last 15 min):
+${activitySummary}
+
+TASK: Classify alignment as:
+- ALIGNED: Activity clearly matches the stated intention and theme
+- NEUTRAL: Ambiguous or transitional (email, Slack, quick searches, switching contexts)
+- DRIFTING: Clear misalignment with stated intention
+- UNTRACKED: No significant digital activity detected
+
+GUIDELINES:
+- Consider semantic meaning, not just keywords
+  (e.g., "Slack #product-team" is aligned with product work)
+- Short diversions (<2 min) are NEUTRAL, not drifting
+- Respect nuance: research on Twitter for a product spec is aligned
+- If no clear activity, classify as UNTRACKED (not a failure)
+
+OUTPUT (JSON only, no explanation):
+{
+  "classification": "aligned" | "neutral" | "drifting" | "untracked",
+  "confidence": 0.0-1.0,
+  "themeDetected": "product" | "data" | "ux" | "strategy" | null,
+  "briefReason": "Short explanation (max 10 words)"
+}`;
+```
+
+**Response Parsing**:
+```typescript
+interface ClassificationResult {
+  classification: AlignmentType
+  confidence: number
+  themeDetected: string | null
+  briefReason: string
+}
+
+// Store in IndexedDB as AlignmentEvent
+```
+
+---
+
+## Implementation Phases
+
+### Phase 1a: ActivityWatch Bundling (Week 1)
+**Goal**: Ship Zenborg with AW pre-configured, zero user setup
+
+**Tasks**:
+1. Bundle AW binaries for macOS/Linux/Windows
+2. Auto-start AW server on Zenborg launch (background process)
+3. Install default watchers (window, web, afk)
+4. Health check: verify AW is running, show status in settings
+5. Graceful fallback: if AW fails, hide extension UI (no crash)
+
+**Acceptance**:
+- User installs Zenborg → AW runs automatically
+- No manual AW installation required
+- Settings page shows "ActivityWatch: Running ✓"
+
+---
+
+### Phase 1b: Activity Collection (Week 1)
+**Goal**: Poll AW database and aggregate recent activity
+
+**Tasks**:
+1. AW SQLite database reader (or REST API client)
+2. Service: poll every 5-10 min for last 15 min of events
+3. Aggregate by app/window/URL with durations
+4. Filter noise (< 10 sec interactions, system processes)
+5. Store raw events temporarily (in-memory, not persisted)
+
+**Acceptance**:
+- Console logs show aggregated activity every 5 min
+- Events correctly grouped by app/window
+- Idle time excluded from aggregation
+
+---
+
+### Phase 1c: Local LLM Integration (Week 2)
+**Goal**: Classify alignment using Ollama locally
+
+**Tasks**:
+1. Detect Ollama installation (or prompt user to install)
+2. Auto-pull lightweight model (Llama 3.2 3B)
+3. Build classification prompt from current moment + activity
+4. Call Ollama API (http://localhost:11434)
+5. Parse JSON response → AlignmentEvent
+6. Store classifications in IndexedDB (not raw activity)
+
+**Acceptance**:
+- Classification runs locally, no external API calls
+- Response time < 2 seconds
+- Confidence scores calibrated (>0.7 for aligned/drifting)
+- Errors gracefully handled (show "untracked" if LLM fails)
+
+---
+
+### Phase 1d: Ambient Compass Indicator (Week 2)
+**Goal**: Show real-time alignment in peripheral vision
+
+**UI Component**:
+```tsx
+<AlignmentCompass
+  classification="aligned"
+  confidence={0.85}
+  canCollapse={true}
+  position="top-right"
+/>
+```
+
+**States**:
+- **Aligned**: 🧭 ↑ (green tint)
+- **Neutral**: 🧭 ↔ (gray)
+- **Drifting**: 🧭 ↙ (amber, not red - no guilt)
+- **Untracked**: 🧭 ○ (faded)
+
+**Interactions**:
+- Click → expand brief reason ("Aligned with product work theme")
+- Double-click → hide for 1 hour (respects user agency)
+- Settings toggle: disable entirely
+
+**Design**:
+- Monochrome base (stone-200 border)
+- Subtle color accent (area color, low opacity)
+- Small: 48px × 48px collapsed, 200px × 80px expanded
+- No animations (calm tech)
+
+**Acceptance**:
+- Updates within 10 seconds of classification
+- No performance impact (< 1% CPU)
+- Can be dismissed/hidden
+- Accessible (ARIA labels, keyboard nav)
+
+---
+
+### Phase 1e: End-of-Phase Reflection (Week 3)
+**Goal**: Show alignment summary at phase transitions
+
+**Trigger**: When current phase ends (based on PhaseConfig.endHour)
+
+**UI**:
+- Overlay (not modal - can click through)
+- Shows all moments from completed phase
+- For each moment:
+  - Alignment status (✓ aligned, ✗ drifting, ? untracked)
+  - Observed duration (aggregated from AW events)
+  - Top 3 apps/activities
+- Press any key or click to dismiss
+
+**Data**:
+- Query AlignmentEvents for completed phase
+- Aggregate classifications by moment
+- Calculate time spent per classification type
+- Do NOT show percentages or scores (no gamification)
+
+**Acceptance**:
+- Appears automatically at phase transition
+- Non-blocking (can dismiss immediately)
+- Shows accurate time aggregations
+- Works offline (uses cached data)
+
+---
+
+### Phase 1f: Settings & Privacy (Week 3)
+**Goal**: User control over data collection and feedback
+
+**Settings Panel** (`:settings` command):
+```
+┌─────────────────────────────────────────────────┐
+│ ActivityWatch Integration                       │
+├─────────────────────────────────────────────────┤
+│ ☑ Enable attention guardrails                   │
+│ ☑ Show ambient compass indicator                │
+│ ☑ Show end-of-phase reflections                 │
+│                                                  │
+│ Classification interval: [5 min] [10 min] [15]  │
+│ LLM Backend: [Ollama (local)] [Claude API]      │
+│                                                  │
+│ Privacy:                                         │
+│ ☑ Process data locally only                     │
+│ ☐ Allow cloud LLM fallback (requires API key)   │
+│                                                  │
+│ Data Retention:                                  │
+│ Keep alignment history: [7 days] [30] [Forever] │
+│ [Clear all ActivityWatch data]                  │
+│                                                  │
+│ Status:                                          │
+│ ActivityWatch: Running ✓                        │
+│ Ollama: Connected ✓ (Llama 3.2 3B)              │
+│ Last classification: 2 minutes ago              │
+└─────────────────────────────────────────────────┘
+```
+
+**Privacy Guarantees**:
+- Raw AW events never leave the machine (unless user opts into cloud LLM)
+- Only classification results stored (not window titles/URLs)
+- User can clear all data anytime
+- AW can be disabled entirely (extension becomes dormant)
+
+**Acceptance**:
+- All toggles functional
+- Data deletion works (verified in IndexedDB)
+- Ollama connection status accurate
+- Works without internet (local-only mode)
+
+---
+
+## User Flows
+
+### Flow 1: First-Time Setup (Zero Config)
+```
+1. User installs Zenborg
+2. ActivityWatch auto-starts in background
+3. Ollama detected (or prompt: "Install Ollama for local AI? [Yes] [Skip]")
+4. If Ollama installed → auto-pull Llama 3.2 3B (progress indicator)
+5. Settings show: "ActivityWatch: Running ✓, Ollama: Ready ✓"
+6. Compass indicator appears (faded, no moment allocated yet)
+```
+
+**Fallback**: If Ollama not installed, extension stays dormant (no crash, no nag).
+
+---
+
+### Flow 2: Morning Routine with Ambient Feedback
+```
+1. User allocates "Product Spec" to Today Morning (:t1)
+2. Morning starts (6am), phase active
+3. User opens Linear, starts writing spec
+4. After 5 min → AW collects events, LLM classifies
+5. Compass shows: 🧭 ↑ (aligned)
+6. User switches to Twitter for 20 min
+7. After 10 min → LLM reclassifies
+8. Compass shifts: 🧭 ↙ (drifting)
+9. User notices (peripheral vision), self-corrects
+10. Back to Linear → compass returns to 🧭 ↑
+```
+
+**Key**: No interruption, no modal. Just ambient awareness.
+
+---
+
+### Flow 3: End-of-Phase Reflection
+```
+1. Morning phase ends (12pm → Afternoon)
+2. Zenborg shows reflection overlay:
+
+   ☕ Morning Complete
+
+   ✓ Product Spec (2h 15m aligned)
+   ✗ Email Triage (1h 20m drifting - Twitter/HN)
+   ? Deep Reading (untracked)
+
+3. User reads, presses Esc
+4. Continues to Afternoon
+```
+
+**Non-Goals**: No lecture, no metrics, no "productivity score". Just a mirror.
+
+---
+
+### Flow 4: Disable Extension (User Agency)
+```
+1. User types :settings
+2. Unchecks "Enable attention guardrails"
+3. ActivityWatch stops collecting data
+4. Compass indicator disappears
+5. Zenborg continues working normally (core features unaffected)
+```
+
+**Key**: Extension is opt-out, not forced.
+
+---
+
+## Technical Constraints
+
+### Performance
+- **Classification latency**: < 2 seconds (local LLM)
+- **UI update latency**: < 500ms (compass indicator)
+- **CPU overhead**: < 5% average (AW watchers + LLM)
+- **Memory**: < 200MB (AW + Ollama model loaded)
+- **Battery impact**: Negligible (10-min polling, not continuous)
+
+### Privacy
+- **Default**: All data processed locally (AW SQLite + Ollama)
+- **No telemetry**: Classification results stay on device
+- **Optional cloud**: User must explicitly enable + provide API key
+- **Data retention**: Default 7 days, user-configurable
+- **GDPR compliance**: Full data export/deletion support
+
+### Compatibility
+- **Platforms**: macOS, Linux, Windows (AW supports all three)
+- **Browsers**: Chrome, Firefox, Safari (aw-watcher-web)
+- **Editors**: VS Code, Cursor, Vim/Neovim (window title detection)
+- **Ollama**: Requires 4GB RAM minimum (for 3B model)
+
+---
+
+## Success Metrics
+
+**Qualitative** (user interviews):
+- "Did the compass help you notice drift before it became hours?"
+- "Did end-of-phase reflection feel useful or guilt-inducing?"
+- "Was setup truly zero-config, or did you struggle?"
+- "Do you trust that data stays local?"
+
+**Quantitative** (optional telemetry, opt-in):
+- % of moments with aligned classification (target: >60%)
+- Average time-to-notice drift (compass shown → user action)
+- Reflection screen dismissal rate (too annoying if >90%)
+- Extension disable rate (failure if >20% disable within 1 week)
+
+**Technical Health**:
+- AW uptime (target: >99%)
+- LLM classification success rate (target: >95%)
+- UI responsiveness (compass updates <500ms)
+- Zero data loss on Zenborg restart
+
+---
+
+## Non-Goals (MVP)
+
+**Explicitly excluded from Phase 1**:
+- ❌ Cloud sync of ActivityWatch data
+- ❌ Mobile app integration (AW is desktop-only)
+- ❌ Productivity metrics / dashboards / charts
+- ❌ Gamification (streaks, scores, achievements)
+- ❌ Social features (compare with others)
+- ❌ AI suggestions ("you should work on X next")
+- ❌ Calendar integration (infer intentions from events)
+- ❌ Pomodoro timers or time-boxing
+- ❌ Automatic moment creation based on observed activity
+- ❌ Notifications/reminders/alerts (calm tech only)
+- ❌ Browser extension (watch via aw-watcher-web is sufficient)
+
+**Future Phases** (not MVP):
+- Phase 2: Trend analysis (weekly patterns, not daily metrics)
+- Phase 3: Custom theme taxonomy (beyond Area keywords)
+- Phase 4: Multi-device correlation (phone + desktop)
+- Phase 5: Shared themes for teams (opt-in collaboration)
+
+---
+
+## Open Questions
+
+**Technical**:
+1. Should we bundle Ollama or just detect/prompt for install?
+   - **Recommendation**: Detect + prompt (Ollama is 500MB+, too large to bundle)
+
+2. Polling interval: 5 min, 10 min, or user-configurable?
+   - **Recommendation**: Default 10 min, configurable down to 5 min
+
+3. How to handle rapid context switching (10+ app switches in 5 min)?
+   - **Recommendation**: Classify as NEUTRAL (transitional state)
+
+4. Should we show compass when no moment allocated?
+   - **Recommendation**: Show as UNTRACKED (○), remind user to allocate
+
+**UX**:
+1. Should compass show confidence score, or just direction?
+   - **Recommendation**: Hide confidence (too metric-y), just show state
+
+2. End-of-phase reflection: auto-dismiss after 30 sec, or wait for user?
+   - **Recommendation**: Wait for user (respect attention), but allow click-through
+
+3. What if user has multiple monitors? Where to show compass?
+   - **Recommendation**: Let user drag/position, persist preference
+
+**Privacy**:
+1. Should we offer data export (JSON dump of AlignmentEvents)?
+   - **Recommendation**: Yes, via `:export-data` command
+
+2. How to handle sensitive window titles (e.g., "Therapy Notes - Google Docs")?
+   - **Recommendation**: Hash or redact in stored data, only use for real-time classification
+
+---
+
+## Philosophy Alignment Check
+
+**Does this maintain Zenborg's core principles?**
+
+✅ **Orchestration, not elimination**: Accepts drift, helps you notice and reallocate
+✅ **Consciousness as currency**: Mirrors where attention actually goes vs. where you said it would
+✅ **Presence over outcomes**: No "productivity score", just alignment awareness
+✅ **Vim-inspired efficiency**: Minimal UI, peripheral vision, no interruptions
+✅ **Calm technology**: Ambient indicators, not notifications; reflection, not real-time guilt
+✅ **Local-first**: IndexedDB + local LLM, cloud is opt-in only
+✅ **Privacy-first**: Raw activity never persisted, only classifications
+
+**Potential Tensions**:
+⚠️ **"No time tracking"** → We're tracking, but not exposing raw time (only alignment)
+⚠️ **"No metrics"** → Classifications are a form of metric, but qualitative (aligned/drifting)
+⚠️ **"Mindful tech is boring"** → AI classification could feel "smart" vs. boring
+
+**Resolution**:
+- Frame as **awareness tool**, not performance tracker
+- Never show percentages, scores, or comparisons
+- Make compass dismissible/disableable (user agency)
+- Keep UI monochrome and calm (no red alerts, no urgency)
+
+---
+
+## Next Steps
+
+**Immediate**:
+1. ✅ PRD approval (this document)
+2. Create technical spike: bundle AW binaries for Next.js app
+3. Test Ollama integration (API calls, model selection)
+4. Design compass component (Figma mockup)
+5. Set up Vitest tests for classification service
+
+**Week 1 Deliverables**:
+- AW auto-start on Zenborg launch
+- Activity collection service (polling AW database)
+- Console logging of aggregated events
+
+**Week 2 Deliverables**:
+- Ollama integration (local LLM classification)
+- Compass indicator UI component
+- Real-time classification display
+
+**Week 3 Deliverables**:
+- End-of-phase reflection screen
+- Settings panel (privacy controls)
+- E2E test: full flow from moment allocation → drift detection → reflection
+
+---
+
+## Appendix: User's Default Themes
+
+**For "Thopiax" (MVP hardcoded)**:
+
+```typescript
+export const THOPIAX_THEMES = {
+  "Product Work": {
+    keywords: ["linear", "notion", "jira", "asana", "roadmap", "spec", "prd", "priorit"],
+    description: "Writing specs, scopes, prioritizing features, planning roadmaps",
+    exampleActivities: [
+      "Linear - Product Roadmap Q2",
+      "Notion - PRD: New Onboarding Flow",
+      "Slack - #product-team"
+    ]
+  },
+  "Data Work": {
+    keywords: ["jupyter", "python", "sql", "postgres", "dbt", "pandas", "numpy", "colab"],
+    description: "Exploring data, writing models, running batches, tweaking experiments",
+    exampleActivities: [
+      "Jupyter Notebook - user_retention_analysis.ipynb",
+      "pgAdmin - Query: weekly_active_users",
+      "Terminal - python run_experiment.py"
+    ]
+  },
+  "UX Work": {
+    keywords: ["figma", "framer", "sketch", "prototype", "design", "component", "css", "tailwind"],
+    description: "Prototyping interfaces, fine-tuning designs, iterating on components",
+    exampleActivities: [
+      "Figma - Zenborg Compass Redesign",
+      "VS Code - MomentCard.tsx",
+      "Chrome - Tailwind CSS Docs"
+    ]
+  },
+  "Strategy Work": {
+    keywords: ["docs", "notion", "notes", "obsidian", "research", "reading", "writing", "plan"],
+    description: "Slow, deliberate thinking, strategic planning, deep reading",
+    exampleActivities: [
+      "Google Docs - Q3 Strategy Draft",
+      "Notion - Weekly Reflection",
+      "Safari - Reading: Shape Up (Basecamp)"
+    ]
+  }
+}
+```
+
+**Usage in Classification**:
+- When moment.area matches theme name, use corresponding keywords/description
+- LLM considers semantic overlap (e.g., "Slack #product-team" → Product Work)
+- Themes evolve with user (future: custom theme editor)
+
+---
+
+**Document Version**: 1.0
+**Last Updated**: 2025-10-25
+**Author**: Thopiax (with Claude)
+**Status**: Ready for implementation
+
+---
+
+*"Reduce the distance from intent to action. Technology as a mirror, not a master."*

From b268a56a03022927db51b09b7f324e4dbd04d5f5 Mon Sep 17 00:00:00 2001
From: Claude <noreply@anthropic.com>
Date: Sat, 25 Oct 2025 02:18:10 +0000
Subject: [PATCH 2/5] docs: Remove end-of-phase reflection, add critical path
 test
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Changes to ActivityWatch PRD:
- Remove immediate end-of-phase reflection feature (too granular)
- Keep only ambient compass indicator for real-time awareness
- Update architecture, flows, and success metrics accordingly
- Shift to longer-term reflection in future phases

Add critical path validation document:
- 2-3 day MVP test protocol to validate core hypothesis
- CLI tool to test semantic classification + ambient feedback
- Clear go/no-go criteria before full implementation
- Tests riskiest assumptions first (accuracy, speed, usefulness)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
---
 docs/prds/ACTIVITY_WATCH_CRITICAL_PATH.md | 398 ++++++++++++++++++++++
 docs/prds/ACTIVITY_WATCH_PROJECT.md       | 106 +-----
 2 files changed, 411 insertions(+), 93 deletions(-)
 create mode 100644 docs/prds/ACTIVITY_WATCH_CRITICAL_PATH.md

diff --git a/docs/prds/ACTIVITY_WATCH_CRITICAL_PATH.md b/docs/prds/ACTIVITY_WATCH_CRITICAL_PATH.md
new file mode 100644
index 0000000..3d75b59
--- /dev/null
+++ b/docs/prds/ACTIVITY_WATCH_CRITICAL_PATH.md
@@ -0,0 +1,398 @@
+# ActivityWatch Integration - Critical Path Validation
+
+**Purpose**: Validate the core hypothesis before building the full system
+**Timeline**: 2-3 days
+**Goal**: Answer the question: "Does semantic AI classification + ambient feedback actually help reduce attention drift?"
+
+---
+
+## The Core Hypothesis
+
+> **"An ambient compass indicator showing real-time semantic alignment between stated intention and observed activity will help users notice and correct attention drift faster than passive reflection alone."**
+
+### What We're Testing
+
+1. **Can a local LLM accurately classify alignment** between a stated work intention and observed computer activity?
+2. **Is the classification fast enough** for real-time feedback (< 2 seconds)?
+3. **Does ambient feedback feel helpful** or intrusive/distracting?
+4. **Do users actually self-correct** when they notice drift, or ignore it?
+
+### What We're NOT Testing (Yet)
+
+- Full Zenborg integration
+- Multi-phase day planning
+- Zero-config setup
+- Settings/privacy controls
+- Historical tracking
+
+---
+
+## Minimal Viable Test (MVT)
+
+### What to Build
+
+**A standalone CLI tool** that:
+1. Polls ActivityWatch for last 15 minutes of activity
+2. Prompts user for current intention (e.g., "Product Spec")
+3. Classifies alignment using Ollama (local LLM)
+4. Prints result to terminal in real-time
+
+**No UI. No persistence. Just the core loop.**
+
+### Technical Stack
+
+- **Language**: TypeScript/Node.js (or Python for speed)
+- **ActivityWatch Client**: REST API calls to `http://localhost:5600`
+- **LLM**: Ollama with Llama 3.2 3B
+- **Output**: Terminal only (colored text for states)
+
+### Implementation (4-6 hours)
+
+```typescript
+// pseudocode
+while (true) {
+  // 1. Get current intention from user
+  const intention = await promptUser("What are you working on?")
+  const theme = await promptUser("Theme? (product/data/ux/strategy)")
+
+  // 2. Poll ActivityWatch every 5 minutes
+  await sleep(5 * 60 * 1000)
+
+  // 3. Fetch recent activity (last 15 min)
+  const activity = await fetchActivityWatch({
+    start: now - 15min,
+    end: now
+  })
+
+  // 4. Aggregate events
+  const summary = aggregateActivity(activity)
+  // { "Chrome - Linear": 480s, "Chrome - Twitter": 120s, ... }
+
+  // 5. Classify with Ollama
+  const result = await classifyAlignment({
+    intention,
+    theme,
+    activity: summary
+  })
+
+  // 6. Print to terminal
+  printCompass(result.classification) // 🧭 ↑ or 🧭 ↙
+  console.log(`Confidence: ${result.confidence}`)
+  console.log(`Reason: ${result.briefReason}`)
+}
+```
+
+---
+
+## Test Protocol
+
+### Setup (Day 0)
+
+1. Install ActivityWatch (manual setup is fine for MVT)
+2. Install Ollama + pull Llama 3.2 3B
+3. Build CLI tool (4-6 hours)
+4. Verify: Run tool, confirm it fetches AW data and calls Ollama
+
+### Day 1: Personal Dogfooding
+
+**Morning Session (3 hours)**:
+- Set intention: "Product Spec" (Product theme)
+- Work normally for 3 hours
+- Observe compass updates every 5 min
+- Note: When did you notice drift? Did you self-correct?
+
+**Questions to Answer**:
+- Was classification accurate? (subjective)
+- Did you notice the compass updates?
+- Did seeing "drifting" cause you to refocus?
+- Was 5-min polling too slow/too fast?
+
+**Afternoon Session (3 hours)**:
+- Set intention: "Data Analysis" (Data theme)
+- Intentionally drift to Twitter/email after 30 min
+- Observe: How long until compass shows "drifting"?
+- Self-correct: Does returning to Jupyter change compass back to "aligned"?
+
+**Questions to Answer**:
+- How quickly did LLM detect drift?
+- Was the feedback helpful or annoying?
+- Did you feel guilt, or just awareness?
+
+### Day 2: Shared Testing
+
+**Recruit 1-2 colleagues**:
+- Give them the CLI tool
+- Ask them to set intentions for their work (product/data/ux/strategy)
+- Run for 4-6 hours
+- Debrief: Interview about experience
+
+**Interview Questions**:
+1. "On a scale of 1-10, how accurate was the classification?"
+2. "Did you notice drift earlier than you normally would?"
+3. "Did the compass feel like a gentle mirror or an annoying nag?"
+4. "Would you use this daily if it was built into Zenborg?"
+5. "What would make this more useful?"
+
+---
+
+## Success Criteria
+
+### Must Pass (Go/No-Go)
+
+✅ **Classification accuracy > 70%** (subjective, user agreement with LLM)
+✅ **Response time < 3 seconds** (Ollama call completes quickly)
+✅ **Users self-correct at least once** when shown "drifting"
+✅ **No one says "this is annoying/distracting"** (neutral or positive feedback only)
+
+### Nice to Have
+
+⭐ Classification accuracy > 85%
+⭐ Users proactively check compass (not just passive glances)
+⭐ Users request "show me when I've been aligned for 2+ hours" (positive reinforcement)
+
+### Failure Modes (Stop/Rethink)
+
+❌ **Classification < 60% accurate** → LLM not good enough, try different model/prompt
+❌ **Response time > 5 seconds** → Too slow for real-time, need smaller model
+❌ **Users ignore compass entirely** → Ambient feedback ineffective, try different UI
+❌ **Users feel guilt/shame** → Messaging is wrong, need gentler framing
+
+---
+
+## Example Test Session (User POV)
+
+```bash
+$ npm run test-compass
+
+🧭 Attention Compass - ActivityWatch Integration Test
+
+What are you working on? (3 words max)
+> Product Spec
+
+Theme? (product/data/ux/strategy)
+> product
+
+✓ Monitoring ActivityWatch every 5 minutes...
+  Press Ctrl+C to stop or change intention
+
+[5 minutes pass]
+
+─────────────────────────────────────────
+🧭 ↑ ALIGNED (confidence: 0.82)
+Reason: "Linear, Notion - matches product work"
+
+Recent activity:
+- Linear - Product Roadmap (4m 20s)
+- Chrome - Notion PRD (3m 10s)
+- Slack - #product-team (1m 30s)
+─────────────────────────────────────────
+
+[10 minutes pass]
+
+─────────────────────────────────────────
+🧭 ↙ DRIFTING (confidence: 0.91)
+Reason: "Twitter browsing - misaligned with product work"
+
+Recent activity:
+- Chrome - Twitter (8m 40s)
+- Chrome - Hacker News (4m 20s)
+- Linear - Product Roadmap (2m 00s)
+─────────────────────────────────────────
+
+[User sees "drifting", closes Twitter, returns to Linear]
+
+[15 minutes pass]
+
+─────────────────────────────────────────
+🧭 ↑ ALIGNED (confidence: 0.88)
+Reason: "Back to Linear - aligned with product work"
+
+Recent activity:
+- Linear - Product Roadmap (12m 30s)
+- Chrome - Notion PRD (2m 30s)
+─────────────────────────────────────────
+```
+
+---
+
+## Decision Points
+
+### After Day 1 (Personal Test)
+
+**If positive** → Proceed to Day 2 (shared testing)
+**If mixed** → Iterate on prompt/polling interval, test again
+**If negative** → Stop, rethink approach (maybe ambient feedback doesn't work)
+
+### After Day 2 (Shared Test)
+
+**If 2/2 users positive** → Greenlight full Zenborg integration (PRD implementation)
+**If 1/2 users positive** → Iterate on UX, test with 2 more users
+**If 0/2 users positive** → Stop, fundamental issue with approach
+
+---
+
+## What We Learn
+
+### On Classification Quality
+
+- **Is semantic understanding working?** (e.g., "Slack #product-team" correctly classified as aligned)
+- **Are edge cases handled?** (e.g., research on Twitter for product spec)
+- **Is the LLM too strict or too lenient?**
+
+### On User Behavior
+
+- **Do users notice drift earlier?** (vs. discovering at end of day)
+- **Do they self-correct when shown "drifting"?**
+- **Do they feel empowered or guilty?**
+
+### On Technical Feasibility
+
+- **Is 5-min polling the right interval?** (or 10 min? 15 min?)
+- **Is Llama 3.2 3B fast enough?** (or do we need smaller model?)
+- **Does ActivityWatch data quality hold up?** (window titles, URLs accurate?)
+
+---
+
+## Pivot Options (If Hypothesis Fails)
+
+### If Classification Is Inaccurate
+
+**Option A**: Use simpler keyword matching (no LLM)
+- Pro: Faster, more predictable
+- Con: Misses semantic nuance
+
+**Option B**: Fine-tune LLM on personal work patterns
+- Pro: Higher accuracy over time
+- Con: Requires training data, more complex
+
+**Option C**: Let user correct classifications (feedback loop)
+- Pro: Improves over time, user feels in control
+- Con: Adds friction
+
+### If Ambient Feedback Is Ineffective
+
+**Option A**: Only show compass on request (`:align` command)
+- Pro: Less intrusive
+- Con: Defeats real-time awareness goal
+
+**Option B**: Remove real-time feedback, only weekly summaries
+- Pro: Aligns with "less granular" philosophy
+- Con: Too late to notice drift in the moment
+
+**Option C**: Add gentle sound/haptic (for users who want it)
+- Pro: Harder to ignore
+- Con: Violates "calm tech" principle
+
+### If Users Feel Guilt/Shame
+
+**Option A**: Reframe language (drop "drifting", use "exploring")
+- Pro: Gentler tone
+- Con: May feel less truthful
+
+**Option B**: Add positive reinforcement ("You've been aligned for 2 hours!")
+- Pro: Balances negative with positive
+- Con: Risks gamification
+
+**Option C**: Make compass optional/hideable at all times
+- Pro: Respects user agency
+- Con: Users may just hide it when uncomfortable
+
+---
+
+## Timeline
+
+**Day 0 (Setup)**: 4-6 hours
+- Build CLI tool
+- Test AW + Ollama integration
+- Verify basic flow works
+
+**Day 1 (Personal Test)**: 6-8 hours of work with compass running
+- Morning: aligned work
+- Afternoon: intentional drift test
+- Evening: notes & reflection
+
+**Day 2 (Shared Test)**: 4-6 hours
+- Recruit 1-2 colleagues
+- Run sessions
+- Debrief interviews (30 min each)
+
+**Day 3 (Decision)**: 2 hours
+- Synthesize findings
+- Make go/no-go decision
+- Document learnings
+
+**Total**: 2-3 days end-to-end
+
+---
+
+## Deliverables
+
+1. **CLI tool** (open-source, can share with testers)
+2. **Test notes** (markdown doc with observations)
+3. **Interview summaries** (anonymized quotes/themes)
+4. **Go/No-Go decision doc** (based on success criteria)
+5. **Learnings** (what worked, what didn't, what to change)
+
+---
+
+## Next Steps After Validation
+
+### If "Go" (Hypothesis Validated)
+
+1. Proceed with full PRD implementation
+2. Integrate into Zenborg (Phases 1a-1e)
+3. Design compass UI component (not just CLI)
+4. Add settings/privacy controls
+5. Ship as opt-in beta to users
+
+### If "No-Go" (Hypothesis Failed)
+
+1. Document failure mode(s)
+2. Explore pivot options (see above)
+3. Consider alternative approaches:
+   - Manual check-ins (`:align` command on demand)
+   - Weekly reflection only (no real-time)
+   - Simple keyword matching (no AI)
+4. Re-test with pivoted approach
+
+---
+
+## Philosophy Check
+
+**Does this test maintain Zenborg principles?**
+
+✅ **Calm technology**: CLI output is passive, not intrusive
+✅ **Local-first**: All processing local (AW + Ollama)
+✅ **Privacy-first**: No data sent to cloud
+✅ **User agency**: Can stop test anytime (Ctrl+C)
+✅ **No metrics**: Shows alignment state, not scores/percentages
+
+**Does it test the right thing?**
+
+✅ **Core value prop**: Does semantic awareness reduce drift?
+✅ **Technical feasibility**: Is LLM fast/accurate enough?
+✅ **User experience**: Does ambient feedback feel helpful?
+✅ **Minimal viable**: No over-engineering, just essentials
+
+---
+
+## Key Questions to Answer
+
+1. **Does it work?** (technically: AW → LLM → classification)
+2. **Is it fast?** (< 2-3 seconds end-to-end)
+3. **Is it accurate?** (> 70% user agreement with classification)
+4. **Is it useful?** (users self-correct when shown drift)
+5. **Is it calm?** (no guilt, no distraction)
+
+**If all 5 are "yes" → Build the full thing.**
+**If any are "no" → Pivot or stop.**
+
+---
+
+**Status**: Ready to build
+**Owner**: Thopiax
+**Timeline**: Start ASAP, decide by end of Week 1
+
+---
+
+*"Test the riskiest assumption first. If semantic awareness works, build it. If not, save weeks of implementation."*
diff --git a/docs/prds/ACTIVITY_WATCH_PROJECT.md b/docs/prds/ACTIVITY_WATCH_PROJECT.md
index 80e3806..2f1c1db 100644
--- a/docs/prds/ACTIVITY_WATCH_PROJECT.md
+++ b/docs/prds/ACTIVITY_WATCH_PROJECT.md
@@ -25,11 +25,10 @@ A **passive ambient awareness system** that:
 1. Observes computer activity via ActivityWatch
 2. Classifies alignment with current moment using local LLM
 3. Provides **peripheral feedback** (ambient compass indicator)
-4. Offers **reflective summaries** (end-of-phase ritual)
 
-**Not**: Performance tracking, productivity metrics, nagging notifications, or guilt-inducing dashboards.
+**Not**: Performance tracking, productivity metrics, nagging notifications, guilt-inducing dashboards, or granular time summaries.
 
-**Is**: A gentle, intelligent mirror that helps you notice drift before hours pass.
+**Is**: A gentle, intelligent mirror that helps you notice drift in the moment, not hours later.
 
 ---
 
@@ -62,35 +61,6 @@ Current: "Product Spec" ☕ Morning
 - **Drifting** (↙): Clear misalignment detected
 - **Untracked** (○): No digital activity (reading, meetings, thinking)
 
-### End-of-Phase Reflection (Passive, Retrospective)
-
-When a phase completes (e.g., Morning → Afternoon transition):
-
-```
-☕ Morning Complete
-
-┌─────────────────────────────────────────────────┐
-│ Product Spec                    [Craft]         │
-│ ✓ Aligned (2h 15m observed)                     │
-│ → Linear, Notion, Figma mockups                 │
-├─────────────────────────────────────────────────┤
-│ Email Triage                    [Admin]         │
-│ ✗ Drift detected (45m allocated, 12m observed)  │
-│ → Spent 1h 20m on Twitter/HN instead            │
-├─────────────────────────────────────────────────┤
-│ Deep Reading                    [Strategy]      │
-│ ? Untracked (no digital footprint)              │
-└─────────────────────────────────────────────────┘
-
-Press any key to continue to Afternoon...
-```
-
-**Design**:
-- Non-blocking (can dismiss immediately)
-- Shows up once per phase transition
-- No judgement language ("drift detected" not "you failed")
-- Acknowledges untracked time as valid (reading, thinking, meetings)
-
 ---
 
 ## Technical Architecture
@@ -135,7 +105,6 @@ Press any key to continue to Afternoon...
 ┌──────────────────────────────────────────────────┐
 │          Ambient Feedback Layer                  │
 │  - Compass indicator (real-time UI)              │
-│  - Phase reflection summary (transition screen)  │
 │  - Alignment history (stored in IndexedDB)       │
 └──────────────────────────────────────────────────┘
 ```
@@ -345,35 +314,7 @@ interface ClassificationResult {
 
 ---
 
-### Phase 1e: End-of-Phase Reflection (Week 3)
-**Goal**: Show alignment summary at phase transitions
-
-**Trigger**: When current phase ends (based on PhaseConfig.endHour)
-
-**UI**:
-- Overlay (not modal - can click through)
-- Shows all moments from completed phase
-- For each moment:
-  - Alignment status (✓ aligned, ✗ drifting, ? untracked)
-  - Observed duration (aggregated from AW events)
-  - Top 3 apps/activities
-- Press any key or click to dismiss
-
-**Data**:
-- Query AlignmentEvents for completed phase
-- Aggregate classifications by moment
-- Calculate time spent per classification type
-- Do NOT show percentages or scores (no gamification)
-
-**Acceptance**:
-- Appears automatically at phase transition
-- Non-blocking (can dismiss immediately)
-- Shows accurate time aggregations
-- Works offline (uses cached data)
-
----
-
-### Phase 1f: Settings & Privacy (Week 3)
+### Phase 1e: Settings & Privacy (Week 2-3)
 **Goal**: User control over data collection and feedback
 
 **Settings Panel** (`:settings` command):
@@ -383,7 +324,6 @@ interface ClassificationResult {
 ├─────────────────────────────────────────────────┤
 │ ☑ Enable attention guardrails                   │
 │ ☑ Show ambient compass indicator                │
-│ ☑ Show end-of-phase reflections                 │
 │                                                  │
 │ Classification interval: [5 min] [10 min] [15]  │
 │ LLM Backend: [Ollama (local)] [Claude API]      │
@@ -451,26 +391,7 @@ interface ClassificationResult {
 
 ---
 
-### Flow 3: End-of-Phase Reflection
-```
-1. Morning phase ends (12pm → Afternoon)
-2. Zenborg shows reflection overlay:
-
-   ☕ Morning Complete
-
-   ✓ Product Spec (2h 15m aligned)
-   ✗ Email Triage (1h 20m drifting - Twitter/HN)
-   ? Deep Reading (untracked)
-
-3. User reads, presses Esc
-4. Continues to Afternoon
-```
-
-**Non-Goals**: No lecture, no metrics, no "productivity score". Just a mirror.
-
----
-
-### Flow 4: Disable Extension (User Agency)
+### Flow 3: Disable Extension (User Agency)
 ```
 1. User types :settings
 2. Unchecks "Enable attention guardrails"
@@ -511,14 +432,13 @@ interface ClassificationResult {
 
 **Qualitative** (user interviews):
 - "Did the compass help you notice drift before it became hours?"
-- "Did end-of-phase reflection feel useful or guilt-inducing?"
 - "Was setup truly zero-config, or did you struggle?"
 - "Do you trust that data stays local?"
+- "Does the ambient feedback feel helpful or distracting?"
 
 **Quantitative** (optional telemetry, opt-in):
 - % of moments with aligned classification (target: >60%)
 - Average time-to-notice drift (compass shown → user action)
-- Reflection screen dismissal rate (too annoying if >90%)
 - Extension disable rate (failure if >20% disable within 1 week)
 
 **Technical Health**:
@@ -545,7 +465,7 @@ interface ClassificationResult {
 - ❌ Browser extension (watch via aw-watcher-web is sufficient)
 
 **Future Phases** (not MVP):
-- Phase 2: Trend analysis (weekly patterns, not daily metrics)
+- Phase 2: Longer-term reflection patterns (weekly/monthly, not immediate)
 - Phase 3: Custom theme taxonomy (beyond Area keywords)
 - Phase 4: Multi-device correlation (phone + desktop)
 - Phase 5: Shared themes for teams (opt-in collaboration)
@@ -571,12 +491,12 @@ interface ClassificationResult {
 1. Should compass show confidence score, or just direction?
    - **Recommendation**: Hide confidence (too metric-y), just show state
 
-2. End-of-phase reflection: auto-dismiss after 30 sec, or wait for user?
-   - **Recommendation**: Wait for user (respect attention), but allow click-through
-
-3. What if user has multiple monitors? Where to show compass?
+2. What if user has multiple monitors? Where to show compass?
    - **Recommendation**: Let user drag/position, persist preference
 
+3. Should alignment history be queryable/viewable?
+   - **Recommendation**: Future phase - keep MVP focused on real-time awareness only
+
 **Privacy**:
 1. Should we offer data export (JSON dump of AlignmentEvents)?
    - **Recommendation**: Yes, via `:export-data` command
@@ -630,10 +550,10 @@ interface ClassificationResult {
 - Compass indicator UI component
 - Real-time classification display
 
-**Week 3 Deliverables**:
-- End-of-phase reflection screen
+**Week 2-3 Deliverables**:
 - Settings panel (privacy controls)
-- E2E test: full flow from moment allocation → drift detection → reflection
+- Data retention & deletion
+- E2E test: full flow from moment allocation → drift detection → self-correction
 
 ---
 

From 686aa113841c0228bb93a9793c56dbd8e487aa36 Mon Sep 17 00:00:00 2001
From: Claude <noreply@anthropic.com>
Date: Sat, 25 Oct 2025 02:25:46 +0000
Subject: [PATCH 3/5] docs: Switch from Ollama to Transformer.js for
 classification
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Major architectural change to PRD and critical path:

**Why Transformer.js**:
- Zero external dependencies (no Ollama install)
- Runs in-browser or Node.js (WASM + WebGPU)
- Auto-downloads models on first use (~400MB BART)
- Faster inference for classification (< 1 second)
- Reusable for journal note semantic annotation
- Better integration with Next.js/TypeScript stack

**Classification approaches**:
1. Zero-shot classification (BART/DeBERTa) for accuracy
2. Semantic similarity (sentence transformers) for speed
3. Both use same Transformer.js API

**Performance improvements**:
- < 1 second classification (vs. < 2-3 seconds with Ollama)
- < 150MB memory footprint (vs. 200MB+)
- No background server required

**Bonus feature**: Same models enable journal note semantic
search, auto-tagging, and moment similarity matching

Updated both PRD and critical path test protocol to reflect
new approach. Simpler setup (2-4 hours vs. 4-6 hours).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
---
 docs/prds/ACTIVITY_WATCH_CRITICAL_PATH.md | 157 +++++++++++++----
 docs/prds/ACTIVITY_WATCH_PROJECT.md       | 204 ++++++++++++++--------
 2 files changed, 253 insertions(+), 108 deletions(-)

diff --git a/docs/prds/ACTIVITY_WATCH_CRITICAL_PATH.md b/docs/prds/ACTIVITY_WATCH_CRITICAL_PATH.md
index 3d75b59..3cd4243 100644
--- a/docs/prds/ACTIVITY_WATCH_CRITICAL_PATH.md
+++ b/docs/prds/ACTIVITY_WATCH_CRITICAL_PATH.md
@@ -41,15 +41,22 @@
 
 ### Technical Stack
 
-- **Language**: TypeScript/Node.js (or Python for speed)
+- **Language**: TypeScript/Node.js
 - **ActivityWatch Client**: REST API calls to `http://localhost:5600`
-- **LLM**: Ollama with Llama 3.2 3B
+- **Classifier**: Transformer.js with BART or DeBERTa (zero-shot classification)
 - **Output**: Terminal only (colored text for states)
 
 ### Implementation (4-6 hours)
 
 ```typescript
-// pseudocode
+import { pipeline } from '@xenova/transformers'
+
+// Load classifier once (auto-downloads model on first run)
+const classifier = await pipeline(
+  'zero-shot-classification',
+  'facebook/bart-large-mnli'
+)
+
 while (true) {
   // 1. Get current intention from user
   const intention = await promptUser("What are you working on?")
@@ -68,17 +75,30 @@ while (true) {
   const summary = aggregateActivity(activity)
   // { "Chrome - Linear": 480s, "Chrome - Twitter": 120s, ... }
 
-  // 5. Classify with Ollama
-  const result = await classifyAlignment({
-    intention,
-    theme,
-    activity: summary
-  })
-
-  // 6. Print to terminal
-  printCompass(result.classification) // 🧭 ↑ or 🧭 ↙
-  console.log(`Confidence: ${result.confidence}`)
-  console.log(`Reason: ${result.briefReason}`)
+  // 5. Build activity description
+  const activityText = Object.entries(summary)
+    .map(([key, duration]) => `${key} (${duration}s)`)
+    .join(', ')
+
+  const description = `
+    Working on: ${intention} (${theme} work)
+    Recent activity: ${activityText}
+  `
+
+  // 6. Classify with Transformer.js
+  const result = await classifier(description, [
+    'aligned with stated intention',
+    'drifting from stated intention',
+    'neutral or transitional activity',
+    'no significant activity'
+  ])
+
+  const classification = result.labels[0]
+  const confidence = result.scores[0]
+
+  // 7. Print to terminal
+  printCompass(classification) // 🧭 ↑ or 🧭 ↙
+  console.log(`Confidence: ${(confidence * 100).toFixed(0)}%`)
 }
 ```
 
@@ -89,9 +109,9 @@ while (true) {
 ### Setup (Day 0)
 
 1. Install ActivityWatch (manual setup is fine for MVT)
-2. Install Ollama + pull Llama 3.2 3B
-3. Build CLI tool (4-6 hours)
-4. Verify: Run tool, confirm it fetches AW data and calls Ollama
+2. `npm install @xenova/transformers` (auto-downloads BART on first run)
+3. Build CLI tool (2-4 hours - simpler than Ollama approach)
+4. Verify: Run tool, confirm it fetches AW data and classifies with Transformer.js
 
 ### Day 1: Personal Dogfooding
 
@@ -114,7 +134,7 @@ while (true) {
 - Self-correct: Does returning to Jupyter change compass back to "aligned"?
 
 **Questions to Answer**:
-- How quickly did LLM detect drift?
+- How quickly did the classifier detect drift?
 - Was the feedback helpful or annoying?
 - Did you feel guilt, or just awareness?
 
@@ -139,8 +159,8 @@ while (true) {
 
 ### Must Pass (Go/No-Go)
 
-✅ **Classification accuracy > 70%** (subjective, user agreement with LLM)
-✅ **Response time < 3 seconds** (Ollama call completes quickly)
+✅ **Classification accuracy > 70%** (subjective, user agreement with classifier)
+✅ **Response time < 1 second** (Transformer.js inference completes quickly)
 ✅ **Users self-correct at least once** when shown "drifting"
 ✅ **No one says "this is annoying/distracting"** (neutral or positive feedback only)
 
@@ -152,8 +172,8 @@ while (true) {
 
 ### Failure Modes (Stop/Rethink)
 
-❌ **Classification < 60% accurate** → LLM not good enough, try different model/prompt
-❌ **Response time > 5 seconds** → Too slow for real-time, need smaller model
+❌ **Classification < 60% accurate** → Zero-shot not working, try semantic similarity instead
+❌ **Response time > 2 seconds** → Too slow for real-time, switch to smaller/faster model
 ❌ **Users ignore compass entirely** → Ambient feedback ineffective, try different UI
 ❌ **Users feel guilt/shame** → Messaging is wrong, need gentler framing
 
@@ -165,6 +185,8 @@ while (true) {
 $ npm run test-compass
 
 🧭 Attention Compass - ActivityWatch Integration Test
+Loading classifier... (first run downloads BART model ~400MB)
+✓ Classifier ready (facebook/bart-large-mnli)
 
 What are you working on? (3 words max)
 > Product Spec
@@ -178,8 +200,7 @@ Theme? (product/data/ux/strategy)
 [5 minutes pass]
 
 ─────────────────────────────────────────
-🧭 ↑ ALIGNED (confidence: 0.82)
-Reason: "Linear, Notion - matches product work"
+🧭 ↑ ALIGNED (confidence: 82%)
 
 Recent activity:
 - Linear - Product Roadmap (4m 20s)
@@ -190,8 +211,7 @@ Recent activity:
 [10 minutes pass]
 
 ─────────────────────────────────────────
-🧭 ↙ DRIFTING (confidence: 0.91)
-Reason: "Twitter browsing - misaligned with product work"
+🧭 ↙ DRIFTING (confidence: 91%)
 
 Recent activity:
 - Chrome - Twitter (8m 40s)
@@ -204,8 +224,7 @@ Recent activity:
 [15 minutes pass]
 
 ─────────────────────────────────────────
-🧭 ↑ ALIGNED (confidence: 0.88)
-Reason: "Back to Linear - aligned with product work"
+🧭 ↑ ALIGNED (confidence: 88%)
 
 Recent activity:
 - Linear - Product Roadmap (12m 30s)
@@ -237,7 +256,8 @@ Recent activity:
 
 - **Is semantic understanding working?** (e.g., "Slack #product-team" correctly classified as aligned)
 - **Are edge cases handled?** (e.g., research on Twitter for product spec)
-- **Is the LLM too strict or too lenient?**
+- **Is zero-shot classification too strict or too lenient?**
+- **Does BART work well, or should we try DeBERTa/semantic similarity?**
 
 ### On User Behavior
 
@@ -248,7 +268,7 @@ Recent activity:
 ### On Technical Feasibility
 
 - **Is 5-min polling the right interval?** (or 10 min? 15 min?)
-- **Is Llama 3.2 3B fast enough?** (or do we need smaller model?)
+- **Is Transformer.js fast enough for real-time feedback?** (< 1 second?)
 - **Does ActivityWatch data quality hold up?** (window titles, URLs accurate?)
 
 ---
@@ -257,17 +277,21 @@ Recent activity:
 
 ### If Classification Is Inaccurate
 
-**Option A**: Use simpler keyword matching (no LLM)
-- Pro: Faster, more predictable
-- Con: Misses semantic nuance
+**Option A**: Switch from zero-shot to semantic similarity
+- Pro: Faster, simpler, often more accurate for narrow domains
+- Con: Requires tuning similarity thresholds
 
-**Option B**: Fine-tune LLM on personal work patterns
-- Pro: Higher accuracy over time
-- Con: Requires training data, more complex
+**Option B**: Use keyword matching (no ML at all)
+- Pro: Fastest, most predictable
+- Con: Misses semantic nuance entirely
 
 **Option C**: Let user correct classifications (feedback loop)
 - Pro: Improves over time, user feels in control
-- Con: Adds friction
+- Con: Adds friction, doesn't improve model
+
+**Option D**: Try different zero-shot model (DeBERTa instead of BART)
+- Pro: May have better accuracy for intent classification
+- Con: Still relatively slow compared to similarity
 
 ### If Ambient Feedback Is Ineffective
 
@@ -389,6 +413,65 @@ Recent activity:
 
 ---
 
+## Bonus: Journal Note Semantic Annotation
+
+Since we're already loading Transformer.js models for ActivityWatch classification, **the same models can power semantic journal features**:
+
+### Use Cases
+
+**1. Semantic Search**
+```typescript
+// Find journal entries related to current moment
+const embedder = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2')
+
+const momentEmbedding = await embedder("Product Spec: prioritizing roadmap")
+const noteEmbeddings = await embedder(journalNotes.map(n => n.content))
+
+const similarities = cosineSimilarity(momentEmbedding, noteEmbeddings)
+const relatedNotes = journalNotes
+  .map((note, i) => ({ note, score: similarities[i] }))
+  .filter(({ score }) => score > 0.6)
+  .sort((a, b) => b.score - a.score)
+```
+
+**2. Auto-Tagging Notes**
+```typescript
+// Tag note with theme (product/data/ux/strategy)
+const classifier = await pipeline('zero-shot-classification', 'facebook/bart-large-mnli')
+
+const result = await classifier(journalNote.content, [
+  'product work and prioritization',
+  'data analysis and experiments',
+  'UX design and prototyping',
+  'strategic thinking and planning'
+])
+
+journalNote.theme = result.labels[0]
+journalNote.confidence = result.scores[0]
+```
+
+**3. Find Similar Past Moments**
+```typescript
+// When creating moment "Product Spec", show related past moments
+const currentEmbedding = await embedder("Product Spec")
+const pastEmbeddings = await embedder(pastMoments.map(m => m.name))
+
+const similar = pastMoments
+  .map((m, i) => ({ moment: m, score: similarities[i] }))
+  .filter(({ score }) => score > 0.8)
+```
+
+### Integration Points
+
+- **Moment creation**: Suggest related journal notes
+- **Journal writing**: Auto-tag with themes from current moment
+- **Reflection**: "Show me notes from when I worked on similar moments"
+- **Search**: Semantic search across all notes and moments
+
+**Advantage**: One model download, multiple features. Zero-config semantic intelligence across the whole app.
+
+---
+
 **Status**: Ready to build
 **Owner**: Thopiax
 **Timeline**: Start ASAP, decide by end of Week 1
diff --git a/docs/prds/ACTIVITY_WATCH_PROJECT.md b/docs/prds/ACTIVITY_WATCH_PROJECT.md
index 2f1c1db..2219bc8 100644
--- a/docs/prds/ACTIVITY_WATCH_PROJECT.md
+++ b/docs/prds/ACTIVITY_WATCH_PROJECT.md
@@ -165,55 +165,114 @@ interface ActivitySummary {
 type AlignmentType = "aligned" | "neutral" | "drifting" | "untracked"
 ```
 
-### LLM Classification Service
+### Semantic Classification Service (Transformer.js)
 
-**Local Model Options** (ranked by preference):
-1. **Ollama** with Llama 3.2 3B (fastest, good balance)
-2. **llama.cpp** with Phi-3 Mini (smallest, edge devices)
-3. **Fallback**: Claude API (privacy implications, requires API key)
+**Model Choice**: **Transformer.js** with zero-shot classification
+
+**Why Transformer.js**:
+- ✅ Zero external dependencies (no Ollama/llama.cpp install)
+- ✅ Runs in browser or Node.js (WASM + WebGPU)
+- ✅ Auto-downloads models on first use (cached locally)
+- ✅ Fast inference for classification tasks (< 500ms)
+- ✅ Reusable for journal note semantic annotation
+- ✅ Works offline immediately after first model download
+
+**Model Options** (ranked by preference):
+1. **`facebook/bart-large-mnli`** - Zero-shot classification (best accuracy)
+2. **`MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli`** - Faster, still accurate
+3. **Sentence transformers** + cosine similarity (ultra-fast, good enough)
+
+**Classification Approach**:
+
+```typescript
+import { pipeline } from '@xenova/transformers'
+
+// Load zero-shot classifier (once, cached)
+const classifier = await pipeline(
+  'zero-shot-classification',
+  'facebook/bart-large-mnli'
+)
+
+// Define candidate labels based on moment's theme
+const labels = {
+  aligned: [
+    moment.area.themeDescription, // "Writing specs, prioritizing features"
+    ...moment.area.keywords, // ["linear", "notion", "spec"]
+  ],
+  drifting: [
+    "social media browsing",
+    "news reading",
+    "entertainment",
+    "unrelated work"
+  ],
+  neutral: [
+    "email communication",
+    "team chat",
+    "quick searches",
+    "context switching"
+  ]
+}
+
+// Build activity description from AW events
+const activityDescription = `
+User is working on: "${moment.name}" (${moment.area.name} - ${moment.area.themeDescription})
+
+Recent activity (last 15 min):
+${activity.map(a => `- ${a.app}: ${a.windowTitle} (${a.duration}s)`).join('\n')}
+`
+
+// Classify alignment
+const result = await classifier(activityDescription, [
+  'aligned with stated intention',
+  'drifting from stated intention',
+  'neutral or transitional activity',
+  'no significant digital activity'
+])
+
+// Map to AlignmentType
+const classification = mapToAlignment(result.labels[0], result.scores[0])
+// { classification: "aligned", confidence: 0.89, themeDetected: "product" }
+```
+
+**Alternative: Semantic Similarity** (faster, simpler):
 
-**Classification Prompt Template**:
 ```typescript
-const CLASSIFICATION_PROMPT = `You are an attention alignment classifier for a mindful productivity system.
-
-CURRENT INTENTION:
-- Moment: "${moment.name}"
-- Area: ${moment.area.name}
-- Theme: ${moment.area.themeDescription}
-- Phase: ${phase} (${phaseEmoji})
-
-OBSERVED ACTIVITY (last 15 min):
-${activitySummary}
-
-TASK: Classify alignment as:
-- ALIGNED: Activity clearly matches the stated intention and theme
-- NEUTRAL: Ambiguous or transitional (email, Slack, quick searches, switching contexts)
-- DRIFTING: Clear misalignment with stated intention
-- UNTRACKED: No significant digital activity detected
-
-GUIDELINES:
-- Consider semantic meaning, not just keywords
-  (e.g., "Slack #product-team" is aligned with product work)
-- Short diversions (<2 min) are NEUTRAL, not drifting
-- Respect nuance: research on Twitter for a product spec is aligned
-- If no clear activity, classify as UNTRACKED (not a failure)
-
-OUTPUT (JSON only, no explanation):
-{
-  "classification": "aligned" | "neutral" | "drifting" | "untracked",
-  "confidence": 0.0-1.0,
-  "themeDetected": "product" | "data" | "ux" | "strategy" | null,
-  "briefReason": "Short explanation (max 10 words)"
-}`;
+import { pipeline } from '@xenova/transformers'
+
+// Load sentence transformer (faster than zero-shot)
+const embedder = await pipeline(
+  'feature-extraction',
+  'Xenova/all-MiniLM-L6-v2'
+)
+
+// Embed intention
+const intentionEmbedding = await embedder(
+  `${moment.name}: ${moment.area.themeDescription}`
+)
+
+// Embed observed activity
+const activityEmbedding = await embedder(
+  activity.map(a => `${a.app} ${a.windowTitle}`).join('. ')
+)
+
+// Compute cosine similarity
+const similarity = cosineSimilarity(intentionEmbedding, activityEmbedding)
+
+// Classify based on threshold
+const classification =
+  similarity > 0.7 ? 'aligned' :
+  similarity > 0.4 ? 'neutral' :
+  similarity > 0.2 ? 'drifting' :
+  'untracked'
 ```
 
-**Response Parsing**:
+**Response Format**:
 ```typescript
 interface ClassificationResult {
-  classification: AlignmentType
-  confidence: number
-  themeDetected: string | null
-  briefReason: string
+  classification: AlignmentType // "aligned" | "neutral" | "drifting" | "untracked"
+  confidence: number // 0.0-1.0 (from model scores)
+  themeDetected: string | null // "product" | "data" | "ux" | "strategy"
+  method: 'zero-shot' | 'similarity' // which approach was used
 }
 
 // Store in IndexedDB as AlignmentEvent
@@ -257,22 +316,23 @@ interface ClassificationResult {
 
 ---
 
-### Phase 1c: Local LLM Integration (Week 2)
-**Goal**: Classify alignment using Ollama locally
+### Phase 1c: Semantic Classification (Week 2)
+**Goal**: Classify alignment using Transformer.js
 
 **Tasks**:
-1. Detect Ollama installation (or prompt user to install)
-2. Auto-pull lightweight model (Llama 3.2 3B)
-3. Build classification prompt from current moment + activity
-4. Call Ollama API (http://localhost:11434)
-5. Parse JSON response → AlignmentEvent
+1. Install `@xenova/transformers` (npm package)
+2. Load zero-shot classification model (BART or DeBERTa)
+3. Build activity description from AW events
+4. Classify alignment with candidate labels
+5. Map scores to AlignmentType + confidence
 6. Store classifications in IndexedDB (not raw activity)
 
 **Acceptance**:
-- Classification runs locally, no external API calls
-- Response time < 2 seconds
+- Classification runs in-browser/Node.js, no external dependencies
+- First-run downloads model (100-500MB), then cached
+- Response time < 1 second (after model loaded)
 - Confidence scores calibrated (>0.7 for aligned/drifting)
-- Errors gracefully handled (show "untracked" if LLM fails)
+- Errors gracefully handled (show "untracked" if classification fails)
 
 ---
 
@@ -326,11 +386,10 @@ interface ClassificationResult {
 │ ☑ Show ambient compass indicator                │
 │                                                  │
 │ Classification interval: [5 min] [10 min] [15]  │
-│ LLM Backend: [Ollama (local)] [Claude API]      │
+│ Model: [BART (accurate)] [DeBERTa (fast)]       │
 │                                                  │
 │ Privacy:                                         │
-│ ☑ Process data locally only                     │
-│ ☐ Allow cloud LLM fallback (requires API key)   │
+│ ☑ Process data locally only (in-browser)        │
 │                                                  │
 │ Data Retention:                                  │
 │ Keep alignment history: [7 days] [30] [Forever] │
@@ -338,7 +397,7 @@ interface ClassificationResult {
 │                                                  │
 │ Status:                                          │
 │ ActivityWatch: Running ✓                        │
-│ Ollama: Connected ✓ (Llama 3.2 3B)              │
+│ Transformer.js: Loaded ✓ (BART-large-mnli)      │
 │ Last classification: 2 minutes ago              │
 └─────────────────────────────────────────────────┘
 ```
@@ -363,13 +422,13 @@ interface ClassificationResult {
 ```
 1. User installs Zenborg
 2. ActivityWatch auto-starts in background
-3. Ollama detected (or prompt: "Install Ollama for local AI? [Yes] [Skip]")
-4. If Ollama installed → auto-pull Llama 3.2 3B (progress indicator)
-5. Settings show: "ActivityWatch: Running ✓, Ollama: Ready ✓"
+3. First classification triggers model download (progress: "Loading classifier...")
+4. BART model downloads (400MB, one-time, cached)
+5. Settings show: "ActivityWatch: Running ✓, Transformer.js: Loaded ✓"
 6. Compass indicator appears (faded, no moment allocated yet)
 ```
 
-**Fallback**: If Ollama not installed, extension stays dormant (no crash, no nag).
+**Fallback**: If model download fails (offline, no space), extension stays dormant until next launch.
 
 ---
 
@@ -407,24 +466,24 @@ interface ClassificationResult {
 ## Technical Constraints
 
 ### Performance
-- **Classification latency**: < 2 seconds (local LLM)
+- **Classification latency**: < 1 second (Transformer.js, after model loaded)
 - **UI update latency**: < 500ms (compass indicator)
-- **CPU overhead**: < 5% average (AW watchers + LLM)
-- **Memory**: < 200MB (AW + Ollama model loaded)
+- **CPU overhead**: < 3% average (AW watchers + inference)
+- **Memory**: < 150MB (AW + Transformer.js model in-memory)
 - **Battery impact**: Negligible (10-min polling, not continuous)
 
 ### Privacy
-- **Default**: All data processed locally (AW SQLite + Ollama)
+- **Default**: All data processed locally (AW SQLite + Transformer.js in-browser)
 - **No telemetry**: Classification results stay on device
-- **Optional cloud**: User must explicitly enable + provide API key
+- **No cloud required**: Models downloaded once, cached locally
 - **Data retention**: Default 7 days, user-configurable
 - **GDPR compliance**: Full data export/deletion support
 
 ### Compatibility
 - **Platforms**: macOS, Linux, Windows (AW supports all three)
-- **Browsers**: Chrome, Firefox, Safari (aw-watcher-web)
+- **Browsers**: Chrome (recommended), Firefox, Safari (aw-watcher-web)
 - **Editors**: VS Code, Cursor, Vim/Neovim (window title detection)
-- **Ollama**: Requires 4GB RAM minimum (for 3B model)
+- **Transformer.js**: Requires 2GB RAM minimum, WebGPU recommended for speed
 
 ---
 
@@ -475,8 +534,8 @@ interface ClassificationResult {
 ## Open Questions
 
 **Technical**:
-1. Should we bundle Ollama or just detect/prompt for install?
-   - **Recommendation**: Detect + prompt (Ollama is 500MB+, too large to bundle)
+1. Which Transformer.js model: BART (accurate) or DeBERTa (faster)?
+   - **Recommendation**: Start with BART, add DeBERTa as fast mode option
 
 2. Polling interval: 5 min, 10 min, or user-configurable?
    - **Recommendation**: Default 10 min, configurable down to 5 min
@@ -487,6 +546,9 @@ interface ClassificationResult {
 4. Should we show compass when no moment allocated?
    - **Recommendation**: Show as UNTRACKED (○), remind user to allocate
 
+5. Use zero-shot classification or semantic similarity?
+   - **Recommendation**: Zero-shot for better accuracy, similarity as fallback/fast mode
+
 **UX**:
 1. Should compass show confidence score, or just direction?
    - **Recommendation**: Hide confidence (too metric-y), just show state
@@ -536,7 +598,7 @@ interface ClassificationResult {
 **Immediate**:
 1. ✅ PRD approval (this document)
 2. Create technical spike: bundle AW binaries for Next.js app
-3. Test Ollama integration (API calls, model selection)
+3. Test Transformer.js integration (model loading, inference speed)
 4. Design compass component (Figma mockup)
 5. Set up Vitest tests for classification service
 
@@ -546,7 +608,7 @@ interface ClassificationResult {
 - Console logging of aggregated events
 
 **Week 2 Deliverables**:
-- Ollama integration (local LLM classification)
+- Transformer.js integration (zero-shot classification)
 - Compass indicator UI component
 - Real-time classification display
 

From 42ae49e9e772899eac98845c3b68dd2088f189d1 Mon Sep 17 00:00:00 2001
From: Claude <noreply@anthropic.com>
Date: Sat, 25 Oct 2025 02:30:53 +0000
Subject: [PATCH 4/5] docs: Add tiny ActivityWatch integration (manual labeling
 MVP)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Alternative to full AI classification - test core integration first:

**Approach**:
- Use ActivityWatch's built-in category/labeling system
- User manually configures regex rules in AW UI
- Zenborg fetches labeled events via REST API (localhost:5600)
- Simple alignment: does activity category match moment area?

**Benefits**:
- 2-4 hours to build vs. weeks for AI version
- Zero ML complexity, uses existing AW features
- Tests core hypothesis: does activity tracking help?
- Transparent rules (regex), user-editable

**Implementation**:
1. ActivityWatch client (TypeScript REST API wrapper)
2. Sync Zenborg areas → AW categories
3. Fetch & display alignment status (🧭 ↑/↙)
4. Simple UI component (fixed position indicator)
5. Standalone test script

**Limitations**:
- Requires manual category setup (regex rules)
- No semantic understanding (can't infer intent)
- Only works if user maintains category rules

**Path forward**:
1. Build tiny version (validate AW integration works)
2. If successful → Add Transformer.js semantic layer
3. Hybrid: User rules + AI classification for unlabeled

This tests the riskiest assumption (AW integration) before
investing in AI infrastructure.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
---
 docs/prds/ACTIVITY_WATCH_TINY_VERSION.md | 596 +++++++++++++++++++++++
 1 file changed, 596 insertions(+)
 create mode 100644 docs/prds/ACTIVITY_WATCH_TINY_VERSION.md

diff --git a/docs/prds/ACTIVITY_WATCH_TINY_VERSION.md b/docs/prds/ACTIVITY_WATCH_TINY_VERSION.md
new file mode 100644
index 0000000..fa450e7
--- /dev/null
+++ b/docs/prds/ACTIVITY_WATCH_TINY_VERSION.md
@@ -0,0 +1,596 @@
+# ActivityWatch Tiny Integration - Manual Labeling MVP
+
+**Purpose**: Test core ActivityWatch integration with manual area labeling before building AI classification
+**Timeline**: 2-4 hours to build + test
+**Goal**: Prove ActivityWatch data collection works, validate API integration
+
+---
+
+## The Simplest Thing That Could Work
+
+Instead of AI classification, **manually label activities** using ActivityWatch's built-in category system:
+
+1. User downloads & runs ActivityWatch themselves
+2. Zenborg syncs Areas → ActivityWatch categories/labels
+3. User manually labels activities in ActivityWatch UI (or via script)
+4. Zenborg fetches labeled data to show alignment
+
+**No AI. No classification. Just basic CRUD operations on localhost.**
+
+---
+
+## How ActivityWatch Labeling Works
+
+ActivityWatch has a built-in **event classification system**:
+
+```json
+{
+  "id": 123,
+  "timestamp": "2025-10-25T10:30:00Z",
+  "duration": 300,
+  "data": {
+    "app": "Google Chrome",
+    "title": "Linear - Product Roadmap",
+    "url": "https://linear.app/...",
+    "$category": ["Work", "Product"]  // ← User-defined labels
+  }
+}
+```
+
+**Categories can be**:
+- Set manually (user clicks in AW UI)
+- Set via regex rules (AW's category watcher)
+- Set via API calls (our script)
+
+---
+
+## Tiny Script Architecture
+
+```
+┌─────────────────────────────────────────┐
+│         Zenborg Areas                   │
+│  - Product Work                         │
+│  - Data Work                            │
+│  - UX Work                              │
+│  - Strategy Work                        │
+└─────────┬───────────────────────────────┘
+          │
+          │ POST /api/aw/sync-categories
+          ▼
+┌─────────────────────────────────────────┐
+│    ActivityWatch REST API               │
+│    http://localhost:5600                │
+│                                         │
+│  /api/0/buckets/                        │
+│  /api/0/events/                         │
+│  /api/0/query/                          │
+└─────────┬───────────────────────────────┘
+          │
+          │ GET events with $category
+          ▼
+┌─────────────────────────────────────────┐
+│    Zenborg UI - Alignment View          │
+│  "You spent 2h on Product Work"         │
+│  "Last 15min: Linear (Product)"         │
+└─────────────────────────────────────────┘
+```
+
+---
+
+## Implementation
+
+### 1. ActivityWatch Client (TypeScript)
+
+```typescript
+// src/infrastructure/activitywatch/aw-client.ts
+
+const AW_BASE_URL = 'http://localhost:5600'
+
+interface AWEvent {
+  id: number
+  timestamp: string
+  duration: number
+  data: {
+    app: string
+    title: string
+    url?: string
+    $category?: string[]  // ActivityWatch categories
+  }
+}
+
+interface AWBucket {
+  id: string
+  name: string
+  type: string
+  hostname: string
+}
+
+export class ActivityWatchClient {
+
+  // Check if AW is running
+  async isRunning(): Promise<boolean> {
+    try {
+      const response = await fetch(`${AW_BASE_URL}/api/0/info`)
+      return response.ok
+    } catch {
+      return false
+    }
+  }
+
+  // Get all buckets (watchers)
+  async getBuckets(): Promise<AWBucket[]> {
+    const response = await fetch(`${AW_BASE_URL}/api/0/buckets/`)
+    if (!response.ok) throw new Error('Failed to fetch buckets')
+    return response.json()
+  }
+
+  // Get events from a bucket (last N minutes)
+  async getEvents(
+    bucketId: string,
+    startTime: Date,
+    endTime: Date
+  ): Promise<AWEvent[]> {
+    const params = new URLSearchParams({
+      start: startTime.toISOString(),
+      end: endTime.toISOString(),
+      limit: '100'
+    })
+
+    const response = await fetch(
+      `${AW_BASE_URL}/api/0/buckets/${bucketId}/events?${params}`
+    )
+
+    if (!response.ok) throw new Error('Failed to fetch events')
+    return response.json()
+  }
+
+  // Get aggregated activity for last N minutes
+  async getRecentActivity(minutes: number = 15): Promise<AWEvent[]> {
+    const buckets = await this.getBuckets()
+
+    // Find window watcher bucket (aw-watcher-window_*)
+    const windowBucket = buckets.find(b =>
+      b.id.startsWith('aw-watcher-window')
+    )
+
+    if (!windowBucket) throw new Error('No window watcher found')
+
+    const endTime = new Date()
+    const startTime = new Date(endTime.getTime() - minutes * 60 * 1000)
+
+    return this.getEvents(windowBucket.id, startTime, endTime)
+  }
+
+  // Use AW's query API for advanced aggregation
+  async queryActivity(query: string): Promise<any> {
+    const response = await fetch(`${AW_BASE_URL}/api/0/query/`, {
+      method: 'POST',
+      headers: { 'Content-Type': 'application/json' },
+      body: JSON.stringify({
+        timeperiods: [`${new Date().toISOString()}/PT1H`], // last hour
+        query: query
+      })
+    })
+
+    if (!response.ok) throw new Error('Query failed')
+    return response.json()
+  }
+
+  // Get time spent per category (last N hours)
+  async getTimeByCategory(hours: number = 1): Promise<Record<string, number>> {
+    // AW query language to aggregate by category
+    const query = `
+      events = query_bucket(find_bucket("aw-watcher-window"));
+      events = filter_keyvals(events, "$category", []);
+      events = categorize(events, [[["Work"], {"regex": "Linear|Notion|Figma"}]]);
+      duration_by_category = sum_durations_by_key(events, "$category");
+      RETURN = duration_by_category;
+    `
+
+    const result = await this.queryActivity(query)
+    return result[0] // First timeperiod result
+  }
+}
+```
+
+### 2. Sync Zenborg Areas → AW Categories
+
+```typescript
+// src/application/use-cases/sync-areas-to-aw.ts
+
+import { ActivityWatchClient } from '@/infrastructure/activitywatch/aw-client'
+import { Area } from '@/domain/entities/area'
+
+export async function syncAreasToAW(areas: Area[]): Promise<void> {
+  const awClient = new ActivityWatchClient()
+
+  // Check if AW is running
+  const isRunning = await awClient.isRunning()
+  if (!isRunning) {
+    console.warn('ActivityWatch not running, skipping sync')
+    return
+  }
+
+  // For tiny version: just log categories
+  // User will manually set them in AW UI or via regex rules
+  console.log('Zenborg Areas → ActivityWatch Categories:')
+  areas.forEach(area => {
+    console.log(`  - ${area.name}: ${area.themeKeywords?.join(', ')}`)
+  })
+
+  // Future: Auto-create categorization rules via AW API
+  // (AW doesn't have a public API for this yet, needs manual config)
+}
+```
+
+### 3. Fetch & Display Alignment
+
+```typescript
+// src/application/use-cases/get-alignment-status.ts
+
+import { ActivityWatchClient } from '@/infrastructure/activitywatch/aw-client'
+import { Moment } from '@/domain/entities/moment'
+
+export interface AlignmentStatus {
+  moment: Moment
+  lastActivity: {
+    app: string
+    title: string
+    duration: number
+    category?: string[]
+  }[]
+  aligned: boolean // true if category matches moment.area.name
+  totalTime: number // seconds in last 15 min
+}
+
+export async function getAlignmentStatus(
+  currentMoment: Moment | null
+): Promise<AlignmentStatus | null> {
+  if (!currentMoment) return null
+
+  const awClient = new ActivityWatchClient()
+
+  // Get last 15 minutes of activity
+  const events = await awClient.getRecentActivity(15)
+
+  // Group by app/title
+  const activitySummary = events.reduce((acc, event) => {
+    const key = `${event.data.app} - ${event.data.title}`
+    if (!acc[key]) {
+      acc[key] = {
+        app: event.data.app,
+        title: event.data.title,
+        duration: 0,
+        category: event.data.$category
+      }
+    }
+    acc[key].duration += event.duration
+    return acc
+  }, {} as Record<string, any>)
+
+  const lastActivity = Object.values(activitySummary)
+    .sort((a, b) => b.duration - a.duration)
+
+  // Check if aligned: does any activity's category match moment's area?
+  const aligned = lastActivity.some(activity =>
+    activity.category?.includes(currentMoment.area.name)
+  )
+
+  const totalTime = lastActivity.reduce((sum, a) => sum + a.duration, 0)
+
+  return {
+    moment: currentMoment,
+    lastActivity,
+    aligned,
+    totalTime
+  }
+}
+```
+
+### 4. Simple UI Component
+
+```tsx
+// src/components/ActivityWatchStatus.tsx
+
+'use client'
+
+import { useEffect, useState } from 'react'
+import { getAlignmentStatus, AlignmentStatus } from '@/application/use-cases/get-alignment-status'
+import { useMomentStore } from '@/infrastructure/state/moment-store'
+
+export function ActivityWatchStatus() {
+  const [status, setStatus] = useState<AlignmentStatus | null>(null)
+  const currentMoment = useMomentStore(state => state.getCurrentMoment())
+
+  useEffect(() => {
+    // Poll every 5 minutes
+    const interval = setInterval(async () => {
+      if (currentMoment) {
+        const newStatus = await getAlignmentStatus(currentMoment)
+        setStatus(newStatus)
+      }
+    }, 5 * 60 * 1000)
+
+    // Initial fetch
+    if (currentMoment) {
+      getAlignmentStatus(currentMoment).then(setStatus)
+    }
+
+    return () => clearInterval(interval)
+  }, [currentMoment])
+
+  if (!status) return null
+
+  return (
+    <div className="fixed top-4 right-4 p-4 bg-stone-100 border border-stone-200 rounded-lg">
+      <div className="text-sm font-mono">
+        <div className="mb-2">
+          Current: <strong>{status.moment.name}</strong> ({status.moment.area.name})
+        </div>
+
+        <div className="mb-2">
+          Status: {status.aligned ? (
+            <span className="text-green-600">🧭 ↑ Aligned</span>
+          ) : (
+            <span className="text-amber-600">🧭 ↙ Drifting</span>
+          )}
+        </div>
+
+        <div className="text-xs text-stone-600">
+          Last 15 min:
+          <ul className="mt-1 space-y-1">
+            {status.lastActivity.slice(0, 3).map((activity, i) => (
+              <li key={i}>
+                {activity.app} ({Math.floor(activity.duration / 60)}m)
+                {activity.category && (
+                  <span className="ml-2 text-stone-400">
+                    [{activity.category.join(', ')}]
+                  </span>
+                )}
+              </li>
+            ))}
+          </ul>
+        </div>
+      </div>
+    </div>
+  )
+}
+```
+
+---
+
+## User Setup (Manual)
+
+### 1. Install ActivityWatch
+
+```bash
+# macOS
+brew install --cask activitywatch
+
+# Linux
+wget https://github.com/ActivityWatch/activitywatch/releases/latest/download/activitywatch-linux-x86_64.zip
+unzip activitywatch-linux-x86_64.zip
+./activitywatch/aw-qt
+
+# Windows
+# Download from https://activitywatch.net/downloads/
+```
+
+### 2. Configure Categories (Manual)
+
+Open ActivityWatch UI (http://localhost:5600):
+
+**Settings → Categories → Add Rules**:
+
+```
+Product Work:
+  - regex: "Linear|Notion|Jira|Asana|PRD"
+  - regex: "#product"
+
+Data Work:
+  - regex: "Jupyter|Python|SQL|Postgres|dbt"
+  - regex: "\.ipynb|\.py|\.sql"
+
+UX Work:
+  - regex: "Figma|Framer|Sketch|Design"
+  - regex: "\.tsx|\.css|Tailwind"
+
+Strategy Work:
+  - regex: "Docs|Notes|Obsidian|Research"
+  - regex: "Strategy|Planning|Reflection"
+```
+
+### 3. Test Zenborg Integration
+
+```bash
+# In Zenborg project
+npm install
+
+# Add AW client component to layout
+# (see implementation above)
+
+# Start Zenborg
+npm run dev
+
+# Open browser, allocate a moment
+# Wait 5 minutes, see status update
+```
+
+---
+
+## Testing Script (Standalone)
+
+For quick testing without full Zenborg integration:
+
+```typescript
+// scripts/test-aw-integration.ts
+
+import { ActivityWatchClient } from '../src/infrastructure/activitywatch/aw-client'
+
+async function main() {
+  const client = new ActivityWatchClient()
+
+  console.log('🧭 Testing ActivityWatch Integration\n')
+
+  // 1. Check if running
+  const isRunning = await client.isRunning()
+  console.log(`✓ ActivityWatch running: ${isRunning}`)
+
+  if (!isRunning) {
+    console.log('❌ Please start ActivityWatch first')
+    process.exit(1)
+  }
+
+  // 2. Get buckets
+  const buckets = await client.getBuckets()
+  console.log(`✓ Found ${buckets.length} buckets:`)
+  buckets.forEach(b => console.log(`  - ${b.id} (${b.type})`))
+
+  // 3. Get last 15 min activity
+  console.log('\n📊 Last 15 minutes of activity:')
+  const events = await client.getRecentActivity(15)
+
+  const summary = events.reduce((acc, event) => {
+    const key = event.data.app
+    if (!acc[key]) acc[key] = 0
+    acc[key] += event.duration
+    return acc
+  }, {} as Record<string, number>)
+
+  Object.entries(summary)
+    .sort(([, a], [, b]) => b - a)
+    .forEach(([app, duration]) => {
+      const minutes = Math.floor(duration / 60)
+      console.log(`  - ${app}: ${minutes}m ${Math.floor(duration % 60)}s`)
+    })
+
+  // 4. Check for categorized events
+  console.log('\n🏷️  Categorized events:')
+  const categorized = events.filter(e => e.data.$category && e.data.$category.length > 0)
+
+  if (categorized.length === 0) {
+    console.log('  ⚠️  No categorized events found')
+    console.log('  Set up categories in AW UI: http://localhost:5600')
+  } else {
+    categorized.forEach(e => {
+      console.log(`  - ${e.data.app}: [${e.data.$category?.join(', ')}]`)
+    })
+  }
+}
+
+main().catch(console.error)
+```
+
+Run it:
+
+```bash
+npx tsx scripts/test-aw-integration.ts
+```
+
+---
+
+## Alignment Logic (No AI)
+
+**Simple rule**: Activity is "aligned" if:
+- Activity's `$category` matches current moment's `area.name`
+
+**Example**:
+
+```typescript
+// User is working on moment "Product Spec" (area: "Product Work")
+// Last 15 min activity:
+
+[
+  { app: "Linear", category: ["Product Work"], duration: 600 },
+  { app: "Slack", category: ["Communication"], duration: 180 },
+  { app: "Chrome - Twitter", category: null, duration: 120 }
+]
+
+// Alignment calculation:
+const productTime = 600  // Linear
+const otherTime = 300    // Slack + Twitter
+
+aligned = productTime > otherTime  // true
+```
+
+**Compass state**:
+- `🧭 ↑ Aligned` if > 50% of time in matching category
+- `🧭 ↙ Drifting` if < 50% of time in matching category
+- `🧭 ○ Untracked` if no categorized events
+
+---
+
+## Advantages of Tiny Version
+
+✅ **Zero AI complexity**: No models, no training, no classification
+✅ **Uses existing AW features**: Categories already built-in
+✅ **Fast to build**: 2-4 hours total (vs. weeks for AI version)
+✅ **Tests core integration**: Validates AW API works, data flows correctly
+✅ **User can manually tune**: Regex rules are transparent and editable
+
+---
+
+## Limitations (To Address Later)
+
+❌ **Manual category setup**: User must configure regex rules in AW
+❌ **No semantic understanding**: "Slack #product-team" won't auto-match "Product Work"
+❌ **Requires user discipline**: If categories not set, shows no alignment
+❌ **No learning**: Rules are static, don't improve over time
+
+**Solution**: Once this works, add Transformer.js on top for semantic classification
+
+---
+
+## Next Steps
+
+1. **Build client** (`aw-client.ts`) - 1 hour
+2. **Test with script** (`test-aw-integration.ts`) - 30 min
+3. **Add UI component** (`ActivityWatchStatus.tsx`) - 1 hour
+4. **Manual testing** (configure categories, use Zenborg) - 1 hour
+5. **Decide**: Does basic integration work? → Add AI layer
+
+**Total**: 2-4 hours to validate core hypothesis
+
+---
+
+## Future: AI Layer on Top
+
+Once manual labeling works:
+
+```typescript
+// Hybrid approach: Use categories as hints, AI for unlabeled
+
+async function classifyActivity(
+  activity: AWEvent,
+  moment: Moment
+): Promise<AlignmentType> {
+
+  // 1. If already categorized by user, trust it
+  if (activity.data.$category?.includes(moment.area.name)) {
+    return 'aligned'
+  }
+
+  // 2. If no category, ask Transformer.js
+  const classifier = await pipeline('zero-shot-classification', 'facebook/bart-large-mnli')
+  const result = await classifier(
+    `${activity.data.app}: ${activity.data.title}`,
+    [`${moment.area.name}`, 'unrelated work', 'distraction']
+  )
+
+  return result.labels[0] === moment.area.name ? 'aligned' : 'drifting'
+}
+```
+
+**Best of both worlds**:
+- User rules = fast, transparent, trustworthy
+- AI classification = fills gaps, handles edge cases
+
+---
+
+**Status**: Ready to implement
+**Owner**: Thopiax
+**Timeline**: 2-4 hours
+
+---
+
+*"Start with the simplest thing that could work. If manual labeling proves the integration, add AI later."*

From 280b727562cbbdc5e7bde1d0128d3c1a9f12403e Mon Sep 17 00:00:00 2001
From: Claude <noreply@anthropic.com>
Date: Sat, 25 Oct 2025 12:23:22 +0000
Subject: [PATCH 5/5] docs: Fix concept - match moments (not areas) in
 classification
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Corrected fundamental misunderstanding across all three documents:

**Before (incorrect)**:
- Predefined "area themes" (Product/Data/UX/Strategy)
- Hardcoded keywords per area
- Classification matched activity → area theme
- Required maintaining theme taxonomy

**After (correct per CLAUDE.md)**:
- Areas are life domains (Wellness, Craft, Social, Joyful, Introspective)
- Moments are specific intentions ("Product Spec", "Data Analysis")
- Classification matches activity → current moment name
- Moment names are self-descriptive, no keywords needed

**Why this is better**:
1. More specific matching (moment-level vs area-level)
2. No hardcoded themes to maintain
3. Moment names already provide semantic context
4. Aligns with Zenborg's core domain model

**Changes across all docs**:
- PRD: Use moment.name as semantic anchor in classification
- Critical path: Remove "theme" prompt, just ask for moment name
- Tiny version: Match AW categories to moment names (not areas)
- Appendix: Replace hardcoded themes with moment examples

**Classification approach now**:
```typescript
// Zero-shot
classifier(activityDescription, [
  `working on: ${moment.name}`,  // e.g., "working on: Product Spec"
  'distracted or browsing',
  'transitional activity',
  'no activity'
])

// Semantic similarity
embed(moment.name)  // Just the moment name - it's self-descriptive!
```

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
---
 docs/prds/ACTIVITY_WATCH_CRITICAL_PATH.md |  47 +++---
 docs/prds/ACTIVITY_WATCH_PROJECT.md       | 168 ++++++++--------------
 docs/prds/ACTIVITY_WATCH_TINY_VERSION.md  |  96 ++++++++-----
 3 files changed, 137 insertions(+), 174 deletions(-)

diff --git a/docs/prds/ACTIVITY_WATCH_CRITICAL_PATH.md b/docs/prds/ACTIVITY_WATCH_CRITICAL_PATH.md
index 3cd4243..db71daf 100644
--- a/docs/prds/ACTIVITY_WATCH_CRITICAL_PATH.md
+++ b/docs/prds/ACTIVITY_WATCH_CRITICAL_PATH.md
@@ -58,9 +58,9 @@ const classifier = await pipeline(
 )
 
 while (true) {
-  // 1. Get current intention from user
-  const intention = await promptUser("What are you working on?")
-  const theme = await promptUser("Theme? (product/data/ux/strategy)")
+  // 1. Get current intention from user (just the moment name)
+  const momentName = await promptUser("What are you working on?")
+  // e.g., "Product Spec", "Data Analysis", "Morning Run"
 
   // 2. Poll ActivityWatch every 5 minutes
   await sleep(5 * 60 * 1000)
@@ -81,14 +81,14 @@ while (true) {
     .join(', ')
 
   const description = `
-    Working on: ${intention} (${theme} work)
+    Current intention: ${momentName}
     Recent activity: ${activityText}
   `
 
-  // 6. Classify with Transformer.js
+  // 6. Classify with Transformer.js (moment name is semantic anchor)
   const result = await classifier(description, [
-    'aligned with stated intention',
-    'drifting from stated intention',
+    `working on: ${momentName}`,  // e.g., "working on: Product Spec"
+    'distracted or browsing unrelated content',
     'neutral or transitional activity',
     'no significant activity'
   ])
@@ -116,7 +116,7 @@ while (true) {
 ### Day 1: Personal Dogfooding
 
 **Morning Session (3 hours)**:
-- Set intention: "Product Spec" (Product theme)
+- Set intention: "Product Spec" (just the moment name)
 - Work normally for 3 hours
 - Observe compass updates every 5 min
 - Note: When did you notice drift? Did you self-correct?
@@ -128,7 +128,7 @@ while (true) {
 - Was 5-min polling too slow/too fast?
 
 **Afternoon Session (3 hours)**:
-- Set intention: "Data Analysis" (Data theme)
+- Set intention: "Data Analysis"
 - Intentionally drift to Twitter/email after 30 min
 - Observe: How long until compass shows "drifting"?
 - Self-correct: Does returning to Jupyter change compass back to "aligned"?
@@ -188,12 +188,9 @@ $ npm run test-compass
 Loading classifier... (first run downloads BART model ~400MB)
 ✓ Classifier ready (facebook/bart-large-mnli)
 
-What are you working on? (3 words max)
+What are you working on? (moment name, 1-3 words)
 > Product Spec
 
-Theme? (product/data/ux/strategy)
-> product
-
 ✓ Monitoring ActivityWatch every 5 minutes...
   Press Ctrl+C to stop or change intention
 
@@ -434,19 +431,17 @@ const relatedNotes = journalNotes
   .sort((a, b) => b.score - a.score)
 ```
 
-**2. Auto-Tagging Notes**
+**2. Auto-Linking Notes to Moments**
 ```typescript
-// Tag note with theme (product/data/ux/strategy)
+// Find which moment this journal note relates to
 const classifier = await pipeline('zero-shot-classification', 'facebook/bart-large-mnli')
 
-const result = await classifier(journalNote.content, [
-  'product work and prioritization',
-  'data analysis and experiments',
-  'UX design and prototyping',
-  'strategic thinking and planning'
-])
+// Get all active moments
+const moments = ["Product Spec", "Data Analysis", "Morning Run", "Deep Reading"]
+
+const result = await classifier(journalNote.content, moments)
 
-journalNote.theme = result.labels[0]
+journalNote.relatedMoment = result.labels[0]  // Most likely moment
 journalNote.confidence = result.scores[0]
 ```
 
@@ -463,12 +458,12 @@ const similar = pastMoments
 
 ### Integration Points
 
-- **Moment creation**: Suggest related journal notes
-- **Journal writing**: Auto-tag with themes from current moment
+- **Moment creation**: Suggest related journal notes based on moment name
+- **Journal writing**: Auto-link notes to the current moment you're working on
 - **Reflection**: "Show me notes from when I worked on similar moments"
-- **Search**: Semantic search across all notes and moments
+- **Search**: Semantic search across all notes and moments by name/content
 
-**Advantage**: One model download, multiple features. Zero-config semantic intelligence across the whole app.
+**Advantage**: One model download, multiple features. Zero-config semantic intelligence across the whole app. No need to manually tag or categorize anything.
 
 ---
 
diff --git a/docs/prds/ACTIVITY_WATCH_PROJECT.md b/docs/prds/ACTIVITY_WATCH_PROJECT.md
index 2219bc8..b96a2d2 100644
--- a/docs/prds/ACTIVITY_WATCH_PROJECT.md
+++ b/docs/prds/ACTIVITY_WATCH_PROJECT.md
@@ -111,36 +111,11 @@ Current: "Product Spec" ☕ Morning
 
 ### Data Model Extensions
 
-**Area** (extended from Zenborg core):
-```typescript
-interface Area {
-  // ... existing fields ...
-  themeKeywords?: string[]  // ["linear", "notion", "spec", "roadmap"]
-  themeDescription?: string // "Product work: writing specs, prioritizing..."
-}
-```
-
-**Default Area Themes** (for user "Thopiax"):
-```typescript
-const DEFAULT_THEMES = {
-  "Product": {
-    keywords: ["linear", "notion", "spec", "roadmap", "jira", "prd"],
-    description: "Writing specs, scopes, prioritizing features"
-  },
-  "Data": {
-    keywords: ["jupyter", "python", "sql", "postgres", "dbt", "pandas"],
-    description: "Exploring data, writing models, running batches, experiments"
-  },
-  "UX": {
-    keywords: ["figma", "framer", "prototype", "design", "css", "component"],
-    description: "Prototyping, fine-tuning interfaces"
-  },
-  "Strategy": {
-    keywords: ["docs", "notes", "research", "reading", "writing"],
-    description: "Slow, deliberate thinking and planning"
-  }
-}
-```
+**Note on Areas vs Moments**:
+- **Areas** are life domains (Wellness, Craft, Social, Joyful, Introspective) per CLAUDE.md
+- **Moments** are specific intentions like "Product Spec", "Data Analysis", "Morning Run"
+- Classification matches activity → **current moment**, not area
+- Moment names provide semantic context (e.g., "Product Spec" implies Linear/Notion/specs)
 
 **AlignmentEvent** (new entity):
 ```typescript
@@ -148,10 +123,9 @@ interface AlignmentEvent {
   id: string                    // UUID
   momentId: string              // FK to Moment
   timestamp: string             // ISO timestamp
-  classification: AlignmentType // "aligned" | "neutral" | "drifting"
+  classification: AlignmentType // "aligned" | "neutral" | "drifting" | "untracked"
   confidence: number            // 0.0-1.0
   observedActivities: ActivitySummary[]
-  themeDetected: string | null  // "product", "data", etc.
   createdAt: string
 }
 
@@ -193,45 +167,26 @@ const classifier = await pipeline(
   'facebook/bart-large-mnli'
 )
 
-// Define candidate labels based on moment's theme
-const labels = {
-  aligned: [
-    moment.area.themeDescription, // "Writing specs, prioritizing features"
-    ...moment.area.keywords, // ["linear", "notion", "spec"]
-  ],
-  drifting: [
-    "social media browsing",
-    "news reading",
-    "entertainment",
-    "unrelated work"
-  ],
-  neutral: [
-    "email communication",
-    "team chat",
-    "quick searches",
-    "context switching"
-  ]
-}
-
 // Build activity description from AW events
 const activityDescription = `
-User is working on: "${moment.name}" (${moment.area.name} - ${moment.area.themeDescription})
+Current intention: "${moment.name}" (${moment.area.name})
+Context: User committed to working on this during ${phase}.
 
 Recent activity (last 15 min):
 ${activity.map(a => `- ${a.app}: ${a.windowTitle} (${a.duration}s)`).join('\n')}
 `
 
-// Classify alignment
+// Classify alignment using moment name as semantic anchor
 const result = await classifier(activityDescription, [
-  'aligned with stated intention',
-  'drifting from stated intention',
-  'neutral or transitional activity',
-  'no significant digital activity'
+  `working on: ${moment.name}`,          // e.g., "working on: Product Spec"
+  'distracted or browsing unrelated content',
+  'transitional activity like email or chat',
+  'no significant activity observed'
 ])
 
 // Map to AlignmentType
 const classification = mapToAlignment(result.labels[0], result.scores[0])
-// { classification: "aligned", confidence: 0.89, themeDetected: "product" }
+// { classification: "aligned", confidence: 0.89 }
 ```
 
 **Alternative: Semantic Similarity** (faster, simpler):
@@ -245,10 +200,9 @@ const embedder = await pipeline(
   'Xenova/all-MiniLM-L6-v2'
 )
 
-// Embed intention
-const intentionEmbedding = await embedder(
-  `${moment.name}: ${moment.area.themeDescription}`
-)
+// Embed intention (just the moment name - it's self-descriptive)
+const intentionEmbedding = await embedder(moment.name)
+// e.g., "Product Spec" or "Morning Run"
 
 // Embed observed activity
 const activityEmbedding = await embedder(
@@ -271,11 +225,10 @@ const classification =
 interface ClassificationResult {
   classification: AlignmentType // "aligned" | "neutral" | "drifting" | "untracked"
   confidence: number // 0.0-1.0 (from model scores)
-  themeDetected: string | null // "product" | "data" | "ux" | "strategy"
   method: 'zero-shot' | 'similarity' // which approach was used
 }
 
-// Store in IndexedDB as AlignmentEvent
+// Store in IndexedDB as AlignmentEvent (linked to moment via momentId)
 ```
 
 ---
@@ -619,55 +572,50 @@ interface ClassificationResult {
 
 ---
 
-## Appendix: User's Default Themes
+## Appendix: Example Moment-to-Activity Mappings
 
-**For "Thopiax" (MVP hardcoded)**:
+**How Semantic Classification Works**:
 
-```typescript
-export const THOPIAX_THEMES = {
-  "Product Work": {
-    keywords: ["linear", "notion", "jira", "asana", "roadmap", "spec", "prd", "priorit"],
-    description: "Writing specs, scopes, prioritizing features, planning roadmaps",
-    exampleActivities: [
-      "Linear - Product Roadmap Q2",
-      "Notion - PRD: New Onboarding Flow",
-      "Slack - #product-team"
-    ]
-  },
-  "Data Work": {
-    keywords: ["jupyter", "python", "sql", "postgres", "dbt", "pandas", "numpy", "colab"],
-    description: "Exploring data, writing models, running batches, tweaking experiments",
-    exampleActivities: [
-      "Jupyter Notebook - user_retention_analysis.ipynb",
-      "pgAdmin - Query: weekly_active_users",
-      "Terminal - python run_experiment.py"
-    ]
-  },
-  "UX Work": {
-    keywords: ["figma", "framer", "sketch", "prototype", "design", "component", "css", "tailwind"],
-    description: "Prototyping interfaces, fine-tuning designs, iterating on components",
-    exampleActivities: [
-      "Figma - Zenborg Compass Redesign",
-      "VS Code - MomentCard.tsx",
-      "Chrome - Tailwind CSS Docs"
-    ]
-  },
-  "Strategy Work": {
-    keywords: ["docs", "notion", "notes", "obsidian", "research", "reading", "writing", "plan"],
-    description: "Slow, deliberate thinking, strategic planning, deep reading",
-    exampleActivities: [
-      "Google Docs - Q3 Strategy Draft",
-      "Notion - Weekly Reflection",
-      "Safari - Reading: Shape Up (Basecamp)"
-    ]
-  }
-}
+Moment names are self-descriptive. The classifier matches observed activity against the moment name semantically:
+
+**Example 1: "Product Spec" (Area: Craft)**
+```
+Moment: "Product Spec"
+Observed: Linear, Notion, Slack #product-team
+Classification: ✓ Aligned (semantic match with spec/planning work)
+
+Moment: "Product Spec"
+Observed: Twitter, Hacker News
+Classification: ✗ Drifting (no semantic connection)
+```
+
+**Example 2: "Data Analysis" (Area: Craft)**
+```
+Moment: "Data Analysis"
+Observed: Jupyter Notebook, pgAdmin, Python
+Classification: ✓ Aligned (semantic match with data/analysis work)
+
+Moment: "Data Analysis"
+Observed: Figma, Design System Docs
+Classification: ✗ Drifting (different domain - design vs. data)
+```
+
+**Example 3: "Morning Run" (Area: Wellness)**
+```
+Moment: "Morning Run"
+Observed: No digital activity
+Classification: ? Untracked (expected for physical activity)
+
+Moment: "Morning Run"
+Observed: Strava, Spotify
+Classification: ✓ Aligned (related apps for running)
 ```
 
-**Usage in Classification**:
-- When moment.area matches theme name, use corresponding keywords/description
-- LLM considers semantic overlap (e.g., "Slack #product-team" → Product Work)
-- Themes evolve with user (future: custom theme editor)
+**Key Insight**: No hardcoded keywords needed. The model understands semantic relationships:
+- "Product Spec" → Linear, Notion, planning tools
+- "Data Analysis" → Jupyter, SQL, Python
+- "UX Prototype" → Figma, design tools
+- "Morning Run" → fitness apps or no digital activity
 
 ---
 
diff --git a/docs/prds/ACTIVITY_WATCH_TINY_VERSION.md b/docs/prds/ACTIVITY_WATCH_TINY_VERSION.md
index fa450e7..16b5857 100644
--- a/docs/prds/ACTIVITY_WATCH_TINY_VERSION.md
+++ b/docs/prds/ACTIVITY_WATCH_TINY_VERSION.md
@@ -11,9 +11,9 @@
 Instead of AI classification, **manually label activities** using ActivityWatch's built-in category system:
 
 1. User downloads & runs ActivityWatch themselves
-2. Zenborg syncs Areas → ActivityWatch categories/labels
-3. User manually labels activities in ActivityWatch UI (or via script)
-4. Zenborg fetches labeled data to show alignment
+2. User manually labels activities in ActivityWatch UI with moment-like names
+3. Zenborg fetches labeled data and matches against current moment
+4. Shows alignment: does activity label match current moment?
 
 **No AI. No classification. Just basic CRUD operations on localhost.**
 
@@ -48,14 +48,11 @@ ActivityWatch has a built-in **event classification system**:
 
 ```
 ┌─────────────────────────────────────────┐
-│         Zenborg Areas                   │
-│  - Product Work                         │
-│  - Data Work                            │
-│  - UX Work                              │
-│  - Strategy Work                        │
+│         Zenborg Current Moment          │
+│  "Product Spec" (Area: Craft)           │
 └─────────┬───────────────────────────────┘
           │
-          │ POST /api/aw/sync-categories
+          │ Fetch events with categories
           ▼
 ┌─────────────────────────────────────────┐
 │    ActivityWatch REST API               │
@@ -66,12 +63,13 @@ ActivityWatch has a built-in **event classification system**:
 │  /api/0/query/                          │
 └─────────┬───────────────────────────────┘
           │
-          │ GET events with $category
+          │ Events with $category labels
           ▼
 ┌─────────────────────────────────────────┐
 │    Zenborg UI - Alignment View          │
-│  "You spent 2h on Product Work"         │
-│  "Last 15min: Linear (Product)"         │
+│  Current: "Product Spec"                │
+│  Last 15min: Linear [Product Work]      │
+│  🧭 ↑ Aligned (category matches intent) │
 └─────────────────────────────────────────┘
 ```
 
@@ -193,31 +191,36 @@ export class ActivityWatchClient {
 }
 ```
 
-### 2. Sync Zenborg Areas → AW Categories
+### 2. Suggest Category Setup (Read-Only)
 
 ```typescript
-// src/application/use-cases/sync-areas-to-aw.ts
+// src/application/use-cases/suggest-aw-categories.ts
 
 import { ActivityWatchClient } from '@/infrastructure/activitywatch/aw-client'
-import { Area } from '@/domain/entities/area'
+import { Moment } from '@/domain/entities/moment'
 
-export async function syncAreasToAW(areas: Area[]): Promise<void> {
+export async function suggestAWCategories(moments: Moment[]): Promise<void> {
   const awClient = new ActivityWatchClient()
 
   // Check if AW is running
   const isRunning = await awClient.isRunning()
   if (!isRunning) {
-    console.warn('ActivityWatch not running, skipping sync')
+    console.warn('ActivityWatch not running')
     return
   }
 
-  // For tiny version: just log categories
-  // User will manually set them in AW UI or via regex rules
-  console.log('Zenborg Areas → ActivityWatch Categories:')
-  areas.forEach(area => {
-    console.log(`  - ${area.name}: ${area.themeKeywords?.join(', ')}`)
+  // For tiny version: just suggest category names based on common moments
+  console.log('💡 Suggested ActivityWatch categories (set up in AW UI):')
+
+  const uniqueMomentNames = [...new Set(moments.map(m => m.name))]
+
+  uniqueMomentNames.forEach(name => {
+    console.log(`  - "${name}" (for moments like: ${name})`)
   })
 
+  console.log('\n👉 Configure these in ActivityWatch UI: http://localhost:5600')
+  console.log('   Settings → Categories → Add Rules')
+
   // Future: Auto-create categorization rules via AW API
   // (AW doesn't have a public API for this yet, needs manual config)
 }
@@ -271,9 +274,13 @@ export async function getAlignmentStatus(
   const lastActivity = Object.values(activitySummary)
     .sort((a, b) => b.duration - a.duration)
 
-  // Check if aligned: does any activity's category match moment's area?
+  // Check if aligned: does any activity's category match moment name?
+  // Supports exact match or fuzzy match (e.g., "Product Work" matches "Product Spec")
   const aligned = lastActivity.some(activity =>
-    activity.category?.includes(currentMoment.area.name)
+    activity.category?.some(cat =>
+      cat.toLowerCase().includes(currentMoment.name.toLowerCase()) ||
+      currentMoment.name.toLowerCase().includes(cat.toLowerCase())
+    )
   )
 
   const totalTime = lastActivity.reduce((sum, a) => sum + a.duration, 0)
@@ -382,24 +389,31 @@ Open ActivityWatch UI (http://localhost:5600):
 
 **Settings → Categories → Add Rules**:
 
+Configure categories based on your common **moment names** (not areas):
+
 ```
-Product Work:
-  - regex: "Linear|Notion|Jira|Asana|PRD"
+Product Spec:
+  - regex: "Linear|Notion|Jira|PRD|Spec|Roadmap"
   - regex: "#product"
 
-Data Work:
-  - regex: "Jupyter|Python|SQL|Postgres|dbt"
+Data Analysis:
+  - regex: "Jupyter|Python|SQL|Postgres|Pandas"
   - regex: "\.ipynb|\.py|\.sql"
 
-UX Work:
+UX Prototype:
   - regex: "Figma|Framer|Sketch|Design"
-  - regex: "\.tsx|\.css|Tailwind"
+  - regex: "\.tsx|\.css|component"
+
+Deep Reading:
+  - regex: "Docs|PDF|Reader|Articles"
+  - regex: "Reading|Research"
 
-Strategy Work:
-  - regex: "Docs|Notes|Obsidian|Research"
-  - regex: "Strategy|Planning|Reflection"
+Email:
+  - regex: "Gmail|Outlook|Mail"
 ```
 
+**Key**: Category names should match your typical moment names ("Product Spec", "Data Analysis"), not areas ("Craft", "Wellness")
+
 ### 3. Test Zenborg Integration
 
 ```bash
@@ -492,25 +506,26 @@ npx tsx scripts/test-aw-integration.ts
 ## Alignment Logic (No AI)
 
 **Simple rule**: Activity is "aligned" if:
-- Activity's `$category` matches current moment's `area.name`
+- Activity's `$category` matches (or relates to) current moment name
 
 **Example**:
 
 ```typescript
-// User is working on moment "Product Spec" (area: "Product Work")
+// User is working on moment "Product Spec" (area: Craft)
+// ActivityWatch categories configured to label Linear/Notion as "Product Spec"
 // Last 15 min activity:
 
 [
-  { app: "Linear", category: ["Product Work"], duration: 600 },
+  { app: "Linear", category: ["Product Spec"], duration: 600 },
   { app: "Slack", category: ["Communication"], duration: 180 },
   { app: "Chrome - Twitter", category: null, duration: 120 }
 ]
 
 // Alignment calculation:
-const productTime = 600  // Linear
+const alignedTime = 600  // Linear (category matches moment name)
 const otherTime = 300    // Slack + Twitter
 
-aligned = productTime > otherTime  // true
+aligned = alignedTime > otherTime  // true
 ```
 
 **Compass state**:
@@ -518,6 +533,11 @@ aligned = productTime > otherTime  // true
 - `🧭 ↙ Drifting` if < 50% of time in matching category
 - `🧭 ○ Untracked` if no categorized events
 
+**Matching logic**:
+- Exact match: moment = "Product Spec", category = "Product Spec" → ✓
+- Fuzzy match: moment = "Product Spec", category = "Product Work" → ✓ (user configures synonyms)
+- No match: moment = "Product Spec", category = "Email" → ✗
+
 ---
 
 ## Advantages of Tiny Version