From 93396e6eaf3b065d89a77e1c9c367264b06d8626 Mon Sep 17 00:00:00 2001 From: Claude Date: Sat, 25 Oct 2025 01:57:04 +0000 Subject: [PATCH 1/5] docs: Add ActivityWatch integration PRD MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add comprehensive PRD for semantic attention guardrail system: - Ambient compass indicator for real-time alignment feedback - End-of-phase reflection summaries - Local-first AI classification using Ollama - Zero-config ActivityWatch bundling - Privacy-first design with local LLM processing šŸ¤– Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude --- docs/prds/ACTIVITY_WATCH_PROJECT.md | 699 ++++++++++++++++++++++++++++ 1 file changed, 699 insertions(+) create mode 100644 docs/prds/ACTIVITY_WATCH_PROJECT.md diff --git a/docs/prds/ACTIVITY_WATCH_PROJECT.md b/docs/prds/ACTIVITY_WATCH_PROJECT.md new file mode 100644 index 0000000..80e3806 --- /dev/null +++ b/docs/prds/ACTIVITY_WATCH_PROJECT.md @@ -0,0 +1,699 @@ +# ActivityWatch Integration - Semantic Attention Guardrails + +**Status**: Draft +**Target**: MVP Extension (Phase 1) +**Philosophy**: Reduce the distance from intent to action through ambient awareness + +--- + +## Vision + +**The Problem**: We allocate consciousness to "Product Spec" but spend 90 minutes on Twitter. The gap between intention and action is invisible until reflection time - by then, the day is spent. + +**The Solution**: Integrate ActivityWatch as a bundled extension to provide **semantic attention guardrails** - AI-powered ambient feedback that gently closes the gap between stated intention and observed activity. + +**Core Principle**: +> "Technology as a mirror for consciousness, not a taskmaster." + +This is not time tracking. This is **attention alignment detection** using AI to understand the semantic relationship between what you committed to doing and what you're actually doing. + +--- + +## What This Is + +A **passive ambient awareness system** that: +1. Observes computer activity via ActivityWatch +2. Classifies alignment with current moment using local LLM +3. Provides **peripheral feedback** (ambient compass indicator) +4. Offers **reflective summaries** (end-of-phase ritual) + +**Not**: Performance tracking, productivity metrics, nagging notifications, or guilt-inducing dashboards. + +**Is**: A gentle, intelligent mirror that helps you notice drift before hours pass. + +--- + +## User Experience + +### The Ambient Indicator (Passive, Real-time) + +A small compass indicator in the corner of Zenborg: + +``` +Current: "Product Spec" ā˜• Morning + +[Compass widget - collapsed state] +🧭 ↑ (aligned) + +[Compass widget - drift detected] +🧭 ↙ (drifting) +``` + +**Behavior**: +- Updates every 5-10 minutes +- Lives in peripheral vision (top-right corner, can collapse/hide) +- No modal takeovers, no sounds, no badges +- Clicking shows brief summary: "Currently aligned with product work theme" +- Can be dismissed entirely (respects user agency) + +**States**: +- **Aligned** (↑): Activity matches moment's semantic theme +- **Neutral** (↔): Ambiguous (email, Slack, quick searches) +- **Drifting** (↙): Clear misalignment detected +- **Untracked** (ā—‹): No digital activity (reading, meetings, thinking) + +### End-of-Phase Reflection (Passive, Retrospective) + +When a phase completes (e.g., Morning → Afternoon transition): + +``` +ā˜• Morning Complete + +ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā” +│ Product Spec [Craft] │ +│ āœ“ Aligned (2h 15m observed) │ +│ → Linear, Notion, Figma mockups │ +ā”œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¤ +│ Email Triage [Admin] │ +│ āœ— Drift detected (45m allocated, 12m observed) │ +│ → Spent 1h 20m on Twitter/HN instead │ +ā”œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¤ +│ Deep Reading [Strategy] │ +│ ? Untracked (no digital footprint) │ +ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜ + +Press any key to continue to Afternoon... +``` + +**Design**: +- Non-blocking (can dismiss immediately) +- Shows up once per phase transition +- No judgement language ("drift detected" not "you failed") +- Acknowledges untracked time as valid (reading, thinking, meetings) + +--- + +## Technical Architecture + +### High-Level Flow + +``` +ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā” +│ Zenborg Core (Phase 1) │ +│ - Moments with Area associations │ +│ - Areas define semantic themes │ +│ - Current moment awareness │ +ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜ + │ + ā–¼ +ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā” +│ ActivityWatch Extension Bundle │ +│ - aw-watcher-window (desktop apps) │ +│ - aw-watcher-web (browser tabs/URLs) │ +│ - aw-watcher-afk (idle detection) │ +│ - Local SQLite database │ +ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜ + │ + ā–¼ +ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā” +│ Activity Collector Service │ +│ - Polls AW database every 5-10 min │ +│ - Aggregates recent events (last 15 min) │ +│ - Filters: apps, window titles, URLs, duration │ +ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜ + │ + ā–¼ +ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā” +│ Semantic Classifier (Local LLM) │ +│ - Ollama/llama.cpp (3B-7B param model) │ +│ - Input: current moment + observed activity │ +│ - Output: alignment classification + confidence │ +│ - Understands work themes semantically │ +ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜ + │ + ā–¼ +ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā” +│ Ambient Feedback Layer │ +│ - Compass indicator (real-time UI) │ +│ - Phase reflection summary (transition screen) │ +│ - Alignment history (stored in IndexedDB) │ +ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜ +``` + +### Data Model Extensions + +**Area** (extended from Zenborg core): +```typescript +interface Area { + // ... existing fields ... + themeKeywords?: string[] // ["linear", "notion", "spec", "roadmap"] + themeDescription?: string // "Product work: writing specs, prioritizing..." +} +``` + +**Default Area Themes** (for user "Thopiax"): +```typescript +const DEFAULT_THEMES = { + "Product": { + keywords: ["linear", "notion", "spec", "roadmap", "jira", "prd"], + description: "Writing specs, scopes, prioritizing features" + }, + "Data": { + keywords: ["jupyter", "python", "sql", "postgres", "dbt", "pandas"], + description: "Exploring data, writing models, running batches, experiments" + }, + "UX": { + keywords: ["figma", "framer", "prototype", "design", "css", "component"], + description: "Prototyping, fine-tuning interfaces" + }, + "Strategy": { + keywords: ["docs", "notes", "research", "reading", "writing"], + description: "Slow, deliberate thinking and planning" + } +} +``` + +**AlignmentEvent** (new entity): +```typescript +interface AlignmentEvent { + id: string // UUID + momentId: string // FK to Moment + timestamp: string // ISO timestamp + classification: AlignmentType // "aligned" | "neutral" | "drifting" + confidence: number // 0.0-1.0 + observedActivities: ActivitySummary[] + themeDetected: string | null // "product", "data", etc. + createdAt: string +} + +interface ActivitySummary { + app: string + windowTitle: string + url?: string + duration: number // seconds +} + +type AlignmentType = "aligned" | "neutral" | "drifting" | "untracked" +``` + +### LLM Classification Service + +**Local Model Options** (ranked by preference): +1. **Ollama** with Llama 3.2 3B (fastest, good balance) +2. **llama.cpp** with Phi-3 Mini (smallest, edge devices) +3. **Fallback**: Claude API (privacy implications, requires API key) + +**Classification Prompt Template**: +```typescript +const CLASSIFICATION_PROMPT = `You are an attention alignment classifier for a mindful productivity system. + +CURRENT INTENTION: +- Moment: "${moment.name}" +- Area: ${moment.area.name} +- Theme: ${moment.area.themeDescription} +- Phase: ${phase} (${phaseEmoji}) + +OBSERVED ACTIVITY (last 15 min): +${activitySummary} + +TASK: Classify alignment as: +- ALIGNED: Activity clearly matches the stated intention and theme +- NEUTRAL: Ambiguous or transitional (email, Slack, quick searches, switching contexts) +- DRIFTING: Clear misalignment with stated intention +- UNTRACKED: No significant digital activity detected + +GUIDELINES: +- Consider semantic meaning, not just keywords + (e.g., "Slack #product-team" is aligned with product work) +- Short diversions (<2 min) are NEUTRAL, not drifting +- Respect nuance: research on Twitter for a product spec is aligned +- If no clear activity, classify as UNTRACKED (not a failure) + +OUTPUT (JSON only, no explanation): +{ + "classification": "aligned" | "neutral" | "drifting" | "untracked", + "confidence": 0.0-1.0, + "themeDetected": "product" | "data" | "ux" | "strategy" | null, + "briefReason": "Short explanation (max 10 words)" +}`; +``` + +**Response Parsing**: +```typescript +interface ClassificationResult { + classification: AlignmentType + confidence: number + themeDetected: string | null + briefReason: string +} + +// Store in IndexedDB as AlignmentEvent +``` + +--- + +## Implementation Phases + +### Phase 1a: ActivityWatch Bundling (Week 1) +**Goal**: Ship Zenborg with AW pre-configured, zero user setup + +**Tasks**: +1. Bundle AW binaries for macOS/Linux/Windows +2. Auto-start AW server on Zenborg launch (background process) +3. Install default watchers (window, web, afk) +4. Health check: verify AW is running, show status in settings +5. Graceful fallback: if AW fails, hide extension UI (no crash) + +**Acceptance**: +- User installs Zenborg → AW runs automatically +- No manual AW installation required +- Settings page shows "ActivityWatch: Running āœ“" + +--- + +### Phase 1b: Activity Collection (Week 1) +**Goal**: Poll AW database and aggregate recent activity + +**Tasks**: +1. AW SQLite database reader (or REST API client) +2. Service: poll every 5-10 min for last 15 min of events +3. Aggregate by app/window/URL with durations +4. Filter noise (< 10 sec interactions, system processes) +5. Store raw events temporarily (in-memory, not persisted) + +**Acceptance**: +- Console logs show aggregated activity every 5 min +- Events correctly grouped by app/window +- Idle time excluded from aggregation + +--- + +### Phase 1c: Local LLM Integration (Week 2) +**Goal**: Classify alignment using Ollama locally + +**Tasks**: +1. Detect Ollama installation (or prompt user to install) +2. Auto-pull lightweight model (Llama 3.2 3B) +3. Build classification prompt from current moment + activity +4. Call Ollama API (http://localhost:11434) +5. Parse JSON response → AlignmentEvent +6. Store classifications in IndexedDB (not raw activity) + +**Acceptance**: +- Classification runs locally, no external API calls +- Response time < 2 seconds +- Confidence scores calibrated (>0.7 for aligned/drifting) +- Errors gracefully handled (show "untracked" if LLM fails) + +--- + +### Phase 1d: Ambient Compass Indicator (Week 2) +**Goal**: Show real-time alignment in peripheral vision + +**UI Component**: +```tsx + +``` + +**States**: +- **Aligned**: 🧭 ↑ (green tint) +- **Neutral**: 🧭 ↔ (gray) +- **Drifting**: 🧭 ↙ (amber, not red - no guilt) +- **Untracked**: 🧭 ā—‹ (faded) + +**Interactions**: +- Click → expand brief reason ("Aligned with product work theme") +- Double-click → hide for 1 hour (respects user agency) +- Settings toggle: disable entirely + +**Design**: +- Monochrome base (stone-200 border) +- Subtle color accent (area color, low opacity) +- Small: 48px Ɨ 48px collapsed, 200px Ɨ 80px expanded +- No animations (calm tech) + +**Acceptance**: +- Updates within 10 seconds of classification +- No performance impact (< 1% CPU) +- Can be dismissed/hidden +- Accessible (ARIA labels, keyboard nav) + +--- + +### Phase 1e: End-of-Phase Reflection (Week 3) +**Goal**: Show alignment summary at phase transitions + +**Trigger**: When current phase ends (based on PhaseConfig.endHour) + +**UI**: +- Overlay (not modal - can click through) +- Shows all moments from completed phase +- For each moment: + - Alignment status (āœ“ aligned, āœ— drifting, ? untracked) + - Observed duration (aggregated from AW events) + - Top 3 apps/activities +- Press any key or click to dismiss + +**Data**: +- Query AlignmentEvents for completed phase +- Aggregate classifications by moment +- Calculate time spent per classification type +- Do NOT show percentages or scores (no gamification) + +**Acceptance**: +- Appears automatically at phase transition +- Non-blocking (can dismiss immediately) +- Shows accurate time aggregations +- Works offline (uses cached data) + +--- + +### Phase 1f: Settings & Privacy (Week 3) +**Goal**: User control over data collection and feedback + +**Settings Panel** (`:settings` command): +``` +ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā” +│ ActivityWatch Integration │ +ā”œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¤ +│ ā˜‘ Enable attention guardrails │ +│ ā˜‘ Show ambient compass indicator │ +│ ā˜‘ Show end-of-phase reflections │ +│ │ +│ Classification interval: [5 min] [10 min] [15] │ +│ LLM Backend: [Ollama (local)] [Claude API] │ +│ │ +│ Privacy: │ +│ ā˜‘ Process data locally only │ +│ ☐ Allow cloud LLM fallback (requires API key) │ +│ │ +│ Data Retention: │ +│ Keep alignment history: [7 days] [30] [Forever] │ +│ [Clear all ActivityWatch data] │ +│ │ +│ Status: │ +│ ActivityWatch: Running āœ“ │ +│ Ollama: Connected āœ“ (Llama 3.2 3B) │ +│ Last classification: 2 minutes ago │ +ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜ +``` + +**Privacy Guarantees**: +- Raw AW events never leave the machine (unless user opts into cloud LLM) +- Only classification results stored (not window titles/URLs) +- User can clear all data anytime +- AW can be disabled entirely (extension becomes dormant) + +**Acceptance**: +- All toggles functional +- Data deletion works (verified in IndexedDB) +- Ollama connection status accurate +- Works without internet (local-only mode) + +--- + +## User Flows + +### Flow 1: First-Time Setup (Zero Config) +``` +1. User installs Zenborg +2. ActivityWatch auto-starts in background +3. Ollama detected (or prompt: "Install Ollama for local AI? [Yes] [Skip]") +4. If Ollama installed → auto-pull Llama 3.2 3B (progress indicator) +5. Settings show: "ActivityWatch: Running āœ“, Ollama: Ready āœ“" +6. Compass indicator appears (faded, no moment allocated yet) +``` + +**Fallback**: If Ollama not installed, extension stays dormant (no crash, no nag). + +--- + +### Flow 2: Morning Routine with Ambient Feedback +``` +1. User allocates "Product Spec" to Today Morning (:t1) +2. Morning starts (6am), phase active +3. User opens Linear, starts writing spec +4. After 5 min → AW collects events, LLM classifies +5. Compass shows: 🧭 ↑ (aligned) +6. User switches to Twitter for 20 min +7. After 10 min → LLM reclassifies +8. Compass shifts: 🧭 ↙ (drifting) +9. User notices (peripheral vision), self-corrects +10. Back to Linear → compass returns to 🧭 ↑ +``` + +**Key**: No interruption, no modal. Just ambient awareness. + +--- + +### Flow 3: End-of-Phase Reflection +``` +1. Morning phase ends (12pm → Afternoon) +2. Zenborg shows reflection overlay: + + ā˜• Morning Complete + + āœ“ Product Spec (2h 15m aligned) + āœ— Email Triage (1h 20m drifting - Twitter/HN) + ? Deep Reading (untracked) + +3. User reads, presses Esc +4. Continues to Afternoon +``` + +**Non-Goals**: No lecture, no metrics, no "productivity score". Just a mirror. + +--- + +### Flow 4: Disable Extension (User Agency) +``` +1. User types :settings +2. Unchecks "Enable attention guardrails" +3. ActivityWatch stops collecting data +4. Compass indicator disappears +5. Zenborg continues working normally (core features unaffected) +``` + +**Key**: Extension is opt-out, not forced. + +--- + +## Technical Constraints + +### Performance +- **Classification latency**: < 2 seconds (local LLM) +- **UI update latency**: < 500ms (compass indicator) +- **CPU overhead**: < 5% average (AW watchers + LLM) +- **Memory**: < 200MB (AW + Ollama model loaded) +- **Battery impact**: Negligible (10-min polling, not continuous) + +### Privacy +- **Default**: All data processed locally (AW SQLite + Ollama) +- **No telemetry**: Classification results stay on device +- **Optional cloud**: User must explicitly enable + provide API key +- **Data retention**: Default 7 days, user-configurable +- **GDPR compliance**: Full data export/deletion support + +### Compatibility +- **Platforms**: macOS, Linux, Windows (AW supports all three) +- **Browsers**: Chrome, Firefox, Safari (aw-watcher-web) +- **Editors**: VS Code, Cursor, Vim/Neovim (window title detection) +- **Ollama**: Requires 4GB RAM minimum (for 3B model) + +--- + +## Success Metrics + +**Qualitative** (user interviews): +- "Did the compass help you notice drift before it became hours?" +- "Did end-of-phase reflection feel useful or guilt-inducing?" +- "Was setup truly zero-config, or did you struggle?" +- "Do you trust that data stays local?" + +**Quantitative** (optional telemetry, opt-in): +- % of moments with aligned classification (target: >60%) +- Average time-to-notice drift (compass shown → user action) +- Reflection screen dismissal rate (too annoying if >90%) +- Extension disable rate (failure if >20% disable within 1 week) + +**Technical Health**: +- AW uptime (target: >99%) +- LLM classification success rate (target: >95%) +- UI responsiveness (compass updates <500ms) +- Zero data loss on Zenborg restart + +--- + +## Non-Goals (MVP) + +**Explicitly excluded from Phase 1**: +- āŒ Cloud sync of ActivityWatch data +- āŒ Mobile app integration (AW is desktop-only) +- āŒ Productivity metrics / dashboards / charts +- āŒ Gamification (streaks, scores, achievements) +- āŒ Social features (compare with others) +- āŒ AI suggestions ("you should work on X next") +- āŒ Calendar integration (infer intentions from events) +- āŒ Pomodoro timers or time-boxing +- āŒ Automatic moment creation based on observed activity +- āŒ Notifications/reminders/alerts (calm tech only) +- āŒ Browser extension (watch via aw-watcher-web is sufficient) + +**Future Phases** (not MVP): +- Phase 2: Trend analysis (weekly patterns, not daily metrics) +- Phase 3: Custom theme taxonomy (beyond Area keywords) +- Phase 4: Multi-device correlation (phone + desktop) +- Phase 5: Shared themes for teams (opt-in collaboration) + +--- + +## Open Questions + +**Technical**: +1. Should we bundle Ollama or just detect/prompt for install? + - **Recommendation**: Detect + prompt (Ollama is 500MB+, too large to bundle) + +2. Polling interval: 5 min, 10 min, or user-configurable? + - **Recommendation**: Default 10 min, configurable down to 5 min + +3. How to handle rapid context switching (10+ app switches in 5 min)? + - **Recommendation**: Classify as NEUTRAL (transitional state) + +4. Should we show compass when no moment allocated? + - **Recommendation**: Show as UNTRACKED (ā—‹), remind user to allocate + +**UX**: +1. Should compass show confidence score, or just direction? + - **Recommendation**: Hide confidence (too metric-y), just show state + +2. End-of-phase reflection: auto-dismiss after 30 sec, or wait for user? + - **Recommendation**: Wait for user (respect attention), but allow click-through + +3. What if user has multiple monitors? Where to show compass? + - **Recommendation**: Let user drag/position, persist preference + +**Privacy**: +1. Should we offer data export (JSON dump of AlignmentEvents)? + - **Recommendation**: Yes, via `:export-data` command + +2. How to handle sensitive window titles (e.g., "Therapy Notes - Google Docs")? + - **Recommendation**: Hash or redact in stored data, only use for real-time classification + +--- + +## Philosophy Alignment Check + +**Does this maintain Zenborg's core principles?** + +āœ… **Orchestration, not elimination**: Accepts drift, helps you notice and reallocate +āœ… **Consciousness as currency**: Mirrors where attention actually goes vs. where you said it would +āœ… **Presence over outcomes**: No "productivity score", just alignment awareness +āœ… **Vim-inspired efficiency**: Minimal UI, peripheral vision, no interruptions +āœ… **Calm technology**: Ambient indicators, not notifications; reflection, not real-time guilt +āœ… **Local-first**: IndexedDB + local LLM, cloud is opt-in only +āœ… **Privacy-first**: Raw activity never persisted, only classifications + +**Potential Tensions**: +āš ļø **"No time tracking"** → We're tracking, but not exposing raw time (only alignment) +āš ļø **"No metrics"** → Classifications are a form of metric, but qualitative (aligned/drifting) +āš ļø **"Mindful tech is boring"** → AI classification could feel "smart" vs. boring + +**Resolution**: +- Frame as **awareness tool**, not performance tracker +- Never show percentages, scores, or comparisons +- Make compass dismissible/disableable (user agency) +- Keep UI monochrome and calm (no red alerts, no urgency) + +--- + +## Next Steps + +**Immediate**: +1. āœ… PRD approval (this document) +2. Create technical spike: bundle AW binaries for Next.js app +3. Test Ollama integration (API calls, model selection) +4. Design compass component (Figma mockup) +5. Set up Vitest tests for classification service + +**Week 1 Deliverables**: +- AW auto-start on Zenborg launch +- Activity collection service (polling AW database) +- Console logging of aggregated events + +**Week 2 Deliverables**: +- Ollama integration (local LLM classification) +- Compass indicator UI component +- Real-time classification display + +**Week 3 Deliverables**: +- End-of-phase reflection screen +- Settings panel (privacy controls) +- E2E test: full flow from moment allocation → drift detection → reflection + +--- + +## Appendix: User's Default Themes + +**For "Thopiax" (MVP hardcoded)**: + +```typescript +export const THOPIAX_THEMES = { + "Product Work": { + keywords: ["linear", "notion", "jira", "asana", "roadmap", "spec", "prd", "priorit"], + description: "Writing specs, scopes, prioritizing features, planning roadmaps", + exampleActivities: [ + "Linear - Product Roadmap Q2", + "Notion - PRD: New Onboarding Flow", + "Slack - #product-team" + ] + }, + "Data Work": { + keywords: ["jupyter", "python", "sql", "postgres", "dbt", "pandas", "numpy", "colab"], + description: "Exploring data, writing models, running batches, tweaking experiments", + exampleActivities: [ + "Jupyter Notebook - user_retention_analysis.ipynb", + "pgAdmin - Query: weekly_active_users", + "Terminal - python run_experiment.py" + ] + }, + "UX Work": { + keywords: ["figma", "framer", "sketch", "prototype", "design", "component", "css", "tailwind"], + description: "Prototyping interfaces, fine-tuning designs, iterating on components", + exampleActivities: [ + "Figma - Zenborg Compass Redesign", + "VS Code - MomentCard.tsx", + "Chrome - Tailwind CSS Docs" + ] + }, + "Strategy Work": { + keywords: ["docs", "notion", "notes", "obsidian", "research", "reading", "writing", "plan"], + description: "Slow, deliberate thinking, strategic planning, deep reading", + exampleActivities: [ + "Google Docs - Q3 Strategy Draft", + "Notion - Weekly Reflection", + "Safari - Reading: Shape Up (Basecamp)" + ] + } +} +``` + +**Usage in Classification**: +- When moment.area matches theme name, use corresponding keywords/description +- LLM considers semantic overlap (e.g., "Slack #product-team" → Product Work) +- Themes evolve with user (future: custom theme editor) + +--- + +**Document Version**: 1.0 +**Last Updated**: 2025-10-25 +**Author**: Thopiax (with Claude) +**Status**: Ready for implementation + +--- + +*"Reduce the distance from intent to action. Technology as a mirror, not a master."* From b268a56a03022927db51b09b7f324e4dbd04d5f5 Mon Sep 17 00:00:00 2001 From: Claude Date: Sat, 25 Oct 2025 02:18:10 +0000 Subject: [PATCH 2/5] docs: Remove end-of-phase reflection, add critical path test MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Changes to ActivityWatch PRD: - Remove immediate end-of-phase reflection feature (too granular) - Keep only ambient compass indicator for real-time awareness - Update architecture, flows, and success metrics accordingly - Shift to longer-term reflection in future phases Add critical path validation document: - 2-3 day MVP test protocol to validate core hypothesis - CLI tool to test semantic classification + ambient feedback - Clear go/no-go criteria before full implementation - Tests riskiest assumptions first (accuracy, speed, usefulness) šŸ¤– Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude --- docs/prds/ACTIVITY_WATCH_CRITICAL_PATH.md | 398 ++++++++++++++++++++++ docs/prds/ACTIVITY_WATCH_PROJECT.md | 106 +----- 2 files changed, 411 insertions(+), 93 deletions(-) create mode 100644 docs/prds/ACTIVITY_WATCH_CRITICAL_PATH.md diff --git a/docs/prds/ACTIVITY_WATCH_CRITICAL_PATH.md b/docs/prds/ACTIVITY_WATCH_CRITICAL_PATH.md new file mode 100644 index 0000000..3d75b59 --- /dev/null +++ b/docs/prds/ACTIVITY_WATCH_CRITICAL_PATH.md @@ -0,0 +1,398 @@ +# ActivityWatch Integration - Critical Path Validation + +**Purpose**: Validate the core hypothesis before building the full system +**Timeline**: 2-3 days +**Goal**: Answer the question: "Does semantic AI classification + ambient feedback actually help reduce attention drift?" + +--- + +## The Core Hypothesis + +> **"An ambient compass indicator showing real-time semantic alignment between stated intention and observed activity will help users notice and correct attention drift faster than passive reflection alone."** + +### What We're Testing + +1. **Can a local LLM accurately classify alignment** between a stated work intention and observed computer activity? +2. **Is the classification fast enough** for real-time feedback (< 2 seconds)? +3. **Does ambient feedback feel helpful** or intrusive/distracting? +4. **Do users actually self-correct** when they notice drift, or ignore it? + +### What We're NOT Testing (Yet) + +- Full Zenborg integration +- Multi-phase day planning +- Zero-config setup +- Settings/privacy controls +- Historical tracking + +--- + +## Minimal Viable Test (MVT) + +### What to Build + +**A standalone CLI tool** that: +1. Polls ActivityWatch for last 15 minutes of activity +2. Prompts user for current intention (e.g., "Product Spec") +3. Classifies alignment using Ollama (local LLM) +4. Prints result to terminal in real-time + +**No UI. No persistence. Just the core loop.** + +### Technical Stack + +- **Language**: TypeScript/Node.js (or Python for speed) +- **ActivityWatch Client**: REST API calls to `http://localhost:5600` +- **LLM**: Ollama with Llama 3.2 3B +- **Output**: Terminal only (colored text for states) + +### Implementation (4-6 hours) + +```typescript +// pseudocode +while (true) { + // 1. Get current intention from user + const intention = await promptUser("What are you working on?") + const theme = await promptUser("Theme? (product/data/ux/strategy)") + + // 2. Poll ActivityWatch every 5 minutes + await sleep(5 * 60 * 1000) + + // 3. Fetch recent activity (last 15 min) + const activity = await fetchActivityWatch({ + start: now - 15min, + end: now + }) + + // 4. Aggregate events + const summary = aggregateActivity(activity) + // { "Chrome - Linear": 480s, "Chrome - Twitter": 120s, ... } + + // 5. Classify with Ollama + const result = await classifyAlignment({ + intention, + theme, + activity: summary + }) + + // 6. Print to terminal + printCompass(result.classification) // 🧭 ↑ or 🧭 ↙ + console.log(`Confidence: ${result.confidence}`) + console.log(`Reason: ${result.briefReason}`) +} +``` + +--- + +## Test Protocol + +### Setup (Day 0) + +1. Install ActivityWatch (manual setup is fine for MVT) +2. Install Ollama + pull Llama 3.2 3B +3. Build CLI tool (4-6 hours) +4. Verify: Run tool, confirm it fetches AW data and calls Ollama + +### Day 1: Personal Dogfooding + +**Morning Session (3 hours)**: +- Set intention: "Product Spec" (Product theme) +- Work normally for 3 hours +- Observe compass updates every 5 min +- Note: When did you notice drift? Did you self-correct? + +**Questions to Answer**: +- Was classification accurate? (subjective) +- Did you notice the compass updates? +- Did seeing "drifting" cause you to refocus? +- Was 5-min polling too slow/too fast? + +**Afternoon Session (3 hours)**: +- Set intention: "Data Analysis" (Data theme) +- Intentionally drift to Twitter/email after 30 min +- Observe: How long until compass shows "drifting"? +- Self-correct: Does returning to Jupyter change compass back to "aligned"? + +**Questions to Answer**: +- How quickly did LLM detect drift? +- Was the feedback helpful or annoying? +- Did you feel guilt, or just awareness? + +### Day 2: Shared Testing + +**Recruit 1-2 colleagues**: +- Give them the CLI tool +- Ask them to set intentions for their work (product/data/ux/strategy) +- Run for 4-6 hours +- Debrief: Interview about experience + +**Interview Questions**: +1. "On a scale of 1-10, how accurate was the classification?" +2. "Did you notice drift earlier than you normally would?" +3. "Did the compass feel like a gentle mirror or an annoying nag?" +4. "Would you use this daily if it was built into Zenborg?" +5. "What would make this more useful?" + +--- + +## Success Criteria + +### Must Pass (Go/No-Go) + +āœ… **Classification accuracy > 70%** (subjective, user agreement with LLM) +āœ… **Response time < 3 seconds** (Ollama call completes quickly) +āœ… **Users self-correct at least once** when shown "drifting" +āœ… **No one says "this is annoying/distracting"** (neutral or positive feedback only) + +### Nice to Have + +⭐ Classification accuracy > 85% +⭐ Users proactively check compass (not just passive glances) +⭐ Users request "show me when I've been aligned for 2+ hours" (positive reinforcement) + +### Failure Modes (Stop/Rethink) + +āŒ **Classification < 60% accurate** → LLM not good enough, try different model/prompt +āŒ **Response time > 5 seconds** → Too slow for real-time, need smaller model +āŒ **Users ignore compass entirely** → Ambient feedback ineffective, try different UI +āŒ **Users feel guilt/shame** → Messaging is wrong, need gentler framing + +--- + +## Example Test Session (User POV) + +```bash +$ npm run test-compass + +🧭 Attention Compass - ActivityWatch Integration Test + +What are you working on? (3 words max) +> Product Spec + +Theme? (product/data/ux/strategy) +> product + +āœ“ Monitoring ActivityWatch every 5 minutes... + Press Ctrl+C to stop or change intention + +[5 minutes pass] + +───────────────────────────────────────── +🧭 ↑ ALIGNED (confidence: 0.82) +Reason: "Linear, Notion - matches product work" + +Recent activity: +- Linear - Product Roadmap (4m 20s) +- Chrome - Notion PRD (3m 10s) +- Slack - #product-team (1m 30s) +───────────────────────────────────────── + +[10 minutes pass] + +───────────────────────────────────────── +🧭 ↙ DRIFTING (confidence: 0.91) +Reason: "Twitter browsing - misaligned with product work" + +Recent activity: +- Chrome - Twitter (8m 40s) +- Chrome - Hacker News (4m 20s) +- Linear - Product Roadmap (2m 00s) +───────────────────────────────────────── + +[User sees "drifting", closes Twitter, returns to Linear] + +[15 minutes pass] + +───────────────────────────────────────── +🧭 ↑ ALIGNED (confidence: 0.88) +Reason: "Back to Linear - aligned with product work" + +Recent activity: +- Linear - Product Roadmap (12m 30s) +- Chrome - Notion PRD (2m 30s) +───────────────────────────────────────── +``` + +--- + +## Decision Points + +### After Day 1 (Personal Test) + +**If positive** → Proceed to Day 2 (shared testing) +**If mixed** → Iterate on prompt/polling interval, test again +**If negative** → Stop, rethink approach (maybe ambient feedback doesn't work) + +### After Day 2 (Shared Test) + +**If 2/2 users positive** → Greenlight full Zenborg integration (PRD implementation) +**If 1/2 users positive** → Iterate on UX, test with 2 more users +**If 0/2 users positive** → Stop, fundamental issue with approach + +--- + +## What We Learn + +### On Classification Quality + +- **Is semantic understanding working?** (e.g., "Slack #product-team" correctly classified as aligned) +- **Are edge cases handled?** (e.g., research on Twitter for product spec) +- **Is the LLM too strict or too lenient?** + +### On User Behavior + +- **Do users notice drift earlier?** (vs. discovering at end of day) +- **Do they self-correct when shown "drifting"?** +- **Do they feel empowered or guilty?** + +### On Technical Feasibility + +- **Is 5-min polling the right interval?** (or 10 min? 15 min?) +- **Is Llama 3.2 3B fast enough?** (or do we need smaller model?) +- **Does ActivityWatch data quality hold up?** (window titles, URLs accurate?) + +--- + +## Pivot Options (If Hypothesis Fails) + +### If Classification Is Inaccurate + +**Option A**: Use simpler keyword matching (no LLM) +- Pro: Faster, more predictable +- Con: Misses semantic nuance + +**Option B**: Fine-tune LLM on personal work patterns +- Pro: Higher accuracy over time +- Con: Requires training data, more complex + +**Option C**: Let user correct classifications (feedback loop) +- Pro: Improves over time, user feels in control +- Con: Adds friction + +### If Ambient Feedback Is Ineffective + +**Option A**: Only show compass on request (`:align` command) +- Pro: Less intrusive +- Con: Defeats real-time awareness goal + +**Option B**: Remove real-time feedback, only weekly summaries +- Pro: Aligns with "less granular" philosophy +- Con: Too late to notice drift in the moment + +**Option C**: Add gentle sound/haptic (for users who want it) +- Pro: Harder to ignore +- Con: Violates "calm tech" principle + +### If Users Feel Guilt/Shame + +**Option A**: Reframe language (drop "drifting", use "exploring") +- Pro: Gentler tone +- Con: May feel less truthful + +**Option B**: Add positive reinforcement ("You've been aligned for 2 hours!") +- Pro: Balances negative with positive +- Con: Risks gamification + +**Option C**: Make compass optional/hideable at all times +- Pro: Respects user agency +- Con: Users may just hide it when uncomfortable + +--- + +## Timeline + +**Day 0 (Setup)**: 4-6 hours +- Build CLI tool +- Test AW + Ollama integration +- Verify basic flow works + +**Day 1 (Personal Test)**: 6-8 hours of work with compass running +- Morning: aligned work +- Afternoon: intentional drift test +- Evening: notes & reflection + +**Day 2 (Shared Test)**: 4-6 hours +- Recruit 1-2 colleagues +- Run sessions +- Debrief interviews (30 min each) + +**Day 3 (Decision)**: 2 hours +- Synthesize findings +- Make go/no-go decision +- Document learnings + +**Total**: 2-3 days end-to-end + +--- + +## Deliverables + +1. **CLI tool** (open-source, can share with testers) +2. **Test notes** (markdown doc with observations) +3. **Interview summaries** (anonymized quotes/themes) +4. **Go/No-Go decision doc** (based on success criteria) +5. **Learnings** (what worked, what didn't, what to change) + +--- + +## Next Steps After Validation + +### If "Go" (Hypothesis Validated) + +1. Proceed with full PRD implementation +2. Integrate into Zenborg (Phases 1a-1e) +3. Design compass UI component (not just CLI) +4. Add settings/privacy controls +5. Ship as opt-in beta to users + +### If "No-Go" (Hypothesis Failed) + +1. Document failure mode(s) +2. Explore pivot options (see above) +3. Consider alternative approaches: + - Manual check-ins (`:align` command on demand) + - Weekly reflection only (no real-time) + - Simple keyword matching (no AI) +4. Re-test with pivoted approach + +--- + +## Philosophy Check + +**Does this test maintain Zenborg principles?** + +āœ… **Calm technology**: CLI output is passive, not intrusive +āœ… **Local-first**: All processing local (AW + Ollama) +āœ… **Privacy-first**: No data sent to cloud +āœ… **User agency**: Can stop test anytime (Ctrl+C) +āœ… **No metrics**: Shows alignment state, not scores/percentages + +**Does it test the right thing?** + +āœ… **Core value prop**: Does semantic awareness reduce drift? +āœ… **Technical feasibility**: Is LLM fast/accurate enough? +āœ… **User experience**: Does ambient feedback feel helpful? +āœ… **Minimal viable**: No over-engineering, just essentials + +--- + +## Key Questions to Answer + +1. **Does it work?** (technically: AW → LLM → classification) +2. **Is it fast?** (< 2-3 seconds end-to-end) +3. **Is it accurate?** (> 70% user agreement with classification) +4. **Is it useful?** (users self-correct when shown drift) +5. **Is it calm?** (no guilt, no distraction) + +**If all 5 are "yes" → Build the full thing.** +**If any are "no" → Pivot or stop.** + +--- + +**Status**: Ready to build +**Owner**: Thopiax +**Timeline**: Start ASAP, decide by end of Week 1 + +--- + +*"Test the riskiest assumption first. If semantic awareness works, build it. If not, save weeks of implementation."* diff --git a/docs/prds/ACTIVITY_WATCH_PROJECT.md b/docs/prds/ACTIVITY_WATCH_PROJECT.md index 80e3806..2f1c1db 100644 --- a/docs/prds/ACTIVITY_WATCH_PROJECT.md +++ b/docs/prds/ACTIVITY_WATCH_PROJECT.md @@ -25,11 +25,10 @@ A **passive ambient awareness system** that: 1. Observes computer activity via ActivityWatch 2. Classifies alignment with current moment using local LLM 3. Provides **peripheral feedback** (ambient compass indicator) -4. Offers **reflective summaries** (end-of-phase ritual) -**Not**: Performance tracking, productivity metrics, nagging notifications, or guilt-inducing dashboards. +**Not**: Performance tracking, productivity metrics, nagging notifications, guilt-inducing dashboards, or granular time summaries. -**Is**: A gentle, intelligent mirror that helps you notice drift before hours pass. +**Is**: A gentle, intelligent mirror that helps you notice drift in the moment, not hours later. --- @@ -62,35 +61,6 @@ Current: "Product Spec" ā˜• Morning - **Drifting** (↙): Clear misalignment detected - **Untracked** (ā—‹): No digital activity (reading, meetings, thinking) -### End-of-Phase Reflection (Passive, Retrospective) - -When a phase completes (e.g., Morning → Afternoon transition): - -``` -ā˜• Morning Complete - -ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā” -│ Product Spec [Craft] │ -│ āœ“ Aligned (2h 15m observed) │ -│ → Linear, Notion, Figma mockups │ -ā”œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¤ -│ Email Triage [Admin] │ -│ āœ— Drift detected (45m allocated, 12m observed) │ -│ → Spent 1h 20m on Twitter/HN instead │ -ā”œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¤ -│ Deep Reading [Strategy] │ -│ ? Untracked (no digital footprint) │ -ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜ - -Press any key to continue to Afternoon... -``` - -**Design**: -- Non-blocking (can dismiss immediately) -- Shows up once per phase transition -- No judgement language ("drift detected" not "you failed") -- Acknowledges untracked time as valid (reading, thinking, meetings) - --- ## Technical Architecture @@ -135,7 +105,6 @@ Press any key to continue to Afternoon... ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā” │ Ambient Feedback Layer │ │ - Compass indicator (real-time UI) │ -│ - Phase reflection summary (transition screen) │ │ - Alignment history (stored in IndexedDB) │ ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜ ``` @@ -345,35 +314,7 @@ interface ClassificationResult { --- -### Phase 1e: End-of-Phase Reflection (Week 3) -**Goal**: Show alignment summary at phase transitions - -**Trigger**: When current phase ends (based on PhaseConfig.endHour) - -**UI**: -- Overlay (not modal - can click through) -- Shows all moments from completed phase -- For each moment: - - Alignment status (āœ“ aligned, āœ— drifting, ? untracked) - - Observed duration (aggregated from AW events) - - Top 3 apps/activities -- Press any key or click to dismiss - -**Data**: -- Query AlignmentEvents for completed phase -- Aggregate classifications by moment -- Calculate time spent per classification type -- Do NOT show percentages or scores (no gamification) - -**Acceptance**: -- Appears automatically at phase transition -- Non-blocking (can dismiss immediately) -- Shows accurate time aggregations -- Works offline (uses cached data) - ---- - -### Phase 1f: Settings & Privacy (Week 3) +### Phase 1e: Settings & Privacy (Week 2-3) **Goal**: User control over data collection and feedback **Settings Panel** (`:settings` command): @@ -383,7 +324,6 @@ interface ClassificationResult { ā”œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¤ │ ā˜‘ Enable attention guardrails │ │ ā˜‘ Show ambient compass indicator │ -│ ā˜‘ Show end-of-phase reflections │ │ │ │ Classification interval: [5 min] [10 min] [15] │ │ LLM Backend: [Ollama (local)] [Claude API] │ @@ -451,26 +391,7 @@ interface ClassificationResult { --- -### Flow 3: End-of-Phase Reflection -``` -1. Morning phase ends (12pm → Afternoon) -2. Zenborg shows reflection overlay: - - ā˜• Morning Complete - - āœ“ Product Spec (2h 15m aligned) - āœ— Email Triage (1h 20m drifting - Twitter/HN) - ? Deep Reading (untracked) - -3. User reads, presses Esc -4. Continues to Afternoon -``` - -**Non-Goals**: No lecture, no metrics, no "productivity score". Just a mirror. - ---- - -### Flow 4: Disable Extension (User Agency) +### Flow 3: Disable Extension (User Agency) ``` 1. User types :settings 2. Unchecks "Enable attention guardrails" @@ -511,14 +432,13 @@ interface ClassificationResult { **Qualitative** (user interviews): - "Did the compass help you notice drift before it became hours?" -- "Did end-of-phase reflection feel useful or guilt-inducing?" - "Was setup truly zero-config, or did you struggle?" - "Do you trust that data stays local?" +- "Does the ambient feedback feel helpful or distracting?" **Quantitative** (optional telemetry, opt-in): - % of moments with aligned classification (target: >60%) - Average time-to-notice drift (compass shown → user action) -- Reflection screen dismissal rate (too annoying if >90%) - Extension disable rate (failure if >20% disable within 1 week) **Technical Health**: @@ -545,7 +465,7 @@ interface ClassificationResult { - āŒ Browser extension (watch via aw-watcher-web is sufficient) **Future Phases** (not MVP): -- Phase 2: Trend analysis (weekly patterns, not daily metrics) +- Phase 2: Longer-term reflection patterns (weekly/monthly, not immediate) - Phase 3: Custom theme taxonomy (beyond Area keywords) - Phase 4: Multi-device correlation (phone + desktop) - Phase 5: Shared themes for teams (opt-in collaboration) @@ -571,12 +491,12 @@ interface ClassificationResult { 1. Should compass show confidence score, or just direction? - **Recommendation**: Hide confidence (too metric-y), just show state -2. End-of-phase reflection: auto-dismiss after 30 sec, or wait for user? - - **Recommendation**: Wait for user (respect attention), but allow click-through - -3. What if user has multiple monitors? Where to show compass? +2. What if user has multiple monitors? Where to show compass? - **Recommendation**: Let user drag/position, persist preference +3. Should alignment history be queryable/viewable? + - **Recommendation**: Future phase - keep MVP focused on real-time awareness only + **Privacy**: 1. Should we offer data export (JSON dump of AlignmentEvents)? - **Recommendation**: Yes, via `:export-data` command @@ -630,10 +550,10 @@ interface ClassificationResult { - Compass indicator UI component - Real-time classification display -**Week 3 Deliverables**: -- End-of-phase reflection screen +**Week 2-3 Deliverables**: - Settings panel (privacy controls) -- E2E test: full flow from moment allocation → drift detection → reflection +- Data retention & deletion +- E2E test: full flow from moment allocation → drift detection → self-correction --- From 686aa113841c0228bb93a9793c56dbd8e487aa36 Mon Sep 17 00:00:00 2001 From: Claude Date: Sat, 25 Oct 2025 02:25:46 +0000 Subject: [PATCH 3/5] docs: Switch from Ollama to Transformer.js for classification MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Major architectural change to PRD and critical path: **Why Transformer.js**: - Zero external dependencies (no Ollama install) - Runs in-browser or Node.js (WASM + WebGPU) - Auto-downloads models on first use (~400MB BART) - Faster inference for classification (< 1 second) - Reusable for journal note semantic annotation - Better integration with Next.js/TypeScript stack **Classification approaches**: 1. Zero-shot classification (BART/DeBERTa) for accuracy 2. Semantic similarity (sentence transformers) for speed 3. Both use same Transformer.js API **Performance improvements**: - < 1 second classification (vs. < 2-3 seconds with Ollama) - < 150MB memory footprint (vs. 200MB+) - No background server required **Bonus feature**: Same models enable journal note semantic search, auto-tagging, and moment similarity matching Updated both PRD and critical path test protocol to reflect new approach. Simpler setup (2-4 hours vs. 4-6 hours). šŸ¤– Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude --- docs/prds/ACTIVITY_WATCH_CRITICAL_PATH.md | 157 +++++++++++++---- docs/prds/ACTIVITY_WATCH_PROJECT.md | 204 ++++++++++++++-------- 2 files changed, 253 insertions(+), 108 deletions(-) diff --git a/docs/prds/ACTIVITY_WATCH_CRITICAL_PATH.md b/docs/prds/ACTIVITY_WATCH_CRITICAL_PATH.md index 3d75b59..3cd4243 100644 --- a/docs/prds/ACTIVITY_WATCH_CRITICAL_PATH.md +++ b/docs/prds/ACTIVITY_WATCH_CRITICAL_PATH.md @@ -41,15 +41,22 @@ ### Technical Stack -- **Language**: TypeScript/Node.js (or Python for speed) +- **Language**: TypeScript/Node.js - **ActivityWatch Client**: REST API calls to `http://localhost:5600` -- **LLM**: Ollama with Llama 3.2 3B +- **Classifier**: Transformer.js with BART or DeBERTa (zero-shot classification) - **Output**: Terminal only (colored text for states) ### Implementation (4-6 hours) ```typescript -// pseudocode +import { pipeline } from '@xenova/transformers' + +// Load classifier once (auto-downloads model on first run) +const classifier = await pipeline( + 'zero-shot-classification', + 'facebook/bart-large-mnli' +) + while (true) { // 1. Get current intention from user const intention = await promptUser("What are you working on?") @@ -68,17 +75,30 @@ while (true) { const summary = aggregateActivity(activity) // { "Chrome - Linear": 480s, "Chrome - Twitter": 120s, ... } - // 5. Classify with Ollama - const result = await classifyAlignment({ - intention, - theme, - activity: summary - }) - - // 6. Print to terminal - printCompass(result.classification) // 🧭 ↑ or 🧭 ↙ - console.log(`Confidence: ${result.confidence}`) - console.log(`Reason: ${result.briefReason}`) + // 5. Build activity description + const activityText = Object.entries(summary) + .map(([key, duration]) => `${key} (${duration}s)`) + .join(', ') + + const description = ` + Working on: ${intention} (${theme} work) + Recent activity: ${activityText} + ` + + // 6. Classify with Transformer.js + const result = await classifier(description, [ + 'aligned with stated intention', + 'drifting from stated intention', + 'neutral or transitional activity', + 'no significant activity' + ]) + + const classification = result.labels[0] + const confidence = result.scores[0] + + // 7. Print to terminal + printCompass(classification) // 🧭 ↑ or 🧭 ↙ + console.log(`Confidence: ${(confidence * 100).toFixed(0)}%`) } ``` @@ -89,9 +109,9 @@ while (true) { ### Setup (Day 0) 1. Install ActivityWatch (manual setup is fine for MVT) -2. Install Ollama + pull Llama 3.2 3B -3. Build CLI tool (4-6 hours) -4. Verify: Run tool, confirm it fetches AW data and calls Ollama +2. `npm install @xenova/transformers` (auto-downloads BART on first run) +3. Build CLI tool (2-4 hours - simpler than Ollama approach) +4. Verify: Run tool, confirm it fetches AW data and classifies with Transformer.js ### Day 1: Personal Dogfooding @@ -114,7 +134,7 @@ while (true) { - Self-correct: Does returning to Jupyter change compass back to "aligned"? **Questions to Answer**: -- How quickly did LLM detect drift? +- How quickly did the classifier detect drift? - Was the feedback helpful or annoying? - Did you feel guilt, or just awareness? @@ -139,8 +159,8 @@ while (true) { ### Must Pass (Go/No-Go) -āœ… **Classification accuracy > 70%** (subjective, user agreement with LLM) -āœ… **Response time < 3 seconds** (Ollama call completes quickly) +āœ… **Classification accuracy > 70%** (subjective, user agreement with classifier) +āœ… **Response time < 1 second** (Transformer.js inference completes quickly) āœ… **Users self-correct at least once** when shown "drifting" āœ… **No one says "this is annoying/distracting"** (neutral or positive feedback only) @@ -152,8 +172,8 @@ while (true) { ### Failure Modes (Stop/Rethink) -āŒ **Classification < 60% accurate** → LLM not good enough, try different model/prompt -āŒ **Response time > 5 seconds** → Too slow for real-time, need smaller model +āŒ **Classification < 60% accurate** → Zero-shot not working, try semantic similarity instead +āŒ **Response time > 2 seconds** → Too slow for real-time, switch to smaller/faster model āŒ **Users ignore compass entirely** → Ambient feedback ineffective, try different UI āŒ **Users feel guilt/shame** → Messaging is wrong, need gentler framing @@ -165,6 +185,8 @@ while (true) { $ npm run test-compass 🧭 Attention Compass - ActivityWatch Integration Test +Loading classifier... (first run downloads BART model ~400MB) +āœ“ Classifier ready (facebook/bart-large-mnli) What are you working on? (3 words max) > Product Spec @@ -178,8 +200,7 @@ Theme? (product/data/ux/strategy) [5 minutes pass] ───────────────────────────────────────── -🧭 ↑ ALIGNED (confidence: 0.82) -Reason: "Linear, Notion - matches product work" +🧭 ↑ ALIGNED (confidence: 82%) Recent activity: - Linear - Product Roadmap (4m 20s) @@ -190,8 +211,7 @@ Recent activity: [10 minutes pass] ───────────────────────────────────────── -🧭 ↙ DRIFTING (confidence: 0.91) -Reason: "Twitter browsing - misaligned with product work" +🧭 ↙ DRIFTING (confidence: 91%) Recent activity: - Chrome - Twitter (8m 40s) @@ -204,8 +224,7 @@ Recent activity: [15 minutes pass] ───────────────────────────────────────── -🧭 ↑ ALIGNED (confidence: 0.88) -Reason: "Back to Linear - aligned with product work" +🧭 ↑ ALIGNED (confidence: 88%) Recent activity: - Linear - Product Roadmap (12m 30s) @@ -237,7 +256,8 @@ Recent activity: - **Is semantic understanding working?** (e.g., "Slack #product-team" correctly classified as aligned) - **Are edge cases handled?** (e.g., research on Twitter for product spec) -- **Is the LLM too strict or too lenient?** +- **Is zero-shot classification too strict or too lenient?** +- **Does BART work well, or should we try DeBERTa/semantic similarity?** ### On User Behavior @@ -248,7 +268,7 @@ Recent activity: ### On Technical Feasibility - **Is 5-min polling the right interval?** (or 10 min? 15 min?) -- **Is Llama 3.2 3B fast enough?** (or do we need smaller model?) +- **Is Transformer.js fast enough for real-time feedback?** (< 1 second?) - **Does ActivityWatch data quality hold up?** (window titles, URLs accurate?) --- @@ -257,17 +277,21 @@ Recent activity: ### If Classification Is Inaccurate -**Option A**: Use simpler keyword matching (no LLM) -- Pro: Faster, more predictable -- Con: Misses semantic nuance +**Option A**: Switch from zero-shot to semantic similarity +- Pro: Faster, simpler, often more accurate for narrow domains +- Con: Requires tuning similarity thresholds -**Option B**: Fine-tune LLM on personal work patterns -- Pro: Higher accuracy over time -- Con: Requires training data, more complex +**Option B**: Use keyword matching (no ML at all) +- Pro: Fastest, most predictable +- Con: Misses semantic nuance entirely **Option C**: Let user correct classifications (feedback loop) - Pro: Improves over time, user feels in control -- Con: Adds friction +- Con: Adds friction, doesn't improve model + +**Option D**: Try different zero-shot model (DeBERTa instead of BART) +- Pro: May have better accuracy for intent classification +- Con: Still relatively slow compared to similarity ### If Ambient Feedback Is Ineffective @@ -389,6 +413,65 @@ Recent activity: --- +## Bonus: Journal Note Semantic Annotation + +Since we're already loading Transformer.js models for ActivityWatch classification, **the same models can power semantic journal features**: + +### Use Cases + +**1. Semantic Search** +```typescript +// Find journal entries related to current moment +const embedder = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2') + +const momentEmbedding = await embedder("Product Spec: prioritizing roadmap") +const noteEmbeddings = await embedder(journalNotes.map(n => n.content)) + +const similarities = cosineSimilarity(momentEmbedding, noteEmbeddings) +const relatedNotes = journalNotes + .map((note, i) => ({ note, score: similarities[i] })) + .filter(({ score }) => score > 0.6) + .sort((a, b) => b.score - a.score) +``` + +**2. Auto-Tagging Notes** +```typescript +// Tag note with theme (product/data/ux/strategy) +const classifier = await pipeline('zero-shot-classification', 'facebook/bart-large-mnli') + +const result = await classifier(journalNote.content, [ + 'product work and prioritization', + 'data analysis and experiments', + 'UX design and prototyping', + 'strategic thinking and planning' +]) + +journalNote.theme = result.labels[0] +journalNote.confidence = result.scores[0] +``` + +**3. Find Similar Past Moments** +```typescript +// When creating moment "Product Spec", show related past moments +const currentEmbedding = await embedder("Product Spec") +const pastEmbeddings = await embedder(pastMoments.map(m => m.name)) + +const similar = pastMoments + .map((m, i) => ({ moment: m, score: similarities[i] })) + .filter(({ score }) => score > 0.8) +``` + +### Integration Points + +- **Moment creation**: Suggest related journal notes +- **Journal writing**: Auto-tag with themes from current moment +- **Reflection**: "Show me notes from when I worked on similar moments" +- **Search**: Semantic search across all notes and moments + +**Advantage**: One model download, multiple features. Zero-config semantic intelligence across the whole app. + +--- + **Status**: Ready to build **Owner**: Thopiax **Timeline**: Start ASAP, decide by end of Week 1 diff --git a/docs/prds/ACTIVITY_WATCH_PROJECT.md b/docs/prds/ACTIVITY_WATCH_PROJECT.md index 2f1c1db..2219bc8 100644 --- a/docs/prds/ACTIVITY_WATCH_PROJECT.md +++ b/docs/prds/ACTIVITY_WATCH_PROJECT.md @@ -165,55 +165,114 @@ interface ActivitySummary { type AlignmentType = "aligned" | "neutral" | "drifting" | "untracked" ``` -### LLM Classification Service +### Semantic Classification Service (Transformer.js) -**Local Model Options** (ranked by preference): -1. **Ollama** with Llama 3.2 3B (fastest, good balance) -2. **llama.cpp** with Phi-3 Mini (smallest, edge devices) -3. **Fallback**: Claude API (privacy implications, requires API key) +**Model Choice**: **Transformer.js** with zero-shot classification + +**Why Transformer.js**: +- āœ… Zero external dependencies (no Ollama/llama.cpp install) +- āœ… Runs in browser or Node.js (WASM + WebGPU) +- āœ… Auto-downloads models on first use (cached locally) +- āœ… Fast inference for classification tasks (< 500ms) +- āœ… Reusable for journal note semantic annotation +- āœ… Works offline immediately after first model download + +**Model Options** (ranked by preference): +1. **`facebook/bart-large-mnli`** - Zero-shot classification (best accuracy) +2. **`MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli`** - Faster, still accurate +3. **Sentence transformers** + cosine similarity (ultra-fast, good enough) + +**Classification Approach**: + +```typescript +import { pipeline } from '@xenova/transformers' + +// Load zero-shot classifier (once, cached) +const classifier = await pipeline( + 'zero-shot-classification', + 'facebook/bart-large-mnli' +) + +// Define candidate labels based on moment's theme +const labels = { + aligned: [ + moment.area.themeDescription, // "Writing specs, prioritizing features" + ...moment.area.keywords, // ["linear", "notion", "spec"] + ], + drifting: [ + "social media browsing", + "news reading", + "entertainment", + "unrelated work" + ], + neutral: [ + "email communication", + "team chat", + "quick searches", + "context switching" + ] +} + +// Build activity description from AW events +const activityDescription = ` +User is working on: "${moment.name}" (${moment.area.name} - ${moment.area.themeDescription}) + +Recent activity (last 15 min): +${activity.map(a => `- ${a.app}: ${a.windowTitle} (${a.duration}s)`).join('\n')} +` + +// Classify alignment +const result = await classifier(activityDescription, [ + 'aligned with stated intention', + 'drifting from stated intention', + 'neutral or transitional activity', + 'no significant digital activity' +]) + +// Map to AlignmentType +const classification = mapToAlignment(result.labels[0], result.scores[0]) +// { classification: "aligned", confidence: 0.89, themeDetected: "product" } +``` + +**Alternative: Semantic Similarity** (faster, simpler): -**Classification Prompt Template**: ```typescript -const CLASSIFICATION_PROMPT = `You are an attention alignment classifier for a mindful productivity system. - -CURRENT INTENTION: -- Moment: "${moment.name}" -- Area: ${moment.area.name} -- Theme: ${moment.area.themeDescription} -- Phase: ${phase} (${phaseEmoji}) - -OBSERVED ACTIVITY (last 15 min): -${activitySummary} - -TASK: Classify alignment as: -- ALIGNED: Activity clearly matches the stated intention and theme -- NEUTRAL: Ambiguous or transitional (email, Slack, quick searches, switching contexts) -- DRIFTING: Clear misalignment with stated intention -- UNTRACKED: No significant digital activity detected - -GUIDELINES: -- Consider semantic meaning, not just keywords - (e.g., "Slack #product-team" is aligned with product work) -- Short diversions (<2 min) are NEUTRAL, not drifting -- Respect nuance: research on Twitter for a product spec is aligned -- If no clear activity, classify as UNTRACKED (not a failure) - -OUTPUT (JSON only, no explanation): -{ - "classification": "aligned" | "neutral" | "drifting" | "untracked", - "confidence": 0.0-1.0, - "themeDetected": "product" | "data" | "ux" | "strategy" | null, - "briefReason": "Short explanation (max 10 words)" -}`; +import { pipeline } from '@xenova/transformers' + +// Load sentence transformer (faster than zero-shot) +const embedder = await pipeline( + 'feature-extraction', + 'Xenova/all-MiniLM-L6-v2' +) + +// Embed intention +const intentionEmbedding = await embedder( + `${moment.name}: ${moment.area.themeDescription}` +) + +// Embed observed activity +const activityEmbedding = await embedder( + activity.map(a => `${a.app} ${a.windowTitle}`).join('. ') +) + +// Compute cosine similarity +const similarity = cosineSimilarity(intentionEmbedding, activityEmbedding) + +// Classify based on threshold +const classification = + similarity > 0.7 ? 'aligned' : + similarity > 0.4 ? 'neutral' : + similarity > 0.2 ? 'drifting' : + 'untracked' ``` -**Response Parsing**: +**Response Format**: ```typescript interface ClassificationResult { - classification: AlignmentType - confidence: number - themeDetected: string | null - briefReason: string + classification: AlignmentType // "aligned" | "neutral" | "drifting" | "untracked" + confidence: number // 0.0-1.0 (from model scores) + themeDetected: string | null // "product" | "data" | "ux" | "strategy" + method: 'zero-shot' | 'similarity' // which approach was used } // Store in IndexedDB as AlignmentEvent @@ -257,22 +316,23 @@ interface ClassificationResult { --- -### Phase 1c: Local LLM Integration (Week 2) -**Goal**: Classify alignment using Ollama locally +### Phase 1c: Semantic Classification (Week 2) +**Goal**: Classify alignment using Transformer.js **Tasks**: -1. Detect Ollama installation (or prompt user to install) -2. Auto-pull lightweight model (Llama 3.2 3B) -3. Build classification prompt from current moment + activity -4. Call Ollama API (http://localhost:11434) -5. Parse JSON response → AlignmentEvent +1. Install `@xenova/transformers` (npm package) +2. Load zero-shot classification model (BART or DeBERTa) +3. Build activity description from AW events +4. Classify alignment with candidate labels +5. Map scores to AlignmentType + confidence 6. Store classifications in IndexedDB (not raw activity) **Acceptance**: -- Classification runs locally, no external API calls -- Response time < 2 seconds +- Classification runs in-browser/Node.js, no external dependencies +- First-run downloads model (100-500MB), then cached +- Response time < 1 second (after model loaded) - Confidence scores calibrated (>0.7 for aligned/drifting) -- Errors gracefully handled (show "untracked" if LLM fails) +- Errors gracefully handled (show "untracked" if classification fails) --- @@ -326,11 +386,10 @@ interface ClassificationResult { │ ā˜‘ Show ambient compass indicator │ │ │ │ Classification interval: [5 min] [10 min] [15] │ -│ LLM Backend: [Ollama (local)] [Claude API] │ +│ Model: [BART (accurate)] [DeBERTa (fast)] │ │ │ │ Privacy: │ -│ ā˜‘ Process data locally only │ -│ ☐ Allow cloud LLM fallback (requires API key) │ +│ ā˜‘ Process data locally only (in-browser) │ │ │ │ Data Retention: │ │ Keep alignment history: [7 days] [30] [Forever] │ @@ -338,7 +397,7 @@ interface ClassificationResult { │ │ │ Status: │ │ ActivityWatch: Running āœ“ │ -│ Ollama: Connected āœ“ (Llama 3.2 3B) │ +│ Transformer.js: Loaded āœ“ (BART-large-mnli) │ │ Last classification: 2 minutes ago │ ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜ ``` @@ -363,13 +422,13 @@ interface ClassificationResult { ``` 1. User installs Zenborg 2. ActivityWatch auto-starts in background -3. Ollama detected (or prompt: "Install Ollama for local AI? [Yes] [Skip]") -4. If Ollama installed → auto-pull Llama 3.2 3B (progress indicator) -5. Settings show: "ActivityWatch: Running āœ“, Ollama: Ready āœ“" +3. First classification triggers model download (progress: "Loading classifier...") +4. BART model downloads (400MB, one-time, cached) +5. Settings show: "ActivityWatch: Running āœ“, Transformer.js: Loaded āœ“" 6. Compass indicator appears (faded, no moment allocated yet) ``` -**Fallback**: If Ollama not installed, extension stays dormant (no crash, no nag). +**Fallback**: If model download fails (offline, no space), extension stays dormant until next launch. --- @@ -407,24 +466,24 @@ interface ClassificationResult { ## Technical Constraints ### Performance -- **Classification latency**: < 2 seconds (local LLM) +- **Classification latency**: < 1 second (Transformer.js, after model loaded) - **UI update latency**: < 500ms (compass indicator) -- **CPU overhead**: < 5% average (AW watchers + LLM) -- **Memory**: < 200MB (AW + Ollama model loaded) +- **CPU overhead**: < 3% average (AW watchers + inference) +- **Memory**: < 150MB (AW + Transformer.js model in-memory) - **Battery impact**: Negligible (10-min polling, not continuous) ### Privacy -- **Default**: All data processed locally (AW SQLite + Ollama) +- **Default**: All data processed locally (AW SQLite + Transformer.js in-browser) - **No telemetry**: Classification results stay on device -- **Optional cloud**: User must explicitly enable + provide API key +- **No cloud required**: Models downloaded once, cached locally - **Data retention**: Default 7 days, user-configurable - **GDPR compliance**: Full data export/deletion support ### Compatibility - **Platforms**: macOS, Linux, Windows (AW supports all three) -- **Browsers**: Chrome, Firefox, Safari (aw-watcher-web) +- **Browsers**: Chrome (recommended), Firefox, Safari (aw-watcher-web) - **Editors**: VS Code, Cursor, Vim/Neovim (window title detection) -- **Ollama**: Requires 4GB RAM minimum (for 3B model) +- **Transformer.js**: Requires 2GB RAM minimum, WebGPU recommended for speed --- @@ -475,8 +534,8 @@ interface ClassificationResult { ## Open Questions **Technical**: -1. Should we bundle Ollama or just detect/prompt for install? - - **Recommendation**: Detect + prompt (Ollama is 500MB+, too large to bundle) +1. Which Transformer.js model: BART (accurate) or DeBERTa (faster)? + - **Recommendation**: Start with BART, add DeBERTa as fast mode option 2. Polling interval: 5 min, 10 min, or user-configurable? - **Recommendation**: Default 10 min, configurable down to 5 min @@ -487,6 +546,9 @@ interface ClassificationResult { 4. Should we show compass when no moment allocated? - **Recommendation**: Show as UNTRACKED (ā—‹), remind user to allocate +5. Use zero-shot classification or semantic similarity? + - **Recommendation**: Zero-shot for better accuracy, similarity as fallback/fast mode + **UX**: 1. Should compass show confidence score, or just direction? - **Recommendation**: Hide confidence (too metric-y), just show state @@ -536,7 +598,7 @@ interface ClassificationResult { **Immediate**: 1. āœ… PRD approval (this document) 2. Create technical spike: bundle AW binaries for Next.js app -3. Test Ollama integration (API calls, model selection) +3. Test Transformer.js integration (model loading, inference speed) 4. Design compass component (Figma mockup) 5. Set up Vitest tests for classification service @@ -546,7 +608,7 @@ interface ClassificationResult { - Console logging of aggregated events **Week 2 Deliverables**: -- Ollama integration (local LLM classification) +- Transformer.js integration (zero-shot classification) - Compass indicator UI component - Real-time classification display From 42ae49e9e772899eac98845c3b68dd2088f189d1 Mon Sep 17 00:00:00 2001 From: Claude Date: Sat, 25 Oct 2025 02:30:53 +0000 Subject: [PATCH 4/5] docs: Add tiny ActivityWatch integration (manual labeling MVP) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Alternative to full AI classification - test core integration first: **Approach**: - Use ActivityWatch's built-in category/labeling system - User manually configures regex rules in AW UI - Zenborg fetches labeled events via REST API (localhost:5600) - Simple alignment: does activity category match moment area? **Benefits**: - 2-4 hours to build vs. weeks for AI version - Zero ML complexity, uses existing AW features - Tests core hypothesis: does activity tracking help? - Transparent rules (regex), user-editable **Implementation**: 1. ActivityWatch client (TypeScript REST API wrapper) 2. Sync Zenborg areas → AW categories 3. Fetch & display alignment status (🧭 ↑/↙) 4. Simple UI component (fixed position indicator) 5. Standalone test script **Limitations**: - Requires manual category setup (regex rules) - No semantic understanding (can't infer intent) - Only works if user maintains category rules **Path forward**: 1. Build tiny version (validate AW integration works) 2. If successful → Add Transformer.js semantic layer 3. Hybrid: User rules + AI classification for unlabeled This tests the riskiest assumption (AW integration) before investing in AI infrastructure. šŸ¤– Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude --- docs/prds/ACTIVITY_WATCH_TINY_VERSION.md | 596 +++++++++++++++++++++++ 1 file changed, 596 insertions(+) create mode 100644 docs/prds/ACTIVITY_WATCH_TINY_VERSION.md diff --git a/docs/prds/ACTIVITY_WATCH_TINY_VERSION.md b/docs/prds/ACTIVITY_WATCH_TINY_VERSION.md new file mode 100644 index 0000000..fa450e7 --- /dev/null +++ b/docs/prds/ACTIVITY_WATCH_TINY_VERSION.md @@ -0,0 +1,596 @@ +# ActivityWatch Tiny Integration - Manual Labeling MVP + +**Purpose**: Test core ActivityWatch integration with manual area labeling before building AI classification +**Timeline**: 2-4 hours to build + test +**Goal**: Prove ActivityWatch data collection works, validate API integration + +--- + +## The Simplest Thing That Could Work + +Instead of AI classification, **manually label activities** using ActivityWatch's built-in category system: + +1. User downloads & runs ActivityWatch themselves +2. Zenborg syncs Areas → ActivityWatch categories/labels +3. User manually labels activities in ActivityWatch UI (or via script) +4. Zenborg fetches labeled data to show alignment + +**No AI. No classification. Just basic CRUD operations on localhost.** + +--- + +## How ActivityWatch Labeling Works + +ActivityWatch has a built-in **event classification system**: + +```json +{ + "id": 123, + "timestamp": "2025-10-25T10:30:00Z", + "duration": 300, + "data": { + "app": "Google Chrome", + "title": "Linear - Product Roadmap", + "url": "https://linear.app/...", + "$category": ["Work", "Product"] // ← User-defined labels + } +} +``` + +**Categories can be**: +- Set manually (user clicks in AW UI) +- Set via regex rules (AW's category watcher) +- Set via API calls (our script) + +--- + +## Tiny Script Architecture + +``` +ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā” +│ Zenborg Areas │ +│ - Product Work │ +│ - Data Work │ +│ - UX Work │ +│ - Strategy Work │ +ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜ + │ + │ POST /api/aw/sync-categories + ā–¼ +ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā” +│ ActivityWatch REST API │ +│ http://localhost:5600 │ +│ │ +│ /api/0/buckets/ │ +│ /api/0/events/ │ +│ /api/0/query/ │ +ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜ + │ + │ GET events with $category + ā–¼ +ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā” +│ Zenborg UI - Alignment View │ +│ "You spent 2h on Product Work" │ +│ "Last 15min: Linear (Product)" │ +ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜ +``` + +--- + +## Implementation + +### 1. ActivityWatch Client (TypeScript) + +```typescript +// src/infrastructure/activitywatch/aw-client.ts + +const AW_BASE_URL = 'http://localhost:5600' + +interface AWEvent { + id: number + timestamp: string + duration: number + data: { + app: string + title: string + url?: string + $category?: string[] // ActivityWatch categories + } +} + +interface AWBucket { + id: string + name: string + type: string + hostname: string +} + +export class ActivityWatchClient { + + // Check if AW is running + async isRunning(): Promise { + try { + const response = await fetch(`${AW_BASE_URL}/api/0/info`) + return response.ok + } catch { + return false + } + } + + // Get all buckets (watchers) + async getBuckets(): Promise { + const response = await fetch(`${AW_BASE_URL}/api/0/buckets/`) + if (!response.ok) throw new Error('Failed to fetch buckets') + return response.json() + } + + // Get events from a bucket (last N minutes) + async getEvents( + bucketId: string, + startTime: Date, + endTime: Date + ): Promise { + const params = new URLSearchParams({ + start: startTime.toISOString(), + end: endTime.toISOString(), + limit: '100' + }) + + const response = await fetch( + `${AW_BASE_URL}/api/0/buckets/${bucketId}/events?${params}` + ) + + if (!response.ok) throw new Error('Failed to fetch events') + return response.json() + } + + // Get aggregated activity for last N minutes + async getRecentActivity(minutes: number = 15): Promise { + const buckets = await this.getBuckets() + + // Find window watcher bucket (aw-watcher-window_*) + const windowBucket = buckets.find(b => + b.id.startsWith('aw-watcher-window') + ) + + if (!windowBucket) throw new Error('No window watcher found') + + const endTime = new Date() + const startTime = new Date(endTime.getTime() - minutes * 60 * 1000) + + return this.getEvents(windowBucket.id, startTime, endTime) + } + + // Use AW's query API for advanced aggregation + async queryActivity(query: string): Promise { + const response = await fetch(`${AW_BASE_URL}/api/0/query/`, { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ + timeperiods: [`${new Date().toISOString()}/PT1H`], // last hour + query: query + }) + }) + + if (!response.ok) throw new Error('Query failed') + return response.json() + } + + // Get time spent per category (last N hours) + async getTimeByCategory(hours: number = 1): Promise> { + // AW query language to aggregate by category + const query = ` + events = query_bucket(find_bucket("aw-watcher-window")); + events = filter_keyvals(events, "$category", []); + events = categorize(events, [[["Work"], {"regex": "Linear|Notion|Figma"}]]); + duration_by_category = sum_durations_by_key(events, "$category"); + RETURN = duration_by_category; + ` + + const result = await this.queryActivity(query) + return result[0] // First timeperiod result + } +} +``` + +### 2. Sync Zenborg Areas → AW Categories + +```typescript +// src/application/use-cases/sync-areas-to-aw.ts + +import { ActivityWatchClient } from '@/infrastructure/activitywatch/aw-client' +import { Area } from '@/domain/entities/area' + +export async function syncAreasToAW(areas: Area[]): Promise { + const awClient = new ActivityWatchClient() + + // Check if AW is running + const isRunning = await awClient.isRunning() + if (!isRunning) { + console.warn('ActivityWatch not running, skipping sync') + return + } + + // For tiny version: just log categories + // User will manually set them in AW UI or via regex rules + console.log('Zenborg Areas → ActivityWatch Categories:') + areas.forEach(area => { + console.log(` - ${area.name}: ${area.themeKeywords?.join(', ')}`) + }) + + // Future: Auto-create categorization rules via AW API + // (AW doesn't have a public API for this yet, needs manual config) +} +``` + +### 3. Fetch & Display Alignment + +```typescript +// src/application/use-cases/get-alignment-status.ts + +import { ActivityWatchClient } from '@/infrastructure/activitywatch/aw-client' +import { Moment } from '@/domain/entities/moment' + +export interface AlignmentStatus { + moment: Moment + lastActivity: { + app: string + title: string + duration: number + category?: string[] + }[] + aligned: boolean // true if category matches moment.area.name + totalTime: number // seconds in last 15 min +} + +export async function getAlignmentStatus( + currentMoment: Moment | null +): Promise { + if (!currentMoment) return null + + const awClient = new ActivityWatchClient() + + // Get last 15 minutes of activity + const events = await awClient.getRecentActivity(15) + + // Group by app/title + const activitySummary = events.reduce((acc, event) => { + const key = `${event.data.app} - ${event.data.title}` + if (!acc[key]) { + acc[key] = { + app: event.data.app, + title: event.data.title, + duration: 0, + category: event.data.$category + } + } + acc[key].duration += event.duration + return acc + }, {} as Record) + + const lastActivity = Object.values(activitySummary) + .sort((a, b) => b.duration - a.duration) + + // Check if aligned: does any activity's category match moment's area? + const aligned = lastActivity.some(activity => + activity.category?.includes(currentMoment.area.name) + ) + + const totalTime = lastActivity.reduce((sum, a) => sum + a.duration, 0) + + return { + moment: currentMoment, + lastActivity, + aligned, + totalTime + } +} +``` + +### 4. Simple UI Component + +```tsx +// src/components/ActivityWatchStatus.tsx + +'use client' + +import { useEffect, useState } from 'react' +import { getAlignmentStatus, AlignmentStatus } from '@/application/use-cases/get-alignment-status' +import { useMomentStore } from '@/infrastructure/state/moment-store' + +export function ActivityWatchStatus() { + const [status, setStatus] = useState(null) + const currentMoment = useMomentStore(state => state.getCurrentMoment()) + + useEffect(() => { + // Poll every 5 minutes + const interval = setInterval(async () => { + if (currentMoment) { + const newStatus = await getAlignmentStatus(currentMoment) + setStatus(newStatus) + } + }, 5 * 60 * 1000) + + // Initial fetch + if (currentMoment) { + getAlignmentStatus(currentMoment).then(setStatus) + } + + return () => clearInterval(interval) + }, [currentMoment]) + + if (!status) return null + + return ( +
+
+
+ Current: {status.moment.name} ({status.moment.area.name}) +
+ +
+ Status: {status.aligned ? ( + 🧭 ↑ Aligned + ) : ( + 🧭 ↙ Drifting + )} +
+ +
+ Last 15 min: +
    + {status.lastActivity.slice(0, 3).map((activity, i) => ( +
  • + {activity.app} ({Math.floor(activity.duration / 60)}m) + {activity.category && ( + + [{activity.category.join(', ')}] + + )} +
  • + ))} +
+
+
+
+ ) +} +``` + +--- + +## User Setup (Manual) + +### 1. Install ActivityWatch + +```bash +# macOS +brew install --cask activitywatch + +# Linux +wget https://github.com/ActivityWatch/activitywatch/releases/latest/download/activitywatch-linux-x86_64.zip +unzip activitywatch-linux-x86_64.zip +./activitywatch/aw-qt + +# Windows +# Download from https://activitywatch.net/downloads/ +``` + +### 2. Configure Categories (Manual) + +Open ActivityWatch UI (http://localhost:5600): + +**Settings → Categories → Add Rules**: + +``` +Product Work: + - regex: "Linear|Notion|Jira|Asana|PRD" + - regex: "#product" + +Data Work: + - regex: "Jupyter|Python|SQL|Postgres|dbt" + - regex: "\.ipynb|\.py|\.sql" + +UX Work: + - regex: "Figma|Framer|Sketch|Design" + - regex: "\.tsx|\.css|Tailwind" + +Strategy Work: + - regex: "Docs|Notes|Obsidian|Research" + - regex: "Strategy|Planning|Reflection" +``` + +### 3. Test Zenborg Integration + +```bash +# In Zenborg project +npm install + +# Add AW client component to layout +# (see implementation above) + +# Start Zenborg +npm run dev + +# Open browser, allocate a moment +# Wait 5 minutes, see status update +``` + +--- + +## Testing Script (Standalone) + +For quick testing without full Zenborg integration: + +```typescript +// scripts/test-aw-integration.ts + +import { ActivityWatchClient } from '../src/infrastructure/activitywatch/aw-client' + +async function main() { + const client = new ActivityWatchClient() + + console.log('🧭 Testing ActivityWatch Integration\n') + + // 1. Check if running + const isRunning = await client.isRunning() + console.log(`āœ“ ActivityWatch running: ${isRunning}`) + + if (!isRunning) { + console.log('āŒ Please start ActivityWatch first') + process.exit(1) + } + + // 2. Get buckets + const buckets = await client.getBuckets() + console.log(`āœ“ Found ${buckets.length} buckets:`) + buckets.forEach(b => console.log(` - ${b.id} (${b.type})`)) + + // 3. Get last 15 min activity + console.log('\nšŸ“Š Last 15 minutes of activity:') + const events = await client.getRecentActivity(15) + + const summary = events.reduce((acc, event) => { + const key = event.data.app + if (!acc[key]) acc[key] = 0 + acc[key] += event.duration + return acc + }, {} as Record) + + Object.entries(summary) + .sort(([, a], [, b]) => b - a) + .forEach(([app, duration]) => { + const minutes = Math.floor(duration / 60) + console.log(` - ${app}: ${minutes}m ${Math.floor(duration % 60)}s`) + }) + + // 4. Check for categorized events + console.log('\nšŸ·ļø Categorized events:') + const categorized = events.filter(e => e.data.$category && e.data.$category.length > 0) + + if (categorized.length === 0) { + console.log(' āš ļø No categorized events found') + console.log(' Set up categories in AW UI: http://localhost:5600') + } else { + categorized.forEach(e => { + console.log(` - ${e.data.app}: [${e.data.$category?.join(', ')}]`) + }) + } +} + +main().catch(console.error) +``` + +Run it: + +```bash +npx tsx scripts/test-aw-integration.ts +``` + +--- + +## Alignment Logic (No AI) + +**Simple rule**: Activity is "aligned" if: +- Activity's `$category` matches current moment's `area.name` + +**Example**: + +```typescript +// User is working on moment "Product Spec" (area: "Product Work") +// Last 15 min activity: + +[ + { app: "Linear", category: ["Product Work"], duration: 600 }, + { app: "Slack", category: ["Communication"], duration: 180 }, + { app: "Chrome - Twitter", category: null, duration: 120 } +] + +// Alignment calculation: +const productTime = 600 // Linear +const otherTime = 300 // Slack + Twitter + +aligned = productTime > otherTime // true +``` + +**Compass state**: +- `🧭 ↑ Aligned` if > 50% of time in matching category +- `🧭 ↙ Drifting` if < 50% of time in matching category +- `🧭 ā—‹ Untracked` if no categorized events + +--- + +## Advantages of Tiny Version + +āœ… **Zero AI complexity**: No models, no training, no classification +āœ… **Uses existing AW features**: Categories already built-in +āœ… **Fast to build**: 2-4 hours total (vs. weeks for AI version) +āœ… **Tests core integration**: Validates AW API works, data flows correctly +āœ… **User can manually tune**: Regex rules are transparent and editable + +--- + +## Limitations (To Address Later) + +āŒ **Manual category setup**: User must configure regex rules in AW +āŒ **No semantic understanding**: "Slack #product-team" won't auto-match "Product Work" +āŒ **Requires user discipline**: If categories not set, shows no alignment +āŒ **No learning**: Rules are static, don't improve over time + +**Solution**: Once this works, add Transformer.js on top for semantic classification + +--- + +## Next Steps + +1. **Build client** (`aw-client.ts`) - 1 hour +2. **Test with script** (`test-aw-integration.ts`) - 30 min +3. **Add UI component** (`ActivityWatchStatus.tsx`) - 1 hour +4. **Manual testing** (configure categories, use Zenborg) - 1 hour +5. **Decide**: Does basic integration work? → Add AI layer + +**Total**: 2-4 hours to validate core hypothesis + +--- + +## Future: AI Layer on Top + +Once manual labeling works: + +```typescript +// Hybrid approach: Use categories as hints, AI for unlabeled + +async function classifyActivity( + activity: AWEvent, + moment: Moment +): Promise { + + // 1. If already categorized by user, trust it + if (activity.data.$category?.includes(moment.area.name)) { + return 'aligned' + } + + // 2. If no category, ask Transformer.js + const classifier = await pipeline('zero-shot-classification', 'facebook/bart-large-mnli') + const result = await classifier( + `${activity.data.app}: ${activity.data.title}`, + [`${moment.area.name}`, 'unrelated work', 'distraction'] + ) + + return result.labels[0] === moment.area.name ? 'aligned' : 'drifting' +} +``` + +**Best of both worlds**: +- User rules = fast, transparent, trustworthy +- AI classification = fills gaps, handles edge cases + +--- + +**Status**: Ready to implement +**Owner**: Thopiax +**Timeline**: 2-4 hours + +--- + +*"Start with the simplest thing that could work. If manual labeling proves the integration, add AI later."* From 280b727562cbbdc5e7bde1d0128d3c1a9f12403e Mon Sep 17 00:00:00 2001 From: Claude Date: Sat, 25 Oct 2025 12:23:22 +0000 Subject: [PATCH 5/5] docs: Fix concept - match moments (not areas) in classification MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Corrected fundamental misunderstanding across all three documents: **Before (incorrect)**: - Predefined "area themes" (Product/Data/UX/Strategy) - Hardcoded keywords per area - Classification matched activity → area theme - Required maintaining theme taxonomy **After (correct per CLAUDE.md)**: - Areas are life domains (Wellness, Craft, Social, Joyful, Introspective) - Moments are specific intentions ("Product Spec", "Data Analysis") - Classification matches activity → current moment name - Moment names are self-descriptive, no keywords needed **Why this is better**: 1. More specific matching (moment-level vs area-level) 2. No hardcoded themes to maintain 3. Moment names already provide semantic context 4. Aligns with Zenborg's core domain model **Changes across all docs**: - PRD: Use moment.name as semantic anchor in classification - Critical path: Remove "theme" prompt, just ask for moment name - Tiny version: Match AW categories to moment names (not areas) - Appendix: Replace hardcoded themes with moment examples **Classification approach now**: ```typescript // Zero-shot classifier(activityDescription, [ `working on: ${moment.name}`, // e.g., "working on: Product Spec" 'distracted or browsing', 'transitional activity', 'no activity' ]) // Semantic similarity embed(moment.name) // Just the moment name - it's self-descriptive! ``` šŸ¤– Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude --- docs/prds/ACTIVITY_WATCH_CRITICAL_PATH.md | 47 +++--- docs/prds/ACTIVITY_WATCH_PROJECT.md | 168 ++++++++-------------- docs/prds/ACTIVITY_WATCH_TINY_VERSION.md | 96 ++++++++----- 3 files changed, 137 insertions(+), 174 deletions(-) diff --git a/docs/prds/ACTIVITY_WATCH_CRITICAL_PATH.md b/docs/prds/ACTIVITY_WATCH_CRITICAL_PATH.md index 3cd4243..db71daf 100644 --- a/docs/prds/ACTIVITY_WATCH_CRITICAL_PATH.md +++ b/docs/prds/ACTIVITY_WATCH_CRITICAL_PATH.md @@ -58,9 +58,9 @@ const classifier = await pipeline( ) while (true) { - // 1. Get current intention from user - const intention = await promptUser("What are you working on?") - const theme = await promptUser("Theme? (product/data/ux/strategy)") + // 1. Get current intention from user (just the moment name) + const momentName = await promptUser("What are you working on?") + // e.g., "Product Spec", "Data Analysis", "Morning Run" // 2. Poll ActivityWatch every 5 minutes await sleep(5 * 60 * 1000) @@ -81,14 +81,14 @@ while (true) { .join(', ') const description = ` - Working on: ${intention} (${theme} work) + Current intention: ${momentName} Recent activity: ${activityText} ` - // 6. Classify with Transformer.js + // 6. Classify with Transformer.js (moment name is semantic anchor) const result = await classifier(description, [ - 'aligned with stated intention', - 'drifting from stated intention', + `working on: ${momentName}`, // e.g., "working on: Product Spec" + 'distracted or browsing unrelated content', 'neutral or transitional activity', 'no significant activity' ]) @@ -116,7 +116,7 @@ while (true) { ### Day 1: Personal Dogfooding **Morning Session (3 hours)**: -- Set intention: "Product Spec" (Product theme) +- Set intention: "Product Spec" (just the moment name) - Work normally for 3 hours - Observe compass updates every 5 min - Note: When did you notice drift? Did you self-correct? @@ -128,7 +128,7 @@ while (true) { - Was 5-min polling too slow/too fast? **Afternoon Session (3 hours)**: -- Set intention: "Data Analysis" (Data theme) +- Set intention: "Data Analysis" - Intentionally drift to Twitter/email after 30 min - Observe: How long until compass shows "drifting"? - Self-correct: Does returning to Jupyter change compass back to "aligned"? @@ -188,12 +188,9 @@ $ npm run test-compass Loading classifier... (first run downloads BART model ~400MB) āœ“ Classifier ready (facebook/bart-large-mnli) -What are you working on? (3 words max) +What are you working on? (moment name, 1-3 words) > Product Spec -Theme? (product/data/ux/strategy) -> product - āœ“ Monitoring ActivityWatch every 5 minutes... Press Ctrl+C to stop or change intention @@ -434,19 +431,17 @@ const relatedNotes = journalNotes .sort((a, b) => b.score - a.score) ``` -**2. Auto-Tagging Notes** +**2. Auto-Linking Notes to Moments** ```typescript -// Tag note with theme (product/data/ux/strategy) +// Find which moment this journal note relates to const classifier = await pipeline('zero-shot-classification', 'facebook/bart-large-mnli') -const result = await classifier(journalNote.content, [ - 'product work and prioritization', - 'data analysis and experiments', - 'UX design and prototyping', - 'strategic thinking and planning' -]) +// Get all active moments +const moments = ["Product Spec", "Data Analysis", "Morning Run", "Deep Reading"] + +const result = await classifier(journalNote.content, moments) -journalNote.theme = result.labels[0] +journalNote.relatedMoment = result.labels[0] // Most likely moment journalNote.confidence = result.scores[0] ``` @@ -463,12 +458,12 @@ const similar = pastMoments ### Integration Points -- **Moment creation**: Suggest related journal notes -- **Journal writing**: Auto-tag with themes from current moment +- **Moment creation**: Suggest related journal notes based on moment name +- **Journal writing**: Auto-link notes to the current moment you're working on - **Reflection**: "Show me notes from when I worked on similar moments" -- **Search**: Semantic search across all notes and moments +- **Search**: Semantic search across all notes and moments by name/content -**Advantage**: One model download, multiple features. Zero-config semantic intelligence across the whole app. +**Advantage**: One model download, multiple features. Zero-config semantic intelligence across the whole app. No need to manually tag or categorize anything. --- diff --git a/docs/prds/ACTIVITY_WATCH_PROJECT.md b/docs/prds/ACTIVITY_WATCH_PROJECT.md index 2219bc8..b96a2d2 100644 --- a/docs/prds/ACTIVITY_WATCH_PROJECT.md +++ b/docs/prds/ACTIVITY_WATCH_PROJECT.md @@ -111,36 +111,11 @@ Current: "Product Spec" ā˜• Morning ### Data Model Extensions -**Area** (extended from Zenborg core): -```typescript -interface Area { - // ... existing fields ... - themeKeywords?: string[] // ["linear", "notion", "spec", "roadmap"] - themeDescription?: string // "Product work: writing specs, prioritizing..." -} -``` - -**Default Area Themes** (for user "Thopiax"): -```typescript -const DEFAULT_THEMES = { - "Product": { - keywords: ["linear", "notion", "spec", "roadmap", "jira", "prd"], - description: "Writing specs, scopes, prioritizing features" - }, - "Data": { - keywords: ["jupyter", "python", "sql", "postgres", "dbt", "pandas"], - description: "Exploring data, writing models, running batches, experiments" - }, - "UX": { - keywords: ["figma", "framer", "prototype", "design", "css", "component"], - description: "Prototyping, fine-tuning interfaces" - }, - "Strategy": { - keywords: ["docs", "notes", "research", "reading", "writing"], - description: "Slow, deliberate thinking and planning" - } -} -``` +**Note on Areas vs Moments**: +- **Areas** are life domains (Wellness, Craft, Social, Joyful, Introspective) per CLAUDE.md +- **Moments** are specific intentions like "Product Spec", "Data Analysis", "Morning Run" +- Classification matches activity → **current moment**, not area +- Moment names provide semantic context (e.g., "Product Spec" implies Linear/Notion/specs) **AlignmentEvent** (new entity): ```typescript @@ -148,10 +123,9 @@ interface AlignmentEvent { id: string // UUID momentId: string // FK to Moment timestamp: string // ISO timestamp - classification: AlignmentType // "aligned" | "neutral" | "drifting" + classification: AlignmentType // "aligned" | "neutral" | "drifting" | "untracked" confidence: number // 0.0-1.0 observedActivities: ActivitySummary[] - themeDetected: string | null // "product", "data", etc. createdAt: string } @@ -193,45 +167,26 @@ const classifier = await pipeline( 'facebook/bart-large-mnli' ) -// Define candidate labels based on moment's theme -const labels = { - aligned: [ - moment.area.themeDescription, // "Writing specs, prioritizing features" - ...moment.area.keywords, // ["linear", "notion", "spec"] - ], - drifting: [ - "social media browsing", - "news reading", - "entertainment", - "unrelated work" - ], - neutral: [ - "email communication", - "team chat", - "quick searches", - "context switching" - ] -} - // Build activity description from AW events const activityDescription = ` -User is working on: "${moment.name}" (${moment.area.name} - ${moment.area.themeDescription}) +Current intention: "${moment.name}" (${moment.area.name}) +Context: User committed to working on this during ${phase}. Recent activity (last 15 min): ${activity.map(a => `- ${a.app}: ${a.windowTitle} (${a.duration}s)`).join('\n')} ` -// Classify alignment +// Classify alignment using moment name as semantic anchor const result = await classifier(activityDescription, [ - 'aligned with stated intention', - 'drifting from stated intention', - 'neutral or transitional activity', - 'no significant digital activity' + `working on: ${moment.name}`, // e.g., "working on: Product Spec" + 'distracted or browsing unrelated content', + 'transitional activity like email or chat', + 'no significant activity observed' ]) // Map to AlignmentType const classification = mapToAlignment(result.labels[0], result.scores[0]) -// { classification: "aligned", confidence: 0.89, themeDetected: "product" } +// { classification: "aligned", confidence: 0.89 } ``` **Alternative: Semantic Similarity** (faster, simpler): @@ -245,10 +200,9 @@ const embedder = await pipeline( 'Xenova/all-MiniLM-L6-v2' ) -// Embed intention -const intentionEmbedding = await embedder( - `${moment.name}: ${moment.area.themeDescription}` -) +// Embed intention (just the moment name - it's self-descriptive) +const intentionEmbedding = await embedder(moment.name) +// e.g., "Product Spec" or "Morning Run" // Embed observed activity const activityEmbedding = await embedder( @@ -271,11 +225,10 @@ const classification = interface ClassificationResult { classification: AlignmentType // "aligned" | "neutral" | "drifting" | "untracked" confidence: number // 0.0-1.0 (from model scores) - themeDetected: string | null // "product" | "data" | "ux" | "strategy" method: 'zero-shot' | 'similarity' // which approach was used } -// Store in IndexedDB as AlignmentEvent +// Store in IndexedDB as AlignmentEvent (linked to moment via momentId) ``` --- @@ -619,55 +572,50 @@ interface ClassificationResult { --- -## Appendix: User's Default Themes +## Appendix: Example Moment-to-Activity Mappings -**For "Thopiax" (MVP hardcoded)**: +**How Semantic Classification Works**: -```typescript -export const THOPIAX_THEMES = { - "Product Work": { - keywords: ["linear", "notion", "jira", "asana", "roadmap", "spec", "prd", "priorit"], - description: "Writing specs, scopes, prioritizing features, planning roadmaps", - exampleActivities: [ - "Linear - Product Roadmap Q2", - "Notion - PRD: New Onboarding Flow", - "Slack - #product-team" - ] - }, - "Data Work": { - keywords: ["jupyter", "python", "sql", "postgres", "dbt", "pandas", "numpy", "colab"], - description: "Exploring data, writing models, running batches, tweaking experiments", - exampleActivities: [ - "Jupyter Notebook - user_retention_analysis.ipynb", - "pgAdmin - Query: weekly_active_users", - "Terminal - python run_experiment.py" - ] - }, - "UX Work": { - keywords: ["figma", "framer", "sketch", "prototype", "design", "component", "css", "tailwind"], - description: "Prototyping interfaces, fine-tuning designs, iterating on components", - exampleActivities: [ - "Figma - Zenborg Compass Redesign", - "VS Code - MomentCard.tsx", - "Chrome - Tailwind CSS Docs" - ] - }, - "Strategy Work": { - keywords: ["docs", "notion", "notes", "obsidian", "research", "reading", "writing", "plan"], - description: "Slow, deliberate thinking, strategic planning, deep reading", - exampleActivities: [ - "Google Docs - Q3 Strategy Draft", - "Notion - Weekly Reflection", - "Safari - Reading: Shape Up (Basecamp)" - ] - } -} +Moment names are self-descriptive. The classifier matches observed activity against the moment name semantically: + +**Example 1: "Product Spec" (Area: Craft)** +``` +Moment: "Product Spec" +Observed: Linear, Notion, Slack #product-team +Classification: āœ“ Aligned (semantic match with spec/planning work) + +Moment: "Product Spec" +Observed: Twitter, Hacker News +Classification: āœ— Drifting (no semantic connection) +``` + +**Example 2: "Data Analysis" (Area: Craft)** +``` +Moment: "Data Analysis" +Observed: Jupyter Notebook, pgAdmin, Python +Classification: āœ“ Aligned (semantic match with data/analysis work) + +Moment: "Data Analysis" +Observed: Figma, Design System Docs +Classification: āœ— Drifting (different domain - design vs. data) +``` + +**Example 3: "Morning Run" (Area: Wellness)** +``` +Moment: "Morning Run" +Observed: No digital activity +Classification: ? Untracked (expected for physical activity) + +Moment: "Morning Run" +Observed: Strava, Spotify +Classification: āœ“ Aligned (related apps for running) ``` -**Usage in Classification**: -- When moment.area matches theme name, use corresponding keywords/description -- LLM considers semantic overlap (e.g., "Slack #product-team" → Product Work) -- Themes evolve with user (future: custom theme editor) +**Key Insight**: No hardcoded keywords needed. The model understands semantic relationships: +- "Product Spec" → Linear, Notion, planning tools +- "Data Analysis" → Jupyter, SQL, Python +- "UX Prototype" → Figma, design tools +- "Morning Run" → fitness apps or no digital activity --- diff --git a/docs/prds/ACTIVITY_WATCH_TINY_VERSION.md b/docs/prds/ACTIVITY_WATCH_TINY_VERSION.md index fa450e7..16b5857 100644 --- a/docs/prds/ACTIVITY_WATCH_TINY_VERSION.md +++ b/docs/prds/ACTIVITY_WATCH_TINY_VERSION.md @@ -11,9 +11,9 @@ Instead of AI classification, **manually label activities** using ActivityWatch's built-in category system: 1. User downloads & runs ActivityWatch themselves -2. Zenborg syncs Areas → ActivityWatch categories/labels -3. User manually labels activities in ActivityWatch UI (or via script) -4. Zenborg fetches labeled data to show alignment +2. User manually labels activities in ActivityWatch UI with moment-like names +3. Zenborg fetches labeled data and matches against current moment +4. Shows alignment: does activity label match current moment? **No AI. No classification. Just basic CRUD operations on localhost.** @@ -48,14 +48,11 @@ ActivityWatch has a built-in **event classification system**: ``` ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā” -│ Zenborg Areas │ -│ - Product Work │ -│ - Data Work │ -│ - UX Work │ -│ - Strategy Work │ +│ Zenborg Current Moment │ +│ "Product Spec" (Area: Craft) │ ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜ │ - │ POST /api/aw/sync-categories + │ Fetch events with categories ā–¼ ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā” │ ActivityWatch REST API │ @@ -66,12 +63,13 @@ ActivityWatch has a built-in **event classification system**: │ /api/0/query/ │ ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜ │ - │ GET events with $category + │ Events with $category labels ā–¼ ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā” │ Zenborg UI - Alignment View │ -│ "You spent 2h on Product Work" │ -│ "Last 15min: Linear (Product)" │ +│ Current: "Product Spec" │ +│ Last 15min: Linear [Product Work] │ +│ 🧭 ↑ Aligned (category matches intent) │ ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜ ``` @@ -193,31 +191,36 @@ export class ActivityWatchClient { } ``` -### 2. Sync Zenborg Areas → AW Categories +### 2. Suggest Category Setup (Read-Only) ```typescript -// src/application/use-cases/sync-areas-to-aw.ts +// src/application/use-cases/suggest-aw-categories.ts import { ActivityWatchClient } from '@/infrastructure/activitywatch/aw-client' -import { Area } from '@/domain/entities/area' +import { Moment } from '@/domain/entities/moment' -export async function syncAreasToAW(areas: Area[]): Promise { +export async function suggestAWCategories(moments: Moment[]): Promise { const awClient = new ActivityWatchClient() // Check if AW is running const isRunning = await awClient.isRunning() if (!isRunning) { - console.warn('ActivityWatch not running, skipping sync') + console.warn('ActivityWatch not running') return } - // For tiny version: just log categories - // User will manually set them in AW UI or via regex rules - console.log('Zenborg Areas → ActivityWatch Categories:') - areas.forEach(area => { - console.log(` - ${area.name}: ${area.themeKeywords?.join(', ')}`) + // For tiny version: just suggest category names based on common moments + console.log('šŸ’” Suggested ActivityWatch categories (set up in AW UI):') + + const uniqueMomentNames = [...new Set(moments.map(m => m.name))] + + uniqueMomentNames.forEach(name => { + console.log(` - "${name}" (for moments like: ${name})`) }) + console.log('\nšŸ‘‰ Configure these in ActivityWatch UI: http://localhost:5600') + console.log(' Settings → Categories → Add Rules') + // Future: Auto-create categorization rules via AW API // (AW doesn't have a public API for this yet, needs manual config) } @@ -271,9 +274,13 @@ export async function getAlignmentStatus( const lastActivity = Object.values(activitySummary) .sort((a, b) => b.duration - a.duration) - // Check if aligned: does any activity's category match moment's area? + // Check if aligned: does any activity's category match moment name? + // Supports exact match or fuzzy match (e.g., "Product Work" matches "Product Spec") const aligned = lastActivity.some(activity => - activity.category?.includes(currentMoment.area.name) + activity.category?.some(cat => + cat.toLowerCase().includes(currentMoment.name.toLowerCase()) || + currentMoment.name.toLowerCase().includes(cat.toLowerCase()) + ) ) const totalTime = lastActivity.reduce((sum, a) => sum + a.duration, 0) @@ -382,24 +389,31 @@ Open ActivityWatch UI (http://localhost:5600): **Settings → Categories → Add Rules**: +Configure categories based on your common **moment names** (not areas): + ``` -Product Work: - - regex: "Linear|Notion|Jira|Asana|PRD" +Product Spec: + - regex: "Linear|Notion|Jira|PRD|Spec|Roadmap" - regex: "#product" -Data Work: - - regex: "Jupyter|Python|SQL|Postgres|dbt" +Data Analysis: + - regex: "Jupyter|Python|SQL|Postgres|Pandas" - regex: "\.ipynb|\.py|\.sql" -UX Work: +UX Prototype: - regex: "Figma|Framer|Sketch|Design" - - regex: "\.tsx|\.css|Tailwind" + - regex: "\.tsx|\.css|component" + +Deep Reading: + - regex: "Docs|PDF|Reader|Articles" + - regex: "Reading|Research" -Strategy Work: - - regex: "Docs|Notes|Obsidian|Research" - - regex: "Strategy|Planning|Reflection" +Email: + - regex: "Gmail|Outlook|Mail" ``` +**Key**: Category names should match your typical moment names ("Product Spec", "Data Analysis"), not areas ("Craft", "Wellness") + ### 3. Test Zenborg Integration ```bash @@ -492,25 +506,26 @@ npx tsx scripts/test-aw-integration.ts ## Alignment Logic (No AI) **Simple rule**: Activity is "aligned" if: -- Activity's `$category` matches current moment's `area.name` +- Activity's `$category` matches (or relates to) current moment name **Example**: ```typescript -// User is working on moment "Product Spec" (area: "Product Work") +// User is working on moment "Product Spec" (area: Craft) +// ActivityWatch categories configured to label Linear/Notion as "Product Spec" // Last 15 min activity: [ - { app: "Linear", category: ["Product Work"], duration: 600 }, + { app: "Linear", category: ["Product Spec"], duration: 600 }, { app: "Slack", category: ["Communication"], duration: 180 }, { app: "Chrome - Twitter", category: null, duration: 120 } ] // Alignment calculation: -const productTime = 600 // Linear +const alignedTime = 600 // Linear (category matches moment name) const otherTime = 300 // Slack + Twitter -aligned = productTime > otherTime // true +aligned = alignedTime > otherTime // true ``` **Compass state**: @@ -518,6 +533,11 @@ aligned = productTime > otherTime // true - `🧭 ↙ Drifting` if < 50% of time in matching category - `🧭 ā—‹ Untracked` if no categorized events +**Matching logic**: +- Exact match: moment = "Product Spec", category = "Product Spec" → āœ“ +- Fuzzy match: moment = "Product Spec", category = "Product Work" → āœ“ (user configures synonyms) +- No match: moment = "Product Spec", category = "Email" → āœ— + --- ## Advantages of Tiny Version