Skip to content

Latest commit

Β 

History

History
628 lines (466 loc) Β· 21.6 KB

File metadata and controls

628 lines (466 loc) Β· 21.6 KB

ClarityLens πŸ”

On-device AI assistant for Chrome leveraging Built-in AI APIs for summarization, rewriting, translation, proofreading, and image understanding.

✨ Features

  • πŸ“ Simplify & Summarize: TLDR summaries for paragraphs with adjustable length and reading level
  • πŸ–ΌοΈ Explain Images: AI-generated alt text and chart/diagram explanations using multimodal Prompt API
  • ✏️ Proofread Forms: Real-time grammar and spelling suggestions with categorized, plain-language fixes
  • 🌐 Translate Pages: In-page translation with language detection, hover-to-view original, and per-site preferences
  • β™Ώ WCAG 2.2 Compliant: Strong focus rings, reduced motion support, screen reader announcements, and accessible controls
  • πŸ“Š Tech Overlay: Real-time API status, latency metrics, and on-device indicators

πŸš€ Quick Start

Prerequisites

  1. Chrome Canary or Dev (version 127+)
  2. Enable Built-in AI flags in chrome://flags:
    • #optimization-guide-on-device-model
    • #prompt-api-for-gemini-nano
    • #summarization-api-for-gemini-nano
    • #rewriter-api-for-gemini-nano
    • #translation-api
  3. Restart Chrome
  4. Visit chrome://components and update "Optimization Guide On Device Model"

Installation

# Clone repository
git clone https://github.com/stealthwhizz/ClarityLens.git
cd ClarityLens

# Install dependencies
npm install

# Build extension
npm run build

# Package for distribution
npm run zip

Load Extension

  1. Open chrome://extensions
  2. Enable "Developer mode"
  3. Click "Load unpacked"
  4. Select the dist folder

πŸ› οΈ Development

# Watch mode for development
npm run watch

# Clean build artifacts
npm run clean

# Build for production
npm run build

πŸ“‹ Project Structure

ClarityLens/
β”œβ”€β”€ manifest.json              # Chrome extension manifest (MV3)
β”œβ”€β”€ package.json               # Dependencies and scripts
β”œβ”€β”€ tsconfig.json              # TypeScript configuration
β”œβ”€β”€ vite.config.ts             # Vite bundler configuration
β”œβ”€β”€ public/
β”‚   └── icons/                 # Extension icons (16, 48, 128px)
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ background/
β”‚   β”‚   └── service-worker.ts  # Background script for context menus, messaging
β”‚   β”œβ”€β”€ content/
β”‚   β”‚   β”œβ”€β”€ content.ts         # Main content script bootstrap
β”‚   β”‚   β”œβ”€β”€ capabilities.ts    # API detection and fallbacks
β”‚   β”‚   β”œβ”€β”€ simplify.ts        # Summarizer + Rewriter integration
β”‚   β”‚   β”œβ”€β”€ images.ts          # Prompt API multimodal for images
β”‚   β”‚   β”œβ”€β”€ forms.ts           # Proofreader integration
β”‚   β”‚   β”œβ”€β”€ translate.ts       # Translator + Language Detector
β”‚   β”‚   └── wcag.ts            # Accessibility utilities
β”‚   β”œβ”€β”€ lib/
β”‚   β”‚   β”œβ”€β”€ ai/                # AI API wrappers
β”‚   β”‚   β”‚   β”œβ”€β”€ summarizer.ts
β”‚   β”‚   β”‚   β”œβ”€β”€ rewriter.ts
β”‚   β”‚   β”‚   β”œβ”€β”€ prompt.ts
β”‚   β”‚   β”‚   β”œβ”€β”€ proofreader.ts
β”‚   β”‚   β”‚   └── translator.ts
β”‚   β”‚   β”œβ”€β”€ dom.ts             # Safe DOM manipulation
β”‚   β”‚   β”œβ”€β”€ storage.ts         # Chrome storage wrapper
β”‚   β”‚   └── telemetry.ts       # API status overlay
β”‚   └── ui/
β”‚       β”œβ”€β”€ popup.html/ts/css  # Extension popup
β”‚       └── options.html/ts    # Settings page
β”œβ”€β”€ scripts/
β”‚   └── zip.mjs                # Packaging script
└── dist/                      # Build output (load as unpacked extension)

🎨 Design System

CSS Custom Properties

/* Colors */
--cl-primary: #2563eb;
--cl-success: #10b981;
--cl-error: #ef4444;
--cl-warning: #f59e0b;

/* Spacing */
--cl-space-xs: 4px;
--cl-space-sm: 8px;
--cl-space-md: 16px;
--cl-space-lg: 24px;

/* Typography */
--cl-font-size-base: 14px;
--cl-line-height-normal: 1.5;

/* Transitions (respects prefers-reduced-motion) */
--cl-transition-fast: 150ms;
--cl-transition-base: 200ms;

Accessibility Features

  • Keyboard navigation: All controls reachable via Tab, activated with Enter/Space
  • Focus indicators: 3px outline with 2px offset for clear visibility
  • Screen reader support: ARIA live regions, labels, and announcements
  • Reduced motion: Animations disabled when prefers-reduced-motion: reduce
  • Target size: Minimum 36Γ—36px for touch targets (WCAG 2.2 Level AA)
  • Color contrast: 4.5:1 minimum for text, 3:1 for UI components

πŸ“– API Usage

Summarizer

import { summarizerAPI } from './lib/ai/summarizer.js';

await summarizerAPI.initialize();
const result = await summarizerAPI.summarize(text, {
  type: 'tl;dr',
  length: 'medium',
});

Rewriter

import { rewriterAPI } from './lib/ai/rewriter.js';

await rewriterAPI.initialize();
const result = await rewriterAPI.rewrite(text, {
  tone: 'more-casual',
  length: 'shorter',
});

Prompt API (Multimodal)

import { promptAPI } from './lib/ai/prompt.js';

await promptAPI.initialize();
const result = await promptAPI.describeImage(imageElement);
const chartExplanation = await promptAPI.explainChart(imageElement);

Proofreader

import { proofreaderAPI } from './lib/ai/proofreader.js';

await proofreaderAPI.initialize();
const result = await proofreaderAPI.proofread(text);
// result.issues: Array<{ type, start, end, suggestions, message }>

Translator

import { translatorAPI } from './lib/ai/translator.js';

await translatorAPI.initialize();
const langResult = await translatorAPI.detectLanguage(text);
const translation = await translatorAPI.translate(text, {
  sourceLanguage: 'en',
  targetLanguage: 'es',
});

πŸ”‘ Permissions Rationale

  • activeTab: Access page content for AI processing (only on user action)
  • scripting: Inject content scripts for on-page features
  • storage: Save user preferences and per-site settings
  • contextMenus: "Explain image" right-click menu

No host permissions or network requests β€” all processing is on-device.

βœ… Acceptance Tests

1. Simplify (Summarizer + Rewriter)

Test: On a dense article, click "Add TLDRs to Page"

  • Per-paragraph TLDR panels appear without layout shift
  • Summaries are concise and relevant
  • "Undo" button restores original text
  • Reading level slider updates summaries instantly
  • Keyboard navigation reaches all controls

2. Explain Images (Prompt API Multimodal)

Test: On a page with charts, right-click an image β†’ "Explain this image"

  • AI-generated description appears as alt attribute
  • Charts get <figcaption> with explanation
  • Screen reader announces new description
  • Explanation panel shows detailed analysis

3. Proofread Forms (Proofreader)

Test: Type text with errors into a textarea

  • Suggestions appear below input with categorized issues (spelling, grammar, punctuation, style)
  • Clicking suggestion replaces text
  • Screen reader announces number of suggestions
  • Plain-language explanations (e.g., "This word may be misspelled")

4. Translate (Translator + Language Detector)

Test: On an English page, select "Hindi" or "Kannada" and click "Translate Page"

  • Content translates in place without layout shift
  • Hovering over translated text shows original
  • Language preference persists on page refresh
  • "Restore Original" button reverts changes

5. Accessibility (WCAG 2.2)

Test: Navigate extension with keyboard only

  • All controls reachable via Tab
  • Focus indicators visible (3px blue outline)
  • Enter/Space activates buttons
  • Screen reader announces state changes
  • Reduced motion honored (check prefers-reduced-motion)
  • Target sizes β‰₯ 36Γ—36px for all interactive elements

6. Tech Overlay

Test: Enable "Show Tech Overlay" in popup

  • Overlay displays API names (Summarizer, Rewriter, Prompt API, etc.)
  • Shows "On-device" or "Unavailable" status
  • Displays latency in milliseconds after each API call
  • Updates in real-time during operations

πŸ› Troubleshooting

APIs Show as "Unavailable"

  1. Verify Chrome version: chrome://version (must be 127+)
  2. Check flags: chrome://flags (all 5 flags enabled?)
  3. Update model: chrome://components β†’ "Optimization Guide On Device Model" β†’ "Check for update"
  4. Restart Chrome completely (quit, not just close window)

"Failed to create session" Errors

  • After-download status: Model is downloading in background. Wait 1-2 minutes and retry.
  • Network issues: Check internet connection for initial model download.

Content Script Not Running

  • Reload extension: chrome://extensions β†’ Click reload icon
  • Check console: Right-click extension icon β†’ "Inspect popup" β†’ Console tab
  • Verify permissions: Manifest must include activeTab and scripting

πŸ“š Resources

πŸ“„ License

MIT License - see LICENSE file

🀝 Contributing

Contributions welcome! Please:

  1. Fork the repository
  2. Create feature branch (git checkout -b feature/amazing-feature)
  3. Commit changes (git commit -m 'Add amazing feature')
  4. Push to branch (git push origin feature/amazing-feature)
  5. Open Pull Request

πŸ™ Acknowledgments

Built with Chrome's Built-in AI APIs (Early Preview). All AI processing runs on-device using Gemini Nano models.


Made with ❀️ by the ClarityLens Team

For issues, feature requests, or questions, please open an issue

Understand any page, instantly.

ClarityLens is an on-device Chrome extension that simplifies dense paragraphs, explains images with multimodal AI, fixes form inputs with plain-language guidance, and translates pages in placeβ€”powered by Chrome's built-in AI APIs (Summarizer, Rewriter, Prompt, Proofreader, Translator) and Gemini Nano.

Demo Video License: MIT Chrome Extension


🎯 Problem Statement

Many users struggle to read dense web content, understand unlabeled images, fix form errors, and browse in their preferred language. Existing tools rarely address text, images, forms, and translation together while preserving privacy and working offline. ClarityLens solves this by unifying five Chrome built-in AI APIs in one seamless, on-device workflow.


✨ Features

1. Smart Text Simplification

  • APIs Used: Summarizer API, Rewriter API
  • Generate TLDRs for long paragraphs
  • Adjust reading level and tone (concise/neutral/detailed)
  • Preserve original layout with quick undo
  • Reduces cognitive load for ADHD, dyslexia, and ESL users

2. Multimodal Image Understanding

  • API Used: Prompt API (multimodal)
  • Generate descriptive alt-text for unlabeled images
  • "Explain this chart/diagram" for complex visuals
  • Inject ARIA-friendly labels for screen readers
  • Improves perceivability for visually impaired users

3. Intelligent Form Proofreading

  • API Used: Proofreader API
  • Detect grammar and spelling errors in form inputs
  • Provide categorized, plain-language explanations
  • Accessible error messages for assistive technologies
  • Increases successful form submissions

4. Seamless In-Page Translation

  • APIs Used: Translator API, Language Detector API
  • Detect page language automatically
  • One-click translation to preferred language (English, Hindi, Kannada, etc.)
  • Hover to view original text
  • Per-site language preference persistence

5. WCAG 2.2 Accessibility

  • Clear keyboard focus indicators
  • Predictable interactions on focus/hover
  • Reduced motion support
  • Adequate target sizes for motor-impaired users

πŸ—οΈ Architecture

ClarityLens/
β”œβ”€β”€ manifest.json              # Manifest V3 configuration
β”œβ”€β”€ package.json               # Build dependencies
β”œβ”€β”€ LICENSE                    # MIT License
β”œβ”€β”€ README.md                  # This file
β”œβ”€β”€ public/
β”‚   └── icons/                 # Extension icons (16, 48, 128px)
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ background/
β”‚   β”‚   └── service-worker.ts  # Background service worker
β”‚   β”œβ”€β”€ content/
β”‚   β”‚   β”œβ”€β”€ content.ts         # Main content script bootstrap
β”‚   β”‚   β”œβ”€β”€ simplify.ts        # Summarizer/Rewriter integration
β”‚   β”‚   β”œβ”€β”€ images.ts          # Prompt API multimodal
β”‚   β”‚   β”œβ”€β”€ forms.ts           # Proofreader integration
β”‚   β”‚   β”œβ”€β”€ translate.ts       # Translator + Language Detector
β”‚   β”‚   β”œβ”€β”€ wcag.ts            # WCAG 2.2 UX enhancements
β”‚   β”‚   └── capabilities.ts    # API detection & fallbacks
β”‚   β”œβ”€β”€ lib/
β”‚   β”‚   β”œβ”€β”€ ai/                # AI API wrappers
β”‚   β”‚   β”œβ”€β”€ dom.ts             # DOM utilities
β”‚   β”‚   β”œβ”€β”€ storage.ts         # Chrome storage API
β”‚   β”‚   └── telemetry.ts       # Performance overlay
β”‚   └── ui/
β”‚       β”œβ”€β”€ popup.html         # Extension popup
β”‚       β”œβ”€β”€ popup.ts           # Popup logic
β”‚       β”œβ”€β”€ options.html       # Options page
β”‚       β”œβ”€β”€ options.ts         # Options logic
β”‚       └── styles.css         # Global styles
β”œβ”€β”€ scripts/
β”‚   β”œβ”€β”€ build.mjs              # TypeScript build script
β”‚   └── zip.mjs                # Package for submission
└── dist/                      # Build output

πŸš€ Installation & Setup

Prerequisites

  • Chrome 127+ (Dev/Canary channel recommended for Early Preview APIs)
  • Node.js 18+ and npm/yarn (for building from source)
  • Enable Chrome AI flags (see below)

Chrome Flags Setup

ClarityLens requires Chrome's built-in AI APIs. Enable these flags in chrome://flags:

  1. Prompt API for Gemini Nano: #prompt-api-for-gemini-nano β†’ Enabled
  2. Summarization API: #summarization-api β†’ Enabled
  3. Writer API: #writer-api β†’ Enabled
  4. Rewriter API: #rewriter-api β†’ Enabled
  5. Translation API: #translation-api β†’ Enabled
  6. Proofreader API: #proofreader-api β†’ Enabled (if available)
  7. Prompt API for Gemini Nano Multimodal: #prompt-api-for-gemini-nano-multimodal β†’ Enabled

Restart Chrome after enabling flags.

Build from Source

# Clone the repository
git clone https://github.com/YOUR_USERNAME/claritylens.git
cd claritylens

# Install dependencies
npm install

# Build the extension
npm run build

# For development with hot reload
npm run watch

Load Unpacked Extension

  1. Open Chrome and navigate to chrome://extensions/
  2. Enable Developer mode (toggle in top-right)
  3. Click Load unpacked
  4. Select the dist/ folder from the project directory
  5. ClarityLens icon should appear in your extensions toolbar

πŸ§ͺ Testing

Test Page 1: Text Simplification

  1. Open any dense article (e.g., Wikipedia, academic paper, news site)
  2. Click the ClarityLens icon β†’ Simplify Text
  3. Observe paragraph-level TLDRs and adjustable reading level
  4. Toggle Undo to restore original text
  5. Expected: Simplified text appears inline without layout shift

Test Page 2: Image Explanation

  1. Navigate to a page with charts/diagrams (e.g., data visualization, infographic)
  2. Right-click an image β†’ Explain Image (context menu)
  3. Expected: Alt-text generated and "Explain this chart" summary displayed
  4. Test with screen reader (NVDA/JAWS) to verify ARIA labels

Test Page 3: Form Proofreading

  1. Open any web form (e.g., contact form, comment box)
  2. Type text with intentional grammar/spelling errors
  3. Focus on input β†’ ClarityLens auto-detects errors
  4. Expected: Plain-language error explanations with suggestions

Test Page 4: In-Page Translation

  1. Open a page in English
  2. Click ClarityLens icon β†’ Translate β†’ Select Hindi/Kannada
  3. Expected: Page content translates in-place within 1-2 seconds
  4. Hover over translated text to view original
  5. Refresh page β†’ language preference persists

Test Page 5: Keyboard Accessibility

  1. Navigate using Tab key only (no mouse)
  2. Trigger each feature via keyboard shortcuts (see Options)
  3. Expected: All features accessible, clear focus indicators visible

πŸ› οΈ Capability Checks & Fallbacks

ClarityLens gracefully handles API availability:

  • Summarizer/Rewriter: Check ai.summarizer and ai.rewriter availability
  • Prompt API (multimodal): Check ai.languageModel with image support
  • Proofreader: Check ai.proofreader (Early Preview API)
  • Translator: Check translation.canTranslate()

If an API is unavailable, the extension:

  1. Displays a feature unavailable tooltip in the popup
  2. Shows a link to the Chrome flags setup guide
  3. Continues functioning with available APIs

πŸ”’ Privacy & Performance

On-Device Processing

  • All AI processing runs locally using Gemini Nano
  • No data sent to external servers
  • Works offline for core features (summarization, rewriting)
  • Translation may require network for language model downloads

Performance Benchmarks

Feature Average Response Time On-Device
Text Simplification ~200ms βœ…
Image Alt-Text ~400ms βœ…
Form Proofreading ~150ms βœ…
In-Page Translation ~300ms βœ…*

*Initial language model download may take 1-2 minutes


πŸ“Š Chrome Built-in AI APIs Used

API Purpose Status
Summarizer API Generate TLDRs for paragraphs Stable
Rewriter API Adjust reading level and tone Stable
Prompt API (multimodal) Image understanding & alt-text Early Preview
Proofreader API Grammar/spelling detection Early Preview
Translator API In-page language translation Stable
Language Detector API Auto-detect page language Stable

πŸŽ₯ Demo Video

Watch the 3-minute demo: YouTube Link

Demo Script:

  1. Simplify a dense Wikipedia article with TLDR
  2. Generate alt-text for an unlabeled chart
  3. Proofread a form with errors and fix them
  4. Translate page to Hindi and back
  5. Navigate all features with keyboard only

πŸ† Google Chrome Built-in AI Challenge 2025

This project was built for the Google Chrome Built-in AI Challenge 2025.

Judging Criteria Alignment

Criterion How ClarityLens Scores
Functionality Works across articles, forms, and multilingual pages; scales to global audiences
Purpose Meaningfully improves reading, image understanding, form completion, and multilingual browsing
Content Clean UI with consistent design tokens, subtle animations, and minimal chrome
User Experience 3-step onboarding, keyboard shortcuts, WCAG 2.2 compliance, accessible to all users
Technological Execution Showcases 6 built-in AI APIs in one cohesive workflow with explicit on-device indicators

πŸ› Troubleshooting

"API not available" error

  • Solution: Ensure Chrome flags are enabled (see Installation)
  • Restart Chrome after enabling flags
  • Use Chrome Dev/Canary channel for Early Preview APIs

Multimodal Prompt not working

  • Solution: Enable #prompt-api-for-gemini-nano-multimodal flag
  • Ensure image size < 5MB and supported formats (JPG, PNG, WebP)

Translation slow on first use

  • Solution: Wait 1-2 minutes for initial language model download
  • Check network connection for model fetch

Extension not loading

  • Solution: Verify dist/ folder contains manifest.json
  • Check browser console for error messages
  • Reload extension in chrome://extensions/

🀝 Contributing

Contributions welcome! Please:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit changes (git commit -m 'Add amazing feature')
  4. Push to branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“„ License

This project is licensed under the MIT License - see LICENSE file for details.


πŸ™ Acknowledgments

  • Chrome Built-in AI team for APIs and documentation
  • WCAG 2.2 guidelines for accessibility standards
  • Google Chrome Built-in AI Challenge 2025 organizers
  • Devpost community for feedback and support

πŸ“§ Contact

Developer: [Your Name]
Email: your.email@example.com
GitHub: @your-username
Project Link: https://github.com/your-username/claritylens


Built with ❀️ using Chrome's Built-in AI APIs and Gemini Nano