Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
182 changes: 182 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,182 @@
# OpenThreads Trace - Developer Guide for Claude Code

## Project Overview

OpenThreads Trace is a cross-browser extension that scans consumer product pages for compliance-surface completeness signals. It detects missing disclosure fields, flags risky marketing claims, and exports structured compliance data in Threadmark-compatible JSON format.

**Target Users**: DTC merchants, compliance consultants
**Platform**: Cross-platform browser extension (Chrome MV3, future Firefox/Safari)
**Architecture**: TypeScript + Vite, data-driven rule engine, client-side only

## Core Architecture

### Technology Stack
- **Build**: Vite 6 + TypeScript 5.7
- **Testing**: Vitest + jsdom (59 passing tests, 50% coverage floor)
- **Linting**: ESLint 9 (flat config) + Prettier
- **CI/CD**: GitHub Actions with CodeQL security scanning
- **Extension**: Chrome Manifest V3 (popup + content script + background service worker)

### Project Structure
```
src/
├── background/ # Background service worker
├── content/ # Content scripts (snapshot capture)
│ ├── snapshot.ts # DOM extraction: metadata, text, SKU hints
│ └── index.ts # Message handler for SCAN requests
├── popup/ # Extension popup UI
│ ├── popup.html # 360x480px UI with category selector + scan button
│ ├── popup.ts # DOM event wiring, chrome.tabs messaging
│ └── popup-ui.ts # Pure testable UI functions
├── rules/ # Data-driven compliance rule engine
│ ├── engine.ts # detectField(), detectClaims(), runRules()
│ ├── field-groups.ts # 12 compliance field definitions (4 groups)
│ └── claim-keywords.ts # Risk claim keywords (eco, sustainable, etc.)
└── types/ # TypeScript types
├── scan.ts # ScanResult, FieldResult, ClaimFlag, PageSnapshot
└── index.ts # Exports and ProductCategory enum
```

## Coding Conventions

### TypeScript Patterns
- **Pure functions first**: Separate testable logic from DOM/chrome API calls
- Example: `popup-ui.ts` has pure functions, `popup.ts` wires DOM
- **Strong typing**: All interfaces in `types/`, no `any`
- **Data-driven rules**: JSON-like structures in `field-groups.ts` and `claim-keywords.ts`
- **Confidence scoring**: Return `0.0-1.0` confidence with detection results
- Meta tags: 0.9 confidence
- Text patterns: 0.7 confidence

### File Organization
- **Test files**: Co-located `*.test.ts` next to source
- **One concern per file**: `snapshot.ts` only handles DOM capture, `engine.ts` only handles rule evaluation
- **Exports**: Use named exports, centralize in `index.ts` where appropriate

### Naming Conventions
- **Functions**: `camelCase`, verb-first (`extractMetaTags`, `detectClaims`)
- **Types**: `PascalCase` (`PageSnapshot`, `FieldResult`)
- **Constants**: `SCREAMING_SNAKE_CASE` for data maps (`FIELD_SEARCH_PATTERNS`, `META_KEY_MAP`)
- **Files**: `kebab-case.ts` (exception: `popup-ui.ts` for clarity)

## Feature Development Workflow

### Current Status (per PRD.json)
- ✅ F005: Engineering baseline (hooks, CI, CodeQL)
- ✅ F010: Extension shell UI
- ✅ F020: DOM snapshot capture
- ✅ F030: Rule engine v1
- 🔨 F040: Risk score model (NEXT)
- 📋 F050: Evidence clipper
- 📋 F060: Threadmark JSON export

### Adding New Features
1. **Read PRD.json first**: Check acceptance criteria for the feature ID
2. **Write tests first**: Add `*.test.ts` with expected behavior
3. **Keep pure functions testable**: Separate DOM/chrome APIs from logic
4. **Update progress.txt**: Document what was implemented and test count
5. **Run full checks**: `npm run typecheck && npm test && npm run build`

### Adding New Compliance Fields
1. Add field definition to `field-groups.ts` (specify group, key, required flag)
2. Add detection pattern to `FIELD_SEARCH_PATTERNS` in `engine.ts`
3. Add meta tag mapping to `META_KEY_MAP` if applicable
4. Write unit tests in `rules/engine.test.ts`

### Adding New Claim Keywords
1. Add category to `claim-keywords.ts` (use lowercase for case-insensitive matching)
2. Engine will auto-detect via `detectClaims()` with context extraction
3. Write tests in `rules/claim-keywords.test.ts`

## Testing Philosophy

### Unit Test Requirements
- **Coverage**: Maintain ≥50% threshold (configured in vitest.config.ts)
- **Isolation**: Mock chrome APIs, use jsdom for DOM tests
- **Fast**: Pre-push hook runs tests in <60s
- **Descriptive**: Use `describe()` blocks per function/module

### Test Structure
```typescript
import { describe, it, expect } from 'vitest';

describe('moduleName', () => {
describe('functionName', () => {
it('should handle expected case', () => {
// Arrange
const input = {...};
// Act
const result = functionName(input);
// Assert
expect(result).toEqual({...});
});
});
});
```

## Git & CI Workflow

### Pre-commit Hook (auto-installed)
- Runs format + lint + typecheck in <15s
- Located in `.husky/pre-commit`

### Pre-push Hook
- Runs full unit test suite in <60s

### CI Checks (GitHub Actions)
- **Required**: lint, typecheck, test, build, package artifact
- **Security**: CodeQL scan on PR + weekly scheduled
- **Coverage**: Uploaded to coverage service (configured threshold)

### Commit Messages
- Use Conventional Commits: `feat:`, `fix:`, `chore:`, `test:`, `refactor:`
- Example: `feat: add risk score calculation with weighted field penalties`
- Always include co-author: `Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>`

## Security & Privacy Constraints

### Hard Requirements (per PRD)
- **No automatic data exfiltration**: User-initiated scans only
- **Sanitized DOM parsing**: Strip scripts/styles in `extractTextContent()`
- **No authentication**: Do not prompt for credentials or access private portals
- **Disclaimer required**: Extension provides signals, not legal advice

### Chrome API Usage
- **activeTab**: Only access current tab on user click
- **scripting**: Inject content scripts declaratively via manifest
- **No network**: All processing is local, no external API calls in v1

## Known Patterns & Anti-Patterns

### ✅ DO
- Extract pure functions for testability (`popup-ui.ts` pattern)
- Use confidence scores with detection results
- Return structured data with context (e.g., `ClaimFlag` includes surrounding text)
- Co-locate tests with source files
- Document acceptance criteria in `progress.txt`

### ❌ DON'T
- Mix DOM manipulation with business logic
- Use `any` type (all types in `types/`)
- Make network calls or external API requests
- Promise legal compliance guarantees in UI text
- Bypass hooks with `--no-verify`

## Next Steps (F040: Risk Score Model)

When implementing the risk score:
1. Create `scoring.ts` with `calculateRiskScore(scanResult: ScanResult): RiskScoreBreakdown`
2. Weight by field importance (required fields > optional)
3. Penalize risky claims without evidence
4. Return explainable breakdown (which fields/claims contribute)
5. Add unit tests for edge cases (all fields present, all missing, mixed)
6. Update `ScanResult` type to include `riskScore` and `riskBreakdown`

## Questions & Support

- **PRD Reference**: `/PRD.json` (source of truth for features)
- **Progress Tracking**: `/progress.txt` (current implementation status)
- **CI Configuration**: `.github/workflows/` (build, test, release)
- **Rule Definitions**: `src/rules/field-groups.ts` and `claim-keywords.ts`

This is a long-running agent-friendly codebase following Anthropic recommendations for structured, testable, data-driven extension development.
3 changes: 2 additions & 1 deletion PRD.json
Original file line number Diff line number Diff line change
Expand Up @@ -165,7 +165,8 @@
"id": "F030",
"phase": 1,
"name": "Rule Engine v1",
"description": "Data-driven JSON rules"
"description": "Data-driven JSON rules",
"status": "passes"
},
{
"id": "F040",
Expand Down
190 changes: 190 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,190 @@
# OpenThreads Trace

**Compliance Exposure Scanner Browser Extension**

A cross-browser extension that scans consumer product pages (Shopify, WooCommerce, Amazon, Etsy, DTC sites) for compliance-surface completeness signals. Flags missing disclosure fields, detects risky marketing claims, and exports structured compliance data.

[![CI](https://github.com/openthreads/surface/actions/workflows/ci.yml/badge.svg)](https://github.com/openthreads/surface/actions/workflows/ci.yml)
[![CodeQL](https://github.com/openthreads/surface/actions/workflows/codeql.yml/badge.svg)](https://github.com/openthreads/surface/actions/workflows/codeql.yml)

## Features

- **One-click page scan** – Extract product metadata, text content, and SKU hints
- **Rule-based compliance detection** – 12 disclosure fields across 4 categories (Identity, Composition, Safety, Claims)
- **Claim risk flagging** – Detects eco/sustainability/health claims requiring evidence
- **Product category selector** – Textiles, Children's Products, Cosmetics, Electronics, General
- **Explainable results** – Confidence scores and detection context
- **Privacy-first** – All processing is local, no data leaves your browser
- **Threadmark export** *(coming soon)* – Structured JSON bundle for compliance workflows

## Installation

### For Development

```bash
# Clone the repository
git clone https://github.com/openthreads/surface.git
cd surface

# Install dependencies
npm install

# Build the extension
npm run build

# Load in Chrome
# 1. Open chrome://extensions/
# 2. Enable "Developer mode"
# 3. Click "Load unpacked"
# 4. Select the `dist/` directory
```

### For Users

*(Coming soon: Chrome Web Store, Firefox Add-ons, Edge Add-ons)*

## Usage

1. **Navigate** to any product page (e.g., Shopify store, Amazon listing)
2. **Click** the OpenThreads Trace extension icon
3. **Select** your product category (Textiles, Children's Products, etc.)
4. **Click "Scan Page"**
5. **Review** missing fields and flagged claims
6. **Export** results *(coming soon)*

## Development

### Prerequisites

- Node.js 18+ (20 LTS recommended)
- npm 9+

### Scripts

```bash
# Development
npm run build # Build extension for production
npm run typecheck # TypeScript type checking
npm run lint # ESLint check
npm run lint:fix # Auto-fix linting issues
npm run format # Format with Prettier
npm run format:check # Check formatting

# Testing
npm test # Run unit tests
npm run test:watch # Watch mode
npm run test:coverage # Generate coverage report
```

### Pre-commit & Pre-push Hooks

Hooks are auto-installed on `npm install` via Husky:

- **Pre-commit**: Format, lint, typecheck (<15s)
- **Pre-push**: Full test suite (<60s)

To skip hooks temporarily (not recommended):
```bash
git commit --no-verify
```

### Project Structure

```
src/
├── background/ # Background service worker
├── content/ # Content scripts (DOM snapshot capture)
├── popup/ # Extension popup UI
├── rules/ # Compliance rule engine (field groups, claim keywords)
└── types/ # TypeScript type definitions
```

See [CLAUDE.md](./CLAUDE.md) for detailed architecture and coding conventions.

## Feature Roadmap

Current status (v1.0.0):

- ✅ **F005**: Engineering baseline (CI, hooks, CodeQL security scanning)
- ✅ **F010**: Extension shell UI
- ✅ **F020**: DOM snapshot capture
- ✅ **F030**: Rule engine v1
- 🔨 **F040**: Risk score model *(in progress)*
- 📋 **F050**: Evidence clipper
- 📋 **F060**: Threadmark JSON export

See [PRD.json](./PRD.json) for full product requirements.

## Compliance Fields Detected

### Identity & Contacts
- Product name ✅ (required)
- Brand ✅ (required)
- Manufacturer name/address
- Contact email or URL

### Composition & Origin
- Materials (fiber content, ingredients)
- Country of origin

### Safety & Use
- Warnings
- Instructions
- Care instructions

### Claims & Evidence
- Marketing claims (eco, sustainable, biodegradable, etc.)
- Certifications (GOTS, OEKO-TEX, etc.)

## Security & Privacy

- **No data collection**: All processing happens locally in your browser
- **No network calls**: Extension does not send data to external servers
- **User-initiated only**: Scans require explicit user action
- **Sanitized parsing**: Scripts and styles are stripped from analyzed content
- **CodeQL scanning**: Continuous security analysis via GitHub Actions

## Contributing

Contributions are welcome! Please:

1. Fork the repository
2. Create a feature branch (`git checkout -b feat/your-feature`)
3. Follow existing code conventions (see [CLAUDE.md](./CLAUDE.md))
4. Write tests for new features
5. Ensure all checks pass (`npm run typecheck && npm test && npm run build`)
6. Submit a pull request

### Commit Convention

Use [Conventional Commits](https://www.conventionalcommits.org/):
- `feat:` New features
- `fix:` Bug fixes
- `refactor:` Code refactoring
- `test:` Test additions/changes
- `chore:` Tooling, dependencies
- `docs:` Documentation only

## License

Apache License 2.0 - See [LICENSE](./LICENSE) for details.

## Disclaimer

**This extension provides heuristic completeness signals only.**
It does not constitute legal advice or guarantee regulatory compliance.
Users remain responsible for all compliance decisions and verification.

## Acknowledgments

Built with:
- [TypeScript](https://www.typescriptlang.org/)
- [Vite](https://vitejs.dev/)
- [Vitest](https://vitest.dev/)
- [Chrome Extensions API](https://developer.chrome.com/docs/extensions/)

Developed by [OpenThreads.dev](https://openthreads.dev) to accelerate structured compliance workflows.

---

**Questions or feedback?** Open an issue or visit [OpenThreads Documentation](https://docs.openthreads.dev).
9 changes: 9 additions & 0 deletions progress.txt
Original file line number Diff line number Diff line change
Expand Up @@ -26,3 +26,12 @@ F020 - DOM Snapshot Capture [PASSES]
- Content script handles SCAN message and returns snapshot to popup
- jsdom added for DOM-based testing
- 19 new unit tests for snapshot module (39 total passing)

F030 - Rule Engine v1 [PASSES]
- engine.ts: runRules() orchestrates field evaluation and claim detection against a PageSnapshot
- detectField: checks meta tags (high confidence 0.9) then text patterns (0.7) for each compliance field
- evaluateFields: runs all 12 field definitions from field-groups.ts against snapshot, returns FieldResult[]
- detectClaims: case-insensitive keyword matching against claim-keywords.ts with surrounding context extraction
- FIELD_SEARCH_PATTERNS: regex map for all 12 fields (product_name, brand, materials, warnings, etc.)
- META_KEY_MAP: Open Graph / structured data key lookups for meta-based detection
- 20 new unit tests for engine module (59 total passing)
Loading
Loading