Skip to content

Production-grade analytics instrumentation for greenfield projects No silent failures. No dropped events. Rock-solid from day one.

License

Notifications You must be signed in to change notification settings

MacFall7/taxonomy-tap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

24 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Analytics Starter

Production-grade analytics instrumentation for greenfield projects No silent failures. No dropped events. Rock-solid from day one.

TypeScript License: MIT


Why this exists Most teams discover analytics problems weeks after launch: missing events, schema drift, PII leaks, flaky vendors. This starter treats analytics as an SRE problem, not a tracking snippet.


What is this?

Analytics Starter is an opinionated, production-ready monorepo that demonstrates how to build reliable, validated, and observable analytics into a new application from the ground up.

Built for teams who want:

  • βœ… Zero silent failures β€” invalid events are quarantined, not dropped
  • βœ… Type-safe tracking β€” event taxonomy validated at compile-time and runtime
  • βœ… PII protection β€” automatic detection and scrubbing of sensitive data
  • βœ… Real-time quality metrics β€” dashboards showing event health and validation failures
  • βœ… Resilience built-in β€” offline queues, retries, circuit breakers, and deduplication
  • βœ… Developer-friendly β€” strongly typed SDK with autocomplete and typo detection

This is not a toy demoβ€”it's a scaffold a senior engineer or CTO would use to start a new product.


Architecture Overview

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                         CLIENT (Browser)                         β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚  Analytics Client SDK (@analytics-client)                   β”‚ β”‚
β”‚  β”‚  β€’ Type-safe event tracking                                 β”‚ β”‚
β”‚  β”‚  β€’ Schema validation (client-side)                          β”‚ β”‚
β”‚  β”‚  β€’ Offline queue + retry logic                              β”‚ β”‚
β”‚  β”‚  β€’ PII detection                                             β”‚ β”‚
β”‚  β”‚  β€’ Event deduplication (idempotency)                        β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                         β”‚ HTTPS POST
                         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                      BACKEND API (Fastify)                       β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚  Validation Pipeline                                        β”‚ β”‚
β”‚  β”‚  1. Event name βœ“                                            β”‚ β”‚
β”‚  β”‚  2. Schema validation βœ“                                     β”‚ β”‚
β”‚  β”‚  3. Business rules βœ“                                        β”‚ β”‚
β”‚  β”‚  4. PII detection βœ“                                         β”‚ β”‚
β”‚  β”‚  5. Deduplication βœ“                                         β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚                        β”‚                                         β”‚
β”‚              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                             β”‚
β”‚              β–Ό                    β–Ό                             β”‚
β”‚     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                   β”‚
β”‚     β”‚  Valid Event     β”‚  β”‚ Invalid Event   β”‚                   β”‚
β”‚     β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜                   β”‚
β”‚              β”‚                     β”‚                            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚                     β”‚
               β–Ό                     β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  events_validated    β”‚  β”‚ events_quarantined   β”‚
β”‚  (Postgres)          β”‚  β”‚  (Postgres)          β”‚
β”‚  β€’ event_name        β”‚  β”‚  β€’ raw_payload       β”‚
β”‚  β€’ payload (JSONB)   β”‚  β”‚  β€’ failure_reasons   β”‚
β”‚  β€’ schema_version    β”‚  β”‚  β€’ client_info       β”‚
β”‚  β€’ adapter_status    β”‚  β”‚  β€’ timestamp         β”‚
β”‚  β€’ timestamps        β”‚  β”‚                      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β”‚
          β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Analytics Adapters  β”‚
β”‚  (Segment, Amplitude,β”‚
β”‚   Mixpanel, etc.)    β”‚
β”‚  + Circuit Breaker   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Quick Start

# 1. Clone and install
git clone https://github.com/MacFall7/taxonomy-tap.git
cd taxonomy-tap
pnpm install

# 2. Configure environment
cp apps/api/.env.example apps/api/.env
cp apps/web/.env.example apps/web/.env
# Edit apps/api/.env with your Postgres connection string

# 3. Set up database
pnpm db:migrate

# 4. Start all services
pnpm dev

That's it! The following will be running:

  • Web app on http://localhost:5173 β€” E-commerce demo + analytics dashboard
  • API server on http://localhost:3000 β€” Event ingestion and validation

Prerequisites: Node.js >= 18, pnpm >= 8, Postgres database

Try it out

  1. Browse the demo store at http://localhost:5173

    • Click through products
    • Add items to cart
    • Complete a checkout
  2. View analytics dashboard at http://localhost:5173/analytics

    • See validated events in real-time
    • Inspect quarantined events (try submitting invalid data)
    • Monitor adapter health and circuit breaker status
    • View PII detection alerts

Project Structure

analytics-starter/
β”œβ”€β”€ apps/
β”‚   β”œβ”€β”€ web/              # React e-commerce demo + analytics dashboard
β”‚   └── api/              # Fastify backend for event ingestion
β”œβ”€β”€ packages/
β”‚   └── analytics-client/ # Reusable analytics SDK (publishable to npm)
β”œβ”€β”€ infra/
β”‚   └── db/               # Postgres migrations and schema
β”œβ”€β”€ docs/
β”‚   β”œβ”€β”€ OVERVIEW.md       # Detailed project overview
β”‚   β”œβ”€β”€ ARCHITECTURE.md   # System design and data flow
β”‚   β”œβ”€β”€ TAXONOMY.md       # Event taxonomy and schema definitions
β”‚   └── HARDENING_NOTES.md # Reliability principles and tradeoffs
β”œβ”€β”€ package.json          # Monorepo root with workspace config
└── pnpm-workspace.yaml   # pnpm workspace configuration

Core Features

1. Type-Safe Event Tracking

The analytics client provides compile-time and runtime type safety:

import { trackEvent } from '@taxonomy-tap/analytics-client';

// βœ… Valid event with autocomplete
trackEvent('product_viewed', {
  product_id: 'abc-123',
  product_name: 'Organic Coffee',
  price: 12.99,
  currency: 'USD'
});

// ❌ TypeScript error: unknown event
trackEvent('prodcut_viewed', { ... });  // Typo caught at compile time!

// ❌ Runtime error: missing required field
trackEvent('product_viewed', {
  product_id: 'abc-123'
  // Missing required fields
});

2. Quarantine System

Invalid events are never silently dropped. They're quarantined with full context:

  • Raw payload
  • Validation failure reasons (categorized)
  • Client metadata (browser, IP, timestamp)
  • Stack traces (in dev mode)

View quarantined events in the dashboard to fix tracking issues proactively.

3. PII Protection

Automatic detection and scrubbing of:

  • Email addresses
  • Credit card numbers
  • Phone numbers
  • Custom patterns (configurable)

Configurable actions: scrub, hash, or reject events containing PII.

4. Resilience Layer

  • Offline queue: Events tracked while offline are queued and sent when reconnected
  • Exponential backoff: Failed requests retry with increasing delays
  • Deduplication: Events are hashed to prevent double-tracking
  • Circuit breakers: Unhealthy adapters are temporarily disabled to prevent cascading failures

5. Quality Metrics Dashboard

Real-time visibility into:

  • Event success vs. failure rates
  • Quarantine trends by reason
  • PII detection alerts
  • Adapter health status
  • Latency percentiles (p50, p95, p99)

Using the Analytics Client

Installation (for external projects)

npm install @taxonomy-tap/analytics-client

Setup

import { initAnalytics } from '@taxonomy-tap/analytics-client';

const analytics = initAnalytics({
  apiEndpoint: 'https://your-api.com/ingest',
  enableOfflineQueue: true,
  enablePiiDetection: true,
  debug: process.env.NODE_ENV === 'development'
});

Track Events

import { trackEvent } from '@taxonomy-tap/analytics-client';

// Product viewed
trackEvent('product_viewed', {
  product_id: 'prod_123',
  product_name: 'Wireless Headphones',
  category: 'Electronics',
  price: 99.99,
  currency: 'USD'
});

// Add to cart
trackEvent('item_added_to_cart', {
  product_id: 'prod_123',
  quantity: 1,
  cart_total: 99.99
});

// Checkout started
trackEvent('checkout_started', {
  cart_total: 99.99,
  item_count: 1
});

// Order completed
trackEvent('order_completed', {
  order_id: 'ord_456',
  revenue: 99.99,
  currency: 'USD',
  item_count: 1
});

Documentation

  • Overview β€” High-level concepts and goals
  • Architecture β€” System design, data flow, and component interaction
  • Taxonomy β€” Event catalog and schema definitions
  • Hardening Notes β€” Reliability principles and design tradeoffs

Production Hardening

This project is built with production reliability in mind. Key hardening features:

Client-Side Resilience

  • πŸ”„ Offline queue β€” Events tracked offline are queued in localStorage and synced on reconnect
  • πŸ” Exponential backoff β€” Failed requests retry with increasing delays (1s β†’ 2s β†’ 4s β†’ 8s)
  • πŸ”’ Idempotency β€” Event IDs prevent duplicate tracking across retries
  • 🚦 Circuit breaker β€” Auto-disable unhealthy endpoints to prevent cascading failures

Server-Side Validation

  • βœ… Multi-layer validation β€” Schema β†’ PII β†’ Business rules β†’ Deduplication
  • πŸ›‘οΈ PII detection β€” Automatic scanning for emails, credit cards, phone numbers
  • πŸ” Quarantine system β€” Invalid events stored with full failure context, never silently dropped
  • πŸ“Š Quality metrics β€” Real-time dashboards for validation success/failure rates

Adapter Delivery

  • πŸ”Œ Circuit breakers β€” Per-adapter fault isolation (default: 5 failures, 60s timeout)
  • 🎯 Fire-and-forget β€” Adapter delivery is async and non-blocking
  • πŸ“ˆ Health tracking β€” Monitor adapter status and recovery in real-time
  • πŸ”§ Configurable β€” Adjust thresholds via environment variables

Database Layer

  • 🏊 Connection pooling β€” Efficient Postgres connection management
  • πŸ—„οΈ JSONB storage β€” Flexible event payload storage without schema migrations
  • πŸ“ Audit trail β€” Full event history with timestamps and metadata
  • πŸ” Prepared statements β€” Protection against SQL injection

For detailed design decisions and tradeoffs, see HARDENING_NOTES.md.


Development

# Run all apps in dev mode
pnpm dev

# Run only the web app
pnpm dev:web

# Run only the API server
pnpm dev:api

# Build all packages and apps
pnpm build

# Build only the analytics client
pnpm build:client

# Run tests
pnpm test

# Lint all code
pnpm lint

# Type-check all code
pnpm typecheck

Deployment

Frontend (Web App)

Deploy to Vercel or Netlify:

cd apps/web
pnpm build
# Follow Vercel/Netlify deployment instructions

Backend (API)

Deploy to Railway, Render, or Fly.io:

cd apps/api
pnpm build
# Follow platform-specific deployment instructions

Database

Use a managed Postgres provider:

Run migrations on your hosted database:

DATABASE_URL=postgresql://user:pass@host/db pnpm db:migrate

License

MIT Β© 2025


Credits

Built with:


Questions or feedback? Open an issue or start a discussion!

About

Production-grade analytics instrumentation for greenfield projects No silent failures. No dropped events. Rock-solid from day one.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages