Claude AI Analytics Dashboard

Full-stack analytics platform: scrapes, classifies and visualizes social + growth data about Claude AI.

📖 API documentation → API_DOCS.md

Quick Start

# 1. Install Python dependencies
pip install -r requirements.txt

# 2. Configure environment
cp .env.example .env
# Edit .env — add GEMINI_API_KEY and YOUTUBE_API_KEY

# 3. Start backend + frontend (Windows)
run_app.bat

# Or manually:
uvicorn app:app --reload --port 8000   # backend
cd frontend && npm install && npm run dev  # frontend → http://localhost:5173

Swagger UI → http://localhost:8000/docs

Architecture

HackNU/
├── app.py              # FastAPI backend (entry point)
├── models.py           # Pydantic schemas — source of truth for API contract
├── pipeline.py         # Scrape → Merge → Classify pipeline
│
├── scrapers/
│   ├── reddit_client.py     # Reddit via MCP (requires Node.js)
│   ├── hn_client.py         # Hacker News via Algolia free API
│   ├── bluesky_client.py    # Bluesky AT Protocol API
│   ├── youtube_scraper.py   # YouTube Data API v3
│   └── producthunt_client.py # ProductHunt GraphQL API
│
├── analysis/
│   ├── growth_metrics.py    # NPM / PyPI / GitHub / Wikipedia / Trends time-series
│   ├── insights.py          # Viral analysis, sentiment, competitor positioning, seeding detection
│   └── official_sources.py  # Anthropic blog / release timeline via Jina reader
│
├── frontend/            # React + Vite + Chart.js dashboard
│   └── src/
│       ├── components/  # Charts, FeedTable, RangeModal, GrowthTrends
│       └── services/    # API client
│
└── data/               # Generated — gitignored
    ├── dataset.json         # Unified post dataset (4800+ posts, all platforms)
    ├── growth_data.json     # NPM, PyPI, GitHub, Wikipedia, Trends time-series
    ├── insights.json        # Pre-computed analytics output
    └── ...                  # Raw per-source CSVs

API Endpoints

Method	Path	Description
`GET`	`/api/posts`	All posts, paginated + filtered by platform/sentiment/date
`GET`	`/api/posts/stats`	Aggregated stats by platform, sentiment, content type
`GET`	`/api/growth-metrics`	NPM / PyPI / GitHub / Wikipedia / Trends data
`GET`	`/api/insights`	Viral analysis, competitor positioning, seeding detection
`GET`	`/api/correlation`	Weekly unified signal table (npm + wiki + social + trends)
`GET`	`/api/signals`	Chronological timeline of all growth signals
`GET`	`/api/features`	Top Claude features mentioned with weekly breakdown
`POST`	`/api/pipeline/run`	Trigger background scrape + classify
`GET`	`/api/pipeline/status`	Pipeline running state + post count
`GET`	`/api/pipeline/progress`	Live progress (stage, source, posts collected)
`GET`	`/health`	Health check

Data Sources

Platform	Method	Auth required
Reddit	MCP client (Node.js subprocess)	No
Hacker News	Algolia Search API (free)	No
Bluesky	AT Protocol API	`BSKY_HANDLE` + `BSKY_APP_PASSWORD` in `.env`
YouTube	YouTube Data API v3	`YOUTUBE_API_KEY` in `.env`
ProductHunt	GraphQL API	`PRODUCTHUNT_API_KEY` + `PRODUCTHUNT_API_SECRET` in `.env`
NPM / PyPI	Public registry APIs	No
GitHub	REST API	No (rate-limited without token)
Wikipedia	Wikimedia Metrics API	No
Google Trends	`pytrends`	No
App Store / Play Store	iTunes lookup + google-play-scraper	No

Running the Pipeline Manually

# Full run: scrape all platforms → merge → classify with Gemini
python pipeline.py --scrape --merge --classify

# Individual steps
python pipeline.py --scrape      # Scrape Reddit + HN + Bluesky + PH
python pipeline.py --youtube     # YouTube only
python pipeline.py --merge       # Merge all sources into dataset.json
python pipeline.py --classify    # Classify unclassified rows with Gemini
python pipeline.py --historical  # Reddit top/year + top/all (slower, more data)
python pipeline.py --since 2026-01-01  # Only posts after this date

# Regenerate analytics (after new posts collected)
python analysis/growth_metrics.py
python analysis/insights.py

Or trigger via HTTP (runs in background, returns immediately):

curl -X POST http://localhost:8000/api/pipeline/run

Environment Variables

Copy .env.example to .env and fill in:

Variable	Required	Description
`GEMINI_API_KEY`	Yes	Google Gemini API — post classification + query generation
`YOUTUBE_API_KEY`	Yes for YouTube	YouTube Data API v3
`BSKY_HANDLE`	Yes for Bluesky	Bluesky account handle (e.g. `you.bsky.social`)
`BSKY_APP_PASSWORD`	Yes for Bluesky	Bluesky app password
`PRODUCTHUNT_API_KEY`	Yes for ProductHunt	PH developer API key
`PRODUCTHUNT_API_SECRET`	Yes for ProductHunt	PH developer API secret

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Claude AI Analytics Dashboard

Quick Start

Architecture

API Endpoints

Data Sources

Running the Pipeline Manually

Environment Variables

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
analysis		analysis
frontend		frontend
scrapers		scrapers
tests		tests
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
app.py		app.py
models.py		models.py
pipeline.py		pipeline.py
playbook.md		playbook.md
requirements.txt		requirements.txt
run_app.bat		run_app.bat

Folders and files

Latest commit

History

Repository files navigation

Claude AI Analytics Dashboard

Quick Start

Architecture

API Endpoints

Data Sources

Running the Pipeline Manually

Environment Variables

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages