Understand who's building. Before they disappear.
An open-source GitHub contributor health dashboard that surfaces burnout signals, knowledge concentration risks, and contribution patterns across any public or private repository, in real time, powered by the GitHub REST API and Google Gemini AI.
- Introduction & Problem Statement
- Key Features
- System Architecture
- Health Scoring Algorithm (Under the Hood)
- Burnout Prediction Engine
- Bus Factor Analysis
- API Route Reference
- Project Structure
- Local Development Setup
- Configuration & Environment Variables
- Deployment Guide
- Contributing
Engineering teams bleed contributors silently. A developer's commit velocity drops, their streak breaks, they start pushing at 2 AM — and by the time anyone notices, they've already mentally checked out. Existing GitHub analytics tools report history. GitStat predicts what comes next.
Open-source maintainers have no early warning system. They discover burnout after the fact, in the form of a quiet fork or an unmaintained dependency. For engineering leads inside organisations, the same blindspot costs months of onboarding when a load-bearing contributor leaves.
GitStat connects directly to the GitHub API, computes a composite health score for every active contributor, runs linear regression burnout forecasts, and surfaces knowledge concentration risks — all without requiring any code changes or integrations in the target repository. Analyse any public repo unauthenticated, or unlock private repos via GitHub OAuth.
Every contributor receives a composite Health Score (0–100) calculated from five weighted signals: commit velocity, activity streak consistency, PR follow-through rate, response latency on pull requests, and an inverse off-hours ratio that acts as a burnout signal. Contributors are classified into four health bands: At Risk · Stressed · Healthy · Thriving.
GitStat runs linear regression on the last 6 weeks of commit activity for every active contributor. If the slope is consistently negative and the contributor is still active, GitStat projects exactly when they will go silent — and flags them before it happens.
- Overview Page — "Run Burnout Prediction" button with a live risk-count badge
- Per-contributor sparklines, weeks-to-fade estimate, and raw slope data
- Time Machine slider on the Contributors page replays metrics backwards up to 10 weeks
Automatically identifies contributors who were active in the previous 8 weeks but have had zero commits in the last 4 weeks — the first signal of silent churn, caught before the contributor has officially disengaged.
Analyses the last 10 commits per file across up to 40 files to compute unique author counts per file. Files with a single author are flagged as high risk. Aggregated into a single Bus Factor Score shown on the Overview KPI bar.
Sends the repository's README to Gemini AI and returns a structured 0–10 score across six onboarding dimensions: Setup Guide · Contribution Guidelines · Code of Conduct · License · Contact · Purpose. Includes actionable, AI-generated improvement suggestions per dimension.
Recursively maps the file tree (up to 500 files) and renders it in one of three adaptive modes based on repository size: collapsible file tree for small repos, categorised directory cards for medium repos, and a cluster map for large codebases. Includes automatic tech-stack detection across languages, frameworks, and tooling.
A force-directed graph (via react-force-graph-2d) that visualises contributor–repository connections across all repos analysed in the current session. Identifies load-bearing contributors — developers who span multiple projects and represent a single point of failure.
Side-by-side comparison of any two repositories across PR success rate, contributor count, velocity, total commits, and newbie-friendliness score. Search is debounced live against the GitHub API.
Each analysed repo gets a dynamic SVG badge hosted at /badge/:owner/:repo that reflects the live average contributor health score, embeddable in any README with a single line of Markdown.
GitStat is structured as a two-tier system: a Node.js/Express API server that acts as an authenticated proxy to the GitHub REST API and Gemini AI, and a React/Vite frontend that drives all analysis, scoring, and visualisation.
flowchart TB
User(["👤 Browser Client"])
subgraph Frontend ["⚛️ React Client (Vite)"]
Pages["📄 Page Components\n10 route-level views"]
Charts["📊 Chart.js\nreact-force-graph-2d"]
Hooks["🪝 Custom Hooks & Contexts\n(Auth · Theme · Architecture)"]
ApiCache["⚡ apiCache.js\nIn-memory request cache"]
end
subgraph Backend ["⚡ Node.js / Express Server"]
AuthMW["🛡️ requireAuth.js\nSession Guard Middleware"]
AuthRoute["🔐 auth.js\nOAuth Flow Handler"]
RepoRoute["📡 repo.js\nData Aggregation Endpoints"]
AIRoute["🤖 ai.js\nGemini Proxy"]
GHService["⚙️ githubService.js\nFetch + Timeout Wrapper"]
Analysis["🧮 analysis.js\nHealth Scoring · Burnout · Bus Factor"]
end
GitHub["🐙 GitHub REST API v3"]
GitHubOAuth["🔑 GitHub OAuth 2.0"]
Gemini["✨ Gemini 2.5 Flash"]
BadgeEP["🏷️ /badge/:owner/:repo\nLive SVG Endpoint"]
User -->|"Page navigation"| Pages
User -->|"Initiates login"| AuthRoute
AuthRoute <-->|"OAuth handshake"| GitHubOAuth
GitHubOAuth -->|"Session cookie"| User
Pages -->|"Deduplicated API calls"| ApiCache
ApiCache -->|"Upstream fetch"| RepoRoute
ApiCache -->|"AI requests"| AIRoute
AuthMW --> RepoRoute
AuthMW --> AIRoute
RepoRoute --> GHService
GHService <-->|"Commits · PRs · Files\nDeployments · Members"| GitHub
GHService --> Analysis
Analysis -->|"Scores · Forecasts\nGhost list · Bus Factor"| RepoRoute
AIRoute <-->|"README text → Scorecard JSON"| Gemini
RepoRoute -->|"JSON responses"| Pages
AIRoute -->|"AI Scorecard"| Pages
Pages --> Charts
Pages --> Hooks
User -->|"Badge embed request"| BadgeEP
BadgeEP --> GHService
classDef fe fill:#0c2340,stroke:#61dafb,stroke-width:2px,color:#e2e8f0
classDef be fill:#0d2818,stroke:#3fb950,stroke-width:2px,color:#e2e8f0
classDef ext fill:#1a1a2e,stroke:#8b949e,stroke-width:1.5px,color:#c9d1d9
class Pages,Charts,Hooks,ApiCache fe
class AuthMW,AuthRoute,RepoRoute,AIRoute,GHService,Analysis be
class GitHub,GitHubOAuth,Gemini,BadgeEP ext
The contributor Health Score is a weighted linear combination of five independent signal scores, each normalised to the range [0, 100]:
| Signal | Weight | Source |
|---|---|---|
| Velocity Score |
28% | Commit growth: recent 4 weeks vs prior 4 weeks |
| Streak Score |
22% | Current active streak vs all-time peak streak |
| PR Score |
18% | PRs merged ÷ PRs opened (follow-through rate) |
| Response Latency Score |
18% | Inverse of average time-to-engage on open PRs |
| Off-Hours Score |
14% | Inverse late-night / weekend commit ratio |
Velocity Score — measures whether a contributor's output is accelerating or decelerating:
where +1 guard prevents division by zero for new contributors.
Streak Score — rewards sustained, consistent activity:
PR Score — measures delivery completion rate:
Off-Hours Score — acts as an inverse burnout signal. A contributor committing heavily at night or on weekends scores lower:
| Band | Score Range | Interpretation |
|---|---|---|
| 🔴 At Risk | 0 – 40 | Declining velocity, broken streaks, or sustained off-hours activity |
| 🟠 Stressed | 41 – 65 | Stable but showing one or more early warning signals |
| 🟢 Healthy | 66 – 85 | Consistent output, good PR completion, normal working hours |
| 🟣 Thriving | 86 – 100 | Accelerating velocity, strong streaks, high engagement |
The burnout forecaster runs ordinary least-squares linear regression over a rolling 6-week commit window for every active contributor. Let
The regression fits:
where the slope
A contributor is flagged at risk when both conditions hold:
- Slope
$m < 0$ (negative trend — output is declining) - The contributor is still active (at least one commit in the most recent 2 weeks)
The projected weeks-to-silence is computed by finding the x-intercept of the regression line:
This value is surfaced as a countdown on each contributor card and in the burnout modal on the Overview page. Contributors with
The Time Machine slider recomputes the regression using a shifted historical window, allowing leads to replay when a contributor began to decline.
The Bus Factor Heatmap evaluates knowledge concentration risk at file granularity. For each file
Files are assigned a risk tier:
| Tier | Condition | Colour |
|---|---|---|
| 🔴 High Risk |
|
Red |
| 🟡 Medium Risk |
|
Yellow |
| 🟢 Low Risk |
|
Green |
The aggregated Bus Factor Score is defined as:
A score of 100 means no file in the repository is a single-author dependency. This score is shown on the Overview page KPI bar alongside other repo-level health signals.
| Method | Route | Description | Response |
|---|---|---|---|
| GET | /auth/github |
Redirects to GitHub OAuth authorisation page | 302 Redirect |
| GET | /auth/callback |
Handles OAuth callback, creates session cookie | 302 → /dashboard |
| GET | /auth/me |
Returns the currently authenticated user object | UserSchema |
| GET | /auth/logout |
Destroys the server-side session | 200 OK |
| Method | Route | Description | Response |
|---|---|---|---|
| GET | /api/repo/search |
Searches public repos via GitHub API (?q=) |
List[RepoSummary] |
| GET | /api/user/repos |
Lists the authenticated user's own repos | List[RepoSummary] |
| GET | /api/repo/:owner/:repo/overview |
KPIs, health heatmap data, and ghost contributor list | OverviewSchema |
| GET | /api/repo/:owner/:repo/contributors |
Per-contributor health scores and burnout forecasts | List[ContributorSchema] |
| GET | /api/repo/:owner/:repo/commits |
Raw commit history with author and timestamp | List[CommitSchema] |
| GET | /api/repo/:owner/:repo/pulls |
Pull request table with merge time and author | List[PullSchema] |
| GET | /api/repo/:owner/:repo/deployments |
Deployment timeline with status, environment, and ref | List[DeploymentSchema] |
| GET | /api/repo/:owner/:repo/files |
Recursive file tree for Architecture Visualizer | FileTreeSchema |
| GET | /api/repo/:owner/:repo/compare |
Side-by-side comparison against a second repo | CompareSchema |
| GET | /badge/:owner/:repo |
Returns a live dynamic SVG health badge | image/svg+xml |
| Method | Route | Description | Response |
|---|---|---|---|
| POST | /api/ai/readme |
Sends README content to Gemini, returns structured scorecard | ScorecardSchema |
GitStat/
├── client/
│ └── src/
│ ├── pages/ # 10 route-level page components
│ │ ├── Landing.jsx # Cinematic hero + GitHub OAuth entry
│ │ ├── Dashboard.jsx # Repo search with match scoring
│ │ ├── YourRepos.jsx # Authenticated user's repo list
│ │ ├── Overview.jsx # KPIs, heatmap, burnout modal, AI score
│ │ ├── Contributors.jsx # Cards, Time Machine, ghost detector
│ │ ├── DeepAnalysis.jsx # Architecture visualizer + tech stack
│ │ ├── PullRequests.jsx # Filterable PR table
│ │ ├── Deployments.jsx # Deployment timeline
│ │ ├── Compare.jsx # Side-by-side repo metrics
│ │ └── NetworkGraph.jsx # Cross-repo force graph
│ ├── components/ # Shared UI components
│ │ ├── RepoLayout.jsx # Persistent repo nav shell
│ │ ├── ContributorCard.jsx # Health score card with sparkline
│ │ ├── ChartDrawer.jsx # Slide-out chart panel
│ │ └── Architecture/ # Tech stack detection, setup guide
│ ├── context/
│ │ ├── AuthContext.jsx # GitHub session state
│ │ └── ThemeContext.jsx # Dark / Light mode toggle
│ ├── hooks/
│ │ └── useRepoArchitecture.js
│ └── utils/
│ ├── metrics.js # Health score + burnout regression
│ ├── matchScore.js # Repo relevance matching
│ └── apiCache.js # In-memory request deduplication
└── server/
├── routes/
│ ├── auth.js # GitHub OAuth + session management
│ ├── repo.js # All repository data endpoints
│ └── ai.js # Gemini AI proxy
├── middleware/
│ └── requireAuth.js # Session guard for protected routes
└── services/
└── githubService.js # GitHub API fetch with timeout + retry
- Node.js: v18.0.0 or higher (check:
node -v) - A GitHub OAuth App (free) — create one here
- A Gemini API key (free) — get one here
git clone https://github.com/malavya1411/GitStat.git
cd GitStat# Root dependencies (concurrent dev runner)
npm install
# Client dependencies
cd client && npm install && cd ..
# Server dependencies
cd server && npm install && cd ..Create server/.env from the template below:
cp server/.env.example server/.envThen populate the values (see Configuration & Environment Variables for full reference).
- Go to github.com/settings/developers → New OAuth App
- Set Homepage URL to
http://localhost:5173 - Set Authorization callback URL to
http://localhost:5001/auth/callback - Copy the Client ID and a generated Client Secret into
server/.env
# From the root directory — starts both client and server concurrently
npm run dev| Service | URL |
|---|---|
| Frontend (Vite) | http://localhost:5173 |
| Backend (Express) | http://localhost:5001 |
| Key | Type | Required | Description |
|---|---|---|---|
GITHUB_CLIENT_ID |
String | ✅ | OAuth App Client ID from GitHub Developer Settings |
GITHUB_CLIENT_SECRET |
String | ✅ | OAuth App Client Secret from GitHub Developer Settings |
GEMINI_API_KEY |
String | ✅ | Google AI Studio API key for Gemini 2.5 Flash |
SESSION_SECRET |
String | ✅ | A long random string used to sign session cookies |
PORT |
Integer | — | Express server port. Defaults to 5001 |
FRONTEND_URL |
String | — | Frontend origin for CORS. Defaults to http://localhost:5173 |
NODE_ENV |
String | — | Set to production for production deployments |
| Key | Type | Default | Description |
|---|---|---|---|
VITE_API_BASE_URL |
String | http://localhost:5001 |
Express backend origin for all API calls |
- Create a new Web Service pointing to your GitHub repository.
- Set the Root Directory to
server. - Set Build Command to
npm installand Start Command tonode index.js(ornpm start). - Add the environment variables (
GITHUB_CLIENT_ID,GITHUB_CLIENT_SECRET,GEMINI_API_KEY,SESSION_SECRET,FRONTEND_URL) in the Render dashboard. - Update your GitHub OAuth App's Authorization callback URL to
https://your-render-url.onrender.com/auth/callback.
- Import the repository in the Vercel dashboard.
- Set the Root Directory to
client. - Set the Framework Preset to
Vite. - Add
VITE_API_BASE_URLpointing to your live Render backend URL. - Deploy. Vercel will handle SPA routing automatically via
vercel.jsonrewrites.
Contributions are welcome. A few notes before opening a PR:
- Health score or burnout logic changes — review the weighting table in
server/utils/analysis.jsand the formula documentation above before modifying. Open an issue first to discuss changes to the scoring model; these affect every contributor card, the overview KPI bar, and the embeddable badge. - New data endpoints — add the route in
server/routes/repo.js, wrap it withrequireAuthwhere appropriate, and add a corresponding entry to the API Route Reference table in this README. - Frontend-only features — keep data transformation logic in
client/src/utils/metrics.js, not inside page components.
Bug reports and feature requests are welcome via GitHub Issues.
MIT — GitStat is fully open source. You are free to use, modify, and distribute it.
