A two-part tool for finding URLs that are losing organic search visibility over time:
analysis/: a Node + TypeScript script that consumes a Google Search Console CSV export and produces a single JSON file (decay.json) describing the decay pattern for every qualifying URL.app/: a Vue 3 + Vite static viewer that loads that JSON and renders an interactive table plus a dual-axis chart for browsing results.
The output is fully static; no backend, no database. Build it once and host it anywhere (Netlify, S3, GitHub Pages), or run it locally in dev mode.
A typical mid-to-large content site has thousands of URLs that have lost organic traffic since launch. Some lost it because Google rolled out an algorithm update; some because a competitor outranked them; some because the topic itself drifted out of relevance; many because the content went stale and search intent moved on. The hardest pattern to catch (and the most actionable) is the slow bleed: a URL that has been quietly losing 5–10% of its impressions every month for a year, hidden inside a flat-top GSC chart that averages thousands of pages together.
Search Console shows you the symptoms (impressions are down) but not the diagnosis: which specific URLs are decaying, in what shape, and how steadily. Sorting the pages report by recent traffic surfaces what's big now, not what's getting smaller fastest. Date filters give you a snapshot, not a trajectory.
This tool fits in that gap. It takes the daily per-URL CSV that Google already gives you and answers a single question per page: is this URL trending downward in a way that's steady enough to act on? Steady decline → refresh, redirect, or sunset candidate. Spiky or random → leave it alone.
- A ranked table of every URL the script analyzed, with slope, decline %, R², and lifetime totals for both impressions and clicks
- A dual-axis line chart with the actual monthly series and the fitted regression line for the URL you click
- A filter that shows only URLs whose slope is negative, so you don't have to mentally subtract growers from decliners
- Sortable columns so you can shift between "biggest absolute losses" (slope), "biggest relative losses" (decline %), and "steadiest, most predictable losses" (R²) without leaving the page
- A search box that substring-matches against URL paths so you can audit one section at a time
- A self-contained static deliverable (no server, no database) that you can host anywhere or open over any HTTP server
- Why this exists
- What you get
- How decay is detected
- Prerequisites
- Project layout
- Architecture and data flow
- Quick start (5 minutes)
- Workflow
- CSV format
- Analysis tuning (CLI flags)
- Reading the app
- Worked example: a single URL through the system
decay.jsonschema- Deploying
- Troubleshooting
- Limitations and caveats
- What this tool is not
- Design principles
- Tech stack rationale
- Visual design system
- Customizing
- Glossary
- FAQ
- License
For each URL the analysis script:
- Aggregates daily impressions and clicks into monthly buckets.
- Drops partial months at the start and end of the dataset, so every data point represents a complete calendar month. (E.g., a CSV starting
2024-12-23discards December 2024 entirely; the first analyzed month is January 2025.) - Anchors each URL's series at its first full month with traffic. Any "missing" months between that anchor and the most recent full month are treated as zero impressions/clicks (the URL existed in Google's index but earned nothing that month).
- Filters out URLs that don't meet the minimum-volume and minimum-history thresholds (defaults: 1,000 lifetime impressions, 100 lifetime clicks, 6 months of data).
- Fits a linear regression on monthly impressions and clicks separately. Each regression yields a slope (avg change per month), intercept (predicted starting value), and R² (how well the line fits, with values closer to 1 meaning a steadier trend).
- Computes a decline percentage from the modeled start to the modeled end of the URL's window, capped at 100%. A URL can't decline more than 100%, and the cap prevents URLs that fell to zero from reporting math-only values like 144%.
A URL with a steeply negative slope and a high R² is a "steady decliner": the kind of post that has been bleeding traffic month after month rather than briefly spiking and crashing. That's the core signal this tool is designed to surface.
The methodology is intentionally unaware of seasonality and short-term spikes; the high-R² requirement filters them out when you sort for steady declines.
Several modeling choices were considered:
| Model | Catches steady decay | Catches seasonality | Single-number summary | Easy to explain |
|---|---|---|---|---|
| Linear regression | yes | no (sees as noise) | yes (slope) | yes |
| ARIMA | yes | yes | no (multiple parameters) | no |
| Exponential smoothing | yes | partial | medium | medium |
| Period-over-period (3mo vs prior 3mo) | partial | no | yes | yes (but misses gradual bleeds) |
| Mann-Kendall trend test | yes (yes/no only) | partial | no magnitude | medium |
| Bayesian structural time series | yes | yes | no | no |
Linear regression won because it's the cleanest match for the question "is this URL on a downward trajectory of meaningful magnitude?" The slope answers magnitude in interpretable units (impressions per month). The R² answers steadiness. Both are single numbers that can sort a table. ARIMA produces too many parameters to summarize; period-over-period misses gradual bleeds; Mann-Kendall doesn't quantify how much.
The trade-off accepted: the model doesn't account for seasonality. A URL that legitimately spikes every December will look noisy (low R²) rather than healthy. This is intentional. Surfacing predictable seasonal pages is a different question, and adding seasonal decomposition would be a different tool. See What this tool is not.
R² is the goodness-of-fit of the linear regression: the proportion of variance in the URL's monthly traffic that the line explains. It runs from 0 (line explains nothing) to 1 (line passes through every point).
| R² | What it usually means |
|---|---|
| > 0.7 | Strongly linear trend. The slope is a confident estimate. Best refresh candidates if the slope is negative. |
| 0.4 – 0.7 | Clear trend with some month-to-month noise. Slope is meaningful. |
| 0.2 – 0.4 | Weak trend, dominated by noise or non-linear shape (e.g., spike + decay). Slope is directionally suggestive but uncertain. |
| < 0.2 | No discernible linear trend. Could be flat, spiky, or seasonal. Don't infer decay from these. |
Pairing slope with R² is what separates "steady decline" from "one bad month."
Decline % is computed from the regression line, not from raw start and end values:
declinePct = (predictedStart − max(0, predictedEnd)) / predictedStart × 100
The max(0, predictedEnd) is a cap that prevents pathological values when the line projects below zero. Worked example:
A URL that had 1,865 impressions in January and ~0 every month for the next 14 months produces a regression with intercept ≈ 430 and slope ≈ −47/month. The line projects from 430 down to roughly −228. Without the cap:
declinePct = (430 − (−228)) / 430 × 100 = 153%
That's mathematically what the linear model says, but it's nonsense; a URL can't decline more than 100%. Capping the projected end at zero gives:
declinePct = (430 − 0) / 430 × 100 = 100%
That reading is honest: the model is saying the URL has effectively gone to zero. All URLs that have flatlined to zero hit the 100% ceiling, and R² then tiebreaks to surface the ones whose decline was steady rather than sudden.
GSC exports rarely start or end on the first or last day of a calendar month. A dataset starting 2024-12-23 only has 9 days of December, so including it would compare December's tiny partial total against full months and skew the regression. The script detects this at both ends of the dataset and trims partial calendar months before computing anything. The check runs per dataset, so a CSV that happens to start exactly on the 1st of a month doesn't lose a month unnecessarily.
Each URL's regression starts at the URL's first full month with any traffic, not at the dataset's first month. This matters for newer URLs: a page launched in June has 9 months of data in a 15-month dataset; treating the first 6 months as "zero impressions" for that URL would manufacture a fake "growth from zero" signal. By starting each URL's series at its first traffic month, the regression measures the URL's actual lifecycle, not its position relative to the export window.
Once anchored, gaps after the anchor are zero-filled. A URL that had impressions in months 1–4, then nothing for months 5–7, then impressions in months 8–15 is treated as 4 active months → 3 months of true zero → 8 active months. GSC reports a URL's absence the same way as zero impressions, so this gives the regression an honest picture of decline-with-gaps versus continuous-with-traffic.
- Node.js 20+ (uses native
parseArgs, ESM, top-level await) - npm (ships with Node)
- A Google Search Console CSV export with daily granularity
That's it. No database, no Python, no cloud accounts.
content-decay/
├── data/ # Drop GSC CSV exports here (gitignored)
│ └── gsc-export.csv
├── analysis/
│ ├── analyze.ts # The analysis script
│ ├── package.json
│ └── tsconfig.json
├── app/
│ ├── public/
│ │ └── decay.json # Output of analyze.ts (gitignored, regenerated)
│ ├── src/
│ │ ├── components/
│ │ │ ├── DecayTable.vue
│ │ │ └── DecayChart.vue
│ │ ├── App.vue
│ │ ├── format.ts
│ │ ├── style.css
│ │ ├── types.ts
│ │ ├── main.ts
│ │ └── env.d.ts
│ ├── index.html
│ ├── package.json
│ ├── tsconfig.json
│ └── vite.config.ts
├── .gitignore
└── README.md
The two halves (analysis/ and app/) have independent package.json files and don't share code at runtime. The only contract between them is the decay.json file format (see schema).
┌────────────────────┐
│ GSC CSV export │ daily, per-URL (one row per URL per day)
│ ~376k rows │
└──────────┬─────────┘
│ csv-parse (streaming)
▼
┌────────────────────┐
│ analyze.ts │ • Aggregate to monthly buckets per URL
│ (Node + TS) │ • Detect partial months at boundaries, trim
│ │ • Anchor each URL at its first traffic month
│ │ • Filter by volume + history thresholds
│ │ • Linear regression × 2 (impressions, clicks)
│ │ • Compute decline % with zero-cap
│ │ • Sort by impressions decline DESC
└──────────┬─────────┘
│ writes JSON
▼
┌────────────────────┐
│ decay.json │ • Metadata (date range, filters, counts)
│ ~750KB │ • Per-URL: slopes, R², monthly series, lifetime
└──────────┬─────────┘
│ fetched at runtime by the Vue app
▼
┌────────────────────┐
│ Browser │ • Loads JSON
│ (Vue 3 + Vite) │ • Filters & sorts in-memory
│ │ • Renders virtualized table
│ │ • Renders dual-axis Chart.js for selection
└────────────────────┘
The two halves are decoupled: the analysis script doesn't know about Vue, and the app doesn't know about the CSV. The contract between them is the decay.json shape, formalized as a TypeScript interface in app/src/types.ts.
Performance characteristics
- Aggregation is O(rows). A 376k-row CSV (~37 MB) aggregates and writes JSON in well under 5 seconds on a modern laptop.
- Regression is O(URLs × months). With 6,000 URLs over 15 months, that's 90,000 monthly points across two regressions each → ~180,000 simple-statistics operations. Sub-second.
- App load fetches a single JSON (typically 0.5–2 MB depending on URL count) and parses it once on mount. Sort and filter run on that in-memory array; both are O(URLs) per interaction.
- Table rendering uses virtual scrolling (
vue-virtual-scroller), so DOM size is constant regardless of URL count. Browsers stay smooth past 10,000 rows.
Assumes you already have your GSC CSV.
# 1. Install analysis dependencies
cd analysis
npm install
# 2. Drop your CSV in /data, then run the analysis
# (use the default file name, or pass --input)
cp ~/Downloads/your-export.csv ../data/
npm run analyze -- --input ../data/your-export.csv
# 3. Install app dependencies and start the dev server
cd ../app
npm install
npm run dev
# → open http://localhost:5173You're now browsing decay results. Re-run step 2 anytime and refresh the browser; no rebuild needed in dev mode.
Place your CSV anywhere. The data/ folder is the conventional location and is gitignored so you won't accidentally commit client data. Multiple CSVs can coexist; each analysis run targets one file.
From analysis/:
npm run analyzeThat uses the defaults (input ../data/gsc-export.csv, output ../app/public/decay.json, standard thresholds).
To target a different file or tune thresholds:
npm run analyze -- \
--input ../data/your-export.csv \
--output ../app/public/decay.json \
--min-impressions 1000 \
--min-clicks 100 \
--min-months 6The script logs its progress and ends with a summary like:
[analyze] parsed 376,221 rows across 6,429 URLs
[analyze] date range: 2024-12-23 → 2026-04-20
[analyze] full months: 2025-01 → 2026-03 (15 months)
[analyze] wrote 740 URLs (0.73 MB) to ../app/public/decay.json
[analyze] qualified: impressions=740 clicks=235
If only a small number of URLs qualify, that's expected; most pages on a typical site don't meet the volume/history thresholds.
Local development (live reload):
cd app
npm run dev
# → http://localhost:5173The app fetches decay.json at runtime, so re-running the analysis and refreshing the browser is enough. No rebuild required.
Production build (static, hostable):
cd app
npm run build
# → app/dist/ (deployable static folder)
npm run preview # optional: serve the production build locally
# → http://localhost:4173The dist/ folder is fully static. See Deploying.
The script expects a Google Search Console CSV export with these columns (header row required, exact names):
Date,URL,Impressions,Clicks,CTR,Avg Rank
Example:
Date,URL,Impressions,Clicks,CTR,Avg Rank
2025-01-01,https://example.com/blog/post-a,1234,56,4.54%,12.3
2025-01-01,https://example.com/blog/post-b,789,5,0.63%,33.2
2025-01-02,https://example.com/blog/post-a,1187,52,4.38%,12.5
Notes:
- Date must be
YYYY-MM-DD. - URL is matched literally;
/page/and/pageare different URLs. - CTR and Avg Rank are read but not used by the analysis (they are not currently surfaced anywhere).
- One row per (Date, URL) pair. The script aggregates rows by URL within each calendar month.
- Rows missing any of (Date, URL, Impressions, Clicks) or with non-numeric impressions/clicks are silently skipped.
Where to get this from GSC:
- Search Console → Performance → set the date range (this tool benefits from at least 9–12 months of data)
- Export → CSV
- Use the per-day table; the URL/Page export is what's expected
| Flag | Default | Purpose |
|---|---|---|
--input |
../data/gsc-export.csv |
Path to source CSV |
--output |
../app/public/decay.json |
Where to write results |
--min-impressions |
1000 |
Min lifetime impressions for impressions-side analysis |
--min-clicks |
100 |
Min lifetime clicks for clicks-side analysis |
--min-months |
6 |
Min full months of data per URL |
- Smaller site / fewer URLs: lower
--min-impressionsto 500 or 250 to keep more URLs in the analysis. The trade-off is more noise in the rankings. - Larger site, focusing only on top performers: raise
--min-impressionsto 5000 or 10000 to filter to high-traffic URLs only. - Short history (< 9 months of data): lower
--min-monthsto 3 or 4, but expect noisier R² values; fewer data points means individual months sway the regression more. - Surface URLs with traffic but no clicks: lower
--min-clicks(the click-side analysis is independent of the impression-side analysis, so a URL can qualify for one and not the other).
The analysis is fully deterministic: same CSV plus same flags equals identical decay.json.
| Column | What it shows |
|---|---|
| URL | The page path (domain stripped for readability; full URL in tooltip and chart link) |
| Imp Slope | Avg impressions change per month (negative = losing impressions) |
| Imp Decline | % drop from modeled start to end, capped at 100% |
| Imp R² | How well a straight line fits the impressions series. 1.0 = perfect fit. < 0.3 = noisy/spiky. |
| Clk Slope | Same as Imp Slope, for clicks |
| Clk Decline | Same as Imp Decline, for clicks |
| Clk R² | Same as Imp R², for clicks |
| Lifetime imp | Sum of impressions across the URL's analyzed window |
| Lifetime clk | Sum of clicks across the URL's analyzed window |
| Months | Number of monthly data points used in the regression |
Color cues:
- 🔴 Red = a declining value (negative slope or positive decline %)
- 🟢 Green = a growing value
- ⚫ Gray "—" = the URL didn't qualify for that metric (e.g., not enough lifetime clicks for the click-side analysis)
Sortable columns: click any column header. Click again to reverse direction. Active sort column is highlighted gold.
Search: types substring-match against the full URL.
Declining only: when on, hides URLs whose slope on the active metric is positive or zero. On by default.
Impressions / Clicks toggle: switches the default sort and the basis of the declining-only filter. Both metrics' columns remain visible.
Selecting a row: click any row to load it in the chart above.
A dual-axis line chart of the selected URL's monthly series:
- Gold solid line: actual monthly impressions (left axis)
- Gold dashed line: fitted regression line for impressions
- Charcoal solid line: actual monthly clicks (right axis)
- Charcoal dashed line: fitted regression line for clicks
Hovering shows tooltips with exact values for every line at that month.
The three stat cards above the chart summarize the regression for the selected URL:
- Imp Decline / Click Decline: capped decline %, slope, and R²
- Lifetime: total impressions, total clicks, months of data
The URL above the chart is a clickable link that opens the page in a new tab.
"Find the worst overall decliners"
- Default state: declining only ON, sort by Imp Decline DESC.
- Top of table = URLs that have lost the most impressions.
"Find slow, steady bleeders (best for content refreshes)"
- Declining only ON.
- Click "Imp R²" once to sort DESC (a fresh column starts in DESC).
- Top of table = URLs whose decline is most linear → most predictable, best refresh candidates.
"Find URLs losing clicks even when impressions are stable"
- Switch to the Clicks metric.
- Sort by Clk Decline DESC.
- Compare Clk Slope vs Imp Slope per row. A negative click slope alongside a positive impression slope points to a ranking-position issue rather than a visibility issue.
"Audit a specific section"
- Type the path prefix into search (e.g.,
/blog/). - Sort however you want.
Take https://example.com/blog/content-strategy-2024/, a URL that decayed cleanly from 8,000 impressions/month down to 1,000 over 15 months.
-
CSV. The file contains roughly
15 full months × ~30 days/month = 450daily rows for this URL. -
Aggregation. Those 450 rows collapse into 15 monthly bucket totals:
[8000, 7500, 7000, 6500, 6000, 5500, 5000, 4500, 4000, 3500, 3000, 2500, 2000, 1500, 1000]. -
Anchor. First month with any traffic is month 1 of the dataset window. No clipping needed.
-
Filter check. Lifetime impressions = 67,500 (≥ 1,000 ✓), months = 15 (≥ 6 ✓). Qualifies for impressions analysis. Lifetime clicks would be checked separately against the click threshold.
-
Regression. Fit a line to
(month index, impressions)for indices 0–14. The math returns:- slope = −500/mo (losing 500 impressions every month)
- intercept = 8,000 (predicted month-1 value, matching the actual)
- R² = 1.00 (the data is perfectly linear by construction; in real GSC data R² would typically be 0.7–0.95 for a steady decay)
-
Decline %. Predicted end at month 14 =
8000 + (−500 × 14) = 1000. The line stays positive, so no cap kicks in:(8000 − 1000) / 8000 × 100 = 87.5%. -
JSON. This URL ends up in
decay.jsonwithslope: −500, declinePct: 87.5, r2: 1.0. -
App. The URL renders in the table with red
−500/mo, red87.5%, and1.00for R². Clicking the row loads the chart, which renders the actual 15-point series (gold solid line) with a downward dashed regression line that tracks the points exactly. -
Triage. High R² + high decline % + meaningful lifetime impressions = strong candidate for content refresh or sunset. The dashed line tracking the data closely tells you this is a steady decline, not a one-time hit.
Contrast this with a URL whose series is [5677, 7892, 3942, 1780, 850, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]. The direction is the same, but the regression line projects below zero (so the cap fires and decline reads 100%), and R² lands around 0.5–0.6 because the shape is "front-loaded then dead" rather than a clean linear descent. Both URLs surface in the table; the R² column tells you which one is the steady bleeder versus which one fell off a cliff.
{
metadata: {
runAt: string; // ISO timestamp of the analysis run
inputFile: string; // basename of the source CSV (e.g. "gsc-export.csv")
dateRange: { start: string; end: string }; // YYYY-MM-DD
fullMonths: string[]; // e.g. ["2025-01", "2025-02", ...]
filters: {
minImpressions: number;
minClicks: number;
minMonths: number;
};
counts: {
totalUrlsInCsv: number;
qualifiedForImpressions: number;
qualifiedForClicks: number;
includedInOutput: number;
};
};
urls: Array<{
url: string;
monthsOfData: number;
lifetimeImpressions: number;
lifetimeClicks: number;
monthlySeries: Array<{
month: string; // YYYY-MM
impressions: number;
clicks: number;
}>;
impressions: RegressionResult | null; // null = didn't qualify
clicks: RegressionResult | null;
}>;
}
interface RegressionResult {
slope: number; // change per month
intercept: number; // predicted value at first analyzed month
r2: number; // 0..1
startValue: number; // = intercept
endValue: number; // intercept + slope * (n - 1), uncapped
declinePct: number | null; // capped at 100, null if intercept ≤ 0
}URLs are sorted in the JSON by impressions decline % descending (worst decay first). The app re-sorts at runtime based on user selection.
The app/dist/ folder is fully static. Two pieces ship: the bundled HTML/JS/CSS and decay.json (which is read at runtime).
| Target | How |
|---|---|
| Netlify | drag-and-drop dist/ into a new site; or netlify deploy --dir=dist --prod |
| GitHub Pages | push dist/ to a gh-pages branch; or use a workflow |
| S3 / CloudFront | aws s3 sync dist/ s3://your-bucket/ |
| Local file server | cd dist && python3 -m http.server 8080 |
Note: opening dist/index.html directly via file:// will not work; modern browsers block ES module loading and fetch() from the local file protocol. Any static HTTP server suffices, including one-line tools like python3 -m http.server or npx serve.
Error: No rows parsed from CSV
The CSV has no valid rows. Check:
- The header row is exactly
Date,URL,Impressions,Clicks,CTR,Avg Rank(case-sensitive). - The file uses comma delimiters, not tabs or semicolons.
- The file isn't empty or only headers.
Error: Not enough full months in dataset to analyze.
The CSV's date range falls within a single calendar month after the partial-month trim. Re-export a longer date range (the tool needs at least one complete calendar month).
The app loads but the table is empty.
Either (a) decay.json is empty (check the analysis log; did 0 URLs qualify?) or (b) the "Declining only" toggle is hiding everything. Toggle it off to verify.
Lots of URLs but they all have low R² and small slopes.
Your dataset may be too short (e.g., 6 months). The regression has fewer data points to work with, so noise dominates. Either wait for more months of data or raise --min-impressions to filter to high-volume URLs where the signal is cleaner.
decay.json: 404 in the dev server.
The analysis hasn't been run yet, or the output path doesn't match app/public/decay.json. Re-run npm run analyze from analysis/.
The dev server compiles but the chart is blank.
A URL has been selected but its monthlySeries is empty or its regression fields are null. Click a different row.
Build fails with a TypeScript error.
The vue-tsc step runs in npm run build. Common cause is a stale type cache; try rm -rf app/node_modules/.vite and rebuild.
- Linear-only model. URLs with non-linear decay shapes (S-curves, step changes from algorithm updates, J-curves from launches) are summarized by a straight line. R² will reflect the poor fit, but the slope is still computed. Don't trust slopes from URLs with R² < 0.3.
- No seasonality awareness. A page that legitimately spikes every Q4 will register low R² and may be excluded when sorting for "steady decline" even if the page is healthy. This is by design but should be expected.
- No SERP feature attribution. If a URL lost impressions because Google added an AI Overview, a featured snippet, or a knowledge panel above it, this tool sees the impression drop but cannot tell you why. Pair findings with manual SERP inspection.
- Window-bound. A URL that decayed before your CSV's date range looks flat to this tool. The analysis can only see what's in the export, so pull a longer date range from GSC for older sites.
- Aggregates across countries / devices / search types. The CSV format the script consumes does not segment by country or device, so the regression is on the URL's total traffic. For multilingual sites that's usually fine; for sites where one country drove everything, the slope conflates "country X's impressions" with "the URL itself."
- No causation. The tool reports correlation with time, not causes of decay. Use it as a triage layer, not a verdict.
- Click-side analysis is sparser. With the default 100-click filter, many URLs qualify for impressions but not clicks. That's expected for low-CTR pages and isn't a bug.
- GSC reporting quirks. Daily GSC data is subject to privacy filtering (queries below a threshold aren't reported), late-arriving data (counts may shift days after the fact), and query batching, all of which add noise the regression cannot distinguish from real signal. Treat slopes as directionally meaningful, not statistically precise.
- Not a content audit tool. It surfaces decay candidates; it doesn't recommend specific edits, link rebuilds, or topic clusters.
- Not a forecasting tool. The slope tells you what was happening, not what will happen next month. Don't extrapolate beyond the analyzed window.
- Not a ranking analyzer. It doesn't pull SERPs, doesn't track positions, doesn't tell you who outranked you.
- Not a replacement for GSC's UI. Use it after GSC tells you "site impressions are down" to find which URLs to focus on, not before.
- Not a real-time monitor. It runs on demand against a CSV. There is no scheduled job, no alerts.
- Not BigQuery-aware. Some sites pipe GSC into BigQuery for richer slicing. Adapting this script to read a BQ extract is straightforward but isn't built in.
The tool follows six principles, in roughly decreasing order of importance:
-
Static-first. Everything runs in the browser against a JSON file. No backend means no auth, no hosting bills, no security surface, no operations toil. The build artifact is portable enough to email if needed.
-
Two halves, one contract. The analysis script and the app communicate only through
decay.json. Either side can be rewritten without touching the other: port the script to Python, swap the Vue app for plain HTML, etc. A formal TypeScript type defines the contract. -
Deterministic. Same CSV + same flags = identical JSON. No random sampling, no dependency on external APIs at analysis time, no time-of-day effects. Re-runs are reproducible across machines and across months.
-
One number for ranking, three for understanding. The table sorts on a single number (decline % by default), but every row exposes slope, R², and lifetime so the user can see why a URL ranked where it did. Black-box composite scores hide judgment; this exposes the inputs.
-
Honest math over flattering math. The 100% decline cap is honest about a URL going to zero, and the R² column is honest about how well the line actually fits. Neither inflates findings to make the tool look more useful than it is.
-
No novelty for its own sake. Linear regression is over 200 years old. There are sexier methods (changepoint detection, Bayesian structural time series) that would surface marginally more nuance. They would also be harder to explain to a stakeholder when a URL surfaces unexpectedly. Boring tools age well.
| Layer | Choice | Why |
|---|---|---|
| Analysis | Node + TypeScript | Same language as the app, so one toolchain to install. Native streaming CSV with csv-parse, well-typed regression with simple-statistics. Pandas would be more idiomatic in Python but introduces a second runtime. |
| CSV parsing | csv-parse | Streaming, handles quoting/escaping correctly, well-maintained. Battery-included alternatives like papaparse are heavier and aimed at the browser. |
| Regression | simple-statistics | Tiny, dependency-free, returns slope/intercept directly. No need for a full stats framework. |
| App framework | Vue 3 + Vite | Simple reactivity model for a single-state app, fast HMR, small bundle. React + Next.js would be overkill for one page; Svelte was equally viable. |
| Charts | Chart.js + vue-chartjs | Dual-axis support, good defaults, ~50 KB gzipped. ECharts is more powerful but heavier; Recharts is React-only; D3 would require building everything from primitives. |
| Virtual scrolling | vue-virtual-scroller | Handles thousands of rows at 60fps without pagination. Native CSS content-visibility would be simpler but less reliable across browsers. |
| Type safety | strict TypeScript on both sides | The decay.json shape is shared via type definitions, so renaming a field at the script side surfaces compiler errors in the app. |
| Styling | Plain CSS with custom properties | No Tailwind, no UI kit. The app is small enough that a few hundred lines of hand-written CSS is more readable than a class soup. CSS variables make brand re-skinning a one-file change. |
| Type checking on build | vue-tsc | The npm run build script runs vue-tsc --noEmit before Vite, so type errors block the build. |
The app's visual language is editorial: light mode, generous whitespace, hairline dividers, and a single warm gold accent. The intent is for the dashboard to feel like a printed report rather than a SaaS console.
| Token | Value | Used for |
|---|---|---|
--bg |
#ffffff |
Page background |
--surface |
#fafbfc |
Subtle off-white panels (controls bar) |
--surface-2 |
#f5f6f8 |
Hover row background |
--ink |
#121212 |
Headings, primary text, active button background |
--ink-soft |
#495057 |
Secondary text |
--ink-muted |
#6c757d |
Metadata, table column headers |
--gold |
#b58c54 |
Brand accent: focus rings, active sort, eyebrow rules, chart impressions line, selected-row tint |
--decline |
#b91c1c |
Negative slope / decline % cells |
--growth |
#15803d |
Positive slope / growth cells |
--border |
#ededed |
Hairline dividers |
--chart-impressions |
#b58c54 |
Impressions line on the chart |
--chart-clicks |
#1f2937 |
Clicks line on the chart |
Typography: Manrope from Google Fonts at weights 400, 500, 600, 700, 800. Manrope is a humanist geometric sans with sturdy lowercase letterforms, readable at small sizes and clean at display sizes. Tabular numerics (font-variant-numeric: tabular-nums) are enabled across the table for vertical digit alignment.
Hierarchy:
- All-caps with
0.18emletter-spacing for category labels (eyebrows, axis titles) - 22px / 700 weight for the H1 with a
−0.02emletter-spacing tightening - 13px / 400 weight for body (Manrope's normal weight); interactive elements (search, toggle, buttons, cells) bump to 500
- 10px / 600 weight for table column headers (uppercase,
0.14emletter-spacing)
Density: the fixed elements above the table body (header ~80 px, controls bar ~64 px, chart 300 px, table head ~68 px) total roughly 515 px, leaving ~385 px for the table body on a 900 px viewport. At 38 px per row that's about 10 visible rows before scrolling. Raising the chart height eats into table density; reducing below ~220 px starts crowding the chart legend and axis labels.
All colors are CSS variables defined at the top of app/src/style.css:
:root {
--bg: #ffffff;
--ink: #121212;
--gold: #b58c54; /* primary accent */
--decline: #b91c1c;
--growth: #15803d;
--chart-impressions: #b58c54;
--chart-clicks: #1f2937;
/* ... */
}Change those values and both the table and chart pick up the new palette. The chart reads its colors from these variables at runtime via getComputedStyle(), so style.css is the single source of truth.
- Add a key to
SortKeyinapp/src/types.ts. - Add a case to
getSortValinApp.vuethat returns the value for the new column. - Add a header entry to the
headersarray inDecayTable.vueand a matching<div class="cell">in the row template. - Update the
colsgrid template (inDecayTable.vue) to include a width for the new column, and update the spans on the group header row inDecayTable.vueif the column belongs to one of the existing groups.
The 100% display cap lives in two places:
- Analysis:
regress()inanalysis/analyze.ts, which capsdeclinePctso it can't exceed 100% even for URLs whose model projects below zero. - Display:
formatPct()inapp/src/format.ts, which guards against any > 100 from old JSONs.
Removing the cap (not recommended) means URLs that decayed to zero will show declines like 144%, which is mathematically what the regression line predicts but is misleading on the surface.
In app/src/style.css:
.chart-wrap {
height: 300px;
}Increasing this reduces the number of table rows visible without scrolling. Decreasing this (below ~220px) starts crowding the legend and axis labels.
- Anchor month. The first month a URL had any impressions or clicks within the dataset's full-month range. The URL's regression series starts here.
- Capped decline. The decline % displayed in the table, with a ceiling of 100% to prevent regression-projection artifacts.
- Decline %.
(modeled start − max(0, modeled end)) / modeled start × 100, computed from the regression's intercept and final fitted point. - Full month. A calendar month for which the dataset contains every day. Months at the dataset's boundaries that are partial are excluded from analysis.
- Lifetime. Sum of impressions or clicks across the URL's analyzed window (anchor month through last full month). Not the URL's all-time total.
- Linear regression. Least-squares fit of a straight line
y = m·x + bthrough the URL's monthly traffic points, where x is the month index from anchor. - Qualified URL. A URL that meets the volume and history thresholds for at least one of impressions or clicks. Only qualified URLs appear in
decay.json. - R² (coefficient of determination). Proportion of monthly variance explained by the fitted line. 1.0 = perfect fit, 0.0 = no linear trend.
- Regression line. The line of best fit produced by the linear regression. Visualized in the chart as a dashed overlay.
- Slope. Rate of change of the regression line, in units of impressions (or clicks) per month. Negative slope = decay.
- Steady decline. Negative slope with high R² (typically > 0.5). The opposite of a sudden cliff.
- Sudden drop / cliff. Negative slope with low R², driven by a single steep change rather than a gradual trend.
Can I run this on Bing Webmaster Tools data?
Yes, if you can produce a CSV with the same column names (Date, URL, Impressions, Clicks, CTR, Avg Rank). The script doesn't care which search engine generated it.
Can I use a date column other than Date (e.g. date, Day)?
Not without editing the script. Column names are part of the contract; rename your CSV columns or change the four reads in analyze.ts.
Why are some URLs missing from the table? They didn't meet the volume or history thresholds. Default filters are ≥1,000 lifetime impressions, ≥6 months of data. Lower the thresholds with CLI flags to include more URLs, or check the analysis log for the qualified count.
A URL has been steadily declining for 18 months but only shows 6 months of data. Why? The CSV's date range determines the analysis window. The tool can't see beyond what you exported. Pull a longer range from GSC (12+ months recommended).
The decline % is 100% but the URL still gets ~50 impressions/month. Is the math wrong? The 100% reflects the regression line's projection (it predicts approximately zero), not the actual current value. Click the URL in the table to see the chart; the regression line is the dashed one, the actual data is the solid one.
Should I trust a URL with R² of 0.15 and a steep negative slope? No. Low R² means the line doesn't represent the data well. The slope is computed but it's not telling you anything about a "trend." Sort by R² descending to find URLs whose declines are actually trend-shaped.
Can I compare two CSVs (e.g., before/after a content refresh)? Not as a side-by-side view in the app. Run the analysis on each separately and compare the JSONs manually. Multi-dataset comparison would be a different tool.
Does this work for sites with millions of URLs?
The analysis script handles ~400k rows in seconds. The app renders the table virtually, so URL count itself isn't the bottleneck. The constraint is the JSON: with hundreds of thousands of qualifying URLs the file gets large (10 MB+) and in-browser sort/filter starts to feel sluggish. For sites that big, raise --min-impressions to 5,000+ to focus on URLs that matter.
Why not show 95% confidence intervals or p-values? Adding statistical inference would suggest precision the data doesn't really support, given GSC's known reporting biases (privacy filtering, late-arriving data, query batching). The slope and R² are diagnostic, not inferential.
Can I run this in CI?
Yes. The script is a pure CLI command. A scheduled workflow that pulls a fresh CSV and runs npm run analyze will produce a new decay.json deterministically. Combine with npm run build to publish a fresh static site.
Is GSC data accurate enough for this kind of analysis? GSC data has known caveats: privacy filtering, late updates, query batching. Month over month the noise is roughly stable, which is what trend detection actually needs. Structural reporting changes (such as GSC's 2023 reprocessing) are a different problem; if you suspect one inside your date range, trim it.
MIT. See LICENSE.