fix: replace manual anomalies with a hampel filter by matthewp · Pull Request #1997 · npmx-dev/npmx.dev

matthewp · 2026-03-08T21:36:55Z

🔗 Linked issue

Previous issue: #1707

🧭 Context

The current implementation of anomaly removal is biased due to the manual nature.

This replaces the implementation with one that uses a hampel filter to automatically remove deviations.

📚 Description

The important part here is applyHampelCorrection.

It goes over each data point and creates a sliding window of data points, by default 3 on each side.

Of those data points it finds the median, since a spike can't pull it off center.

Then it measures how spread out the neighbors are by calculating the MAD (see here).

Then it gives the data point a score and if it's above a threshold then it gets replaced with the median of that window and marked as being an anomaly.

Here are two articles about how hampel filters work:

Here's a screenshot of Vite which still shows its spike removed after this change.

vercel · 2026-03-08T21:37:01Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
npmx.dev	Ready	Preview, Comment	Mar 8, 2026 9:45pm

2 Skipped Deployments

Project	Deployment	Actions	Updated (UTC)
docs.npmx.dev	Ignored	Preview	Mar 8, 2026 9:45pm
npmx-lunaria	Ignored		Mar 8, 2026 9:45pm

codecov · 2026-03-08T21:38:52Z

Codecov Report

❌ Patch coverage is 23.52941% with 26 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
app/utils/download-anomalies.ts	20.00%	20 Missing and 4 partials ⚠️
app/components/Package/TrendsChart.vue	50.00%	1 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

github-actions · 2026-03-08T21:43:32Z

Lunaria Status Overview

🌕 This pull request will trigger status changes.

Learn more

By default, every PR changing files present in the Lunaria configuration's files property will be considered and trigger status changes accordingly.

You can change this by adding one of the keywords present in the ignoreKeywords property in your Lunaria configuration file in the PR's title (ignoring all files) or by including a tracker directive in the merged commit's description.

Tracked Files

File	Note
i18n/locales/bg-BG.json	Localization changed, will be marked as complete.
i18n/locales/cs-CZ.json	Localization changed, will be marked as complete.
i18n/locales/de-DE.json	Localization changed, will be marked as complete.
i18n/locales/en.json	Source changed, localizations will be marked as outdated.
i18n/locales/es.json	Localization changed, will be marked as complete.
i18n/locales/fr-FR.json	Localization changed, will be marked as complete.
i18n/locales/hu-HU.json	Localization changed, will be marked as complete.
i18n/locales/id-ID.json	Localization changed, will be marked as complete.
i18n/locales/ja-JP.json	Localization changed, will be marked as complete.
i18n/locales/pl-PL.json	Localization changed, will be marked as complete.
i18n/locales/ru-RU.json	Localization changed, will be marked as complete.
i18n/locales/tr-TR.json	Localization changed, will be marked as complete.
i18n/locales/uk-UA.json	Localization changed, will be marked as complete.
i18n/locales/zh-CN.json	Localization changed, will be marked as complete.
i18n/locales/zh-TW.json	Localization changed, will be marked as complete.

Warnings reference

Icon	Description
🔄️	The source for this localization has been updated since the creation of this pull request, make sure all changes in the source have been applied.

coderabbitai · 2026-03-08T21:47:28Z

📝 Walkthrough

Walkthrough

This PR replaces blocklist-based anomaly correction with a Hampel-filter implementation (add: applyHampelCorrection) and removes the DOWNLOAD_ANOMALIES dataset and related blocklist functions. TrendsChart and WeeklyDownloadStats now call the Hampel correction. Per-package anomaly UI, detailed anomaly tooltips and related state were removed or simplified. i18n keys for known anomaly ranges were deleted across locales and schema. Tests were updated to exercise the Hampel-based correction.

Possibly related PRs

fix: the weekly data anomaly detection was broken for the Svelte anomalies #1983 — Touches app/utils/download-anomalies.ts and modifies blocklist-based anomaly handling in the same utility file.
feat: fix known download anomalies with interpolation #1636 — Modifies the same anomaly-correction/codepath and chart UI that this PR replaces with a Hampel approach.
fix: uncheck "apply correction" when there is no anomaly data #1864 — Changes the anomaliesFixed checkbox/checkbox gating in TrendsChart.vue; related to this PR’s checkbox and UI simplifications.

Suggested reviewers

graphieros

🚥 Pre-merge checks | ✅ 1

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description check	✅ Passed	The pull request description clearly relates to the changeset, explaining the shift from manual blocklist-based anomaly correction to Hampel filter-based automatic anomaly detection with supporting technical details.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: beb9630b-e84a-4279-a4f8-362d18f451b4

📥 Commits

Reviewing files that changed from the base of the PR and between d26e250 and 1b0a0c7.

📒 Files selected for processing (5)

app/components/Package/TrendsChart.vue
app/components/Package/WeeklyDownloadStats.vue
app/utils/download-anomalies.data.ts
app/utils/download-anomalies.ts
test/unit/app/utils/download-anomalies.spec.ts

💤 Files with no reviewable changes (1)

app/utils/download-anomalies.data.ts

coderabbitai · 2026-03-08T21:47:31Z

app/utils/download-anomalies.ts

+  // halfWindow controls how many neighbors on each side to consider.
+  // A window of 3 means we look at 7 points total (3 left + current + 3 right).
+  const halfWindow = opts?.halfWindow ?? DEFAULT_HALF_WINDOW


⚠️ Potential issue | 🟠 Major

Do not score boundary samples with truncated windows.

Line 33 says a halfWindow of 3 uses 3 neighbours on each side, but Lines 50-52 clamp the first and last samples to shorter windows after Line 42 only validates the overall series length. That makes edge points easy false positives: 100,100,100,100,100,100,200 gets its last point flattened back to 100 because there is no right-hand context and windowMad falls to 0. Skip indices that cannot form a full symmetric window, or handle boundaries explicitly.

Suggested fix

- for (let i = 0; i < values.length; i++) { - // Build a sliding window around the current point, clamped to array bounds. - const start = Math.max(0, i - halfWindow) - const end = Math.min(values.length - 1, i + halfWindow) + for (let i = halfWindow; i < values.length - halfWindow; i++) { + // Only evaluate points that have a full symmetric window. + const start = i - halfWindow + const end = i + halfWindow const window = values.slice(start, end + 1)

Also applies to: 41-42, 48-52

coderabbitai · 2026-03-08T21:47:31Z

app/utils/download-anomalies.ts

+    // MAD of 0 means most values in the window are identical.
+    // If this point differs from the median at all, it's an outlier.
+    if (windowMad === 0) {
+      if (deviation > 0) {
+        result[i]!.value = Math.round(windowMedian)
+        result[i]!.hasAnomaly = true
+      }
+      continue


⚠️ Potential issue | 🟠 Major

The zero-MAD branch will erase real low-volume traffic.

When windowMad === 0, Line 68 treats any deviation from the local median as an anomaly. For sparse packages, a legitimate series like 0,0,0,1,0,0,0 is rewritten to all zeros on Line 69, which drops the only real activity in that period. Please gate this path behind an absolute/relative floor, or leave zero-MAD windows untouched.

coderabbitai · 2026-03-08T21:47:31Z

test/unit/app/utils/download-anomalies.spec.ts

+describe('applyHampelCorrection', () => {
+  it('flags and corrects a spike in the middle of steady data', () => {
+    const data: WeeklyDataPoint[] = [
+      makeWeeklyPoint('2022-11-07', 100),
+      makeWeeklyPoint('2022-11-14', 100),
+      makeWeeklyPoint('2022-11-21', 100),
+      makeWeeklyPoint('2022-11-28', 1000), // spike
+      makeWeeklyPoint('2022-12-05', 100),
+      makeWeeklyPoint('2022-12-12', 100),
+      makeWeeklyPoint('2022-12-19', 100),
+    ]
+
+    const result = applyHampelCorrection(data) as WeeklyDataPoint[]
+
+    // The spike should be corrected
+    expect(result[3]!.hasAnomaly).toBe(true)
+    expect(result[3]!.value).toBe(100) // replaced with median
+
+    // Non-spike points should be unchanged
+    expect(result[0]!.value).toBe(100)
+    expect(result[0]!.hasAnomaly).toBeUndefined()
+    expect(result[1]!.value).toBe(100)
+    expect(result[6]!.value).toBe(100)
+  })
+
+  it('does not flag gradual growth as anomalies', () => {
+    const data: WeeklyDataPoint[] = [
+      makeWeeklyPoint('2022-11-07', 100),
+      makeWeeklyPoint('2022-11-14', 110),
+      makeWeeklyPoint('2022-11-21', 120),
+      makeWeeklyPoint('2022-11-28', 130),
+      makeWeeklyPoint('2022-12-05', 140),
+      makeWeeklyPoint('2022-12-12', 150),
+      makeWeeklyPoint('2022-12-19', 160),
+    ]
+
+    const result = applyHampelCorrection(data) as WeeklyDataPoint[]
+
+    for (const point of result) {
+      expect(point.hasAnomaly).toBeUndefined()
+    }
+  })
+
+  it('returns data unchanged when too few points for the window', () => {
+    const data: WeeklyDataPoint[] = [
+      makeWeeklyPoint('2022-11-07', 100),
+      makeWeeklyPoint('2022-11-14', 1000),
+      makeWeeklyPoint('2022-11-21', 100),
+    ]
+
+    const result = applyHampelCorrection(data) as WeeklyDataPoint[]
+    expect(result[1]!.value).toBe(1000) // not enough data to detect
+  })
+
+  it('does not mutate the original data', () => {
    const data: WeeklyDataPoint[] = [
-      {
-        value: 100,
-        weekKey: '2022-11-07_2022-11-13',
-        weekStart: '2022-11-07',
-        weekEnd: '2022-11-13',
-        timestampStart: 0,
-        timestampEnd: 0,
-      },
-      {
-        value: 999,
-        weekKey: '2022-11-14_2022-11-20',
-        weekStart: '2022-11-14',
-        weekEnd: '2022-11-20',
-        timestampStart: 0,
-        timestampEnd: 0,
-      },
-      {
-        value: 999,
-        weekKey: '2022-11-21_2022-11-27',
-        weekStart: '2022-11-21',
-        weekEnd: '2022-11-27',
-        timestampStart: 0,
-        timestampEnd: 0,
-      },
-      {
-        value: 999,
-        weekKey: '2022-11-28_2022-12-04',
-        weekStart: '2022-11-28',
-        weekEnd: '2022-12-04',
-        timestampStart: 0,
-        timestampEnd: 0,
-      },
-      {
-        value: 200,
-        weekKey: '2022-12-05_2022-12-11',
-        weekStart: '2022-12-05',
-        weekEnd: '2022-12-11',
-        timestampStart: 0,
-        timestampEnd: 0,
-      },
+      makeWeeklyPoint('2022-11-07', 100),
+      makeWeeklyPoint('2022-11-14', 100),
+      makeWeeklyPoint('2022-11-21', 100),
+      makeWeeklyPoint('2022-11-28', 1000),
+      makeWeeklyPoint('2022-12-05', 100),
+      makeWeeklyPoint('2022-12-12', 100),
+      makeWeeklyPoint('2022-12-19', 100),
    ]

-    expect(
-      applyBlocklistCorrection({
-        data,
-        packageName: 'svelte',
-        granularity: 'weekly',
-      }),
-    ).toEqual([
-      data[0],
-      { ...data[1], value: 125, hasAnomaly: true },
-      { ...data[2], value: 150, hasAnomaly: true },
-      { ...data[3], value: 175, hasAnomaly: true },
-      data[4],
-    ])
+    applyHampelCorrection(data)
+    expect(data[3]!.value).toBe(1000) // original unchanged
  })


⚠️ Potential issue | 🟡 Minor

Please add regressions for the boundary and sparse-series cases.

The suite covers a centred large spike and gradual growth, but not the two easiest false-positive cases in this implementation: the first/last halfWindow indices, and flat low-volume series such as 0,0,0,1,0,0,0. Please pin both behaviours down here so the filter does not silently flatten real data at the edges or on sparse packages.

As per coding guidelines "**/*.{test,spec}.{ts,tsx}: Write unit tests for core functionality using vitest".

coderabbitai

🧹 Nitpick comments (1)

app/components/Package/TrendsChart.vue (1)
1831-1838: Consider using v-model for simpler checkbox binding.

The explicit :checked + @change pattern is functionally correct, but v-model provides equivalent behaviour with less boilerplate.
♻️ Optional simplification
 <input
-  :checked="settings.chartFilter.anomaliesFixed"
-  `@change`="
-    settings.chartFilter.anomaliesFixed = ($event.target as HTMLInputElement).checked
-  "
+  v-model="settings.chartFilter.anomaliesFixed"
   type="checkbox"
   class="accent-[var(--accent-color,var(--fg-subtle))]"
 />

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 91971e88-96ef-48e4-befd-3a0865a5eb4c

📥 Commits

Reviewing files that changed from the base of the PR and between 1b0a0c7 and 17e63ce.

📒 Files selected for processing (17)

app/components/Package/TrendsChart.vue
i18n/locales/bg-BG.json
i18n/locales/cs-CZ.json
i18n/locales/de-DE.json
i18n/locales/en.json
i18n/locales/es.json
i18n/locales/fr-FR.json
i18n/locales/hu-HU.json
i18n/locales/id-ID.json
i18n/locales/ja-JP.json
i18n/locales/pl-PL.json
i18n/locales/ru-RU.json
i18n/locales/tr-TR.json
i18n/locales/uk-UA.json
i18n/locales/zh-CN.json
i18n/locales/zh-TW.json
i18n/schema.json

💤 Files with no reviewable changes (16)

i18n/locales/en.json
i18n/locales/bg-BG.json
i18n/locales/es.json
i18n/locales/de-DE.json
i18n/locales/zh-TW.json
i18n/locales/fr-FR.json
i18n/locales/tr-TR.json
i18n/locales/cs-CZ.json
i18n/locales/hu-HU.json
i18n/schema.json
i18n/locales/zh-CN.json
i18n/locales/uk-UA.json
i18n/locales/ru-RU.json
i18n/locales/ja-JP.json
i18n/locales/pl-PL.json
i18n/locales/id-ID.json

danielroe · 2026-03-08T23:28:11Z

this looks very promising! tagging @jycouet who may have some thoughts on this 🙏

jycouet · 2026-03-09T06:37:50Z

Great to have a second look into this.

You probably checked my initial PR, #1636 I started with hampel implementation 👍

If I remember correctly, the sweet spot was 2.5 or 3 for vite. But this setting would hide "great start" of a lib. (eg: 0 0 0 0 0 0 20000)
I will pull the branch later (I'm on phone atm) as I'm curious to see it, maybe you have a more robust implem' 👍

Another note, when the start or end of the chart is in an anomalie period, any algo can't fix it. A good example is: if you start the chart in the middle of the vite spike.

Maybe we should add some tests around all this ? (to keep the intent)

Another note 2: I would love to have more than just vite there today ! 😅

graphieros · 2026-03-09T06:46:23Z

Another note 2: I would love to have more than just vite there today ! 😅

@jycouet corrections were applied to Svelte in #1934 & #1983

jycouet · 2026-03-09T06:49:13Z

Another note 2: I would love to have more than just vite there today ! 😅

@jycouet corrections were applied to Svelte in #1934 & #1983

That's the drawback of answering on phone
It shows my age 😅

jycouet · 2026-03-09T07:59:35Z

Thanks for this! A few thoughts after pulling the branch:

Reproduction URL that matters:
http://127.0.0.1:3000/package/vite?modal=chart&start=2025-03-09&end=2026-03-07
(because I suspect there's also a day-of-the-week issue atm, see #2005 let's not speak about this here)

"Great start" test cases:

Packages like @bramus/specificity or @sveltejs/sv-utils are good real-world examples of the 0 0 0 0 0 → lots of downloads pattern.
Would be great to add these as test cases (or manual verification) to see how Hampel manages them (or to find good defaults)

PR suggestion

Could we expose the Hampel filter as its own (separate from manual correction), with tweakable halfWindow / threshold sliders? That way we can compare manual vs Hampel side by side and find the right balance before committing to one approach?
Let me know if you want me to help or if you want to do it?

// Happy coding

matthewp · 2026-03-09T12:23:47Z

Could we expose the Hampel filter as its own (separate from manual correction), with tweakable halfWindow / threshold sliders? That way we can compare manual vs Hampel side by side and find the right balance before committing to one approach?
Let me know if you want me to help or if you want to do it?

I don't follow, as its own what?

jycouet · 2026-03-09T12:28:24Z

own

Something like

It's own controls. (It's more to be able to tests different senarios)

matthewp · 2026-03-09T15:20:25Z

Yes I can do that.

Replace manual anomalies with a hampel filter

1b0a0c7

matthewp changed the title ~~Replace manual anomalies with a hampel filter~~ fix: replace manual anomalies with a hampel filter Mar 8, 2026

vercel bot deployed to Preview – npmx.dev March 8, 2026 21:39 View deployment

Fix typechecking

b3ada65

vercel bot deployed to Preview – npmx.dev March 8, 2026 21:42 View deployment

Remove i18n keys

17e63ce

vercel bot deployed to Preview – npmx.dev March 8, 2026 21:45 View deployment

coderabbitai bot reviewed Mar 8, 2026

View reviewed changes

Uh oh!

Conversation

matthewp commented Mar 8, 2026

🔗 Linked issue

🧭 Context

📚 Description

Uh oh!

vercel bot commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented Mar 8, 2026

Lunaria Status Overview

Tracked Files

Uh oh!

coderabbitai bot commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

danielroe commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jycouet commented Mar 9, 2026

Uh oh!

graphieros commented Mar 9, 2026

Uh oh!

jycouet commented Mar 9, 2026

Uh oh!

jycouet commented Mar 9, 2026

"Great start" test cases:

PR suggestion

Uh oh!

matthewp commented Mar 9, 2026

Uh oh!

jycouet commented Mar 9, 2026

Uh oh!

matthewp commented Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

vercel bot commented Mar 8, 2026 •

edited

Loading

codecov bot commented Mar 8, 2026 •

edited

Loading

coderabbitai bot commented Mar 8, 2026 •

edited

Loading

danielroe commented Mar 8, 2026 •

edited

Loading