Skip to content

feat: add own measure implementation#120

Open
m-muecke wants to merge 29 commits into
mainfrom
measures
Open

feat: add own measure implementation#120
m-muecke wants to merge 29 commits into
mainfrom
measures

Conversation

@m-muecke
Copy link
Copy Markdown
Member

@m-muecke m-muecke commented Apr 11, 2026

  • Replace fpc hard dependency with native implementations of clustering measures
  • Add new measures: clust.avg_between, clust.avg_within, clust.davies_bouldin, clust.dunn2, clust.entropy,
    clust.pearsongamma, clust.wb_ratio
  • Return NaN for measures undefined at k < 2, following the mlr3measures convention

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR replaces the prior reliance on fpc::cluster.stats() with native implementations of internal clustering quality measures, adds several additional internal measures, and updates documentation/tests accordingly to reduce hard dependencies.

Changes:

  • Implement native cluster quality statistics (WSS, CH, Dunn, etc.) and wire them into a new MeasureClustInternal measure implementation.
  • Add new internal measures (dunn2, wb_ratio, entropy, pearsongamma, davies_bouldin, avg_between, avg_within) and register them in mlr3.
  • Update docs, pkgdown reference index, NEWS, and add tests comparing results against fpc when available.

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
tests/testthat/test_cluster_stats.R Adds tests comparing new native implementations to fpc::cluster.stats() plus a known-value test for Davies–Bouldin.
R/zzz.R Switches measure registration from the old FPC-based measure to MeasureClustInternal and registers new measures.
R/measures.R Refactors measure metadata to store a scoring function + input type instead of an fpc criterion string.
R/MeasureClustInternal.R Introduces MeasureClustInternal, updates silhouette scoring, and adds roxygen docs for all internal measures.
R/cluster_stats.R New native implementations of internal clustering statistics used by measures and tests.
R/bibentries.R Adds bibliography entries used in the expanded measure documentation.
pkgdown/_pkgdown.yml Adds a structured reference index for learners/measures/tasks/general topics.
NEWS.md Documents new measures and the removal of fpc as a hard dependency.
NAMESPACE Removes importFrom(fpc, cluster.stats).
man/mlr_measures_clust.wss.Rd Updates WSS documentation and removes fpc dependency mention.
man/mlr_measures_clust.wb_ratio.Rd Adds documentation for the new clust.wb_ratio measure.
man/mlr_measures_clust.silhouette.Rd Expands silhouette documentation and adds references/seealso updates.
man/mlr_measures_clust.pearsongamma.Rd Adds documentation for the new clust.pearsongamma measure.
man/mlr_measures_clust.entropy.Rd Adds documentation for the new clust.entropy measure.
man/mlr_measures_clust.dunn2.Rd Adds documentation for the new clust.dunn2 measure.
man/mlr_measures_clust.dunn.Rd Updates Dunn documentation and removes fpc dependency mention.
man/mlr_measures_clust.davies_bouldin.Rd Adds documentation for the new clust.davies_bouldin measure.
man/mlr_measures_clust.ch.Rd Updates CH documentation and removes fpc dependency mention.
man/mlr_measures_clust.avg_within.Rd Adds documentation for the new clust.avg_within measure.
man/mlr_measures_clust.avg_between.Rd Adds documentation for the new clust.avg_between measure.
man-roxygen/measure_internal.R Removes the old boilerplate description referencing fpc::cluster.stats().
DESCRIPTION Moves fpc from Imports to Suggests and updates Collate to include cluster_stats.R.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread R/cluster_stats.R Outdated
Comment thread R/cluster_stats.R Outdated
Comment thread R/cluster_stats.R
Comment thread R/cluster_stats.R Outdated
Comment thread R/cluster_stats.R
Comment thread R/cluster_stats.R
Comment thread R/cluster_stats.R
Comment thread R/cluster_stats.R Outdated
Comment thread tests/testthat/test_cluster_stats.R
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 22 out of 22 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/testthat/test_cluster_stats.R
Comment thread R/cluster_stats.R Outdated
Comment thread R/cluster_stats.R
Comment thread R/cluster_stats.R
Comment thread R/cluster_stats.R
Comment thread R/cluster_stats.R
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 22 out of 22 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread R/cluster_stats.R
Comment thread R/cluster_stats.R
Comment thread R/cluster_stats.R
Comment thread R/cluster_stats.R
Comment thread tests/testthat/test_cluster_stats.R
Comment thread R/cluster_stats.R
@m-muecke m-muecke requested a review from be-marc April 12, 2026 14:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants