Added custom eval metric feature by ozguraslank · Pull Request #84 · ozguraslank/flexml

ozguraslank · 2026-01-03T14:46:50Z

Resolved #83

Copilot

Pull request overview

This PR adds a custom evaluation metric feature to the flexml library, allowing users to define and use their own scoring functions for model evaluation and tuning. The implementation introduces a new CustomScore class to wrap custom metric functions and integrates this functionality across the supervised learning pipeline.

Key changes:

New CustomScore class to validate and wrap custom metric functions with support for probability vs label predictions and maximize/minimize optimization directions
Extended start_experiment, tune_model, and related methods to accept callable custom metrics alongside standard string-based metrics
Comprehensive test suite covering regression, binary classification, multiclass classification, and tuning scenarios

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 20 comments.

Show a summary per file

File	Description
tests/test_custom_metrics.py	Comprehensive test suite with 9 test cases covering custom metrics for regression, classification (binary/multiclass), tuning (GridSearch, RandomizedSearch, Optuna), and error handling
flexml/structures/custom_score.py	New class to wrap and validate custom scoring functions with parameter validation, sklearn scorer integration, and support for probability/label-based metrics
flexml/structures/supervised_base.py	Updates to handle custom metrics in experiment workflow including parameter handling, model selection logic for minimize vs maximize, and leaderboard sorting
flexml/helpers/validators.py	Type hint updates to support CustomScore objects in addition to string-based metrics
flexml/helpers/supervised_helpers.py	Enhanced evaluate_model_perf to compute custom metrics alongside standard metrics with proper handling of probabilities vs labels
flexml/_model_tuner.py	Support for custom metrics in all tuning methods (grid search, randomized search, Optuna) with proper scorer integration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…ore start_experiment()

Added custom eval metric feature

738a445

ozguraslank requested a review from Copilot January 3, 2026 14:46

ozguraslank self-assigned this Jan 3, 2026

ozguraslank added the enhancement New feature or request label Jan 3, 2026

ozguraslank added this to the 1.1.1 milestone Jan 3, 2026

Copilot started reviewing on behalf of ozguraslank January 3, 2026 14:47 View session

Copilot AI reviewed Jan 3, 2026

View reviewed changes

ozguraslank added 7 commits January 3, 2026 17:56

Added name validation to custom score

6ca26e3

Misspelling fixes

c8a0639

Error fix when show_model_stats() or get_best_models() are called bef…

44c37f9

…ore start_experiment()

Improved if logic at evaluate_model_perf(), custom_score calculation

89f5e44

Removed unused import

6de04e1

Improved custom score class

55bd7bc

Passed custom score name as a argument in tests

2abe8a4

ozguraslank merged commit af8a4b7 into main Jan 3, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added custom eval metric feature#84

Added custom eval metric feature#84
ozguraslank merged 8 commits into
mainfrom
custom_eval_score

ozguraslank commented Jan 3, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ozguraslank commented Jan 3, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants