Add reusable model evaluation utilities with comprehensive tests by kamalahasiniburra · Pull Request #210 · ML4SCI/DeepLense

kamalahasiniburra · 2026-03-30T15:52:02Z

Summary

This PR adds a reusable evaluation utilities module (evaluation_utils.py) and a comprehensive test suite (test_evaluation_utils.py) for the DeepLense project.

What's included

evaluation_utils.py

compute_classification_metrics() : Computes accuracy, per-class precision/recall/F1, and macro-averaged metrics for gravitational lens classification
- compute_confusion_matrix() : Builds confusion matrix without requiring sklearn dependency
- compute_regression_metrics() : Computes MSE, RMSE, MAE, and R-squared for lens parameter estimation tasks
- compute_roc_auc_ovr() : One-vs-rest ROC AUC computation for multi-class classification
- format_metrics_table() : Formats metrics dictionary as a readable table string

test_evaluation_utils.py

8 comprehensive tests covering all functions including edge cases
- Tests for perfect predictions, partial misclassifications, confusion matrix correctness, regression accuracy, and input validation

Why this is useful

Currently, each DeepLense sub-project implements its own evaluation logic. This shared module provides standardized metrics computation that can be imported across different sub-projects, ensuring consistent evaluation and reducing code duplication.

All 8 tests pass locally.

- evaluation_utils.py: Classification metrics, confusion matrix, regression metrics, ROC AUC computation, and formatting helpers for DeepLense models - test_evaluation_utils.py: Comprehensive test suite (8 tests) covering all evaluation functions including edge cases

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add reusable model evaluation utilities with comprehensive tests#210

Add reusable model evaluation utilities with comprehensive tests#210
kamalahasiniburra wants to merge 1 commit intoML4SCI:mainfrom
kamalahasiniburra:add-model-evaluation-utils

kamalahasiniburra commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kamalahasiniburra commented Mar 30, 2026

Summary

What's included

evaluation_utils.py

test_evaluation_utils.py

Why this is useful

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant