Skip to content

Add reusable model evaluation utilities with comprehensive tests#210

Open
kamalahasiniburra wants to merge 1 commit intoML4SCI:mainfrom
kamalahasiniburra:add-model-evaluation-utils
Open

Add reusable model evaluation utilities with comprehensive tests#210
kamalahasiniburra wants to merge 1 commit intoML4SCI:mainfrom
kamalahasiniburra:add-model-evaluation-utils

Conversation

@kamalahasiniburra
Copy link
Copy Markdown

Summary

This PR adds a reusable evaluation utilities module (evaluation_utils.py) and a comprehensive test suite (test_evaluation_utils.py) for the DeepLense project.

What's included

evaluation_utils.py

  • compute_classification_metrics() : Computes accuracy, per-class precision/recall/F1, and macro-averaged metrics for gravitational lens classification
    • compute_confusion_matrix() : Builds confusion matrix without requiring sklearn dependency
    • compute_regression_metrics() : Computes MSE, RMSE, MAE, and R-squared for lens parameter estimation tasks
    • compute_roc_auc_ovr() : One-vs-rest ROC AUC computation for multi-class classification
    • format_metrics_table() : Formats metrics dictionary as a readable table string

test_evaluation_utils.py

  • 8 comprehensive tests covering all functions including edge cases
    • Tests for perfect predictions, partial misclassifications, confusion matrix correctness, regression accuracy, and input validation

Why this is useful

Currently, each DeepLense sub-project implements its own evaluation logic. This shared module provides standardized metrics computation that can be imported across different sub-projects, ensuring consistent evaluation and reducing code duplication.

All 8 tests pass locally.

- evaluation_utils.py: Classification metrics, confusion matrix, regression
  metrics, ROC AUC computation, and formatting helpers for DeepLense models
- test_evaluation_utils.py: Comprehensive test suite (8 tests) covering all
  evaluation functions including edge cases
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant