fix: use list comprehension instead of set in PerClassScorer.get_metric by Chessing234 · Pull Request #583 · allenai/scispacy

Chessing234 · 2026-05-06T10:19:15Z

Bug

PerClassScorer.get_metric computes overall precision/recall/F1 by summing the true-positive, false-positive, and false-negative counts across all entity types (excluding "untyped"). The three sum() calls use set comprehensions ({v for k, v in ...}), which silently deduplicate equal values before summation.

Root cause

{3, 3} evaluates to {3}, so if two entity types have the same count, one is dropped. For example, if labels A and B each have 3 true positives, sum({3, 3}) returns 3 instead of 6. The resulting overall metrics are incorrectly low whenever any two entity types share a count value.

Fix

Change the three set comprehensions to list comprehensions. Lists preserve all values, so sum([3, 3]) correctly returns 6.

# Before
sum_true_positives = sum(
    {v for k, v in self._true_positives.items() if k != "untyped"}
)

# After
sum_true_positives = sum(
    [v for k, v in self._true_positives.items() if k != "untyped"]
)

The fix is identical for sum_false_positives and sum_false_negatives.

Set comprehensions passed to sum() silently deduplicate equal values. If two entity types have the same true-positive (or FP/FN) count, the set collapses them to one element before summation, producing an incorrectly low overall precision/recall/F1. Co-Authored-By: Chessing234 <takshkothari09@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: use list comprehension instead of set in PerClassScorer.get_metric#583

fix: use list comprehension instead of set in PerClassScorer.get_metric#583
Chessing234 wants to merge 1 commit into
allenai:mainfrom
Chessing234:fix/per-class-scorer-set-to-list

Chessing234 commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Chessing234 commented May 6, 2026

Bug

Root cause

Fix

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant