Skip to content

Feat/allow int candidates#370

Open
graceg571 wants to merge 10 commits into
mggg:3.4.1from
graceg571:feat/allow-int-candidates
Open

Feat/allow int candidates#370
graceg571 wants to merge 10 commits into
mggg:3.4.1from
graceg571:feat/allow-int-candidates

Conversation

@graceg571

@graceg571 graceg571 commented Jun 1, 2026

Copy link
Copy Markdown

Summary

Addresses #359

This PR allows integer and string candidates as input to rank and score ballots. The user can now input a ranking or score using candidates "1" and 1, and each will be treated as separate candidates with a warning raised to let the user know that's the case.

RankProfile and ScoreProfile store an internal df of the ballots using candidate integer IDs instead of candidate names to provide a uniform internal representation regardless of candidate type which will be helpful for our eventual move to Rust. If the df attribute is accessed, the internal df will be translated to its candidate names and cached for future references.

Why

Scottish election data labels their candidates as integers and stores a mapping of their integer IDs to candidate names. Likewise, it is common within math literature to use integers to represent candidates. This change will align better with the Scottish election data and math literature as well as allow users to more easily input candidates to ballots without casting all to strings.

We could have cast all integers to strings or enforced no mixing of candidate types, but decided against it: election result candidate names would be misaligned with the ballots the user created, and with the internal df there is no need to constrain candidate types across an entire ballot or preference profile.

Changes

  • types.py: added Candidate : TypeAlias = str | int and CandidateFloatDictLike alias. All type annotations are updated to use Candidate across the codebase.
  • ballot.py: RankBallot and ScoreBallot now accept Candidate in rankings and scores. Warning raised when string and integer candidates collide (e.g. "1" and 1), treated as separate candidates
  • pref_profile.py: RankProfile and ScoreProfile maintain and internal _df with the candidate integer IDs as the values or column names respectively. The public df property translates back to candidate names and is cached. _candidates and _candidates_cast store the candidate integer IDs to be able to search the internal _df. candidate_id_map is the mapping of candidate names to their integer IDs. id_candidate_map is the mapping of candidate integer IDs to their names. These maps allow translation both ways.
  • ballot generators, elections, utils, graphs, plots, etc. have been updated to accept Candidate types as input and propagate Candidate types to internal functions

Testing

  • test_ScoreBallot.py/test_RankBallot.py: new tests for int and mixed str/int candidates at the ballot level, including collision warning
  • test_rank_pp.py/test_score_pp.py: new tests for int and mixed str/int candidate profiles
  • test_rank_pp_df.py/test_score_pp_df.py: new tests verifying internal _df uses integer IDs and the public df translates back to candidate names of mixed types
  • test_mixed_candidates.py: fuzz tests running Borda, STV, and Plurality elections and all utils.py functions over profiles with mixed str/int candidates at varying ballot counts (10, 1000, 10000)

@graceg571 graceg571 marked this pull request as draft June 1, 2026 14:56
@graceg571 graceg571 self-assigned this Jun 1, 2026
@graceg571 graceg571 force-pushed the feat/allow-int-candidates branch from c7d7185 to ac74d1d Compare June 12, 2026 15:47
@graceg571 graceg571 changed the base branch from main to 3.4.1 June 12, 2026 15:49
@graceg571 graceg571 requested a review from peterrrock2 June 22, 2026 12:46
@graceg571 graceg571 marked this pull request as ready for review June 22, 2026 14:27

@peterrrock2 peterrrock2 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Consistency note: "Candidate can be a str or int" or "Candidate can strings, integers, or a mix of both"
  • Fix roundtrip on CSV with mixed types


Args:
scores (dict[str, float]): Mutable score dict, modified in place.
scores (dict[str | int, float]): Mutable score dict, modified in place.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Candidates?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test this with a mixture of candidate types

Comment on lines +196 to +205
cand_subset: list[Candidate]
| tuple[Candidate]
| set[Candidate]
| list[str]
| set[str]
| list[int]
| set[int],
pairwise_dict: dict[tuple[Candidate, Candidate], tuple[float, float]]
| dict[tuple[str, str], tuple[float, float]]
| dict[tuple[int, int], tuple[float, float]],

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sequence, maybe?

Comment on lines 33 to 46
def comention_above(i: Candidate, j: Candidate, ballot: RankBallot) -> bool:
"""
Takes candidates i,j and returns True if i >= j in the ranking.
Requires that the ballot has a ranking.


Args:
i (str): Candidate name.
j (str): Candidate name.
i (Candidate): Candidate name.
Candidates can be strings, integers, or mix of both.
j (Candidate): Candidate name.
Candidates can be strings, integers, or mix of both.
ballot (RankBallot): RankBallot.

Returns:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we are here, maybe rename these?


Returns:
dict[str, dict[str, float]]: Data dictionary for ``multi_bar_plot``.
dict[str, dict[str | int, float]]: Data dictionary for ``multi_bar_plot``.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Candidate?

Comment thread src/votekit/utils.py Outdated

Returns:
dict[str, float]:
dict[str | int, float]:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Candidate?

Comment on lines +388 to +393
def df(self) -> pd.DataFrame:
"""
Compute the dataframe as a cached property.
The dataframe is internally stored with candidate ids.
The dataframe will be translated to original candidate names.
"""

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstrings are a part of the Public API. Keep internal details in # NOTE comments.

Comment on lines +1111 to +1113
Compute the dataframe as a cached property.
The dataframe is internally stored with candidate ids.
The dataframe will be translated to original candidate names.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Internal vs external

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need that extra set? Investigate, please.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check collisions. Also add checks for instantiating from df.

…imultaneous veto and ranked pairs to mixed cand testing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Request to allow candidates to be integers

2 participants