Skip to content

fix: disambiguate two distinct "Jake Anderson" lifters in US data#516

Open
jakemanderson wants to merge 1 commit into
euanwm:developmentfrom
jakemanderson:fix/disambiguate-jake-anderson
Open

fix: disambiguate two distinct "Jake Anderson" lifters in US data#516
jakemanderson wants to merge 1 commit into
euanwm:developmentfrom
jakemanderson:fix/disambiguate-jake-anderson

Conversation

@jakemanderson

Copy link
Copy Markdown
Contributor

Summary

The "Jake Anderson" record in US event data conflates two distinct lifters:

  • Minnesota high-school-era lifter — 2013-12 → 2015-03, 7 meets, body weight 96–108 kg, totals 208–255
  • +109 kg lifter — 2022-04 → 2026-03, 12 meets, body weight 122–137 kg, totals 255–320

7-year gap, ~25 kg body-weight delta, no overlap. Almost certainly different people.

This PR renames the older lifter's 7 historical entries to Jake Anderson #1 (OpenPowerlifting convention). The newer lifter's records are unchanged.

Notes

  • Disclosure: I am the newer lifter.
  • This is a one-time fix for historical CSVs. New meets scraped for either name will land as plain "Jake Anderson" and re-conflate. A general aliasing layer would be a separate, larger change — happy to file an issue if useful.
  • make check_db passes (run via python3 scripts/check_db.py).

Test plan

  • python3 scripts/check_db.py reports TEST PASSED
  • All 7 older entries now read Jake Anderson #1
  • All 12 newer entries unchanged

🤖 Generated with Claude Code

The lifter 'Jake Anderson' currently merges two distinct people:
a Minnesota high-school-era lifter (2013-12 to 2015-03, 7 meets,
body weight 96-108 kg) and a separate +109 kg lifter (2022-04
onward, 12 meets, body weight 122-137 kg).

This renames the older lifter's 7 historical entries to
'Jake Anderson euanwm#1', following the OpenPowerlifting convention.
The newer lifter's records are unchanged.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@euanwm

euanwm commented Apr 27, 2026

Copy link
Copy Markdown
Owner

Currently in the progress of developing a new feature to disambiguate through estimate year of births which I'll likely do here. The #1 lifter has an estimated YOB of around 1998 which is a few years younger than #2.

I'll leave this open for reference and as a fallback.

@jakemanderson

Copy link
Copy Markdown
Contributor Author

Excellent! Yes I started writing up something since I do lots of name matching and disambiguation for my research, but I didn't want to overstep and figured you had something in progress already after I saw how many lifters have unrealistic body weight variance when I plotted the time series. YOB will probbaly get 99%+.

@euanwm

euanwm commented Apr 28, 2026

Copy link
Copy Markdown
Owner

Excellent! Yes I started writing up something since I do lots of name matching and disambiguation for my research, but I didn't want to overstep and figured you had something in progress already after I saw how many lifters have unrealistic body weight variance when I plotted the time series. YOB will probbaly get 99%+.

OPL do a lot of manual data correction which is something I simply don't have time to do at scale. I'm slowly going through ways of collating data at the right time without manual correction.

Happy to hear any recommendations you have as well. The data layer is staying opensource, it's only the main UI that is closed source so there isn't an additional attack vector on the infosec side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants