Skip to content

Fix scraping for GitHub's new Primer-React profile shell + fix of hunter.io search issues#68

Open
soxoj wants to merge 2 commits into
mxrch:masterfrom
soxoj:master
Open

Fix scraping for GitHub's new Primer-React profile shell + fix of hunter.io search issues#68
soxoj wants to merge 2 commits into
mxrch:masterfrom
soxoj:master

Conversation

@soxoj
Copy link
Copy Markdown
Contributor

@soxoj soxoj commented May 21, 2026

No description provided.

soxoj added 2 commits May 17, 2026 18:46
GitHub now serves a new profile shell to authenticated viewers that
broke three scraping paths in gitfive:

- repos.py: the navigation tab counter moved from
  `<span class="Counter" title="N">` to `<span data-component="counter">`
  with the number nested in text. `gitfive user <name>` raised
  `ValueError: Could not find the repositories counter on the page.`
  Extract the count via a helper that handles both layouts.

- utils.py: `get_commits_count` matched the commits link via
  `…/commits/<branch>$`, but the new shell renders it with a trailing
  slash. The Metamon post-push polling loop spun forever as a result.
  Allow an optional trailing slash and harden number extraction.

- commits.py: the `/commits/<branch>` page no longer uses
  `li.js-commits-list-item` / `a.js-navigation-open` / `img.avatar-user`.
  Parse the embedded `<script type="application/json">` payload instead
  — `payload.commitGroups[].commits[]` gives the commit oid, the
  co-authored email, and the matched GitHub account (login + avatar)
  directly. Pull `currentCommit.oid` from the payload in place of the
  removed `permalink` anchor; pagination via `?after=<head>+<offset>`
  still works.
@soxoj
Copy link
Copy Markdown
Contributor Author

soxoj commented May 21, 2026

I will resolve conflicts later

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant