fix: BexleyCouncil - replace Selenium with requests (WasteWorks HTML)#1937
fix: BexleyCouncil - replace Selenium with requests (WasteWorks HTML)#1937InertiaUK wants to merge 2 commits intorobbrad:masterfrom
Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 25 minutes and 11 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughReplaced Selenium-based web scraping with direct Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@uk_bin_collection/uk_bin_collection/councils/BexleyCouncil.py`:
- Around line 27-35: The retry loop that fetches the page retries up to 3 times
but then proceeds to parse whatever response was last received, which can hide
transient failures; after the for-attempt loop (the block that checks for
"waste-service-name" in response.text), detect if the marker was never found and
raise a clear exception (e.g., RuntimeError or ValueError) including the
URL/page and last response.status_code/text snippet so callers of BexleyCouncil
parsing know the fetch failed instead of silently returning an empty bins list;
ensure this change is made where response and soup are set so downstream code
cannot continue on an unexpected page format.
- Around line 19-21: The code silently allows kwargs.get("uprn") to be None
which yields a request to "/waste/None"; update the logic in BexleyCouncil
(where user_uprn is obtained and page is built) to explicitly validate the UPRN:
either use kwargs["uprn"] or check if user_uprn is truthy and raise a clear
exception (e.g., ValueError with a message like "Missing required 'uprn'")
before constructing page = f"https://waste.bexley.gov.uk/waste/{user_uprn}", so
callers fail fast on bad input.
- Around line 41-82: The parser currently swallows many failures; instead make
each unexpected/malformed case raise a clear exception so callers can detect
markup changes: when h3 is missing in the grid raise a ValueError (include grid
context), when summary_list is missing raise a ValueError, when no "Next
collection" dt is found for a service raise a ValueError naming the
service/bin_type (use service_name_elem or bin_type), when dd is missing raise a
ValueError, and when date parsing fails (in
remove_ordinal_indicator_from_date_string / datetime.strptime) re-raise a
ValueError with a message containing bin_type and the raw next_collection string
rather than printing; update the loop in BexleyCouncil.py that handles grids,
h3, summary_list, rows, dt, dd, next_collection and the date parsing to raise
these exceptions with contextual text.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 509146d0-72a5-4b8f-b008-196ee8d156c8
📒 Files selected for processing (1)
uk_bin_collection/uk_bin_collection/councils/BexleyCouncil.py
|
Fair points — pushed fixes for all three:
|
Bexley migrated to WasteWorks, and the new results page is plain static HTML — no JavaScript needed to render the bin schedule.
Dropped the Selenium flow entirely and switched to a requests + BeautifulSoup scrape against the WasteWorks results URL. Much faster and no webdriver dependency for this council.
Tested with a real UPRN in Bexley.
Summary by CodeRabbit