fix: LondonBoroughOfRichmondUponThames - rewrite for My Richmond page migration#1948
fix: LondonBoroughOfRichmondUponThames - rewrite for My Richmond page migration#1948InertiaUK wants to merge 1 commit intorobbrad:masterfrom
Conversation
📝 WalkthroughWalkthroughUpdated the Richmond upon Thames council class to adapt HTML parsing for new waste section markup format, refined PID parameter handling to remove external passthrough while deriving from URL query or PAON data, and simplified User-Agent header construction. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
uk_bin_collection/uk_bin_collection/councils/LondonBoroughOfRichmondUponThames.py (1)
96-103:⚠️ Potential issue | 🟠 MajorRemove unnecessary broad exception handler in
_pid_from_url.The
try-except Exceptionblock at line 102 is unnecessary—urlparse()andparse_qs()are designed to handle edge cases gracefully and do not raise exceptions. Removing this handler ensures that real errors are not silently hidden and makes the code's intent clearer.Suggested fix
def _pid_from_url(self, url): if not url: return None - try: - q = parse_qs(urlparse(url).query) - return q.get("pid", [None])[0] - except Exception: - return None + q = parse_qs(urlparse(url).query) + return q.get("pid", [None])[0]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@uk_bin_collection/uk_bin_collection/councils/LondonBoroughOfRichmondUponThames.py` around lines 96 - 103, Remove the unnecessary broad try/except in the _pid_from_url method: keep the initial guard (if not url: return None), call urlparse(url).query and parse_qs(...) to extract "pid" (q.get("pid", [None])[0]) and return it directly without catching Exception; this avoids swallowing real errors from _pid_from_url while preserving the same None behavior for empty/absent pid.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@uk_bin_collection/uk_bin_collection/councils/LondonBoroughOfRichmondUponThames.py`:
- Line 13: The module docstring for LondonBoroughOfRichmondUponThames contains
an EN DASH (–) which triggers Ruff RUF002; open the module (symbol:
LondonBoroughOfRichmondUponThames or the top-level docstring) and replace the EN
DASH with a normal hyphen (-) in the string "Richmond upon Thames – parse My
Richmond property page." so it becomes "Richmond upon Thames - parse My Richmond
property page.".
---
Outside diff comments:
In
`@uk_bin_collection/uk_bin_collection/councils/LondonBoroughOfRichmondUponThames.py`:
- Around line 96-103: Remove the unnecessary broad try/except in the
_pid_from_url method: keep the initial guard (if not url: return None), call
urlparse(url).query and parse_qs(...) to extract "pid" (q.get("pid", [None])[0])
and return it directly without catching Exception; this avoids swallowing real
errors from _pid_from_url while preserving the same None behavior for
empty/absent pid.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f8545302-afec-459e-9764-ec6e39267b57
📒 Files selected for processing (1)
uk_bin_collection/uk_bin_collection/councils/LondonBoroughOfRichmondUponThames.py
| Richmond upon Thames – parse the static My Property page. | ||
| No Selenium. No BeautifulSoup. Just requests + regex tailored to the current markup. | ||
| """ | ||
| """Richmond upon Thames – parse My Richmond property page.""" |
There was a problem hiding this comment.
Replace EN DASH in docstring to satisfy Ruff RUF002.
Line 13 uses –; switch to - to clear the lint warning.
💡 Suggested patch
- """Richmond upon Thames – parse My Richmond property page."""
+ """Richmond upon Thames - parse My Richmond property page."""📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| """Richmond upon Thames – parse My Richmond property page.""" | |
| """Richmond upon Thames - parse My Richmond property page.""" |
🧰 Tools
🪛 Ruff (0.15.9)
[warning] 13-13: Docstring contains ambiguous – (EN DASH). Did you mean - (HYPHEN-MINUS)?
(RUF002)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@uk_bin_collection/uk_bin_collection/councils/LondonBoroughOfRichmondUponThames.py`
at line 13, The module docstring for LondonBoroughOfRichmondUponThames contains
an EN DASH (–) which triggers Ruff RUF002; open the module (symbol:
LondonBoroughOfRichmondUponThames or the top-level docstring) and replace the EN
DASH with a normal hyphen (-) in the string "Richmond upon Thames – parse My
Richmond property page." so it becomes "Richmond upon Thames - parse My Richmond
property page.".
Richmond moved their waste collection lookup from the old
/services/waste_and_recycling/collection_days/page to the My Richmond property portal at/my_richmond?pid=UPRN.The old scraper looked for
<a id="my_waste">anchors between section markers. The new page uses<div class="my-item my-waste">with<h4>bin type headings and<ul><li>collection dates.Rewritten
_extract_waste_blockto match the newdiv.my-wastestructure with a fallback to the old anchor format. Simplified the overall scraper — removed unused imports and consolidated helpers.The PID (property ID) is a standard UPRN passed via the house number field. Tested with a real Richmond address.
Summary by CodeRabbit
Bug Fixes
Refactor