Skip to content

fix: add realistic user_agent to all Selenium scrapers missing one#1947

Open
InertiaUK wants to merge 1 commit intorobbrad:masterfrom
InertiaUK:fix/selenium-user-agent-blanket
Open

fix: add realistic user_agent to all Selenium scrapers missing one#1947
InertiaUK wants to merge 1 commit intorobbrad:masterfrom
InertiaUK:fix/selenium-user-agent-blanket

Conversation

@InertiaUK
Copy link
Copy Markdown

@InertiaUK InertiaUK commented Apr 11, 2026

Several Selenium-based scrapers were running with Chrome's default headless user agent, which some council websites detect and block (returning 403s, Cloudflare challenges, or empty responses).

This adds a realistic desktop Chrome user agent string to the create_webdriver call in every Selenium scraper that was missing one. The create_webdriver function already supports the user_agent parameter — most scrapers were just passing None.

63 files changed, all with the same mechanical edit: define a user_agent variable and pass it instead of None. Scrapers that already set a user agent (15 councils) and those with separate open PRs are excluded.

No functional changes beyond the user agent string.

Summary by CodeRabbit

  • Refactor
    • Updated browser identification across 68 council integrations to use consistent user-agent strings during web requests.
    • Note: Two council modules may require additional review for syntax issues.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 11, 2026

📝 Walkthrough

Walkthrough

This PR systematically updates 65+ council scraper implementations across the UK bin collection module by adding explicit browser user-agent strings to webdriver initialization calls, replacing None arguments in create_webdriver() invocations.

Changes

Cohort / File(s) Summary
Council Scrapers with User-Agent Updates
uk_bin_collection/uk_bin_collection/councils/ArgyllandButeCouncil.py, AshfordBoroughCouncil.py, BaberghDistrictCouncil.py, BarkingDagenham.py, BlackburnCouncil.py, BlaenauGwentCountyBoroughCouncil.py, BrightonandHoveCityCouncil.py, BroadlandDistrictCouncil.py, BroxtoweBoroughCouncil.py, CalderdaleCouncil.py, CeredigionCountyCouncil.py, ChichesterDistrictCouncil.py, ColchesterCityCouncil.py, CotswoldDistrictCouncil.py, CroydonCouncil.py, DacorumBoroughCouncil.py, EastLindseyDistrictCouncil.py, EastRenfrewshireCouncil.py, EastRidingCouncil.py, EastSuffolkCouncil.py, EpsomandEwellBoroughCouncil.py, ForestOfDeanDistrictCouncil.py, GloucesterCityCouncil.py, GreatYarmouthBoroughCouncil.py, GuildfordCouncil.py, HertsmereBoroughCouncil.py, HighPeakCouncil.py, Hillingdon.py, HyndburnBoroughCouncil.py, KingstonUponThamesCouncil.py, KnowsleyMBCouncil.py, LondonBoroughRedbridge.py, MaidstoneBoroughCouncil.py, MidAndEastAntrimBoroughCouncil.py, MidSuffolkDistrictCouncil.py, MidUlsterDistrictCouncil.py, NorthDevonCountyCouncil.py, NorthEastDerbyshireDistrictCouncil.py, NorthWestLeicestershire.py, PortsmouthCityCouncil.py, PowysCouncil.py, PrestonCityCouncil.py, RugbyBoroughCouncil.py, RushcliffeBoroughCouncil.py, SevenoaksDistrictCouncil.py, StHelensBC.py, StocktonOnTeesCouncil.py, SunderlandCityCouncil.py, TeignbridgeCouncil.py, TendringDistrictCouncil.py, ThreeRiversDistrictCouncil.py, TorbayCouncil.py, WalthamForest.py, WestBerkshireCouncil.py, WestLothianCouncil.py, WestOxfordshireDistrictCouncil.py, WinchesterCityCouncil.py, WirralCouncil.py, WokinghamBoroughCouncil.py, WrexhamCountyBoroughCouncil.py
Define explicit Chrome-based user-agent string and pass to create_webdriver() in parse_data() methods instead of None. All parsing logic and control flow remain unchanged.
Council Scrapers with Potential Indentation Issues
uk_bin_collection/uk_bin_collection/councils/StaffordshireMoorlandsDistrictCouncil.py, ThanetDistrictCouncil.py
Added user-agent string with apparent indentation misalignment, which may cause syntax/control-flow errors at runtime. Requires verification of correct indentation alignment.
Council Scrapers with Line Ending Changes
uk_bin_collection/uk_bin_collection/councils/NorthumberlandCouncil.py
Added user-agent string with additional line-ending/formatting modifications (+173/-172 lines), alongside the standard user-agent update.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

The PR applies a consistent, homogeneous pattern across 65+ files, reducing per-file review complexity. However, the large count of files necessitates spot-checking for consistency and careful attention to the two flagged indentation issues in StaffordshireMoorlandsDistrictCouncil.py and ThanetDistrictCouncil.py, which could introduce runtime syntax errors.

Possibly related PRs

  • March 2026 Release #1883: Updates council scraper parse_data methods to pass explicit user-agent into create_webdriver() calls, replacing None arguments with hardcoded Chrome user-agent strings.
  • fix: Council Fix Pack - November 2025 #1679: Modifies multiple council scraper implementations to supply explicit browser user-agent strings to create_webdriver() instead of None, same pattern as this PR.
  • Dec release #1754: Applies user-agent parameter updates to multiple council parser files by modifying create_webdriver() calls in CouncilClass.parse_data methods.

Poem

🐰 Sixty-five councils, now with a face,
Chrome whispers its name through cyberspace.
No more anonymous Selenium creep—
Each scraper now wears its user-agent deep!
🌐✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 7.84% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and concisely summarizes the main change: adding realistic user agents to Selenium scrapers that were missing them. It directly reflects the primary objective across the 63 modified council council scraper files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (8)
uk_bin_collection/uk_bin_collection/councils/PowysCouncil.py (1)

36-37: Centralize the shared Selenium user-agent constant.

This works, but the same literal is now duplicated across many scrapers. Moving it to a shared constant/helper (e.g., in common) will reduce drift and make future updates one-touch.

♻️ Suggested direction
-            user_agent = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36"
-            driver = create_webdriver(web_driver, headless, user_agent, __name__)
+            driver = create_webdriver(
+                web_driver, headless, DEFAULT_SELENIUM_USER_AGENT, __name__
+            )
# in common module (example)
DEFAULT_SELENIUM_USER_AGENT = (
    "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 "
    "(KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36"
)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@uk_bin_collection/uk_bin_collection/councils/PowysCouncil.py` around lines 36
- 37, Extract the duplicated user-agent literal into a shared constant (e.g.,
DEFAULT_SELENIUM_USER_AGENT) in the common module and replace the local
assignment in PowysCouncil.py (the user_agent variable used when calling
create_webdriver) with an import of that constant; update the
create_webdriver(web_driver, headless, user_agent, __name__) call to pass the
imported DEFAULT_SELENIUM_USER_AGENT and add the new constant export in common
so other scrapers can reuse it.
uk_bin_collection/uk_bin_collection/councils/ForestOfDeanDistrictCouncil.py (1)

39-40: Centralize the Selenium user-agent string in shared code.

This change is correct, but duplicating the same hardcoded UA across many scrapers will be brittle to maintain. Please move it to a shared constant/helper (for example in common) and reference that value here.

♻️ Example refactor shape
# uk_bin_collection/uk_bin_collection/common.py
+DEFAULT_SELENIUM_USER_AGENT = (
+    "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 "
+    "(KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36"
+)

# council scraper
- user_agent = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36"
- driver = create_webdriver(web_driver, headless, user_agent, __name__)
+ driver = create_webdriver(web_driver, headless, DEFAULT_SELENIUM_USER_AGENT, __name__)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@uk_bin_collection/uk_bin_collection/councils/ForestOfDeanDistrictCouncil.py`
around lines 39 - 40, Replace the hardcoded user_agent string in
ForestOfDeanDistrictCouncil.py by importing a shared constant from the common
module (e.g., DEFAULT_USER_AGENT) and pass that into create_webdriver instead of
the inline user_agent variable; update any references to the local user_agent
variable, add the constant (DEFAULT_USER_AGENT = "Mozilla/5.0 ...") to the
common helper/module, and adjust the import statement in this file to use the
shared constant so all scrapers reference the single source of truth.
uk_bin_collection/uk_bin_collection/councils/CalderdaleCouncil.py (1)

42-43: Centralize the shared Selenium user-agent string.

This works, but the same literal is now duplicated across many councils. Please move it to a single constant (e.g., in uk_bin_collection/uk_bin_collection/common.py) to avoid version drift and bulk edits later.

♻️ Proposed refactor
# uk_bin_collection/uk_bin_collection/common.py
+DEFAULT_SELENIUM_USER_AGENT = (
+    "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 "
+    "(KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36"
+)

# uk_bin_collection/uk_bin_collection/councils/CalderdaleCouncil.py
-            user_agent = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36"
+            user_agent = DEFAULT_SELENIUM_USER_AGENT
             driver = create_webdriver(web_driver, headless, user_agent, __name__)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@uk_bin_collection/uk_bin_collection/councils/CalderdaleCouncil.py` around
lines 42 - 43, The hard-coded Selenium user-agent literal used when calling
create_webdriver (e.g., the user_agent variable in CalderdaleCouncil.py) is
duplicated across multiple councils; centralize it by adding a single constant
USER_AGENT (or SELENIUM_USER_AGENT) in uk_bin_collection.common and import that
constant in CalderdaleCouncil.py, replacing the local user_agent assignment and
passing USER_AGENT into create_webdriver(web_driver, headless, USER_AGENT,
__name__); apply the same replacement to other council modules that define the
same literal so all callers of create_webdriver use the shared constant.
uk_bin_collection/uk_bin_collection/councils/ChichesterDistrictCouncil.py (1)

27-28: Centralize the shared Selenium user-agent string to avoid drift across councils.

Line 27 duplicates a long literal that is now repeated in many modules. Consider using a single constant (e.g., in common.py) and reusing it here.

♻️ Proposed refactor
-            user_agent = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36"
-            driver = create_webdriver(web_driver, headless, user_agent, __name__)
+            driver = create_webdriver(
+                web_driver, headless, DEFAULT_SELENIUM_USER_AGENT, __name__
+            )
+# uk_bin_collection/uk_bin_collection/common.py
+DEFAULT_SELENIUM_USER_AGENT = (
+    "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 "
+    "(KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36"
+)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@uk_bin_collection/uk_bin_collection/councils/ChichesterDistrictCouncil.py`
around lines 27 - 28, The duplicated Selenium user-agent literal assigned to
user_agent before calling create_webdriver should be replaced with a shared
constant: define a USER_AGENT (or SELENIUM_USER_AGENT) in the module that
centralizes common values (e.g., common.py) and import it here; then remove the
local literal and pass the imported USER_AGENT into create_webdriver in the
ChichesterDistrictCouncil code (referencing the user_agent variable and
create_webdriver function to locate the change).
uk_bin_collection/uk_bin_collection/councils/BroxtoweBoroughCouncil.py (1)

34-35: Consider centralizing this shared user-agent value.

This literal is now duplicated across many scrapers, which makes future updates error-prone. Prefer a single constant/helper in common code and reuse it here.

Refactor sketch
# uk_bin_collection/uk_bin_collection/common.py
+DEFAULT_SELENIUM_USER_AGENT = (
+    "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 "
+    "(KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36"
+)

# uk_bin_collection/uk_bin_collection/councils/BroxtoweBoroughCouncil.py
-            user_agent = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36"
+            user_agent = DEFAULT_SELENIUM_USER_AGENT
             driver = create_webdriver(web_driver, headless, user_agent, __name__)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@uk_bin_collection/uk_bin_collection/councils/BroxtoweBoroughCouncil.py`
around lines 34 - 35, The duplicated user-agent string used before calling
create_webdriver in BroxtoweBoroughCouncil.py should be centralized: remove the
local literal assignment to user_agent and instead import or reference a shared
constant (e.g., DEFAULT_USER_AGENT) from a common module (e.g., a utils or
settings module) and pass that into create_webdriver(web_driver, headless,
DEFAULT_USER_AGENT, __name__); update any other scrapers to use the same
constant to avoid duplication.
uk_bin_collection/uk_bin_collection/councils/AshfordBoroughCouncil.py (1)

36-37: Centralize the user-agent constant to reduce maintenance churn.

This literal is repeated across many council modules in the PR. Please consider moving it to a shared constant (e.g., in uk_bin_collection/uk_bin_collection/common.py) and reusing it here, so future UA updates happen in one place.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@uk_bin_collection/uk_bin_collection/councils/AshfordBoroughCouncil.py` around
lines 36 - 37, Replace the hard-coded user_agent string in
AshfordBoroughCouncil.py by importing a shared constant from the common module:
add a constant like DEFAULT_USER_AGENT (or USER_AGENT) in
uk_bin_collection.common and then use that constant when calling
create_webdriver (replace the local user_agent variable used before the
create_webdriver(web_driver, headless, user_agent, __name__) call). Ensure the
constant name is descriptive and update the import in AshfordBoroughCouncil.py
so the create_webdriver call uses the shared value.
uk_bin_collection/uk_bin_collection/councils/BlaenauGwentCountyBoroughCouncil.py (1)

26-27: Consider moving the UA to one shared constant/default.

This works, but copying the same literal across the blanket change will make the next UA rotation another multi-file edit. If this is now the common Selenium default, set it once in create_webdriver() or a shared constant and only override councils that genuinely need a different value.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@uk_bin_collection/uk_bin_collection/councils/BlaenauGwentCountyBoroughCouncil.py`
around lines 26 - 27, The hard-coded user agent string near the call to
create_webdriver(web_driver, headless, user_agent, __name__) should be
centralized: remove the literal from BlaenauGwentCountyBoroughCouncil.py and
instead provide a default/shared constant or default parameter inside the
create_webdriver function (or a shared module constant like DEFAULT_USER_AGENT)
so callers only pass a custom UA when needed; update create_webdriver to use
that default and change this call to rely on the default (or reference the
shared constant) so future UA rotations require a single-file change.
uk_bin_collection/uk_bin_collection/councils/NorthumberlandCouncil.py (1)

1-3: Minor: Redundant import shadows the datetime module.

The file imports datetime as a module on line 1, then imports the datetime class from it on line 3, which shadows the module. While this is pre-existing code and works correctly, it could cause confusion.

♻️ Suggested cleanup
-import datetime
 import time
 from datetime import datetime
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@uk_bin_collection/uk_bin_collection/councils/NorthumberlandCouncil.py` around
lines 1 - 3, The file currently imports the datetime module and also imports the
datetime class from it (imports at top: "import datetime" and "from datetime
import datetime"), which causes the module name to be shadowed and is confusing;
remove the redundant "from datetime import datetime" (or alternatively remove
"import datetime" and qualify usages) and update any references to use either
datetime.datetime or datetime (depending on which import you keep) so all usages
(e.g., in functions/classes that call datetime.now or datetime.strptime) remain
correct and unambiguous.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@uk_bin_collection/uk_bin_collection/councils/StaffordshireMoorlandsDistrictCouncil.py`:
- Around line 29-30: There's an extra indentation before the driver assignment
causing a syntax/parsing error; align the driver = create_webdriver(...)
statement with the user_agent = ... line so both have the same indentation
level. Locate the driver assignment that calls create_webdriver(web_driver,
headless, user_agent, __name__) and remove the leading tab/spaces so it lines up
with the user_agent variable.

In `@uk_bin_collection/uk_bin_collection/councils/ThanetDistrictCouncil.py`:
- Around line 31-32: The indentation of the driver creation is incorrect: align
the `driver = create_webdriver(web_driver, headless, user_agent, __name__)`
statement to the same indentation level as the `user_agent = "..."` assignment
so both `user_agent` and `driver` are in the same scope; locate the `user_agent`
and `driver` lines in ThanetDistrictCouncil.py and un-indent or re-indent the
`driver` line accordingly so `create_webdriver(web_driver, headless, user_agent,
__name__)` executes without a syntax error.

---

Nitpick comments:
In `@uk_bin_collection/uk_bin_collection/councils/AshfordBoroughCouncil.py`:
- Around line 36-37: Replace the hard-coded user_agent string in
AshfordBoroughCouncil.py by importing a shared constant from the common module:
add a constant like DEFAULT_USER_AGENT (or USER_AGENT) in
uk_bin_collection.common and then use that constant when calling
create_webdriver (replace the local user_agent variable used before the
create_webdriver(web_driver, headless, user_agent, __name__) call). Ensure the
constant name is descriptive and update the import in AshfordBoroughCouncil.py
so the create_webdriver call uses the shared value.

In
`@uk_bin_collection/uk_bin_collection/councils/BlaenauGwentCountyBoroughCouncil.py`:
- Around line 26-27: The hard-coded user agent string near the call to
create_webdriver(web_driver, headless, user_agent, __name__) should be
centralized: remove the literal from BlaenauGwentCountyBoroughCouncil.py and
instead provide a default/shared constant or default parameter inside the
create_webdriver function (or a shared module constant like DEFAULT_USER_AGENT)
so callers only pass a custom UA when needed; update create_webdriver to use
that default and change this call to rely on the default (or reference the
shared constant) so future UA rotations require a single-file change.

In `@uk_bin_collection/uk_bin_collection/councils/BroxtoweBoroughCouncil.py`:
- Around line 34-35: The duplicated user-agent string used before calling
create_webdriver in BroxtoweBoroughCouncil.py should be centralized: remove the
local literal assignment to user_agent and instead import or reference a shared
constant (e.g., DEFAULT_USER_AGENT) from a common module (e.g., a utils or
settings module) and pass that into create_webdriver(web_driver, headless,
DEFAULT_USER_AGENT, __name__); update any other scrapers to use the same
constant to avoid duplication.

In `@uk_bin_collection/uk_bin_collection/councils/CalderdaleCouncil.py`:
- Around line 42-43: The hard-coded Selenium user-agent literal used when
calling create_webdriver (e.g., the user_agent variable in CalderdaleCouncil.py)
is duplicated across multiple councils; centralize it by adding a single
constant USER_AGENT (or SELENIUM_USER_AGENT) in uk_bin_collection.common and
import that constant in CalderdaleCouncil.py, replacing the local user_agent
assignment and passing USER_AGENT into create_webdriver(web_driver, headless,
USER_AGENT, __name__); apply the same replacement to other council modules that
define the same literal so all callers of create_webdriver use the shared
constant.

In `@uk_bin_collection/uk_bin_collection/councils/ChichesterDistrictCouncil.py`:
- Around line 27-28: The duplicated Selenium user-agent literal assigned to
user_agent before calling create_webdriver should be replaced with a shared
constant: define a USER_AGENT (or SELENIUM_USER_AGENT) in the module that
centralizes common values (e.g., common.py) and import it here; then remove the
local literal and pass the imported USER_AGENT into create_webdriver in the
ChichesterDistrictCouncil code (referencing the user_agent variable and
create_webdriver function to locate the change).

In `@uk_bin_collection/uk_bin_collection/councils/ForestOfDeanDistrictCouncil.py`:
- Around line 39-40: Replace the hardcoded user_agent string in
ForestOfDeanDistrictCouncil.py by importing a shared constant from the common
module (e.g., DEFAULT_USER_AGENT) and pass that into create_webdriver instead of
the inline user_agent variable; update any references to the local user_agent
variable, add the constant (DEFAULT_USER_AGENT = "Mozilla/5.0 ...") to the
common helper/module, and adjust the import statement in this file to use the
shared constant so all scrapers reference the single source of truth.

In `@uk_bin_collection/uk_bin_collection/councils/NorthumberlandCouncil.py`:
- Around line 1-3: The file currently imports the datetime module and also
imports the datetime class from it (imports at top: "import datetime" and "from
datetime import datetime"), which causes the module name to be shadowed and is
confusing; remove the redundant "from datetime import datetime" (or
alternatively remove "import datetime" and qualify usages) and update any
references to use either datetime.datetime or datetime (depending on which
import you keep) so all usages (e.g., in functions/classes that call
datetime.now or datetime.strptime) remain correct and unambiguous.

In `@uk_bin_collection/uk_bin_collection/councils/PowysCouncil.py`:
- Around line 36-37: Extract the duplicated user-agent literal into a shared
constant (e.g., DEFAULT_SELENIUM_USER_AGENT) in the common module and replace
the local assignment in PowysCouncil.py (the user_agent variable used when
calling create_webdriver) with an import of that constant; update the
create_webdriver(web_driver, headless, user_agent, __name__) call to pass the
imported DEFAULT_SELENIUM_USER_AGENT and add the new constant export in common
so other scrapers can reuse it.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c69db34a-ba48-46d1-91fb-7540d71a6717

📥 Commits

Reviewing files that changed from the base of the PR and between 60bd3cc and 2b31cd8.

📒 Files selected for processing (63)
  • uk_bin_collection/uk_bin_collection/councils/ArgyllandButeCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/AshfordBoroughCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/BaberghDistrictCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/BarkingDagenham.py
  • uk_bin_collection/uk_bin_collection/councils/BlackburnCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/BlaenauGwentCountyBoroughCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/BrightonandHoveCityCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/BroadlandDistrictCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/BroxtoweBoroughCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/CalderdaleCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/CeredigionCountyCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/ChichesterDistrictCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/ColchesterCityCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/CotswoldDistrictCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/CroydonCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/DacorumBoroughCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/EastLindseyDistrictCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/EastRenfrewshireCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/EastRidingCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/EastSuffolkCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/EpsomandEwellBoroughCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/ForestOfDeanDistrictCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/GloucesterCityCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/GreatYarmouthBoroughCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/GuildfordCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/HertsmereBoroughCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/HighPeakCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/Hillingdon.py
  • uk_bin_collection/uk_bin_collection/councils/HyndburnBoroughCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/KingstonUponThamesCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/KnowsleyMBCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/LondonBoroughRedbridge.py
  • uk_bin_collection/uk_bin_collection/councils/MaidstoneBoroughCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/MidAndEastAntrimBoroughCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/MidSuffolkDistrictCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/MidUlsterDistrictCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/NorthDevonCountyCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/NorthEastDerbyshireDistrictCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/NorthWestLeicestershire.py
  • uk_bin_collection/uk_bin_collection/councils/NorthumberlandCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/PortsmouthCityCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/PowysCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/PrestonCityCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/RugbyBoroughCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/RushcliffeBoroughCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/SevenoaksDistrictCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/StHelensBC.py
  • uk_bin_collection/uk_bin_collection/councils/StaffordshireMoorlandsDistrictCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/StocktonOnTeesCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/SunderlandCityCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/TeignbridgeCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/TendringDistrictCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/ThanetDistrictCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/ThreeRiversDistrictCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/TorbayCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/WalthamForest.py
  • uk_bin_collection/uk_bin_collection/councils/WestBerkshireCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/WestLothianCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/WestOxfordshireDistrictCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/WinchesterCityCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/WirralCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/WokinghamBoroughCouncil.py
  • uk_bin_collection/uk_bin_collection/councils/WrexhamCountyBoroughCouncil.py

Comment on lines +29 to +30
user_agent = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36"
driver = create_webdriver(web_driver, headless, user_agent, __name__)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

python - <<'PY'
from pathlib import Path
import ast
import sys

path = Path("uk_bin_collection/uk_bin_collection/councils/StaffordshireMoorlandsDistrictCouncil.py")
source = path.read_text(encoding="utf-8")

try:
    ast.parse(source, filename=str(path))
    print(f"{path}: syntax OK")
except SyntaxError as exc:
    print(f"{path}:{exc.lineno}:{exc.offset}: {exc.msg}")
    sys.exit(1)
PY

Repository: robbrad/UKBinCollectionData

Length of output: 181


🏁 Script executed:

cat -n uk_bin_collection/uk_bin_collection/councils/StaffordshireMoorlandsDistrictCouncil.py | sed -n '25,35p'

Repository: robbrad/UKBinCollectionData

Length of output: 666


Fix the extra indentation on line 30.

Line 30 is indented one level too deep, causing a syntax error that prevents the file from being parsed. The driver assignment must align with the user_agent assignment above it.

🐛 Minimal fix
         # Create Selenium webdriver
         user_agent = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36"
-            driver = create_webdriver(web_driver, headless, user_agent, __name__)
+        driver = create_webdriver(web_driver, headless, user_agent, __name__)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
user_agent = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36"
driver = create_webdriver(web_driver, headless, user_agent, __name__)
user_agent = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36"
driver = create_webdriver(web_driver, headless, user_agent, __name__)
🧰 Tools
🪛 Ruff (0.15.9)

[warning] 30-30: Unexpected indentation

(invalid-syntax)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@uk_bin_collection/uk_bin_collection/councils/StaffordshireMoorlandsDistrictCouncil.py`
around lines 29 - 30, There's an extra indentation before the driver assignment
causing a syntax/parsing error; align the driver = create_webdriver(...)
statement with the user_agent = ... line so both have the same indentation
level. Locate the driver assignment that calls create_webdriver(web_driver,
headless, user_agent, __name__) and remove the leading tab/spaces so it lines up
with the user_agent variable.

Comment on lines +31 to +32
user_agent = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36"
driver = create_webdriver(web_driver, headless, user_agent, __name__)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Indentation error causes syntax failure.

Line 32 has incorrect indentation relative to line 31. The driver = create_webdriver(...) statement is indented further than the user_agent = ... assignment, which will cause a Python syntax error preventing this scraper from running.

🐛 Proposed fix
         # Create the Selenium WebDriver
         user_agent = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36"
-            driver = create_webdriver(web_driver, headless, user_agent, __name__)
+        driver = create_webdriver(web_driver, headless, user_agent, __name__)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
user_agent = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36"
driver = create_webdriver(web_driver, headless, user_agent, __name__)
# Create the Selenium WebDriver
user_agent = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36"
driver = create_webdriver(web_driver, headless, user_agent, __name__)
🧰 Tools
🪛 Ruff (0.15.9)

[warning] 32-32: Unexpected indentation

(invalid-syntax)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@uk_bin_collection/uk_bin_collection/councils/ThanetDistrictCouncil.py` around
lines 31 - 32, The indentation of the driver creation is incorrect: align the
`driver = create_webdriver(web_driver, headless, user_agent, __name__)`
statement to the same indentation level as the `user_agent = "..."` assignment
so both `user_agent` and `driver` are in the same scope; locate the `user_agent`
and `driver` lines in ThanetDistrictCouncil.py and un-indent or re-indent the
`driver` line accordingly so `create_webdriver(web_driver, headless, user_agent,
__name__)` executes without a syntax error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant