The BC Child Care Dashboard is an interactive Quarto dashboard for finding open childcare vacancies across BC. It pulls daily data from the BC Child Care dataset published by the BC government and is published automatically via GitHub Actions.
The dashboard has interactive filters, a map view, and a facility listing to help navigate vacancies by location, age group, language, and certification.
In addition to the daily snapshot, this repo tracks per-facility vacancy history in data/vacancy_history.csv. This enables questions like:
- Has this facility ever had a vacancy for under-36-month-olds?
- When was the last time this facility had an open spot?
- When did this facility first appear in the dataset?
The history file is updated daily by a separate GitHub Action (update_history.yml) that runs 30 minutes before the dashboard publishes. It tracks one row per facility with these fields:
| Column | Description |
|---|---|
FAC_PARTY_ID |
Unique facility identifier |
is_active |
FALSE if facility no longer appears in BC data |
date_first_seen |
Date facility first appeared after tracking began (NA for facilities present at launch) |
ever_vacancy_under36 |
Has ever had a vacancy for children under 36 months |
last_vacancy_under36 |
Most recent date vacancy was open for under-36-month-olds |
ever_vacancy_30mos_5yrs |
Has ever had a vacancy for 30 months to 5 years |
last_vacancy_30mos_5yrs |
Most recent date vacancy was open for 30 months to 5 years |
ever_vacancy_licpre |
Has ever had a vacancy for licensed preschool |
last_vacancy_licpre |
Most recent date vacancy was open for licensed preschool |
ever_vacancy_gr1_age12 |
Has ever had a vacancy for grade 1 to age 12 |
last_vacancy_gr1_age12 |
Most recent date vacancy was open for grade 1 to age 12 |
data/facility_urls.csv stores website URLs for each facility, seeded from the BC dataset's WEBSITE field and supplemented by DuckDuckGo search for facilities without one. It tracks one row per facility:
| Column | Description |
|---|---|
FAC_PARTY_ID |
Unique facility identifier |
url |
Website URL (NA if not found) |
url_source |
"bc_dataset" if from BC data, "duckduckgo" if found via search |
last_searched |
Date DuckDuckGo was last queried for this facility |
The file is updated monthly by find_urls.yml. Facilities with no URL are re-searched after 150 days in case a website has since appeared.
Running manually:
Rscript find_urls.RTuning parameters (set as environment variables):
| Variable | Default | Description |
|---|---|---|
DDG_THROTTLE_SECS |
3 |
Seconds between sequential DuckDuckGo requests; lower = faster but more likely to trigger rate limiting |
DDG_BATCH_SIZE |
100 |
Facilities per write checkpoint; lower = more frequent saves |
DDG_RETRY_DAYS |
150 |
Days before re-searching a facility that previously returned no URL |
DDG_MAX_RUNTIME_SECS |
18000 |
Wall-clock budget (5h). Script checkpoints and exits cleanly when reached, so the monthly cron stays under the GHA 6h job limit |
DDG_MAX_CONSEC_BLOCKS |
100 |
Abort the run after this many consecutive blocked responses (DDG is rate-limiting our IP); at 3s/req this allows ~5 min of transient blocking before giving up; the next run will resume |
Blocked responses (DDG anti-bot pages, non-200 status, transport errors) do not update last_searched, so affected facilities are retried on the next run rather than locked out for DDG_RETRY_DAYS days.
DDG_THROTTLE_SECS=5 Rscript find_urls.R