SΓ»rEtBon is a backend infrastructure for restaurant safety assessment, combining government health inspection data with public ratings to provide comprehensive food establishment evaluation. Built on Supabase and following medallion architecture patterns, it delivers scalable, secure, and maintainable data processing capabilities.
This backend project provides one-time initialization scripts to set up the infrastructure and load initial data. Ongoing data updates are handled by a separate project (data-pipeline) using Apache Airflow for scheduled ETL operations.
- This project (
backend): Initial setup, migrations, and infrastructure maintenance - data-pipeline project: Scheduled data updates, ETL pipelines, and data refresh operations
SΓ»rEtBon implements a modern data platform architecture with separated initialization and update processes:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β External Data Sources β
β ββββββββββββββββ βββββββββββββββββ βββββββββββββββββ βββββββββββββββ β
β β OpenStreetMapβ β Alim'confianceβ β Google Places β β Tripadvisor β β
β ββββββββ¬ββββββββ βββββββββ¬ββββββββ βββββββββ¬ββββββββ ββββββββ¬βββββββ β
βββββββββββΌβββββββββββββββββββΌβββββββββββββββββββΌββββββββββββββββββΌββββββββββ
βΌ βΌ βΌ βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Data Ingestion (Two Projects) β
β ββββββββββββββββββββββββββ ββββββββββββββββββββββββββββ β
β β Backend (This Project)β β data-pipeline (Airflow) β β
β β β’ Initial data load β β β’ Scheduled updates β β
β β β’ One-time setup β β β’ ETL pipelines β β
β β β’ Infrastructure β β β’ Data refresh β β
β ββββββββββββββ¬ββββββββββββ βββββββββββββ¬βββββββββββββββ β
ββββββββββββββββββββββββΌβββββββββββββββββββββββββββΌββββββββββββββββββββββββββ
βΌ βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Supabase Platform β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Storage (Data Lake) β β
β β Parquet File Archive β β
β ββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββ β
β βΌ β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β PostgreSQL Database β β
β β ββββββββββββββ ββββββββββββββ ββββββββββββββ β β
β β β Bronze ββ β Silver ββ β Gold β β β
β β β Schema β β Schema β β Schema β β β
β β ββββββββββββββ ββββββββββββββ ββββββββββββββ β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
The system implements a three-layer medallion architecture for progressive data refinement:
- Bronze Layer: Raw data ingestion with minimal transformation
- Silver Layer: Cleansed, validated, and standardized data
- Gold Layer: Business-ready aggregations and metrics
The platform includes comprehensive PostGIS support for location-based analysis:
- PostGIS Core: 2D/3D geometry and geography types for restaurant locations
- Raster Support: Heatmap generation and density analysis
- Advanced 3D: Complex geometric calculations for urban environments
- Topology Management: Administrative boundaries and inspection zones
- Spatial Indexing: High-performance proximity searches and geographic queries
The Supabase CLI manages local development and deployment:
π Installation Guide: https://supabase.com/docs/guides/cli/getting-started
I recommend uv for fast, reliable Python dependency management:
π Installation Guide: https://docs.astral.sh/uv/getting-started/installation/
Authenticate with your Supabase account:
supabase loginπ Authentication Guide: https://supabase.com/docs/reference/cli/supabase-login
Create a new project on the Supabase platform:
supabase projects create SurEtBon \
--db-password <strong-password> \
--org-id <your-org-id> \
--region eu-west-3Parameters:
--db-password: Strong password (16+ chars, mixed case, numbers, symbols)--org-id: Organization ID (runsupabase orgs listto find)--region: Geographic region (eu-west-3for Paris)
Important: Save the displayed REFERENCE ID - you'll need it throughout setup.
Initialize the local Supabase stack:
supabase startConnect your local environment to the cloud project:
supabase link --project-ref <reference-id>For self-hosted Supabase instances, the supabase link command is not available. Skip Step 3 and proceed directly to Step 4.
The initialization script auto-detects self-hosted deployments and uses --db-url instead of --linked for database migrations. Ensure SUPABASE_DB_URI is correctly configured in your .env file.
-
Copy template:
cp .env.sample .env chmod 600 .env # Secure file permissions -
Set SUPABASE_URL:
SUPABASE_URL=https://<reference-id>.supabase.co
-
Get API keys:
supabase projects api-keys --project-ref <reference-id>
Copy the
service_rolekey toSUPABASE_SERVICE_ROLE_KEY -
Configure database URI:
- Navigate to:
https://supabase.com/dashboard/project/<reference-id>?showConnect=true - Select "Session pooler" connection type
- Copy URI and remove
postgresql://prefix - Set in
.envasSUPABASE_DB_URI
- Navigate to:
Run the comprehensive initialization script:
./bin/initialize_backend.shThis performs:
- Storage bucket creation with security policies
- Database setup via migrations (all schemas and tables)
- Initial data import (OSM + Alim'confiance) for bootstrap
Expected runtime: 5-10 minutes
Note: This is a one-time initialization. Subsequent data updates will be handled by the data-pipeline project.
For users who have already initialized the backend and need to apply new migrations:
Push pending migrations to your linked project:
supabase db push --linkedThis applies all migrations in supabase/migrations/ that haven't been executed yet.
Push pending migrations using the database URL:
supabase db push --db-url "postgres://$SUPABASE_DB_URI"Ensure SUPABASE_DB_URI is configured in your .env file.
Check which migrations have been applied:
# Hosted Supabase
supabase migration list --linked
# Self-hosted Supabase
supabase migration list --db-url "postgres://$SUPABASE_DB_URI"backend/
βββ bin/ # Executable scripts
β βββ initialize_backend.sh # Main orchestrator (Bash)
β βββ setup_bucket.py # Storage configuration (Python)
β βββ download_osm_data.py # OSM initial data loader (Python)
β βββ download_alimconfiance_data.py # Government initial data loader (Python)
β
βββ supabase/ # Supabase configuration
β βββ config.toml # Service configuration
β βββ migrations/ # Database migrations
β β βββ 20251008212033_create_medallion_architecture.sql # Schemas: bronze, silver, gold
β β βββ 20251008230529_create_restaurant_ratings_enrichment_tables.sql # API response tables
β β βββ 20251025000000_enable_database_extensions.sql # PostGIS and fuzzystrmatch extensions
β βββ functions/ # Edge Functions (future)
β
βββ logs/ # Execution logs (gitignored)
β βββ backend_initialization_*.log # Timestamped logs
β
βββ .env.sample # Environment template
βββ .env # Local configuration (gitignored)
βββ .gitignore # Git exclusions
βββ pyproject.toml # Python dependencies
βββ README.md # This documentation
Bucket: data_lake
- Access: Private (service role only)
- File Types: Parquet, CSV
- Size Limit: 50MB per file
- Organization: Date-partitioned folders
| Extension | Schema | Description | Use Case |
|---|---|---|---|
postgis |
extensions |
Core geospatial types and functions | Restaurant proximity searches, coordinate transforms |
postgis_raster |
extensions |
Raster data support | Heatmaps, density analysis, coverage visualization |
postgis_sfcgal |
extensions |
Advanced 3D geometry operations | Complex spatial calculations, building-level analysis |
postgis_topology |
topology |
Spatial topology management | Administrative boundaries, inspection zone management |
fuzzystrmatch |
extensions |
Fuzzy string matching functions | Restaurant name matching, typo tolerance |
geohash_adjacent |
extensions |
Compute adjacent GeoHash cell in a direction | Spatial neighbor computation, boundary handling |
geohash_neighbors |
extensions |
Get 9-cell array (center + 8 neighbors) | Spatial batch processing, boundary coverage |
Spatial Reference Systems: WGS84 (4326), Web Mercator (3857), Lambert-93 (2154)
| Table | Description | Initial Data | Row Count |
|---|---|---|---|
osm_france_food_service |
OpenStreetMap restaurants | Latest snapshot at initialization | ~165K |
export_alimconfiance |
Health inspections | Latest available (daily refresh) | ~80K |
google_places |
Google Maps restaurant ratings | Populated by data-pipeline via Airflow | Variable* |
tripadvisor_location_details |
Tripadvisor restaurant ratings | Populated by data-pipeline via Airflow | Variable* |
*API-based tables: Row counts vary based on monthly query budget. Airflow prioritizes new restaurants first, then oldest updates. Historical data is preserved.
Populated by the data-pipeline project using DBT transformations. Contains validated and standardized datasets where OpenStreetMap restaurant data is linked with official health inspection results, enabling accurate matching for rating enrichment.
Populated by the data-pipeline project using DBT transformations. Contains the final aggregated dataset with all restaurant data enriched with Google Maps and Tripadvisor ratings, providing comprehensive metrics and KPIs for food establishment assessment.
- Provider: OpenDataSoft
- Coverage: Metropolitan France + overseas
- Initial Load: Latest available snapshot at initialization (dated to most recent Monday)
- Format: Parquet
- License: ODbL
- Note: Regular updates handled by data-pipeline project. The initialization script calculates the most recent Monday date for consistency with OpenDataSoft's weekly refresh cycle.
- Provider: Ministry of Agriculture (via OpenDataSoft DGAL)
- Coverage: All French food establishments
- Initial Load: Latest available snapshot at initialization
- Format: Parquet
- License: Open License 2.0
- Note: Data refreshed daily. Regular updates handled by data-pipeline project
Version: 0.0.3 Last Updated: 2025-12-09 Maintainers: Jonathan About