FoehnCast tells you which Swiss kiteboarding spot is worth the drive today. It pulls weather forecasts, builds features, trains a quality model, and ranks spots through an API — all following the Feature → Training → Inference (FTI) pattern from the HSLU MLOps course.
Full docs: https://javihslu.github.io/foehncast/
Reviewers: the scoring checklist maps every grading criterion to where it is implemented, with direct links to the code, docs, and live services.
flowchart TD
Forecasts[Weather APIs] -->|Feature pipeline| Curated[Curated features]
Curated -->|Training pipeline| Registry[MLflow model registry]
Curated --> Feast[Feast online store]
Inputs[Live forecast + drive time] --> API[FastAPI]
Feast --> API
Registry --> API
API --> Rank[Ranked spots]
Three pipelines, one goal:
| Pipeline | What it does |
|---|---|
| Feature | Fetches forecasts, engineers wind features, validates data, stores parquet |
| Training | Labels quality, trains a model, evaluates it, registers in MLflow |
| Inference | Serves predictions via FastAPI + ranks spots for one rider profile |
Open the live demo → — ranked Swiss kiteboarding spots, served from Cloud Run.
git clone https://github.com/javihslu/foehncast.git
cd foehncast
./scripts/bootstrap-local.shThe script starts the local stack (Airflow, MLflow, MinIO, Prometheus, and the app) and runs a smoke test. No cloud credentials are needed.
After bootstrap, you get:
| Service | URL |
|---|---|
| App (FastAPI) | http://127.0.0.1:8000 |
| Airflow | http://127.0.0.1:8080 |
| MLflow | http://127.0.0.1:5001 |
| Prometheus | http://127.0.0.1:9090 |
Try it:
curl -X POST http://127.0.0.1:8000/rank \
-H 'content-type: application/json' \
-d '{"spot_ids":["silvaplana","urnersee"]}'DVC lets you rerun the offline pipelines deterministically:
dvc repro # ingest → curate → train
dvc metrics show # check training resultsBoth DVC stages call the same Python modules as the Airflow DAGs, so the offline and orchestrated runs share one code path.
make test # run the unit tests
make lint # ruff
make coverage # coverage reportThe cloud version runs on GCP Cloud Run. Contributors do not need cloud access; Docker is enough to run everything locally.
flowchart LR
classDef public fill:#c8e6c9,stroke:#2e7d32,color:#0f2530
classDef managed fill:#e3f2fd,stroke:#1565c0,color:#0f2530
classDef svc fill:#fff3e0,stroke:#ef6c00,color:#0f2530
UI[Streamlit UI]:::public
APP[FastAPI App]:::public
MLF[MLflow]:::svc
WF[Cloud Workflows]:::svc
BQ[(BigQuery)]:::managed
GCS[(Cloud Storage)]:::managed
SQL[(Cloud SQL)]:::managed
APP --> BQ & GCS
MLF --> SQL & GCS
WF -->|schedules pipelines| APP
Deployment is handled by Terraform (infrastructure) and Cloud Build triggers (images). See the Cloud Architecture docs for details.
src/foehncast/ # All application code (features, training, inference, monitoring)
dags/ # Airflow DAG definitions
ui/ # Streamlit dashboard
containers/ # Dockerfiles (6 services)
scripts/ # Bootstrap and helper scripts
terraform/ # GCP infrastructure-as-code
tests/ # pytest unit tests
docs/ # MkDocs source → GitHub Pages
config.yaml # All tuneable parameters (spots, model, APIs)
dvc.yaml # Reproducible pipeline stages
- Scoring checklist — grading criteria mapped to evidence
- Full documentation
- Getting started
- Architecture
- Cloud architecture
- Terraform operator detail:
terraform/README.md - Container detail:
containers/README.md
