Skip to content

OpenBeta/parquet-exporter

Repository files navigation

OpenBeta Parquet Exporter

Export climbing route data from OpenBeta to Apache Parquet format.

Quick Start - Download Data

Latest export: Releases

Download openbeta-climbs.parquet and use it with any Parquet-compatible tool:

# Python with DuckDB
import duckdb
df = duckdb.execute("SELECT * FROM 'openbeta-climbs.parquet' LIMIT 10").fetchdf()
print(df)
# R with arrow
library(arrow)
df <- read_parquet('openbeta-climbs.parquet')
-- DuckDB
SELECT * FROM 'openbeta-climbs.parquet'
WHERE country = 'USA' AND state_province = 'California'
LIMIT 10;

Converting to JSON/GeoJSON

python parquet2json.py climbs.json      # JSON
python parquet2json.py climbs.geojson   # GeoJSON (auto-detected from extension)

Data Format

Each row represents one climbing route. See schema.sql for column definitions.

Customizing the Export

Want different fields or filters? You can customize and run your own export!

Prerequisites

  • Python 3.9+
  • pip

Installation

git clone https://github.com/OpenBeta/parquet-exporter.git
cd parquet-exporter
pip install -r requirements.txt

Option 1: Edit Configuration

Edit config.yaml to change:

  • Geographic regions to export
  • Output filename
  • Compression type (snappy, gzip, zstd)
export:
  regions:
    - USA        # Change to your preferred regions
    - Canada

  output:
    filename: "my-custom-export.parquet"
    compression: "zstd"  # or snappy, gzip

Option 2: Custom SQL Schema

Edit schema.sql to reshape the data:

-- Example: Filter to sport routes only
SELECT
    uuid AS climb_id,
    name AS climb_name,
    grades.yds AS grade,
    metadata.lat AS latitude,
    metadata.lng AS longitude

FROM climbs
WHERE type.sport = true

Run Your Custom Export

python export.py

Output will be saved to the filename specified in config.yaml.

Example Schemas

The examples/ directory contains ready-to-use schema variations:

Minimal (smallest file)

cp examples/schema-minimal.sql schema.sql
python export.py

Just climb name, grade, and coordinates. Smallest possible file size.

Extended (all metadata)

cp examples/schema-extended.sql schema.sql
python export.py

All available fields including descriptions, multiple grade systems, and full location hierarchy.

USA Sport Routes Only

cp examples/schema-usa-sport-only.sql schema.sql
python export.py

Filtered to just sport climbing routes in the United States.

Data Updates

This export runs weekly (Sundays at midnight UTC). Each release is versioned by date: v2024-11-19.

To get the latest:

# Download latest release programmatically
curl -s https://api.github.com/repos/OpenBeta/parquet-exporter/releases/latest \
  | grep "browser_download_url.*parquet" \
  | cut -d : -f 2,3 \
  | tr -d \" \
  | wget -qi -

Data Source

All data comes from the OpenBeta GraphQL API.

OpenBeta is a free, crowd-sourced climbing route database. Learn more at openbeta.io.

License

You can use this data for any purpose, including commercial applications, without restriction.

Support

Contributing

PRs welcome! Especially for:

  • Additional example schemas
  • Performance improvements
  • Better error handling
  • Documentation improvements

Built with ❤️ by the OpenBeta community

About

Export OpenBeta climbing route data to Apache Parquet format

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages