A cross-platform terminal uploader for data.welllabs.org.
This repository contains a single Python script that helps WELL Labs Data Platform users upload datasets and files from Windows, macOS, or Linux. It follows the same upload flow as the web application, while adding a terminal-friendly workflow for metadata, API-key setup, and pre-upload sensitivity checks.
- Prompts for
WELL_API_KEYif it is not already set. - Optionally saves the API key for future terminal sessions.
- Provides a local Tkinter UI for users who prefer forms over terminal prompts.
- Prompts for WELL Labs login credentials locally.
- Creates a logged-in upload session.
- Creates new datasets.
- Adds files to existing datasets.
- Replaces datasets by uploading a new dataset first, then asking whether to delete the old one.
- Deletes datasets only after explicit confirmation.
- Supports reviewed metadata JSON files so metadata can be prepared before upload.
- Runs a pre-upload scan for sensitive or traceability fields.
- Offers an interactive sanitization step to drop or anonymise selected columns or JSON keys.
- Checks the upload size limit and can ZIP large files before upload.
The script does not contain credentials, API keys, or organization-specific data.
- Python 3.10 or newer.
- A WELL Labs Data Platform account.
- A WELL Labs Data Platform API key.
- Permission to upload to
data.welllabs.org.
No external Python packages are required. The script uses only the Python standard library.
Download or clone this repository, then run the script directly.
git clone https://github.com/WhatsThisClint/welllabs-data-platform-uploader.git
cd welllabs-data-platform-uploaderOn Windows, use PowerShell. On macOS or Linux, use Terminal.
Use the launcher when you want to choose UI or terminal at the beginning.
Windows:
python .\welllabs_uploader.pymacOS or Linux:
python3 welllabs_uploader.pyIt asks:
1. Local UI
2. Terminal
You can also skip the question:
python3 welllabs_uploader.py --ui
python3 welllabs_uploader.py --terminal --mode new-dataset --file "/path/to/your-file.csv"Open the local uploader UI:
Windows:
python .\welllabs_uploader_ui.pymacOS or Linux:
python3 welllabs_uploader_ui.pyThe UI runs on your computer. It lets you choose a file, fill metadata in forms, scan for sensitive fields, drop or anonymise supported columns or JSON keys, save or load metadata JSON, ZIP files that exceed the platform upload limit, and upload using your own API key and WELL Labs login.
The UI does not save passwords. If you choose to save the API key, it uses the same WELL_API_KEY behavior as the terminal uploader.
Create a new dataset and upload one file.
Windows:
python .\welllabs_data_platform_uploader.py --mode new-dataset --file "C:\path\to\your-file.csv"macOS or Linux:
python3 welllabs_data_platform_uploader.py --mode new-dataset --file "/path/to/your-file.csv"If the API key is not available, the script asks for it. If you choose to save it, future runs will use WELL_API_KEY.
Add a file to an existing dataset:
python3 welllabs_data_platform_uploader.py --mode add-file --dataset-id DATASET_UUID --file "/path/to/your-file.csv"Prepare metadata in advance and pass it to the uploader:
python3 welllabs_data_platform_uploader.py --mode new-dataset --file "/path/to/your-file.csv" --metadata "./metadata.example.json"When --metadata is used, the script uses the JSON values and prompts only for missing required values, API-key setup, and login.
Replace mode uploads a new dataset first. Only after the new upload succeeds does the script ask whether to delete the old dataset.
python3 welllabs_data_platform_uploader.py --mode replace-dataset --dataset-id OLD_DATASET_UUID --file "/path/to/new-file.csv" --metadata "./metadata.example.json"Delete mode shows the dataset details and requires typing DELETE before it deletes anything.
python3 welllabs_data_platform_uploader.py --mode delete-dataset --dataset-id DATASET_UUIDBefore uploading, the script scans supported file types for sensitive or traceability fields.
It checks for likely:
- beneficiary or household fields
- farmer, respondent, or person fields
- phone, contact, email, address, Aadhaar, or identity fields
- owner fields
- crop fields that may contribute to household traceability
- latitude, longitude, GPS, or coordinate fields
Supported scans:
- GeoPackage:
.gpkg - CSV:
.csv - TSV:
.tsv - Excel workbook:
.xlsx - JSON and GeoJSON:
.json,.geojson
For GeoPackage files, the scan checks all non-system tables and layers. It checks column names across the whole package and samples text values for email, phone, and identity-like patterns.
If possible sensitive fields are found, the script prints the findings and asks whether to continue.
It also offers to create a sanitized copy before upload. The original file is not modified. In the sanitizer, you can:
- drop selected columns or JSON keys
- anonymise selected columns or JSON keys with one-way stable tokens
- write the prepared copy to a separate folder with
--prepared-output-dir
Manual sanitization currently supports:
- GeoPackage:
.gpkg - CSV:
.csv - TSV:
.tsv - JSON and GeoJSON:
.json,.geojson
Run the sanitizer even when the automatic scan does not flag anything:
python3 welllabs_data_platform_uploader.py --mode new-dataset --file "/path/to/file.gpkg" --metadata "./metadata.example.json" --sanitize-before-uploadWrite sanitized or compressed copies to a chosen folder:
python3 welllabs_data_platform_uploader.py --mode new-dataset --file "/path/to/file.gpkg" --prepared-output-dir "./prepared_uploads"Stop immediately on possible sensitive fields:
python3 welllabs_data_platform_uploader.py --mode new-dataset --file "/path/to/file.gpkg" --metadata "./metadata.example.json" --fail-on-sensitiveSkip the scan only after separate review:
python3 welllabs_data_platform_uploader.py --mode new-dataset --file "/path/to/file.gpkg" --metadata "./metadata.example.json" --skip-sensitivity-checkThe current platform object upload limit is 5 GiB. Before upload, the script checks the final file size. If the file is larger than the limit, it asks whether to create a ZIP archive and upload that ZIP instead.
Automatically compress large files without prompting:
python3 welllabs_data_platform_uploader.py --mode new-dataset --file "/path/to/large-file.gpkg" --metadata "./metadata.example.json" --compress-large-filesFail instead of compressing:
python3 welllabs_data_platform_uploader.py --mode new-dataset --file "/path/to/large-file.gpkg" --no-compress-large-filesIf a file is compressed, the uploader sets the uploaded file format metadata to zip and adds a note that the ZIP should be extracted before use.
Dataset metadata:
titledescriptiontagsprivate
File metadata:
file_titlefile_descriptionfile_formatprovenancesourcecite_aspermissionsgroundtrutheddurationvaluetemporal_fromtemporal_togeography
The data platform rejects some special characters in text fields. The uploader removes:
< > " ' & ; ( )
- Do not commit API keys, passwords, cookies, transcripts, or uploaded data.
- Do not paste API keys into chat, screenshots, public issues, or README files.
- Rotate your API key if it is accidentally exposed.
- The uploader does not save passwords.
- The sensitivity scan is a safety layer, not a final release decision.
- GeoPackage and GeoJSON geometries are retained unless you separately generalize or remove them. Dropping latitude and longitude attribute columns does not spatially anonymize the geometry itself.
Install Python 3 from:
https://www.python.org/downloads/
On Windows, make sure python.exe is added to PATH.
Check that:
- Your email and password work on
https://data.welllabs.org. - Your account is verified.
- Your account has upload permission.
This usually means the login session was not created or the account does not have upload permission.
The platform rejected the uploaded object because it exceeded the 5 GiB limit. Re-run the upload and choose compression when prompted, or pass --compress-large-files.
Review the findings before uploading. The uploader can help create a sanitized copy for supported formats, but the file may still need to be kept private, aggregated, or withheld.