A CNN-based detector for African forest elephant rumble vocalizations, built for the Elephant Listening Project at Cornell. Audio clips are converted to spectrograms and classified as rumble / non-rumble. Training runs locally or on the SDSC Expanse ACCESS GPU supercomputer.
Python baseline: This repo targets Python 3.10, the version proven on Expanse for rumble model training. Dependencies are managed exclusively through
pyproject.toml— there is norequirements.txt.
- macOS (primary development platform; untested on Windows)
- Homebrew
- pyenv for managing Python versions
- Access to the Cornell ELP data
brew install pyenv
pyenv install 3.10.13cd /path/to/your/ElephantListeningProject
git clone <repo https url>
cd ELP-Rumble-Detector~/.pyenv/versions/3.10.13/bin/python -m venv .venv
source .venv/bin/activate
pip install -e .[full] # includes TensorFlow, TensorBoard, and Jupytercp .env.example .envEdit .env and set CORNELL_DATA_ROOT to point to your local copy of the ELP Cornell Data folder. For example:
CORNELL_DATA_ROOT="/Users/username/ELP_Cornell_Data"
Step 1 creates shared, version-controlled artifacts. Do NOT rerun it unless the team agrees to change the dataset. In normal use, only run steps 2 and 3.
create_data_plan.py ──► clips_plan.csv + splits/model{1,2,3}.csv (committed)
cut_wav_clips.py ──► data/wav_clips/{pos,neg}/... (local)
create_tfrecords.py ──► data/tfrecords/... (local)
-
Data plan (committed — do not rerun casually)
python -m elp_rumble.data_creation.create_data_plan
-
Cut clips (safe to run)
python -m elp_rumble.data_creation.cut_wav_clips
-
TFRecords (safe to run)
python -m elp_rumble.data_creation.create_tfrecords
Output in one run for all three models (model1, model2, model3):
data/tfrecords/tfrecords_audio/{model}/ and data/tfrecords/tfrecords_spectrogram/{model}/
- Sources: Rumble PNNN and Dzanga folders only (
pnnn1,pnnn2,dzanga). - Neg:pos ratio: 3:1 in every split, enforced by per-split trimming.
- Split assignment: WAV-level grouping (80/10/10 train/val/test) prevents recording-condition leakage.
- Model hierarchy:
model1 ⊂ model2 ⊂ model3— feasibility → scaled → full dataset.
- Expanse project storage access (e.g.
/expanse/lustre/projects/cso100/) - Processed data tfrecords uploaded to the project tree
Clone the repo under the shared project root so it co-resides with other Elephant Listening Project material:
/expanse/lustre/projects/cso100/<your_username>/ElephantListeningProject/
└── ELP-Rumble-Detector/Use Globus Connect Personal with your SDSC Expanse credentials to transfer files between your local machine and the remote server. Tutorial. Note: In the Globus file manager tab, search for the collection SDSC HPC - Expanse Lustre, then either append path to direct it to your project storage or navigate to your project storage via the UI.
To ensure consistency between local and remote environments, use the same relative data folder structure on both systems.
ELP-Rumble-Detector/
├── data/
│ ├── tfrecords/
│ │ ├── tfrecords_audio/
│ │ │ ├── model1/
│ │ │ ├── model2/
│ │ │ └── model3/
│ │ └── tfrecords_spectrogram/
│ │ ├── model1/
│ │ ├── model2/
│ │ └── model3/
├── slurm_scripts/
├── src/
└── ...
Training runs inside a Singularity container. Build it once on a Linux machine with Apptainer installed:
apptainer pull tensorflow-2.15.0-gpu.sif \
docker://tensorflow/tensorflow:2.15.0-gpuUpload it to Expanse so the file exists at: $PROJECT_ROOT/tensorflow-2.15.0-gpu.sif, i.e. one level above ELP-Rumble-Detector/.
rsync -avP tensorflow-2.15.0-gpu.sif \
<your_username>@login.expanse.sdsc.edu:/expanse/lustre/projects/cso100/<your_username>/ElephantListeningProject/
Step 0 — create the SLURM log directory (run once after cloning):
mkdir -p slurm_logsSLURM writes job output to slurm_logs/ and will fail immediately if the directory is missing. This repo ships a .gitkeep placeholder so it exists after cloning.
Step 1 — install the package into the container’s Python environment (run once, or after dependency changes):
bash slurm_scripts/setup-pythonuserbase.shThis installs the repo as an editable package into $PROJECT_ROOT/.pythonuserbase in a shared user base ($PROJECT_ROOT/.pythonuserbase) used by the container. It records a hash of pyproject.toml so the training scripts can detect when a reinstall is needed.
Step 2 — submit a training job:
Both run-train-gpu-shared.sh and run-train-gpu-debug.sh live in slurm_scripts/. They will fail fast if setup hasn’t been run or if no GPUs are visible inside the container.
Invoke with MODELTYPE (cnn or rnn) and MODEL (model1, model2, or model3). An optional third argument overrides the epoch count.
Examples:
# Shared partition — CNN on model3 (default epochs)
sbatch slurm_scripts/run-train-gpu-shared.sh cnn model3
# Debug partition — quick sanity check with 2 epochs
sbatch slurm_scripts/run-train-gpu-debug.sh cnn model1 2
# Shared partition — RNN trainer
sbatch slurm_scripts/run-train-gpu-shared.sh rnn model2Each script prints a usage message and exits if MODELTYPE or MODEL are missing or invalid.
squeue -u $USER -l # queue status
sacct -j <job_id> --format=JobID,State,Elapsed,MaxRSS # job details
cat slurm_logs/<job_name>.o<job_id>.<node> # output logs
scancel <job_id> # cancelSelect a model split via the MODEL environment variable (model1, model2, model3). Results are saved under runs/{cnn,rnn}/<run_name>/ depending on the trainer.
MODEL=model3 python -m elp_rumble.training.train_cnnOverride epoch count (useful for quick smoke-tests):
MODEL=model1 EPOCHS=2 python -m elp_rumble.training.train_cnnMODEL=model3 python -m elp_rumble.training.train_rnnEach completed run produces a directory runs/{cnn,rnn}/{MODEL}_bs{BS}_lr{LR}_e{EPOCHS}_{TIMESTAMP}/ containing:
| File | Description |
|---|---|
params.json |
All hyperparameters, TFRecord paths, class weights |
history.csv |
Per-epoch loss, accuracy, precision, recall, AUC (train + val) |
best_model.keras |
Best checkpoint (monitored by val AUC) |
final_model.keras |
Final trained model (with best-validation weights restored if early stopping is used) |
test_metrics.json |
Test-set accuracy, precision, recall, AUC, confusion matrix |
test_predictions.csv |
Per-clip: clip_wav_relpath, y_true, y_pred, y_score |
logs/ |
TensorBoard event files |
Generate publication-quality figures from a completed run:
python -m elp_rumble.evaluate_cnn --run_dir runs/cnn/<run_name>By default, figures are saved to <run_dir>/figures/ as both PDF and PNG at 300 DPI:
training_curves.{pdf,png}- loss + AUC vs. epochconfusion_matrix.{pdf,png}- counts and percentagesroc_curve.{pdf,png}- ROC with AUC annotatedpr_curve.{pdf,png}- precision-recall with AP annotated
Optional output directory override:
python -m elp_rumble.evaluate_cnn --run_dir runs/cnn/<run_name> --output_dir tmp/figuresOptional run-local notebook copy (default: false):
python -m elp_rumble.evaluate_cnn --run_dir runs/cnn/<run_name> --include_notebook trueWhen notebook inclusion is enabled, a run-local cnn_results.ipynb is copied from the template and retargeted to the generated figures. Execute notebook cells manually after generation.
Display-only notebook template:
runs/cnn/cnn_results_template.ipynb
The Legacy/ directory contains earlier training and utility scripts from the 2024–2025 CNN-vs-RNN research phase. These are deprecated and will be removed in a future cleanup. Use the elp_rumble package entrypoints above instead.
- RavenPro / RavenLite — view and annotate audio waveforms and spectrograms
- SDSC Expanse User Guide
- SDSC Basic Skills — Linux, interactive computing, Jupyter on Expanse
- SDSC On-Demand Learning — webinars and educational archive