Shared audio preprocessing utilities for the Elephant Listening Project (ELP).
This package provides a single, consistent audio processing pipeline used across:
- Elephant rumble detection (CNN + RNN)
- Gunshot detection (CNN)
- Real-time deployment (Jetson Nano)
The goal is to remove duplicated preprocessing code, ensure consistency between training and deployment, and support efficient real-time inference.
This repository is intended to be used as a shared submodule by training and deployment repositories, not as a standalone application.
- Single source of truth for audio preprocessing
- Framework-agnostic by default (NumPy + SciPy)
- Readable and explicit (config-driven, no hidden behavior)
- Backward-compatible with existing trained models
- Sample-rate standardization (4 kHz target)
- Fixed-length clipping (pad / trim)
- Time-domain filtering (e.g. lowpass for rumbles)
- Spectrogram computation (STFT + log scaling)
- Frequency slicing / masking
- Optional normalization
- Incremental STFT + rolling cache for real-time inference
- WAV file I/O (kept repo-specific to preserve model parity)
- Model training or evaluation
- Dataset-level normalization or TFRecord sharding
- 4 s clips @ 4 kHz
- Log-magnitude spectrogram
- No time-domain filtering by default
- 5 s clips @ 4 kHz
- Log-magnitude spectrogram
- Lowpass filtered at 200 Hz
- Frequencies above 200 Hz discarded
- 5 s clips @ 4 kHz
- Raw waveform input
- Lowpass filtered at 200 Hz
- No spectrogram computation
RNN models are used only for elephant rumble detection.
For deployment (e.g. Jetson Nano), this package supports incremental STFT:
- Only newly available STFT frames are computed
- Frames are stored in a rolling spectrogram cache
- Overlapping inference windows reuse cached frames
This avoids recomputing spectrograms for overlapping windows and significantly reduces CPU usage in real-time inference.
Minimal install:
pip install -e .
With SciPy DSP support:
pip install -e ".[scipy]"
With TensorFlow (legacy parity / training):
pip install -e ".[scipy,tf]"
For development:
pip install -e ".[scipy,dev]"
Example 1:
from audio_processing.pipelines import RumblePipeline
pipe = RumblePipeline()
spec = pipe.extract_from_audio(audio, sr) # (T, F, 1)Example 2:
from audio_processing.streaming import StreamingSpecPipeline
from audio_processing.configs import rumble_default_config
stream = StreamingSpecPipeline(rumble_default_config())
stream.append_audio(chunk, sr)
if stream.ready_for_clip():
spec = stream.get_latest_clip_spec()