Audio Processing

Shared audio preprocessing utilities for the Elephant Listening Project (ELP).

This package provides a single, consistent audio processing pipeline used across:

Elephant rumble detection (CNN + RNN)
Gunshot detection (CNN)
Real-time deployment (Jetson Nano)

The goal is to remove duplicated preprocessing code, ensure consistency between training and deployment, and support efficient real-time inference.

This repository is intended to be used as a shared submodule by training and deployment repositories, not as a standalone application.

Design Principles

Single source of truth for audio preprocessing
Framework-agnostic by default (NumPy + SciPy)
Readable and explicit (config-driven, no hidden behavior)
Backward-compatible with existing trained models

What This Package Handles

Sample-rate standardization (4 kHz target)
Fixed-length clipping (pad / trim)
Time-domain filtering (e.g. lowpass for rumbles)
Spectrogram computation (STFT + log scaling)
Frequency slicing / masking
Optional normalization
Incremental STFT + rolling cache for real-time inference

What It Does Not Handle

WAV file I/O (kept repo-specific to preserve model parity)
Model training or evaluation
Dataset-level normalization or TFRecord sharding

Pipelines

Gunshot CNN

4 s clips @ 4 kHz
Log-magnitude spectrogram
No time-domain filtering by default

Rumble CNN

5 s clips @ 4 kHz
Log-magnitude spectrogram
Lowpass filtered at 200 Hz
Frequencies above 200 Hz discarded

Rumble RNN (Waveform)

5 s clips @ 4 kHz
Raw waveform input
Lowpass filtered at 200 Hz
No spectrogram computation

RNN models are used only for elephant rumble detection.

Streaming / Real-Time Processing

For deployment (e.g. Jetson Nano), this package supports incremental STFT:

Only newly available STFT frames are computed
Frames are stored in a rolling spectrogram cache
Overlapping inference windows reuse cached frames

This avoids recomputing spectrograms for overlapping windows and significantly reduces CPU usage in real-time inference.

Installation

Minimal install: pip install -e .

With SciPy DSP support: pip install -e ".[scipy]"

With TensorFlow (legacy parity / training): pip install -e ".[scipy,tf]"

For development: pip install -e ".[scipy,dev]"

Example usage:

Example 1:

from audio_processing.pipelines import RumblePipeline

pipe = RumblePipeline()
spec = pipe.extract_from_audio(audio, sr)  # (T, F, 1)

Example 2:

from audio_processing.streaming import StreamingSpecPipeline
from audio_processing.configs import rumble_default_config

stream = StreamingSpecPipeline(rumble_default_config())
stream.append_audio(chunk, sr)

if stream.ready_for_clip():
    spec = stream.get_latest_clip_spec()

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
src/audio_processing		src/audio_processing
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio Processing

This repository is intended to be used as a shared submodule by training and deployment repositories, not as a standalone application.

Design Principles

What This Package Handles

What It Does Not Handle

Pipelines

Gunshot CNN

Rumble CNN

Rumble RNN (Waveform)

Streaming / Real-Time Processing

Installation

Example usage:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Audio Processing

This repository is intended to be used as a shared submodule by training and deployment repositories, not as a standalone application.

Design Principles

What This Package Handles

What It Does Not Handle

Pipelines

Gunshot CNN

Rumble CNN

Rumble RNN (Waveform)

Streaming / Real-Time Processing

Installation

Example usage:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages