Transferring Mathematical Structures from Physics to Ecology for Parameter-Efficient Neural Networks
This repository contains the official implementation of our paper on discovering domain-specific activation functions using Genetic Programming and transferring them across scientific domains.
Modern neural networks rely on generic activation functions (ReLU, GELU, SiLU) that ignore the mathematical structure inherent in scientific data. We propose Neuro-Symbolic Activation Discovery, a framework that uses Genetic Programming to extract interpretable mathematical formulas from data and inject them as custom activation functions.
Key Findings:
- π― Geometric Transfer: Activation functions discovered on particle physics data successfully generalize to ecological classification
- β‘ Efficiency: 18-21% higher parameter efficiency with 5-6Γ fewer parameters
- π¬ Interpretability: Human-readable symbolic formulas as activation functions
| Dataset | Best Model | Accuracy | Params | Efficiency Gain |
|---|---|---|---|---|
| HIGGS | Light ReLU | 71.0% | 4,161 | +21.2% vs Heavy |
| Forest Cover | Hybrid (Transfer) | 82.4% | 5,825 | +18.2% vs Heavy |
| Spambase | Hybrid (Specialist) | 92.0% | 6,017 | +18.0% vs Heavy |
The Transfer Phenomenon: A formula discovered on HIGGS (mul(cos(x), x)) transfers to Forest Cover, outperforming ReLU, GELU, and SiLU!
NeuroSymbolic_Activation/
βββ data/ # Downloaded datasets (HIGGS.csv, etc.)
βββ src/
β βββ data_loader.py # Dataset fetching and preprocessing
β βββ models.py # AutoSymbolicLayer, Heavy/Light Models
β βββ discovery.py # Genetic Programming logic (gplearn)
β βββ train.py # Training loop and evaluation metrics
β βββ utils.py # Seeds, device, plotting helpers
βββ results/ # Generated plots (activation_*.png) and CSV results
βββ main.py # Entry point: orchestrates the full pipeline
βββ benchmark_standalone.py # Single-file script containing all logic
βββ requirements.txt # Python dependencies
βββ README.md
- Python 3.8+
- PyTorch 1.12+
- CUDA (optional, for GPU acceleration)
# Clone the repository
git clone https://github.com/ana55e/NeuroSymbolic_Activation.git
cd NeuroSymbolic_Activation
# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txtFor users who want to run the benchmark immediately without managing the src/ folder structure, we provide a standalone script. This single file (benchmark_standalone.py) contains all necessary logic (data loading, GP discovery, training, and evaluation).
- Save the standalone code provided in the repo as
benchmark_standalone.py. - Run it directly:
python benchmark_standalone.pyOutput: This will download data, train models, and save plots/CSVs to the current directory.
For researchers who wish to modify individual components (e.g., change the architecture in models.py or the GP function set in discovery.py), use the modular entry point.
To reproduce the full benchmark (Table 2 in the paper):
python main.pyThis script will:
- Download datasets automatically.
- Discover activation formulas using Genetic Programming.
- Train Heavy and Light models across 3 random seeds.
- Save results to
results/final_efficiency_results.csv.
Check the results/ folder for:
activation_HIGGS.png: Visualization of the discovered physics formula.activation_FOREST_COVER.png: Visualization of the ecology formula.activation_SPAMBASE.png: Visualization of the spam formula.final_efficiency_results.csv: Raw numbers for all experiments.
All experiments use fixed random seeds (42, 43, 44) for robustness. Ensure you are using Python 3.8+ to match package versions exactly.
If you use this code or find our research helpful, please cite:
@article{,
title={Neuro-Symbolic Activation Discovery: Transferring Mathematical Structures from Physics to Ecology for Parameter-Efficient Neural Networks},
author={Hajbi, Anas},
journal={arXiv preprint arXiv:2601.10740},
year={2026}
}For questions or issues, please open a GitHub issue or contact anas.hajbi@um6p.ma.