This README is generated by AI without any manual check (this line is by the author)

Entropy Fitting Project

A PyTorch-based project for training neural networks to predict entropy values from data. This project includes both synthetic data generation and real-world image processing capabilities using CIFAR-100 dataset.

Features

Entropy Calculation: Custom implementation of entropy calculation with histogram-based approach
Synthetic Data Generation: Generate datasets with controlled entropy characteristics
Neural Network Models: MLP and Transformer architectures for entropy prediction
CIFAR-100 Integration: Process and analyze real image data entropy
Training Pipelines: Complete training and evaluation workflows
Visualization: Comprehensive plotting and result analysis

Project Structure

entropy-fitting/
├── entcal.py              # Core entropy calculation functions
├── entdata.py            # Synthetic data generation and dataset classes
├── train-synth.py        # Training on synthetic data
├── train-cifar.py        # Training on CIFAR-100 images
├── c100-ent-dist.py      # CIFAR-100 entropy distribution analysis
├── compare-entcal.py     # Entropy calculation comparison
├── best_entropy_model.pth # Trained model checkpoint
└── cifar-100/           # CIFAR-100 dataset directory

Installation

Clone the repository:

git clone <repository-url>
cd entropy-fitting

Install dependencies:

pip install torch torchvision numpy scipy matplotlib tqdm scikit-image

Download CIFAR-100 dataset (will be automatically downloaded when running training scripts):

python -c "from torchvision import datasets; datasets.CIFAR100(root='./cifar-100', train=True, download=True)"

Usage

Synthetic Data Training

Train on generated synthetic data with controlled entropy:

python train-synth.py

Parameters:

num_samples: Number of training samples (default: 10000)
dim: Dimension of each sample (default: 128)
num_bins: Number of bins for entropy calculation (default: 16)
model_type: 'mlp' or 'transformer' (default: 'mlp')

CIFAR-100 Training

Train on CIFAR-100 images or their noise components:

python train-cifar.py

Parameters:

crop_size: Image crop size (default: 24)
num_bins: Number of bins for entropy calculation (default: 16)
use_noise: Train on noise components instead of images (default: False)

Entropy Distribution Analysis

Analyze entropy distribution of CIFAR-100 dataset:

python c100-ent-dist.py

Entropy Calculation

Test the entropy calculation implementation:

python compare-entcal.py

Core Components

Entropy Calculation (`entcal.py`)

batch_histc(): Batch histogram calculation with adaptive binning
calculate_entropy(): Compute entropy from data using histogram method

Data Generation (`entdata.py`)

BaseEntropyDataset: Base class for entropy datasets
EntropyDataset: Finite dataset with pre-generated samples
OnTheFlyEntropyDataset: Infinite dataset with on-the-fly generation
DataLoader utilities for easy integration

Model Architectures

MLP Model: Multi-layer perceptron for synthetic data

def get_mlp(input_dim=128, hidden_dims=[256, 256, 256])

CNN Model: Convolutional network for image data

def get_cnn(hidden_channels=[16, 16, "A2", 32, 32, "A2", 64, 64, "A2", 128, 128, "AA"])

Training Results

The project includes comprehensive evaluation metrics:

Mean Absolute Error (MAE)
Mean Squared Error (MSE)
Root Mean Squared Error (RMSE)
Training and test loss curves
Prediction vs true value scatter plots
Residual distribution analysis

Examples

Generate and visualize synthetic data:

from entdata import EntropyDataset
import matplotlib.pyplot as plt

dataset = EntropyDataset(num_samples=10000, dim=128, num_bins=16)
plt.plot(dataset.data[0])
plt.title(f"Entropy: {dataset.labels[0].item():.4f}")
plt.show()

Calculate entropy of custom data:

from entcal import calculate_entropy
import torch

data = torch.randn(1, 100)  # Batch of 100 samples
entropy = calculate_entropy(data, min_value=-3, max_value=3, num_bins=16)
print(f"Entropy: {entropy.item():.4f}")

Configuration

Key parameters can be adjusted in the script files:

Data Generation: dim, num_bins, num_samples
Training: batch_size, learning_rate, num_epochs
Model: Hidden layer sizes, number of heads/layers
Evaluation: Test ratio, random seed

Dependencies

Python 3.7+
PyTorch 1.8+
torchvision
NumPy
SciPy
Matplotlib
scikit-image
tqdm

License

This project is for research and educational purposes.

Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests if applicable
Submit a pull request

Citation

If you use this code in your research, please cite:

@software{entropy_fitting,
  title = {Entropy Fitting Project},
  author = {Your Name},
  year = {2025},
  url = {https://github.com/your-username/entropy-fitting}
}

Support

For questions and support, please open an issue on GitHub or contact the maintainers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

This README is generated by AI without any manual check (this line is by the author)

Entropy Fitting Project

Features

Project Structure

Installation

Usage

Synthetic Data Training

CIFAR-100 Training

Entropy Distribution Analysis

Entropy Calculation

Core Components

Entropy Calculation (`entcal.py`)

Data Generation (`entdata.py`)

Model Architectures

Training Results

Examples

Generate and visualize synthetic data:

Calculate entropy of custom data:

Configuration

Dependencies

License

Contributing

Citation

Support

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LICENSE		LICENSE
README.md		README.md
c100-ent-dist.py		c100-ent-dist.py
compare-entcal.py		compare-entcal.py
entcal.py		entcal.py
entdata.py		entdata.py
train-cifar.py		train-cifar.py
train-synth.py		train-synth.py

License

yeyun11/entropy-experiment

Folders and files

Latest commit

History

Repository files navigation

This README is generated by AI without any manual check (this line is by the author)

Entropy Fitting Project

Features

Project Structure

Installation

Usage

Synthetic Data Training

CIFAR-100 Training

Entropy Distribution Analysis

Entropy Calculation

Core Components

Entropy Calculation (entcal.py)

Data Generation (entdata.py)

Model Architectures

Training Results

Examples

Generate and visualize synthetic data:

Calculate entropy of custom data:

Configuration

Dependencies

License

Contributing

Citation

Support

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Entropy Calculation (`entcal.py`)

Data Generation (`entdata.py`)

Packages