-
Notifications
You must be signed in to change notification settings - Fork 8
Add spectrum codec #10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Ease model weight loading.
Add tests for SpectrumCodec
… into add-spectrum-tokenizer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements a new autoencoder‐based codec for processing desi spectra in the Aion framework. Key changes include introducing a Spectrum modality with dedicated fields, implementing the SpectrumCodec along with its encoding/decoding logic using ConvNeXt-based modules and quantizers, and adding supporting test data and dependency updates.
Reviewed Changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/tokenizers/test_spectrum_tokenizer.py | Added tests for the new spectrum codec using a Hugging Face–pretrained model. |
| tests/test_data/SPECTRUM_input_batch.pt | Added sample input data for spectrum modality. |
| tests/test_data/SPECTRUM_encoded_batch.pt | Added sample encoded output data for spectrum codec verification. |
| tests/test_data/SPECTRUM_decoded_batch.pt | Added sample decoded output data for spectrum codec verification. |
| pyproject.toml | Updated dependencies to include vector_quantize_pytorch. |
| aion/modalities.py | Introduced the Spectrum modality with fields for flux, ivar, mask, and wavelength. |
| aion/codecs/tokenizers/spectrum.py | Implemented a Spectrum codec class with autoencoder logic and quantization integration. |
| aion/codecs/quantizers/init.py | Added new LFQ and scalar quantizers for handling the latent space in the codec. |
| aion/codecs/modules/utils.py | Added custom LayerNorm and GRN utility modules. |
| aion/codecs/modules/spectrum.py | Provided interpolation functions and a latent spectral grid for converting between grids. |
| aion/codecs/modules/convnext.py | Added 1D ConvNeXt-based encoder and decoder modules for processing spectral data. |
This PR implements the codec for desi spectrum (stored at
/mnt/ceph/users/polymathic/MMOMA/outputs/mmoma_codec_sdss+desi/6kzi0iz9/checkpoints/last.pt).I checked I could reproduce the same encoded data as the original codec from the same random input.
Reflecting on this PR related to the other, we may want to reorganize a bit the codec. For instance, removing the pytorch-lightning dependencies make the codecs standard classes, whereas we would like them to be
torch.nn.Moduleand ultimately offer HF support.