From df20ef4d03586a4c71f86082363daa2123e380f4 Mon Sep 17 00:00:00 2001 From: DhruvrajSinhZala24 Date: Tue, 24 Mar 2026 12:30:41 +0530 Subject: [PATCH] docs: add local dataset setup guide to README (#180) Adds download steps, expected directory layout, and quickstart commands. --- README.md | 37 ++++++++++++++++++++++++++++++++++++- 1 file changed, 36 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index f0f50e6..463a4a8 100644 --- a/README.md +++ b/README.md @@ -35,6 +35,41 @@ All datasets are constructed using Lenstronomy, by Michael W. Toomey, as present |Model 3 dataset|Sheared Isothermal Elliptical lens | Sérsic light profile | HST's observation characteristics, Axion DM and CDM substructure appended to base halo to create 3 sub-structure classes |Model 4 dataset|Two Isothermal Elliptical lenses | Three-channel **real galaxy** images | Euclid's observation characteristics, Axion DM and CDM substructure appended to base halo to create 3 sub-structure classes +### 2.1 Getting the simulated datasets locally + +The vision-transformer training and evaluation scripts expect the simulated data from [DeepLenseSim](https://github.com/mwt5345/DeepLenseSim/tree/main/) to live in this repository under `./data//{train,test}/{axion,cdm,no_sub}` as `.npy` files. You can mirror that structure with the public DeepLenseSim repository: + +```bash +# from the repo root +git clone https://github.com/mwt5345/DeepLenseSim.git ../DeepLenseSim +mkdir -p data +cp -r ../DeepLenseSim/Model_I ../DeepLenseSim/Model_II ../DeepLenseSim/Model_III data/ +``` + +Expected layout (replace `Model_I` with `Model_II` or `Model_III`): + +``` +data/ + Model_I/ + train/{axion,cdm,no_sub}/*.npy + test/{axion,cdm,no_sub}/*.npy +``` + +If you just want a smoke test without downloading the full datasets, miniature samples live under `DeepLense_Classification_Transformers_Archil_Srivastava/application_tests/`. + +Run training (Weights & Biases login required): + +```bash +cd DeepLense_Classification_Transformers_Archil_Srivastava +python3 train.py --dataset Model_I --model_name coatnet_nano_rw_224 --project ml4sci_deeplense_final +``` + +Evaluate a saved run (uses the `best_model.pt` artifact from the train run): + +```bash +python3 eval.py --runid --project ml4sci_deeplense_final +``` + ## 3. Projects ![Project compositions](/Images_for_README/DeepLense%20project%20composition.jpeg) @@ -115,4 +150,4 @@ Finally, DeepLense help combat the problem of noisy and low-resolution of real l **Pranath Reddy** performs a comparative study of the super-resolution of strong lensing images in their [GSoC 2023 project](https://summerofcode.withgoogle.com/archive/2023/projects/Rh8kJLr4), using Residual Models with Content Loss and Conditional Diffusion Models, on the Model 1 dataset. #### 3.3.3 Physics-Informed Unsupervised Super-Resolution of Strong Lensing Images -**Anirudh Shankar** explores the unsupervised super-resolution of strong lensing images through a Physics-Informed approach in his [GSoC 2024 project](https://summerofcode.withgoogle.com/programs/2024/projects/AvlaMMJJ), built to handle sparse datasets. They use custom datasets using different lens models and light profiles. \ No newline at end of file +**Anirudh Shankar** explores the unsupervised super-resolution of strong lensing images through a Physics-Informed approach in his [GSoC 2024 project](https://summerofcode.withgoogle.com/programs/2024/projects/AvlaMMJJ), built to handle sparse datasets. They use custom datasets using different lens models and light profiles.