updates for the IIA course #3
Merged
Merged
Conversation
There was a problem hiding this comment.
Pull request overview
This PR reorganizes the training-models/ Dermamnist scripts to share common training/evaluation/visualization logic, and adds repo-level tooling (uv + Makefile) intended to make the IIA course workflow easier to follow.
Changes:
- Introduce
training-models/shared/utilities for data loading, training/checkpointing, evaluation, and visualization. - Refactor Dermamnist v1–v7 scripts to use the shared utilities and a consistent
argparseCLI (train|test|visualize). - Add uv/Makefile-based setup instructions and update documentation paths/layout.
Reviewed changes
Copilot reviewed 17 out of 19 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| training-models/utils.py | Leaves a pointer note indicating utilities moved into shared/. |
| training-models/shared/init.py | Defines the shared package for common training-model utilities. |
| training-models/shared/data.py | Adds shared Dermamnist dataset loading helper. |
| training-models/shared/model.py | Adds shared device/output-path helpers. |
| training-models/shared/utils.py | Adds shared training loop, checkpointing, evaluation, visualization, and test helpers. |
| training-models/README.md | Expands setup/run instructions and updates results/artifact layout references. |
| training-models/dermamnist_v1_initial.py | Refactors script to shared utilities + CLI. |
| training-models/dermamnist_v2_momentum0p9.py | Refactors script to shared utilities + CLI. |
| training-models/dermamnist_v3_lr0p005_val_patience.py | Refactors script to shared utilities + CLI. |
| training-models/dermamnist_v4_adam_TB.py | Refactors script to shared utilities + CLI. |
| training-models/dermamnist_v5_deeper_network.py | Refactors script to shared utilities + CLI. |
| training-models/dermamnist_v6_even_deeper_network.py | Refactors script to shared utilities + CLI. |
| training-models/dermamnist_v7_with_augm.py | Refactors script to shared utilities + CLI (including augmentation). |
| training-models/_config.yml | Updates book path to the notebook under training-models/. |
| README.md | Adjusts a couple of links to use repo-relative paths. |
| pyproject.toml | Adds project metadata + dependency list for uv installation. |
| Makefile | Adds uv-based setup/install, clean targets, and versioned run targets. |
| .gitignore | Ignores generated artifacts (results, checkpoints, ONNX, venv, etc.). |
Comments suppressed due to low confidence (1)
training-models/README.md:60
- Same table formatting issue here: the rows start with
||, which adds an empty column and misaligns the table in standard markdown renderers. Please switch to single leading|and verify the table renders correctly on GitHub.
## Version performance summary
| Version | Key change | Test accuracy | Delta |
|---|---|---|---|
| v1 | Baseline 4-layer CNN, SGD lr=5e-6, momentum=0.5 | ~0.65 | — |
| v2 | Momentum 0.5 → 0.9 | ~0.65 | ≈ 0.00 |
| v3 | Learning rate 5e-6 → 0.005, validation patience | ~0.75 | +0.10 |
| v4 | SGD → Adam, TensorBoard logging | **0.762** | +0.01 |
| v5 | Deeper 6-layer CNN (adds 128-ch block) | 0.755 | −0.007 |
| v6 | Even deeper 8-layer CNN (adds 256-ch block) | < v5 | −0.01 |
| v7 | Data augmentation (flip + crop) on 8-layer CNN | **0.770** | +0.015 |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| labels = labels.squeeze().long() | ||
|
|
||
| outputs = model(images) | ||
| _, predicted = torch.max(outputs.data, 1) |
| if epoch_label != "?": | ||
| epoch_label += 1 # stored as 0-indexed | ||
| best_acc = saved.get("best_accuracy", "?") | ||
| print(f"Loaded checkpoint: epoch {epoch_label}, best accuracy so far {best_acc:.4f}") |
| if flag == "train": | ||
| data_next = DataClass(split="val", transform=base_transform, download=True) | ||
| elif flag == "test": | ||
| data_next = DataClass(split="test", transform=base_transform, download=True) |
Comment on lines
+36
to
+46
| Each `results/vN/` directory contains: | ||
|
|
||
| | File | Description | | ||
| |---|---| | ||
| | `best_model.pt` | Saved model weights at peak validation accuracy | | ||
| | `train_loss.png` / `val_acc.png` | Training curves (v1–v3) | | ||
| | TensorBoard event files | Training + validation loss and accuracy (v4–v7) | | ||
| | `model_summary.txt` | Layer-by-layer parameter count via **torchinfo** | | ||
| | `model_graph.png` | Computation graph image via **torchview** | | ||
| | `model.onnx` | ONNX export for interactive inspection with **netron** | | ||
|
|
Comment on lines
97
to
108
| We invite you to make a copy of the notebook, and then make changes to it (you could simply copy changes from the scripts we point to) as we make progress in the versions below. | ||
|
|
||
|  | ||
|  | ||
|
|
||
| Note from the image above that only the `melanocytic nevi` category is being learnt by the model, and since it has the largest representation in both the training and validation/test set, the weighted average accuracy is quite high even though all the other categories have 0 accuracy. | ||
|
|
||
|  | ||
|  | ||
|
|
||
| Training loss for initial version: seems to reduce with increasing iterations, but then flattens out. When it flattens out, it is really not very useful to train for more iterations, as the accuracy also flattens out. We will see in the third version, how this wasteful training could be avoided using 'validation patience'. | ||
|
|
||
|  | ||
|  | ||
|
|
|  | ||
|  | ||
|
|
||
| For this version, we note that the test accuracy is now 0.770, higher than all the benchmarks listed on the [MedMNIST webpage](https://medmnist.com)! Mote also that the `dermatofibroma` category is no longer 0 in it's metrics, and nearly all categories have precisions greater than all previous versions. |
| [project] | ||
| name = "bender" | ||
| version = "0.1.0" | ||
| requires-python = ">=3.13" |
Comment on lines
+180
to
+185
| val_batch = next(iter(loader_val)) | ||
| val_inputs = val_batch[0] | ||
| val_labels = val_batch[1].squeeze().long() | ||
| optimizer.zero_grad() | ||
| val_outputs = model(val_inputs) | ||
| val_loss = loss_function(val_outputs, val_labels) |
| torch.onnx.export(model, dummy_input, onnx_path, opset_version=11) | ||
| print(f"ONNX model saved. Run: netron {onnx_path}") | ||
| except ModuleNotFoundError as e: | ||
| print(f"ONNX export skipped ({e}). Install the missing package with: pip install onnxscript") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Updates specifically for the training-models folder with infrastructure to make the IIA course easier to follow