The repo has two experiment stacks:
| Directory | What it runs |
|---|---|
gsm/ |
LoRA distillation on GSM8K/TinyGSM with Gemma student/expert models. |
modadd/ |
Modular-addition experiments with a small transformer trained from scratch. |
Install once from the repo root:
uv sync --locked
source .venv/bin/activateThis creates .venv/ from uv.lock and installs the dependencies for both
experiment stacks, including PyTorch CUDA 12.8, vLLM, Transformers, PEFT,
Hydra, and W&B.
Then choose a stack:
cd gsm # real-model GSM8K/TinyGSM experiments
# or
cd modadd # modular-addition experimentsSee gsm/README.md and modadd/README.md
for the full commands.
GSM commands below are run from inside gsm/.
| Method | GSM command | Modadd command |
|---|---|---|
| LogLossBC | bash scripts/train.sh configs/offline_bc.yaml |
python -m nanogpt.run experiment=modadd_noisy_bc |
| NAIL-F | bash scripts/train.sh configs/nail_f.yaml |
python -m nanogpt.run experiment=modadd_nail |
| NAIL-R | bash scripts/train.sh configs/nail_r.yaml |
python -m nanogpt.run experiment=modadd_nail_reverse_mc_fixed |
| NAIL-Mixed | bash scripts/train.sh configs/nail_mixed.yaml |
python -m nanogpt.run experiment=modadd_nail task.loss=mixed task.kl_beta=<beta> |
| OPD-F | bash scripts/train.sh configs/opd_f.yaml |
python -m nanogpt.run experiment=modadd_opd_forward |
| OPD-R | bash scripts/train.sh configs/opd_r.yaml |
python -m nanogpt.run experiment=modadd_opd |
The base causal transformer in modadd/model.py and the nanogpt package name
are derived from Andrej Karpathy's nanoGPT,
MIT licensed.