Conversation
| return sorted(checkpoints, key=lambda x: x[0]) | ||
|
|
||
|
|
||
| class DataStream: |
|
I am keen to merge this alongside an idempotent script to patch pytorch nightly (like this: https://github.com/EleutherAI/unlearn/blob/main/unlearn/magic/magic_wmdp_setup.sh) and a README and/or CLAUDE.md skill about how to set it up, thoughts? |
This seems reasonable. The PyTorch patch will hopefully be merged next week but I'm not sure how much longer it will take for the next release of PyTorch to come out with the patch in it. |
for more information, see https://pre-commit.ci
|
I'm going to merge this to unblock Will's work wanted to note that there's an alternative second launch_distributed thing in here that's only used by MAGIC + an accompanying image experiment that I wasn't sure what to do with so I let them be. Would be cool to properly support images though, plz do add that! |
64596e9 to
16851d2
Compare
…ight support - Add bergson/magic_patch.py: runtime monkey-patch for twice-differentiable DTensor redistribution (pytorch/pytorch#160509), replacing the old magic_wmdp_setup.sh that modified torch source files on disk - Add per_token mode to DataStream for [n_examples, max_length] weight tensors - Support 2D [B, T] per-token weights in weighted_causal_lm_ce - Fix backward weight_grads accumulation when autograd returns None
No description provided.