Commit ecfd72f
committed
[cuda] int4: stabilize two-layer decode test via CUDA-seeded init
_make_int4_linear built the throwaway nn.Linear on CPU, so reset_parameters()
drew from the CPU RNG between the two layer constructions and shifted the stream
that seeds the quantized weights. That pushed test_two_layer_mlp's genuine INT4
error from 0.1405 to 0.1556, crossing the 0.15 bound. Build the module with
device=cuda so init draws from the CUDA RNG, leaving the CPU stream (and the
measured error) deterministic. Test-only; dequant math is unchanged.1 parent 4519036 commit ecfd72f
1 file changed
Lines changed: 4 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
59 | 59 | | |
60 | 60 | | |
61 | 61 | | |
62 | | - | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
63 | 66 | | |
64 | 67 | | |
65 | 68 | | |
| |||
0 commit comments