Hello. Thank you for your outstanding work. However, I am having some problems reproducing the training portion of the code and am not getting the expected training results. Your code originally appeared to have all losses as nan, as shown below.

I tried to modify the loss function a bit, but it seems that there is no backpropagation, although the losses are no longer nan.

where all the parameters use the default training parameters,Except that batch_size was changed from 36 to 24
Hello. Thank you for your outstanding work. However, I am having some problems reproducing the training portion of the code and am not getting the expected training results. Your code originally appeared to have all losses as nan, as shown below.

I tried to modify the loss function a bit, but it seems that there is no backpropagation, although the losses are no longer nan.

where all the parameters use the default training parameters,Except that batch_size was changed from 36 to 24