Hi Jiwei! Why you try a little bit more bigger learning rate in your training phase? Or You try some bigger lr like `lr = 3e-4` or `lr = 1e-6`, can you suggest some useful experiential value?
Hi Jiwei!
Why you try a little bit more bigger learning rate in your training phase?
Or
You try some bigger lr like
lr = 3e-4orlr = 1e-6, can you suggest some useful experiential value?