I successfully trained T5 small and base models but am not able to train the large model as Colab GPU is running out of memory. I changed the batch size to 1 but still this error. Has anyone trained the T5 Large or T5 3b models? How much GPU memory will be required to train these models? Is there any workaround?
I successfully trained T5 small and base models but am not able to train the large model as Colab GPU is running out of memory. I changed the batch size to 1 but still this error. Has anyone trained the T5 Large or T5 3b models? How much GPU memory will be required to train these models? Is there any workaround?