Have any GPTNeoForCausalLM training example in pytorch with hardware acceleration?
Pwang001 opened this issue · 1 comments
Pwang001 commented
I am upgrading a GPT-2 pytorch project to GPTNeoForCausalLM. However I encounter an "out of memory" issue when training with GPU, and a very slow training speed with TPU in colab.
Is there any example for training a GPTNeoForCausalLM model with hardware acceleration?
StellaAthena commented
The readme has a colab notebook that may be helpful. Also, often times decreasing the batch size prevents OOM errors.