EleutherAI/gpt-neo

Have any GPTNeoForCausalLM training example in pytorch with hardware acceleration?

Pwang001 opened this issue · 1 comments

I am upgrading a GPT-2 pytorch project to GPTNeoForCausalLM. However I encounter an "out of memory" issue when training with GPU, and a very slow training speed with TPU in colab.
Is there any example for training a GPTNeoForCausalLM model with hardware acceleration?

The readme has a colab notebook that may be helpful. Also, often times decreasing the batch size prevents OOM errors.