Process gets killed when training
50417 commented
I am training with the smallest GPT2(117M parameters).
Loading dataset...
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 109.93it/s]
dataset has 42736 tokens
However the process gets killed as shown above. Any help is appreciated.
50417 commented
Upon investigating, it was due to Out of memory error.
Resource exhausted: OOM when allocating tensor with shape[1024,50257] and type float on /job:localhost/replica:0/task:0/device:CPU:0 by allocator cpu