nshepperd/gpt-2

Encoding on GPU

Zer0-dev115 opened this issue · 2 comments

I have tried to encode a file on GPU, but it is still running on CPU. I can't encode that file, process get killed even before start.
python encode.py corpus_final.txt corpus_final.npz --model_name 345M
Reading files
0%| | 0/1 [00:00<?, ?it/s]Killed

this file has 1.9M sentences.

Split the file into multiple files. Let's say 10K lines per file. Put them in a folder named 'Splitted_Data_Folder'.
Then run a command like this.

python encode.py Splitted_Data_Folder/  corpus_final.npz --model_name 345M

By the way, how did you try to run it on GPU? The encode.py file doesn't use GPU to encode. Did you take any extra step to do it by GPU?