About the-pile dataset
hwyFighting opened this issue · 1 comments
hwyFighting commented
Hi!
How can I download the-pile dataset in another way for training on GPU.
thanks for the answer!
terrykong commented
The current recommendation for downloading pile is here: https://github.com/NVIDIA/JAX-Toolbox/tree/main/rosetta/rosetta/projects/t5x#downloading-the-pile . Note that it is around 1TB, so you'll have to make sure you have the disk space for it.