Question regarding Shuffling
LeoXinhaoLee opened this issue · 1 comments
LeoXinhaoLee commented
Hi, thank you very much for releasing this great dataset. I am wondering if the original PILE dataset (with 30 chunks) have already shuffled? Or do we still need to globally shuffle PILE before using it for pertaining? Thank you.
yuzc19 commented
Hi, @LeoXinhaoLee I am also curious about it. Are there any conclusions?