leoozy opened this issue 3 years ago · 1 comments
Hello, I processed the wikipedia and bookcorpors using your scripts. The total size of the processed wikipedia dataset is around 106G (~2650 hdf5 files). Could you please tell me whether it is right?
Sounds about right.