[Question] OOM Is there a way not to load the whole dataset in the dataloader?
gaceladri opened this issue · 1 comments
gaceladri commented
Hello,
I have a very large parquet file that the Loader
is trying to load on a 24 GB GPU. Is there any way not to load the whole dataset into the dataloader?
gaceladri commented
Solved following the NVTabular documentation for good practices.
I added train.to_parquet("../../data/processed/merlin_train", engine="pyarrow", **row_group_size=10000**)
and I can load the dataset.
Thanks