Can we get the length of entire dataset from `chunked_dataset_iterator`?
prabhakar267 opened this issue · 0 comments
prabhakar267 commented
To integrate fairseq
and infinibatch
, I need to get the size of the entire dataset. I used to chunked_dataset_iterator
to read the text dataset. As a workaround, right now I'm reading the length from a custom config file with hardcorded values.
Is there a way to get the length of entire dataset while continuing the same functionality as current iterator?