microsoft/infinibatch

Can we get the length of entire dataset from `chunked_dataset_iterator`?

prabhakar267 opened this issue · 0 comments

To integrate fairseq and infinibatch, I need to get the size of the entire dataset. I used to chunked_dataset_iterator to read the text dataset. As a workaround, right now I'm reading the length from a custom config file with hardcorded values.
Is there a way to get the length of entire dataset while continuing the same functionality as current iterator?