Some questions
lironT74 opened this issue · 1 comments
lironT74 commented
Hello @vahidk and thank you for this awesome repository. I would like to ask:
-
To my understanding, TFRecordDataset returns an infinite iterator over the TFRecord. How can I change this behavior? this is needed for the evaluation step. If I somehow got it wrong and it is finite, how do I make it finite?
-
I noticed that when I try to read a large TFRecord with TFRecordDataset, iterating over the dataset is a lot slower than when I iterate over a small TFRecord. This is not a surprise of course, but is it better to somehow split the large TFRecord to smaller ones and use MultiTFRecordDataset instead? If so, how can you split TFRecords?
Thank you again.
vahidk commented
- TFRecordDataset is finite. MultiTFRecordDataset is infinite by default but you can pass infinite=False to the its constructor to make it finite.
- Just read using reader and write parts of it. You should use stackoverflow for this kind of question not create an issue.