sholtodouglas/learning_from_play

Fix ProtoBuf crash when running `from_tensor_slices` on large datasets

0xtristan opened this issue · 1 comments

Protobufs have a hard limit of 2GB transfer. In our current dataset (with sufficient augmentation) the tensor dataset size gets to about 2.6GB and hence crashes.

Either figure out a way to load dataset via several calls or else reduce dataset size (e.g. by doing augmentation on the fly) - else use GCS (undesirable).

Fixed by reducing dataset footprint by doing data augmentation (mainly sub-trajectory sliding window augmentation) on the fly with tf.data.Dataset.window(). Can now perform max window augmentation (with a shift of 1) without any memory errors because dataset size is kept constant and window augmentation is done on the fly. Training time with these changes is no worse than before, mainly due to multi-threading of data aug steps + prefetching.