google/seqio

caching tasks goes out of memory due to apache beam

mayurnewase opened this issue · 2 comments

Trying to cache tasks from magenta/MT3 repository, only with 200 examples it takes around 30GB of memory while caching at the very end of processing.
Without caching it trains just fine even with 1000 train examples train dataset.

I am dumb, was using directrunner.

don't be so hard on yourself! :)