Cannot finish tasks in Colab because runtime crashes due to low RAM.
ritog opened this issue · 4 comments
On the setting up section of this course, it says that:
Google Colab for hands-on exercises. The free version is enough.
But the section where the feature extractor is applied to the music database, the Colab runtime crashes saying that it crashed due to low RAM.
What could be a possible workaround?
@sanchit-gandhi Can you please take a look?
Thanks for flagging @ritog! There are a few 'tricks' we can employ to try and get this working with lower RAM (I'm fairly confident it's just a case of tweaking the .map
hyper-parameters to get this to work on a free Google Colab)
Could you try reducing two parameters please?
batch_size
: defaults to 1000, let's try setting this to 100, and if that doesn't work then reduce it by a factor of 2 again to 50writer_batch_size
: defaults to 1000, let's try setting this to 500, and if that doesn't work then reduce it by a factor of 2 to 250
=> using a combination of the above two should be most optimal here, so I would try batch_size=100, writer_batch_size=500, and if that doesn't work then batch_size=50, writer_batch_size=500:
gtzan_encoded = gtzan.map(
preprocess_function, remove_columns=["audio", "file"], batched=True, num_proc=1, batch_size=100, writer_batch_size=500,
)
Hey @ritog - wondering if you had any luck here? Would be interested in hearing whether you found a configuration that worked for the .map
method. If so, I can update the Unit to use your configs. Otherwise, we'll have to find a different workaround!
@sanchit-gandhi
It works fine with batch_size=100, no need to change writer_batch_size. You may also update the output pointed out in #95.