lukas/ml-class

resolve OOM errors in transfer-learning or warn that it needs a GPU

charlesfrye opened this issue · 1 comments

On my reasonably-equipped home cpu-machine and on the (cpu) hub, the transfer-learning example causes OOM errors -- sometimes even before getting to the model.fit call.

They're pretty scary-looking, if you're not expecting them, and depending on the exact system parameters, they can sometimes be triggered inside of the wandb.init call, making it look like our fault.

Potential solutions:

  • cut the dataset size in half
  • shrink the images to the minimum acceptable size for ResNet50 (80% of current size)

It also doesn't help that there are two copies of the data: one normalized and one not normalized.

Moving the normalization into the data-loading or into the model (as is standard Keras practice these days, to prevent GPUs from getting starved) would help.