minimaxir/gpt-2-cloud-run

Reduce memory consumption to prevent errors due to container OOM

minimaxir opened this issue · 2 comments

Containers seem to go OOM after ~10 generations, despite garbage collection. Loading the model takes up ~1.5GB so hitting the ceiling is not surprising, but there should be a way to control the leaks.

The current implementation (reload model after 8 generations) appears to avoid OOMs.

this will reduce the memory consumption by a lot tensor2tensor.utils.adafactor.adafactoroptimizer