Continuing training from the last snapshot

Question

Continuing training from the last snapshot

Closed this issue 3 years ago · 4 comments

Is there a way to continue the training from the latest saved pickle?

Answer 1 · 2022-01-12T21:19:42.000Z

It's possible by adding the 'resume' argument to the main.yml file!

Answer 2 · 2022-01-14T09:23:18.000Z

We built upon https://github.com/NVlabs/stylegan2-ada-pytorch and in fact, it does not have a proper way to continue training after an interruption, because it does not save the optimizer state in the checkpoint. You would need to change this to have a proper training continuation

Answer 3 · 2022-01-14T18:32:21.000Z

What about resuming from a pickle?

Answer 4 · 2022-01-14T21:48:20.000Z

This will work fine for fine-tuning, but it won't be equivalent to continuing after an interruption, because Adam state (first and second moments) are not saved in the pickle when a checkpoint is being saved.