universome/alis

Continuing training from the last snapshot

Closed this issue · 4 comments

Is there a way to continue the training from the latest saved pickle?

It's possible by adding the 'resume' argument to the main.yml file!

We built upon https://github.com/NVlabs/stylegan2-ada-pytorch and in fact, it does not have a proper way to continue training after an interruption, because it does not save the optimizer state in the checkpoint. You would need to change this to have a proper training continuation

What about resuming from a pickle?

This will work fine for fine-tuning, but it won't be equivalent to continuing after an interruption, because Adam state (first and second moments) are not saved in the pickle when a checkpoint is being saved.