nagadomi/waifu2x

[Question] Incremental training

Closed this issue · 3 comments

I was wondering whether it is currently possible to interrupt a training run and pick off where it was stopped. I noticed from the example scripts that there’s a --resume switch, which apparently allows extending existing models, e.g. to add denoising on top of superscaling. I was wondering whether this also allows “pausing” the training phase and pick off later. I did try using --resume like this, but it started counting the epochs from 0 again, and moreover it seems that the old .t7 model files for previous epochs were being rewritten. Would using --resume 5 times, with 10 epochs each, produce a model equivalent to a single training run over 50 epochs?

Thank you kindly for making waifu2x!

It is not fully supported. Training starts with the trained model specified by -resume option, but the learning rate is reset. So you need to specify both -resume and -learning_rate. The learning rate is displayed in the cosole output for each epoch.

# 2	
learning rate: 0.00024853029126955	

In addition, if you use train.lua, I recommend dev branch. It is faster than master branch.

Thanks for the pointer!