[Question] Incremental training
Closed this issue · 3 comments
I was wondering whether it is currently possible to interrupt a training run and pick off where it was stopped. I noticed from the example scripts that there’s a --resume
switch, which apparently allows extending existing models, e.g. to add denoising on top of superscaling. I was wondering whether this also allows “pausing” the training phase and pick off later. I did try using --resume
like this, but it started counting the epochs from 0 again, and moreover it seems that the old .t7
model files for previous epochs were being rewritten. Would using --resume
5 times, with 10 epochs each, produce a model equivalent to a single training run over 50 epochs?
Thank you kindly for making waifu2x!
It is not fully supported. Training starts with the trained model specified by -resume option, but the learning rate is reset. So you need to specify both -resume and -learning_rate. The learning rate is displayed in the cosole output for each epoch.
# 2
learning rate: 0.00024853029126955
In addition, if you use train.lua, I recommend dev branch. It is faster than master branch.
Thanks for the pointer!