YapengTian/TDAN-VSR-CVPR-2020

training dataset and loss function

engelsjo opened this issue · 11 comments

Is your training dataset the vimeo 90K dataset? Also, what loss function did you use to train the model you have provided? I have used your code to retrain on the vimeo dataset with many different loss functions, but I am not able to get images as sharp as the model you have open sourced.

Based on my experiments, I believe that the pretrained model was trained on the REDS dataset, rather than the Vimeo dataset reported in the paper.

Sorry for the late response. I was working on PhD thesis proposal.

The model is trained on Vimeo 90K and MSE loss was used to train the model (CHECK THE CVPR VERSION PAPER). One trick I used and mentioned in other issues for training is that I found fine-tune can help to improve performance a lot. It means that you first train the model with 1e-4 for example and then you can do fine-tune with 5e-5 to further improve. In addition, I used 8 GPUs to obtain the released pre-trained model for fast training.

You mentioned REDS dataset. But according to my experience and also EDVR work, a model trained on REDS will obtain lower performance on Vid4. I am not sure why you think it is trained on Vimeo dataset.

Thanks for the response. I missed your update of the L2 loss in the CVPR version.

Still, when I train from scratch on Vimeo images which are downsampled with bicubic interpolation, the resulting test images are much more blurry than the images I obtain from your pretrained model. However, when I train on REDS, I get sharper looking test images similar to what I get using your pretrained model. How many epochs did you train on the vimeo dataset for?

Sorry, I did not see any discussion about fine-tuning in the other issues. Do you mean train the model on vimeo and fine-tune on a new dataset?

Thanks again for the response!

Interesting! I tried to use REDS 1 year ago but found that it drops testing performance on Vid4. Then, I never use it again. It was first trained with 600 epochs. Then do fine-tune still on the same Vimeo-90K dataset with lower learning rate (I remembered that I did fine-tune at least twice and more gpus help a lot - you might find that the performance in arxiv version is much lower than cvpr version and the main improvement is from 8-gpu training).

But probably my experience on REDS is wrong since I did not explore it much before. If you find it is better than Vimeo-90K, pls ignore my comments.

Thanks for the follow up. I will try this fine tuning strategy again on vimeo.

Regarding your second point on the 8-gpu training, it is not clear to me why 8-gpu training would help the performance.

We can use a larger batch size when we can access more gpus.

Makes sense. So your final batch size was larger than 64?

I have forgotten the exact number but it should be larger than 64 (when do training, just make full use of your gpu memory to set the number). I have lost permission to the 8-gpu sever.

I forgot to modify it in the CVPR version. This is my fault.

Thanks Yapeng! I will give these training tips a try.

No problem!