amjltc295/Free-Form-Video-Inpainting

Pretraining/Finetuning stage of FFVI and Cannot reproduce quantitative evaluation

MaureenZOU opened this issue · 8 comments

Hi Ya-Liang, after going through the issues, you have mentioned in #20 that there are pretraining and finetuning stage of your model. Also, when I train your model from scratch, I found your GAN loss is set to zero. Could you please explain the detailed training schedule of the pretraining and finetuning stage and loss type/weight? Thanks in advance : )

Question 2: Cannot reproduce quantitative evaluation:

Paper Result:
Screen Shot 2020-07-14 at 1 51 28 PM

My Result:
Screen Shot 2020-07-14 at 2 17 34 PM

Evaluation Setting:
Use your default setting to inference all the test images, and then run evaluation using the following cmd:
python evaluate.py -rgd ../../../data/FVI/Test/JPEGImages/ -rmd ../../../data/FVI/Test/object_masks/ -rrd ../../../data/results/FVI_Test/epoch_0/test_object_removal/
all the setting is in default.

I compare the MSE score on the object like mask, they show very different result 0.0024 vs 0.01044. Meanwhile, FID score is also very different.

First question solved, found in SUP. Thanks!

This table is an average score of different mask-to-frame ratios (please refer to the supplementary materials). I'm not sure which ratio you're using, but it should be a ~60% one considering the score.

Thanks a lot for your information! I am really sorry to have an additional question : (
After training your model around 300 epochs with the default setting. The results are actually very fluctuated, could you please verify whether it makes sense in the attached file. Each row represent an epoch, and it is ordered in chronological order.
https://drive.google.com/file/d/1sOelWjXReCvLyaiKWDScsPa2xdEWQJA7/view?usp=sharing

Did you include the discriminator loss? This is normal if you only use perceptual loss. The pretraining step is done if the loss is converged. Then you need to include the discriminator. If it's still like this after fine-tuning, you will probably need to tune some hyper parameters.

Thanks a lot! I only use perceptual loss. I will finetune with discriminator loss to see what happens. Thanks!

Thanks for your code again! I have reproduce all the results on your paper using the model provided. At the same time, reproduce the paper number by training the model myself. It is a nice repo with clean code : )

No problem, thank you for verifying it as well :)