What are the key factors that boost performance?

Question

What are the key factors that boost performance?

maradona20200204 opened this issue 4 years ago · 3 comments

maradona20200204 commented 4 years ago

Dear Dr. Nah:

Thank you for answering my last question! I have one final question. PSNR boosts from 29.23dB (29.08 dB for 3 scales, according to the CVPR2017 paper) to 30.40dB for the single precision model. After studying this repo and your original repo (DeepDeblur_release) for several days, I become curious about the key factors that boost performance. I have the following conjectures:

(1) RGB range [0, 1] --> [0, 255]. But why does this work?

(2) Removal of the GAN loss. According to NTIRE 2020 UniA's paper, GAN loss may not help training. Besides, removal of the discriminator may help training.

(3) Learning rate schedule. Both repos train 1000 epochs using the Adam optimizer. However, the original repo sets the learning rate to its 1/10 on the 150th epoch (code/main.lua), while this code uses 5 warmup epochs and halves the learning rate (default gamma=0.5) on the milestones (the 500th, 750th, 900th epoch). The initial learning rate is 1e-4 (src/option.py, line 92) in this repo and 5e-5 in the original repo (code/opts.lua, line 53).

(4) Augmentation. The augmentation methods in this repo are slightly different from the original ones. First, this repo changes saturation with possibility 1 while the original repo changes saturation with 1/10 possibility (code/donkey.lua, line 59). Second, this repo only rotates 90 degrees while the original repo may rotate 90, 180 or 270 degrees counterclockwise. Third, this repo uses "flip_v" and "flip_h" which flip vertically and horizontally respectively, while the original repo only uses "flip_h". Both repos use AWGN with sigma=2.

(5) Change from Torch to PyTorch. This seems weird. Maybe the initialization method has changed (not sure)?

CNN is a "black box". Therefore, I believe nobody can answer this question except you.
I am a layman to deblurring. Would you please solve my puzzle? Thanks!

Answer 1 · 2020-12-22T06:38:28.000Z

Here are my answers.

(1) The initialized weights could be in a different position with respect to the input images. Also, the average gradient scale might differ. No clear clues but we found the effect when experimenting EDSR.(torch, pytorch)

(2) Removing adversarial loss can bring higher PSNR but I can't say it brings improvements. Adversarial or perceptual loss plays a different role and the evaluation criteria should not be PSNR only.

(3) Usually longer training helps. I found the previous implementation did not meet the convergence point. However, this repository does not guarantee convergence. I would say there still remain rooms to be improved by even longer training. If you read other deblurring papers, they typically train for 2000 ~ 4000 epochs.

(4) Augmentation details are mostly the same except the saturation augmentation and the sharpness preservation probability. The rotation/flipping boolean variables are different but they are designed to perform the same transformations jointly.
I don't think this would cause much difference.

(5) I did not test the exactly the same implementation. Torch used Xavier initialization while PyTorch uses He initialization by default. Other differences may affect the optimization.

(6) Mixed precision training could bring differences in performance.

(7) Different batch sizes lead to different convergence schemes. Larger batches are slower at the very early part of the training but become faster afterward in my experiences.

Answer 2 · 2020-12-22T07:06:04.000Z

Thanks for you detailed answers. I have no further questions and sorry for bothering you!

Answer 3 · 2020-12-22T10:23:39.000Z

No problem at all.