isl-org/PhotorealismEnhancement

Is discriminator training properly?

TiPEX360 opened this issue · 4 comments

Hi,
Below is the snippet of the training log from training on a similar dataset I put together. It featues a real dataset from Houzz and an artificial dataset of GTAV buildings. Could you explain why I get a get loss update at each iteration for the generator but only occasional updates for the discriminator? I can't tell if this is intentional or if I made a mistake along the way.

I've also noticed that for the .mat files saved in /out
store ['i_fake'] and ['i_real] correctly, but ['i_rec_fake] is filled with NaN. Testing the network produces entirely black images.

Sorry, this is probably too wide to format correctly...
2022-04-19 22:53:09,737 346880 rdf0 ds0 rdf1 ds1 rdf2 ds2 rdf3 ds3 rdf4 ds4 rdf5 ds5 rdf6 ds6 rdf7 ds7 rdf8 ds8 rdf9 ds9 rdr0 rdr1 rdr2 rdr3 rdr4 rdr5 rdr6 rdr7 rdr8 rdr9 reg gs0 gs1 gs2 gs3 gs4 gs5 gs6 gs7 gs8 gs9 vgg 2022-04-19 22:53:09,737 346880 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 2022-04-19 22:53:10,071 346881 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 1.02 0.99 0.99 1.04 0.99 0.99 1.03 0.98 0.97 1.05 0.72 2022-04-19 22:53:11,022 346882 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 2022-04-19 22:53:11,034 346883 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 1.01 1.02 1.00 1.06 1.03 1.01 1.08 0.98 1.00 1.04 0.64 2022-04-19 22:53:11,983 346884 0.00 0.00 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 1.00 ---- ---- ---- ---- ---- ---- ---- ---- ---- 0.00 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 2022-04-19 22:53:12,328 346885 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 1.04 1.03 0.99 1.04 1.02 1.02 1.07 0.97 0.97 1.03 0.24 2022-04-19 22:53:13,278 346886 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 2022-04-19 22:53:13,625 346887 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 1.03 1.02 0.99 1.05 1.03 1.01 1.07 0.97 0.98 1.03 0.50 2022-04-19 22:53:14,576 346888 ---- ---- ---- ---- ---- ---- ---- ---- 0.00 0.00 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 1.00 ---- ---- ---- ---- ---- 0.00 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 2022-04-19 22:53:14,588 346889 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 1.00 1.01 0.99 1.06 1.04 1.02 1.07 0.98 1.00 1.04 0.70 2022-04-19 22:53:15,540 346890 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 0.00 0.00 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 1.00 ---- ---- ---- ---- 0.00 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 2022-04-19 22:53:15,551 346891 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 1.01 1.02 1.00 1.06 1.05 0.93 1.09 0.97 1.00 1.04 0.79 2022-04-19 22:53:16,497 346892 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 2022-04-19 22:53:16,509 346893 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 0.99 1.02 1.00 1.06 1.05 0.92 1.09 0.98 1.01 1.04 0.80 2022-04-19 22:53:17,461 346894 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 2022-04-19 22:53:17,472 346895 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 1.00 1.02 1.00 1.06 1.04 0.93 1.09 0.98 1.01 1.04 0.73 2022-04-19 22:53:18,424 346896 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 2022-04-19 22:53:18,436 346897 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 1.02 1.02 1.00 1.05 1.03 0.94 1.08 0.98 1.00 1.03 0.58 2022-04-19 22:53:19,387 346898 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 2022-04-19 22:53:19,399 346899 ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 1.00 1.02 1.00 1.06 1.05 0.93 1.09 0.98 1.01 1.04 0.73

So I have since narrowed down the issue: the problem began after I tried to load a previously saved network state using name_load in the config for a 'train' config. If I start from scratch I don't get the issue. Any ideas why loading a savegame for training might cause problems?

The discriminator networks are only updated from time to time depending on their performance in classifying real and fake images. If one of them gets too strong, we skip backpropagation for a couple of iterations so the generator can catch up. This is the adaptive backpropagation in the paper. I have no idea about the issue with loading savegames as we have successfully saved and loaded savegames for continuing training. It likely depends on the specific config you are using there.

Hi,
Could you please explain me what .mat files stores? And in particular, what ['i_fake'] and ['i_real] and ['i_rec_fake] store and represent?

@lucamarini22 .mat essentially stores the model inputs/outputs for that iteration. ''i_fake' is the input image, 'i_rec_fake' is the generated output image.