royorel/Lifespan_Age_Transformation_Synthesis

training model collapse

wtliao opened this issue · 3 comments

Hi, thanks for sharing your nice work and the dataset. I am playing around your code to know more about your idea. However, after training an epoch, the synthesized images of different classes are blank.

after about 1000 iterations
66fc8bb19dfcc7f563e25487f3a7375

after 1 epoch
44f3b4659a8b8d687f27c1bbd4241b9

after 5 epoches
c8a20c97691ff157088a5d1c7953bd7

Training loss over time
newplot

I have not changed anything of your code. Do you have any idea about that? Thanks a lot.

Hi @wtliao,

This is known and happens from time to time because of the relatively high initial learning rate needed to train the Equalized learning rate StyleGAN convolution blocks. Sometimes this issue resolves itself at later stages of training (It should be solved by epoch 150). Alternatively, you can try lowering the learning rate. It will stabilize the training, but the aging effect might not be the same as in the paper.

Hi @royorel thanks a lot for you swift reply :). I will let it run till 200 epoches and have a look. I have one more question about the batch_size setting. I notice that the default bs=6 on 4 GPUs. I am a little confused that how 6 samples are assigned to 4 GPUs at each iteration?

@wtliao each sample is actually a pair of images. So overall it's 12 images over 4 GPUs (3 per GPU)