albertpumarola/GANimation

distributed training not good?

harryxu-yscz opened this issue · 4 comments

I had some OK results training with one GPU but not with two GPUs (with batch_size=50). Specifically, the attention mask would not change and the generator ended up returning the original image. I tried scaling lr_G, lr_D and lambda_D_cond but no luck.

Any suggestion?

problem solved by properly scaling parameters

@harryxu-yscz Hi, I met the same question as you. Would you please share me your experience? Thank you very much!

zxu7 commented

@joyyang1997
Try tuning these params for multi-GPU code:
lambda_mask_smooth, lambda_D_cond

e.g. I used these params:

--batch_size 50 --gpu_ids 0,1 --lambda_D_cond 8000 \
--lambda_mask_smooth 5e-6

@joyyang1997
Try tuning these params for multi-GPU code:
lambda_mask_smooth, lambda_D_cond

e.g. I used these params:

--batch_size 50 --gpu_ids 0,1 --lambda_D_cond 8000 \
--lambda_mask_smooth 5e-6

Thank you very much, I'll try it.