distributed training not good?
harryxu-yscz opened this issue · 4 comments
harryxu-yscz commented
I had some OK results training with one GPU but not with two GPUs (with batch_size=50). Specifically, the attention mask would not change and the generator ended up returning the original image. I tried scaling lr_G
, lr_D
and lambda_D_cond
but no luck.
Any suggestion?
harryxu-yscz commented
problem solved by properly scaling parameters
joyyang1997 commented
@harryxu-yscz Hi, I met the same question as you. Would you please share me your experience? Thank you very much!
zxu7 commented
@joyyang1997
Try tuning these params for multi-GPU code:
lambda_mask_smooth
, lambda_D_cond
e.g. I used these params:
--batch_size 50 --gpu_ids 0,1 --lambda_D_cond 8000 \
--lambda_mask_smooth 5e-6
joyyang1997 commented
@joyyang1997
Try tuning these params for multi-GPU code:
lambda_mask_smooth
,lambda_D_cond
e.g. I used these params:
--batch_size 50 --gpu_ids 0,1 --lambda_D_cond 8000 \ --lambda_mask_smooth 5e-6
Thank you very much, I'll try it.