rosinality/swapping-autoencoder-pytorch

Need of two encoders and decoder of training

krips89 opened this issue ยท 8 comments

What is the need of two encoder (encoder and e_ema) during the trainign?
And the same goes for the decoder

It is running average of the encoder/decoder. Using running average of the model will show better results.

Any literature to back this up?

I don't know which paper first tried to apply ema to the gan training. But you can refer to the papers like this. https://arxiv.org/abs/1806.04498

Thank you for the answer, does the official stylegan2 implement that?
I'm closing this issue.

Yes, stylegan2 also uses it.

Hi, thanks for your implementation.
Is there any criterion to difine accum (the hyper-param for ema)
code here
accum = 0.5 ** (32 / (10 * 1000))
@rosinality

@nobodypengium I took it from official stylegan2 implementations. It seems like that authors defined it in relation of number of seen images. (which is stylegan2 authors preferred way to define hyperparameters.) I think 0.999 ~ 0.9999 (biggan) works well. (Though stylegan2 uses about 0.998)

Thanks for your answer!