google-research/electra

Can you share models trained with all weights tied?

YovaKem opened this issue · 0 comments

In the paper you say " On the other hand, tying all encoder weights caused little improvement while incurring the significant disadvantage of requiring the generator and discriminator to be the same size"
Is it possible to share the generator and discriminator you trained to obtain this result?