google-research/electra

Sampling step?

anshulsamar opened this issue · 2 comments

Thanks for your documentation and transparency here

Quick question, in sample_from_softmax(logits, disallow=None), you return:

tf.one_hot(tf.argmax(tf.nn.softmax(logits + gumbel_noise), -1, output_type=tf.int32), logits.shape[-1])

Wondering why tf.softmax is needed here if the result will be passed through tf.argmax anyways? Perhaps this is a holdover from another experiment?

Thanks!

Acejoy commented

Hey, did you find the answer to this?

Hi, I didn't, sorry