google-deepmind/neural-processes

Latent Encoder and Decoder: log_sigma transformation

FabricioArendTorres opened this issue · 1 comments

Hi,

I was wondering what the reasoning behind the specific parameterizations of the standard deviation in the latent encoder is, i.e. why is it bounded?
I usually just use a softplus in such settings.

Also, they are different between the encoder and decoder, not sure if that is intentional.
I'd assume both should be with tf.sigmoid?

(Latent) Encoder: Bounds SD between 0.1 and 1

    # Compute sigma
    sigma = 0.1 + 0.9 * tf.sigmoid(log_sigma)

Decoder: Bounds SD to be higher than 0.1 and...?

    # Bound the variance
    sigma = 0.1 + 0.9 * tf.nn.softplus(log_sigma)

And thank you for that well-documented repository :).

Hi, the difference in the parameterisation of the sigma is not intentional - in this particular 1D regression case, either choice should be fine. In general the range of the sigma should be problem-dependent, but it usually helps to lower bound it by some small value >0 for stability in optimisation. Hope that helps!