about the variance estimation

Question

about the variance estimation

orangeniux opened this issue a year ago · 1 comments

For the Variance layer, should the second activate function be relu? cause tanh results the possibility of negative variance. Or the step is not about the sampling variance, please advise, thanks a lot!

Answer 1 · 2023-08-29T16:06:00.000Z

You are possibly referring to the Variance layer in the gluformer/variance.py file, which indeed returns a value in [-10,10], because the last activation function is tanh and then the output is manually scaled by multiplying by 10.

This is not variance but rather log-variance, see the error computation in model_train.py (line 143) and the associated process_batch function in utils/train.py. Hence, the variance indeed is restricted to the range [exp(-10), exp(10)] to avoid computational instability during training.