The purpose of learntop argument
XuezheMax opened this issue · 2 comments
XuezheMax commented
Hi,
I want to ask the purpose of the learntop arguments.
I found that this argument is only used in the following code:
def prior(name, y_onehot, hps):
with tf.variable_scope(name):
n_z = hps.top_shape[-1]
h = tf.zeros([tf.shape(y_onehot)[0]]+hps.top_shape[:2]+[2*n_z])
if hps.learntop:
h = Z.conv2d_zeros('p', h, 2*n_z)
if hps.ycond:
h += tf.reshape(Z.linear_zeros("y_emb", y_onehot,
2*n_z), [-1, 1, 1, 2 * n_z])
pz = Z.gaussian_diag(h[:, :, :, :n_z], h[:, :, :, n_z:])
What is the purpose to input a zero vector h
into a convolution network?
Thanks.
naturomics commented
Hope my explanation will help you:
- as the latent space is constrained to this Gaussian distribution p(z; mean, scale), in order to calculate logp(z), we need z, mean and scale all to be known. We get z=encoder(x), but we don't know its mean and scale, the solution here is assuming mean and scale are learnable(mean, scale=h in the code), so if learntop is true, mean and scale will be trained as part of parameters of model. you can see they always set learntop=true, or mean,scale=0 as h is initialized as 0(if ycond=false).
- why input zero h into conv2d: the last layer of encoder z=f(x) is a 1x1 conv layer (z with shape NHWC), and based on the fact that convolution shares the weights between patch, so p(z) also share the mean and scale parameter between spatial dimension, e.g. mean and scale both should have shape [1,1,1, C]. you can implement it as follow:
h = tf.get_variable('h', [1,1,1, 2*n_z])
mean = h[:, :, :, :n_z]
logscale = h[:, :, :, n_z:]
to replace these two ops h=tf.zero and h=conv2d_zeros. But you will need to tile h in the NHW dimensions to have the same shape as z for subsequent calculation. In the repo they just play a trick so we don't need to call tile op.
Of course it should work leaving h=[1,1,1, 2*n_z] and don't call tile op, because TF will do the broadcasting job for you.
XuezheMax commented
Thanks a lot for your reply. It is pretty clear!
…On Wed, Dec 5, 2018 at 8:20 AM Huadong Liao ***@***.***> wrote:
Hope my explanation will help you:
1. as the latent space is constrained to this Gaussian distribution
p(z; mean, scale), in order to calculate logp(z), we need z, mean and scale
all to be known. We get z=encoder(x), but we don't know its mean and scale,
the solution here is assuming mean and scale are learnable(mean, scale=h in
the code), so if learntop is true, mean and scale will be trained as part
of parameters of model. you can see they always set learntop=true, or
mean,scale=0 as h is initialized as 0(if ycond=false).
2. why input zero h into conv2d: the last layer of encoder z=f(x) is a
1x1 conv layer (z with shape NHWC), and based on the fact that convolution
shares the weights between patch, so p(z) also share the mean and scale
parameter between spatial dimension, e.g. mean and scale both should have
shape [1,1,1, C]. you can implement it as follow:
h = tf.get_variable('h', [1,1,1, 2*n_z])
mean = h[:, :, :, :n_z]
logscale = h[:, :, :, n_z:]
to replace these two ops h=tf.zero and h=conv2d_zeros. But you will need
to tile h in the NHW dimensions to have the same shape as z for subsequent
calculation. In the repo they just play a trick so we don't need to call
tile op.
Of course it should work leaving h=[1,1,1, 2*n_z] and don't call tile op,
because TF will do the broadcasting job for you.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#66 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADUtltVUzDQ8M1GNZ4mzBf5sLZOCU_TSks5u18gEgaJpZM4YyCpO>
.
--
------------------
Best regards,
Ma,Xuezhe
Language Technologies Institute,
School of Computer Science,
Carnegie Mellon University
Tel: +1 206-512-5977