The purpose of learntop argument

Question

The purpose of learntop argument

XuezheMax opened this issue 6 years ago · 2 comments

Hi,

I want to ask the purpose of the learntop arguments.
I found that this argument is only used in the following code:

def prior(name, y_onehot, hps):

    with tf.variable_scope(name):
        n_z = hps.top_shape[-1]

        h = tf.zeros([tf.shape(y_onehot)[0]]+hps.top_shape[:2]+[2*n_z])
        if hps.learntop:
            h = Z.conv2d_zeros('p', h, 2*n_z)
        if hps.ycond:
            h += tf.reshape(Z.linear_zeros("y_emb", y_onehot,
                                           2*n_z), [-1, 1, 1, 2 * n_z])

        pz = Z.gaussian_diag(h[:, :, :, :n_z], h[:, :, :, n_z:])

What is the purpose to input a zero vector h into a convolution network?
Thanks.

Answer 1 · 2018-12-05T13:19:58.000Z

Hope my explanation will help you:

as the latent space is constrained to this Gaussian distribution p(z; mean, scale), in order to calculate logp(z), we need z, mean and scale all to be known. We get z=encoder(x), but we don't know its mean and scale, the solution here is assuming mean and scale are learnable(mean, scale=h in the code), so if learntop is true, mean and scale will be trained as part of parameters of model. you can see they always set learntop=true, or mean,scale=0 as h is initialized as 0(if ycond=false).
why input zero h into conv2d: the last layer of encoder z=f(x) is a 1x1 conv layer (z with shape NHWC), and based on the fact that convolution shares the weights between patch, so p(z) also share the mean and scale parameter between spatial dimension, e.g. mean and scale both should have shape [1,1,1, C]. you can implement it as follow:

h = tf.get_variable('h', [1,1,1, 2*n_z])
mean = h[:, :, :, :n_z]
logscale = h[:, :, :, n_z:]

to replace these two ops h=tf.zero and h=conv2d_zeros. But you will need to tile h in the NHW dimensions to have the same shape as z for subsequent calculation. In the repo they just play a trick so we don't need to call tile op.

Of course it should work leaving h=[1,1,1, 2*n_z] and don't call tile op, because TF will do the broadcasting job for you.

Answer 2 · 2018-12-05T18:32:15.000Z

Thanks a lot for your reply. It is pretty clear!

…

On Wed, Dec 5, 2018 at 8:20 AM Huadong Liao ***@***.***> wrote: Hope my explanation will help you: 1. as the latent space is constrained to this Gaussian distribution p(z; mean, scale), in order to calculate logp(z), we need z, mean and scale all to be known. We get z=encoder(x), but we don't know its mean and scale, the solution here is assuming mean and scale are learnable(mean, scale=h in the code), so if learntop is true, mean and scale will be trained as part of parameters of model. you can see they always set learntop=true, or mean,scale=0 as h is initialized as 0(if ycond=false). 2. why input zero h into conv2d: the last layer of encoder z=f(x) is a 1x1 conv layer (z with shape NHWC), and based on the fact that convolution shares the weights between patch, so p(z) also share the mean and scale parameter between spatial dimension, e.g. mean and scale both should have shape [1,1,1, C]. you can implement it as follow: h = tf.get_variable('h', [1,1,1, 2*n_z]) mean = h[:, :, :, :n_z] logscale = h[:, :, :, n_z:] to replace these two ops h=tf.zero and h=conv2d_zeros. But you will need to tile h in the NHW dimensions to have the same shape as z for subsequent calculation. In the repo they just play a trick so we don't need to call tile op. Of course it should work leaving h=[1,1,1, 2*n_z] and don't call tile op, because TF will do the broadcasting job for you. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#66 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADUtltVUzDQ8M1GNZ4mzBf5sLZOCU_TSks5u18gEgaJpZM4YyCpO> .

-- ------------------ Best regards, Ma，Xuezhe Language Technologies Institute, School of Computer Science, Carnegie Mellon University Tel: +1 206-512-5977