Gated transposed convolution seems odd

Question

Gated transposed convolution seems odd

ColorDiff opened this issue 5 years ago · 2 comments

ColorDiff commented 5 years ago

What the code (in ops.py) currently does is:

apply the same transposed convolution (TC) to the input x twice,
results in x1, x2 with x1 == x2. In code, x1 corresponds to deconv and x2 to g.
to both outputs of the TC, a different learned bias is added: x1 += b1, x2 += b2
leaky relu is applied x1 = lrelu(x1, 0.2)
x2 is overwritten: g = tf.nn.sigmoid(deconv)
x1 = x1 * x2 which is equivalent to
lrelu(conv(x) + b1, 0.2) * sigmoid(lrelu(conv(x) + b1), 0.2))

Two things seem kind of strange and raised following questions:

Why do we apply the same convolution to create the mask and features? Shouldn't we use two different convolutions as we do with gated convolution? Also, why the different biases then?
Shouldn't we apply sigmoid to g instead of deconv?

Thanks in advance!

For convenience i posted the relevant code below:

def gate_deconv(input_, output_shape, k_h=5, k_w=5, d_h=2, d_w=2, stddev=0.02,
       name="deconv", training=True):
    with tf.variable_scope(name):
        # filter : [height, width, output_channels, in_channels]
        w = tf.get_variable('w', [k_h, k_w, output_shape[-1], input_.get_shape()[-1]],
                  initializer=tf.random_normal_initializer(stddev=stddev))

        deconv = tf.nn.conv2d_transpose(input_, w, output_shape=output_shape,
                    strides=[1, d_h, d_w, 1])

        biases = tf.get_variable('biases1', [output_shape[-1]], initializer=tf.constant_initializer(0.0))
        deconv = tf.reshape(tf.nn.bias_add(deconv, biases), deconv.get_shape())
        deconv = tf.nn.leaky_relu(deconv)

        g = tf.nn.conv2d_transpose(input_, w, output_shape=output_shape,
                    strides=[1, d_h, d_w, 1])
        b = tf.get_variable('biases2', [output_shape[-1]], initializer=tf.constant_initializer(0.0))
        g = tf.reshape(tf.nn.bias_add(g, b), deconv.get_shape())
        g = tf.nn.sigmoid(deconv)

        deconv = tf.multiply(g,deconv)

        return deconv, g

Answer 1 · 2019-09-23T15:02:50.000Z

Issue #37 also mentions this, and yes you're right. You can check my implementation here. Although not completed yet, but i've mostly re-implemented it in keras.

Answer 2 · 2019-09-23T17:09:05.000Z

Issue #37 also mentions this, and yes you're right. You can check my implementation here. Although not completed yet, but i've mostly re-implemented it in keras.

Awesome, I'll check it out!