In `transform_roc`, why do we need `xmb[:, :, :, 1] `?

Question

In `transform_roc`, why do we need `xmb[:, :, :, 1] `?

FrankWork opened this issue 6 years ago · 3 comments

def transform_roc(X1, X2, X3):
    n_batch = len(X1)
    xmb = np.zeros((n_batch, 2, n_ctx, 2), dtype=np.int32)
    mmb = np.zeros((n_batch, 2, n_ctx), dtype=np.float32)
    start = encoder['_start_']
    delimiter = encoder['_delimiter_']
    for i, (x1, x2, x3), in enumerate(zip(X1, X2, X3)):
        x12 = [start] + x1[:max_len] + [delimiter] + x2[:max_len] + [clf_token]
        x13 = [start] + x1[:max_len] + [delimiter] + x3[:max_len] + [clf_token]
        l12 = len(x12)
        l13 = len(x13)
        xmb[i, 0, :l12, 0] = x12
        xmb[i, 1, :l13, 0] = x13
        mmb[i, 0, :l12] = 1
        mmb[i, 1, :l13] = 1
    xmb[:, :, :, 1] = np.arange(n_vocab + n_special, n_vocab + n_special + n_ctx)
    return xmb, mmb

guotong1988 commented 6 years ago

Mark

Answer 1 · 2018-07-02T11:11:18.000Z

This part of the xmb is used for the learned positional encoding.

An embedding vector is associated by the network to each position of the input and this vector is then added to the corresponding word embedding during the forward pass of the network.

def forward(self, x):
    x = x.view(-1, x.size(-2), x.size(-1))
    e = self.embed(x)
    h = e.sum(dim=2)
    for block in self.h:
        h = block(h)
    return h

The line h = e.sum(dim = 2) do this addition.

Answer 2 · 2018-07-02T11:43:49.000Z

@rodgzilla Thank you! You saved my day!