In `transform_roc`, why do we need `xmb[:, :, :, 1] `?
FrankWork opened this issue · 3 comments
FrankWork commented
def transform_roc(X1, X2, X3):
n_batch = len(X1)
xmb = np.zeros((n_batch, 2, n_ctx, 2), dtype=np.int32)
mmb = np.zeros((n_batch, 2, n_ctx), dtype=np.float32)
start = encoder['_start_']
delimiter = encoder['_delimiter_']
for i, (x1, x2, x3), in enumerate(zip(X1, X2, X3)):
x12 = [start] + x1[:max_len] + [delimiter] + x2[:max_len] + [clf_token]
x13 = [start] + x1[:max_len] + [delimiter] + x3[:max_len] + [clf_token]
l12 = len(x12)
l13 = len(x13)
xmb[i, 0, :l12, 0] = x12
xmb[i, 1, :l13, 0] = x13
mmb[i, 0, :l12] = 1
mmb[i, 1, :l13] = 1
xmb[:, :, :, 1] = np.arange(n_vocab + n_special, n_vocab + n_special + n_ctx)
return xmb, mmb
rodgzilla commented
This part of the xmb
is used for the learned positional encoding.
An embedding vector is associated by the network to each position of the input and this vector is then added to the corresponding word embedding during the forward pass of the network.
def forward(self, x):
x = x.view(-1, x.size(-2), x.size(-1))
e = self.embed(x)
h = e.sum(dim=2)
for block in self.h:
h = block(h)
return h
The line h = e.sum(dim = 2)
do this addition.
FrankWork commented
@rodgzilla Thank you! You saved my day!
guotong1988 commented
Mark