Modify the transformer tutorial based on performer

Question

Modify the transformer tutorial based on performer

HelloWorldLTY opened this issue 2 years ago · 0 comments

Hi, I intend to apply performer to finish this tutorial for transformer.

https://pytorch.org/tutorials/search.html?q=pre-training&check_keywords=yes&area=default

However, I received such an error:

ext, mask, context_mask, **kwargs)
426 if exists(context_mask):
427 global_mask = context_mask[:, None, :, None]
--> 428 v.masked_fill_(~global_mask, 0.)
430 if exists(pos_emb) and not cross_attend:
431 q, k = apply_rotary_pos_emb(q, k, pos_emb)

RuntimeError: The expanded size of the tensor (20) must match the existing size (35) at non-singleton dimension 2. Target sizes: [35, 4, 20, 64]. Tensor sizes: [35, 1, 35, 1]

I did not know why. Is this model different from transformer? Thanks a lot.