lucidrains/performer-pytorch

[Feature] Adding fixed positional embeddings as an option

gulnazaki opened this issue · 3 comments

I believe that, although using learnable positional embeddings is the trend nowadays, it would help to use fixed embeddings (sinusoidal, as in the original implementation), in relatively small dataset scenarios, where it would be hard to learn a meaningful embedding. At least, it would be interesting to compare both methods.

I see you included fixed embeddings in the reformer implementation, but don't you think it would be more efficient to calculate them once during the initialization? (like here)

Btw, I read a cool paper that compares fixed positional ambeddings and the ones learned by BERT, GPT2 and roBERTa.

If you prefer, I could do a PR on this adding the implementation in the above pytorch tutorial but it is no big deal.

@gulnazaki yea sure, I would welcome a PR on that :D I'll check out the paper you recommended tonight, thank you! Another good one I read recently is https://arxiv.org/abs/2006.15595

Seems pretty interesting, I'll check it out thanks.

Ok, I'll give it a look later. Do you think axial would also be a good embedding option I should include?

@gulnazaki yea, axial is great! :)