lucidrains/x-transformers

[new paper] TokenFormer

Closed this issue · 2 comments

This novel paper could be a nice addition to the features of the repo:

https://arxiv.org/abs/2410.23168

the authors replaced the projections with another attention and claimed great gains in training speed

@guillaumeguy hey Guillaume, this was brought up already

unfortunately, not a believer yet

Always good to have a certain degree of skepticism! Thanks for looking into it.