[new paper] TokenFormer
Closed this issue · 2 comments
guillaumeguy commented
This novel paper could be a nice addition to the features of the repo:
https://arxiv.org/abs/2410.23168
the authors replaced the projections with another attention and claimed great gains in training speed
lucidrains commented
@guillaumeguy hey Guillaume, this was brought up already
unfortunately, not a believer yet
guillaumeguy commented
Always good to have a certain degree of skepticism! Thanks for looking into it.