fkodom/dilated-attention-pytorch
(Unofficial) Implementation of dilated attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens" (https://arxiv.org/abs/2307.02486)
PythonMIT
Stargazers
- agemagicianTechnical University of Munich (TUM)
- ahsanulhabib
- AkbarableKarachi, Pakistan
- AlexanderChen1989
- AndreasKaratzasSoftware/Hardware co-design for Deep Learning
- antmikinka
- bennylamRight Station Ltd.
- Beyond-Zw
- bngfavored
- Bondrake
- CanyonWindBay area
- Ci-TJ
- codehwang
- CxsGhostEast China Normal University
- darkman111a
- fkodomPlainsight
- fyaslaFrance
- Ganten-Hornby
- GuangShuaiWangCHINA,ShangHai
- HeadCrab65
- jazzbearz
- JeffCarpenterCanada
- liangrj2014
- lycyhrc
- MHarris021Pachyderm
- MichaelDays
- Mr-Fang-VLSI
- nightscape@xencura
- notnilPlainsight
- PDS99
- rmsmith251Phoenix, AZ
- schultzjack
- seung7361University Student
- stasys-hub
- ultranity
- yaguo