tahmid0007/VisionTransformer
A complete easy to follow implementation of Google's Vision Transformer proposed in "AN IMAGE IS WORTH 16X16 WORDS". This pytorch implementation has comments for better understanding.
Python
Stargazers
- 1phaCognitive System Lab, Artificial Intelligence Department, Korea University
- Abhranta
- Alchemistyui
- amnesiackMunich
- Christine620
- darkbuck
- decoherencerIIT Delhi
- dipesh-commitsLSU
- DoubiiuThe Chinese University of Hong Kong
- gjk287
- HeimingXUofAdelaide
- jenhuluckMaryland
- jlqzzz
- JongchanLunit Inc.
- kizombaciao
- korejan
- LJ-Zhang
- marcospiauBrazil
- MaxiaoyuHehe
- menorki
- MicPieOpenBioML.org
- MJVNOR
- mritunjaymusale@KJSCE-SVU
- mymuli南洋理工大学|中山大学|北京邮电大学
- randy8642
- robeson1010
- seominseok0429south korea, Daejeon
- SuperBruceJiaBoston University
- tahmedge
- tahmid0007Arkeus
- tasfia
- TheDudeFromCIHobbyist
- xingyi-liHuazhong University of Science and Technology
- YoungGodZhejiang University / CAS
- yuyaozhao
- zeyh