emla2805/vision-transformer
Tensorflow implementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)
Python
Issues
- 1
download error
#4 opened by mouni13r - 0
- 2
Training on Custom video
#2 opened by Jalilnkh - 0
Could you provide the code for visualizing the attention map or attention distance?
#1 opened by rainylt