emla2805/vision-transformer

Tensorflow implementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)

Python

Issues

download error
#4 opened 3 years ago by mouni13r
1
Do you have the original pretrain model weights in TF format?
#3 opened 3 years ago by knaffe
0
Training on Custom video
#2 opened 4 years ago by Jalilnkh
2
Could you provide the code for visualizing the attention map or attention distance?
#1 opened 4 years ago by rainylt
0