/ViT-tf

A Tensorflow Implementation of "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (Vision Transformer)"

Primary LanguagePython

No issues in this repository yet.