lucidrains/vit-pytorch
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
PythonMIT
Issues
- 0
LayerNorm for Vit
#244 opened - 5
Neighbourhood Attention Implementation
#243 opened - 1
MAE `decoder_tokens` computation
#241 opened - 2
How to retrain ViT
#240 opened - 2
- 8
- 2
Quesiton about attention's qkv matrix
#237 opened - 4
simplify `to_patch_embedding` using Conv2d
#236 opened - 1
- 1
Loading weights of custom ViT models
#234 opened - 1
Visualize the attention weights
#233 opened - 1
Distillation RuntimeError
#232 opened - 0
Attention maps for PiT
#230 opened - 1
Question about example notebook
#229 opened - 1
EfficientFormer Request!
#228 opened - 0
- 0
- 5
How to calculate Params and Flops for Vit?
#225 opened - 2
- 1
MaxViT's MbConv doesn't match article.
#223 opened - 2
Add another MLP head in vision transformer
#222 opened - 8
- 0
Attention maps rectangular input
#219 opened - 1
- 1
- 1
- 0
issue VIT + Loss function
#214 opened - 2
my own datasets? train.py pre.py?
#213 opened - 2
Did you miss dropout?
#209 opened - 2
CCT and non-square images
#208 opened - 0
A new idea
#207 opened - 2
- 0
- 1
How to train custom dataset
#204 opened - 0
- 0
- 1
How to use Multi-Head Attention in ViT
#201 opened - 0
- 1
- 2
- 2
- 3
MAE using pretrained VIT
#196 opened - 3
- 4
How to get the feature map of the vit encoder
#191 opened - 1
does vit-pytorch has mvit?
#190 opened - 2
Vit MAE decoder positional embeddings
#189 opened - 0
- 2
Vit MAE reconstruction size mismatch
#187 opened - 2
where is the train.py? tks
#185 opened - 1
About different patch embedding in ViT
#184 opened