facebookresearch/deit

Official DeiT repository

PythonApache-2.0

Issues

ViT-B Training for DeiT
#233 opened 10 months ago by ziqipang
2
Will you be releasing the accuracy of the official deit III framework trained tiny version on IN1k?
#241 opened a year ago by chenziwenhaoshuai
0
DeiT depth 24 (CaiT - TABLE 1)
#218 opened a year ago by GoJunHyeong
2
Gradient accumulation code
#240 opened a year ago by King4819
0
Question about different seeds per gpu with DDP
#239 opened a year ago by HIT-LiuChen
0
Training
#238 opened a year ago by ali-88123
0
Inclusion of Transformers Need Registers
#237 opened a year ago by mileseverett
0
Slow Training
#234 opened a year ago by mueller-mp
2
random.seed(seed) in line 205 is commented
#236 opened a year ago by Phuoc-Hoan-Le
0
Checkpoints of IN21K pretrained deit III
#232 opened a year ago by Byakuya-zi
0
Hi，Why can't I find deit_tiny_distilled_patch16_224 in hubconf
#231 opened a year ago by GerogeD
0
TracerWarning
#230 opened a year ago by maingoc1605
0
batch_size flag
#220 opened 2 years ago by tsengalb99
2
how to implement document layout analysis use Deit-B
#210 opened 2 years ago by sherryhsy
2
How to launch a training of CAIT models ?
#226 opened 2 years ago by elias-ramzi
0
Code for cosub
#224 opened 2 years ago by ppalantir
0
The ablation experiment of DeiT
#215 opened 2 years ago by Berry-Wu
2
ImageNet21K data preparation for pre-training
#219 opened 2 years ago by mxjecho
5
Meaning of the model name ( ResMLP)
#207 opened 2 years ago by YHYeooooong
1
Can I use timm==0.4.12 instead of timm==0.3.2 ?
#206 opened 2 years ago by irhallac
1
unexpected keyword argument 'pretrained_cfg'
#212 opened 2 years ago by entron
2
Are the hyperparameters for DeiT-T and for DeiT-S any different than DeiT-B?
#201 opened 2 years ago by Phuoc-Hoan-Le
1
ImageNet21k pretrained model without finetuning on 1k
#181 opened 2 years ago by bhheo
2
How long is it supposed to take to train on ImageNet21k for 90 epochs with 8 V100 GPUs
#198 opened 2 years ago by Phuoc-Hoan-Le
1
number of classes
#197 opened 2 years ago by Ye-Na-Kim
1
What's the accuracy of deit-S without pre-trained on CIFAR10
#202 opened 2 years ago by hanwenran1
1
how to implement cosub training use deit-III
#217 opened 2 years ago by xiaoguang-1
2
What are the hyperparameters for DeiT-III (epoch 400 or 600)?
#214 opened 2 years ago by GoJunHyeong
0
how to implement cosub training use deit-III
#216 opened 2 years ago by xiaoguang-1
0
Single machine multi-GPU training
#213 opened 2 years ago by AlexNmSED
0
Multi-node support
#208 opened 2 years ago by Phuoc-Hoan-Le
0
Multinode Slurm Training
#204 opened 2 years ago by yazdanimehdi
0
What batch size number other than 1024 have been tried when training a DeiT model?
#205 opened 2 years ago by Phuoc-Hoan-Le
0
Does the EMA is used in DeiT-III?
#203 opened 2 years ago by mzr1996
3
Question about Throughput
#189 opened 2 years ago by techmonsterwang
1
cifar100 pretrain model?
#191 opened 2 years ago by Wang-Y-S
1
The concatenation of 'cls_tokens' and 'patch_embedding' is not necceassay.
#188 opened 2 years ago by thb1314
1
What is the difference between class attention in the paper CaiT and traditional multi-headed self-attention？
#196 opened 2 years ago by hutingz
1
What is the ImageNet-1K Top-1 accuracy of Training from 0 to 400 epochs (Fig. 5 of Deit III paper)
#199 opened 2 years ago by sanyalsunny111
0
Config file of ViT-B/16
#195 opened 2 years ago by shashankvkt
2
Object detection from the DeiT III pretrained model
#187 opened 2 years ago by YuanLiuuuuuu
2
Uneven memory usage among GPUs with DistributedDataParallel
#194 opened 2 years ago by Phuoc-Hoan-Le
0
Is it possible if I can see how the validation accuracy changes over the number of epochs for DeiT?
#193 opened 2 years ago by Phuoc-Hoan-Le
0
Is "unscale-lr" used in DeiT training on ImageNet1k
#192 opened 2 years ago by Phuoc-Hoan-Le
0
Reproduce PatchConvnet
#185 opened 2 years ago by billpsomas
2
LAMB and amp
#182 opened 2 years ago by sgunasekar
1
Question about training DeiT-small distilled
#186 opened 2 years ago by mingqiJ
2
Is uniform drop-path rates beneficial ?
#184 opened 2 years ago by bhheo
2
DeiT-tiny pth size is not 5M, it is 22M
#183 opened 2 years ago by witding
1
Confusion about fine-tuning
#180 opened 2 years ago by WangWenhao0716
4