/ViT_seminar

for image processing trasformers seminar

Primary LanguagePython

ViT_seminar

training 3 different models from scratch - for image processing transformers seminar

directory tree

├── data
│   ├── CIFAR
│   ├── CLS-LOC
│   └── MNIST
├── resnet_pytorch
│   └── resnet.py
├── efficientnet_pytorch
│   ├── model.py
│   └── utils.py
├── vit_pytorch
│   └── vit.py
├── log
│   ├── EfficientNet
│   │   ├── CIFARDataset
│   │   ├── ImageNetDataset
│   │   └── MNISTDataset
│   ├── ResNet
│   │   ├── CIFARDataset
│   │   ├── ImageNetDataset
│   │   └── MNISTDataset
│   └── ViT
│       ├── CIFARDataset
│       ├── ImageNetDataset
│       └── MNISTDataset
├── download.py
├── test.py
├── dataset.py
├── train_resnet.py
├── train_efcntnet.py
└── train_vit.py

data

  • MNIST (10 classes)
  • CIFAR10 (10 classes)
  • ImageNet ILSVRC2012 (20 classes - randomly selected distinctive classes)

not included in git repo due to size limits

seminar_dataset

models

  • ResNet50
  • EfficientNetB3
  • ViT

seminar_models

train

python {train_resnet.py/train_efcntnet.py/train_vit.py} --seed 123 --dataset {MNISTDataset/CIFARDataset/ImageNetDataset} --resize 224 --val_ratio 0.2 --epochs 20 --batch_size 16

log

tensorboard --logdir=./log/{modelname}/{datasetname}