/Image-Models-CIFAR-10

PyTorch Implementations of Image Models. Training&Testing on CIFAR-10 dataset.

Primary LanguagePython

Vision Models on CIFAR-10

Computer Vision Models implemented in PyTorch. Train and Test on CIFAR-10 dataset. To be updated...

TODO List

  • Auto-Encoder
  • Variational Auto-Encoder
  • ResNet
  • Vision Transformer (ViT)
  • MAE (TODO)
  • Swin Transformer (TODO)
  • Diffusion Model (TODO)

Results

Model Test Accuracy Checkpoints Training Log
ResNet-18 95.% Checkpoints Log
vit_base_patch16_224 98.87% Checkpoints Log

ResNet-18

  • Dataset Cutout
  • Learning Rate Scheduler
  • 7x7 Conv -> 3x3 Conv
  • Remove MaxPool
  • KaiMing_Normal for initialization

Vision Transformer (ViT)

Used pre-trained checkpoints(vit_base_patch16_224) from TIMM.

Reference

  1. Deep Residual Learning for Image Recognition
  2. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
  3. Github/kentaroy47/vision-transformers-cifar10
  4. pytorch-image-models