Implementation of ResNet in PyTorch.
Currently trainer supports only Cifar10 dataset.
- PyTorch v0.4.0
- torchvision
- numpy
- scipy
- tqdm
Models | Dataset | Top-1 without identity | Top-1 with identity |
---|---|---|---|
ResNet-10 | CIFAR-10 | 94.2% | 94.4% |
ResNet-18 | CIFAR-10 | 94.8% | 95.0% |
ResNet-34 | CIFAR-10 | 93.5% | 95.5% |
ResNet-50 | CIFAR-10 | 90.3% | 95.0% |
ResNet-10 | CIFAR-100 | 75.5% | 75.3% |
ResNet-18 | CIFAR-100 | 76.7% | 79.2% |
ResNet-34 | CIFAR-100 | 71.4% | 79.4% |
- In most ResNet implementations that I encountered on the web, there were additional convolutional layers used to represent the identity mappings. Even some included batch normalization layers. I believe this contradicts with the idea that deeper neural nets struggle representing identity mappings. Therefore here, I use average pooling for downsampling and add extra feature maps which contain all-zeros to increase the number of feature planes.
- When I work on the validation set, I noticed that identity mappings in residual blocks stabilizes the training in the long run. Yet, all trainning & validation & test accuracies tend to converge for ResNet-10 and ResNet-18. Performance gap becomes noticable when depth increases, i.e., ~2% on ResNet-34.
- Default settings start with a learning rate of 0.1 and the learning rate is multiplied by 0.1 after every 100 epochs.
- For computational restrictions, I trained ResNet50s with batches of size 64.
- You can benefit from
tensorboard
if you installtensorboard_logger
via pip (skip this if you already use Tensorflow) and set the tensorboard argument to True. - To run on CPU or GPU supporting cuda set, device argument to 'cpu' or 'cuda' respectively.
- Don't forget to arange folders for
exp_dir
andlog_dir
.
To examine the representations learned by a ResNet on the Cifar-10:
- I extracted the features of the test set from the ResNet-34, which yield 95.5% test set accuracy.
- For each feature:
- I sorted all the features based on their magnitudes.
- I took the least and the most relevant 10 images, and formed the below (big!) image. The left and the right halves contain the least and the most relevant ones, respectively.
For the most rows, the most relevant ones seem to be related conceptually. But still, I believe such a visual inspection cannot be well-grounded.
- Add Cifar100 support
- Share results for ResNet-34 and ResNet-50