MemCNN
a PyTorch Framework for Developing Memory Efficient Deep Invertible Networks
Reference: Sil C. van de Leemput, Jonas Teuwen, Rashindra Manniesing. MemCNN: a Framework for Developing Memory Efficient Deep Invertible Networks. International Conference on Learning Representations (ICLR) 2018 Workshop Track. (https://iclr.cc/)
Licencing
This repository comes with the MIT license, which implies everyone has the right to use, copy, distribute and/or modify this work. If you do, please cite our work.
Installation
Using NVIDIA docker
Requirements
- NVIDIA graphics card and the proper NVIDIA-drivers on your system
- nvidia-docker installed on your system
The following bash commands will clone this repository and do a one-time build of the docker image with the right environment installed:
git clone https://github.com/silvandeleemput/memcnn.git
docker build ./memcnn/docker --tag=memcnn-docker
After the one-time install on your machine, the docker can be invoked by:
docker run --shm-size=4g --runtime=nvidia -it memcnn-docker
This will open a preconfigured bash shell, which is correctly configured to run the experiments from the next section.
The datasets and experimental results will be put inside the created docker container under:
\home\user\data
and \home\user\experiments
respectively
Using a Custom environment
Requirements
- PyTorch 0.3 (CUDA support recommended)
- torchvision 0.1.9
- TensorboardX 0.9
Clone the repository and navigate to the right folder to execute the experiments:
git clone https://github.com/silvandeleemput/memcnn.git
cd ./memcnn/memcnn
Note that the location of the cloned repository has to be added to your Python path.
Example usage: ReversibleBlock
# some required imports
import torch
import torch.nn as nn
from torch.autograd import Variable
import numpy as np
import memcnn.models.revop
# define a new class of operation(s) PyTorch style
class ExampleOperation(nn.Module):
def __init__(self, channels):
super(ExampleOperation, self).__init__()
self.seq = nn.Sequential(
nn.Conv2d(in_channels=channels, out_channels=channels,
kernel_size=(3, 3), padding=1),
nn.BatchNorm2d(num_features=channels),
nn.ReLU(inplace=True)
)
def forward(self, x):
return self.seq(x)
# generate some random input data (b, c, y, x)
data = np.random.random((2, 10, 8, 8)).astype(np.float32)
X = Variable(torch.from_numpy(data))
# application of the operation(s) the normal way
Y = ExampleOperation(channels=10)(X)
# application of the operation(s) using the reversible block
F, G = ExampleOperation(channels=10 // 2), ExampleOperation(channels=10 // 2)
Y = memcnn.models.revop.ReversibleBlock(F, G)(X)
Run PyTorch Experiments
./train.py [MODEL] [DATASET] --fresh
Available values for DATASET
are cifar10
and cifar100
.
Available values for MODEL
are resnet32
, resnet110
, resnet164
, revnet38
, revnet110
, revnet164
If not available datasets are automatically downloaded.
Results
TensorFlow results were obtained from the reversible residual network running the code from their GitHub.
TensorFlow | PyTorch | |||||||
---|---|---|---|---|---|---|---|---|
Cifar-10 | Cifar-100 | Cifar-10 | Cifar-100 | |||||
Model | acc. | time | acc. | time | acc. | time | acc. | time |
resnet-32 | 92.74 | 2:04 | 69.10 | 1:58 | 92.86 | 1:51 | 69.81 | 1:51 |
resnet-110 | 93.99 | 4:11 | 73.30 | 6:44 | 93.55 | 2:51 | 72.40 | 2:39 |
resnet-164 | 94.57 | 11:05 | 76.79 | 10:59 | 94.80 | 4:59 | 76.47 | 3:45 |
revnet-38 | 93.14 | 2:17 | 71.17 | 2:20 | 92.54 | 1:10 | 69.33 | 1:40 |
revnet-110 | 94.02 | 6:59 | 74.00 | 7:03 | 93.25 | 3:43 | 72.24 | 3:44 |
revnet-164 | 94.56 | 13:09 | 76.39 | 13:12 | 93.40 | 7:19 | 74.63 | 7:21 |
Future Releases
- Support for other reversible networks
- Better support for non volume preserving mappings
Citation
If you use our code, please cite:
@inproceedings{
leemput2018memcnn,
title={MemCNN: a Framework for Developing Memory Efficient Deep Invertible Networks},
author={Sil C. van de Leemput, Jonas Teuwen, Rashindra Manniesing},
booktitle={ICLR 2018 Workshop Track},
year={2018},
url={https://openreview.net/forum?id=r1KzqK1wz},
}