This is toolbox project for Pytorch. Aiming to make you write Pytorch code more easier, readable and concise.
You could also regard this as a auxiliary tool for Pytorch. It will contain what you use most frequently tools.
A easy way to install this is by using pip:
pip install torchtoolbox
If you want to install the nightly version(recommend for now):
pip install -U git+https://github.com/deeplearningforfun/torch-toolbox.git@master
Toolbox have two mainly parts:
- Additional tools to make you use Pytorch easier.
- Some fashion work which don't exist in Pytorch core.
import torch
from torchtoolbox.tools import summary
from torchvision.models.mobilenet import mobilenet_v2
model = mobilenet_v2()
summary(model, torch.rand((1, 3, 224, 224)))
Here are some short outputs.
Layer (type) Output Shape Params FLOPs(M+A) #
================================================================================
Conv2d-1 [1, 64, 112, 112] 9408 235225088
BatchNorm2d-2 [1, 64, 112, 112] 256 1605632
ReLU-3 [1, 64, 112, 112] 0 0
MaxPool2d-4 [1, 64, 56, 56] 0 0
... ... ... ...
Linear-158 [1, 1000] 1281000 2560000
MobileNetV2-159 [1, 1000] 0 0
================================================================================
Total parameters: 3,538,984 3.5M
Trainable parameters: 3,504,872
Non-trainable parameters: 34,112
Total flops(M) : 305,252,872 305.3M
Total flops(M+A): 610,505,744 610.5M
--------------------------------------------------------------------------------
Parameters size (MB): 13.50
When we train a model we usually need to calculate some metrics like accuracy(top1-acc), loss etc. Now toolbox support as below:
- Accuracy: top-1 acc.
- TopKAccuracy: topK-acc.
- NumericalCost: This is a number metric collection which support
mean
,max
,min
calculate type.
from torchtoolbox import metric
# define first
top1_acc = metric.Accuracy(name='Top1 Accuracy')
top5_acc = metric.TopKAccuracy(top=5, name='Top5 Accuracy')
loss_record = metric.NumericalCost(name='Loss')
# reset before using
top1_acc.reset()
top5_acc.reset()
loss_record.reset()
...
model.eval()
for data, labels in val_data:
data = data.to(device, non_blocking=True)
labels = labels.to(device, non_blocking=True)
outputs = model(data)
losses = Loss(outputs, labels)
# update/record
top1_acc.step(outputs, labels)
top5_acc.step(outputs, labels)
loss_record.step(losses)
test_msg = 'Test Epoch {}: {}:{:.5}, {}:{:.5}, {}:{:.5}\n'.format(
epoch, top1_acc.name, top1_acc.get(), top5_acc.name, top5_acc.get(),
loss_record.name, loss_record.get())
print(test_msg)
Then you may get outputs like this
Test Epoch 101: Top1 Accuracy:0.7332, Top5 Accuracy:0.91514, Loss:1.0605
Now ToolBox support XavierInitializer
and KaimingInitializer
.
from torchtoolbox.nn.init import KaimingInitializer
model = XXX
KaimingInitializer(model)
Make Pytorch nn.Sequential
could handle multi input/output layer.
from torch import nn
from torchtoolbox.nn import AdaptiveSequential
import torch
class n_to_n(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(3, 3, 1, 1, bias=False)
self.conv2 = nn.Conv2d(3, 3, 1, 1, bias=False)
def forward(self, x1, x2):
y1 = self.conv1(x1)
y2 = self.conv2(x2)
return y1, y2
class n_to_one(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(3, 3, 1, 1, bias=False)
self.conv2 = nn.Conv2d(3, 3, 1, 1, bias=False)
def forward(self, x1, x2):
y1 = self.conv1(x1)
y2 = self.conv2(x2)
return y1 + y2
class one_to_n(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(3, 3, 1, 1, bias=False)
self.conv2 = nn.Conv2d(3, 3, 1, 1, bias=False)
def forward(self, x):
y1 = self.conv1(x)
y2 = self.conv2(x)
return y1, y2
seq = AdaptiveSequential(one_to_n(), n_to_n(), n_to_one()).cuda()
td = torch.rand(1, 3, 32, 32).cuda()
out = seq(td)
print(out.size())
# output
# torch.Size([1, 3, 32, 32])
from torchtoolbox.nn import LabelSmoothingLoss
# The num classes of your task should be defined.
classes = 10
# Loss
Loss = LabelSmoothingLoss(classes, smoothing=0.1)
...
for i, (data, labels) in enumerate(train_data):
data = data.to(device, non_blocking=True)
labels = labels.to(device, non_blocking=True)
optimizer.zero_grad()
outputs = model(data)
# just use as usual.
loss = Loss(outputs, labels)
loss.backward()
optimizer.step()
Cosine lr scheduler with warm-up epochs.It's helpful to improve acc for classification models.
from torchtoolbox.optimizer import CosineWarmupLr
optimizer = optim.SGD(...)
# define scheduler
# `batches_pre_epoch` means how many batches(times update/step the model) within one epoch.
# `warmup_epochs` means increase lr how many epochs to `base_lr`.
# you can find more details in file.
lr_scheduler = CosineWarmupLr(optimizer, batches_pre_epoch, epochs,
base_lr=lr, warmup_epochs=warmup_epochs)
...
for i, (data, labels) in enumerate(train_data):
...
optimizer.step()
# remember to step/update status here.
lr_scheduler.step()
...
from torchtoolbox.nn import SwitchNorm2d, SwitchNorm3d
Just use it like Batchnorm2d/3d. More details please refer to origin paper Differentiable Learning-to-Normalize via Switchable Normalization OpenSourse
from torchtoolbox.nn import Swish
Just use it like Relu. More details please refer to origin paper SEARCHING FOR ACTIVATION FUNCTIONS
A wrapper optimizer seems better than Adam. Lookahead Optimizer: k steps forward, 1 step back
from torchtoolbox.optimizer import Lookahead
from torch import optim
optimizer = optim.Adam(...)
optimizer = Lookahead(optimizer)
Mixup method to train a classification model. mixup: BEYOND EMPIRICAL RISK MINIMIZATION
from torchtoolbox.tools import mixup_data, mixup_criterion
# set beta distributed parm, 0.2 is recommend.
alpha = 0.2
for i, (data, labels) in enumerate(train_data):
data = data.to(device, non_blocking=True)
labels = labels.to(device, non_blocking=True)
data, labels_a, labels_b, lam = mixup_data(data, labels, alpha)
optimizer.zero_grad()
outputs = model(data)
loss = mixup_criterion(Loss, outputs, labels_a, labels_b, lam)
loss.backward()
optimizer.step()
A image transform method. Improved Regularization of Convolutional Neural Networks with Cutout
from torchvision import transforms
from torchtoolbox.transform import Cutout
_train_transform = transforms.Compose([
transforms.RandomResizedCrop(224),
Cutout(),
transforms.RandomHorizontalFlip(),
transforms.ColorJitter(0.4, 0.4, 0.4),
transforms.ToTensor(),
normalize,
])
If you train a model with big batch size, eg. 64k, you may need this, Highly Scalable Deep Learning Training System with Mixed-Precision: Training ImageNet in Four Minutes
from torchtoolbox.tools import split_weights
from torch import optim
model = XXX
parameters = split_weights(model)
optimizer = optim.SGD(parameters, ...)
Welcome pull requests and issues!!!