Flops counting tool for neural networks in pytorch framework

This script is designed to compute the theoretical amount of multiply-add operations in convolutional neural networks. It can also compute the number of parameters and print per-layer computational cost of a given network.

Supported layers:

Conv1d/2d/3d (including grouping)
ConvTranspose1d/2d/3d (including grouping)
BatchNorm1d/2d/3d, GroupNorm, InstanceNorm1d/2d/3d
Activations (ReLU, PReLU, ELU, ReLU6, LeakyReLU, GELU)
Linear
Upsample
Poolings (AvgPool1d/2d/3d, MaxPool1d/2d/3d and adaptive ones)

Experimental support:

RNN, LSTM, GRU (NLH layout is assumed)
RNNCell, LSTMCell, GRUCell
MultiheadAttention

Requirements: Pytorch >= 1.1, torchvision >= 0.3

Thanks to @warmspringwinds for the initial version of script.

Usage tips

This script doesn't take into account torch.nn.functional.* operations. For an instance, if one have a semantic segmentation model and use torch.nn.functional.interpolate to upscale features, these operations won't contribute to overall amount of flops. To avoid that one can use torch.nn.Upsample instead of torch.nn.functional.interpolate.
ptflops launches a given model on a random tensor and estimates amount of computations during inference. Complicated models can have several inputs, some of them could be optional. To construct non-trivial input one can use the input_constructor argument of the get_model_complexity_info. input_constructor is a function that takes the input spatial resolution as a tuple and returns a dict with named input arguments of the model. Next this dict would be passed to the model as a keyword arguments.
verbose parameter allows to get information about modules that don't contribute to the final numbers.
ignore_modules option forces ptflops to ignore the listed modules. This can be useful for research purposes. For an instance, one can drop all convolutuions from the counting process specifying ignore_modules=[torch.nn.Conv2d].

Install the latest version

From PyPI:

pip install ptflops

From this repository:

pip install --upgrade git+https://github.com/sovrasov/flops-counter.pytorch.git

Example

import torchvision.models as models
import torch
from ptflops import get_model_complexity_info

with torch.cuda.device(0):
  net = models.densenet161()
  macs, params = get_model_complexity_info(net, (3, 224, 224), as_strings=True,
                                           print_per_layer_stat=True, verbose=True)
  print('{:<30}  {:<8}'.format('Computational complexity: ', macs))
  print('{:<30}  {:<8}'.format('Number of parameters: ', params))

Citation

If ptflops was useful for your paper or tech report, please cite me:

@online{ptflops,
  author = {Vladislav Sovrasov},
  title = {ptflops: a flops counting tool for neural networks in pytorch framework},
  year = 2018-2023,
  url = {https://github.com/sovrasov/flops-counter.pytorch},
}

Benchmark

torchvision

Model	Input Resolution	Params(M)	MACs(G)	Top-1 error	Top-5 error
alexnet	224x224	61.1	0.72	43.45	20.91
vgg11	224x224	132.86	7.63	30.98	11.37
vgg13	224x224	133.05	11.34	30.07	10.75
vgg16	224x224	138.36	15.5	28.41	9.62
vgg19	224x224	143.67	19.67	27.62	9.12
vgg11_bn	224x224	132.87	7.64	29.62	10.19
vgg13_bn	224x224	133.05	11.36	28.45	9.63
vgg16_bn	224x224	138.37	15.53	26.63	8.50
vgg19_bn	224x224	143.68	19.7	25.76	8.15
resnet18	224x224	11.69	1.82	30.24	10.92
resnet34	224x224	21.8	3.68	26.70	8.58
resnet50	224x224	25.56	4.12	23.85	7.13
resnet101	224x224	44.55	7.85	22.63	6.44
resnet152	224x224	60.19	11.58	21.69	5.94
squeezenet1_0	224x224	1.25	0.83	41.90	19.58
squeezenet1_1	224x224	1.24	0.36	41.81	19.38
densenet121	224x224	7.98	2.88	25.35	7.83
densenet169	224x224	14.15	3.42	24.00	7.00
densenet201	224x224	20.01	4.37	22.80	6.43
densenet161	224x224	28.68	7.82	22.35	6.20
inception_v3	224x224	27.16	2.85	22.55	6.44

Top-1 error - ImageNet single-crop top-1 error (224x224)
Top-5 error - ImageNet single-crop top-5 error (224x224)

Cadene/pretrained-models.pytorch

Model	Input Resolution	Params(M)	MACs(G)	Acc@1	Acc@5
alexnet	224x224	61.1	0.72	56.432	79.194
bninception	224x224	11.3	2.05	73.524	91.562
cafferesnet101	224x224	44.55	7.62	76.2	92.766
densenet121	224x224	7.98	2.88	74.646	92.136
densenet161	224x224	28.68	7.82	77.56	93.798
densenet169	224x224	14.15	3.42	76.026	92.992
densenet201	224x224	20.01	4.37	77.152	93.548
dpn107	224x224	86.92	18.42	79.746	94.684
dpn131	224x224	79.25	16.13	79.432	94.574
dpn68	224x224	12.61	2.36	75.868	92.774
dpn68b	224x224	12.61	2.36	77.034	93.59
dpn92	224x224	37.67	6.56	79.4	94.62
dpn98	224x224	61.57	11.76	79.224	94.488
fbresnet152	224x224	60.27	11.6	77.386	93.594
inceptionresnetv2	299x299	55.84	13.22	80.17	95.234
inceptionv3	299x299	27.16	5.73	77.294	93.454
inceptionv4	299x299	42.68	12.31	80.062	94.926
nasnetalarge	331x331	88.75	24.04	82.566	96.086
nasnetamobile	224x224	5.29	0.59	74.08	91.74
pnasnet5large	331x331	86.06	25.21	82.736	95.992
polynet	331x331	95.37	34.9	81.002	95.624
resnet101	224x224	44.55	7.85	77.438	93.672
resnet152	224x224	60.19	11.58	78.428	94.11
resnet18	224x224	11.69	1.82	70.142	89.274
resnet34	224x224	21.8	3.68	73.554	91.456
resnet50	224x224	25.56	4.12	76.002	92.98
resnext101_32x4d	224x224	44.18	8.03	78.188	93.886
resnext101_64x4d	224x224	83.46	15.55	78.956	94.252
se_resnet101	224x224	49.33	7.63	78.396	94.258
se_resnet152	224x224	66.82	11.37	78.658	94.374
se_resnet50	224x224	28.09	3.9	77.636	93.752
se_resnext101_32x4d	224x224	48.96	8.05	80.236	95.028
se_resnext50_32x4d	224x224	27.56	4.28	79.076	94.434
senet154	224x224	115.09	20.82	81.304	95.498
squeezenet1_0	224x224	1.25	0.83	58.108	80.428
squeezenet1_1	224x224	1.24	0.36	58.25	80.8
vgg11	224x224	132.86	7.63	68.97	88.746
vgg11_bn	224x224	132.87	7.64	70.452	89.818
vgg13	224x224	133.05	11.34	69.662	89.264
vgg13_bn	224x224	133.05	11.36	71.508	90.494
vgg16	224x224	138.36	15.5	71.636	90.354
vgg16_bn	224x224	138.37	15.53	73.518	91.608
vgg19	224x224	143.67	19.67	72.08	90.822
vgg19_bn	224x224	143.68	19.7	74.266	92.066
xception	299x299	22.86	8.42	78.888	94.292

Acc@1 - ImageNet single-crop top-1 accuracy on validation images of the same size used during the training process.
Acc@5 - ImageNet single-crop top-5 accuracy on validation images of the same size used during the training process.

huxuan/flops-counter.pytorch