/ZenNAS

Primary LanguagePythonApache License 2.0Apache-2.0

License arXiv

ZenNAS: A Zero-Shot NAS for High-Performance Deep Image Recognition

ZenNAS is a lighting fast Neural Architecture Searching (NAS) algorithm for automatically designing deep neural networks with high prediction accuracy and high inference speed on GPU and mobile device.

Our paper is available here: arXiv link

How Fast It IS

Using 1 GPU searching for 12 hours, ZenNAS is able to design networks of ImageNet top-1 accuracy comparable to EfficientNet-B5 (~83%) while inference speed 11 times faster on NVIDIA T4, 2.6 times faster on Google Pixel2.

Inference Speed on T4 and Pixel2

Examples

To evaluate the pre-trained model on ImageNet using GPU 0:

python val.py --fp16 --gpu 0 --arch ${zennet_model_name}

where ${zennet_model_name} should be replaced by a valid ZenNet model name. The complete list of model names can be found in 'Pre-trained Models' section.

To evaluate the pre-trained model on CIFAR10 or CIFAR100 using GPU 0:

python val_cifar.py --dataset cifar10 --gpu 0 --arch ${zennet_model_name}

To create a ZenNet in your python code:

gpu=0
model = ZenNet.get_ZenNet(opt.arch, pretrained=True)
torch.cuda.set_device(gpu)
torch.backends.cudnn.benchmark = True
model = model.cuda(gpu)
model = model.half()
model.eval()

System Requirement and Default Paths

  • PyTorch >= 1.5, Python >= 3.7
  • By default, ImageNet dataset is stored under ~/data/imagenet; CIFAR10/CIFAR100 is stored under ~/data/pytorch_cifar10 or ~/data/pytorch_cifar100
  • Pre-trained parameters are cached under ~/.cache/pytorch/checkpoints/zennet_pretrained

Pre-trained Models

We provided pre-trained models on ImageNet and CIFAR10/CIFAR100.

ImageNet Models

model resolution # params FLOPs Top-1 Acc V100 T4 Pixel2
zennet_imagenet1k_flops400M_SE_res192 192 12.4M 453M 78.3% 0.29 0.29 109.7
zennet_imagenet1k_flops600M_SE_res256 256 12.6M 685M 80.0% 0.49 0.46 166.9
zennet_imagenet1k_flops900M_SE_res224 224 19.4M 934M 80.8% 0.55 0.55 215.7
zennet_imagenet1k_latency01ms_res192 192 34.5M 1.8B 76.0% 0.1 0.08 246.9
zennet_imagenet1k_latency02ms_res192 192 44.6M 3.0B 80.1% 0.2 0.16 332.5
zennet_imagenet1k_latency03ms_res224 224 86.5M 5.1B 81.5% 0.3 0.26 475.6
zennet_imagenet1k_latency05ms_res224 224 148.3M 8.2B 82.1% 0.5 0.41 808.3
zennet_imagenet1k_latency08ms_res224 224 105.1M 9.4B 83.0% 0.8 0.57 896.0
EfficientNet-B3 300 12.0M 1.8B 81.1% 1.12 1.86 569.3
EfficientNet-B5 456 30.0M 9.9B 83.3% 4.5 7.0 2580.2
  • 'V100' is the inference latency on NVIDIA V100 in milliseconds, benchmarked at batch size 64, float16.
  • 'T4' is the inference latency on NVIDIA T4 in milliseconds, benchmarked at batch size 64, TensorRT INT8.
  • 'Pixel2' is the inference latency on Google Pixel2 in milliseconds, benchmarked at single image.

CIFAR10/CIFAR100 Models

model resolution # params FLOPs Top-1 Acc
zennet_cifar10_model_size05M_res32 32 0.5M 140M 96.2%
zennet_cifar10_model_size1M_res32 32 1.0M 162M 96.2%
zennet_cifar10_model_size2M_res32 32 2.0M 487M 97.5%
zennet_cifar100_model_size05M_res32 32 0.5M 140M 79.9%
zennet_cifar100_model_size1M_res32 32 1.0M 162M 80.1%
zennet_cifar100_model_size2M_res32 32 2.0M 487M 84.4%

Major Contributor

Copyright

Copyright (C) 2010-2021 Alibaba Group Holding Limited.