ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware

@inproceedings{
  cai2018proxylessnas,
  title={Proxyless{NAS}: Direct Neural Architecture Search on Target Task and Hardware},
  author={Han Cai and Ligeng Zhu and Song Han},
  booktitle={International Conference on Learning Representations},
  year={2019},
  url={https://openreview.net/forum?id=HylVB3AqYm},
}

Without any proxy, directly search neural network architectures on your target task and hardware!

Website, arXiv

Requirements

PyTorch 0.3.1 or Tensorflow 1.5
Python 3.6+

Updates

Dec-21-2018: TensorFlow pretrained models are released.
Dec-01-2018: PyTorch pretrained models are released.

Performance

Mobile settings

GPU settings

Model	Top-1	Top-5	Latency	FLOPs
MobilenetV1	70.6	89.5	113ms	575M
MobilenetV2	72.0	91.0	75ms	300M
MNasNet(our impl)	74.0	91.8	79ms	317M
ProxylessNAS (mobile)	74.6	92.2	78ms	320M
ProxylessNAS (mobile_14)	76.7	93.3	147ms	581M

Model	Top-1	Top-5	Latency
MobilenetV2	72.0	91.0	6.1ms
ShufflenetV2(1.5)	72.6	-	7.3ms
ResNet-34	73.3	91.4	8.0ms
MNasNet(our impl)	74.0	91.8	6.1ms
ProxylessNAS (GPU)	75.1	92.5	5.1ms

2.6% better than MobilenetV2 with same speed.

3.1% better than MobilenetV2 with 20% faster.

ProxylessNAS consistently outperforms MobileNetV2 under various latency settings.

Specialization

People used to deploy one model to all platforms, but this is not good. To fully exploit the efficiency, we should specialize architectures for each platform.

Please refer to our paper for more results.

How to use / evaluate

Use

# pytorch 
from proxyless_nas import proxyless_cpu, proxyless_gpu, proxyless_mobile, proxyless_mobile_14
net = proxyless_cpu(pretrained=True) # Yes, we provide pre-trained models!

# tensorflow
from proxyless_nas_tensorflow import proxyless_cpu, proxyless_gpu, proxyless_mobile, proxyless_mobile_14
tf_net = proxyless_cpu(pretrained=True)