/NAS-Projects

11 neural architecture search (NAS) algorithms implemented in PyTorch.

Primary LanguagePythonMIT LicenseMIT

Nueral Architecture Search (NAS)

This project contains the following neural architecture search algorithms, implemented in PyTorch. More NAS resources can be found in Awesome-NAS.

  • Network Pruning via Transformable Architecture Search, NeurIPS 2019
  • One-Shot Neural Architecture Search via Self-Evaluated Template Network, ICCV 2019
  • Searching for A Robust Neural Architecture in Four GPU Hours, CVPR 2019
  • several typical classification models, e.g., ResNet and DenseNet (see BASELINE.md)

Requirements and Preparation

Please install PyTorch>=1.1.0, Python>=3.6, and opencv.

The CIFAR and ImageNet should be downloaded and extracted into $TORCH_HOME. Some methods use knowledge distillation (KD), which require pre-trained models. Please download these models from Google Driver (or train by yourself) and save into .latent-data.

usefull tools

  1. Compute the number of parameters and FLOPs of a model:
from utils import get_model_infos
flop, param  = get_model_infos(net, (1,3,32,32))
  1. Different NAS-searched architectures are defined here.

In this paper, we proposed a differentiable searching strategy for transformable architectures, i.e., searching for the depth and width of a deep neural network. You could see the highlight of our Transformable Architecture Search (TAS) at our project page.

Usage

Use bash ./scripts/prepare.sh to prepare data splits for CIFAR-10, CIFARR-100, and ILSVRC2012. If you do not have ILSVRC2012 data, pleasee comment L12 in ./scripts/prepare.sh.

Search the depth configuration of ResNet:

CUDA_VISIBLE_DEVICES=0,1 bash ./scripts-search/search-depth-gumbel.sh cifar10 ResNet110 CIFARX 0.57 -1

Search the width configuration of ResNet:

CUDA_VISIBLE_DEVICES=0,1 bash ./scripts-search/search-width-gumbel.sh cifar10 ResNet110 CIFARX 0.57 -1

Search for both depth and width configuration of ResNet:

CUDA_VISIBLE_DEVICES=0,1 bash ./scripts-search/search-cifar.sh cifar10 ResNet56  CIFARX 0.47 -1

args: cifar10 indicates the dataset name, ResNet56 indicates the basemodel name, CIFARX indicates the searching hyper-parameters, 0.47/0.57 indicates the expected FLOP ratio, -1 indicates the random seed.

Highlight: we equip one-shot NAS with an architecture sampler and train network weights using uniformly sampling.

Usage

Please use the following scripts to train the searched SETN-searched CNN on CIFAR-10, CIFAR-100, and ImageNet.

CUDA_VISIBLE_DEVICES=0 bash ./scripts/nas-infer-train.sh cifar10  SETN 96 -1
CUDA_VISIBLE_DEVICES=0 bash ./scripts/nas-infer-train.sh cifar100 SETN 96 -1
CUDA_VISIBLE_DEVICES=0,1,2,3 bash ./scripts/nas-infer-train.sh imagenet-1k SETN  256 -1

The searching codes of SETN on a small search space:

CUDA_VISIBLE_DEVICES=0 bash ./scripts-search/algos/SETN.sh cifar10 -1

We proposed a Gradient-based searching algorithm using Differentiable Architecture Sampling (GDAS). GDAS is baseed on DARTS and improves it with Gumbel-softmax sampling. Experiments on CIFAR-10, CIFAR-100, ImageNet, PTB, and WT2 are reported.

The old version is located at others/GDAS and a paddlepaddle implementation is locate at others/paddlepaddle.

Usage

Please use the following scripts to train the searched GDAS-searched CNN on CIFAR-10, CIFAR-100, and ImageNet.

CUDA_VISIBLE_DEVICES=0 bash ./scripts/nas-infer-train.sh cifar10  GDAS_V1 96 -1
CUDA_VISIBLE_DEVICES=0 bash ./scripts/nas-infer-train.sh cifar100 GDAS_V1 96 -1
CUDA_VISIBLE_DEVICES=0,1,2,3 bash ./scripts/nas-infer-train.sh imagenet-1k GDAS_V1 256 -1

The GDAS searching codes on a small search space:

CUDA_VISIBLE_DEVICES=0 bash ./scripts-search/algos/GDAS.sh cifar10 -1

The baseline searching codes are DARTS:

CUDA_VISIBLE_DEVICES=0 bash ./scripts-search/algos/DARTS-V1.sh cifar10 -1
CUDA_VISIBLE_DEVICES=0 bash ./scripts-search/algos/DARTS-V2.sh cifar10 -1

Citation

If you find that this project helps your research, please consider citing some of the following papers:

@inproceedings{dong2019tas,
  title     = {Network Pruning via Transformable Architecture Search},
  author    = {Dong, Xuanyi and Yang, Yi},
  booktitle = {Neural Information Processing Systems (NeurIPS)},
  year      = {2019}
}
@inproceedings{dong2019one,
  title     = {One-Shot Neural Architecture Search via Self-Evaluated Template Network},
  author    = {Dong, Xuanyi and Yang, Yi},
  booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
  year      = {2019}
}
@inproceedings{dong2019search,
  title     = {Searching for A Robust Neural Architecture in Four GPU Hours},
  author    = {Dong, Xuanyi and Yang, Yi},
  booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  pages     = {1761--1770},
  year      = {2019}
}