PC-DARTS is a memory-efficient differentiable architecture method based on DARTS. It mainly focuses on reducing the large memory cost of the super-net in one-shot NAS method, which means that it can also be combined with other one-shot NAS method e.g. ENAS. Different from previous methods that sampling operations, PC-DARTS samples channels of the constructed super-net. For a detailed description of technical details and experimental results, please refer to our paper:
Partial Channel Connections for Memory-Efficient Differentiable Architecture Search
Yuhui Xu, Lingxi Xie, Xiaopeng Zhang, Xin Chen, Guo-Jun Qi, Qi Tian and Hongkai Xiong.
This code is based on the implementation of DARTS.
Method | Params(M) | Error(%) | Search-Cost |
---|---|---|---|
AmoebaNet-B | 2.8 | 2.55 | 3150 |
DARTSV1 | 3.3 | 3.00 | 0.4 |
DARTSV2 | 3.3 | 2.76 | 1.0 |
SNAS | 2.8 | 2.85 | 1.5 |
PC-DARTS | 3.6 | 2.57 | 0.1 |
Only 0.1 GPU-days are used for a search on CIFAR-10!
Method | FLOPs | Top-1 Error(%) | Top-5 Error(%) | Search-Cost |
---|---|---|---|---|
NASNet-A | 564 | 26.0 | 8.4 | 1800 |
AmoebaNet-B | 570 | 24.3 | 7.6 | 3150 |
PNAS | 588 | 25.8 | 8.1 | 225 |
DARTSV2 | 574 | 26.7 | 8.7 | 1.0 |
SNAS | 522 | 27.3 | 9.3 | 1.5 |
PC-DARTS | 597 | 24.2 | 7.3 | 3.8 |
Search a good arcitecture on ImageNet by using the search space of DARTS(First Time!).
To run our code, you only need one Nvidia 1080ti(11G memory).
python train_search.py \\
python train.py \\
--auxiliary \\
--cutout \\
python train_imagenet.py \\
--tmp_data_dir /path/to/your/data \\
--save log_path \\
--auxiliary \\
--note note_of_this_run
Coming soon!.
-
For the codes in the main branch,
python2 with pytorch(3.0.1)
is recommended (running onNvidia 1080ti
). We also provided codes in theV100_python1.0
if you want to implement PC-DARTS onTesla V100
withpython3+
andpytorch1.0+
. -
You can even run the codes on a GPU with memory only 4G. PC-DARTS only costs less than 4G memory, if we use the same hyper-parameter settings as DARTS(batch-size=64).
-
You can search on ImageNet by
model_search_imagenet.py
! The training file for search on ImageNet will be uploaded after it is cleaned or you can generate it according to the train_search file on CIFAR10 and the evluate file on ImageNet. Hyperparameters are reported in our paper! The search cost 11.5 hours on 8 V100 GPUs(16G each). If you have V100(32G) you can further increase the batch-size. -
We random sample 10% and 2.5% from each class of training dataset of ImageNet. There are still 1000 classes! Replace
input_search, target_search = next(iter(valid_queue))
with following codes would be much faster:
try:
input_search, target_search = next(valid_queue_iter)
except:
valid_queue_iter = iter(valid_queue)
input_search, target_search = next(valid_queue_iter)
-
The main codes of PC-DARTS are in the file
model_search.py
. As descriped in the paper, we use an efficient way to implement the channel sampling. First, a fixed sub-set of the input is selected to be fed into the candidate operations, then the concated output is swaped. Two efficient swap operations are provided: channel-shuffle and channel-shift. For the edge normalization, we define edge parameters(beta in our codes) along with the alpha parameters in the original darts codes. -
As PC-DARTS is an ultra memory-efficient NAS methods. It has potentials to be implemnted on other tasks such as detection and segmentation.
Progressive Differentiable Architecture Search
Differentiable Architecture Search
If you use our code in your research, please cite our paper accordingly.
@article{xu2019pcdarts,
title={Partial Channel Connections for Memory-Efficient Differentiable Architecture Search},
author={Xu, Yuhui and Xie, Lingxi and Zhang, Xiaopeng and Chen, Xin and Qi, Guo-Jun and Tian, Qi and Xiong, Hongkai},
journal={arXiv preprint arXiv:1907.05737},
year={2019}
}