/BossNAS

BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search

Primary LanguagePython

BossNAS

PWC
PWC
PWC

This repository contains PyTorch code and pretrained models of our paper: BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search.

Illustration of the Siamese supernets training with ensemble bootstrapping.

Illustration of the fabric-like Hybrid CNN-transformer Search Space with flexible down-sampling positions.

Our Results and Trained Models

  • Here is a summary of our searched models:

    Model MAdds Steptime Top-1 (%) Top-5 (%) Url
    BossNet-T0 w/o SE 3.4B 101ms 80.5 95.0 checkpoint
    BossNet-T0 3.4B 115ms 80.8 95.2 checkpoint
    BossNet-T0^ 5.7B 147ms 81.6 95.6 same as above
    BossNet-T1 7.9B 156ms 81.9 95.6 checkpoint
    BossNet-T1^ 10.5B 165ms 82.2 95.7 same as above
  • Here is a summary of architecture rating accuracy of our method:

    Search space Dataset Kendall tau Spearman rho Pearson R
    MBConv ImageNet 0.65 0.78 0.85
    NATS-Bench Ss Cifar10 0.53 0.73 0.72
    NATS-Bench Ss Cifar100 0.59 0.76 0.79

Usage

1. Requirements

  • Linux

  • Python 3.5+

  • CUDA 9.0 or higher

  • NCCL 2

  • GCC 4.9 or higher

  • Install PyTorch 1.7.0+ and torchvision 0.8.1+, for example:

    conda install -c pytorch pytorch torchvision
  • Install Apex, for example:

    git clone https://github.com/NVIDIA/apex.git
    cd apex
    pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
  • Install pytorch-image-models 0.3.2, for example:

    pip install timm==0.3.2
  • Install OpenSelfSup. As the original OpenSelfSup can not be installed as a site-package, please install our forked and modified version, for example:

    git clone https://github.com/changlin31/OpenSelfSup.git
    cd OpenSelfSup
    pip install -v --no-cache-dir .
  • ImageNet & meta files

  • Download NATS-Bench split version CIFAR datasets from Google Drive. Put it under /YOURDATAROOT/cifar/

  • Prepare BossNAS repository:

    git clone https://github.com/changlin31/BossNAS.git
    cd BossNAS
    • Create a soft link to your data root:
      ln -s /YOURDATAROOT data
    • Overall stucture of the folder:
      BossNAS
      ├── ranking_mbconv
      ├── ranking_nats
      ├── retraining_hytra
      ├── searching
      ├── data
      │   ├── imagenet
      │   │   ├── meta
      │   │   ├── train
      │   │   |   ├── n01440764
      │   │   |   ├── n01443537
      │   │   |   ├── ...
      │   │   ├── val
      │   │   |   ├── n01440764
      │   │   |   ├── n01443537
      │   │   |   ├── ...
      │   ├── cifar
      │   │   ├── cifar-10-batches-py
      │   │   ├── cifar-100-python
      

2. Retrain or Evaluate our BossNet-T models

  • First, move to retraining code directory to perform Retraining or Evaluation.

    cd retraining_hytra

    Our retraining code of BossNet-T is based on DeiT repository.

  • Evaluate our BossNet-T models with the following command:

    • Please download our checkpoint files from the result table, and change the --resume and --input-size accordingly. You can change the --nproc_per_node option to suit your GPU numbers

      python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --model bossnet_T0 --input-size 224 --batch-size 128 --data-path ../data/imagenet --num_workers 8 --eval --resume PATH/TO/BossNet-T0-80_8.pth
      python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --model bossnet_T1 --input-size 224 --batch-size 128 --data-path ../data/imagenet --num_workers 8 --eval --resume PATH/TO/BossNet-T1-81_9.pth
  • Retrain our BossNet-T models with the following command:

    • You can change the --nproc_per_node to suit your GPU numbers. Please note that the learning rate will be automatically scaled according to the GPU numbers and batchsize. We recommend training with 128 batchsize and 8 GPUs. (takes about 2 days)

      python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --model bossnet_T0 --input-size 224 --batch-size 128 --data-path ../data/imagenet --num_workers 8
      python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --model bossnet_T1 --input-size 224 --batch-size 128 --data-path ../data/imagenet --num_workers 8

Architecture of our BossNet-T0

3. Evaluate architecture rating accuracy of BossNAS

  • Get the ranking correlations of BossNAS on MBConv search space with the following commands:

    cd ranking_mbconv
    python get_model_score_mbconv.py

  • Get the ranking correlations of BossNAS on NATS-Bench Ss with the following commands:
    cd ranking_nats
    python get_model_score_nats.py

4. Search Architecture with BossNAS

First, go to the searching code directory:

cd searching
  • Search in NATS-Bench Ss Search Space on CIFAR datasets (4 GPUs, 3 hrs)

    • CIFAR10:
      bash dist_train.sh configs/nats_c10_bs256_accumulate4_gpus4.py 4
    • CIFAR100:
      bash dist_train.sh configs/nats_c100_bs256_accumulate4_gpus4.py 4
  • Search in MBConv Search Space on ImageNet (8 GPUs, 1.5 days)

    bash dist_train.sh configs/mbconv_bs64_accumulate8_ep6_multi_aug_gpus8.py 8
  • Search in HyTra Search Space on ImageNet (8 GPUs, 4 days, memory requirement: 24G)

    bash dist_train.sh configs/hytra_bs64_accumulate8_ep6_multi_aug_gpus8.py 8

Citation

If you use our code for your paper, please cite:

@article{li2021bossnas,
  author = {Li, Changlin and
            Tang, Tao and
            Wang, Guangrun and
            Peng, Jiefeng and
            Wang, Bing and
            Liang, Xiaodan and
            Chang, Xiaojun},
  title = {BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search},
  journal = {arXiv preprint arXiv:2103.12424},
  year = 2021,
}