HR-NAS: Searching Efficient High-Resolution Neural Architectures with Lightweight Transformers (CVPR21 Oral)
Environment
Require Python3, CUDA>=10.1, and torch>=1.4, all dependencies are as follows:
pip3 install torch==1.4.0 torchvision==0.5.0 opencv-python tqdm tensorboard lmdb pyyaml packaging Pillow==6.2.2 matplotlib yacs pyarrow==0.17.1
pip3 install cityscapesscripts # for Cityscapes segmentation
pip3 install mmcv-full==latest+torch1.4.0+cu101 -f https://openmmlab.oss-accelerate.aliyuncs.com/mmcv/dist/index.html # for Segmentation data loader
pip3 install pycocotools shapely==1.6.4 Cython pandas pyyaml json_tricks scikit-image # for COCO keypoint estimation
or pip3 install requirements.txt
Setup
Optionally configure NCCL before running:
export NCCL_IB_DISABLE=1
export NCCL_IB_HCA=mlx5_0
export NCCL_IB_GID_INDEX=3
export NCCL_SOCKET_IFNAME=eth0
export HOROVOD_MPI_THREADS_DISABLE=1
export OMP_NUM_THREADS=56
export KMP_AFFINITY=granularity=fine,compact,1,0
Set the following ENV variable:
$MASTER_ADDR: IP address of the node 0 (Not required if you have only one node (machine))
$MASTER_PORT: Port used for initializing distributed environment
$NODE_RANK: Index of the node
$N_NODES: Number of nodes
$NPROC_PER_NODE: Number of GPUs (NOTE: should exactly match local GPU numbers with `CUDA_VISIBLE_DEVICES`)
Example1 (One machine with 8 GPUs):
Node 1:
>>> python -m torch.distributed.launch --nproc_per_node=8
--nnodes=1 --node_rank=0 --master_port=1234 train.py
Example2 (Two machines, each has 8 GPUs):
Node 1: (IP: 192.168.1.1, and has a free port: 1234)
>>> python -m torch.distributed.launch --nproc_per_node=8
--nnodes=2 --node_rank=0 --master_addr="192.168.1.1"
--master_port=1234 train.py
Node 2:
>>> python -m torch.distributed.launch --nproc_per_node=8
--nnodes=2 --node_rank=1 --master_addr="192.168.1.1"
--master_port=1234 train.py
Datasets
-
ImageNet
- Prepare ImageNet data following pytorch example.
- Optional: Generate lmdb dataset by
utils/lmdb_dataset.py
. If not, please overwritedataset:imagenet1k_lmdb
in yaml todataset:imagenet1k
. - The directory structure of
$DATA_ROOT
should look like this:
${DATA_ROOT} ├── imagenet └── imagenet_lmdb
- Link the data:
ln -s YOUR_LMDB_DIR data/imagenet_lmdb
-
Cityscapes
- Download data from Cityscapes.
- unzip gtFine_trainvaltest.zip leftImg8bit_trainvaltest.zip
- Link the data:
ln -s YOUR_DATA_DIR data/cityscapes
- preprocess the data:
python3 tools/convert_cityscapes.py data/cityscapes --nproc 8
-
ADE20K
- Download data from ADE20K.
- unzip ADEChallengeData2016.zip
- Link the data:
ln -s YOUR_DATA_DIR data/ade20k
-
COCO keypoints
- Download data from COCO.
- build tools
git clone https://github.com/cocodataset/cocoapi.git cd cocoapi/PythonAPI python3 setup.py build_ext --inplace python3 setup.py build_ext install make # for nms
- Unzip and Link the data:
ln -s YOUR_DATA_DIR data/coco
- We also provide person detection result of COCO val2017 and test-dev2017 to reproduce our multi-person pose estimation results. Please download from OneDrive or GoogleDrive.
- Download and extract them under
data/coco/person_detection_results
, and make them look like this:
${POSE_ROOT} |-- data `-- |-- coco `-- |-- annotations | |-- person_keypoints_train2017.json | `-- person_keypoints_val2017.json |-- person_detection_results | |-- COCO_val2017_detections_AP_H_56_person.json | |-- COCO_test-dev2017_detections_AP_H_609_person.json `-- images |-- train2017 | |-- 000000000009.jpg | |-- 000000000025.jpg | |-- 000000000030.jpg | |-- ... `-- val2017 |-- 000000000139.jpg |-- 000000000285.jpg |-- 000000000632.jpg |-- ...
Running (train & evaluation)
-
Search for NAS models.
python3 -m torch.distributed.launch --nproc_per_node=${NPROC_PER_NODE} --nnodes=${N_NODES} \ --node_rank=${NODE_RANK} --master_addr=${MASTER_ADDR} --master_port=${MASTER_PORT} \ --use_env train.py app:configs/YOUR_TASK.yml
Supported tasks:
- cls_imagenet
- seg_cityscapes
- seg_ade20k
- keypoint_coco
The super network is constructed using
model_kwparams
in YOUR_TASK.yml.
To enable the searching of Transformer, setprune_params.use_transformer=True
in YOUR_TASK.yml, the token numbers of each Transformer will be printed during training.
The searched architecture can be found inbest_model.json
in the output dir. -
Retrain the searched models.
-
For retraining the searched classification model, please use
best_model.json
to overwrite thecheckpoint.json
in root dir of this project. -
Modify
models/hrnet.py
to setcheckpoint_kwparams = json.load(open('checkpoint.json'))
andclass InvertedResidual(InvertedResidualChannelsFused)
-
Retrain the model.
python3 -m torch.distributed.launch --nproc_per_node=${NPROC_PER_NODE} --nnodes=${N_NODES} \ --node_rank=${NODE_RANK} --master_addr=${MASTER_ADDR} --master_port=${MASTER_PORT} \ --use_env train.py app:configs/cls_retrain.yml
-
Miscellaneous
-
Plot keypoint detection results.
python3 tools/plot_coco.py --prediction output/results/keypoints_val2017_results_0.json --save-path output/vis/
-
About YAML config
- The codebase is a general ImageNet training framework using yaml config with several extension under
apps
dir, based on PyTorch.- YAML config with additional features
${ENV}
in yaml config._include
for hierachy config._default
key for overwriting.xxx.yyy.zzz
for partial overwriting.
--{{opt}} {{new_val}}
for command line overwriting.
- YAML config with additional features
- Any questions regarding HR-NAS, feel free to contact the author (mingyuding@hku.hk).
- If you find our work useful in your research please consider citing our paper:
@inproceedings{ding2021hrnas, title={HR-NAS: Searching Efficient High-Resolution Neural Architectures with Lightweight Transformers}, author={Ding, Mingyu and Lian, Xiaochen and Yang, Linjie and Wang, Peng and Jin, Xiaojie and Lu, Zhiwu and Luo, Ping}, booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, year={2021} }