ISyNet Description
Model Architecture
Dataset
Environment Requirements
Script Description
- Script and Sample Code
Model Description
- Performance
  - Evaluation Performance
Description of Random Situation

ISyNet Description

ISyNets is a set of architectures designed to be fast on the 310 hardware and accurate at the same time. We show the advantage of the designed architectures for the NPU devices on ImageNet and the generalization ability for the downstream classification and detection tasks.

Paper: Alexey Letunovskiy, Vladimir Korviakov, Vladimir Polovnikov, Anastasiia Kargapoltseva, Ivan Mazurenko, Yepan Xiong. ISyNet: Convolutional Neural Networks design for AI accelerator.

Model architecture

The overall network architecture of ISyNet is described in our paper.

Dataset

Dataset used: ImageNet 2012

Dataset size:
- Train: 1.2 million images in 1,000 classes
- Test: 50,000 validation images in 1,000 classes
Data format: RGB images.
- Note: Data will be processed in src/dataset.py

Environment Requirements

Hardware (GPU )
Framework
- MindSpore
For more information, please check the resources below:
- MindSpore Python API

Script Description

Script and Sample Code

├── README.md                           # descriptions about ISyNet
├── script
│   ├── run_eval_gpu.sh                 # gpu evaluation script
│   ├── run_infer_310_om.sh                # evaluation in  310 script
│   ├── run_standalone_train_gpu.sh     # training script on single GPU
│   └── run_distributed_train.sh        # distributed training script on multiple GPUs
├── ISyNet
│   ├── model.py            # architecture cell constructor
│   ├── backbone.py         # CNN backbone constructor
│   ├── head.py             # classification head constructor
│   ├── layers.py           # definition of model's layers
│   └── json_parser_backbone.py     # parser of architecture's json files
├── src
│   ├── CrossEntropySmooth.py         # cross entropy with label smooth
│   ├── KLLoss.py               # KL loss for 
│   ├── autoaugment.py                # auto augmentation
│   ├── dataset.py                    # dataset
│   ├── ema_callback.py               # callback for EMA(exponential moving average)
│   ├── eval_callback.py              # eval callback
│   ├── lr_generator.py               # learning rate scheduling
│   ├── metric.py                     # metric
│   ├── momentum.py                   # SGD momentum
│   └── model_utils                   # utils
├── utils
│   ├── count_acc.py                  # count accuracy of the model executed on the  310
│   ├── export.py                     # export mindir script
│   └── preprocess_310.py             # preprocess imagenet dataset and convert it from jpeg to bin files for  310 inference
├── config                            # yml configs
├── eval.py                 # evaluation script
└── train.py                # training script

Training process

Launch

# training on single GPU
  bash run_standalone_train_gpu.sh  [DATA_PATH] [CONFIG_PATH]
# training on multiple GPUs
  bash run_distributed_train_gpu.sh [DATA_PATH] [CONFIG_PATH]

checkpoints will be saved in the ./train/output folder (single GPU) ./train_parallel/output/ folder (multiple GPUs) ./train_parallel0-7/output/ folder (multiple s)

Evaluation Process

Launch

# infer example

bash run_eval_gpu.sh [DATA_PATH] [CHECKPOINT_PATH] [CONFIG_PATH]

Checkpoint can be produced in training process.

Inference Process

Export MindIR

Export MindIR on local

python utils/export.py --jsonFile [JSON_FILE] --file_name [FILE_NAME] --file_format [FILE_FORMAT] --checkpoint_file_path [CHECKPOINT_PATH]

The checkpoint_file_path parameter is required, EXPORT_FORMAT should be in ["AIR", "MINDIR"]

Infer on 310

Overall procedure of running on the 310 consists of the following steps:

Conversion of the ImageNet validation set to bin files
Conversion of the CKPT MindSpore model to AIR format
Conversion of the AIR model to OM format
Building the inference executable program
Running OM model and dumping the inference results
Computing the validation accuracy

We only provide an example of inference using OM model. Current batch_size can only be set to 1 for the accuracy measurement.

Step 1 should be done only once with the following command:

# ImageNet files conversion

python3.7 preprocess_310.py --data_path [IMAGENET_ORIGINAL_VAL_PATH] --output_path [IMAGENET_PREPROCESSED_VAL_PATH]

IMAGENET_ORIGINAL_VAL_PATH is an input path to the original ImageNet validation folder.
IMAGENET_PREPROCESSED_VAL_PATH is an output path where the converted ImageNet files will be saved.

Steps 2 to 6 are fully automated with the following script:

# 310 inference
cd scripts
export _PATH=/usr/local// # set another path to the  toolkit if needed
bash run_infer_310_om.sh [MODEL_JSON_FILE] [MODEL_CKPT_FILE] [IMAGENET_PREPROCESSED_VAL_PATH] [BATCH_SIZE] [MODE]

MODEL_JSON_FILE is a path to model JSON description file.
MODEL_CKPT_FILE is a path to pretrained model CKPT file.
IMAGENET_PREPROCESSED_VAL_PATH is a path to the converted ImageNet files prepared by preprocess_310.py script.
BATCH_SIZE is a batch size. Computing the validation accuracy is supported only for batch size 1. Inference and profiling is supported for any size of batch, but input files should be concatenated correspondingly.
MODE is a inference regime, can be "inference" or "profile"
- "inference" means simple running the model, saving the outputs and measuring the average latency.
- "profile" means profiling the model and saving the detailed analysis of each operation in the model graph.

After the validation

result

Inference result is saved in current path, you can find result like this in acc.log file.

Model Description

Performance

Evaluation Performance

Model	ImageNet Top-1 mindspore	Latency, ms*	Params, x10^6	MACs, x10^9	Checkpoint
ISyNet-N0	75.03	0.43	9.59	1.13	Link
ResNet-18+	74.3	0.63	11.69	2.28
ISyNet-N1	76.41	0.72	7.42	2.85	Link
ISyNet-N1-S1	76.78	0.74	7.82	2.88	Link
ISyNet-N1-S2	77.45	0.83	8.86	3.34	Link
ISyNet-N1-S3	78.25	0.97	10.81	4.12	Link
ResNet-34+	77.95	1.05	21.8	4.63
ISyNet-N2	79.07	1.10	19.43	4.93	Link
ISyNet-N3	80.43	1.55	20.47	7.32	Link
ResNet-50+	80.18	1.64	25.56	5.19

Latency is measured on 310 NPU accelerator with batch size 16 in fp16 precision.

IsyNet-N3 on ImageNet

Parameter	Value
Model Version	IsyNet-N3
Resource	910, 8 NPU; CPU 2.60GHz, 96 cores; Memory 1500G; OS Euler2.5
uploaded Date	03/01/2022 (month/day/year)
MindSpore Version	1.5.0
Dataset	ImageNet
Training Parameters	epoch=550, steps_per_epoch=1251, batch_size=128; Deep Mutual Learning, RandAugmentation, Last BatchNorm; lr=0.001, warmup=40epochs, cosine scheduler
Optimizer	AdamW
Loss Function	Softmax Cross Entropy and KL Divergence
outputs	probability
Speed	6 min / epoch
Total time	55 hours
Parameters (M)	20.47
config	config/IsyNet-N1-S3_imagenet2012_config_MA_v7.yaml

See more details in the Paper.

Description of Random Situation

We set the seed inside dataset.py. We also use random seed in train.py.

kingcong/gpu_ISyNet

Contents

ISyNet Description

Model architecture

Dataset

Environment Requirements

Script Description

Script and Sample Code

Training process

Launch

Evaluation Process

Launch

Inference Process

Export MindIR

Infer on 310

result

Model Description

Performance

Evaluation Performance

IsyNet-N3 on ImageNet

Description of Random Situation