/3D_Attention

Inhibition-aware (-regularized) 3D attention for robust visual recognition

Primary LanguagePython

3D_Attention

Inhibition-aware (-regularized) 3D attention for robust visual recognition


0. Environment Preparation

  • The version requirements of key frameworks are listed here:
python>=3.6
pytorch>=1.6.0
mxnet>=1.6.0
  • For more detailed requirements, please refer to 'requirements.txt'

1. Dataset Preparation

1.1 CIFAR-100

  • Download CIFAR-100 dataset.
  • Convert to mxnet.recordio.MXIndexedRecordIO:
cd datasets
python cvt_cifar_100.py
  • {Optional} For faster IO, you can copy your datasets folder (containing 'train.rec' and 'train.idx') to memory using this:
sudo mkdir /tmp/train_tmp
sudo mount -t tmpfs -o size=10G tmpfs /tmp/train_tmp
cp -r {Your_Datasets_Folder} /tmp/train_tmp  
# you may get /tmp/train_tmp/cifar-100/train.rec...

After setting this, you can set num_workers=0 in your training code.

1.2 ImageNet-1k

  • TODO

2. Train

2.1 Configure

  • Edit config.py as you need.
  • For fp16, please refer to https://pytorch.org/docs/stable/amp.html.
  • For rec, you need to set it as your datasets folder which includes 'xxx.rec' and 'xxx.idx'.
  • Some settings are useless. You may ignore them.

2.2 Train for CIFAR-100

CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 --nnodes=1 --node_rank=0 --master_addr="127.0.0.1" --master_port=1234 train.py --network iresnet18
  • For more --network types, please refer to backbones/__init__.py.
  • The evaluating process starts every 10 epoch during training. You don't need to run other codes for evaluation.

2.3 Train for ImageNet-1k

  • TODO

3. Evaluate

  • TODO