RGB-T Multi-Modal Crowd Counting Based on Transformer

The paper has been published by BMVC 2022.

The details are in https://bmvc2022.mpi-inf.mpg.de/0427.pdf

Code

链接：https://pan.baidu.com/s/1sYlFQXqGiY8ykOpehH_kkQ 提取码：jrux

Pretraining Parameter

链接：https://pan.baidu.com/s/1RmQEKXC_-nzdQo76tazspg 提取码：neqq

Results

RGB-T CC dataset

Methods	GAME(0)↓	GAME(1)↓	GAME(2)↓	GAME(3)↓	RMSE↓
CMCRL	15.61	19.95	24.69	32.89	28.18
TAFNet	12.38	16.98	21.86	30.19	22.45
MAT	12.35	16.29	20.81	29.09	22.53
DEFNet	11.90	16.08	20.19	27.27	21.09
Ours	10.90	14.81	19.02	26.14	18.79

Note:

preprocess_RGBT.py convert .json files into .npy files.

predataset_RGBT_CC.py is used to change the size of input images.

If you have any question, please email liuzywen@ahu.edu.cn

Please cite:

@inproceedings{liu2022RGBT, 
  title={{RGB-T Multi-Modal Crowd Counting Based on Transformer}},  
  author={Zhengyi Liu and Wei Wu and Yacheng Tan and Guanghui Zhang},  
  booktitle={Procedings of British Machine Vision Conference},  
  pages={1--14},  
  year={2022} 
}

Install dependencies

torch >= 1.0 torchvision opencv numpy scipy, all the dependencies can be easily installed by pip or conda

This code was tested with python 3.8.

install MultiScaleDeformableAttention:

cd ./models/ops     
sh ./make.sh

Preprocessing

File predataset_RGBT_CC.py is used to process RGBT-CC datasets to obtain pictures of different sizes.

python predataset_RGBT_CC.py

Training

Edit this file for training our model.

training hyperparameters as:

--data-dir = ""  # train datasets path.
--save-dir = ""  # save model path.
--pretrained_model = ""  # pre-trained parameters
--max-epoch = 500
--val-epoch = 1
--val_start = 30
--batch-size = 16

bash train.sh

Testing

Edit this file for testing models. testing hyperparameters as :

--data-dir = ""  # test datasets path
--save-dir = ""  # saved model path
--model = ""  # best_model_XXXXX.pth

bash test.sh

liuzywen/RGBTCC