ReST 🛌 (ICCV2023)

ReST: A Reconfigurable Spatial-Temporal Graph Model for Multi-Camera Multi-Object Tracking

Cheng-Che Cheng¹ Min-Xuan Qiu¹ Chen-Kuo Chiang² Shang-Hong Lai¹

¹National Tsing Hua University, Taiwan ²National Chung Cheng University, Taiwan

News

2023.8 Code release
2023.7 Our paper is accepted to ICCV 2023!

Introduction

ReST, a novel reconfigurable graph model, that first associates all detected objects across cameras spatially before reconfiguring it into a temporal graph for Temporal Association. This two-stage association approach enables us to extract robust spatial and temporal-aware features and address the problem with fragmented tracklets. Furthermore, our model is designed for online tracking, making it suitable for real-world applications. Experimental results show that the proposed graph model is able to extract more discriminating features for object tracking, and our model achieves state-of-the-art performance on several public datasets.

Requirements

Installation

Clone the project and create virtual environment

git clone https://github.com/chengche6230/ReST.git
conda create --name ReST python=3.8
conda activate ReST

Install (follow instructions):

torchreid
DGL (also check PyTorch/CUDA compatibility table below)
warmup_scheduler
py-motmetrics

Reference commands:

# torchreid
git clone https://github.com/KaiyangZhou/deep-person-reid.git
cd deep-person-reid/
pip install -r requirements.txt
conda install pytorch torchvision cudatoolkit=9.0 -c pytorch
python setup.py develop

# other packages (in /ReST)
conda install -c dglteam/label/cu117 dgl
pip install git+https://github.com/ildoonet/pytorch-gradual-warmup-lr.git
pip install motmetrics

Install other requirements
```
pip install -r requirements.txt
```
Download pre-trained ReID model
- OSNet

Datasets

Place datasets in ./datasets/ as:

./datasets/
├── CAMPUS/
│   ├── Garden1/
│   │   └── view-{}.txt
│   ├── Garden2/
│   │   └── view-HC{}.txt
│   ├── Parkinglot/
│   │   └── view-GL{}.txt
│   └── metainfo.json
├── PETS09/
│   ├── S2L1/
│   │   └── View_00{}.txt
│   └── metainfo.json
├── Wildtrack/
│   ├── sequence1/
│   │   └── src/
│   │       ├── annotations_positions/
│   │       └── Image_subsets/
│   └── metainfo.json
└── {DATASET_NAME}/ # for customized dataset
    ├── {SEQUENCE_NAME}/
    │   └── {ANNOTATION_FILE}.txt
    └── metainfo.json

Prepare all metainfo.json files (e.g. frames, file pattern, homography)

Run for each dataset:

python ./src/datasets/preprocess.py --dataset {DATASET_NAME}

Check ./datasets/{DATASET_NAME}/{SEQUENCE_NAME}/output if there is anything missing:

/output/
├── gt_MOT/ # for motmetrics
│   └── c{CAM}.txt
├── gt_train.json
├── gt_eval.json
├── gt_test.json
└── {DETECTOR}_test.json # if you want to use other detector, e.g. yolox_test.json

Prepare all image frames as {FRAME}_{CAM}.jpg in /output/frames.

Model Zoo

Download trained weights if you need, and modify TEST.CKPT_FILE_SG & TEST.CKPT_FILE_TG in ./configs/{DATASET_NAME}.yml.

Dataset	Spatial Graph	Temporal Graph
Wildtrack	sequence1	sequence1
CAMPUS	Garden1 Garden2 Parkinglot	Garden1 Garden2 Parkinglot
PETS-09	S2L1	S2L1

Training

To train our model, basically run the command:

python main.py --config_file ./configs/{DATASET_NAME}.yml

In {DATASET_NAME}.yml:

Modify MODEL.MODE to 'train'
Modify SOLVER.TYPE to train specific graphs.
Make sure all settings are suitable for your device, e.g. DEVICE_ID, BATCH_SIZE.

You can also directly append attributes after the command for convenience, e.g.:

python main.py --config_file ./configs/Wildtrack.yml MODEL.DEVICE_ID "('1')" SOLVER.TYPE "SG"

Testing

python main.py --config_file ./configs/{DATASET_NAME}.yml

In {DATASET_NAME}.yml:

Modify MODEL.MODE to 'test'.
Select what input detection you want, and modify MODEL.DETECTION.
- You need to prepare {DETECTOR}_test.json in ./datasets/{DATASET_NAME}/{SEQUENCE_NAME}/output/ by your own first.
Make sure all settings in TEST are configured.

DEMO

Wildtrack

Acknowledgement

Thanks for the codebase from the re-implementation of GNN-CCA (arXiv).

Citation

If you find this code useful for your research, please cite our paper

@InProceedings{Cheng_2023_ICCV,
    author    = {Cheng, Cheng-Che and Qiu, Min-Xuan and Chiang, Chen-Kuo and Lai, Shang-Hong},
    title     = {ReST: A Reconfigurable Spatial-Temporal Graph Model for Multi-Camera Multi-Object Tracking},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2023},
    pages     = {10051-10060}
}