/deep-person-reid

Pytorch implementation of deep person re-identification models.

Primary LanguagePythonMIT LicenseMIT

deep-person-reid

PyTorch implementation of deep person re-identification models.

We support

  • multi-GPU training.
  • both image-based and video-based reid.
  • unified interface for different reid models.
  • easy dataset preparation.
  • end-to-end training and evaluation.
  • standard dataset splits used by most papers.
  • fast cython-based evaluation.

Get Started

  1. cd to the folder where you want to download this repo.
  2. Run git clone https://github.com/KaiyangZhou/deep-person-reid.
  3. Install dependencies by pip install -r requirements.txt.
  4. To accelerate evaluation (10x faster), you can use cython-based evaluation code (developed by luzai). First cd to eval_lib, then do make or python setup.py build_ext -i. After that, run python test_cython_eval.py to test if the package is successfully installed.

Datasets

Image reid datasets:

  • Market1501 [7]
  • CUHK03 [13]
  • DukeMTMC-reID [16, 17]
  • MSMT17 [22]
  • VIPeR [28]
  • GRID [29]
  • CUHK01 [30]
  • PRID450S [31]

Video reid datasets:

  • MARS [8]
  • iLIDS-VID [11]
  • PRID2011 [12]
  • DukeMTMC-VideoReID [16, 23]

Instructions regarding how to prepare these datasets can be found here.

Models

  • models/resnet.py: ResNet50 [1], ResNet101 [1], ResNet50M [2].
  • models/resnext.py: ResNeXt101 [26].
  • models/seresnet.py: SEResNet50 [25], SEResNet101 [25], SEResNeXt50 [25], SEResNeXt101 [25].
  • models/densenet.py: DenseNet121 [3].
  • models/mudeep.py: MuDeep [10].
  • models/hacnn.py: HACNN [15].
  • models/squeezenet.py: SqueezeNet [18].
  • models/mobilenetv2.py: MobileNetV2 [19].
  • models/shufflenet.py: ShuffleNet [20].
  • models/xception.py: Xception [21].
  • models/inceptionv4.py: InceptionV4 [24].
  • models/inceptionresnetv2.py: InceptionResNetV2 [24].

See models/__init__.py for details regarding what keys to use to call these models.

Benchmarks can be found here.

Train

Training codes are implemented in

  • train_imgreid_xent.py: train image model with cross entropy loss.
  • train_imgreid_xent_htri.py: train image model with combination of cross entropy loss and hard triplet loss.
  • train_vidreid_xent.py: train video model with cross entropy loss.
  • train_vidreid_xent_htri.py: train video model with combination of cross entropy loss and hard triplet loss.

For example, to train an image reid model using ResNet50 and cross entropy loss, run

python train_imgreid_xent.py -d market1501 -a resnet50 --optim adam --lr 0.0003 --max-epoch 60 --stepsize 20 40 --train-batch 32 --test-batch 100 --save-dir log/resnet50-xent-market1501 --gpu-devices 0

To use multiple GPUs, you can set --gpu-devices 0,1,2,3.

Please run python train_blah_blah.py -h for more details regarding arguments.

Test

Say you have downloaded ResNet50 trained with xent on market1501. The path to this model is 'saved-models/resnet50_xent_market1501.pth.tar' (create a directory to store model weights mkdir saved-models/ beforehand). Then, run the following command to test

python train_imgreid_xent.py -d market1501 -a resnet50 --evaluate --resume saved-models/resnet50_xent_market1501.pth.tar --save-dir log/resnet50-xent-market1501 --test-batch 100 --gpu-devices 0

Likewise, to test video reid model, you should have a pretrained model saved under saved-models/, e.g. saved-models/resnet50_xent_mars.pth.tar, then run

python train_vid_model_xent.py -d mars -a resnet50 --evaluate --resume saved-models/resnet50_xent_mars.pth.tar --save-dir log/resnet50-xent-mars --test-batch 2 --gpu-devices 0

Note that --test-batch in video reid represents number of tracklets. If you set this argument to 2, and sample 15 images per tracklet, the resulting number of images per batch is 2*15=30. Adjust this argument according to your GPU memory.

Citation

Please link this project in your paper.

References

[1] He et al. Deep Residual Learning for Image Recognition. CVPR 2016.
[2] Yu et al. The Devil is in the Middle: Exploiting Mid-level Representations for Cross-Domain Instance Matching. arXiv:1711.08106.
[3] Huang et al. Densely Connected Convolutional Networks. CVPR 2017.
[4] Hermans et al. In Defense of the Triplet Loss for Person Re-Identification. arXiv:1703.07737.
[5] Szegedy et al. Rethinking the Inception Architecture for Computer Vision. CVPR 2016.
[6] Kingma and Ba. Adam: A Method for Stochastic Optimization. ICLR 2015.
[7] Zheng et al. Scalable Person Re-identification: A Benchmark. ICCV 2015.
[8] Zheng et al. MARS: A Video Benchmark for Large-Scale Person Re-identification. ECCV 2016.
[9] Wen et al. A Discriminative Feature Learning Approach for Deep Face Recognition. ECCV 2016
[10] Qian et al. Multi-scale Deep Learning Architectures for Person Re-identification. ICCV 2017.
[11] Wang et al. Person Re-Identification by Video Ranking. ECCV 2014.
[12] Hirzer et al. Person Re-Identification by Descriptive and Discriminative Classification. SCIA 2011.
[13] Li et al. DeepReID: Deep Filter Pairing Neural Network for Person Re-identification. CVPR 2014.
[14] Zhong et al. Re-ranking Person Re-identification with k-reciprocal Encoding. CVPR 2017
[15] Li et al. Harmonious Attention Network for Person Re-identification. CVPR 2018.
[16] Ristani et al. Performance Measures and a Data Set for Multi-Target, Multi-Camera Tracking. ECCVW 2016.
[17] Zheng et al. Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in vitro. ICCV 2017.
[18] Iandola et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size. arXiv:1602.07360.
[19] Sandler et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks. CVPR 2018.
[20] Zhang et al. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. CVPR 2018.
[21] Chollet. Xception: Deep Learning with Depthwise Separable Convolutions. CVPR 2017.
[22] Wei et al. Person Transfer GAN to Bridge Domain Gap for Person Re-Identification. CVPR 2018.
[23] Wu et al. Exploit the Unknown Gradually: One-Shot Video-Based Person Re-Identification by Stepwise Learning. CVPR 2018.
[24] Szegedy et al. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. ICLRW 2016.
[25] Hu et al. Squeeze-and-Excitation Networks. CVPR 2018.
[26] Xie et al. Aggregated Residual Transformations for Deep Neural Networks. CVPR 2017.
[27] Chen et al. Dual Path Networks. NIPS 2017.
[28] Gray et al. Evaluating appearance models for recognition, reacquisition, and tracking. PETS 2007.
[29] Loy et al. Multi-camera activity correlation analysis. CVPR 2009.
[30] Li et al. Human Reidentification with Transferred Metric Learning. ACCV 2012.
[31] Roth et al. Mahalanobis Distance Learning for Person Re-Identification. PR 2014.