/SnP

SnP: Large-Scale Training Data Search for Object Re-Identification (CVPR 2023)

Primary LanguagePython

Search and Pruning (SnP) Framework for Training Set Search

This repository includes our code for the paper 'Large-scale Training Data Search for Object Re-identification' in CVPR2023.

Related material: Paper, Video, Zhihu

As shown in figure above, we present a search and pruning (SnP) solution to the training data search problem in object re-ID. The source data pool is 1 order of magnitude larger than existing re-ID training sets in terms of the number of images and the number of identities. When the target is AlicePerson, from the source pool, our method (SnP) results in a training set 80% smaller than the source pool while achieving a similar or even higher re-ID accuracy. The searched training set is also superior to existing individual training sets such as Market-1501, Duke, and MSMT.

Requirements

  • Sklearn
  • Scipy 1.2.1
  • PyTorch 1.7.0 + torchivision 0.8.1

Re-ID Datasets Preparation

fig1

Please prepare the following datasets for person re-ID: DukeMTMC-reID, Market1503, MSMT17, CUHK03, RAiD, PersonX, UnrealPerson, RandPerson, PKU-Reid, VIPeR, AlicePerson (target data in VisDA20).

You may need to sign up to get access to some of these datasets. Please store these datasets in a file strcuture like this

~
└───reid_data
    └───duke_reid
    │   │ bounding_box_train
    │   │ ...
    │
    └───market
    │   │ bounding_box_train
    │   │ ...
    │
    └───MSMT
    │   │ MSMT_bounding_box_train
    │   │ ...
    │
    └───cuhk03_release
    │   │ cuhk-03.mat
    │   │ ...
    │
    └───alice-person
    │   │ bounding_box_train
    │   │ ...
    │
    └───RAiD_Dataset-master
    │   │ bounding_box_train
    │   │ ...
    │
    └───unreal
    │   │ UnrealPerson-data
    │   │ ...
    │
    └───randperson_subset
    │   │ randperson_subset
    │   │ ...
    │
    └───PKU-Reid
    │   │ PKUv1a_128x48
    │   │ ...
    │
    └───i-LIDS-VID
    │   │ images
    │   │ ...
    │
    └───VIPeR
    │   │ images
    │   │ ...

Please prepare the following datasets for vehicle re-ID: VeRi, CityFlow-reID, VehicleID, VeRi-wild, VehicleX, Stanford Cars, PKU-vd1 and PKU-vd2. The AliceVehicle will be public available by our team shortly.

Please store these datasets in a file strcuture like this

~
└───reid_data
    └───VeRi
    │   │ bounding_box_train
    │   │ ...
    │
    └───AIC19-reid
    │   │ bounding_box_train
    │   │ ...
    │
    └───VehicleID_V1.0
    │   │ image
    │   │ ...
    │
    └───vehicleX_random_attributes
    │   │ ...
    │
    └───veri-wild
    │   │ VeRI-Wild
    │   │ ...
    │
    └───stanford_cars
    │   │ cars_train
    │   │ ...
    │
    └───compcars
    │   │ CompCars
    │   │ ...
    │
    └───PKU-VD
    │   │ VD1
    │   │ VD2
    │   │ ...

Running example

The SnP framework are shown in animation above. For running such process, when Market is used as target, we can seach a training set with 2860 IDs using the command below:

python trainingset_search_person.py --target 'market' \
--result_dir 'results/sample_data_market/' --n_num_id 2860 \
--ID_sampling_method SnP --img_sampling_method 'FPS' --img_sampling_ratio 0.5 \
--output_data '/data/reid_data/market/SnP_2860IDs_0.5Imgs_0610'  

When VeRi is used as target, the command is:

python trainingset_search_vehicle.py --target 'veri' \
--result_dir './results/sample_data_veri/' --n_num_id 3118 \
--ID_sampling_method SnP --img_sampling_method 'FPS' --img_sampling_ratio 0.5 \
--output_data '/data/data/VeRi/SnP_3118IDs_0.5Imgs_0610'

Citation

If you find this code useful, please kindly cite:

@article{yao2023large,
  title={Large-scale Training Data Search for Object Re-identification},
  author={Yao, Yue and Lei, Huan and Gedeon, Tom and Zheng, Liang},
  journal={arXiv preprint arXiv:2303.16186},
  year={2023}
}

If you have any question, feel free to contact yue.yao@anu.edu.au