person_reid_batchall

A triplet network takes in three images as input i.e., an anchor image, a positive image (i.e., image having label same as the anchor) and a negative image (i.e., image having label different from the anchor). The objective here is to learn embeddings such that the positive images are closer to the anchor as compared to the negative images.

The Batch Hard variant of the triplet loss is mathematically expressed as:

Source: Alexander Hermans, Lucas Beyer, Bastian Leibe, “In Defense of the Triplet Loss for Person Re-Identification”

Dataset

The dataset on which this model has been trained is Market-1501. Dataset description can be found on this link. The dataset can be downloaded from the kaggle website.

Network Architecture and Data Augmentation

The network architecture used is:
Pretrained ResNet-50 > Linear 1024 > BatchNorm > ReLU > Linear 128
The dataset is augmented on the go (during training) by using Random Horizontal Flips.

Tensorboard Visualization

The training logs obtained are as follows:
Batch Hard with Hard Margin:
Batch Hard with Soft Margin:

Moreover the training logs can be visualized by following instructions as:

Go to the source directory.
Type the command:
For Batch Hard with Hard Margin:
$ tensorboard --logdir logs_market1501_batchhard
For Batch Hard with Soft Margin:
$ tensorboard --logdir logs_market1501_batchhard_softplus
Go to a browser and type:
http://localhost:6006/

Performance Evaluation

The performance evaluation code is taken from this repository.
The results are summarized in the table below:

Batch Hard with Hard Margin:

mAP	top-1	top-2	top-5	top-10
41.12%	63.03%	72.89%	82.96%	89.46%

Batch Hard with Softplus:

mAP	top-1	top-2	top-5	top-10
40.89%	59.62%	70.84%	81.71%	87.35%

Weights