A triplet network takes in three images as input i.e., an anchor image, a positive image (i.e., image having label same as the anchor) and a negative image (i.e., image having label different from the anchor). The objective here is to learn embeddings such that the positive images are closer to the anchor as compared to the negative images.
The Batch Hard variant of the triplet loss is mathematically expressed as:
The dataset on which this model has been trained is Market-1501. Dataset description can be found on this link. The dataset can be downloaded from the kaggle website.
The network architecture used is:
Pretrained ResNet-50 > Linear 1024 > BatchNorm > ReLU > Linear 128
The dataset is augmented on the go (during training) by using Random Horizontal Flips.
The training logs obtained are as follows:
Batch Hard with Hard Margin:
Batch Hard with Soft Margin:
Moreover the training logs can be visualized by following instructions as:
- Go to the source directory.
- Type the command:
For Batch Hard with Hard Margin:
$ tensorboard --logdir logs_market1501_batchhard
For Batch Hard with Soft Margin:
$ tensorboard --logdir logs_market1501_batchhard_softplus
- Go to a browser and type:
http://localhost:6006/
The performance evaluation code is taken from this repository.
The results are summarized in the table below:
Batch Hard with Hard Margin:
mAP | top-1 | top-2 | top-5 | top-10 |
---|---|---|---|---|
41.12% | 63.03% | 72.89% | 82.96% | 89.46% |
Batch Hard with Softplus:
mAP | top-1 | top-2 | top-5 | top-10 |
---|---|---|---|---|
40.89% | 59.62% | 70.84% | 81.71% | 87.35% |
The weights obtained using this repository can be downloaded using the following links:
Batch Hard with Hard Margin:
https://drive.google.com/file/d/1KF43SrAsX8Hn5xQBtr9TnFddFbfBO5XN/view?usp=sharing
Batch Hard with Softplus:
https://drive.google.com/file/d/1NaM4Id31FUdiKgoFoiYr4C42zhk_ngaF/view?usp=sharing