Siamese networks and Friends

Fig. 1: Latent space learned from a Siamese network. These dots are MNIST test samples.
They are highlighted with the same color if they come from the same class.

Introduction

Given a set of data, such as images, we can find a latent space that these samples lay on. Although this latent space typically much lower dimensions than the original space, it preserves semantic information between neighbours. Mapping data from the original space to the latent space allows us to reduce the dimensions we need to represent data significantly; hence reduce computation resources.

In this project, we aim to use several neural networks to learn such latent spaces. Using different architectures and loss functions, we can see how data is represented in those latent spaces Some of these spaces map similar data together, i.e. samples in the same class. With appropriated learned latent space, one can use these latent features for other downstream tasks, such as classification or reverse image search.

Architectures

Fig. 2: Architectures experimented in this project.

EmbeddingNet is based on @adambielski's siamese-tripet, while SiameseNet is from Koch (2015).

Loss Functions

Cross Entropy Loss (CE)

CE is used to train EmbeddingNet. It learns a latent space that provides good information for classifying samples.
Binary Cross Entropy Loss (BCE)

Using BCE, the latent representation is learned in such as a way that enables the Siamese network to classify whether two given samples are similar.
Contrastive Loss (CL)

CL allows us to learn representation that bring similar samples together in a latent space.
Tripet Loss (TL)

TL is similar to CL but it also has a constraint that the distance between a sample and its positive pair should be smaller than the distance between the sample and its negative pair.

Command

Usage: train.py [-h] [--lr 0.0001] [--epochs 10] [--batch-size 32]
                [--output ./tmp] [--log-interval 50] [--animation 50]
                network dataset

Positional arguments:
  network [embedding-classification|siamese-constrastive|siamese-binary-cross-entropy|tripet-loss-net]
  dataset [MNIST|FashionMNIST]

Optional arguments:
  -h, --help         show this help message and exit
  --lr 0.0001        learning rate
  --epochs 10        no. epochs
  --batch-size 32    batch size
  --output ./tmp     output directory
  --log-interval 50  logging interval
  --animation False  produce latent space every epoch for animation