adambielski/siamese-triplet

Performance of Triplet on huge number of classes ?

BerenLuthien opened this issue · 4 comments

Hey,
Is there any plan to include such complicated dataset as having 10K~100K classes ?
Intuitionally MNIST is easy because it has only 10 classes for clustering.

Thanks for sharing the great work BTW :)

It can be trained on other datasets, it has been done successfully e.g. for faces (FaceNet reference in readme), the main change is a bigger architecture.

Thanks for response. I knew Facenet was a famous example.

I meant, do you have any plan to include such a larger dataset with more classes into this project ? Facenet was trained on too huge dataset, though.
I am looking for some dataset which provides ~10K classes, with ~10 samples per class, that is, about 100K training examples. -- basically something trainable on one GPU.

And I enjoy your Pytorch code, not Google's TF 1.x facenet code :) Besides, Google did not and cannot open source that dataset.
Thanks

I have no plans to include more datasets, the examples were only to show how the approach can be used and what's the intuition behind it.

QQ: are these embeddings normalized , that is , ||embed||=1 ? Thanks