Performance of Triplet on huge number of classes ?

Question

Performance of Triplet on huge number of classes ?

BerenLuthien opened this issue 4 years ago · 4 comments

Hey,
Is there any plan to include such complicated dataset as having 10K~100K classes ?
Intuitionally MNIST is easy because it has only 10 classes for clustering.

Thanks for sharing the great work BTW :)

Answer 1 · 2020-04-23T22:43:13.000Z

It can be trained on other datasets, it has been done successfully e.g. for faces (FaceNet reference in readme), the main change is a bigger architecture.

Answer 2 · 2020-04-24T00:21:02.000Z

Thanks for response. I knew Facenet was a famous example.

I meant, do you have any plan to include such a larger dataset with more classes into this project ? Facenet was trained on too huge dataset, though.
I am looking for some dataset which provides ~10K classes, with ~10 samples per class, that is, about 100K training examples. -- basically something trainable on one GPU.

And I enjoy your Pytorch code, not Google's TF 1.x facenet code :) Besides, Google did not and cannot open source that dataset.
Thanks

Answer 3 · 2020-04-24T08:05:12.000Z

I have no plans to include more datasets, the examples were only to show how the approach can be used and what's the intuition behind it.

Answer 4 · 2020-05-12T03:54:04.000Z

QQ: are these embeddings normalized , that is , ||embed||=1 ? Thanks