Pre-training on a small dataset to learn image similarity

Question

Pre-training on a small dataset to learn image similarity

Opened this issue 3 years ago · 1 comments

Hi,

Firstly, just wanted to say this paper + approach is super cool! :)

After studying the SwAV paper, I was curious about whether SwAV could learn useful representations when only using a small image dataset.

For example, if we want to build a model to do image similarity search for only images within some small dataset of ~100 images (ie., the model doesn't need to generalize to images outside of this small dataset), can we expect SwAV to work well in these cases if we were to pre-train the model using only such a small dataset?

Or, would the recommended approach be to still pre-train SwAV on a very large dataset, and then use that model to perform image similarity search within the small dataset of interest?

Since dataset size appears to play an important role for novel approaches like SwAV, I was curious about this and wanted to ask to learn more!

Thanks!

Answer 1 · 2022-04-04T07:27:14.000Z

Hi @thecooltechguy,

Did you run experiments to answer this question, or maybe obtained information on this in another way?