yl-1993/learn-to-cluster

clustering with a strong backbone

yyang2xin opened this issue · 1 comments

Hi Yang Lei, may I ask another question?
the cluster problem will become easier with a stronger backbone( the face feature extractor ), with the perfect backbone the similarity score between positive pairs would always be 1.0 and 0 for those negative pairs.

so I reckoned with the gap between the simple CW method and the GCN-V would be small before training, and then I found the results were almost identical... to be clear, I didn't test it on a large dataset yet.

but that raises the questions here: what backbone and loss function did you used during the training? Will the advantage of this method go away if you use a stronger backbone?

thanks!

Hi @yyang2xin , thanks for the question.
(1) For the ideal scenario, if the features are separable, then there exist several algorithms that can guarantee a perfect solution. As you suggested, both methods work well under that situation. We can consider another extreme situation that each data sample belongs to a different cluster, then if we apply a high threshold for all methods to partition each sample to a cluster, then we can obtain the same perfect results.
(2) If the feature extractor is not perfect, the performance gap depends on the data. Stronger features are likely to produce both better baseline and better learning-based clustering models. As the power of feature extractor increases, the performance gap may first enlarge and then decrease.
(3) As stated in the paper, we use standard ResNet-50 and softmax in our experiments. CDP gives an analysis on stronger backbone, which shows that stronger backbone brings further performance gain.