salesforce/CoMatch

How to improve performance

Closed this issue · 3 comments

the idea in your paper is amazing,great truths are all simple. I have the following questions:
1、Does a stronger data enhancement method, such as RandomAugment, improve the performance?
2、To improve the performance, is necessary expend the size of distribution alignment ?
3、how to adjust the memory bank size when more label data is used in ImageNet ,such as 20% of ImageNet?
4、If use 20% of ImageNet data as a label, what are the recommendations for other hyperparameters?
5、From your point of view, what are the main challenges in achieving full oversight with 20% of ImageNet data?

Thanks for your questions! Here are my answers.

  1. RandomAugment is important to improve performance, which was originally observed by the FixMatch paper.
  2. What does "size" refer to? If you mean the number of mini-batches (32 for CIFAR), it won't have a large impact if we reduce or increase it.
  3. The memory bank size can be fixed as it is.
  4. I would suggest you to first try the same hyperparameter as 10% label, and then try to further reduce the thresholds "thr" and "contrast-th".
  5. With more labeled data, the task would be easier in my opinion.

Thanks for your answers! Here are my questions.
1、do your try to use similar RandomAugment or other strong data augment to improve performance;
2、the size mean the distribution alignment lenght 128

if len(self.hist_prob)>128:
, if the size larger(such as 265), is any useful to improve performance?;
3、i will fix as it is.
4、i try the same hyperparameter as 10% label when use the 20% label data,but get just 0.6-0.7point , so except reduce the thresholds "thr" and "contrast-th", do you have some recommendations .
5、do you try to introduce some more loss or other composition, Or any advice on that?

  1. I have only tried RandAug
  2. I think 128 should be enough
  3. you can also try to decrease --lam-c from 2 to 1