There are some differences between the paper and the code

Question

There are some differences between the paper and the code

wxy1234567 opened this issue 4 years ago · 6 comments

In Figure 2 of the paper, first maximize instance uncertainty, then minimize instance uncertainty, but the code seems to be the opposite, and the frozen layers in different stages of the code are also the opposite

Answer 1 · 2021-03-16T11:46:02.000Z

In Figure 2 of the paper, first maximize instance uncertainty, then minimize instance uncertainty, but the code seems to be the opposite, and the frozen layers in different stages of the code are also the opposite

Yes, we have noticed the problem. However, some of our previous experiments have shown that, if the order of max step and min step is reversed (including the frozen layers), the performance will change little. We are reproducing this now, and we will provide this log file soon. You can compare the log files in these two cases.

Answer 2 · 2021-03-17T06:29:44.000Z

In Figure 2 of the paper, first maximize instance uncertainty, then minimize instance uncertainty, but the code seems to be the opposite, and the frozen layers in different stages of the code are also the opposite

And here are the results under the new condition and those of the last row in Tab. 1 in the paper:

Proportion (%) of Labeled Images	5.0	7.5	10.0	12.5	15.0	17.5	20.0
mAP (%) under new condition	48.41	56.16	65.50	68.65	69.53	70.88	71.92
mAP (%) in the paper	47.18	58.03	63.98	66.58	69.57	70.96	72.03
mAP (%) of the differences	1.23	-1.87	1.52	2.07	-0.04	-0.08	-0.11

Here is the output log: Google Drive | Baidu Drive (Extraction Code: 3nza)

Answer 3 · 2021-03-17T07:45:54.000Z

Thank you ! There is also a small question. The initial labeled experiment in Figure 5 of this paper should be similar in theory. Is it because your algorithm needs to train more epochs?

Answer 4 · 2021-03-17T10:37:42.000Z

Thank you ! There is also a small question. The initial labeled experiment in Figure 5 of this paper should be similar in theory. Is it because your algorithm needs to train more epochs?

No. As described in the subsection "Active Learning Settings" in Section 4.1, MI-AOD and other methods shared the same training setting, the same initialization and the same random seed.

The reason of performance improvement has been explained in the last paragraph in Section 3.2, as:

In each active learning cycle, the max-min prediction discrepancy procedures repeat several times so that the instance uncertainty is learned and THE INSTANCE DISTRIBUTIONS of the labeled and unlabeled set ARE PROGRESSIVELY ALIGNED. This actually defines AN UNSUPERVISED LEARNING PROCEDURE, which leverages the information (i.e., PREDICTION DISCREPANCY) of the unlabeled set to improve the detection model.

So the reason can be summarized as:
Intentional use of unlabeled data
-> Better aligned instance distributions of the labeled and unlabeled set
-> Effective information (prediction discrepancy) of the unlabeled set
-> Naturally formed unsupervised learning procedure
-> Performance improvement

And the internal direct module effect has been shown in Tab.1. Using 5.0% data, IUL can improve it from 28.31 to 30.09, and IUR can further improve it to 47.18.

Answer 5 · 2021-03-18T01:06:28.000Z

Thank you for your reply and solved my confusion !

Answer 6 · 2021-03-18T01:51:37.000Z

Thank you for your reply and solved my confusion !

😄