Question regarding "class agnostic center heatmap" P(t-tao)_agn

Question

Question regarding "class agnostic center heatmap" P(t-tao)_agn

NosremeC opened this issue 4 years ago · 6 comments

Hi there, thank you for sharing your great work! I have a question here, please do correct me if I got it wrong:

We use a "class agnostic center heatmap"(of previous feat) P_agn times the previous feat, then passing it to the DCN along with the offset OD to get the "center attentive feature", which can be used to enhance the current feat. In your paper, it said P_agn was obtained from the heatmap output of CenterNet, but in the base_model section in your training code I found that the P_agn, aka 'pre_hm_i', was always generated directly from the ground truth.

base_model.py:

trainer.py:

generic_dataset.py, getitem:

_get_pre_dets:

I also read your code for test, the Detector seemed ok since it used the previous heatmap result. Have I misunderstood anything here? Many thanks!

Answer 1 · 2021-03-30T03:03:10.000Z

Thanks for your interest in our work!

We use the ground truth heatmap in training, while in testing we use the heatmap predicted by the model.

Answer 2 · 2021-03-30T03:22:13.000Z

Thanks for your interest in our work!

We use the ground truth heatmap in training, while in testing we use the heatmap predicted by the model.

So you mean you have used the gt heatmap to make the feature center-attentive during training? Wouldn't that affect the performance of testing? Since during testing there is no gt so we have to rely on the previous centerNet output. Shouldn't you also train the model in the same way in which it uses P_agn from the previous result and as the loss goes down the performance of centerNet will improve?

Answer 3 · 2021-03-30T03:44:53.000Z

Good question. I cited the explanation from the CenterTrack paper for your reference:

Answer 4 · 2021-03-30T04:59:42.000Z

Good question. I cited the explanation from the CenterTrack paper for your reference:

That is very interesting. Now I see what is actually happening during the training, but still I don't fully understand why we don't just use the real heatmap output to train instead of manually introducing errors. I also read through the ablation study of centertrack and they never explained why abandoning using the real heatmap, maybe it creates way too much noise during the early stage of training when the detection head is still weak? When the detection head outputs a bad heatmap it will degrade the quality of the original feature after applying it, and the result will further make the detection on the current feature even worse after they are interacted, so the training somehow gets stuck in the early stage?

Thank you for sharing the reference though!

Answer 5 · 2021-03-30T05:22:56.000Z

What you had said could be a reason. Not sure using the real heatmap or manipulated gt would make difference in performance. The previous heatmap does not play that important role in the method. The manipulated gt should be ok enough.

Answer 6 · 2021-03-30T05:28:29.000Z

What you had said could be a reason. Not sure using the real heatmap or manipulated gt would make difference in performance. The previous heatmap does not play that important role in the method. The manipulated gt should be ok enough.

Totally make sense! I'll just move on from here now. Thank you for the patience!