shangbuhuan13/SO-Pose

question about backbone in experiment configs for LM dataset

Closed this issue · 6 comments

Hi,

Thanks for your great work! I have a question about the backbone in LM datasets. In your paper, you said that "As backbone we leverage ResNet34 [6] for all experiments on the LM dataset", but the config files in this repo seems different:

       BACKBONE=dict(
            FREEZE=False,
            PRETRAINED="mmcls://resnet50_v1d",
            INIT_CFG=dict(
                _delete_=True,
                type="mm/ResNetV1d",
                depth=50,
                in_channels=3,
                out_indices=(3,),
            ),
        ),

and the output feature dimension is 2048, not 512 as in GDR-Net.

Could you help to clarify this? Thanks!

Thanks for your question.
I also notice that this is different from the paper.
I forget some of the experiment details.
I remembered I used resnet34 for LM, since this dataset is relatively easy, no occlusion or blur, using a deeper backbone (50) like this results in similar results with (34).
The provided config files are not the initial config file. Sorry I didn't check the config file on LM.
For the experiments, you can directly change the config file according to GDR-net.

Got it, thanks!

I am curious if anyone has run the configs in the repo on LM dataset and been able to reproduce the results. For me, I got ad_2=44.49 using ResNet50 backbone, but only 39.71 using ResNet34. The performance gap may be due to my own generation of ground truth of P.

I don't know. I forgot the details on LM dataset.
But can you reproduce the results on LMO? I think the provided config file is right

I guess since I did ablation study on LM, the config might be changed a little.
But I remembered that the performance of 34 and 50 should be similar.
And by the way, P and Q should be on the same line.
The public renderer may cause problems in this step

Thanks very much for your reply! Yeah I used the data generation codes from Pix2Pose with some modifications, which is a renderer based method. I will check the codes in this repo as well. I'd like to try large datasets like LMO and YCBV when I get enough disk space.