garrickbrazil/M3D-RPN

How is the performace when using resnet( 18 or 50)

kl456123 opened this issue ยท 11 comments

I just want to know which backbone is good enough to train for a mono 3d prediction task.

We have not tried Resnet in a long time. If I recall the performance is slightly less than with DenseNet. I can plan to add a few more backbone models to the repository soon.

Otherwise feel free to extend them yourself using any backbones offered by torchvision, which primarily involves changing the final pooling layer and optionally dilating the network. Squeezenet, Shufflenet, and MNAS would all be interesting in addition to resnet but may require a few more minor tweaks.

I just want to know which backbone is good enough to train for a mono 3d prediction task.

I tried ResNet18 34, 50 and 101.. the best one is 101 but still the results are lower than densenet and you can not use the whole network because the network stride is 16 you feature size cannot exceed 1024.

I do not have time currently to test the models. But you can feel free to adjust the stride of ResNet backbones such that the final stride is still 16. Then you can use all the layers and utilize the proper input channel into "prop_feats". I typically change the last few resnet blocks for this. For example:

Put this in the init:

self.conv1 = base.conv1
self.bn1 = base.bn1
self.relu = base.relu
self.maxpool = base.maxpool
self.layer1 = base.layer1
self.layer2 = base.layer2
self.layer3 = base.layer3
self.layer4 = base.layer4
self.layer4[0].downsample[0].stride = (1, 1)
self.layer4[0].conv1.stride = (1, 1)

self.prop_feats = nn.Sequential(
            nn.Conv2d(self.layer4[-1].conv3.out_channels, 512, 3, padding=1),
            nn.ReLU(inplace=True)
        )

Put this in the forward:

x = self.conv1(x)
x = self.bn1(x)
x = self.relu(x)
x = self.maxpool(x)
x = self.layer1(x)
x = self.layer2(x)
x = self.layer3(x)
x = self.layer4(x)

Put this in the build:

resnet18 = models.resnet18(pretrained=train)
rpn_net = RPN(phase, resnet18, conf)

I have not tested this but something similar to this should work for at least for resnet18 and resnet50.

I've attached examples for each ResNet18 and ResNet50 here. However, I have NOT tested them :)

Thank you so much Garrick Brazil for commenting and making the code.
I am still having the same error which i faced when i try to use layer4 of resnet50 model..

RuntimeError: The size of tensor a (55) must match the size of tensor b (110) at non-singleton dimension 3

well, after training with layer3 with the code you provided following are the results..
test_iter 100000 2d car --> easy: 0.0004, mod: 0.0008, hard: 0.0013
test_iter 100000 gr car --> easy: 0.0002, mod: 0.0001, hard: 0.0001
test_iter 100000 3d car --> easy: 0.0002, mod: 0.0001, hard: 0.0001
test_iter 100000 2d pedestrian --> easy: 0.0070, mod: 0.0070, hard: 0.0070
test_iter 100000 gr pedestrian --> easy: 0.0003, mod: 0.0003, hard: 0.0003
test_iter 100000 3d pedestrian --> easy: 0.0003, mod: 0.0003, hard: 0.0003
test_iter 100000 2d cyclist --> easy: 0.0000, mod: 0.0000, hard: 0.0000
test_iter 100000 gr cyclist --> easy: 0.0000, mod: 0.0000, hard: 0.0000
test_iter 100000 3d cyclist --> easy: 0.0000, mod: 0.0000, hard: 0.0000

now i changed the code and i used it in the dilate file using layer gives reasonable results..

Attached are fixed model files (I briefly check that they begin training on my machine).

resnet_models_fixed.tar.gz

Regarding the performance you posted, that is very alarming. If the above models converge to an low training loss but have extremely poor validation, then it may be necessary to adjust the batch norm momentum. Either slow it or halt it entirely.

You can accomplish this by using either of the below functions. I do NOT recommend doing this unless you observe a major generalization issue between training loss and validation. Freezing these layers is not ideal unless the batches are too unstable.

def freeze_bn(network):

    for name, module in network.named_modules():
        if isinstance(module, torch.nn.BatchNorm2d):
            module.eval()

def slow_bn(network, val=0.01):

    for name, module in network.named_modules():
        if isinstance(module, torch.nn.BatchNorm2d):
            module.momentum = val

Then you must add the function call (e.g., "slow_bn(rpn_net)") in train_rpn_3d.py at line 104 before beginning training AND later on around line 192 after validation finishes each cycle.

when i use the model resnet101 with depthaware i get the following results still reasonable
test_iter 50000 2d car --> easy: 0.8680, mod: 0.8214, hard: 0.6647
test_iter 50000 gr car --> easy: 0.1937, mod: 0.1634, hard: 0.1385
test_iter 50000 3d car --> easy: 0.1265, mod: 0.1138, hard: 0.1003
test_iter 50000 2d pedestrian --> easy: 0.6363, mod: 0.5651, hard: 0.4853
test_iter 50000 gr pedestrian --> easy: 0.0533, mod: 0.0521, hard: 0.0460
test_iter 50000 3d pedestrian --> easy: 0.0467, mod: 0.0461, hard: 0.0430
test_iter 50000 2d cyclist --> easy: 0.6447, mod: 0.4094, hard: 0.4014
test_iter 50000 gr cyclist --> easy: 0.0422, mod: 0.0258, hard: 0.0261
test_iter 50000 3d cyclist --> easy: 0.0265, mod: 0.0215, hard: 0.0132
..
let me try with your code and changing batch normalization..

Attached are fixed model files (I briefly check that they begin training on my machine).

resnet_models_fixed.tar.gz

Regarding the performance you posted, that is very alarming. If the above models converge to an low training loss but have extremely poor validation, then it may be necessary to adjust the batch norm momentum. Either slow it or halt it entirely.

You can accomplish this by using either of the below functions. I do NOT recommend doing this unless you observe a major generalization issue between training loss and validation. Freezing these layers is not ideal unless the batches are too unstable.

def freeze_bn(network):

    for name, module in network.named_modules():
        if isinstance(module, torch.nn.BatchNorm2d):
            module.eval()

def slow_bn(network, val=0.01):

    for name, module in network.named_modules():
        if isinstance(module, torch.nn.BatchNorm2d):
            module.momentum = val

Then you must add the function call (e.g., "slow_bn(rpn_net)") in train_rpn_3d.py at line 104 before beginning training AND later on around line 192 after validation finishes each cycle.

I am very thankful to you for helping. you did a great job.

I've attached examples for each ResNet18 and ResNet50 here. However, I have NOT tested them :)

Thanks for your great work! i have some problems about changing densenet121.i hope you can give me some advice.when i change the backbone from densenet121.features to resnet50 until layer4 and i change the self.base[-1].num_features in prop_feats to 2048 and i change the lr from 0.004 to 0.0005 and batch from 2 to 8 otherwise the training can't continue.But i got the results like this
[INFO]: 2021-12-10 22:11:39,653 test_iter 50000 2d car --> easy: 0.0007, mod: 0.0009, hard: 0.0012
[INFO]: 2021-12-10 22:11:39,654 test_iter 50000 gr car --> easy: 0.0002, mod: 0.0005, hard: 0.0005
[INFO]: 2021-12-10 22:11:39,655 test_iter 50000 3d car --> easy: 0.0001, mod: 0.0005, hard: 0.0005
[INFO]: 2021-12-10 22:11:39,656 test_iter 50000 2d pedestrian --> easy: 0.0152, mod: 0.0455, hard: 0.0455
[INFO]: 2021-12-10 22:11:39,656 test_iter 50000 gr pedestrian --> easy: 0.0012, mod: 0.0012, hard: 0.0012
[INFO]: 2021-12-10 22:11:39,657 test_iter 50000 3d pedestrian --> easy: 0.0012, mod: 0.0012, hard: 0.0012
[INFO]: 2021-12-10 22:11:39,658 test_iter 50000 2d cyclist --> easy: 0.0000, mod: 0.0000, hard: 0.0000
[INFO]: 2021-12-10 22:11:39,658 test_iter 50000 gr cyclist --> easy: 0.0000, mod: 0.0000, hard: 0.0000
[INFO]: 2021-12-10 22:11:39,659 test_iter 50000 3d cyclist --> easy: 0.0000, mod: 0.0000, hard: 0.0000
could you please give me some help?

I've attached examples for each ResNet18 and ResNet50 here. However, I have NOT tested them :)

i use the code provided by you.But when i run the train_rpn_3d.py ,it shows like this
iter: 250, acc (bg: 0.99, fg: 0.01, iou: nan), loss (bbox_3d: nan, cls: nan, iou: nan), misc (ry: nan, z: nan), dt: 0.34, eta: 4.7h
iter: 500, acc (bg: 1.00, fg: 0.00, iou: nan), loss (bbox_3d: nan, cls: nan, iou: nan), misc (ry: nan, z: nan), dt: 0.31, eta: 4.3h
iter: 750, acc (bg: 1.00, fg: 0.00, iou: nan), loss (bbox_3d: nan, cls: nan, iou: nan), misc (ry: nan, z: nan), dt: 0.30, eta: 4.1h
iter: 1000, acc (bg: 1.00, fg: 0.00, iou: nan), loss (bbox_3d: nan, cls: nan, iou: nan), misc (ry: nan, z: nan), dt: 0.29, eta: 4.0h
iter: 1250, acc (bg: 1.00, fg: 0.00, iou: nan), loss (bbox_3d: nan, cls: nan, iou: nan), misc (ry: nan, z: nan), dt: 0.28, eta: 3.8h
iter: 1500, acc (bg: 1.00, fg: 0.00, iou: nan), loss (bbox_3d: nan, cls: nan, iou: nan), misc (ry: nan, z: nan), dt: 0.30, eta: 4.0h
iter: 1750, acc (bg: 1.00, fg: 0.00, iou: nan), loss (bbox_3d: nan, cls: nan, iou: nan), misc (ry: nan, z: nan), dt: 0.29, eta: 3.9h
could you please give me somoe help?