Yu-Wu/One-Example-Person-ReID

How to replace the resnet50 with other networks?

xiaonvxia opened this issue · 9 comments

Hi,I want to replace the resnet50 with other networks,but every time I switch to the source code for another network, the program runs up a lot of memory after it shows" create dataloader for Training with batch_size 16", and then the code stops running with showing nothing.I don't know why and what should I do if I want to change the network?

Yu-Wu commented

I think there might be some error when changing the backbone.

Please carefully read https://github.com/Yu-Wu/One-Example-Person-ReID/blob/master/my_reid/models/end2end.py

There are some minor changes to the backbones here.

You can print some results in Line 100.

Thank you very much for your reply. I am a beginner with little experience in deep learning.After I replaced the network, I have changed line88 in end2end.py" self.CNN =resnet50(dropout=dropout,fixed_layer=fixed_layer ) ".The code does not show any errors, but it will terminate after running half. What else should I change in end2end.py ?

Yu-Wu commented

I am not sure about this issue.
Can you please try a small dataset, `market1501' with a small batch size, e.g., 4?

I just used the market1501 and set the batch size to 1,and use the resnet50 source code to replace the resnet50 code in your project,it also dosen't work.I don't know why using the resnet50 code in your project does not produce the same results as the resnet50 source code?I use the resnet50 source code but it terminates,however,I use your resnet50 code and it runs successfully

I try to print resnet_feature in Line 100 and the result shows "tensor([], device='cuda:0', size=(4, 0), grad_fn=)".Dose it mean there is no network output and there is something wrong in the resnet code that I have changed?

Yu-Wu commented

I think there must be something wrong with your code. You can check the difference between your resnet50 code and my code line by line.

I cannot produce your results given the very few words.

I'm sorry to bother you again. Your network code has many fewer parameters and classes than the official source code, which defines many classes such as BasicBlock.I don't know why?And,In this line "The self. The base = torchvision. Models. Resnet50 (pretrained = pretrained)" ,I don't understand what self.base does?

Yu-Wu commented

@xiaonvxia

I just used the code from https://github.com/Cysu/open-reid/blob/master/reid/models/resnet.py

Please see the original repo for more details.

Ok, thank you very much for your answer.