stefanopini/simple-HRNet

Loading pretrained weights (imagenet)

dabeschte opened this issue · 2 comments

First of all: thanks a lot for this awesome work.

I managed to run training on Windows (simply remove all "cpu_nms" and "gpu_nms" references in nms.py).
However, when trying to load the pretrained weights, I ran into an error, because weights pretrained on imagenet use some different layers at the end. This could easily be solved by setting "strict=False" in Train.py.
I think it could never work with the default setting of "strict=True".

Do you know if there could be some side effects from setting this option to False?
I just started training, so I can't tell yet if loading pretrained weights is actually working and accelerating/improving the training.

Hi, thank you for the appreciation and for reporting the issue!

According to the pytorch documentation, there shouldn't be any issue in using strict=False in the load_state_dict function.
However, in this case it is important to check the values returned by the function.

When strict=False, pytorch silently loads only the matching weights (i.e. weights with the same name in both the model and the checkpoint) and lists the mismatched weight names in the returned lists.
If you accidentally load the wrong checkpoint, e.g. a checkpoint of a different model, you won't notice it unless you check the returned values.

For example, in the case of the ImageNet checkpoint, there should differences only in the last layer(s) of the network because there will be a classification layer instead of the prediction of the heatmaps.

In the next days, I'll try to integrate this check into the code and update the project, thank you!

Hi,
I eventually updated the training code to fix the loading of the weights pre-trained on ImageNet.
Let me know if you encounter any other issues!