gau-nernst/vision-toolbox

Comparing backbone to original YOLOv5 repo

Opened this issue · 3 comments

Hey,
I've been trying to load the pre-trained weights from this repo into the original YOLOv5 (changing state dict key names, etc.).
However, something does not align in the weights shapes, for example on YOLOv5s:
(left - your implementation with names and sizes, right - original YOLOv5)
image

The first problem occurs In block 0 - a mismatch in the weights shapes as shown in the image above.
I've checked it on the s,m,l , all of them having the same issue.

Could you please check if there's a bug in your implementation?
Thanks!

Hello,

When I implemented YOLOv5, I checked the total number of parameters in my implementation against the official models, and they were the same. So I was quite confident that the implementation is correct.

I think there are two possible scenarios:

  1. YOLOv5 changes its architecture some time after I implemented mine.
  2. You match the wrong weight names from my implementation to the official one. The naming and order of the weights can be very different. I recommend you to use Netron to inspect the model architecture to make sure you map the correct weights. You can also dig into the source code.

I will check on the two possible reasons above when I have some time.

Cheers!

Hello,

It seems that you have matched the wrong weight names.

C3 layer in the official repo (link) is the same as CSPDarknetStage layer in my implementation (link). cv3 corresponds to out_conv, not conv1.

The correct mapping would be

  • conv -> separate conv layer
  • conv1 -> cv2
  • conv2 -> cv1
  • blocks -> m
  • out_conv -> cv3

Maybe I can write a script to convert the weights since some people have requested it before. I will update it here.

Hello @AlonZolfi,

I have added the script to convert YOLOv5 backbone weights.

python scripts/convert_yolov5_weights.py {weights_from_this_repo.pth} {save_path.pth}

Here are the notes I added to my README

The weights will be renamed to be compatiable with Ultralytics' repo. Note that the converted .pth file only contains the renamed state dict, up to model.8 (the backbone part, without the SPPF layer). You will need to modify their train script to treat the loaded file as a state dict, instead of a dictionary with key model containing the model object.

I haven't tested training a full YOLOv5 object detector with the converted weights, so this function is not guaranteed to work correctly.

I'm not familiar with the YOLOv5 codebase. You can help me test if the converted weights work correctly with the official YOLOv5 repo.

Cheers!