Detection and segmentation models

Question

Detection and segmentation models

pianogGG opened this issue a year ago · 8 comments

Hi , I saw in the catalog that you have plans to train models for detection and segmentation. I was wondering

if you plan to modify the network structure for the detection task ?
When do you plan to release the related models?
I want to replace the Swin-T backbone in BevFusion's camera backbone with FasterVit and use existing ImageNet-1K pre-trained models. Do you think this would be a good solution?

Answer 1 · 2023-06-21T06:16:32.000Z

Hi @pianogGG , yes we will release the detection code with a slightly modified architecture. Although the any-resolution FasterViT model can readily be used for this purpose at this stage as well.

There is no ETA currently for lease of these models but hopefully that should happen soon.

And it is certainly a great idea to use FasterViT in BEVFusion with counterpart FasterViT due to its clear advantage in both accuracy and throughput. There is some domain gap, but many papers have shown that ImageNet pre-trained models indeed prove useful.

Answer 2 · 2023-06-21T06:45:36.000Z

Hi @ahatamiz ,"yes we will release the detection code with a slightly modified architecture"===> did you mean FastVit backbone or neck or head?

Answer 3 · 2023-06-21T06:48:36.000Z

The FastVit backbone i mentioned refers to the part circle in the first box

Answer 4 · 2023-06-21T07:01:13.000Z

Thanks for the question @pianogGG . Precisely, the neck as we remove the final classification head and only extract intermediate feature maps.

Answer 5 · 2023-06-21T07:12:13.000Z

Hi @ahatamiz Thanks a lot. So "yes we will release the detection code with a slightly modified architecture", which part you changed?

Answer 6 · 2023-06-21T13:51:51.000Z

Hi @pianogGG , the detection/segmentation code is now almost same as FasterViT_any_res model which supports non-square images (as often see in COCO, ADE20K, etc.). We only have to remove the linear head (as mentioned above), also extract the output of each stage.

Answer 7 · 2023-06-22T05:48:27.000Z

Will close this issue for now.

Answer 8 · 2023-10-14T20:17:39.000Z

Thanks for your inquiry @pianogGG ! Please see our newly release object detection repository and the FasterViT backbone !

We will add pretrained checkpoints very soon !