Whether the output of the model is the result before the NMS?

Question

Whether the output of the model is the result before the NMS?

Eleanor456 opened this issue 4 years ago · 10 comments

Thank you for your great work. But I have some problems when I conver the mmdetection to the torch model or the tensorrt model.
Whether the output of the model is the result before the NMS? I obtained the result with 100-dimensional when I print "inference_detector(trt_model, image_path, cfg_path, args.device)". And the result is different from the processing result of the mmdetection model.
And I test the torch_model and the tensorrt model with the parameters of return_wrap_model=True. The output of the tow models is same, but they are both tensor with 100-dimensional which is much more than my ground-truth.

Answer 1 · 2021-01-19T10:47:55.000Z

Hi
The output has been through nms layer. Invalid boxes will be filled with 0 and cls_id will be filled with -1.

Answer 2 · 2021-01-19T11:04:24.000Z

Hi
The output has been through nms layer. Invalid boxes will be filled with 0 and cls_id will be filled with -1.

But all the value of my results are valid. Is the post-processing method the same as mmdet?

Answer 3 · 2021-04-06T10:15:19.000Z

Hi @grimoire !
Possibly stupid question but I'd like to be sure if NMS is applied in converted .engine (I've converted GFL model to .engine and using it in C++). Or should I add that post-processing step (NMS) right after tensorrt->detect call?
If it's applied, then my next question. I saw mmcv has a flag which turns on cross-class NMS. If I use it will NMS within tensorrt .engine be also cross-class?

Thanx)

Answer 4 · 2021-04-07T01:36:28.000Z

@vedrusss NMS has been include in the converted model. And it is cross-class by default. Read PyTorch implement and TensorRT converter for detail.
Actually, the NMS TensorRT implement here is modified from Nvidia's official plugin batchedNMSPlugin.

Answer 5 · 2021-04-07T13:17:41.000Z

Hi @grimoire , from your PyTorch implement I can see NMS is applied for each class separately (done within for cls_idx in range(scores.shape[2]) cycle) and then nmsed results of each class are stacked into final results. Am I right? If yes, is there are way to do it for all labels together (cross-label)?

Answer 6 · 2021-04-07T14:10:13.000Z

@vedrusss yes, It works just like what you understand.
And I think there is no way to do the cross-label nms for now. You might need to add another one class nms after the inference.

Answer 7 · 2021-04-08T05:33:17.000Z

@grimoire , I've reviewed Nvidia's BatchedNMSPlugin and it looks like parameter shareLocation can be used to force cross-label NMS ("If set to true, the boxes input are shared across all classes. If set to false, the boxes input should account for per-class box data.")
So, looks like I could obtain TRT .engine with cross-label NMS layer without re-implementation of your BatchedNMS module (of course in pytorch mmdetection original model will do per-class NMS in such case).
I guess I must investigate around scores and boxes passed to the BatchedNMS forward - which dimension contains what. Because according to convert_batchednms the flag of shareLocation is defined from the boxes (shareLocation = (bboxes.shape[2] == 1)

Answer 8 · 2021-04-08T14:15:31.000Z

@vedrusss I think the flag shareLocation share the boxes between different classes, but each classes are still processed seprately. Read the cuda kernel here.
The block number is const int GS = num_classes;, which means NMS of different classes are processed in different cuda block. shareLocation is used to control the offset of boxes.

Answer 9 · 2021-11-02T12:02:36.000Z

@grimoire

is the input format of batchedNMSPlugin is the same as TensorRT repo (https://github.com/NVIDIA/TensorRT/tree/main/plugin/batchedNMSPlugin)?

Answer 10 · 2021-11-02T13:41:23.000Z

@twmht Yes, actually it is modified from the official one.