grimoire/mmdetection-to-tensorrt

Whether the output of the model is the result before the NMS?

Eleanor456 opened this issue · 10 comments

Thank you for your great work. But I have some problems when I conver the mmdetection to the torch model or the tensorrt model.
Whether the output of the model is the result before the NMS? I obtained the result with 100-dimensional when I print "inference_detector(trt_model, image_path, cfg_path, args.device)". And the result is different from the processing result of the mmdetection model.
And I test the torch_model and the tensorrt model with the parameters of return_wrap_model=True. The output of the tow models is same, but they are both tensor with 100-dimensional which is much more than my ground-truth.

Hi
The output has been through nms layer. Invalid boxes will be filled with 0 and cls_id will be filled with -1.

Hi
The output has been through nms layer. Invalid boxes will be filled with 0 and cls_id will be filled with -1.

But all the value of my results are valid. Is the post-processing method the same as mmdet?

Hi @grimoire !
Possibly stupid question but I'd like to be sure if NMS is applied in converted .engine (I've converted GFL model to .engine and using it in C++). Or should I add that post-processing step (NMS) right after tensorrt->detect call?
If it's applied, then my next question. I saw mmcv has a flag which turns on cross-class NMS. If I use it will NMS within tensorrt .engine be also cross-class?

Thanx)

@vedrusss NMS has been include in the converted model. And it is cross-class by default. Read PyTorch implement and TensorRT converter for detail.
Actually, the NMS TensorRT implement here is modified from Nvidia's official plugin batchedNMSPlugin.

Hi @grimoire , from your PyTorch implement I can see NMS is applied for each class separately (done within for cls_idx in range(scores.shape[2]) cycle) and then nmsed results of each class are stacked into final results. Am I right? If yes, is there are way to do it for all labels together (cross-label)?

@vedrusss yes, It works just like what you understand.
And I think there is no way to do the cross-label nms for now. You might need to add another one class nms after the inference.

@grimoire , I've reviewed Nvidia's BatchedNMSPlugin and it looks like parameter shareLocation can be used to force cross-label NMS ("If set to true, the boxes input are shared across all classes. If set to false, the boxes input should account for per-class box data.")
So, looks like I could obtain TRT .engine with cross-label NMS layer without re-implementation of your BatchedNMS module (of course in pytorch mmdetection original model will do per-class NMS in such case).
I guess I must investigate around scores and boxes passed to the BatchedNMS forward - which dimension contains what. Because according to convert_batchednms the flag of shareLocation is defined from the boxes (shareLocation = (bboxes.shape[2] == 1)

@vedrusss I think the flag shareLocation share the boxes between different classes, but each classes are still processed seprately. Read the cuda kernel here.
The block number is const int GS = num_classes;, which means NMS of different classes are processed in different cuda block. shareLocation is used to control the offset of boxes.

twmht commented

@grimoire

is the input format of batchedNMSPlugin is the same as TensorRT repo (https://github.com/NVIDIA/TensorRT/tree/main/plugin/batchedNMSPlugin)?

@twmht Yes, actually it is modified from the official one.