app_fasterrcnn

Question

app_fasterrcnn

Closed this issue 3 years ago · 53 comments

parsing onnx error rpn_backbone_resnet50:
(nbSpatialDims == kernelWeights.shape.nbDims - 2) && "The number of spatial dimensions and the kernel shape doesn't match up for the Conv operator."

Answer 1 · 2022-04-01T14:33:40.000Z

@rasbery1
trt的版本是多少啊？我这边有人遇到的问题是trt版本小于8.0.3.4大于8.0.0.0会报错
trt版本7.2可以
8.0.3.4以上版本也可以，仅供参考

Answer 2 · 2022-04-01T14:37:49.000Z

So the tensorrt should be 7.2 or greater than 8.0.3.4, right?

Answer 3 · 2022-04-01T14:41:08.000Z

So the tensorrt should be 7.2 or greater than 8.0.3.4, right?

Yes, and you can check the onnx node name replacing operation in Step 7(see readme)

Answer 4 · 2022-04-01T18:30:13.000Z

@thb1314 :step7 is related to exporting the header part and the name nodes are changed respectively. The problem is related to parsing rpn_backbone_resnet50.onnx. The error occurs while importing the following node. Is this still related to the version of TensorRT?
input: "rpn.head.conv.weight"
input: "rpn.head.conv.bias"
input: "feature_0"
output: "1388"
name: "Conv_243"
op_type: "Conv"

Answer 5 · 2022-04-02T02:42:57.000Z

@rasbery1 Could you give more information about the error? Furthermore, you can check the output in step1-step8 one by one.

Answer 6 · 2022-04-02T17:57:00.000Z

@thb1314 The complete error is:
While parsing node number 240 [Conv -> "1388"]:
[error][trt_builder.cpp:30]:NVInfer: tensorrt_code/src/tensorRT/onnx_parser/ModelImporter.cpp:737: --- Begin node ---
[error][trt_builder.cpp:30]:NVInfer: tensorrt_code/src/tensorRT/onnx_parser/ModelImporter.cpp:738: input: "rpn.head.conv.weight"
input: "rpn.head.conv.bias"
input: "feature_0"
output: "1388"
name: "Conv_243"
op_type: "Conv"
attribute {
name: "dilations"
ints: 1
ints: 1
type: INTS
}
attribute {
name: "group"
i: 1
type: INT
}
attribute {
name: "kernel_shape"
ints: 3
ints: 3
type: INTS
}
attribute {
name: "pads"
ints: 1
ints: 1
ints: 1
ints: 1
type: INTS
}
attribute {
name: "strides"
ints: 1
ints: 1
type: INTS
}

[error][trt_builder.cpp:30]:NVInfer: tensorrt_code/src/tensorRT/onnx_parser/ModelImporter.cpp:739: --- End node ---
[error][trt_builder.cpp:30]:NVInfer: tensorrt_code/src/tensorRT/onnx_parser/ModelImporter.cpp:741: ERROR: tensorrt_code/src/tensorRT/onnx_parser/builtin_op_importers.cpp:624 In function importConv:
[8] Assertion failed: (nbSpatialDims == kernelWeights.shape.nbDims - 2) && "The number of spatial dimensions and the kernel shape doesn't match up for the Conv operator."

Answer 7 · 2022-04-03T02:35:34.000Z

@rasbery1 OK，Could you share your generated onnx file using GoogleDisk or BaiduDisk? I will check the onnx file in netron.

Answer 8 · 2022-04-03T02:39:24.000Z

@rasbery1 You can also check the network structure in netron and make sure Step 4 and Step 6 correct.

Answer 9 · 2022-04-03T10:27:06.000Z

@thb1314 Here is the onnx file: https://drive.google.com/file/d/15GqC1qCGbRC1_Q5nVSJIzPN2ayqqZun_/view

Answer 10 · 2022-04-03T11:49:20.000Z

@rasbery1 As this picture shows, your onnx file is wrong. Because the "X" in the "conv node" can not be "Initializer" and the "B" in the "conv node" which means bias of conv operation can not be "feature_0", the two items should switch positions.
Maybe it is a bug of onnxsim. You can try this pip install onnx-simplifier==0.3.6, and execute Step1->Step8 one by one again,

Answer 11 · 2022-04-03T15:59:09.000Z

ok, thanks. Just wondering why the graph in onnx looks very complicated and what onnx simplifier does.

Answer 12 · 2022-04-04T03:03:16.000Z

ok, thanks. Just wondering why the graph in onnx looks very complicated and what onnx simplifier does.

onnx simplifier is used to merge some redundant op in onnx. For more infomation, see https://github.com/daquexian/onnx-simplifier.

Answer 13 · 2022-04-04T14:17:30.000Z

@thb1314 I'm making the executable "pro" again, and get this error:usr/bin/ld: cannot find -lcuda
collect2: error: ld returned 1 exit status
Any idea?

Answer 14 · 2022-04-04T14:38:41.000Z

@rasbery1 edit the cuda path in CMake config file

Answer 15 · 2022-04-04T21:19:26.000Z

@thb1314 I edited the CMake file but I'm getting this error:
/usr/bin/ld: cannot find -lcublas

Answer 16 · 2022-04-05T02:01:28.000Z

@thb1314 I edited the CMake file but I'm getting this error: /usr/bin/ld: cannot find -lcublas

Make sure cudnn and cudnn patch(if exist) is installed and the cudnn path in CMake config file is correct.

Answer 17 · 2022-04-05T17:40:28.000Z

Just wondering if the level in "rpn_boxes" onnx output node represents the level of feature map corresponding to each bounding boxes before NMS.

And does the x06reduceRpnOnnx script remove the ROI pooling layer from RPN part?

Answer 18 · 2022-04-06T01:39:50.000Z

Just wondering if the level in "rpn_boxes" onnx output node represents the level of feature map corresponding to each bounding boxes before NMS.

And does the x06reduceRpnOnnx script remove the ROI pooling layer from RPN part?

The level in "rpn_boxes" onnx output node represents the level of currtent bounding boxes in FPN. Suppose the level=i(i = 0,1,2,3), it means currtent bounding boxes is calculated from i-th lelvel anchor box in FPN

x06reduceRpnOnnx script remove the first ouput of FPN which is used for traing.

Answer 19 · 2022-04-06T01:52:17.000Z

@rasbery1 Execute me, may I know where you are from?

Answer 20 · 2022-04-06T17:53:40.000Z

If the level of the the rpn_boxes is passed, is level assignmnet for ROI_align is done in cuda code. If so, why is that?
I see that it is done in fasterrcnn_decode.cu. I just want to know the reason
float area = width * height;
int fpn_lvl = floorf(4 + log2(sqrt(area) / 224) + 1e-6) - 2;

    fpn_lvl = fpn_lvl > 3 ? 3 : fpn_lvl;
    fpn_lvl = fpn_lvl < 0 ? 0 : fpn_lvl;

Answer 21 · 2022-04-06T20:12:57.000Z

@rasbery1 As this picture shows, your onnx file is wrong. Because the "X" in the "conv node" can not be "Initializer" and the "B" in the "conv node" which means bias of conv operation can not be "feature_0", the two items should switch positions. Maybe it is a bug of onnxsim. You can try this pip install onnx-simplifier==0.3.6, and execute Step1->Step8 one by one again,

@thb1314 Regarding to this issue, I had to change check_n to from 0 t 1 in the following command to get the correct node:
model_simp, check = onnxsim.simplify(model, check_n=1,input_shapes={'input':[1,3,input_height,input_width]},
dynamic_input_shape=False)

Now rpn_backbone_resnet50 is imported but I'm getting another error:
Compile FP32 Onnx Model 'rpn_backbone_resnet50.onnx'.
[info][trt_builder.cpp:557]:Input shape is -1 x 3 x 608 x 800
[info][trt_builder.cpp:558]:Set max batch size = 1
[info][trt_builder.cpp:559]:Set max workspace size = 1024.00 MB
[info][trt_builder.cpp:562]:Network has 1 inputs:
[info][trt_builder.cpp:568]: 0.[input] shape is -1 x 3 x 608 x 800
[info][trt_builder.cpp:574]:Network has 6 outputs:
[info][trt_builder.cpp:579]: 0.[rpn_boxes] shape is 1 x 4390 x 6
[info][trt_builder.cpp:579]: 1.[feature_0] shape is -1 x 256 x 152 x 200
[info][trt_builder.cpp:579]: 2.[feature_1] shape is -1 x 256 x 76 x 100
[info][trt_builder.cpp:579]: 3.[feature_2] shape is -1 x 256 x 38 x 50
[info][trt_builder.cpp:579]: 4.[feature_3] shape is -1 x 256 x 19 x 25
[info][trt_builder.cpp:579]: 5.[feature_pool] shape is -1 x 256 x 10 x 13
[info][trt_builder.cpp:583]:Network has 572 layers:
[info][trt_builder.cpp:650]:Building engine...
[warn][trt_builder.cpp:33]:NVInfer: Detected invalid timing cache, setup a local cache instead
[warn][trt_builder.cpp:33]:NVInfer: GPU error during getBestTactic: Gather_329 : invalid configuration argument
[trt_builder.cpp:30]:NVInfer: 10: [optimizer.cpp::computeCosts::1853] Error Code 10: Internal Error (Could not find any implementation for node Gather_329.)
[trt_builder.cpp:654]:engine is nullptr
[warn][trt_builder.cpp:33]:NVInfer: The logger passed into createInferRuntime differs from one already assigned, 0x55bd42639d00, logger not updated.

[error][fasterrcnn.cpp:164]:Engine rpn_backbone_resnet50.FP32.trtmodel load failed
[error][app_fasterrcnn.cpp:44]:Engine is nullptr

This is the link to the onnx file. Could you please help me with this issue?
https://drive.google.com/file/d/14INj1ut3di6kuEKtbAN5iKYuKGGdQFwW/view?usp=sharing

Answer 22 · 2022-04-07T03:24:10.000Z

@rasbery1 Execute me, may I know where you are from?
Armenia

Answer 23 · 2022-04-07T03:24:59.000Z

I appreciate if you can help me with the mentioned issue

Answer 24 · 2022-04-07T14:41:53.000Z

@rasbery1 fpn_lvl is used to select the level of output features of RPN, which is passed to Header. level in the code indicates the level of anchor predefined in FPN, which can be viewed as a "class" of the bbox, and it is used to NMS in RPN.

Answer 25 · 2022-04-07T14:44:01.000Z

@rasbery1 As this picture shows, your onnx file is wrong. Because the "X" in the "conv node" can not be "Initializer" and the "B" in the "conv node" which means bias of conv operation can not be "feature_0", the two items should switch positions. Maybe it is a bug of onnxsim. You can try this pip install onnx-simplifier==0.3.6, and execute Step1->Step8 one by one again,

@thb1314 Regarding to this issue, I had to change check_n to from 0 t 1 in the following command to get the correct node: model_simp, check = onnxsim.simplify(model, check_n=1,input_shapes={'input':[1,3,input_height,input_width]}, dynamic_input_shape=False)

Now rpn_backbone_resnet50 is imported but I'm getting another error: Compile FP32 Onnx Model 'rpn_backbone_resnet50.onnx'. [info][trt_builder.cpp:557]:Input shape is -1 x 3 x 608 x 800 [info][trt_builder.cpp:558]:Set max batch size = 1 [info][trt_builder.cpp:559]:Set max workspace size = 1024.00 MB [info][trt_builder.cpp:562]:Network has 1 inputs: [info][trt_builder.cpp:568]: 0.[input] shape is -1 x 3 x 608 x 800 [info][trt_builder.cpp:574]:Network has 6 outputs: [info][trt_builder.cpp:579]: 0.[rpn_boxes] shape is 1 x 4390 x 6 [info][trt_builder.cpp:579]: 1.[feature_0] shape is -1 x 256 x 152 x 200 [info][trt_builder.cpp:579]: 2.[feature_1] shape is -1 x 256 x 76 x 100 [info][trt_builder.cpp:579]: 3.[feature_2] shape is -1 x 256 x 38 x 50 [info][trt_builder.cpp:579]: 4.[feature_3] shape is -1 x 256 x 19 x 25 [info][trt_builder.cpp:579]: 5.[feature_pool] shape is -1 x 256 x 10 x 13 [info][trt_builder.cpp:583]:Network has 572 layers: [info][trt_builder.cpp:650]:Building engine... [warn][trt_builder.cpp:33]:NVInfer: Detected invalid timing cache, setup a local cache instead [warn][trt_builder.cpp:33]:NVInfer: GPU error during getBestTactic: Gather_329 : invalid configuration argument [trt_builder.cpp:30]:NVInfer: 10: [optimizer.cpp::computeCosts::1853] Error Code 10: Internal Error (Could not find any implementation for node Gather_329.) [trt_builder.cpp:654]:engine is nullptr [warn][trt_builder.cpp:33]:NVInfer: The logger passed into createInferRuntime differs from one already assigned, 0x55bd42639d00, logger not updated.

[error][fasterrcnn.cpp:164]:Engine rpn_backbone_resnet50.FP32.trtmodel load failed [error][app_fasterrcnn.cpp:44]:Engine is nullptr

This is the link to the onnx file. Could you please help me with this issue? https://drive.google.com/file/d/14INj1ut3di6kuEKtbAN5iKYuKGGdQFwW/view?usp=sharing

feature_pool should be moved in the next step. Exec step1 to step8 one by one again? Furthermore, don't forget to delete the '*.trtmodel' in workspace dir if exists.

Answer 26 · 2022-04-07T20:59:48.000Z

@thb1314 Thank you very much for your help. I ran the python scripts from step 1 to step 8, but I noticed this problem: I see that the rpn_backbone_resnet50 generated in step 3 looks good, i.e. X in conv node 243 is feature_0 and W and B are shown as weights and bias, but after executing step 6 that removes feature pool, I get the wrong rpn_backbone_resnet50.onnx file again as before if you remember (X is shown as weights and B becomes feature_0 in conv node). I'm not sure what the problem is. Maybe it is related to the command "graph.cleanup()" but not sure about it. Could you help me with it?

Answer 27 · 2022-04-08T04:51:12.000Z

@thb1314 Thank you very much for your help. I ran the python scripts from step 1 to step 8, but I noticed this problem: I see that the rpn_backbone_resnet50 generated in step 3 looks good, i.e. X in conv node 243 is feature_0 and W and B are shown as weights and bias, but after executing step 6 that removes feature pool, I get the wrong rpn_backbone_resnet50.onnx file again as before if you remember (X is shown as weights and B becomes feature_0 in conv node). I'm not sure what the problem is. Maybe it is related to the command "graph.cleanup()" but not sure about it. Could you help me with it?

@rasbery1 I tested Step6 script in my python3.7 conda env from scratch and did't replicate the bug. Maybe you can create a clean env and try again.
See below for another solution
Use the code to replace Step6 script.

import onnx
import numpy as np

# def getElementByName(graph, )

def cutOnnx():
    onnx_save_path = "rpn_backbone_resnet50.onnx"
    onnx_model = onnx.load(onnx_save_path)
    removed_list = list()
    for item in onnx_model.graph.output:
        if item.name == "feature_pool":
            removed_list.append(item)
    for item in removed_list:
        onnx_model.graph.output.remove(item)
    print(onnx_model.graph.output)
  
    # remove feature pool
    onnx.save(onnx_model, onnx_save_path)

    


if __name__ == '__main__':
    cutOnnx()

Answer 28 · 2022-04-08T13:42:55.000Z

def cutOnnx():
    onnx_save_path = "rpn_backbone_resnet50.onnx"
    onnx_model = onnx.load(onnx_save_path)
    removed_list = list()
    for item in onnx_model.graph.output:
        if item.name == "feature_pool":
            removed_list.append(item)
    for item in removed_list:
        onnx_model.graph.output.remove(item)
    print(onnx_model.graph.output)
  
    # remove feature pool
    onnx.save(onnx_model, onnx_save_path)

@thb1314 Thank you so much. I got the correct graph, but when creating the trt engine for rpn part I'm getting the following error. This does not happen when it it generating engine for new_header.onnx:
[warn][trt_builder.cpp:33]:NVInfer: Detected invalid timing cache, setup a local cache instead
[warn][trt_builder.cpp:33]:NVInfer: GPU error during getBestTactic: Gather_329 : invalid configuration argument
[error][trt_builder.cpp:30]:NVInfer: 10: [optimizer.cpp::computeCosts::1853] Error Code 10: Internal Error (Could not find any implementation for node Gather_329.)
[error][trt_builder.cpp:654]:engine is nullptr

Answer 29 · 2022-04-08T14:47:22.000Z

@rasbery1 Which onnx file？
Gather op should not exist in onnx graph.
Execute step 5-8 and replace the node name in Step7 as illustrated in Step7

Answer 30 · 2022-04-08T17:16:27.000Z

@rasbery1 Which onnx file？ Gather op should not exist in onnx graph. Execute step 5-8 and replace the node name in Step7 as illustrated in Step7

This is related to rpn_backbone_resnet50.onnx which comes from x06reduceRpnOnnx.py. I'm using the script that you sent and it has gather nodes. Could you please help me how to remove them?:

def cutOnnx():
onnx_save_path = "rpn_backbone_resnet50.onnx"
onnx_model = onnx.load(onnx_save_path)
removed_list = list()
for item in onnx_model.graph.output:
if item.name == "feature_pool":
removed_list.append(item)
for item in removed_list:
onnx_model.graph.output.remove(item)
#print(onnx_model.graph.output)

# remove feature pool
onnx.save(onnx_model, onnx_save_path)

if name == 'main':
cutOnnx()

Answer 31 · 2022-04-09T03:32:28.000Z

@rasbery1 I am sorry to draw the wrong conclusion. The Gather op Parsing error may result from the onnxparser version. You need to verify your TRTVersion and onnxparser according to https://github.com/thb1314/tensorrt-onnx-fasterrcnn-fpn-roialign/tree/master/tensorrt_code#setup-and-configuration

Answer 32 · 2022-04-09T03:33:22.000Z

@rasbery1 I have tested the onnx file rpn_backbone_resnet50.onnx in my TRT 8.0.3.4 env.

Answer 33 · 2022-04-11T00:26:48.000Z

onnxparser
@thb1314 My TRT version is also is 8.0.3.4. For onnx-parser, I replaced onnx_parser_for_8.x/onnx_parser to src/tensorRT/onnx_parser according to these instructions https://github.com/thb1314/tensorrt-onnx-fasterrcnn-fpn-roialign/tree/master/tensorrt_code#setup-and-configuration but still, I'm getting error for "gather node" in rpn onnx file. Could please guide me on how to fix it?

Answer 34 · 2022-04-11T01:07:09.000Z

@rasbery1 share your onnx file？

Answer 35 · 2022-04-11T01:58:22.000Z

@rasbery1 share your onnx file？

@thb1314 https://drive.google.com/file/d/15gzHvH1BiGj6eM5n2eW96BmFQrsSbr1v/view?usp=sharing

Answer 36 · 2022-04-11T13:40:03.000Z

@rasbery1 share your onnx file？

@thb1314 https://drive.google.com/file/d/15gzHvH1BiGj6eM5n2eW96BmFQrsSbr1v/view?usp=sharing
@rasbery1 I don't have the access

Answer 37 · 2022-04-11T13:44:04.000Z

@thb1314 Just gave you the access

Answer 38 · 2022-04-11T13:45:26.000Z

@thb1314 Could you check it now?

Answer 39 · 2022-04-11T13:56:02.000Z

@rasbery1 Ok, I have got it.

Answer 40 · 2022-04-11T14:25:09.000Z

@thb1314 Just wondering if you got any chance to look at it.

Answer 41 · 2022-04-11T14:53:11.000Z

@rasbery1
I test you provided onnx file in my TensorRT environment and get the correct result.
Maybe you can check your protobuf version, protobuf version is 3.11.4 in myenvironment.

Answer 42 · 2022-04-11T14:54:48.000Z

Thanks, I'll change the version of protobuf to see how it goes.

Answer 43 · 2022-04-11T22:55:57.000Z

@thb1314 It was fixed. Thanks. The engines are generated now I'm getting these erros from fasterrcnn_decode.cu for inference. I appreciate it if you can guide me to fix this too:

[error][preprocess_kernel.cu:385]:launch failed: no kernel image is available for execution on the device
[error][preprocess_kernel.cu:385]:launch failed: no kernel image is available for execution on the device
[error][fasterrcnn_decode.cu:203]:launch failed: no kernel image is available for execution on the device
[error][fasterrcnn_decode.cu:207]:launch failed: no kernel image is available for execution on the device
[error][trt_tensor.cpp:224]:Offset location[0] >= bytes_[0], out of range
[error][trt_tensor.cpp:224]:Offset location[0] >= bytes_[0], out of range
[error][trt_builder.cpp:30]:NVInfer: 3: [executionContext.cpp::setBindingDimensions::970] Error Code 3: Internal Error (Parameter check failed at: runtime/api/executionContext.cpp::setBindingDimensions::970, condition: profileMinDims.d[i] <= dimensions.d[i]. Supplied binding dimension [0,4] for bindings[0] exceed min ~ max range at index 0, maximum dimension in profile is 5120, minimum dimension in profile is 1, but supplied dimension is 0.
)
[error][trt_builder.cpp:30]:NVInfer: 3: [executionContext.cpp::setBindingDimensions::970] Error Code 3: Internal Error (Parameter check failed at: runtime/api/executionContext.cpp::setBindingDimensions::970, condition: profileMinDims.d[i] <= dimensions.d[i]. Supplied binding dimension [0,256,7,7] for bindings[1] exceed min ~ max range at index 0, maximum dimension in profile is 5120, minimum dimension in profile is 1, but supplied dimension is 0.
)
[error][trt_builder.cpp:30]:NVInfer: 3: [executionContext.cpp::enqueueInternal::322] Error Code 3: Internal Error (Parameter check failed at: runtime/api/executionContext.cpp::enqueueInternal::322, condition: bindings[x] != nullptr
)
[fatal][trt_infer.cpp:340]:execute fail, code 209[cudaErrorNoKernelImageForDevice], message no kernel image is available for execution on the device
[error][fasterrcnn_decode.cu:111]:launch failed: no kernel image is available for execution on the device
[error][fasterrcnn_decode.cu:115]:launch failed: no kernel image is available for execution on the device
[error][fasterrcnn_decode.cu:203]:launch failed: no kernel image is available for execution on the device
[error][fasterrcnn_decode.cu:207]:launch failed: no kernel image is available for execution on the device
[error][trt_tensor.cpp:224]:Offset location[0] >= bytes_[0], out of range
[error][trt_tensor.cpp:224]:Offset location[0] >= bytes_[0], out of range
[error][trt_builder.cpp:30]:NVInfer: 3: [executionContext.cpp::setBindingDimensions::970] Error Code 3: Internal Error (Parameter check failed at: runtime/api/executionContext.cpp::setBindingDimensions::970, condition: profileMinDims.d[i] <= dimensions.d[i]. Supplied binding dimension [0,4] for bindings[0] exceed min ~ max range at index 0, maximum dimension in profile is 5120, minimum dimension in profile is 1, but supplied dimension is 0.
)
[error][trt_builder.cpp:30]:NVInfer: 3: [executionContext.cpp::setBindingDimensions::970] Error Code 3: Internal Error (Parameter check failed at: runtime/api/executionContext.cpp::setBindingDimensions::970, condition: profileMinDims.d[i] <= dimensions.d[i]. Supplied binding dimension [0,256,7,7] for bindings[1] exceed min ~ max range at index 0, maximum dimension in profile is 5120, minimum dimension in profile is 1, but supplied dimension is 0.
)
[error][trt_builder.cpp:30]:NVInfer: 3: [executionContext.cpp::enqueueInternal::322] Error Code 3: Internal Error (Parameter check failed at: runtime/api/executionContext.cpp::enqueueInternal::322, condition: bindings[x] != nullptr
)

Answer 44 · 2022-04-12T02:21:52.000Z

@rasbery1 no kernel image is available for execution on the device， It seems that your trt version doesn't match you cuda version.
And is the last bug result from the protobuf version?

Answer 45 · 2022-04-12T02:32:48.000Z

@thb1314 I used trtpy get-env and run the code on the downloaded env which is trt8cuda112cudnn8, not sure how to figure out the exact version of tensorrt8 is in the mentioned envs.

Answer 46 · 2022-04-12T04:06:40.000Z

@rasbery1 The trtpy get-env command is the latest trtpy, maybe its version doesn't match the current onnxparser.
Use the instuction in https://github.com/thb1314/tensorrt-onnx-fasterrcnn-fpn-roialign/tree/master/tensorrt_code#setup-and-configuration is the best choice.

I will add support for the latest trtpy in the future.

Answer 47 · 2022-04-12T20:26:25.000Z

Just wondering if level index is related to FPN level each box belongs to?
const int level_index = int(offset_bottom_rois[roi_cols - 2]);"

Answer 48 · 2022-04-13T00:40:21.000Z

in tensorrt_code/src/application/app_fasterrcnn/fasterrcnn.cpp Line 287-299

for(int i = 0; i < count; ++i) {
      float* pbox  = parray + 1 + i * RPN_NUM_BOX_ELEMENT;
      int keepflag = pbox[6];
      if(keepflag == 1) {
          // left, top, right, bottom, score, level, keepflag, fpn_level, batch_index
          roi_align_inputs_cpu_ptr[roi_align_inputs_index++] = pbox[0];
          roi_align_inputs_cpu_ptr[roi_align_inputs_index++] = pbox[1];
          roi_align_inputs_cpu_ptr[roi_align_inputs_index++] = pbox[2];
          roi_align_inputs_cpu_ptr[roi_align_inputs_index++] = pbox[3];
          roi_align_inputs_cpu_ptr[roi_align_inputs_index++] = pbox[7];
          roi_align_inputs_cpu_ptr[roi_align_inputs_index++] = pbox[8];
      }
  }

roi_align_inputs has 6 columns corresponding to "left, top, right, bottom, fpn_level, batch_index".
level_index is fpn_level calculated by the area of bbox.

Answer 49 · 2022-04-13T14:06:00.000Z

I see that the shape of proposals while inference is: proposals : shape {5120 x 4}. I was thinking that initially, the number of boxes is 4390 and after NMS and roi_alignment becomes 1000 which is the input of header part. Why the shape is not 1000 x 4 instead?

Answer 50 · 2022-04-14T08:52:59.000Z

I see that the shape of proposals while inference is: proposals : shape {5120 x 4}. I was thinking that initially, the number of boxes is 4390 and after NMS and roi_alignment becomes 1000 which is the input of header part. Why the shape is not 1000 x 4 instead?

@rasbery1 The outputs of RPN network is dynamic due to NMS. "1000" is the man-made maximum number of bbox which can be mobified in code.

Answer 51 · 2022-04-15T10:44:46.000Z

@rasbery1 no more question?

Answer 52 · 2022-04-17T17:18:45.000Z

@thb1314 is the inference time around 38 ms? how is it compared to when we don't use tensorrt?

Answer 53 · 2022-04-18T10:10:52.000Z

@rasbery1 As long as it works, you can compared the speed with pytorch python api or torchscript