fabio-sim/LightGlue-ONNX

Convert 2.0 End-to-end parallel ONNX file to TensorRT engine

Closed this issue · 4 comments

Hi! Thanks for this helpful repository and a new release.

I'm facing a problem of running the TensorRT engine file generated by the new released superpoint_lightglue_pipeline.trt.onnx.

According to the trt_infer.py, build_engine was successfully run after modified in profile part for static inputs. Weirdly, in run_engine function, context = engine.create_execution_context() would return a None value for context, like this:

[07/23/2024-14:32:36] [TRT] [I] Loaded engine size: 61 MiB
[07/23/2024-14:32:36] [TRT] [V] Deserialization required 55328 microseconds.
[07/23/2024-14:32:36] [TRT] [I] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +59, now: CPU 0, GPU 59 (MiB)
[07/23/2024-14:32:36] [TRT] [E] 1: Unexpected exception vector::_M_range_check: __n (which is 0) >= this->size() (which is 0)
[07/23/2024-14:32:36] [TRT] [E] 1: Unexpected exception vector::_M_range_check: __n (which is 0) >= this->size() (which is 0)
[07/23/2024-14:32:36] [TRT] [E] 1: Unexpected exception vector::_M_range_check: __n (which is 0) >= this->size() (which is 0)
[07/23/2024-14:32:36] [TRT] [E] 1: Unexpected exception vector::_M_range_check: __n (which is 0) >= this->size() (which is 0)
[07/23/2024-14:32:36] [TRT] [E] 1: Unexpected exception vector::_M_range_check: __n (which is 0) >= this->size() (which is 0)
[07/23/2024-14:32:36] [TRT] [E] 1: Unexpected exception vector::_M_range_check: __n (which is 0) >= this->size() (which is 0)
[07/23/2024-14:32:36] [TRT] [E] 1: Unexpected exception vector::_M_range_check: __n (which is 0) >= this->size() (which is 0)
[07/23/2024-14:32:36] [TRT] [E] 1: Unexpected exception vector::_M_range_check: __n (which is 0) >= this->size() (which is 0)
[07/23/2024-14:32:36] [TRT] [E] 1: Unexpected exception vector::_M_range_check: __n (which is 0) >= this->size() (which is 0)
[07/23/2024-14:32:36] [TRT] [E] 1: Unexpected exception vector::_M_range_check: __n (which is 0) >= this->size() (which is 0)
[07/23/2024-14:32:36] [TRT] [E] 1: Unexpected exception vector::_M_range_check: __n (which is 0) >= this->size() (which is 0)
[07/23/2024-14:32:36] [TRT] [E] 1: Unexpected exception vector::_M_range_check: __n (which is 0) >= this->size() (which is 0)
[07/23/2024-14:32:36] [TRT] [V] Total per-runner device persistent memory is 98816
[07/23/2024-14:32:36] [TRT] [V] Total per-runner host persistent memory is 84192
[07/23/2024-14:32:36] [TRT] [V] Allocated activation device memory of size 1105330176
Traceback (most recent call last):
File "/root/LightGlue/ONNX-2.0/trt_infer.py", line 141, in
outputs = run_engine(output_path)
File "/root/LightGlue/ONNX-2.0/trt_infer.py", line 124, in run_engine
context.set_input_shape(name, tuple(shape))
AttributeError: 'NoneType' object has no attribute 'set_input_shape'
[07/23/2024-14:32:36] [TRT] [E] 1: Unexpected exception vector::_M_range_check: __n (which is 0) >= this->size() (which is 0)

Could this caused by the Unexpected exception vector error or other problem? Is there any solutions? The trt version I'm using is 8.6.1.

Thanks

Hi @EvW1998, thank you for your interest in LightGlue-ONNX.

For TensorRT, I've only tested TRT 10.2 via ONNX Runtime TRT Execution Provider. I haven't got a chance to test it with pure TRT so far.

Actually, it could just be that my trt_infer.py is broken. Sorry, I'm not too familiar with pure TRT API

Thanks for your reply @fabio-sim.

In trt_infer.py, you mentioned import tensorrt as trt # >= 8.6.1, and acorrding to the v1.0 release, do you mean that only superpoint_lightglue.trt.onnx was tested with pure TRT API in TRT 8.6.1?

Thanks

Yes, that's correct. Though the main problem was that it was difficult to allocate the output tensor:

# TODO: Still haven't figured out dynamic output shapes yet:
if binding == "matches0":
shape = (512, 2)
elif binding == "mscores0":
shape = (512,)

I'll have a look at it again when I get some time

Closed by #91