fabio-sim/LightGlue-ONNX

Is the lightglue run in the tensorrt mode, and how to using C++ inferface build engine?

weihaoysgs opened this issue · 6 comments

Thank you for your great work!

I have run the command

 python infer.py \                                                                                                                                                  
  --img_paths assets/DSC_0410.JPG assets/DSC_0411.JPG \
  --lightglue_path weights/superpoint_lightglue_trt.onnx \
  --extractor_type superpoint \
  --extractor_path weights/superpoint.onnx \
  --trt \
  --viz

then, i can get the correct result, everyuthing is ok.

but i have two problems

  1. is the superpoint using the tensorrt infer or onnx ?
  2. the engine file save in the cache is only the lightglue engine file?

Because i want using tensorrt C++ interface to infer superpoint and lightglue, but when i read the onnx file of lightglue and build() it, i get the following error.

[12/15/2023-15:51:30] [I] [TRT] No importer registered for op: MultiHeadAttention. Attempting to import as plugin.
[12/15/2023-15:51:30] [I] [TRT] Searching for plugin: MultiHeadAttention, plugin_version: 1, plugin_namespace: 
[12/15/2023-15:51:30] [E] [TRT] ModelImporter.cpp:726: While parsing node number 274 [MultiHeadAttention -> "/transformers.0/self_attn/Reshape_5_output_0"]:
[12/15/2023-15:51:30] [E] [TRT] ModelImporter.cpp:727: --- Begin node ---
[12/15/2023-15:51:30] [E] [TRT] ModelImporter.cpp:728: input: "Transpose_0_out"
output: "/transformers.0/self_attn/Reshape_5_output_0"
name: "MultiHeadAttention_0"
op_type: "MultiHeadAttention"
attribute {
  name: "num_heads"
  i: 4
  type: INT
}
domain: "com.microsoft"

[12/15/2023-15:51:30] [E] [TRT] ModelImporter.cpp:729: --- End node ---
[12/15/2023-15:51:30] [E] [TRT] ModelImporter.cpp:732: ERROR: builtin_op_importers.cpp:5410 In function importFallbackPluginImporter:
[8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"

and the onnx file have been convert by the following command

python tools/symbolic_shape_infer.py \
  --input weights/superpoint_lightglue.onnx \
  --output weights/superpoint_lightglue.onnx \
  --auto_merge

Hi @weihaoysgs, thank you for your interest in LightGlue-ONNX.

To answer your questions,

  1. Superpoint is run without ORT's TensorRT Execution Provider.
  2. The cached engine files are specific to ORT and can't be used in the same fashion as, for example, an engine file generated purely with TensorRT.

That error (no MultiHeadAttention op) is because you are trying to convert a fused version, which only supports ORT. Could you try this model I just uploaded?: superpoint_lightglue.trt.onnx

It was exported and shape-inferred using:

python -m onnxruntime.tools.symbolic_shape_infer --auto_merge --input weights/superpoint_lightglue.onnx --output weights/superpoint_lightglue.trt.onnx

@fabio-sim Hi, Excited to get your prompt reply!

I have try the .onnx model to build tensorrt eigien using c++ interface, but I also get the similar error

[12/15/2023-23:00:51] [I] [TRT] [MemUsageChange] Init CUDA: CPU +14, GPU +0, now: CPU 26, GPU 2374 (MiB)
[12/15/2023-23:00:52] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +546, GPU +118, now: CPU 627, GPU 2482 (MiB)
[12/15/2023-23:00:52] [I] [TRT] ----------------------------------------------------------------
[12/15/2023-23:00:52] [I] [TRT] Input filename:   /home/weihao/workspace/lightglue_ws/tensorrt_tutorial/weight/superpoint_lightglue.trt.onnx
[12/15/2023-23:00:52] [I] [TRT] ONNX IR version:  0.0.8
[12/15/2023-23:00:52] [I] [TRT] Opset version:    17
[12/15/2023-23:00:52] [I] [TRT] Producer name:    pytorch
[12/15/2023-23:00:52] [I] [TRT] Producer version: 2.1.0
[12/15/2023-23:00:52] [I] [TRT] Domain:           
[12/15/2023-23:00:52] [I] [TRT] Model version:    0
[12/15/2023-23:00:52] [I] [TRT] Doc string:       
[12/15/2023-23:00:52] [I] [TRT] ----------------------------------------------------------------
[12/15/2023-23:00:52] [W] [TRT] onnx2trt_utils.cpp:377: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[12/15/2023-23:00:52] [W] [TRT] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[12/15/2023-23:00:52] [I] [TRT] No importer registered for op: LayerNormalization. Attempting to import as plugin.
[12/15/2023-23:00:52] [I] [TRT] Searching for plugin: LayerNormalization, plugin_version: 1, plugin_namespace: 
[12/15/2023-23:00:52] [E] [TRT] ModelImporter.cpp:726: While parsing node number 177 [LayerNormalization -> "/transformers.0/self_attn/ffn/ffn.1/LayerNormalization_output_0"]:
[12/15/2023-23:00:52] [E] [TRT] ModelImporter.cpp:727: --- Begin node ---
[12/15/2023-23:00:52] [E] [TRT] ModelImporter.cpp:728: input: "/transformers.0/self_attn/ffn/ffn.0/Add_output_0"
input: "transformers.0.self_attn.ffn.1.weight"
input: "transformers.0.self_attn.ffn.1.bias"
output: "/transformers.0/self_attn/ffn/ffn.1/LayerNormalization_output_0"
name: "/transformers.0/self_attn/ffn/ffn.1/LayerNormalization"
op_type: "LayerNormalization"
attribute {
  name: "axis"
  i: -1
  type: INT
}
attribute {
  name: "epsilon"
  f: 1e-05
  type: FLOAT
}

[12/15/2023-23:00:52] [E] [TRT] ModelImporter.cpp:729: --- End node ---
[12/15/2023-23:00:52] [E] [TRT] ModelImporter.cpp:732: ERROR: builtin_op_importers.cpp:5410 In function importFallbackPluginImporter:
[8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"

this time the error is LayerNormalization, different from the previously mentioned MultiHeadAttention.

What version of TensorRT are you using? I'm able to build an engine using v8.6.1.

TensorRT-8.5.1.7.Linux.x86_64-gnu.cuda-11.8.cudnn8.6

@fabio-sim Hi, Thank you very much for your reply and I'm sorry for taking up your time.
Can you build the engine using C++? I wonder if you can share some sample code? this is my build function

auto builder =  trtCommon::trtUniquePtr<nvinfer1::IBuilder>(nvinfer1::createInferBuilder(sample::gLogger.getTRTLogger()));
if (!builder)
{
  return false;
}

const auto explicitBatch = 1U << static_cast<uint32_t>(nvinfer1::NetworkDefinitionCreationFlag::kEXPLICIT_BATCH);
auto network = trtCommon::trtUniquePtr<nvinfer1::INetworkDefinition>(builder->createNetworkV2(explicitBatch));
if (!network)
{
  return false;
}

auto config = trtCommon::trtUniquePtr<nvinfer1::IBuilderConfig>(builder->createBuilderConfig());
if (!config)
{
  return false;
}
auto profile = builder->createOptimizationProfile();
if (!profile)
  return false;

profile->setDimensions(superpoint_config_.input_tensor_names_[0].c_str(), nvinfer1::OptProfileSelector::kMIN, nvinfer1::Dims3(1, 1, 2));
profile->setDimensions(superpoint_config_.input_tensor_names_[0].c_str(), nvinfer1::OptProfileSelector::kOPT, nvinfer1::Dims3(1, 500, 2));
profile->setDimensions(superpoint_config_.input_tensor_names_[0].c_str(), nvinfer1::OptProfileSelector::kMAX, nvinfer1::Dims3(1, 3000, 2));

profile->setDimensions(superpoint_config_.input_tensor_names_[1].c_str(), nvinfer1::OptProfileSelector::kMIN, nvinfer1::Dims3(1, 1, 2));
profile->setDimensions(superpoint_config_.input_tensor_names_[1].c_str(), nvinfer1::OptProfileSelector::kOPT, nvinfer1::Dims3(1, 500, 2));
profile->setDimensions(superpoint_config_.input_tensor_names_[1].c_str(), nvinfer1::OptProfileSelector::kMAX, nvinfer1::Dims3(1, 3000, 2));

profile->setDimensions(superpoint_config_.input_tensor_names_[2].c_str(), nvinfer1::OptProfileSelector::kMIN, nvinfer1::Dims3(1, 1, 256));
profile->setDimensions(superpoint_config_.input_tensor_names_[2].c_str(), nvinfer1::OptProfileSelector::kOPT, nvinfer1::Dims3(1, 500, 256));
profile->setDimensions(superpoint_config_.input_tensor_names_[2].c_str(), nvinfer1::OptProfileSelector::kMAX, nvinfer1::Dims3(1, 3000, 256));

profile->setDimensions(superpoint_config_.input_tensor_names_[3].c_str(), nvinfer1::OptProfileSelector::kMIN, nvinfer1::Dims3(1, 1, 256));
profile->setDimensions(superpoint_config_.input_tensor_names_[3].c_str(), nvinfer1::OptProfileSelector::kOPT, nvinfer1::Dims3(1, 500, 256));
profile->setDimensions(superpoint_config_.input_tensor_names_[3].c_str(), nvinfer1::OptProfileSelector::kMAX, nvinfer1::Dims3(1, 3000, 256));

config->addOptimizationProfile(profile);



auto parser = trtCommon::trtUniquePtr<nvonnxparser::IParser>(
    nvonnxparser::createParser(*network, sample::gLogger.getTRTLogger()));
if (!parser)
{
  return false;
}

auto constructed = constructNetwork(builder, network, config, parser);
if (!constructed)
{
  return false;
}

// CUDA stream used for profiling by the builder.
auto profileStream = samplesCommon::makeCudaStream();
if (!profileStream)
{
  return false;
}
config->setProfileStream(*profileStream);

trtCommon::trtUniquePtr<nvinfer1::IHostMemory> plan{builder->buildSerializedNetwork(*network, *config)};
if (!plan)
{
  return false;
}

trtCommon::trtUniquePtr<nvinfer1::IRuntime> runtime{nvinfer1::createInferRuntime(sample::gLogger.getTRTLogger())};
if (!runtime)
{
  return false;
}

pengine_ = std::shared_ptr<nvinfer1::ICudaEngine>(runtime->deserializeCudaEngine(plan->data(), plan->size()),
                                                  samplesCommon::InferDeleter());
if (!pengine_)
{
  return false;
}

the constructNetwork() fun is

constructNetwork(trtCommon::trtUniquePtr<nvinfer1::IBuilder> &builder,
                                  trtCommon::trtUniquePtr<nvinfer1::INetworkDefinition> &network,
                                  trtCommon::trtUniquePtr<nvinfer1::IBuilderConfig> &config,
                                  trtCommon::trtUniquePtr<nvonnxparser::IParser> &parser)
{
  auto parsed = parser->parseFromFile(superpoint_config_.onnx_file_path_.c_str(),
                                      static_cast<int>(trtLogger::gLogger.getReportableSeverity()));
  if (!parsed)
    return false;
  config->setMaxWorkspaceSize(500_MiB);
  if (superpoint_config_.fp16_)
  {
    config->setFlag(nvinfer1::BuilderFlag::kFP16);
  }
  if (superpoint_config_.int8_)
  {
    config->setFlag(nvinfer1::BuilderFlag::kINT8);
    samplesCommon::setAllDynamicRanges(network.get(), 127.0f, 127.0f);
  }
  trtCommon::enableDLA(builder.get(), config.get(), superpoint_config_.dla_core_);
  return true;
}

@fabio-sim Hi, thank you fro you suggestion tips, i have reinstall the tensorrt version to 8.6.1.6, then the build produce success. I will close this question and if there is a new one I will open another one.

Good Luck for You!