grimoire/mmdetection-to-tensorrt

LD_PRELOAD libamirstan_plugin.so core dump in triton server

Closed this issue · 9 comments

Describe the bug
LD_PRELOAD libamirstan_plugin.so core dump in triton server
triton server,tensorrt and other software version: 20.03

the problem is similar with triton-inference-server/server#2227
I tried loading the shared library with trtexec directly as mentioned by CoderHam, core dumped too.

Did a search in tensorRT issue,
no hit.

so it is a build issue?
thanks!

@tianq01
check the position, make sure the LD_PRELOAD position befor the docker image, not tritonserver
--env LD_PRELOAD=/models/libamirstan_plugin.so nvcr.io/nvidia/tritonserver:20.03.1-py3 tritonserver

@tianq01
check the position, make sure the LD_PRELOAD position befor the docker image, not tritonserver
--env LD_PRELOAD=/models/libamirstan_plugin.so nvcr.io/nvidia/tritonserver:20.03.1-py3 tritonserver

@Chen-cyw thanks for quick reply.
I tried below command:(refer to https://github.com/triton-inference-server/server/blob/v1.12.0/docs/run.rst)
nvidia-docker run --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 --env LD_PRELOAD=/models/libamirstan_plugin.so -p 8800:8000 -p 8801:8001 -p 8802:8002 -v /data/model_repository:/models nvcr.io/nvidia/tritonserver:20.03-py3 trtserver --model-repository=/models --strict-model-config=false --log-verbose=1
the docker fails to start without any explict error info, looks like core dumps.
any problem with the command?

Hi @grimoire ,
we did run the converted model successfully via below python code as described in README:

trt_model = init_detector(save_path) num_detections, trt_bbox, trt_score, trt_cls = inference_detector(trt_model, image_path, cfg_path, "cuda:0")

But trtexec -loadEngine segmentation faults.
Do we miss something? what addtional effort should be made to support trtexec?
Could you give some clue?
thanks.

@tianq01 can you run the docker image without start trtserver? before run with docker , try with your local env like https://github.com/grimoire/mmdetection-to-tensorrt/issues/33

Hi @Chen-cyw
the url cannot be opened. Did you run converted model successfully using trtexec or tritonserver?

@tianq01 copy and paste on the chrome ,not click
I have successed to convert the mmdetection bbox model to trt model and run in trtexec and tritonserver(20.03)

@tianq01 copy and paste on the chrome ,not click
I have successed to convert the mmdetection bbox model to trt model and run in trtexec and tritonserver(20.03)

missed this post.
@Chen-cyw thanks a lot! will try.

the problem got resolved after upgrading to latest code.
closed.

@tianq01 Hi, can you provide some detail about how to deploy mmdet trt engine with triton-inference-server?
I've converted the trt engine file from mmdet model with docker container CLI:

mmdet2trt --fp16 cascade_rcnn_s101_fpn_syncbn-backbone+head_mstrain-range_1x_coco_fp16.py epoch_5.pth output.trt

So I got output.trt model file. And then I created the following directories to place my model:

models
    ├── big_model
    │   └── 1
    │       └── model.plan  # (rename from output.trt)
    └── libamirstan_plugin.so

and then want to deploy it with tritonserver:

docker run --rm --gpus device=3 --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 \
  --env LD_PRELOAD=/models/libamirstan_plugin.so -p 8800:8000 -p 8801:8001 -p 8802:8002 \
  -v $(pwd):/models nvcr.io/nvidia/tritonserver:20.08-py3 \
  tritonserver --model-repository=/models --strict-model-config=false --log-verbose=1

However it cannot load the model:

=============================
== Triton Inference Server ==
=============================

NVIDIA Release 20.08 (build 15533555)

Copyright (c) 2018-2020, NVIDIA CORPORATION.  All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION.  All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying
project or file.

I0224 07:22:55.402466 1 metrics.cc:184] found 1 GPUs supporting NVML metrics
I0224 07:22:55.408652 1 metrics.cc:193]   GPU 0: GeForce RTX 2080 Ti
I0224 07:22:55.409009 1 server.cc:119] Initializing Triton Inference Server
I0224 07:22:55.850319 1 pinned_memory_manager.cc:195] Pinned memory pool is created at '0x7f13f6000000' with size 268435456
I0224 07:22:55.852497 1 netdef_backend_factory.cc:46] Create NetDefBackendFactory
I0224 07:22:55.852517 1 plan_backend_factory.cc:48] Create PlanBackendFactory
I0224 07:22:55.852523 1 plan_backend_factory.cc:55] Registering TensorRT Plugins
I0224 07:22:55.852566 1 logging.cc:52] Registered plugin creator - ::BatchTilePlugin_TRT version 1
I0224 07:22:55.852579 1 logging.cc:52] Registered plugin creator - ::BatchedNMS_TRT version 1
I0224 07:22:55.852600 1 logging.cc:52] Registered plugin creator - ::CoordConvAC version 1
I0224 07:22:55.852638 1 logging.cc:52] Registered plugin creator - ::CropAndResize version 1
I0224 07:22:55.852645 1 logging.cc:52] Registered plugin creator - ::DetectionLayer_TRT version 1
I0224 07:22:55.852653 1 logging.cc:52] Registered plugin creator - ::FlattenConcat_TRT version 1
I0224 07:22:55.852660 1 logging.cc:52] Registered plugin creator - ::GenerateDetection_TRT version 1
I0224 07:22:55.852671 1 logging.cc:52] Registered plugin creator - ::GridAnchor_TRT version 1
I0224 07:22:55.852683 1 logging.cc:52] Registered plugin creator - ::GridAnchorRect_TRT version 1
I0224 07:22:55.852709 1 logging.cc:52] Registered plugin creator - ::InstanceNormalization_TRT version 1
I0224 07:22:55.852716 1 logging.cc:52] Registered plugin creator - ::LReLU_TRT version 1
I0224 07:22:55.852723 1 logging.cc:52] Registered plugin creator - ::MultilevelCropAndResize_TRT version 1
I0224 07:22:55.852734 1 logging.cc:52] Registered plugin creator - ::MultilevelProposeROI_TRT version 1
I0224 07:22:55.852742 1 logging.cc:52] Registered plugin creator - ::NMS_TRT version 1
I0224 07:22:55.852749 1 logging.cc:52] Registered plugin creator - ::Normalize_TRT version 1
I0224 07:22:55.852757 1 logging.cc:52] Registered plugin creator - ::PriorBox_TRT version 1
I0224 07:22:55.852765 1 logging.cc:52] Registered plugin creator - ::ProposalLayer_TRT version 1
I0224 07:22:55.852772 1 logging.cc:52] Registered plugin creator - ::Proposal version 1
I0224 07:22:55.852779 1 logging.cc:52] Registered plugin creator - ::PyramidROIAlign_TRT version 1
I0224 07:22:55.852785 1 logging.cc:52] Registered plugin creator - ::Region_TRT version 1
I0224 07:22:55.852793 1 logging.cc:52] Registered plugin creator - ::Reorg_TRT version 1
I0224 07:22:55.852802 1 logging.cc:52] Registered plugin creator - ::ResizeNearest_TRT version 1
I0224 07:22:55.852810 1 logging.cc:52] Registered plugin creator - ::RPROI_TRT version 1
I0224 07:22:55.852816 1 logging.cc:52] Registered plugin creator - ::SpecialSlice_TRT version 1
I0224 07:22:55.852828 1 onnx_backend_factory.cc:53] Create OnnxBackendFactory
I0224 07:22:55.860046 1 libtorch_backend_factory.cc:53] Create LibTorchBackendFactory
I0224 07:22:55.860167 1 custom_backend_factory.cc:46] Create CustomBackendFactory
I0224 07:22:55.860172 1 backend_factory.h:44] Create TritonBackendFactory
I0224 07:22:55.860203 1 ensemble_backend_factory.cc:47] Create EnsembleBackendFactory
I0224 07:22:55.860364 1 autofill.cc:142] TensorFlow SavedModel autofill: Internal: unable to autofill for 'big_model', unable to find savedmodel directory named 'model.savedmodel'
I0224 07:22:55.860396 1 autofill.cc:155] TensorFlow GraphDef autofill: Internal: unable to autofill for 'big_model', unable to find graphdef file named 'model.graphdef'
I0224 07:22:55.860420 1 autofill.cc:168] PyTorch autofill: Internal: unable to autofill for 'big_model', unable to find PyTorch file named 'model.pt'
I0224 07:22:55.860450 1 autofill.cc:180] Caffe2 NetDef autofill: Internal: unable to autofill for 'big_model', unable to find netdef files: 'model.netdef' and 'init_model.netdef'
I0224 07:22:56.123378 1 autofill.cc:376] failed to load /models/big_model/1/model.plan: Internal: onnx runtime error 1: /workspace/onnxruntime/onnxruntime/core/session/inference_session.cc:279 onnxruntime::InferenceSession::InferenceSession(const onnxruntime::SessionOptions&, const onnxruntime::Environment&, const void*, int) result was false. Could not parse model successfully while constructing the inference session

I0224 07:22:56.123459 1 autofill.cc:212] ONNX autofill: Internal: unable to autofill for 'big_model', unable to find onnx file
WARNING: Since openmp is enabled in this build, this API cannot be used to configure intra op num threads. Please use the openmp environment variables to control the number of threads.
E0224 07:23:19.480148 1 logging.cc:43] coreReadArchive.cpp (38) - Serialization Error in verifyHeader: 0 (Version tag does not match)
E0224 07:23:19.516318 1 logging.cc:43] INVALID_STATE: std::exception
E0224 07:23:19.516344 1 logging.cc:43] INVALID_CONFIG: Deserialize the cuda engine failed.
I0224 07:23:19.548534 1 autofill.cc:225] TensorRT autofill: Internal: unable to autofill for 'big_model', unable to find a compatible plan file.
W0224 07:23:19.548552 1 autofill.cc:265] Proceeding with simple config for now
I0224 07:23:19.548576 1 model_config_utils.cc:629] autofilled config: name: "big_model"

E0224 07:23:19.558529 1 model_repository_manager.cc:1633] unexpected platform type  for big_model
error: creating server: Internal - failed to load all models

Can you share your procedure please? Thank you.