grimoire/mmdetection-to-tensorrt

Converted engine fails on DeepStream: Detect-postprocessor failed to init resource because dlsym failed to get func NvDsInferParseMmdet pointer

xarauzo opened this issue · 5 comments

Describe the bug
Hi, I have converted a MMDet model to TRT using "mmdet2trt". I am doing this on my NVIDIA Jetson Xavier NX, on Jetpack 4.4. I then run my deepstream application and I get the following error:

0:00:04.883005662  2175     0x22601720 ERROR                nvinfer gstnvinfer.cpp:613:gst_nvinfer_logger:<primary-gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initResource() <nvdsinfer_context_impl.cpp:683> [UID = 1]: Detect-postprocessor failed to init resource because dlsym failed to get func NvDsInferParseMmdet pointer
ERROR: Infer Context failed to initialize post-processing resource, nvinfer error:NVDSINFER_CUSTOM_LIB_FAILED
ERROR: Infer Context prepare postprocessing resource failed., nvinfer error:NVDSINFER_CUSTOM_LIB_FAILED
0:00:04.897069813  2175     0x22601720 WARN                 nvinfer gstnvinfer.cpp:809:gst_nvinfer_start:<primary-gie> error: Failed to create NvDsInferContext instance
0:00:04.897199925  2175     0x22601720 WARN                 nvinfer gstnvinfer.cpp:809:gst_nvinfer_start:<primary-gie> error: Config file path: /drone/config_files/detector.txt, NvDsInfer Error: NVDSINFER_CUSTOM_LIB_FAILED
2022-07-08 10:37:21.493 | ERROR    | pydeepstream.pipeline:bus_message_handler:39 - gst-resource-error-quark: Failed to create NvDsInferContext instance (1): /dvs/git/dirty/git-master_linux/deepstream/sdk/src/gst-plugins/gst-nvinfer/gstnvinfer.cpp(809): gst_nvinfer_start (): /GstPipeline:pipeline0/GstNvInfer:primary-gie:
Config file path: /drone/config_files/detector.txt, NvDsInfer Error: NVDSINFER_CUSTOM_LIB_FAILED
2022-07-08 10:37:21.494 | INFO     | pydeepstream.pipeline:run_loop:62 - Exiting app...
2022-07-08 10:37:21.502 | ERROR    | __main__:<module>:100 - An error has been caught in function '<module>', process 'MainProcess' (2175), thread 'MainThread' (547503841296):
Traceback (most recent call last):

> File "/drone/pipeline/airelectronics_pipeline.py", line 100, in <module>
    run_pipeline(args.config)
    |            |    -> '/drone/config_files/pipeline_xarauzo.yaml'
    |            -> Namespace(config='/drone/config_files/pipeline_xarauzo.yaml')
    -> <function run_pipeline at 0x7f6a4a4378>

  File "/drone/pipeline/airelectronics_pipeline.py", line 85, in run_pipeline
    pipeline.run_loop()
    |        -> <function Pipeline.run_loop at 0x7f6a510d08>
    -> <pydeepstream.pipeline.Pipeline object at 0x7f6a4b8e10>

  File "/usr/local/lib/python3.6/dist-packages/pydeepstream-0.0.1-py3.6.egg/pydeepstream/pipeline.py", line 66, in run_loop

RuntimeError: gst-resource-error-quark: Failed to create NvDsInferContext instance (1): /dvs/git/dirty/git-master_linux/deepstream/sdk/src/gst-plugins/gst-nvinfer/gstnvinfer.cpp(809): gst_nvinfer_start (): /GstPipeline:pipeline0/GstNvInfer:primary-gie:
Config file path: /drone/config_files/detector.txt, NvDsInfer Error: NVDSINFER_CUSTOM_LIB_FAILED
Traceback (most recent call last):
  File "/drone/pipeline/airelectronics_pipeline.py", line 100, in <module>
    run_pipeline(args.config)
  File "/usr/local/lib/python3.6/dist-packages/loguru/_logger.py", line 1220, in catch_wrapper
    return function(*args, **kwargs)
  File "/drone/pipeline/airelectronics_pipeline.py", line 85, in run_pipeline
    pipeline.run_loop()
  File "/usr/local/lib/python3.6/dist-packages/pydeepstream-0.0.1-py3.6.egg/pydeepstream/pipeline.py", line 66, in run_loop
RuntimeError: gst-resource-error-quark: Failed to create NvDsInferContext instance (1): /dvs/git/dirty/git-master_linux/deepstream/sdk/src/gst-plugins/gst-nvinfer/gstnvinfer.cpp(809): gst_nvinfer_start (): /GstPipeline:pipeline0/GstNvInfer:primary-gie:
Config file path: /drone/config_files/detector.txt, NvDsInfer Error: NVDSINFER_CUSTOM_LIB_FAILED

Both the "mmdet2trt" and the "deepstream" applications, I run them on a Docker container. Regarding the "mmdet2trt" Dockerfile, I modified the one on this repository in order to make it work on Jetson Xavier NX, Jetpack 4.4. You can see the mmdet, mmcv, pytorch and torchvision versiones in the Dockerfile.

MMCV==1.3.9
MMDet==2.14.0
Pytorch==1.9.0
Torchvision==0.10.0

I don't know if anything is missing, but it worked and I got to create the "engine" file. It is as follows:

FROM nvcr.io/nvidia/l4t-base:r32.5.0

### update apt and install libs
RUN apt-get update &&\
    apt-get install -y vim cmake libsm6 libxext6 libxrender-dev libgl1-mesa-glx git python3-packaging

### torch install 
RUN wget https://nvidia.box.com/shared/static/h1z9sw4bb1ybi0rm3tu8qdj8hs05ljbm.whl -O torch-1.9.0-cp36-cp36m-linux_aarch64.whl &&\
    apt-get install -y python3-pip libopenblas-base libopenmpi-dev &&\
    pip3 install Cython &&\
    pip3 install numpy torch-1.9.0-cp36-cp36m-linux_aarch64.whl
### python
RUN pip3 install --upgrade pip

### install mmcv

RUN DEBIAN_FRONTEND=noninteractive apt-get install -y python3-opencv

### scikit image
ENV DEBIAN_FRONTEND noninteractive
RUN apt-get update -y \
  && apt-get install -y --no-install-recommends apt-utils \
  && apt-get install -y \
    python3-dev libpython3-dev python-pil python3-tk python-imaging-tk \
    build-essential wget locales liblapack-dev

RUN sed -i -e 's/# en_US.UTF-8 UTF-8/en_US.UTF-8 UTF-8/' /etc/locale.gen && \
    dpkg-reconfigure --frontend=noninteractive locales && \
    update-locale LANG=en_US.UTF-8
ENV LANG en_US.UTF-8

RUN wget -q -O /tmp/get-pip.py --no-check-certificate https://bootstrap.pypa.io/pip/3.6/get-pip.py \
  && python3 /tmp/get-pip.py \
  && pip3 install -U pip
RUN pip3 install -U testresources setuptools

RUN pip3 install -U numpy
#####

# H5py:
RUN apt-get install -y pkg-config libhdf5-100 libhdf5-dev &&\
    pip3 install versioned-hdf5

# MMCV:
RUN apt-get install -y libssl-dev
RUN pip3 install --upgrade pip
RUN mkdir /root/space && cd /root/space &&\
    git clone --branch v1.3.9 https://github.com/open-mmlab/mmcv.git /root/space/mmcv &&\
    cd /root/space/mmcv &&\
    MMCV_WITH_OPS=1 pip3 install -e .

# MMDetection
### Git mmdetection:
RUN git clone --branch v2.14.0 https://github.com/open-mmlab/mmdetection.git /root/space/mmdetection

### Install mmdetection:
RUN cd /root/space/mmdetection &&\
    pip3 install -r requirements/build.txt &&\
    pip3 install -r requirements/optional.txt &&\
    pip3 install -r requirements/runtime.txt &&\
    python3 setup.py develop

RUN cd /root/space &&\
    wget https://github.com/Kitware/CMake/releases/download/v3.19.1/cmake-3.19.1.tar.gz &&\
    tar -xf cmake-3.19.1.tar.gz &&\
    cd cmake-3.19.1 &&\
    apt-get install -y libssl-dev &&\
    ./configure &&\
    make &&\
    make install

### git amirstan plugin
RUN git clone --depth=1 https://github.com/grimoire/amirstan_plugin.git /root/space/amirstan_plugin &&\ 
    cd /root/space/amirstan_plugin &&\ 
    git submodule update --init --progress --depth=1

### install amirstan plugin
RUN cd /root/space/amirstan_plugin &&\ 
    mkdir build &&\
    cd build &&\
    cmake .. &&\
    make -j10 &&\
    echo "export AMIRSTAN_LIBRARY_PATH=/root/space/amirstan_plugin/build/lib" >> /root/.bashrc

### git torch2trt_dynamic
RUN git clone --depth=1 https://github.com/grimoire/torch2trt_dynamic.git /root/space/torch2trt_dynamic

### install torch2trt_dynamic
RUN cd /root/space/torch2trt_dynamic &&\
    python3 setup.py develop

### git mmdetection-to-tensorrt
RUN git clone --depth=1 https://github.com/grimoire/mmdetection-to-tensorrt.git /root/space/mmdetection-to-tensorrt

### install mmdetection-to-tensorrt
RUN cd /root/space/mmdetection-to-tensorrt &&\
    python3 setup.py develop

## setuptools for python3
RUN apt-get install -y python3-setuptools

### install torchvision
RUN  apt-get install -y libjpeg-dev zlib1g-dev libpython3-dev libavcodec-dev libavformat-dev libswscale-dev &&\
     git clone --branch v0.10.0 https://github.com/pytorch/vision torchvision &&\
     cd torchvision &&\
     export BUILD_VERSION=0.10.0 &&\  
     python3 setup.py install

WORKDIR /root/space

To run the conversion I use the python method that generates the "model.engine", with no errors.

As I said, we are running this on Jetpack 4.4, with Deepstream 5.0Then, in my DeepStream inference file "detector.txt" I added "parse-bbox-func-name" and "custom-lib-path" properties:

[property]
net-scale-factor=0.017352074
offsets=123.675;116.28;103.53
infer-dims=3;720;1280
model-engine-file=/drone/models/test.engine
gie-unique-id=1
network-type=0
network-mode=2
num-detected-classes=1
interval=8
batch-size=4
parse-func=0
parse-bbox-func-name=NvDsInferParseMmdet
output-blob-names=num_detections;boxes;scores;classes
custom-lib-path=/amirstan_plugin/build/lib/libamirstan_plugin.so
cluster-mode=4

[class-attrs-all]
pre-cluster-threshold=0.5
group-threshold=0

I would appreciate if anyone could help me with this issue.

I think the custom-lib-path in your config might be wrong, as the log indicates NVDSINFER_CUSTOM_LIB_FAILED.
/amirstan_plugin/build/lib/libamirstan_plugin.so seems like an absolute path from root. Did you place the library there?

Yes, the library is placed there, I double checked it. That's why I don't really understand what might be happening.

Did you build amirstan_plugin with deepstream support?

You can see in the dockerfile I pasted the way I built the plugin:

### git amirstan plugin
RUN git clone --depth=1 https://github.com/grimoire/amirstan_plugin.git /root/space/amirstan_plugin &&\ 
    cd /root/space/amirstan_plugin &&\ 
    git submodule update --init --progress --depth=1

### install amirstan plugin
RUN cd /root/space/amirstan_plugin &&\ 
    mkdir build &&\
    cd build &&\
    cmake .. &&\
    make -j10 &&\
    echo "export AMIRSTAN_LIBRARY_PATH=/root/space/amirstan_plugin/build/lib" >> /root/.bashrc

I don't know if there is anything else I have to do.

PS: If this is of any interest to the matter, I have an old (2020) "model.engine" which was converted back then (using an older version of mmdet2trt) that works properly when using the matching version of the plugin. However, now (2022) I have created a new mmdet model, converted it using the versions you see in the Dockerfile, and there is where I get the error. (I also tried using an older commit of the plugin with this new MMDet model, which should not work, and of course, it did not work).

You were completely right on that one, I was totally missing the "DEEPSTREAM SUPPORT" when compiling theplugin on the deepstream side. It works now, I am having trouble with the model now, as it does not detect anything properly, and it might not be related to the plugin nor the conversion, so I will close this issue for now. Thanks for the help.