Parses ONNX models for execution with TensorRT.
See also the TensorRT documentation.
The TensorRT backend for ONNX can be used in Python as follows:
import onnx
import onnx_tensorrt.backend as backend
import numpy as np
model = onnx.load("/path/to/model.onnx")
engine = backend.prepare(model, device='CUDA:1')
input_data = np.random.random(size=(32, 3, 224, 224)).astype(np.float32)
output_data = engine.run(input_data)[0]
print(output_data)
print(output_data.shape)
ONNX models can be converted to serialized TensorRT engines using the onnx2trt
executable:
onnx2trt my_model.onnx -o my_engine.trt
ONNX models can also be converted to human-readable text:
onnx2trt my_model.onnx -t my_model.onnx.txt
See more usage information by running:
onnx2trt -h
The model parser library, libnvonnxparser.so, has a C++ API declared in this header:
NvOnnxParser.h
TensorRT engines built using this parser must use the plugin factory provided in libnvonnxparser_runtime.so, which has a C++ API declared in this header:
NvOnnxParserRuntime.h
Clone the code from GitHub.
git clone --recursive https://github.com/onnx/onnx-tensorrt.git
Suppose your TensorRT library is located at /opt/tensorrt
. Build the onnx2trt
executable and the libnvonnxparser*
libraries using CMake:
mkdir build
cd build
cmake .. -DTENSORRT_ROOT=/opt/tensorrt
make -j8
sudo make install
Build the Python wrappers and modules by running:
python setup.py build
sudo python setup.py install
Build the onnx_tensorrt Docker image by running:
cp /path/to/TensorRT-3.0.*.tar.gz .
docker build -t onnx_tensorrt .
After installation (or inside the Docker container), ONNX backend tests can be run as follows:
Real model tests only:
python onnx_backend_test.py OnnxBackendRealModelTest
All tests:
python onnx_backend_test.py
You can use -v
flag to make output more verbose.
Pre-trained Caffe2 models in ONNX format can be found at https://github.com/onnx/models