Error Code 1: Cask (isConsistent)

Question

Error Code 1: Cask (isConsistent)

MHGL opened this issue 3 years ago · 3 comments

MHGL commented 3 years ago

question

I get this error while convert module to tensorrt

module has 5 down sample
upsample at last down sample
torch.cat

To Reproduce

Steps to reproduce the behavior:

code example

import torch
import torch.nn.functional as F

class MyModule(torch.nn.Module):
    def __init__(self):
        super(MyModule, self).__init__()
        for i in range(1, 6):
            setattr(self, f"down{i}", torch.nn.Conv2d(3, 3, 3, 2, padding=1))

    def forward(self, x):
        x1 = self.down1(x)
        x2 = self.down2(x1)
        x3 = self.down3(x2)
        x4 = self.down4(x3)
        x5 = self.down5(x4)
        return torch.cat([x4, F.interpolate(x5, scale_factor=2)], 1)

torch_model = MyModule()

# torch.onnx.export
torch.onnx.export(torch_model,
    torch.randn(1, 3, 224, 224),
    "./tmp.onnx",
    input_names=["inputs"],
    output_names=["outputs"],
    dynamic_axes={"inputs": {0: "batch", 2: "height", 3: "width"}, "outputs": {0: "batch", 1: "class", 2: "height", 3: "width"}},
    opset_version=11,
    export_params=True)

import os
onnx_file = os.path.join(os.getcwd(), "tmp.onnx")

# onnx -> tensorrt
# !!!
# you should build tensorrt first
import tensorrt as trt

TRT_LOGGER = trt.Logger(trt.Logger.WARNING)
with trt.Builder(TRT_LOGGER) as builder, builder.create_network(1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)) as network, trt.OnnxParser(network, TRT_LOGGER) as parser:
    with open(onnx_file, 'rb') as model:
        parser.parse(model.read())

    config = builder.create_builder_config()

    profile = builder.create_optimization_profile()
    profile.set_shape("inputs", (1, 3, 1, 1), (1, 3, 224, 224), (1, 3, 2000, 2000))
    config.add_optimization_profile(profile)

    engine = builder.build_engine(network, config)
    with open("tmp.trt", "wb") as f:
        f.write(engine.serialize())

stack traces

sometimes i failed and get this

mini_code.py:54: DeprecationWarning: Use build_serialized_network instead.
  engine = builder.build_engine(network, config)
[TensorRT] WARNING: Convolution + generic activation fusion is disable due to incompatible driver or nvrtc
[TensorRT] WARNING: TensorRT was linked against cuBLAS/cuBLAS LT 11.4.2 but loaded cuBLAS/cuBLAS LT 11.2.1
[TensorRT] WARNING: Detected invalid timing cache, setup a local cache instead
[TensorRT] ERROR: 1: [convolutionBuilder.cpp::createConvolution::184] Error Code 1: Cask (isConsistent)
Traceback (most recent call last):
  File "mini_code.py", line 56, in <module>
    f.write(engine.serialize())
AttributeError: 'NoneType' object has no attribute 'serialize'

sometimes i succeed and get this

mini_code.py:54: DeprecationWarning: Use build_serialized_network instead.
  engine = builder.build_engine(network, config)
[TensorRT] WARNING: Convolution + generic activation fusion is disable due to incompatible driver or nvrtc
[TensorRT] WARNING: TensorRT was linked against cuBLAS/cuBLAS LT 11.4.2 but loaded cuBLAS/cuBLAS LT 11.2.1
[TensorRT] WARNING: Detected invalid timing cache, setup a local cache instead
[TensorRT] WARNING: Max value of this profile is not valid
[TensorRT] WARNING: Min value of this profile is not valid
[TensorRT] WARNING: TensorRT was linked against cuBLAS/cuBLAS LT 11.4.2 but loaded cuBLAS/cuBLAS LT 11.2.1

Expected behavior

Environment

TensorRT Version: 8.0.0.3
PyTorch Version: 1.9.0
OS (e.g., MacOS, Linux): Ubuntu20.04 LTS
How you install python (anaconda, virtualenv, system): miniconda
python version (e.g. 3.7): 3.8.5
any other relevant information:
- gpu: GeForce GTX 1650
- driver: Driver Version: 460.80
- CUDA: CUDA Version: 11.2

Answer 1 · 2021-07-20T02:46:53.000Z

我在转化yolov5时遇到了同样的错误，我在Tensorrt那边看到了这个issue，我再max更改为2016工作正常，我觉的max input可能跟模型的stride有关

Answer 2 · 2021-07-21T02:28:36.000Z

单纯的5次下采样并不会引发这个错误，还需要在最后的feature上上采样+cat操作。
这个问题很奇怪，有时可以正常运行，会出现[TensorRT] WARNING: Max value of this profile is not valid，就只是warning；有的时候就是直接报错了。暂时没有能力深究原因，等下官方吧。

Answer 3 · 2022-12-28T02:50:50.000Z

I encountered the same problem in 8.5GA version, I set min HW dims to a number greater than 1, the problem seems to be solved. However, there is fine when I using 8.2 version. It seems that TensorRT add some check for invalid dynamic dims .