tensorflow/tensorrt

Error 'Tensor must be at least a vector, but saw shape: []' when tensorRT optimizing with converter.Build() string input base64 servable

257kb opened this issue · 0 comments

257kb commented

Description

I tried to optimize a tensorflow 2+ simple classification model using tensorRT with as input a string as I want to serve with tensorflow serving with base64 input. But when it reaches the converter.build() function, I get this error:

inputs, attrs, num_outputs) tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: Tensor must be at least a vector, but saw shape: [] [[node StatefulPartitionedCall/StatefulPartitionedCall/map/TensorArrayUnstack/TensorListFromTensor (defined at serving/serving.py:152) ]] [[StatefulPartitionedCall/StatefulPartitionedCall/map/while/body/_253/map/while/StatefulPartitionedCall/decode_image/cond_jpeg/else/_314/decode_image/cond_jpeg/cond_png/else/_333/decode_image/cond_jpeg/cond_png/is_gif/_98]] (1) Invalid argument: Tensor must be at least a vector, but saw shape: [] [[node StatefulPartitionedCall/StatefulPartitionedCall/map/TensorArrayUnstack/TensorListFromTensor (defined at serving/serving.py:152) ]] 0 successful operations. 0 derived errors ignored. [Op:__inference_pruned_70570] Function call stack: pruned -> pruned

I tried to provide the string as a tensor with different dimension but nothing worked. The same script works fine and optimize well when using float32 input (1000 FPS +). If I do not use the converter.build() it does not give me any error and it run fine in the serving but it does not look optimized as it is still running at 200 FPS.

Same when I try to optimize it using saved_model_cli with tensorrt option:
saved_model_cli convert \ --dir my_model --output_dir tensorRT_FP16 \ --tag_set serve \ tensorrt --precision_mode FP16

I get in the logs:
2020-11-24 08:13:42.621033: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at trt_engine_resource_ops.cc:196 : Not found: Container TF-TRT does not exist. (Could not find resource: TF-TRT/TRTEngineOp_0_0) INFO:tensorflow:Could not find TRTEngineOp_0_0 in TF-TRT cache. This can happen if build() is not called, which means TensorRT engines will be built and cached at runtime. INFO:tensorflow:Assets written to: /tensorRT_FP16/assets

As you can see it does not run the converter.Build function and the model is not optimized as it is still running at around 200 FPS.

Environment

TensorRT Version: 7.2.1
GPU Type: V100 16GB
Nvidia Driver Version: Driver Version: 440.100
CUDA Version: 11.1
CUDNN Version:
Operating System + Version:
Python Version (if applicable):
TensorFlow Version (if applicable): 2.3.0
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag): I tried both: tensorflow/tensorflow:2.3.0-gpu and nvcr.io/nvidia/tensorflow:20.11-tf2-py3

dpkg -l | grep -i tensor ii graphsurgeon-tf 7.2.1-1+cuda11.1 amd64 GraphSurgeon for TensorRT package ii libnvinfer-bin 7.2.1-1+cuda11.1 amd64 TensorRT binaries ii libnvinfer-dev 7.2.1-1+cuda11.1 amd64 TensorRT development libraries and headers ii libnvinfer-plugin-dev 7.2.1-1+cuda11.1 amd64 TensorRT plugin libraries and headers ii libnvinfer-plugin7 7.2.1-1+cuda11.1 amd64 TensorRT plugin library ii libnvinfer7 7.2.1-1+cuda11.1 amd64 TensorRT runtime libraries ii libnvonnxparsers-dev 7.2.1-1+cuda11.1 amd64 TensorRT ONNX libraries ii libnvonnxparsers7 7.2.1-1+cuda11.1 amd64 TensorRT ONNX libraries ii libnvparsers-dev 7.2.1-1+cuda11.1 amd64 TensorRT parsers libraries ii libnvparsers7 7.2.1-1+cuda11.1 amd64 TensorRT parsers libraries ii python3-libnvinfer 7.2.1-1+cuda11.1 amd64 Python 3 bindings for TensorRT ii python3-libnvinfer-dev 7.2.1-1+cuda11.1 amd64 Python 3 development package for TensorRT ii uff-converter-tf 7.2.1-1+cuda11.1 amd64 UFF converter for TensorRT package

Relevant Files

class ServableModelWrapper(tf.Module):
    def __init__(self, model, ):
        super().__init__()
        self.model = model

    @tf.function
    def decode_image(self, image):
        """Decode the input image and convert it into a tf.int8 tensor."""
        image = tf.io.decode_image(image, channels=3, expand_animations=False)
        return image

    @tf.function
    def prepare_input(self, images):
        images = tf.map_fn(self.decode_image, images, fn_output_signature=tf.uint8)
        images = tf.cast(images, tf.float32)
        return images

    @tf.function
    def __call__(self, x):
        x = self.prepare_input(x)
        return self.model(x)

def create_servable_model(weights_path):
    def build_model(num_classes, weights, input_shape):
        base_model = tf.keras.applications.ResNet50V2(weights=weights, include_top=False, input_shape=input_shape)
        inputs = tf.keras.Input(shape=input_shape)
        x = tf.keras.applications.resnet_v2.preprocess_input(inputs)
        x = base_model(x, training=train_backbone)
        x = tf.keras.layers.GlobalAveragePooling2D()(x)
        x = tf.keras.layers.Dense(256, activation='relu')(x)
        predictions = tf.keras.layers.Dense(num_classes, activation='softmax')(x)
        model = tf.keras.models.Model(inputs=inputs, outputs=predictions)
        return model

    def my_input_image_fn():
        # I tried many different things here: hardcoded string, tensors
        inp1 = np.zeros(shape=(height, width, 3)).astype(
            np.uint8)  # I tried increase dimensions, reducing, providing an hardcoded valid string...
        _, imgByteArr = cv2.imencode('.JPG', inp1)
        b64_image = base64.b64encode(imgByteArr).decode("utf-8")
        yield [b64_image]

    model = build_model()
    model.load_weights(weights_path, by_name=True)
    model = ServableModelWrapper(model)
    input_tensor_spec = tf.TensorSpec([None], tf.string,
                                      name='input_0')  # I also tried with tf.TensorSpec.from_tensor()
    model_call_fn = model.__call__.get_concrete_function(input_tensor_spec)
    tmp_save_dir = tempfile.mkdtemp()
    tf.saved_model.save(model, tmp_save_dir, signatures=model_call_fn)

    params = tf.experimental.tensorrt.ConversionParams(
        precision_mode=trt_precision_mode,
        maximum_cached_engines=trt_max_cached_engines
    )
    converter = trt.TrtGraphConverterV2(input_saved_model_dir=tmp_save_dir, conversion_params=params)
    converter.convert()
    converter.build(input_fn=my_input_image_fn)

    converter.save(output_saved_model_dir=save_dir)
    shutil.rmtree(tmp_save_dir)