Jupyter Notebook kernel dies automatically

Question

Jupyter Notebook kernel dies automatically

WeiFoo opened this issue 5 years ago · 8 comments

I was trying to run the following notebook on Ubuntu 18.04 with T4 GPU on EC2,

https://github.com/tensorflow/tensorrt/blob/master/tftrt/examples/image-classification/TFv2-TF-TRT-inference-from-Keras-saved-model.ipynb

I can run most cells until TF-TRT FP32 model section, the kernel will die automatically.

I even restarted the runtime and just ran the following code, the kernel still die

conversion_params = trt.DEFAULT_TRT_CONVERSION_PARAMS._replace(precision_mode=trt.TrtPrecisionMode.FP32,
                                                               max_workspace_size_bytes=8000000000)

converter = trt.TrtGraphConverterV2(input_saved_model_dir='resnet50_saved_model',
                                    conversion_params=conversion_params)
converter.convert()
converter.save(output_saved_model_dir='resnet50_saved_model_TFTRT_FP32')
print('Done Converting to TF-TRT FP32')

Anyone has an idea? Thanks!

Answer 1 · 2019-11-29T13:18:47.000Z

I am also experiencing the same. Any pointers @pooyadavoodi?

Answer 2 · 2019-12-13T23:43:48.000Z

I have seen an issue related to running int8 calibration in the same process that previously ran fp32/fp16 conversion. But if you run each conversion once per process, I expect it to work.

Answer 3 · 2019-12-13T23:56:31.000Z

I just tried TF-TRT FP32 and it worked.
I got the following perf on a P100:

Step 0: 10.8ms
Step 50: 10.8ms
Step 100: 10.8ms
Step 150: 10.8ms
Step 200: 10.8ms
Step 250: 10.8ms
Step 300: 10.8ms
Step 350: 10.8ms
Step 400: 10.8ms
Step 450: 10.8ms
Step 500: 10.8ms
Step 550: 10.8ms
Step 600: 10.8ms
Step 650: 10.8ms
Step 700: 10.8ms
Step 750: 10.8ms
Step 800: 10.8ms
Step 850: 10.8ms
Step 900: 10.8ms
Step 950: 10.8ms
Throughput: 742 images/s

Perhaps some colab nodes aren't stable?

Answer 4 · 2019-12-14T02:45:39.000Z

Yeah, it might be the case. Worth propagating to the Colab team, I guess.

Answer 5 · 2019-12-18T16:46:36.000Z

My kernel is also dying when doing this step. Running Jupyter inside the docker container with FP32, FP16 on GTX 1650.

Answer 6 · 2020-06-09T06:40:49.000Z

I had the same symptoms. In my case, it was caused by not adding <your tensorRT path>\lib to LD_LIBRARY_PATH before running jupyter lab. Adding Path and running Jupyter lab again solved it.

Answer 7 · 2020-07-06T09:02:33.000Z

Hey, my kernel dies not during conversion and optimization of the model but during inference, it converts the model smoothly both with fp16 and fp32 but during inference ( predicting one image) the kernel dies and automatically restarts. Any help ?

Answer 8 · 2020-07-06T09:12:49.000Z

I actually put together a tutorial a few days back that shows how to use TensorRT in an end-to-end manner for accelerating inference: https://sayak.dev/tf.keras/tensorrt/tensorflow/2020/07/01/accelerated-inference-trt.html. Hope this will be helpful.