RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

Question

RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

kayleeliyx opened this issue 4 years ago · 5 comments

I installed all the requirements as you listed on the website. However, I met this error while running the demo.

Error:

Fine-tuning directory: 'results/ayush/R_hierarchical2_mc/B0.1_R1.0_PL1-0_LR0.0004_BS4_Oadam'
Found cache checkpoints/mc.pth
Using 1 GPUs.
Traceback (most recent call last):
  File "main.py", line 13, in <module>
    dp.process(params)
  File "/home/ubuntu/Documents/consistent_depth/process.py", line 117, in process
    return self.pipeline(params)
  File "/home/ubuntu/Documents/consistent_depth/process.py", line 60, in pipeline
    ft.save_depth(initial_depth_dir)
  File "/home/ubuntu/Documents/consistent_depth/depth_fine_tuning.py", line 190, in save_depth
    depth = self.model.forward(stacked_images, metadata)
  File "/home/ubuntu/Documents/consistent_depth/monodepth/depth_model.py", line 23, in forward
    depth = self.estimate_depth(images)
  File "/home/ubuntu/Documents/consistent_depth/monodepth/mannequin_challenge_model.py", line 60, in estimate_depth
    self.model.prediction_d, _ = self.model.netG.forward(images)
  File "/home/ubuntu/anaconda3/envs/depth/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/home/ubuntu/anaconda3/envs/depth/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/Documents/consistent_depth/monodepth/mannequin_challenge/models/hourglass.py", line 176, in forward
    pred_feature = self.seq(input_)
  File "/home/ubuntu/anaconda3/envs/depth/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/anaconda3/envs/depth/lib/python3.6/site-packages/torch/nn/modules/container.py", line 100, in forward
    input = module(input)
  File "/home/ubuntu/anaconda3/envs/depth/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/anaconda3/envs/depth/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 345, in forward
    return self.conv2d_forward(input, self.weight)
  File "/home/ubuntu/anaconda3/envs/depth/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 342, in conv2d_forward
    self.padding, self.dilation, self.groups)
RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

python main.py --video_file data/videos/ayush.mp4 --path results/ayush --camera_params "1671.770118, 540, 960" --camera_model "SIMPLE_PINHOLE" --make_video

Here is my setup

Cuda compilation tools, release 10.1, V10.1.243

print(torch.__version__) 1.4.0+cu100

Thanks for helping!

Answer 1 · 2021-08-03T15:51:29.000Z

exactly the same error on my side, I try all kind of cudnn installation and version but none of them worked

Answer 2 · 2021-09-23T10:55:08.000Z

hi, i solved this by installing the latest pytorch version (pip3 install torch==1.9.1+cu111 torchvision==0.10.1+cu111 torchaudio==0.9.1 -f https://download.pytorch.org/whl/torch_stable.html), then i had to patch flownet2 (https://github.com/NVIDIA/flownet2-pytorch/pull/254/files) to make it compatible.

Answer 3 · 2021-11-17T23:02:42.000Z

Hi!

@dschoerk can you elaborate a bit more how did you manage to solve this. I was having the same problem as @karliell and once I used your advice and installed the newer pytorch (pip3 install torch==1.9.1+cu111 torchvision==0.10.1+cu111 torchaudio==0.9.1 -f https://download.pytorch.org/whl/torch_stable.html) and separately clone the git repository that you referred to (https://github.com/christian-rauch/flownet2-pytorch/tree/cxx14) in place of flownet2 I managed to get it to work. So I was just wondering if there is an easier solution than the one I used (kind of messy with separate cloning etc.). In any case, thanks for the solution!

Answer 4 · 2022-01-21T21:39:27.000Z

hi, i solved this by installing the latest pytorch version (pip3 install torch==1.9.1+cu111 torchvision==0.10.1+cu111 torchaudio==0.9.1 -f https://download.pytorch.org/whl/torch_stable.html), then i had to patch flownet2 (https://github.com/NVIDIA/flownet2-pytorch/pull/254/files) to make it compatible.

After patching flownet2 it is necessary to install it again

Answer 5 · 2022-12-01T09:18:36.000Z

Worked for me too :
I have Ubuntu 20 and GeForce 3050 Ti:
pip3 install torch==1.9.1+cu111 torchvision==0.10.1+cu111 torchaudio==0.9.1 -f https://download.pytorch.org/whl/torch_stable.html
then cloning branch cxx14 from https://github.com/christian-rauch/flownet2-pytorch/tree/cxx14
and installing according to readme