vt-vl-lab/3d-photo-inpainting

RuntimeError: CUDA error: no kernel image is available for execution on the device

ipritom opened this issue · 2 comments

I'm having CUDA error. I'm trying to generate video with this repo in Ubuntu 20.04 installed in WSL2.

The error message is given bellow:

(3DP) $ python main.py --config argument.yml

running on device 0
  0%|                                                                                             | 0/1 [00:00<?, ?it/s]Current Source ==>  1
Running depth extraction at 1675843829.7992733
BoostingMonocularDepth/inputs/*.jpg
device: cuda
Namespace(Final=True, R0=False, R20=False, colorize_results=False, data_dir='inputs/', depthNet=0, max_res=inf, net_receptive_field_size=None, output_dir='outputs', output_resolution=1, pix2pixsize=1024, savepatchs=0, savewholeest=0)
----------------- Options ---------------
                    Final: True                                 [default: False]
                       R0: False
                      R20: False
             aspect_ratio: 1.0
               batch_size: 1
          checkpoints_dir: ./pix2pix/checkpoints
         colorize_results: False
                crop_size: 672
                 data_dir: inputs/                              [default: None]
                 dataroot: None
             dataset_mode: depthmerge
                 depthNet: 0                                    [default: None]
                direction: AtoB
          display_winsize: 256
                    epoch: latest
                     eval: False
            generatevideo: None
                  gpu_ids: 0
                init_gain: 0.02
                init_type: normal
                 input_nc: 2
                  isTrain: False                                [default: None]
                load_iter: 0                                    [default: 0]
                load_size: 672
         max_dataset_size: 10000
                  max_res: inf
                    model: pix2pix4depth
               n_layers_D: 3
                     name: void
                      ndf: 64
                     netD: basic
                     netG: unet_1024
 net_receptive_field_size: None
                      ngf: 64
               no_dropout: False
                  no_flip: False
                     norm: none
                 num_test: 50
              num_threads: 4
               output_dir: outputs                              [default: None]
                output_nc: 1
        output_resolution: None
                    phase: test
              pix2pixsize: None
               preprocess: resize_and_crop
                savecrops: None
             savewholeest: None
           serial_batches: False
                   suffix:
                  verbose: False
----------------- End -------------------
initialize network with normal
loading the model from ./pix2pix/checkpoints/mergemodel/latest_net_G.pth
Loading weights:  midas/model.pt
Using cache found in /home/ipritom/.cache/torch/hub/facebookresearch_WSL-Images_main
start processing
processing image 0 : 1
         wholeImage being processed in : 2496
Traceback (most recent call last):
  File "run.py", line 580, in <module>
    run(dataset_, option_)
  File "run.py", line 126, in run
    option.pix2pixsize, option.depthNet)
  File "run.py", line 389, in doubleestimate
    estimate1 = singleestimate(img, size1, net_type)
  File "run.py", line 418, in singleestimate
    return estimatemidas(img, msize)
  File "run.py", line 475, in estimatemidas
    prediction = midasmodel.forward(sample)
  File "/home/3d-photo-inpainting/BoostingMonocularDepth/midas/models/midas_net.py", line 59, in forward
    layer_1 = self.pretrained.layer1(x)
  File "/home/ipritom/miniconda3/envs/3DP/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ipritom/miniconda3/envs/3DP/lib/python3.7/site-packages/torch/nn/modules/container.py", line 100, in forward
    input = module(input)
  File "/home/ipritom/miniconda3/envs/3DP/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ipritom/miniconda3/envs/3DP/lib/python3.7/site-packages/torch/nn/modules/activation.py", line 94, in forward
    return F.relu(input, inplace=self.inplace)
  File "/home/ipritom/miniconda3/envs/3DP/lib/python3.7/site-packages/torch/nn/functional.py", line 912, in relu
    result = torch.relu_(input)
RuntimeError: CUDA error: no kernel image is available for execution on the device
  0%|                                                                                             | 0/1 [00:05<?, ?it/s]
Traceback (most recent call last):
  File "main.py", line 54, in <module>
    run_boostmonodepth(sample['ref_img_fi'], config['src_folder'], config['depth_folder'])
  File "/home/3d-photo-inpainting/boostmonodepth_utils.py", line 41, in run_boostmonodepth
    depth = imageio.imread(os.path.join(BOOST_BASE, BOOST_OUTPUTS, tgt_name))
  File "/home/ipritom/miniconda3/envs/3DP/lib/python3.7/site-packages/imageio/__init__.py", line 97, in imread
    return imread_v2(uri, format=format, **kwargs)
  File "/home/ipritom/miniconda3/envs/3DP/lib/python3.7/site-packages/imageio/v2.py", line 200, in imread
    with imopen(uri, "ri", **imopen_args) as file:
  File "/home/ipritom/miniconda3/envs/3DP/lib/python3.7/site-packages/imageio/core/imopen.py", line 118, in imopen
    request = Request(uri, io_mode, format_hint=format_hint, extension=extension)
  File "/home/ipritom/miniconda3/envs/3DP/lib/python3.7/site-packages/imageio/core/request.py", line 248, in __init__
    self._parse_uri(uri)
  File "/home/ipritom/miniconda3/envs/3DP/lib/python3.7/site-packages/imageio/core/request.py", line 407, in _parse_uri
    raise FileNotFoundError("No such file: '%s'" % fn)
FileNotFoundError: No such file: '/home/3d-photo-inpainting/BoostingMonocularDepth/outputs/1.png'

Please, let me know how can I solve this issue.

NVIDIA Driver Version: 516.94
CUDA Version: 11.7

This problem is solved.
I didn't noticed that I've installed wrong version of CUDA while following the instruction in README.md file.

Then I created new environment. And installed PyTorch with my CUDA version. The installation code can be found here:
For me, it was:

conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia

After that you may have to change the following line in main.py as discussed in #191

config = yaml.safe_load(open(args.config, 'r'))

After that you may face another problem like following:

Error: mkl-service + Intel(R) MKL: MKL_THREADING_LAYER=INTEL is incompatible with libgomp.so.1 library.

It was solved by upgrading the numpy module:

pip install -U numpy

At this point everything should work fine. However, some people have mentioned PyQt related error (#16).
The error message may look like the following:

WARNING: could not connect to display
WARNING: Could not load the Qt platform plugin "xcb" in "" even though it was found.
WARNING: This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.

Available platform plugins are: eglfs, linuxfb, minimal, minimalegl, offscreen, vnc, wayland-egl, wayland, wayland-xcomposite-egl, wayland-xcomposite-glx, webgl, xcb.

Aborted

It was solved by installing mesa-utils-extra libegl1-mesa-dev libgles2-mesa-dev xvfb and initializing a virtual display.
Terminal Command:

$ sudo apt update 
$ sudo apt install mesa-utils-extra libegl1-mesa-dev libgles2-mesa-dev xvfb
$ Xvfb :0 -screen 0 1024x768x24 -ac +extension GLX +render -noreset &
$ export DISPLAY=:0

Thanks, I faced the same situation on ubuntu 22.04 with CUDA 11.7, at the end, instead of PyQt error, I got a different error at the video rendering stage:

libGL error: MESA-LOADER: failed to open radeonsi: /usr/lib/dri/radeonsi_dri.so: cannot open shared object file: No such file or directory (search paths /usr/lib/x86_64-linux-gnu/dri:\$${ORIGIN}/dri:/usr/lib/dri, suffix _dri)
libGL error: failed to load driver: radeonsi
libGL error: MESA-LOADER: failed to open radeonsi: /usr/lib/dri/radeonsi_dri.so: cannot open shared object file: No such file or directory (search paths /usr/lib/x86_64-linux-gnu/dri:\$${ORIGIN}/dri:/usr/lib/dri, suffix _dri)
libGL error: failed to load driver: radeonsi
libGL error: MESA-LOADER: failed to open swrast: /usr/lib/dri/swrast_dri.so: cannot open shared object file: No such file or directory (search paths /usr/lib/x86_64-linux-gnu/dri:\$${ORIGIN}/dri:/usr/lib/dri, suffix _dri)
libGL error: failed to load driver: swrast
WARNING: Error drawing visual <Mesh at 0x7f71339e0110>

Solved by exporting the library path:
LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libstdc++.so.6 python main.py --config argument.yml

as mentioned in:
conda-forge/ctng-compilers-feedstock#95 (comment)