cupy-related install issue on Windows

Question

cupy-related install issue on Windows

Closed this issue a year ago · 17 comments

Via email exchange Christophe Leterrier,

Ran into an error setting up mcsim. Other CUDA based packages work on his machine using system CUDA. I thought it might be the cupy version, but now I am not so sure.

One idea that comes to mind is a potential python=3.11 issue. Have we tested with that version?

Environment creation and python call:

conda create -n mcsim_env
conda activate mcsim_env
cd C:\Users\chris\christo\Processing
conda install pip
git clone [https://github.com/QI2lab/mcSIM.git](https://urldefense.com/v3/__https://github.com/QI2lab/mcSIM.git__;!!IKRxdwAv5BmarQ!avDFKv2snekFYnaCqbJ-cR51YIeiU-VsHMO61fgln81lQI2jwurr6RHQ-tqBgUcDUjPz4ungXil6dgsl1gI4ztxm9540HW1Yc8SPiA$)
cd mcSIM
pip install .[gpu]
cd examples
python reconstruction_sim_gpu_repeatedly.py

Error:

(mcsim_env) C:\Users\chris\christo\Processing\mcSIM>python ./examples/reconstruction_sim_gpu_repeatedly.py
Traceback (most recent call last):
  File "C:\Users\chris\.conda\envs\mcsim_env\Lib\site-packages\cupy\__init__.py", line 17, in <module>
    from cupy import _core  # NOQA
    ^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\chris\.conda\envs\mcsim_env\Lib\site-packages\cupy\_core\__init__.py", line 3, in <module>
    from cupy._core import core  # NOQA
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "cupy\_core\core.pyx", line 1, in init cupy._core.core
  File "C:\Users\chris\.conda\envs\mcsim_env\Lib\site-packages\cupy\cuda\__init__.py", line 8, in <module>
    from cupy.cuda import compiler  # NOQA
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\chris\.conda\envs\mcsim_env\Lib\site-packages\cupy\cuda\compiler.py", line 13, in <module>
    from cupy.cuda import device
  File "cupy\cuda\device.pyx", line 1, in init cupy.cuda.device
ImportError: DLL load failed while importing runtime: Le module spécifié est introuvable.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\chris\christo\Processing\mcSIM\examples\reconstruction_sim_gpu_repeatedly.py", line 7, in <module>
    import cupy as cp
  File "C:\Users\chris\.conda\envs\mcsim_env\Lib\site-packages\cupy\__init__.py", line 19, in <module>
    raise ImportError(f'''
ImportError:
================================================================
Failed to import CuPy.

If you installed CuPy via wheels (cupy-cudaXXX or cupy-rocm-X-X), make sure that the package matches with the version of CUDA or ROCm installed.

On Linux, you may need to set LD_LIBRARY_PATH environment variable depending on how you installed CUDA/ROCm.
On Windows, try setting CUDA_PATH environment variable.

Check the Installation Guide for details:
  [https://docs.cupy.dev/en/latest/install.html](https://urldefense.com/v3/__https://docs.cupy.dev/en/latest/install.html__;!!IKRxdwAv5BmarQ!cOM0Y-teafEDZSikbEeamol8WyQOsx6UJLd5YJ0ov9vGvxo8ejMMOtQDKhnCHmIoyZ8c_vIuuq5GAK_NZwLD_JD3Z9Uv4m3Qo_0rLg$)

CUDA Path: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6
DLL dependencies:
  KERNEL32.dll -> C:\WINDOWS\System32\KERNEL32.DLL
  MSVCP140.dll -> C:\Users\chris\.conda\envs\mcsim_env\MSVCP140.dll
  VCRUNTIME140.dll -> C:\Users\chris\.conda\envs\mcsim_env\VCRUNTIME140.dll
  api-ms-win-crt-convert-l1-1-0.dll -> C:\WINDOWS\System32\ucrtbase.dll
  api-ms-win-crt-environment-l1-1-0.dll -> C:\WINDOWS\System32\ucrtbase.dll
  api-ms-win-crt-filesystem-l1-1-0.dll -> C:\WINDOWS\System32\ucrtbase.dll
  api-ms-win-crt-heap-l1-1-0.dll -> C:\WINDOWS\System32\ucrtbase.dll
  api-ms-win-crt-runtime-l1-1-0.dll -> C:\WINDOWS\System32\ucrtbase.dll
  api-ms-win-crt-stdio-l1-1-0.dll -> C:\WINDOWS\System32\ucrtbase.dll
  cuTENSOR.dll -> not found
  cublas64_11.dll -> not found
  cudart64_110.dll -> not found
  cudnn64_8.dll -> not found
  cufft64_10.dll -> not found
  curand64_10.dll -> not found
  cusolver64_11.dll -> not found
  cusparse64_11.dll -> not found
  nvcuda.dll -> C:\WINDOWS\SYSTEM32\nvcuda.dll
  nvrtc64_112_0.dll -> not found
  python311.dll -> C:\Users\chris\.conda\envs\mcsim_env\python311.dll

Original error:
  ImportError: DLL load failed while importing runtime: Le module spécifié est introuvable.
=============

Conda package list

# packages in environment at C:\Users\chris\.conda\envs\mcsim_env:
#
# Name                    Version                   Build  Channel
asciitree                 0.3.3                    pypi_0    pypi
bzip2                     1.0.8                h8ffe710_4    conda-forge
ca-certificates           2023.5.7             h56e8100_0    conda-forge
click                     8.1.3                    pypi_0    pypi
cloudpickle               2.2.1                    pypi_0    pypi
colorama                  0.4.6                    pypi_0    pypi
contourpy                 1.1.0                    pypi_0    pypi
cucim                     23.2.0                   pypi_0    pypi
cupy-cuda11x              12.1.0                   pypi_0    pypi
cycler                    0.11.0                   pypi_0    pypi
dask                      2023.6.0                 pypi_0    pypi
dask-image                2023.3.0                 pypi_0    pypi
entrypoints               0.4                      pypi_0    pypi
fasteners                 0.18                     pypi_0    pypi
fastrlock                 0.8.1                    pypi_0    pypi
fonttools                 4.40.0                   pypi_0    pypi
fsspec                    2023.6.0                 pypi_0    pypi
h5py                      3.9.0                    pypi_0    pypi
imageio                   2.31.1                   pypi_0    pypi
importlib-metadata        6.7.0                    pypi_0    pypi
joblib                    1.2.0                    pypi_0    pypi
kiwisolver                1.4.4                    pypi_0    pypi
lazy-loader               0.2                      pypi_0    pypi
libexpat                  2.5.0                h63175ca_1    conda-forge
libffi                    3.4.2                h8ffe710_5    conda-forge
libsqlite                 3.42.0               hcfcfb64_0    conda-forge
libzlib                   1.2.13               hcfcfb64_5    conda-forge
llvmlite                  0.40.1rc1                pypi_0    pypi
localize-psf              0.2.0                    pypi_0    pypi
locket                    1.0.0                    pypi_0    pypi
matplotlib                3.7.1                    pypi_0    pypi
mcsim                     1.4.0                    pypi_0    pypi
networkx                  3.1                      pypi_0    pypi
numba                     0.57.0                   pypi_0    pypi
numcodecs                 0.11.0                   pypi_0    pypi
numpy                     1.24.3                   pypi_0    pypi
openssl                   3.1.1                hcfcfb64_1    conda-forge
packaging                 23.1                     pypi_0    pypi
pandas                    2.0.2                    pypi_0    pypi
partd                     1.4.0                    pypi_0    pypi
pillow                    9.5.0                    pypi_0    pypi
pims                      0.6.1                    pypi_0    pypi
pip                       23.1.2             pyhd8ed1ab_0    conda-forge
psutil                    5.9.5                    pypi_0    pypi
pyparsing                 3.1.0                    pypi_0    pypi
python                    3.11.4          h2628c8c_0_cpython    conda-forge
python-dateutil           2.8.2                    pypi_0    pypi
pytz                      2023.3                   pypi_0    pypi
pywavelets                1.4.1                    pypi_0    pypi
pyyaml                    6.0                      pypi_0    pypi
scikit-image              0.21.0                   pypi_0    pypi
scipy                     1.10.1                   pypi_0    pypi
setuptools                67.7.2             pyhd8ed1ab_0    conda-forge
six                       1.16.0                   pypi_0    pypi
slicerator                1.1.0                    pypi_0    pypi
tifffile                  2023.4.12                pypi_0    pypi
tk                        8.6.12               h8ffe710_0    conda-forge
toolz                     0.12.0                   pypi_0    pypi
tzdata                    2023.3                   pypi_0    pypi
ucrt                      10.0.22621.0         h57928b3_0    conda-forge
vc                        14.3                hb25d44b_16    conda-forge
vc14_runtime              14.34.31931         h5081d32_16    conda-forge
vs2015_runtime            14.34.31931         hed1258a_16    conda-forge
wheel                     0.40.0             pyhd8ed1ab_0    conda-forge
xz                        5.2.6                h8d14728_0    conda-forge
zarr                      2.15.0                   pypi_0    pypi
zipp                      3.15.0                   pypi_0    pypi

Answer 1 · 2023-06-20T16:25:33.000Z

I tried to create an environment specifying python 3.9 before installing your package but that also resulted in getting cupy_cuda11x as 12.1.0 version.

I then tried to creat a python 3.10 environment, and pip install cupy_cuda11x before installing your package (as discussed), but that also resulted in getting cupy_cuda11x as 12.1.0:

(mcsim_env) C:\Users\chris\christo\Processing\mcSIM>pip install cupy-cuda11x
Collecting cupy-cuda11x
Downloading cupy_cuda11x-12.1.0-cp310-cp310-win_amd64.whl (69.9 MB)
Collecting numpy<1.27,>=1.20 (from cupy-cuda11x)
Downloading numpy-1.25.0-cp310-cp310-win_amd64.whl (15.0 MB)
Collecting fastrlock>=0.5 (from cupy-cuda11x)
Downloading fastrlock-0.8.1-cp310-cp310-win_amd64.whl (28 kB)
Installing collected packages: fastrlock, numpy, cupy-cuda11x
Successfully installed cupy-cuda11x-12.1.0 fastrlock-0.8.1 numpy-1.25.0

Both strategies result in the same error with cupy (reproduced in first post) when launching reconstruction_sim_gpu_repeatedly.py

Answer 2 · 2023-06-20T17:08:57.000Z

I could get trough launching reconstruction_sim_gpu_repeatedly using this sequence of install:

conda create -n mcsim_env python=3.10
conda activate mcsim_env
conda install -c conda-forge cupy cudnn cutensor
cd C:\Users\chris\christo\Processing
git clone https://github.com/QI2lab/mcSIM.git
cd mcSIM
pip install .
pip install "git+https://github.com/rapidsai/cucim.git@v22.12.00#egg=cucim&subdirectory=python/cucim"
cd examples
python reconstruction_sim_gpu_repeatedly.py

basically I used conda to get cuda toolkit and the previoulsy missing cuda bits, installed mcSIM package without the GPU support, then manually installed the Windows cucim bit.

Answer 3 · 2023-06-20T17:10:33.000Z

Next for another episode (or Issues thread :)
running initial reconstruction with full parameter estimation
Traceback (most recent call last):
File "C:\Users\chris\christo\Processing\mcSIM\examples\reconstruction_sim_gpu_repeatedly.py", line 40, in
imgset = sim.SimImageSet.initialize({"pixel_size": dxy,
File "C:\Users\chris.conda\envs\mcsim_env\lib\site-packages\mcsim\analysis\sim_reconstruction.py", line 244, in initialize
self.preprocess_data(physical_params["pixel_size"],
File "C:\Users\chris.conda\envs\mcsim_env\lib\site-packages\mcsim\analysis\sim_reconstruction.py", line 483, in preprocess_data
self.nangles, self.nphases, self.ny, self.nx = imgs.shape[-4:]
ValueError: not enough values to unpack (expected 4, got 3)

Answer 4 · 2023-06-20T17:11:14.000Z

@cleterrier I can reproduce your error using Python 3.11 on Windows 10 Enterpise if I try to pip install CuPy. This looks like a CuPy problem to me, but I don't think that version 12.1.0 of cupy_cuda11x is necessarily a problem.

Try installing CuPy using conda. I typically run

conda install -c conda-forge cupy cudatoolkit=11.8

but you should replace 11.8 with your CUDA toolkit version

Please also pull the latest version of mcSIM, as I have updated the CuPy instructions

Answer 5 · 2023-06-20T17:18:18.000Z

@cleterrier the variable imgs which is passed to the constructor should be 4D of shape nangles x nphases x ny x nx. My guess is you are loading a 3D SIM image instead of size nangles * nphases x ny x nx

Sorry, this example was not ready for prime time. I see that I did not include the data file I was using here in the Zenodo repository. I will update both the code and the Zenodo repository so you can actually reproduce what I'm doing

Do the example scripts reconstruct_sim_simulated.py or reconstruct_sim_experiment.py run for you? To run the second you will have to download the image file from https://doi.org/10.5281/zenodo.7851111

Answer 6 · 2023-06-20T17:44:02.000Z

I see - my image is a timelapse 2D-SIM "flat" stack (tif from ImageJ) with 61 times points x 3 phases x 3 angles x X x Y.

So I guess I need to use something like in reconstruct_sim_experiment.py when reading the tif file:
reshape([ncolors, nangles, nphases, ny, nx])
but what do I put instead of 'ncolors'?
In reconstruction_sim_gpu_repeatedly.py I can't find the name of the variable that defines the different images upstream of phases/angles/x/y in the sequence

Answer 7 · 2023-06-20T17:57:35.000Z

@cleterrier I don't have an explicit example for reconstructing time-lapse data, but it sounds like I should add one. Since the different channels are handled separately anyways, reconstructing single-color time-lapse data is simpler than the recipe in reconstruction_sim_experiment.py.

You would want to do something like

imgs = tifffile.imread(fname).reshape([61, 3, 3, ny, nx])

imgset = sim.SimImageSet.initialize({"pixel_size": pixel_size, "na": na, "wavelength": emission_wavelength},
                                        imgs,
                                        otf=otf,
                                        wiener_parameter=0.1,
                                        frq_estimation_mode="band-correlation",
                                        frq_guess=frqs_guess,
                                        phase_estimation_mode="wicker-iterative",
                                        phases_guess=phases_guess,
                                        combine_bands_mode="fairSIM",
                                        fmax_exclude_band0=0.4,
                                        normalize_histograms=False,
                                        background=100,
                                        gain=2,
                                        min_p2nr=0.5,
                                        use_gpu=use_gpu)
 # ###########################################
 # run reconstruction
 # ###########################################
 imgset.reconstruct(compute_widefield=True,
                       compute_os=True,
                       compute_deconvolved=True,
                       compute_mcnr=True)

# ###########################################
# print parameters
# ###########################################
imgset.print_parameters()

# ###########################################
# save reconstruction results
# ###########################################
imgset.save_imgs(save_dir,
                     format="tiff",  #format="zarr",
                     save_raw_data=False,
                     save_patterns=False)

# ###########################################
# save diagnostic plots
# ###########################################
imgset.plot_figs(save_dir,
                    diagnostics_only=True
                     figsize=(20, 10),
                     imgs_dpi=300)

Answer 8 · 2023-06-20T18:46:32.000Z

Good news is that I managed to format my data using reshape in reconstruction_sim_gpu_repeatedly.py

Bad news is that my cupy install still seems a bit wonky... Processing stopped with:

(mcsim_env) C:\Users\chris\christo\Processing\mcSIM\examples>python reconstruction_sim_gpu_repeatedly.py
running initial reconstruction with full parameter estimation
initialization took 7.19s
starting parameter estimation...
estimating 3 frequencies using mode band-correlation took 5.99s
Traceback (most recent call last):
File "C:\Users\chris\christo\Processing\mcSIM\examples\reconstruction_sim_gpu_repeatedly.py", line 60, in
imgset.reconstruct()
File "C:\Users\chris.conda\envs\mcsim_env\lib\site-packages\mcsim\analysis\sim_reconstruction.py", line 1118, in reconstruct
self.estimate_parameters(slices=slices)
File "C:\Users\chris.conda\envs\mcsim_env\lib\site-packages\mcsim\analysis\sim_reconstruction.py", line 845, in estimate_parameters
peak_val = tools.get_peak_value(imgs_ft[ii],
File "C:\Users\chris.conda\envs\mcsim_env\lib\site-packages\mcsim\analysis\analysis_tools.py", line 194, in get_peak_value
roi = rois.get_centered_rois([iy, ix], [3 * peak_pixel_size, 3 * peak_pixel_size])[0]
File "C:\Users\chris.conda\envs\mcsim_env\lib\site-packages\localize_psf\rois.py", line 139, in get_centered_rois
centers = np.atleast_2d(centers)
File "<array_function internals>", line 200, in atleast_2d
File "C:\Users\chris.conda\envs\mcsim_env\lib\site-packages\numpy\core\shape_base.py", line 121, in atleast_2d
ary = asanyarray(ary)
File "cupy_core\core.pyx", line 1480, in cupy._core.core._ndarray_base.array
TypeError: Implicit conversion to a NumPy array is not allowed. Please use .get() to construct a NumPy array explicitly.

Answer 9 · 2023-06-20T18:47:18.000Z

@cleterrier can you pull the latest version? I just fixed this

Answer 10 · 2023-06-20T18:47:45.000Z

@cleterrier I think you are going to run out of memory using the GPU with 61 time points anyways, so you might want to set the GPU flag to false

Answer 11 · 2023-06-20T18:57:10.000Z

@cleterrier I added a simple time-lapse reconstruction script to examples here. The sample tiff file is available here

To expand on what I was saying before ... right now the GPU support is fairly stupid. So if you put in a whole z-stack at once as I do in this example file, the code will try and reconstruct the whole z-stack at once. The work around is to reconstruct the images in a loop using a similar strategy to reconstruction_sim_gpu_repeatedly.py

Answer 12 · 2023-06-20T19:08:28.000Z

I see... Here's what I get with reconstruction_sim_gpu_repeatedly.py and my 61-frame 1024x1024 stack:

(mcsim_env) C:\Users\chris\christo\Processing\mcSIM\examples>python reconstruction_sim_gpu_repeatedly.py
running initial reconstruction with full parameter estimation
initialization took 6.71s
starting parameter estimation...
estimating 3 frequencies using mode band-correlation took 5.99s
estimated peak-to-noise ratio in 6.21s
estimated 9 phases using mode wicker-iterative in 5.48s
Angle 0 phase guesses have more than the maximum allowed phase error=10.00deg. Defaulting to guess values
fit phase diffs=0.00deg, 116.36deg, 223.81deg,
Angle 2 phase guesses have more than the maximum allowed phase error=10.00deg. Defaulting to guess values
fit phase diffs=0.00deg, 130.93deg, 243.22deg,
estimated global phases and modulation depths in 7.58s
replaced modulation depth for angle 0 because estimated value was less than allowed minimum, 0.039 < 0.500
replaced modulation depth for angle 1 because estimated value was less than allowed minimum, 0.021 < 0.500
replaced modulation depth for angle 2 because estimated value was less than allowed minimum, 0.027 < 0.500
parameter estimation took 58.34s
combining bands took 0.57s
reconstruction took 1.94s
estimating parameters took = 67.00s, used GPU memory = 0.679GB, memory pool = 1.443GB
used GPU memory = 0.679GB, memory pool = 4.337GB
iteration 1, process time=36.51s, used GPU memory = 0.000GB, memory pool = 4.941GB
iteration 2, process time=37.50s, used GPU memory = 0.000GB, memory pool = 7.810GB
iteration 3, process time=37.48s, used GPU memory = 0.000GB, memory pool = 3.733GB

Comment in the code says iteration should take less than 2 seconds, and GPU memory reported to zero makes me think there's a problem somewhere. But my GPU is definitely working according to GPU-Z

Answer 13 · 2023-06-20T19:10:09.000Z

Also it looks like the parameter estimation fails. I've had the same problem with fairSIM but not with Hifi-SIM. Maybe I can send you the image file so that you see if you manage to get somehting out of it?

Answer 14 · 2023-06-20T19:24:35.000Z

@cleterrier I'm happy to give it a go. My email is ptbrown1729 AT gmail

Answer 15 · 2023-06-20T19:25:21.000Z

I'm happy to take a look at it if you send a link. Is the data TIRF-SIM?

Edit: Peter's napari plugin is helpful for setting up the initial parameters.

Answer 16 · 2023-06-20T19:32:28.000Z

Check your email!

Answer 17 · 2023-06-20T19:36:17.000Z

For TIRF SIM it's probably necessary to do a provisional unmixing of the SIM bands before parameter estimation, as no peaks will be visible in the raw FT. The plugin doesn't (yet!) support that