QI2lab/mcSIM

cupy-related install issue on Windows

Closed this issue · 17 comments

Via email exchange Christophe Leterrier,

Ran into an error setting up mcsim. Other CUDA based packages work on his machine using system CUDA. I thought it might be the cupy version, but now I am not so sure.

One idea that comes to mind is a potential python=3.11 issue. Have we tested with that version?

Environment creation and python call:

conda create -n mcsim_env
conda activate mcsim_env
cd C:\Users\chris\christo\Processing
conda install pip
git clone [https://github.com/QI2lab/mcSIM.git](https://urldefense.com/v3/__https://github.com/QI2lab/mcSIM.git__;!!IKRxdwAv5BmarQ!avDFKv2snekFYnaCqbJ-cR51YIeiU-VsHMO61fgln81lQI2jwurr6RHQ-tqBgUcDUjPz4ungXil6dgsl1gI4ztxm9540HW1Yc8SPiA$)
cd mcSIM
pip install .[gpu]
cd examples
python reconstruction_sim_gpu_repeatedly.py

Error:

(mcsim_env) C:\Users\chris\christo\Processing\mcSIM>python ./examples/reconstruction_sim_gpu_repeatedly.py
Traceback (most recent call last):
  File "C:\Users\chris\.conda\envs\mcsim_env\Lib\site-packages\cupy\__init__.py", line 17, in <module>
    from cupy import _core  # NOQA
    ^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\chris\.conda\envs\mcsim_env\Lib\site-packages\cupy\_core\__init__.py", line 3, in <module>
    from cupy._core import core  # NOQA
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "cupy\_core\core.pyx", line 1, in init cupy._core.core
  File "C:\Users\chris\.conda\envs\mcsim_env\Lib\site-packages\cupy\cuda\__init__.py", line 8, in <module>
    from cupy.cuda import compiler  # NOQA
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\chris\.conda\envs\mcsim_env\Lib\site-packages\cupy\cuda\compiler.py", line 13, in <module>
    from cupy.cuda import device
  File "cupy\cuda\device.pyx", line 1, in init cupy.cuda.device
ImportError: DLL load failed while importing runtime: Le module spécifié est introuvable.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\chris\christo\Processing\mcSIM\examples\reconstruction_sim_gpu_repeatedly.py", line 7, in <module>
    import cupy as cp
  File "C:\Users\chris\.conda\envs\mcsim_env\Lib\site-packages\cupy\__init__.py", line 19, in <module>
    raise ImportError(f'''
ImportError:
================================================================
Failed to import CuPy.

If you installed CuPy via wheels (cupy-cudaXXX or cupy-rocm-X-X), make sure that the package matches with the version of CUDA or ROCm installed.

On Linux, you may need to set LD_LIBRARY_PATH environment variable depending on how you installed CUDA/ROCm.
On Windows, try setting CUDA_PATH environment variable.

Check the Installation Guide for details:
  [https://docs.cupy.dev/en/latest/install.html](https://urldefense.com/v3/__https://docs.cupy.dev/en/latest/install.html__;!!IKRxdwAv5BmarQ!cOM0Y-teafEDZSikbEeamol8WyQOsx6UJLd5YJ0ov9vGvxo8ejMMOtQDKhnCHmIoyZ8c_vIuuq5GAK_NZwLD_JD3Z9Uv4m3Qo_0rLg$)

CUDA Path: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6
DLL dependencies:
  KERNEL32.dll -> C:\WINDOWS\System32\KERNEL32.DLL
  MSVCP140.dll -> C:\Users\chris\.conda\envs\mcsim_env\MSVCP140.dll
  VCRUNTIME140.dll -> C:\Users\chris\.conda\envs\mcsim_env\VCRUNTIME140.dll
  api-ms-win-crt-convert-l1-1-0.dll -> C:\WINDOWS\System32\ucrtbase.dll
  api-ms-win-crt-environment-l1-1-0.dll -> C:\WINDOWS\System32\ucrtbase.dll
  api-ms-win-crt-filesystem-l1-1-0.dll -> C:\WINDOWS\System32\ucrtbase.dll
  api-ms-win-crt-heap-l1-1-0.dll -> C:\WINDOWS\System32\ucrtbase.dll
  api-ms-win-crt-runtime-l1-1-0.dll -> C:\WINDOWS\System32\ucrtbase.dll
  api-ms-win-crt-stdio-l1-1-0.dll -> C:\WINDOWS\System32\ucrtbase.dll
  cuTENSOR.dll -> not found
  cublas64_11.dll -> not found
  cudart64_110.dll -> not found
  cudnn64_8.dll -> not found
  cufft64_10.dll -> not found
  curand64_10.dll -> not found
  cusolver64_11.dll -> not found
  cusparse64_11.dll -> not found
  nvcuda.dll -> C:\WINDOWS\SYSTEM32\nvcuda.dll
  nvrtc64_112_0.dll -> not found
  python311.dll -> C:\Users\chris\.conda\envs\mcsim_env\python311.dll

Original error:
  ImportError: DLL load failed while importing runtime: Le module spécifié est introuvable.
=============

Conda package list

# packages in environment at C:\Users\chris\.conda\envs\mcsim_env:
#
# Name                    Version                   Build  Channel
asciitree                 0.3.3                    pypi_0    pypi
bzip2                     1.0.8                h8ffe710_4    conda-forge
ca-certificates           2023.5.7             h56e8100_0    conda-forge
click                     8.1.3                    pypi_0    pypi
cloudpickle               2.2.1                    pypi_0    pypi
colorama                  0.4.6                    pypi_0    pypi
contourpy                 1.1.0                    pypi_0    pypi
cucim                     23.2.0                   pypi_0    pypi
cupy-cuda11x              12.1.0                   pypi_0    pypi
cycler                    0.11.0                   pypi_0    pypi
dask                      2023.6.0                 pypi_0    pypi
dask-image                2023.3.0                 pypi_0    pypi
entrypoints               0.4                      pypi_0    pypi
fasteners                 0.18                     pypi_0    pypi
fastrlock                 0.8.1                    pypi_0    pypi
fonttools                 4.40.0                   pypi_0    pypi
fsspec                    2023.6.0                 pypi_0    pypi
h5py                      3.9.0                    pypi_0    pypi
imageio                   2.31.1                   pypi_0    pypi
importlib-metadata        6.7.0                    pypi_0    pypi
joblib                    1.2.0                    pypi_0    pypi
kiwisolver                1.4.4                    pypi_0    pypi
lazy-loader               0.2                      pypi_0    pypi
libexpat                  2.5.0                h63175ca_1    conda-forge
libffi                    3.4.2                h8ffe710_5    conda-forge
libsqlite                 3.42.0               hcfcfb64_0    conda-forge
libzlib                   1.2.13               hcfcfb64_5    conda-forge
llvmlite                  0.40.1rc1                pypi_0    pypi
localize-psf              0.2.0                    pypi_0    pypi
locket                    1.0.0                    pypi_0    pypi
matplotlib                3.7.1                    pypi_0    pypi
mcsim                     1.4.0                    pypi_0    pypi
networkx                  3.1                      pypi_0    pypi
numba                     0.57.0                   pypi_0    pypi
numcodecs                 0.11.0                   pypi_0    pypi
numpy                     1.24.3                   pypi_0    pypi
openssl                   3.1.1                hcfcfb64_1    conda-forge
packaging                 23.1                     pypi_0    pypi
pandas                    2.0.2                    pypi_0    pypi
partd                     1.4.0                    pypi_0    pypi
pillow                    9.5.0                    pypi_0    pypi
pims                      0.6.1                    pypi_0    pypi
pip                       23.1.2             pyhd8ed1ab_0    conda-forge
psutil                    5.9.5                    pypi_0    pypi
pyparsing                 3.1.0                    pypi_0    pypi
python                    3.11.4          h2628c8c_0_cpython    conda-forge
python-dateutil           2.8.2                    pypi_0    pypi
pytz                      2023.3                   pypi_0    pypi
pywavelets                1.4.1                    pypi_0    pypi
pyyaml                    6.0                      pypi_0    pypi
scikit-image              0.21.0                   pypi_0    pypi
scipy                     1.10.1                   pypi_0    pypi
setuptools                67.7.2             pyhd8ed1ab_0    conda-forge
six                       1.16.0                   pypi_0    pypi
slicerator                1.1.0                    pypi_0    pypi
tifffile                  2023.4.12                pypi_0    pypi
tk                        8.6.12               h8ffe710_0    conda-forge
toolz                     0.12.0                   pypi_0    pypi
tzdata                    2023.3                   pypi_0    pypi
ucrt                      10.0.22621.0         h57928b3_0    conda-forge
vc                        14.3                hb25d44b_16    conda-forge
vc14_runtime              14.34.31931         h5081d32_16    conda-forge
vs2015_runtime            14.34.31931         hed1258a_16    conda-forge
wheel                     0.40.0             pyhd8ed1ab_0    conda-forge
xz                        5.2.6                h8d14728_0    conda-forge
zarr                      2.15.0                   pypi_0    pypi
zipp                      3.15.0                   pypi_0    pypi

I tried to create an environment specifying python 3.9 before installing your package but that also resulted in getting cupy_cuda11x as 12.1.0 version.

I then tried to creat a python 3.10 environment, and pip install cupy_cuda11x before installing your package (as discussed), but that also resulted in getting cupy_cuda11x as 12.1.0:

(mcsim_env) C:\Users\chris\christo\Processing\mcSIM>pip install cupy-cuda11x
Collecting cupy-cuda11x
Downloading cupy_cuda11x-12.1.0-cp310-cp310-win_amd64.whl (69.9 MB)
Collecting numpy<1.27,>=1.20 (from cupy-cuda11x)
Downloading numpy-1.25.0-cp310-cp310-win_amd64.whl (15.0 MB)
Collecting fastrlock>=0.5 (from cupy-cuda11x)
Downloading fastrlock-0.8.1-cp310-cp310-win_amd64.whl (28 kB)
Installing collected packages: fastrlock, numpy, cupy-cuda11x
Successfully installed cupy-cuda11x-12.1.0 fastrlock-0.8.1 numpy-1.25.0

Both strategies result in the same error with cupy (reproduced in first post) when launching reconstruction_sim_gpu_repeatedly.py

I could get trough launching reconstruction_sim_gpu_repeatedly using this sequence of install:

conda create -n mcsim_env python=3.10
conda activate mcsim_env
conda install -c conda-forge cupy cudnn cutensor
cd C:\Users\chris\christo\Processing
git clone https://github.com/QI2lab/mcSIM.git
cd mcSIM
pip install .
pip install "git+https://github.com/rapidsai/cucim.git@v22.12.00#egg=cucim&subdirectory=python/cucim"
cd examples
python reconstruction_sim_gpu_repeatedly.py

basically I used conda to get cuda toolkit and the previoulsy missing cuda bits, installed mcSIM package without the GPU support, then manually installed the Windows cucim bit.

Next for another episode (or Issues thread :)
running initial reconstruction with full parameter estimation
Traceback (most recent call last):
File "C:\Users\chris\christo\Processing\mcSIM\examples\reconstruction_sim_gpu_repeatedly.py", line 40, in
imgset = sim.SimImageSet.initialize({"pixel_size": dxy,
File "C:\Users\chris.conda\envs\mcsim_env\lib\site-packages\mcsim\analysis\sim_reconstruction.py", line 244, in initialize
self.preprocess_data(physical_params["pixel_size"],
File "C:\Users\chris.conda\envs\mcsim_env\lib\site-packages\mcsim\analysis\sim_reconstruction.py", line 483, in preprocess_data
self.nangles, self.nphases, self.ny, self.nx = imgs.shape[-4:]
ValueError: not enough values to unpack (expected 4, got 3)

@cleterrier I can reproduce your error using Python 3.11 on Windows 10 Enterpise if I try to pip install CuPy. This looks like a CuPy problem to me, but I don't think that version 12.1.0 of cupy_cuda11x is necessarily a problem.

Try installing CuPy using conda. I typically run

conda install -c conda-forge cupy cudatoolkit=11.8

but you should replace 11.8 with your CUDA toolkit version

Please also pull the latest version of mcSIM, as I have updated the CuPy instructions

@cleterrier the variable imgs which is passed to the constructor should be 4D of shape nangles x nphases x ny x nx. My guess is you are loading a 3D SIM image instead of size nangles * nphases x ny x nx

Sorry, this example was not ready for prime time. I see that I did not include the data file I was using here in the Zenodo repository. I will update both the code and the Zenodo repository so you can actually reproduce what I'm doing

Do the example scripts reconstruct_sim_simulated.py or reconstruct_sim_experiment.py run for you? To run the second you will have to download the image file from https://doi.org/10.5281/zenodo.7851111

I see - my image is a timelapse 2D-SIM "flat" stack (tif from ImageJ) with 61 times points x 3 phases x 3 angles x X x Y.

So I guess I need to use something like in reconstruct_sim_experiment.py when reading the tif file:
reshape([ncolors, nangles, nphases, ny, nx])
but what do I put instead of 'ncolors'?
In reconstruction_sim_gpu_repeatedly.py I can't find the name of the variable that defines the different images upstream of phases/angles/x/y in the sequence

@cleterrier I don't have an explicit example for reconstructing time-lapse data, but it sounds like I should add one. Since the different channels are handled separately anyways, reconstructing single-color time-lapse data is simpler than the recipe in reconstruction_sim_experiment.py.

You would want to do something like

imgs = tifffile.imread(fname).reshape([61, 3, 3, ny, nx])

imgset = sim.SimImageSet.initialize({"pixel_size": pixel_size, "na": na, "wavelength": emission_wavelength},
                                        imgs,
                                        otf=otf,
                                        wiener_parameter=0.1,
                                        frq_estimation_mode="band-correlation",
                                        frq_guess=frqs_guess,
                                        phase_estimation_mode="wicker-iterative",
                                        phases_guess=phases_guess,
                                        combine_bands_mode="fairSIM",
                                        fmax_exclude_band0=0.4,
                                        normalize_histograms=False,
                                        background=100,
                                        gain=2,
                                        min_p2nr=0.5,
                                        use_gpu=use_gpu)
 # ###########################################
 # run reconstruction
 # ###########################################
 imgset.reconstruct(compute_widefield=True,
                       compute_os=True,
                       compute_deconvolved=True,
                       compute_mcnr=True)

# ###########################################
# print parameters
# ###########################################
imgset.print_parameters()

# ###########################################
# save reconstruction results
# ###########################################
imgset.save_imgs(save_dir,
                     format="tiff",  #format="zarr",
                     save_raw_data=False,
                     save_patterns=False)

# ###########################################
# save diagnostic plots
# ###########################################
imgset.plot_figs(save_dir,
                    diagnostics_only=True
                     figsize=(20, 10),
                     imgs_dpi=300)

Good news is that I managed to format my data using reshape in reconstruction_sim_gpu_repeatedly.py

Bad news is that my cupy install still seems a bit wonky... Processing stopped with:

(mcsim_env) C:\Users\chris\christo\Processing\mcSIM\examples>python reconstruction_sim_gpu_repeatedly.py
running initial reconstruction with full parameter estimation
initialization took 7.19s
starting parameter estimation...
estimating 3 frequencies using mode band-correlation took 5.99s
Traceback (most recent call last):
File "C:\Users\chris\christo\Processing\mcSIM\examples\reconstruction_sim_gpu_repeatedly.py", line 60, in
imgset.reconstruct()
File "C:\Users\chris.conda\envs\mcsim_env\lib\site-packages\mcsim\analysis\sim_reconstruction.py", line 1118, in reconstruct
self.estimate_parameters(slices=slices)
File "C:\Users\chris.conda\envs\mcsim_env\lib\site-packages\mcsim\analysis\sim_reconstruction.py", line 845, in estimate_parameters
peak_val = tools.get_peak_value(imgs_ft[ii],
File "C:\Users\chris.conda\envs\mcsim_env\lib\site-packages\mcsim\analysis\analysis_tools.py", line 194, in get_peak_value
roi = rois.get_centered_rois([iy, ix], [3 * peak_pixel_size, 3 * peak_pixel_size])[0]
File "C:\Users\chris.conda\envs\mcsim_env\lib\site-packages\localize_psf\rois.py", line 139, in get_centered_rois
centers = np.atleast_2d(centers)
File "<array_function internals>", line 200, in atleast_2d
File "C:\Users\chris.conda\envs\mcsim_env\lib\site-packages\numpy\core\shape_base.py", line 121, in atleast_2d
ary = asanyarray(ary)
File "cupy_core\core.pyx", line 1480, in cupy._core.core._ndarray_base.array
TypeError: Implicit conversion to a NumPy array is not allowed. Please use .get() to construct a NumPy array explicitly.

@cleterrier can you pull the latest version? I just fixed this

@cleterrier I think you are going to run out of memory using the GPU with 61 time points anyways, so you might want to set the GPU flag to false

@cleterrier I added a simple time-lapse reconstruction script to examples here. The sample tiff file is available here

To expand on what I was saying before ... right now the GPU support is fairly stupid. So if you put in a whole z-stack at once as I do in this example file, the code will try and reconstruct the whole z-stack at once. The work around is to reconstruct the images in a loop using a similar strategy to reconstruction_sim_gpu_repeatedly.py

I see... Here's what I get with reconstruction_sim_gpu_repeatedly.py and my 61-frame 1024x1024 stack:

(mcsim_env) C:\Users\chris\christo\Processing\mcSIM\examples>python reconstruction_sim_gpu_repeatedly.py
running initial reconstruction with full parameter estimation
initialization took 6.71s
starting parameter estimation...
estimating 3 frequencies using mode band-correlation took 5.99s
estimated peak-to-noise ratio in 6.21s
estimated 9 phases using mode wicker-iterative in 5.48s
Angle 0 phase guesses have more than the maximum allowed phase error=10.00deg. Defaulting to guess values
fit phase diffs=0.00deg, 116.36deg, 223.81deg,
Angle 2 phase guesses have more than the maximum allowed phase error=10.00deg. Defaulting to guess values
fit phase diffs=0.00deg, 130.93deg, 243.22deg,
estimated global phases and modulation depths in 7.58s
replaced modulation depth for angle 0 because estimated value was less than allowed minimum, 0.039 < 0.500
replaced modulation depth for angle 1 because estimated value was less than allowed minimum, 0.021 < 0.500
replaced modulation depth for angle 2 because estimated value was less than allowed minimum, 0.027 < 0.500
parameter estimation took 58.34s
combining bands took 0.57s
reconstruction took 1.94s
estimating parameters took = 67.00s, used GPU memory = 0.679GB, memory pool = 1.443GB
used GPU memory = 0.679GB, memory pool = 4.337GB
iteration 1, process time=36.51s, used GPU memory = 0.000GB, memory pool = 4.941GB
iteration 2, process time=37.50s, used GPU memory = 0.000GB, memory pool = 7.810GB
iteration 3, process time=37.48s, used GPU memory = 0.000GB, memory pool = 3.733GB

Comment in the code says iteration should take less than 2 seconds, and GPU memory reported to zero makes me think there's a problem somewhere. But my GPU is definitely working according to GPU-Z

Also it looks like the parameter estimation fails. I've had the same problem with fairSIM but not with Hifi-SIM. Maybe I can send you the image file so that you see if you manage to get somehting out of it?

@cleterrier I'm happy to give it a go. My email is ptbrown1729 AT gmail

I'm happy to take a look at it if you send a link. Is the data TIRF-SIM?

Edit: Peter's napari plugin is helpful for setting up the initial parameters.

Check your email!

For TIRF SIM it's probably necessary to do a provisional unmixing of the SIM bands before parameter estimation, as no peaks will be visible in the raw FT. The plugin doesn't (yet!) support that