[BUG] Cellpose 3 using GPU macOS M1

Question

[BUG] Cellpose 3 using GPU macOS M1

Opened this issue 3 months ago · 9 comments

I had the same bug using cellpose 2 to detect cells after the training of model (2D) so I decided to install Cellpose 3 on my macOS M1, using Anaconda. I follow the protocol posted on GitHub
https://github.com/MouseLand/cellpose?tab=readme-ov-file#option-1-installation-instructions-with-conda
All models fail to detect cells using the GPU option. When I use Cellpose + track mate, the GPU option does not detect any cells.
Thanks for your help!

Answer 1 · 2024-10-30T12:42:54.000Z

sorry there was a typo in the readme, can you please use python=3.10 and pip install torch --upgrade? we've heard of issues with previous versions of torch + MPS (Mac)

Answer 2 · 2024-10-30T13:16:00.000Z

Thanks a lot! I've tried what you suggest but I got this error:

A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.0.2 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Answer 3 · 2024-10-30T13:39:34.000Z

Hmm which module is having the issue?

Answer 4 · 2024-10-30T13:45:26.000Z

I found these lines:
File "/opt/anaconda3/envs/cellpose_Debug/lib/python3.10/site-packages/torch/nn/modules/transformer.py", line 20, in
device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
/opt/anaconda3/envs/cellpose_Debug/lib/python3.10/site-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/utils/tensor_numpy.cpp:84.)
device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
/opt/anaconda3/envs/cellpose_Debug/lib/python3.10/site-packages/cellpose/io.py:18: PythonQtWarning: Selected binding 'pyqt5' could not be found; falling back to 'pyqt6'
from qtpy import QtGui, QtCore, Qt, QtWidgets
2024-10-30 14:09:26,990 [INFO] WRITING LOG OUTPUT TO /.cellpose/run.log
2024-10-30 14:09:26,990 [INFO]
cellpose version: 3.1.0
platform: darwin
python version: 3.10.15
torch version: 2.2.2

Answer 5 · 2024-10-30T13:47:08.000Z

I have tried to downgrade Numpy to 1.23.5. but still GPU cannot be used and without the use of GPU the trained model does not detect cells.

Answer 6 · 2024-11-03T21:31:45.000Z

I am having the same issue concerning not being able to get any segmentation objects with the GPU on a mackbook pro M1. When I try running the commands I get the following error suggesting that the backend is failing.

NotImplementedError Traceback (most recent call last)
Cell In[1], line 78
75 images = [io.imread(img_file)]
77 # Run segmentation on the image using the custom model with the specified diameter
---> 78 result = model.eval(
79 images,
80 diameter=diameter, # Set diameter based on extracted stage
81 channels=channels,
82 flow_threshold=0.4,
83 cellprob_threshold=-1.0,
84 resample=True,
85 niter=200
86 )
88 # Handle cases where only 3 values are returned
89 if len(result) == 4:

File ~/mambaforge/lib/python3.10/site-packages/cellpose/models.py:453, in CellposeModel.eval(self, x, batch_size, resample, channels, channel_axis, z_axis, normalize, invert, rescale, diameter, flow_threshold, cellprob_threshold, do_3D, anisotropy, dP_smooth, stitch_threshold, min_size, max_size_fraction, niter, augment, tile_overlap, bsize, interp, compute_masks, progress)
451 for i in iterator:
452 tic = time.time()
--> 453 maski, flowi, stylei = self.eval(
454 x[i], batch_size=batch_size,
455 channels=channels[i] if channels is not None and
456 ((len(channels) == len(x) and
457 (isinstance(channels[i], list) or
...
PythonTLSSnapshot: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:161 [backend fallback]
FuncTorchDynamicLayerFrontMode: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:493 [backend fallback]
PreDispatch: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:165 [backend fallback]
PythonDispatcher: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:157 [backend fallback]

I tried installing the OratHelm clone and it allows me to run the samples with GPU on macbook pro M1 no problem but I get the warning:

"/Users/user/mambaforge/lib/python3.10/site-packages/cellpose/dynamics.py:189](https://file+.vscode-resource.vscode-cdn.net/Users/solivanriv/mambaforge/lib/python3.10/site-packages/cellpose/dynamics.py:189): RuntimeWarning: invalid value encountered in divide
mu /= (1e-60 + (mu**2).sum(axis=0)**0.5)"

Unfortunately, the segmentation results from the most updated version are a bit more accurate than the OratHelm fork. I have a lot of images and GPU support would be ideal with the most recent version, but I cannot seem to pinpoint a solution.

Answer 7 · 2024-11-04T15:47:26.000Z

I am also encountering a problem with version 3.1.0: since the update, it is impossible to use the GPU either to calibrate the diameter or to segment. I have the following error:

Traceback (most recent call last):
  File "/Users/orat/opt/anaconda3/lib/python3.9/site-packages/cellpose/gui/gui.py", line 1001, in calibrate_size
    diams, _ = self.model.sz.eval(self.stack[self.currentZ].copy(),
  File "/Users/orat/opt/anaconda3/lib/python3.9/site-packages/cellpose/models.py", line 776, in eval
    masks = self.cp.eval(
  File "/Users/orat/opt/anaconda3/lib/python3.9/site-packages/cellpose/models.py", line 534, in eval
    masks = self._compute_masks(x.shape, dP, cellprob, flow_threshold=flow_threshold,
  File "/Users/orat/opt/anaconda3/lib/python3.9/site-packages/cellpose/models.py", line 620, in _compute_masks
    outputs = dynamics.resize_and_compute_masks(
  File "/Users/orat/opt/anaconda3/lib/python3.9/site-packages/cellpose/dynamics.py", line 839, in resize_and_compute_masks
    mask = compute_masks(dP, cellprob, niter=niter,
  File "/Users/orat/opt/anaconda3/lib/python3.9/site-packages/cellpose/dynamics.py", line 906, in compute_masks
    mask = get_masks_torch(p_final, inds, dP.shape[1:], 
  File "/Users/orat/opt/anaconda3/lib/python3.9/site-packages/cellpose/dynamics.py", line 757, in get_masks_torch
    coo = torch.sparse_coo_tensor(pt, torch.ones(pt.shape[1], device=pt.device, dtype=torch.int)

NET ERROR: Could not run 'aten::_sparse_coo_tensor_with_dims_and_tensors' with arguments from the 'SparseMPS' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::_sparse_coo_tensor_with_dims_and_tensors' is only available for these backends: [MPS, Meta, SparseCPU, SparseMeta, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradHIP, AutogradXLA, AutogradMPS, AutogradIPU, AutogradXPU, AutogradHPU, AutogradVE, AutogradLazy, AutogradMTIA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, AutogradMeta, AutogradNestedTensor, Tracer, AutocastCPU, AutocastXPU, AutocastMPS, AutocastCUDA, FuncTorchBatched, BatchedNestedTensor, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].

MPS: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:75 [backend fallback]
Meta: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterMeta.cpp:26996 [kernel]
SparseCPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterSparseCPU.cpp:1406 [kernel]
SparseMeta: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterSparseMeta.cpp:290 [kernel]
BackendSelect: registered at /Users/runner/work/pytorch/pytorch/pytorch/build/aten/src/ATen/RegisterBackendSelect.cpp:792 [kernel]
Python: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:153 [backend fallback]
FuncTorchDynamicLayerBackMode: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:497 [backend fallback]
Functionalize: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/FunctionalizeFallbackKernel.cpp:349 [backend fallback]
Named: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback]
Conjugate: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/ConjugateFallback.cpp:17 [backend fallback]
Negative: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/NegateFallback.cpp:18 [backend fallback]
ZeroTensor: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/ZeroTensorFallback.cpp:86 [backend fallback]
ADInplaceOrView: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:96 [backend fallback]
AutogradOther: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19981 [autograd kernel]
AutogradCPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19981 [autograd kernel]
AutogradCUDA: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19981 [autograd kernel]
AutogradHIP: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19981 [autograd kernel]
AutogradXLA: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19981 [autograd kernel]
AutogradMPS: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19981 [autograd kernel]
AutogradIPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19981 [autograd kernel]
AutogradXPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19981 [autograd kernel]
AutogradHPU: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19981 [autograd kernel]
AutogradVE: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19981 [autograd kernel]
AutogradLazy: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19981 [autograd kernel]
AutogradMTIA: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19981 [autograd kernel]
AutogradPrivateUse1: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19981 [autograd kernel]
AutogradPrivateUse2: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19981 [autograd kernel]
AutogradPrivateUse3: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19981 [autograd kernel]
AutogradMeta: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19981 [autograd kernel]
AutogradNestedTensor: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19981 [autograd kernel]
Tracer: registered at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/autograd/generated/TraceType_2.cpp:17715 [kernel]
AutocastCPU: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/autocast_mode.cpp:321 [backend fallback]
AutocastXPU: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/autocast_mode.cpp:463 [backend fallback]
AutocastMPS: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/autocast_mode.cpp:209 [backend fallback]
AutocastCUDA: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/autocast_mode.cpp:165 [backend fallback]
FuncTorchBatched: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:731 [backend fallback]
BatchedNestedTensor: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:758 [backend fallback]
FuncTorchVmapMode: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/VmapModeRegistrations.cpp:27 [backend fallback]
Batched: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/LegacyBatchingRegistrations.cpp:1075 [backend fallback]
VmapMode: fallthrough registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback]
FuncTorchGradWrapper: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/TensorWrapper.cpp:207 [backend fallback]
PythonTLSSnapshot: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:161 [backend fallback]
FuncTorchDynamicLayerFrontMode: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:493 [backend fallback]
PreDispatch: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:165 [backend fallback]
PythonDispatcher: registered at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:157 [backend fallback]

On the same configuration (M3 Pro, python 3.9.13, torch 2.5.1), after reinstalling Cellpose 3.0.11, it works correctly and I have no warnings on my side. For the moment, if it helps, this version can be installed with pip install git+https://github.com/mouseland/cellpose.git@v3.0.11

Answer 8 · 2024-11-04T16:48:29.000Z

darn, I wish Apple would help pytorch implement all these functions for MPS! I upgraded the mask computation using sparse matrices, and they must not be supported. I will default to the old mask computation for MPS until they implement it

Answer 9 · 2024-11-04T20:31:12.000Z

@OratHelm thanks for the suggestion! Going back to 3.0.11 worked for me! I am seeing a small error coming form the dynamics.py stating:

"RuntimeWarning: invalid value encountered in divide
mu /= (1e-60 + (mu**2).sum(axis=0)**0.5)"

I guess that this is emerging from a division by zero. Sounds like this should not alter my results though, segmentation images look fine.

@carsen-stringer, thanks! Hopefully, they will implement this soon!