Import fails on cpu
yyu22 opened this issue · 0 comments
yyu22 commented
Describe the bug
The GPU version of curator fails during import when running on cpu only nodes.
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/cupy/__init__.py", line 17, in <module>
from cupy import _core # NOQA
File "/usr/local/lib/python3.10/dist-packages/cupy/_core/__init__.py", line 3, in <module>
from cupy._core import core # NOQA
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/NeMo-Curator/nemo_curator/__init__.py", line 29, in <module>
from .modules import *
File "/opt/NeMo-Curator/nemo_curator/modules/__init__.py", line 24, in <module>
from .add_id import AddId
File "/opt/NeMo-Curator/nemo_curator/modules/add_id.py", line 21, in <module>
from nemo_curator.datasets import DocumentDataset
File "/opt/NeMo-Curator/nemo_curator/datasets/__init__.py", line 15, in <module>
from .doc_dataset import DocumentDataset
File "/opt/NeMo-Curator/nemo_curator/datasets/doc_dataset.py", line 19, in <module>
from nemo_curator.utils.distributed_utils import read_data, write_to_disk
File "/opt/NeMo-Curator/nemo_curator/utils/distributed_utils.py", line 32, in <module>
cudf = gpu_only_import("cudf")
File "/opt/NeMo-Curator/nemo_curator/utils/import_utils.py", line 347, in gpu_only_import
return safe_import(
File "/opt/NeMo-Curator/nemo_curator/utils/import_utils.py", line 261, in safe_import
return importlib.import_module(module)
File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "/usr/local/lib/python3.10/dist-packages/cudf/__init__.py", line 12, in <module>
import cupy
File "/usr/local/lib/python3.10/dist-packages/cupy/__init__.py", line 19, in <module>
raise ImportError(f'''
ImportError:
================================================================
Failed to import CuPy.
If you installed CuPy via wheels (cupy-cudaXXX or cupy-rocm-X-X), make sure that the package matches with the version of CUDA or ROCm installed.
On Linux, you may need to set LD_LIBRARY_PATH environment variable depending on how you installed CUDA/ROCm.
On Windows, try setting CUDA_PATH environment variable.
Check the Installation Guide for details:
https://docs.cupy.dev/en/latest/install.html
Original error:
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory
================================================================
Steps/Code to reproduce bug
-
Install GPU version of curator or use nemo framework container
-
Run
import nemo_curator
on cpu-only node/machine
Expected behavior
The GPU version should still work on cpu-only node for steps that does not require GPU (e.g., add id).