Kaggle/docker-python

CHAOS AT CURRENT CUDF WITH RAPIDS DRIVERS

Opened this issue ยท 2 comments

๐Ÿ› Bug

To Reproduce

import cudf as cf #Use Rapids framework dataframe for GPU (PANDAS)
import cupy as cp #Use Rapids framework arrays for GPU (NUMPY)

Get the log:

/opt/conda/lib/python3.10/site-packages/cudf/utils/_numba.py:110: UserWarning: Using CUDA toolkit version (12, 3) with CUDA driver version (12, 2) requires minor version compatibility, which is not yet supported for CUDA driver versions 12.0 and above. It is likely that many cuDF operations will not work in this state. Please install CUDA toolkit version (12, 2) to continue using cuDF.
  warnings.warn(

Expected behavior

๐Ÿง NONE LOG ๐Ÿง

Additional context

The current CUDA drivers in the Docker ENV are:

| NVIDIA-SMI 535.129.03             Driver Version: 535.129.03   CUDA Version: 12.2    

Possible solutions:

  1. Go to RAPIDS to update de Toolkit with the drivers (Doesnt work, at the end )
#GET CUDF INNER CONFLICT 
/opt/conda/lib/python3.10/site-packages/cudf/utils/_numba.py:17: UserWarning: CUDA Toolkit is newer than CUDA driver. Numba features will not work in this configuration. 
  warnings.warn(
/opt/conda/lib/python3.10/site-packages/cupy/_environment.py:487: UserWarning: 
--------------------------------------------------------------------------------

  CuPy may not function correctly because multiple CuPy packages are installed
  in your environment:

    cupy, cupy-cuda12x

  Follow these steps to resolve this issue:

    1. For all packages listed above, run the following command to remove all
       existing CuPy installations:

         $ pip uninstall <package_name>

      If you previously installed CuPy via conda, also run the following:

         $ conda uninstall cupy

    2. Install the appropriate CuPy package.
       Refer to the Installation Guide for detailed instructions.

         https://docs.cupy.dev/en/stable/install.html

I KNOW , USING [pip uninstall cupy -y]
IT will broke then as: 


---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
Cell In[4], line 1
----> 1 import cudf as cf #Use Rapids framework dataframe for GPU (PANDAS)
      2 import cupy as cp #Use Rapids framework arrays for GPU (NUMPY)

File /opt/conda/lib/python3.10/site-packages/cudf/__init__.py:19
     16 from rmm.allocators.cupy import rmm_cupy_allocator
     17 from rmm.allocators.numba import RMMNumbaManager
---> 19 from cudf import api, core, datasets, testing
     20 from cudf._version import __git_commit__, __version__
     21 from cudf.api.extensions import (
     22     register_dataframe_accessor,
     23     register_index_accessor,
     24     register_series_accessor,
     25 )

File /opt/conda/lib/python3.10/site-packages/cudf/datasets.py:7
      4 import pandas as pd
      6 import cudf
----> 7 from cudf._lib.transform import bools_to_mask
      8 from cudf.core.column_accessor import ColumnAccessor
     10 __all__ = ["timeseries", "randomdata"]

File /opt/conda/lib/python3.10/site-packages/cudf/_lib/__init__.py:4
      1 # Copyright (c) 2020-2023, NVIDIA CORPORATION.
      2 import numpy as np
----> 4 from . import (
      5     avro,
      6     binaryop,
      7     concat,
      8     copying,
      9     csv,
     10     datetime,
     11     expressions,
     12     filling,
     13     groupby,
     14     hash,
     15     interop,
     16     join,
     17     json,
     18     labeling,
     19     merge,
     20     null_mask,
     21     nvtext,
     22     orc,
     23     parquet,
     24     partitioning,
     25     pylibcudf,
     26     quantiles,
     27     reduce,
     28     replace,
     29     reshape,
     30     rolling,
     31     round,
     32     search,
     33     sort,
     34     stream_compaction,
     35     string_casting,
     36     strings,
     37     strings_udf,
     38     text,
     39     timezone,
     40     transpose,
     41     unary,
     42 )
     44 MAX_COLUMN_SIZE = np.iinfo(np.int32).max
     45 MAX_COLUMN_SIZE_STR = "INT32_MAX"

ImportError: /opt/conda/lib/python3.10/site-packages/cudf/_lib/avro.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN4cudf2io19avro_reader_options7builderENS0_11source_infoE

Updating RAPIDS toolkit:

As the docs says:

https://docs.rapids.ai/install#system-req

pip install \
    --extra-index-url=https://pypi.nvidia.com \
    cudf-cu12==23.12.* dask-cudf-cu12==23.12.* cuml-cu12==23.12.* \
    cugraph-cu12==23.12.* cuspatial-cu12==23.12.* cuproj-cu12==23.12.* \
    cuxfilter-cu12==23.12.* cucim-cu12==23.12.* pylibraft-cu12==23.12.* \
    raft-dask-cu12==23.12.*

Will happen this thing:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
cudf 23.8.0 requires cubinlinker, which is not installed.
cudf 23.8.0 requires cupy-cuda11x>=12.0.0, which is not installed.
cudf 23.8.0 requires ptxcompiler, which is not installed.
cuml 23.8.0 requires cupy-cuda11x>=12.0.0, which is not installed.
dask-cudf 23.8.0 requires cupy-cuda11x>=12.0.0, which is not installed.
apache-beam 2.46.0 requires dill<0.3.2,>=0.3.1.1, but you have dill 0.3.7 which is incompatible.
apache-beam 2.46.0 requires protobuf<4,>3.12.2, but you have protobuf 4.25.2 which is incompatible.
apache-beam 2.46.0 requires pyarrow<10.0.0,>=3.0.0, but you have pyarrow 14.0.2 which is incompatible.
beatrix-jupyterlab 2023.128.151533 requires jupyterlab~=3.6.0, but you have jupyterlab 4.0.11 which is incompatible.
cudf 23.8.0 requires cuda-python<12.0a0,>=11.7.1, but you have cuda-python 12.3.0 which is incompatible.
cudf 23.8.0 requires pyarrow==11.*, but you have pyarrow 14.0.2 which is incompatible.
cuml 23.8.0 requires dask==2023.7.1, but you have dask 2023.11.0 which is incompatible.
cuml 23.8.0 requires dask-cuda==23.8.*, but you have dask-cuda 23.12.0 which is incompatible.
cuml 23.8.0 requires distributed==2023.7.1, but you have distributed 2023.11.0 which is incompatible.
cuml 23.8.0 requires treelite==3.2.0, but you have treelite 3.9.1 which is incompatible.
cuml 23.8.0 requires treelite-runtime==3.2.0, but you have treelite-runtime 3.9.1 which is incompatible.
dask-cudf 23.8.0 requires dask==2023.7.1, but you have dask 2023.11.0 which is incompatible.
dask-cudf 23.8.0 requires distributed==2023.7.1, but you have distributed 2023.11.0 which is incompatible.
google-cloud-aiplatform 0.6.0a1 requires google-api-core[grpc]<2.0.0dev,>=1.22.2, but you have google-api-core 2.11.1 which is incompatible.
google-cloud-automl 1.0.1 requires google-api-core[grpc]<2.0.0dev,>=1.14.0, but you have google-api-core 2.11.1 which is incompatible.
google-cloud-bigquery 2.34.4 requires packaging<22.0dev,>=14.3, but you have packaging 23.2 which is incompatible.
google-cloud-bigquery 2.34.4 requires protobuf<4.0.0dev,>=3.12.0, but you have protobuf 4.25.2 which is incompatible.
google-cloud-bigtable 1.7.3 requires protobuf<4.0.0dev, but you have protobuf 4.25.2 which is incompatible.
google-cloud-pubsub 2.19.0 requires grpcio<2.0dev,>=1.51.3, but you have grpcio 1.51.1 which is incompatible.
google-cloud-vision 2.8.0 requires protobuf<4.0.0dev,>=3.19.0, but you have protobuf 4.25.2 which is incompatible.
jupyterlab 4.0.11 requires jupyter-lsp>=2.0.0, but you have jupyter-lsp 1.5.1 which is incompatible.
jupyterlab-lsp 5.0.2 requires jupyter-lsp>=2.0.0, but you have jupyter-lsp 1.5.1 which is incompatible.
kfp 2.5.0 requires google-cloud-storage<3,>=2.2.1, but you have google-cloud-storage 1.44.0 which is incompatible.
kfp 2.5.0 requires protobuf<4,>=3.13.0, but you have protobuf 4.25.2 which is incompatible.
kfp-pipeline-spec 0.2.2 requires protobuf<4,>=3.13.0, but you have protobuf 4.25.2 which is incompatible.
libpysal 4.9.2 requires shapely>=2.0.1, but you have shapely 1.8.5.post1 which is incompatible.
momepy 0.7.0 requires shapely>=2, but you have shapely 1.8.5.post1 which is incompatible.
osmnx 1.8.1 requires shapely>=2.0, but you have shapely 1.8.5.post1 which is incompatible.
pyldavis 3.4.1 requires pandas>=2.0.0, but you have pandas 1.5.3 which is incompatible.
raft-dask 23.8.0 requires dask==2023.7.1, but you have dask 2023.11.0 which is incompatible.
raft-dask 23.8.0 requires dask-cuda==23.8.*, but you have dask-cuda 23.12.0 which is incompatible.
raft-dask 23.8.0 requires distributed==2023.7.1, but you have distributed 2023.11.0 which is incompatible.
rmm 23.8.0 requires cuda-python<12.0a0,>=11.7.1, but you have cuda-python 12.3.0 which is incompatible.
spopt 0.6.0 requires shapely>=2.0.1, but you have shapely 1.8.5.post1 which is incompatible.
tensorboard 2.15.1 requires protobuf<4.24,>=3.19.6, but you have protobuf 4.25.2 which is incompatible.
tensorflow-metadata 0.14.0 requires protobuf<4,>=3.7, but you have protobuf 4.25.2 which is incompatible.
tensorflow-transform 0.14.0 requires protobuf<4,>=3.7, but you have protobuf 4.25.2 which is incompatible.
  1. Update de CUDA drivers to 12.3 (actual 12.2)
  2. Downgrade CUDA drivers to 11.8 ??? (better compatibility with other RAPIDS tools)

IN PREVIOS DOCKER ENVs [Pin to original environment (2023-12-13)] with CUDA 11.8, the dask framework with RAPIDS worked, I believe that the upgrade you did now in January its the cause... Actually the version is [Pin to original environment (2024-01-29)] with CUDA 12.2

Filed https://b.corp.google.com/issues/328057594

Have you found any actual issues besides the warning when using cudf/cupy?