[QST] No module named 'merlin.dtypes' while importing nvtabular
murali-munna opened this issue · 4 comments
When I perform import nvtabular as nvt
, I get the following error:
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In[5], line 8
5 import numpy as np
6 import pandas as pd
----> 8 import nvtabular as nvt
9 # from nvtabular.ops import *
10 from merlin.schema.tags import Tags
File /usr/local/lib/python3.8/dist-packages/nvtabular/__init__.py:24
22 from merlin.dag import ColumnSelector
23 from merlin.schema import ColumnSchema, Schema
---> 24 from nvtabular import workflow # noqa
25 from nvtabular import _version
27 # suppress some warnings with cudf warning about column ordering with dlpack
28 # and numba warning about deprecated environment variables
File /usr/local/lib/python3.8/dist-packages/nvtabular/workflow/__init__.py:18
1 #
2 # Copyright (c) 2021, NVIDIA CORPORATION.
3 #
(...)
16
17 # flake8: noqa
---> 18 from nvtabular.workflow.node import WorkflowNode
19 from nvtabular.workflow.workflow import Workflow
File /usr/local/lib/python3.8/dist-packages/nvtabular/workflow/node.py:17
1 #
2 # Copyright (c) 2021, NVIDIA CORPORATION.
3 #
(...)
14 # limitations under the License.
15 #
16 from merlin.dag import Node
---> 17 from nvtabular.ops import LambdaOp, Operator
20 class WorkflowNode(Node):
21 """WorkflowNode represents a Node in a NVTabular workflow graph"""
File /usr/local/lib/python3.8/dist-packages/nvtabular/ops/__init__.py:38
36 from nvtabular.ops.fill import FillMedian, FillMissing
37 from nvtabular.ops.filter import Filter
---> 38 from nvtabular.ops.groupby import Groupby
39 from nvtabular.ops.hash_bucket import HashBucket
40 from nvtabular.ops.hashed_cross import HashedCross
File /usr/local/lib/python3.8/dist-packages/nvtabular/ops/groupby.py:21
18 from dask.dataframe.utils import meta_nonempty
20 from merlin.core.dispatch import DataFrameType, annotate
---> 21 from merlin.dtypes.shape import DefaultShapes
22 from merlin.schema import Schema
23 from nvtabular.ops.operator import ColumnSelector, Operator
ModuleNotFoundError: No module named 'merlin.dtypes
Here is my installed package list
merlin 0.0.1
merlin-core 0.5.0
merlin-dataloader 0.0.3
merlin-models 23.2.0
merlin-systems 23.2.0
nvtabular 23.2.0
transformers4rec 23.2.0
Please guide. Thanks.
@murali-munna are you installing merlin packages via pip? if yes, are you installing them on a machine with GPU and cuda driver installed? can you please give more detail about your env?
Most likely you are having lib version mismatches. if you are installing via pip you can search our pypi packages an install the latest stable versions:
https://pypi.org/project/merlin-core/
https://pypi.org/project/nvtabular/
https://pypi.org/project/merlin-dataloader/
https://pypi.org/project/merlin-models/
https://pypi.org/project/merlin-systems/
please check this support matrix for the lib versions: https://nvidia-merlin.github.io/Merlin/main/support_matrix/support_matrix_merlin_tensorflow.html
I'm also able to reproduce this issue when installing via conda/mamba; it looks like somehow the dependencies are not being set correctly with nvtabular.
When running: conda install -c nvidia -c rapidsai -c numba -c conda-forge nvtabular
Conda will choose to install:
nvtabular 23.02.00
merlin-core 0.5.0
merlin-dataloader 0.3.0
Calling install with merlin-core=23.02.01
and merlin-dataloader=23.02.01
appears to resolve the problem.
We intend to address the merlin-core
dependency version specifiers accepting too wide a range of versions in the next release of NVTabular
Closing this issue due to low activity.