[BUG] Merlin has AttributeError: 'ColumnSelector' object has no attribute 'all' for an example notebook
mtnt-2022 opened this issue · 0 comments
mtnt-2022 commented
Bug description
/usr/local/lib/python3.8/dist-packages/nvtabular/workflow/workflow.py:427: UserWarning: Loading workflow generated with nvtabular version 0+unknown - but we are running nvtabular 23.02.00. This might cause issues
warnings.warn(
/usr/local/lib/python3.8/dist-packages/nvtabular/workflow/workflow.py:427: UserWarning: Loading workflow generated with cudf version 22.02.00a+309.gdad51a548e - but we are running cudf 22.08.00a+304.g6ca81bbc78.dirty. This might cause issues
warnings.warn(
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[17], line 1
----> 1 export.export_ensemble(
2 model_name=MODEL_NAME,
3 workflow_path=local_workflow_path,
4 saved_model_path=local_saved_model_path,
5 output_path=local_ensemble_path,
6 categorical_columns=categorical_columns,
7 continuous_columns=continuous_columns,
8 label_columns=label_columns,
9 num_slots=NUM_SLOTS,
10 max_nnz=MAX_NNZ,
11 num_outputs=NUM_OUTPUTS,
12 embedding_vector_size=EMBEDDING_VECTOR_SIZE,
13 max_batch_size=MAX_BATCH_SIZE,
14 model_repository_path=model_repository_path
15 )
File /home/jupyter/mluser/git/aml-merlin-on-vertex-ai/example/../src/serving/export.py:107, in export_ensemble(model_name, workflow_path, saved_model_path, output_path, categorical_columns, continuous_columns, label_columns, num_slots, max_nnz, num_outputs, embedding_vector_size, max_batch_size, model_repository_path)
104 hugectr_params['embedding_vector_size'] = embedding_vector_size
105 hugectr_params['n_outputs'] = num_outputs
--> 107 export_hugectr_ensemble(
108 workflow=workflow,
109 hugectr_model_path=saved_model_path,
110 hugectr_params=hugectr_params,
111 name=model_name,
112 output_path=output_path,
113 label_columns=label_columns,
114 cats=categorical_columns,
115 conts=continuous_columns,
116 max_batch_size=max_batch_size,
117 )
119 hugectr_backend_config = create_hugectr_backend_config(
120 model_path=os.path.join(output_path, model_name, '1'),
121 max_batch_size=max_batch_size,
122 deployed_device_list=[0],
123 model_repository_path=model_repository_path)
125 with open(os.path.join(output_path, HUGECTR_CONFIG_FILENAME), 'w') as f:
File /usr/local/lib/python3.8/dist-packages/nvtabular/inference/triton/ensemble.py:245, in export_hugectr_ensemble(workflow, hugectr_model_path, hugectr_params, name, output_path, version, max_batch_size, nvtabular_backend, cats, conts, label_columns)
242 if not cats and not conts:
243 raise ValueError("Either cats or conts has to have a value.")
--> 245 workflow = workflow.remove_inputs(labels)
247 # generate the nvtabular triton model
248 preprocessing_path = os.path.join(output_path, name + "_nvt")
File /usr/local/lib/python3.8/dist-packages/nvtabular/workflow/workflow.py:160, in Workflow.remove_inputs(self, input_cols)
140 def remove_inputs(self, input_cols) -> "Workflow":
141 """Removes input columns from the workflow.
142
143 This is useful for the case of inference where you might need to remove label columns
(...)
158 merlin.dag.Graph.remove_inputs
159 """
--> 160 self.graph.remove_inputs(input_cols)
161 return self
File /usr/local/lib/python3.8/dist-packages/merlin/dag/graph.py:173, in Graph.remove_inputs(self, to_remove)
171 node, columns_to_remove = nodes_to_process.popleft()
172 if node.input_schema and len(node.input_schema):
--> 173 output_columns_to_remove = node.remove_inputs(columns_to_remove)
175 for child in node.children:
176 nodes_to_process.append(
177 (child, list(set(to_remove + output_columns_to_remove)))
178 )
File /usr/local/lib/python3.8/dist-packages/merlin/dag/node.py:425, in Node.remove_inputs(self, input_cols)
411 def remove_inputs(self, input_cols: List[str]) -> List[str]:
412 """
413 Remove input columns and all output columns that depend on them.
414
(...)
423 The output columns that were removed
424 """
--> 425 removed_outputs = _derived_output_cols(input_cols, self.column_mapping)
427 self.input_schema = self.input_schema.without(input_cols)
428 self.output_schema = self.output_schema.without(removed_outputs)
File /usr/local/lib/python3.8/dist-packages/merlin/dag/node.py:484, in Node.column_mapping(self)
482 @property
483 def column_mapping(self):
--> 484 selector = self.selector or ColumnSelector(self.input_schema.column_names)
485 return self.op.column_mapping(selector)
File /usr/local/lib/python3.8/dist-packages/merlin/dag/selector.py:151, in ColumnSelector.__bool__(self)
150 def __bool__(self):
--> 151 return bool(self.all or self._names or self.subgroups or self.tags)
AttributeError: 'ColumnSelector' object has no attribute 'all'
Steps/Code to reproduce bug
-
At the cell with
export.export_ensemble(
model_name=MODEL_NAME,
workflow_path=local_workflow_path,
saved_model_path=local_saved_model_path,
output_path=local_ensemble_path,
categorical_columns=categorical_columns,
continuous_columns=continuous_columns,
label_columns=label_columns,
num_slots=NUM_SLOTS,
max_nnz=MAX_NNZ,
num_outputs=NUM_OUTPUTS,
embedding_vector_size=EMBEDDING_VECTOR_SIZE,
max_batch_size=MAX_BATCH_SIZE,
model_repository_path=model_repository_path
) -
Got the above error
Expected behavior
Environment details
- Merlin version: 1.9.1
- Platform: Ubuntu 20.04.5 LTS
- Python version: Python 3.8.10 (default, Nov 14 2022, 12:59:47) [GCC 9.4.0] on linux
- PyTorch version (GPU?): 2.0.0 (support GPU)
Additional context
Merlin image: nvcr.io/nvidia/merlin/merlin-pytorch:23.02
Distributor ID: Ubuntu
Description: Ubuntu 20.04.5 LTS
Release: 20.04
Codename: focal
merlin 1.9.1
merlin-core 23.2.1
merlin-dataloader 0.0.3
merlin-models 23.2.0
merlin-systems 23.2.0
nvidia-cublas-cu11 11.10.3.66
nvidia-cuda-cupti-cu11 11.7.101
nvidia-cuda-nvrtc-cu11 11.7.99
nvidia-cuda-runtime-cu11 11.7.99
nvidia-cudnn-cu11 8.5.0.96
nvidia-cufft-cu11 10.9.0.58
nvidia-curand-cu11 10.2.10.91
nvidia-cusolver-cu11 11.4.0.1
nvidia-cusparse-cu11 11.7.4.91
nvidia-nccl-cu11 2.14.3
nvidia-nvtx-cu11 11.7.91
nvidia-pyindex 1.0.9
nvtabular 23.2.0
torch 2.0.0