[BUG] Throw warning if reserved column is used
bschifferer opened this issue · 0 comments
bschifferer commented
Describe the bug
It seems, we are not allowed to call a column labels
for the categorify op.
Steps/Code to reproduce bug
import cudf
import nvtabular as nvt
from nvtabular.ops import *
df = cudf.DataFrame({'labels': [10,11,12]})
feat = ['labels'] >> nvt.ops.Categorify()
workflow = nvt.Workflow(feat)
dataset = nvt.Dataset(df, cpu=False)
workflow.fit(dataset)
workflow.transform(dataset).compute()
The output is the original input [10,11,12]
Expected behavior
The output should be the categorified column
We should throw at least a warning (or even an error), that we cannot use labels
as a column name in categorify
Environment details (please complete the following information):
I tested it in pytorch:22.12 container. Reading the NVT code, it seems that labels
is a special column name
https://github.com/NVIDIA-Merlin/NVTabular/blob/main/nvtabular/ops/categorify.py#L1645
Additional context