NVIDIA-Merlin/NVTabular

[BUG] Dtype discrepancy with pandas and groupby on CPU

oliverholworthy opened this issue · 1 comments

Describe the bug

Steps/Code to reproduce bug

TypeError: Dtype discrepancy detected for column age_days-list: operator Groupby reported dtype `DType(name='float32', element_type=<ElementType.Float: 'float'>, element_size=32, element_unit=None, signed=True, shape=Shape(dims=None))` but returned dtype `DType(name='float64', element_type=<ElementType.Float: 'float'>, element_size=64, element_unit=None, signed=True, shape=Shape(dims=None))`.

Expected behavior

No exception raised, and output matching equivalent result when running on GPU with cudf

Environment details:

  • Environment location: Docker
  • Method of NVTabular install: from source

Additional context

A similar issue has been reported recently #1767 . However that particular example is now working following a change in core NVIDIA-Merlin/core#226

angmc commented

@oliverholworthy I ran on the 23.04 pytorch container without GPU and it ran without error. Is this error only apparent when installing NVTabular from source? Or was it corrected with changes in core also?