scikit-learn-contrib/category_encoders

Pandas FutureWarning: The default dtype for empty Series will be 'object'

ftrojan opened this issue · 4 comments

Expected Behavior

No warning when using the ordinal encoder with an empty dataframe.

Actual Behavior

category_encoders/ordinal.py:329: FutureWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.

The same for lines 294 and 331 of the ordinal.py.

Steps to Reproduce the Problem

  1. Use the ordinal encoder with an empty dataframe.

Specifications

  • Version: 2.4.1
  • Platform: MacOS
  • Subsystem: Python 3.8.13, pandas==1.4.2, numpy==1.22.3

Hi @ftrojan

could you please specify your pandas and numpy version? I've been looking into our dependency versions the last couple of days co-incidentally and noticed that there is some incompatibilities with newer numpy versions (<1.20) with older pandas versions (<1.0.5). I've create a separate issue for this and explain the way I'll move this library to higher versions: #359

I just checked and the warning is something different. But I think we can safely ignore it. In the ordinal encoder the data for all series is specified. So we shouldn't initialize an empty sequence, except for when the input data is empty, but then we do not care about data types

My pandas is 1.4.2 and numpy is 1.22.3

I am about to provide a pull request as the fix seems to be relatively straightforward. In about the next two weeks, as my free time allows.