scikit-learn-contrib/category_encoders

Pandas copy-on-write doesn't work properly

s-banach opened this issue · 2 comments

Here is a basic error message you will get when running (probably) any of the encoders
after setting pd.options.mode.copy_on_write = True.

ChainedAssignmentError: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method.
When using the Copy-on-Write mode, such inplace method never works to update the original DataFrame or Series, because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' instead, to perform the operation inplace on the original object.

A quick way to fix this automatically would be to run ruff with the PD rules enabled, in particular PD002 which removes all uses of inplace.

Copy on write is probably going to be the default in pandas 3.0, so this should be viewed as a legitimate issue.

Fixing this is typically as simple as replacing df[col].method(inplace=True) with df[col] = df[col].method().

thanks for reporting this. I can fix it once I have some time. Or maybe you can create a PR if you want to