Use a function call to monkey patch and not on import?
thomasjpfan opened this issue · 5 comments
For scikit-learn, I prefer to not automatically monkeypatch when using this library. I plan to use a custom function to get the DataFrame API:
def get_dataframe_standard(df):
if hasattr(df, "__dataframe_standard__"):
return df.__dataframe_standard__()
elif _is_pandas_df(df):
from .pandas_standard import dataframe_standard as pandas_dataframe_standard
return pandas_dataframe_standard(df)
...
Is there a way to adjust this library to monkey patch after a function call?
from pandas_standard import patch_dataframe
patch_dataframe("pandas")
A similiar pattern is used in Intel's scikit-learn implementation.
Hey @thomasjpfan
You could do
from pandas_standard import PandasDataFrame
df_compliant = PandasDataFrame(df)
Eventually the aim is to get this upstreamed to pandas/polars, but for now, that should work to test it out and see what the gaps are
The issue is that importing pandas_standard
itself will monkeypatch, even if I only want to use PandasDataFrame
:
from pandas_standard import PandasDataFrame
import pandas as pd
x = pd.DataFrame([[1, 2, 3], [4, 5, 6]])
assert hasattr(x, "__dataframe_standard__")
A library using pandas_standard
will end up monkeypatching a user's environment.
Specifically, the pandas_standard
import will monkeypatch in the following:
def get_dataframe_standard(df):
if hasattr(df, "__dataframe_standard__"):
return df.__dataframe_standard__()
elif _is_pandas_df(df):
from pandas_standard import PandasDataFrame
return PandasDataFrame(df)
elif _is_polars_df(df):
...
ah makes sense
I'll just remove the monkeypatching then
have updated - does this work for you?
Yea, that's okay. The only nit is that convert_to_standard_compliant_dataframe
should raise an error if it does not recognize the input:
I likely wont be able to use standard.py
as is, because scikit-learn optionally depends on the dataframe libraries. But I can work around it by having my own custom function.