get_feature_names_out is incompatible with sklearn estimators and eli5, consequently

Question

get_feature_names_out is incompatible with sklearn estimators and eli5, consequently

DZIMDZEM opened this issue a year ago · 3 comments

Expected Behavior

In BaseEncoder, get_feature_names_out() should accept more than 1 argument as in other sklearn base estimators.

def get_feature_names_out(self, input_features=None):
      """
      ...
      """
      return _check_feature_names_in(self, input_features)

Actual Behavior

BaseEncoder's get_feature_names_out() accepts only 1 argument: self. It makes it incompatible with eli5 module and other modules that work with feature names when you use sklearn modules.

def get_feature_names_out(self) -> List[str]:
       """..."""
       if not isinstance(self.feature_names_out_, list):
           raise NotFittedError("Estimator has to be fitted to return feature names.")
       else:
           return self.feature_names_out_

Steps to Reproduce the Problem

Add input_features keyword argument to get_feature_names_out.
Copy/inherit _check_feature_names_in method from sklearn.utils.validation so get_feature_names_out has the same implementation as sklearn.base.BaseEstimator

As a temporary solution you can just override the method. Example for TargetEncoder:

class TargetEncoderFixed(TargetEncoder):
        def get_feature_names_out(self, *arg, **kargs):
            return self.feature_names_out_

Specifications

Version: 2.6.0
Platform: Windows
Subsystem: pipeline workflow

Answer 1 · 2023-05-14T12:54:36.000Z

I think this was fixed by #398
Could you please confirm this? So if you check the current master branch the get_feature_names_out function already supports the input_features argument. I haven't built a release for this bugfix yet though, so if you install form pypi you should still experience this problem.
I can build a release this week though if it solves your problem

Answer 2 · 2023-05-14T21:37:27.000Z

@PaulWestenthanner , yes, it resolves the problem. Thank you for the fast response!

Answer 3 · 2023-05-15T21:31:21.000Z

Version 2.6.1 is published on pypi now