catboost/catboost

How to recursive remove features by best loss ?

Opened this issue · 2 comments

I follow the example to do recursive feature selection and find the target is reduce feature number to num_features_to_select .
Instead of set num_features_to_select, I want to reduce feature number by best loss .

As my selection plots :
image

Here, we can see the number of removed features in 70 ~ 270 get best loss . So 270 is the maximum number for removing features.
But I didn't find api for this purpose. It would be good to include best loss features in summary = model.select_features(...) .

ek-ak commented

Hello! Sorry for the long wait for an answer to your question.
It looks like a useful feature, thanks for the request!

You can obtain this information about best loss_features from summary, i e:

summary = model.select_features(...)
values = summary['loss_graph']['loss_values']
best_iteration = values.index(min(values))
eliminated_features = summary['eliminated_features'][:best_iteration]
eromoe commented

Thanks. Yeah it's very useful and may need some parameters for balancing best loss and maximum number of removal features.