Categorical Feature LGBM supported?
Sathrovarr opened this issue · 2 comments
I'm running lightgbm classifier lightgbm.LGBMClassifier
, including setting categorical features as model.fit(X, y, categorical_feature = ccols)
.
When I attempt exporting the resulting model using nyoka's lgb_to_pmml
, I get the following error
python3.9/site-packages/nyoka/lgbm/lgb_to_pmml.py", line 338, in create_left_node
operator=SIMPLE_PREDICATE_OPERATOR.LESS_OR_EQUAL, value="{:.16f}".format(obj['threshold'])))
I've double checked the categoric_values
that's passed around between the library functions are set up correctly. However, I would not see anywhere where these would be taken into account (?). It appears to me that regardless, the library tries to create a <=
and >
node around the value, which it wants to interpret as {.16f}
indeed. The categoricals that we provide to the model are cast to int on our side, so this generally works, except that the LGBM in question apprently produces 1395||1401||1427||1496||1504||1510||1521
as the threshold value, where nyoka's float 'cast' fails.
As far as I can tell, this an expected threshold value for LGBMClassifier
, which I would expect to be interpreted as SimpleSetPredicate
in the PMML. While I did find implementations of the primitives in nyoka's PMML44.py
and PMML44Super.py
, I could not find any way this could be conceivably called from lgb_to_pmml
either.
None of the examples given for lgbm seem to include categorical features either (https://github.com/SoftwareAG/nyoka/tree/master/examples/lgbm).
So I'm quite at a loss as towhat I may be missing at this point, or whether categorical columns are not supported.
I'm using nyoka '5.0.1', and lightgbm '3.2.1'.
Hi @Sathrovarr, support for categorical feature is not added in Nyoka yet. We will try to add this along with others in the pipeline in near future. Thanks!
Hello @Sathrovarr,
In future roadmap, we do have plans to implement it as a part of Nyoka. Currently closing the ticket.
Thanks