RuleSet model does not implement regression method
Closed this issue · 7 comments
The RuleSetModelEvalutor only implements evaluateClassification
, but not evaluateRegression
. This means that scoring a RuleSetModel with functionName="regression"
, which is definitely allowed by the PMML standard, leads to a runtime error.
.. a RuleSetModel with
functionName="regression"
, which is definitely allowed by the PMML standard ..
Care to provide some references?
More importantly, what software is producing regression-type RuleSetModel
files? Do you have an example?
If you check https://dmg.org/pmml/v4-3/RuleSet.html#xsdElement_RuleSetModel, it states
<xs:attribute name="functionName" type="MINING-FUNCTION" use="required"/>
and it is not clear to me why MINING-FUNCTION
here cannot be regression
. If only classification
were allowed, why would this attribute exist? Also, the very top of the page states "Ruleset models can be thought of as flattened decision tree models.". Since decision trees can be regression models, why can't Ruleset models?
This is not from some standard converter, but a custom model that uses Ruleset for post-processing of a regression model.
If only classification were allowed, why would this attribute exist?
The <Model>@functionName
attribute is common to all model elements, whether they support only one mining function type, or multiple.
For example, the Scorecard model is an example of a regression-type model. However, it also has a functionName
attribute that, at least theoretically, can be set to alternative mining function types.
Also, the very top of the page states "Ruleset models can be thought of as flattened decision tree models."
Similarly, Scorecard models can be thought as flattened decision tree models.
This is not from some standard converter, but a custom model that uses Ruleset for post-processing of a regression model.
That sounds like an interesting setup. You would normally post-process (aka calibrate) predictions using transformer-like elements (eg. NormContinuous for isotonic regression), not model-like elements.
How else does your regression-type RuleSetModel
element differ from a "classical" classification-type RuleSetModel
element?
The JPMML-Evaluator library allows you to subclass the o.j.e.rule_set.RuleSetModelEvaluator
class, and provide the desired #evaluateRegression(ValueFactory, EvaluationContext)
method implementation. If there is a demonstrable value/use case to it, then perhaps it will be possible to extract some common rule evaluation logic into a helper method.
@jrauch-pros Can you contact me privately via e-mail (as listed in my GitHub profile)?
I noticed that as a workaround I can just use functionName="classification" and still treat the output like I would a regression result. Looks odd, but seems to work OK. So I'm closing this.
I noticed that as a workaround I can just use functionName="classification" and still treat the output like I would a regression result.
Main differences from the JPMML-Evaluator API perspective:
functionName="classification"
. The target field is an instance oforg.jpmml.evaluator.Classification
(the exact type is model type dependent), and the enclosed value is typicallyjava.lang.String
. For example, if your model wants to return a PI value, then you'd be getting a string"3.14159"
(your application will have to parse it into a number).functionName="regression"
. The target field is an instance oforg.jpmml.evaluator.Regression
, and the enclosed value is somejava.lang.Number
.
If you don't want to re-implement the #evaluateRegression(ValueFactory, EvaluationContext)
method from scratch, then at least you could re-package the target field value from o.j.e.Classification
to o.j.e.Regression
there.
I'm not looking to change the library because I don't control the system that does the actual scoring. So I'll make it work with what is implemented now. Thanks!