jpmml/jpmml-evaluator

Issue with assigning result to Integer PMML 4.1

Closed this issue · 1 comments

Hello.
When running the DMG example:
http://dmg.org/pmml/pmml_examples/KNIME_PMML_4.1_Examples/single_audit_mlp.xml
I get an issue: "Expected integer value got double"
This issue occurs since the resulting value is a number with a decimal point and this can be resolved by updating the PMML and changing the result value from integer to double. However other PMML engines (for example Zementis) can handle such scenarios and round the result so it could be assigned to an Integer.
Since we are running the PMML from the official DMG website we would expect that corrections to the PMML would not be needed.
Maybe this is something that could be enhanced in the future.

Thank you,
Alex Eidukaitis

When running the DMG example:
http://dmg.org/pmml/pmml_examples/KNIME_PMML_4.1_Examples/single_audit_mlp.xml
I get an issue: "Expected integer value got double"

This is a valid/intended complaint - if the model is performing calculations using floating-point math, but is mapping the prediction to an integer target field in the end, then the floating-point value must be explicitly rounded to an integer value using the Targets/Target@castInteger attribute:
http://dmg.org/pmml/v4-3/Targets.html

This issue occurs since the resulting value
is a number with a decimal point and this can
be resolved by updating the PMML and changing
the result value from integer to double.

Correct.

KNIME should be generating a PMML document where the data type of the target field is double (not integer), because all calculations are performed inside the model using double values.

It's a known KNIME bug. I'm afraid that it's still around in latest KNIME versions. Perhaps you should raise an additional issue in KNIME forums?

However other PMML engines (for example Zementis)
can handle such scenarios and round the result so it could
be assigned to an Integer.

Suppose we have a model for diagnosing COVID-19.

The model performs a computation and obtains a double value 0.2, which then must be mapped to an target field that has integer data type.

Is this patient diseased or not? Rounding 0.2 down (Target@castInteger=floor) would mean that the patient is not a diseased. Rounding 0.2 up (Target@castInteger=ceiling) would mean that the patient is diseased.

Performing a round operation is a business decision. A PMML engine cannot make business decisions at random; it must carry out the intended computation, no more, no less, and raise an error when the intended computation is incorrect.

In the current case, Zementis is making a business decision for you. You have no idea what its decisioning algorithm is, so you're effectively getting a random/black box business decision here.

Since we are running the PMML from the official
DMG website we would expect that corrections to
the PMML would not be needed.

The "Examples" section of the DMG.org website is full of invalid/incorrect PMML documents.