Model verification fails for XGBoost models

Question

Model verification fails for XGBoost models

ddiddi opened this issue 2 years ago · 1 comments

Following steps on benchmarking.md I'm trying to run a transpiled model called "test_model.jar"
java -jar pmml-evaluator-example-executable-1.6.4.jar --model test_model.jar --input input.csv --output --loop 100

And getting the following Model Evaluator exception:
https://github.com/jpmml/jpmml-evaluator/blob/master/pmml-evaluator/src/main/java/org/jpmml/evaluator/ModelEvaluator.java#L262

Exception in thread "main" org.jpmml.evaluator.EvaluationException: Values 0.034122027 and 0.03412203 do not match
	at org.jpmml.evaluator.ModelEvaluator.verify(ModelEvaluator.java:262)
	at org.jpmml.evaluator.ModelEvaluator.verify(ModelEvaluator.java:218)
	at org.jpmml.evaluator.ModelEvaluator.verify(ModelEvaluator.java:60)
	at org.jpmml.evaluator.example.EvaluationExample.execute(EvaluationExample.java:351)
	at org.jpmml.evaluator.example.Example.execute(Example.java:95)
	at org.jpmml.evaluator.example.EvaluationExample.main(EvaluationExample.java:260)

How can I resolve this double precision issue?

Answer 1 · 2023-01-18T21:29:02.000Z

The two values seem to befloat32 values. They are close enough to be considered a "match" by a human.

The JPMML-Evaluator computes the match/mis-match status following the instructions that are included in the PMML document. See what are the values of yourVerificationField@precision and VerificationField@zeroThreshold attributes:
https://dmg.org/pmml/v4-4-1/ModelVerification.html#xsdElement_VerificationField

This algorithm is very similar to the one of the numpy.allclose utility function:
https://numpy.org/doc/stable/reference/generated/numpy.allclose.html

If you agree that 0.034122027 and 0.03412203 are the same value, then please update the prediction acceptance criteria in your PMML document.

When working with XGboost models, then relative error of 1E-7 is okay. You must tweak the value of the absolute error according to the magnitude of predictions. Looks like something in the 1E-6 .. 1E-7 range could do.