XGboost converstion issue
Kawalierus opened this issue · 10 comments
I have tried recreating example as provided in example (https://github.com/jpmml/r2pmml#package-xgboost). Unfortunately I receive error message :
SEVERE: Failed to convert RDS to PMML
java.lang.ClassCastException: org.jpmml.rexp.RStringVector cannot be cast to org.jpmml.rexp.RIntegerVector
at org.jpmml.rexp.XGBoostConverter.loadFeatureMap(XGBoostConverter.java:283)
at org.jpmml.rexp.XGBoostConverter.loadFeatureMap(XGBoostConverter.java:265)
at org.jpmml.rexp.XGBoostConverter.loadFeatureMap(XGBoostConverter.java:230)
at org.jpmml.rexp.XGBoostConverter.ensureFeatureMap(XGBoostConverter.java:209)
at org.jpmml.rexp.XGBoostConverter.encodeSchema(XGBoostConverter.java:66)
at org.jpmml.rexp.ModelConverter.encodePMML(ModelConverter.java:70)
at org.jpmml.rexp.Converter.encodePMML(Converter.java:39)
at org.jpmml.rexp.Main.run(Main.java:149)
at org.jpmml.rexp.Main.main(Main.java:97)
Exception in thread "main" java.lang.ClassCastException: org.jpmml.rexp.RStringVector cannot be cast to org.jpmml.rexp.RIntegerVector
at org.jpmml.rexp.XGBoostConverter.loadFeatureMap(XGBoostConverter.java:283)
at org.jpmml.rexp.XGBoostConverter.loadFeatureMap(XGBoostConverter.java:265)
at org.jpmml.rexp.XGBoostConverter.loadFeatureMap(XGBoostConverter.java:230)
at org.jpmml.rexp.XGBoostConverter.ensureFeatureMap(XGBoostConverter.java:209)
at org.jpmml.rexp.XGBoostConverter.encodeSchema(XGBoostConverter.java:66)
at org.jpmml.rexp.ModelConverter.encodePMML(ModelConverter.java:70)
at org.jpmml.rexp.Converter.encodePMML(Converter.java:39)
at org.jpmml.rexp.Main.run(Main.java:149)
at org.jpmml.rexp.Main.main(Main.java:97)
I have notice that in the similar topic (https://www.gitmemory.com/issue/jpmml/r2pmml/64/718139304) you suggested that label should be factorized. Nevertheless it would cause error when providing label as factor to xgboost function [17:10:50] amalgamation/../src/objective/multiclass_obj.cu:120: SoftmaxMultiClassObj: label must be in [0, num_class).
I have initially tried to run conversion on poisson count model and obtained identical error (and prediction there is frequency so I do not see where strings are to be converted to integers).
R version: 4.0.3
Xgboost version: 1.3.2.1
r2pmml version: 0.25.1
I would greatly appreciate support regarding this issue.
java.lang.ClassCastException: org.jpmml.rexp.RStringVector cannot be cast to org.jpmml.rexp.RIntegerVector
at org.jpmml.rexp.XGBoostConverter.loadFeatureMap(XGBoostConverter.java:283)
This is the line 183 of XGBoostConverter source code:
https://github.com/jpmml/jpmml-r/blob/1.4.2/src/main/java/org/jpmml/rexp/XGBoostConverter.java#L283
As you can see, the converter expects that the name
attribute of the Feature Map (aka FMap) object is a R factor. You've been giving it a R string instead.
To fix resolve this conversion error, simply change the type of the name
attribute to R factor:
iris.fmap = as.fmap(iris.matrix)
# THIS!
iris.fmap$name = as.factor(iris.fmap$name)
These two issues are close relatives, because they both reflect inproper typing of FMap attributes (name
here, type
there).
All FMap attributes must be R factors.
All FMap attributes must be R factors.
Leaving this issue open for now - the r2pmml::as.fmap
utility function should always force-cast all FMap attributes to R factors before returning the result to the end user.
Thank you very much for such a quick response.
Unfortunately your advice have not resolved the issue in my case. Please note the code as below:
library("xgboost")
library("r2pmml")
data(iris)
iris_X = iris[, 1:4]
iris_y = as.integer(iris[, 5]) - 1
iris.matrix = model.matrix(~ . - 1, data = iris_X)
iris.DMatrix = xgb.DMatrix(iris.matrix, label = iris_y)
iris.fmap = as.fmap(iris.matrix)
iris.fmap$name = as.factor(iris.fmap$name)
iris.xgb = xgboost(data = iris.DMatrix, missing = NULL, objective = "multi:softmax", num_class = 3, nrounds = 13)
r2pmml(iris.xgb, "iris_xgb.pmml", fmap = iris.fmap, response_name = "Species", response_levels = c("setosa", "versicolor", "virginica"), missing = NULL, ntreelimit = 7, compact = TRUE)
still results in the error:
SEVERE: Failed to convert RDS to PMML
java.lang.ClassCastException: org.jpmml.rexp.RStringVector cannot be cast to org.jpmml.rexp.RIntegerVector
at org.jpmml.rexp.XGBoostConverter.loadFeatureMap(XGBoostConverter.java:284)
at org.jpmml.rexp.XGBoostConverter.loadFeatureMap(XGBoostConverter.java:265)
at org.jpmml.rexp.XGBoostConverter.loadFeatureMap(XGBoostConverter.java:230)
at org.jpmml.rexp.XGBoostConverter.ensureFeatureMap(XGBoostConverter.java:209)
at org.jpmml.rexp.XGBoostConverter.encodeSchema(XGBoostConverter.java:66)
at org.jpmml.rexp.ModelConverter.encodePMML(ModelConverter.java:70)
at org.jpmml.rexp.Converter.encodePMML(Converter.java:39)
at org.jpmml.rexp.Main.run(Main.java:149)
at org.jpmml.rexp.Main.main(Main.java:97)
Exception in thread "main" java.lang.ClassCastException: org.jpmml.rexp.RStringVector cannot be cast to org.jpmml.rexp.RIntegerVector
at org.jpmml.rexp.XGBoostConverter.loadFeatureMap(XGBoostConverter.java:284)
at org.jpmml.rexp.XGBoostConverter.loadFeatureMap(XGBoostConverter.java:265)
at org.jpmml.rexp.XGBoostConverter.loadFeatureMap(XGBoostConverter.java:230)
at org.jpmml.rexp.XGBoostConverter.ensureFeatureMap(XGBoostConverter.java:209)
at org.jpmml.rexp.XGBoostConverter.encodeSchema(XGBoostConverter.java:66)
at org.jpmml.rexp.ModelConverter.encodePMML(ModelConverter.java:70)
at org.jpmml.rexp.Converter.encodePMML(Converter.java:39)
at org.jpmml.rexp.Main.run(Main.java:149)
at org.jpmml.rexp.Main.main(Main.java:97)
Exception in thread "main" java.lang.ClassCastException: org.jpmml.rexp.RStringVector cannot be cast to org.jpmml.rexp.RIntegerVector
at org.jpmml.rexp.XGBoostConverter.loadFeatureMap(XGBoostConverter.java:284)
See - you fixed the type of the name
attribute, and now the exception is happening one line later (284 instead of 283)
Please see my earlier resolution - "All FMap attributes must be R factors"
iris.fmap = as.fmap(iris.matrix)
iris.fmap$id = as.factor(iris.fmap$id)
iris.fmap$name = as.factor(iris.fmap$name)
iris.fmap$type = as.factor(iris.fmap$type)
My XGBoost example works fine with R 3.X.
Looks like R 4.X has botched matrix column types (were factors before, are strings now).
Thank you very much again. Apologies for not noticing this at first glance. We are making steady progress as I have reached another type of error by applying all your corrections.
Code:
library("xgboost")
library("r2pmml")
data(iris)
iris_X = iris[, 1:4]
iris_y = as.integer(iris[, 5]) - 1
iris.matrix = model.matrix(~ . - 1, data = iris_X)
iris.DMatrix = xgb.DMatrix(iris.matrix, label = iris_y)
iris.fmap = as.fmap(iris.matrix)
iris.fmap$id = as.factor(iris.fmap$id)
iris.fmap$name = as.factor(iris.fmap$name)
iris.fmap$type = as.factor(iris.fmap$type)
iris.xgb = xgboost(data = iris.DMatrix, missing = NULL, objective = "multi:softmax", num_class = 3, nrounds = 13)
r2pmml(iris.xgb, "iris_xgb.pmml", fmap = iris.fmap, response_name = "Species", response_levels = c("setosa", "versicolor", "virginica"), missing = NULL, ntreelimit = 7, compact = TRUE)
Error:
SEVERE: Failed to convert RDS to PMML
java.lang.IllegalArgumentException
at org.jpmml.rexp.XGBoostConverter.loadFeatureMap(XGBoostConverter.java:295)
at org.jpmml.rexp.XGBoostConverter.loadFeatureMap(XGBoostConverter.java:265)
at org.jpmml.rexp.XGBoostConverter.loadFeatureMap(XGBoostConverter.java:230)
at org.jpmml.rexp.XGBoostConverter.ensureFeatureMap(XGBoostConverter.java:209)
at org.jpmml.rexp.XGBoostConverter.encodeSchema(XGBoostConverter.java:66)
at org.jpmml.rexp.ModelConverter.encodePMML(ModelConverter.java:70)
at org.jpmml.rexp.Converter.encodePMML(Converter.java:39)
at org.jpmml.rexp.Main.run(Main.java:149)
at org.jpmml.rexp.Main.main(Main.java:97)
Exception in thread "main" java.lang.IllegalArgumentException
at org.jpmml.rexp.XGBoostConverter.loadFeatureMap(XGBoostConverter.java:295)
at org.jpmml.rexp.XGBoostConverter.loadFeatureMap(XGBoostConverter.java:265)
at org.jpmml.rexp.XGBoostConverter.loadFeatureMap(XGBoostConverter.java:230)
at org.jpmml.rexp.XGBoostConverter.ensureFeatureMap(XGBoostConverter.java:209)
at org.jpmml.rexp.XGBoostConverter.encodeSchema(XGBoostConverter.java:66)
at org.jpmml.rexp.ModelConverter.encodePMML(ModelConverter.java:70)
at org.jpmml.rexp.Converter.encodePMML(Converter.java:39)
at org.jpmml.rexp.Main.run(Main.java:149)
at org.jpmml.rexp.Main.main(Main.java:97)
java.lang.IllegalArgumentException
at org.jpmml.rexp.XGBoostConverter.loadFeatureMap(XGBoostConverter.java:295)
See for yourself:
https://github.com/jpmml/jpmml-r/blob/1.4.2/src/main/java/org/jpmml/rexp/XGBoostConverter.java#L295
Looks like it's not allowed to convert the id
attribute to R factor; leave it to be a R integer.
iris.fmap = as.fmap(iris.matrix)
iris.fmap$name = as.factor(iris.fmap$name)
iris.fmap$type = as.factor(iris.fmap$type)
And we have reached another one:
SEVERE: Failed to convert RDS to PMML
java.lang.IllegalArgumentException: 1730313018.1919250021
at org.jpmml.xgboost.Learner.load(Learner.java:82)
at org.jpmml.xgboost.XGBoostUtil.loadLearner(XGBoostUtil.java:93)
at org.jpmml.xgboost.XGBoostUtil.loadLearner(XGBoostUtil.java:57)
at org.jpmml.xgboost.XGBoostUtil.loadLearner(XGBoostUtil.java:45)
at org.jpmml.rexp.XGBoostConverter.loadLearner(XGBoostConverter.java:309)
at org.jpmml.rexp.XGBoostConverter.loadLearner(XGBoostConverter.java:242)
at org.jpmml.rexp.XGBoostConverter.ensureLearner(XGBoostConverter.java:218)
at org.jpmml.rexp.XGBoostConverter.encodeSchema(XGBoostConverter.java:80)
at org.jpmml.rexp.ModelConverter.encodePMML(ModelConverter.java:70)
at org.jpmml.rexp.Converter.encodePMML(Converter.java:39)
at org.jpmml.rexp.Main.run(Main.java:149)
at org.jpmml.rexp.Main.main(Main.java:97)
Exception in thread "main" java.lang.IllegalArgumentException: 1730313018.1919250021
at org.jpmml.xgboost.Learner.load(Learner.java:82)
at org.jpmml.xgboost.XGBoostUtil.loadLearner(XGBoostUtil.java:93)
at org.jpmml.xgboost.XGBoostUtil.loadLearner(XGBoostUtil.java:57)
at org.jpmml.xgboost.XGBoostUtil.loadLearner(XGBoostUtil.java:45)
at org.jpmml.rexp.XGBoostConverter.loadLearner(XGBoostConverter.java:309)
at org.jpmml.rexp.XGBoostConverter.loadLearner(XGBoostConverter.java:242)
at org.jpmml.rexp.XGBoostConverter.ensureLearner(XGBoostConverter.java:218)
at org.jpmml.rexp.XGBoostConverter.encodeSchema(XGBoostConverter.java:80)
at org.jpmml.rexp.ModelConverter.encodePMML(ModelConverter.java:70)
at org.jpmml.rexp.Converter.encodePMML(Converter.java:39)
at org.jpmml.rexp.Main.run(Main.java:149)
at org.jpmml.rexp.Main.main(Main.java:97)
java.lang.IllegalArgumentException: 1730313018.1919250021
Duplicate of jpmml/jpmml-xgboost#54
TLDR: XGBoost 1.3.X switched model saving data format from binary to JSON.
Two solutions:
- Order XGBoost 1.3.X to save model in binary data format.
- Downgrade XGBoost to 1.2.X.
@Kawalierus I've released R2PMML version 0.25.2 to GitHub, which is able to convert XGBoost 1.3(.3) models now.