StackOverflowError

Question

StackOverflowError

Closed this issue 9 years ago · 15 comments

I am trying to convert a random forest model for pkl to pmml, and I get stack overflow error. I can covert the regression version of the same model without any problem. Attached is the pkl files for regression and random forest and the mapper.

Model 1.zip

Exception in thread "main" java.lang.StackOverflowError
at java.lang.StrictMath.floorOrCeil(StrictMath.java:355)
at java.lang.StrictMath.floor(StrictMath.java:340)
at java.lang.Math.floor(Math.java:424)
at sun.misc.FloatingDecimal.dtoa(FloatingDecimal.java:629)
at sun.misc.FloatingDecimal.(FloatingDecimal.java:468)
at java.lang.Double.toString(Double.java:196)
at org.jpmml.converter.PMMLUtil.formatValue(PMMLUtil.java:387)
at sklearn.tree.TreeModelUtil.encodeNode(TreeModelUtil.java:82)
at sklearn.tree.TreeModelUtil.encodeNode(TreeModelUtil.java:97)

Answer 1 · 2016-03-07T17:09:50.000Z

Very nice - I can reproduce the StackOverflowError using your example files. Will investigate and fix it in the upcoming JPMML-SkLearn version that will be released either later today or tomorrow.

I suspect that Scikit-Learn has changed something about the encoding of random forest models. I've tested with Scikit-Learn versions 0.16.0 through 0.17.1. What's your Scikit-Learn version?

import sklearn
print(sklearn.__version__)

Answer 2 · 2016-03-07T17:16:35.000Z

Thank you very much. The version is 0.17.

Answer 3 · 2016-03-07T18:00:32.000Z

This looks like a legitimate StackOverflowError, because the first member tree model in your random forest model is over 2000 levels deep. That's highly unusual.

How was your sklearn.ensemble.RandomForestRegressor instance parametrized? You should set the value of max_depth parameter to some sensible value such as 100.

Answer 4 · 2016-03-07T20:37:05.000Z

There's a related issue, where a StackOverflowError happens when converting a random forest model that has been trained using the Iris dataset. It should be impossible to train a 2000-level deep tree model using a dataset that contains only 150 training instances.

jpmml/sklearn2pmml#4

Answer 5 · 2016-03-08T09:16:29.000Z

Thank you very much for your prompt response. I have set the max_depth to 100 and still getting the error. My java version is 1.7.0_79.

Answer 6 · 2016-03-08T09:25:32.000Z

I have also tested it with Oracle Java 1.8.0_40.

Answer 7 · 2016-03-08T09:28:45.000Z

The error however has changed to:

Exception in thread "main" java.lang.StackOverflowError
at sun.misc.FDBigInteger.leftShift(FDBigInteger.java:511)
at sun.misc.FDBigInteger.valueOfMulPow52(FDBigInteger.java:324)
at sun.misc.FloatingDecimal$BinaryToASCIIBuffer.dtoa(FloatingDecimal.java:714)
at sun.misc.FloatingDecimal$BinaryToASCIIBuffer.access$100(FloatingDecimal.java:259)
at sun.misc.FloatingDecimal.getBinaryToASCIIConverter(FloatingDecimal.java:1785)
at sun.misc.FloatingDecimal.getBinaryToASCIIConverter(FloatingDecimal.java:1738)
at sun.misc.FloatingDecimal.toJavaFormatString(FloatingDecimal.java:70)
at java.lang.Double.toString(Double.java:204)
at org.jpmml.converter.ValueUtil.formatValue(ValueUtil.java:118)
at sklearn.tree.TreeModelUtil.encodeNode(TreeModelUtil.java:81)
at sklearn.tree.TreeModelUtil.encodeNode(TreeModelUtil.java:96)
at sklearn.tree.TreeModelUtil.encodeNode(TreeModelUtil.java:96)
at sklearn.tree.TreeModelUtil.encodeNode(TreeModelUtil.java:96)
at sklearn.tree.TreeModelUtil.encodeNode(TreeModelUtil.java:96)
at sklearn.tree.TreeModelUtil.encodeNode(TreeModelUtil.java:96)

which is the same as
https://github.com/jpmml/sklearn2pmml/issues/4

Which java version should I use?

Answer 8 · 2016-03-08T09:50:41.000Z

You probably can't solve the issue simply by using a different Java version.

The problem is more fundamental, and appears to be an unpickling error (which is manifested on some Java versions, and not on others) or something like that. As a result, we have a situation where the unpickled Scikit-Learn data contains (invalid-) cross-references, which make the TreeModelUtil#encodeNode jump back and forth between two nodes, until the JVM dies with a StackOverflowError.

Answer 9 · 2016-03-08T10:40:40.000Z

How were the example pickle files in the Model1.zip file generated? I am unable to unpickle them for closer inspection using either sklearn.externals.joblib or pickle modules:

>>> from sklearn.externals import joblib
>>> forest = joblib.load("pp_model_1_forest.pkl")

Traceback (most recent call last):
  File "load_joblib.py", line 3, in <module>
    forest = joblib.load("pp_model_1_forest.pkl")
  File "/usr/lib/python3.4/site-packages/sklearn/externals/joblib/numpy_pickle.py", line 459, in load
    obj = unpickler.load()
  File "/usr/lib64/python3.4/pickle.py", line 1038, in load
    dispatch[key[0]](self)
  File "/usr/lib64/python3.4/pickle.py", line 1384, in load_reduce
    value = func(*args)
  File "sklearn/tree/_tree.pyx", line 579, in sklearn.tree._tree.Tree.__cinit__ (sklearn/tree/_tree.c:6774)
ValueError: Buffer dtype mismatch, expected 'SIZE_t' but got 'int'

and

>>> import pickle
>>> forest = pickle.load(open("pp_model_1_forest.pkl", "rb"))

Traceback (most recent call last):
  File "load_pickle.py", line 3, in <module>
    forest = pickle.load(open("pp_model_1_forest.pkl", "rb"))
_pickle.UnpicklingError: invalid load key, 'Z'.

Answer 10 · 2016-03-08T11:00:42.000Z

test.zip
I receive the same error for loading the pickle even for the Iris example provided (see test.zip). I have also put complied jar file. So may be the problem is in the joblib dump of the random forest not in the converter?

def store_pkl(obj, name):
joblib.dump(obj,"pkl/" + name, compress = 9)

Answer 11 · 2016-03-08T11:09:35.000Z

The JPMML-SkLearn library should be able to consume the following dumps:

sklearn.externals.joblib
joblib
pickle

Option 1 is recommended by Scikit-Learn documentation (eg. see http://scikit-learn.org/stable/modules/model_persistence.html). However, it may happen that this module is outdated and/or out of sync with other modules.

You could try dumping the RF object manually using options 2 and 3, and use the JPMML-SkLearn command-line application to do the conversion.

Answer 12 · 2016-03-08T13:24:31.000Z

I have tested all methods for dumping the .pkl files. Still stackoverflow error even with Iris data. The log file is provided in the attached file.
test.zip

I use Python 2.7 32bit (Anaconda).

This is the code for the model1.zip
from sklearn.externals import joblib
model 1.zip

def store_pkl(obj, name):
joblib.dump(obj,"pkl/" + name, compress = 9)

pp_model_regression = LinearRegression()
pp_model_regression.fit(pp_X, pp_y)

pp_model_forest = RandomForestRegressor(max_depth=100,min_samples_leaf = 5)
pp_model_forest.fit(pp_X, pp_y)

store_pkl(pp_mapper, "pp_mapper_1.pkl")
store_pkl(pp_model_regression, "pp_model_1_regression.pkl")
store_pkl(pp_model_forest, "pp_model_1_forest.pkl")

you should be able to load them with joblib. Can you please try again? I tried different java versions as well. So I am really confused.

Answer 13 · 2016-03-08T13:51:42.000Z

I use Python 2.7 32bit (Anaconda)

This could be a 32-bit vs. 64-bit compatibility issue.

I'm running a 64-bit OS, and the JPMML-SkLearn project has been tested against 64-bit versions of Python2(.7) and Python3(.4).

My unpickling error message (ValueError: Buffer dtype mismatch, expected 'SIZE_t' but got 'int') fits perfectly into this picture, as for me SIZE_t is long, not int.

Answer 14 · 2016-03-08T14:42:18.000Z

Fixed! Thank you very much for all your help. The problem was the compatibility of python 32 and java 64.

Answer 15 · 2016-03-08T15:01:21.000Z

Closing this issue in favour of the following one: jpmml/jpmml-sklearn#6