Question: How to use the most specialized data structure and data type
Closed this issue · 2 comments
Dear All,
in your paper "Crack Random Forest for Arbitrary Large Datasets" in Section IV.A you write that "The binning operation is needed both for improving the computational performance [9], [14] and for storing the dataset in-memory in an efficient manner using the most specialized data structure and data type."
I exactly need to exploit this trick, can you please put an example?
Thank you in advance for your help.
Best,
Federica
Dear Federica,
we have updated the repository for addressing the questions that you have raised.
To use the most specialised data structure build the predictor as follows
val rfRunner = ReForeStTrainerBuilder.apply(new TypeInfoDouble(), new TypeInfoByte(), property).build(sc)
I hope it helps!
--Luca
Dear Luca,
thank you very much for your prompt reply.
The solution is exactly what I was searching for.
Thank you again for your great library and work!
Federica