JuliaAI/DecisionTree.jl

Some minor differences in random forest implementations

tecosaur opened this issue · 0 comments

I've been comparing some random forest implementations recently (https://github.com/tecosaur/TreeComparison), one of the results of which is #159, but I also have some other information which may be of interest.

For starters, here's the colour coding I use:
image

Error rates mostly converged among the different implementations I tested, however sometimes ranger does a little bit better:
image

image

Precision-recall and ROC curves generally look near-identical, as they should.
image

I've also noticed some larger differences in the depth and size of the random trees created. Across a number of datasets DecisionTrees.jl and randomForest produce narrower/deeper trees than ranger and sklearn.

image

image

image

image