Yu-Group/iterative-Random-Forest

A potential small bug

chenfork opened this issue · 2 comments

Hey, when I was doing some tests earlier today, I found this potential bug in get_tree_data().

node_features_idx = all_features_idx[np.array(node_features_raw_idx)]

node_features_raw_idx is just tree_.feature, that's clear, but for leaf nodes, the feature value is TREE_UNDEFINED, which is -2.
So, if you use -2 in this all_features_idx array, you will get second to last feature id for your leaf node, not -2 anymore.

Luckily, I see that you remove leaf node in the following code, I suppose this won't be a bug.
Still, I wonder is there a necessity to use all_features_idx rather than node_features_raw_idx?
Thanks!

I am not really sure about this, this part of the code is not used very often. Given the definition of all_features_idx, it does not seem to be necessary.

That makes sense, thanks!