Clustering features

In the line

Line 418 in fa76cb5

tmp = np.concatenate( (self.X, self.fX.reshape([-1, 1]) ), axis = 1 )

the clustering in nodes is done based on both x and f(x) though in the paper it is said that clustering is performed based on f(x). Could you please clarify, is f(x) considered as just another variable of x for clustering?

Thank you. Yes f(x) is considered, otherwise we will only split by x. Here we want to split x based on f(x), so [x f(x)] are needed. I will clarify that in the final revision.

However, there is an improvement here. Instead of using K-mean, to label samples into two groups. We can learn a regressor R on [x, f(x)]. The split now becomes R(x) > f(x) goes to good partition, and R(x) < f(x) goes to a bad partition.

Thanks!