IBM/differential-privacy-library

Implementation of Decision Tree

haaksb opened this issue · 1 comments

Hi

I have a question around the implementation of private decision trees. You have a predict_proba function that should return the probabilities of each feature. However, in the paper you cite for constructing differentially private DT, Differentially Private Random Decision Forests using Smooth Sensitivity, the only implementation mentioned is to query for the majority class label. This is seen, for instance, in Algorithm 1 in the paper and in the abstract. Is the predict_proba method differentially private?

Sorry for just posing a question in the Issues area, but I could not find anywhere to ask about it.

In the Random Forest Classifier, each decision tree that forms the forest is trained with DP. Therefore the output from each tree and any of its functions satisfies DP, as does the output from the forest (since each tree is itself DP). The predict_proba method just aggregates the output from each tree, so is therefore DP by the postprocessing guarantee.