BiasedRandomForest: A Jupyter Notebook repository from zhangxiaowbl

Code for implementation of BRAF/Biased Random Forest, from Bader-El-Den 2019 (IEEE).

To run the pipeline on the PIMA dataset, run python run_pipeline.py from the command line.

Credit Tamar Melman, 2020

This code contains 3 files:

ml_utils.py: utility functions to calculate metrics of interest for ML algorithm evaluation
randomforest.py: script defining DecisionTree, RandomForest, and BiasedRandomForest implementations
run_pipeline.py, which runs the entire analysis pipeline to train the model and output metrics.

BiasedRandomForest is for demonstration purposes and is not recommended for ML applications; for imbalanced data, I would recommend one of the following approaches:

Use a weighted Random Forest
modifying the algorithm to pick a balanced subset of the majority class
using SMOTE upsampling with a standard RandomForest

zhangxiaowbl/BiasedRandomForest