/Boxdrawing

Implementing the Boxdrawing classification for imbalanced data

Primary LanguageR

Boxdrawing in R

Implementing the Boxdrawing algorithms as a R package for my Master's Thesis.

This package contains the two box drawing algorithms for classification (Exactbox and Fastbox) and several helping functions. Credit to the original matlab code goes to Cynthia Rudin and Siong Thye Goh for their Paper from 2014.

The use of Exactbox requires gurobi and its R-API

Installation

Install the newest version right from GitHub.

library("devtools")
install_github("hendrikpfaff/boxdrawing")

Usage

Exactbox

The Exactbox-algorithm creates a Mixed integer programming model and tries to find a solution.

Fastbox

The Fastbox-algorithm uses a heuristic approach, characterizing and then discriminating elements for its boxes.

Both algorithms return a list of 13 different elements:

  • execTime - Execution time of the function in seconds.
  • colNum - Number of Features in the data set.
  • trainingTP - Number of True Positive classifications in the training data for every tradeoff-parameter.
  • trainingFP - Number of False Positive classifications in the training data for every tradeoff-parameter.
  • trainingTN - Number of True Negative classifications in the training data for every tradeoff-parameter.
  • trainingFN - Number of False Negative classifications in the training data for every tradeoff-parameter.
  • testingTP - Number of True Positive classifications in the testing data for every tradeoff-parameter.
  • testingFP - Number of False Positive classifications in the testing data for every tradeoff-parameter.
  • testingTN - Number of True Negative classifications in the testing data for every tradeoff-parameter.
  • testingFN - Number of False Negative classifications in the testing data for every tradeoff-parameter.
  • tradeoff - All used tradeoff-parameters.
  • lowerIdeal - The lower boundary of every dimension per box.
  • upperIdeal - The upper boundary of every dimension per box.

Example

library(boxdrawing)

# Execute classifiers.
ebox <- exactboxes(positivetraining, negativetraining, positivetesting, negativetesting, 1, 1, 0.01, varType='C')
fbox <- fastboxes(positivetraining, negativetraining, positivetesting, negativetesting, 1,1, 1)

# Compare found box boundaries.
printBoundaries(ebox)
printBoundaries(fbox)

# Apply the boundaries on the data.
applyExact <- applyBoundaries(ebox, data)
applyFast <- applyBoundaries(fbox, data)