/citree

Recursive conditional inference tree

Primary LanguageROtherNOASSERTION

citree

Project Status: Active - The project has reached a stable, usable state and is being actively developed.

The main function of this package is runCtree(), which is a wrapper of partykit::ctree() with addition functions:

  • partykit::ctree() only produces the best separation at each node, i.e. one tree. By setting recursive = T in runCtree(), all trees meeting p-val cutoff are produced and can be examined to see which one makes more sense according to domain knowledge. Each round of recursion is done by removing the 1st splitting variable from the input data.frame and running runCtree(); the recursion stops if no splitting variable is found.
  • The info and stats of each node of each tree are collected and summarized in an excel file, which also contains ULRs to each tree.
  • Before running partykit::ctree(), low-informative columns and rows are removed to reduce computation and adjustment on association p-vals.
  • Cases leading to crashes of partykit::ctree() are handled, e.g. Inf and -Inf are converted to NA to avoid the following errors: ” ‘breaks’ are not unique”.

Note:

  • ctree uses coin::independence_test() to test the association of two variables of any data type. See here for theory behind the test, and here for an explanation of the algorithm, and here for a nice tutorial.
  • see here for discussions on the pros and cons of ctree in comparison to other trees, e.g. rpart.

See manual and examples

Installation

Since this is just a toy, I have no plan to submit it to CRAN. So please install from github directly:

devtools::install_github("blueskypie/citree")

Example

library(citree)
data('mtcars')
re=runCtree(mtcars,'mtcars',oDir='tmp')

check the tmp directory for output.