Hybrid Isolation Forest
The Hybrid Isolation Forest (HIF) is an extension of the [Isolation Forest (IF) algorithm] (http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.IsolationForest.html). IF and HIF are designed for detecting anomalies and outliers from a data point distribution. As is, they are alternative methods to the one-class Support Vector Machine.
HIF integrates two extensions dedicated to
- overcome a drawback in the Isolation Forest (IF) algorithm that limits its use in the scope of anomaly detection
- provide it with some supervised learning capability from few samples
The HIF is described (among other places) in this draft paper http://arxiv.org/abs/1705.03800. Please cite this draft paper if you use this code.
This is a simple package implementation for the HIF (inspired from this simple Python implementation of the Isolation Forest algorithm).
(It supports python3, posssibly python2)
$ sudo python3 setup.py install
No extra requirements are needed.
$ python3 -i testHIFDonuts.py
createDonutData(contamin=.005)
computeHIF(ntrees=512, sample_size=64)
Outputs the best (HIF1) and <alpha1, alpha2> (HIF2) values
plotGlobalAucBis(contamin=True)
testOneClassSVM(NU=.1, GAMMA=.1)
testTwoClassSVM(C=.1, GAMMA=.1)
plotDetailedResults(alpha0=.5, alpha1=.5, alpha2=.5)