How to get mean value from 0/1 RFClassifier
Mad5ci opened this issue · 2 comments
Mad5ci commented
We've built a random forest classifier that trains on and delivers a 0 or 1 output for one of two classes.
We want to be able to get a double value representing the mean value of all of the individual trees.
So, for example, if 2/3 of the trees in the forest vote 1 then we would expect to get a value near 0.666.
There doesn't appear to be a way to drill down to that level of detail -- but maybe by doing some trickery with the decision function.
How should we go about getting at the data from the trees after a prediction?
Ulfgard commented
If you use the current version on master, i.e. the last release, you should be able to do:
auto class_probabilities = random_forest.decisionFunction() (inputs)
this will be a a vector/matrix with N elements per row (where N is number of classes). The output should be normalized between 0 and 1. I think for binary classification, the second value should be the one you are after (proportion of classes with label 1)
I hope this helps.
…________________________________________
From: Pete McNeil <notifications@github.com>
Sent: Monday, October 12, 2020 11:10:43 PM
To: Shark-ML/Shark
Cc: Subscribed
Subject: [Shark-ML/Shark] How to get mean value from 0/1 RFClassifier (#282)
We've built a random forest classifier that trains on and delivers a 0 or 1 output for one of two classes.
We want to be able to get a double value representing the mean value of all of the individual trees.
So, for example, if 2/3 of the trees in the forest vote 1 then we would expect to get a value near 0.666.
There doesn't appear to be a way to drill down to that level of detail -- but maybe by doing some trickery with the decision function.
How should we go about getting at the data from the trees after a prediction?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FShark-ML%2FShark%2Fissues%2F282&data=02%7C01%7Coswin.krause%40di.ku.dk%7C19d21a33cb31470999ba08d86ef348de%7Ca3927f91cda14696af898c9f1ceffa91%7C0%7C0%7C637381338474936691&sdata=LZeZuT3uHkvnK4EoI7rSyuoTYr3gTheJmyoolvaeyno%3D&reserved=0>, or unsubscribe<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FADSZGBTFCNJXNKYRLYOYLGTSKNWFHANCNFSM4SNL65CQ&data=02%7C01%7Coswin.krause%40di.ku.dk%7C19d21a33cb31470999ba08d86ef348de%7Ca3927f91cda14696af898c9f1ceffa91%7C0%7C0%7C637381338474946686&sdata=sD3ZMRzyv8pKGpve286Yc08vvQWsCe7T%2FapOBCY9z%2BM%3D&reserved=0>.
Mad5ci commented
Thanks! That worked... Here is a snippet of the code that's giving the desired result.
// Code to make the prediction
double thePrediction;
unsigned int modelOutput;
auto predictionData = model.decisionFunction()(theInputs);
thePrediction = predictionData.element(0)[1];