Dificulties using classifier

Question

Dificulties using classifier

anaritam opened this issue 10 years ago · 4 comments

Hi,

I'm trying to use a classifier that I've created, to see how accurate it can be.
I have 2 features of 157 independent observations of 2 classes, which I used to created a Maximum a Posteriori classifier
features is a 157x2 matrix
labels = [zeros(79,1); ones(78,1)];
dataSet = prtDataSetClass(features,labels);
classifier = prtClassMap;
classifier = classifier.train(dataSet);

I have 1 single observation to test the classifier
features_test is a 1x2 matrix
labels_test = 0;
dataSet_test = prtDataSetClass(features_test,labels_test)

After I run the classifier
classified = run(classifier, dataSet_test)

I try to use this comand
prtScoreRoc(classified,dataSet_test);

but it gives me this error
Error using prtScoreRoc (line 111)
ROC requires input labels to have 2 unique classes; unique(y(:)) = 0

How can I solve this? Thanks

Answer 1 · 2015-02-25T12:38:52.000Z

Hi,

Unfortunately, it doesn't make sense to run a ROC (http://en.wikipedia.org/wiki/Receiver_operating_characteristic) curve on a data set that only has one sample, so prtScoreRoc errors here.

If you have a classifier that includes a "decider" you can look at the output class to see if the guess was correct-

classifier = prtClassMap + prtDecisionBinaryMinPe;

Then check to see if

classified.X == classified.Y

Let me know if that helps!

-Pete

Answer 2 · 2015-02-25T12:48:13.000Z

What I'm trying to do is to quantify the acuracy of my classifier.
In fact I have 158 indipendent observations, and what I'm doing is exclude 1 observation from that dataSet, buid the classifier, and then use that excluded observation, to test the classifier, for every 158 observations.

What I wanted to know is which parameter in the "classified" variable has the information about the class atributed to the Test dataSet, so I could compare it to the true class of the observation.

Should I use

classifier = prtClassMap + prtDecisionBinaryMinPe;

classifier = classifier.train(dataSet);
classified = run(classifier, dataSet_test);

class_decision = classified.getObservations;

??

Answer 3 · 2015-02-25T13:12:59.000Z

Yes - precisely. For an algorithm including a decider ( + prtDecisionBinaryMinPe) look at:

class_decision = classified.getObservations;

But if you want to evaluate the entire algorithm on each single observation by leaving one out at a time (Leave-One-Out: http://en.wikipedia.org/wiki/Cross-validation_%28statistics%29#Leave-one-out_cross-validation) then you can use:

classifier = prtClassMap;
classifiedLeaveOneOut = classifier.kfolds(dataSet);
prtScoreRoc(classifiedLeaveOneOut);

This will iteratively train the classifier on each 157-element sub-set, then evaluate it on the other example.

Note that to get meaningful ROC curves, you do not want to include a decider. I.e., use:
classifier = prtClassMap; %this outputs continuous values in classified.X

and not:
classifier = prtClassMap + prtDecisionBinaryMinPe; %this outputs binary 1/0 values in classified.X

However, for decision-level evaluation, you can use a decider:

ds = prtDataGenUnimodal;
class = prtClassMap + prtDecisionBinaryMinPe; % Include a decider, so we get binary outputs
classified = class.kfolds(ds,10); %10-folds is fast and fair
prtScoreConfusionMatrix(classified); %evaluate the binary outputs with a confusion matrix

Let me know if this helps

-Pete

Answer 4 · 2015-02-25T13:16:09.000Z

Thank you very much for your help!