does anybody have expertise with matlab SVM classifier?

Does anybody know if it's possible to extract more information from SVM classify than just the group? I'm hoping I'll be able to determine the distance from the class boundary. It can serve as a measure of confidence in the classification, or a fuzziness to the membership definition. It seems like that should be possible, I'm wondering if matlab has already implemented that functionality. If not, can anybody suggest a way to implement this as an accessory function to matlab's svm classification?

 Accepted Answer

12 Comments

That's really helpful, I just want to make sure I understand. Below is my code for predicting classes on a test dataset:
Group = svmclassify(SVMStruct,Sample); %Sample is the test features
sv = SVMStruct.SupportVectors;
alphaHat = SVMStruct.Alpha;
bias = SVMStruct.Bias;
kfun = SVMStruct.KernelFunction;
kfunargs = SVMStruct.KernelFunctionArgs;
f = kfun(sv,Sample,kfunargs{:})'*alphaHat(:) + bias;
I ran it on a test set of 25, and saved Group in the first column and your f value in the second:
1 61.9034496803315
1 55.6097471320111
1 29.9168641220881
1 37.4841493478117
1 20.3962105716209
-1 79.6733478962262
1 45.3679636234752
1 49.0841383006749
1 72.6740709982137
1 42.9570874407634
1 56.3203630497531
-1 91.3203476908088
1 56.5131087391274
1 49.4198820528178
-1 97.4609560433329
-1 85.8169118173245
-1 94.1952199328395
-1 99.9702926453379
-1 97.1881289724361
-1 104.362322625414
-1 86.4060043918572
1 58.0004228062444
-1 88.3996251852743
-1 87.6174077335166
-1 87.6789516196668
I can see that all f values are positive. My understanding is the greater the f value, the further away it is from the boundary? Ie, an f value of 0.1 would mean a weak prediction?
You are not showing how you trained SVM. By default, training data are standardized (centered and scaled). The means and scale factors are saved in SVMStruct.ScaleData. In that case, you need to standardize the test data before applying the code above:
Sample = bsxfun(@plus,Sample,SVMStruct.ScaleData.shift);
Sample = bsxfun(@rdivide,Sample,SVMStruct.ScaleData.scaleFactor);
If SVM training converged and SVM predictions are computed correctly, there should be a mix of positive and negative values for f.
You're right, there was a scale. When I added your two new lines I got a mix of positive and negative values.
Does my reasoning still make sense? If I then take the absolute value of each f value, can I treat that as a measure of distance from the boundary? Therefore a low f value indicates that a predication lacks confidence and a high f value indicates a high confidence predication?
I'm getting some strange results, I'm hoping you can help me interpret them? From our discussion above, I expected the distance to be negative when the class is -1 and positive when the class is +1. However, this isn't what I observed. Below is the code I used to extract the distance and the class/distance pairs. The first column is "Group" and the second is "Confidence". Thanks for your help!
Group = svmclassify(SVMStruct,Sample);
Sample = bsxfun(@plus,Sample,SVMStruct.ScaleData.shift);
Sample = bsxfun(@rdivide,Sample,SVMStruct.ScaleData.scaleFactor);
sv = SVMStruct.SupportVectors;
alphaHat = SVMStruct.Alpha;
bias = SVMStruct.Bias;
kfun = SVMStruct.KernelFunction;
kfunargs = SVMStruct.KernelFunctionArgs;
f = kfun(sv,Sample,kfunargs{:})'*alphaHat(:) + bias;
Confidence = f;
1 -3506820447
1 716546018
1 -1046146344
1 -958945814.4
1 71630440.45
-1 184198734.1
-1 -841606073
1 -1422261542
-1 72318657.79
1 -1857223170
-1 694021135.9
-1 487229120
The score f must be negative when the predicted class is -1 and positive when the predicted class is +1. Your result does not make sense. Large values of f would look suspicious even if the sign was right. I suspect you are showing not what you really did, but a cleaned up version of what you did. Something important was lost in the cleanup.
I didn't clean anything up, this is my complete function (just removed the header to made it easier to paste in). I'm a bit confused about lines 2 and 3 that you suggested for me. I thought I shouldn't be overwriting the changes made in lines 2 in line 3, so I changed that and it didn't fix the problem of f values not agreeing with predicted class.
SampleScaleShift = bsxfun(@plus,Sample,SVMStruct.ScaleData.shift);
Sample = bsxfun(@rdivide,SampleScaleShift,SVMStruct.ScaleData.scaleFactor);
I ran it on a different dataset and got much smaller values of f, but the sign on those f values still doesn't make sense to me. I'm sure the predicted class and the f value match up.
1 -94.2522619242825
-1 159.828900542196
1 63.7326650768443
-------------- complete function -----------------------------------
Group = svmclassify(SVMStruct,Sample);
Sample = bsxfun(@plus,Sample,SVMStruct.ScaleData.shift);
Sample = bsxfun(@rdivide,Sample,SVMStruct.ScaleData.scaleFactor);
sv = SVMStruct.SupportVectors;
alphaHat = SVMStruct.Alpha;
bias = SVMStruct.Bias;
kfun = SVMStruct.KernelFunction;
kfunargs = SVMStruct.KernelFunctionArgs;
f = kfun(sv,Sample,kfunargs{:})'*alphaHat(:) + bias;
Confidence = f;
You are right. There are two issues, and none is them is yours.
First, I incorrectly suggested that you should divide by the scale factor. Instead, you should multiply:
bsxfun(@times,SampleScaleShift,SVMStruct.ScaleData.scaleFactor);
The documentation for svmtrain is accurate in that respect:
scaleFactor — Row vector of values. Each value is 1 divided by the standard deviation of an observation in Training, the training data.
I am just not used to this convention.
Second, it appears that if you pass class labels as double values -1 and +1, this SVM implementation treats the -1 label as the positive class. Which means you need to flip the sign of f to get the correct prediction. This is an unorthodox convention, I might say.
This seems to be working! I've copied the function and some sample outputs below. Do the magnitudes of the f values seem more reasonable now?
---- complete function -----
Group = svmclassify(SVMStruct,Sample);
SampleScaleShift = bsxfun(@plus,Sample,SVMStruct.ScaleData.shift);
Sample = bsxfun(@times,SampleScaleShift,SVMStruct.ScaleData.scaleFactor);
sv = SVMStruct.SupportVectors;
alphaHat = SVMStruct.Alpha;
bias = SVMStruct.Bias;
kfun = SVMStruct.KernelFunction;
kfunargs = SVMStruct.KernelFunctionArgs;
f = kfun(sv,Sample,kfunargs{:})'*alphaHat(:) + bias;
Confidence = f*-1;
-------------------------------
1 0.760583390612709
1 1.25173263758715
1 3.04339806528354
1 2.63327843250154
1 3.59341993594926
-1 -0.472036559922782
1 1.98468704863745
1 1.66813020278730
1 0.0750073463900852
1 2.36576742177346
1 0.754440605916582
-1 -1.65394319069710
I don't understand, how do you get a scalar value from Confidence? My sample is a 1x3000 or so vector and Confidence gives me a vector with 3000 or so values. If this is normal, how to obtain a scalar score for the confidence?

Sign in to comment.

More Answers (1)

how to use this approach to one versus all SVM?

Categories

Find more on Fuzzy Logic Toolbox in Help Center and File Exchange

Asked:

Tim
on 22 Feb 2013

Answered:

on 9 Feb 2015

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!