How to define custom classification loss function

Question

0 votes

I am currently trying to run a kfold cross validation on a decision tree with a custom classification loss function, as described here.

However, I don't understand how the C and S matrices, which are passed to the loss function, are helpful.

1. Under the link it says "C is an n-by-K logical matrix with rows indicating which class the corresponding observation belongs." So this is not predicted and hence a repetition of the input data?

2. The S matrix. "S is an n-by-K numeric matrix of classification scores". Why can I not simply use the predicted classifications instead of the scores?

To be more specific: I create a classification decision tree. Next, I use crossval to get a partitionedModel. Then, I calculate the validation accuracy by using kfoldLoss. Now, instead of using the in built 'classiferror' function, I would like to use my own classification loss function, e.g. matthews correlation coefficient.

% create set of cross-validated classification model(s) from a classification model
partitionedModel = crossval(trainedClassifier.ClassificationTree, 'KFold', 10);
% Loss, by default the fraction of misclassified data, is a scalar and averaged over all folds
validationAccuracy = 1 - kfoldLoss(partitionedModel, 'LossFun', 'classiferror');

Any help is greatly appreciated.

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Leon Kellner on 3 Jul 2018

Open in MATLAB Online

1 vote

In case anybody else is looking for a solution. I used the crossval function to wrap the training of the decision tree. This way the implementation of other loss functions is straightforward.

function [trainedClassifier, qualityMeasures] = trainDTwCrossVal(data, predictorNames, MaxNumSplits)
% cross validation
numberOfFolds=5;
cp = cvpartition(data.typeBehavior,'k',numberOfFolds);  % creates a random partition for a stratified k-fold cross-validation
vals=crossval(@trainDT2, data, 'partition', cp);        % Loss estimate using cross validation
function testval = trainDT2(trainingData, testingData)      % nested function to train one DT with trainingData and test with testingData

Testval are quality measures of the prediction, derived from the confusion matrix, calculated inside the nested function to train the decision tree.

        % C=[TP FP
        %    FN TN]
        TP=C(1,1); FP=C(1,2); FN=C(2,1); TN=C(2,2);
        % Matthews correlation coefficient, worst value = -1, best value = 1
        if ( (TP+FP)*(TP+FN)*(TN+FP)*(TN+FN) ) == 0
            MCC = 0;    % set MCC to zero, if the denominator is zero
        else
            MCC = (TP*TN - FP*FN) / ...
                sqrt( (TP+FP)*(TP+FN)*(TN+FP)*(TN+FN) );
        end
        accuracy=(TP+TN)/(TP+TN+FP+FN);     % accuracy, worst value = 0, best value = 1
        F1score=2*TP/(2*TP+FP+FN);          % F1 score, worst value = 0, best value = 1
        testval=[accuracy F1score MCC];

1 Comment
Show -1 older comments Hide -1 older comments

Elena Casiraghi on 21 Aug 2019

Dear, I had the same problem, however it seems I found the solution:

I have a classification problems with labels 1,..,5.

Since the label is a score related to a grade, I would to compute the Loss by computing the

distance between the triue label and the predicted label

So, if:

are the N points in my dataset,

is the TRUE label of

, and if the predicted label is

,

is the weigh for point

is the cost of assigning the point in class

to class

I would like to measure the loss ass:

The score S to be used when computing the loss contains ngative values. Whats the meaning of that score?

from the explanation in matlab help it seems that the more the S value is low (negative), the more the point is "distant from that class"

If I have 5 labels and for x(i) I have

than this means that x(i) would have predicted label = 3. You could somehow normalize the score to transform them in a sort of probability of the point x of belonging to each class.

I used kfoldPredict to understand what's happening and it should be right.

Sign in to comment.

How to define custom classification loss function

0 Comments
Show -2 older comments Hide -2 older comments

Accepted Answer

1 Comment
Show -1 older comments Hide -1 older comments

More Answers (0)

Categories

Tags

Community Treasure Hunt

How to define custom classification loss function

0 Comments Show -2 older comments Hide -2 older comments

Accepted Answer

1 Comment Show -1 older comments Hide -1 older comments

More Answers (0)

Categories

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

1 Comment
Show -1 older comments Hide -1 older comments