Main Content

Classification loss for cross-validated kernel classification model

returns the classification loss
obtained by the cross-validated, binary kernel model (`loss`

= kfoldLoss(`CVMdl`

)`ClassificationPartitionedKernel`

) `CVMdl`

. For every fold,
`kfoldLoss`

computes the classification loss for validation-fold
observations using a model trained on training-fold observations.

By default, `kfoldLoss`

returns the classification error.

returns the classification loss with additional options specified by one or more name-value
pair arguments. For example, specify the classification loss function, number of folds, or
aggregation level.`loss`

= kfoldLoss(`CVMdl`

,`Name,Value`

)

Load the `ionosphere`

data set. This data set has 34 predictors and 351 binary responses for radar returns, which are labeled either bad (`'b'`

) or good (`'g'`

).

`load ionosphere`

Cross-validate a binary kernel classification model using the data.

CVMdl = fitckernel(X,Y,'Crossval','on')

CVMdl = ClassificationPartitionedKernel CrossValidatedModel: 'Kernel' ResponseName: 'Y' NumObservations: 351 KFold: 10 Partition: [1x1 cvpartition] ClassNames: {'b' 'g'} ScoreTransform: 'none' Properties, Methods

`CVMdl`

is a `ClassificationPartitionedKernel`

model. By default, the software implements 10-fold cross-validation. To specify a different number of folds, use the `'KFold'`

name-value pair argument instead of `'Crossval'`

.

Estimate the cross-validated classification loss. By default, the software computes the classification error.

loss = kfoldLoss(CVMdl)

loss = 0.0940

Alternatively, you can obtain the per-fold classification errors by specifying the name-value pair `'Mode','individual'`

in `kfoldLoss`

.

Load the `ionosphere`

data set. This data set has 34 predictors and 351 binary responses for radar returns, which are labeled either bad (`'b'`

) or good (`'g'`

).

`load ionosphere`

Cross-validate a binary kernel classification model using the data.

CVMdl = fitckernel(X,Y,'Crossval','on')

CVMdl = ClassificationPartitionedKernel CrossValidatedModel: 'Kernel' ResponseName: 'Y' NumObservations: 351 KFold: 10 Partition: [1x1 cvpartition] ClassNames: {'b' 'g'} ScoreTransform: 'none' Properties, Methods

`CVMdl`

is a `ClassificationPartitionedKernel`

model. By default, the software implements 10-fold cross-validation. To specify a different number of folds, use the `'KFold'`

name-value pair argument instead of `'Crossval'`

.

Create an anonymous function that measures linear loss, that is,

$$L=\frac{\sum _{j}-{w}_{j}{y}_{j}{f}_{j}}{\sum _{j}{w}_{j}}.$$

$${w}_{j}$$ is the weight for observation *j*, $${y}_{j}$$ is the response *j* (–1 for the negative class and 1 otherwise), and $${f}_{j}$$ is the raw classification score of observation *j*.

linearloss = @(C,S,W,Cost)sum(-W.*sum(S.*C,2))/sum(W);

Custom loss functions must be written in a particular form. For rules on writing a custom loss function, see the `'LossFun'`

name-value pair argument.

Estimate the cross-validated classification loss using the linear loss function.

`loss = kfoldLoss(CVMdl,'LossFun',linearloss)`

loss = -0.7792

`CVMdl`

— Cross-validated, binary kernel classification model`ClassificationPartitionedKernel`

model objectCross-validated, binary kernel classification model, specified as a `ClassificationPartitionedKernel`

model object. You can create a
`ClassificationPartitionedKernel`

model by using `fitckernel`

and specifying any one of the cross-validation name-value pair arguments.

To obtain estimates, `kfoldLoss`

applies the same data used to
cross-validate the kernel classification model (`X`

and
`Y`

).

Specify optional
comma-separated pairs of `Name,Value`

arguments. `Name`

is
the argument name and `Value`

is the corresponding value.
`Name`

must appear inside quotes. You can specify several name and value
pair arguments in any order as
`Name1,Value1,...,NameN,ValueN`

.

`kfoldLoss(CVMdl,'Folds',[1 3 5])`

specifies to use only the
first, third, and fifth folds to calculate the classification loss.`Folds`

— Fold indices for prediction`1:CVMdl.KFold`

(default) | numeric vector of positive integersFold indices for prediction, specified as the comma-separated pair consisting of
`'Folds'`

and a numeric vector of positive integers. The elements
of `Folds`

must be within the range from `1`

to
`CVMdl.KFold`

.

The software uses only the folds specified in `Folds`

for
prediction.

**Example: **`'Folds',[1 4 10]`

**Data Types: **`single`

| `double`

`LossFun`

— Loss function`'classiferror'`

(default) | `'binodeviance'`

| `'exponential'`

| `'hinge'`

| `'logit'`

| `'mincost'`

| `'quadratic'`

| function handleLoss function, specified as the comma-separated pair consisting of
`'LossFun'`

and a built-in loss function name or a function handle.

This table lists the available loss functions. Specify one using its corresponding value.

Value Description `'binodeviance'`

Binomial deviance `'classiferror'`

Misclassified rate in decimal `'exponential'`

Exponential loss `'hinge'`

Hinge loss `'logit'`

Logistic loss `'mincost'`

Minimal expected misclassification cost (for classification scores that are posterior probabilities) `'quadratic'`

Quadratic loss `'mincost'`

is appropriate for classification scores that are posterior probabilities. For kernel classification models, logistic regression learners return posterior probabilities as classification scores by default, but SVM learners do not (see`kfoldPredict`

).Specify your own function by using function handle notation.

Assume that

`n`

is the number of observations in`X`

, and`K`

is the number of distinct classes (`numel(CVMdl.ClassNames)`

, where`CVMdl`

is the input model). Your function must have this signature:`lossvalue =`

(C,S,W,Cost)`lossfun`

The output argument

`lossvalue`

is a scalar.You specify the function name (

).`lossfun`

`C`

is an`n`

-by-`K`

logical matrix with rows indicating the class to which the corresponding observation belongs. The column order corresponds to the class order in`CVMdl.ClassNames`

.Construct

`C`

by setting`C(p,q) = 1`

, if observation`p`

is in class`q`

, for each row. Set all other elements of row`p`

to`0`

.`S`

is an`n`

-by-`K`

numeric matrix of classification scores. The column order corresponds to the class order in`CVMdl.ClassNames`

.`S`

is a matrix of classification scores, similar to the output of`kfoldPredict`

.`W`

is an`n`

-by-1 numeric vector of observation weights. If you pass`W`

, the software normalizes the weights to sum to`1`

.`Cost`

is a`K`

-by-`K`

numeric matrix of misclassification costs. For example,`Cost = ones(K) – eye(K)`

specifies a cost of`0`

for correct classification, and`1`

for misclassification.

**Example: **`'LossFun',@`

`lossfun`

**Data Types: **`char`

| `string`

| `function_handle`

`Mode`

— Aggregation level for output`'average'`

(default) | `'individual'`

Aggregation level for the output, specified as the comma-separated pair consisting of
`'Mode'`

and `'average'`

or
`'individual'`

.

This table describes the values.

Value | Description |
---|---|

`'average'` | The output is a scalar average over all folds. |

`'individual'` | The output is a vector of length k containing one value per
fold, where k is the number of folds. |

**Example: **`'Mode','individual'`

`loss`

— Classification lossnumeric scalar | numeric column vector

Classification loss, returned as a numeric scalar or numeric column vector.

If `Mode`

is `'average'`

, then
`loss`

is the average classification loss over all folds.
Otherwise, `loss`

is a *k*-by-1 numeric column
vector containing the classification loss for each fold, where *k* is
the number of folds.

*Classification loss* functions measure the predictive
inaccuracy of classification models. When you compare the same type of loss among many
models, a lower loss indicates a better predictive model.

Suppose the following:

*L*is the weighted average classification loss.*n*is the sample size.*y*is the observed class label. The software codes it as –1 or 1, indicating the negative or positive class (or the first or second class in the_{j}`ClassNames`

property), respectively.*f*(*X*) is the positive-class classification score for observation (row)_{j}*j*of the predictor data*X*.*m*=_{j}*y*_{j}*f*(*X*) is the classification score for classifying observation_{j}*j*into the class corresponding to*y*. Positive values of_{j}*m*indicate correct classification and do not contribute much to the average loss. Negative values of_{j}*m*indicate incorrect classification and contribute significantly to the average loss._{j}The weight for observation

*j*is*w*. The software normalizes the observation weights so that they sum to the corresponding prior class probability. The software also normalizes the prior probabilities so that they sum to 1. Therefore,_{j}$$\sum _{j=1}^{n}{w}_{j}}=1.$$

This table describes the supported loss functions that you can specify by using the
`'LossFun'`

name-value argument.

Loss Function | Value of `LossFun` | Equation |
---|---|---|

Binomial deviance | `'binodeviance'` | $$L={\displaystyle \sum _{j=1}^{n}{w}_{j}\mathrm{log}\left\{1+\mathrm{exp}\left[-2{m}_{j}\right]\right\}}.$$ |

Exponential loss | `'exponential'` | $$L={\displaystyle \sum _{j=1}^{n}{w}_{j}\mathrm{exp}\left(-{m}_{j}\right)}.$$ |

Misclassified rate in decimal | `'classiferror'` | $$L={\displaystyle \sum _{j=1}^{n}{w}_{j}}I\left\{{\widehat{y}}_{j}\ne {y}_{j}\right\}.$$ $${\widehat{y}}_{j}$$ is the class label corresponding to the class with the
maximal score. |

Hinge loss | `'hinge'` | $$L={\displaystyle \sum}_{j=1}^{n}{w}_{j}\mathrm{max}\left\{0,1-{m}_{j}\right\}.$$ |

Logit loss | `'logit'` | $$L={\displaystyle \sum _{j=1}^{n}{w}_{j}\mathrm{log}\left(1+\mathrm{exp}\left(-{m}_{j}\right)\right)}.$$ |

Minimal expected misclassification cost | `'mincost'` |
The software computes
the weighted minimal expected classification cost using this procedure
for observations Estimate the expected misclassification cost of classifying the observation *X*into the class_{j}*k*:$${\gamma}_{jk}={\left(f{\left({X}_{j}\right)}^{\prime}C\right)}_{k}.$$ *f*(*X*) is the column vector of class posterior probabilities for binary and multiclass classification for the observation_{j}*X*._{j}*C*is the cost matrix stored in the`Cost` property of the model.For observation *j*, predict the class label corresponding to the minimal expected misclassification cost:$${\widehat{y}}_{j}=\underset{k=1,\mathrm{...},K}{\text{argmin}}{\gamma}_{jk}.$$ Using *C*, identify the cost incurred (*c*) for making the prediction._{j}
The weighted average of the minimal expected misclassification cost loss is $$L={\displaystyle \sum _{j=1}^{n}{w}_{j}{c}_{j}}.$$ If you use the default cost matrix (whose element
value is 0 for correct classification and 1 for incorrect
classification), then the |

Quadratic loss | `'quadratic'` | $$L={\displaystyle \sum _{j=1}^{n}{w}_{j}{\left(1-{m}_{j}\right)}^{2}}.$$ |

This figure compares the loss functions (except `'mincost'`

) over the
score *m* for one observation. Some functions are normalized to pass
through the point (0,1).

You have a modified version of this example. Do you want to open this example with your edits?

You clicked a link that corresponds to this MATLAB command:

Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

Select web siteYou can also select a web site from the following list:

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

- América Latina (Español)
- Canada (English)
- United States (English)

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)