Main Content

Classification loss for Gaussian kernel classification model

returns the classification loss for the model `L`

= loss(`Mdl`

,`Tbl`

,`ResponseVarName`

)`Mdl`

using the
predictor data in `Tbl`

and the true class labels in
`Tbl.ResponseVarName`

.

specifies options using one or more name-value pair arguments in addition to any
of the input argument combinations in previous syntaxes. For example, you can
specify a classification loss function and observation weights. Then,
`L`

= loss(___,`Name,Value`

)`loss`

returns the weighted classification loss using the
specified loss function.

Load the `ionosphere`

data set. This data set has 34 predictors and 351 binary responses for radar returns, either bad (`'b'`

) or good (`'g'`

).

`load ionosphere`

Partition the data set into training and test sets. Specify a 15% holdout sample for the test set.

rng('default') % For reproducibility Partition = cvpartition(Y,'Holdout',0.15); trainingInds = training(Partition); % Indices for the training set testInds = test(Partition); % Indices for the test set

Train a binary kernel classification model using the training set.

Mdl = fitckernel(X(trainingInds,:),Y(trainingInds));

Estimate the training-set classification error and the test-set classification error.

ceTrain = loss(Mdl,X(trainingInds,:),Y(trainingInds))

ceTrain = 0.0067

ceTest = loss(Mdl,X(testInds,:),Y(testInds))

ceTest = 0.1140

Load the `ionosphere`

data set. This data set has 34 predictors and 351 binary responses for radar returns, either bad (`'b'`

) or good (`'g'`

).

`load ionosphere`

Partition the data set into training and test sets. Specify a 15% holdout sample for the test set.

rng('default') % For reproducibility Partition = cvpartition(Y,'Holdout',0.15); trainingInds = training(Partition); % Indices for the training set testInds = test(Partition); % Indices for the test set

Train a binary kernel classification model using the training set.

Mdl = fitckernel(X(trainingInds,:),Y(trainingInds));

Create an anonymous function that measures linear loss, that is,

$$L=\frac{\sum _{j}-{w}_{j}{y}_{j}{f}_{j}}{\sum _{j}{w}_{j}}.$$

$${w}_{j}$$ is the weight for observation *j*, $${y}_{j}$$ is response *j* (-1 for the negative class, and 1 otherwise), and $${f}_{j}$$ is the raw classification score of observation *j*.

linearloss = @(C,S,W,Cost)sum(-W.*sum(S.*C,2))/sum(W);

Custom loss functions must be written in a particular form. For rules on writing a custom loss function, see the '`LossFun'`

name-value pair argument.

Estimate the training-set classification loss and the test-set classification loss using the linear loss function.

`ceTrain = loss(Mdl,X(trainingInds,:),Y(trainingInds),'LossFun',linearloss)`

ceTrain = -1.0851

`ceTest = loss(Mdl,X(testInds,:),Y(testInds),'LossFun',linearloss)`

ceTest = -0.7821

`Mdl`

— Binary kernel classification model`ClassificationKernel`

model objectBinary kernel classification model, specified as a `ClassificationKernel`

model object. You can create a
`ClassificationKernel`

model object using `fitckernel`

.

`Y`

— Class labelscategorical array | character array | string array | logical vector | numeric vector | cell array of character vectors

Class labels, specified as a categorical, character, or string array; logical or numeric vector; or cell array of character vectors.

The data type of

`Y`

must be the same as the data type of`Mdl.ClassNames`

. (The software treats string arrays as cell arrays of character vectors.)The distinct classes in

`Y`

must be a subset of`Mdl.ClassNames`

.If

`Y`

is a character array, then each element must correspond to one row of the array.The length of

`Y`

must be equal to the number of observations in`X`

or`Tbl`

.

**Data Types: **`categorical`

| `char`

| `string`

| `logical`

| `single`

| `double`

| `cell`

`Tbl`

— Sample datatable

Sample data used to train the model, specified as a table. Each row of
`Tbl`

corresponds to one observation, and each column corresponds
to one predictor variable. Optionally, `Tbl`

can contain additional
columns for the response variable and observation weights. `Tbl`

must
contain all the predictors used to train `Mdl`

. Multicolumn variables
and cell arrays other than cell arrays of character vectors are not allowed.

If `Tbl`

contains the response variable used to train `Mdl`

, then you do not need to specify `ResponseVarName`

or `Y`

.

If you train `Mdl`

using sample data contained in a table, then the input
data for `loss`

must also be in a table.

`ResponseVarName`

— Response variable namename of variable in

`Tbl`

Response variable name, specified as the name of a variable in `Tbl`

. If `Tbl`

contains the response variable used to train `Mdl`

, then you do not need to specify `ResponseVarName`

.

If you specify `ResponseVarName`

, then you must specify it as a character
vector or string scalar. For example, if the response variable is stored as
`Tbl.Y`

, then specify `ResponseVarName`

as
`'Y'`

. Otherwise, the software treats all columns of
`Tbl`

, including `Tbl.Y`

, as predictors.

The response variable must be a categorical, character, or string array; a logical or numeric vector; or a cell array of character vectors. If the response variable is a character array, then each element must correspond to one row of the array.

**Data Types: **`char`

| `string`

Specify optional
comma-separated pairs of `Name,Value`

arguments. `Name`

is
the argument name and `Value`

is the corresponding value.
`Name`

must appear inside quotes. You can specify several name and value
pair arguments in any order as
`Name1,Value1,...,NameN,ValueN`

.

```
L =
loss(Mdl,X,Y,'LossFun','quadratic','Weights',weights)
```

returns the
weighted classification loss using the quadratic loss function.`LossFun`

— Loss function`'classiferror'`

(default) | `'binodeviance'`

| `'exponential'`

| `'hinge'`

| `'logit'`

| `'mincost'`

| `'quadratic'`

| function handleLoss function, specified as the comma-separated pair consisting of
`'LossFun'`

and a built-in loss function name or a
function handle.

This table lists the available loss functions. Specify one using its corresponding value.

Value Description `'binodeviance'`

Binomial deviance `'classiferror'`

Misclassified rate in decimal `'exponential'`

Exponential loss `'hinge'`

Hinge loss `'logit'`

Logistic loss `'mincost'`

Minimal expected misclassification cost (for classification scores that are posterior probabilities) `'quadratic'`

Quadratic loss `'mincost'`

is appropriate for classification scores that are posterior probabilities. For kernel classification models, logistic regression learners return posterior probabilities as classification scores by default, but SVM learners do not (see`predict`

).To specify a custom loss function, use function handle notation. The function must have this form:

`lossvalue =`

(C,S,W,Cost)`lossfun`

The output argument

`lossvalue`

is a scalar.You specify the function name (

).`lossfun`

`C`

is an`n`

-by-`K`

logical matrix with rows indicating the class to which the corresponding observation belongs.`n`

is the number of observations in`Tbl`

or`X`

, and`K`

is the number of distinct classes (`numel(Mdl.ClassNames)`

. The column order corresponds to the class order in`Mdl.ClassNames`

. Create`C`

by setting`C(p,q) = 1`

, if observation`p`

is in class`q`

, for each row. Set all other elements of row`p`

to`0`

.`S`

is an`n`

-by-`K`

numeric matrix of classification scores. The column order corresponds to the class order in`Mdl.ClassNames`

.`S`

is a matrix of classification scores, similar to the output of`predict`

.`W`

is an`n`

-by-1 numeric vector of observation weights.`Cost`

is a`K`

-by-`K`

numeric matrix of misclassification costs. For example,`Cost = ones(K) – eye(K)`

specifies a cost of`0`

for correct classification and`1`

for misclassification.

**Example: **`'LossFun',@`

`lossfun`

**Data Types: **`char`

| `string`

| `function_handle`

`Weights`

— Observation weights`ones(size(X,1),1)`

(default) | numeric vector | name of variable in `Tbl`

Observation weights, specified as the comma-separated pair consisting
of `'Weights'`

and a numeric vector or the name of a
variable in `Tbl`

.

If

`Weights`

is a numeric vector, then the size of`Weights`

must be equal to the number of rows in`X`

or`Tbl`

.If

`Weights`

is the name of a variable in`Tbl`

, you must specify`Weights`

as a character vector or string scalar. For example, if the weights are stored as`Tbl.W`

, then specify`Weights`

as`'W'`

. Otherwise, the software treats all columns of`Tbl`

, including`Tbl.W`

, as predictors.

If you supply weights, `loss`

computes the weighted
classification loss and normalizes the weights to sum up to
the value of the prior probability in the respective class.

**Data Types: **`double`

| `single`

| `char`

| `string`

`L`

— Classification lossnumeric scalar

Classification loss, returned as a numeric scalar. The
interpretation of `L`

depends on
`Weights`

and `LossFun`

.

*Classification loss* functions measure the predictive
inaccuracy of classification models. When you compare the same type of loss among many
models, a lower loss indicates a better predictive model.

Suppose the following:

*L*is the weighted average classification loss.*n*is the sample size.*y*is the observed class label. The software codes it as –1 or 1, indicating the negative or positive class (or the first or second class in the_{j}`ClassNames`

property), respectively.*f*(*X*) is the positive-class classification score for observation (row)_{j}*j*of the predictor data*X*.*m*=_{j}*y*_{j}*f*(*X*) is the classification score for classifying observation_{j}*j*into the class corresponding to*y*. Positive values of_{j}*m*indicate correct classification and do not contribute much to the average loss. Negative values of_{j}*m*indicate incorrect classification and contribute significantly to the average loss._{j}The weight for observation

*j*is*w*. The software normalizes the observation weights so that they sum to the corresponding prior class probability. The software also normalizes the prior probabilities so that they sum to 1. Therefore,_{j}$$\sum _{j=1}^{n}{w}_{j}}=1.$$

This table describes the supported loss functions that you can specify by using the
`'LossFun'`

name-value argument.

Loss Function | Value of `LossFun` | Equation |
---|---|---|

Binomial deviance | `'binodeviance'` | $$L={\displaystyle \sum _{j=1}^{n}{w}_{j}\mathrm{log}\left\{1+\mathrm{exp}\left[-2{m}_{j}\right]\right\}}.$$ |

Exponential loss | `'exponential'` | $$L={\displaystyle \sum _{j=1}^{n}{w}_{j}\mathrm{exp}\left(-{m}_{j}\right)}.$$ |

Misclassified rate in decimal | `'classiferror'` | $$L={\displaystyle \sum _{j=1}^{n}{w}_{j}}I\left\{{\widehat{y}}_{j}\ne {y}_{j}\right\}.$$ $${\widehat{y}}_{j}$$ is the class label corresponding to the class with the
maximal score. |

Hinge loss | `'hinge'` | $$L={\displaystyle \sum}_{j=1}^{n}{w}_{j}\mathrm{max}\left\{0,1-{m}_{j}\right\}.$$ |

Logit loss | `'logit'` | $$L={\displaystyle \sum _{j=1}^{n}{w}_{j}\mathrm{log}\left(1+\mathrm{exp}\left(-{m}_{j}\right)\right)}.$$ |

Minimal expected misclassification cost | `'mincost'` |
The software computes
the weighted minimal expected classification cost using this procedure
for observations Estimate the expected misclassification cost of classifying the observation *X*into the class_{j}*k*:$${\gamma}_{jk}={\left(f{\left({X}_{j}\right)}^{\prime}C\right)}_{k}.$$ *f*(*X*) is the column vector of class posterior probabilities for binary and multiclass classification for the observation_{j}*X*._{j}*C*is the cost matrix stored in the`Cost` property of the model.For observation *j*, predict the class label corresponding to the minimal expected misclassification cost:$${\widehat{y}}_{j}=\underset{k=1,\mathrm{...},K}{\text{argmin}}{\gamma}_{jk}.$$ Using *C*, identify the cost incurred (*c*) for making the prediction._{j}
The weighted average of the minimal expected misclassification cost loss is $$L={\displaystyle \sum _{j=1}^{n}{w}_{j}{c}_{j}}.$$ If you use the default cost matrix (whose element
value is 0 for correct classification and 1 for incorrect
classification), then the |

Quadratic loss | `'quadratic'` | $$L={\displaystyle \sum _{j=1}^{n}{w}_{j}{\left(1-{m}_{j}\right)}^{2}}.$$ |

This figure compares the loss functions (except `'mincost'`

) over the
score *m* for one observation. Some functions are normalized to pass
through the point (0,1).

Calculate with arrays that have more rows than fit in memory.

Usage notes and limitations:

`loss`

does not support tall`table`

data.

For more information, see Tall Arrays.

You have a modified version of this example. Do you want to open this example with your edits?

You clicked a link that corresponds to this MATLAB command:

Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

Select web siteYou can also select a web site from the following list:

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

- América Latina (Español)
- Canada (English)
- United States (English)

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)