# loss

Classification error

## Syntax

```L = loss(ens,tbl,ResponseVarName) L = loss(ens,tbl,Y) L = loss(ens,X,Y) L = loss(___,Name,Value) ```

## Description

`L = loss(ens,tbl,ResponseVarName)` returns the classification error for ensemble `ens` computed using table of predictors `tbl` and true class labels `tbl.ResponseVarName`.

`L = loss(ens,tbl,Y)` returns the classification error for ensemble `ens` computed using table of predictors `tbl` and true class labels `Y`.

`L = loss(ens,X,Y)` returns the classification error for ensemble `ens` computed using matrix of predictors `X` and true class labels `Y`.

`L = loss(___,Name,Value)` computes classification error with additional options specified by one or more `Name,Value` pair arguments, using any of the previous syntaxes.

When computing the loss, `loss` normalizes the class probabilities in `ResponseVarName` or `Y` to the class probabilities used for training, stored in the `Prior` property of `ens`.

## Input Arguments

 `ens` Classification ensemble created with `fitcensemble`, or a compact classification ensemble created with `compact`. `tbl` Sample data, specified as a table. Each row of `tbl` corresponds to one observation, and each column corresponds to one predictor variable. `tbl` must contain all of the predictors used to train the model. Multi-column variables and cell arrays other than cell arrays of character vectors are not allowed. If you trained `ens` using sample data contained in a `table`, then the input data for this method must also be in a table. `ResponseVarName` Response variable name, specified as the name of a variable in `tbl`. You must specify `ResponseVarName` as a character vector or string scalar. For example, if the response variable `Y` is stored as `tbl.Y`, then specify it as `'Y'`. Otherwise, the software treats all columns of `tbl`, including `Y`, as predictors when training the model. `X` Matrix of data to classify. Each row of `X` represents one observation, and each column represents one predictor. `X` must have the same number of columns as the data used to train `ens`. `X` should have the same number of rows as the number of elements in `Y`. If you trained `ens` using sample data contained in a matrix, then the input data for this method must also be in a matrix. `Y` Class labels of observations in `tbl` or `X`. `Y` should be of the same type as the classification used to train `ens`, and its number of elements should equal the number of rows of `tbl` or `X`.

### Name-Value Pair Arguments

Specify optional comma-separated pairs of `Name,Value` arguments. `Name` is the argument name and `Value` is the corresponding value. `Name` must appear inside quotes. You can specify several name and value pair arguments in any order as `Name1,Value1,...,NameN,ValueN`.

`'learners'`

Indices of weak learners in the ensemble ranging from `1` to `ens``.NumTrained`. `loss` uses only these learners for calculating loss.

Default: `1:NumTrained`

`'Lossfun'`

Loss function, specified as the comma-separated pair consisting of `'LossFun'` and a built-in, loss-function name or function handle.

• The following table lists the available loss functions. Specify one using its corresponding character vector or string scalar.

ValueDescription
`'binodeviance'`Binomial deviance
`'classiferror'`Classification error
`'exponential'`Exponential
`'hinge'`Hinge
`'logit'`Logistic
`'mincost'`Minimal expected misclassification cost (for classification scores that are posterior probabilities)
`'quadratic'`Quadratic

`'mincost'` is appropriate for classification scores that are posterior probabilities.

• Bagged and subspace ensembles return posterior probabilities by default (`ens.Method` is `'Bag'` or `'Subspace'`).

• If the ensemble method is `'AdaBoostM1'`, `'AdaBoostM2'`, `GentleBoost`, or `'LogitBoost'`, then, to use posterior probabilities as classification scores, you must specify the double-logit score transform by entering

`ens.ScoreTransform = 'doublelogit';`

• For all other ensemble methods, the software does not support posterior probabilities as classification scores.

• Specify your own function using function handle notation.

Suppose that `n` be the number of observations in `X` and `K` be the number of distinct classes (`numel(ens.ClassNames)`, `ens` is the input model). Your function must have this signature

``lossvalue = lossfun(C,S,W,Cost)``
where:

• The output argument `lossvalue` is a scalar.

• You choose the function name (`lossfun`).

• `C` is an `n`-by-`K` logical matrix with rows indicating which class the corresponding observation belongs. The column order corresponds to the class order in `ens.ClassNames`.

Construct `C` by setting `C(p,q) = 1` if observation `p` is in class `q`, for each row. Set all other elements of row `p` to `0`.

• `S` is an `n`-by-`K` numeric matrix of classification scores. The column order corresponds to the class order in `ens.ClassNames`. `S` is a matrix of classification scores, similar to the output of `predict`.

• `W` is an `n`-by-1 numeric vector of observation weights. If you pass `W`, the software normalizes them to sum to `1`.

• `Cost` is a K-by-`K` numeric matrix of misclassification costs. For example, ```Cost = ones(K) - eye(K)``` specifies a cost of `0` for correct classification, and `1` for misclassification.

Specify your function using `'LossFun',@lossfun`.

For more details on loss functions, see Classification Loss.

Default: `'classiferror'`

`'mode'`

Meaning of the output `L`:

• `'ensemble'``L` is a scalar value, the loss for the entire ensemble.

• `'individual'``L` is a vector with one element per trained learner.

• `'cumulative'``L` is a vector in which element `J` is obtained by using learners `1:J` from the input list of learners.

Default: `'ensemble'`

`'UseObsForLearner'`

A logical matrix of size `N`-by-`T`, where:

When `UseObsForLearner(i,j)` is `true`, learner `j` is used in predicting the class of row `i` of `X`.

Default: `true(N,T)`

`'weights'`

Vector of observation weights, with nonnegative entries. The length of `weights` must equal the number of rows in `X`. When you specify weights, `loss` normalizes the weights so that observation weights in each class sum to the prior probability of that class.

Default: `ones(size(X,1),1)`

## Output Arguments

 `L` Classification loss, by default the fraction of misclassified data. `L` can be a vector, and can mean different things, depending on the name-value pair settings.

## Examples

expand all

`load fisheriris`

Train a classification ensemble of 100 decision trees using AdaBoostM2. Specify tree stumps as the weak learners.

```t = templateTree('MaxNumSplits',1); ens = fitcensemble(meas,species,'Method','AdaBoostM2','Learners',t);```

Estimate the classification error of the model using the training observations.

`L = loss(ens,meas,species)`
```L = 0.0333 ```

Alternatively, if `ens` is not compact, then you can estimate the training-sample classification error by passing `ens` to `resubLoss`.