# kfoldLoss

Regression loss for cross-validated kernel regression model

## Syntax

``L = kfoldLoss(CVMdl)``
``L = kfoldLoss(CVMdl,Name,Value)``

## Description

example

````L = kfoldLoss(CVMdl)` returns the regression loss obtained by the cross-validated kernel regression model `CVMdl`. For every fold, `kfoldLoss` computes the regression loss for observations in the validation fold, using a model trained on observations in the training fold.```
````L = kfoldLoss(CVMdl,Name,Value)` returns the mean squared error (MSE) with additional options specified by one or more name-value pair arguments. For example, you can specify the regression-loss function or which folds to use for loss calculation.```

## Examples

collapse all

Simulate sample data:

```rng(0,'twister'); % For reproducibility n = 1000; x = linspace(-10,10,n)'; y = 1 + x*2e-2 + sin(x)./x + 0.2*randn(n,1);```

Cross-validate a kernel regression model.

`CVMdl = fitrkernel(x,y,'Kfold',5);`

`fitrkernel` implements 5-fold cross-validation. `CVMdl` is a `RegressionPartitionedKernel` model. It contains the property `Trained`, which is a 5-by-1 cell array holding 5 `RegressionKernel` models that the software trained using the training set.

Compute the epsilon-insensitive loss for each fold for observations that `fitrkernel` did not use in training the folds.

`L = kfoldLoss(CVMdl,'LossFun','epsiloninsensitive','Mode','individual')`
```L = 5×1 0.2812 0.3223 0.3073 0.3117 0.2576 ```

## Input Arguments

collapse all

Cross-validated kernel regression model, specified as a `RegressionPartitionedKernel` model object. You can create a `RegressionPartitionedKernel` model using `fitrkernel` and specifying any of the cross-validation name-value pair arguments, for example, `CrossVal`.

### Name-Value Arguments

Specify optional comma-separated pairs of `Name,Value` arguments. `Name` is the argument name and `Value` is the corresponding value. `Name` must appear inside quotes. You can specify several name and value pair arguments in any order as `Name1,Value1,...,NameN,ValueN`.

Example: `'LossFun','epsiloninsensitive','Mode','individual'` specifies `kfoldLoss` to return the epsilon-insensitive loss for each fold.

Fold indices to use for response prediction, specified as the comma-separated pair consisting of `'Folds'` and a numeric vector of positive integers. The elements of `Folds` must range from `1` through `CVMdl.KFold`.

Example: `'Folds',[1 4 10]`

Data Types: `single` | `double`

Loss function, specified as the comma-separated pair consisting of `'LossFun'` and a built-in, loss-function name or function handle.

• The following table lists the available loss functions. Specify one using its corresponding character vector or string scalar. Also, in the table, $f\left(x\right)=x\beta +b.$

• β is a vector of p coefficients.

• x is an observation from p predictor variables.

• b is the scalar bias.

ValueDescription
`'epsiloninsensitive'`Epsilon-insensitive loss: $\ell \left[y,f\left(x\right)\right]=\mathrm{max}\left[0,|y-f\left(x\right)|-\epsilon \right]$
`'mse'`MSE: $\ell \left[y,f\left(x\right)\right]={\left[y-f\left(x\right)\right]}^{2}$

`'epsiloninsensitive'` is appropriate for SVM learners only.

• Specify your own function using function handle notation.

Assume that `n` is the number of observations in `X`. Your function must have this signature

``lossvalue = lossfun(Y,Yhat,W)``
where:

• The output argument `lossvalue` is a scalar.

• You specify the function name (`lossfun`).

• `Y` is an `n`-dimensional vector of observed responses. `kfoldLoss` passes the input argument `Y` in for `Y`.

• `Yhat` is an `n`-dimensional vector of predicted responses, which is similar to the output of `predict`.

• `W` is an `n`-by-1 numeric vector of observation weights.

Data Types: `char` | `string` | `function_handle`

Loss aggregation level, specified as the comma-separated pair consisting of `'Mode'` and `'average'` or `'individual'`.

ValueDescription
`'average'`Returns losses averaged over all folds
`'individual'`Returns losses for each fold

Example: `'Mode','individual'`

## Output Arguments

collapse all

Cross-validated regression losses, returned as a numeric scalar or vector. The interpretation of `L` depends on `LossFun`.

• If `Mode` is `'average'`, then `L` is a scalar.

• Otherwise, `L` is a k-by-1 vector, where k is the number of folds. `L(j)` is the average regression loss over fold `j`.

To estimate `L`, `kfoldLoss` uses the data that created `CVMdl`.