# predict

Predict responses for Gaussian kernel regression model

## Description

uses the `YFit`

= predict(`Mdl`

,`X`

,PredictionForMissingValue=`prediction`

)`prediction`

value as the predicted response for
observations with missing values in the predictor data `X`

. By
default, `predict`

uses the median of the observed response
values in the training data.* (since R2023b)*

## Examples

### Predict Test Set Responses

Predict the test set responses using a Gaussian kernel regression model for the `carbig`

data set.

Load the `carbig`

data set.

`load carbig`

Specify the predictor variables (`X`

) and the response variable (`Y`

).

X = [Weight,Cylinders,Horsepower,Model_Year]; Y = MPG;

Delete rows of `X`

and `Y`

where either array has `NaN`

values. Removing rows with `NaN`

values before passing data to `fitrkernel`

can speed up training and reduce memory usage.

R = rmmissing([X Y]); X = R(:,1:4); Y = R(:,end);

Reserve 10% of the observations as a holdout sample. Extract the training and test indices from the partition definition.

rng(10) % For reproducibility N = length(Y); cvp = cvpartition(N,'Holdout',0.1); idxTrn = training(cvp); % Training set indices idxTest = test(cvp); % Test set indices

Train the regression kernel model. Standardize the training data.

```
Xtrain = X(idxTrn,:);
Ytrain = Y(idxTrn);
Mdl = fitrkernel(Xtrain,Ytrain,'Standardize',true)
```

Mdl = RegressionKernel ResponseName: 'Y' Learner: 'svm' NumExpansionDimensions: 128 KernelScale: 1 Lambda: 0.0028 BoxConstraint: 1 Epsilon: 0.8617

`Mdl`

is a `RegressionKernel`

model.

Predict responses for the test set.

Xtest = X(idxTest,:); Ytest = Y(idxTest); YFit = predict(Mdl,Xtest);

Create a table containing the first 10 observed response values and predicted response values.

table(Ytest(1:10),YFit(1:10),'VariableNames', ... {'ObservedValue','PredictedValue'})

`ans=`*10×2 table*
ObservedValue PredictedValue
_____________ ______________
18 17.616
14 25.799
24 24.141
25 25.018
14 13.637
14 14.557
18 18.584
27 26.096
21 25.031
13 13.324

Estimate the test set regression loss using the mean squared error loss function.

L = loss(Mdl,Xtest,Ytest)

L = 9.2664

## Input Arguments

`Mdl`

— Kernel regression model

`RegressionKernel`

model object

Kernel regression model, specified as a `RegressionKernel`

model object. You can create a
`RegressionKernel`

model object using `fitrkernel`

.

`X`

— Predictor data used to generate responses

numeric matrix | table

Predictor data used to generate responses, specified as a numeric matrix or table.

Each row of `X`

corresponds to one observation, and
each column corresponds to one variable.

For a numeric matrix:

The variables in the columns of

`X`

must have the same order as the predictor variables that trained`Mdl`

.If you trained

`Mdl`

using a table (for example,`Tbl`

) and`Tbl`

contains all numeric predictor variables, then`X`

can be a numeric matrix. To treat numeric predictors in`Tbl`

as categorical during training, identify categorical predictors using the`CategoricalPredictors`

name-value pair argument of`fitrkernel`

. If`Tbl`

contains heterogeneous predictor variables (for example, numeric and categorical data types) and`X`

is a numeric matrix, then`predict`

throws an error.

For a table:

`predict`

does not support multicolumn variables or cell arrays other than cell arrays of character vectors.If you trained

`Mdl`

using a table (for example,`Tbl`

), then all predictor variables in`X`

must have the same variable names and data types as those that trained`Mdl`

(stored in`Mdl.PredictorNames`

). However, the column order of`X`

does not need to correspond to the column order of`Tbl`

. Also,`Tbl`

and`X`

can contain additional variables (response variables, observation weights, and so on), but`predict`

ignores them.If you trained

`Mdl`

using a numeric matrix, then the predictor names in`Mdl.PredictorNames`

and corresponding predictor variable names in`X`

must be the same. To specify predictor names during training, see the`PredictorNames`

name-value pair argument of`fitrkernel`

. All predictor variables in`X`

must be numeric vectors.`X`

can contain additional variables (response variables, observation weights, and so on), but`predict`

ignores them.

**Data Types: **`double`

| `single`

| `table`

`prediction`

— Predicted response value to use for observations with missing predictor values

`"median"`

(default) | `"mean"`

| numeric scalar

*Since R2023b*

Predicted response value to use for observations with missing predictor values, specified as `"median"`

, `"mean"`

, or a numeric scalar.

Value | Description |
---|---|

`"median"` | `predict` uses the median of the observed response values in the training data as the predicted response value for observations with missing predictor values. |

`"mean"` | `predict` uses the mean of the observed response values in the training data as the predicted response value for observations with missing predictor values. |

Numeric scalar | `predict` uses this value as the predicted response value for observations with missing predictor values. |

**Example: **`"mean"`

**Example: **`NaN`

**Data Types: **`single`

| `double`

| `char`

| `string`

## Output Arguments

## Extended Capabilities

### Tall Arrays

Calculate with arrays that have more rows than fit in memory.

Usage notes and limitations:

`predict`

does not support tall`table`

data.

For more information, see Tall Arrays.

### C/C++ Code Generation

Generate C and C++ code using MATLAB® Coder™. (since R2023a)

Usage notes and limitations:

Use

`saveLearnerForCoder`

,`loadLearnerForCoder`

, and`codegen`

(MATLAB Coder) to generate code for the`predict`

function. Save a trained model by using`saveLearnerForCoder`

. Define an entry-point function that loads the saved model by using`loadLearnerForCoder`

and calls the`predict`

function. Then use`codegen`

to generate code for the entry-point function.To generate single-precision C/C++ code for

`predict`

, specify the name-value argument`"DataType","single"`

when you call the`loadLearnerForCoder`

function.If the code generator uses the Open Multiprocessing (OpenMP) library, the generated code of

`predict`

splits the predictor data`X`

into multiple chunks and predicts responses for the chunks in parallel. The generated code uses`parfor`

(MATLAB Coder) to create loops that run in parallel on supported shared-memory multicore platforms. If your compiler does not support the OpenMP application interface, or if you disable the OpenMP library, the generated code does not split the predictor data and, therefore, processes one observation at a time. To find supported compilers, see Supported Compilers. To disable the OpenMP library, set the`EnableOpenMP`

property of the configuration object to`false`

. For details, see`coder.CodeConfig`

(MATLAB Coder).This table contains notes about the arguments of

`predict`

. Arguments not included in this table are fully supported.Argument Notes and Limitations `Mdl`

For the usage notes and limitations of the model object, see Code Generation of the

`RegressionKernel`

object.`X`

For general code generation,

`X`

must be a single-precision or double-precision matrix or a table containing numeric variables, categorical variables, or both.The number of rows, or observations, in

`X`

can be a variable size, but the number of columns in`X`

must be fixed.If you want to specify

`X`

as a table, then your model must be trained using a table, and your entry-point function for prediction must do the following:Accept data as arrays.

Create a table from the data input arguments and specify the variable names in the table.

Pass the table to

`predict`

.

For an example of this table workflow, see Generate Code to Classify Data in Table. For more information on using tables in code generation, see Code Generation for Tables (MATLAB Coder) and Table Limitations for Code Generation (MATLAB Coder).

Name-value arguments Names in name-value arguments must be compile-time constants.

If the value of

`PredictionForMissingValue`

is nonnumeric, then it must be a compile-time constant.

For more information, see Introduction to Code Generation.

## Version History

**Introduced in R2018a**

### R2023b: Specify predicted response value to use for observations with missing predictor values

Starting in R2023b, when you predict or compute the loss, some regression models allow you to specify the predicted response value for observations with missing predictor values. Specify the `PredictionForMissingValue`

name-value argument to use a numeric scalar, the training set median, or the training set mean as the predicted value. When computing the loss, you can also specify to omit observations with missing predictor values.

This table lists the object functions that support the
`PredictionForMissingValue`

name-value argument. By default, the
functions use the training set median as the predicted response value for observations with
missing predictor values.

Model Type | Model Objects | Object Functions |
---|---|---|

Gaussian process regression (GPR) model | `RegressionGP` , `CompactRegressionGP` | `loss` , `predict` , `resubLoss` , `resubPredict` |

`RegressionPartitionedGP` | `kfoldLoss` , `kfoldPredict` | |

Gaussian kernel regression model | `RegressionKernel` | `loss` , `predict` |

`RegressionPartitionedKernel` | `kfoldLoss` , `kfoldPredict` | |

Linear regression model | `RegressionLinear` | `loss` , `predict` |

`RegressionPartitionedLinear` | `kfoldLoss` , `kfoldPredict` | |

Neural network regression model | `RegressionNeuralNetwork` , `CompactRegressionNeuralNetwork` | `loss` , `predict` , `resubLoss` , `resubPredict` |

`RegressionPartitionedNeuralNetwork` | `kfoldLoss` , `kfoldPredict` | |

Support vector machine (SVM) regression model | `RegressionSVM` , `CompactRegressionSVM` | `loss` , `predict` , `resubLoss` , `resubPredict` |

`RegressionPartitionedSVM` | `kfoldLoss` , `kfoldPredict` |

In previous releases, the regression model `loss`

and `predict`

functions listed above used `NaN`

predicted response values for observations with missing predictor values. The software omitted observations with missing predictor values from the resubstitution ("resub") and cross-validation ("kfold") computations for prediction and loss.

### R2023a: Generate C/C++ code for prediction

You can generate C/C++ code for the `predict`

function.

## See Also

`fitrkernel`

| `loss`

| `RegressionKernel`

| `resume`

## Open Example

You have a modified version of this example. Do you want to open this example with your edits?

## MATLAB Command

You clicked a link that corresponds to this MATLAB command:

Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.

Select a Web Site

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

You can also select a web site from the following list:

## How to Get Best Site Performance

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

### Americas

- América Latina (Español)
- Canada (English)
- United States (English)

### Europe

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)