predict
Predict responses for Gaussian kernel regression model
Description
uses the YFit
= predict(Mdl
,X
,PredictionForMissingValue=prediction
)prediction
value as the predicted response for
observations with missing values in the predictor data X
. By
default, predict
uses the median of the observed response
values in the training data. (since R2023b)
Examples
Predict Test Set Responses
Predict the test set responses using a Gaussian kernel regression model for the carbig
data set.
Load the carbig
data set.
load carbig
Specify the predictor variables (X
) and the response variable (Y
).
X = [Weight,Cylinders,Horsepower,Model_Year]; Y = MPG;
Delete rows of X
and Y
where either array has NaN
values. Removing rows with NaN
values before passing data to fitrkernel
can speed up training and reduce memory usage.
R = rmmissing([X Y]); X = R(:,1:4); Y = R(:,end);
Reserve 10% of the observations as a holdout sample. Extract the training and test indices from the partition definition.
rng(10) % For reproducibility N = length(Y); cvp = cvpartition(N,'Holdout',0.1); idxTrn = training(cvp); % Training set indices idxTest = test(cvp); % Test set indices
Train the regression kernel model. Standardize the training data.
Xtrain = X(idxTrn,:);
Ytrain = Y(idxTrn);
Mdl = fitrkernel(Xtrain,Ytrain,'Standardize',true)
Mdl = RegressionKernel ResponseName: 'Y' Learner: 'svm' NumExpansionDimensions: 128 KernelScale: 1 Lambda: 0.0028 BoxConstraint: 1 Epsilon: 0.8617
Mdl
is a RegressionKernel
model.
Predict responses for the test set.
Xtest = X(idxTest,:); Ytest = Y(idxTest); YFit = predict(Mdl,Xtest);
Create a table containing the first 10 observed response values and predicted response values.
table(Ytest(1:10),YFit(1:10),'VariableNames', ... {'ObservedValue','PredictedValue'})
ans=10×2 table
ObservedValue PredictedValue
_____________ ______________
18 17.616
14 25.799
24 24.141
25 25.018
14 13.637
14 14.557
18 18.584
27 26.096
21 25.031
13 13.324
Estimate the test set regression loss using the mean squared error loss function.
L = loss(Mdl,Xtest,Ytest)
L = 9.2664
Input Arguments
Mdl
— Kernel regression model
RegressionKernel
model object
Kernel regression model, specified as a RegressionKernel
model object. You can create a
RegressionKernel
model object using fitrkernel
.
X
— Predictor data used to generate responses
numeric matrix | table
Predictor data used to generate responses, specified as a numeric matrix or table.
Each row of X
corresponds to one observation, and
each column corresponds to one variable.
For a numeric matrix:
The variables in the columns of
X
must have the same order as the predictor variables that trainedMdl
.If you trained
Mdl
using a table (for example,Tbl
) andTbl
contains all numeric predictor variables, thenX
can be a numeric matrix. To treat numeric predictors inTbl
as categorical during training, identify categorical predictors using theCategoricalPredictors
name-value pair argument offitrkernel
. IfTbl
contains heterogeneous predictor variables (for example, numeric and categorical data types) andX
is a numeric matrix, thenpredict
throws an error.
For a table:
predict
does not support multicolumn variables or cell arrays other than cell arrays of character vectors.If you trained
Mdl
using a table (for example,Tbl
), then all predictor variables inX
must have the same variable names and data types as those that trainedMdl
(stored inMdl.PredictorNames
). However, the column order ofX
does not need to correspond to the column order ofTbl
. Also,Tbl
andX
can contain additional variables (response variables, observation weights, and so on), butpredict
ignores them.If you trained
Mdl
using a numeric matrix, then the predictor names inMdl.PredictorNames
and corresponding predictor variable names inX
must be the same. To specify predictor names during training, see thePredictorNames
name-value pair argument offitrkernel
. All predictor variables inX
must be numeric vectors.X
can contain additional variables (response variables, observation weights, and so on), butpredict
ignores them.
Data Types: double
| single
| table
prediction
— Predicted response value to use for observations with missing predictor values
"median"
(default) | "mean"
| numeric scalar
Since R2023b
Predicted response value to use for observations with missing predictor values, specified as "median"
, "mean"
, or a numeric scalar.
Value | Description |
---|---|
"median" | predict uses the median of the observed response values in the training data as the predicted response value for observations with missing predictor values. |
"mean" | predict uses the mean of the observed response values in the training data as the predicted response value for observations with missing predictor values. |
Numeric scalar | predict uses this value as the predicted response value for observations with missing predictor values. |
Example: "mean"
Example: NaN
Data Types: single
| double
| char
| string
Output Arguments
Extended Capabilities
Tall Arrays
Calculate with arrays that have more rows than fit in memory.
The
predict
function supports tall arrays with the following usage
notes and limitations:
predict
does not support talltable
data.
For more information, see Tall Arrays.
C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™. (since R2023a)
Usage notes and limitations:
Use
saveLearnerForCoder
,loadLearnerForCoder
, andcodegen
(MATLAB Coder) to generate code for thepredict
function. Save a trained model by usingsaveLearnerForCoder
. Define an entry-point function that loads the saved model by usingloadLearnerForCoder
and calls thepredict
function. Then usecodegen
to generate code for the entry-point function.To generate single-precision C/C++ code for
predict
, specify the name-value argument"DataType","single"
when you call theloadLearnerForCoder
function.If the code generator uses the Open Multiprocessing (OpenMP) library, the generated code of
predict
splits the predictor dataX
into multiple chunks and predicts responses for the chunks in parallel. The generated code usesparfor
(MATLAB Coder) to create loops that run in parallel on supported shared-memory multicore platforms. If your compiler does not support the OpenMP application interface, or if you disable the OpenMP library, the generated code does not split the predictor data and, therefore, processes one observation at a time. To find supported compilers, see Supported Compilers. To disable the OpenMP library, set theEnableOpenMP
property of the configuration object tofalse
. For details, seecoder.CodeConfig
(MATLAB Coder).This table contains notes about the arguments of
predict
. Arguments not included in this table are fully supported.Argument Notes and Limitations Mdl
For the usage notes and limitations of the model object, see Code Generation of the
RegressionKernel
object.X
For general code generation,
X
must be a single-precision or double-precision matrix or a table containing numeric variables, categorical variables, or both.The number of rows, or observations, in
X
can be a variable size, but the number of columns inX
must be fixed.If you want to specify
X
as a table, then your model must be trained using a table, and your entry-point function for prediction must do the following:Accept data as arrays.
Create a table from the data input arguments and specify the variable names in the table.
Pass the table to
predict
.
For an example of this table workflow, see Generate Code to Classify Data in Table. For more information on using tables in code generation, see Code Generation for Tables (MATLAB Coder) and Table Limitations for Code Generation (MATLAB Coder).
Name-value arguments Names in name-value arguments must be compile-time constants.
If the value of
PredictionForMissingValue
is nonnumeric, then it must be a compile-time constant.
For more information, see Introduction to Code Generation.
Version History
Introduced in R2018aR2023b: Specify predicted response value to use for observations with missing predictor values
Starting in R2023b, when you predict or compute the loss, some regression models allow you to specify the predicted response value for observations with missing predictor values. Specify the PredictionForMissingValue
name-value argument to use a numeric scalar, the training set median, or the training set mean as the predicted value. When computing the loss, you can also specify to omit observations with missing predictor values.
This table lists the object functions that support the
PredictionForMissingValue
name-value argument. By default, the
functions use the training set median as the predicted response value for observations with
missing predictor values.
Model Type | Model Objects | Object Functions |
---|---|---|
Gaussian process regression (GPR) model | RegressionGP , CompactRegressionGP | loss , predict , resubLoss , resubPredict |
RegressionPartitionedGP | kfoldLoss , kfoldPredict | |
Gaussian kernel regression model | RegressionKernel | loss , predict |
RegressionPartitionedKernel | kfoldLoss , kfoldPredict | |
Linear regression model | RegressionLinear | loss , predict |
RegressionPartitionedLinear | kfoldLoss , kfoldPredict | |
Neural network regression model | RegressionNeuralNetwork , CompactRegressionNeuralNetwork | loss , predict , resubLoss , resubPredict |
RegressionPartitionedNeuralNetwork | kfoldLoss , kfoldPredict | |
Support vector machine (SVM) regression model | RegressionSVM , CompactRegressionSVM | loss , predict , resubLoss , resubPredict |
RegressionPartitionedSVM | kfoldLoss , kfoldPredict |
In previous releases, the regression model loss
and predict
functions listed above used NaN
predicted response values for observations with missing predictor values. The software omitted observations with missing predictor values from the resubstitution ("resub") and cross-validation ("kfold") computations for prediction and loss.
R2023a: Generate C/C++ code for prediction
You can generate C/C++ code for the predict
function.
See Also
fitrkernel
| loss
| RegressionKernel
| resume
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)