Fit a support vector machine regression model
fitrsvm
trains or cross-validates
a support vector machine (SVM) regression model on a low- through
moderate-dimensional predictor data set. fitrsvm
supports
mapping the predictor data using kernel functions, and supports SMO,
ISDA, or L1 soft-margin minimization via quadratic
programming for objective-function minimization.
To train a linear SVM regression model on a high-dimensional
data set, that is, data sets that include many predictor variables,
use fitrlinear
instead.
To train an SVM model for binary classification, see fitcsvm
for low- through moderate-dimensional
predictor data sets, or fitclinear
for
high-dimensional data sets.
returns
a full, trained support vector machine (SVM) regression model Mdl
= fitrsvm(Tbl
,ResponseVarName
)Mdl
trained
using the predictors values in the table Tbl
and
the response values in Tbl.ResponseVarName
.
returns
an SVM regression model with additional options specified by one or
more name-value pair arguments, using any of the previous syntaxes.
For example, you can specify the kernel function or train a cross-validated
model.Mdl
= fitrsvm(___,Name,Value
)
Train a support vector machine (SVM) regression model using sample data stored in matrices.
Load the carsmall
data set.
load carsmall rng 'default' % For reproducibility
Specify Horsepower
and Weight
as the predictor variables (X
) and MPG
as the response variable (Y
).
X = [Horsepower,Weight]; Y = MPG;
Train a default SVM regression model.
Mdl = fitrsvm(X,Y)
Mdl = RegressionSVM ResponseName: 'Y' CategoricalPredictors: [] ResponseTransform: 'none' Alpha: [75x1 double] Bias: 43.2943 KernelParameters: [1x1 struct] NumObservations: 93 BoxConstraints: [93x1 double] ConvergenceInfo: [1x1 struct] IsSupportVector: [93x1 logical] Solver: 'SMO' Properties, Methods
Mdl
is a trained RegressionSVM
model.
Check the model for convergence.
Mdl.ConvergenceInfo.Converged
ans = logical
0
0
indicates that the model did not converge.
Retrain the model using standardized data.
MdlStd = fitrsvm(X,Y,'Standardize',true)
MdlStd = RegressionSVM ResponseName: 'Y' CategoricalPredictors: [] ResponseTransform: 'none' Alpha: [77x1 double] Bias: 22.9131 KernelParameters: [1x1 struct] Mu: [109.3441 2.9625e+03] Sigma: [45.3545 805.9668] NumObservations: 93 BoxConstraints: [93x1 double] ConvergenceInfo: [1x1 struct] IsSupportVector: [93x1 logical] Solver: 'SMO' Properties, Methods
Check the model for convergence.
MdlStd.ConvergenceInfo.Converged
ans = logical
1
1
indicates that the model did converge.
Compute the resubstitution (in-sample) mean-squared error for the new model.
lStd = resubLoss(MdlStd)
lStd = 17.0256
Train a support vector machine regression model using the abalone data from the UCI Machine Learning Repository.
Download the data and save it in your current folder with the name 'abalone.csv'
.
url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/abalone/abalone.data'; websave('abalone.csv',url);
Read the data into a table. Specify the variable names.
varnames = {'Sex'; 'Length'; 'Diameter'; 'Height'; 'Whole_weight';... 'Shucked_weight'; 'Viscera_weight'; 'Shell_weight'; 'Rings'}; Tbl = readtable('abalone.csv','Filetype','text','ReadVariableNames',false); Tbl.Properties.VariableNames = varnames;
The sample data contains 4177 observations. All the predictor variables are continuous except for Sex
, which is a categorical variable with possible values 'M'
(for males), 'F'
(for females), and 'I'
(for infants). The goal is to predict the number of rings (stored in Rings
) on the abalone and determine its age using physical measurements.
Train an SVM regression model, using a Gaussian kernel function with an automatic kernel scale. Standardize the data.
rng default % For reproducibility Mdl = fitrsvm(Tbl,'Rings','KernelFunction','gaussian','KernelScale','auto',... 'Standardize',true)
Mdl = RegressionSVM PredictorNames: {1×8 cell} ResponseName: 'Rings' CategoricalPredictors: 1 ResponseTransform: 'none' Alpha: [3635×1 double] Bias: 10.8144 KernelParameters: [1×1 struct] Mu: [1×10 double] Sigma: [1×10 double] NumObservations: 4177 BoxConstraints: [4177×1 double] ConvergenceInfo: [1×1 struct] IsSupportVector: [4177×1 logical] Solver: 'SMO'
The Command Window shows that Mdl
is a trained RegressionSVM
model and displays a property list.
Display the properties of Mdl
using dot notation. For example, check to confirm whether the model converged and how many iterations it completed.
conv = Mdl.ConvergenceInfo.Converged iter = Mdl.NumIterations
conv = logical 1 iter = 2759
The returned results indicate that the model converged after 2759 iterations.
Load the carsmall
data set.
load carsmall rng 'default' % For reproducibility
Specify Horsepower
and Weight
as the predictor variables (X
) and MPG
as the response variable (Y
).
X = [Horsepower Weight]; Y = MPG;
Cross-validate two SVM regression models using 5-fold cross-validation. For both models, specify to standardize the predictors. For one of the models, specify to train using the default linear kernel, and the Gaussian kernel for the other model.
MdlLin = fitrsvm(X,Y,'Standardize',true,'KFold',5)
MdlLin = classreg.learning.partition.RegressionPartitionedSVM CrossValidatedModel: 'SVM' PredictorNames: {'x1' 'x2'} ResponseName: 'Y' NumObservations: 94 KFold: 5 Partition: [1x1 cvpartition] ResponseTransform: 'none' Properties, Methods
MdlGau = fitrsvm(X,Y,'Standardize',true,'KFold',5,'KernelFunction','gaussian')
MdlGau = classreg.learning.partition.RegressionPartitionedSVM CrossValidatedModel: 'SVM' PredictorNames: {'x1' 'x2'} ResponseName: 'Y' NumObservations: 94 KFold: 5 Partition: [1x1 cvpartition] ResponseTransform: 'none' Properties, Methods
MdlLin.Trained
ans=5×1 cell
{1x1 classreg.learning.regr.CompactRegressionSVM}
{1x1 classreg.learning.regr.CompactRegressionSVM}
{1x1 classreg.learning.regr.CompactRegressionSVM}
{1x1 classreg.learning.regr.CompactRegressionSVM}
{1x1 classreg.learning.regr.CompactRegressionSVM}
MdlLin
and MdlGau
are RegressionPartitionedSVM
cross-validated models. The Trained
property of each model is a 5-by-1 cell array of CompactRegressionSVM
models. The models in the cell store the results of training on 4 folds of observations, and leaving one fold of observations out.
Compare the generalization error of the models. In this case, the generalization error is the out-of-sample mean-squared error.
mseLin = kfoldLoss(MdlLin)
mseLin = 17.4417
mseGau = kfoldLoss(MdlGau)
mseGau = 16.7397
The SVM regression model using the Gaussian kernel performs better than the one using the linear kernel.
Create a model suitable for making predictions by passing the entire data set to fitrsvm
, and specify all name-value pair arguments that yielded the better-performing model. However, do not specify any cross-validation options.
MdlGau = fitrsvm(X,Y,'Standardize',true,'KernelFunction','gaussian');
To predict the MPG of a set of cars, pass Mdl
and a table containing the horsepower and weight measurements of the cars to predict
.
This example shows how to optimize hyperparameters automatically using fitrsvm
. The example uses the carsmall
data.
Load the carsmall
data set.
load carsmall
Specify Horsepower
and Weight
as the predictor variables (X
) and MPG
as the response variable (Y
).
X = [Horsepower Weight]; Y = MPG;
Find hyperparameters that minimize five-fold cross-validation loss by using automatic hyperparameter optimization.
For reproducibility, set the random seed and use the 'expected-improvement-plus'
acquisition function.
rng default Mdl = fitrsvm(X,Y,'OptimizeHyperparameters','auto',... 'HyperparameterOptimizationOptions',struct('AcquisitionFunctionName',... 'expected-improvement-plus'))
|====================================================================================================================| | Iter | Eval | Objective: | Objective | BestSoFar | BestSoFar | BoxConstraint| KernelScale | Epsilon | | | result | log(1+loss) | runtime | (observed) | (estim.) | | | | |====================================================================================================================| | 1 | Best | 6.1124 | 10.727 | 6.1124 | 6.1124 | 0.35664 | 0.043031 | 0.30396 | | 2 | Best | 2.9114 | 0.081241 | 2.9114 | 3.088 | 70.67 | 710.65 | 1.6369 | | 3 | Accept | 4.1884 | 0.062996 | 2.9114 | 3.078 | 14.367 | 0.0059144 | 442.64 | | 4 | Accept | 4.159 | 0.059691 | 2.9114 | 3.0457 | 0.0030879 | 715.31 | 2.6045 | | 5 | Best | 2.9044 | 0.21473 | 2.9044 | 2.9042 | 906.95 | 761.46 | 1.3274 | | 6 | Best | 2.8666 | 0.494 | 2.8666 | 2.8668 | 997.3 | 317.41 | 3.7696 | | 7 | Accept | 4.1881 | 0.046231 | 2.8666 | 2.8669 | 759.56 | 987.74 | 15.074 | | 8 | Accept | 2.8992 | 2.5175 | 2.8666 | 2.8669 | 819.07 | 152.11 | 1.5192 | | 9 | Accept | 2.8916 | 0.15154 | 2.8666 | 2.8672 | 921.52 | 627.48 | 2.3029 | | 10 | Accept | 2.9001 | 0.28924 | 2.8666 | 2.8676 | 382.91 | 343.04 | 1.5448 | | 11 | Accept | 3.6573 | 9.8445 | 2.8666 | 2.8784 | 945.1 | 8.885 | 3.9207 | | 12 | Accept | 2.9381 | 0.13287 | 2.8666 | 2.871 | 935.49 | 979.29 | 0.1384 | | 13 | Accept | 2.9341 | 0.048236 | 2.8666 | 2.8719 | 1.992 | 999.49 | 0.21557 | | 14 | Accept | 2.9227 | 0.061377 | 2.8666 | 2.8742 | 2.351 | 977.85 | 0.026124 | | 15 | Accept | 2.9483 | 0.13459 | 2.8666 | 2.8751 | 826.92 | 713.57 | 0.0096305 | | 16 | Accept | 2.9502 | 1.1896 | 2.8666 | 2.8813 | 345.64 | 129.6 | 0.027832 | | 17 | Accept | 2.9329 | 0.10496 | 2.8666 | 2.8799 | 836.96 | 970.73 | 0.034398 | | 18 | Accept | 2.9177 | 0.068845 | 2.8666 | 2.8771 | 0.10167 | 129.91 | 0.0092675 | | 19 | Accept | 2.95 | 2.5322 | 2.8666 | 2.8749 | 199.85 | 68.93 | 0.0092982 | | 20 | Accept | 4.1964 | 0.070247 | 2.8666 | 2.8685 | 0.0012054 | 940.94 | 0.0097673 | |====================================================================================================================| | Iter | Eval | Objective: | Objective | BestSoFar | BestSoFar | BoxConstraint| KernelScale | Epsilon | | | result | log(1+loss) | runtime | (observed) | (estim.) | | | | |====================================================================================================================| | 21 | Accept | 2.905 | 0.079709 | 2.8666 | 2.8675 | 5.9475 | 199.82 | 0.013585 | | 22 | Accept | 2.9329 | 0.096708 | 2.8666 | 2.8747 | 0.33221 | 21.509 | 0.0094248 | | 23 | Accept | 2.9017 | 0.049333 | 2.8666 | 2.8689 | 13.341 | 554.39 | 0.069216 | | 24 | Accept | 2.9067 | 0.049191 | 2.8666 | 2.8694 | 0.21467 | 73.415 | 0.028231 | | 25 | Accept | 2.9046 | 0.056755 | 2.8666 | 2.8731 | 0.68546 | 61.287 | 0.0099165 | | 26 | Accept | 2.9138 | 0.04743 | 2.8666 | 2.8676 | 0.0012185 | 8.8743 | 0.0093263 | | 27 | Accept | 2.9193 | 0.048818 | 2.8666 | 2.8731 | 0.0099434 | 30.484 | 0.0093546 | | 28 | Accept | 8.5384 | 10.252 | 2.8666 | 2.8683 | 992.36 | 1.4043 | 0.0093129 | | 29 | Accept | 3.2254 | 0.046193 | 2.8666 | 2.8682 | 0.0010092 | 16.917 | 7.3665 | | 30 | Accept | 4.1884 | 0.046135 | 2.8666 | 2.8683 | 983.95 | 42.654 | 287.19 | __________________________________________________________ Optimization completed. MaxObjectiveEvaluations of 30 reached. Total function evaluations: 30 Total elapsed time: 70.0414 seconds. Total objective function evaluation time: 39.6037 Best observed feasible point: BoxConstraint KernelScale Epsilon _____________ ___________ _______ 997.3 317.41 3.7696 Observed objective function value = 2.8666 Estimated objective function value = 2.8683 Function evaluation time = 0.494 Best estimated feasible point (according to models): BoxConstraint KernelScale Epsilon _____________ ___________ _______ 997.3 317.41 3.7696 Estimated objective function value = 2.8683 Estimated function evaluation time = 0.44767
Mdl = RegressionSVM ResponseName: 'Y' CategoricalPredictors: [] ResponseTransform: 'none' Alpha: [35×1 double] Bias: 48.8155 KernelParameters: [1×1 struct] NumObservations: 93 HyperparameterOptimizationResults: [1×1 BayesianOptimization] BoxConstraints: [93×1 double] ConvergenceInfo: [1×1 struct] IsSupportVector: [93×1 logical] Solver: 'SMO' Properties, Methods
The optimization searched over BoxConstraint
, KernelScale
, and Epsilon
. The output is the regression with the minimum estimated cross-validation loss.
Tbl
— Predictor dataSample data used to train the model, specified as a table. Each
row of Tbl
corresponds to one observation, and
each column corresponds to one predictor variable. Optionally, Tbl
can
contain one additional column for the response variable. Multi-column
variables and cell arrays other than cell arrays of character vectors
are not allowed.
If Tbl
contains the response variable, and
you want to use all remaining variables in Tbl
as
predictors, then specify the response variable using ResponseVarName
.
If Tbl
contains the response variable, and
you want to use only a subset of the remaining variables in Tbl
as
predictors, then specify a formula using formula
.
If Tbl
does not contain the response variable,
then specify a response variable using Y
. The
length of response variable and the number of rows of Tbl
must
be equal.
If a row of Tbl
or an element of Y
contains
at least one NaN
, then fitrsvm
removes
those rows and elements from both arguments when training the model.
To specify the names of the predictors in the order of their
appearance in Tbl
, use the PredictorNames
name-value
pair argument.
Data Types: table
ResponseVarName
— Response variable nameTbl
Response variable name, specified as the name of a variable in
Tbl
. The response variable must be a numeric vector.
You must specify ResponseVarName
as a character vector or string
scalar. For example, if Tbl
stores the response variable
Y
as Tbl.Y
, then specify it as
'Y'
. Otherwise, the software treats all columns of
Tbl
, including Y
, as predictors when
training the model.
Data Types: char
| string
formula
— Explanatory model of response variable and subset of predictor variablesExplanatory model of the response variable and a subset of the predictor variables,
specified as a character vector or string scalar in the form
'Y~X1+X2+X3'
. In this form, Y
represents the
response variable, and X1
, X2
, and
X3
represent the predictor variables.
To specify a subset of variables in Tbl
as predictors for
training the model, use a formula. If you specify a formula, then the software does not
use any variables in Tbl
that do not appear in
formula
.
The variable names in the formula must be both variable names in Tbl
(Tbl.Properties.VariableNames
) and valid MATLAB® identifiers.
You can verify the variable names in Tbl
by using the isvarname
function. The following code returns logical 1
(true
) for each variable that has a valid variable name.
cellfun(@isvarname,Tbl.Properties.VariableNames)
Tbl
are not valid, then convert them by using the
matlab.lang.makeValidName
function.Tbl.Properties.VariableNames = matlab.lang.makeValidName(Tbl.Properties.VariableNames);
Data Types: char
| string
Y
— Response dataResponse data, specified as an n-by-1 numeric
vector. The length of Y
and the number of rows
of Tbl
or X
must be equal.
If a row of Tbl
or X
,
or an element of Y
, contains at least one NaN
,
then fitrsvm
removes those rows and elements
from both arguments when training the model.
To specify the response variable name, use the ResponseName
name-value
pair argument.
Data Types: single
| double
X
— Predictor dataPredictor data to which the SVM regression model is fit, specified as an n-by-p numeric matrix. n is the number of observations and p is the number of predictor variables.
The length of Y
and the number of rows
of X
must be equal.
If a row of X
or an element of Y
contains
at least one NaN
, then fitrsvm
removes
those rows and elements from both arguments.
To specify the names of the predictors in the order of their
appearance in X
, use the PredictorNames
name-value
pair argument.
Data Types: single
| double
Specify optional
comma-separated pairs of Name,Value
arguments. Name
is
the argument name and Value
is the corresponding value.
Name
must appear inside quotes. You can specify several name and value
pair arguments in any order as
Name1,Value1,...,NameN,ValueN
.
'KernelFunction','gaussian','Standardize',true,'CrossVal','on'
trains a 10-fold cross-validated SVM regression model using a Gaussian kernel and
standardized training data.You cannot use any cross-validation name-value pair argument along with the
'OptimizeHyperparameters'
name-value pair argument. You can modify
the cross-validation for 'OptimizeHyperparameters'
only by using the
'HyperparameterOptimizationOptions'
name-value pair
argument.
'BoxConstraint'
— Box constraintBox constraint for the alpha coefficients, specified as the
comma-separated pair consisting of 'BoxConstraint'
and a positive scalar value.
The absolute value of the Alpha
coefficients cannot
exceed the value of BoxConstraint
.
The default BoxConstraint
value for the
'gaussian'
or 'rbf'
kernel
function is iqr(Y)/1.349
, where
iqr(Y)
is the interquartile range of response
variable Y
. For all other kernels, the default
BoxConstraint
value is 1.
Example: BoxConstraint,10
Data Types: single
| double
'KernelFunction'
— Kernel function'linear'
(default) | 'gaussian'
| 'rbf'
| 'polynomial'
| function nameKernel function used to compute the Gram matrix, specified as the comma-separated pair
consisting of 'KernelFunction'
and a value in this
table.
Value | Description | Formula |
---|---|---|
'gaussian' or
'rbf' | Gaussian or Radial Basis Function (RBF) kernel |
|
'linear' | Linear kernel |
|
'polynomial' | Polynomial kernel. Use
'PolynomialOrder',
to specify a polynomial kernel of order
p . |
|
You can set your own kernel function, for example,
kernel
, by setting
'KernelFunction','kernel'
.
kernel
must have the following form:
function G = kernel(U,V)
U
is an
m-by-p
matrix.
V
is an
n-by-p
matrix.
G
is an
m-by-n Gram matrix
of the rows of U
and
V
.
And kernel.m
must be on the
MATLAB path.
It is good practice to avoid using generic names for kernel functions.
For example, call a sigmoid kernel function
'mysigmoid'
rather than
'sigmoid'
.
Example: 'KernelFunction','gaussian'
Data Types: char
| string
'KernelScale'
— Kernel scale parameter1
(default) | 'auto'
| positive scalarKernel scale parameter, specified as the comma-separated pair
consisting of 'KernelScale'
and 'auto'
or
a positive scalar. The software divides all elements of the predictor
matrix X
by the value of KernelScale
.
Then, the software applies the appropriate kernel norm to compute
the Gram matrix.
If you specify 'auto'
, then the
software selects an appropriate scale factor using a heuristic procedure.
This heuristic procedure uses subsampling, so estimates can vary from
one call to another. Therefore, to reproduce results, set a random
number seed using rng
before
training.
If you specify KernelScale
and your own kernel
function, for example, 'KernelFunction','kernel'
, then
the software throws an error. You must apply scaling within
kernel
.
Example: 'KernelScale','auto'
Data Types: double
| single
| char
| string
'PolynomialOrder'
— Polynomial kernel function order3
(default) | positive integerPolynomial kernel function order, specified as the comma-separated
pair consisting of 'PolynomialOrder'
and a positive
integer.
If you set 'PolynomialOrder'
and KernelFunction
is
not 'polynomial'
, then the software throws an error.
Example: 'PolynomialOrder',2
Data Types: double
| single
'KernelOffset'
— Kernel offset parameterKernel offset parameter, specified as the comma-separated pair
consisting of 'KernelOffset'
and a nonnegative
scalar.
The software adds KernelOffset
to each element
of the Gram matrix.
The defaults are:
0
if the solver is SMO (that is,
you set 'Solver','SMO'
)
0.1
if the solver is ISDA (that
is, you set 'Solver','ISDA'
)
Example: 'KernelOffset',0
Data Types: double
| single
'Epsilon'
— Half the width of epsilon-insensitive bandiqr(Y)/13.49
(default) | nonnegative scalar valueHalf the width of the epsilon-insensitive band, specified as
the comma-separated pair consisting of 'Epsilon'
and
a nonnegative scalar value.
The default Epsilon
value is iqr(Y)/13.49
,
which is an estimate of a tenth of the standard deviation using the
interquartile range of the response variable Y
.
If iqr(Y)
is equal to zero, then the default Epsilon
value
is 0.1.
Example: 'Epsilon',0.3
Data Types: single
| double
'Standardize'
— Flag to standardize predictor datafalse
(default) | true
Flag to standardize the predictor data, specified as the
comma-separated pair consisting of 'Standardize'
and
true
(1
) or
false
(0)
.
If you set 'Standardize',true
:
The software centers and scales each column of the
predictor data (X
) by the weighted
column mean and standard deviation, respectively (for
details on weighted standardizing, see Algorithms).
MATLAB does not standardize the data contained in the
dummy variable columns generated for categorical
predictors.
The software trains the model using the standardized
predictor matrix, but stores the unstandardized data in the
model property X
.
Example: 'Standardize',true
Data Types: logical
'Solver'
— Optimization routine'ISDA'
| 'L1QP'
| 'SMO'
Optimization routine, specified as the comma-separated pair consisting
of 'Solver'
and a value in this table.
Value | Description |
---|---|
'ISDA' | Iterative Single Data Algorithm (see [30]) |
'L1QP' | Uses quadprog to implement
L1 soft-margin minimization by
quadratic programming. This option requires an
Optimization
Toolbox™ license. For more details, see Quadratic Programming Definition (Optimization Toolbox). |
'SMO' | Sequential Minimal Optimization (see [17]) |
The defaults are:
'ISDA'
if you set
'OutlierFraction'
to a positive
value
'SMO'
otherwise
Example: 'Solver','ISDA'
'Alpha'
— Initial estimates of alpha coefficientsInitial estimates of alpha coefficients, specified as the
comma-separated pair consisting of 'Alpha'
and a
numeric vector. The length of Alpha
must be equal to
the number of rows of X
.
Each element of Alpha
corresponds to an
observation in X
.
Alpha
cannot contain any
NaN
s.
If you specify Alpha
and any one of the
cross-validation name-value pair arguments
('CrossVal'
,
'CVPartition'
,
'Holdout'
,
'KFold'
, or
'Leaveout'
), then the software returns an
error.
If Y
contains any missing values, then remove all
rows of Y
, X
, and
Alpha
that correspond to the missing values. That
is,
enter:
idx = ~isnan(Y); Y = Y(idx); X = X(idx,:); alpha = alpha(idx);
Y
, X
, and
alpha
as the response, predictors, and initial
alpha estimates, respectively.
The default is zeros(size(Y,1))
.
Example: 'Alpha',0.1*ones(size(X,1),1)
Data Types: single
| double
'CacheSize'
— Cache size1000
(default) | 'maximal'
| positive scalarCache size, specified as the comma-separated pair consisting
of 'CacheSize'
and 'maximal'
or
a positive scalar.
If CacheSize
is 'maximal'
,
then the software reserves enough memory to hold the entire n-by-n Gram matrix.
If CacheSize
is a positive scalar, then the
software reserves CacheSize
megabytes of memory
for training the model.
Example: 'CacheSize','maximal'
Data Types: double
| single
| char
| string
'ClipAlphas'
— Flag to clip alpha coefficientstrue
(default) | false
Flag to clip alpha coefficients, specified as the comma-separated
pair consisting of 'ClipAlphas'
and either true
or false
.
Suppose that the alpha coefficient for observation j is αj and the box constraint of observation j is Cj, j = 1,...,n, where n is the training sample size.
Value | Description |
---|---|
true | At each iteration, if αj is near 0 or near Cj, then MATLAB sets αj to 0 or to Cj, respectively. |
false | MATLAB does not change the alpha coefficients during optimization. |
MATLAB stores the final values of α in
the Alpha
property of the trained SVM model object.
ClipAlphas
can affect SMO and ISDA convergence.
Example: 'ClipAlphas',false
Data Types: logical
'NumPrint'
— Number of iterations between optimization diagnostic message output1000
(default) | nonnegative integerNumber of iterations between optimization diagnostic message
output, specified as the comma-separated pair consisting of 'NumPrint'
and
a nonnegative integer.
If you specify 'Verbose',1
and 'NumPrint',numprint
, then
the software displays all optimization diagnostic messages from SMO and ISDA every
numprint
iterations in the Command Window.
Example: 'NumPrint',500
Data Types: double
| single
'OutlierFraction'
— Expected proportion of outliers in training dataExpected proportion of outliers in training data, specified as the
comma-separated pair consisting of 'OutlierFraction'
and a numeric scalar in the interval [0,1). fitrsvm
removes observations with large gradients, ensuring that
fitrsvm
removes the fraction of observations
specified by OutlierFraction
by the time convergence
is reached. This name-value pair is only valid when
'Solver'
is 'ISDA'
.
Example: 'OutlierFraction',0.1
Data Types: single
| double
'RemoveDuplicates'
— Flag to replace duplicate observations with single observationsfalse
(default) | true
Flag to replace duplicate observations with single observations in the training data,
specified as the comma-separated pair consisting of
'RemoveDuplicates'
and true
or
false
.
If RemoveDuplicates
is true
, then
fitrsvm
replaces duplicate observations in the training
data with a single observation of the same value. The weight of the single observation
is equal to the sum of the weights of the corresponding removed duplicates (see
Weights
).
If your data set contains many duplicate observations, then specifying
'RemoveDuplicates',true
can decrease convergence time
considerably.
Data Types: logical
'Verbose'
— Verbosity level0
(default) | 1
| 2
Verbosity level, specified as the comma-separated pair consisting of
'Verbose'
and 0
, 1
, or
2
. The value of Verbose
controls the amount of
optimization information that the software displays in the Command Window and saves the
information as a structure to Mdl.ConvergenceInfo.History
.
This table summarizes the available verbosity level options.
Value | Description |
---|---|
0 | The software does not display or save convergence information. |
1 | The software displays diagnostic messages and saves convergence
criteria every numprint iterations, where
numprint is the value of the name-value pair
argument 'NumPrint' . |
2 | The software displays diagnostic messages and saves convergence criteria at every iteration. |
Example: 'Verbose',1
Data Types: double
| single
'CategoricalPredictors'
— Categorical predictors list'all'
Categorical predictors
list, specified as the comma-separated pair consisting of
'CategoricalPredictors'
and one of the values in this table.
Value | Description |
---|---|
Vector of positive integers | Each entry in the vector is an index value corresponding to the column of the
predictor data (X or Tbl ) that contains a
categorical variable. |
Logical vector | A true entry means that the corresponding column of predictor
data (X or Tbl ) is a categorical
variable. |
Character matrix | Each row of the matrix is the name of a predictor variable. The names must match
the entries in PredictorNames . Pad the names with extra blanks so
each row of the character matrix has the same length. |
String array or cell array of character vectors | Each element in the array is the name of a predictor variable. The names must match
the entries in PredictorNames . |
'all' | All predictors are categorical. |
By default, if the
predictor data is in a table (Tbl
), fitrsvm
assumes that a variable is categorical if it is a logical vector, categorical vector, character
array, string array, or cell array of character vectors. If the predictor data is a matrix
(X
), fitrsvm
assumes that all predictors are
continuous. To identify any other predictors as categorical predictors, specify them by using
the 'CategoricalPredictors'
name-value pair argument.
For the identified categorical predictors, fitrsvm
creates dummy variables using two different schemes, depending on whether a
categorical variable is unordered or ordered. For details, see Automatic Creation of Dummy Variables.
Example: 'CategoricalPredictors','all'
Data Types: single
| double
| logical
| char
| string
| cell
'PredictorNames'
— Predictor variable namesPredictor variable names, specified as the comma-separated pair consisting of
'PredictorNames'
and a string array of unique names or cell array
of unique character vectors. The functionality of 'PredictorNames'
depends on the way you supply the training data.
If you supply X
and Y
, then you
can use 'PredictorNames'
to give the predictor variables
in X
names.
The order of the names in PredictorNames
must correspond to the column order of X
.
That is, PredictorNames{1}
is the name of
X(:,1)
,
PredictorNames{2}
is the name of
X(:,2)
, and so on. Also,
size(X,2)
and
numel(PredictorNames)
must be
equal.
By default, PredictorNames
is
{'x1','x2',...}
.
If you supply Tbl
, then you can use
'PredictorNames'
to choose which predictor variables
to use in training. That is, fitrsvm
uses only the
predictor variables in PredictorNames
and the response
variable in training.
PredictorNames
must be a subset of
Tbl.Properties.VariableNames
and cannot
include the name of the response variable.
By default, PredictorNames
contains the
names of all predictor variables.
It is a good practice to specify the predictors for training
using either 'PredictorNames'
or
formula
only.
Example: 'PredictorNames',{'SepalLength','SepalWidth','PetalLength','PetalWidth'}
Data Types: string
| cell
'ResponseName'
— Response variable name'Y'
(default) | character vector | string scalarResponse variable name, specified as the comma-separated pair consisting of
'ResponseName'
and a character vector or string scalar.
If you supply Y
, then you can
use 'ResponseName'
to specify a name for the response
variable.
If you supply ResponseVarName
or formula
,
then you cannot use 'ResponseName'
.
Example: 'ResponseName','response'
Data Types: char
| string
'ResponseTransform'
— Response transformation'none'
(default) | function handleResponse transformation, specified as the comma-separated pair consisting of
'ResponseTransform'
and either 'none'
or a
function handle. The default is 'none'
, which means
@(y)y
, or no transformation. For a MATLAB function or a function you define, use its function handle. The function
handle must accept a vector (the original response values) and return a vector of the
same size (the transformed response values).
Example: Suppose you create a function handle that applies an exponential
transformation to an input vector by using myfunction = @(y)exp(y)
.
Then, you can specify the response transformation as
'ResponseTransform',myfunction
.
Data Types: char
| string
| function_handle
'Weights'
— Observation weightsones(size(X,1),1)
(default) | vector of numeric valuesObservation weights, specified as the comma-separated pair consisting
of 'Weights'
and a vector of numeric values. The size
of Weights
must equal the number of rows in
X
. fitrsvm
normalizes the
values of Weights
to sum to 1.
Data Types: single
| double
'CrossVal'
— Cross-validation flag'off'
(default) | 'on'
Cross-validation flag, specified as the comma-separated pair
consisting of 'CrossVal'
and either
'on'
or 'off'
.
If you specify 'on'
, then the software implements
10-fold cross-validation.
To override this cross-validation setting, use one of these name-value
pair arguments: CVPartition
,
Holdout
, KFold
, or
Leaveout
. To create a cross-validated model,
you can use one cross-validation name-value pair argument at a time
only.
Alternatively, you can cross-validate the model later using the
crossval
method.
Example: 'CrossVal','on'
'CVPartition'
— Cross-validation partition[]
(default) | cvpartition
partition objectCross-validation partition, specified as the comma-separated pair consisting of
'CVPartition'
and a cvpartition
partition
object created by cvpartition
. The partition object
specifies the type of cross-validation and the indexing for the training and validation
sets.
To create a cross-validated model, you can use one of these four name-value pair arguments
only: CVPartition
, Holdout
,
KFold
, or Leaveout
.
Example: Suppose you create a random partition for 5-fold cross-validation on 500
observations by using cvp = cvpartition(500,'KFold',5)
. Then, you can
specify the cross-validated model by using
'CVPartition',cvp
.
'Holdout'
— Fraction of data for holdout validationFraction of the data used for holdout validation, specified as the comma-separated pair
consisting of 'Holdout'
and a scalar value in the range (0,1). If you
specify 'Holdout',p
, then the software completes these steps:
Randomly select and reserve p*100
% of the data as
validation data, and train the model using the rest of the data.
Store the compact, trained model in the Trained
property of the cross-validated model.
To create a cross-validated model, you can use one of these
four name-value pair arguments only: CVPartition
, Holdout
, KFold
,
or Leaveout
.
Example: 'Holdout',0.1
Data Types: double
| single
'KFold'
— Number of folds10
(default) | positive integer value greater than 1Number of folds to use in a cross-validated model, specified as the comma-separated pair
consisting of 'KFold'
and a positive integer value greater than 1. If
you specify 'KFold',k
, then the software completes these steps:
Randomly partition the data into k
sets.
For each set, reserve the set as validation data, and train the model
using the other k
– 1 sets.
Store the k
compact, trained models in the cells of a
k
-by-1 cell vector in the Trained
property of the cross-validated model.
To create a cross-validated model, you can use one of these
four name-value pair arguments only: CVPartition
, Holdout
, KFold
,
or Leaveout
.
Example: 'KFold',5
Data Types: single
| double
'Leaveout'
— Leave-one-out cross-validation flag'off'
(default) | 'on'
Leave-one-out cross-validation flag, specified as the comma-separated pair consisting of
'Leaveout'
and 'on'
or
'off'
. If you specify 'Leaveout','on'
, then,
for each of the n observations (where n is the
number of observations excluding missing observations, specified in the
NumObservations
property of the model), the software completes
these steps:
Reserve the observation as validation data, and train the model using the other n – 1 observations.
Store the n compact, trained models in the cells of an
n-by-1 cell vector in the Trained
property of the cross-validated model.
To create a cross-validated model, you can use one of these
four name-value pair arguments only: CVPartition
, Holdout
, KFold
,
or Leaveout
.
Example: 'Leaveout','on'
'DeltaGradientTolerance'
— Tolerance for gradient differenceTolerance for gradient difference between upper and lower violators
obtained by SMO or ISDA, specified as the comma-separated pair
consisting of 'DeltaGradientTolerance'
and a
nonnegative scalar.
Example: 'DeltaGradientTolerance',1e-4
Data Types: single
| double
'GapTolerance'
— Feasibility gap tolerance1e-3
(default) | nonnegative scalarFeasibility gap tolerance obtained by SMO or ISDA, specified as the
comma-separated pair consisting of 'GapTolerance'
and
a nonnegative scalar.
If GapTolerance
is 0
, then
fitrsvm
does not use this parameter to check
convergence.
Example: 'GapTolerance',1e-4
Data Types: single
| double
'IterationLimit'
— Maximal number of numerical optimization iterations1e6
(default) | positive integerMaximal number of numerical optimization iterations, specified
as the comma-separated pair consisting of 'IterationLimit'
and
a positive integer.
The software returns a trained model regardless of whether the
optimization routine successfully converges. Mdl.ConvergenceInfo
contains
convergence information.
Example: 'IterationLimit',1e8
Data Types: double
| single
'KKTTolerance'
— Tolerance for KKT violationTolerance for Karush-Kuhn-Tucker (KKT) violation, specified as the
comma-separated pair consisting of 'KKTTolerance'
and
a nonnegative scalar value.
This name-value pair applies only if 'Solver'
is
'SMO'
or 'ISDA'
.
If KKTTolerance
is 0
, then
fitrsvm
does not use this parameter to check
convergence.
Example: 'KKTTolerance',1e-4
Data Types: single
| double
'ShrinkagePeriod'
— Number of iterations between reductions of active set0
(default) | nonnegative integerNumber of iterations between reductions of the active set, specified as the
comma-separated pair consisting of 'ShrinkagePeriod'
and a
nonnegative integer.
If you set 'ShrinkagePeriod',0
, then the software does not shrink
the active set.
Example: 'ShrinkagePeriod',1000
Data Types: double
| single
'OptimizeHyperparameters'
— Parameters to optimize'none'
(default) | 'auto'
| 'all'
| string array or cell array of eligible parameter names | vector of optimizableVariable
objectsParameters to optimize, specified as the comma-separated pair
consisting of 'OptimizeHyperparameters'
and one of
the following:
'none'
— Do not optimize.
'auto'
— Use
{'BoxConstraint','KernelScale','Epsilon'}
.
'all'
— Optimize all eligible
parameters.
String array or cell array of eligible parameter names.
Vector of optimizableVariable
objects,
typically the output of hyperparameters
.
The optimization attempts to minimize the cross-validation loss
(error) for fitrsvm
by varying the parameters. To
control the cross-validation type and other aspects of the optimization,
use the HyperparameterOptimizationOptions
name-value pair.
'OptimizeHyperparameters'
values override any values you set using
other name-value pair arguments. For example, setting
'OptimizeHyperparameters'
to 'auto'
causes the
'auto'
values to apply.
The eligible parameters for fitrsvm
are:
BoxConstraint
—
fitrsvm
searches among positive values,
by default log-scaled in the range
[1e-3,1e3]
.
KernelScale
—
fitrsvm
searches among positive values,
by default log-scaled in the range
[1e-3,1e3]
.
Epsilon
—
fitrsvm
searches among positive values,
by default log-scaled in the range
[1e-3,1e2]*iqr(Y)/1.349
.
KernelFunction
—
fitrsvm
searches among
'gaussian'
, 'linear'
,
and 'polynomial'
.
PolynomialOrder
—
fitrsvm
searches among integers in the
range [2,4]
.
Standardize
—
fitrsvm
searches among
'true'
and
'false'
.
Set nondefault parameters by passing a vector of
optimizableVariable
objects that have nondefault
values. For example,
load carsmall params = hyperparameters('fitrsvm',[Horsepower,Weight],MPG); params(1).Range = [1e-4,1e6];
Pass params
as the value of
OptimizeHyperparameters
.
By default, iterative display appears at the command line, and
plots appear according to the number of hyperparameters in the optimization. For the
optimization and plots, the objective function is log(1 + cross-validation loss) for regression and the misclassification rate for classification. To control
the iterative display, set the Verbose
field of the
'HyperparameterOptimizationOptions'
name-value pair argument. To
control the plots, set the ShowPlots
field of the
'HyperparameterOptimizationOptions'
name-value pair argument.
For an example, see Optimize SVM Regression.
Example: 'OptimizeHyperparameters','auto'
'HyperparameterOptimizationOptions'
— Options for optimizationOptions for optimization, specified as the comma-separated pair consisting of
'HyperparameterOptimizationOptions'
and a structure. This
argument modifies the effect of the OptimizeHyperparameters
name-value pair argument. All fields in the structure are optional.
Field Name | Values | Default |
---|---|---|
Optimizer |
| 'bayesopt' |
AcquisitionFunctionName |
Acquisition functions whose names include
| 'expected-improvement-per-second-plus' |
MaxObjectiveEvaluations | Maximum number of objective function evaluations. | 30 for 'bayesopt' or 'randomsearch' , and the entire grid for 'gridsearch' |
MaxTime | Time limit, specified as a positive real. The time limit is in seconds, as measured by | Inf |
NumGridDivisions | For 'gridsearch' , the number of values in each dimension. The value can be
a vector of positive integers giving the number of
values for each dimension, or a scalar that
applies to all dimensions. This field is ignored
for categorical variables. | 10 |
ShowPlots | Logical value indicating whether to show plots. If true , this field plots
the best objective function value against the
iteration number. If there are one or two
optimization parameters, and if
Optimizer is
'bayesopt' , then
ShowPlots also plots a model of
the objective function against the
parameters. | true |
SaveIntermediateResults | Logical value indicating whether to save results when Optimizer is
'bayesopt' . If
true , this field overwrites a
workspace variable named
'BayesoptResults' at each
iteration. The variable is a BayesianOptimization object. | false |
Verbose | Display to the command line.
For details, see the
| 1 |
UseParallel | Logical value indicating whether to run Bayesian optimization in parallel, which requires Parallel Computing Toolbox™. Due to the nonreproducibility of parallel timing, parallel Bayesian optimization does not necessarily yield reproducible results. For details, see Parallel Bayesian Optimization. | false |
Repartition | Logical value indicating whether to repartition the cross-validation at every iteration. If
| false |
Use no more than one of the following three field names. | ||
CVPartition | A cvpartition object, as created by cvpartition . | 'Kfold',5 if you do not specify any cross-validation
field |
Holdout | A scalar in the range (0,1) representing the holdout fraction. | |
Kfold | An integer greater than 1. |
Example: 'HyperparameterOptimizationOptions',struct('MaxObjectiveEvaluations',60)
Data Types: struct
Mdl
— Trained SVM regression modelRegressionSVM
model | RegressionPartitionedSVM
cross-validated
modelTrained SVM regression model, returned as a RegressionSVM
model
or RegressionPartitionedSVM
cross-validated
model.
If you set any of the name-value pair arguments KFold
, Holdout
, Leaveout
, CrossVal
,
or CVPartition
, then Mdl
is
a RegressionPartitionedSVM
cross-validated model.
Otherwise, Mdl
is a RegressionSVM
model.
fitrsvm
supports low- through moderate-dimensional
data sets. For high-dimensional data set, use fitrlinear
instead.
Unless your data set is large, always try to standardize
the predictors (see Standardize
). Standardization
makes predictors insensitive to the scales on which they are measured.
It is good practice to cross-validate using the KFold
name-value
pair argument. The cross-validation results determine how well the
SVM model generalizes.
Sparsity in support vectors is a desirable property
of an SVM model. To decrease the number of support vectors, set the BoxConstraint
name-value
pair argument to a large value. This action also increases the training
time.
For optimal training time, set CacheSize
as
high as the memory limit on your computer allows.
If you expect many fewer support vectors than observations
in the training set, then you can significantly speed up convergence
by shrinking the active-set using the name-value pair argument 'ShrinkagePeriod'
.
It is good practice to use 'ShrinkagePeriod',1000
.
Duplicate observations that are far from the regression
line do not affect convergence. However, just a few duplicate observations
that occur near the regression line can slow down convergence considerably.
To speed up convergence, specify 'RemoveDuplicates',true
if:
Your data set contains many duplicate observations.
You suspect that a few duplicate observations can fall near the regression line.
However, to maintain the original data set during training, fitrsvm
must
temporarily store separate data sets: the original and one without
the duplicate observations. Therefore, if you specify true
for
data sets containing few duplicates, then fitrsvm
consumes
close to double the memory of the original data.
After training a model, you can generate C/C++ code that predicts responses for new data. Generating C/C++ code requires MATLAB Coder™. For details, see Introduction to Code Generation.
For the mathematical formulation of linear and nonlinear SVM regression problems and the solver algorithms, see Understanding Support Vector Machine Regression.
NaN
, <undefined>
, empty character
vector (''
), empty string (""
), and
<missing>
values indicate missing data values.
fitrsvm
removes entire rows of data corresponding
to a missing response. When normalizing weights,
fitrsvm
ignores any weight corresponding to an
observation with at least one missing predictor. Consequently, observation box
constraints might not equal BoxConstraint
.
fitrsvm
removes observations
that have zero weight.
If you set 'Standardize',true
and 'Weights'
,
then fitrsvm
standardizes the predictors
using their corresponding weighted means and weighted standard deviations.
That is, fitrsvm
standardizes predictor j (xj)
using
xjk is observation k (row) of predictor j (column).
If your predictor data contains categorical variables, then the software generally uses full dummy encoding for these variables. The software creates one dummy variable for each level of each categorical variable.
The PredictorNames
property stores
one element for each of the original predictor variable names. For
example, assume that there are three predictors, one of which is a
categorical variable with three levels. Then PredictorNames
is
a 1-by-3 cell array of character vectors containing the original names
of the predictor variables.
The ExpandedPredictorNames
property
stores one element for each of the predictor variables, including
the dummy variables. For example, assume that there are three predictors,
one of which is a categorical variable with three levels. Then ExpandedPredictorNames
is
a 1-by-5 cell array of character vectors containing the names of the
predictor variables and the new dummy variables.
Similarly, the Beta
property stores
one beta coefficient for each predictor, including the dummy variables.
The SupportVectors
property stores
the predictor values for the support vectors, including the dummy
variables. For example, assume that there are m support
vectors and three predictors, one of which is a categorical variable
with three levels. Then SupportVectors
is an m-by-5
matrix.
The X
property stores the training
data as originally input. It does not include the dummy variables.
When the input is a table, X
contains only the
columns used as predictors.
For predictors specified in a table, if any of the variables contain ordered (ordinal) categories, the software uses ordinal encoding for these variables.
For a variable having k ordered levels, the software creates k – 1 dummy variables. The jth dummy variable is -1 for levels up to j, and +1 for levels j + 1 through k.
The names of the dummy variables stored in the ExpandedPredictorNames
property
indicate the first level with the value +1.
The software stores k –
1 additional predictor names for the dummy variables,
including the names of levels 2, 3, ..., k.
All solvers implement L1 soft-margin minimization.
Let p
be the proportion of outliers
that you expect in the training data. If you set 'OutlierFraction',p
,
then the software implements robust learning.
In other words, the software attempts to remove 100p
%
of the observations when the optimization algorithm converges. The
removed observations correspond to gradients that are large in magnitude.
[1] Clark, D., Z. Schreter, A. Adams. A Quantitative Comparison of Dystal and Backpropagation, submitted to the Australian Conference on Neural Networks, 1996.
[2] Fan, R.-E., P.-H. Chen, and C.-J. Lin. “Working set selection using second order information for training support vector machines.” Journal of Machine Learning Research, Vol 6, 2005, pp. 1889–1918.
[3] Kecman V., T. -M. Huang, and M. Vogt. “Iterative Single Data Algorithm for Training Kernel Machines from Huge Data Sets: Theory and Performance.” In Support Vector Machines: Theory and Applications. Edited by Lipo Wang, 255–274. Berlin: Springer-Verlag, 2005.
[4] Lichman, M. UCI Machine Learning Repository, [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.
[5] Nash, W.J., T. L. Sellers, S. R. Talbot, A. J. Cawthorn, and W. B. Ford. The Population Biology of Abalone (Haliotis species) in Tasmania. I. Blacklip Abalone (H. rubra) from the North Coast and Islands of Bass Strait, Sea Fisheries Division, Technical Report No. 48, 1994.
[6] Waugh, S. Extending and benchmarking Cascade-Correlation, Ph.D. thesis, Computer Science Department, University of Tasmania, 1995.
To run in parallel, set the 'UseParallel'
option to true
.
To perform parallel hyperparameter optimization, use the 'HyperparameterOptions', struct('UseParallel',true)
name-value pair argument in the call to this function.
For more information on parallel hyperparameter optimization, see Parallel Bayesian Optimization.
For more general information about parallel computing, see Run MATLAB Functions with Automatic Parallel Support (Parallel Computing Toolbox).
CompactRegressionSVM
| RegressionPartitionedSVM
| RegressionSVM
| predict
A modified version of this example exists on your system. Do you want to open this version instead?
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
Select web siteYou can also select a web site from the following list:
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.