fitrsvm
Fit a support vector machine regression model
Syntax
Description
fitrsvm trains or cross-validates a support vector
            machine (SVM) regression model on a low- through moderate-dimensional predictor data
            set. fitrsvm supports mapping the predictor data using kernel
            functions, and supports SMO, ISDA, or L1 soft-margin minimization via
            quadratic programming for objective-function minimization.
To train a linear SVM regression model on a high-dimensional data set, that is, data
            sets that include many predictor variables, use fitrlinear instead.
To train an SVM model for binary classification, see fitcsvm for low- through moderate-dimensional predictor data sets, or
                fitclinear for high-dimensional data sets.
        
Mdl = fitrsvm(Tbl,ResponseVarName)Mdl trained using the predictors values in the table Tbl and the response values in Tbl.ResponseVarName.
Mdl = fitrsvm(___,Name,Value)
[
                    also returns Mdl,AggregateOptimizationResults] = fitrsvm(___)AggregateOptimizationResults, which contains
                    hyperparameter optimization results when you specify the
                        OptimizeHyperparameters and
                        HyperparameterOptimizationOptions name-value arguments.
                    You must also specify the ConstraintType and
                        ConstraintBounds options of
                        HyperparameterOptimizationOptions. You can use this
                    syntax to optimize on compact model size instead of cross-validation loss, and
                    to perform a set of multiple optimization problems that have the same options
                    but different constraint bounds.
Examples
Input Arguments
Name-Value Arguments
Output Arguments
Limitations
fitrsvm supports low- through moderate-dimensional data sets. For high-dimensional data set, use fitrlinear instead.
Tips
- Unless your data set is large, always try to standardize the predictors (see - Standardize). Standardization makes predictors insensitive to the scales on which they are measured.
- It is good practice to cross-validate using the - KFoldname-value pair argument. The cross-validation results determine how well the SVM model generalizes.
- Sparsity in support vectors is a desirable property of an SVM model. To decrease the number of support vectors, set the - BoxConstraintname-value pair argument to a large value. This action also increases the training time.
- For optimal training time, set - CacheSizeas high as the memory limit on your computer allows.
- If you expect many fewer support vectors than observations in the training set, then you can significantly speed up convergence by shrinking the active-set using the name-value pair argument - 'ShrinkagePeriod'. It is good practice to use- 'ShrinkagePeriod',1000.
- Duplicate observations that are far from the regression line do not affect convergence. However, just a few duplicate observations that occur near the regression line can slow down convergence considerably. To speed up convergence, specify - 'RemoveDuplicates',trueif:- Your data set contains many duplicate observations. 
- You suspect that a few duplicate observations can fall near the regression line. 
 - However, to maintain the original data set during training, - fitrsvmmust temporarily store separate data sets: the original and one without the duplicate observations. Therefore, if you specify- truefor data sets containing few duplicates, then- fitrsvmconsumes close to double the memory of the original data.
- After training a model, you can generate C/C++ code that predicts responses for new data. Generating C/C++ code requires MATLAB Coder™. For details, see Introduction to Code Generation. 
Algorithms
- For the mathematical formulation of linear and nonlinear SVM regression problems and the solver algorithms, see Understanding Support Vector Machine Regression. 
- NaN,- <undefined>, empty character vector (- ''), empty string (- ""), and- <missing>values indicate missing data values.- fitrsvmremoves entire rows of data corresponding to a missing response. When normalizing weights,- fitrsvmignores any weight corresponding to an observation with at least one missing predictor. Consequently, observation box constraints might not equal- BoxConstraint.
- fitrsvmremoves observations that have zero weight.
- If you set - 'Standardize',trueand- 'Weights', then- fitrsvmstandardizes the predictors using their corresponding weighted means and weighted standard deviations. That is,- fitrsvmstandardizes predictor j (xj) using- xjk is observation k (row) of predictor j (column). 
 
- If your predictor data contains categorical variables, then the software generally uses full dummy encoding for these variables. The software creates one dummy variable for each level of each categorical variable. - The - PredictorNamesproperty stores one element for each of the original predictor variable names. For example, assume that there are three predictors, one of which is a categorical variable with three levels. Then- PredictorNamesis a 1-by-3 cell array of character vectors containing the original names of the predictor variables.
- The - ExpandedPredictorNamesproperty stores one element for each of the predictor variables, including the dummy variables. For example, assume that there are three predictors, one of which is a categorical variable with three levels. Then- ExpandedPredictorNamesis a 1-by-5 cell array of character vectors containing the names of the predictor variables and the new dummy variables.
- Similarly, the - Betaproperty stores one beta coefficient for each predictor, including the dummy variables.
- The - SupportVectorsproperty stores the predictor values for the support vectors, including the dummy variables. For example, assume that there are m support vectors and three predictors, one of which is a categorical variable with three levels. Then- SupportVectorsis an m-by-5 matrix.
- The - Xproperty stores the training data as originally input. It does not include the dummy variables. When the input is a table,- Xcontains only the columns used as predictors.
 
- For predictors specified in a table, if any of the variables contain ordered (ordinal) categories, the software uses ordinal encoding for these variables. - For a variable having k ordered levels, the software creates k – 1 dummy variables. The jth dummy variable is -1 for levels up to j, and +1 for levels j + 1 through k. 
- The names of the dummy variables stored in the - ExpandedPredictorNamesproperty indicate the first level with the value +1. The software stores k – 1 additional predictor names for the dummy variables, including the names of levels 2, 3, ..., k.
 
- All solvers implement L1 soft-margin minimization. 
- Let - pbe the proportion of outliers that you expect in the training data. If you set- 'OutlierFraction',p, then the software implements robust learning. In other words, the software attempts to remove 100- p% of the observations when the optimization algorithm converges. The removed observations correspond to gradients that are large in magnitude.
References
[1] Clark, D., Z. Schreter, A. Adams. "A Quantitative Comparison of Dystal and Backpropagation." submitted to the Australian Conference on Neural Networks, 1996.
[4] Lichman, M. UCI Machine Learning Repository, [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.
[5] Nash, W.J., T. L. Sellers, S. R. Talbot, A. J. Cawthorn, and W. B. Ford. "The Population Biology of Abalone (Haliotis species) in Tasmania. I. Blacklip Abalone (H. rubra) from the North Coast and Islands of Bass Strait." Sea Fisheries Division, Technical Report No. 48, 1994.
[6] Waugh, S. "Extending and Benchmarking Cascade-Correlation: Extensions to the Cascade-Correlation Architecture and Benchmarking of Feed-forward Supervised Artificial Neural Networks." University of Tasmania Department of Computer Science thesis, 1995.
