RegressionChainEnsemble

Multiresponse regression model

Since R2024b

Description

RegressionChainEnsemble is a trained multiresponse regression model that uses regression chains. Use the predict and loss object functions to predict on new data and compute the regression loss, respectively.

For more information, see Regression Chains.

Creation

Create a RegressionChainEnsemble object by using the fitrchains function.

Properties

expand all

Chain Ensemble Properties

`ChainOrders` — Order of response variables in regression chains
positive integer matrix

This property is read-only.

Order of the response variables in the regression chains, specified as a positive integer matrix. Row i indicates the order of the response variables in regression chain i.

Data Types: double

`Learners` — Compact regression models trained as part of regression chains
cell array of regression model objects

This property is read-only.

Compact regression models trained as part of the regression chains, specified as a cell array of regression model objects. Each row of Learners corresponds to one regression chain.

This table lists the possible compact regression models.

Regression Model Type	Model Object
Bagged or boosted ensemble of trees	`CompactRegressionEnsemble`
General additive model (GAM)	`CompactRegressionGAM`
Gaussian process regression (GPR)	`CompactRegressionGP`
Kernel model	`RegressionKernel`
Linear model	`RegressionLinear`
Support vector machine (SVM)	`CompactRegressionSVM`
Decision tree	`CompactRegressionTree`

Data Types: cell

`NumChains` — Number of regression chains
positive integer scalar

This property is read-only.

Number of regression chains in the chain ensemble, specified as a positive integer scalar. NumChains indicates the number of rows in ChainOrders and Learners.

Data Types: double

Data Properties

`CategoricalPredictors` — Categorical predictor indices
positive integer vector | `[]`

This property is read-only.

Categorical predictor indices, specified as a positive integer vector. Each index value in CategoricalPredictors indicates that the corresponding predictor listed in PredictorNames is categorical. If none of the predictors are categorical, then this property is empty ([]).

Data Types: double

`NumObservations` — Number of observations
positive integer scalar

This property is read-only.

Number of observations in the data stored in X and Y, specified as a positive integer scalar.

Data Types: double

`NumPredictors` — Number of predictor variables
positive integer scalar

This property is read-only.

Number of predictor variables, specified as a positive integer scalar. NumPredictors does not include response variables that are used as predictors by some models in Learners.

To see all the predictors used by a specific compact regression model in Learners, use the properties of the compact regression model. For an example, see Specify Multiresponse Regression Model Properties.

Data Types: double

`NumResponses` — Number of response variables
positive integer scalar

This property is read-only.

Number of response variables, specified as a positive integer scalar.

Data Types: double

`PredictorNames` — Predictor variable names
cell array of character vectors

This property is read-only.

Predictor variable names, specified as a cell array of character vectors. The order of the elements in PredictorNames corresponds to the order of the predictors in the data used to train the model.

Data Types: cell

`ResponseName` — Response variable names
string array

This property is read-only.

Response variable names, specified as a string array. The order of the elements in ResponseName corresponds to the order of the response variables in the data used to train the model.

Data Types: string

`X` — Predictor data
numeric matrix | table

This property is read-only.

Predictor data used to train the model, specified as a numeric matrix or a table. Each row of X corresponds to an observation, and each column corresponds to a predictor variable (PredictorNames).

Data Types: single | double | table

`Y` — Response data
numeric matrix | numeric table

This property is read-only.

Response data used to train the model, specified as a numeric matrix or table. Each row of Y corresponds to an observation, and each column corresponds to a response variable (ResponseName).

Data Types: single | double | table

`W` — Observation weights
numeric vector

This property is read-only.

Observation weights used to train the model, specified as a numeric vector. Each row of W corresponds to an observation.

Data Types: single | double

Object Functions

`compact`	Reduce size of multiresponse regression model
`loss`	Loss for multiresponse regression model
`predict`	Predict responses using multiresponse regression model

Examples

collapse all

Specify Multiresponse Regression Model Properties

Open Live Script

Train a multiresponse regression model using regression chains. Specify the type of regression models to use in the regression chains, and train the models with predicted values for response variables used as predictors.

Load the carbig data set, which contains measurements of cars made in the 1970s and early 1980s. Create a table containing the predictor variables Displacement, Horsepower, and so on, as well as the response variables Acceleration and MPG. Display the first eight rows of the table.

load carbig
cars = table(Displacement,Horsepower,Model_Year, ...
    Origin,Weight,Acceleration,MPG);
head(cars)

    Displacement    Horsepower    Model_Year    Origin     Weight    Acceleration    MPG
    ____________    __________    __________    _______    ______    ____________    ___

        307            130            70        USA         3504           12        18 
        350            165            70        USA         3693         11.5        15 
        318            150            70        USA         3436           11        18 
        304            150            70        USA         3433           12        16 
        302            140            70        USA         3449         10.5        17 
        429            198            70        USA         4341           10        15 
        454            220            70        USA         4354            9        14 
        440            215            70        USA         4312          8.5        14

Categorize the cars based on whether they were made in the USA.

cars.Origin = categorical(cellstr(cars.Origin));
cars.Origin = mergecats(cars.Origin,["France","Japan",...
    "Germany","Sweden","Italy","England"],"NotUSA");

Remove observations with missing values.

cars = rmmissing(cars);

Train a multiresponse regression model by passing the cars data to the fitrchains function. Use regression chains composed of regression support vector machine (SVM) models with standardized numeric predictors. When training the SVM models, use the predicted values for the response variables that are treated as predictors.

Mdl = fitrchains(cars,["Acceleration","MPG"], ...
    Learner=templateSVM(Standardize=true), ...
    ChainPredictedResponse=true);

Mdl is a trained RegressionChainEnsemble model object. You can use dot notation to access the properties of Mdl.

Display the order of the response variables in the regression chains in Mdl, and display the trained regression SVM models in the regression chains.

Mdl.ChainOrders

Mdl.Learners

ans=2×2 cell array
    {1x1 classreg.learning.regr.CompactRegressionSVM}    {1x1 classreg.learning.regr.CompactRegressionSVM}
    {1x1 classreg.learning.regr.CompactRegressionSVM}    {1x1 classreg.learning.regr.CompactRegressionSVM}

In the first regression chain, the first SVM model uses Acceleration as the response variable. The second SVM model uses MPG as the response variable and the predicted values for Acceleration as a predictor variable. The first SVM model provides the predicted Acceleration values used by the second SVM model.

Recall that the SVM models use standardized numeric predictors. Find the means (Mu) and standard deviations (Sigma) used by the second model in the first regression chain.

Chain1Model2 = Mdl.Learners{1,2};

Mdl.PredictorNames

ans = 1x5 cell
    {'Displacement'}    {'Horsepower'}    {'Model_Year'}    {'Origin'}    {'Weight'}

Chain1Model2.ExpandedPredictorNames

ans = 1x7 cell
    {'x1'}    {'x2'}    {'x3'}    {'x4 == 1'}    {'x4 == 2'}    {'x5'}    {'x6'}

Chain1Model2.Mu

ans = 1×7
10³ ×

    0.1944    0.1045    0.0760         0         0    2.9776    0.0153

Chain1Model2.Sigma

ans = 1×7

  104.6440   38.4912    3.6837    1.0000    1.0000  849.4026    2.2190

The SVM model uses five numeric predictors: Displacement (x1), Horsepower (x2), Model_Year (x3), Weight (x5), and the predicted values for Acceleration (x6). The software uses the corresponding Mu and Sigma values to standardize the predictor data before predicting with the predict object function.

The categorical predictor Origin is split into two variables (x4 == 1 and x4 == 2) after categorical expansion. The corresponding Mu and Sigma values indicate that the two variables are unchanged after standardization.

Version History

Introduced in R2024b

RegressionChainEnsemble

Description

Creation

Properties

Chain Ensemble Properties

ChainOrders — Order of response variables in regression chains positive integer matrix

Learners — Compact regression models trained as part of regression chains cell array of regression model objects

NumChains — Number of regression chains positive integer scalar

Data Properties

CategoricalPredictors — Categorical predictor indices positive integer vector | []

NumObservations — Number of observations positive integer scalar

NumPredictors — Number of predictor variables positive integer scalar

NumResponses — Number of response variables positive integer scalar

PredictorNames — Predictor variable names cell array of character vectors

ResponseName — Response variable names string array

X — Predictor data numeric matrix | table

Y — Response data numeric matrix | numeric table

W — Observation weights numeric vector

Object Functions

Examples

Specify Multiresponse Regression Model Properties

Version History

See Also

`ChainOrders` — Order of response variables in regression chains
positive integer matrix

`Learners` — Compact regression models trained as part of regression chains
cell array of regression model objects

`NumChains` — Number of regression chains
positive integer scalar

`CategoricalPredictors` — Categorical predictor indices
positive integer vector | `[]`

`NumObservations` — Number of observations
positive integer scalar

`NumPredictors` — Number of predictor variables
positive integer scalar

`NumResponses` — Number of response variables
positive integer scalar

`PredictorNames` — Predictor variable names
cell array of character vectors

`ResponseName` — Response variable names
string array

`X` — Predictor data
numeric matrix | table

`Y` — Response data
numeric matrix | numeric table

`W` — Observation weights
numeric vector