Main Content

RegressionChainEnsemble

Multiresponse regression model

Since R2024b

    Description

    RegressionChainEnsemble is a trained multiresponse regression model that uses regression chains. Use the predict and loss object functions to predict on new data and compute the regression loss, respectively.

    For more information, see Regression Chains.

    Creation

    Create a RegressionChainEnsemble object by using the fitrchains function.

    Properties

    expand all

    Chain Ensemble Properties

    This property is read-only.

    Order of the response variables in the regression chains, specified as a positive integer matrix. Row i indicates the order of the response variables in regression chain i.

    Data Types: double

    This property is read-only.

    Compact regression models trained as part of the regression chains, specified as a cell array of regression model objects. Each row of Learners corresponds to one regression chain.

    This table lists the possible compact regression models.

    Regression Model TypeModel Object
    Bagged or boosted ensemble of treesCompactRegressionEnsemble
    General additive model (GAM)CompactRegressionGAM
    Gaussian process regression (GPR)CompactRegressionGP
    Kernel modelRegressionKernel
    Linear modelRegressionLinear
    Support vector machine (SVM)CompactRegressionSVM
    Decision treeCompactRegressionTree

    Data Types: cell

    This property is read-only.

    Number of regression chains in the chain ensemble, specified as a positive integer scalar. NumChains indicates the number of rows in ChainOrders and Learners.

    Data Types: double

    Data Properties

    This property is read-only.

    Categorical predictor indices, specified as a positive integer vector. Each index value in CategoricalPredictors indicates that the corresponding predictor listed in PredictorNames is categorical. If none of the predictors are categorical, then this property is empty ([]).

    Data Types: double

    This property is read-only.

    Number of observations in the data stored in X and Y, specified as a positive integer scalar.

    Data Types: double

    This property is read-only.

    Number of predictor variables, specified as a positive integer scalar. NumPredictors does not include response variables that are used as predictors by some models in Learners.

    To see all the predictors used by a specific compact regression model in Learners, use the properties of the compact regression model. For an example, see Specify Multiresponse Regression Model Properties.

    Data Types: double

    This property is read-only.

    Number of response variables, specified as a positive integer scalar.

    Data Types: double

    This property is read-only.

    Predictor variable names, specified as a cell array of character vectors. The order of the elements in PredictorNames corresponds to the order of the predictors in the data used to train the model.

    Data Types: cell

    This property is read-only.

    Response variable names, specified as a string array. The order of the elements in ResponseName corresponds to the order of the response variables in the data used to train the model.

    Data Types: string

    This property is read-only.

    Predictor data used to train the model, specified as a numeric matrix or a table. Each row of X corresponds to an observation, and each column corresponds to a predictor variable (PredictorNames).

    Data Types: single | double | table

    This property is read-only.

    Response data used to train the model, specified as a numeric matrix or table. Each row of Y corresponds to an observation, and each column corresponds to a response variable (ResponseName).

    Data Types: single | double | table

    This property is read-only.

    Observation weights used to train the model, specified as a numeric vector. Each row of W corresponds to an observation.

    Data Types: single | double

    Object Functions

    compactReduce size of multiresponse regression model
    lossLoss for multiresponse regression model
    predictPredict responses using multiresponse regression model

    Examples

    collapse all

    Train a multiresponse regression model using regression chains. Specify the type of regression models to use in the regression chains, and train the models with predicted values for response variables used as predictors.

    Load the carbig data set, which contains measurements of cars made in the 1970s and early 1980s. Create a table containing the predictor variables Displacement, Horsepower, and so on, as well as the response variables Acceleration and MPG. Display the first eight rows of the table.

    load carbig
    cars = table(Displacement,Horsepower,Model_Year, ...
        Origin,Weight,Acceleration,MPG);
    head(cars)
        Displacement    Horsepower    Model_Year    Origin     Weight    Acceleration    MPG
        ____________    __________    __________    _______    ______    ____________    ___
    
            307            130            70        USA         3504           12        18 
            350            165            70        USA         3693         11.5        15 
            318            150            70        USA         3436           11        18 
            304            150            70        USA         3433           12        16 
            302            140            70        USA         3449         10.5        17 
            429            198            70        USA         4341           10        15 
            454            220            70        USA         4354            9        14 
            440            215            70        USA         4312          8.5        14 
    

    Categorize the cars based on whether they were made in the USA.

    cars.Origin = categorical(cellstr(cars.Origin));
    cars.Origin = mergecats(cars.Origin,["France","Japan",...
        "Germany","Sweden","Italy","England"],"NotUSA");

    Remove observations with missing values.

    cars = rmmissing(cars);

    Train a multiresponse regression model by passing the cars data to the fitrchains function. Use regression chains composed of regression support vector machine (SVM) models with standardized numeric predictors. When training the SVM models, use the predicted values for the response variables that are treated as predictors.

    Mdl = fitrchains(cars,["Acceleration","MPG"], ...
        Learner=templateSVM(Standardize=true), ...
        ChainPredictedResponse=true);

    Mdl is a trained RegressionChainEnsemble model object. You can use dot notation to access the properties of Mdl.

    Display the order of the response variables in the regression chains in Mdl, and display the trained regression SVM models in the regression chains.

    Mdl.ChainOrders
    ans = 2×2
    
         1     2
         2     1
    
    
    Mdl.Learners
    ans=2×2 cell array
        {1x1 classreg.learning.regr.CompactRegressionSVM}    {1x1 classreg.learning.regr.CompactRegressionSVM}
        {1x1 classreg.learning.regr.CompactRegressionSVM}    {1x1 classreg.learning.regr.CompactRegressionSVM}
    
    

    In the first regression chain, the first SVM model uses Acceleration as the response variable. The second SVM model uses MPG as the response variable and the predicted values for Acceleration as a predictor variable. The first SVM model provides the predicted Acceleration values used by the second SVM model.

    Recall that the SVM models use standardized numeric predictors. Find the means (Mu) and standard deviations (Sigma) used by the second model in the first regression chain.

    Chain1Model2 = Mdl.Learners{1,2};
    
    Mdl.PredictorNames
    ans = 1x5 cell
        {'Displacement'}    {'Horsepower'}    {'Model_Year'}    {'Origin'}    {'Weight'}
    
    
    Chain1Model2.ExpandedPredictorNames
    ans = 1x7 cell
        {'x1'}    {'x2'}    {'x3'}    {'x4 == 1'}    {'x4 == 2'}    {'x5'}    {'x6'}
    
    
    Chain1Model2.Mu
    ans = 1×7
    103 ×
    
        0.1944    0.1045    0.0760         0         0    2.9776    0.0153
    
    
    Chain1Model2.Sigma
    ans = 1×7
    
      104.6440   38.4912    3.6837    1.0000    1.0000  849.4026    2.2190
    
    

    The SVM model uses five numeric predictors: Displacement (x1), Horsepower (x2), Model_Year (x3), Weight (x5), and the predicted values for Acceleration (x6). The software uses the corresponding Mu and Sigma values to standardize the predictor data before predicting with the predict object function.

    The categorical predictor Origin is split into two variables (x4 == 1 and x4 == 2) after categorical expansion. The corresponding Mu and Sigma values indicate that the two variables are unchanged after standardization.

    Version History

    Introduced in R2024b