setmodel
Set model predictors and coefficients
Description
sets the predictors and coefficients of a linear logistic regression model
fitted outside the sc = setmodel(sc,ModelPredictors,ModelCoefficients)creditscorecard object and returns an
updated creditscorecard object. The predictors and
coefficients are used for the computation of scorecard points. Use
setmodel in lieu of fitmodel, which fits a linear
logistic regression model, because setmodel offers increased
flexibility. For example, when a model fitted with fitmodel needs to be modified,
you can use setmodel. For more information, see Workflows for Using setmodel.
Note
When using setmodel, the following assumptions apply:
The model coefficients correspond to a linear logistic regression model (where only linear terms are included in the model and there are no interactions or any other higher-order terms).
The model was previously fitted using Weight of Evidence (WOE) data with the response mapped so that ‘Good’ is
1and ‘Bad’ is0.
Examples
This example shows how to use setmodel to make modifications to a logistic regression model initially fitted using the fitmodel function, and then set the new logistic regression model predictors and coefficients back into the creditscorecard object.
Create a creditscorecard object using the CreditCardData.mat file to load the data (using a dataset from Refaat 2011).
load CreditCardData sc = creditscorecard(data,'IDVar','CustID')
sc =
creditscorecard with properties:
GoodLabel: 0
ResponseVar: 'status'
WeightsVar: ''
VarNames: {'CustID' 'CustAge' 'TmAtAddress' 'ResStatus' 'EmpStatus' 'CustIncome' 'TmWBank' 'OtherCC' 'AMBalance' 'UtilRate' 'status'}
NumericPredictors: {'CustAge' 'TmAtAddress' 'CustIncome' 'TmWBank' 'AMBalance' 'UtilRate'}
CategoricalPredictors: {'ResStatus' 'EmpStatus' 'OtherCC'}
BinMissingData: 0
IDVar: 'CustID'
PredictorVars: {'CustAge' 'TmAtAddress' 'ResStatus' 'EmpStatus' 'CustIncome' 'TmWBank' 'OtherCC' 'AMBalance' 'UtilRate'}
Data: [1200×11 table]
Perform automatic binning.
sc = autobinning(sc);
The standard workflow is to use the fitmodel function to fit a logistic regression model using a stepwise method. However, fitmodel only supports limited options regarding the stepwise procedure. You can use the optional mdl output argument from fitmodel to get a copy of the fitted GeneralizedLinearModel object, to later modify.
[sc,mdl] = fitmodel(sc);
1. Adding CustIncome, Deviance = 1490.8527, Chi2Stat = 32.588614, PValue = 1.1387992e-08
2. Adding TmWBank, Deviance = 1467.1415, Chi2Stat = 23.711203, PValue = 1.1192909e-06
3. Adding AMBalance, Deviance = 1455.5715, Chi2Stat = 11.569967, PValue = 0.00067025601
4. Adding EmpStatus, Deviance = 1447.3451, Chi2Stat = 8.2264038, PValue = 0.0041285257
5. Adding CustAge, Deviance = 1441.994, Chi2Stat = 5.3511754, PValue = 0.020708306
6. Adding ResStatus, Deviance = 1437.8756, Chi2Stat = 4.118404, PValue = 0.042419078
7. Adding OtherCC, Deviance = 1433.707, Chi2Stat = 4.1686018, PValue = 0.041179769
Generalized linear regression model:
logit(status) ~ 1 + CustAge + ResStatus + EmpStatus + CustIncome + TmWBank + OtherCC + AMBalance
Distribution = Binomial
Estimated Coefficients:
Estimate SE tStat pValue
________ ________ ______ __________
(Intercept) 0.70239 0.064001 10.975 5.0538e-28
CustAge 0.60833 0.24932 2.44 0.014687
ResStatus 1.377 0.65272 2.1097 0.034888
EmpStatus 0.88565 0.293 3.0227 0.0025055
CustIncome 0.70164 0.21844 3.2121 0.0013179
TmWBank 1.1074 0.23271 4.7589 1.9464e-06
OtherCC 1.0883 0.52912 2.0569 0.039696
AMBalance 1.045 0.32214 3.2439 0.0011792
1200 observations, 1192 error degrees of freedom
Dispersion: 1
Chi^2-statistic vs. constant model: 89.7, p-value = 1.4e-16
Suppose you want to include, or "force," the predictor 'UtilRate' in the logistic regression model, even though the stepwise method did not include it in the fitted model. You can add 'UtilRate' to the logistic regression model using the GeneralizedLinearModel object mdl directly.
mdl = mdl.addTerms('UtilRate')mdl =
Generalized linear regression model:
logit(status) ~ 1 + CustAge + ResStatus + EmpStatus + CustIncome + TmWBank + OtherCC + AMBalance + UtilRate
Distribution = Binomial
Estimated Coefficients:
Estimate SE tStat pValue
________ ________ ________ __________
(Intercept) 0.70239 0.064001 10.975 5.0538e-28
CustAge 0.60843 0.24936 2.44 0.014687
ResStatus 1.3773 0.6529 2.1096 0.034896
EmpStatus 0.88556 0.29303 3.0221 0.0025103
CustIncome 0.70146 0.2186 3.2089 0.0013324
TmWBank 1.1071 0.23307 4.7503 2.0316e-06
OtherCC 1.0882 0.52918 2.0563 0.03975
AMBalance 1.0413 0.36557 2.8483 0.004395
UtilRate 0.013157 0.60864 0.021618 0.98275
1200 observations, 1191 error degrees of freedom
Dispersion: 1
Chi^2-statistic vs. constant model: 89.7, p-value = 5.26e-16
Use setmodel to update the model predictors and model coefficients in the creditscorecard object. The ModelPredictors input argument does not explicitly include a string for the intercept. However, the ModelCoefficients input argument does have the intercept information as its first element.
ModelPredictors = mdl.PredictorNames
ModelPredictors = 8×1 cell
{'CustAge' }
{'ResStatus' }
{'EmpStatus' }
{'CustIncome'}
{'TmWBank' }
{'OtherCC' }
{'AMBalance' }
{'UtilRate' }
ModelCoefficients = mdl.Coefficients.Estimate
ModelCoefficients = 9×1
0.7024
0.6084
1.3773
0.8856
0.7015
1.1071
1.0882
1.0413
0.0132
sc = setmodel(sc,ModelPredictors,ModelCoefficients);
Verify that 'UtilRate' is part of the scorecard predictors by displaying the scorecard points.
pi = displaypoints(sc)
pi=41×3 table
Predictors Bin Points
______________ ________________ _________
{'CustAge' } {'[-Inf,33)' } -0.17152
{'CustAge' } {'[33,37)' } -0.15295
{'CustAge' } {'[37,40)' } -0.072892
{'CustAge' } {'[40,46)' } 0.033856
{'CustAge' } {'[46,48)' } 0.20193
{'CustAge' } {'[48,58)' } 0.21787
{'CustAge' } {'[58,Inf]' } 0.46652
{'CustAge' } {'<missing>' } NaN
{'ResStatus' } {'Tenant' } -0.043826
{'ResStatus' } {'Home Owner' } 0.11442
{'ResStatus' } {'Other' } 0.36394
{'ResStatus' } {'<missing>' } NaN
{'EmpStatus' } {'Unknown' } -0.088843
{'EmpStatus' } {'Employed' } 0.30193
{'EmpStatus' } {'<missing>' } NaN
{'CustIncome'} {'[-Inf,29000)'} -0.46956
⋮
This example shows how to use setmodel to fit a logistic regression model directly, without using the fitmodel function, and then set the new model predictors and coefficients back into the creditscorecard object. This approach gives more flexibility regarding options to control the stepwise procedure. This example fits a logistic regression model with a nondefault value for the 'PEnter' parameter, the criterion to admit a new predictor in the logistic regression model during the stepwise procedure.
Create a creditscorecard object using the CreditCardData.mat file to load the data (using a dataset from Refaat 2011). Use the 'IDVar' argument to indicate that 'CustID' contains ID information and should not be included as a predictor variable.
load CreditCardData sc = creditscorecard(data,'IDVar','CustID')
sc =
creditscorecard with properties:
GoodLabel: 0
ResponseVar: 'status'
WeightsVar: ''
VarNames: {'CustID' 'CustAge' 'TmAtAddress' 'ResStatus' 'EmpStatus' 'CustIncome' 'TmWBank' 'OtherCC' 'AMBalance' 'UtilRate' 'status'}
NumericPredictors: {'CustAge' 'TmAtAddress' 'CustIncome' 'TmWBank' 'AMBalance' 'UtilRate'}
CategoricalPredictors: {'ResStatus' 'EmpStatus' 'OtherCC'}
BinMissingData: 0
IDVar: 'CustID'
PredictorVars: {'CustAge' 'TmAtAddress' 'ResStatus' 'EmpStatus' 'CustIncome' 'TmWBank' 'OtherCC' 'AMBalance' 'UtilRate'}
Data: [1200×11 table]
Perform automatic binning.
sc = autobinning(sc);
The logistic regression model needs to be fit with Weight of Evidence (WOE) data. The WOE transformation is a special case of binning, since the data first needs to be binned, and then the binned information is mapped to the corresponding WOE values. This transformation is done using the bindata function. bindata has an argument that prepares the data for the model fitting step. By setting the bindata name-value pair argument for 'OutputType' to WOEModelInput':
All predictors are converted to WOE values.
The output contains only predictors and response (no
'IDVar'or any unused variables).Predictors with infinite or undefined (
NaN) WOE values are discarded.The response values are mapped so that "Good" is
1and "Bad" is0(this implies that higher unscaled scores correspond to better, less risky customers).
bd = bindata(sc,'OutputType','WOEModelInput');
For example, the first ten rows in the original data for the variables 'CustAge', 'ResStatus', 'CustIncome', and 'status' (response variable) look like this:
data(1:10,{'CustAge' 'ResStatus' 'CustIncome' 'status'})ans=10×4 table
CustAge ResStatus CustIncome status
_______ __________ __________ ______
53 Tenant 50000 0
61 Home Owner 52000 0
47 Tenant 37000 0
50 Home Owner 53000 0
68 Home Owner 53000 0
65 Home Owner 48000 0
34 Home Owner 32000 1
50 Other 51000 0
50 Tenant 52000 1
49 Home Owner 53000 1
Here is how the same ten rows look after calling bindata with the name-value pair argument 'OutputType' set to 'WOEModelInput':
bd(1:10,{'CustAge' 'ResStatus' 'CustIncome' 'status'})ans=10×4 table
CustAge ResStatus CustIncome status
________ _________ __________ ______
0.21378 -0.095564 0.47972 1
0.62245 0.019329 0.47972 1
0.18758 -0.095564 -0.026696 1
0.21378 0.019329 0.47972 1
0.62245 0.019329 0.47972 1
0.62245 0.019329 0.47972 1
-0.39568 0.019329 -0.29217 0
0.21378 0.20049 0.47972 1
0.21378 -0.095564 0.47972 0
0.21378 0.019329 0.47972 0
Fit a logistic linear regression model using a stepwise method with the Statistics and Machine Learning Toolbox™ function stepwiseglm, but use a nondefault value for the 'PEnter' and 'PRemove' optional arguments. The predictors 'ResStatus' and 'OtherCC' would normally be included in the logistic linear regression model using default options for the stepwise procedure.
mdl = stepwiseglm(bd,'constant','Distribution','binomial',... 'Upper','linear','PEnter',0.025,'PRemove',0.05)
1. Adding CustIncome, Deviance = 1490.8527, Chi2Stat = 32.588614, PValue = 1.1387992e-08 2. Adding TmWBank, Deviance = 1467.1415, Chi2Stat = 23.711203, PValue = 1.1192909e-06 3. Adding AMBalance, Deviance = 1455.5715, Chi2Stat = 11.569967, PValue = 0.00067025601 4. Adding EmpStatus, Deviance = 1447.3451, Chi2Stat = 8.2264038, PValue = 0.0041285257 5. Adding CustAge, Deviance = 1441.994, Chi2Stat = 5.3511754, PValue = 0.020708306
mdl =
Generalized linear regression model:
logit(status) ~ 1 + CustAge + EmpStatus + CustIncome + TmWBank + AMBalance
Distribution = Binomial
Estimated Coefficients:
Estimate SE tStat pValue
________ ________ ______ __________
(Intercept) 0.70263 0.063759 11.02 3.0544e-28
CustAge 0.57265 0.2482 2.3072 0.021043
EmpStatus 0.88356 0.29193 3.0266 0.002473
CustIncome 0.70399 0.21781 3.2321 0.001229
TmWBank 1.1 0.23185 4.7443 2.0924e-06
AMBalance 1.0313 0.32007 3.2221 0.0012724
1200 observations, 1194 error degrees of freedom
Dispersion: 1
Chi^2-statistic vs. constant model: 81.4, p-value = 4.18e-16
Use setmodel to update the model predictors and model coefficients in the creditscorecard object. The ModelPredictors input argument does not explicitly include a string for the intercept. However, the ModelCoefficients input argument does have the intercept information as its first element.
ModelPredictors = mdl.PredictorNames
ModelPredictors = 5×1 cell
{'CustAge' }
{'EmpStatus' }
{'CustIncome'}
{'TmWBank' }
{'AMBalance' }
ModelCoefficients = mdl.Coefficients.Estimate
ModelCoefficients = 6×1
0.7026
0.5726
0.8836
0.7040
1.1000
1.0313
sc = setmodel(sc,ModelPredictors,ModelCoefficients);
Verify that the desired model predictors are part of the scorecard predictors by displaying the scorecard points.
pi = displaypoints(sc)
pi=30×3 table
Predictors Bin Points
______________ _________________ _________
{'CustAge' } {'[-Inf,33)' } -0.10354
{'CustAge' } {'[33,37)' } -0.086059
{'CustAge' } {'[37,40)' } -0.010713
{'CustAge' } {'[40,46)' } 0.089757
{'CustAge' } {'[46,48)' } 0.24794
{'CustAge' } {'[48,58)' } 0.26294
{'CustAge' } {'[58,Inf]' } 0.49697
{'CustAge' } {'<missing>' } NaN
{'EmpStatus' } {'Unknown' } -0.035716
{'EmpStatus' } {'Employed' } 0.35417
{'EmpStatus' } {'<missing>' } NaN
{'CustIncome'} {'[-Inf,29000)' } -0.41884
{'CustIncome'} {'[29000,33000)'} -0.065161
{'CustIncome'} {'[33000,35000)'} 0.092353
{'CustIncome'} {'[35000,40000)'} 0.12173
{'CustIncome'} {'[40000,42000)'} 0.13259
⋮
Input Arguments
Credit scorecard model, specified as a creditscorecard
object. Use creditscorecard to create a
creditscorecard object.
Predictor names included in the fitted model, specified as a cell array of
character vectors as
{'PredictorName1','PredictorName2',...}. The
predictor names must match predictor variable names in the
creditscorecard object.
Note
Do not include a character vector for the constant term in
ModelPredictors,
setmodel internally handles the
'(Intercept)' term based on the number of
model coefficients (see
ModelCoefficients).
Data Types: cell
Model coefficients corresponding to the model predictors, specified as a
numeric array of model coefficients, [coeff1,coeff2,..].
If N is the number of predictor names provided in
ModelPredictors, the size of
ModelCoefficients can be N or
N+1. If ModelCoefficients has
N+1 elements, then the first coefficient is used as
the '(Intercept)' of the fitted model. Otherwise, the
'(Intercept)' is set to 0.
Data Types: double
Output Arguments
Credit scorecard model, returned as an updated
creditscorecard object. The
creditscorecard object contains information about the
model predictors and coefficients of the fitted model. For more information
on using the creditscorecard object, see creditscorecard.
More About
When using setmodel, there are two possible
workflows to set the final model predictors and model coefficients into a
creditscorecard object.
The first workflow is:
Use
fitmodelto get the optional output argumentmdl. This is aGeneralizedLinearModelobject and you can add and remove terms, or modify the parameters of the stepwise procedure. Only linear terms can be in the model (no interactions or any other higher-order terms).Once the
GeneralizedLinearModelobject is satisfactory, set the final model predictors and model coefficients into thecreditscorecardobject using thesetmodelinput arguments forModelPredictorsandModelCoefficients.
An alternate workflow is:
Obtain the Weight of Evidence (WOE) data using
bindata. Use the'WOEModelInput'option for the'OutputType'name-value pair argument inbindatato ensure that:The predictors data is transformed to WOE.
Only predictors whose bins have finite WOE values are included.
The response variable is placed in the last column.
The response variable is mapped (“Good” is
1and “Bad” is0).
Use the data from the previous step to fit a linear logistic regression model (only linear terms in the model, no interactions, or any other higher-order terms). See, for example,
stepwiseglm.Once the
GeneralizedLinearModelobject is satisfactory, set the final model predictors and model coefficients into thecreditscorecardobject using thesetmodelinput arguments forModelPredictorsandModelCoefficients.
References
[1] Anderson, R. The Credit Scoring Toolkit. Oxford University Press, 2007.
[2] Refaat, M. Credit Risk Scorecards: Development and Implementation Using SAS. lulu.com, 2011.
Version History
Introduced in R2014b
See Also
creditscorecard | autobinning | bininfo | predictorinfo | modifypredictor | plotbins | modifybins | bindata | displaypoints | formatpoints | score | stepwiseglm | fitglm | fitmodel | probdefault | validatemodel | GeneralizedLinearModel
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)