Linear mixed effects model standardization with Z score not giving consistent results

40 views (last 30 days)
HI, I am running a mixed effects model with interaction terms, and when I run the model with the raw data, I essentialy get what I would expect
Linear mixed-effects model fit by ML
Model information:
Number of observations 44
Fixed effects coefficients 5
Random effects coefficients 22
Covariance parameters 2
Formula:
outcome ~ 1 + intervals + exposure*sessions + (1 | subject)
Model fit statistics:
AIC BIC LogLikelihood Deviance
264 276.49 -125 250
Fixed effects coefficients (95% CIs):
Name Estimate SE tStat DF pValue Lower Upper
'(Intercept)' 26.899 2.6919 9.9926 39 2.6128e-12 21.454 32.344
'exposure' -1.6766 1.0159 -1.6504 39 0.10689 -3.7313 0.3782
'intervals' -0.19934 0.033912 -5.8783 39 7.6441e-07 -0.26794 -0.13075
'sessions' -0.73017 0.44687 -1.634 39 0.11031 -1.6341 0.17371
'exposure:sessions' 0.4013 0.14279 2.8105 39 0.0076969 0.11249 0.69012
Random effects covariance parameters (95% CIs):
Group: subject (22 Levels)
Name1 Name2 Type Estimate Lower Upper
'(Intercept)' '(Intercept)' 'std' 3.1621 1.9751 5.0624
Group: Error
Name Estimate Lower Upper
'Res Std' 3.1433 2.3339 4.2334
But when I standardize the predictor using the matlab zscore function I get something totally different, whether I also standardize the dependent variable or not
Linear mixed-effects model fit by ML
Model information:
Number of observations 44
Fixed effects coefficients 5
Random effects coefficients 22
Covariance parameters 2
Formula:
drinking ~ 1 + intervals + exposure*sessions + (1 | subject)
Model fit statistics:
AIC BIC LogLikelihood Deviance
127.45 139.94 -56.727 113.45
Fixed effects coefficients (95% CIs):
Name Estimate SE tStat DF pValue Lower Upper
'(Intercept)' 0.0078404 0.17882 0.043845 39 0.96525 -0.35386 0.36954
'exposure' -0.020592 0.14613 -0.14091 39 0.88867 -0.31618 0.27499
'intervals' -0.1539 0.21666 -0.71034 39 0.48172 -0.59213 0.28433
'sessions' 0.3547 0.21217 1.6717 39 0.10258 -0.07446 0.78385
'exposure:sessions' -0.099309 0.24164 -0.41098 39 0.68334 -0.58807 0.38945
Random effects covariance parameters (95% CIs):
Group: subject (22 Levels)
Name1 Name2 Type Estimate Lower Upper
'(Intercept)' '(Intercept)' 'std' 0.6939 0.43148 1.1159
Group: Error
Name Estimate Lower Upper
'Res Std' 0.65418 0.4809 0.8899
When I look at plots of the individual predictors before and after z transformation, everything looks similar, just the values are (expectedly) different. I would not expect such a drastic difference in LME models. Grateful for any assistance! Thank you!
  3 Comments
bsriv
bsriv on 28 Sep 2022
Hi I've attached csvs with both the unstandardized and standardized ("z") data
The code I used was simply
lme=fitlme(tbl_roc,'drinking ~ alc_cue+cbt+intervals+alc_cue*cbt+(1|subject)')
the cyclist
the cyclist on 28 Sep 2022
Edited: the cyclist on 28 Sep 2022
If anyone else happens to investigate, be aware that the attached files are MAT, not CSV.

Sign in to comment.

Accepted Answer

the cyclist
the cyclist on 28 Sep 2022
Edited: the cyclist on 28 Sep 2022
I'm not sure why you expect the model to be unchanged (or what specific aspects of the model you expect to be unchanged).
Even with a simple linear model, it is obvious to me that the fitted parameters will be different under a zscore transform of the predictors. For example,
rng default
x1 = [-2; -1; 0; 1; 2] + 10;
x2 = zscore(x1); % new predictor is zscore of original predictor
y = 2*x1 + randn(5,1);
tbl = table(x1,x2,y);
mdl1 = fitlm(tbl,"y ~ x1") % fit original predictor
mdl1 =
Linear regression model: y ~ 1 + x1 Estimated Coefficients: Estimate SE tStat pValue ________ _______ _______ ________ (Intercept) 1.6682 5.552 0.30047 0.78343 x1 1.859 0.54973 3.3817 0.043036 Number of observations: 5, Error degrees of freedom: 3 Root Mean Squared Error: 1.74 R-squared: 0.792, Adjusted R-Squared: 0.723 F-statistic vs. constant model: 11.4, p-value = 0.043
mdl2 = fitlm(tbl,"y ~ x2") % fit new predictor
mdl2 =
Linear regression model: y ~ 1 + x2 Estimated Coefficients: Estimate SE tStat pValue ________ _______ ______ __________ (Intercept) 20.259 0.77744 26.058 0.00012398 x2 2.9394 0.8692 3.3817 0.043036 Number of observations: 5, Error degrees of freedom: 3 Root Mean Squared Error: 1.74 R-squared: 0.792, Adjusted R-Squared: 0.723 F-statistic vs. constant model: 11.4, p-value = 0.043
In this simpler case, it is pretty easy to understand why certain things (e.g. beta for the x1 vs. x2 ) is different. It may or may not be obvious why the R-squared is unchanged.
In your more complicated model, especially because of the interactions, I think it could be tricky what could be expected to remain constant across the two models.
What did you expect to remain constant after this transform? Presumably not the beta coefficients.
  4 Comments
the cyclist
the cyclist on 28 Sep 2022
load("tbl_roc")
tbl_roc_z = tbl_roc;
tbl_roc_z.alc_cue = zscore(tbl_roc.alc_cue);
tbl_roc_z.intervals = zscore(tbl_roc.intervals);
tbl_roc_z.cbt = zscore(tbl_roc.cbt);
tbl_roc_z.drinking = zscore(tbl_roc.drinking);
lme = fitlme(tbl_roc,'drinking ~ alc_cue+cbt+intervals+alc_cue*cbt+(1|subject)')
lme =
Linear mixed-effects model fit by ML Model information: Number of observations 44 Fixed effects coefficients 5 Random effects coefficients 22 Covariance parameters 2 Formula: drinking ~ 1 + intervals + alc_cue*cbt + (1 | subject) Model fit statistics: AIC BIC LogLikelihood Deviance 264 276.49 -125 250 Fixed effects coefficients (95% CIs): Name Estimate SE tStat DF pValue Lower Upper {'(Intercept)'} 26.899 2.6919 9.9926 39 2.6128e-12 21.454 32.344 {'alc_cue' } -1.6766 1.0159 -1.6504 39 0.10689 -3.7313 0.3782 {'intervals' } -0.19934 0.033912 -5.8783 39 7.6441e-07 -0.26794 -0.13075 {'cbt' } -0.73017 0.44687 -1.634 39 0.11031 -1.6341 0.17371 {'alc_cue:cbt'} 0.4013 0.14279 2.8105 39 0.0076969 0.11249 0.69012 Random effects covariance parameters (95% CIs): Group: subject (22 Levels) Name1 Name2 Type Estimate Lower Upper {'(Intercept)'} {'(Intercept)'} {'std'} 3.1621 1.9751 5.0624 Group: Error Name Estimate Lower Upper {'Res Std'} 3.1433 2.3339 4.2334
lme_z = fitlme(tbl_roc_z,'drinking ~ alc_cue+cbt+intervals+alc_cue*cbt+(1|subject)')
lme_z =
Linear mixed-effects model fit by ML Model information: Number of observations 44 Fixed effects coefficients 5 Random effects coefficients 22 Covariance parameters 2 Formula: drinking ~ 1 + intervals + alc_cue*cbt + (1 | subject) Model fit statistics: AIC BIC LogLikelihood Deviance 55.721 68.21 -20.86 41.721 Fixed effects coefficients (95% CIs): Name Estimate SE tStat DF pValue Lower Upper {'(Intercept)'} 0.077311 0.082032 0.94245 39 0.35176 -0.088614 0.24324 {'alc_cue' } 0.027453 0.074245 0.36976 39 0.71356 -0.12272 0.17763 {'intervals' } -0.92989 0.15819 -5.8783 39 7.6441e-07 -1.2499 -0.60992 {'cbt' } 0.045009 0.1567 0.28723 39 0.77546 -0.27194 0.36196 {'alc_cue:cbt'} 0.17342 0.061704 2.8105 39 0.0076969 0.048611 0.29823 Random effects covariance parameters (95% CIs): Group: subject (22 Levels) Name1 Name2 Type Estimate Lower Upper {'(Intercept)'} {'(Intercept)'} {'std'} 0.29655 0.18523 0.47478 Group: Error Name Estimate Lower Upper {'Res Std'} 0.29479 0.21889 0.39703

Sign in to comment.

More Answers (0)

Products


Release

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!