Where is the coefficient for the reference condition when using 'fitlm' to perform ANOVA with intercept omitted?

2 views (last 30 days)
I want to compare the means for a variable among five different groups. To do so, I use fitlm to perform an ANOVA. The following code provides an example that closely resembles my own data:
% simulate data
x = repmat([1:5]',[10,1]);
y = (x-4).^2 + randn([size(x),1]);
grp = categorical(x);
% fit model
fitlm(grp,y)
This yields an intercept, which represents the mean for grp 1, and coefficients for the other groups that represent the difference of the mean for those groups relative to the intercept.
I am unsure (and would also like to know) how to then calculate the standard error (SE) of the group means from the combination of the SE for the intercept and SE for the coefficients. I therefore tried omitting the intercept, expecting coefficients for each group
fitlm(grp,y,'Intercept',false)
Here there is indeed no intercept, but also no coefficient for group 1. Where did it go?
Aside, as a workaround, I tried dummy coding the group variable
grp_dummy = dummyvar(grp);
fitlm(grp_dummy,y,'Intercept',false)
This gives me the expected result, but the SE correspond to the SE of the intercept in the first model. How then do the SE of the coefficients of the first model fit in with this latter model?

Accepted Answer

Jeff Miller
Jeff Miller on 27 Jun 2020
Why not just use this?
for iGrp=1:5
stderr = std(y(x==iGrp)) / sqrt( sum(x==iGrp) )
end
  3 Comments
Jeff Miller
Jeff Miller on 27 Jun 2020
If the intercept is omitted, the linear model being fit constrains the true mean of group 1 to be zero. So, any deviation of the group 1 observed mean from zero contributes to error. Look at the effect of adding this to the end of your code:
grp1 = grp==categorical(1);
y(grp1) = y(grp1) + 1000;
fitlm(grp,y,'Intercept',false)
You get the same coefficient estimates as before, but the SE of those estimates goes way up. That's because there is a lot of error in those group 1 scores, relative to the predicted values of zero. Now try
y(grp1) = y(grp1) - mean(y(grp1));
fitlm(grp,y,'Intercept',false)
The SEs are now much smaller--even smaller than they were originally--because 0 is actually a very accurate estimate of these revised group 1 scores.
So, I think the answer is that any deviation of the group 1 scores from 0 goes into error.

Sign in to comment.

More Answers (0)

Products


Release

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!