ANOVA for linear mixed-effects models produces misleading main effects
Show older comments
Hello,
I'm using fitlme for modeling my data (followed by anova). I always assumed that the stats for the main effect for a specific fixed effect reflect the significance of the difference between means, but now I'm not so sure.
The script belwo generates simulated data for model with two fixed effects - one continous and one categorical, with two levels. The data was generated to produce a significant interaction, but no main effects (i.e., the means are identical for both groups). However, the model yields a significant main effect of group, simply because the intersects are different (although the means are the same). Shifting the data on the x-axis produced markedly different results.
I'm hoping to still find a way to use these functions to measure both main and interaction effects, but I'm not sure what I'm doing wrong and how to fix it.
Any guidance will be much appreciated.
x=0:100;
y_group1=50+randn(1,101)*10;
y_group2=y_group1+x-50;
figure;plot(x,y_group1,x,y_group2);

mean(y_group1)==mean(y_group1)
% ans =
%
% logical
%
% 1
tab=table([x';x'],[y_group1';y_group2'],[ones(101,1);2*ones(101,1)],'variablenames',{'x','y','group'});
tab.group=categorical(tab.group);
model=fitlme(tab,'y~x*group');
anova(model)
% ANOVA marginal tests: DFMethod = 'Residual'
%
% Term FStat DF1 DF2 pValue
% {'(Intercept)'} 755.21 1 198 1.7114e-69
% {'x' } 0.16396 1 198 0.68597
% {'group' } 388.26 1 198 1.4799e-48
% {'x:group' } 520.27 1 198 2.6278e-57
% Strong effect of group, even though means are equal
tab.x=tab.x-50;
figure;plot(x-50,y_group1,x-50,y_group2);

model=fitlme(tab,'y~x*group');
anova(model)
% ANOVA marginal tests: DFMethod = 'Residual'
%
% Term FStat DF1 DF2 pValue
% {'(Intercept)'} 3052.7 1 198 2.8264e-122
% {'x' } 0.16396 1 198 0.68597
% {'group' } 1.3039e-28 1 198 1
% {'x:group' } 520.27 1 198 2.6278e-57
% Essentially the same data, but the intercept is eliminated. The group effect is gone
Answers (3)
the cyclist
on 13 Feb 2024
Edited: the cyclist
on 13 Feb 2024
I think it is easier to interpret the estimated model coefficients if you write out all terms for both groups, in the following format:
x = 0:100;
noise = 10*randn(1,101); % using same "noise" for both groups
y_group1 = 50 + 0*x + noise;
y_group2 = 0 + 1*x + noise;
What I see in your generative model is two effects:
- intercept
- group * slope
If you consider group 1 to be the reference group, then the effects are measured with respect to it. Therefore,
- The intercept of the reference group is 50
- The slope (w.r.t x) of the reference group is 0
- The intercept effect (of group 2 relative to group 1) is -50
- The group*slope effect (of group 2 relative to group 1) is 1
I think you kinda fooled yourself by plotting these data over a range where they have the same mean. But that's not actually salient for the model. You could have plotted these data over the range 1000:2000, where they would have radically different means. But the model would be the same.
The effects you specified are well estimated by fitlme:
tab=table([x';x'],[y_group1';y_group2'],[ones(101,1);2*ones(101,1)],'variablenames',{'x','y','group'});
tab.group=categorical(tab.group);
model_lme = fitlme(tab,'y~x*group')
I hope that helps.
Jeff Miller
on 13 Feb 2024
1 vote
"I always assumed that the stats for the main effect for a specific fixed effect reflect the significance of the difference between means..."
Intuitive as it seems, that's actually not true when models include continuous as well as categorical effects. For such models, the specific fixed effect reflects the significance of the (predicted) group difference when the continuous variable is zero. And, as you can see from your first graph, the groups clearly are different when x is 0, even though they aren't different on average across all of the x values.
2 Comments
Eitan Schechtman
on 14 Feb 2024
The model has to fit the hypothesis. Leaving out the intercept term:
x = 0:100;
noise = 10*randn(1,101);
y_group1 = 50 + 0*x + noise;
y_group2 = 0 + 1*x + noise;
tab=table([x';x'],[y_group1';y_group2'],[ones(101,1);2*ones(101,1)],'variablenames',{'x','y','group'});
tab.group=categorical(tab.group);
% No-intercept model, estimating coefficients for group's impact on (1)
% overall level and (2) multiplier on x
model = fitlme(tab,'y ~ -1 + group + x:group')
anova(model)
The result is that the group has no (significant) effect on level, but has one on the interaction term of group:level.
Jeff Miller
on 14 Feb 2024
One common approach in this situation is to rescore x to shift the intercept to a value that you do care about--usually the mean of x. For example,
x2 = x - 50;
model2=fitlme(tab,'y~x2*group');
In model2, the categorical group effect tests whether the groups differ at x2=0 <=> at x=50. Given that the lines are not parallel in your original plot, it is kind of meaningless to ask whether the groups differ overall ignoring x, because the group difference in y clearly depends on which x you are asking about.
The situation is analogous to avoiding the interpretation of a main effect when there is a cross-over interaction in a purely categorical anova.
Didn't quite follow your example with tests A and B and the intervention, sorry.
1 Comment
Eitan Schechtman
on 14 Feb 2024
Categories
Find more on Repeated Measures and MANOVA in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!