Clear Filters
Clear Filters

How to prepare my data for ANOVA?

33 views (last 30 days)
Hi there,
I'm currently analyzing data from a randomized, double-blind, cross-over trial. Participants recieved a drug and a placebo, separated by a 7 day wash-out period. I suspect that there is an interaction effect between the drug and the timing of the drug administration. How do I have to prepare my data for a repeated measures ANOVA?
So far I've already extracted the data of interest (the mean value) and aranged it in the following way:
Drug_session1
Drug_session2
Placebo_session1
Placebo_session2
These are 15x1 doubles since there are 30 participants.
In order to execute the ANOVA I have to concatenate the 4 single 15x1 doubles, but in what way?
Thanks a lot!
  5 Comments
Scott MacKenzie
Scott MacKenzie on 4 Aug 2022
@Steve Schulz, a repeated measures ANOVA seems appropriate, since each participant received both the real drug and the placebo. So, your experiment is a 2 x 2 mixed design with 30 participants. There were two independent variables. One was drug which was within-subjects (with levels real and placebo) and the other was group which was between-subjects (with levels A and B). The group levels represent the different order of administering the drug.
Fifteen participants were in each group which is why group is a between-subjects factor. However, all 30 participants received both the real drug and the placeble which is why drug is a within-subjects factor.
There was (at least) one dependent variable: cognitive score.
This should be fairly easy to setup using MATLAB's ranova function. Any chance you can post the data, so @Jeff Miller or I can code-up a solution for you?
Steve Schulz
Steve Schulz on 4 Aug 2022
@Scott MacKenzie, thanks for your reply! In the following I've provided the data (cognitive score) for each group.
score_drug_ses1
___________________
Drug|Placebo
61 147
67 106
69 139
32 90
56 157
50 144
71 111
146 187
148 187
141 185
123 155
105 135
115 183
88 147
45 112
score_drug_ses2
___________________
Drug|Placebo
58 89
56 98
86 114
86 91
85 93
35 26
113 91
156 166
126 110
190 179
100 124
123 165
149 106
118 142
165 155

Sign in to comment.

Accepted Answer

Scott MacKenzie
Scott MacKenzie on 4 Aug 2022
Edited: Scott MacKenzie on 4 Aug 2022
@Steve Schulz, thanks for posting the data. You have organized the data in a slighly peculiar way. You've got session 1 in rows 1 to15 and session 2 in rows 16 to 30. But, to organize the data for the anova, you want one row per participant, with the repeated measurements across the columns. So, the first step in my answer below is to rearrange the data. See code and comment below.
As seen in the ANOVA table, the effect of Dose Type on Cognitive Score was statistically significant, F(1,28) = 20.575, p = .0001. The effect of Group on Cognitive Score was also statistically significant, F(1,28) = 4.368, p = 0.0458). This just barely meets the conventional threshold for significance. So, the Group effect was modest, albeit statistically significant. The Dose Type x Group interaction effect was not statistically significant, however, F(1,28) = 0.250, ns).
% the data, as per posted comment (30x2)
M = [ 61 147
67 106
69 139
32 90
56 157
50 144
71 111
146 187
148 187
141 185
123 155
105 135
115 183
88 147
45 112
58 89
56 98
86 114
86 91
85 93
35 26
113 91
156 166
126 110
190 179
100 124
123 165
149 106
118 142
165 155];
% rearrange data to have one row per participant:
% Group A in rows 1-15, Group B in rows 16-30
% Drug results in column 1, placebo results in column 2
M = [M(1:15,1) M(16:30,2); M(16:30,1) M(1:15,2)];
% put the data into a table
T = array2table(M, 'VariableNames', {'Drug', 'Placebo'});
% add group code
T.Group = [repmat('A', 15, 1); repmat('B', 15, 1)];
% display the table
T
T = 30×3 table
Drug Placebo Group ____ _______ _____ 61 89 A 67 98 A 69 114 A 32 91 A 56 93 A 50 26 A 71 91 A 146 166 A 148 110 A 141 179 A 123 124 A 105 165 A 115 106 A 88 142 A 45 155 A 58 147 B
% setup the repeated measures model
withinDesign = table([1 2]', 'VariableNames', { 'DoseType' });
withinDesign.DoseType = categorical(withinDesign.DoseType);
rm = fitrm(T, 'Drug-Placebo ~ Group', 'WithinDesign', withinDesign);
% do the anova (supress output for the moment)
AT = ranova(rm, 'WithinModel', 'DoseType');
% output a conventional ANOVA table
disp(anovaTable(AT, 'CognitiveScore'));
ANOVA table for CognitiveScore ============================================================================== Effect df SS MS F p ------------------------------------------------------------------------------ Group 1 9753.75000 9753.75000 4.368 0.0458 Participant 28 62527.60000 2233.12857 DoseType 1 15714.01667 15714.01667 20.575 0.0001 Group:DoseType 1 190.81667 190.81667 0.250 0.6211 Participant(DoseType) 28 21384.66667 763.73810 ==============================================================================
% -------------------------------------------------------------------------
% Function to create a conventional ANOVA table from the overly-complicated
% and confusing ANOVA table created by the ranova function.
function [s] = anovaTable(AT, dvName)
c = table2cell(AT);
% remove erroneous entries in F and p columns
for i=1:size(c,1)
if c{i,4} == 1
c(i,4) = {''};
end
if c{i,5} == .5
c(i,5) = {''};
end
end
% use conventional labels in Effect column
effect = AT.Properties.RowNames;
for i=1:length(effect)
tmp = effect{i};
tmp = erase(tmp, '(Intercept):');
tmp = strrep(tmp, 'Error', 'Participant');
effect(i) = {tmp};
end
% determine the required width of the table
fieldWidth1 = max(cellfun('length', effect)); % width of Effect column
fieldWidth2 = 57; % width for df, SS, MS, F, and p columns
barDouble = repmat('=', 1, fieldWidth1 + fieldWidth2);
barSingle = repmat('-', 1, fieldWidth1 + fieldWidth2);
% re-organize the data
c = c(2:end,[2 1 3 4 5]);
c = [num2cell(repmat(fieldWidth1, size(c,1), 1)), effect(2:end), c]';
% create the ANOVA table
s = sprintf('ANOVA table for %s\n', dvName);
s = [s sprintf('%s\n', barDouble)];
s = [s sprintf('%-*s %4s %11s %14s %9s %9s\n', fieldWidth1, 'Effect', 'df', 'SS', 'MS', 'F', 'p')];
s = [s sprintf('%s\n', barSingle)];
s = [s sprintf('%-*s %4d %14.5f %14.5f %10.3f %10.4f\n', c{:})];
s = [s sprintf('%s\n', barDouble)];
end
  6 Comments
Steve Schulz
Steve Schulz on 5 Aug 2022
Hey Scott, I would like to learn from this problem (this is my first research project), so allow me a few more questions:
1) Since the interaction effect between dose and group is significant, I would have expected the group main effect to be necessarily significant as well. I mean the only fixed factor that differs between the two groups is the order of treatment. Is this a valid conclusion?
2) Is the added value of the repeated measures ANOVA over the t-test only in quantifying the interaction effect? Could I also have tested the main effects with one sample (drug effect) and two-sample t-tests (group effect)?
3) 'Drug-Placebo ~ Group'--> Does that mean the Group-Effect is the result of the difference between the two treatment conditions?
Scott MacKenzie
Scott MacKenzie on 6 Aug 2022
Edited: Scott MacKenzie on 6 Aug 2022
@Steve Schulz, in your first queston, you astutely notice an important issue. Since the Group main effect is not significant, how can there be a significant interaction effect of Group with another independent variable? In this case, with p < .0001 for the Dose Type main effect, it is likely that the interaction effect is entirely due to the strong main effect of Dose Type; that is, there is likely no practical significance or real implication in the observed significance of the Dose Type x Group interacton -- or something like that.
A t-test can only be used to compare two conditions; that is, two levels of a single independent variable. The ANOVA procedure expands on this capability in two ways. First, it can be used with an independent variable having > 2 levels. Second, it can be used with > 1 independent variable. In the latter case, you are testing for both the main effects of each independent variable and the interaction effects between the independent variables. That's the case in your study. You could have used t-tests for both main effects (since each factor has only two levels) and ignored the interaction effect. However, this not a good approach. As more t-tests are used, the likelihood of getting an erroneously significant outcome increases. Using an ANOVA avoids this (since all effects are tested for in a single procedure).
I'm not sure I understand your third question. The Group effect in your study was not significant. That simply means that there is no evidence of a difference between Group A and Group B. Put another way, there is no evidence of a difference in cognitive score between the drug-then-placebo group and the placebo-then-drug group.
BTW, I often emphasise with students that the results of statistical procecures like the ANOVA are not the results per se. There is often a misconception about this and it is sometimes apparent in reseach papers that overly emphasise the results of statistical stats. There is so much effort invested in figuring out how to do statistical tests and what the outcomes mean, that one might be tempted to think that the results of statistical tests are the results. But, that's not the case. Really, what is needed -- first and foremost -- is to inspect the data collected. The first step is to compute the means in the measuements across all the conditions tested and then look at them and then think about what the observations and measurements suggest Then, to get a good visual sense of the data, plot the results in line charts, bar charts, or whatever. You'll no doubt see some differences in the means and you'll be thinking... Hmmm, the score on such-and-such seems to be quite a bit higher (or lower) under this condition compared to that condition. I wonder if that difference is real or is it just an artefact of the variability in the measures? That's the sort of thinking we all do. And that's when the statistical tests come into play. At the end of the day, the results of the statistical tests only play a supporting role: They allow you to add strength and confidence to concluding statements, which is why the ANOVA results are often presented in parentheses.
Hope this helps. Good luck.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!