Computation and interpretation of terms in table generated by ranova
    9 views (last 30 days)
  
       Show older comments
    
Hi, 
I want to apply ranova from the statistics toolbox to a dataset very similar to one in an example in the documentation for the function, involving the fisheriris dataset. I am using the documented example to understand what ranova does but am not understanding this entirely. 
The dataset consists of an array "meas" with 4 columns, one per condition, and 150 rows. Each row in meas corresponds to one subject. These subjects belong to 3 different species, with 50 subjects per species, as specified in string vector "species".
The following is the table output by ranova in the documented example:
                                SumSq     DF      MeanSq       F         pValue        pValueGG       pValueHF       pValueLB  
                                ______    ___    ________    ______    ___________    ___________    ___________    ___________
    (Intercept):Measurements    1656.3      3      552.09    6873.3              0    9.4491e-279    2.9213e-283    2.5871e-125
    species:Measurements        282.47      6      47.078     586.1    1.4271e-206    4.9313e-156    1.5406e-158     9.0151e-71
    Error(Measurements)         35.423    441    0.080324    
 I was able to compute the sum of squared deviations (SSD) term (Intercept):Measurements in the first row above, as follows:
    Columns = sum((mean(meas)-mean(meas(:))).^2)*size(meas,1) 	% sum of squared deviations of column means from global mean
However I cannot find a way to compute the SSD terms in the 2nd or 3rd rows. The closest I've managed is to compute 
    Rows = sum((mean(meas')-mean(meas(:))).^2)*size(meas,2) 	% sum of squared deviations of row means from global mean
    Total = (mean(meas(:).^2)-(mean(meas(:))).^2)*numel(meas)	% sum of squared deviations of all values from global mean
    SS_error_leftover = Total - Rows - Columns
where it turns out that 
 SS_error_leftover -  Error(Measurements)    = 317.88 - 35.42 =  282.47  = species:Measurements 
Here Error(Measurements) and species:Measurements are the SSD terms from the table which I'd like to compute directly in order to understand what the terms represent.  
The best I have come up with is
    reps = @(selection,data)data(find(cellfun(@(x)~isempty(x),strfind(species,selection))),:);
    SSD_spxmeas = @(x) sum(  (mean(x)    -  mean(meas(:))      ).^2)*size(x,1); 
    SSD_spxmeas_total = SSD_spxmeas(reps('versicolor',meas)) + SSD_spxmeas(reps('setosa',meas)) + SSD_spxmeas(reps('virginica',meas))
Here reps picks out subjects belonging to one species and SSD_spxmeas_total is the SSD between the means for each {species,measure} combination and the global mean.
The documentation is not very helpful.  There is a related post with an answer on Matlab Central which provides an interpretation but I'd like to get a better understanding by seeing how the terms are computed. I have checked documentation for repanova (octave) but that function cannot handle this scenario involving a between-subject variable (grouping by species). 
Assistance would be greatly appreciated. Thanks!
0 Comments
Answers (1)
  Shivansh
      
 on 22 Mar 2024
        Hi Chris!
The calculations of the Sum of Square deviations in the context of repeated mesaures ANOVA for Species:Measurements interaction effect and Error(Measurements) involves complex statistical processes. The method used for calculations of SSD for second part in the provided script which involves comparing each {species, measurement} combination to the global mean doesn't take interaction effects into the account.  It's not about the deviation from the global mean but rather the deviation due to the interaction between species and the specific measurements.In case of  Error(Measurements), it represents the within-subject variability. It is about capturing the variability in measurements for each subject that isn't explained by the species-measurement interaction.
The actual computations performed by the ranova function in MATLAB involve more sophisticated statistical techniques to accurately partition the variance, especially considering the correlations among repeated measures on the same subjects. 
You can also have a  look at the following MATLAB Answer: https://www.mathworks.com/support/search.html/answers/416818-which-type-of-sums-of-squares-does-ranova-use.html?fq%5B%5D=asset_type_name:answer&fq%5B%5D=category:stats/repeated-measures&page=1.
 I hope it helps!
0 Comments
See Also
Categories
				Find more on Repeated Measures and MANOVA in Help Center and File Exchange
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
