Main Content

multcompare

Multiple comparison of means for analysis of variance (ANOVA)

Since R2022b

    Description

    m = multcompare(aov) returns a table of results m from a multiple comparison of means for a one-way anova object.

    m = multcompare(aov,factors) performs the multiple comparison of means over the combinations of values for the factors listed in factors. This syntax is valid for a one-, two-, or N-way ANOVA.

    example

    m = multcompare(___,Name=Value) specifies additional options using one or more name-value arguments. For example, you can specify the confidence level and the type of critical value used to determine if the means are significantly different.

    example

    Examples

    collapse all

    Load popcorn yield data.

    load popcorn.mat

    The columns of the 6-by-3 matrix popcorn contain popcorn yield observations in cups for the brands Gourmet, National, and Generic.

    Convert popcorn to a vector.

    popcorn = popcorn(:);

    Create a string array of values for the factor Brand using the function repmat.

    brand = [repmat("Gourmet",6,1); repmat("National",6,1); repmat("Generic",6,1)];

    Perform a one-way ANOVA to test the null hypothesis that the mean yields are the same across the three brands.

    aov = anova(brand,popcorn,FactorNames="Brand")
    aov = 
    1-way anova, constrained (Type III) sums of squares.
    
    Y ~ 1 + Brand
    
                 SumOfSquares    DF    MeanSquares     F        pValue  
                 ____________    __    ___________    ____    __________
    
        Brand       15.75         2        7.875      18.9    7.9603e-05
        Error        6.25        15      0.41667                        
        Total          22        17                                     
    
    
      Properties, Methods
    
    
    

    The small p-value indicates that the null hypothesis can be rejected at the 99% confidence level. Therefore, the difference in mean popcorn yield is statistically significant for at least one brand. Perform Dunnett's Test to determine if the mean yields of Gourmet and National differ significantly from the mean yield of Generic.

    m = multcompare(aov,CriticalValueType="dunnett",ControlGroup=3)
    m=2×6 table
          Group1       Group2      MeanDifference    MeanDifferenceLower    MeanDifferenceUpper     pValue  
        __________    _________    ______________    ___________________    ___________________    _________
    
        "Gourmet"     "Generic"         2.25                 1.341                 3.159           4.402e-05
        "National"    "Generic"         0.75              -0.15904                 1.659             0.11012
    
    

    Each row of m contains a p-value for the null hypothesis that the means of the groups in columns Group1 and Group2 are not significantly different. The p-value in the first row is small enough to reject the null hypothesis that the mean popcorn yield of Gourmet is not significantly different from that of Generic.The p-value in the second row is too large to reject the null hypothesis that the mean popcorn yield of National is not significantly different from that of Generic. The value for MeanDifference is positive in the first row; therefore, the mean popcorn yield of Gourmet is significantly higher than that of Generic.

    Load the patients data.

    load patients.mat

    Create a table containing variables with factor values for the smoking status and physical location of patients, and the response data for systolic blood pressure.

    tbl = table(Smoker,Location,Systolic)
    tbl=100×3 table
        Smoker              Location               Systolic
        ______    _____________________________    ________
    
        true      {'County General Hospital'  }      124   
        false     {'VA Hospital'              }      109   
        false     {'St. Mary's Medical Center'}      125   
        false     {'VA Hospital'              }      117   
        false     {'County General Hospital'  }      122   
        false     {'St. Mary's Medical Center'}      121   
        true      {'VA Hospital'              }      130   
        false     {'VA Hospital'              }      115   
        false     {'St. Mary's Medical Center'}      115   
        false     {'County General Hospital'  }      118   
        false     {'County General Hospital'  }      114   
        false     {'St. Mary's Medical Center'}      115   
        false     {'VA Hospital'              }      127   
        true      {'VA Hospital'              }      130   
        false     {'St. Mary's Medical Center'}      114   
        true      {'VA Hospital'              }      130   
          ⋮
    
    

    Perform a two-way ANOVA to test the null hypothesis that systolic blood pressure is not significantly different between smokers and non-smokers or locations.

    aov = anova(tbl,"Systolic")
    aov = 
    2-way anova, constrained (Type III) sums of squares.
    
    Systolic ~ 1 + Smoker + Location
    
                    SumOfSquares    DF    MeanSquares      F         pValue  
                    ____________    __    ___________    ______    __________
    
        Smoker         2154.4        1      2154.4       94.462    5.9678e-16
        Location       46.064        2      23.032       1.0099       0.36811
        Error          2189.5       96      22.807                           
        Total          4461.2       99                                       
    
    
      Properties, Methods
    
    
    

    The p-values indicate that enough evidence exists to conclude that smoking status has a significant effect on blood pressure. However, not enough evidence exists to conclude that physical location has a significant effect.

    Investigate the mean differences between the response data from each group.

    m = multcompare(aov,["Smoker","Location"])
    m=15×6 table
                        Group1                                     Group2                     MeanDifference    MeanDifferenceLower    MeanDifferenceUpper      pValue  
        _______________________________________    _______________________________________    ______________    ___________________    ___________________    __________
    
        Smoker              Location               Smoker              Location                                                                                         
        ______    _____________________________    ______    _____________________________                                                                              
                                                                                                                                                                        
        false     {'County General Hospital'  }    true      {'County General Hospital'  }        -9.935              -12.908                -6.9623          7.6385e-15
        false     {'County General Hospital'  }    false     {'VA Hospital'              }         1.516              -1.6761                  4.708             0.73817
        false     {'County General Hospital'  }    true      {'VA Hospital'              }        -8.419              -12.899                -3.9394          5.3456e-06
        false     {'County General Hospital'  }    false     {'St. Mary's Medical Center'}        0.3721              -3.2806                 4.0248             0.99968
        false     {'County General Hospital'  }    true      {'St. Mary's Medical Center'}       -9.5629              -14.637                -4.4886          5.0113e-06
        true      {'County General Hospital'  }    false     {'VA Hospital'              }        11.451               7.2101                 15.692          8.3835e-11
        true      {'County General Hospital'  }    true      {'VA Hospital'              }         1.516              -1.6761                  4.708             0.73817
        true      {'County General Hospital'  }    false     {'St. Mary's Medical Center'}        10.307               5.9931                 14.621          6.5271e-09
        true      {'County General Hospital'  }    true      {'St. Mary's Medical Center'}        0.3721              -3.2806                 4.0248             0.99968
        false     {'VA Hospital'              }    true      {'VA Hospital'              }        -9.935              -12.908                -6.9623          7.6385e-15
        false     {'VA Hospital'              }    false     {'St. Mary's Medical Center'}       -1.1439              -4.8086                 2.5209             0.94367
        false     {'VA Hospital'              }    true      {'St. Mary's Medical Center'}       -11.079              -16.058                -6.0994          6.0817e-08
        true      {'VA Hospital'              }    false     {'St. Mary's Medical Center'}        8.7911               4.3482                 13.234          1.5297e-06
        true      {'VA Hospital'              }    true      {'St. Mary's Medical Center'}       -1.1439              -4.8086                 2.5209             0.94367
        false     {'St. Mary's Medical Center'}    true      {'St. Mary's Medical Center'}        -9.935              -12.908                -6.9623          7.6385e-15
    
    

    Each p-value corresponds to the null hypothesis that the means of groups in the same row are not significantly different. The table includes six p-values greater than 0.05, corresponding to the six pairs of groups with the same smoking status value. Therefore, systolic blood pressure is not significantly different between groups with the same smoking status value.

    Input Arguments

    collapse all

    Analysis of variance results, specified as an anova object. The properties of aov contain the factors and response data used by multcompare to compute the difference in means.

    Factors used to group the response data, specified as a string vector or cell array of character vectors. The multcompare function groups the response data by the combinations of values for the factors in factors. The factors argument must be one or more of the names in aov.FactorNames.

    Example: ["g1","g2"]

    Data Types: string | cell

    Name-Value Arguments

    Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

    Example: Alpha=0.01,CriticalValueType="dunnett",Approximate=true sets the significance level of the confidence intervals to 0.01 and uses an approximation of Dunnett's critical value to calculate the p-values.

    Significance level for the estimates, specified as a scalar value in the range (0,1). The confidence level of the confidence intervals is 100(1α)%. The default value for Alpha is 0.05, which returns 95% confidence intervals for the estimates.

    Example: Alpha=0.01

    Data Types: single | double

    Critical value type used by the multcompare function to calculate p-values, specified as one of the options in the following table. Each option specifies the statistical test that multcompare uses to calculate the critical value.

    OptionStatistical Test
    "tukey-kramer" (default)Tukey-Kramer test
    "hsd"Honestly Significant Difference test — Same as "tukey-kramer"
    "dunn-sidak"Dunn-Sidak correction
    "bonferroni"Bonferroni correction
    "scheffe"Scheffé test
    "dunnett"Dunnett's test — Can be used only when aov is a one-way anova object or when a single factor is specified in factors. For Dunnett's test, the control group is selected in the generated plot and cannot be changed.
    "lsd"Stands for Least Significant Difference and uses the critical value for a plain t-test. This option does not protect against the multiple comparisons problem unless it follows a preliminary overall test such as an F-test.

    Example: CriticalValueType="dunn-sidak"

    Data Types: char | string

    Indicator to compute the Dunnett critical value approximately, specified as a numeric or logical 1 (true) or 0 (false). You can compute the Dunnett critical value approximately for speed. The default for Approximate is true for an N-way ANOVA with N greater than two, and false otherwise. This argument is valid only when CriticalValueType is "dunnett".

    Example: Approximate=true

    Data Types: logical

    Index of the control group factor value for Dunnett's test, specified as a positive integer. Factor values are indexed by the order in which they appear in aov.ExpandedFactorNames. This argument is valid only when CriticalValueType is "dunnett".

    Example: ControlGroup=3

    Data Types: single | double

    Output Arguments

    collapse all

    Multiple comparison procedure results, returned as a table. The table m has the following variables:

    • Group1 — Values of the factors in the first comparison group

    • Group2 — Values of the factors in the second comparison group

    • MeanDifference — Difference in mean response between the observations in Group1 and the observations in Group2

    • MeanDifferenceLower — 95% lower confidence bound on the mean difference

    • MeanDifferenceUpper — 95% upper confidence bound on the mean difference

    • pValuep-value indicating whether or not the mean of Group1 is significantly different from the mean of Group2

    If two or more factors are provided in factors, the columns Group1 and Group2 contain tables of values for the factors of the groups being compared.

    References

    [1] Hochberg, Y., and A. C. Tamhane. Multiple Comparison Procedures. Hoboken, NJ: John Wiley & Sons, 1987.

    [2] Milliken, G. A., and D. E. Johnson. Analysis of Messy Data, Volume I: Designed Experiments. Boca Raton, FL: Chapman & Hall/CRC Press, 1992.

    [3] Searle, S. R., F. M. Speed, and G. A. Milliken. “Population marginal means in the linear model: an alternative to least-squares means.” American Statistician. 1980, pp. 216–221.

    [4] Dunnett, Charles W. “A Multiple Comparison Procedure for Comparing Several Treatments with a Control.” Journal of the American Statistical Association, vol. 50, no. 272, Dec. 1955, pp. 1096–121.

    [5] Krishnaiah, Paruchuri R., and J. V. Armitage. "Tables for multivariate t distribution." Sankhyā: The Indian Journal of Statistics, Series B (1966): 31-56.

    Version History

    Introduced in R2022b