# manova1

One-way multivariate analysis of variance (MANOVA)

## Description

d = manova1(X,group) performs a one-way multivariate analysis of variance (MANOVA) and returns an estimate d for the dimension of the space containing the group means. To perform the MANOVA, manova1 uses the factor in group and the data in X.

d = manova1(X,group,alpha) also specifies the significance level for the MANOVA.

[d,p] = manova1(___) also returns the p-value p corresponding to d, using any of the input argument combinations in the previous syntaxes.

example

[d,p,stats] = manova1(___) also returns a structure stats containing additional MANOVA statistics.

example

## Examples

collapse all

Calculate the dimension of the space containing the group mean vectors and the corresponding p-values.

[d,p] = manova1([MPG Acceleration Weight Displacement],...
Origin)
d =
3

p = 4×1

0.0000
0.0000
0.0075
0.1934

The output shows that enough evidence exists to reject the null hypothesis that the mean vectors are statistically the same. However, not enough evidence exists to reject the null hypothesis that the mean vectors lie in the same 3D space.

The column vector species contains three iris flower species: setosa, versicolor, and virginica. The matrix meas contains four types of measurements for the flower: the length and width of sepals and petals in centimeters.

Perform a one-way MANOVA to test the null hypothesis that the vector of means for the four measurements is the same across the three flower species. Specify the significance level. Calculate the dimension of the space containing the vectors for the three flower species, the corresponding p-values, and additional statistics for the MANOVA.

[d,p,stats] = manova1(meas,species,0.01)
d =
2

p = 2×1
10-7 ×

0.0000
0.5786

stats = struct with fields:
W: [4x4 double]
B: [4x4 double]
T: [4x4 double]
dfW: 147
dfB: 2
dfT: 149
lambda: [2x1 double]
chisq: [2x1 double]
chisqdf: [2x1 double]
eigenval: [4x1 double]
eigenvec: [4x4 double]
canon: [150x4 double]
mdist: [150x1 double]
gmdist: [3x3 double]
gnames: {3x1 cell}

The output shows that the vectors of means for the three species are contained in a two-dimensional space. This result indicates that one of the vectors is statistically different from the others. The stats structure contains additional statistics for the MANOVA.

Inspect the canonical response data for the MANOVA.

C = stats.canon
C = 150×4

-8.0618    0.3004    0.0287    0.2769
-7.1287   -0.7867    0.8907   -0.0714
-7.4898   -0.2654    0.1792   -0.5257
-6.8132   -0.6706   -0.3940   -0.7182
-8.1323    0.5145   -0.4776    0.0508
-7.7019    1.4617   -0.4069    0.4651
-7.2126    0.3558   -0.4843   -0.9609
-7.6053   -0.0116   -0.2433    0.0825
-6.5606   -1.0152   -0.0342   -1.1131
-7.3431   -0.9473   -0.0903    0.1119
⋮

Each column of C corresponds to a canonical variable, and each row contains a transformed data point corresponding to the same row in X. For more information about canonical variables, see Canonical Variables.

Create a scatter plot using the first and second canonical variables.

gscatter(C(:,1),C(:,2),species)

The scatter plot shows two main clusters of data, with the measurements for setosa in one cluster and the measurements for versicolor and virginica in the other. This result also shows that the vectors of means for the three species are contained in a two-dimensional space.

## Input Arguments

collapse all

Data, specified as a numeric matrix with n rows, where n is the number of observations. The columns of X correspond to the elements of the multivariate means.

Data Types: single | double

Factor values, specified as a categorical, numeric, or string vector, or a cell array of character vectors. group must contain n elements, where n is the number of rows in X. Each element of group represents the factor value of the data in the corresponding row of X.

Example: [1,2,1,3,1,...,3,1]

Example: ["white","red","white",...,"black","red"]

Data Types: single | double | string | cell | categorical

Significance level for the MANOVA, specified as a scalar between 0 and 1. For more information, see Algorithms.

Example: 0.01

Data Types: single | double

## Output Arguments

collapse all

Estimate of the dimension of the space containing the mean vectors, returned as a nonnegative scalar. d is less than or equal to the number of rows in X. For more information, see Algorithms.

p-values for the MANOVA, returned as a nonnegative vector of length d. p contains a p-value for each dimension manova1 tests when calculating d. For more information, see Algorithms.

Data Types: single | double

Additional MANOVA results, returned as a structure with the following fields.

FieldContents
W

Within-groups sum of squares and cross-products matrix

B

Between-groups sum of squares and cross-products matrix

T

Total sum of squares and cross-products matrix

dfW

Degrees of freedom for W

dfB

Degrees of freedom for B

dfT

Degrees of freedom for T

lambda

Vector of values of the Wilks' lambda test statistic for testing whether the means have dimension 0, 1, and so on.

chisq

Transformation of lambda to an approximate chi-square distribution

chisqdf

Degrees of freedom for chisq

eigenval

Eigenvalues of W-1B

eigenvec

Eigenvectors of W-1B, the coefficients for the canonical variables C scaled so the within-groups variance of the canonical variables is 1

canon

Canonical variables, equal to XC*eigenvec, where XC is X with the columns centered by subtracting their means (see Canonical Variables).

mdist

Vector of Mahalanobis distances from each point to the mean of its group

gmdist

Matrix of Mahalanobis distances between each pair of group means

Data Types: struct

collapse all

### Canonical Variables

The canonical variables canon are linear combinations of the original variables that maximize the separation between groups. canon(:,1) is the linear combination of the X columns that has the maximum separation between groups. Among all possible linear combinations, canon(:,1) has the most significant F-statistic in a one-way analysis of variance (ANOVA). canon(:,2) has the maximum separation subject to it being orthogonal to canon(:,1), and so on.

## Algorithms

manova1 determines d by calculating a test statistic for each possible value of d. The formula for the test statistic is

$\left(n-1-\frac{l+r}{2}\right)\mathrm{log}\left(\lambda \right),$

where n is the number of observations, l is the number of factor levels, r is the number of response variables, and $\lambda$ is Wilks' lambda. For more information about Wilks' lambda, see Multivariate Analysis of Variance for Repeated Measures.

The largest possible value of d is the minimum between the number of response variables and one less than the number of factor levels. d is the largest value for which the p-value is less than the significance level specified by alpha.

## Alternative Functionality

Instead of using manova1, you can create a manova object using the manova function, and then use the barttest object function to calculate the dimension of the space containing the group means. The advantages of using the manova function include:

• Support for two-way and N-way MANOVA

• Table support for factor and response data

• Additional properties of the manova object, including those for the fitted MANOVA model coefficients, degrees of freedom for the error, and response covariance matrix

## References

[1] Krzanowski, Wojtek. J. Principles of Multivariate Analysis: A User's Perspective. New York: Oxford University Press, 1988.

[2] Morrison, Donald F. Multivariate Statistical Methods. 2nd ed, McGraw-Hill, 1976.

## Version History

Introduced before R2006a