## Hypothesis Test Assumptions

Different hypothesis tests make different assumptions about the distribution of the random variable being sampled in the data. These assumptions must be considered when choosing a test and when interpreting the results.

For example, the *z*-test (`ztest`

) and the *t*-test
(`ttest`

) both assume that the
data are independently sampled from a normal distribution. Statistics and Machine Learning Toolbox™ functions
are available for testing this assumption, such as `chi2gof`

, `jbtest`

, `lillietest`

, and `normplot`

.

Both the *z*-test and the *t*-test
are relatively robust with respect to departures from this assumption,
so long as the sample size *n* is large enough. Both
tests compute a sample mean $$\overline{x}$$,
which, by the Central Limit Theorem, has an approximately normal sampling
distribution with mean equal to the population mean *μ*,
regardless of the population distribution being sampled.

The difference between the *z*-test and the *t*-test
is in the assumption of the standard deviation *σ* of
the underlying normal distribution. A *z*-test assumes
that *σ* is known; a *t*-test
does not. As a result, a *t*-test must compute an
estimate *s* of the standard deviation from the sample.

Test statistics for the *z*-test and the *t*-test
are, respectively,

$$\begin{array}{l}z=\frac{\overline{x}-\mu}{\sigma /\sqrt{n}}\\ t=\frac{\overline{x}-\mu}{s/\sqrt{n}}\end{array}$$

Under the null hypothesis that the population is distributed
with mean *μ*, the *z*-statistic
has a standard normal distribution, *N*(0,1). Under
the same null hypothesis, the *t*-statistic has Student's *t* distribution
with *n* – 1 degrees of freedom. For small
sample sizes, Student's *t* distribution is flatter
and wider than *N*(0,1), compensating for the decreased
confidence in the estimate *s*. As sample size increases,
however, Student's *t* distribution approaches the
standard normal distribution, and the two tests become essentially
equivalent.

Knowing the distribution of the test statistic under the null
hypothesis allows for accurate calculation of *p*-values.
Interpreting *p*-values in the context of the test
assumptions allows for critical analysis of test results.

Assumptions underlying Statistics and Machine Learning Toolbox hypothesis tests are given in the reference pages for implementing functions.