Main Content

This section explains how the Statistics and Machine Learning Toolbox™ functions `quantile`

and `prctile`

compute
quantiles and percentiles.

The `prctile`

function calculates the percentiles
in a similar way as `quantile`

calculates quantiles.
The following steps in the computation of quantiles are also true
for percentiles, given the fact that, for the same data sample, the
quantile at the value Q is the same as the percentile at the value
P = 100*Q.

`quantile`

initially assigns the sorted values in`X`

to the (0.5/*n*), (1.5/*n*), ..., ([*n*– 0.5]/*n*) quantiles. For example:For a data vector of six elements such as {6, 3, 2, 10, 8, 1}, the sorted elements {1, 2, 3, 6, 8, 10} respectively correspond to the (0.5/6), (1.5/6), (2.5/6), (3.5/6), (4.5/6), and (5.5/6) quantiles.

For a data vector of five elements such as {2, 10, 5, 9, 13}, the sorted elements {2, 5, 9, 10, 13} respectively correspond to the 0.1, 0.3, 0.5, 0.7, and 0.9 quantiles.

The following figure illustrates this approach for data vector

*X*= {2, 10, 5, 9, 13}. The first observation corresponds to the cumulative probability 1/5 = 0.2, the second observation corresponds to the cumulative probability 2/5 = 0.4, and so on. The step function in this figure shows these cumulative probabilities.`quantile`

instead places the observations in midpoints, such that the first corresponds to 0.5/5 = 0.1, the second corresponds to 1.5/5 = 0.3, and so on, and then connects these midpoints. The red lines in the following figure connect the midpoints.By switching the axes, as the next figure, you can see the values of the variable**Assigning Observations to Quantiles***X*that correspond to the`p`

quantiles.**Quantiles of***X*`quantile`

finds any quantiles between the data values using linear interpolation.*Linear interpolation*uses linear polynomials to approximate a function f(*x*) and construct new data points within the range of a known set of data points. Algebraically, given the data points (*x*_{1},*y*_{1}) and (*x*_{2},*y*_{2}), where*y*_{1}= f(*x*_{1}) and*y*_{2}= f(*x*_{2}), linear interpolation finds*y*= f(*x*) for a given*x*between*x*_{1}and*x*_{2}as follows:$$y=f(x)={y}_{1}+\frac{\left(x-{x}_{1}\right)}{\left({x}_{2}-{x}_{1}\right)}\left({y}_{2}-{y}_{1}\right).$$

Similarly, if the 1.5/

*n*quantile is*y*_{1.5/n}and the 2.5/*n*quantile is*y*_{2.5/n}, then linear interpolation finds the 2.3/*n*quantile*y*_{2.3/n}as$${y}_{\frac{2.3}{n}}={y}_{\frac{1.5}{n}}+\frac{\left(\frac{2.3}{n}-\frac{1.5}{n}\right)}{\left(\frac{2.5}{n}-\frac{1.5}{n}\right)}\left({y}_{\frac{2.5}{n}}-{y}_{\frac{1.5}{n}}\right).$$

`quantile`

assigns the first and last values of*X*to the quantiles for probabilities less than (0.5/*n*) and greater than ([*n*–0.5]/*n*), respectively.

[1] Langford, E. “Quartiles in Elementary Statistics”, *Journal
of Statistics Education*. Vol. 14, No. 3, 2006.