Main Content

rmasummary

Calculate gene expression values from Affymetrix microarray probe-level data using Robust Multi-array Average (RMA) procedure

Syntax

ExpressionMatrix = rmasummary(ProbeIndices, Data)
ExpressionMatrix = rmasummary(ProbeIndices, Data, 'Output', OutputValue)

Arguments

ProbeIndices

Column vector of probe indices. The convention for probe indices is, for each probe set, to label each probe 0 to N – 1, where N is the number of probes in the probe set.

Tip

Use the ProbeIndices field in the structure returned by celintensityread as the ProbeIndices input.

Data

Matrix of natural-scale intensity values where each row corresponds to a perfect match (PM) probe and each column corresponds to an Affymetrix® CEL file. (Each CEL file is generated from a separate chip. All chips should be of the same type.)

Tip

Using a single-precision matrix for Data decreases memory usage.

Tip

You can use the matrix from the PMIntensities field in the structure returned by celintensityread as the Data input. However, first ensure the matrix has been background adjusted, using the rmabackadj or gcrmabackadj function, and normalized, using the quantilenorm function.

OutputValue

Specifies the scale of the returned gene expression values. OutputValue can be:

  • 'log'

  • 'log2'

  • 'log10'

  • 'linear'

  • @functionname

In the last instance, the data is transformed as defined by the function functionname. Default is 'log2'.

Description

ExpressionMatrix = rmasummary(ProbeIndices, Data) returns gene (probe set) expression values after calculating them from natural-scale probe intensities in the matrix Data, using the column vector of probe indices, ProbeIndices. Note that each row in Data corresponds to a perfect match (PM) probe, and each column corresponds to an Affymetrix CEL file. (Each CEL file is generated from a separate chip. All chips should be of the same type.) Note that the column vector ProbeIndices designates probes within each probe set by labeling each probe 0 to N – 1, where N is the number of probes in the probe set. Note that each row in ExpressionMatrix corresponds to a gene (probe set) and each column in ExpressionMatrix corresponds to an Affymetrix CEL file, which represents a single chip.

For a given probe set n, with J probe pairs, let Yijn denote the background-adjusted, base 2 log transformed and quantile-normalized PM probe intensity value of chip i and probe j. Yijn follows a linear additive model:

Yijn = Uin + Ajn + Eijn; i = 1, ..., I; j = 1, ..., J; n = 1, ..., N

where:

Uin = Gene expression of the probe set n on chip i

Ajn = Probe affinity effect for the jth probe in the probe set

Eijn = Residual for the jth probe on the ith chip

The RMA method assumes A1 + A2 + ... + AJ = 0 for all probe sets. A robust procedure, median polish, estimates Ui as the log scale measure of expression.

Note

There is no column in ExpressionMatrix that contains probe set or gene information.

ExpressionMatrix = rmasummary(..., 'PropertyName', PropertyValue, ...) calls rmasummary with optional properties that use property name/property value pairs. You can specify one or more properties in any order. Each PropertyName must be enclosed in single quotation marks and is case insensitive. These property name/property value pairs are as follows:

ExpressionMatrix = rmasummary(ProbeIndices, Data, 'Output', OutputValue) specifies the scale of the returned gene expression values. OutputValue can be:

  • 'log'

  • 'log2'

  • 'log10'

  • 'linear'

  • @functionname

In the last instance, the data is transformed as defined by the function functionname. Default is 'log2'.

Examples

  1. Load a MAT-file, included with the Bioinformatics Toolbox™ software, which contains Affymetrix data variables, including pmMatrix, a matrix of PM probe intensity values from multiple CEL files.

    load prostatecancerrawdata
  2. Perform background adjustment on the PM probe intensity values in the matrix, pmMatrix, using the rmabackadj function, thereby creating a new matrix, BackgroundAdjustedMatrix.

    BackgroundAdjustedMatrix = rmabackadj(pmMatrix);
  3. Normalize the data in BackgroundAdjustedMatrix, using the quantilenorm function.

    NormMatrix = quantilenorm(BackgroundAdjustedMatrix);
  4. Calculate gene expression values from the probe intensities in NormMatrix, creating a new matrix, ExpressionMatrix. (Use the probeIndices column vector provided to supply information on the probe indices.)

    ExpressionMatrix = rmasummary(probeIndices, NormMatrix);

The prostatecancerrawdata.mat file used in the previous example contains data from Best et al., 2005.

References

[1] Irizarry, R.A., Hobbs, B., Collin, F., Beazer-Barclay, Y.D., Antonellis, K.J., Scherf, U., Speed, T.P. (2003). Exploration, Normalization, and Summaries of High Density Oligonucleotide Array Probe Level Data. Biostatistics. 4, 249–264.

[2] Mosteller, F., and Tukey, J. (1977). Data Analysis and Regression (Reading, Massachusetts: Addison-Wesley Publishing Company), pp. 165–202.

[3] Best, C.J.M., Gillespie, J.W., Yi, Y., Chandramouli, G.V.R., Perlmutter, M.A., Gathright, Y., Erickson, H.S., Georgevich, L., Tangrea, M.A., Duray, P.H., Gonzalez, S., Velasco, A., Linehan, W.M., Matusik, R.J., Price, D.K., Figg, W.D., Emmert-Buck, M.R., and Chuaqui, R.F. (2005). Molecular alterations in primary prostate cancer after androgen ablation therapy. Clinical Cancer Research 11, 6823–6834.

Version History

Introduced in R2006a