Main Content

Microarray Data Analysis Tools

The MATLAB® environment is widely used for microarray data analysis, including reading, filtering, normalizing, and visualizing microarray data. However, the standard normalization and visualization tools that scientists use can be difficult to implement. The toolbox includes these standard functions:

Microarray data — Read Affymetrix® GeneChip® files (affyread) and plot data (probesetplot), ImaGene® results files (imageneread), SPOT files (sptread) and Agilent® microarray scanner files (agferead). Read GenePix® GPR files (gprread) and GAL files (galread). Get Gene Expression Omnibus (GEO) data from the Web (getgeodata) and read GEO data from files (geosoftread).

A utility function (magetfield) extracts data from one of the microarray reader functions (gprread, agferead, sptread, imageneread).

Microarray normalization and filtering — The toolbox provides a number of methods for normalizing microarray data, such as lowess normalization (malowess) and mean normalization (manorm), or across multiple arrays (quantilenorm). You can use filtering functions to clean raw data before analysis (geneentropyfilter, genelowvalfilter, generangefilter, genevarfilter), and calculate the range and variance of values (exprprofrange, exprprofvar).

Microarray visualization — The toolbox contains routines for visualizing microarray data. These routines include spatial plots of microarray data (maimage, redgreencmap), box plots (maboxplot), loglog plots (maloglog), and intensity-ratio plots (mairplot). You can also view clustered expression profiles (clustergram, redgreencmap). You can create 2-D scatter plots of principal components from the microarray data (mapcaplot).

Microarray utility functions — Use the following functions to work with Affymetrix GeneChip data sets. Get library information for a probe (probelibraryinfo), gene information from a probe set (probesetlookup), and probe set values from CEL and CDF information (probesetvalues). Plot probe set values (probesetplot).

The toolbox accesses statistical routines to perform cluster analysis and to visualize the results, and you can view your data through statistical visualizations such as dendrograms, classification, and regression trees.

Related Topics