Main Content

mspalign

Align mass spectra from multiple peak lists from LC/MS or GC/MS data set

    Description

    [CMZ,AlignedPeaks] = mspalign(Peaklist) aligns mass spectra from multiple peak lists (centroided data), by first estimating CMZ, a vector of common mass/charge (m/z) values estimated by considering the peaks in all spectra in Peaklist, a cell array of peak lists, where each element corresponds to a spectrum or retention time. It then aligns the peaks in each spectrum to the values in CMZ, creating AlignedPeaks, a cell array of aligned peak lists.

    example

    [CMZ, AlignedPeaks] = mspalign(Peaklist,Name,Value)calls mspalign uses additional options specified by one or more Name,Value pair arguments.

    Examples

    collapse all

    1. Load a MAT-file, included with the Bioinformatics Toolbox™ software, which contains liquid chromatography/mass spectrometry (LC/MS) data variables, including peaks and ret_time. peaks is a cell array of peak lists, where each element is a two-column matrix of m/z values and ion intensity values, and each element corresponds to a spectrum or retention time. ret_time is a column vector of retention times associated with the LC/MS data set.

      load lcmsdata
    2. Resample the unaligned data, display it in a heat map, and then overlay a dot plot.

      [MZ,Y] = msppresample(ms_peaks,5000);
      msheatmap(MZ,ret_time,log(Y))

      msdotplot(ms_peaks,ret_time)
    3. Click the Zoom In button, and then click the dot plot two or three times to zoom in and see how the dots representing peaks overlay the heat map image.

    4. Align the peak lists from the mass spectra using the default estimation and correction methods.

      [CMZ, aligned_peaks] = mspalign(ms_peaks);
    5. Resample the unaligned data, display it in a heat map, and then overlay a dot plot.

      [MZ2,Y2] = msppresample(aligned_peaks,5000);
      msheatmap(MZ2,ret_time,log(Y2))

      msdotplot(aligned_peaks,ret_time)
    6. Link the axes of the two heat plots and zoom in to observe the detail to compare the unaligned and aligned LC/MS data sets.

      linkaxes(findobj(0,'Tag','MSHeatMap'))
      axis([480 532 375 485])

    Input Arguments

    collapse all

    Cell array of peak lists from a liquid chromatography/mass spectrometry (LC/MS) or gas chromatography/mass spectrometry (GC/MS) data set. Each element in the cell array is a two-column matrix with m/z values in the first column and ion intensity values in the second column. Each element corresponds to a spectrum or retention time.

    Note

    You can use the mzxml2peaks function or the mspeaks function to create the Peaklist cell array.

    Name-Value Arguments

    Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

    Example: 'Replicates',5 specifies to repeat the algorithm five times.

    Determine which peaks are selected by the estimation method to create CMZ, the vector of common m/z values.

    Example: 'Quantile',0.5

    Specify the method used to estimate CMZ, the vector of common mass/charge (m/z) values.

    OptionDescription
    histogram Peak locations are clustered using a kernel density estimation approach. The peak ion intensity is used as a weighting factor. The center of all the clusters conform to the CMZ vector.
    regressionTakes a sample of the distances between observed significant peaks and regresses the inter-peak distance to create the CMZ vector with similar inter-element distances.

    Example: 'EstimationMethod','histogram'

    Specify the method used to align each peak list to the CMZ vector

    OptionDescription
    nearest-neighborFor each common peak in the CMZ vector, its counterpart in each peak list is the peak that is closest to the common peak's m/z value.
    shortest-pathFor each common peak in the CMZ vector, its counterpart in each peak list is selected using the shortest path algorithm.

    Example: 'CorrectionMethod','nearest-neighbor'

    Control the display of an assessment plot relative to the estimation method and the estimated vector of common mass/charge (m/z) values. The default is false when return values are specified and true when return values are not specified.

    Example: 'ShowEstimation', true

    Output Arguments

    collapse all

    Vector of common mass/charge (m/z) values estimated by the mspalign function.

    Cell array of peak lists, with the same form as Peaklist, but with corrected m/z values in the first column of each matrix.

    References

    [1] Jeffries, N. (2005) Algorithms for alignment of mass spectrometry proteomic data. Bioinfomatics 21:14, 3066–3073.

    [2] Purvine, S., Kolker, N., and Kolker, E. (2004) Spectral Quality Assessment for High-Throughput Tandem Mass Spectrometry Proteomics. OMICS: A Journal of Integrative Biology 8:3, 255–265.

    Version History

    Introduced in R2007a