A Markov-chain-based regression model for the analysis of high-resolution enzymatically 18O-labeled mass spectra
- Prelegent(ci)
- Dirk Valkenborg
- Afiliacja
- Vision on technology - vito.be
- Termin
- 20 lutego 2013 10:15
- Pokój
- p. 5820
- Seminarium
- Seminarium „Biologia obliczeniowa i bioinformatyka”
To reduce the variability on the data and facilitate a statistical analysis, stable-isotope coding is often used, such that peptides from distinct groups can be pooled together and analyzed simultaneously on the mass spectrometer. In other words, protein information about different conditions appears together in a single mass spectrum and is affected by the same machine variability making a direct comparison possible.
A powerful and relatively new technique for stable-isotope coding is the enzymatic labeling with oxygen isotopes containing 18 neutrons instead of the 16 neutrons most commonly observed in nature. In this setting, oxygen atoms from the carboxyl-terminus of peptides are replaced with the oxygens from heavy-oxygen-water during an enzymatic reaction with trypsine. Ideally, the series of peptide peaks (i.e. isotopic distribution) related to labeled peptides will shift 4 dalton (Da) to the right in the mass spectrum due to the incorporation of two heavy-oxygen isotopes. However, peptide-specific oxygen incorporation rate and impurities of oxygen isotopes present in the heavy-oxygen-water yield various isotope combinations on the carboxyl-terminus. As a consequence, the series of peptide peaks will not be shifted by 4 Da, but, depending on the number of oxygen isotopes introduced to the carboxyl-terminus, by 0, 1, 2, 3 or 4 Da. This will lead to multiple peptide peaks from the labeled peptide superimposing with the unlabeled peptide. This additional overlap complicates the estimation of the relative abundance and can result in a biased estimate. In order to arrive at a correct estimate of the relative abundance, the overlap needs to be taken into account in the analysis.
Therefore, we propose a new method, similar in spirit to that
developed by Eckel-Passow et al. (2006), which estimates the isotopic distribution, peptide-specific incorporation rate, and the abundance ratio directly from the observed mixture of peptide peaks. The method is implemented in MATLAB as a non-linear regression model. The performance of the method is illustrated using real-life datasets.