posted on 2017-07-18, 00:00authored byAlbert
Y. Xue, Lindsey C. Szymczak, Milan Mrksich, Neda Bagheri
Emerging peptide
array technologies are able to profile molecular
activities within cell lysates. However, the structural diversity
of peptides leads to inherent differences in peptide signal-to-noise
ratios (S/N). These complex effects
can lead to potentially unrepresentative signal intensities and can
bias subsequent analyses. Within mass spectrometry-based peptide technologies,
the relation between a peptide’s amino acid sequence and S/N remains largely nonquantitative. To
address this challenge, we present a method to quantify and analyze
mass spectrometry S/N of two peptide
arrays, and we use this analysis to portray quality of data and to
design future arrays for SAMDI mass spectrometry. Our study demonstrates
that S/N varies significantly across
peptides within peptide arrays, and variation in S/N is attributable to differences of single amino
acids. We apply supervised machine learning to predict peptide S/N based on amino acid sequence, and identify
specific physical properties of the amino acids that govern variation
of this metric. We find low peptide–S/N concordance between arrays, demonstrating that different
arrays require individual characterization and that global peptide–S/N relationships are difficult to identify.
However, with proper peptide sampling, this study illustrates how
machine learning can accurately predict the S/N of a peptide in an array, allowing for the efficient design
of arrays through selection of high S/N peptides.