Representing and Comparing Site-Specific Glycan Abundance Distributions of Glycoproteins

journal contribution
posted on 30.07.2021, 10:36 by Concepcion A. Remoroza, Meghan C. Burke, Yi Liu, Yuri A. Mirokhin, Dmitrii V. Tchekhovskoi, Xiaoyu Yang, Stephen E. Stein
A method for representing and comparing distributions of N-linked glycans located at specific sites on proteins is presented. The representation takes the form of a simple mass spectrum for a given peptide sequence, with each peak corresponding to a different glycopeptide. The mass (in place of m/z) of each peak is that of the glycan mass, and its abundance corresponds to its relative abundance in the electrospray MS1 spectrum. This provides a facile means of representing all identifiable glycopeptides arising from a single protein “sequon” on a specific sequence, thereby enabling the comparison and searching of these distributions as routinely done for mass spectra. Likewise, these reference glycopeptide abundance distribution spectra (GADS) can be stored in searchable libraries. A set of such libraries created from available data is provided along with an adapted version of the widely used NIST-MS library-search software. Since GADS contain only MS1 abundances and identifications, they are equally suitable for expressing collision-induced fragmentation and electron-transfer dissociation determinations of glycopeptide identity. Comparisons of GADS for N-glycosylated sites on several proteins, especially the SARS-CoV-2 spike protein, demonstrate the potential reproducibility of GADS and their utility for comparing site-specific distributions.