New Glycoproteomics Software, GlycoPep Evaluator,
Generates Decoy Glycopeptides de Novo and Enables Accurate False Discovery
Rate Analysis for Small Data Sets
posted on 2015-12-17, 04:30authored byZhikai Zhu, Xiaomeng Su, Eden P. Go, Heather Desaire
Glycoproteins
are biologically significant large molecules that
participate in numerous cellular activities. In order to obtain site-specific
protein glycosylation information, intact glycopeptides, with the
glycan attached to the peptide sequence, are characterized by tandem
mass spectrometry (MS/MS) methods such as collision-induced dissociation
(CID) and electron transfer dissociation (ETD). While several emerging
automated tools are developed, no consensus is present in the field
about the best way to determine the reliability of the tools and/or
provide the false discovery rate (FDR). A common approach to calculate
FDRs for glycopeptide analysis, adopted from the target-decoy strategy
in proteomics, employs a decoy database that is created based on the
target protein sequence database. Nonetheless, this approach is not
optimal in measuring the confidence of N-linked glycopeptide
matches, because the glycopeptide data set is considerably smaller
compared to that of peptides, and the requirement of a consensus sequence
for N-glycosylation further limits the number of
possible decoy glycopeptides tested in a database search. To address
the need to accurately determine FDRs for automated glycopeptide assignments,
we developed GlycoPep Evaluator (GPE), a tool that helps to measure
FDRs in identifying glycopeptides without using a decoy database.
GPE generates decoy glycopeptides de novo for every target glycopeptide,
in a 1:20 target-to-decoy ratio. The decoys, along with target glycopeptides,
are scored against the ETD data, from which FDRs can be calculated
accurately based on the number of decoy matches and the ratio of the
number of targets to decoys, for small data sets. GPE is freely accessible
for download and can work with any search engine that interprets ETD
data of N-linked glycopeptides. The software is provided
at https://desairegroup.ku.edu/research.