posted on 2023-04-26, 16:06authored byOscar
M. Camacho, Kerry A. Ramsbottom, Andrew Collins, Andrew R. Jones
Phosphorylation is a post-translational modification
of great interest
to researchers due to its relevance in many biological processes.
LC-MS/MS techniques have enabled high-throughput data acquisition,
with studies claiming identification and localization of thousands
of phosphosites. The identification and localization of phosphosites
emerge from different analytical pipelines and scoring algorithms,
with uncertainty embedded throughout the pipeline. For many pipelines
and algorithms, arbitrary thresholding is used, but little is known
about the actual global false localization rate in these studies.
Recently, it has been suggested to use decoy amino acids to estimate
global false localization rates of phosphosites, among the peptide–spectrum
matches reported. Here, we describe a simple pipeline aiming to maximize
the information extracted from these studies by objectively collapsing
from peptide–spectrum match to the peptidoform-site level,
as well as combining findings from multiple studies while maintaining
track of false localization rates. We show that the approach is more
effective than current processes that use a simpler mechanism for
handling phosphosite identification redundancy within and across studies.
In our case study using eight rice phosphoproteomics data sets, 6368
unique sites were confidently identified using our decoy approach
compared to 4687 using traditional thresholding in which false localization
rates are unknown.