posted on 2024-12-07, 14:42authored byElloise Coyle, Mickaël Leclercq, Clarisse Gotti, Florence Roux-Dalvai, Arnaud Droit
In targeted proteomics utilizing Selected Reaction Monitoring
(SRM),
the precise detection of specific peptides within complex mixtures
remains a significant challenge, particularly due to noise and interference
in chromatograms. Existing methodologies, such as isotopic labeling
and scoring algorithms, offer partial solutions but are constrained
by high run times and elevated false discovery rates. To address these
limitations, we have developed ProPickML a machine learning-based
tool designed to accurately identify peptide peaks across diverse
data sets, independent of the assumed presence of the peptide. This
model was trained on a manually labeled data set and subsequently
validated to assess its predictive accuracy. The results demonstrate
that the model reliably identifies peptide peaks in the presence of
noise, achieving a Matthews correlation coefficient (MCC) of 0.81
on an independent test data set, surpassing mProphet’s MCC
of 0.71. Implemented in R as ProPickML, this tool offers a competitive,
cost-effective alternative to existing techniques, significantly reducing
reliance on isotopic labeling and enhancing the accuracy of peptide
identification in SRM workflows.