posted on 2023-10-31, 18:15authored byBinjun Yan, Mengtian Shi, Siyu Cai, Yuan Su, Renhui Chen, Chiyuan Huang, David Da Yong Chen
Proteomics
provides molecular bases of biology and disease, and
liquid chromatography–tandem mass spectrometry (LC–MS/MS)
is a platform widely used for bottom-up proteomics. Data-independent
acquisition (DIA) improves the run-to-run reproducibility of LC–MS/MS
in proteomics research. However, the existing DIA data processing
tools sometimes produce large deviations from true values for the
peptides and proteins in quantification. Peak-picking error and incorrect
ion selection are the two main causes of the deviations. We present
a cross-run ion selection and peak-picking (CRISP) tool that utilizes
the important advantage of run-to-run consistency of DIA and simultaneously
examines the DIA data from the whole set of runs to filter out the
interfering signals, instead of only looking at a single run at a
time. Eight datasets acquired by mass spectrometers from different
vendors with different types of mass analyzers were used to benchmark
our CRISP-DIA against other currently available DIA tools. In the
benchmark datasets, for analytes with large content variation among
samples, CRISP-DIA generally resulted in 20 to 50% relative decrease
in error rates compared to other DIA tools, at both the peptide precursor
level and the protein level. CRISP-DIA detected differentially expressed
proteins more efficiently, with 3.3 to 90.3% increases in the numbers
of true positives and 12.3 to 35.3% decreases in the false positive
rates, in some cases. In the real biological datasets, CRISP-DIA showed
better consistencies of the quantification results. The advantages
of assimilating DIA data in multiple runs for quantitative proteomics
were demonstrated, which can significantly improve the quantification
accuracy.