posted on 2017-04-24, 00:00authored bySaer Samanipour, Malcolm J. Reid, Kevin V. Thomas
Liquid
chromatography coupled to high resolution mass spectrometry
(LC-HR-MS) has been one of the main analytical tools for the analysis
of small polar organic pollutants in the environment. LC-HR-MS typically
produces a large amount of data for a single chromatogram. The analyst
is therefore required to perform prioritization prior to nontarget
structural elucidation. In the present study, we have combined the
F-ratio statistical variable selection and the apex detection algorithms
in order to perform prioritization in data sets produced via LC-HR-MS.
The approach was validated through the use of semisynthetic data,
which was a combination of real environmental data and the artificially
added signal of 31 alkanes in that sample. We evaluated the performance
of this method as a function of four false detection probabilities,
namely: 0.01, 0.02, 0.05, and 0.1%. We generated 100 different semisynthetic
data sets for each F-ratio and evaluated that data set using this
method. This design of experiment created a population of 30 000
true positives and 32 000 true negatives for each F-ratio,
which was considered sufficiently large enough in order to fully validate
this method for analysis of LC-HR-MS data. The effect of both the
F-ratio and signal-to-noise ratio (S/N) on the performance of the suggested approach were evaluated through
normalized statistical tests. We also compared this method to the
pixel-by-pixel as well as peak list approaches. More than 92% of features
present in the final feature list via the F-ratio method were also
present in the conventional peak list generated by MZmine. However,
this method was the only approach successful in the classification
of samples, and thus prioritization, when compared to the other evaluated
approaches. The application potential and limitations of the suggested
method are discussed.