posted on 2017-02-21, 00:00authored byLeslie Myint, Andre Kleensang, Liang Zhao, Thomas Hartung, Kasper D. Hansen
As mass spectrometry-based metabolomics
becomes more widely used in biomedical research, it is important to
revisit existing data analysis paradigms. Existing data preprocessing
efforts have largely focused on methods which start by extracting
features separately from each sample, followed by a subsequent attempt
to group features across samples to facilitate comparisons. We show
that this preprocessing approach leads to unnecessary variability
in peak quantifications that adversely impacts downstream analysis.
We present a new method, bakedpi, for the preprocessing of both centroid
and profile mode metabolomics data that relies on an intensity-weighted
bivariate kernel density estimation on a pooling of all samples to
detect peaks. This new method reduces this unnecessary quantification
variability and increases power in downstream differential analysis.