American Chemical Society
Browse
ac3c02263_si_001.pdf (283.31 kB)

LibGen: Generating High Quality Spectral Libraries of Natural Products for EAD‑, UVPD‑, and HCD-High Resolution Mass Spectrometers

Download (283.31 kB)
journal contribution
posted on 2023-11-08, 19:39 authored by Fanzhou Kong, Uri Keshet, Tong Shen, Elys Rodriguez, Oliver Fiehn
Compound annotation using spectral-matching algorithms is vital for (MS/MS)-based metabolomics research, but is hindered by the lack of high-quality reference MS/MS library spectra. Finding and removing errors from libraries, including noise ions, is mostly done manually. This process is both error-prone and time-consuming. To address these challenges, we have developed an automated library curation pipeline, LibGen, to universally build novel spectral libraries. This pipeline corrects mass errors, denoises spectra by subformula assignments, and performs quality control of the reference spectra by calculating explained intensity and spectral entropy. We employed LibGen to generate three high-quality libraries with chemical standards of 2241 natural products. To this end, we used an IQ-X orbital ion trap mass spectrometer to generate 1947 classic high-energy collision dissociation spectra (HCD) as well as 1093 ultraviolet-photodissociation (UVPD) mass spectra. The third library was generated by an electron-activated collision dissociation (EAD) 7600 ZenoTOF mass spectrometer yielding 3244 MS/MS spectra. The natural compounds covered 140 chemical classes from prenol lipids to benzypyrans with >97% of the compounds showing <0.2 Tanimoto-similarity, demonstrating a very high structural variance. Mass spectra showed much higher information content for both UVPD- and EAD-mass spectra compared to classic HCD spectra when using spectral entropy calculations. We validated the denoising algorithm by acquiring MS/MS spectra at high concentration and at 13-fold diluted chemical standards. At low concentrations, a higher proportion of spectra showed apparent fragment ions that could not be explained by subformula losses of the parent molecule. When more than 10% of the total intensity of MS/MS fragments was regarded as noise ions, spectra were considered as low quality and were not included in the libraries. As the overall process is fully automated, LibGen can be utilized by all researchers who create or curate mass spectral libraries. The libraries we created here are publicly available at MassBank.us.

History