posted on 2021-10-18, 23:44authored byHyun Woo Kim, Mingxun Wang, Christopher A. Leber, Louis-Félix Nothias, Raphael Reher, Kyo Bin Kang, Justin J. J. van der Hooft, Pieter C. Dorrestein, William H. Gerwick, Garrison W. Cottrell
Computational
approaches such as genome and metabolome mining are
becoming essential to natural products (NPs) research. Consequently,
a need exists for an automated structure-type classification system
to handle the massive amounts of data appearing for NP structures.
An ideal semantic ontology for the classification of NPs should go
beyond the simple presence/absence of chemical substructures, but
also include the taxonomy of the producing organism, the nature of
the biosynthetic pathway, and/or their biological properties. Thus,
a holistic and automatic NP classification framework could have considerable
value to comprehensively navigate the relatedness of NPs, and especially
so when analyzing large numbers of NPs. Here, we introduce NPClassifier,
a deep-learning tool for the automated structural classification of
NPs from their counted Morgan fingerprints. NPClassifier is expected
to accelerate and enhance NP discovery by linking NP structures to
their underlying properties.