posted on 2016-02-19, 03:04authored byMartin Vogt, Jürgen Bajorath
In
similarity searching, compound potency is usually not taken into account.
Given a set of active reference compounds, similarity to database
molecules is calculated using different metrics without considering
compound potency as a search parameter. Herein, we introduce a feature
selection method for fingerprint similarity searching to maximize
compound recall and preferentially detect potent compounds. On the
basis of training examples, fingerprint features are selected that
identify potent compounds and produce high recall. Using the reduced
fingerprint representations, potent hits are preferentially detected,
even if reference compounds have only moderate or low potency. Small
sets of simple chemical features are found to yield high search performance.