posted on 2009-01-22, 00:00authored byAlex M. Clark, Paul Labute
A method is presented for the detection and analysis of multiple common scaffolds for small collections of pharmaceutically relevant molecules that share a set of common structural motifs. The input consists of the molecules themselves, possibly some of the scaffolds, and possibly information about the relation between the substitution points of these scaffolds. Three new algorithms are presented: multiple scaffold detection, common scaffold alignment, and scaffold substructure assignment. Each of these steps is relevant for cases when either none, some, or all information about the common scaffolds and their substitution patterns is available. Each of these problems must be solved in an optimal way in order to produce useful structure−activity correlations. The output consists of a collection of scaffolds, a common numbering system, and a unique mapping of each molecule to a single scaffold substructure. This information can then be used to produce data for structure−activity analysis of medicinal chemistry project databases.