Descriptors, Physical Properties, and Drug-Likeness
journal contributionposted on 03.07.2002, 00:00 by Matthias Brüstle, Bernd Beck, Torsten Schindler, William King, Timothy Mitchell, Timothy Clark
We have investigated techniques for distinguishing between drugs and nondrugs using a set of molecular descriptors derived from semiempirical molecular orbital (AM1) calculations. The “drug” data set of 2105 compounds was derived from the World Drug Index (WDI) using a procedure designed to select real drugs. The “nondrug” data set was the Maybridge database. We have first investigated the dimensionality of physical properties space based on a set of 26 descriptors that we have used successfully to build absorption, distribution, metabolism, and excretion-related quantitative structure−property relationship models. We discuss the general nature of the descriptors for physical property space and the ability of these descriptors to distinguish between drugs and nondrugs. The third most significant principal component of this set of descriptors serves as a useful numerical index of drug-likeness, but no others are able to distinguish between drugs and nondrugs. We have therefore extended our set of descriptors to a total of 66 and have used recursive partitioning to identify the descriptors that can distinguish between drugs and nondrugs. This procedure pointed to two of the descriptors that play an important role in the principal component found above and one more from the set of 40 extra descriptors. These three descriptors were then used to train a Kohonen artificial neural net for the entire Maybridge data set. Projecting the drug database onto the map obtained resulted in a clear distinction not only between drugs and nondrugs but also, for instance, between hormones and other drugs. Projection of 42 131 compounds from the WDI onto the Kohonen map also revealed pronounced clustering in the regions of the map assigned as druglike.