posted on 2000-10-12, 00:00authored byGregory A. Bakken, Peter C. Jurs
Linear discriminant analysis is used to generate models to classify multidrug-resistance reversal
agents based on activity. Models are generated and evaluated using multidrug-resistance
reversal activity values for 609 compounds measured using adriamycin-resistant P388 murine
leukemia cells. Structure-based descriptors numerically encode molecular features which are
used in model formation. Two types of models are generated: one type to classify compounds
as inactive, moderately active, and active (three-class problem) and one type to classify
compounds as inactive or active without considering the moderately active class (two-class
problem). Two activity distributions are considered, where the separation between inactive
and active compounds is different. When the separation between inactive and active classes is
small, a model based on nine topological descriptors is developed that produces a classification
rate of 83.1% correct for an external prediction set. Larger separation between active and
inactive classes raises the prediction set classification rate to 92.0% correct using a model with
six topological descriptors. Models are further validated through Monte Carlo experiments in
which models are generated after class labels have been scrambled. The classification rates
achieved demonstrate that the models developed could serve as a screening mechanism to
identify potentially useful MDRR agents from large libraries of compounds.