ci050520j_si_001.pdf (79.02 kB)

Random Forest Prediction of Mutagenicity from Empirical Physicochemical Descriptors

Download (79.02 kB)
journal contribution
posted on 22.01.2007, 00:00 by Qing-You Zhang, João Aires-de-Sousa
Fast-to-calculate empirical physicochemical descriptors were investigated for their ability to predict mutagenicity (positive or negative Ames test) from the molecular structure. Fast methods are highly desired for the screening of large libraries of compounds. Global molecular descriptors and MOLMAP descriptors of bond properties were used to train random forests. Error percentages as low as 15% and 16% were achieved for an external test set with 472 compounds and for the training set with 4083 structures, respectively. High sensitivity and specificity were observed. Random forests were able to associate meaningful probabilities to the predictions and to explain the predictions in terms of similarities between query structures and compounds in the training set.