American Chemical Society
ci900464s_si_001.pdf (235.38 kB)

Drug- and Lead-likeness, Target Class, and Molecular Diversity Analysis of 7.9 Million Commercially Available Organic Compounds Provided by 29 Suppliers

Download (235.38 kB)
journal contribution
posted on 2010-04-26, 00:00 authored by Alexander Chuprina, Oleg Lukin, Robert Demoiseaux, Alexander Buzko, Alexander Shivanyuk
A database of 7.9 million compounds commercially available from 29 suppliers in 2008−2009 was assembled and analyzed. 5.2 million structures of this database were identified to be unique and were subjected to an assessment of physical and biological properties and estimation of molecular diversity. The rules of Lipinski and Veber were applied to the molecular weight, the calculated water/n-octanol partition coefficients (Clog P), the calculated aqueous solubility (log S), the numbers of hydrogen-bond donors and acceptors, and the calculated Caco-2 membrane permeability to identify the drug-like compounds, whereas the toxicity/reactivity filters were used to remove the structures with biologically undesired functional groups. This filtering resulted in 2.0 million (39%) structures perfectly suitable for high-throughput screening of biological activity. Modified filters applied to identify lead-like structures revealed that 16% of the unique compounds could be potential leads. Assessment of the biological activities, the analysis of diversity, and the sizes of exclusive sets of compounds are presented.