posted on 2016-09-12, 14:50authored byFlorian Lauck, Matthias Rarey
In the search for
new marketable drugs, new ideas are required
constantly. Particularly with regard to challenging targets and previously
patented chemical space, designing novel molecules is crucial. This
demands efficient and innovative computational tools to generate libraries
of promising molecules. Here we present an efficient method to generate
such libraries by systematically enumerating all molecules in a specific
chemical space. This space is defined by a fragment space and a set
of user-defined physicochemical properties (e.g., molecular weight,
tPSA, number of H-bond donors and acceptors, or predicted logP). In
order to enumerate a very large number of molecules, our algorithm
uses file-based data structures instead of memory-based ones, thus
overcoming the limitations of computer main memory. The resulting
chemical library can be used as a starting point for computational
lead-finding technologies, like similarity searching, pharmacophore
mapping, docking, or virtual screening. We applied the algorithm in
different scenarios, thus creating numerous target-specific libraries.
Furthermore, we generated a fragment space from all approved drugs
in DrugBank and enumerated it with lead-like constraints, thus generating
0.5 billion molecules in the molecular weight range 250–350.