FSees: Customized Enumeration of Chemical Subspaces
with Limited Main Memory Consumption

Lauck, Florian; Rarey, Matthias

doi:10.1021/acs.jcim.6b00117.s001

ci6b00117_si_001.pdf (281.8 kB)

FSees: Customized Enumeration of Chemical Subspaces with Limited Main Memory Consumption

journal contribution

posted on 2016-09-12, 14:50 authored by Florian Lauck, Matthias Rarey

In the search for new marketable drugs, new ideas are required constantly. Particularly with regard to challenging targets and previously patented chemical space, designing novel molecules is crucial. This demands efficient and innovative computational tools to generate libraries of promising molecules. Here we present an efficient method to generate such libraries by systematically enumerating all molecules in a specific chemical space. This space is defined by a fragment space and a set of user-defined physicochemical properties (e.g., molecular weight, tPSA, number of H-bond donors and acceptors, or predicted logP). In order to enumerate a very large number of molecules, our algorithm uses file-based data structures instead of memory-based ones, thus overcoming the limitations of computer main memory. The resulting chemical library can be used as a starting point for computational lead-finding technologies, like similarity searching, pharmacophore mapping, docking, or virtual screening. We applied the algorithm in different scenarios, thus creating numerous target-specific libraries. Furthermore, we generated a fragment space from all approved drugs in DrugBank and enumerated it with lead-like constraints, thus generating 0.5 billion molecules in the molecular weight range 250–350.