posted on 2022-02-04, 20:06authored byBrajesh K. Rai, Vishnu Sresht, Qingyi Yang, Ray Unwalla, Meihua Tu, Alan M. Mathiowetz, Gregory A. Bakken
Fast
and accurate assessment of small-molecule dihedral energetics
is crucial for molecular design and optimization in medicinal chemistry.
Yet, accurate prediction of torsion energy profiles remains challenging
as the current molecular mechanics (MM) methods are limited by insufficient
coverage of drug-like chemical space and accurate quantum mechanical
(QM) methods are too expensive. To address this limitation, we introduce
TorsionNet, a deep neural network (DNN) model specifically developed
to predict small-molecule torsion energy profiles with QM-level accuracy.
We applied active learning to identify nearly 50k fragments (with
elements H, C, N, O, F, S, and Cl) that maximized the coverage of
our corporate compound library and leveraged massively parallel cloud
computing resources for density functional theory (DFT) torsion scans
of these fragments, generating a training data set of 1.2 million
DFT energies. After training TorsionNet on this data set, we obtain
a model that can rapidly predict the torsion energy profile of typical
drug-like fragments with DFT-level accuracy. Importantly, our method
also provides an uncertainty estimate for the predicted profiles without
any additional calculations. In this report, we show that TorsionNet
can accurately identify the preferred dihedral geometries observed
in crystal structures. Our TorsionNet-based analysis of a diverse
set of protein–ligand complexes with measured binding affinity
shows a strong association between high ligand strain and low potency.
We also present practical applications of TorsionNet that demonstrate
how consideration of DNN-based strain energy leads to substantial
improvement in existing lead discovery and design workflows. TorsionNet500,
a benchmark data set comprising 500 chemically diverse fragments with
DFT torsion profiles (12k MM- and DFT-optimized geometries and energies),
has been created and is made publicly available.