posted on 2024-08-08, 19:36authored byXu Li, Haoliang Zhong, Haoyu Yang, Lin Li, Qingji Wang
Nucleophilic index (NNu) as a significant parameter plays a crucial
role in screening
of amine catalysts. Indeed, the quantity and variety of amines are
extensive. However, only limited amines exhibit an NNu value exceeding 4.0 eV, rendering
them potential nucleophiles in chemical reactions. To address this
issue, we proposed a computational method to quickly identify amines
with high NNu values by using Machine Learning (ML) and high-throughput Density
Functional Theory (DFT) calculations. Our approach commenced by training
ML models and the exploration of Molecular Fingerprint methods as
well as the development of quantitative structure–activity
relationship (QSAR) models for the well-known amines based on NNu values derived
from DFT calculations. Utilizing explainable Shapley Additive Explanation
plots, we were able to determine the five critical substructures that
significantly impact the NNu values of amine. The aforementioned conclusion can
be applied to produce and cultivate 4920 novel hypothetical amines
with high NNu values. The QSAR models were employed to predict the NNu values of 259 well-known
and 4920 hypothetical amines, resulting in the identification of five
novel hypothetical amines with exceptional NNu values (>4.55 eV). The enhanced NNu values of these
novel amines were validated by DFT calculations. One novel hypothetical
amine, H1, exhibits an unprecedentedly high NNu value of 5.36 eV, surpassing the
maximum value (5.35 eV) observed in well-established amines. Our research
strategy efficiently accelerates the discovery of the high nucleophilicity
of amines using ML predictions, as well as the DFT calculations.