posted on 2024-06-18, 14:05authored byZheng Cheng, Hangrui Bi, Siyuan Liu, Junmin Chen, Alston J. Misquitta, Kuang Yu
Accurately describing long-range interactions is a significant
challenge in molecular dynamics (MD) simulations of proteins. High-quality
long-range potential is also an important component of the range-separated
machine learning force field. This study introduces a comprehensive
asymptotic parameter database encompassing atomic multipole moments,
polarizabilities, and dispersion coefficients. Leveraging active learning,
our database comprehensively represents protein fragments with up
to 8 heavy atoms, capturing their conformational diversity with merely
78,000 data points. Additionally, the E(3) neural network (E3NN) is
employed to predict the asymptotic parameters directly from the local
geometry. The E3NN models demonstrate exceptional accuracy and transferability
across all asymptotic parameters, achieving an R2 of 0.999 for both protein fragments and 20 amino acid dipeptide
test sets. The long-range electrostatic and dispersion energies can
be obtained using the E3NN-predicted parameters, with an error of
0.07 and 0.02 kcal/mol, respectively, when compared to symmetry-adapted
perturbation theory (SAPT). Therefore, our force fields demonstrate
the capability to accurately describe long-range interactions in proteins,
paving the way for next-generation protein force fields.