# Multifidelity Statistical Machine Learning for Molecular Crystal Structure Prediction

journal contribution

posted on 2020-09-17, 22:44 authored by Olga Egorova, Roohollah Hafizi, David C. Woods, Graeme M. DayThe prediction of
crystal structures from first-principles requires
highly accurate energies for large numbers of putative crystal structures.
High accuracy of solid state density functional theory (DFT) calculations
is often required, but hundreds or more structures can be present
in the low energy region of interest, so that the associated computational
costs are prohibitive. Here, we apply statistical machine learning
to predict expensive hybrid functional DFT (PBE0) calculations using
a multifidelity approach to re-evaluate the energies of crystal structures
predicted with an inexpensive force field. The method uses an autoregressive
Gaussian process, making use of less expensive GGA DFT (PBE) calculations
to bridge the gap between the force field and PBE0 energies. The method
is benchmarked on the crystal structure landscapes of three small,
hydrogen-bonded organic molecules and shown to produce accurate predictions
of energies and crystal structure ranking using small numbers of the
most expensive calculations; the PBE0 energies can be predicted with
errors of less than 1 kJ mol

^{–1}with between 4.2 and 6.8% of the cost of the full calculations. As the model that we have developed is probabilistic, we discuss how the uncertainties in predicted energies impact the assessment of the energetic ranking of crystal structures.