oc0c00026_si_001.pdf (2.12 MB)
Accurate Multiobjective Design in a Space of Millions of Transition Metal Complexes with Neural-Network-Driven Efficient Global Optimization
journal contribution
posted on 2020-03-11, 13:57 authored by Jon Paul Janet, Sahasrajit Ramesh, Chenru Duan, Heather J. KulikThe
accelerated discovery of materials for real world applications
requires the achievement of multiple design objectives. The multidimensional
nature of the search necessitates exploration of multimillion compound
libraries over which even density functional theory (DFT) screening
is intractable. Machine learning (e.g., artificial neural network,
ANN, or Gaussian process, GP) models for this task are limited by
training data availability and predictive uncertainty quantification
(UQ). We overcome such limitations by using efficient global optimization
(EGO) with the multidimensional expected improvement (EI) criterion.
EGO balances exploitation of a trained model with acquisition of new
DFT data at the Pareto front, the region of chemical space that contains
the optimal trade-off between multiple design criteria. We demonstrate
this approach for the simultaneous optimization of redox potential
and solubility in candidate M(II)/M(III) redox couples for redox flow
batteries from a space of 2.8 M transition metal complexes designed
for stability in practical redox flow battery (RFB) applications.
We show that a multitask ANN with latent-distance-based UQ surpasses
the generalization performance of a GP in this space. With this approach,
ANN prediction and EI scoring of the full space are achieved in minutes.
Starting from ca. 100 representative points, EGO improves both properties
by over 3 standard deviations in only five generations. Analysis of
lookahead errors confirms rapid ANN model improvement during the EGO
process, achieving suitable accuracy for predictive design in the
space of transition metal complexes. The ANN-driven EI approach achieves
at least 500-fold acceleration over random search, identifying a Pareto-optimal
design in around 5 weeks instead of 50 years.
History
Usage metrics
Categories
- Biophysics
- Biochemistry
- Space Science
- Neuroscience
- Biotechnology
- Chemical Sciences not elsewhere classified
- Astronomical and Space Sciences not elsewhere classified
- Biological Sciences not elsewhere classified
- Information Systems not elsewhere classified
- Mathematical Sciences not elsewhere classified
- Plant Biology
Keywords
ANN model improvementEGO processAccurate Multiobjective Designworld applicationsPareto frontGPlatent-distance-based UQdesign criteriamultitask ANNtraining data availability5 weekschemical space100 representative pointsTransition Metal ComplexesANN predictionGaussian processcompound librariesDFT dataredox flow battery2.8 M transition metal complexesNeural-Network-Driven Efficient Global Optimizationoptimizationdesign objectivesuncertainty quantificationEGO balances exploitation50 yearslookahead errorsredox flow batteriesANN-driven EI approachgeneralization performanceRFBtransition metal complexesPareto-optimal design
Licence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC