co9b00086_si_001.pdf (1.93 MB)
Metric Learning for High-Throughput Combinatorial Data Sets
journal contribution
posted on 2019-10-31, 13:03 authored by Kiran Vaddi, Olga WodoMaterials
design and discovery through the high-throughput exploration
of materials space has been recognized as a new paradigm in materials
science. However, typical high-throughput exploration methods deliver
high-dimensional and very diverse data sets that pose the challenge
of extracting the key features and patterns that could guide the discovery
process. Unraveling patterns is a nontrivial task as quite often the
underlying physical phenomena are uncertain and latent variables governing
the performance are mainly unknown. In this paper, we discuss challenges
related to designing a data analytics tool for clustering high-throughput
measurements performed on the compositional library of materials.
The critical aspects of our methodology are (i) learning the similarity
measures, as opposed to using fixed similarity measures (e.g., Euclidean
distance, dynamic time warping), while (ii) imposing the similarity
in the composition space. Our methodology is based on the multitask
learning approach that is formulated to account for the composition
neighborhoods that are specific to the compositional libraries. We
demonstrate the advantages of our methodology for the library of cyclic
voltammetry curves generated for model multimetal catalysts, as well
as X-ray diffraction patterns from experimental studies. We also compare
our approach with the current state-of-the-art methods used in similar
problems. This work has important implications for designing high-throughput
exploration including catalysts for electrochemical systems, such
as fuel cells and metal-air batteries.