posted on 2024-02-22, 09:04authored byFernando Garcia-Escobar, Toshiaki Taniike, Keisuke Takahashi
Proposing relevant catalyst descriptors that can relate
the information
on a catalyst’s composition to its actual performance is an
ongoing area in catalyst informatics, as it is a necessary step to
improve our understanding on the target reactions. Herein, a small
descriptor-engineered data set containing 3289 descriptor variables
and the performance of 200 catalysts for the oxidative coupling of
methane (OCM) is analyzed, and a descriptor search algorithm based
on the workflow of the Basin-hopping optimization methodology is proposed
to select the descriptors that better fit a predictive model. The
algorithm, which can be considered wrapper in nature, consists of
the successive generation of random-based modifications to the descriptor
subset used in a regression model and adopting them depending on their
effect on the model’s score. The results are presented after
being tested on linear and Support Vector Regression models with average
cross-validation r2 scores of 0.8268 and
0.6875, respectively.