es1c04326_si_002.txt (8.98 MB)
Download file“pySiRC”: Machine Learning Combined with Molecular Fingerprints to Predict the Reaction Rate Constant of the Radical-Based Oxidation Processes of Aqueous Organic Contaminants
dataset
posted on 2021-09-02, 20:04 authored by Flávio Olimpio Sanches-Neto, Jefferson Richard Dias-Silva, Luiz Henrique Keng Queiroz Junior, Valter Henrique Carvalho-SilvaWe developed a web application structured
in a machine learning
and molecular fingerprint algorithm for the automatic calculation
of the reaction rate constant of the oxidative processes of organic
pollutants by •OH and SO4•– radicals in the aqueous
phasethe pySiRC platform. The model development
followed the OECD principles: internal and external validation, applicability
domain, and mechanistic interpretation. Three machine learning algorithms
combined with molecular fingerprints were evaluated, and all the models
resulted in high goodness-of-fit for the training set with R2 > 0.931 for the •OH radical
and R2 > 0.916 for the SO4•– radical and good predictive capacity for the test set with Rext2 = Qext2 values in the range of 0.639–0.823 and 0.767–0.824
for the •OH and SO4•– radicals. The model was interpreted
using the SHAP (SHapley Additive exPlanations) method: the results
showed that the model developed made the prediction based on a reasonable
understanding of how electron-withdrawing and -donating groups interfere
with the reactivity of the •OH and SO4•– radicals. We hope that our models and web interface can stimulate
and expand the application and interpretation of kinetic research
on contaminants in water treatment units based on advanced oxidative
technologies.
History
Usage metrics
Read the peer-reviewed publication
Categories
Keywords
shapley additive explanationsreaction rate constantgood predictive capacitydonating groups interfereaqueous phase sup >• supmolecular fingerprint algorithmadvanced oxidative technologiesmodel development followed767 – 0639 – 0based oxidation processes2 supweb application structuredaqueous organic contaminantsmodel developed madeoxidative processes>< supweb interfaceprediction basedorganic pollutantsmolecular fingerprintsr q py training settest setsirc platformresults showedreasonable understandingoecd principlesmachine learningkinetic researchinterpreted usinghigh goodnessexternal validationautomatic calculationapplicability domain