posted on 2024-02-28, 13:36authored byNicole
M. North, Abigail A. A. Enders, Jessica B. Clark, Kezia A. Duah, Heather C. Allen
Solvated
organics in the ocean are present in relatively small
concentrations but contribute largely to ocean chemical diversity
and complexity. Existing in the ocean as dissolved organic carbon
(DOC) and enriched within the sea surface microlayer (SSML), these
compounds have large impacts on atmospheric chemistry through their
contributions to cloud nucleation, ice formation, and other climatological
processes. The ability to quantify the concentrations of organics
in ocean samples is critical to understanding these marine processes.
The work presented herein details an investigation to develop a machine
learning (ML) methodology utilizing infrared spectroscopy data to
accurately estimate saccharide concentrations in complex solutions.
We evaluated multivariate linear regression (MLR), K-nearest neighbors
(KNN), decision trees (DT), gradient-boosted regressors (GBR), multilayer
perceptrons (MLP), and support vector regressors (SVR) toward this
goal. SVR models are shown to best predict accurate generalized saccharide
concentrations. Our work presents an application combining fast spectroscopic
techniques with ML to analyze organic composition in proxy ocean samples.
As a result, we target a generalized method for analyzing field marine
samples more efficiently without sacrificing accuracy or precision.