American Chemical Society
Browse

Just-in-Time Learning-Integrated Partial Least-Squares Strategy for Accurately Predicting 71 Chemical Constituents in Chinese Tobacco by Near-Infrared Spectroscopy

Download (587.15 kB)
journal contribution
posted on 2022-10-20, 16:05 authored by Youyan Liang, Le Zhao, Junwei Guo, Hongbo Wang, Shaofeng Liu, Luoping Wang, Li Chen, Mantang Chen, Nuohan Zhang, Huimin Liu, Cong Nie
Near-infrared spectroscopy has been widely used to characterize the chemical composition of tobacco because it is fast, economical, and nondestructive. However, few predictive models perform ideally when applied to large spectral libraries of tobacco and its various chemical indicators. In this study, the just-in-time learning-integrated partial least-squares (JIT-PLS) modeling strategy was applied for the first time to quantitatively analyze 71 chemical components in Chinese tobacco. Approximately 18000 tobacco samples from China were analyzed to find appropriately similar measurements and propose suitable and flexible similar subsets from the calibration for each test sample. In total, 879 representative aged tobacco leaf samples and 816 cigarette samples were used as external instances to evaluate the practical predicting ability of the proposed method. The most suitable similar subsets for each test sample could be selected by limiting the Euclidean distance and number of similar subsets to 0–3.0 × 10–9 and 10–300, respectively. The majority of the JIT-PLS models performed significantly better than traditional PLS models. Specifically, using JIT-PLS instead of traditional PLS models increased the R2 values from 0.347–0.984 to 0.763–0.996, and from 0.179–0.981 to 0.506–0.989 for the prediction of 67 and 71 components in aged tobacco leaf and cigarette samples, respectively. Good prediction ability was demonstrated for routine chemical components, polyphenolic compounds, organic acids, and other compounds, with the mean ratios of prediction to deviation (RPDmean) being 7.74, 4.39, 4.05, and 5.48, respectively). The proposed methodology could simultaneously determine 67 major components in large and complicated tobacco spectral libraries with high precision and accuracy, which will assist tobacco and cigarette quality control in collecting as well as processing stages.

History