Just-in-Time Learning-Integrated
Partial Least-Squares
Strategy for Accurately Predicting 71 Chemical Constituents in Chinese
Tobacco by Near-Infrared Spectroscopy
posted on 2022-10-20, 16:05authored byYouyan Liang, Le Zhao, Junwei Guo, Hongbo Wang, Shaofeng Liu, Luoping Wang, Li Chen, Mantang Chen, Nuohan Zhang, Huimin Liu, Cong Nie
Near-infrared spectroscopy has been widely used to characterize
the chemical composition of tobacco because it is fast, economical,
and nondestructive. However, few predictive models perform ideally
when applied to large spectral libraries of tobacco and its various
chemical indicators. In this study, the just-in-time learning-integrated
partial least-squares (JIT-PLS) modeling strategy was applied for
the first time to quantitatively analyze 71 chemical components in
Chinese tobacco. Approximately 18000 tobacco samples from China were
analyzed to find appropriately similar measurements and propose suitable
and flexible similar subsets from the calibration for each test sample.
In total, 879 representative aged tobacco leaf samples and 816 cigarette
samples were used as external instances to evaluate the practical
predicting ability of the proposed method. The most suitable similar
subsets for each test sample could be selected by limiting the Euclidean
distance and number of similar subsets to 0–3.0 × 10–9 and 10–300, respectively. The majority of
the JIT-PLS models performed significantly better than traditional
PLS models. Specifically, using JIT-PLS instead of traditional PLS
models increased the R2 values from 0.347–0.984
to 0.763–0.996, and from 0.179–0.981 to 0.506–0.989
for the prediction of 67 and 71 components in aged tobacco leaf and
cigarette samples, respectively. Good prediction ability was demonstrated
for routine chemical components, polyphenolic compounds, organic acids,
and other compounds, with the mean ratios of prediction to deviation
(RPDmean) being 7.74, 4.39, 4.05, and 5.48, respectively).
The proposed methodology could simultaneously determine 67 major components
in large and complicated tobacco spectral libraries with high precision
and accuracy, which will assist tobacco and cigarette quality control
in collecting as well as processing stages.