Machine
learning is one of the most innovative tools that has entered
the materials science toolkit in recent years. This work employs a
machine learning strategy to develop a yield prediction model for
producing cellulose nanocrystals (CNCs). It analyses the critical
factors affecting the yield from CNCs by optimizing reaction conditions
and reducing experiments. First, a data set of CNCs is established,
including cellulose sources and reaction conditions. The Weighted
Average Ensemble (WAE) approach is applied to an ensemble of five
tree-based base models on the data set, and it was found that the
WAE surpasses all the base models. The impact of critical features
on yield prediction is analyzed with partial dependence plots and
individual conditional expectation plots. Batch experiments are mainly
used to produce CNCs, but these are time-consuming. In this context,
the WAE model is a promising tool for rapidly predicting the yield,
and this study provides an excellent gateway to improve the extraction
of CNCs with high yields.