es9b01117_si_001.pdf (1.71 MB)

Fusion Method Combining Ground-Level Observations with Chemical Transport Model Predictions Using an Ensemble Deep Learning Framework: Application in China to Estimate Spatiotemporally-Resolved PM2.5 Exposure Fields in 2014–2017

Download (1.71 MB)
journal contribution
posted on 06.06.2019, 00:00 by Baolei Lyu, Yongtao Hu, Wenxian Zhang, Yunsong Du, Bin Luo, Xiaoling Sun, Zhe Sun, Zhu Deng, Xiaojiang Wang, Jun Liu, Xuesong Wang, Armistead G. Russell
Atmospheric chemical transport models (CTMs) have been widely used to simulate spatiotemporally resolved PM2.5 concentrations. However, CTM results are usually prone to bias and errors. In this study, we improved the accuracy of PM2.5 predictions by developing an ensemble deep learning framework to fuse model simulations with ground-level observations. The framework encompasses four machine-learning models, i.e., general linear model, fully connected neural network, random forest, and gradient boosting machine, and combines them by stacking approach. This framework is applied to PM2.5 concentrations simulated by the Community Multiscale Air Quality (CMAQ) model for China from 2014 to 2017, which has complete spatial coverage over the entirety of China at a 12-km resolution, with no sampling biases. The fused PM2.5 concentration fields were evaluated by comparing with an independent network of observations. The R2 values increased from 0.39 to 0.64, and the RMSE values decreased from 33.7 μg/m3 to 24.8 μg/m3. According to the fused data, the percentage of Chinese population residing under the level II National Ambient Air Quality Standards of 35 μg/m3 for PM2.5 has increased from 46.5% in 2014 to 61.7% in 2017. The method is readily adapted to utilize near-real-time observations for operational analyses and forecasting of pollutant concentrations and can be extended to provide source apportionment forecasts as well.