WIT Press


ENSEMBLE DEEP LEARNING FOR CLASSIFICATION OF POLLUTION PEAKS

Price

Free (open access)

Volume

259

Pages

12

Page Range

25 - 36

Published

2022

Paper DOI

10.2495/AWP220031

Copyright

Author(s)

PHUONG N. CHAU, RASA ZALAKEVICIUTE, YVES RYBARCZYK

Abstract

The concentration peaks of atmospheric pollutants are the most challenging and important phenomena in air quality forecasting. The fact that these elevated levels of pollution do not seem to follow any specific pattern explains why current models still struggle to provide an accurate prediction of these harmful events for human health. The present study tackles this issue by testing several supervised learning methods to discriminate between peak and no peak of concentrations of five contaminants: NO2, CO, SO2, PM2.5, and O3. The classification performance of ensemble decision tree (gradient boosting machine (GBM)) models and ensemble deep learning (EDL) models are compared. The results reveal that the EDL outperforms the GBM model. An analysis of the variable importance (SHapley additive exPlanations (SHAP)) shows that both temporal and meteorological features have an impact on the proposed models. In particular, time of day and wind speed are the most important features to explain the performance of the ensemble DL models.

Keywords

machine learning, deep learning, air pollution forecasting, data-driven modelling