Nine classifers were used to forecast the exceedance of European O3 threshold.
•
Bagging and stacking, but also their cost of learning were evaluated.
•
Two classifers showed best forecasting performance, using three different metrics.
•
Performance obtained with bagging and stacking were less than expected.
•
The approaches used are not influenced by the training data set sizes.
Abstract
Classification models to forecast exceedance of the ozone (O3) threshold established by European legislation are rare in literature, as is the focus on background O3, with higher concentrations at city outskirts. This study evaluated the performance of nine classifiers to forecast this threshold exceedance by background O3. Models used five large hourly background O3 data sets (2006–2015), and included temporal features describing the O3 formation dynamic. Bagging and stacking ensembles of such classifiers and their cost of learning were also evaluated. C5.0 and nnet classifiers achieved the best forecasting performance, even at imbalanced learning. Bagging ensembles outperformed stacking approaches, although with little accuracy improvement as compared to classifiers. The cost of learning evidenced similar performance results from reduced fractions of original data sets. The use of these models to forecast background O3 threshold exceedances are encouraged due to the performances obtained and to their easy reproducibility.
Keywords
Background ozone
Forecasting
Classification
Imbalanced learning
Ensembles
Cost of learning
Cited by (0)
Peer review under responsibility of Turkish National Committee for Air Pollution Research and Control.