Skip to main content
Log in

Comparison of Four Machine Learning Methods for Predicting PM10 Concentrations in Helsinki, Finland

  • Published:
Water, Air and Soil Pollution: Focus

Abstract

Machine learning methods can offer a practicalalternative to deterministic and statistical methods forpredicting air pollution concentrations. However, for agiven data set, it is often not clear beforehand whichmachine learning method will yield the best predictionperformance. This study compares the variable selection andprediction performance of four machine-learning methods ofdifferent complexity: logistic regression, decision tree,multivariate adaptive regression splines and neuralnetwork. The methods are applied to the task of predictingthe exceedance of the European PM10 daily averageobjective of 50 μg m-3 for a station in Helsinki,Finland. Our study shows that some predictors were selectedby all models but that the different models also pickeddifferent variables. The performance of three of the fourmethods investigated was very similar, however, performanceof the decision tree method was significantly inferior.Performance was sensitive to the learning sample size andtime period used.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Berge, E., Walker, S-E., Sorteberg, A., Lenkopane, M. L., Eastwood, S., Jablonska, H. J. and Ødegaard, M.: 2001, ‘A Real Time Operational Forecast Model for Meteorology and Air Quality During Peak Air Pollution Episodes in Oslo, Norway’, Proceedings of 3th International Conference on Urban Air Quality, Loutraki, Greece, March 2001.

  • Berthold, M. and Hand, D. (eds): 1999, Intelligent Data Analysis, Springer.

  • Breiman, L., Friedman, J., Olshen, R. and Stone, C.: 1984, Classification and Regression Trees, Wadsworth International Group.

  • Brodley, C. E.: 1993, ‘Addressing the selective superiority problem: Automatic algorithms/model class selection’, in P. Utgoff (ed.), Proceedings of the Tenth International Conference on Machine Learning, pp. 17–24.

  • De Leeuw, F., Moussiopoulos, N., Bartonova, A. and Sahm, P.: 2000, ‘Air Quality in Larger Conurbations in the European Union’, European Topic Centre on Air Quality.

  • Friedman, J. H.: 1991, ‘Multivariate adaptive regression splines (with discussion)’, Ann. Statis. 19,1–141.

    Google Scholar 

  • Gardner, M. and Dorling, S., 1998: 'Artificial neural networks (the multi-layer perceptron) – a review of applications in the atmospheric sciences’, Atmos. Environ. 32, 2627–2636

    Google Scholar 

  • Gardner, M. and Dorling, S.: 1999, ‘Statistical surface ozone models: an improved methodology to account for non-linear behaviour, Atmos. Environ. 34, 21–34.

    Google Scholar 

  • Goldberg, D. E.: 1989, Genetic Algorithms, Reading, MA: Addison Wesley.

    Google Scholar 

  • Kennedy, R. L., Yuchun, L., van Roy, B., Reed, C. and Lippman, R.: 1997, ‘Solving Data Mining Problems with Pattern Recognition’, The Data Warehousing Institute Series.

  • Kooperberg, C., Smarajit, B. and Charles, J.: 1997, ‘Polychotomous regression’, J. Amer. Stat. Assoc. 92, 117–127.

    Google Scholar 

  • Pohjola, M., Kousa, A., P. Aarnio, P., Koskentalo, T., Kukkonen, Harkonen, J. and Karppinen, A.: 2000, ‘Meteorological interpretation of measured urban PM2.5 and PM10 concentrations in Helsinki Metropolitan Area’, Air Pollution VIII, 679–698.

    Google Scholar 

  • SPSS, User Manual, Version 9.0.

  • US EPA: 1999 'Guideline for Developing an Ozone Forecasting Program’, EPA-454/R–99–009.

  • Zickus, M.: 1999, ‘Influence of Meteorological Parameters on Urban Air Pollution and Its Forecast’, PhD. Thesis, Department of Physics, Vilnius University, 105 pp. Available on Internet: http://195.194.93.120/thesis/.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zickus, M., Greig, A.J. & Niranjan, M. Comparison of Four Machine Learning Methods for Predicting PM10 Concentrations in Helsinki, Finland. Water, Air, & Soil Pollution: Focus 2, 717–729 (2002). https://doi.org/10.1023/A:1021321820639

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1021321820639

Navigation