Skip to main content

Using Preprocessing Techniques in Air Quality forecasting with Artificial Neural Networks

  • Conference paper
  • First Online:

Part of the book series: Environmental Science and Engineering ((ENVENG))

Abstract

Data quality is one of the fundamental issues influencing the performance of any data investigation algorithm. Poor data quality always leads to poor quality results. In the investigation chain, the data selection phase is followed by the preprocessing phase, which results in increased data quality, while in parallel it demands the highest time resources of the overall data investigation chain. The preprocessing phase includes the handling of missing data, handling of the outliers, data de-trending and data smoothing. The methods that are used in the preprocessing phase are usually not sufficiently reported in the literature of environmental data analysis and knowledge extraction. The current paper investigates the performance of several methods in all phases of the preprocessing chain of environmental data, by emphasizing in the use of ICT (Information & Communication Technology) methods for the materialization of such preprocessing tasks, and by making use of the air quality as the environmental domain paradigm.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Kukkonen J, Partanen L, Karppinen A, Ruuskanen J, Junninen H, Kolehmainen M, Niska H, Dorling S, Foxall R, Cawley G (2003) Extensive evaluation of neural network models for the prediction of NO2 and PM10 concentrations, compared with a deterministic modelling system and measurements in central Helsinki. Atmospheric Environment 37: 4539-4550.

    Article  CAS  Google Scholar 

  2. Tzima F, Karatzas K, Mitkas P, Karathanasis S (2007) Using data-mining techniques for PM10 forecasting in the metropolitan area of Thessaloniki, Greece. Proceedings of the 20th International Joint Conference on Neural Networks (http://www.ijcnn2007.org):2752–2757

  3. Pyle D (1999) Data preparation for data mining. Los Altos.

    Google Scholar 

  4. Gardner MW, Dorling SR (1998) Artificial Neural Networks (The Multilayer Perceptron) - a Review of Applications in the Atmospheric Sciences. Atmospheric Environment 32 (14/15):2627-2636

    Article  CAS  Google Scholar 

  5. Kolehmainen M, Rissanen E, Raatikainen O, Ruuskanen J (2001) Monitoring odorous sulfur emissions using self-organizing maps for handling ion mobility spectrometry data. Journal of Air and Waste Management 51:966-971.

    Article  CAS  Google Scholar 

  6. Sfetsos A, Siriopoulos C (2004) Time series forecasting with a hybrid clustering scheme and pattern recognition, IEEE Transactions on Systems, Man and Cybernetics, Part A, Vol. 34 (3): 399-405

    Article  Google Scholar 

  7. Bianchini M, Di Iorio E, Maggini M, Mocenni C, Pucci A (2006) A Cyclostationary Neural Network Model for the Prediction of the NO2 Concentration. ESANN 2006:67-72

    Google Scholar 

  8. Zhang Z, San Y (2004) Adaptive Wavelet Neural Network for Prediction of Hourly NOX and NO2 Concentrations. Winter Simulation Conference 2004:1170-1778

    Google Scholar 

  9. Finardi S, Pellegrini U (2004) Systematic Analysis of Meteorological Conditions Causing Severe Urban Air Episodes in the Central Po Valley”. 9th Conference on Harmonisation within Atmospheric Dispersion Modelling for Regulatory Purposes:250-254

    Google Scholar 

  10. Airbase, the European Air quality database: http://air-climate.eionet.europa.eu/databases/airbase (accessed 06 March 2009).

  11. Karatzas K, Kaltsatos S (2007) Air pollution modelling with the aid of computational intelligence methods in Thessaloniki, Greece. Simulation Modelling Practice and Theory, vol 15, issue 10:1310-1319

    Article  Google Scholar 

  12. Weather Underground web site http://www.wunderground.com/

  13. Witten IH, Eibe F (2005) Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, San Francisco, 2nd edition

    Google Scholar 

  14. Slini T, Karatzas K, Mousiopoulos N (2004) Correlation of air pollution and meteorological data using neural networks. Int. J. Environment and Pollution, vol 20(1-6):218-229

    Google Scholar 

  15. Engelbrecht AP (2002) Computational Intelligence, An Introduction. University of Pretoria, South Africa.

    Google Scholar 

  16. Matlab Documentation, Section of Pre-processing Data.

    Google Scholar 

  17. Agnew DC, Constable C, Lecture on Total Least Squares and Robust Methods, http://mahi.ucsd.edu/cathy/Classes/SIO223/Part1/sio223.chap8.pdf

  18. Ronchetti E (2008) Lectures. Department of Economics, University of Geneva, Switzerland.

    Google Scholar 

  19. Huber PJ (1964) Robust Estimation of a Location Parameter. Ann. Of Mathematical Statistics 35(1):73-101

    Article  Google Scholar 

  20. Yohai V (2006) The teaching of robust statistics for regression, in Proceedings of the 7th International Conference on Teaching Statistics, http://www.ime.usp.br/~abe/ICOTS7/Proceedings/PDFs/InvitedPapers/3B3_YOHA.pdf (accessed 05 March 2009)

  21. Andrews DF (1974) A Robust Method for Multiple Linear Regression. Technometrics (16):523-531

    Google Scholar 

  22. Leblebicioğlu A (2008) Financial integration, credit market imperfections and consumption smoothing. North Carolina State University

    Google Scholar 

  23. Xiong L, Guo S, O’Connor KM (2005) Smoothing the seasonal means of rainfall and runoff in the linear perturbation model (LPM) using the kernel estimator, Journal of Hydrology 324(1-4):266-282.

    Article  Google Scholar 

  24. Savitzky A, Golay MJE (1964) Smoothing and Differentiation of Data by Simplified Least Squares Procedures. Analytical Chemistry 36:1627-1639

    Article  CAS  Google Scholar 

  25. Michie D, Spiegelhalter DJ, Taylor CC (1994) Machine Learning, Neural and Statistical Classification (eds), (accessed 06 March 2009) http://www.shams.edu.eg/www.maththinking.com/4/whole.pdf

  26. Bishop C (1995) Neural Networks for Pattern Recognition. Clarendon Press, Oxford.

    Google Scholar 

  27. Kolehmainen M, Junninen H, Niska H, Patama T, Ruuskanen A, Tuppurainen K, Ruuskanen J (2007) Environmental Communication in the Information Society. 16th International Conference Informatics for Environmental Protection, September 25-27, Vienna University of Technology, 2002:445-451

    Google Scholar 

  28. Varotsos C, Ondov J, Efstathiou M (2005) Scaling properties of air pollution in Athens, Greece and Baltimore, Maryland, Atmospheric Environment 39(22):4041-4047.

    Article  CAS  Google Scholar 

  29. Slini T, Karatzas K, Mousiopoulos M (2004) Correlation of air pollution and meteorological data using neural networks. Int. J. Environment and Pollution, vol 20, nos 1-6:218-229

    Google Scholar 

  30. Matlab Documentation, Section of Neural Network Toolbox.

    Google Scholar 

  31. Saini LM, Soni MK (2002) Artificial neural network based peak load forecasting using Levenberg-Marquardt and quasi-Newton methods. Generation, Transmission and Distribution, IEE Proceedings, vol 149, issue 5:578–584

    Article  Google Scholar 

  32. StatSoft, Inc. © Copyright (1984-2003) Neural Networks http://www.statsoft.com/textbook/stneunet.html

  33. Willmott CJ, Ackleson SG, Davis RE, Feddema JJ, Klink KM, Legates DR, O’Donnell J, Rowe CM (1985) Statistics for the Evaluation and Comparison of Models, Geophys J (Res), 90(C5):8995–9005.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to George Papadourakis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kyriakidis, I., Karatzas, K.D., Papadourakis, G. (2009). Using Preprocessing Techniques in Air Quality forecasting with Artificial Neural Networks. In: Athanasiadis, I.N., Rizzoli, A.E., Mitkas, P.A., Gómez, J.M. (eds) Information Technologies in Environmental Engineering. Environmental Science and Engineering(). Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88351-7_27

Download citation

Publish with us

Policies and ethics