Elsevier

Atmospheric Environment

Volume 44, Issue 4, February 2010, Pages 476-482
Atmospheric Environment

Uncertainty analysis of developed ANN and ANFIS models in prediction of carbon monoxide daily concentration

https://doi.org/10.1016/j.atmosenv.2009.11.005Get rights and content

Abstract

This study aims to predict daily carbon monoxide (CO) concentration in the atmosphere of Tehran by means of developed artificial neural network (ANN) and adaptive neuro-fuzzy inference system (ANFIS) models. Forward selection (FS) and Gamma test (GT) methods are used for selecting input variables and developing hybrid models with ANN and ANFIS. From 12 input candidates, 7 and 9 variables are selected using FS and GT, respectively. Evaluation of developed hybrid models and its comparison with ANN and ANFIS models fed with all input variables shows that both FS and GT techniques reduce not only the output error, but also computational cost due to less inputs. FS–ANN and FS–ANFIS models are selected as the best models considering R2, mean absolute error and also developed discrepancy ratio statistics. It is also shown that these two models are superior in predicting pollution episodes. Finally, uncertainty analysis based on Monte-Carlo simulation is carried out for FS–ANN and FS–ANFIS models which shows that FS–ANN model has less uncertainty; i.e. it is the best model which forecasts satisfactorily the trends in daily CO concentration levels.

Introduction

In recent years, artificial intelligence (AI) based methods have been proposed as alternatives to traditional statistical ones in many scientific disciplines. The literature demonstrates that AI models such as ANN and neuro-fuzzy techniques are successfully used for air pollution modeling (Nunnari et al., 2004, Perez-Roaa et al., 2006) and forecasting (Perez et al., 2000, Gautama et al., 2008). Moseholm et al. (1996) investigated the relationships between traffic and carbon monoxide (CO) concentrations using ANN model measured near an intersection which was sheltered from the wind by multi-story buildings. They compared ANN and MLR models and reported ANN as the superior model. Viotti et al. (2002) used an ANN model with a hidden layer to predict short-term and medium-term air pollutant concentrations (CO, ozone and benzene) in an urban area of Perugia city. Martin et al. (2008) used ANN and k-nearest neighbors classifiers as predictive tool in order to predicting future peaks of CO. Noori et al. (2008) compared use of ANN and PCA-MLR models in forecasting CO daily concentration in atmosphere of Tehran and reported ANN as the superior model. Modeling and controlling CO concentration using a neuro-fuzzy technique have been used by Tanaka et al. (1995) in a large city of Japan. Prediction results show that the fuzzy model is much better than the linear model. Yildirim and Bayramoglu (2006) proposed adaptive neuro-fuzzy inference system (ANFIS) to estimate the impact of meteorological factors on SO2 and total suspended particular matter pollution levels over an urban area in Turkey. Carnevale et al. (2009) presented the application of neural network and neuro-fuzzy models to estimate nonlinear source–receptor relationships between precursor emission and pollutant concentrations (ozone and PM10) in Northern Italy. The results show that, despite a large advantage in terms of computational costs, the selected source–receptor models are able to accurately reproduce the simulation of the 3D modeling system.

Input selection is a crucial step in ANN and ANFIS implementation. These techniques are not engineered to eliminate superfluous inputs. In the case of a high number of input variables, irrelevant, redundant, and noisy variables might be included in the data set, simultaneously; meaningful variables could be hidden (Seasholtz and Kowalski, 1993, Noori et al., 2009a). Therefore, reducing input variables is recommended. There are different methods for reducing the number of input variables such as forward selection (FS) (Chen et al., 1989, Wang et al., 2006) and Gamma test (GT) techniques (Corcoran et al., 2003, Moghaddamnia et al., 2009). In comparison with other statistical models, another important subject which rarely has been observed in ANN and ANFIS is uncertainty analysis of results. It is obvious that predictions are not certain; therefore, uncertainty analysis can be effective in application of results. Literature shows that just a few methods proposed for determination of uncertainty in ANN and ANFIS. Some of them are bootstrap and sandwich estimator (Tibshirani, 1994), maximum likelihood and Bayesian inference (Dybowski, 1997) and Mont-Carlo method proposed by Marce et al. (2004). In this research, Mont-Carlo simulation, which is based on locating the models in a Mont-Carlo random sampling process, is selected, because it has not only better performance but also more novelty. Aqil et al. (2007) applied this uncertainty analysis method for evaluating outputs of ANFIS to predict weekly stream flow in the river and reported that it is appropriate for ANFIS model. Noori et al. (2009b) used Mont-Carlo method for uncertainty analysis of solid waste generation forecasting by means of wavelet transform-ANFIS and wavelet transform-ANN.

In this study, two techniques of input selection (FS and GT) have been applied in order to building hybrid models with ANN and ANFIS (FS–ANN, FS–ANFIS, GT–ANN, and GT–ANFIS), then have been compared with ANN and ANFIS fed with all input data. Finally, uncertainty analysis is done for two best models and the superior model is reported.

Section snippets

Case study and data

Tehran is the capital and the largest city of Iran which is located between 35° 34′–35° 50′N and 51° 02′–51° 36′E with the area about 570 km2. It is surrounded by mountains to the north, west and east. It has current population of about 8,000,000 (Bayat, 2005). There are 11 air quality measurement stations in Tehran. The results of previous studies about air pollution of Tehran demonstrate that 90% by weight of total air pollutants are generated from traffic and only 10% from other sources (

Forward selection

In this study, the FS method is used as a linear input selection technique in order to select the best subset of 12 input candidates. In other words, a linear model is developed using best correlated subset of inputs. First, correlation between each input variable and the desired output is evaluated. Second, the variable with highest correlation, i.e. Temp with R2 = 0.26, is selected as the first and the most important input. Then, remained candidates are implemented into the model one by one

Conclusion

Considering the importance of daily CO concentration in the atmosphere of Tehran, this research aims to develop proper prediction models using ANN and ANFIS models. Since input selection is a significant step in modeling, FS and GT methods are used and six models are developed. The goodness of each model is evaluated using R2, d, and MAE statistics and also, DDR. Finally, uncertainty analysis of FS–ANN and FS–ANFIS, as superior models, is carried out. The following conclusions could be drawn

References (53)

  • S.M.S. Nagendra et al.

    Artificial neural network approach for modeling nitrogen dioxide dispersion from vehicular exhaust emissions

    Ecological Modelling

    (2006)
  • R. Noori et al.

    Results uncertainty of solid waste generation forecasting by hybrid of wavelet transform-ANFIS and wavelet transform-neural network

    Expert Systems with Applications

    (2009)
  • G. Nunnari et al.

    Modelling SO2 concentration at a point with statistical approaches

    Environmental Modelling & Software

    (2004)
  • P. Perez et al.

    Prediction of PM2.5 concentrations several hours in advance using neural networks in Santiago, Chile

    Atmospheric Environment

    (2000)
  • M.B. Seasholtz et al.

    The parsimony principle applied to multivariate calibration

    Analytica Chimica Acta

    (1993)
  • P. Viotti et al.

    Atmospheric urban pollution: applications of an artificial neural network (ANN) to the city of Perugia

    Ecological Modelling

    (2002)
  • X.X. Wang et al.

    Sparse support vector regression based on orthogonal forward selection for the generalised kernel model

    Neurocomputing

    (2006)
  • Y. Yildirim et al.

    Adaptive neuro-fuzzy based modelling for prediction of air pollution daily levels in city of Zonguldak

    Chemosphere

    (2006)
  • S. Agalbjörn et al.

    A note on the gamma test

    Neural Computing Applied

    (1997)
  • Bayat, R., 2005. Source Apportionment of Tehran's Air Pollution. M.Sc thesis. Department of Civil and Environmental...
  • S. Chen et al.

    Orthogonal least squares methods and their application to nonlinear system identification

    International Journal of Control

    (1989)
  • S. Chen et al.

    Sparse modeling using orthogonal forward regression with PRESS statistic and regularization

    IEEE Transactions on Systems, Man, and Cybernetics – Part B

    (2004)
  • S.L. Chiu

    Fuzzy model identification based on cluster estimation

    Journal of Intelligent Information Systems

    (1994)
  • G. Cybenko

    Approximation by superposition of a sigmoidal function

    Mathematics of Control, Signals, and Systems

    (1989)
  • Z.Q. Deng et al.

    Longitudinal dispersion coefficient in straight rivers

    Journal of Hydraulic Engineering (ASCE)

    (2001)
  • Durrant, P.J., 2001. winGamma: a Non-linear Data Analysis and Modeling Tool with Applications to Flood Prediction. PhD...
  • Cited by (154)

    View all citing articles on Scopus
    View full text