Uncertainty analysis of developed ANN and ANFIS models in prediction of carbon monoxide daily concentration
Introduction
In recent years, artificial intelligence (AI) based methods have been proposed as alternatives to traditional statistical ones in many scientific disciplines. The literature demonstrates that AI models such as ANN and neuro-fuzzy techniques are successfully used for air pollution modeling (Nunnari et al., 2004, Perez-Roaa et al., 2006) and forecasting (Perez et al., 2000, Gautama et al., 2008). Moseholm et al. (1996) investigated the relationships between traffic and carbon monoxide (CO) concentrations using ANN model measured near an intersection which was sheltered from the wind by multi-story buildings. They compared ANN and MLR models and reported ANN as the superior model. Viotti et al. (2002) used an ANN model with a hidden layer to predict short-term and medium-term air pollutant concentrations (CO, ozone and benzene) in an urban area of Perugia city. Martin et al. (2008) used ANN and k-nearest neighbors classifiers as predictive tool in order to predicting future peaks of CO. Noori et al. (2008) compared use of ANN and PCA-MLR models in forecasting CO daily concentration in atmosphere of Tehran and reported ANN as the superior model. Modeling and controlling CO concentration using a neuro-fuzzy technique have been used by Tanaka et al. (1995) in a large city of Japan. Prediction results show that the fuzzy model is much better than the linear model. Yildirim and Bayramoglu (2006) proposed adaptive neuro-fuzzy inference system (ANFIS) to estimate the impact of meteorological factors on SO2 and total suspended particular matter pollution levels over an urban area in Turkey. Carnevale et al. (2009) presented the application of neural network and neuro-fuzzy models to estimate nonlinear source–receptor relationships between precursor emission and pollutant concentrations (ozone and PM10) in Northern Italy. The results show that, despite a large advantage in terms of computational costs, the selected source–receptor models are able to accurately reproduce the simulation of the 3D modeling system.
Input selection is a crucial step in ANN and ANFIS implementation. These techniques are not engineered to eliminate superfluous inputs. In the case of a high number of input variables, irrelevant, redundant, and noisy variables might be included in the data set, simultaneously; meaningful variables could be hidden (Seasholtz and Kowalski, 1993, Noori et al., 2009a). Therefore, reducing input variables is recommended. There are different methods for reducing the number of input variables such as forward selection (FS) (Chen et al., 1989, Wang et al., 2006) and Gamma test (GT) techniques (Corcoran et al., 2003, Moghaddamnia et al., 2009). In comparison with other statistical models, another important subject which rarely has been observed in ANN and ANFIS is uncertainty analysis of results. It is obvious that predictions are not certain; therefore, uncertainty analysis can be effective in application of results. Literature shows that just a few methods proposed for determination of uncertainty in ANN and ANFIS. Some of them are bootstrap and sandwich estimator (Tibshirani, 1994), maximum likelihood and Bayesian inference (Dybowski, 1997) and Mont-Carlo method proposed by Marce et al. (2004). In this research, Mont-Carlo simulation, which is based on locating the models in a Mont-Carlo random sampling process, is selected, because it has not only better performance but also more novelty. Aqil et al. (2007) applied this uncertainty analysis method for evaluating outputs of ANFIS to predict weekly stream flow in the river and reported that it is appropriate for ANFIS model. Noori et al. (2009b) used Mont-Carlo method for uncertainty analysis of solid waste generation forecasting by means of wavelet transform-ANFIS and wavelet transform-ANN.
In this study, two techniques of input selection (FS and GT) have been applied in order to building hybrid models with ANN and ANFIS (FS–ANN, FS–ANFIS, GT–ANN, and GT–ANFIS), then have been compared with ANN and ANFIS fed with all input data. Finally, uncertainty analysis is done for two best models and the superior model is reported.
Section snippets
Case study and data
Tehran is the capital and the largest city of Iran which is located between 35° 34′–35° 50′N and 51° 02′–51° 36′E with the area about 570 km2. It is surrounded by mountains to the north, west and east. It has current population of about 8,000,000 (Bayat, 2005). There are 11 air quality measurement stations in Tehran. The results of previous studies about air pollution of Tehran demonstrate that 90% by weight of total air pollutants are generated from traffic and only 10% from other sources (
Forward selection
In this study, the FS method is used as a linear input selection technique in order to select the best subset of 12 input candidates. In other words, a linear model is developed using best correlated subset of inputs. First, correlation between each input variable and the desired output is evaluated. Second, the variable with highest correlation, i.e. Temp with R2 = 0.26, is selected as the first and the most important input. Then, remained candidates are implemented into the model one by one
Conclusion
Considering the importance of daily CO concentration in the atmosphere of Tehran, this research aims to develop proper prediction models using ANN and ANFIS models. Since input selection is a significant step in modeling, FS and GT methods are used and six models are developed. The goodness of each model is evaluated using R2, d, and MAE statistics and also, DDR. Finally, uncertainty analysis of FS–ANN and FS–ANFIS, as superior models, is carried out. The following conclusions could be drawn
References (53)
- et al.
Modeling hydrology and water quality in the pre-alpine/alpine Thur watershed using SWAT
Journal of Hydrology
(2007) - et al.
Analysis and prediction of flow from local source in a river basin using a neuro-fuzzy modeling tool
Journal of Environmental Management
(2007) - et al.
Neuro-fuzzy and neural network systems for air quality control
Atmospheric Environment
(2009) - et al.
Predicting the geo-temporal variation of crime and disorder
International Journal of Forecasting
(2003) - et al.
Daily reservoir inflow forecasting using artificial neural networks with stopped training approach
Journal of Hydrology
(2000) - et al.
Subset selection in multiple linear regression: a new mathematical programming approach
Computers & Industrial Engineering
(2005) - et al.
Building a robust linear model with forward selection and stepwise procedures
Computational Statistics & Data Analysis
(2007) - et al.
Prediction of CO maximum ground level concentrations in the Bay of Algeciras, Spain using artificial neural networks
Chemosphere
(2008) - et al.
Evaporation estimation using artificial neural networks and adaptive neuro-fuzzy inference system techniques
Advances in Water Resources
(2009) - et al.
Forecasting carbon monoxide concentrations near a sheltered intersection using video traffic surveillance and neural networks
Transport Research
(1996)
Artificial neural network approach for modeling nitrogen dioxide dispersion from vehicular exhaust emissions
Ecological Modelling
Results uncertainty of solid waste generation forecasting by hybrid of wavelet transform-ANFIS and wavelet transform-neural network
Expert Systems with Applications
Modelling SO2 concentration at a point with statistical approaches
Environmental Modelling & Software
Prediction of PM2.5 concentrations several hours in advance using neural networks in Santiago, Chile
Atmospheric Environment
The parsimony principle applied to multivariate calibration
Analytica Chimica Acta
Atmospheric urban pollution: applications of an artificial neural network (ANN) to the city of Perugia
Ecological Modelling
Sparse support vector regression based on orthogonal forward selection for the generalised kernel model
Neurocomputing
Adaptive neuro-fuzzy based modelling for prediction of air pollution daily levels in city of Zonguldak
Chemosphere
A note on the gamma test
Neural Computing Applied
Orthogonal least squares methods and their application to nonlinear system identification
International Journal of Control
Sparse modeling using orthogonal forward regression with PRESS statistic and regularization
IEEE Transactions on Systems, Man, and Cybernetics – Part B
Fuzzy model identification based on cluster estimation
Journal of Intelligent Information Systems
Approximation by superposition of a sigmoidal function
Mathematics of Control, Signals, and Systems
Longitudinal dispersion coefficient in straight rivers
Journal of Hydraulic Engineering (ASCE)
Cited by (154)
Explainable based approach for the air quality classification on the granular computing rule extraction technique
2024, Engineering Applications of Artificial IntelligenceEnvironmental modelling of CO concentration using AI-based approach supported with filters feature extraction: A direct and inverse chemometrics-based simulation
2023, Sustainable Chemistry for the Environment