Sodium Adsorption Ratio (SAR) Prediction of the Chalghazi River Using Artificial Neural Network (ANN) Iran

Considering the significance of the Sodium Adsorption Ratio (SAR) for growing plants, its prediction is essential for water quality management for irrigation. The SAR prediction in Chelghazy River in Kurdistan, northwest of Iran, using an Artificial Neural Network (ANN) was studied. The study applied the Multilayer Perceptron (MLP) of the ANN to average monthly data, which was collected by the water authority of the Kurdistan province for the period of 1998-2009. The input parameters of the MLP network was pH, discharge, sulfate, sodium, calcium, chloride, magnesium and bicarbonate, and output was predictive of the SAR. The results showed a correlation coefficient 0.976 between actual and predicted SAR, which means the accuracy of the model is acceptable. The model uses the input parameters to predict the SAR at the same month. The sensitivity analysis indicated the prediction of the SAR was affected by merely pH and calcium. As a whole, the MLP of the ANN may be applicable for prediction of the SAR which is necessary parameter ration for agriculture.

this objective, it is necessary to have proper monitoring of water quality.Having enough data without proper interpretation cannot help water quality management adequately.There are several methods for analyzing water quality data, such as statistical models, deterministic models and artificial neural models Asadollahfardi et al. (2011Asadollahfardi et al. ( -2012) ) .Dogan et al. (2007) used the ANN to forecast BOD concentration using data from eleven stations on the Melen River in 2001-2002, including COD, temperature, DO, chlorophyll-a, ammonia, nitrite and nitrate.They concluded that the ANN model provides a reasonable estimate for BOD parameter.Singh et al. (2009) applied the ANN to predict dissolved oxygen and BOD using ten years of data at eight different stations in India.The input of the model consisted of eleven monthly water quality parameters, and the predictions and actual data had a good agreement.Huang and Foo (2002) applied the ANN for the assessment of the variation of water salinity in the Apalachicola River in Florida.They employed part of existing data for testing and the remainder of the data for the validity of the ANN model.Maier and Dandy (2002) reported 41 successful case studies of prediction of water changing, which uses the MLP neural.Coppola et al. (2003) described the successful application of the technology for three kinds of management and underground water prediction.In the first instance, an ANN was trained by simulation of data collected from a numerical model based on physical data in the appropriate places, with various pumping and climate conditions.The ANN obtained a high precision for forecasting, and its altering statue equations were substituted in the multipurpose optimal formula.In the second and third instances, the ANNs were developed according to climate and hydrology real data for the different environmental hydrology conditions.For the second problem, an ANN was developed using the information collected over five years and eight months and under various climates and pumping conditions to predict the heights in a limestone and ice layer under the multi-layer soil.Misaghi et al. (2003) studied water quality in the Zayandeh Rud River applying a General Regression Neural Network (GRNN) for ten years of BOD and DO.Musavi-Jahromi and Golabi (2008) applied the ANN to monthly water quality of the Karoon River, Iran, using CO, HCO 3 -, SO 4 , CL, Na, Ca, Mg and K as inputs and TDS, EC and SAR as outputs.Their results were satisfactory.Kanani et al. (2008) studied water quality in the Achechay River, but they selected a water quality monitoring station and applied MLP and Input Delay Neural Network (IDNN) models.They predicted TDS parameters using discharge as input data to the models.Olyaie et al. ( 2010) employed the MLP model with water quality parameters including BOD and DO in Hamadan Morad Beik River.Their outcome was satisfactory.Asadollahfardi et al. (2010) applied the MLP model to total phosphor and total nitrogen data in an Anzaliy Wetland (Iran) study and obtained acceptable prediction for eutrophication in the wetland.Asadollahfardi et al. (2012) applied the MLP and Elman dynamic of the ANN to two stations of Talkheh Rud River and predicted TDS of the river one month in advance, and the results were acceptable.Generally for assessment of the dispersion risk of irrigation water it is significant to consider the ratio of sodium to other exchangeable cations on soil colloids.High sodium ions in water affect the permeability of the soil and cause of infiltration problems.When sodium exists in the soil in an exchangeable form, it replaces the calcium and magnesium absorbed on the soil clays and causes dispersion of soil particles.The SAR has a proper criterion for irrigation water suitability.If calcium and magnesium are the predominant cations absorbed on the soil exchange complex, the soil tends to be easily cultivated and has a permeable and granular structure Asadollahfardi et al. (2010).
Zhang and Stanley (1997) using the artificial neural network (ANN) modeling technique were used to establish a model for forecasting the rawwater coloring in a large river.In this research the potential applications of ANN in the water treatment industry are also discussed.Results indicate that the ANN modeling scheme shows much promise for water quality modeling and process control in water treatment.Keiner and Yan (1998) studied sea Surface Chlorophyll and Sediments from Thematic Mapper Imagery using neural network models.It was found that a neural network with two hidden nodes, using the three visible Landsat Thematic Mapper bands as inputs, was able to model the transfer function to a much higher accurately than multiple regression analysis.The RMS errors for the neural network were <10%, while the errors in regression analysis were >25%.Zhang et al. (2002) studied water quality in the Gulf of Finland using an empirical neural network and combined optical data and microwave data.The results showed that the estimation accuracy of the major characteristics of surface waters using the neural network is much better than those from the regression analysis.
The study area is a part of the Sirwan Basin with a total area about 105,000 hectares which contain 3.8% of the Kurdistan provincial area.The north of the basin is surrounded by the Sefed River catchment area, east of the area is the Ghaveh River, and west of the area has a common boundary with the Seravan River.The catchment is located between 46°, 46', 40" and 47°, 20', 00" East longitude and 35°, 24', 44" and 35°, 43', 23" North latitude (Figure 1).
The height of the area varies from 1500 to 2900 meters.All of the surface runoff of the basin is collected by the Chelghazy and Khalefeh Tar Khan main river (Mohammady and Fathi, 2004).Figure 1 shows the study area.The weather conditions of the basin are similar to subtropical dry region weather.Its winters are cold; its summers are relatively warm and dry, and its rainy season is from October to May.The average annual minimum and maximum temperatures between 1990 and 2005 were 6.3 and 21.8 degrees Centigrade, respectively.The average annual minimum and maximum humidity between 1990 and 2005 was 26.7% and 21.8%, respectively.The annual total rainfall was 395 mm, Islamic Republic of Iran Meteorological Organization (1983).The basin is part of the Kurdistan Province, which is geologically quite active.The land straddles the seduction zone between the colliding Eurasian and African tectonic plates.Locally, the breakaway Arabian micro plate is being sub ducted under the Iranian and Anatolian Micro plates at the rate of a few inches a year, and as a result the Zagros mountains and Kurdistanthe point of this collision-are being compressed and pushed upward several inches a year (Hooshmand Zadeh, 1995).
The high concentration of sodium in irrigation water may negatively affect the soil structure and decrease the soil hydraulic conductivity in fine-textured soil.The degree to which sodium will be absorbed by a soil is a function of the amount of sodium to divalent cations(Ca and Mg) and is regularly stated by the sodium adsorption ratio(SAR) (Bouwer and Idelovitch 1987).The SAR is a general water quality index that indicates the percentage of sodium in the water and function of the ratio of sodium to divalent cations such as Ca and Mg.The SAR parameter is obtained from the Eq. ( 1 Where pk2=negative logarithm of the second dissociation constant for carbonic acid; pk c =solubility constant for calcite; and p=negative logarithm of ion concentration (meq/L).The amount of p (k 2 -KC), p (Ca+Mg) and p (Alk) related to Ca ++ +Mg ++ +Na + , Ca ++ +Mg ++ and CO 3 --+HCO 3 -respectively can be found in Bouwer (1974).
The adjusted SAR adj value typically is computed, which take into account the effects of rainfall.Sodium also has adverse effects on the crops such as leaf burn in almond, avocado, and stone fruits (Bouwer and Idelovitch 1987).Ayers and Tangi (1981) suggested that whether the SAR adj water for irrigation is below 3, there are no sodium problems if the SAR is between 3 and 9 there are increasing problem and whether the SAR is above 9, there are severe problems.The first objective of this study was set to develop an ANN-based prediction model for monthly SAR.Development of the model uses the discharge, sodium, calcium, magnesium, chloride, sulfate, bicarbonate, pH and SAR data for a station of the Chalgazi River Basin in the Kurdistan Province, which were collected from the Water Authority of Kurdistan.The second aim was using sensitivity analysis to find out which of the mentioned parameters were significant in the prediction of the SAR.

Methodology
The ANN is a data processing system, based on an idea similar to the processing of the human brain that treats data as a steady network parallel to each other in order to solve a problem.With the networks, the structure of data is designed to help programming knowledge in which the behavior is the same as natural neural and its component.An artificial neural system consists of three components, including weighting (W), bias (b) and transfer function (f).These three components are unique to each neural system.In Figure 2, "p" and "n" equal input and output while "a" equal net input.The junctions of 1 and 2 in Figure 2 show the schematic of the artificial neural system.The function of artificial neural network would be called "p".

...(3) ..(4)
The MLP is a static neural network which has three layers including an input layer, hidden layer and output layer.
Figure 2 shows the schematic of the ANN. Figure 3 shows schematically tangent -sigmoid transfer function and linear transfer function.Number of neurons in the hidden layers for each model may be obtained using trial and error.The function of the network can be modeled by equations 5 and 6 (Menhaj, 1998): Where, R = numbers of input vector components.S 1 and S 2 = numbers of neurons in hidden and output layers, respectively.P = input vector.W1 and W 2 = weighting matrix in hidden and output layers, respectively.b 1 and b 2 = bias vectors in hidden and output layers, respectively.G and F = neuron transfer functions in hidden and output layers, respectively (Menhaj and Safepoor.1998).
For assessment of water quality, we applied the MLP model using average monthly data.According to the Universal Approximator, each multi-layer Perceptron of the ANN with a sigmoid hidden layer and a linear output layer is able to predict each complicated function if the number of neurons is selected precisely (Cybenko, 1989;Hornik, 1991Hornik, , 1993;;Leshno et 1993).This theory decreases the number of hidden layers to the least and decreases the complexity of the network.According to the mentioned theory, all models applied in this research are the MLP with a hidden layer, tangent sigmoid transfer function and linear layer outputs.Figure 4 illustrates the MLP neural network schematic.
For calculation of the amount of error in predicting the desired parameter and performance evaluation of models, we used R 2 and Root Mean Squared Error (RMSE), as shown in Equations 7 and 8 (Kennedy and Neville, 1964).
... (7) ...( 8) Where X i , Y i , X and Y = the measured data, predicted data, the average of measured data and the mean of predicted data, respectively (Preis et al, 2008;Singh, 2009).The statistical summary of the data is shown in Table 1.

Learning Rate
There is a parameter called the learning rate in the training algorithm of back propagation, which is on the basis of the steepest descent.Its objective is to minimize the sum square error of outputs.The learning rate is indicated by a symbol á and determines the velocity of convergence in this algorithm.The performance of the steepest descent algorithm is enhanced if the learning rate is allowed to alter during the training process.Maier and Dandy (2000), in a review , studied Neural networks for the prediction and forecasting of water resource variables .They stated that in a review of 43 papers dealing with the use of neural network models for the prediction and forecasting of water resources variables are undertaken in terms of the modeling process adopted and at the vast majority of these networks was trained using the back propagation algorithm.

RESULTS AND DISCUSSION
The MLP model was used for the twelve years monthly (period 1998-2010) data of the Chalgazi River which was collected by Water Authorities of Kurdistan province, Iran.As mentioned previously, according to the universal approximator number of hidden layer decrease to a hidden layer.The rate of the network efficiency depends on applying the appropriate number of neural in the hidden layer.Table 1 indicates a statistical summary of the data that was used in this study.The input layer of the model consisted of eight parameters, including the amounts of sodium, magnesium, calcium, sulfate, chloride, bicarbonate, pH and discharge, which were applied simultaneously, and the output was the SAR.70% of the data was used for training, 5% for validation and 25% was used for testing of the model.Because  2.
Table 2 shows the result of using a different number of neurons in the hidden layer while sodium, magnesium, calcium, sulfate, chloride, bicarbonate, pH and discharge were used as inputs simultaneously, and the SAR was an output of the network.As shown in the both tables, the minimum rate of RMSE occurred while we applied three neurons in a hidden layer.Figure 6 shows the actual and the forecasted SAR.Considering the figure, there is good conformity between the actual and the predicted data, and the correlation coefficient was R 2 = 0. 9757.Hence, the developed model may apply for the SAR prediction in water quality management.Comparing the results of this study with the work of Musavi-Jahromi and Golabi (2008) in Karoon River, Iran, there are similarities and agreement between the two results, which confirm the suitability of the ANN model to predict the SAR in river water quality.The differences between the two works are: (1) we gave evidence for using a hidden layer and clearly mentioned type of the model which we applied, and (2) it was evaluated which of the input parameters were significant to forecast the SAR by use of sensitivity analysis.They worked for several water quality monitoring stations, yet in the Chalghazi River there are not as many monitoring stations.Data for this study belonged to the period 1998-2009, 2.
The input parameters to the MLP were rate of flow, pH , Ca, Na, SO4, bicarbonate, chloride and magnesium 3.
The correlation coefficient between prediction and actual SAR was 0.9757, and the minimum RMSE in training, testing and total were 0.0113, 0.0242 and 0.0175, respectively, in three neurons in the hidden layer.4.
Sensitivity analysis was done in this work, which was not done at work of Katoozian et al. (2011) Figure 5 showed Comparison between actual and predicted SAR, as shown actual and predicted SAR are close together.

Sensitivity analysis
To assess the effect of each input parameter to the result of the SAR prediction, we increased or declined 10% of one of the input parameters while the other was kept unchanged; then the role of variation of each parameter in prediction of the SAR was identified.We drew the normal plot between the observed and the predicted data for the SAR after changing each input parameter.Figures 6 and 7 indicate the results of the normal plot between the observed and the predicted SAR while each input parameter was changed 10%.The prediction of the SAR was affected only by two of the parameters including pH and Ca, while the other parameters' effects were not considerable.As indicated in Figure 6, the correlation coefficient for altering pH is 0.0239.This means that the pH has a significant affect on the SAR prediction, however, the discharge, sodium, magnesium, sulfate, chloride; bicarbonate did not affect the prediction noticeably.It may show that the role of the discharge to the SAR perfection is not vital.Figure 7 shows the normal plot for the variation of calcium, as shown in the figure, R 2 =0.92, which means that the calcium has an effect on the SAR prediction; however, the effects are less than the pH variation.Summary of the sensitivity analysis for each input parameter are summarized in Table 3.

Fig. 7 :
Fig. 7: The normal plot between observed and predicted the SARadj under changing Ca +2