Development and Evaluation the Performance of Ann Based Statistical Downscaling Models For Daily and Monthly Precipitation.

Statistical downscaling techniques represent a quantitative relationship between large-scale atmospheric variables (predictors) and local-scale meteorological variables (predictand) such as precipitation. This study uses large-scale atmospheric as predictor variables derived from the National Centre for Environmental Prediction and National Centre for Atmospheric Research (NCEP/NCAR) reanalysis data set and precipitation in Izmir city stations as predictant. The purpose of this study is to develop statistical downscaling models for daily and monthly precipitation over Izmir city by using Artificial Neural Network ) ANN ( methods and comparison of the performance of those models. The results revealed that the performance of the daily model improves with the aggregated of daily results. Although the performance of the daily model gives fair (not high) results (e.g., R 2 ranging from 0.362 to 0.331), the aggregated model gives very good results at monthly


INTRODUCTION
Global climate changes can lead to hydrological changes that will affect almost all aspects of humans, e.g, regional water availability, agricultural productivity, flood control.So it will be necessitates understanding how a change in global climate could affect regional water availability (Kusangaya et al. 2014).
General circulation models (GCMs) are one of the most important tools for studying climate change, GCMs are numerical models and describe the atmospheric processes through mathematical equations.GCMs which have been developing over several decades are able to simulate features of the global climate, represent various earth systems including the atmosphere, oceans, land surface, and sea ice, and offer considerable potential for studying climate change (Liu et al. 2016).
The performance of GCMs are coarse at the smaller spatial and temporal scales relevant to regional climate, because the spatial resolution grids are too coarse to resolve many important sub grid-scale processes, therefore GCMs outputs are often unreliable at regional scales (Xu 1999).
One possible solution to overcome this problem is using downscaling techniques to downscale the output from GCMs or reanalysis datasets to a higher resolution in space and time.The basic idea of downscaling techniques is transferring large-scale changes in atmospheric variables to local weather (Hanssen-Bauer et al. 2005).
Downscaling methodologies are classified into two main groups as dynamic downscaling and statistical downscaling (Fistikoglu and Okkan 2011).Dynamic downscaling, which use of regional climate models (RCMs) to produce higher resolution outputs, is very computationally intensive, very complex, and requires substantial computational resources.Therefore, it is not the best options for the studies that require a quick response.The statistical downscaling methodology requires less computational effort than dynamical downscaling applications and it is sound statistically.Statistical downscaling consists of a varied group of approaches that are often almost simple to implement, but requires an adequate amount of quality observed data (Trzaska and Schnarr 2014).Statistical downscaling can be divided into three main categories, which are Weather classification, Weather generators, and Transfer functions.
Weather classification methods use synoptic patterns to classify weather patterns based on their synoptic similarity.After that, they establish quantitative relationships to assign Predictants (Bardossy et al. 2005;Fistikoglu and Okkan 2011).Although these methods are appropriate for downscaling nonnormal distributions, they require a large amount of observed data (more than 30 years) to evaluate all probable weather conditions.These approaches are more computationally intensive than linear approaches (Trzaska and Schnarr 2014).
Weather generators methods are replicators of local weather statistical features under the large-scale variables conditions (Fistikoglu and Okkan 2011;Kilsby et al. 2007).
These methods are widely used in temporal downscaling.
Transfer functions methods build the relationships between large-scale atmospheric variables and local surface variables.These methods are considered the most common methods in statistical downscaling.Transfer functions use various applications including linear and non-linear regression, artificial neural networks, redundancy analysis, and canonical correlation analysis (Benestad et al. 2007;Wilby et al. 2004).
The applications of transfer functions are range from linear and nonlinear regression types to artificial neural networks (ANNs), principal component analysis (PCA), canonical correlation, and redundancy analysis (Schoof et al. 2007), (Fistikoglu and Okkan 2011).
Studies comparing different statistical downscaling methods are now relatively common (Singh and Kumar 2020;Liu et al. 2016).The results of these studies have shown that different methods have different performance in a certain area, and a certain method has different performance in different study areas.
Many studies were implemented SDM using monthly precipitation as predictands (Okkan and Kirdemir 2016) and also relatively few previous studies focused on daily precipitation (Singh and Kumar 2020).This paper take in consideration the daily and monthly precipitation.
In this study, a statistical downscaling method based on ANN is applied to predict precipitation over the city of Izmir.The purpose of the study is to develop two different models, the first one is a model that downscale daily precipitation, and through this model we calculate a cumulative rainfall for different periods for two days until we reach the monthly precipitation by aggregate the rainfall day by day, while the second model is to downscale monthly rainfall directly by using monthly data.At the end of the paper, we compare the results of the monthly model and monthly cumulative results of the daily model.

STUDY REGION AND DATASETS
The study region is Izmir city, which is the third largest city In Turkey, it is located at the Aegean coast of Turkey (see Fig. 1).Izmir city and its environs reflect typical Mediterranean climate characteristics.Precipitation amounts in study area is around 610 mm per year, and it is mostly abundant in winter, whereas up to 80% of the total annual precipitation falls between November to March.

Predictands
There are many stations are available within the region, 6 stations were selected in this study.The available data and the location of the stations are listed in table 1. Daily and monthly precipitation data from meteorological stations in Izmir city used as predictand.The other neighbour stations around Izmir station were used in order to test the results of downscaling models regarding to the spatial consistency of the daily and monthly rainfall estimations.The required data were collected and extracted mainly from the Turkish state Meteorological Service in Turkey  It's worth mention that, the stations (Adnan menderes, Selcuk, Seferihisar Cesme, and Odemis) were selected as the stations are geographically closed to each other, they are sharing similar meteorological and atmospherically conditions.Besides, the stations shows a strong +ve correlation (coefficient of determination R 2 ) with the main station used in this paper is Izmir station.As its clearly shown in Table 2, at the daily level, the obtained correlation between the main station (Izmir) and A.M. havalimani and Seferihisar stations are (0.72 and 0.65, receptivity), while odemis and cesme selcuk stations recorded relatively low R 2 regards to the other stations (havalimani and sefirhisar).On the other hand, at the monthly level, a strong correlation was observed between all stations ranging between (0.75 and 0.91), see Table 2. ,and partial correlation (Anandhi et al. 2009;Wilby and Wigley 2000).According to the previous studies (Chen et al. 2010;Wilby et al. 1999), the predictors can be selected using stepwise regression and analysing correlation.
In this study, the main criteria for choosing the relevant predictors are the predictors should be physically and conceptually sensible for the predictand (Wilby et al. 1999).
In addition, it has an acceptable correlation with the predictant which is daily precipitation in this study (Fistikoglu and Okkan 2011).Therefore, 12 large scale atmospheric variables were selected as predictors, which taking into consideration precipitation generation mechanism.The selected NCEP/ NCAR reanalysis variables are listed in Table 3.The data were downloaded from the NCEP/NCAR reanalysis project web site http:www.esrl.noaa.gov/psd/data/grided/data.necp.reanalysis.html.

Compare and analysis
Fig. 2: flow chart illustrates the proposed Methodology
Statistical downscaling models can be defined as Y = F(X), where Y is the predictand, which represent the daily/monthly precipitation, and X is the predictors.
The predictand is the target data, which is the small-scale variable representing in this study precipitation from meteorological stations in Izmir city (see Table 1).Predictors are the input that represent the large-scale atmospheric variables from NCEP/NCAR reanalysis data set (see Table 3).

Statistical Downscaling using ANN
ANN was used as a well-known machine learning technique to estimate the observed daily rainfall from the large-scale atmospheric reanalysis parameters (Wilby et al. 1998).
ANN refers to computing systems whose central theme is borrowed from the analogy of biological neural networks.ANN are able to learn and generalize from examples to produce meaningful solutions to problems (Jiang and Cotton 2004).The ANN is utilized as a practical black-box tool for developing a non-linear regression between the large-scale atmospheric dataset (predictors) and observed daily rainfall (predictands) (Harpham and Wilby 2005;Khan et al. 2006).
The ANN structure designed consists of three types of layers; the first one is an input layer, which has n neurons for each variable; here, the input layer is the one where the predictors (large-scale atmospheric variables) were defined.The second layer is the hidden layer which may have several neurons, and all hidden neurons transform the inputs nonlinearly into another dimension through weight and a bias term shown in Eq. ( 1).In this study, the assumption that the number of neurons is double the number of predictors was accepted (by applying the trial and error technique).The last layer is the output layer (predictand) which in this study contains daily/monthly precipitation variables from station records.See Fig. 3.
Three popular transfer functions are tried out in ANN construction trials: tangent sigmoid, linear, and log-sigmoid.In this study, the tangent sigmoid function was found suitable.See Eq. ( 2) The search for the best value of weights and biases is referred to as ANN learning phase, which is carried out with a known input and output set.Each training phase involves a set of inputs passed forward through the network to generate trial outputs, and then compare the observed outputs.When the difference (residual) exceeds the desired value, the error is passed back to the network.
The training algorithm adjusts the connection weights based on the error.This process is referred to as back-propagation.Once, the comparison error has been reduced to an acceptable level for the entire training set, the training phase is complete.
After training, the network is evaluated using a set of cases withheld from it during the training session.After the model was completed, it was ready to be applied to any other situation.
The network is trained using the Levenberg-Marquardt feed forward-backpropagation algorithm, as shown in Eq. (3.The Levenberg-Marquardt feed forwardback-propagation algorithm is a second-order non-linear optimization technique, that is typically faster and more reliable than any other back-propagation methodology.(Fistikoglu and Okkan, 2011).
J =Jacobian matrix, w is parameter vector, μ is Marquardt parameter.

RESULTS AND DISCUSSION
The performance of the model is evaluated by comparing the results of statistical downscaled model P(t)_down.The cumulative precipitation was obtained through aggregating the daily unit to larger units (i.e 2, 3, 4, 5, 6, 7, 15, 30-days).Eq. 6 mathematically expresses the aggregation process.

𝑷(𝒏) 𝒄𝒖𝒎 = ∑ 𝑷(𝒏)
(6) Where P(n)cum is the cumulative precipitation aggregated from daily precipitation, and n is the number of days.
The performance of the model for all the aggregated units (from the daily unit) combined with the descriptive statistics of the training time series are presented in Table 4. Similarly, Table 5 shows the same statistics for the tested time series.
The results show that the mean values for both observed and downscaled precipitation are very close to each other in both training and testing periods, regardless aggregated units.However, in terms of standard deviation, as the aggregated unit becomes larger, the gap between the observed and downscaled values decreases, giving a strong indicator that the performance of the model is enhanced as the aggregated unit increases.Besides, when comparing the coefficient of variation between the observed and downscaled precipitation, the abovementioned conclusion can be drawn as Fig. 4 illustrate.
Fig. 7 shows that the differences of the box plots of observed and downscaled depresses in long duration like 15 days and 30 days.
For the aims of spatially validate the neighbour stations, the concept of the aggregated unit was also applied.The model behaviour maintains valid (R 2 increase with the aggregated unit increase) for these stations, as Fig. 6 and Table 6 help indicate.
To sum up, it was clear that the performance of the model improves as the aggregated unit increases.Although the performance of the model gives fair results at daily level, a significant improvement has been recorded at 30 days level (e.g., R 2 is more than 0.73).1948-1990 1991-2018 2009 -2018 1965 -2011 1980 -2011 1965 -2018 1960

CONCLUSION
The ANN technique was used to establish statistical downscaling models for estimating station-based precipitation from the large-scale NCEP/NCAR atmospheric variables.The downscaling models were established using the records in Izmir station, and for spatial validation of the model, the nearby stations were utilized.Here it was downscaled daily and monthly precipitation in separated models, and then the performance of these models was compared.
In the daily model, although the performance of the model gives fair (not high) results at daily level, a significant improvement has been recorded as the aggregated day increases.That means, the accuracy at weekly precipitation level is significantly better than at daily level, as well as at monthly precipitation better than at weekly (e.g., R 2 are 0.36, 0.59 and 0.73 for daily, weekly, and monthly model respectively for Izmir station in training period).Also, the model behaviour maintains valid for the neighbour stations.
In the Monthly downscaled model, where mean monthly large sale atmospheric variables were used, it is observed that the model gives very good results at training and test period, (e.g.R 2 ranging 0.65 to 0.73).
In general, performance of statistical downscaling model shows that ANN model produces very good results for monthly model, while daily model produces fair (not high) results.This is explained by the presence of uncertainty in daily predictors and predictants, which is much higher than that in monthly data.(Khan et al., 2006).In addition to that, the ANN downscaling model simulates the mean values perfectly, because ANN uses MSE for obtaining the best results, so it produces small amount of precipitation in each dry days, but the mean values are nearly the same with observed values.
So, the aggregated daily sums approaches to the observed cumulative as the aggregate day increases.For this reasons, the monthly downscaling models have relatively strong performance comparing with the daily downscaling.It is found that there are similar findings of the aggregated monthly model and the monthly model.
However, the aggregated monthly provided a little better result.Although the accuracy of the aggregated monthly model is higher, it required a significant amount of time and effort.
This study recommends that if monthly precipitation is required for climate studies, the monthly downscaling approach could be preferred conveniently rather than daily downscaling due to the time of the process.
Overall, these results indicate that ANN method proved to be a good tool for producing local surface variables like precipitation, especially at the monthly level, from large-scale atmospheric variables.

Fig. 1 :
Fig. 1: Location of selected meteorological stations within study area.

Fig. 3 :
Fig. 3 : Structure of the downscale model and precipitation observed P(t)_obs.from stations data.To examine and evaluate the performance of models, the coefficient of determination (R 2 ), and root mean square error (RMSE) indices were used.See Eq. model, the time series of Izmir station was divide for training and test datasets.The precipitation data of the years between 1948 and 1990 were served as training period, while rain-data recorded for the years of 1991 to 2018 employed as testing period.

Fig. 6 :
Fig. 6: The performance of the model (R 2 ) improves with an increase in the duration.

Table 2 :
Summary of the R 2 between the precipitation in Izmir station and other predictors for the period from 1948 -2018.The spatial resolution of 2.5° x 2.5°.The latitudes range from 36.25 °N to 38.75 °N, and the longitudes range from 26.25°E to 28.75 °E, see Fig. 1.
(Fistikoglu and Okkan, 2011)tmospheric variables from the National Center for Environmental Prediction and National Center for Atmospheric Research (NCEP/NCAR) reanalysis data set were used as predictors.The NCEP/NCAR data set is considered to represent the atmospheric conditions of the study area considerably well(Fistikoglu and Okkan, 2011).The large-scale atmospheric variables of NCEP/NCAR reanalysis dataset are selected as the

Table 4 :
Summary of the results of daily model for the training periods for Izmir station

Table 5 :
Summary of the results daily model for the testing periods for Izmir station

Table 6 :
Summary of the value of R 2 of precipitation for 1 day and cumulative days

Table 8 :
The performance of cumulative monthly results and monthly model.