Applying different resampling strategies in machine learning models to predict head-cut gully erosion susceptibility

Citation for published version (APA): Wang, F., Sahana, M., Pahlevanzadeh, B., Pal, S. C., Shit, P. K., Piran, M. J., Janizadeh, S., Band, S. S., & Mosavi, A. (2021). Applying different resampling strategies in machine learning models to predict head-cut gully erosion susceptibility: predict head-cut gully erosion susceptibility. Alexandria Engineering Journal, 60(6), 58135829. https://doi.org/10.1016/j.aej.2021.04.026

Abstract Gully erosion is one of the advanced forms of water erosion. Identifying the effective factors and gully erosion predicting is one of the important tools to control and manage such phenomenon. The main purpose of this study is to evaluate the effect of four different resampling algorithms including cross-validation (5-fold and 10-fold) and bootstrapping (Bootstrap and Optimism bootstrap) on boosted regression tree (BRT), support vector machine (SVM), and random forest (RF) models in spatial modeling and evaluation of head-cut gully erosion in Konduran watershed. For this purpose, based on an extensive field survey, the points of the head-cut of the gully erosion were identified first, and a map of the distribution of head-cut gully erosion in the study area was prepared. Then 18 variable identify and prepare as factors affecting the occurrence of head-cut gully erosion. To assess the efficiency of the models, receiver operating characteristics (ROC) and area under the curve (AUC) were used. <Through the assessment result we indicate

Introduction
Gullies are part of the most important processes that result in land degradation. Gullies cause the loss of 10% to 90% of soil erosion [1]. Gullies form the most fragile land systems and deteriorate the physio-chemical properties of the soils. They are hot spots of soil erosion and an enormous amount of soil is eroded from these lands and transported to low land areas and increasing the risk of sedimentation and flooding in a catchment region [2,3].
Gully erosion is a frequent event in the arid and semi-arid climatic region and in a dry-wet climatic situation, leads to a large amount of sediment yield [4]. Iran experiences approximately 1 mm/yr reduction in soil thickness across the whole country due to water erosion, and 75.8% of the geographical area is exposed [5]. The combination of the prolonged dry periods and short wet seasons leads to a large amount of discharge, promoting surface runoff and sedimentation [6]. A large rate of soil erosion through head-cut gullies in agricul-ture is unsustainable and requires a way out in the form of a management strategy. However, studying gully erosion and predicting head-cut gully erosion is difficult in complex environments [2,7].
Many researchers have handled gully erosion susceptibility mapping (GESM) using RS and GIS techniques. They have used various traditional data mining approaches including weights of evidence (WoE) [5,[8][9][10], conditional analysis (CA) [11,12], logistic regression (LR) [12,13], index or entropy (IOE) [14,15], evidential belief function (EBF) [9], certainty factor (CF), frequency ratio (FR) [16], analytical hierarchy process (AHP) [14,17]. But Regional GESM techniques needed factors affecting gully erosion data variables from different sources at spatial scales, which may contain ambiguity and uncertainties. And traditional data mining approaches could not establish the relationship between geo-environmental factors and gully erosion processes. So, new modeling techniques are required that move away from traditional data mining techniques, could solve ongoing concerns, and intensify the Fig. 1 Location of the study area in Iran. model performance and precision to predict head-cut gully susceptibility mapping.
Previous studies verify that there are more faces that may be investigated and that a large number of potentially valuable approaches have not yet been totally performed to assess gully erosion susceptibility mapping. However, the present study to establish the relationship between gully conditioning factors and head-cut gully occurrence using three ML algorithms, includes BRT, SVM, and RF models and evaluation of the performance of different resampling algorithms in ML models to predict head-cut gully erosion susceptibility.

Description of the study area
Konduran watershed is located 60 km west of Bandar Lengeh, Hormozgan province, Iran between latitudes 51°26 0 and 51°2 8 0 north and longitudes 26°44 0 and 26°51 0 east, located in a coastal watershed (Fig. 1). The highest point in the northwest of the study area is 418 m above sea level and the lowest point in the southwest is 1 m above sea level. The study area has an area of 75.95 Km 2 . Annual rainfall in this area is 165 mm, the maximum annual rainfall is 392.7 and the minimum is 43 mm. The physiographic type of the region is flood plain. The floodplain area is relatively flat with low elevation and gentle slope, which is caused by fine sedimentary deposition over long periods. The soil is deep to very deep with medium to the light texture (silt, silt loam, and loam sand). The predominant geological formations in the area include Gachsaran formations consisting of gypsum and indrite layers, Mishan formation consisting mainly of marine, calcareous, Aghajari formation outcrops consisting of rock and marl sand and salt domes. The closest village to the gully area is Konduran village, which is 500 m from the gully above. So, the development of gully erosion in the village is of great concern to the villagers (Administration of the natural resource of Hormozgan Province).

Methodology
For this purpose, using extensive field surveys, locations of areas sensitive to head-cut gully erosion were collected with the global positioning system (GPS), Garmin (76 CSX Garmin), and entered into the ArcGIS software. The distribution of head-cut gully erosion was prepared with a total of 103 head-cut gullies in the case study and 103 points were randomly selected as non-head-cut gully points. The data was divided into two stages: training and validation, so that 70% of the data were randomly selected as training data and the other 30% were handled for validation [16,23].
In the present study, for spatial modeling of head-cut gully erosion susceptibility, the effect of different resampling algorithms on different methods of ML algorithms were evaluated. The flowchart of the research steps is shown in Fig. 2.

Dataset preparation
The choice of the type of factors affecting the occurrence of natural hazard phenomena varies in order to the different features of the region. Accordingly, researchers have considered a set of different topographic, hydrological, and lithological fac-tors in the research of gully erosion zoning [23,28,29]. Therefore, in the present study, 18 variables including altitude, slope, aspect, distance from rivers, drainage density, plan curvature, profile curvature, distance from roads, normalized Difference vegetation index (NDVI), land use, lithology, soil type, clay, silt, sand, Topographic position index (TPI), stream  Applying different resampling strategies in machine learning models power index (SPI) and Topographic wetness index (TWI) were considered as effective variables in the incidence of head-cut gully erosion. The digital elevation model with a resolution of 12.5*12.5 m was obtained from the phased array type Lband synthetic aperture radar (PALSAR) (https://search.asf. alaska.edu/#/) and morphometric factors such as altitude, slope, aspect, curvature were provided in ArcGIS 10.5.
Topographic characteristics such as altitude play an important role mainly in the expansion and control of gully erosion, by indirectly affecting the characteristics of vegetation and precipitation [14,29]. According to the altitude map of Konduran watershed, the minimum altitude is 1 and the maximum is 418 m above sea level (Fig. 3a). Low-slope areas have a high potential for surface runoff concentration and exposure to the process of gully erosion [5,12]. In this research, a slope map was prepared from DEM (Fig. 3b). The slope aspect can indirectly affect the erosion processes by controlling sun exposure, vegetation type, soil moisture content [15,30]. The aspect variable was provided in ArcGIS 10.5 based on the DEM and was classified into 9 categories (Fig. 3c). In general, the role of the plan and profile curvature on the occurrence of gully erosion is in the divergence or convergence of water during the descent [22,28]. Therefore, the plan and profile curvature layers are selected according to their effect on the creation and emancipation of gully erosion. Using ArcGIS 10.5, plan and profile curvature maps were prepared ( Fig. 3d and Fig. 3e). Drainage density reflects the geological nature, soil characteristics, and vegetation conditions of each region and, by affecting the amount of runoff, causes and spreads erosion [5,29]. In order to prepare a drainage density map, line density extension in ArcGIS 10.5 was used (Fig. 3f). In many areas, the stream network is effective in developing gully erosion, and areas near the river are more susceptible to erosion. In this study, the distance from rivers layer was prepared based on the Euclidean distance extension in ArcGIS 10.5 software [9,22] (Fig. 3g). Roads create intermittent penetration levels and soil disturbance, interrupting and concentrating the surface flow and discharging it in the downstream slopes, and increasing the process of gully erosion [31]. Distance from road map was prepared based on the Euclidean distance extension in ArcGIS 10.5 software (Fig. 3h). Land-use is one of the important factors in the appearance of gully phenomena due to the impact on land use and vegetation. In general, regions with low vegetation are sensitive to erosion than regions with high vegetation because vegetation causes soil resistance to erosion [5,16]. The land use variable was obtained by OLI imagery in the Landsat 8 satellite on 21/6/2019 using classification based on the maximum likelihood algorithm (Fig. 3i). Lithology and gender of existing formations in the region are some of the main factors affecting sedimentation in the watershed [26,32]. So, sensitive formations have the potential to form various types of erosion and sedimentation compared to resistant formations. In order to prepare the lithological map, we used a 1: 100,000 geological map from the national cartographic center of Iran (Fig. 3j).
The soil type map was obtained based on available information in the administration of natural resources of Hormozgan province. The soil type map of the region has two types of soils -Badlands and Entisols-Aridisols (Fig. 3k). Soil texture, including the proportion of clay, silt, and sand, affects the flow of current, subsurface and the occurrence of piping erosion, which can cause gully erosion [5,16]. Clay, silt and sand percentage maps were prepared from the soil grid site with a pixel size of 250*250 m (Fig. 3l, Fig. 3m, and Fig. 3n).
The NDVI shows the vegetation status in the area. Considering that the vegetation status in hydrological and soil studies is of great importance, in this study, NDVI was used. NDVI varies between À1 and +1, and the value of À1 indicates areas without vegetation coverage and +1 shows areas with dense vegetation coverage [33,34]. In order to prepare the NDVI map, bands 5 (near-infrared) and 4 (red) from OLI imagery Landsat satellite taken on 21/6/2019 were used (Fig. 4o).
Due to the effect of TPI, SPI, and TWI on morphometric and hydrological characteristics [16,35,36], TPI, SPI, and TWI have been considered as important variables in causing gully erosion. SPI, TPI, and TWI factors were provided in SAGA GIS 2.6 ( Fig. 3p, Fig. 3q, and Fig. 3s).

Multi-colinearity
Multi-colinearity is a condition that shows that an independent factor is a linear function of other independent factors [37]. If linearity is high, it means that there is a high correlation between factors and it can reduce the accuracy of modeling [38]. In order to survey multi-collinearity, the variance inflation factor (VIF) is used. Variables with VIF less than 5 are considered as variables with low and suitable linearity for modeling [25,39].
Tolerance is a ratio of the relative scatter to the scatter of that variable, which is not justified by the linear relationships of that variable with other independent variables in the model [37]. Given that tolerance is a ratio, its value varies between zero and one. A value close to one means that in an independent variable a small part of its scattering is justified by other independent variables. A value close to zero means that a variable is almost a linear combination of other independent variables, and the data has multiple common linear relationships [40].

ML algorithms
ML approaches are used to analysis that and build a full automated analytical models. Such approaches learn from data, identify patterns, and accordingly make rational decisions [54]. Supervised learning (SL) is one category of ML approaches, the goal is to classify the input data into some classes based on some pre-defined labels according to their features. The main tasks in SL are classification, regression, etc. The most well-known approaches in SL are linear regression, random forest, decision tree, SVM, and K-nearest neighbors (KNN).

Boosted regression tree (BRT)
The boosted regression tree (BRT) model schemed as a ML algorithms based on classification and regression trees along with the Boosting algorithm [4,41]. This model, unlike algorithms that predict the average, uses a stage-wise progressive method (analysis and matching between variables and data of the training group) and more non-parametric statistics. This model can be used to predict quantitative (regression tree) or classification outcomes (classification tree) [41]. In order to implement this model, rpart package in R software was used.

Support vector Machine (SVM)
The SVM is used for data classification and regression analysis. In other words, after identifying the input data of the model and the target data, the SVM model divides the data into separate groups after analysis between the independent and dependent variables [42]. SVM is an algorithm that finds a specific type of linear model that produces the maximum margin of a hyperplane. Maximizing the hyperplane margin leads to maximizing segregation between classes. The closest training points to the maximum of the hyperplane margin are called support vectors, and these functions are used to obtain the boundary between classes. SVM could be thought of as an algorithm in which two classes are hyperplane by a separating superscript that is defined on the training dataset [43]. Choosing the optimal kernel type is an important step in the SVM model, as it can greatly increase the accuracy of the model. The kernel functions used in SVM are usually divided into four groups: linear kernels, polynomial kernels, radial base kernels, and circular kernels [42]. In this study, e1071 package was used in R software for modeling based on SVM with a radial kernel function.

Random forest (RF)
RF is an effective tool in predicting target variables or pattern classification is the decision tree. RF is an ensemble method that mixes multi algorithms to produce a repeated prediction of each phenomenon [44]. In general, the single decision tree is prone to over-fitting and has little generalizability. When a decision tree is formed, small changes in learning patterns can cause fundamental changes in the structure of that tree. RF could learn complex types and remark nonlinear relationships between independent factors and response factors. It could also constitute various sorts of data due to the lack of normal distribution of the data used [44]. This method is a combination of several decision trees, in the structure of which, several samples of Bootstrap are extracted from the data and a number of input variables are randomly involved in the structure of each tree. In the sampling stage, near 1-third of data is not sampled, and as an example out of the bag (OOB), this data is used to determine important variables as well as to estimate the error. Then a tree is spread on each sample of Bootstrap. In the stage of structuring a tree in each branch, from all independent M variables, the m variable is randomly selected for the division. After making the whole tree, the validation data is introduced to the tree and the number of trees for the input vector of output is obtained [45]. In this study, random forest model calculations were performed in R software and randomForest package.

Resampling
Resampling techniques are a manner of frequently carrying samples from a data set and refitting an assigned model on each sample with the purpose of learning more about the adapted model. Resampling techniques could be timeconsuming and costly, as they often need the same statistical methods to be performed under different data sets [46]. To receive extra information about the fitted model, return the resampling techniques of the model of attention to the samples created from the training set. There are various resampling algorithms such as cross-validation (CV), leave-one-out cross-validation (LOOCV), K-fold cross-validation and Bootstrap [46,47].

K-fold cross validation (K-Fold CV)
K-fold CV functions are based on randomly dividing the set of observations into K sections or folds of approximately similar size. Any of the K folds is handled as the validation set while the other K À 1 folds are applied as the validation set to create K calculates the validation error [48]. The cross-validation process, which is repeated K times, is then used exactly once as validation data with each of the K samples. The results K can then be averaged to produce a single estimate. The advantage of this method over repetitive random sampling is that all observations are used for both training and validation, and each observation is used accurately for validation. The Kfold CV evaluated validation error occurs from the mean of these calculations. Standard values for K are 5 or 10 since these values demand less calculation than when K is equal to n [48]. When we have a small size of observations, dividing it into training and validation sets may lead to a very small test set [49]. By chance, we can get almost any performance in this set. This is even worse when we are faced with a multi-class problem. If we use cross-validation in this case, we create different K models, so we can predict all our data. For each instance, we predict with a model that does not see this instance. When we create different models using our K learning algorithm and test it on different sets of K tests, we can have more confidence in the performance of our algorithm. When we do a single evaluation on our test suite, we get only one result.

Bootstrap
The bootstrap is a generally appropriate algorithm that could be handled to quantify the uncertainty affiliated with an addressed estimator or statistical learning method, such as those for which it is tough to achieve a size of variability [50]. The bootstrap produces different data sets by frequently sampling observations from the fundamental data set. These created data sets could be applied to determine variability in lieu of sampling independent data sets from the entire group. The bootstrap produces different data sets by frequently sampling observations from the fundamental data set. These created data sets could be applied to determine variability in lieu of sampling independent data sets from the entire group. The sampling hired by the bootstrap implicates randomly choosing n observations with substitution, which intends some observations could be elected duplicated times while other observations are not involved at all [50]. This procedure is iterated N times to generate N bootstrap data sets, M*1, M*2,. . ., M * N, which could be handled to determine other quantities include standard error (Fig. 4).

Validation models
In this research, the parameters of the receiver operating characteristics (ROC) were applied to the assessment efficiency of the models. The parameters in this curve include sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV), and area under the curve (AUC) [16,26]. AUC is equal to the probability of distinguishing between dependent variables and independent variables by a model. Different values of the AUC are between 0.5 and 1. If AUC is 0.5, it indicates that the model is random, and if this value is equal to 1, the model can best distinguish between occurrence and non-occurrence points. AUC of 0.7 to 0.8 indicates a good efficiency model, between 0.8 and 0.9 excellent, and more than 0.9 indicates the excellent detection power of the model [51].
where TRP is sensitivity, TNR is Specificity, TP is indicated true positive of gully occurrence, and TN indicated True negative of gully non-occurrence.

Multi-collinearity analysis
Multi-collinearity data can interfere with the accuracy and interpretation of findings where every linear statistical procedure has been used. Multi-collinearity is described as a linear relationship in a dataset linking two or more independent variables [23,35]. The TOL and VIF values are either 0.1 or ! 10, suggesting strong multi-collinearity [52,53]. The multi-collinearity test of 18 Geo-environmental gully headcut erosion susceptibility parameters showed that the highest and lowest values of variance inflation factor (VIF) are 4.07 and 1.12 whereas the highest and lowest tolerance values are 0.89 and 0.21 respectively ( Table 1). The results of multicollinearity showed that all independent variables have VIF < 5, these variables are independent and there is not a strong correlation between them, therefore, all 18 variables were selected for modeling HGES modeling.

Determine best parameters
The performance and selection of the best resampling method from tune parameters for the BRT, SVM, and RF ML models are shown in Fig. 5, Fig. 6, and Fig. 7 respectively.

Assessment efficiency of models
The accuracy of the maps developed by OB-BRT, OB-SVM, OB-RF models has been checked by the ROC curves. The validation of the outcomes clearly showed that the Optimism Bootstrap BRT, Optimism Bootstrap SVM, and Optimism Bootstrap RF models with ROC values of 0.85, 0.823, and 0.89 respectively had exceptional levels of precision than rest of the ensemble and standalone ML models ( Fig. 8 and Fig. 9). Results obtained using the positions of 103 recorded gully and non-gully points, were validated. The graphical outcomes of the suitable output tests revealed that the ensemble ML models (Optimism Bootstrap BRT, Optimism Bootstrap SVM, and Optimism Bootstrap RF) had a decent performance compared to others (Fig. 10).

Head-cut gully erosion susceptibility modeling
ML models such as BRT, SVM, RF along with four resampling methods K-fold (5 and 10) CV and Bootstrapping (Bootstrap (B) and Optimism Bootstrap (OB)) were applied to predicting the gully erosion areas. Resampling techniques have been utilized since resampling is a methodology that will support all the data used. After determining the accuracy of ensemble models based on resampling algorithms, the best ensemble models of each group were selected to predict head-cut gully erosion maps (Fig. 11).
The map of gully head-cut susceptibility using OB-BRT model (Fig. 11) shows the highest level of precision in comparison with the standalone BRT, 5-fold CV BRT, 10-fold CV BRT, and B-BRT respectively. According to the gully headcut susceptibility map of OB-BRT model, the maximum part of the region is engaged by very low (20.03%) to low (28.66%) classes, while high (17.55%), moderate (25.60%), and very high (8.16%) classes covered other parts of the region respectively ( Table 2).
The map of the susceptibility to gully head-cut using the SVM Optimism Bootstrap model (Fig. 11) indicates the high level of accuracy compared to the standalone SVM, 5-fold CV SVM, 10-fold CV SVM, and B-SVM respectively. According to the gully head-cut susceptibility map of OB-SVM model, the maximum part of the region is engaged by very low (14.08%) to low (28.73%) classes, while high (17.70%), moderate (29.80%) and very high (9.70%) susceptibility classes occupied rest of the studied region respectively ( Table 2).
The Optimism Bootstrap RF Model (Fig. 11) map displays the maximum degree of accuracy compared to the rest of the standalone and ensemble models such as 5-fold CV RF, 10-  (Table 2).  Fig. 12 Result of importance value based on OB-RF model. Fig. 13 Partial plot for importance variable a) altitude, distance from road, b) distance from river and NDVI.

Importance value
The results indicate that altitude, distance from the river, distance from the road, land use, NDVI, and percentage of clay are of maximum importance, while the rest of the other variables, such as TWI, SPI, TPI, aspect, lithology, and slope are of less importance (Fig. 12). In this region, the causes of gully head-cut formation are closely linked to land use, NDVI, and distance from the road, with the latter being the most influential variable. These agents also play a crucial function in aggravating the degradation of gullies. Most gully headcuts' shapes manifest the impact of land use activities induced surface runoff on the progress and expansion of this kind of gully head-cut development. The partial plots for important variables are also showing the same importance level (Fig. 13).

Discussion
In recent years, with the advancement of computer science and modeling, the use of these methods to solve various problems, including natural disasters, has increased. Therefore, comparing the performance of different models and identifying the best model is one of the necessities in these studies. Although comparing the performance of different models with common criteria used in various studies, the results may be misleading.
That is why it is very important to find the best method based on their abilities [51]. Therefore, in this study, a field-based methodology was explored by implementing three ML models and four resampling procedures for the development of a gully head-cut susceptibility map, taking into account subsurface piping erosion. While modeling gully head-cut susceptibility, it is necessary to calculate the projecting capacity and multicollinearity of the eighteen selected variables and therefore, the importance values of each conditioning variable have been calculated [35]. Furthermore, few studies have focused on gully head-cut erosion to establish and project the correlation between gully head-cuts and their absence, considering a number of individual variables using models [6]. As a result, in the present study, we have calculated the importance of each parameter behind the appearance of gully head-cut positions in Konduran watershed. The findings are associated with the statement that the intensity of gully head-cut erosion depends on certain variables such as the volume of the runoff, altitude, distance from the road, distance from the river, percentage of clay, NDVI, and land use. A large number of ML models are available, but efficient and advanced methods for spatial modeling of the occurrences of gully head-cut erosion susceptibility are needed. Here, ML models such as BRT, SVM, RF as well as resampling approaches were applied to evaluate the performance of ensemble ML models by comparing the performance of individual ML models (BRT, SVM and RF) to predict the precise gully headcut erosion susceptibly map and gully head-cut locations based on eighteen selected significant parameters. Methods of resampling have been utilized as resampling is a method that can help other applied analysts and solve the unbalancing in response variables [51]. While they do not solve all inferential issues, unlike resampling analytical sampling distributions. CV is still used; it's used to ensure the adequacy of a statistical model. The explanation that cross-validation and bootstrap are likely to be less reliable is that they have smaller efficient goodness-of-fit tests relative to the Optimism Bootstrap which is always the chosen process. Evaluation of modeling performance confirms that the best method is a set of models (OB-BRT, OB-SVM, and OB-RF models) with exceptional precision at 85%, 82.3%, and 89%, respectively, in the prediction of head-cut locations compared to the rest of the ensemble and standalone ML methods. In general, the Bootstrap BRT, Bootstrap SVM and Bootstrap RF set enhances the generalization of base predictors for gully head-cut locations to find the Optimism Bootstrap set of combined models with improved accuracy. In other fields and different ML models also the ability of Optimism Bootstrap resampling method have showed [51,52].
A larger number of gully head-cut occurrences were observed in and around the residential area of the region than the other categories of land use, which reveals the obvious effect of land use on gully head-cut. Inadequate management practices and land use reforms have been noted to play a significant impact in gully head cutting, as subsurface piping and gully head-cut encounters have contributed to gully formation and increase. Our findings are consistent with those to which it is inferred that gully head-cuts are mainly generated by runoff, land use, percentage of clay, and altitude [25]. In addition, the resampling-based ensemble ML models (OB-BRT, OB-SVM and OB-RF models) have identified these gully head-cut places more accurately on a regional scale than other tested ML ensembles and standalone models.
Gully head-cut and subsurface piping erosion are one of the most significant causes of soil erosion in the research region and are the major sources of sediment supply in the lower area. Such adverse effects have created huge damages to the economy in the region. Therefore, appropriate management measures should be implemented keeping in view, the health and sustainability of the local environment of this region.

Conclusion
In the present study, we applied different resampling algorithms on three ML approaches to investigate the spatial modeling efficiency of gully head-cut erosion susceptibility prediction in the Konduran watershed. Variable importance of explanatory factors showed that altitude, distance from the road, distance from the river, percentage of clay, NDVI and land use, had the most significant impact on the occurrence of gully head-cut erosion in the Konduran watershed. Validation of the models was rendered on the basis of the Receiver Operating Characteristic (ROC) curve and area under the curve. The results showed that Optimism bootstrap in comparison to other resampling methods, has better performance to increase the accuracy of ML models. Validation of the outcomes clearly showed that the OB-BRT, OB-SVM, and OB-RF models had exceptional levels of precision depending upon selected significant variables. The outcome of this gully head-cut susceptibility map can be useful for land use planning, soil and water conservation, and, as a result, for sustainable development of the region.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.