UTILISATION OF ARTIFICIAL NEURAL NETWORK FOR THE ANALYSIS OF INTERLAYER SHEAR PROPERTIES

. For a long time artificial intelligence tools were not used in pavement engineering, but their application is becoming more and more important. As opposed to other subjects in pavement engineering this is not yet the case for inter-layer bonding. The aim of this paper is to apply artificial intelligence in form of artificial neural network for knowledge discovery from pavement engineering data in the field of interlayer bonding. This means that the focus is on practical use of artificial neural network and its application for datasets on interlayer bonding in order to find pattern within the data and to predict certain interlayer bond properties. It was shown that artificial neural network techniques are suitable for deriving models from datasets and to predict interlayer shear bond properties such as max shear force, deformation at max shear stress, and max shear stiffness.


Introduction
In order to organise data and to discover knowledge from data the so called knowledge discovery techniques with artificial neuronal networks (ANN) were introduced in the 1950s. According to Fayyad (Fayyad et al. 1996;Miradi 2009) knowledge discovery is the nontrivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data. ANN is a mathematical, computational model that simulates the structure of biological neural networks. It consists of an interconnected group of artificial neurons and process information using a connectionist approach to computation (Haykin 1999).
The advantage of using ANN for modelling and data evaluation lies in the fact that ANN is capable of processing large amounts of data sets. ANN determines a model based on learning or training process, as opposed to statistical analysis, when a model has to be developed by regression. Furthermore, in most cases, it is unknown if the relationship between the variables is linear or not. Therefore, ANN being a non-linear statistical data modelling tool, has a clear advantage over statistical linear regression analysis. Although ANN does not deliver an equation it is utilised to determine most critical and influencing variables. For many problems these influencing variables are not known in detail or are not completely assessable. In case of multi regression analysis (MRA), their knowledge is indispensable while ANN has the potential to identify the most important ones. As opposed to other areas in science and economy, where computational tools in the field of artificial intelligence were used for discovering knowledge from an increasing amount of data, this was rarely the case in pavement engineering. Here, for a long period of time, data collection and evaluation was rather based on empirical or statistical methods. In her thesis, Miradi (Miradi 2009) performed an artificial intelligence based knowledge discovery study of data on asphalt road pavement problems, in particular, ravelling, cracking and rutting as well as stiffness of cement treated bases. She showed that even without special knowledge in asphalt pavement technology the correct use of artificial intelligence tools leads to meaningful results and findings.
The aim of this paper is to apply ANN for knowledge discovery from pavement interlayer bonding data covering a key issue in pavement engineering. As opposed to Miradi's thesis (Miradi 2009) which approached the problem of knowledge discovery for asphalt pavements from a mathematical side, this paper deals with the practical application of ANN. Hence, it focuses on the practical use of ANN and its application for datasets on interlayer bonding for determining pattern within the data and to predict certain interlayer bond properties.

Problem description
ANN is valuable empirical substitutes for conventional physical models for analysing complex relationships involving multiple variables provided that a sufficiently large database is available. The bond between asphalt pavement layers is influenced by such a great variety of different parameters or variables.
Clearly, the interlayer bond will depend on physical and mechanical properties of the main constituents of the asphalt mixtures, the geometrical, chemical and physical characteristics of the interface, the mechanical properties of the pavement structure and the external factors that affect the pavement structure itself, such as traffic and climate. The characteristic values of the bond itself are also heavily dependent on the way how they are determined, i.e. testing methods and conditions. Table 1 presents a list of the variables which are expected to govern the general behaviour of the bond. The list consists of 4 major variable sets which includes more than 20 different main variables.
The number and complexity of parameters influencing interlayer bonding makes it difficult to quantify the contribution of the different parameters to the measured bonding properties and to find a physical model predicting the interlayer shear bond properties such as max shear force or max shear stiffness. Although the decrease of the interlayer bond with increasing temperature is a well-known fact, the influence of other factors (e.g. influence of tack coat, geometry of the interface etc.) is either unknown to a full extent or intensely debated among researchers and practitioners (Romanoschi, Metcalf 2002;Uzan et al. 1978;Ziari, Khabiri 2007).
Another reason for the fact that ANN has not yet been applied for the evaluation of interlayer bonding is the lack of generally acknowledged and openly accessible databases for interlayer bonding. Furthermore, openly accessible databases are lacking interlayer bond longterm performance data including information on traffic survey and pavement condition data, since the evaluation of the shear bond between asphalt layers is usually determined using cores, which are directly taken after construction and before the road is opened to traffic.

Shear testing
Shear testing was done using the Layer-Parallel Direct Shear (LPDS) test device (Fig. 1). LPDS is an Empa modified version of equipment developed in Germany by Leutner being more versatile in geometry and more defined in the clamping mechanism (Raab, Partl 2008).
The specimens were conditioned in a climate chamber for 8 h and all tests were conducted at a temperature of 20 °C. From the LPDS test the shear force F as a function of the vertical shear deformation w is obtained.
Nominal maximal shear stress, i.e. the average shear stress in the cross section, is obtained by dividing the max shear force by the cross section area of the specimen.
where max F -maximal force, kN; A -nominal cross section area, mm 2 ; d -specimen diameter, mm.
In addition to the max shear force, the max slope is used to define the max shear "stiffness" value Smax as follows (Raab, Partl 2008): where dF -differential shear force; dw -differential shear deformation. In order to compare "stiffness" for different specimen diameters, the shear reaction modulus Kmax (Goodman et al. 1968) is used: where dτ -differential shear stress; dw -differential shear deformation.

Artificial neuronal networks in general
An ANN is a biologically inspired computational model consisting of several single units, artificial neurons, connected with weighting coefficients (Ghaffari et al. 2006). This system is capable of recognizing, capturing and mapping patterns in a set of data due to the high interconnections of neurons processing information in parallel. A basic network is composed by three or more layers (Fig. 2). The first layer contains the input data while the last layer contains the output data. One or more layers known as hidden layers are placed between the input and output layers. The arriving signals, called inputs, multiplied by the connection weights are 1 st summed and then passed through a transfer function to produce the output for that neuron. The activation function acts on the weighted sum of the neuron's inputs and the most commonly used function are sigmoid and hyperbolic tangent function. The way that the neurons are connected to each other has a significant impact on the operation of the ANN (Martínez, Angelone 2010). The most commonly used ANN is a feed forward ANN. In this type of ANN each artificial neuron is only connected to the artificial neuron in the next layer and its output is fed forward to the next layer in the direction from input to output (Miradi 2009). There are many different learning algorithms but the most common one is the back propagation (Ghaffari et al. 2006). For back propagation, two other parameters, the learning rate and the momentum coefficient need to be defined. The learning rate is an adjustable factor that controls the speed of the learning process. The momentum coefficient determines the proportion of the last weight change that is added to the new weight change. The following simplified relationship presented by Erb (1993) points out the effects of these two parameters on the weight adjustment: new weight change η = error β + (last weight change), (4) where η -learning rate; β -momentum coefficient. An ANN is trained to map a set of input data by iterative adjustment of the weights. There are two main approaches for weight adjustment: online and batch. The online method modifies and updates the weights for each input data, while the batch method computes the weight update for each input data, but stores these values during Fig. 2. Schematic of a three layer ANN with four neurons in the input layer, three neurons in the hidden layer and one neuron in the output layer (Miradi 2009) one repetition through the training set. At the end, after all input data samples have been presented, all the contributions are added, and only then the weights will be updated (Abraham 2005).
Information from input data is fed forward through the network to optimize the weights between neurons. Optimization of the weights is made by backward propagation of the error during training or learning phase. The ANN reads the input and output values in the training data set and changes the value of the weighted links to reduce the difference between the predicted and target (observed) values. The error in prediction is minimized across many training cycles until network reaches specified level of accuracy (Ghaffari et al. 2006).
A basic architecture of an ANN with four neurons in the input layer, three neurons in the hidden layer and one neuron in the output layer is presented in Fig. 2.

Datasets
As explained earlier, no standard databases were available. The datasets used for this research were gathered over the years by the authors of this paper from two different research projects on in situ data , 2008. Data can be divided into those from new pavements ("New Road") and performance data from old pavements. "LTPP Road" is a dataset combination of data from new pavements and performance data from the same roads after 10 years. For both datasets the single results for max shear force, max shear stress, shear deformation at max shear force and max shear stiffness were determined.

New Road dataset
In the mid 1990s, the Swiss Federal Laboratories for materials testing and research, Empa, was appointed by the Swiss Federal Road Office (ASTRA) to evaluate a simple, practice oriented and standard able test method for assessing the interlayer bond between the layers of asphalt pavements . The test method was intended as a quality assurance (QA) tool for inspection immediately after pavement construction. In the course of this research project, a number of Swiss pavements, constructed between 1993 and 1997 were investigated, providing a representative selection of materials for heavy vehicle traffic roads during that period of time. In the course of the project only the bond between the surface (layer 1) and the binder or base layer (layer 2) should be evaluated.
Since the construction took place before the European Standards became effective, all surface courses had been constructed according the old Swiss Standard SN 640431a Asphalt Concrete, Conception and Requirements (1988). Fig. 3 shows the location of all test sites in Switzerland and Table 2 gives an overview of structures, mixtures and LPDS testing temperature for all test sites.
The investigated asphalt pavements were either new constructions or rehabilitations of layer 1 and layer 2. Therefore, all surface (layer 1) and second layers (layer 2) apart from three binder courses and one upper base layer were totally new.  Most pavement surface courses consisted either of mastic asphalt (SMA) or asphalt concrete (AC): 9 road sections had SMA and 7 road sections had AC surface courses (layer 1). In addition, three coring sites with special surface courses, i.e. mastic asphalt (MA), hot-rolled asphalt (HRA) and porous asphalt (PA) were included. The surface courses were placed either on AC layers with a nominal max aggregate size of 10, 16, 22 or 32 mm or on MA with a max aggregate size of 16 mm each (layer 2).
The investigated asphalt pavements were either new constructions or rehabilitations of layer 1 and layer 2. Therefore, all surface (layer 1) and second layers (layer 2) apart from 3 binder courses and 1 upper base layer were totally new. In two cases, pavements with new surface courses on unknown old base and binder courses were investigated. According to a binder extraction analysis in the lab, the mixture type of these unknown layers was evaluated to be most probably AC 10 according to the Swiss standard SN 640431 Asphalt Concrete, Conception and Requirements (1976). It is important to note, that the composition of binder courses was equal to the composition of base courses since at that time the Swiss construction practise did not distinguish between binder and base courses. In order to avoid confusion, all notations in Table 1  All new asphalt pavement mixes of layer 1 and layer 2 were analysed in the laboratory determining aggregate size distribution, binder content, Marshall values (stability, flow, air void content) and standard binder properties (penetration, softening point ring and ball). Furthermore, the air void content of the pavement layers (mean value) and the tack coat type were determined. For this investigation, at each test site, 40 cores were taken directly after construction of the pavement or pavement rehabilitation. From these cores, interlayer shear tests between the first and second layer were performed at 20 °C and 40 °C using the LPDS shear device. In addition to the max shear force, max shear stress, shear deformation at max shear stress and max shear stiffness S were determined for all cores.

LTPP Road dataset
For a long time, apart from two preliminary investigations in 1999 and 2001 on a limited database Stöckert 2001), little performance data concerning interlayer bonding were available, until in 2003 Empa conducted a long term pavement performance study on the evaluation of interlayer bonding over time.
Based on the research project from 1999 ) and the results obtained from more than 1000 cores from 20 different pavements, a decade later the long term bonding properties of remaining 14 pavements could be determined again. The bonding properties determined at 20 °C with the LPDS of 14 remaining high volume road pavements for the years 1993 to 1997 were compared to the values for the same road pavements in 2006.
From the remaining pavements, seven had SMA and four AC surface courses. All three coring sites with special surface courses, i.e. MA, HRA and PA could also be included.
For most road sections the average daily traffic (ADT, vpd) and the percentage of heavy vehicles (> 3.5 t) data were also available from Swiss Traffic Survey of 2005 and from Swiss Federal Office of Statistics of 2006. Table 3 shows the remaining road sections with information on the material and the traffic data.
Coring for the investigation of the long-term pavement performance study was conducted a few meters away from the original coring site. For every road section, 5 cores were taken inside and another 5 outside the wheel track. From these cores interlayer shear tests between the first and second layer were performed at 20 °C using the LPDS shear device. In addition to the max shear force, shear stress and shear deformation as well as shear stiffness and shear reaction modulus were determined for all cores.

Data preparation
Before ANN calculations can be conducted, the available data have to be prepared in terms of variable selection, data cleaning and data scaling. In order to prepare the available datasets the input and output variables have to be selected.
The following output variables were chosen: -max shear force Fmax, kN, which is converted into the max nominal shear stress τmax, MPa; -shear deformation at max shear stress w, mm; -max shear stiffness Smax, kN/mm, which is converted into the reaction modulus Kmax, MPa/mm 2 . Input variable selection is a key step since the choice of the variables influences the quality of the ANN model prediction. Sometimes it is possible that a variable seems to be important for the ANN software, while this importance can physically not be explained and is opposed to findings in reality. Therefore, it is important to rely not only on machine-aided search mechanisms, but also on experimental knowledge and engineering judgement. Since the interlayer bond generally depends on two different layers, all variables of the mixture and binder characteristics and some variables of the pavement characteristics have to be multiplied by a factor of two.
The input variable selection for ANN modelling of the databases was conducted using a feature selection mode inbuilt in the applied ANN software. When executing an exhaustive search, temperature, aggregates passing through 2 mm and through 0.09 mm sieve of the second layer were detected to be the most important variables for the New Road dataset with a fitness of 56.1%, while the combination of all input variables gained a fitness of 55.6%. It was therefore decided to take all 11 input variables, since in this way more information could be retrieved using the response graph feature.
The following additional input variables have to be taken into account for the LTPP dataset: -age, year; -ADT, vpd; -percentage of heavy vehicles > 3.5 t, %. As opposed to the input variables for the New Road dataset the test temperature had to be excluded, since all performance data for the LTPP dataset were only determined for a temperature of 20 °C. The binder contents for the layers were not included because their range was very small and, therefore, their evaluation did not give valuable information. The air void content was also neglected because the values for the LTPP Road were not comparable to the values of New Road. In the LTPP investigation air voids had been determined for every single core, while for New Road the air void content represents a global value for the whole pavement. Table 4 depicts all input variables for the New Road and the LTPP Road dataset.
Another step in data preparation is data cleaning. Therefore the datasets are not allowed to contain missing data and outliers. In cases of missing output data, the whole row of data was eliminated in this research. In some cases, output data were only missing for one output parameter (such as shear stiffness). In this case, the data line was eliminated for the evaluation of shear stiffness while it was used for the evaluation of shear force and shear deformation. In case of missing input data (variables) it depended whether it was possible to insert data using values known from standards or guidelines, such as, mixture characteristics or traffic data, or whether the whole line of data was eliminated. Wrong type values resulting from human error were either corrected or eliminated. Outliers are extreme cases such as, measurement errors or other anomalies. Hence, each single outlier was examined and it Aggregate passing through sieve 0.09 mm, layer 1 was decided to use or to eliminate the data. The applied software often detected values for extreme cases and characterised them as outliers. Here, it was decided to accept these data (e.g. high binder and low air void content in case of mastic asphalt) when the given data were consistent with reality and then included in the evaluation. In other cases, unreasonably high or low data were either corrected when the correct value was available or eliminated when this was not the case.
After data cleaning data scaling is done. This is a procedure which allows eliminating any incompatibility of data caused by the different measurement units, which affects the accuracy of the model. Data scaling was done within a range of [-1, 1] using Eqs (4-5): where SF -scaling factor; SRmax -upper scaling range limit; SRmin -lower scaling range limit; x -actual numerical value; xmax -max actual value; xmin -min actual value; xS -scaled value. Before ANN modelling, the dataset is divided in two subsets, the training and the test set. The training set, about 85% of the dataset, is used for training and the test set, about 15% of the data, is used for testing the evaluated model. The software used in this research divides each dataset into three subsets: the training set, the validation set and the test set. The training set is a part of the input dataset used for neural network training, i.e. for the adjustment of network weights. The validation set is a part of the data used to tune network topology or network parameters other than weights. For example, it is used to define the number of hidden units or to detect the moment when the neural network performance started to deteriorate. The validation set is used for calculating generalisation loss and retaining the best network (the network with the lowest error on validation set). The test set is a part of the dataset used only to test how well the neural network will perform on new data. The test set is used after the network is ready trained, to test what errors will occur during future network application. The test set is not used during training and thus was considered as new data entered by the user for the neural network application. It was also decided to separate a part of the dataset (about 10% of the data) to have an additional test set, the so called query set, which was used to query and validate the determined network. This was done prior to feeding the datasets into the ANN modelling process, which means that several lines of data were excluded and put together in the query set file.

New Road
For all three output parameters the same number of hidden layers was used. Since it was found that the result did not differ too much when using different numbers of hidden neurons, the min number 5, which gave a good prediction, was used. Batch back propagation learning algorithms with a learning rate of = η 0.2 and a momentum coefficient of = β 0.9 was found to give the best results for the prediction of all three output parameters.
The hyperbolic tangent was chosen as the activation function for both, hidden layer and output layer. From Figs 4-6 it becomes clear that for the output variable max shear force max F and max shear stiffness max S a good prediction is possible, while the ANN computation of the output variable, shear deformation w at max F , does not lead to a model, which is able to predict its values in a sufficient way. In case of the output variables force and stiffness the linear regression coefficients values R 2 of 0.96 and 0.85 give a good prediction of max F and max S and the slopes of the regression lines are close to 1.
Regarding the output variable shear deformation w at max F a linear correlation in the form b ax y + = with 0 = b and therefore a prediction of the values is not possible.

LTPP road
Again, for all three output parameters the same number of hidden layers was used. Since it was found that the result did not differ too much when using different numbers of hidden neurons, the min number 5, which gave a good prediction, was used. Batch back propagation learning algorithms with a learning rate of 4 . 0 = η and a momentum coefficient of 9 . 0 = β was found to give the best results for the prediction of all three output parameters. The hyperbolic tangent was chosen as the activation function for both, hidden layer and output layer.
Figs 7-9 give the result of ANN modelling and show the prediction for the output variables using the query files for validating the determined network.   The findings for the LTPP dataset are similar. For the output variable max F and max S a prediction is possible with R 2 values of 0.75 and 0.74. For this database even the prediction for the output variable shear deformation w at max F is possible, although the R 2 value with 0.52 is clearly not very high. This finding can be contributed to the fact that with age the deformation at max F becomes smaller and the distinction between the new and aged values becomes clearer. It is interesting to note, that in the LTPP data, as opposed to the New Road data, a weak correlation for the max shear deformation at max F can be found. This corre- lation might be attributed to the fact that in case of LTPP data, overall the shear deformation data are lying within a less wide range and the amount of similar or comparable data (new/old) increased. Similar statements are made for LPTT dataset (Figs 8-10). However, here max F and max S show only weak correlations with R 2 values of 0.58 and 0.67, while the R 2 of the linear correlation for the shear deformation is only 0.50, partly due to the fact that, for physical reason again, the regression line was forced through the origin of the axis. On the other hand, the slope of the linear regression line was for all output variables close to 1. That the prediction for LTPP Road is not so good compared to New Road is explained by the fact that LTPP Road is a combination of two datasets which differ regarding the time of testing.

Discussion
The applied software offers the possibility to analyse ANN results by using the so-called response graphs. The response graph displays the response of the model output by varying one of the variables, while keeping the other input variables constant. The constant value for each variable is the mean value of that variable in the dataset. Fig. 10 gives the response graphs for New Road showing the input variables "Temperature" and "Air void content of layer 1 and layer 2". It was decided to show the response graphs for max F and max S since here physical dependencies are known best.
As shown in Fig. 10a, the max F and max S decrease with increasing temperature from 20 °C to 40 °C. The same applies for the max F and max S with increasing air void content of layer 1 (Fig. 10b).
The situation gets different, for the air void content of layer 2, where an increase in air void content goes with an increase of max F and max S (Fig. 10c). While the first two findings are in agreement with practical experience, the finding that the shear force increases with increasing air void content of layer 2 is debatable. Here, the range of air void content is probably too small for determining a clear dependency and one has to keep in mind that the increase in shear force is also quite small. Regarding the air void content (Fig. 10b), another explanation is found in the difference between layer 1 and layer 2. The difference in air void content of the layer 1 is mainly based on differences in the asphalt concept, with mastic asphalt having very low air void content on the one hand and porous asphalt having very high air void content on the other hand. The air void content of the layer 2 lies within a clearly defined range since these layers are all constructed according to the concept of asphalt concrete. In case of the air void content of layer 2, other effects of roughness and interlock could be dominant. Fig. 11 gives the response graphs for LTPP Road showing the input variables ADT, vehicles > 3.5 t and age.  . 11. Response graphs of the input variables as a function of Fmax and τmax: a -ADT; b -vehicles > 3.5 t; c -age As shown in Fig. 11a, the shear force decreases with increasing ADT. The same applies for increasing percentage of heavy vehicles (Fig. 11b). The situation gets different, when looking at the age, where an increase in operation time of 10 years leads to an increase of shear force and shear stress (Fig. 11c). The findings for the ADT, the percentage of heavy vehicles and the age are in agreement with practical experience.
The results in a paper by Raab (Raab, Partl 2008) clearly states that while the nominal Fmax of intact pavements increases with age or operation time, very high levels of average daily traffic and high percentages of heavy vehicles can lead to pavement deterioration combined with a decrease in shear force and shear stress. In this investigation it was found that very high levels of average daily traffic and high percentages of heavy vehicles can cause damage to the pavement, which results in a decrease of shear forces and stresses mainly in the wheel path. In most cases pavement deterioration is visible (ruts, cracks), but when the pavement is subjected to very high levels of ADT over a long period of time, shear properties were found to decrease without the pavement showing visible defects.

Conclusions
The results presented in this paper support the following conclusions: 1. ANN techniques are a valuable tool to derive models from datasets and to predict interlayer shear bond properties such as max shear force, deformation at max shear stress, and max shear stiffness.
2. The prediction of quality and accuracy of various interlayer bond properties is different. Max shear force and shear stress are predicted best, followed by max shear stiffness, while shear deformation at max shear stress is a less representative of the bond property.
3. Engineering judgement and practical knowledge are indispensable when choosing the important variables for using the artificial neuronal networks technique. Therefore, plausibility checks are necessary.
4. According to the findings of this research, it is recommended to create additional independent query test files. In order to have the most reliable output, the data for these query files must be chosen randomly, but taking into account every investigated characteristic, such as different materials, different temperatures or intermediate layers etc.
5. Regarding New Road dataset the best predictions was found for the output parameter "max shear force" followed by the "max shear stiffness" with linear regression coefficient values R 2 of 0.94 and 0.85 for the query test set. A prediction for the max shear deformation was not possible, since the deformation data seemed to be too diverse within the database.
6. The response graphs for "temperature" and "air void layer 1" the predicted max shear forces are in good agreement with practical experience, and findings from other research, while for "air void layer 2" a connection with practical experience was more difficult.
7. For LTPP Road dataset, a combination of New Road with its performance data a prediction of max shear force and max shear stiffness is not as accurate as for the New Road dataset. This results in linear regression coefficient values R 2 of 0.58 and 0.62. The prediction for the shear deformation even becomes better than for New Road dataset (R 2 = 0.42).
8. The response graphs for LTPP Road dataset for the prediction of the max shear force support findings that aging and trafficking has a positive effect on the max shear force, while the pavement deteriorates, leading to a decrease in shear force when the average daily traffic and the percentage of heavy vehicles becomes very large.