Studying hydraulic conductivity of asphalt concrete using a database

Abstract A new database called AC/k-1624 containing over 1600 measurements of saturated hydraulic conductivity of asphalt concrete has been assembled and analysed. AC/k-1624 was used to investigate the effect of the grading entropy parameters on saturated hydraulic conductivity. A new prediction model comprising both air voids and grading entropy is presented. The database analysis using different predictors of asphalt hydraulic conductivity reveals that the gradation does affect the hydraulic conductivity, but the air void level is necessary to make reasonable a-priori assessments of hydraulic conductivity for asphalt concrete. The new empirical model is shown to have a good predictive capacity for hydraulic conductivity fitting more securely at higher values with more scatter observed at lower values. The effects of test type, gradation classification and Nominal Maximum Aggregate Size (NMAS) are also studied, revealing in general relatively modest influences on the computed regression coefficients.


Introduction
Assessing the propensity for asphalt concrete pavement layers to allow the flow of water throughout is important for understanding pavement performance [1] . Hydraulic conductivity ( k ) of asphalt concrete has been the subject of sustained research efforts in recent decades [e.g., [2][3][4][5][6][7][8][9][10][11][12][13][14]. This paper reviews the influence of different predictors for k of asphalt concrete using a database called AC/k-1624. The database contains over 1600 measurements of k on asphalt concrete mixtures and builds upon previous database analyses of this important parameter. An early version of this database was presented in Vardanega and Waters [15] ( n = 467) and was subsequently expanded in Vardanega et al. [16] ( n = 1318) as well as Feng [17] ( n = 1578). The aim of this paper is to bring together and revisit the results of the previous studies and develop a novel empirical model for asphalt concrete k that incorporates both the percentage air void ( AV% ) level and a grading entropy parameter (a similar concept for gravels was recently proposed by O'Kelly and Nogal when discussing Feng et al. [18 , 19] and presented in detail in O'Kelly and Nogal [20] ). In particular, this study aims to: (i) Report the details of the sources of data used to build AC/k-1624; (ii) Develop transformation models [21 , 22] linking measurements of saturated k to simple asphalt concrete mix parameters and determine the key predic-Notation: The following notation is used in this paper (units given in brackets for those quantities with units) A relative base entropy; a a coefficient; AV% air void percentage; B normalised entropy increment; b a coefficient; C i number of elementary statistical cells in fraction i ; D 10 effective particle size, for which 10% of the soil is finer (length); D 20 effective particle size, for which 20% of the soil is finer (length); D 25 effective particle size, for which 25% of the soil is finer (length); D 30 effective particle size, for which 30% of the soil is finer (length); D 40 effective particle size, for which 40% of the soil is finer (length); D 50 effective particle size, for which 50% of the soil is finer (length); D 60 effective particle size, for which 60% of the soil is finer (length); D 70 effective particle size, for which 70% of the soil is finer (length); D 75 effective particle size, for which 75% of the soil is finer (length); D 90 effective particle size, for which 90% of the soil is finer (length); D x effective particle size; H entropy of a set of probabilities; k hydraulic conductivity (length. time − 1 ); n number of data points; N number of fractions; NV normalised voids; NMAS nominal maximum aggregate size (length); p p-value ; p i probability of a system being in cell i of its phase space; PSD particle size distribution; R 2 coefficient of determination; R p representative pore size (length); S grading entropy; S 0 base entropy; SE standard error; x i relative frequency of fraction i ; ΔS entropy increment.

Effective particle size
There have been many studies exploring the effects of mixture gradation on the k of asphalt concrete [2 , 15 , 37-39] . Waters [2 , 40] used the 'normalised air voids' ( NV , which incorporates AV(%) and the D 50 ) as a predictor for k of asphalt concrete. Following this work, Vardanega and Waters [15] reported a database ( n = 467) and following subsequent analysis showed that the 'representative pore size' ( R p ) with D 75 as the effective particle size was a good predictor of asphalt concrete k giving the following equation: where: in which D x is the effective particle size in mm (taken as D 75 ) in Vardanega and Waters [15] . Vardanega et al. [16] updated the database of Vardanega and Waters [15] ( n = 1318) and showed that R p remained a good predictor of asphalt concrete k .
The entropy of a set of probabilities p 1 , …, p n can be computed as: where: p i is the probability of a system being in cell i of its phase space [42] .
To compute the statistical entropy of the PSD, a double statistical cell system, with a grid fraction (real cell system) with successively doubled width and an elementary statistical cell system (imaginary cell system) with a uniform width d 0 , is used [49] . After embedding the PSD information in the double statistical cell system, the grading entropy of an arbitrary soil mixture can be computed using [49] : where N is the number of fractions, is the relative frequency of fraction i , and C i is the number of elmentary statistical cells in fraction i . The grading entropy S can be expressed as a combination of two terms, the base entropy, S 0 , ( Eq. 5 ), and the entropy increment, ΔS , ( Eq. 6 ): and The base entropy, S 0 , explains the relative spread of the grain sizes, while the entropy increment, ΔS, describes the statistical entropy of the PSD in terms of the fractions and also explains the relative distribution of the size of the particles.
To make the entropy increment ΔS independent with the number of fractions N , and constrain the base entropy, S 0 , and to a set interval for a varying number of fractions [41 , 44] , the normalized grading entropy coordinates, the relative base entropy, A , and the normalised entropy increment, B, were introduced [41] . The relative base entropy, A , describes the symmetry of the PSD and is given by: The normalised entropy increment, B , describes the kurtosis of the PSD, and is given by: Variations in the PSD can be plotted vectorially as a set of points on the normalized grading entropy diagram rather than a series of full PSD plots.
James [50] undertook an early study on the potential use of the grading entropy co-ordinates as predictors of asphalt concrete k . Feng et al. [18] showed that for a set of constant head permeability tests on road construction granular mixtures ( n = 30) subjected to similar compaction effort, the normalised grading entropy co-ordinates were a reasonable predictor of k (a similar result was reported in Feng et al. [33] for a database ( n = 164) of sand-gravel mixtures).

Database
A database of asphalt concrete k measurements ( n = 1624) has been compiled referred to in this paper as AC/k-1624 ( n = 1624). The database is an expanded version of those presented in Vardanega et al. [16] ( n = 1318) and Feng [17] ( n = 1578). Table 1 shows the origins of the information used to compile the database, relevant ranges of the key parameters and details on the asphalt concrete samples and testing methods. Anisotropy of k is not studied in this paper as the direction of the test flow is not specifically stated in most data sources, and usually the k is measured vertically through the test specimen. Although most of the k data were assessed from laboratory testing on laboratory fabricated samples, the compiled database includes some data from field tests [51 , 52] as well. However, the field k data will inevitably account for both the horizontal and vertical k to some extent [53 , 54] . The saturation level of the testing samples is sometimes reported in the data sources [e.g., [53][54][55][56] . However, based on an examination of the testing methods used in the database studied it is assumed that the k was measured in saturated or near saturated conditions. The flow conditions during the k test are reported in some sources (see Table 1 ). Considering the air void level range present in the database (1.7 to 32.67), it is accepted that non-laminar flow may have occurred in some of the tests on samples with higher air voids content. That said, the k -values reported in the database were almost all certainly calculated making use of Darcy's law and the corresponding assumption of laminar flow. Also, some scatter in the analysis results presented in this paper could potentially be due to the variation in test methods and possibly testing temperature, the latter not often reported in the cited publications.

Statistical methods
When evaluating the quality of the results of linear regressions, the coefficient of determination ( R 2 ) must be supplemented with other statistical measures, especially the number of data points used in the regression ( n ) and the standard error ( SE ) [80] . Phoon and Kulhawy [21 , 22] explained the importance of quoting the standard deviation of a transformation model (regression) in geotechnical research. For the key correlations presented in this paper, accompanying predicted versus measured plots are provided with the predicted values on the x-axis and the measured values on the y-axis following [81] . Considering the unquantified variations in sample sources, test methods and temperature, a prediction band width of 0.2-5 times range was chosen to examine the prediction accuracy of the datapoints in AC/k-1624. Stevens [82] emphasized the importance of outliers and influential points, as these points may substantially distort the regression results. Datapoints with standardized residuals falling outside the interval ( − 2.5, 2.5) (e.g., see the review of [83] ) and/or with a leverage greater than 3 times of the average (e.g., see the review of [84] ) were classed as outliers or influential points. Adjusted correlations were then developed without the identified outliers and influential points included in the analysis. For the three key correlations discussed in Sections 4.2 to 4.4 around 5 to 7% of the points were classed as outliers or influential points using the aforementioned methodology.

Air voids
The regression between lnk and lnAV% from AC/k-1624 yielded the following equation: which can be rearranged to give: Based on Eq. 9a , about 5% of the datapoints were identified as outliers or influential points. The adjusted correlations with all identified outliers or influential points removed is ( Fig. 1 ): which can be rearranged to: Significant differences between the regression coefficients in Eqs. (9a ) and ( 10a ) was not observed. The k-measured is plotted against kpredicted using Eq. (10) in Fig. 1 with the k level classified (see the shading on Figs. 1-3 which indicates the following categories based on [15] : A1 = 'very low permeability'; A2 = 'low permeability'; B = 'moderately permeable'; C = 'permeable'; D = 'moderately free draining'; E = 'free draining'). Fig. 1 plot shows that 71.14% of the data points lies within the 0.2 to 5 times range and about 50.06% of the datapoint fall below the line of equality (overpredictions), while 49.94% of the data points are underpredicted by the correlation. Fig. 1 indicates that AV% is a strong predictor of k .

Effective particle size
The coefficient of determination ( R 2 ) for various effective particle sizes ( D 10 Table 2 . It is observed that D 30 and D 40 yields the highest R 2 , which is close to the results from Waters [85] , where D 25 is chosen as the effective particle size for asphalt concrete k predictions. Table 2 also shows the variation in R 2 when various effective particle sizes ( D 10 , D 25 (2 )) then used as predictor. For the studied database (AC/k-1624), D 60 yields the highest R 2 when adopted as D x in R p . Vardanega and Waters [15] showed that the coarse fraction was where the D eff was located ( D 50 -D 90 ) and this is similar to the results shown in Table 2 where R 2 is highest in the range D 50 -D 75 . For the analysis in this paper, D 60 will be used to compute the representative pore size ( R p ). The fitted correlation between ln k and ln R p ( D x = D 60 ) for the entire database yields: which can be rearranged to give: Around 6.8% of the datapoints were identified as outliers or influential points based on Eq. (11a ). The adjusted correlation with these points removed is ( Fig. 2 ): which can be rearranged to: Comparing Eqs. (11a ) and ( 12a ), it can be observed that the influence from the potential outliers or influential points is marginal. The k-measured versus k-predicted plot using Eq. (12) ( Fig. 2 ) shows that 77.59% of the data points lie within the 0.2 to 5 times prediction range, 43.42% of the data points lie below the line of equality (overprediction),  Sources with explicitly stated saturation level. Notes: Data sources S23-S27 and part of the data in S5 (also in [79] ) were used in Vardanega and Waters [15] ; S13-S22 in Vardanega et al [16] and S4-S12 in Feng [17] and Feng et al. [33] . * Field permeability test data. * * Sources with saturation level clearly stated. * * * Number in brackets is the n value for the NMAS value stated.    while the remainder (56.58%) of the data points lie above the line of equality (underprediction). Fig. 2 shows that the inclusion of gradation parameter ( D x ) does slightly enhance the accuracy of prediction in terms of R 2 and percentage within prediction range when compared with data using AV% alone (especially for the permeable and free draining categories), though not as marked as presented in the earlier study [15] .

Grading entropy
The grading entropy S explains the disorder of the PSDs [44 , 45 , 49] . Based on its intrinsic features, it is reasonable to infer that the grading entropy ( S ) could assist with characterization of the fluid path within asphalt concrete mixtures. The target PSD and AV% of asphalt concrete mixtures are usually specified in the field so these parameters were considered good candidates to predict k . The multiple linear regression of k with both AV% and S yields: Based on Eq. 13a , 6.4% of the points in the studied database can be identified as outliers or influential points. The adjusted regression with these points removed is ( Fig. 3  Some variations on the regression coefficients can be observed especially on S . The k -measured versus k -predicted (Eq. (14)) plot is presented in Fig. 3 , where 79.34% of the data points lie within the 0.2 to 5 times prediction range and 50.13% of the points fall below the line of equality (overpredictions), and 49.87% of the data points lie above (underpredictions). Fig. 3 shows that Eq. (14) gives a better prediction of k compared to Eq. (12) and Eq. (10).

Analysis of data subsets
In the following analysis, the entire database AC/k-1624 is used to study whether the test method, gradation classification and NMAS significantly impact the regression results presented in Section 4 . The outliers identified in Section 4 are not removed in the analysis that follows, as the outliers and influential points will be slightly different for each data subset. Therefore, the regression results from the sub dataset analysis should be compared with the fitted results based on the entire database ( Eqs. (9b ), ( 11b ), ( 13b )).

Test method
The database AC/k-1624 includes k data measured by various test methods. In order to investigate the effect of the test method, the database is further divided into 'constant head', 'falling head', 'falling head rising-tail', 'field test' subsets. For the 'constant head test' subset, D 20 gives the highest R 2 , while for the other subsets, the peaks are between D 50 and D 70 . The analysis results using Eqs. (9b ), (11b ), (13b ) and modified Eq. (11b ) with the favoured D x for each data subsets are summarized in Table 3 . The examined models ( Eqs. (9b ), ( 11b ) and ( 13b )) still provide statistically strong predictions mostly between 0.2 to 5 times range, though some degree of variation appears in the coefficient and the exponent of the regressed Eqs. among different subsets. Most of the data is from falling-head tests ( n = 1267) and it is therefore not surprising that the regression coefficients are similar to those shown in Eqs. (9b ), (11b ) and (13b ).

Gradation parameter
O'Kelly and Nogal [20] processed data from 47 permeability measurements on granular soil and concluded that for the hydraulic conductivity assessment of coarse-grained soils separate analysis should be conducted based on the gradation type. The asphalt concrete mixture is largely comprised of coarse-grained soil, it is thus worth examining the potential influence brought by the gradation types for the asphalt concrete database in this study. As per the Unified Soil Classification System [88] , the database AC/k-1624 was initially divided into wellgraded soil and poorly-graded soil based on the gradation type ( Table  4 ), and then further subdivided into well-graded gravel, poorly-graded gravel, well-graded sand and poorly-graded sand data subsets based on the fraction size ( Table 5 ). The most favoured D x always falls within D 50 to D 70 range for all subsets. The analysis results calibrated based on Eqs. (9b ), (11b ), (13b ) and modified Eq. (11b ) with the favoured  Tables  4 and 5 .

Nominal maximum aggregate size (NMAS)
The effect of nominal maximum aggregate size ( NMAS ) on the k of asphalt concrete has been discussed in Hainin et al. [5] and Yan et al. [74] . AC/k-1624 was subdivided into three data subsets based on the NMAS and its potential effect on the prediction of k is further investigated. Table 6 shows that for data subsets with coarser NMAS ( 9 .5mm < NMAS ≤ 12.5mm and NMAS > 12.5mm) the most favoured D x still falls within D 50 to D 70 range, while for data subset with NMAS ≤ 9.5mm, D 90 gives the peak value in R 2 . The analysis results calibrated based on Eqs. (9b ), (11b ), (13b ) and modified Eq. (11b ) with the favoured D x for each data subset are summarized in Tables 6. Some degree of variation in the regression coefficients does exist among different data subsets but not as marked as for test type ( Table 3 ).

Summary and conclusions
A large database ( n = 1624) of k measurements on asphalt concrete called AC/k-1624 has been presented in this paper. Potential predictors for k of asphalt concrete include AV% , effective particle size D x , representative pore size R p , grading entropy parameter ( S ), gradation parameters and NMAS were investigated using AC/k-1624. AV% is a significant predictor of asphalt concrete permeability explaining around 67% of the variation ( R 2 = 0.67, Eq. (10)). An effective particle size is often used to incorporate gradation into empirical models for asphalt concrete permeability e.g. the representative pore size [15] or normalised air voids concepts [2] . Such methods are limited by the need to statistically determine the best D eff which may change as the database expands ( D 75 was chosen in [15] with further research in this paper showing D 60 may be a better candidate). The grading entropy framework in the form of the S parameter offers for a large database n > 1500 (with outliers removed) a method to incorporate gradation without having to statistically determine the effective particle size. Inclusion of S explains a further 9 percent variation in k ( R 2 = 0.76, Eq. (14)) as opposed to only 3 percent for R p ( R 2 = 0.70, Eq. (12)).
Eq. (14) is a novel empirical equation which systematically captures the grading and porosity information of asphalt concrete mixtures, calibrated with a large database AC/k-1624 (see also Ching et al. [86] for recent developments in international efforts to develop geodatabases for key geotechnical parameters or regions). The present authors suggest that a similar approach can be used for pavement engineering properties either at international or regional scale. Eq. (14) shows that from both statistically and practically perspectives S is a superior way of accounting for gradation changes than the R p concept for asphalt concrete (based on the analysis of a large database AC/k-1624). Subdivision of the database by test type, grading classification and NMAS shows that no significant increases in R 2 can be found from Eq. (14) with the exception of constant head and field tests ( Table 3 ), although it is noted that the regression equations are affected to some degree by these subdivisions.
While the statistical trends shown in this paper are significant ( p is often < 0.001) and the database is relatively large ( n > 1600) much of the scatter is probably due to the fact that different test methods for k and AV(%) determination were used in the studies included in the database. While it would be preferable to have a database with limited variation of conditions, such dataset is not available at present. Despite this limitation, the trends shown give useful approximations for k that can be used by pavement engineers to assess the propensity for different mix design to transmit water. Future uses of large databases such as AC/k-1624 may include Artificial Neural Network (ANN) modelling approaches as reported in [38 , 87] .

Disclosure statement
The authors wish to report no conflict of interest.

Data availability statement
This research has not generated any new experimental data.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.