Systematic literature review and mapping of the prediction of pile capacities

Abstract Predicting the pile’s load capacity is one of the first steps of foundation engineering design. In geotechnical engineering, there are different ways of predicting soil resistance, which is one of the main parameters. The pile load test is the most accurate method to predict bearing capacity in foundations, as it is the most accurate due to the nature of the experiment. On the other hand, it is an expensive test, and time-consuming. Over the years, semi-empirical methods have played an important role in this matter. Initially, many proposed methods were based on linear regressions. Those are still mainly used, but recently the use of a new method has gained popularity in Geotechnics: Artificial Neural Network. Over the past few decades, Machine Learning has proven to be a very promising technique in the field, due to the complexity and variability of material and properties of soils. Considering that, this work has reviewed and mapped the literature of the main papers published in journals over the last decades. The aim of this paper was to determine the main methods used and lacks that can be fulfilled in future research. Among the results, the bibliometric and protocol aiming questions such as types of piles, tests, statistic methods, and characteristics inherent to the data, indicated a lack of works in helical piles and instrumented pile load tests results, dividing point and shaft resistance.


Introduction
Estimating the bearing capacity of piles is an important step of foundation design and one of the best ways to get to know this capacity is through the execution of pile load tests.Despite the accuracy of this test, it is not always used in small or medium constructions due to its high cost.In such cases, semi-empirical methods are a very important tool for predicting pile load in the foundation design process.
Semi-empirical methods, such as Aoki & Velloso (1975) and Décourt & Quaresma (1978), were created comparing the prediction of bearing capacities obtained from pile load tests against other tests, which are easier to implement but more difficult to interpret, such as Standard Penetration Test (SPT), Cone Penetration Test (CPT) or Pile Driving Analyser (PDA).Besides, most of these methods might have limited information regarding imprecisions in the mobilization of the load by the pile, regarding the diameter and the regionalization of the data used (Schnaid & Odebrecht, 2012).These imprecisions in the prediction Nevertheless, Tarawneh & Imam (2014), for example, compared a Multiple Linear Regression (MLR) to an ANN model for the prediction of bearing capacity in time.They used a database of 169 piles of different types of section and material.The analysis of the ANN was more precise with an R 2 of 0,94 against 0,841 from the MLR model.Amâncio et al. (2022) conducted a successful comparison of a multilayer perceptron-based ANN model with Aoki & Velloso (1975) and Décourt & Quaresma (1978) methods, demonstrating improved accuracy in predicting tip and shaft resistance in 95 instrumented piles.Similarly, Gomes et al. (2021) employed machine learning models to estimate the bearing capacity of 165 precast concrete piles based on SPT results, surpassing the performance of Décourt & Quaresma (1978) method.The random forest technique exhibited the best performance, with RMSE values below 710, compared to Décourt & Quaresma (1978) RMSE value of 900.
According to Yong et al. (2020), there are two main divisions in the methods of Machine Learning.The first is Neurobased Predictive Machine Learning (NPML), to which ANN belongs, and the second is Evolutionary Predictive Machine Learning (EPML), which contains Genetic Programming (GP), a powerful algorithm that provides a mathematic model, in the form of a regression.However, EPML models are generally more precise because regressions use functions pre-determined for modeling.Some examples of GP are linear-GP (LGP), Gene Expression Programming (GEP), and Simulated Annealing-GP (SA-GP).(Yong et al., 2020) In this work, the main aim of the systematic review is to determine the main methods used and lacks that can be fulfilled in future research.The use of a protocol of research shows the main papers that have been published about bearing capacity in piles, compiling important information about the methods used, the data that have been applied.
The methodology presents the criteria of research, exclusion, and inclusion of papers within the string in the citation database search.The results of the bibliometric show information about the papers published, such as authors, publications over the years, and main journals of publication.At last, it was possible to also know some aspects of the research established by a protocol that will guide future works.

Methodology
In this work, a literature mapping was performed based on the search of two important abstract and citation databases: Web of Science (WOF), from Clarivate Analytics, and Scopus (SCP), from Elsevier.For both these databases, the string used was (Regression OR neural network) AND (bearing OR load) AND capacity AND piles.The systematic review was then conducted in three phases: planning the research guidelines based on a protocol; the proper search and selection of works of interest according to inclusion and exclusion criteria; and the extraction of information from the papers to understand the subject under investigation.
The Population, Intervention, Comparison, Outcomes, and Context (PICOC) methodology was used for to conduct the selection process.The description and application of each of the terms is provided in Table 1.
The collection of the papers is shown in Figure 1 and is divided into identification, selection, and eligibility.In the WOF database, 210 published works while in the SCP database 156 published works were returned, for a total of 366 publications, including theses, papers published in conferences, book chapters, and journal articles.As the aim of this work was to analyze only papers published in journals in English, the selection was reduced to 241 works that met these criteria.
Continuing the down-selection, all the duplicated documents were excluded (a total of 17 articles) and, following the flowchart, a selection criterion was applied to the titles, abstracts, and keywords.For this step, the criteria established by the PICOC protocol were applied.Hence, all publications whose subject was related to horizontal load, dynamic load, shallow foundations, and any other geotechnical field were excluded.This resulted in a final selection of 162 papers to pass to the eligibility phase.
During the eligibility phase, in which the complete reading of the papers is completed resulting in a further reduction of eligible papers to 80.In the process, some information about the bibliometrics and the methods and criteria of the papers were extracted.In the bibliometric research, the following information from the publications was collected: • Main journals of publication; • Number of publications per year; • Main authors and their countries; • Main keywords used in the publications.
In line with the PICOC methodology, the following questions regarding the methods used in each publication were addressed: •

Results
The search on the database platforms, WOS and SCP, happened on May 12 th of 2021.366 papers were collected from both platforms, from which only 241 were published in journals.From this analysis, after sorting (reading of titles, abstracts, and keywords) and removing duplicates, 80 papers were eligible to be read and analyzed.The results are presented as bibliometric results and protocol results.

Bibliometric
All 80 papers were published in English in journals that are listed in Table 2. Figure 2 shows the latest ranking by the Journal Citation Reports (JCR) from 2021 grouped.Among the journals, the best JCR factor was 7.963 from Engineering with Computers, with a total of 11 publications (13.75%), and 51,25% scored over 3.0.A total of 21 publications did not have a JCR factor, which represents 26.25% of the publications listed in this paper.
Figure 3 shows the distribution of these publications against time.In the search, there was no exclusion criterion related to the year of publications.The first eligible publication is from 1995, and the year that registered the largest number of eligible publications up to the date of the search (May 12 th , 2021) was the year 2020, with 19 publications.The year 2021 was omitted from the figure as the data for this year was incomplete at the time of the search.Publications were rare between the years 1998 and 2009, with only 4 publications, and 90% of publications were made after 2009.
Bearing capacity semi-empirical prediction methods have been used for some decades.Even though prediction methods are quite known and used for a long time, methods based on machine learning are still a novelty in geotechnical engineering as also highlighted in the review by Moayedi et      Jaksa ( 2017) have compared their methods to other methods such as Poulos & Davis (1980).
Publications were also sorted using the authors' location at the time of publication and the data is presented in Figure 4. Author country rankings are shown using only the first author as well as using all authors.In both cases, the top country is Iran.Making up the rest of the top seven in both cases includes Vietnam, India, China, Malaysia, the United Kingdom, and Australia.
Figure 5 shows the main 15 publishing authors, among which the first and second authors are included.The first 5 authors that have mostly published papers were Armaghani, D.J, Moayedi, H, Rashid, A.S.A, Harandizadeh, H, and Jebur, A.A.
The keywords used by the authors in the papers were variable, with up to 244 different expressions.The fifteen most recurrent are shown in Table 3, and in Table 4 those words that were lookalike, shown both in the acronym or expanded forms or with the main word in common, such as the types of the "pile", were grouped.Despite the use of many different methods, algorithms, and methods, the words ANN, "Artificial Neural Network" and "neuro networks" are still preferred as keywords.

Protocol of search
The protocol search sought to answer important questions on the types of piles studied, the size of the databases used, the methods used, and how they are validated with statistical parameters.The database size is of major importance because it can help the decision-making of future research on how to collect and analyze this database.A large database can improve predictions but is very laborious to generate.On the other hand, a small database may be easier to establish but can lead to poor variance and big errors.According to Jebur et al. (2018a), the ideal size of database in ANN depends on the individual number of entrance parameters, composed mostly by information over the pile geometry, soil resistance, and type, and can be described by: 50 8.
where, N is the database size, and I is the individual number of entrance parameters.
In statistical learning, the correlation of data is transcribed in a function f that represents systematic correlation between one or more inputs, also known as independent variables, and output, or dependent, variables (James et al, 2013).Statistical learning methods, in which regressions and Machine Learning methods are included, use such approach to estimate f.
The observations called training data is a partition exclusively used to train or teach the method that finds and calibrates f.The other set of data is the testing data, which is used to confirm f.
Figure 6 shows the boxplot of both data sizes and how their division between training and test partitions is made in the papers.The average size of data, in Figure 6a, used by the authors is 304 while the median is 80.Some works used data sizes bigger than 200 units, and 4 of them did not say the size at all.Further, four studies used dataset sizes well outside the norm, of 1300, 2314, 4072, and 6437 (Baziar et al, 2015;Alzo'ubi & Ibrahim, 2019;Pham et al., 2020a;Zhang et al., 2021), and were omitted from the diagram for a better visualization of the boxplot.
In Figure 6b, the split between training and test partitions of the databases is shown.The average percentage used as training data is, according to the reviewed papers, 74%, and the median is 75%, while the average distribution for the testing set is 25% and a median of 20%.Shahin (2010) highlights that just like empirical models, ANNs perform better using interpolation than extrapolation and so, within the training data should be included the extremes of it.The author also says that once the input and output data are selected, all variables should be normalized to vary between 0 and 1.This elimination of scales and dimensions allows the algorithm to pay equal attention to all variables during training.
The most used types of piles are presented in Table 5, and it shows that 57 out of 80 papers (71.25%) used driven piles of different sections (pipe with open and closed ending, octagonal, square) and different materials (concrete, steel, and timber).The second most cited type of pile is the bored piles (16.25%), and helical piles, which are commonly used over the world, are represented by only 3.75% of piles in the papers.In the Table 6 papers that used chamber load test were omitted, since they do not represent a type of pile.
The data extracted from a geotechnical test are shown in Table 7 and the most used is CPT, representing 17% of the papers.Unfortunately, 21% of the papers do not specify exactly what tests were used.Of the specified test, the second most used is SPT (15%), followed by laboratory chamber load test (12%) and PDA (11%), which is commonly used in driven piles, as it can be obtained during the installation of the pile.In only 7 papers the pile capacity has been measured by dividing the contribution of the shaft and the tip of piles (Haque & Abu-Farsakh, 2019;Kiefa, 1998;Lu et al., 2020;Samui, 2012;Teh et al., 1997;Yamin et al., 2018;Zhang et al., 2006).
The methods that were used in the works are shown in Table 6.Regressions and MARS are mentioned in 13 different works.All the other names represent an algorithm of Machine Learning, which makes clear that most recent research is based on these methods.Most methods are described by the authors as ANNs (26 times), whilst Back Propagation, Adaptive Neuro-Fuzzy Inference Systems, Gaussian Process and Levenberg-Marquardt are mentioned a combined total of 47 times.Many studies use optimization algorithms, with some authors referring to their approach as a hybrid method, since optimization is an auxiliary tool to reach the global minimum.
Finally, the statistical parameters were analyzed, to evaluate the efficiency of the methods used and allow comparisons when needed.In this matter, there were significant differences in the statistical parameters used by the authors, and the main ones are listed in Table 8 and the main used in Figure 7.The most used parameter that appears in 61% of the papers is the Root-Mean-Square Deviation, followed by the coefficient of Determination, R 2 , with 54% and the correlation coefficient, R, in 32% of the use in papers.In many papers, more than 5 parameters are used and, in this case, a ranking of the performance of each is used, to assist in the evaluation of the methods compared.

Conclusion
This systematic literature review and mapping have shown that Machine Learning has become predominant in the prediction of pile bearing capacity over the last 25 years and has surpassed the most traditional regression-based methods both in number and performance.
The protocol assisted to know the type of piles that are studied, the geotechnical tests that have been used, the size of the database the authors have collected and their share among training and testing, and the main statistical tools along with statistical parameters.
The mapping of literature enabled a better understanding of the main publications over the years, the most relevant authors, and journals, as well as the main keywords used by the authors.
In comparison to other methods, ANN has shown to be a very efficient tool when compared to classic empirical methods that are consolidated.ANNs have performed better, and, in most cases, results are much closer to the bearing capacities measured by pile load tests.The main algorithms used were Backpropagation, ANFIS, Gaussian Process and Levenberg-Marquardt.The most recent papers included meta-heuristics algorithms as well, in a hybrid approach.
Regarding the database, the average size used by authors was 304 and the median of 80 piles, while the average share between training and testing data were respectively 74% and 25%.
This work showed also that the main type of pile that has been investigated is driven piles, corresponding to almost 63% of the papers, along with the main tests being CPT and PDA accordingly.This might be justified because of the availability of data since to better perform such methods, a big database is expected to be used.Helical piles, on the other hand, are one of the most used piles in the world, and according to this research, were represented by only 4% of these papers, which shows an opportunity for new research.Besides, only seven of the papers mentioned that the pile capacity was measured by dividing the shaft and the point resistance.
Main methods used by the authors to predict the bearing capacity, among linear regression methods and Neuro network methods; • Most used statistic methods; • Geotechnical tests used to generate the methods and types of piles used; • Size of the database split between training and testing; • Use instrumented pile load tests in the methods.

Figure 1 .
Figure 1.Flowchart of selection of papers for reading.

Figure 3 .
Figure 3. Distribution of papers per year.

Figure 4 .
Figure 4. Distribution of authors and first authors per country.

Figure 5 .
Figure 5. Main authors publishing as first authors, second authors, and total publications.

Figure 6 .
Figure 6.Boxplot: (a) database size used by the authors; (b) Training and Test share of the database.

Figure 7 .
Figure 7.The 10 main statistical parameters used in the papers and their usage percentage.

Table 1 .
Description of the PICOC components of this systematic review.
O outcomes It was expected to obtain results regarding the main tests and types of piles that have been used, database size, and statistic parameters used to determine the accuracy of the methods.C context To better understand the main application of the statistical methods to predict the bearing capacity of piles, and to outline decisions for future works.

Table 2 .
Journal and papers published.

Table 3 .
Main keywords as they appear in the paper.

Table 4 .
Main keywords grouped in recurrence.

Table 5 .
Types of piles analyzed in the papers.

Table 6 .
Methods used as they were named in the papers.

Table 7 .
Geotechnical test used in the methods by the authors.

Table 8 .
Main Statistical parameters used in the works.