The assessment of Levenberg–Marquardt and Bayesian Framework training algorithm for prediction of concrete shrinkage by the artificial neural network

Abstract Shrinkage and creep are the main concrete volume changes over time. This unacceptable concrete deformation leads to stress and cracks creation where eventually reduces the service life of concrete structures. According to this, the prediction of shrinkage and creep strain in concrete structures with acceptable accuracy is the significance essential. The extensive investigation accomplished by several researchers has created different relationships and models for forecasting of shrinkage and creep strain based on experimental and analytical observation. Despite effective efforts in this regard, existing models do not have sufficient accuracy for anticipate of shrinkage strain. According to this, in this research, it has been attempted to provide a shrinkage predicting model based on the artificial neural network technique with the application of RILEM database. Also, it has been tried to determine the accuracy of the proposed model in comparison to the existing standard models by statistical analysis. According to the obtained results, by application of the neural network technique, the shrinkage strain could be predicted with acceptable accuracy especial in the extended period.


PUBLIC INTEREST STATEMENT
The massive quantity of carbon dioxide enters the atmosphere by the cement industry for the production of consumed cement in concrete structures. While the performance and lifetime of the concrete structures deteriorate because of destructive agents penetration in different aggressive conditions. The concrete cracks, especially induced crack by concrete volumetric changes, play an essential role to penetrate of the contaminating agents and reducing the performance and lifetime of concrete structures. Therefore, the evaluation of concrete shrinkage strain values is vitally important in order to increase the lifetime of reinforced concrete structures. However, despite extensive efforts by different researchers in this area, it has not yet been possible to estimate the volumetric strain caused by shrinkage accurately. According to this, in this research, it has been attempted to provide a shrinkage predicting model based on the artificial neural network techniques as a powerful numerical method.

Introduction
Shrinkage is the concrete volume changes because of internal moisture losses (drying shrinkage), drying fresh concrete (plastic shrinkage), internal water absorption by anhydrous cement particles (autogenous shrinkage) and carbon dioxide penetration (carbonation shrinkage) which appears without any stress increasing in the free or unrestrained members. Drying shrinkage among the different types of concrete shrinkage is significantly crucial because of its extensive effects. The shrinkage strain increases over time by the progress of desiccation. Therefore, good estimation of this phenomena during structural design can help to increase the durability and service life of concrete structures (i.e., bridges) and decreases destruction effects of this deleterious phenomena.
Many researchers have offered several methods and equations based on experimental and analytically investigation for prediction of concrete volume changes. Some of these manners have even applied in structural design codes. However, mentioned methods do not provide an accurate estimation of the volume changes and usually predicted values have significant differences from observed results. Also, inability of these methods application in modern concrete (i.e., high-performance concrete, self-compacting concrete) because of their unrealistic results is one of the main limitations.
There are several models which are used for concrete shrinkage and creep prediction (i.e., ACI209, CEB90, B3, GL2000, Sakata model) (Akthem & Jian-Ping, 2005). Predicting models are not limited to the above models and several others can be found (Yoo, Kwon, & Jung, 2012). However, many of the new models are combination or modification of standard models such as ACI209 (ACI Committee 209, 2008), CEB90 and B3 code models (Akthem & Jian-Ping, 2005). Several different parameters are used in mentioned models for prediction of concrete shrinkage and creep. The accuracy of these models has improved over time by modification and updating of the applied parameters and coefficients. In Table 1, the parameters which are used in some wellknown models and their limitation are listed.
According to Table 1, if concrete properties such as w/c ratio, compressive strength and also environmental condition such as relative humidity, curing conditions, etc., are not in the defined ranges, the use of mentioned models provides incorrect results. The other critical problem is the operation of additives. The use of mineral and industrial admixture as like as silica fume and fly ash which is used for improving concrete strength and durability has a direct effect on drying process and dimensional changes as the results. Therefore, the application of additives would be affected by the accuracy of the mentioned models. Also, mix proportion of modern concrete are not in the defined domain of predicting models in general. Therefore, further researches are required to reach more precise models in ordinary Portland cement and modern concrete. In other words, the best way is the method that can be applied for the wide range of concrete properties and also, can predict shrinkage and creep strain with acceptable accuracy.
The recently conducted researches are aimed to present new ways which are suffered lower limitation. Mazlom's (2008) studies are among these researches where several equations have provided for prediction of time-dependent volume changes of high-performance concrete containing silica fume. In their research, an equation is proposed in hyperbolic form for prediction of concrete shrinkage (Equation 1). Abbasnia, Shekarchi, and Ahmadi (2013) presented an equation as a function of humidity changes for shrinkage prediction according to the solution of differential equations governing the moisture flow within the concrete (Equation 2). In their research, shrinkage strain is estimated in two main steps. At first, internal moisture changes have been calculated from the modified Fick's second low and measured by modified SDB sensors, and then the relation between concrete volume changes and variation of internal humidity has been established.
As mentioned, applying existing predicting models in addition to the lack of acceptable accuracy, have a considerable restriction. Also, with the development of different additives type, the capability of existing methods becomes more restricted. However, the usage of neural network technique can overcome the referred restriction and access more accurate estimation of time depending on volume changes of concrete if adequate data are available, where are growing increasingly, and if the appropriate algorithms can be selected.
The artificial neural network technique as a numerical method which can use for estimation or clustering of data is used for the prediction of concrete properties recently (Bin, Qian, & Jingquan, 2015).
Artificial neural network technique has been applied successfully to predict the concrete mechanical strength (Razavi, Jumaat, Ahmed, Shafie, & Pegah, 2011;Seung, 2003), compressive strength (Marta, Iñaki, & Olmo, 2003;Manish & Kewalramani, 2006), concrete creep phenomenon (Lyes, Franc, & Buyle, 2014;Reda Taha, Noureldin, Sheimy, & Shrive, 2003), concrete shrinkage phenomenon (Gedam, Bhandari, & Akhil, 2014) and even some additives effect on the concrete strength (Bakhta, Mohamed, Said, & Arezki, 2011), prediction of self-compacting concrete properties containing fly ash (Omar, Bakhta, Mohamed, & Arezki, 2016) and durability of concrete (Parichatprecha & Nimityongskul, 2009) in recent years. In this regard, Jamshid, David, Pecknold, and Rami (1998), proposed a new method, for training neural networks to learn complex stress-strain behavior of materials. Also, Parichatprecha and Nimityongskul (2009) were successfully used the ANN to predict the durability of high-performance concrete. As mentioned above, the ANN method has been used to the calculation of concrete volume changes over time. In this regard, Ball and Buyle (2013) presented a predicting model for estimation of concrete shrinkage based on artificial neural network technique. They showed that the artificial neural network as a numerical method could be utilized successfully to predict the concrete shrinkage and creep strain. In their studies, RILEM database has been used for training the networks, and the more precise algorithm has been introduced by modeling different networks based on a different algorithm. Despite remarkable studies in this area, there is not any general comprehensive assessment of this method application. Also, fewer studies have been done to predict the volumetric changes of concrete using neural network techniques than mechanical properties. While, in studies which were done in this area, the effect of several factors such as additives and pozzolanic materials are not generally considered. Regarding this issue, this research attempts to provide a comprehensive model based on neural network techniques to predict shrinkage strain in concrete with different cement type and broader ranges of the water to cement ratios. Also, it is tried to utilize the proposed model for the high-performance concrete contain silica fume and fly ash.

Research significance
The primary objective of this paper is to develop a method based on artificial neural network technique for the prediction of concrete shrinkage strain. Accordingly, by utilization of the different network architectures and different training algorithms (Levenberg-Marquardt and Bayesian Framework) in MATLAB software, it has been attempted to achieve more precise values of shrinkage strain in concrete with different properties. Also, an experimental program for assessment and validation of shrinkage prediction method are done.

Network configuration for anticipate of the concrete shrinkage
In the proposed method based on artificial neural network technique, it has been attempted to employ all effective parameters on concrete shrinkage for configuration and training of the networks as possible. Also, a very wide range of parameters variation has been selected considering the available information in the database. According to this, the proposed model can be applied for various types of ordinary and modern Portland cement concrete.

Data preparation
Concerning the importance of concrete shrinkage and to reach a better understanding of this phenomenon, various experiments by many researchers have been done on the concrete with different properties over time. As a result, concrete shrinkage and creep database have been created and developed gradually by gathering the experimental observation. The first comprehensive data bank which includes approximately 300 shrinkage test results, was developed at Northwestern University in 1978. This database was expanded by ACI 209 subcommittee in collaboration with CEB in 1980. H. MÄuller undertook further expansion in the RILEM sub-committee. Also, an enlarged database named NU-ITI database is created recently by collecting 490 shrinkage test results in the Institute of Technology and Infrastructure of Northwestern University (Bazant & Li, 2008).
In this research, the most recent RILEM database is used. This has been developed by Bazant and Li (2008) by gathering experimental results. In this database there is various information about the environmental conditions and concrete properties effects such as cement type, aggregates to cement ratio, aggregate type, additives kind and …, on the induced shrinkage strain. Also, concerning the quantity of the information from the database, silica fume substantially and fly ash content have been considered within the parameters for the configuration of the artificial neural network. In Table 2, the parameters and its variation ranges which are used to form the network are summarized.
In the mentioned database only the test results are intended that the all above parameters (Table 2) are available for it. Also, according to the importance of the total shrinkage awareness during design and also the existence of a large number of autogenous and drying shrinkage data, the total shrinkage was selected from the database.
Depending on the dispersion and expansion of the existent data in the databases, all input data for configuration of the network have been scaled and normalized (the mean and standard deviation are equal to zero and one, respectively). Then, due to the large magnitude of the input vector, it is tried to eliminate the data which have the negligible effect on the data collection by using Principal Component Analysis method (PCA)). Also for simulation purposes, the new entrance should be normalized by mentioned methods. After this step, input data are prepared to configure and train the network.

The architecture of the artificial neural network
Different training algorithms have been used for selection of the best model with the most accurate results. As mentioned above, the primary object of this study is the prediction of the shrinkage strain from the input data which include concrete properties and environmental condition. According to this aim, the standard back-propagation network has been utilized for configuration of the artificial neural network.
After determination of the network type and for the next step, the appropriate training algorithm must be selected. For this purpose, networks with different architecture are assessed and then the network with the best performance is introduced. Training algorithm includes a complex mathematical process which can reduce the network error. The input data collections are divided into three categories. For training of the networks 70% of data, for networks testing 15% and 15% is used for verification of the networks.
For the configuration of the networks, at first, a primary network with two hidden layers which consist of eight neurons was formed. Two hidden layers of the training functions were in the form of Tan-Sigmoid transfer function (tansig), and the training function of the final layer or output is in the form of purelin. Architecture details of this network are shown in Figure 1.
After configuration of this primary network, several different training algorithms were studied, and the best one was selected. Then, the number of hidden layers' effect was investigated. At first, it was tried to use a network with one hidden layer and different neurons number, but obtained results show lower accuracy than the two-layers network. For this reason, just the two-layer network was evaluated, and their results are presented. Several three-layer networks with a different number of neurons were also built. However, the performance of these networks was lower than the two-layer network, and the obtained results were not accurate enough. The specifications for the initial training algorithm which were used for network training, are presented in Table 3.
Last columns in the above table are the network performance or the mean square of the network error. Even if the value of network performance is lower, the accuracy of the network  would be higher. The regression curves of each training algorithm are drawn in Figure 2. The values on the axes indicate the shrinkage strain.
According to Figure 2 and Table 3, it can be concluded that the Levenberg-Marquardt algorithm and also the Bayesian Framework training algorithm have the best performance between studied algorithms. Therefore, these algorithms have been selected and used for the training and configuration of the networks.
As shown in Figure 2, these training algorithms have the acceptable accuracy for the prediction of long-term shrinkage. Also, it can be seen that the performance of the Levenberg-Marquardt method is better than the network built up by Bayesian Framework method. However, it has been used for network generation to improve the generality of networks.
After the selection of the training algorithms, the effect of neurons numbers in the hidden layers are investigated. In this regard, the number of neurons in the hidden layers have been considered 8, 10 and 12. The characteristics of these networks algorithm (Lenenberg-Marquardt and Bayesian framework) are presented in Table 4. The transfer's functions for the first and second hidden layer are tansig, and for the output, the layer is in the purelin form.
In among the studied networks, the TS10101 and RTS10101 networks have the lowest mean error, although the mean square error is not minimum. However, this factor is acceptable. Also,  although the mean square error of the RTS1281 and TS1281 networks is the least, these networks have not least mean error. It must be noted that the lower average network errors are equal to reach a closer answer to reality. Regarding this fact, the TS10101 and RTS10101 at the Levenberg-Marquardt, and Bayesian Framework algorithm were chosen as the final networks form.
As mentioned above, the input data are divided into three categories (for networks training 70%, for testing 15% and for verification 15% of data). The accuracy of these categories application using TS10101 and RTS10101 networks are shown in Figures 3 and 4. According to these figures, the accuracy of predicted values is acceptable when the training, testing and verification data are used. It must be noted that the function "trainbr" that performs Bayesian regularization back propagation disables validation stops by default. The reasoning for this is that validation is usually used as a form of regularization, but "trainbr" has its form of validation built into the algorithm.

Assessment and validation of the proposed method
In this research, a laboratory program has been performed to the evaluation of the proposed method performance. In the laboratory investigation, the cement was Portland cement type II.  The coarse and fine aggregates were crushed, and the maximum size of coarse aggregates 12.5 mm and the coarse and fine aggregate water absorption are 1.78% and 2.8%, respectively. Also, superplasticizer based on polycarboxylate was used to maintain suitable workability.
Zeolite as a mineral additive also has been used for assessment of method capability to predict shrinkage strain in the specimens contain mineral admixtures (Table 5), while the application of mineral additives is the significant restriction for existing shrinkage models.
The shrinkage test, which indicates the changes in the length of the non-loaded concrete specimen, has been performed on the specimens according to ASTM C157.

Comparison between obtained results and predicted values
In this section, the shrinkage strain is calculated with the ACI209, B3 and GL2000 models and proposed method according to presented data in the database. Then, the difference between the calculated data and the experimental results (calculated error) are obtained.
The distribution of calculated error from the shrinkage standard models and the provided network-based models are presented in Table 6. The predicted values are represented in the short and long time. The values in parentheses show the percentage of the errors.
According to Table 5, the neural network-based method provides an equivalent estimation of shrinkage strain in the short and long period. While the other studied models generally represent upper estimate results, this fact could lead to a non-optimal design. Also, the amount of data which can be used in the neural network-based methods are higher than  the others. Accordingly, the network base methods are applicable for the broader range of concrete parameters as like a low level of w/c ratio, high strength concrete and so on. Whereas the standard models have subjected the many circumscriptions, so their application is limited or often impossible. The distribution of shrinkage strain error over time for each model is illustrated in Figure 5.
As seen in Figure 5, the network-based model error is located in the range of ±250 microstrain. In comparison, the other models (i.e., GL2000 & ACI209) even have the ±500 microstrain error. Also, according to this Fig, the error of the network-based model is reduced over time. In the other hand, in the early age of concrete, the error of this method is great, but increase their accuracy with increasing the time.
According to Figure 5, it can be seen that Levenberg-Marquardt is accurate than Bayesian Framework method. However, this does not necessarily mean that the TS10101 is better than the RTS10101 model because it is possible that the model based on Levenberg-Marquardt training algorithm gates overfitting of training data.
The regression curves of the neural network method and predicting models are illustrated in Figure 6. As shown in this figure, the neural network method has the accurately predicted values than the others. However, the next accurate model is the B3 model. In general and across the intended models, the B3 model has a higher accuracy than the other standard models.
In Table 7 by dividing the shrinkage error into two categories, the accuracy of each model has been investigated in two short (0 to 1000 days) and long (1000 to 9000) time interval.
As shown in Table 7, in the TS10101 method about 90% of the errors are less than the ±100 microstrain, and 10% are just higher than 100 micro-strain. This value for the RTS10101 is about 85% while 70% of the error of the other studied models in best situation is less than ±100 micro-strain.
In this section, the measured and calculated shrinkage strain from the experiment and neural networks technique are represented. Measured and calculated shrinkage strain of each mixture in Table 5 are presented in Figures 7, 8, 9 and 10.
As shown in the above Figs, the artificial neural network technique with Levenberg-Marquardt, and Bayesian Framework training algorithms could estimate the shrinkage strain with acceptable accuracy in the specimens contain mineral admixture.

Conclusion
In this research, it has been attempted to achieve a more accurate estimation of shrinkage strain by applying neural network technique, considering the importance of shrinkage strain forecasting in durability, reliability, and maintenance of concrete structure.
According to this, obtained results are summarized as follows: • The shrinkage prediction model such as ACI209, B3, and GL2000 have a considerable limitation in the input data. While in the network-based model the wide range of the concrete properties can be used. This fact is one of the most important advantages.
• The network-based methods present a balanced estimation of the shrinkage values. While the ACI209 and B3 estimate nearly 58% and GL2000 estimate about 75% over the observed values.
• The calculated amount of shrinkage error at a short time from the ACI209, B3, and GL2000 are smaller than a long time. Also, the amount of error in the network-based models is reduced over time. (e) GL2000 Figure 5. The distribution of shrinkage error for different shrinkage predicting models.
• The neural network-based method with the Levenberg-Marquardt algorithm presents the more accurate values for the shrinkage strain than the Bayesian Framework algorithm.
• The proposed method has been used successfully to estimate the shrinkage of the mix designs containing the mineral additive.