Estimating the Properties of Ground-Waste-Brick Mortars Using DNN and ANN

In this study, deep-neural-network (DNN)and artificial-neural-network (ANN)-based models along with regression models have been developed to estimate the pressure, bending and elongation values of ground-brick (GB)-added mortar samples. This study is aimed at utilizing GB as a mineral additive in concrete in the ratios 0.0%, 2.5%, 5.0%, 7.5%, 10.0%, 12.5% and 15.0%. In this study, 756 mortar samples were produced for 84 different series and were cured in tap water (W), 5% sodium sulphate solution (SS5) and 5% ammonium nitrate solution (AN5) for 7 days, 28 days, 90 days and 180 days. The developed DNN models have three inputs and two hidden layers with 20 neurons and one output, whereas the ANN models have three inputs, one output and one hidden layer with 15 neurons. Twenty-five previously obtained experimental sample datasets were used to train these developed models and to generate the regression equation. Fifty-nine non-training-attributed datasets were used to test the models. When these test values were attributed to the trained DNN, ANN and regression models, the brick-dust pressure as well as the bending and elongation values have been observed to be very close to the experimental values. Although only a small fraction (30%) of the experimental data were used for training, both the models performed the estimation process at a level that was in accordance with the opinions of experts. The fact that this success has been achieved using very little training data shows that the models have been appropriately designed. In addition, the DNN models exhibited better performance as compared with that exhibited by the ANN models. The regression model is a model whose performance is worst and unacceptable; further, the prediction error is observed to be considerably high. In conclusion, ANNand DNN-based models are practical and effective to estimate these values.

However, in the previous few years, deep learning (DL) research has been rapidly progressing and has begun to replace ANN [Janocha and Czarnecki (2016)]. DL is a whole set of methods comprising ANN based on deep architecture in which the number of hidden layers is increased; further, a feature related to the problem is learned in each layer. In this architecture, a feature regarding the problem is learned in each layer, and this learned feature creates an input to the upper layer. Thus, from the lowest to the highest layer, a structure can be established using which the simplest to the most complex features are learned [Işık and Artuner (2016)]. DL typically uses ANNs [Deng and Yu (2014)]. There have been several advances in consulted DL in the 1990s and 2000s [Schmidhuber (2015)]. Since 2006, deep structured learning, or more commonly, deep/hierarchical learning has emerged as a new ML research field [Deng and Yu (2014)]. Ultimately, DNN has received a considerable deal of attention by performing better than the alternative ML methods in several significant applications. DNNs are ANNs that are formed by multiple layers of neural networks with a high number of non-linear neurons per layer [Ferreiro-Cabello, Fraile-Garcia, Martinez de Pison Ascacibar et al. (2018); Schmidhuber (2015)]. Currently, DNN is among the most extensively used classifiers [Janocha and Czarnecki (2016)]. DNN exhibits an ability to extract new properties from raw data and to reduce the size of the dataset [Cılasun and Yalçın (2016); Basturk, Yuksel and Caliskan (2017)]. DNNs provide an excellent set of hypotheses for performing various ML tasks, including classification. The success of DNNs can be largely attributed to the depth of the networks [Arora, Basu, Mianjy et al. (2018)]. Many researchers are actively working on the abovestipulated subject around the world universities, among these universities we can name:  [Deng and Yu (2014)].
Although there are several studies on different mineral additives, there are few researches related to the determination of the properties of the mortars that are produced using the ground waste brick, especially in different curing conditions using DNNs and ANNs. However, even though it is easy to predict the behaviours of mineral additives, such as fly ash and slag, under different curing conditions, GB is considered to be difficult to predict. This study investigates the usage of ANN and DNN methods in the field of construction. In this experimental study, cement mortar samples have been produced using the GB as cement replacement material in 0%, 2.5%, 5%, 7.5%, 10%, 12.5% and 15% compositions; further, the mortar samples have been cured in three different environments (W, SS5, AN5), and the pressure, bending and elongation values of the mortar samples have been determined for 7, 28, 90 and 180 days [Demir, Yaprak and Simsek (2011)]. The objective of this study is to estimate the pressure, bending and elongation values of the GB-added mortars with low error margins using the DNN and ANN models that use different GB replacement ratios, environmental conditions and age values as inputs. Experiments have been conducted by changing the parameters of both the models so that these created models could result in optimum results. The estimated values that are obtained from both the models are compared with each other and with the experimental results.

Related work
Recently, a large number of studies have been performed to reveal the estimation of cement and concrete properties using ANN [Uysal and Tanyildizi (2011);Atici (2011);Siddique, Aggarwal and Aggarwal (2011);Bilgehan (2011); Karakurt and Topçu (2011);Öztürk and Turan (2012); Khan (2012); Uysal and Tanyildizi (2012); Hodhod and Salama (2013); Bal and Buyle-Bodin (2013) ;Bingöl, Tortum and Gül (2013); Diab, Elyamany, Elmoaty et al. (2014); Gülbandılar and Koçak (2017)]. It is possible to observe in the literature that different computer algorithms are used for different purposes in the construction sector. For instance, in another study, multi-gene genetic programming and ANN have been used to develop two models for estimating the creep compliance of concrete [Hodhod, Said and Ataya (2018)]. Gazder et al. used ANN to estimate the pressure resistance of common Portland cement, i.e., the pressure value of the concrete that was manufactured without the addition of reinforcing cement materials [Gazder Al-Amoudi, Khan, (2017)]. In another study, Gopalakrishnan et al. developed an application that could automatically detect cracks in the hot-fix asphalt (HMA) and Portland cement concrete using a deep convolutional neural network that was trained on 'big data' ImageNet database containing millions of images. Consequently, it revealed various images of coatings, including various non-crack anomalies and defects [Kasthurirangan, Siddhartha, Alok et al. (2017)]. However, in another study, DL has been designed and applied to model the elastic homogenisation structure-property connection in a high contrast composite material system [Yang, Yabansu, Al-Bahrani et al. (2018)]. Because of the conducted literature review, two significant results have been obtained. The first one is that such computer-based methods would be useful for solving several problems in the field of construction. The other result is that no similar study related to the ANN and DL methods has been observed in the literature regarding brick-dust.

Method
This experimental study is an application of the DL method that has attracted considerable attention in recent years in civil engineering. For this study, cement mortar samples have been produced using GB as cement replacement material in 0%, 2.5%, 5%, 7.5%, 10%, 12.5% 15% compositions; further, the mortar samples have been cured in three different environments (W, SS5, AN5), and the pressure as well as the bending and elongation values of the mortar samples have been determined for 7, 28, 90 and 180 days. Details of this experimental study and the DNN, ANN, Regression models are presented in this section.

Deep learning
DL is one of the quintessential facets of ML, which is a prominent method for learning representations from data that highlight the criticality of learning successive layers of progressively meaningful representations. Modern DL frequently includes tens or even hundreds of sequential layers of representations, and they are all automatically comprehended using exposure to training data. Further, other approaches to ML incline on solely learning either one or two layers of representations of data; therefore, they are sometimes considered to be shallow learning methods [Chollet (2018)]. First, it is necessary to conceptualise a feature vector to use classical ML techniques for either defining a model or for setting up a ML system. To achieve this objective, finding experts in the field is of considerable importance. Because of the fact that this kind of process takes a long time or because experts are alienated from the field, classical ML techniques could not work on data without performing pre-operation or expert assistance. Deep networks, unlike traditional ML and image processing techniques, perform the learning process using raw data. Thus, while they eliminate the problems that have been mentioned above, they pioneer the progress in the field. While they utilise raw data, they derive the necessary information from the resemblances that can be observed between different layers. While devising DL applications, the designer has to decide on some parameters that will be employed in the design. Those parameters, which vary according to the problems and datasets, can be referred to as hyper-parameters. The most important parameters in such a taxonomy can be listed as the dimension of a dataset, dimension of mini-batch, learning speed, selection of optimisation algorithm, number of epochs, detection of starting values, activation function, value of drop out, number of layers, number of neurons in hidden layers and convolutional neural network urban size (CNN). The values that are employed in the study are given in Tab. 1. The other parameter values that are not included in Tab. 1. will be given in the following sections.

Experimental study
The materials used in this study included cement, ground-brick (GB), standard sand, water and super-plasticisers. CEM I 42.5N, conforming to TS EN 197-1, and standard sand, compliant with TS EN 196-1, were used. The waste fired clay bricks were obtained from a local brick manufacturer in Eskişehir (Kılınçoğlu). The waste bricks were crushed and dried at 105°C for 24 h and were then finely ground to a fineness value (on Blaine) of approximately 5200 cm 2 /g in a grinding mill, and the chemical composition was characterised. The chemical compositions and physical properties of OPC and GB are given in Tab. 2. The sum of SiO2, Al2O3 and Fe2O3 oxides of GB is 88.20, which makes GB a good pozzolanic material according to ASTMC 618. GB was used as supplementary cementing material in the mortar mix to replace cement by weight in 0.0%, 2.5%, 5.0%, 7.5%, 10.0%, 12.5% and 15.0% compositions. The consistency and volume expansion values of the mortars conforming to TS EN 196-3 [Turkish Standards Institution (2002)] were determined. The mortar specimens were prepared for the present experimental investigation using standard sand, binder, and water in the ratio 3:1:0.5. In order to determine the expansions of the mortar, the samples were prepared as prisms of dimensions 25 mm×25 mm×285 mm according to the ASTM C 1012 [ASTM (2007)]. In order to determine the flexural and compressive strength after 7, 28, 90 and 180 days, the samples were prepared as prisms of dimensions 40 mm×40 mm×160 mm according to the TS EN 196-1. Seven different types of mortar mixtures were prepared and 3 specimens were prepared for each one. The mortar prism samples were stored at a temperature of 20±3°C for 24 hours and were subsequently moulded and maturated in lime-saturated tap water (W), 5% sodium sulphate solution (SS5) and 5% ammonium nitrate solution (AN5) for the test duration. The pH values of the SS5 and AN5 solutions were kept within the range of 6-8 by replacing the solution with a fresh one when required. The length change measurements were conducted on 25 mm×25 mm×285 mm prism specimens at the end of the curing period. For each mortar, a prism (40 mm×40 mm 160 mm) was utilised for three-point bending, and the six broken half prism specimens were used for performing the compression tests. The compressive test was performed at a 40 mm×40 mm loading area, with the test procedure conforming to TS EN 1015-11:1999[Turkish Standards Institution (1999].

DNN and ANN models
In the study, DNN and ANN models, which used back-propagation algorithm as the learning algorithm, were constructed to determine the brick-dust pressure as well as the bending and elongation values without experimentation. The input layer is considered to be similar for both the DNN and ANN models. There are three neurons in the input layer. These neurons and their value ranges are presented in Tab. 4.

DNN models
In this study, three DNN models comprising three inputs, one output and two hidden layers are created to determine the brick-dust pressure, bending and elongation values without experimentation. The number of neurons in hidden layers has been chosen as 20. The number of hidden layers in DNN and ANN and the number of neurons in the hidden layers affect the learning performance. Therefore, different models have been designed, and numerous tests have been conducted using the number of hidden layers and the number of neurons in the hidden layers. Because of the experiments, it was determined that a model with 20 neurons and two hidden layers generated the optimal result. The constructed DNN model is given in Fig. 1. An output can be obtained in each of the constructed DNN models. These outputs are as follows: Training of DNN The Rapid Miner software was used to create and train the DNN. The tests and training processes were performed by applying BD replacement, environmental conditions and age of the DNN model as inputs; further, the brick-dust pressure and the bending and elongation values were received as outputs. 30% of the experimental data (25) were used for training, and 70% of the experimental data (59) were used to test the DNN model. When we consider the DNN and ANN studies in literature, majority of the studies use 70% of data for training and 30% for testing purposes. Network training was achieved by conducting the training process using few data. Therefore, the number of test data that the network did not previously observe has been increased. Thus, the verification of the DNN model has been performed steadily and effectively. The parameters of the DNN models are presented in Tab. 5. All three DNN models use the same parameters. Only the output values are different. The most commonly used activation functions in DL and neural network studies in the literature are the standard logistic sigmoid and the hyperbolic tangents. However, in this study, the rectifier activation function provided better results. The rectifier activation function produces successful results in DNN where the number of hidden layers is high [Glorot, Bordes and Bengio (2011)]. The rectifier activation function is shown in Eq. (1) [Maas, Hannun and Ng (2013); Zhang, Jiang, Wei et al. (2015)].
(1), w (i)T : i. The weight vector for hidden layer, x: input The loss function gives a measure of the accuracy of the prediction model. The loss functions are treated as part of the DNN model construction process. Loss functions are considered for the quantitative and categorical response variables [Berk (2011)]. Further, the quadratic loss function has been used in the DNN model that is created in this study. Quadratic loss functions were introduced in the 1700's and 1800's. This function is mathematically shown in Eq. (2) [Berk (2011); Benneyan and Aksezer (2006)].  This method does not require manual adjustment of the learning coefficient. Additionally, a separate dynamic learning rate is determined for each dimension. Further, it requires less calculation and provides good results in terms of noisy data [Zeiler (2012)]. It was not necessary to determine the learning and momentum coefficients in DNN models owing to the ADADELTA method. Different learning cycles have been tested for training the DNN models. However, the optimum result has been obtained using a learning cycle of 10,000 epochs.

ANN models
Three forward-feed, back-propagation ANN models comprising three inputs, one output and one hidden layer that use the back-propagation algorithm as the learning algorithm were created to determine the brick-dust pressure, bending and elongation values. The created ANN model is shown in Fig. 3.

Figure 3: The developed ANN models
Because of the experiments that have been conducted, the number of neurons in the hidden layers was chosen as 15. Further, the number of neurons in the hidden layer affects the learning performance. If the number of neurons in the hidden layer is considerably small, the network cannot converge to the ideal value and exhibits oscillation behaviour. Therefore, the network is unable to learn. If the number of neurons is considerably large, the network only stores the input-output list and exhibits a weak generalisation. In other words, the network memorises them [Tortum (2007)]. Therefore, the number of neurons in the hidden layer differs depending on the dataset in the problem, and the most appropriate neuron number can be observed only using trial and error.

Training of ANN
The tests and training processes were conducted by applying BD replacement, environmental conditions and age of the ANN model as inputs, and the brick-dust pressure, bending and elongation values were received as outputs. 30% of the experimental data (25) were used for training; 70% of the experimental data (59), which were never seen by the network, were used to test the ANN model. The parameters of the ANN models are given in Tab. 6. The hidden layers of all the three ANN models and the number of neurons in these hidden layers are identical. In model ANN-I, the momentum and learning rate parameters are observed to differ from those of the other models. The learning coefficient determines the amount of weight that should be changed in each step. If large values are chosen, it is possible for the network to navigate between the local solutions, i.e. to perform oscillation. Selecting small values increases the learning time. The momentum coefficient ensures that the weight change value is added to the subsequent change in a certain ratio so that it does not stick to a local optimum point during network learning. In other words, the weight change in the previous step affects the subsequent change at the rate of the momentum coefficient. Another difference in the model ANN-I is that the learning cycle is 100,000 epochs. In other ANN models, this value is 10,000 epochs. In addition, the mean square error (MSE) parameter is considered to test the network's performance in all the ANN models. Back-propagation algorithm is used as the learning algorithm. The back-propagation algorithm is the most extensively used learning algorithm in many disciplines, especially in engineering. The greatest reason for this is that its learning capacity is high and that its algorithm is simple [Elmas (2007)]. The error-correction learning rule provides the basis of the algorithm. Basically, the error back-propagation process calls for two passes through different layers of the network, including a forward pass and a backward pass. Previously, an activity pattern (input vector) was applied to the sensory nodes of the network and effect propagates across the layers. Finally, as a part of the actual response of the network, a set of outputs resulted and all networks were fixed-synaptic-weight. However, during the backward pass, the synaptic weights of the network are adjusted according to the error-correction rule. Moreover, an error signal was created by the actual response of the network that was subtracted from the target response. The backward propagation of this error signal occurred subsequently [Alam (2009)]. Sigmoid transfer function is used in all the ANN models. Sigmoid function is the most extensively used transfer function for the hidden and output layers in a back-propagation network [Ghasemzadeha, Ahmadnejada, Aghaeinejad-Meybodib et al. (2018)]. The commonly used activation function f(x) for prediction purposes is the sigmoid transfer function that can be represented as follows [López, Rene, Boger et al. (2017)]; ( ) = 1 1 +

Regression models
In this study, three different multiple linear regression models (REGs) have been developed in Python to compare the ML methods with traditional methods. The multiple linear regression model is an equation similar to Eq. (4). In this equation, Yi is a dependent variable; β0, β1, β2,…βp-1 are unknown parameters; Xi1, Xi2,…,Xip-1 are the predictor variables [Salleh and Hasan, (2017)].
There are three independent variables in the developed regression models. They are BD (Xi1), age (Xi2) and environmental conditions (Xi3). These variables forecast the values of brick-dust pressure (REG-1), brick-dust elongation (REG-2) and brick-dust bending (REG-3). The regression equation of the regression models are denoted in Eq. (5) (REG-1), Eq. 6 (REG-2) and Eq. (7) (7) To obtain such equations, the learning data used in ANN and DNN models are employed. After obtaining these equations, the prediction operation has been performed using sonar test data. The comparative analysis of prediction results is presented in the Result and Discussion section.

Results and discussion
In this section, the test data (59%-70% of data) that were not used in network training were entered into the trained DNN, ANN and regression models, and the obtained results were examined. The MSE parameter was used to determine the model's prediction error. The MSE parameter also provides information related to the performance of the models. In statistics, the MSE of an estimator is one method to quantify the amount by which an estimator differs from the true value of the quantity that is being estimated [Casella (1999)]. The MSE parameter is calculated as shown in Eq. (8): The approach of the MSE parameter to zero indicates that the prediction error of models is low and that the model performance is high. Another parameter that is used to measure the performance of the models is the R (regression) parameter. The value R is an indication of the relation between the results obtained from the model and the experimental results. We can assume that there is a linear relation between the results obtained from the models and the experimental results as the value R approaches to one [Paralı, Sarı, Kılıç et al. (2017)]. The MSE and R values obtained from the regression equations and the trained ANN and DNN models test values are presented in Tab. 7. In addition, the comparison graphs of the R and MSE parameters obtained from the ANN, DNN and regression models are depicted in Fig. 4.

Figure 4: Comparison of ANN, DNN and REG models
When Tab. 6 and Fig. 4 are examined, it can be observed that the prediction performances of the DNN models are high and that the errors are low. Thus, the brickdust pressure, elongation and bending values are predicted accurately by DNN. The R values of the DNN models are close to 1. This indicates that the relation between the output of the DNN model and the desired output is not coincidental. When the R values of the ANN model are examined, they are considered to be lower compared with those of the DNN model. Especially, the performance of the ANN-2 model, which predicted the brick-dust elongation value, is low. The DNN models have a higher performance when compared with that of the MSE parameters. In other words, the DNN models perform the prediction function with few errors. The MSE values of the DNN models are very close to zero. Therefore, the difference between the output of the network and the experimental output is so small that there is no error. When the results of the regression model are analyzed, the results are relatively high in terms of MSE parameters and relatively low in terms of the R parameters when compared to the ANN and DNN models. Thus, regression models estimate the results with a high level of error. Apart from that, it can also be observed that the relation between the outputs of those models and the outputs of intended models are relatively high compared to other models. Only the estimation of brick-dust elongation gives a close result to that of the ANN model. Generally, the regression model is not an effective method in the estimation of brick-dust pressure, brick-dust elongation and brick-dust bending. Thus, it has been observed that the DNN model is better trained than the ANN model in terms of the R and MSE values. In addition, it should also be noted that these results were obtained using 30% training data. The DNN model exhibits a very good prediction performance even though it has only seen a small amount of the experimental data. This is an indication that the DNN model and its parameters are very well designed as a result of the efforts. The regression analysis graphs of the R parameter for the testing set of the DNN, ANN and REG models are depicted in Fig. 5. The regression analysis graphs show that the DNN model exhibits less deviation than that exhibited by the ANN and REG models, and there is a linear relation between the experimental data and prediction data. In addition, correlation analysis was performed to determine the direction and severity of the relation between the experimental results and the predicted values that were obtained from models; further, the Pearson correlation coefficients (r) were calculated. The correlation analysis results are presented in Tab. 8. The results of the correlation analysis in Tab. 8 depict that there is a strong positive correlation between the experimental pressure value and the pressure values that are obtained from the ANN (r=0.99, p<0.01) and DNN (r=99, p<0.01) models. Similarly, there is a strong positive correlation between the experimental lengthening and the lengthening values that are obtained from the ANN (r=0.92, p<0.01) and DNN models (r=0.96, p<0.01).
Interestingly, a similar relation is also observed between the experimental bending and bending values that are obtained from the ANN (r=0.95, p<0.01) and DNN models (r=0.99, p<0.01). Therefore, the results of correlation analysis indicate that both models developed by correlation analysis made strong predictions with few errors. However, when the Pearson correlations coefficient (r) is examined in depth, the DNN model seems to be a good model. The test data that the DNN and ANN models have never observed before were given as input to these models. The obtained prediction data, experimental data and error values are shown in Tab. 9. When Tab. 9 is examined, the maximum absolute error for the brick-dust pressure output is 2.98 in the ANN-1 model and 1.88 in the DNN-1 model. These values are 0.033 and 0.026, respectively, for the brick-dust elongation output. In addition, the number of error-free predictions is eight in the DNN-2 model while it is three in the ANN-2 model. Finally, the maximum absolute error value of the ANN-3 model for brick-dust bending output is 1.10, while that for the DNN-3 model is 0.50. According to the absolute error values, both the models provide output at the desired level. However, the performance of the DNN model is higher than the performance of the ANN model. The reason for the better performance of DNN compared with that of ANN is that the DNN model, due to its structure, achieves more accurate results with more data. In addition, DNN performs unconsulted learning within itself to determine the level of importance of features and use them accordingly.

Conclusions
In this study, the DNN and ANN models based on deep-neural and artificial neural networks as well as regression model were developed to estimate the pressure, bending and elongation values for 7 days, 28 days, 90 days and 180 days of curing of the concrete samples in different BD replacement and environmental conditions without experimentation. The developed models were trained with the experimental input and output data. Randomly selected 30% of the experimental data (25) were used to train the models, and 70% (59) of the experimental data were used to test the trained models. When such test data were given to the trained DNN and ANN models as input, the brickdust pressure, bending and elongation values were found to be close to the actual experimental data. In the regression model, the estimation process was performed with a high error margin. In addition, according to the results of correlation analysis with the SPSS software, there is a strong positive correlation (r>0.9, p<0.01) at a 99% confidence level between the experimental results and the prediction values obtained from the models. The level of this relation is slightly higher in the DNN model than that in the ANN model. Although only a very small percentage (30%) of the experimental data was used for training, both the models performed the prediction process at the expected level. Achieving this success using few training data indicates that the models are perfectly designed. When the DNN and ANN models were compared, the performance of the DNN model was higher for all the three predictions (pressure, elongation and bending). The MSE values of the DNN models were 0.504, 0.000063 and 0.036, while the ANN models had values of 1.733, 0.000127 and 0.2476. When the MSE values were examined, the DNN-model predictions exhibited fewer errors than those exhibited by the ANN models. When the R values were examined to address the relation between the experimental data and the prediction data of the models, the R values for the DNN model were 0.9967, 0.967 and 0.9930, whereas they were 0.9899, 0.9336 and 0.9531 for the ANN models. According to the R parameter, the relation between the prediction values of the DNN models and the experimental data may be assumed to be linear. The performances of the regression models are unacceptably poor compared to the ANN and DNN models. Therefore, using both DNN and ANN models, the brick-dust pressure, bending and elongation values can be predicted with a small margin of error in a considerably short time and without experimentation. Using DNN and ANN models to predict these values is considered to be practical and effective.