Performance evaluation of friction stir welding using machine learning approaches

Graphical abstract


Method details
The demand of aluminium alloys is increasing in the area of aerospace, shipbuilding, automotive, transport, military and other many industries owing to their unique features, i.e., high strength to weight ratio, high formability, excellent corrosion resistance, etc. However, as similar to ferrous alloys the joining of aluminium alloys by conventional processes are very problematic due to high thermal conductivity, aluminium oxide formation, high thermal expansion, hydrogen solubility, etc. [1]. Aluminium has high thermal and electrical conductivity so more intense heat to be employed during fusion or resistance welding of this metal. This results in variation of mechanical and metallurgical properties of the joint changed. Furthermore, aluminium has high coefficient of thermal expansion and therefore tacking is mandatory before welding operation to make the weld uniform [2]. Sometimes due to the occurrence of solidification cracking during fusion welding of aluminium alloys, these alloys are considered as nonweldable alloys [3]. To improve the weld characteristics, Friction stir welding-a revolutionary welding technique was introduced by The welding Institute (TWI) in the year 1991. FSW process omitted arcwelding problems of aluminium alloys without change of phase of base metal or without reaching melting point. This revolutionary technique has a capability of joining thin and thick materials with a less skilled operator. FSW is a solid-state welding technique that produces joint with superior mechanical and metallurgical properties as compared to fusion welding process [4]. On the basis of its benefits, this technique still has not been fully commercialized around the world owing to high equipment cost, lack of industries standards and specification and design allow ability. The process principle of FSW is depicted in Fig. 1. Weld is produces with a non-consumable rotating tool with specially design pin and shoulder. The friction between the tool and workpiece generates heat that softens the material around the pin and owing to rotational and translational movement of tool the joint is produced. It can also be treated as green technology as it is free of any filler material or shielding gases [5]. In FSW, there is no need to melt the metal by arc for joining. FSW process is controlled by various process parameters like rotational speed, transverse speed, tilt angle, dwell time etc. These parameters affect the quality of the weld. So it is important to gather the detailed information of these welding process parameters which influence the quality of the weld.
Numerous publication has been published on this technique over the past 25 years. Patel et al. [6] studied the superplastic behavior of FS processed aluminium alloys. It is found that, superplasticity significantly depends upon the tool related parameters, machine-related parameters, heat distribution, cooling rate and strain rate. In addition, application of superplastic behavior in aviation and automotive industries are discussed. Patel et al. [7] reveal the effect of polygon pin profile on the friction stir processing of AA7075. The non-uniform microstructure was observed in the stir zone for all polygon pin profiles. The maximum tensile strength was obtained by using a square pin profile. Patel et al. [8] explained the effect of tool pin profiles on the temperature distribution during friction stir processing of AA7075. It was observed that the pentagon tool pin profile generates more heat during plunging. Moreover, tool shoulder has more influence on the heat generation during the process as compared to pin. Most of the researchers have focussed on mechanical and metallurgical properties of the FSW joint and few researchers are tried to evaluate the performance of FSW on different materials using different modeling and optimization technique. Okuyucu et al. [9] investigated the mechanical properties of FS welded joint by using artificial neural network modeling (ANN). Rajakumar and Balasubramanian [10] employed response surface methodology (RSM) for optimizing the process parameters of FSW. Shojaeefard et al. [11] employed ANN modeling for finding the correlation between the input and output process parameters. Tansel et al. [12] used genetically optimized neural network systems (GONNS) for modeling the FSW process. Ghetiya et al. [13] used Taguchi's Tbased grey rational approach for optimizing the input parameters for FS welded AA8011. It is observed that maximum tensile strength is obtained at a tool diameter of 14 mm, the transverse speed of 80 mm/min, and a spindle speed of 1400 rpm respectively. Na et al. [14] employed tungsten arc welding for joining dissimilar welding metal plates. They used SVR model for predicting the residuals stresses of joint. It is concluded that these models are very much precise in predicting the experiment outcomes. Another researcher Wang et al. [15] observed that SVM more precisely classify the defect and non-defective features of the weld. Pal and Deswal [16] used gaussian process regression (GPR) approach for predicting water-engineering problems. Apart from SVM and GPR, researchers mostly used ANN models for predicting the performance of manufacturing processes [17][18][19]. Subsequently, hardly a work has been published on modeling of the tensile strength of FS welded AA6082 using machine learning approaches i.e. GPR and SVM. The present study investigates the ultimate tensile strength of FS welded AA6082 and examines the potential of machine learning technique i.e. GPR, SVM, and MLR. In addition, GPR model results are compared with SVM and MLR model results.

Experimental procedure
The material in this study is AA6082 which in the form of a rolled sheet of 6.35 mm thickness. The chemical composition of the given material is carried out by EDX analysis as shown in Fig. 2. In the present study, the experiments are designed on the basis of full factorial design. In which five levels of rotational and transverse speeds are used for fabricating the joint at a 2 tilt angle and 30 s dwell time. Table 1 reveals the input and output process parameters for the present study. The total number of experiments for five levels and two factors are 25 i.e. 5 2 . Every experiment is conducted with two replications for eliminating human error during the process. A vertical milling machine is used for conducting the experiments by fabricating tool and fixture as depicted in Fig. 3. The tool is made up of H13 die steel with a hardness of 54-56HRC. A tool having 20 mm diameter, 6.1 mm pin length with 6 mm pin diameter. Two plates of dimensions 100 mm Â 80 mm Â 6.35 mm are butt-welded at the right angle to the rolling direction. Fig. 4(a) shows the schematic diagram for how to cut tensile test specimen and schematic diagram of the tensile specimen. Firstly, the rectangular strips (150 Â 12 Â 6.35 mm) are cut from the welded sample on a power hacksaw. Afterward, these strips are converted into tensile specimen on the milling machine with the help of end milling cutter according to ASTM E8M-04 standard in a direction right angle to the welding direction. Next, these specimens are evaluated on a universal testing machine (UTM) for evaluating the ultimate tensile strength (UTS) of FS welded joint Fig. 4(b) illustrates the photographic view and dimension of the tensile specimen.

Modeling of tensile strength
In the present study, we used machine-learning techniques, namely, GPR, SVM and MLR for predicting the friction stir welding process. GPR, SVM, and MLR regression mathematical models are   2  500  40  170  3  500  63  152  4  500  80  135  5  500  100  79  6  710  20  199  7  710  40  192  8  710  63  181  9  710  80  160  10  710  100  110  11  1000  20  220  12  1000  40  216  13  1000  63  203  14  1000  80  182  15  1000  100  143  16  1400  20  238  17  1400  40  242  18  1400  63  225  19  1400  80  193  20  1400  100  170  21  2000  20  215  22  2000  40  242  23  2000  63  260  24  2000  80  220  25  2000  100  192 developed for the better of the ultimate tensile strength of the FS welded AA6082 joint. Then, the most suitable model is proposed on the basis of performance of model and graph analysis. Experimental outcomes are divided into two sets i.e. one set consist of training set data and other set consist of testing data set. Out of 25 observations, 83% data is randomly selected for the training set and remaining selected for testing set. With the use of training set data, different regression models are developed using machine learning approaches i.e. GPR, SVM and MLR. Two kernel functions i.e. Pearson VII (PUK) and radial based kernel function (RBF) are used with both GPR and SVM regression Afterward, adequacy of each developed model is checked on testing data set.

Gaussian process regression
Gaussian process models are assuming a directional contingency between the input (x) and output (y) of the process. These models also pronounce the conditional distribution p (y/x). In GPR models, each observation provides the information of the adjacent observation value. It is the collection of random variables [20]. The graphical representation of GPR is shown in Fig. 5. The box shows the observed variables where the circles represent the unknown values. The line is the connection node between the various observed values. The observed values are tentatively independent upon the other nodes corresponding the values of f (latent variable). This is because of the degradation properties of  the GPR. The main assumption of GPR is that the y is given by y = f(x) + e, where e $ N (0, s 2 ). In the GPR, for each input values, there is a supplementary variable f (x). In the current study, e is the observational error which is independently and identically distributed with variance s n 2 . The observation becomes Where, I is the identity matrix and K ij = k (x i, xj) Because Y/X $ N (0, K + s 2 I) is normal and its training and test data of p(Y * /Y, X, X * ). Then one has Y * / Y, X, X * $ N (m, S), where Let us assume n and n * are the training and testing data points then K (X, X * ) represents the n x n matrix of the covariance estimated at each set of training and testing data points. In the same manner, this is also correct for K (X, X), K(X * , X * ) and K(X * , X). The covariance matrix K (where K ij = k (x i, xj)) is generated with the help of covariance function. These covariance functions are similar as the use of kernel function in SVM. If the values of kernel k and noise s 2 are known, then Eqs. (2) and (3) are sufficient for interpretation. The minimization of negative log-posterior [21]: The hyper parameters are determined by differentiating Eq. (4) with respect to k and s 2 .

Support vector machining
The SVM is a machine learning approach that is derived from statistical learning theory by Vapnik [22]. The goal of this approach is to determine the location of decision boundaries that produce the optimal separation of the classes. Support vector machine minimizes the generalization error. SVM are less time and cost taking technique for modeling the data with the least error as compared to ANN modeling [23]. The graphical representation of hyperplanes for the SVM approach is shown in Fig. 6. where x e R N is represents the N-dimensional space and class label is y e À1 þ 1 f g. If a vector w is exists (i.e. orientation determining vector for discriminating plane) than these training sets are called linearly separable variables and a scalar b used for determining the offset of discriminating plane from the origin as given below. For hypothesis space, the function is given by

Multi-linear regression (MLR)
MLR is the conversion of data sets into logarithmic form. The relationship between the ultimate tensile strength of FS welded joint with rotational speed (R) and feed rate (F) is non-linear therefore, a following functional correlation may be supposed primarily.

UTS MPa
Whereas, p i = Proportionality constant. Taking logs on both sides of Eq. (7) so to convert the equation into a linear form.
The above equation is multi-linear with two explanatory variables. For developing the multi-linear model, log (UTS) is taken as output parameters, two explanatory variables, namely log R, and log F are taken as input parameters. The output of MLR provides the values of p i , a and b and developed the equation in the form of Eq. (7). UTS ¼ 181:882 R :0494

Details kernels of GPR and SVM
Machine learning approaches i.e. SVM and GPR consist of the use of various kernel functions. Numerous researchers suggested kernels on the basis of its performance i.e. radial basis kernels (RBF) and Pearson VII kernel (PUK) [24][25][26]. In this study, two kernel functions are used given below. Where g, s, v are kernel specified parameters. Gamma (g) is user-defined parameter for RBF kernel, which affect the classification accuracy of training data set. Sigma and omega are the parameters for PUK. The Pearson width is controlled by sigma (s) parameter whereas omega (v) is the trailing factor of the peak value when PUK is employed for curve fitting applications. GPR and SVM techniques are needs to be the establishment of suitable user-defined parameters selections upon which the precision of both modeling technique depends. Moreover, the selection of kernel and kernel-specified parameters, the SVM needs the creation of regularization parameters C and size of error-insensitive zone e whereas GPR needs the determination of optimum values of the Gaussian noise level It is level after the target has been normalized/standardized/left unchanged. In the present study, a large number of hit and trails are carried out by using a different combination of user-defined parameters with GPR and SVM for the selection of user-defined parameters i.e. C, g, s, v, and Gaussian   Table 2 User-defined parameters for GPR and SVM using RBF and PUK kernel functions.
noise. The criteria for choosing user-defined functions are based on the minimization of root mean square error and maximization of the correlation coefficient. On the basis of this criterion, optimum values of user-defined parameters are selected. Kernel-specific parameters for both regression techniques are same. Table 2 reveals the optimum values of user-defined parameters for PUK and RBF kernels based SVM and GPR. In this study, SVM_PUK, SVM_RBF, GPR_PUK, and GPR_RBF represents the Pearson VII kernel and radial basis kernel function based SVM and GPR. Root mean square error (RMSE) and correlation coefficient (CC) are used for checking the performance of both techniques.

Results and discussion
Machine learning approaches i.e. GPR_PUK, GPR_RBF, SVM_PUK, SVM_RBF, and MLR are used to predict the values of UTS for each reading in training and testing data set as depicted in Tables 3 and 4. Two performance parameters i.e. coefficient of correlation (CC) and RMSE are practiced for evaluation of model outcomes for a better understanding of the obtained value of UTS through GPR_PUK, GPR_RBF, SVM_PUK, and SVM_RBF. Table 5 reveals the values of CC and RMSE for GPR, SVM and MLR approaches respectively. The graph for actual and predicted values of UTS for Pearson VII and RBF kernel based GP and SVM regression, as well as MLR using training and testing data sets, are depicted in Figs. 7 and 8 respectively. Fig. 7 is for a data set of 19 training points and Fig. 8 is for the data set of 6 test points. Each graph consists of three lines i.e. one perfect line (line plotted at an angle of 45 ) and others are error lines to examining the scattering of data around the perfect line. Each error lines are plotted in the limit of AE 10% error.
The evaluations of results suggest that GPR regression technique is in worthy agreement with the experimental outcomes than the SVM and MLR as indicated by Table 5. The predicted values of the training data set and test data set are agreed well with the experimental values. This shows that the GPR approaches are fruitful are in modeling the non-linear relationship between the UTS and input parameters. The coefficient of correlation .97 (RMSE 5.94 MPa) is attained by RBF kernel based GPR Fig. 7. Actual vs predicted values of UTS using GP, SVM, and MLR using training data.  (Fig. 9). It is clear from the figure that predicted values provided by RBF and Pearson VII based GPR   approach are in very close to actual values of UTS. It is observed that predicted values follow the same pattern as followed by the actual values. Fig. 10 shows the residual of predicted and actual values of UTS for the test data set. This shows that GPR has minor residual than the SVM and MLR.