Composition design of 7XXX aluminum alloys optimizing stress corrosion cracking resistance using machine learning

In this paper, three different strategies based on machine learning methods were applied to Al-Zn-Mg-Cu series alloy composition design with the targeted property of stress corrosion cracking (SCC) resistance. By comparing the results of the strategies, it was discovered that the performance of the efficient global optimization (EGO) method was better than that of response surface optimization method, and much better than that of Random method, among which the Al-6.05Zn-1.46Mg-1.32Cu-0.13Zr-0.02Ti-0.50Y-0.23Ce (named EGO alloy) alloy had the best stress corrosion cracking resistance. The slow strain rate test (SSRT) technique was carried out to compare the EGO alloy with the traditional 7N01 alloy. It indicated that the ISCC of the new EGO alloy was lower than that of traditional 7N01 alloy for both single and double aging treatment. With the XRD, SEM and EDS analysis, it was found the rare earth elements formed Al8Cu4(Y, Ce) and quadrilateral phase Al20Ti2(Y, Ce) in the EGO alloy.


Introduction
7XXX series Al alloys (Al-Zn-Mg-Cu series alloys) have been widely applied in the body structures and key components of high-speed trains [1][2][3] due to their excellent mechanical properties, low density, and good weldability. However, stress corrosion cracking (SCC) has been an issue in the 7XXX series Al alloys which may lead to significant service failures [4,5]. Up to now, there have been intensive investigations on SCC resistance and a number of methods proposed to improve it [6][7][8]. It has been shown that SCC resistance depends on many factors, such as alloy composition, heat treatment system, and corrosion solution [9]. Chen et al [10] pointed out when the Zn/Mg ratio was 4.53, the alloy had excellent SCC resistance, where the I SCC was 3.4%. The SCC resistance increased significantly as the Zn/Mg ratio decreased, which was attributed to the narrower width of precipitation-free zone. Other work indicated that the addition of Cu clearly improved the strength and corrosion resistance while it decreased the welding properties due to its extended solidification temperature range [11,12]. Also, the addition of rare earth elements such as Zr and Sc [13][14][15] may enhance the corrosion resistance. Therefore, designing reasonable elements ratios, no matter main alloying elements or microelements, was able to effectively improve the SCC resistance of 7XXX series Al alloys.
Material design has always been a continuous research hotspot. Designing materials with targeted properties is often time-consuming and laborious due to the complexity of the search space. Composition design methods are various, including theoretical calculations, orthogonal tests, further designs based on traditional composition, trial-and-error, and so on. The phase field method, which can effectively simulate the microstructure evolution of various materials under different process parameters, is a significant part of the materials simulation space [16][17][18][19][20]. In the case of multiphase crystalline solids, the phase field method is an efficient tool to predict the properties of a complex alloy. Ma et al [21] employed a phase field model incorporating applied thermal stress to predict the equivalent elastic modulus, and its relationship with solid fractions can be achieved quantitatively by adding an external constant stress boundary condition. Zhang et al [22] established a thermodynamically consistent phase field model to describe the sintering process with Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. multiphase powders. The results demonstrated that the microstructure and solute distribution during the sintering process can both be obtained quantitively.
Given the complexity of the design problem and the recent advances in modelling large data sets, machine learning methods have been applied in the field of material science [23,24]. Up to now, there has been much interest in using machine learning for material design, which can help researchers to optimize materials with targeted properties less time and with lower cost [25,26]. Xue et al [27] employed an adaptive design strategy coupled with experiments to discover the target alloy, Ti 50.0 Ni 46.7 Cu 0.8 Fe 2.3 Pd 0.2 , from a potential space of 800000 compositions. Wen et al [28] searched for high entropy alloys (HEAs) with high hardness in the Al-Co-Cr-Cu-Fe-Ni system by formulating a material design strategy based on a machine learning model. The feasibility and availability of machine learning in the field of material science has been demonstrated as a new way of material design and discovery. Generally, machine learning is used to establish a functional model between inputs (such as the composition) and outputs (such as the target property) based on the known observation data [26,[29][30][31]. Then, based on the model, the targeted property of unexplored materials is estimated, which leads researchers to choose the candidate with the best estimated property for experiment measurement. The methods of estimating the targeted property of materials by the models are referred to as strategies. Different strategies for searching the material design space have various results and efficiency, hence a suitable strategy is of vital significance for material composition design.
So far, few researchers have applied machine learning to the composition design of 7XXX Al alloys. In this paper, based on machine learning methods, three different material design strategies for optimizing the stress corrosion resistance (SCC) behavior of Al-Zn-Mg-Cu-Zr-Ti-Y-Ce alloys were investigated. Then, the optimization ability of the three material design strategies was compared. Finally, the designed Al alloy with the best SCC resistance using the material design strategy was compared with the traditional 7N01 alloy.
2. Material design strategies and experimental methods 2.1. Initial data set For this paper, several elements which exert positive effect on the SCC behavior were added into 7XXX series Al alloys, such as Zr, Ti, Y, and Ce. The designed alloys are marked as Al-x 1 Zn-x 2 Mg-x 3 Cu-x 4 Zr-x 5 Ti-x 6 Y-x 7 Ce. . Subsequently, 40 data sets with inputs of Al composition and the output of electrical conductivity (σ) were obtained. The initial 40 data sets are given in the Appendix. As well known, SCC generally occurs in the potential sensitive area and has a certain internal correlation with the electrical properties of Al alloys. Research has suggested that the SCC resistance of Al alloys is positively correlated with its electrical conductivity [32,33]. Therefore, electrical conductivity is often used to evaluate the SCC resistance of Al alloys [34][35][36]. Hence, electrical conductivity (σ) is taken as the preliminary target output value.

Material design strategies based on a machine learning method
The learning loop was divided into six parts, which are illustrated in figure 1. The first part was setting up the initial data set, including the 40 data sets mentioned above. The electrical conductivity was measured as the output. In the second part, a surrogate model called Kriging was trained to establish the relationship between composition and electrical conductivity. The Kriging model can be represented as follows: where b j is the coefficient of the basis function, ( ) f x j represents the basis function, and ( ) Z x expresses the Gaussian random process error, which obeys the normal distribution. The mean square error (MSE) at a point can be expressed by the uncertainty.
Then, in the third part, the efficient global optimization (EGO), response surface optimization (RSO), and Random methods were applied to search for the next candidate in the unexplored space. The Random strategy resembles the trial-and-error approach which is commonly used in traditional material design. Input points were randomly generated in the unknown space, and it has the maximum uncertainty. The RSO method uses the maximum value predicted by the Kriging model in the unsearched space as the optimal result, which is an exploitation process. The EGO strategy makes further improvement on the RSO strategy and combines the predicted values by MSE, which is call the 'expected improvement (EI)' function, to determine the next iteration point. Let be the current best function value. Before the x is sampled, the value y(x) is uncertain. However, the predicted value,ŷ , and standard deviation, s, of the prediction function can be used to establish the normal random distribution Y, (ˆ) Y N s y, . 2 Formally, the improvement of the point x is To obtain the expected improvement, the expected value is taken as follows: where (·) f and (·) F are the standard normal density and distribution function, respectively. The search process of EI is called exploration. EI is able to balance the relationship between a better target value and the uncertainty of the model. If the maximum value of the EI function could be selected, both global and convergence can be considered. Based on the three strategies, different methods are chosen for the next candidate. For the Random method, the randomly generated input value is regarded as the candidate. For the RSO method, the optimal value predicted by the response surface is treated as the candidate. For the EGO method, the input point corresponding to the maximum EI is the next candidate. Candidates obtained by the three strategies were heat treated with same parameters and tested in same way.
In the last part, the three different candidates are fed back into the database to extend it. However, if the required property is obtained, the material design loop can immediately break.

Experimental procedures
The detailed manufacturing process of the alloys used in the learning loop is shown in figure 2(a). The Al-x 1 Zn-x 2 Mg-x 3 Cu-x 4 Zr-x 5 Ti-x 6 Y-x 7 Ce alloys calculated in the learning loop were prepared in an electric resistance furnace, and the burning loss rate of different elements was taken into account. The specimens were machined into gauge dimensions of 20×20×5 mm 3 and were polished using 400, 800, 1200, and 1500 grit abrasive papers to guarantee the surface consistency. Then, the specimens were solution treated at 470°C for 2 h and then aged at 120°C for 8 h and 170°C for 12 h. Lastly, the electrical conductivities were measured by a 7501 Eddy current conductometer.
The detailed manufacturing process of the highest conductivity Al alloy obtained by the EGO strategy (called 'EGO alloy' below) is shown in figure 2(b). The EGO alloys were cast in a preheated cylindrical iron mold with an interior diameter of 90 mm. After a homogenizing treatment at 465°C for 24 h, the obtained ingots were machined into ∅86 mm cylinders and then extruded at 420°C with an extrusion ratio of 10. Then, the EGO alloy and 7N01 alloy were heat treated with the same parameters, as shown in table 1, including a single-aging treatment (referred to as '(1)') and a double-aging treatment (referred to as '(2)'). The final SCC susceptibility was studied using a slow strain rate test (SSRT) at a strain rate of 1.0×10 −6 s −1 in air and a 3.5 wt% NaCl solution. The specimens were polished by 1500 grit abrasive paper before testing. In order to ensure reproducibility of the experiments, each test was repeated at least three times. The SCC susceptibility is evaluated by: where A sol and A air are the area enclosed by the stress-strain curve of specimens in air and in the 3.5 wt% NaCl solution, respectively. In order to identify the phases of the EGO alloy, x-ray diffraction (XRD) analysis was performed with a Bruker D8 ADVANCE A25X. The specimens were 10×10×4 mm 3 . The x-ray scanning ranged from 5°to 85°. Fracture morphology and microstructure were study with a JEOL JSM 7800F field emission scanning electron microscope (SEM), and the elemental determination employed an Oxford X-Max 80 energy dispersive spectrometer (EDS).

Comparison of the three material design strategies
After establishing the Kriging model, the accuracy of the surrogate model needed to be verified to see if the model can be used for the next optimization. Generally, model verification includes a number of common methods. For the Kriging response surface, the most accurate verification method is 'cross-validation.' Jones et al [37] pointed out that taking out a sample point to form training data has minimal impact on the accuracy of  the fitted response surface unless the sample points are very small, which is also a prerequisite for crossvalidation. Standard cross-validation residual were proposed for evaluating the surface accuracy, and the formula is shown below:  Figure 3 shows the response surface cross-validation of the 40 initial training data, in which (a) and (b) are the initial response surface cross-validation and standard cross-validation residuals, respectively. It can be observed clearly from figure 3(a) that some data points are close to the standard 45°line, while others are far away. This indicates that Kriging model has good predictability for many points and poor predictability for a few points, which mainly depends on whether the distribution of initial points is concentrated. Figure 3(b) exhibits that all the residuals fall in the range of [−3, 3] with a uniform distribution and no obvious trend with respect to the predicted value. Therefore, the response surface model-Kriging model built by the initial training set meet the requirement of precision, which allowed it to be used for the material design strategies.
After cross-validation, the obtained Kriging model was employed for material design optimization in the next step. The material design strategies mentioned above (EGO, RSO, and Random) were used for four iterations each, and an Al alloy with a different composition was obtained in each iteration. Figure 4 shows the results predicted by each of the material design methods. The iterative feedback loops were performed four times; thus, four alloys were predicted and synthesized for each of the three methods (twelve alloys in total that are listed in table 2). After each iteration, the electrical conductivities of the calculated Al alloys were measured, and the results, augmented by the training data, led to a new round of regression and design. In figure 4, it can be extrapolated that the Random method is similar to the common trial-and-error method. Conductivities obtained by the Random method had low values due to the complexity of search space. However, conductivities of Al alloys designed by the RSO method are much higher than the Random method. The values of the four iterations are 38.899, 38.790, 39.992, and 39.756 (% IACS) respectively. The 3 rd and 4 th iterations had higher conductivity than the maximum of the initial training set (39.092% IACS). Based on a trade-off between exploitation and exploration, the EGO strategy, compared to the RSO strategy, performs better and produces the best performer in every iteration. On the 3 rd iteration, the Al alloy (Al-6.05Zn-1.46Mg-1.32Cu-0.13Zr-0.02Ti-0.50Y-0.23Ce) with the highest conductivity of 42.15 % IACS was obtained. The cross-validation results of the final model after iteration using the EGO strategy are shown in figure 5. The 40 initial data are not completely on the standard 45°line and were on both sides of the line. However, the error ranges of almost all points intersect the standard 45°line. At the same time, the four points iterated by EGO are close to the 45°line, which indicates that the prediction ability of the model (near the iteration point) is significantly improved.

SCC susceptibility tested by SSRT
The typical strain-stress curves of the EGO and 7N01 alloys in air and the 3.5 wt% NaCl solution are displayed in figure 6. Figure 6(a) shows the strain-stress curves of the Al alloys treated by single aging treatment (EGO(1) and 7N01 (1)). It reveals that the SCC phenomenon occurs in both the EGO(1) alloy and 7N01(1) alloy. The ultimate strain of the EGO(1) alloy had some loss but the strength was almost unchanged, while the ultimate strength and strain of 7N01(1) alloy decreased more severely. Figure 6(b) shows the strain-stress curves of the Al alloys with the double-aging treatment (EGO(2) and 7N01 (2)). It reveals that the curves of the EGO(2) alloy in 3.5 wt% NaCl and in air almost overlap, which indicates no SCC occurred, while the ultimate strain of the 7N01(2) alloy reduced relatively significantly. Figure 7 shows the loss factors (I SCC ) of the EGO and 7N01 alloys under the single and double aging treatment. A high I SCC means high SCC susceptibility [38,39]. Figure 7 reveals that the I SCC of the EGO alloy(1) is only 6.82%, which is significantly lower than that of the 7N01 alloy(1) with an I SCC of 23.55%. This indicates that the EGO alloy(1) has lower stress corrosion cracking susceptibility. Meanwhile, the I SCC of the EGO(2) alloy is only 1.15%, and the I SCC of the 7N01(2) alloy is 13.16%, which is much higher than that of the EGO(2) alloy. The I SCC results confirm that the EGO alloy has better SCC resistance compared with the traditional 7N01 alloy. In particular, the EGO alloy with the double-aging treatment has almost no stress corrosion cracking. Figures 8(a) and (c) show the SSRT fracture micrographs of the single-and double-aged EGO alloys in air. It can be observed that the fracture modes of both the single-and double-aged EGO alloys are mainly ductile transgranular fracture in nature because of the number of dimples on the fracture surface. The effect of heat treatment on the SSRT fracture mode is not significant in air. Figure 8(b) shows the SSRT fracture morphology of the single-aged EGO alloy in a 3.5 wt% NaCl solution. Some corrosion marks appear on the fracture surface, as well as cleavage step and many dimples, which indicates that the fracture mode is a mixed ductile and plastic fracture. Figure 8(d) shows the SSRT fracture morphology of double-aged EGO alloy in a 3.5 wt% NaCl solution. Similarly, there exist some corrosion marks and many dimples and tearing ridges on the fracture surface. In general, SCC fracture surfaces have more intergranular fracture marks, like rock candy fracture or cleavage steps, because SCC begins at grain boundaries. However, the SEM micrographs show little trace of intergranular failure, as shown figures 8(b) and (d), which indicates that the EGO alloy has great SCC performance. Figure 9 shows the XRD diffraction patterns of the as-cast and double-aged EGO alloys. It reveals that the major phases in the as-cast alloy were α-Al and Al 8 Cu 4 Ce, and in the double-aged EGO alloys were α-Al, MgZn 2 , and Al 8 Cu 4 Ce. This indicates that the rare elements exist in the form of Al 8 Cu 4 Ce. Figure 10 shows the SEM images and EDS patterns of the as-cast and double-aged EGO alloys. There are various shapes of phases distributed in the as-cast EGO alloy, as shown in figure 10(a). According to the EDS energy spectrum, the elements at point A include Al, Zn, Mg, Cu, Y, and Ce. The Mg element content is very low, which can be almost ignored. Combined with XRD analysis, it can be concluded that this phase should be Al 8 Cu 4 (Y, Ce), with some Zn dissolved in this phase. In figure 10(a), another quadrilateral phase at point B has a different elemental distribution. Analyzing the EDS energy spectrum, the quadrilateral phase contains Ti and rare earth elements Y and Ce, which is identified as Al 20 Ti 2 (Y, Ce) with some Zn and Mg dissolved in it. Figure 10(b) exhibits the SEM     images and EDS patterns of the EGO alloy after the double-stage aging treatment. It shows that there are mainly two phases in the alloy, shown by a difference in color. It can be seen from the EDS energy spectrum analysis that the phases with the two different colors are the same as those in the as-cast aluminum alloy, namely Al 8 Cu 4 (Y, Ce) and Al 20 Ti 2 (Y, Ce). Al-Zn-Mg-Cu alloy is prone to SCC under the combined action of a corrosive medium and stress. After the aging treatment, there exists a prominent potential difference between the grain boundary precipitated by η(MgZn 2 ) and the Al matrix in Al-Zn-Mg-Cu alloy, which causes an anode dissolution channel, also called a  corrosion channel. SCC occurs along grain boundaries with the effect of stress. A stress concentration forms at the crack source, which accelerates the crack propagation. Then, the expanded cracks accumulate into large ones that causes stress corrosion cracking [40,41]. However, the addition of rare earth elements Ce and Y into the Al-Zn-Mg-Cu alloy make the rare earth phase Al 8 Cu 4 (Y, Ce) and Al 20 Ti 2 (Y, Ce) form which are mentioned above. Some Zn and Mg elements are dissolved in the two rare earth phases, which make the η(MgZn 2 ) content decline and distribute discontinuously at the grain boundary. Therefore, the continuous anode channel is not easily formed during the corrosion process, which improves the corrosion resistance of the alloy.

Mechanisms for improved SCC resistance
Hydrogen embrittlement also gives rise to SCC behavior. Active hydrogen atoms generated by an aqueous solution are concentrated at areas of stress concentration (including the crack tip), the defect grain boundary, and areas of dislocation concentration, resulting in material embrittlement and hydrogen-induced cracking [41]. However, rare earth elements Ce and Y have greater affinity with H atoms due to characteristic of their electronic structure, and they can dissolve and absorb H atoms in large quantities. Meanwhile, rare earth oxide films are easily generated by rare earth element and O; as a result, a layer of oxide film covering the substrate is able to prevent the H from penetrating, which reduces the concentration of H atoms on the grain boundary, resulting in lower SCC susceptibility [42]. Finally, Ce in corrosive environments is inclined to produce a chemical reaction with the corrosion medium, Cl − and O 2− , which generates the dense reaction products Cecontaining oxide or Ce-containing chloride. Reaction products as the passivation layer are adsorbed by the surface of the Al substrate to reduce the pitting corrosion damage by Cl − , and effectively prevent the further corrosion [43,44].

Conclusion
The conclusions of the paper can be summarized as follows: 1. The efficient global optimization (EGO) method, making a trade-off between exploitation and exploration, performed better than the response surface optimization method, and performed much better than the Random method, in finding a 7XXX series Al alloy with improved SCC resistance. Using an initial database containing the composition and conductivity of 40 samples, a learning loop was iterated four times using all three methods with the conductivity as a proxy for SCC resistance. The composition Al-6.05Zn-1.46Mg-1.32Cu-0.13Zr-0.02Ti-0.50Y-0.23Ce, obtained by the EGO strategy, had the best SCC resistance.
2. The EGO alloy had better SCC resistance than the traditional 7N01 alloy. After single-aging and doubleaging heat treatments, the Iscc values of the EGO alloy were significantly lower than the 7N01 alloy (6.82% and 1.15% compared to 23.55% and 14.31%).
3. The addition of rare earth elements Y and Ce formed the Al8Cu4(Y, Ce) and quadrilateral Al20Ti2(Y, Ce) phases, which improve the stress corrosion cracking (SCC) resistance of 7XXX Al alloys by decreasing the amount of η(MgZn2) on the grain boundaries, which disrupts the continuous anode channels; by forming rare earth oxide films, which help prevent hydrogen embrittlement; and by creating dense Ce-containing oxides and chlorides, which passivate against pitting damage.
In this paper, the iterative feedback loops were only performed four times due to the limitations of time and experiment. If iterative feedback loops could be done more times, the performance of the alloy would improve. At the same time, further development of basic model (Kriging model in this paper) can also reduce the number of iterations necessary in EGO strategy. Only a single objective function (electrical conductivity) was used for composition optimization. If possible, a multi-objective function combining electrical conductivity and hardness could be used for optimization so that the mechanical properties and corrosion resistance would be improved simultaneously.
Material design by machine learning methods is a significant part of 'Materials Genome Initiative' proposed in 2011. This paper shows the great potential of machine learning in composition design. In fact, machine learning has many more applications, including material properties prediction and the improvement of material simulation calculation methods. Another effective method, the phase field method which was mentioned previously, is also one of the main computational tools in the 'Materials Genome Initiative.' It can also be a powerful tool in future of material property prediction and design. Future advances in material design may result not from the development of a single approach but a combination of multiple technologies, which will make the field sustainable.