Prediction of hydrogen concentration responsible for hydrogen-induced mechanical failure in martensitic high-strength steels

(cid:1) Tensile testing under continuous hydrogen-charging. (cid:1) Performance of martensitic ultra-high-strength steels in hydrogenated conditions. (cid:1) Development of deep learning artiﬁcial neural network models. (cid:1) Prediction of hydrogen concentration responsible for hydrogen-induced steel failure.


Introduction
The detrimental effect of hydrogen on the mechanical integrity of steels and alloys is a major industrial challenge.Atomic hydrogen enters metallic materials during manufacturing processes or in service through physisorption and chemisorption [1,2].Being the smallest and lightest element, upon surface adsorption, the hydrogen is absorbed via diffusion within the lattice of steels and alloys and tends to be trapped at microstructural features like interstitials, grain boundaries, and interfaces [3e5].During diffusion, hydrogen can be trapped reversibly or irreversibly by microstructural features and defects like dislocations, voids and precipitates depending on the local thermodynamic state [6e9].Consequently, because of simultaneous action of high local stress state and hydrogen content above critical concentrations, early degradation of mechanical properties, leading to crack initiation and failure occurs [10e12].This phenomenon is generally termed hydrogen embrittlement (HE).As yet, a real-time experimental observation of hydrogen-metal interaction leading to early failure is still difficult and the phenomenon is not fully understood.Nevertheless, several hydrogen-induced failure mechanisms have been proposed.
The mechanisms include hydrogen enhanced decohesison (HEDE) [13,14], hydrogen enhanced localised plasticity (HELP) [15], adsorption induced dislocation emission (AIDE) [16], hydrogen enhanced stress induced vacancies (HESIV) [17], stress-induced hydride formation and cleavage [18] and the newly proposed HEDE þ HELP mechanism [19].The underlying principle of HEDE mechanism is that the presence of hydrogen in the bulk material reduces any interatomic bond strength leading to decohesison and consequential failure at relatively lower stresses [13].In reheated-quenched (RQ) and quenched and tempered (QT) martensitic steels, the HEDE mechanism is mostly responsible for failure where the fracture mode is intergranular along the prior austenitic grain (PAG) boundaries [20e22].The fundament of the HELP mechanism is that hydrogen enhances the movement of dislocations resulting in dislocation pile-up.The movement of dislocations in turn facilitates hydrogen transport and increases its concentration at pile-ups.This mechanism is found to be responsible for the observed plastic strain localisation in hydrogen-induced brittle fractures and fracture modes transitioning from ductile to transgranular, quasi-cleavage and intergranular in martensitic steels [15,23].Generally, for DQ and QT steels, the synergic activity of HEDE þ HELP as emphasized by Djukic et al. [19] is seen in action when hydrogen, transported by dislocations reaches a critical concentration at pile-ups caused by dislocation barriers such as carbide and inclusion-matrix interfaces.At this critical concentration, hydrogen at interfaces instigates the HEDE mechanism leading to eventual early fracture [19].
Principally, the magnitude of steel' susceptibility to HE is proportional to their strength and hardness [24].Thus, DQ and QT martensitic ultra-high strength are at higher risk of HE.In addition to alloying, the volume fraction and shape of retained austinite (RA) as well as the morphology of PAG [25,26] are parameters that influence martensitic ultra-high steels' susceptibility to hydrogen.Studies have shown that HE associated with RA is not only dependent on the volume fraction but also the shape.Zhu et al. [27] reported that massive hydrogeninduced cracks are found to initiate at the interface of blocky RA.On the other hand, steels with filmy RA are relatively more resistant to hydrogen-induced cracks.As for PAGs which serve as crack initiation points and preferred propagation paths, studies have shown that smaller PAGs reduce the sensitivity to hydrogen in DQ steels [8,28].This can be associated not only with the reduced trapping ability at the grain boundaries but also with the prevention of hydrogen concentration from attaining critical concentrations sooner due to improved diffusion over a larger area [29].
Hydrogen embrittlement research has benefited from several investigative techniques over the years, including experimental methods and atomistic simulations [17,30e33].These, help investigate the HE phenomenon from macroscopic to nanoscopic levels in variety of steels with distinguished microstructural compositions and service applications.Contemporary investigation of hydrogenassisted degradation on macroscopic levels is carried out by mechanical testing after hydrogen charging or under continuous hydrogen charging.The mechanical tests include constant extension rate tests (CERT), constant load tests (CLT) [11], impact toughness tests, four-point bend tests [33] and the recently proposed tuning-fork test [34].The commonly used hydrogen charging techniques include the classical electrochemical charging [35], plasma charging [36] and hydrogen gas charging [37].Worth to note here is that although the results obtained via these methods provide some scientific insight for better understanding of the HE phenomenon, they are often used only to evaluate the performance of the steel with the attempt to mimic service conditions.In other words, evaluating the influence of hydrogen on macroscopic mechanical properties.For distinguished steels, the extent of mechanical properties degradation during testing is not only dependent on the hydrogen charging conditions, but more importantly the hydrogen concentration in the steels [31,33].Thus far, there are no swift means available for obtaining hydrogen i n t e r n a t i o n a l j o u r n a l o f h y d r o g e n e n e r g y 4 8 ( 2 0 2 3 ) 5 7 1 8 e5 7 3 0 concentration values in steels after hydrogen-assisted failure except for complex experimental measurement using mercury, gas chromatography and thermal desorption spectroscopy (TDS) techniques.Even with these methods, the measurements need to be performed immediately after fracture with specimens of specific shapes and sizes which is a challenge for steels that fail in service.So far, the ability to predict hydrogen contents based on service conditions is poor.Meanwhile, this information could assist designers in estimating the probabilities and magnitude of hydrogen assisted failures.
In recent times artificial neural networks (ANN) find their growing application in several fields including research in engineering and material sciences [38e42].Among many functions, ANNs mimicking the neural networks that constitute human brains, possess the ability to learn, identify trends and build mathematical relationships between the feature variables of input data to predict specific target outputs [43].In many disciplines, the immense reposited data available from experimental results over the years are used to train ANN models to predict target outputs with little or no further experimental requirements.
Hydrogen embrittlement research has benefited from ANN as well.Titus et al. [38] developed ANN model to predict the degradation of mechanical properties of aluminium alloys in the presence of hydrogen while considering the influence of different elemental compositions.In other works, ANN models were trained to use hydrogen thermal desorption spectroscopy data believed to contain microstructural information to predict HE index of ferritic, austenitic, and ferritic-martensitic steels with up 98% accuracy [44].Despite the available works using ANN to predict mechanical performance of steels and alloys in the presence of hydrogen, there are no available models to predict the actual hydrogen concentrations resulting from a combination of hydrogen charging conditions and loading which are simultaneously responsible for performance degradation of the steel.
In this paper, we develop, and evaluate the reliability of two ANN models to predict total hydrogen concentration responsible for performance degradation in DQ and DQ þ RQ steels by using data curated from a combination of mechanical testing in as-supplied state, and under continuous hydrogen charging.The hydrogen charging parameters are also considered as input.The success and implementation of this approach will assist designers in estimating the levels of hydrogen required to cause hydrogenassisted failure of hot-rolled martensitic ultra-highstrength steels, hence influencing the selection of material and service conditions.

Materials
Two direct-quenched martensitic steels with distinct chemical compositions were used in this study.The original microstructures were achieved by hot rolling in the nonrecrystallization region followed by direct quenching (DQ1 and DQ2).In addition to the base materials, two additional microstructures were obtained by re-heating and quenching (RQ) of DQ2 at austenisation temperatures of 860 C and 960 C with 25 min holding time, followed by quenching in water-oil emulsion bath (DQ2þRQ at 860 C ≡ A860 and DQ2þRQ at 960 C ≡ A960, respectively).The re-austenisation led to equiaxed PAG structures with different grain sizes.Table 1 shows the elemental composition of DQ1, DQ2 with their corresponding mechanical properties as well as those of A860 and A960.The different chemical compositions and different PAG structures are intended to evaluate the sensitivity to hydrogen uptake of the mechanical property degradation.The encompassment of these steels and different heat treatment conditions will provide a wider application range of the proposed model for hydrogen concentration prediction at failure, for materials of this same class.
The steels used in the present study are predominantly martensitic in microstructure.Fig. 1 shows the similarity in the lath-martensite microstructure and carbide morphology of the steels.On the contrary, the steels possess rather dissimilar PAG structures: DQ1 and DQ2 have elongated PAG structures (Fig. 2(a) and (b)) resulting from hot rolling below non-recrystallization temperature followed by directquenching; while A860 and A960 have equiaxed PAG structures (Fig. 2(c) and (d)) as a result of additional reaustenisation and quenching.The average PAG sizes for all test materials, calculated from all directions with linear intercept method, were 11.2 mm for DQ1, 9.6 mm for DQ2, 9.4 mm and 42.6 mm for A860 and A960 respectively.Besides the shape, emphasis should also be made on the significant difference in average PAG size between the A860 and A960, versus the original DQ2 condition.X-ray diffraction results revealed that the steels contain negligible volume fractions of RA (<1%).

Hydrogen charging and mechanical testing
Electrochemical hydrogen charging was performed with a three-electrode electrochemical cell coupled with a Gamry potentiostat framework.Calomel and platinum electrodes i n t e r n a t i o n a l j o u r n a l o f h y d r o g e n e n e r g y 4 8 ( 2 0 2 3 ) 5 7 1 8 e5 7 3 0 were used as reference and counter electrodes, respectively.The steel specimen was the working electrode.30 g/l of sodium chloride (NaCl) and 2e6 g/l of ammonium thiocyanate (NH 4 SCN) were utilised as main electrolyte and atomic hydrogen recombination inhibitor, respectively.Prior to tensile tests under continuous hydrogen charging, the steel specimens were pre-charged for 2 h, which was experimentally determined to provide sufficient and homogeneous distribution of hydrogen across the thickness of the specimen's gauge section.Electrochemical potentials of À0.8 V SCE to À1.4After fracture the gauge part of the specimen is cut to the characteristic size of 1 mm Â 5 mm Â 10 mm for hydrogen concentration measurement.

Hydrogen concentration measurement
Hydrogen concentration was measured with TDS technique.The extracted specimens after fracture were cleaned with distilled water followed by drying in helium gas flow to prevent the formation of moisture on the specimen surface.The measurement of partial pressure of hydrogen occurs in ultrahigh-vacuum (UHV) chamber (ultimate pressure is 10 À9 mbar) coupled with a mass spectrometer (SRS residual gas analyser RGA100).To keep the UHV chamber at the required pressure and reduce pumping time before measurement, the sample is first placed in an airlock compartment and pumped to an intermediate pressure of 10 À6 mbar.After which the specimen is transported to the UHV chamber for hydrogen partial pressure measurement (see Fig. 4).The total time from the specimen preparation to TDS measurement did not exceeded 10 min with effective dwelling time in airlock as 5 min (up to 10 À6 mbar) ensuring negligible amounts of hydrogen loss.All TDS measurements were performed at a heating rate of 10 C/min.The total hydrogen concentration was calculated by integrating the area below the desorption rate versus temperature curve.

Data processing
The data utilised as input for the developed ANN models consists of the inherent mechanical properties of the test steels, parameters for hydrogen charging, results from mechanical tensile testing in air and under continuous hydrogen charging and their corresponding measured hydrogen concentrations.Table 2 presents the features of the input data utilised for training, validation and testing of the developed ANN models.The mechanical properties were obtained from classical hardness and tensile tests of as-received specimens.The tensile toughness was obtained by integrating the area under the nominal stress-strain curve, up to the UTS, resulting from tensile tests.Hydrogen sensitivity parameter (HSP) is used to quantify the level of mechanical property degradation as a result of increased hydrogen concentration from mechanical loading under hydrogen charging [32].HSP was calculated for the test steels in this study considering tensile strength, elongation to fracture and tensile toughness using Equation (1).
Where: M p air and M p H are the mechanical properties of the test steels in as-received condition and after electrochemical hydrogen charging respectively, excluding the hardness.It is important to mention that only the absolute values of applied potentials and their corresponding currents were used.The collected experimental data was split into training, validation, and testing datasets according to the 60-20-20 rule.To ensure the generalisation the model, a data augmentation method was applied to the training and validation datasets using the stats.truncnorm.rvsfunction from the statistics package in spicy, an open-source library built on python with NumPy extension [45].With this, probabilities within a normal distribution are defined for each input parameter as may naturally occur in an experimental setting or in service.Thus, random variates of desired size are generated within a predetermined standard deviation considering experimental data point as a mean [46].Table 3 shows the set standard deviation values attributed to the feature samples of experimental data utilised for the augmentation.For each sample within the training and validation data structure, a 10-fold random variates were generated respectively with a controlled random seed.The resulting input data was 649 samples.No augmentation was carried out on the test dataset.In fact, the test data was not at any point exposed to the ANN during training and validation.The test data was strictly used to test the predictive error of the network.
In the preparation of ANN input data, normalisation is typically carried out to ensure the unification of input data, thus ensuring that all input features have equal importance on the prediction [38,47].In this study, normalisation of the input data was performed using preprocessing.normalizefrom scikit-learn library.The normalisation model was saved and called at any time to ensure the reproducibility of the normalisation process.

Neural network architecture
The multilayer feed-forward artificial neural network (ANN) models, with backpropagation, were developed using Keras, an open-source library built on TensorFlow software (version 2.3.0) for machine learning applications with python programming language [48].All codes in this study were written and executed using the Jupiter platform.
The ANN is intended to receive as input data containing the features presented in Table 2 and predict a target output of total hydrogen concentration responsible of mechanical properties degradation and failure for the given data sample.Two ANN models of identical architecture were developed: Model I and Model II.The topology of the models however differs in terms of input feature sizes.In Model I, the charging parameters i.e. applied potentials (E a ), hydrogen charging times (T ch ) and initial current (I i ) were transformed into system energy (E s ) using Equation 2 Where: E a is the absolute value of the applied electrochemical potential in volts, I i the initial corresponding current in ampere and T ch is the charging time in seconds.The standard deviation selected for data augmentation of energy system feature was 0.5 J, that can be introduced by the following parameters amplitude: E a , T ch , I i , which are physically applicable, and at the same time significant in representing a wide range of laboratorial testing, or real service conditions.The ANN models were developed with an input layer with 12 nodes for Model I and 14 nodes for Model II corresponding to Table 3 e Standard deviations of attributed to feature used for data augmentation.

Input features Standard deviation
i n t e r n a t i o n a l j o u r n a l o f h y d r o g e n e n e r g y 4 8 ( 2 0 2 3 ) 5 7 1 8 e5 7 3 0 their respective input features number.The input layer is then followed by 4 hidden densely connected layers with respective nodes and summed up with a single node output layer.The topology of the networks can be summarised as 12-7-5-3-2-1 and 14-7-5-3-2-1 for Model I and Model II, respectively.The networks operate sequentially, where every node in a layer receives an input from nodes in preceding layer and transmits an output to nodes in subsequent layer within the framework of supervised leaning.Fig. 5 illustrates the characteristic function of a unit node within a layer.Hyperparameters, especially hidden layers and nodes play the most important role in the perfomance of an ANN [49].Despite the proposals in literature with equations governing the selection of hyperparameters [40,41,50], there is not thumb rule or robust methodology for the selection of hidden layers and nodes.Some proposed methods are suitable only for specific applications [38].In this study, the network topology 12-7-5-3-2 was selected by trial and error while monitoring the perfomance of the models.The oupout from all layers in the model are governed by the rectified linear units (ReLU) activation function as it ensures better gradient propagation [51].In order to efficiently control the adjustment of added weight and maximise the loss function which helps in monitoring the model performance, the adaptive estimation of first and second-order moments (adam) algorithm was used as an optimizer.This is based on a stochastic gradient descent method.Several studies testify of the computational efficiency and little memory requirement of adam, as well as its suitability for problems that are large in terms of input features [52,53].In this research adam was used with its default arguments including default values for learning rate and step size.The learning perfomance evaluation of the developed ANNs is monitered by the mean square error (MSE) loss function [54].MSE loss score during training of the model is determined by Equation (3).
Where: N is the number of samples of input data, y is the experimental output data and Y is the predicted output by the ANN.

Results and discussion
Hydrogen charging and mechanical testing were performed on four steels to highlight the deleterious effect of hydrogen on their mechanical perfomance.In addition, curate and process the obtained data for the development of ANN models microstructure especially the average size and shape of PAG as shown in Fig. 2. In addition, it is important to mention that microstructural defects resulting from the deformation during tensile testing may influence the hydrogen concentration measurements as the different steels deform and retain hydrogen differently.In future studies, this can be evaluated by analysing the obtained spectra from TDS hydrogen concentration measurements.
It is worth noting that the deleterious effect of hydrogen on the test steels shown in Fig. 6 represents effect on one feature, i.e. mechanical strength, meanwhile the effect of hydrogen on the steels in terms mechanical properties is multi-dimensional (including effect on ductility, steel surface quality, size, etc.).Nonetheless the results shown in Fig. 6 are clearly representative of the effect of hydrogen on the respective steels.What is important to underline from these results is that the hydrogen influence on the mechanical integrity of the respective steels is dissimilar.Despite that the effect of hydrogen on the steels are commonly deleterious, their nature and magnitudes are different.The input data consisting of different steels is important for the ANN models in terms of reliability and their suitability for a wider scope of applications.
A preliminary statistical correlation analyses was performed on the input data features.A correlation matrix (heatmap of Pearson's corelation) was used to investigate the degree of correlation between the input feature pairs and target output.The utilised topology in terms of hidden layers and neurons was found by trial and error to be superlative for Model I. Attempts to add or subtract any hidden layers results in the model overfitting or underfitting quickly.It is worth to note that the current features have been retained after a preliminary feature selection procedure [55].Fig. 7 (a) and (b) depicts the heatmap correlation between the features of the ANN input data for Model I and Model II, respectively.As shown in map of Fig. 7 (a) for Model I, less than 20% of the feature pairs have moderate to strong positive or negative correlation (⎪R⎪ > 0.65).The strongest correlation is mostly between original M p and HSP which indubitably consist of original M p as A component per Equation (1).The same map further reveals a low correlation (⎪R⎪ < 0.5) between more that 75% of the feature pairs, of which more than 95% are statistically insignificant at P.01 (highlighted by asterisks).For Model II, despites the increase in feature pairs, the map shows only about 25% of the feature pairs have relatively strong positive or negative correlation (⎪R⎪ > 0.65) which is also attributed to inherent correlation between M p and HSP and charging time.Low to moderate Correlations with ⎪R⎪ < 0.5, exist between more that 65% of the feature pairs, of which more than 80% are statistically insignificant at P.01 (highlighted by asterisks).In other words, for both models, majority of the input data feature pairs  have a weak correlation.Although other non-linear, multivariate forms of correlation may exist, classical linear regression solutions do not suffice to predict the hydrogen concentration with the available data.Nevertheless, more powerful methods like ANN can establish any existing nonlinear or multi-dimensional relationships between the input features and target output.This is of great value, when there is no analytical validated formulation for this multi-physical problem.
The learning performance of the ANN models, i.e., training history, prediction accuracy at training, validation and testing are evaluated using the mean absolute error (MAE) metric [56].MAE is the measure of errors between pairs of true and predicted values over the training epochs, where each epoch is a full dataset training iteration.MAE is calculated with Equation (4).
Where: T i are the experimental true values, P i are the predicted values, and n the number of epochs.Fig. 8 (a) shows the final training history of Model I (12-7-5-3-2-1).The utilised topology in terms of hidden layers and neurons was found by trial and error to be superlative for Model I. Attempts to add or subtract any hidden layers results in the model overfitting or underfitting quickly and getting stuck in a local minimum, evidenced by straight line in MAE for both training and validation history [57].Considering the training history with this topology, MAE for training consistently reduces up to 10 5 epochs.MAE for validation data originally declined but starts to overfits at about 10 4 epochs followed with a momentary spike.MAE nevertheless continues to decline until 6 Â 10 4 when the overfitting gap between training and validation starts to widen remarkably.Consequently, Model I was re-run and training was stopped at 6 Â 10 4 epochs and saved.The model was then called to make predictions on training, validation, and test datasets.selected to include all fours DQ test steels.However, further data and investigation is required ascertain the predictive performance of the model for the same steels under different mechanical test conditions (e.g.varying strain rates).
To evaluate the effect of increased feature size on the training and prediction performance of the developed ANN, Model II, was developed keeping the individual hydrogen charging parameters as input features resulting in a 14-7-5-3-2-1 network topology.This was performed to test the general assumption that an increase feature size may cause quicker overfitting or allow a good fit for training data but result in a bad prediction performance on test data due to lack of generalization [61,62].Fig. 10

Conclusion
Two artificial neural network (ANN) models were developed to predict the hydrogen concentration responsible for mechanical properties degradation DQ and DQ þ RQ martensitic ultra-high-strength steels.The inherent mechanical properties of the steels in as supplied state, after hydrogen-induced failure and hydrogen charging parameters for four steels encompassing different chemical composition and thermal treatment, resulting in similar martensitic microstructure but different PAG size and morphology, were used as inputs to represent this family of steels and wide range of hydrogen rich service conditions.The following conclusions can be drawn.
The developed ANN Model I (12-7-5-3-2-1) and Model II (14-7-5-3-2-1) show comparable hydrogen concentration prediction abilities with remarkable accuracy.The ANN models show good hydrogen concentration prediction for all considered steels irrespective of their different mechanical responses to varying levels of hydrogen concentration resulting from differences in prior austenitic grain (PAG) morphology.Increase in feature size in Model II, by considering the original three parameters controlling the hydrogen charging energy, requires longer training time to reach a good mean absolute error (MAE) trade-off for training and validation dataset, yet reduces prediction performance evaluated with test dataset compared to Model I.The developed models may not be suitable for hydrogen concentration at fracture for other family of DQ highstrength steels, namely with significant volume fraction of retained austinite, and steels with dominantly ferritic or austenitic microstructures as is expectable that they respond differently to hydrogen-induced fracture.In future works, the effect of PAG size and boundary surface area as input features on the prediction performance of the developed models can be considered for further studies.

i n t e
r n a t i o n a l j o u r n a l o f h y d r o g e n e n e r g y 4 8 ( 2 0 2 3 ) 5 7 1 8 e5 7 3 0 V SCE were applied for hydrogen charging.A combination of varied applied electrochemical potential and concentration of hydrogen recombination inhibitor was used to provide a wide range of hydrogen concentration in the studied steels.The pH of the electrolyte was measured to be 5.5 and 4.5 for the electrolyte solution containing 2 g/l and 3 g/l NH 4 SCN, respectively.During hydrogen charging, the electrolyte was kept under constant stirring at 40 rev/min and deaeration by nitrogen gas flow.Hydrogen charging was performed at room temperature.Mechanical testing (MT) specimens with size of 5 mm Â 10 mm Â 300 mm and gauge part size of 1.0 mm Â 5.0 mm Â 20 mm, shown in Fig.3(a) were wire cut with EDM and polished mechanically, finishing with emery paper No. 1200.The specimens were Teflon-taped exposing only the gauge part to the hydrogen charging environment.CERTs were conducted under continuous hydrogen charging on a 30 kN MTS benchtop tensile test machine at the strain rate of 10 À4 s À1 until fracture.Fig.3(b) shows a general view of mechanical testing setup with continuous hydrogen charging.

Fig. 3 eFig. 4 e
Fig. 3 e Depiction of the experimental features with (a) schematic and dimensions of MT specimens (b) general setup CERT under continuous hydrogen charging.

Fig. 5 e
Fig. 5 e Characteristic function of a unit node within an ANN hidden layer.
i n t e r n a t i o n a l j o u r n a l o f h y d r o g e n e n e r g y 4 8 ( 2 0 2 3 ) 5 7 1 8 e5 7 3 0

Fig. 7 eFig. 8 e
Fig. 7 e Heatmap of Pearson's linear correlation between the feature of ANN input data (a) Model I (b) Model II [where asterisks represent correlation that are insignificant at P.01].

Fig. 9 eFig. 10 e
Fig. 9 e Correlation between experimental and predicted hydrogen concentration from ANN Model I, using (a) validation dataset, (b) test dataset.

Table 1 e
Chemical composition and mechanical properties of the steels used in this study.

Table 2 e
Features of the input data utilised in this study.