Machine learning of metal-ceramic wettability

The adhesion and wetting between metal and ceramic is a basic problem in materials science and engineering. For example, past materials selection for metal-ceramic composites has relied on random trials and heuristics due to a limited understanding of their adhesion; the large chemical/structural variability that such interfaces can have hinders the identi ﬁ cation of the governing factors. Here based on literature data, we have developed a database with ~1,000 experimentally measured wetting angles at different temperatures and atmospheric conditions, and come up with a model for the wettability of ionocovalent ceramics (ICs) by metals using a machine learning (ML) algorithm. The random forest model uses the testing temperature and ~40 features generated based on the chemical compositions of the metal and the ceramic as predictors and exhibits strong predictive power with an R 2 of ~0.86. Moreover, this model and the featurization code are integrated into a single computational pipeline to enable (1) predicting metal-IC wettability of interest and (2) high-throughput searching of ICs with the desired wettability by certain metals in the entire Inorganic Crystallographic Structure Data-base. As a demonstration of this pipeline, the wettability of a Li-ion and electron insulator (LEI), CaO, by molten Li is estimated and compared with ab initio molecular dynamics simulation result. This ML pipeline can serve as a practical tool for methodical design of materials in systems where certain metal-ceramic wettability is desired. © 2021 The Chinese Ceramic Society. Production and hosting by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).


Introduction
The wettability of non-metals by metals has been of importance in various fields. For example, in structural materials applications, adding non-metallic nanodispersions is one of the well-known strategies used to strengthen the metal or to provide unique properties, such as irradiation resistance in oxide-dispersionstrengthened steels [1e3]. However, these composite materials are deployed only to a limited extent in the industry due to the generally poor wettability of non-metallic phases by metals, which makes it difficult to achieve a uniform dispersion of ceramic fillers in a metal matrix via scalable processing [3]. As another example, in energy applications such as rechargeable solid-state Li-metal batteries, the wettability of solid electrolytes by metallic Li at the anode side is of critical importance for uniform Li deposition and hence the batteries' long-term stability [4e10].
There has thus been intensive research [11e19] over the decades to identify key factors that govern the metal-ceramic wettability and thereby to design a metal-ceramic pair with good wettability or improve that of a given pair by tuning the processing conditions. Nevertheless, parameterization or prediction of wettability in a practical sense has not been realized yet. The surface of a material has high degrees of freedom. Its chemical composition and atomic structure can differ from that of the bulk, depending on the temperature and atmospheric conditions. Moreover, surface roughness can vary to a large extent depending on the specimen preparation methods. All these degrees of freedom can matter when two surfaces form an interface and undergo a reaction. These complexities have caused a large variation in the experimentally measured wetting angles reported in the literature [20,21] and have made it difficult for computational approaches, such as ab initio calculations [22], to approximate real interfaces, leaving the wettability of ceramics by metals poorly predicted. Hence, predicting wettability requires an approach that can handle the multi-dimensionality of the problem. One such approach is machine learning (ML). In particular, neural networktype algorithms can develop multiple hidden layers with different numbers of neurons between an input layer and an output layer. Upon being trained on the multiple variables collected in a database, these algorithms have succeeded in predicting various material properties with acceptable accuracy (>~80%) [23]. Examples range from electronic properties such as bandgap [24] and density of states [25] to macroscopic properties such as the superconducting critical temperature [26] and thermal boundary resistance [27].
In this paper, we present a ML model that can predict the observed wetting angles of metal-ionocovalent ceramic (IC) pairs. With the random forest algorithm, we show that the out-of-sample accuracy of~86% can be achieved even without atmospheric information, which is known to affect the wettability; the input parameters used are testing temperature and features generated from the chemical composition of metals and ceramics. We also demonstrate that the prediction results for the metal and the ceramic that do not appear in the database at all, are also in good agreement with the wetting angle calculated by ab initio molecular dynamic simulations. This model would allow high-throughput screening of candidate material pairs for which wettability matters, and help us discern outliers, thereby enabling the investigation of physics underlying the wettability.

Wetting angle data collection
We collected approximately 1,000 measured metal-IC wetting angles from the experimental literature [12,25,28e74]. ICs in the database are mostly oxides and fluorides. The standard approach for evaluating metal-ceramic wettability is to measure the contact angles between molten metals and substrate ceramics via the sessile drop method [20], where metallic melt gets extruded through the hole of the dropping device to remove oxide films on its surface. Modified versions of it, on the other hand, often omit the procedures for the removal of oxide films [20]. For instance, one of the modified sessile drop methods introduces a solid metal block instead of metallic melt directly onto the substrate without mechanical extrusion [20]. The oxide film can thus remain on the surface of the metal after melting in this case and affect the wettability [20]. That being said, such data points are not excluded in this study because those from the unmodified sessile drop method can also have oxide films; they can form again in situ depending on atmospheric conditions (e.g., the type/pressure of the gas used, oxygen partial pressure) even after the mechanical extrusion has been carried out. Likewise, the wetting angles measured under different atmospheric conditions are all included without distinction, although such conditions can change the wetting angle of a given metal-ionocovalent ceramic pair greatly [73,75]dfrom 68 (under vacuum) to 132 (under Ar) for the FeeAl 2 O 3 pairdsince only a limited number of articles report atmospheric information. It should be noted that the wetting angles measured via these methods are close to advancing contact angles rather than receding or equilibrium contact angles, which require the application of controlled vibrations for the measurement.
The testing temperature is the only extrinsic variable that is included in the database. If the measurement was carried out at a constant temperature and the wetting angle-time profile is provided, the wetting angle at equilibrium was collected to construct the database; when only one wetting angle was reported without the profile, we assumed that this value is the equilibrium wetting angle. On the other hand, if the experiment was conducted under the continuous heating condition at a heating rate of less than 5 C/ min, wetting angles at every 10 C were used. The wetting angle database used to generate results in this work can be downloaded from https://github.com/sokim1/Wettability_Metal-IonocovalentCeramic.

Featurization
A set of attributes were generated using the featurizer package provided by MatMiner [76], which is an open-access Python library for assisting ML in materials science. Among the various packages that it offers, we adopted a composition-based featurizer to develop a model generally applicable throughout the metal-IC systems, including the ones with off-stoichiometric compounds. In particular, a class called "ElementProperty" was used with a preset named "Magpie," which can return 22 elemental features calculated by the rule of mixture, such as the average number of electrons in each valence shell (s, p, d, and f) among all elements present in the material [77]. This specific preset is chosen because it takes into account a wide variety of material characteristics from physical/chemical properties (e.g., melting temperature, specific volume, electronegativity) to electronic/magnetic properties (e.g., bandgap, magnetic moment). This featurizer provides 5 additional statistics (i.e., mean absolute deviation, range, minimum, maximum, and mode) [77]; however, only the average values are exploited since the rest are less likely to have actual physical meanings.
The final input matrix was constructed using these features. As the features in the input matrix are employed as parameters to predict wetting angles, the term "predictor" is adopted in this paper to refer to these features. The testing temperature is included as one of the predictors, and two sets of 22 predictors are derived from the composition of the molten metals and the substrate ceramics, respectively. Among these 45 predictors, the predictor named the number of unfilled f-states of metals is excluded, since metals in our database have either empty or completely filled f-states. Meanwhile, the predictor named bandgap at 0 K ground state is not excluded, as some metals have semiconductor elements (e.g., Ge, Si) as one of the constituent elements. The list of the predictors used is summarized in Table 1.

Machine learning
To develop a model for metal-ionocovalent ceramic wetting angles out of the constructed dataset, we used a random forest regressor implemented in the scikit-learn Python module [78]. No preprocessing (e.g., scaling or normalization) of the predictors was carried out. The dataset was parsed randomly into two subsets with the ratio of 75% and 25%, which were used for training and testing, respectively. Parameters for the random forest regressor were then optimized using a grid search method to optimize its performance. Generally, the more complex the relationships the model can learn, the greater the danger of overfitting, where the model learns irrelevant information (e.g., "noise") [26]. Therefore, parameters for a regression model should be adjusted to avoid overfitting. Examples of such hyperparameters for the random forest regressor include the number of trees in the forest, the depth of the tree, and the minimum number of samples required to split an internal node. An exhaustive search over the specified values of these parameters was performed.

Ab initio molecular dynamics simulation
The ab initio molecular dynamics (MD) simulation of LieCaO wetting behavior was carried out using the density functional theory, implemented in Vienna ab initio Simulation Package (VASP). A Li metal droplet was created via Packmol [79], which can build an initial disordered structure, guaranteeing that short-range repulsive interactions do not disrupt the simulations. For building the CaO substrate, its Crystallographic Information File (CIF) was downloaded from the Materials Project database (Materialsproject. org) and cut into a two-layered slab using VESTA 3 [80]. To ensure that the Li droplet does not interact with neighboring images under periodic boundary condition, the CaO slab was expanded in-plane, and the vacuum, as well as the as-prepared Li droplet, was added on top. The dimensions of the resultant simulation box were 24 Â 24 Â 25 Å. For exchange-correlation functionals, the Perdew-Burke-Ernzerhof functionals, which adopt the generalized gradient approximation (GGA), were used with the energy cut-off of 850 eV. The LieCaO system was run for 1800 steps at T ¼ 500 K to achieve energy convergence. The atom trajectories of the last 500 steps were extracted as an XYZ file and then rendered with OVITO 3.4.0 [81], from which the contact angle was measured.

Regression model and feature importance
The random forest is one of the neural network-type algorithms, which does not assume a shallow function form of the relationship between the predictors and the target variable [78]. Instead, it builds a multitude of independent decision trees, which solve the same classification or regression problem [78]. If the regression is of interest, the model combines results from individual trees and returns their average prediction [78]. This method is exploited in this study due to the benefits that constructing decision trees can bring in modeling the wetting angles of metal-IC systems.
First, individual decision trees allow training the datasets with conflicting tendencies, which are common in the wettability problem. For example, it was suggested that strong metallic bonding at the interface can help promote good wettability for the metallic elements with a completely filled d-band (e.g., Sn, Cu, Ag, Au) [22,82]. In contrast, for transition metals with a partially filled d-band (e.g., Fe, Co, Ni), covalent bonding is pointed out to play a major role in the wetting behavior [22,82]. The random forest method can tackle this issue by using decision trees to split data points with conflicting tendencies. Second, decision trees eliminate the need for potentially dangerous preprocessing in handling heterogeneities in the training data [26]. The values for different predictors used in this study span over very different ranges. Melting temperatures, for example, are typically several hundred or a few thousand Kelvin, whereas the number of electrons in dvalence shells is 10 at the largest. Preprocessing is often required in such cases to prevent the underrepresentation of certain predictors; nonetheless, it can skew dependencies [26]. Meanwhile, decision trees treat each predictor individually, not being affected by the range over which the other predictors span, and hence do not demand preprocessing. These characteristics make the random forest algorithm appropriate for the wettability prediction task. Fig. 1 shows the benchmark result of the trained random forest regression model in predicting the wetting angles of different metal-IC pairs. The average R 2 value, which represents the prediction accuracy, is~0.86dthe accuracies obtained using other regressors are also summarized in Supplementary Information; since  R 2 values can differ from training/testing datasets to datasets, those of the 40 different training/testing datasets were estimated and averaged. Considering that this model is trained on the experimental data, which inevitably has a certain degree of scattering, achieving this level of prediction accuracy with a thousand data points is notable. For comparison, training on the dataset of which size is greater than~13,000 gives the R 2 value of~0.88 for the prediction of the superconducting critical temperatures [26]. Most of the predicted values are within the 30 range with respect to the grey dotted line, where the predicted wetting angles accord with those measured experimentally. This level of deviation reflects the fact that the wetting angles of a given metal-ionocovalent pair can vary by~60 or more depending on the atmospheric conditions (e.g., oxygen partial pressure) [73,75]. One of the significant advantages of the random forest algorithm is that it can estimate the importance of each predictor by combining information from individual trees, thereby making the developed regression model more interpretable [78]. However, unlike model construction, where the presence of correlated predictors does not influence the model's ability to learn, importance estimates can be complicated when some predictors have strong correlations [26]. More specifically, when a material property that plays an important role in determining wettability is relevant with several predictors, the model can access the information about this property from more than one predictor, and hence regard each of those predictors to be less critical. In other words, the importance of the aforementioned property becomes diluted across those predictors [26].
To circumvent the issue of correlated predictors and increase the interpretability, standard predictor selection procedures and a backward predictor elimination process were employed. First, pairwise correlation coefficients were calculated via the Pearson method. When a pair of predictors has a correlation coefficient of > 0.9, the less important predictor was removed; 12 out of 44 predictors were excluded through these procedures. The regression model was then trained again on the modified input matrix composed of the remaining 32 predictors. Next, starting with this model, the backward feature elimination process iteratively removed the least crucial predictor. In every iteration, the model was rebuilt and the importances were evaluated again since the importance rankings of the predictors were subject to change at each step. Upon iterating until the overall accuracy obtained from an out-of-bag estimate dropped by 2%, there remained only 8 predictors. These predictors were sufficient to yield the prediction accuracy of~0.84 for metal-IC wetting angles, as shown in Fig. 2a.
There were a few outliers, but the model appears to capture most of the trends even with less than 20% of the total predictors at the level comparable to that with the full list of the predictors. Fig. 2b delineates the relative importances of the remaining predictors. Different colors are used to distinguish bars for the metal-related, ceramic-related, and extrinsic predictors. Interestingly, the metal-related predictors account for 76.3% of the importances, whereas the total importances of the ceramicpredictors are 15.4%. That of the extrinsic predictor is 8.3%. This bias in importances indicates that the metal-IC wettability depends mainly on metals. Moreover, none of the system-related predictors shows database-wide correlations. Even for the number of dvalence electrons of metals, which has the highest importance, a database-wide trend is hardly noticeable in the plot of wetting angles versus predictor values as illustrated in Fig. 2c. Instead, only a few local correlations are observable. The lack of global correlations implies that the mechanism by which metals wet ionocovalent ceramics differs from system to system. Investigation on local correlations would help understand mechanisms that govern the wettability in particular systems. Nevertheless, it should be noted that local correlations observed for a certain predictor could be a manifestation of other effects. For example, the testing temperature leads to hard limits on the minimum wetting angle achievable, as displayed in Fig. 2d. However, considering that only a few metals are in a liquid state in those low-temperature regimes, it is possible that the poor wettability therein is attributed to other properties of the low-melting-point metals.
The system-specific correlations also explain a high ranking of some physically meaningless predictors. For example, space group number of metals ranked the third in terms of predictor importances. However, it is the value obtained by averaging the space group numbers of the constituent elements and thus does not have a physical meaning. Therefore, it is likely that there exist predictors that are highly correlated with the space group number. Indeed, when calculating the correlation coefficients between space group number and other metal-related predictors, one can see that melting temperature has a close correlation with it as indicated by a dotted circle in Fig. 3. The space group number of metals may have contributed to constructing one of the decision trees that distinguish a subgroup of datasets where melting temperature or relevant factors play an important role; this, in fact, implies that the relative importance of melting temperature could be greater than the current value. Likewise, it should also be noted that the values for some predictors (e.g., melting temperature of ceramics) are not in good agreement with their real values now that they are calculated solely based on the rule of mixture.

Pipeline for wettability prediction
To enable the prediction of the wetting angle of arbitrary metal-IC pairs of interest, the featurization code and the regression model were integrated into one computational pipeline. The function to screen the Inorganic Crystal Structure Database (ICSD) was also embedded by utilizing application programming interfaces (API) offered by Automatic-FLOW for Materials Discovery [83], which is an open-access database for high-throughput computational materials design. The first step in the constructed pipeline is system specification, where the metal-IC pairs and temperatures of interest were defined. Second, the full list of Magpie predictors was generated based on the constituent materials' compositions, thereby completing the input matrix. Then, the list was fed into a random forest regression model (trained on the entire dataset collected) to return the predicted wetting angles. One should note that the predicted wetting angles are the values that are expected to appear when the surface roughness level is in the range that typical ceramic processing methods generate. The Python code that works in an interactive manner can be found at https://github.com/ sokim1/Wettability_Metal-IonocovalentCeramic.
As a demonstration of the pipeline, we predicted the wettability of Li-ion and electron insulator (LEI) candidates by molten Li metal. LEI is a recently proposed class of materials [84]. It is expected to contribute to combating electrochemomechanical degradation challenges in rechargeable solid-state Li metal batteries, functioning as an inert mechanical binder at the interface between the solid electrolyte and the open porous mixed ionic-electronic conductor 3D host [84,85]. Lithiophobicity has been pointed out as one of the important properties required for it to be effective [85]. By feeding the list of LEI candidates sorted out by Pei et al. [85] via high-throughput screening of the Material Project database [86] into the pipeline, the wetting angles of those candidate materials are estimated. Table 2 shows the LEI candidates and corresponding wetting angles at 500 K, which is slightly above the melting temperature of Li metal (~453 K). In accordance with the fact that the metal-related predictors account for more than 75% of the predictor importances, the predicted wetting angles were distributed over a small range (~13 ) with an average wetting angle of~117 .
To check this prediction result against first-principles theory, ab initio molecular dynamics (AIMD) simulation was performed with a molten Li droplet that consisted of 54 Li atoms. For substrate ceramics, CaO, of which space group is Fm3m, was used since it is one of the most common space groups among LEI candidates. Wettability is a macroscopic phenomenon, and thus, the wetting angles obtained via AIMD simulation cannot directly represent macroscopic wetting angles; nonetheless, AIMD simulation can capture intrinsic wettability, which is expected to appear when no surface roughness and gaseous molecules are present. Fig. 4 shows the positions of Li and constituent atoms of the substrate ceramics after 1,800 MD steps, which correspond to~500 steps after the energy convergence is achieved. The exact measurement of wetting angles is not conducted due to the limited number of atoms in the droplet and their fluctuations throughout the simulations (see Supplementary Movie). Nonetheless, unlike the Li droplet-graphene system, which exhibits an acute wetting angle in ab initio MD simulation [87], the LieCaO system shows angles greater than 90 throughout the simulations. This result agrees well with the ML prediction, which anticipates CaO to be lithiophobic having the wetting angle of~119 . The findings that the presence of surface roughness and gaseous molecules, which is unavoidable in experiments, generally increases wetting angles [88e90] also supports that the ML prediction is in a reasonable range.

Conclusions
To summarize, we have developed a regression model for metal-IC wettability by implementing the random forest algorithm. Based on~1,000 experimental data accumulated over several decades and open-access Python libraries built for high-throughput materials science research, the prediction accuracy of~0.86 is achieved, and the predictors that are important in modeling wettability are identified; the list of them would be a useful starting point for the investigation of mechanisms underlying the wettability. Moreover, the regression model and the code for generating predictors were integrated into a single pipeline to enable high-throughput searching of metal-IC pairs with desired wettability. The wetting angle prediction results obtained using this pipeline were in good agreement with density functional theory calculations. Composites composed of more than two different classes of materials have the potential to outperform their constituent materials. ML modelguided design can help realize the full potential of metal-ceramic  composites.

Declaration of competing interest
The authors declare that there are no conflicts of interest.