Machine Learning Enabled Discovery of Application Dependent Design Principles for Two-dimensional Materials

The large-scale search for high-performing candidate 2D materials is limited to calculating a few simple descriptors, usually with first-principles density functional theory calculations. In this work, we alleviate this issue by extending and generalizing crystal graph convolutional neural networks to systems with planar periodicity, and train an ensemble of models to predict thermodynamic, mechanical, and electronic properties. To demonstrate the utility of this approach, we carry out a screening of nearly 45,000 structures for two largely disjoint applications: namely, mechanically robust composites and photovoltaics. An analysis of the uncertainty associated with our methods indicates the ensemble of neural networks is well-calibrated and has errors comparable with those from accurate first-principles density functional theory calculations. The ensemble of models allows us to gauge the confidence of our predictions, and to find the candidates most likely to exhibit effective performance in their applications. Since the datasets used in our screening were combinatorically generated, we are also able to investigate, using an innovative method, structural and compositional design principles that impact the properties of the structures surveyed and which can act as a generative model basis for future material discovery through reverse engineering. Our approach allowed us to recover some well-accepted design principles: for instance, we find that hybrid organic-inorganic perovskites with lead and tin tend to be good candidates for solar cell applications.


Introduction
Two-dimensional (2D) materials have emerged as attractive candidates for energy applications due to their unique electronic, mechanical, chemical, optoelectronic and magnetic properties. [1][2][3][4] Among their different prototype structures, MXenes have been explored for applications in battery electrodes, water purification, catalysis, lubrication etc. [5][6][7] Their structure and composition allows the careful tuning of properties for these applications. 8 Most device applications of 2D materials require mechanical integration with the substrate, promoting an interest in their mechanical properties. MXenes are known to be mechanically stronger compared to other 2D materials resulting in applications in protective coatings, composites and membranes. 9 2D materials offer a new way of tuning the properties of their 3D counterparts like band gaps through exfoliation. 10 2D counterparts of perovskites have shown promise for solar cell applications 11 (Figure 1). Figure 1: Illustration of 2D materials applications: (a) 2D perovskites whose bandgaps can be the range of 1.5-3 eV have been demonstrating promise in the field of photovoltaics, ; (b) MXenes can be used in composites to increase their mechanical strength, here indicated by the material's response to an external mechanical stress (in grey).
The existence of a variety of 2D structures and atoms to populate their sites imply that a purely experimental or computational approach based on first-principles calculations to identify materials for desired applications is infeasible. For example, an MXene structure of the form M n+1 X n T x (X=C, N) when provided with m metal possibilities, t functional group possibilities combinatorially explodes: the upper bound on the number of materials would be ∼ 2 n m n+1 t x ; for example, the case, n = 2 (i.e. M 3 X 2 ) with m = 10 metal possibilities and t = 3 different terminations (e.g. F, OH, O) on either side of the structure would yield approximately 35, 000 materials, while just doubling the number of metal options from 10 to m = 20 gives ∼ 280, 000 possibilities. With the generation of massive amounts of materials data, 12 data-driven techniques offer a new avenue to tackle this problem. 13 Data-driven methods have shown the promise of not only furthering our fundamental understanding of materials 14 but also provide a platform for performing large scale computational screening through the development of accurate structure-property relationships. [15][16][17][18] Machine learning methods can bypass the use of expensive first-principles calculations and help accelerate the often time consuming discovery and optimization of materials for various applications. [19][20][21] Recently, graph convolution based machine learning models have shown promising generalization capability for predicting the properties of crystals and molecules. 18,[22][23][24] These methods encode the structure of a material as a graph based on the position and coordination of atoms, thus circumventing the use of carefully handcrafted or engineered structural features. This enables them to be used in a variety of applications. Here, we extend crystal graph convolutional neural networks (CGCNN) to study materials with planar periodicity.
Using 100 different, randomly generated training sets, we trained an ensemble of CGCNN models to predict thermodynamic, mechanical, and electronic properties. The ensemble of neural networks shows errors comparable to those from highly accurate first-principles calculations, such as density functional theory (DFT), as discussed in the Supporting Information.
We use this ensemble to screen ∼ 45, 000 2D monolayer materials with focus on mechanically strong MXenes (c 11 , c 22 ≥ 175 N/m) and in perovskites whose band gaps fall within an acceptable range ([1.5, 3] eV) for solar cell applications. The two applications chosen are quite different from each other and aims to demonstrate the generalizability of our model predictions. With these models, We recover some well-accepted design rules: for instance, hybrid organic-inorganic perovskites with either tin or lead as metal components are useful for photovoltaics, as well as fomamidinium, imidazolium, or azetidinium fulfilling the role of organic cations. Similarly, we find that titanium based MXenes tend to be mechanically   robust, however, incorporating the other main elements of group 4 of the periodic table, zirconium and hafnium, also aids in increasing the stiffness of this class of structures. Interestingly, we also identify some lesser investigated principles: all-inorganic perovskites whose band gaps are most likely to fall within a desirable range for solar cell applications have zirconium, niobium, hafnium, scandium, or vanadium as metal components.
We primarily used data from the C2DB database to train the CGCNN model. As of August 2019, C2DB consists of over 3500 structural, thermodynamic, elastic, electronic, magnetic, and optical properties calculated using density functional theory (DFT). Each structure was combinatorially generated from a series of prototype structures that differ in space group, stoichiometry, and thickness. Some example prototypes include BN (space group P3m2), BiI 3 (P3m1), or PbSe (P4/mmm). DFT calculations were performed using the Perdew, Burke, and Ernzerhof (PBE) exchange correlation functional 29 in the projector augmented wave (PAW) code GPAW. 30 The stability of each material is evaluated by predicting enthalpy of formation, the elastic constants, and the phonon frequencies. If a material is stable, its electronic structure and other properties, such as polarizability, are also calculated. The most relevant properties for screening 2D MXenes and perovskites are the heat of formation; bandgap; and the c 11 , c 12 , and c 22 components of the elastic tensor.
Using the CGCNN models trained from the C2DB data, we predict the heat of formation, bandgap, and in-plane elastic tensor components of approximately 20, 000 2D perovskites and 25, 000 2D MXenes. The perovskite structures were taken from two sources: the HOIP dataset, 26 and the Castelli database. 27 The HOIP database contains 1,346 structures that were combinatorially-generated from a series of 135 prototypes. The perovskite prototypes were obtained using the minima-hopping method outlined by Goedecker.

Model Training
In order to screen the ∼ 20, 000 perovskites and ∼ 24, 000 MXenes 2D monolayer structures, as well as to uncover the underlying design principles for their respective applications, a technique that can predict properties accurately at a computational cost much lower than DFT is required. We use the Crystal Graph Convolutional Neural Network (CGCNN) framework 22 as a surrogate technique for predicting material properties. This method provides the accuracy of DFT calculations (discussed in the Supporting Information) but at a fraction of the associated computational cost: while it can take up to 500 CPU hours to compute the c 11 coefficient for one structure with DFT, CGCNNs can predict the same property for roughly 25,000 structures in under 20 GPU minutes once trained. This framework has been successfully used in a variety of applications, from selecting solid electrolyte candidates 18 to screening catalytic materials. 32 At the foundation of the CGCNN is the undirected multigraph representation of the crystal structure, in which nodes represent atoms by their respective features, and edges encode interatomic bond distances. 22 Iterative convolution layers update atomic feature vectors based on neighbor information. A simplified depiction of the CGCNN can be seen in Figure 2. Figure 2: A simple representation of the CGCNN architecture. The atomic structure is first converted into its crystal graph representation. This graph is then passed as input for the convolutional layers, where the atomic feature vectors are updated based on neighbor information and bond lengths. Next, a pooling function is employed to produce an overall vector representation of the crystal, guaranteeing invariance with respect to number of primitive unit cells used in the creation of the original structure. Finally, a set of fully connected hidden layers maps the simplified vector-represented crystal structure to the property of interest.
After optimizing the network architecture (see Supporting Information), we used an ensemble of 100 CGCNN models, each trained on a random set of 70% of the C2DB data to predict the properties of interest: band gap, log(c 11 ), log(c 22 ), c 12 , conduction band minimum (CBM), valence band maximum (VBM), and heat of formation (H form ).

Structure Screening
In order to evaluate the accuracy of the ensemble of 100 models, we used them to predict the properties of the all the structures in the C2DB database. The results for the ensemble predictions of some of the main properties of interest can be seen in the parity plots shown in Figure 3. A discussion on the usefulness of the uncertainty quantification of models can be found in the Supporting Information.
As discussed in the Introduction, we screened MXenes in search of structures that are strong mechanically, with both c 11 , c 22 ≥ 175 N/m, thus exceeding those of graphene oxide, 9 and perovskites whose bandgaps fall in the range [1.5, 3] eV, appropriate for solar cell applications. Since it is also required that these structures be stable, as well as synthesizable, our filtering procedure included the additional requirement that H form ≤ −2 eV/atom for MXenes and inorganic perovskites, and H form ≤ −0.5 eV/atom for hybrid organic-inorganic perovskites. The difference in treatment for the latter stems from the fact that hybrid perovskites are known to be relatively less stable than their inorganic counterparts. 33 These threshold values of H form were chosen as they represent the average of the lowest heats of formation of the structures in their respective datasets. We quantify the confidence of our predictions for a given structure s by its c-value (confidence value) 34 representing the fraction of models in the ensemble that predict structure s to be useful for the application, based on the aforementioned criterion. It is calculated in the following manner: where N = 100 is the number of models used and 1 if the ith model predicts the given structure s to be useful 0 otherwise This enables us to determine the 2D structures with the highest likelihood of being useful for their applications, as shown in Table 1. It is important to note, however, that the training sets for the band gap prediction models contained only materials with non-zero band gap, meaning that a further metallic versus insulator filtering, with subsequent update of the cvalues, is needed. The reason for this is that, when given a conducting material, they predict a positive band gap, since they have only been trained on insulators or semiconductors.

Identifying Design Principles
To uncover the compositional and structural commonalities of useful candidates, we applied an analogous concept to study the design principles that can increase the c-values of different which can be interpreted as the chance of an arbitrary model predicting that a random structure in D DP is a useful candidate.
Next, we introduce a minimum threshold, c cut , to distinguish the best candidate struc- will define a few quantities, all functions of c cut . One of the simplest indicators that a given DP is effective at making a structure useful for the application in a combinatorically generated dataset is the proportion of the set of best candidates B that is comprised of materials satisfying the DP, P DP|best = |B ∩ D DP |/|B|, and how it compares with P DP . Additionally, it is helpful to examine the difference between the likelihood of a random material being amongst the best candidates P best|All = |B|/N dataset and the chance of that happening given that the structure contains the DP, c chanceDP = P best|DP = |B ∩ D DP |/|D DP |. Besides these quantities, it is also important to measure our confidence in these candidates, members of B ∩ D DP , by averaging their c-values: Note that, by construction, c bestDP is a monotonically increasing function of c cut while the set B ∩ D DP is not empty. Finally, although redundant with all previously described measures, we also studied, for completeness, how the elements from B ∩ D DP contribute to c DP : Since This approach of understanding the effect of design principles is best suited for combinatorially generated datasets. Therefore, we used it to study all of the data mentioned in the Databases Section, including that from HOIP. 26 We constructed a list of all possible design principles using the same combinatorics applied in the creation of the respective datasets.
These design principles were then ordered by highest to lowest P DP|best /P DP ratio at a cutoff value of c cut = 0.95 for MXenes, and c cut = 0.80 for both inorganic and organic perovskites.
Our choice for cutoff values was guided by the results from Table 1: we wanted to make sure that the set of best candidates B was sizeable enough for a meaningful analysis of the design principles. Furthermore, for the MXenes, we excluded the design rules whose P DP ≤ 0.008% due to their high specificity. We would like to note the following interesting equality, which establishes a relationship between the two most intuitive criteria of gauging the effectiveness of a given DP discussed previously: The results of this analysis are shown in Table 2. We have also chosen one of the top design principles for MXenes (Figure 4), inorganic ( Figure 5), and organic perovskites ( Figure 6) to represent the dependency between the metrics discussed above and the cutoff c cut , which determines the minimum confidence level of the structures in the set of best candidates B.

Interpreting Design Principles
Our study was able to identify some known design rules: for example, we see that titanium (Ti) based MXenes tend to have high stiffness coefficients, as suggested by Figure 4 and Table   1 in Anasori et al. 8 . Interestingly, however, we discovered that the other main elements of group 4 of the periodic table, namely, zirconium (Zr) and hafnium (Hf), can also increase the mechanical strength of this class of materials, as long as these elements are bonded with oxygen, and the opposite side of the monolayer is either oxygen or fluorine terminated. Similarly, our model was able to recognize that, in order for hybrid organic-inorganic perovskites to have a band gap in the range of [1.5, 3] eV, the B-sites should be occupied by either lead (Pb) or tin (Sn), a relatively well-known design principle in the photovoltaics community. [35][36][37][38] We have also found that the organic A-sites should be composed of fomamidinium (HC(NH 2 ) 2 ), imidazolium (C 3 H 5 N 2 ), or azetidinium(C 3 H 8 N). Curiously, for these hybrid perovskites, our method suggests that the X-sites be populated by fluorine, rather than the usual iodine. A deeper investigation suggests that the reason for this is the en-   The collection of design rules we uncovered shows the capability of our model to both identify established criteria to attaining material performance, as well as to find new, unex- Figure 4: One of the top design principles for MXenes: hafnium (Hf) bonded with oxygen (O) termination, and with a fluorine (F) termination bonded to any other metal. The likelihood that any arbitrary model predicts that a random material satisfying this DP is a useful candidate is of ∼ 57%, as indicated by c DP . Choosing c cut = 1.0 shows that the chance of a structure satisfying the DP being among the best candidates is of nearly c chanceDP = P best|DP ∼ 20%, and these candiates contribute to approximately 30% of c DP , as shown by c contribDP . Note that, for this specific DP, P DP|best almost steadily increase with the value of c cut , indicating that, the more confident we want to be in our set of useful candidates, in general, the more prevalent this DP becomes in this B set. Figure 5: One of the top design principles for inorganic perovskites: A-site occupied by hafnium (Hf), with vanadium (V) in the B-site. The likelihood that any arbitrary model predicts a random material satisfying this DP is a useful candidate is of ∼ 29%, as indicated by c DP . Choosing c cut = 0.80 shows that the chance of a structure from set D DP to be among the best candidates is of c chanceDP = P best|DP = 14%, and these candidates contribute to approximately 50% of c DP , as shown by c contribDP . Additionally, for this specific DP, P DP|best almost steadily increases with the value of c cut , indicating that, the more confident we want to be in our set of useful candidates, in general, the more prevalent this DP becomes in this B set. Finally, an examination of the behavior of c best reveals that, for values of the cutoff c cut > 0.6, all of the structures in set B ∩ D DP have a c-value of nearly 1.0. Figure 6: Top design principle for Khazana perovskites: A-site occupied by fomamidinium (HC(NH 2 ) 2 ), with fluorine (F) in all X-sites. The likelihood that any arbitrary model predicts a random material satisfying this DP is a useful candidate is of ∼ 82%, as indicated by c DP . Choosing c cut = 0.80 shows that the chance of a structure from set D DP to be among the best candidates is of c chanceDP = P best|DP =∼ 65%, and these candidates contribute to approximately ∼ 70% of c DP , as shown by c contribDP . Figure 7: Summary of uncovered design rules for: (a) MXenes, where hafnium (Hf), zirconium (Zr), or titanium (Ti) must be bonded to an oxygen termination, regardless of the central X component or the metal M2 on the opposite side of the structure, which should have either an oxygen or a fluorine termination; (b) inorganic perovskites, in which the Asite must be occupied by scandium (Sc), zirconium (Zr), or hafnium (Hf), the B-site has to contain chromium (Cr), scandium (Sc), niobium (Nb), or vanadium (V), and, as discussed in the main text, all X-sites should be occupied by oxygen; (c) organic-inorganic hybrid perovskites, where all X-sites must be occupied by fluorine, the B-site can be occupied by either tin (Sn) or lead (Pb), and the A-site possibilities are fomamidinium (HC(NH 2 ) 2 ), imidazolium (C 3 H 5 N 2 ), and azetidinium(C 3 H 8 N). Note: in the inorganic perovskite case, even though all X-site occupations should be the same, we represented them by different colors for completeness, since changing the value of c cut used in the analysis allows for other possibilities where all three distinct types of X-sites can be populated by a different atom.
plored avenues for application-focused material discovery, since it can be considered a basis for reverse engineering of 2D structures. We believe that machine learning methods such as

Conclusions
Using fast and accurate graph convolutional neural networks on 2D materials, we screened large combinatorially generated datasets of MXene and perovskite materials in search of those with high likelihood of having properties of interest, as determined by the ensemble of trained CGCNN models. Using the results from the screening process, we were able to uncover the underlying design principles that make said structures useful candidates for their respective applications, which can be used guidance for both experimental and computational testing at different confidence levels.
Furthermore, our design rules can act as a generative model for generating datasets by populating a series of structural sites with a given list of atoms. The metrics we adopted for evaluating design principles could prove most useful when dealing with combinatorically generated databases, typically the case for high-throughput screening. For instance, while c DP indicates how generally good a design rule is, the ratio P DP|best /P DP shows how the knowledge that a candidate satisfies the specified design principle impacts its likelihood of being useful.

Data availability
The CGCNN modified code base can be found on GitHub, as well as instructions on how to download it, set up a virtual environment, and run it. Further details that support the findings of this study are available from the corresponding author, V. Viswanathan, upon reasonable request.

Acknowledgement
The authors thank T. Xie

Supporting Information Available
The following files are available free of charge.
The following files are available free of charge.
• suppinfo.pdf: Details on CGCNN hyperparameter optimization and ensemble prediction performance.  Figure S1: Example of root mean square (RMSE) and mean absolute error (MAE) curves used in optimization of network architecture and hyperparameters. These curves were used to evaluate the prediction performance of models using different number of convolution layers, hidden layers, epochs, number of neighbors used in convolution operations, among others.

CGCNN Network Optimization
Represented here, we have the results from training a model to predict H form using a mean pooling function, 300 epochs, 1 hidden and 2 convolution layers.
In this work, we apply CGCNNs power of accurately predicting properties of periodic materials to investigate 2D materials, namely, MXenes and perovskites. Using a 70:15:15 training:validation:test split ratio on the C2DB database for the heat of formation property (H form ), we first optimized the network architecture, including number of convolution and hidden layers, learning rate, and number of epochs to be used in training, and the models' performances were evaluated as shown in Figure S1. We also tested different possibilities of pooling functions (mean, max, and min), as shown in Figure S2. The final architecture used in the models was composed of 2 convolutional layers and 1 hidden layer post-pooling. The networks were trained with a learning rate of 0.01 and a mean pooling function over 300 epochs. Figure S2: Evaluation of how pooling functions, as well as number of neighbors used in convolution operations, affect RMSE and MAE in log(c 11 ) prediction. We also implemented higher order norm functions as pooling operators.

CGCNN Ensemble Performance
In order to measure the performance of our model ensemble, we apply some of the metrics introduced by Kuleshov et al. 1 and, 2 namely, calibration and sharpness, besides mean absolute error (MAE) and root mean square error (RMSE).
In their work, the authors institute the concept of a calibration plot, which compares, for each predicted data point, the standard deviation of the ensemble predictions (y-axis) and the residual between the mean of the predictions and the true value of the data point -the mean error of the predictions (x-axis). In the regions where the observed estimation interval is greater than the expected interval (green), the model ensemble is called underconfident, since the true value falls within the error bars of the ensemble prediction. On the other hand, in the regions where the observed estimation interval is smaller than the expected interval (red), the model is called overconfident, since the standard deviation of the ensemble predictions do not encompass the true data value. The calibration plot for our 100 model ensemble trained to predict conduction band maximum is shown in Figure S3. However, measuring calibration is not sufficient, though necessary, for useful uncertainty quantification. For instance, a well calibrated model with large uncertainty estimates is less useful than a similarly calibrated model with smaller uncertainties. Thus, the concept of sharpness is introduced: a sharper model is that whose prediction standard deviations are smaller. This metric is measured in the following manner: 2 sharpness = 1 N dataset structure V ar[M(structure)] Figure S4 illustrates histograms of the mean prediction errors (red) and ensemble standard deviations (blue) of the models trained over band gap data. The values of prediction mean absolute error (MAE) and root mean square error (RMSE), as well as ensemble sharpness, are also represented. Table S1 contains some of these metrics for all predicted properties. Taking as example our band gap and heat of formation predictions, one can see that our approach performs better than some first-principle simulations: for band gap, the accepted DFT errors are between 0.25 and 0.4 eV, 3,4 while both our MAE and sharpness fall on the lower end of this range; for H form , DFT errors are of usually 0.1 eV/atom, 5 while all our uncertainty metrics are below this value by a safe margin.
CGCNN is a direction-agnostic framework for machine learning and since the material stiffness only depends on the relative positions of the atoms in the crystal, we can use them to predict the elastic constants as seen from the low MAE and RMSE (Table S1). We regress over the log of c 11 and c 22 since they are positive and want to avoid overweighing elastic constants of very stiff materials. Figure S4: Histograms of errors on band gap prediction and of standard deviation of ensemble predictions.

Graphical TOC Entry
Synopsis: We use crystal graph convolutional neural networks (CGCNN) to screen nearly 45,000 monolayer 2D materials for applications in: composites and photovoltaics. We identify underlying compositional and structural design rules that govern the performance of these structures in these applications.