SurfFlow: high-throughput surface energy calculations for arbitrary crystals

We introduce SurfFlow, an open-source high-throughput workflow package designed for automated first-principles calculations of surface energies in arbitrary crystals. Our package offers a comprehensive solution capable of handling multi-element crystals, nonstoichiometric compositions, and asymmetric slabs, for all potential terminations. To streamline the computational process, SurfFlow employs an efficient pre-screening method that discards surfaces with suspected high surface energy before conducting resource-intensive density functional theory computations. The results generated are seamlessly compiled into an optimade-compliant database, ensuring easy access and compatibility. Additionally, a user-friendly web interface facilitates workflow submission and management, provides result visualization, and enables the examination of Wulff shapes. SurfFlow represents a valuable tool for researchers looking to explore surface energies and their implications in a diverse range of systems.


Introduction
The surface energy of a material determines its stability and influences its adhesion properties, catalytic ability, and ability to form thin films and interfaces.It is defined as the work needed to create a new surface from a bulk crystal.It is a fundamental property that can be used to explain the stability of surface facets, the Wulff shape [1], which is the equilibrium crystal shape of the material, and phenomena such as surface reconstruction [2,3], segregation [4,5,6], and catalysis [7,8,9].Catalytic properties are closely tied to surface energy, and nanomaterials with high surface energies show exceptional properties for electrocatalysis, photocatalysis, and gas sensor applications [10].The surface energies of a pair of materials are also an important descriptor for the adhesion energy of these materials, a property of great importance for the construction of solid-state batteries [11,12].
Experimental data on surface energies are scarce because of the technical difficulty of the measurements.The available data are primarily limited to specific facets of elemental crystals.Experimental measurements of surface energies of solids can be based e.g. on cleavage [13,14], interfacial energy of small elastic smooth spheres [15], or contact angle measurements of various liquids [16,17].The main problem with cleavage-based methods is plastic deformation at the crack tip, while the second approach strictly works only for amorphous solids.Contact angle measurements, on the other hand, are associated with several difficulties and limitations that affect the accuracy of results, such as surface contamination, roughness, and assumptions in the underlying mathematical models.
In the quest to find new and highly functional materials, expensive and time-consuming experimental studies have been supplemented more and more by large-scale computational high-throughput (HT) screenings in recent years, mainly employing density functional theory (DFT).Due to the difficulties with conducting experiments to determine surface energies, and their crucial importance for a range of applications (e.g.Refs.[18,19]), a couple of HT tools to tackle this problem have been published in recent years [20,21,22,23].Calculating surface properties via ab-initio atomistic modeling alleviates the problem of isolating specific facets and provides exceptional control of the system parameters.
Tran et al. generated a database of the surface energies of elemental crystals [20], in which they provide surface energies of more than 100 polymorphs of approximately 70 elements.Yang et al. provide an open-source code [21] to generate organic surfaces from bulk molecular crystals, which was very recently updated to better interface with DFT and neural network interatomic potentials, among other improvements [24].Brlec et al. provide a python library [22] that automates the surface cleaving and the processing of raw DFT outputs to extract materials properties such as surface energy.Furthermore, there has been an interest in predicting surface properties from unrelaxed surfaces, skipping DFT relaxations, and using a neural network model to learn and predict cleavage energies and Wulff constructions [23].
Although these studies are effective in calculating surface energies for specific systems, they are hampered by some constraints: the ability to handle multi-element crystals is very limited, and/or they are not fully automated, requiring the user to manually initiate calculations and run post-processing scripts.This hinders the high-throughput screening approach as manual handling of such a large number of systems quickly becomes impractical.
The main problem that so far eluded a comprehensive HT treatment of surface energies of multi-element crystals is to correctly and automatically handle asymmetric and/or nonstoichiometric slabs, which are unavoidable for most systems, while still keeping the computational effort tractable.This is further complicated by the sheer amount of symmetrically in-equivalent surface directions (Miller indices), which might have several unique terminations each.Looking at surfaces with Miller indices of only up to a maximal index of 3, a rather simple crystal with only two distinct elements might have on the order of 100 unique surface configurations, while only a small fraction of those will show favorable surface energy and actually contribute to the Wulff shape.
In the present work, we aim to solve these issues and present a fully automatic approach to calculate surface energies and Wulff shapes for arbitrary crystals.Our code package efficiently handles multiple terminations of asymmetric and nonstoichiometric slabs.Additionally, it can filter out polar surfaces1 , and predict low energy ones, as well as provide fully automatic calculation handling, error correction, database operations, and output visualization.
The package is developed in Python 3 and builds on well-known and commonly used tools for material science and HT computations like pymatgen [25], atomate [26], and Fireworks [27].It is available from the Python package index (PyPI) or a GitLab repository as open source software and uses the Vienna Ab-initio Simulation Package, vasp [28,29,30], to perform the DFT calculations.See Sec. 3 for the settings used.While the methods and algorithms used in our workflow are not individually new, we believe that no other freely available Python package offers a comparable richness of features, ease of use, and efficiency.
In the following sections, we will present our methodology, highlight our measures to optimize slab size and pre-screen for low-energy surfaces using bond valence information, and benchmark some results against available experimental and calculated data.A detailed flowchart and explanation of the workflow architecture can be found in S5 of the SM.

High-throughput handling of arbitrary surfaces
To model a surface with a DFT code, which generally employs periodic boundary conditions in all directions, requires the construction of a slab, a thin slice of material separated by a vacuum region from its duplicates in one direction.The simplest case of a surface energy (γ) calculation is for a symmetric and stoichiometric slab, meaning that both surfaces of the slab are equivalent and the system contains an integer number of formula units of the primitive bulk cell of the material.Then the surface energy γ is, where A is the area of the exposed surface, E slab the total energy of the relaxed slab, E bulk the energy of the bulk unit 2 , and N the number of formula units in the slab.For all other cases of symmetry and stoichiometry, a more generalized expression can be given where asymmetry leads to two distinct surface energies given by γ1 and γ2, and the bulk energy is replaced by a term involving chemical potentials µi, which quantifies the contributions of the missing atoms in nonstoichiometric slabs.Because the chemical potential of a species is a function of pressure and temperature, instead of a single value for the surface energy, we can talk about a region of stability for the surface that is defined by the environment.
While it is in principle possible to perform reference calculations to determine the region of stability with respect to chemical potentials, this gets progressively more difficult as the number of species in the system increases.It is also not always trivial to choose a bulk or gas-phase reference [32,33].We have thus decided to compute our surface energies with respect to a complete vacuum and not make an attempt to include chemical potentials.
There have been a number of ways presented in the literature to computationally decouple surfaces in asymmetric cases, without relying on chemical potentials.Notable examples are the wedge method [34,35], energy density method [36], surface passivation method [34,37], twinned slab method [38], and methods based on combinations of unrelaxed and relaxed surface energies [39,40].All these approaches intend to isolate one side of the slab by eliminating the contribution to the free energy of the other side.
In this work, we have chosen to employ the method by Tian et al., which calculates the surface energy as the combination of cleavage (very similar to the unrelaxed surface energy given in [39]) and relaxation energies in a way that is highly transferable and able to deal with both asymmetric and nonstoichiometric slabs [40].This approach is also similar to that of Eglitis and Vanderbilt [41] with some improvements in dealing with asymmetric surfaces.The approach is based on the notion that a slab is first cleaved from a bulk, and then the surfaces 2 It has been shown that using a bulk cell with the same lateral symmetry helps to converge the surface energy quickly with respect to slab thickness [31].
relax into their final shape.The cleavage energy is defined as where E unrelax slab is the total energy of the unrelaxed slab, E bulk is the energy of the bulk reference structure, and N is the number of formula units in the slab.(We should note that this definition of the cleavage energy differs from its more common interpretation as the average surface energy of an asymmetric slab.)Relaxation energy for symmetric slabs is simply given as where E relax slab is the total energy of the fully relaxed slab.Two key assumptions in this method allow it to treat asymmetric and nonstoichiometric slabs.First, it is assumed that the cleavage energy (sometimes referred to as the unrelaxed surface energy in literature) is equal for complementary terminations (i.e.sequential layers in the infinite bulk).This assumption follows from the idea that complementary terminations are created simultaneously by cleaving the bulk in a single plane 3 .Thus, one can calculate the contribution of the cleavage energy to the surface energy on each side of the slab separately, provided that the slab is stoichiometric, as Eqn. 3 is valid only for stoichiometric slabs.
The second assumption is that freezing half of the slab is enough to decouple two surfaces of the slab and simulate a semi-infinite slab which is fulfilled if the slab is thick enough that the middle layers are bulk-like.The relaxation energy for asymmetric slabs can then be expressed separately for different terminations where this time we divide by A since only one side of the slab is relaxed.
These assumptions together allow SurfFlow to compute the surface energy γ as where E relaxation is either equation 4 or equation 5, depending on the symmetry of the slab.Of course, equation 6 is dependent on the termination T in the case of asymmetric slabs.
While it is possible in theory to handle all cases of symmetry and stoichiometry using this approach, in practice we have to impose further constraints on the systems to be able to automate the process of calculating surface energies.First, our computational framework deals strictly with bulk-terminated surfaces, which means that adatoms or surface vacancies cannot be considered.
Furthermore, we limit the nonstoichiometric systems to symmetric slabs.This is done by modifying the bottom surfaces of slabs but is not possible for all slabs, thus this option is turned off by default and we generate equivalent stoichiometric, but asymmetric, slabs instead.This is possible in all cases and all possible terminations of a bulk are covered by this approach.For these systems, it suffices to perform two relaxation calculations (top and bottom halves) and two static calculations (static slab and bulk reference) to calculate both surface energies γ(T1) and γ(T2).This case is depicted for an example system (KCl) in Fig. 1.For a more detailed explanation of the symmetric but nonstoichiometric case, see the SM, section 2.
With this method [40], we are able to deal with all terminations using only a few calculations, avoid large system sizes (as for the wedge and twinned-slab methods), or face difficulties in generalizability (as for energy density and passivation methods).

Optimizing performance by slab resizing and predicting low energy surfaces
While simple DFT calculations with a GGA or metaGGA functional can no longer be considered expensive for systems up to around 100 atoms, slabs have usually low symmetry and especially higher index surfaces often need slabs with large lateral extensions.Even if a single calculation is relatively cheap, usually there are many unique directions with several possible terminations each, even for relatively simple crystals.Only surfaces with comparatively low energy are formed in most experimental circumstances, so those are of considerably more interest compared to unstable ones.It is therefore prudent to minimize the computational load by optimizing slabs and pre-screening surfaces to determine if it is likely that they might occur in experiments and/or contribute to the Wulff shape.Optimization of slab size is a simpler task but still not trivial.While pymatgen provides great tools for generating slabs and minimizing their lateral dimensions, the minimal thickness is set by the size of the oriented unit cell (OUC).For high-index surfaces, this OUC might become quite large due to the necessity for periodic boundary conditions.For the slabs, the periodic boundary condition in the direction of the surface normal is broken by the vacuum however, SurfFlow reduces the thickness to match the intended layer count as accurately as possible.However, the original terminations must be preserved, which SurfFlow can achieve by clustering atomic sites into layers and removing appropriate chunks of them.This procedure is described in detail in S5.2 of the supplemental material (SM).
Filtering out surfaces that are unlikely to contribute to the Wulff shape because they should have very high surface energy is a much more complex task.In the first step, we attempt to filter out polar surfaces which usually have higher surface energies than their nonpolar counterparts of the same crystal [42,43,44].The polarity of the surfaces is identified by pymatgen by guessing the most likely oxidation states and calculating the dipole moment in the direction of the surface normal.More about polar slabs can be found in section 3 of the SM.
Next, SurfFlow attempts to rank the remaining surfaces by quantifying the bonds broken during slab generation before actually performing any DFT calculations.Then, the workflow proceeds with N candidates predicted to have low surface energies.These will be used for the Wulff shape generation.(Note that high-index and high energy surfaces can be beneficial for catalysis applications [19], so this option can be turned off when submitting a workflow to include all possible slabs compatible with the slab generation parameters.)In the following paragraphs, we will present the broken bond model used by SurfFlow in some more detail and benchmark the performance on a comprehensive set of materials with different structures, constituents, and bonding types.
Broken bond models have been used to predict surface energies of certain materials in the past [45,46,47].However, the performance of the approach depends heavily on the material and the types of bonds present, which leads to better performance for some systems than for others.SurfFlow adapts this method to apply to a wide range of materials by counting and weighing broken bonds.This is especially important for multi-element crystals where bond strength can vary significantly.The weights assigned to each bond should correspond to its strength, as the energy needed to break them is the main contributor to the surface energy.
SurfFlow makes use of bond valences that have been shown to approximate the bond energy by Etxebarria et al. [48].The valence of a bond between species i and j is given as where R0 is the optimal (for these bonding partners) and Rij is the realized bond length in the investigated material.The parameter b measures the softness of the bond.
Eqn. 7 is the expression most widely used for bond valence, and tables of values for R0 and b for numerous bonds are available in the literature.To be independent of necessarily incomplete tabulated data, SurfFlow uses the sum of the covalent radii of species forming the bond as the ideal bond length, and the frequently used 0.37 Å as b parameter [49] 4 .We define the total bond valence sum (BVS) of a given site i as a sum over all neighbors, We consider all neighbors up to 1.2 times the largest bond length of the bulk, however, due to the exponential decay of the bond valence, mostly nearest neighbors contribute.The total bond valence sum of broken bonds S is then simply a sum of all the partial sums over all surface sites, A similar, but slightly more complex approach was recently used to estimate the relative surface energies of homoatomic transition metal crystals with good success [50].However, for our more diverse data set, the method presented here was found to be slightly more reliable.
No method based on bond breaking can differentiate between in-equivalent but complementary terminations of an asymmetric slab, because both surfaces originate from a single cleavage.However, the bond valence sum correlates generally well with DFT surface energies, with a median Pearson correlation value of 0.87.Thus, in most cases, the method can be used as an excellent preliminary filter to weed out facets and terminations predicted to have high surface energies.

Benchmarking the bond-valence model for predicting low energy surfaces
We have tested our bond valence model for 36 materials (See table 1 in the SM), encompassing different bonding characteristics, crystal structures, and constituting elements, from monoatomic to three-component systems.
All unique non-polar surfaces up to a maximal Miller index (MMI) of 3 have been considered for each material, 653 surface energies in total 5 .The median correlation coefficient across all test systems is r med = 0.870, while two systems with slightly negative correlation reduce the average to ravg = 0.803.However, 30 of the 36 materials tested show a correlation larger than 0.7, as can be seen in the histogram in Fig. 2. The four worst performing systems are bcc Li (mp-135, r = −0.087),γ-TiAl (mp-1953, r = −0.070),as well as bcc and hcp Fe (mp-13, r = 0.573; mp-136, r = 0.550).The by far worst correlation values (essentially 0) are observed for bcc Li and γ-TiAl.Both materials have several surfaces that are extremely close to one another in energy.For Li, a very soft metal, where all surface energies are low, and also the BVS values are very close together, this is not very surprising.We thus recommend not relying on BVS filtering when calculating surface energies of very weakly bound materials.For TiAl, however, the situation is different and might be rooted in difficulties in evaluating Ti d − d interactions correctly in intermetallic systems.Sang and coworkers measured the charge density distribution of γ-TiAl with highly accurate quantitative convergent beam electron diffraction [51].While the qualitative agreement with full potential DFT calculations was good, there were considerable deviations for PAW-based calculations as we conduct them here.A more detailed discussion about all systems with p < 0.6 can be found in S11 of the SM.
In the inset of Fig. 2 we plot examples of the correlation between S and γ for four systems, colored according to the bins into which they are sorted.
To test the bond valence sum filtering approach on Wulff shapes, we use the N surfaces with the lowest bond valence sums to construct Wulff shapes.All of the surface energies are calculated with DFT, however.We then compare the mean absolute errors (MAE) in the area fractions of these Wulff shapes with respect to the full DFT results 5 For a couple of systems we found only 4 non-polar unique surfaces, and have decided to include polar ones with MMI ≤ 2 as well for those.Also, slabs with more than 100 sites were disregarded to conserve computational resources.See S1 in the SM for more details.In Fig. 4 we plot Wulff shapes for the four systems already presented in Fig. 2 calculated by DFT from the lowest N BVS surfaces.These give examples of Wulff shape evolution if more and more surfaces are considered for different classes of BVS/γ correlation.
In the case of CoSi2 (Fig. 4a), where the correlation is really good, we see that for 3 computed surfaces, the (331) facet is quite prominently featured alongside the dominant (111) facet and a sliver of (110).The total MAE of the area fractions is already very low at 1.8%.Adding four more surface energy calculations reduces the MAE a bit more by including (211), (320), and (321) facets, where the latter two disappear in the final shape in favor of the (311) and (310) facets that are both considerably higher in energy at N = 11.For LiPd (Fig. 4b) we have a total of 41 surfaces to consider.Again, a high index surface appears early with the (301) facet, which shows up even at N = 3 and is ever so slightly lower in energy than (101).At N = 9, the (320) and (001) facets also appear, but (320) ends up not contributing to the final shape at N = 38, being replaced by the (101) surface.The (103) facet shown with considerable area fraction at N = 38 does not appear until quite late at N = 32, due to its relatively high surface energy.We see that with many surfaces close in energy, even systems with decent correlations between BVS and γ might not produce the correct Wulff shape unless many surfaces are considered.MAEs are therefore higher here, at 9.4% for N = 9 and 7.2% for N = 14.However, calculating the Wulff shape by computing all unique surfaces with Miller indices smaller than 1 or 2 (as commonly done) results in even higher errors of 14 and 13.2%, and at costs of 7 and 13 calculations, respectively.
For Al3Pt2 (Fig. 4c), where the correlation is in the bottom six of our test set, low-index facets dominate the Wulff shape.Here, MAE starts off relatively high at N = 3 with 11.2% and as we add more surfaces, drops to 6.9% at N = 13 mostly due to the (100) facet appearing which also contributes to the final shape.At N = 27, we see contributions from (111) and (2-12) which remain in the final Wulff shape, and at this value of N , the MAE drops to 3.6%.Calculating up to MMI 1 or 2, however, results in MAE of 5.2 and 0.0%, although at relatively high costs of 11 and 23 calculations, respectively.For this system, this ends up being the better approach, providing higher accuracy at fewer calculations.
For γ-TiAl (Fig. 4d), where we have no correlation and do not expect this method to work well, we expect similar problems.Indeed SurfFlow misses the lowest energy surface, (110), which also contributes to the Wulff shape if we set N < 15.However, the second lowest energy surface, (101) is immediately captured, as is the high energy (001) facet.The MAEs for N = 3 and N = 9 are decent at 9.1 and 6.8%, and at N = 15, we have the correct Wulff shape, even though there are 29 distinct surfaces for MMI = 3.However, computing up to MMI 1 or 2 is more advantageous here as well, with MAE of 3.2 and 2.4%, at costs of 5 and 12 calculations respectively.
Overall, we can confidently say that utilizing the BVS is better than just relying on low-index surfaces, although low surface energy alone does not determine that a facet contributes to the Wulff shape.The average MAE (compared to the full Wulff shape for MMI 3) for all materials is 17.1% if an MMI of 1 is used and 12.5% if MMI is 2. Using the same number of calculations per material as for MMI 1, or 2, but using the BVS predictions to select them, the average MAE goes down to 9.0 and 2.7%, respectively.
Fig. 3 shows a histogram of materials with respect to the coverage6 required to reach 5% of MAE when compared to the full Wulff shape at 100% coverage.The results show that a 40% coverage is sufficient to reach this threshold for the vast majority of materials studied (∼ 75%), while only 4 of the 36 materials need more than 60%.
Finally, the lowest-energy orientation of a crystal is also of considerable interest.In the study of 36 materials, the BVS ranking correctly predicts the surface with the lowest overall energy in 67% of the cases, and in over 85%, it is within the lowest 3 predictions (see inset in Fig. 3).If finding the lowest energy surface for a set of materials is the primary concern, SurfFlow's BVS filtering is thus an excellent tool to save computation time.

Benchmarking the SurfFlow workflow
As mentioned in section 1, it is hard to measure surface energies experimentally.Thus, it is not easy to benchmark our workflow using hard experimental data as a reference.
We have decided to first validate our workflow using some monoatomic systems with data from the materials project as reference [20].The agreement is excellent other than for some outliers which hint at some problems in the data of reference [20] since our results match other previous studies.A detailed discussion of these results can be found in S10 of the SM.
A more rigorous test of the important capability to treat multi-component systems is to compare our calculated Wulff shapes with the experimentally determined and previously calculated nanoparticle shape of the anatase and rutile phases of TiO2 (mp-390 and mp-2657).These oxide materials potentially feature polar surfaces and reconstructions, and thus present a particularly hard challenge for SurfFlow's algorithms.They are also well-studied systems, both experimentally and especially theoretically (see e.g. the reviews by Diebold and Liu et al. [52,53]).
For these calculations we have employed the defaults of SurfFlow (see S5.1 in the SM), disregarded polar surfaces, and utilized the bond valence sum for pre-screening, only calculating the lowest 10 BVS terminations with DFT.
For anatase, SurfFlow predicts the (101) facet to be dominant (99%) in the Wulff shape and also having the lowest surface energy of all facets at 0.41 J/m 2 which is in good agreement with previous results of 0.44 J/m 2 [54,55].The only other surface contributing to the Wulff shape is a small (001) facet, which has a considerably higher surface energy of 0.95 J/m 2 (0.90 J/m 2 [54]).Nanoparticles with this shape were already grown more than 20 years ago by Penn and Banfield [56].Further calculated low energy surfaces are the (100) and ( 201) facets, at 0.51 J/m 2 (0.53 [54]) and 0.63 J/m 2 , respectively.Indeed, TiO2 anatase nanoparticles with exposed (201) facets have been grown previously [57,58].The other high-index surfaces showing relatively low surface energy we calculated are (112), (310), and (103).Another termination of the (201) facet with low BVS, but featuring an undercoordinated oxygen atom did not converge, so only 9 surface energies are reported in table 1.The correlation between the BVS and the DFT computed surface energy is very high at ranatase = 0.96.For rutile, (110) is found to be the dominant facet, occupying 82% of the total area of the Wulff shape, with a surface energy of 0.29 J/m 2 .Two more facets, (201) at 0.80 J/m 2 , and (101) at 0.97 J/m 2 each contribute about 9% to the Wulff shape.In literature, only low index surfaces (MMI=1) are calculated, and the Wulff shapes computed feature the (110), (101), (100), and (001) facets [59,60].We do calculate a low surface energy of 0.58 J/m 2 for the (100) facet, but it does not contribute to the Wulff shape, while the (001) surface is not calculated because its BVS sum is not within the lowest 10 surfaces.
SurfFlow's result for the (110) facet is in line with a previous PBE result of 0. This broad spread of values is due to well-known and large oscillations of the electronic properties of this surface with respect to the odd/even number of titanium layers [62], which makes the surface energy very hard to converge.Such oscillations are not found for the (100) surfaces.Our (110) result was computed for 4 titanium layers according to SurfFlow's defaults, which yields a particularly low value of the surface energy, crowding out the (100) surface from our Wulff shape.High-resolution scanning tunneling microscopy images suggest that the (100) facet should however be present [60].
Again, one of the facets, now (320), did not converge, leaving us with 9 computed surface energies reported in table 1.Note that of the 9 computed surface energies, 6 have MMI > 1, and the correlation between BVS and computed γ is decent at r rutile = 0.80, indicating that those surfaces indeed have low surface energy compared to not calculated lower index surfaces.
In conclusion, we have developed SurfFlow, a Python package that efficiently computes surface energies and Wulff shapes for arbitrary crystals in a high-throughput manner.By employing symmetric considerations, optimizing slab thickness, and implementing a pre-screening approach based on the bond valence sum, we successfully reduce the computational cost associated with determining low-energy surface orientations and terminations.For all but very weakly bound materials we find a good to excellent correlation between the bond valence sum and the surface energy.Utilizing this correlation, we have shown that we can reliably construct more accurate Wulff shapes with fewer calculations than by just computing low-index surfaces.This paper presents the algorithms and approximations utilized as well as the workflow architecture, web tools, and a comprehensive database that includes hundreds of surface energies for crystals with diverse chemistries and structures.Notably, SurfFlow offers features such as streamlined job submission, automatic error correction, post-processing capabilities, and integration with an optimade compliant MongoDB database.

Methods
We use the Vienna Ab-Initio Simulation Package (VASP, version 6.3.1, [28,29,30]) with the potpaw.54potential set [63].The potential mapping is equivalent to the one defined in the MPRelaxSet of the Materials Project, except for tungsten, where we use W sv instead of the deprecated W pv. The user can easily overwrite this potential mapping while submitting workflows.Our package allows for easy changes to computational and structural parameters such as plane wave energy cutoff, k-point density, and slab thickness.We have nevertheless taken care to set default values that promise both accurate results and reasonable computation time.Those are e.g. using 400 eV as the plane wave cutoff, 5.0 Å−1 for the k-point density, and slab thicknesses of at least 8 layers or 10 Å, whatever is greater.All data presented here are computed with those defaults, or in some cases of older data, with higher values.Smearing parameters change for the relaxations depending on material parameters (especially bandgap) and might be corrected during runs by custodian, but for total energies, we always include an extra static run using the tetrahedron method with Blöchl corrections.Our workflow is optimized to use the PBE or SCAN functionals, but LDA or other GGAs, as well as van der Waals corrections, can also be selected.The results calculated for this paper are using the PBE functional.More information about the choice of functional and pseudopotential set is available in section 8 of the SM.

Figure 1 :
Figure 1: Surface energy calculation scheme for an asymmetric and stoichiometric KCl (mp-23193) (111) slab (a) with complementary terminations using the method by Tian et al. [40].Calculations required are relaxations of the initial slab with only the (b) top, and (c) bottom halves relaxed, (c) static initial slab, and (d) static oriented unit cell.
up to the MMI 3. The area fractions describe which percentage of the Wulff shape each facet contributes.

Figure 2 :
Figure 2: Main plot: Histogram sorting the investigated materials into bins according to the Pearson correlation numbers of the bond valence sum with the DFT surface energies.Inset: Surface energy γ vs bond valence sum S for four exemplary materials representing different success situations of the BVS estimation of the surface energy.Colors indicate the bins of the main plot.The shaded regions represent the 95% confidence interval for the regression estimate.

Figure 3 :
Figure 3: Coverage required to reach MAE threshold of 0.05 for area fraction compared to the full Wulff shape (coverage 100%).The inset visualizes how many N lowest BVS predictions must be calculated with DFT for all materials to find the lowest energy surface.

Figure 4 :
Figure 4: Wulff shapes of 4 materials with varying correlations between S and γ for different numbers of calculated Miller-index/surface shift combinations.The facet colors correspond to the surface energy, from dark blue (lowest) to dark red (highest).
31 J/m 2 [54], while Perron et al. report a Perdew & Wang GGA value of 0.48 J/m 2 [61], Bredow et al. find 0.63 J/m 2 for the same functional and Jiang and coworkers compute an even larger value of 0.74 J/m 2 with PBE [60].