Phase-Field Modeling of Fracture with Physics-Informed Deep Learning

We explore the potential of the deep Ritz method to learn complex fracture processes such as quasistatic crack nucleation, propagation, kinking, branching, and coalescence within the unified variational framework of phase-field modeling of brittle fracture. We elucidate the challenges related to the neural-network-based approximation of the energy landscape, and the ability of an optimization approach to reach the correct energy minimum, and we discuss the choices in the construction and training of the neural network which prove to be critical to accurately and efficiently capture all the relevant fracture phenomena. The developed method is applied to several benchmark problems and the results are shown to be in qualitative and quantitative agreement with the finite element solution. The robustness of the approach is tested by using neural networks with different initializations.


Introduction
Predictive modeling and simulation of fracture processes is critically important in design for many engineering applications and still poses significant challenges.In the past two decades, phase-field fracture modeling [1,2] has emerged as a game changer; it offers a unified variational framework and, upon discretization, a flexible computational method to encompass various aspects of fracture including crack nucleation, propagation, merging, and branching [2][3][4], while obviating the need for crack tracking procedures as well as for ad-hoc parameters and criteria.For these reasons, it has found applications and extensions to a variety of fracture problems including thermal cracks [5], ductile fracture [6], drying-induced cracks [7], and hydraulic fracture [8], among many others.Furthermore, it has been demonstrated to accurately capture topologically complex crack paths, even in three dimensions [5], while delivering quantitatively accurate predictions [9].
The first variational phase-field model for brittle fracture was derived by regularizing the free-discontinuity problem obtained from recasting Griffith's fracture criterion in a variational form [1,10]. Regularization is performed by introducing an additional field, known as the phase field, ranging between 0 (representing the intact material) and 1 (representing the fully cracked material).Localization of this field in a narrow band, whose width is controlled by a length-scale parameter inherent to the model, gives rise to a smeared representation of a crack.Later, it was shown that a more flexible but substantially analogous variational modeling framework can be constructed from a special family of gradient damage models [11], whereby the phase field is interpreted as a damage variable and is therefore postulated to be irreversible.In the computational setting, the regularization of a sharp crack to a smeared representation overcomes the computational challenges encountered in handling an a priori unknown crack path.The crack is now represented implicitly by the continuous phase field; along with the displacement field, this field is obtained by solving the non-linear coupled partial differential equations (PDEs) stemming from the stationarity conditions of the governing energy functional.However, two challenging aspects remain: the energy functional is non-convex, which needs special attention in the numerics [12], and the resolution of the (small) length scale in the discretized setting is computationally expensive.This limitation restricts the applicability of the approach to e.g.optimization, uncertainty quantification and inverse problems.
In recent years, considerable effort has been directed towards leveraging advances in machine learning to accurately model physical phenomena [13]; various physics-informed deep learning approaches [14] have been employed to solve PDEs and variational problems [15][16][17][18].One prominent approach, known as physics-informed neural networks (PINNs), learns the solution of a PDE by the unsupervised (or semi-supervised) training of a neural network (NN) [14,16,19,20].PINNs build upon the universal approximation property of NNs [21][22][23], i.e. the ability of NNs to approximate any continuous function, which allows for the use of NNs as the ansatz space for PDE solutions.Notably, PINNs can seamlessly incorporate any available data and compute solutions to forward and inverse problems with the same NN architecture.They have demonstrated accurate approximation of solutions to a variety of PDEs and inverse problems, also in solid mechanics [24][25][26][27][28][29][30][31].
In PINNs, training of the NN representing the solution field involves the minimization of a loss function comprising the PDE residual and additional terms, which enforce initial and boundary conditions and possibly other constraints.An alternative approach known as the deep Ritz method (DRM) [18,37,38], directly minimizes the energy functional instead of the PDE residual.The DRM is particularly useful in obtaining stable solutions when the governing energy functional is non-convex, as in phase-field fracture modeling.Additionally, since the computation of the energy requires a lower order of derivatives compared to the corresponding strong-form PDE, the DRM is expected to be computationally less expensive [37].On the other hand, energy calculation requires integration over the entire domain, and only a few studies have investigated quadrature rules for NNs [39].
PINN approaches have been developed to solve phase-field problems (Cahn-Hilliard and Allen-Cahn equations) involving the phase field as the only scalar unknown [40,41].In contrast, phase-field fracture modeling involves two coupled fields: the displacement (vector) field and the phase field.Furthermore, while the steep variation of the phase field at localization is governed by the regularization length scale, the displacement field undergoes a much sharper variation wherever the phase field localizes.These differences lead to additional and unique challenges in approximating the solution of phase-field fracture problems using NNs.While the DRM has shown some promise, its applicability has been demonstrated primarily in problems involving crack propagation with simple crack paths [42,43].Crack initiation in the absence of a notch has only been demonstrated by assuming a very large value for the regularization length scale [42].Additionally, these approaches have required a large number of quadrature points, exceeding the number needed to resolve the regularization length scale.Notably, a quantitative assessment of the solution accuracy is also lacking.The applicability of approaches other than the DRM has been partially investigated in [42,44].An operator learning approach [45,46], namely variational DeepONets, has also been applied to predict the crack path in a problem involving crack propagation in quasi-brittle materials [47].However, the applicability of physics-informed deep learning to phase-field fracture modeling hinges on the ability of NNs to learn all the fracture processes accurately, especially considering that this flexibility lies at the core of the phase-field approach and is also one of the key reasons of its success.
In this work, we elucidate the challenges involved in learning the solution of phase-field fracture problems, and investigate the design of NNs and of their training strategies aimed at learning crack initiation, propagation, kinking, branching, and coalescence.We also address the challenges associated with learning the solution fields with the same level of domain discretization as required in finite element analysis (FEA).We systematically demonstrate the accuracy of the learned solution by comparing it with the FEA solution.Additionally, we test the robustness of our approach by training NNs with different random initializations.
The paper is organized as follows.A brief review of the phase-field brittle fracture formulation is presented in Section 2. In Section 3, the DRM is summarized, and the challenges in learning the solution of a phase-field fracture problem are discussed along with the design of the NNs and their training strategy.This is followed by numerical examples of crack nucleation, propagation, kinking, branching, and coalescence in Section 4. Finally, we summarize the main conclusions in Section 5.

Phase-field model of brittle fracture
In this section, we summarize the governing equations of the phase-field model of brittle fracture which we aim at solving with the deep learning approach proposed in the following sections.

Basics of the formulation
Let Ω ⊂ R d , d ∈ {1, 2, 3} be a bounded open domain occupied by a d-dimensional body.Γ D,0 , Γ D,1 and Γ N are disjoint sections of the boundary ∂Ω of Ω with prescribed homogeneous Dirichlet, non-homogeneous Dirichlet and Neumann boundary conditions, respectively.The body is assumed to be linear elastic with the strain energy density given by Ψ(ε) = 1  2 λtr 2 (ε) + µtr(ε • ε), where λ and µ are the Lamé constants, and ε = 1 2 (∇u + ∇u T ) is the infinitesimal strain tensor, with u : Ω → R d as the displacement field.The total energy functional for the body acted upon by body force b and surface traction t is given by: Here E el and E d represent the elastic and damage parts of the energy, respectively.α : Ω → [0, 1] is the phase field (or damage field), with α = 0 representing the undamaged state and α = 1 the fully damaged state.The function g(α), known as degradation function, modulates the degradation of the elastic energy with increasing damage.w(α) is also known as local dissipation function, as it dictates the energy dissipated through damage in the unit volume of the material.The parameter l with 0 < l ≪ diam(Ω) is the regularization length controlling the thickness of the transition zone between undamaged and fully damaged material regions; G c represents the fracture toughness of the material, and c w is a normalization constant given by c w = 4 1 0 w(t)dt.In this work, we adopt the following choices [11]: with η = o(l).The elastic strain energy density Ψ is decomposed into active (i.e.damage-driving) and inactive (i.e.damage-resisting) parts, Ψ + and Ψ − , respectively, and the degradation is applied only to Ψ + [48][49][50][51].While several energy decompositions are available [48][49][50]52], in this work we focus on the volumetric-deviatoric decomposition [48] whereby with ⟨a⟩ + = max{0, a}, ⟨a⟩ − = min{0, a}, the bulk modulus K = λ + 2 3 µ, and the deviatoric strain tensor e = ε − tr(ε) 3 I, where I is the third-order identity tensor.

Time-discrete evolution problem
In the time-discrete setting of the quasi-static evolution problem, the state of the system at the (pseudo-)time or loading step n ≥ 1 is obtained as the solution of the minimization problem where denotes the space of kinematically admissible displacement fields, and denotes the space of admissible phase fields.The condition α ≥ α n−1 represents the irreversibility of the phase field evolution.The first-order necessary conditions of the minimization problem deliver the strong form of the governing equations all valid in Ω, and represent the local equilibrium of forces, damage irreversibility, damage criterion and loading-unloading conditions, respectively.The Cauchy stress is defined as σ = g(α) ∂Ψ + ∂ε + ∂Ψ − ∂ε , and () ′ = ∂ ∂α ().The whole set of boundary conditions reads as follows:

Irreversibility of the phase field
One way to enforce the irreversibility of the phase field in the computations is to add to the energy functional in (1) the following energetic penalty [53]: where γ ir is the penalty parameter, whose value can be chosen as suggested in [53]: Here 0 < TOL ir ≤ 1 is the prescribed irreversibility tolerance threshold; in this work, we set it to TOL ir = 5 × 10 −3 .

Non-dimensionalization scheme
Non-dimensionalization not only reduces the number of relevant physical parameters, but also aids in learning by ensuring that the inputs are ∼ 1.So, we non-dimensionalize the energy functional using the following scheme: where x ∈ Ω ( x ∈ Ω) is a material point coordinate, L is the characteristic length of the body, E is the Young's modulus, and Gc El is of the order of the critical strain for crack nucleation in a 1D bar [11].Then the dimensionless form of the energy functional in (1), with no surface tractions and body forces, and including the energetic penalty to enforce the irreversibility of the phase field, reads where and μ = 1 2(1+ν) .Notably, as a consequence, the only parameters in the non-dimensional energy are ν and l.For simplicity of notation, from here onwards, we only use non-dimensional quantities but omit the (•) symbol.

The Deep Ritz method
In the DRM, as in any other physics-informed deep learning approach for solving PDEs and variational problems, the solution field is represented by an NN, frequently a feedforward NN (also known as a multilayer perceptron).A feedforward NN is a composition of affine transformations and scalar nonlinear activation functions.Given an input y ∈ R n , the NN maps it to the output z θ ∈ R m as follows where C k (with 1 ≤ k ≤ K) represents an affine transformation, also known as the k th layer of the NN.Specifically, , where W k and b k are trainable weights and biases and z k is the input to the layer.The symbol • represents the composition of functions, and σ is a scalar activation function.
Commonly used activation functions are the ReLU, sigmoid, and hyperbolic tangent (tanh) functions [54].θ denotes the set of all trainable parameters of the NN, i.e.
Training of the NN aims to find θ such that z θ approximates the target solution.It involves constructing an appropriate loss function and then minimizing the loss with respect to the parameters in θ.The minimization is typically performed using stochastic gradient decent algorithms such as Adam [55] or higher-order optimization algorithms such as L-BFGS [56].
For our problem, we define the displacement and phase fields obtained from the NN as u θ (x) and α θ (x).Then the loss function is expressed as where (18).Taking the log of the energy in ( 20) makes the loss function ∼ 1, allowing us to set the same weight for weight regularization (to be discussed later) and the same stopping criterion for the optimizer for all the problems investigated in this work.Note that, in our experience, the enforcement of the Dirichlet boundary conditions using a soft constraint in DRM is challenging.Hence, we enforce them by constructing an ansatz, following the approach employed in [14], see Section 3.2.1.For a general domain shape for which an ansatz is not feasible, general approaches for the enforcement of boundary conditions as in [57,58] can be employed.

Construction and training of the NN
Our aim is to obtain an NN-based approximation of the solution fields employing the DRM in learning.This approach entails two key challenges.The first challenge lies in ensuring that the NN-based approximation of the energy surface, E θ , is a faithful projection of the energy landscape in the space of NN-parameters.Specifically, it is crucial that the energy barriers and minima are appropriately represented.In turn, the representation of the energy surface relies on three ingredients: the construction of the NN(s) representing the fields, the computation of the gradients of the fields, and the integration over the domain.The second challenge lies in the choice of the optimization algorithm.This algorithm needs to respect the energy barriers and reach the correct local energy minimum regardless of the complexity of the solution associated with that energy minimum.As follows, we discuss all these aspects and refer to some Appendices for additional details.

Choice of the NN architecture
Physically distinct fields such as the displacement field and the phase field are expected to be learned efficiently when represented by independent NNs.However, we find that an intimate coupling between the fields, facilitated by representing the displacement and phase fields as the outputs of the same NN, helps in learning the nucleation of a crack as shown in Appendix A. Hence, we construct an NN which takes as input the position vector of a point in the domain (2 components in 2D, 1 in 1D) and delivers outputs which correspond to the components of the displacement vector and the value of the phase field at that point (3 outputs in 2D, 2 in 1D), see Figure 1.The nondimensionalization scheme outlined in Section 2.4 ensures that the inputs are ∼ 1.In constructing an ansatz to apply Dirichlet boundary conditions, we ensure that the outputs of the NN corresponding to the displacement field are also approximately ∼ 1 as follows: where ûθ and vθ are the direct NN outputs corresponding to the displacement components, and U p is the prescribed displacement.f u and f v are designed for each problem.To ensure that α θ ∈ [0, 1], the application of the sigmoid function to the direct NN output corresponding to the phase field, αθ , is an attractive choice; however, it turns out that its vanishing slope hinders crack nucleation.Therefore, we design the following function, plotted in Figure 1, to overcome this problem: By setting α θ = f α (α θ ) in the 1D bar problem, we find that the order of the smallest β for which an NN can learn the correct critical load for crack nucleation is 10 −3 .So, we set β = 10 −3 in all examples of this paper.

Choice of the activation function
Another important aspect of the NN design is the choice of the non-linear activation function.In PINNs, smooth activations like tanh are frequently employed.However, we find that an NN with this activation fails to learn the correct crack path (see Appendix B).The evolution of the loss during training suggests that an NN with tanh activation fails to represent the energy barrier in the loss landscape which would prevent crack propagation in an incorrect direction.Therefore, we employ ReLU activation with the following modification: where m k are learnable coefficients.While initializing the NN, we set m k = m 0 , where m 0 is a constant.We observe that while the value of m k does not change significantly during training, some choices of m 0 , as detailed in Section 4.1, facilitate learning the critical applied displacement for crack initiation or propagation with higher accuracy.

Gradient computation and quadrature
Computation of the gradients poses another challenge.When employing autodifferentiation to compute the gradients of the fields represented by an NN with a smooth activation function like tanh, the learned fields display high-frequency oscillations (similar to those resulting from Gibbs phenomena) leading to the localization of the stress in random directions near the crack tip (see Appendix C).This can be viewed as an undesired modification of the energy barrier, allowing for crack propagation in unexpected directions and leading to an inaccurate crack path prediction.
While this is an additional argument against the use of tanh activation, the problem is not solved by the use of the ReLU activation function.To address it, we discretize the domain as we would do in FEA.The field values at the nodes are obtained from the NN, whereas the field values and their gradients within each element, in particular at the Gauss points, are computed using shape functions like in FEA, as follows where the superscript h denotes the finite element approximation of the fields, N a (x) is the shape function corresponding to node a of the finite element discretization, and u a θ and α a θ are the values of u θ and α θ at the same node, respectively, with a = 1...n and n as the number of nodes.This approach to gradient computation makes the training considerably faster.
We use the finite element discretization also for the purpose of approximating the integral in (18) in order to compute the energy using the NNbased representation of the displacement and phase fields.We evaluate the integral employing Gauss quadrature at each element of the finite element discretization.

Weight regularization
Weight regularization plays an important role in obtaining the correct solution of a phase-field fracture problem using the same level of domain discretization as needed in FEA.In the problem of crack nucleation in a 1D bar in Section 4.2, when employing autodifferentiation for strain computation, the NN learns incorrect solutions with approximately zero energy after crack nucleation.An analysis of the governing PDEs (see Appendix D) reveals that the phase-field model does not constrain the maximum slope of the displacement field near the localized phase field, thus this slope depends on the expressivity of the NN.Therefore, if the training points are not fine enough to resolve the sharply changing displacement field expressible by the NN, and not only the length scale l, the DRM yields incorrect solutions as shown in Figure D.26.Correct results can be obtained by a finer discretization of the domain than needed to resolve the regularization length scale; this is however computationally inefficient, as it requires an even finer resolution than needed for FEA computations.Note that, while we first recognized the issue when adopting autodifferentiation, the employment of gradient computation by finite element discretization (as illustrated in Section 3.2.3)does not solve this issue.Instead, a solution is to limit du θ dx max by introducing weight regularization.This leads to a computationally efficient approach, in which the discretization of the domain needed for an accurate solution is comparable to the one used in FEA.Therefore, we employ weight regularization throughout the examples in this paper, see Section 4.1 for more details.

Choice of the optimization algorithm
Optimization algorithms play a key role in the training of an NN and selecting one that respects energy barriers and reaches the correct energy minimum is essential.In this study, we consider the L-BFGS algorithm [56] and the resilient backpropagation (RPROP) algorithm [59].While L-BFGS is the fastest of the two, it is unable to reach a good energy minimum with the localized phase field and the sharply varying displacement field (see Appendix E).In contrast, the RPROP algorithm is observed to respect energy barriers and reach the desired energy minimum.It should be noted that the performance of first-order optimization algorithms like Adam for these problems was considerably poorer compared to L-BFGS and RPROP, hence we do not consider them further in this work.

Transfer learning
As loading is applied in incremental steps, we utilize transfer learning to learn the solution quickly.The trained network from the previous loading step is used as the initial network for training in the current loading step.We observe that transfer learning enables learning the solution field in typically fewer than 500 steps of the RPROP optimizer provided a load step does not lead to a sudden crack propagation and associated jump in energy.

Nucleation and damage evolution
Our experience shows that crack nucleation is more challenging to learn than crack propagation.In the AT1 model, the phase field does not evolve slowly with increasing load but jumps abruptly from 0 to 1 at the location of crack nucleation when the threshold loading for crack nucleation is reached.This poses a formidable challenge for NNs to learn crack nucleation in 2D.Therefore, for 2D problems involving crack nucleation we employ the AT2 model, which allows for a slow evolution of the phase field to nonzero values before it jumps to 1 at the threshold loading condition.

Summary
To summarize, our approach involves one NN with adaptive ReLU activation expressing both the displacement and phase fields.Dirichlet boundary conditions are enforced employing an ansatz which also ensures that the NN outputs corresponding to the displacement field are approximately ∼ 1.To incentivize the phase field to lie between 0 and 1 without hindering crack nucleation, we design a new function f α (α θ ).Field gradients are computed numerically like in FEA.To estimate the integral in the computation of the energy, Gauss quadrature is employed for each element in a discretized domain.In the training of the network, weight regularization is employed to constrain the maximum strain expressible by the NN.The RPROP algorithm is found to perform best in training the NN to learn the solution fields.We also employ transfer learning to learn the solution efficiently.Additionally, while the NN learns crack propagation, branching, and coalescence using the AT1 model, it is able to learn crack nucleation in 2D only with the AT2 model.

Numerical examples
In this section, we first demonstrate crack nucleation in a 1D homogeneous bar and an L-shaped panel.We then study crack propagation in a single-edge notched (SEN) specimen under both tensile and shear loading conditions.To illustrate crack branching, we again model a SEN specimen under shear loading without strain energy decomposition.Crack coalescence is demonstrated by subjecting a specimen with three preexisting cracks to tensile loading.
Since these examples are taken from the phase-field fracture literature, for each example we use the same material properties as in the corresponding paper.This not only enables an easier comparison with the results of the referenced papers; it also demonstrates that the developed approach works robustly for different material properties (and geometries).Throughout the 2D examples, we assume plane-strain conditions.

Numerical setup
For the 1D problem, we use an NN of 4 hidden layers and 50 neurons in each layer.The coefficient in the activation function is set to m 0 = 1, and we do not include m k among the learnable parameters.The L-BFGS optimizer is found to be sufficient to learn the solution in this case.Eight NNs with different initializations obtained by using Xavier initialization [60] with random seeds are trained for this problem.
For the 2D problems, we use an NN of 8 hidden layers and 400 neurons in each layer.The size of the network is chosen based on the numerical experiment conducted for the problem of crack nucleation in an L-shaped panel (see Appendix F).We set m 0 = 2 for crack initiation in the L-shaped panel, and m 0 = 3 for the remaining 2D problems.We train 8 NNs with different initializations for each problem employing Xavier initialization with random seeds.In the first loading step, an NN is first trained with L-BFGS before training it with RPROP; for all the subsequent loading steps only RPROP is used.For RPROP, the learning rate is set to 10 −5 with the smallest and largest step sizes of 10 −10 and 50, respectively.The remaining optimizer parameters are kept as in the standard implementation in Pytorch.The stopping criterion is set to correspond to a relative change in the loss function of less than 5 × 10 −6 for 10 consecutive optimization steps, with a maximum of 10000 optimization steps.Additionally, we apply L 2 weight regularization with a weight of 10 −5 .The Pytorch library [61] is used for implementation and training.
In the 1D problem, we discretize the domain into elements of size l/5 and assume linear shape functions.In the 2D problems, we utilize triangular elements with linear shape functions.Only the regions of the domain where the crack is expected to propagate are finely discretized with elements of size l/5, to limit the computational cost.Away from these regions, the element size smoothly increases up to 4l.Discretization is performed using Gmsh [62].We use one Gauss point per element for integration.
The numerical setup for the reference FEA computations is detailed in Appendix G.

Crack nucleation in a 1D homogeneous bar
We first study crack nucleation in a linear elastic bar shown in Figure 2.This problem lends itself to an analysis that sheds light on the reason for the occurrence of incorrect solutions when the domain is not discretized sufficiently finely.In this case, the expression of the energy in (18) simplifies to the following form: The only parameter in the above expression is l, and we set l = 0.05.We employ the AT1 model (see ( 2)).The following ansatz is employed to apply the boundary conditions: Note that, since fracture at the boundaries requires less energy compared to fracture inside the domain, the NN (as well as the FEA) has a tendency to choose the solution with fracture at the boundary.To suppress this solution, we set α θ = 0 at the boundaries using the above ansatz.
For the 1D bar, the energies obtained from the NN are compared with the FEA solution in Figure 3, showing close agreement.Additionally, the displacement and phase fields before and after crack nucleation are compared with the FEA solution in Figure 4, again showing that the DRM achieves a quite accurate solution.

Crack nucleation in an L-shaped panel
This problem illustrates crack nucleation in 2D.The geometry of the panel and the boundary conditions are as shown in Figure 5, whereas the material properties are set to ν = 0.18 and l = 0.01 [53].To prescribe homogeneous and nonhomogeneous Dirichlet boundary conditions, we construct distance functions d 1 (x, y) and d 2 (x, y), respectively, such that d 1 (x, y) equals 1 at y = −0.the section of the boundary where U p is applied and smoothly decreases to 0 away from it (see Appendix H for more details): As discussed in Section 3.2, the NN finds it difficult to learn crack nucleation when the AT1 model is employed; therefore, we use the AT2 model  in this problem.Figure 6 compares the solution obtained from the NN with the FEA solution at U p = 1.2.The crack path in the NN solution closely resembles that in the FEA solution, although the FEA solution shows comparatively sharper turning of the crack.Note that, due to the non-convexity of the governing energy functional, even the FEA solution is in general not unique, and changes in the obtained crack patterns may be induced by numerical perturbations as discussed in [63].standard deviation of the energies obtained from 8 trained NNs with different initializations, and compares them with the energies obtained from FEA.The energies are in close agreement before crack nucleation, and the NN accurately captures the critical U p at which nucleation takes place.However, in the subsequent stage the damage energy obtained from the NN is higher than in FEA.This discrepancy results from the difficulty in learning a curving crack and can be improved by reducing the loading step sizes and imposing a stricter stopping criterion for the optimizer, which however would increase the computational cost.

Crack evolution in a notched specimen
The geometry of the SEN specimen and boundary conditions are as shown in Figure 8, while the material properties are ν = 0.3 and l = 0.01 [53].We define the following ansatz to apply the boundary conditions and to compute α: We define the notch in the sample by setting an initial phase field as described in Appendix I; the NN learns the notch as a localized phase field as shown in Figure 8.

Tensile loading
With this problem, we demonstrate the ability of the NN to learn crack propagation.We subject the SEN specimen to tensile loading, i.e. the loading angle is ω = π/2, and adopt the AT1 model.Figure 9 compares the fields obtained from NN and FEA at U p = 0.2, showing close agreement.In this example, the simplicity of the crack path allows the NN to easily and accurately learn the cracked solution.Figure 10 compares the energies associated with the NN solution with the energies from FEA, exhibiting once again a close agreement.Note that the comparatively higher standard deviation near the critical load for crack propagation results from differences in the prediction of the critical load for different initializations of the NN.

Shear loading
This problem demonstrates the ability of the NN to learn crack kinking, i.e. the change in direction of a crack.To this aim, we subject the SEN specimen to shear loading, i.e. the loading angle is ω = 0, and again use the AT1 model.The displacement and phase fields obtained from the NN at U p = 0.4 are compared with the FEA solution in Figure 11, showing close agreement.Note that FEA encounters convergence issues for U p > 0.4, while the NN does not exhibit any problem even in that range, as demonstrated by Figure 12, which shows the fields at U p = 0.47.
The energies associated with the NN solution are compared with the energies from FEA in Figure 13.They are also very close, although the    critical load for crack propagation in the NN solution is slightly higher than in FEA.Moreover, higher standard deviations in the energies result from slight variations in the critical loads for different NN initializations.

Crack branching in a notched plate
To demonstrate crack branching, we again prescribe the shear loading condition on the SEN specimen, i.e. ω = 0.In this problem, in the absence  of the strain energy decomposition in (4), which differentiates between tensile and compressive stresses, unphysical branching of the crack occurs [3].Since in this case we do not seek a physically realistic prediction of the behavior but rather an accurate solution of the governing PDEs, we purposefully omit the strain energy decomposition, i.e. we take Ψ + = Ψ and Ψ − = 0. Figure 14 illustrates that the NN is able to learn crack branching.The fields obtained from the NN at U p = 0.37 are compared with the FEA solution in Figure 14, displaying again a good agreement, with some differences in the prediction of the curvature of the crack path.
The evolution of the energies with U p obtained from the NN are compared with the FEA energies in Figure 15.The critical load at which branching initiates shows variability for different NN initializations, leading to a higher standard deviation; however, the average value is close to the critical load predicted by FEA.Also, the difficulty encountered by the NN in learning the curvature of the crack path is reflected in the deviation of the energies from those computed from FEA after the critical load.Note that one of the 8 NNs had difficulty in learning branching; the corresponding result is not included in the computation of the mean and standard deviation of the energies.

Crack coalescence in a plate with cracks
To demonstrate crack coalescence, tensile loading is applied on a specimen with preexisting cracks (Figure 16).In this case, we again adopt (29) with loading angle ω = π/2, and the material properties are ν = 1/3 and l = 0.01 [64].We use a larger value for l in comparison to [64] to reduce the computational cost.
Figure 17 demonstrates that the NN is able to learn crack coalescence.Unlike in the FEA solution, cracks starting from the initial cracks and reaching the domain boundaries are at an angle from the horizontal direction.On  the other hand, as discussed earlier, the FEA solution itself is not guaranteed to be unique [63].
The energies obtained from the NN with increasing U p are compared with the FEA energies in Figure 18.The critical load at which propagation and merging occurs, while close to the FEA prediction, shows slight variability for different NN initializations, leading to a higher standard deviation near the critical load.The energies obtained from the NN after the crack propagates through the sample are in close agreement with the FEA energies even though the crack paths show differences.This suggests that different crack paths may be competing at similar energy levels, which is a known phenomenon in phase-field fracture modeling [63].An alternative explanation may be that the NN is not reaching a sufficiently deep energy minimum, in which case results can be improved by prescribing a stricter stopping criterion for the optimizer.Note that here two NNs (out of 8) exhibit difficulty in learning the correct crack evolution, thus the corresponding results are not included in the computation of the mean and standard deviation of the energies.

Computational cost
Training of the NNs is performed on NVIDIA GeForce RTX 2080 Ti GPUs, whereas FEA computations are performed on Intel(R) Xeon(R) W-  2223 CPUs.While our DRM approach is robust and we are able to learn solutions involving various fracture phenomena, we observe that the DRM is computationally more expensive compared to FEA, with the training for NNs taking up to one order of magnitude longer than the corresponding FEA computations.This observation is not suprising; our aim in this work was not to deliver an alternative solution framework for a single boundary value problem in phase-field fracture, but rather to develop a robust approach which can lay the foundation to solve parametric phase-field fracture problems with the DRM.

Conclusions
To harness physics-informed deep learning for phase-field fracture modeling, we propose the design of an NN and of a learning approach based on the DRM, aimed at learning complex fracture processes.The main conclusions of this work are as follows: • Phase-field fracture modeling requires solving a coupled problem which involves fields with sharp spatial variations.Training NNs to learn such fields is challenging; successful training relies, on one hand, on the ability of the NN to accurately approximate the energy surface and, on the other hand, on the ability of an optimizer to reach the correct local minimum respecting the energy barriers.
• We empirically establish that fully connected feed-forward NNs, with a modified ReLU activation and an appropriate ansatz in the construction of the solution fields, along with weight regularization and numerical gradient computation, is able to learn complex fracture processes.Furthermore, RPROP is found to be the most effective optimization algorithm for training.
• Our approach can solve examples of crack initiation, propagation, kinking, branching, and coalescence taken from the phase-field fracture literature, within one single numerical setup.Results in terms of solution fields and energy vs. applied displacement curves are in quantitatively excellent agreement with the respective FEA results.Additionally, our approach is robust to different network initializations.
In this work, we focused on the design of a robust NN and learning approach to learn the solution of boundary value problems involving diverse fracture phenomena in phase-field fracture modeling.While our approach is computationally expensive compared to FEA, our aim is not to deliver an alternative solution framework for a single boundary value problem in phase-field fracture; rather, the present work is intended as the first step in the direction of learning solutions to parametric phase-field fracture models with the DRM.In this setting, NNs can be trained on a few realizations of the parameter space and results can then be inferred online for all other realizations, leveraging the true potential of the DRM.Learning solutions to parametric phase-field fracture models will be the goal of our future research.

Figure 1 :
Figure 1: Scheme of our deep Ritz method for the two-dimensional case (left); function constraining the phase field between 0 and 1 (right)

Figure 6 :
Figure 6: L-shaped panel: displacement and phase fields at U p = 1.2 from NN and FEA.

Figure 7 Figure 7 :
Figure 7: L-shaped panel: elastic and fracture energies vs prescribed displacement from NN and FEA.

Figure 8 :
Figure 8: SEN specimen: geometry and boundary conditions (left); phase field from NN at the smallest applied tensile load (right).

Figure 14 :
Figure 14: Crack branching test: displacement and phase fields at U p = 0.37 from NN and FEA.

Figure 15 :
Figure 15: Crack branching test: elastic and fracture energies vs prescribed displacement from NN and FEA.

Figure 16 :
Figure 16: Specimen with preexisting cracks: geometry and boundary conditions (left) and phase field at the location of the cracks from NN at the smallest applied load in training (right).

Figure 17 :Figure 18 :
Figure 17: Specimen with preexisting cracks: displacement and phase fields at U p = 0.2 from NN and FEA.

Figure A. 19 :Figure A. 20 :
Figure A.19: Phase field at U p = 0.4 after crack propagation (left) and energies (right) for the SEN specimen under shear loading (ω = 0) when the displacement and phase fields are represented by two separate NNs.

layers and 50 Figure C. 24 :
Figure C.24: Displacement field, phase field, and stress field at U p = 0.31 in the SEN specimen under shear loading (ω = 0).