Learning constitutive models from microstructural simulations via a non-intrusive reduced basis method

In order to optimally design materials, it is crucial to understand the structure-property relations in the material by analyzing the effect of microstructure parameters on the macroscopic properties. In computational homogenization, the microstructure is thus explicitly modeled inside the macrostructure, leading to a coupled two-scale formulation. Unfortunately, the high computational costs of such multiscale simulations often render the solution of design, optimization, or inverse problems infeasible. To address this issue, we propose in this work a non-intrusive reduced basis method to construct inexpensive surrogates for parametrized microscale problems; the method is specifically well-suited for multiscale simulations since the coupled simulation is decoupled into two independent problems: (1) solving the microscopic problem for different (loading or material) parameters and learning a surrogate model from the data; and (2) solving the macroscopic problem with the learned material model. The proposed method has three key features. First, the microscopic stress field can be fully recovered. Second, the method is able to accurately predict the stress field for a wide range of material parameters; furthermore, the derivatives of the effective stress with respect to the material parameters are available and can be readily utilized in solving optimization problems. Finally, it is more data efficient, i.e. requiring less training data, as compared to directly performing a regression on the effective stress. For the microstructures in the two test problems considered, the mean approximation error of the effective stress is as low as 0.1% despite using a relatively small training dataset. Embedded into the macroscopic problem, the reduced order model leads to an online speed up of approximately three orders of magnitude while maintaining a high accuracy as compared to the FE$^2$ solver.


Introduction
Simulation methods are widely utilized to bridge the understanding between macroscopic behavior and the microstructure of the material. This can be achieved by using multiscale methods such as computational homogenization (CH). In CH, the microstructure and macrostructure are both separately modeled, with the microstructure essentially replacing the constitutive model of the macroscopic problem. The microstructure no longer be recovered.
Intrusive methods, on the other hand, still consider the microscopic PDE and therefore often require far less data due to the known physics. Instead of replacing the microscopic problem, the solution is accelerated. One notable method is the Transformation Field Analysis (TFA) [35], which is specifically designed for simulations involving inelastic materials. By considering the field of internal variables to be piecewise uniform, the computational cost of evaluating the nonlinear terms is greatly decreased. Moreover, the number of internal variables that needs to be tracked is also reduced. Later, the method was generalized to non-uniform fields and referred to as Nonuniform Transformation Field Analysis (NTFA) [36,37]. Even though successful in accelerating two-scale simulations, the reduced model is designed for a specific microscopic material model and therefore cannot handle material parameters [38].
Another approach called the Self-consistent Clustering Analysis (SCA) was proposed in Liu et al. [39] and extended in [40]. The idea of this method is to find material points within a microstructure that exhibit similar deformation behavior and to group them into clusters. It is then assumed that each cluster exhibits the same deformation, therefore reducing the number of independent points to the number of clusters. The compressed problem is then solved with a FFT approach. A similar approach was proposed in [41], however, in that work the problem is solved with a three-field Hashin-Shtrikman type variational formulation. Similarly to (N)TFA, this method does not offer a direct way of treating material parameters.
In the Proper Generalized Decomposition (PGD) [42,43], the solution to a PDE is approximated by a linear combination of terms that each consist of a separated functional representation for each independent variable (i.e. coordinates, time, and parameters). The PGD approach has been applied to nonlinear solid mechanics problems, see, e.g., [44]. It has also been combined with the LATIN multiscale method for non-linear problems, see, e.g., [43]. However, the method becomes increasingly more difficult for severe nonlinearities, especially in combination with varying parameters [42].
A related approach is the reduced basis (RB) method [45][46][47] which attempts to reduce the parametric problem by searching for the PDE solution on a reduced space spanned by global parameter-independent basis functions. The basis functions can be constructed by a proper orthogonal decomposition (POD) of a collection of high-fidelity simulations, which are often referred to as snapshots. The solution can then be found by either solving the reduced problem with a (Petrov-)Galerkin projection [48][49][50][51][52][53] or as proposed in more recent works by using a regression-based approach to find the basis coefficients [54][55][56][57][58][59]. Material parameters can naturally be dealt with in this framework.
The projection-based approach works especially well when the problem has an affine dependence on the parameters, allowing the decoupling into an offline (construction of the RB) and online stage (solving the reduced set of equations). However, this is not the case for the general solution of a nonlinear problem since the nonlinear terms will still depend on the full scale problem. To work around this issue, methods such as the empirical interpolation method (EIM) [60] and its discrete variant [61], or hyperreduction [52] can be used to approximate the nonlinear term and recover the affine dependence. In the context of CH, POD was first applied on a hyperelastic material in Yvonnet et al. [62]. However, in this work the Galerkin approach was directly applied without any special treatment of the nonlinear term, thus resulting in a rather modest speedup. In Hernández et al. [53], the computation of the nonlinear term was accelerated by a hyperreduction scheme and an algorithm to find appropriate integration points was proposed. In Soldner et al. [50], POD was combined with three different hyperreduction schemes and it was found that some approximations led to convergence problems in some cases. Another problem is that there does not exist a practical way of calculating the consistent effective stiffness [62], as this would require the full assembly of the microscopic tangent matrix. If the tangent matrix was approximated with a hyperreduction scheme, its symmetry might not be preserved and therefore an inconsistent effective stiffness obtained with a perturbation method might be preferable [50].
Instead of solving the reduced system, the regression-based approach infers the coefficients of the corresponding RB functions from a regression. In doing so, the online phase only consists of evaluating the coefficients from which the solution field can be directly obtained. Furthermore, the method becomes non-intrusive, similar to the other data-driven approaches. However, as observed in Hesthaven et al. [55], considerably more snapshots might be needed to construct an accurate regression as compared to an accurate basis. For that reason, active learning algorithms [54,59] or a bifidelity reconstruction [56] were recently proposed in order to minimize the number of snapshots needed.
While the projection-based method efficiently approximates only the field variable (here, the displacements), the regression-based approach could, in principle, be applied to any derived variable obtained in the high-fidelity solution (here, e.g., the stress). In [57], Swischuk et al. utilized a regression-based POD to directly find the strain field. In the case of CH, the microscopic displacements are not needed, but only the effective stress is required. Hence, in this work, we propose a regression-based RB surrogate model for the microscopic stress field, which is specifically geared towards multiscale simulations, to combine the advantages of intrusive and non-intrusive methods: (i) Due to the non-intrusiveness, this method can be easily implemented and the two-scale problem is decoupled into two independent problems, (ii) the approximation of the stress field allows a direct way of computing the effective stress and its derivative with respect to any (loading or material) parameter and (iii) even though the method requires more data as compared to the projection-based POD, it is more data efficient as compared to the pure data-driven methods. Furthermore, there are two reasons that favor the stress-based regression over the usual displacement-based approach: 1. If the displacements were approximated, the stresses would still need to be computed from the displacements. This means that the material model must be implemented again, or the displacement data has to be fed into the microstructure solver to return the stresses, thus negating the advantages of the non-intrusive method. Furthermore this would lead to unnecessary additional computation, leading to decreased efficiency. 2. Inelastic material models are defined in stress rate rather than stresses, including internal variables in their formulation that need to be tracked. It would thus not be possible to determine the stresses using only the displacement field since the evolution of the internal variables would not be known. As a result, the procedure would become highly inefficient or even inapplicable to history-dependent material behavior. On the other hand, works in DNN have shown that history-dependent behavior can be learned from stress and strain data alone without the need for internal variables [25][26][27][28][29][30]63].
The remainder of this paper is organized as follows. In Section 2, the two-scale theory of first-order computational homogenization is reviewed. Following that, Section 3 presents how the non-intrusive reduced order model is constructed. Moreover, an error estimation and a comparison to neural networks is provided. Two microstructural simulations and a two-scale problem are addressed in Section 4, testing the capabilities of the proposed framework. A summary of the findings and concluding remarks are presented in Section 5.
Notation. In this work, italic bold symbols are employed for coordinates and functions, such as the coordinates X, the displacement field u or the stress field P . Upright bold symbols are used for vectors or tensors representing algebraic or discretized quantities, such as the identity matrix I, discrete stress field snapshots P or the snapshot tensor S. When A is a N th-order tensor with indices x 1 , x 2 , ..., x N , then A x1,x2,...,xn with n ≤ N denotes the same tensor with the first n indices held fixed and is of (N − n)th-order. As an example, when A is a matrix, then A i denotes the ith row vector and A ij the entry at the ith row and jth column. Macroscopic variables are denoted by an overline to distinguish them from microscopic quantities; otherwise, the notation described above is used for both micro-and macroscopic quantities.

Macroscopic simulation
Consider a solid bodyΩ that is deformed under prescribed boundary conditions. Under the deformation, each pointX of the undeformed body is continuously mapped onto the coordinatesx of the deformed body. The macroscopic displacement for each point is defined asū :=x −X, and the macroscopic deformation gradient then follows asF The governing partial differential equation (PDE) describing the deformation is given by the quasi-static linear momentum balance, whereP is the macroscopic first Piola-Kirchhoff (PK1) stress tensor,B are the body forces,N is the outward normal on the surface of the body,t 0 andū 0 are the prescribed traction and displacement, and ∂Ω N , ∂Ω D denote the Neumann and Dirichlet boundaries with ∂Ω = ∂Ω N ∪ ∂Ω D and ∂Ω N ∩ ∂Ω D = ∅. The weak form of Eq. (2) is then where δū ∈ H 1 0 (Ω) = {v ∈ H 1 (Ω) | v = 0 on ∂Ω D } is a test function, and a solution for the displacement, u ∈ H 1 (Ω), is sought that fulfillsū =ū 0 on ∂Ω D . To solve for the displacements,ū, the linearization ofḠ in the direction ∆ū around the current deformationū is needed, whereĀ := ∂P ∂F is the macroscopic stiffness tensor. To model a certain material behavior, a constitutive relation which connectsP andĀ to the deformationF has to be provided.

Microscopic simulation
The microscopic simulation is defined on a representative volume element (RVE) Ω. In first-order CH, it is assumed that the microscopic deformed points x are coupled to the macroscopic deformation by where w(X) is a zero-mean microscopic fluctuation displacement field. Note that the macroscopic deformation gradientF depends on the macroscopic coordinatesX. For conciseness, this dependence is ommited here and the following equations are given for a fixedX. The microscopic displacement is then given as and the microscopic deformation gradient as Analogous to the above, the microscopic deformation is governed by the quasi-static linear momentum balance, where body forces are neglected for simplicity. The weak form is given as and the linearization of G in the direction ∆u around the current state u is In contrast to the macroscale, the materials are fully specified by a constitutive law in the microscale. In this work, we consider a hyperelastic Neo-Hookean material model with strain energy density function where C 1 and D 1 are the material parameters stored in µ = [C 1 , D 1 ] T , Tr(•) denotes the trace of a tensor, C = F T F the right Cauchy-Green deformation tensor and J = det(F ) is the determinant of F respectively. The material parameters C 1 and D 1 are related to the Young's modulus E and Poisson ratio ν by The PK1 stress and stiffness tensor can then be derived from Eq. (11),

Scale coupling
The aim of the microscopic simulation is to replace the macroscopic constitutive modelP (X,F ). For every quadrature point in the macrostructure, the macroscopic simulation transfers the macroscopic deformation gradient to the microscopic simulation, which in turn returns an effective stress and stiffness tensor; see Fig. 1 for a visualization. The displacements of both scales are coupled according to Eq. (6). Furthermore, the Hill-Mandel Condition, states that the virtual work exerted on both scales has to be the same; here, {•} = |Ω| −1 Ω {•}dV denotes the volumetric averaging operator with |Ω| the volume of the RVE Ω.
It can be shown that the condition is always fulfilled by introducing appropriate boundary conditions for the microscopic problem. One set of boundary conditions that fulfill Eq. (14) are linear displacement boundary conditions, where the fluctuation displacement w on the RVE boundary is assumed to be zero. With Eq. (5), the following Dirichlet boundary condition follows: In [14], it is shown that, by prescribing Eq. (15) on the boundaries, the macroscopic deformation gradient is always equal to the averaged microscopic deformation gradient, i.e.F = F . It then follows that the averaged microscopic stress has to be equal to the macroscopic stress: However, the averaged stiffness tensor is generally not equal to the macroscopic stiffness tensor due to the changes of w inside Ω, i.e., thus complicating the calculation of the correct effective stiffness [14]. A popular way to compute the stiffness with a perturbation method is suggested in [17], which essentially approximates the stiffness with a finite difference scheme.

Reduced order model
The two-scale simulation presented above can be solved using the finite element method (FEM) on both scales, leading to a nested FE scheme, also known as FE 2 . Although effective, FE 2 method tends to be computationally very expensive. To lower the computational cost, we approximate the microscopic problem using a non-intrusive reduced order model, constructed by a proper orthogonal decomposition (POD) and a Gaussian process regression (GPR), which we will refer to as P-PODGPR later. The proposed method differs from existing works in the literature (see e.g. [54,56,57,64]) in that the non-intrusive reduced order model is constructed directly for the stresses instead of the displacements or strains. As seen in Fig. 1, both displacement and strain are not required by the macroscopic solver and, moreover, the stresses still need to be calculated, which means that the microscopic constitutive law has to be implemented anew, negating the advantages of a non-intrusive method. Moreover, depending on the material model the evaluation might be costly and compromise the speed up significantly. It is also not obvious how one would derive the effective stiffness for that case. On the other hand, reducing the stresses directly circumvents the above-mentioned disadvantages, and both the effective stress and stiffness can be rapidly obtained.

Proper orthogonal decomposition (POD)
Principal component analysis (PCA) is a powerful tool in data science to find a low-dimensional approximation in Euclidean space. By introducing an appropriate function metric, PCA can be generalized for functions in Hilbert spaces and is referred to as proper orthogonal decomposition (POD). POD can utilize the correlation of solutions of a problem for different parameters to find a low-dimensional orthonormal basis, thus reducing the number of unknowns to a very small fraction of the dimension of the original highfidelity model. The procedure for computing the low-dimensional basis of the microscopic stress P ∈ L 2 (Ω) is outlined in the following.

Basis construction
After N pod high-fidelity simulations for different parameters have been carried out, the microscopic stress in all quadrature points P ∈ R Nqp×3×3 , where N qp is the total number of quadrature points in the high-fidelity model, is collected for each solution in the snapshot tensor S ∈ R Nqp×3×3×N pod , Next, the correlation matrix C ∈ R N pod ×N pod can be formed by computing the L 2 -inner product between every pair of two snapshots, where (•, •) L 2 (Ω) denotes the L 2 -inner product and i, j = 1, 2, . . . , N pod . The weights w q are the integration weights corresponding to each quadrature point; they only depend on the mesh and can be easily computed. Note the subtle difference between italic and nonitalic characters used, corresponding to L 2 (Ω) functions and their discrete counterparts respectively. After obtaining the correlation matrix, its eigenvalues λ l and eigenvectors v l , l = 1, 2, . . . , min (L, N pod ) are computed, where L is the maximum number of basis functions specified by the user. The number of basis functions is often chosen from the criterion where E pod corresponds to the energy captured by L basis functions and is specified by the user. Moreover, the eigenvalues can reveal if a problem is reducible using POD: if the eigenvalues decay rapidly, the solution space can be accurately captured by a few basis functions. Finally, the lth POD basis function B l ∈ R Nqp×3×3 is found with By construction, the POD basis is orthonormal, i.e.

Approximation of stress and stiffness
With the POD basis functions, P can be approximated with where the loadingF and material parameters µ are listed separately, as the partial derivative over the loading parameters yields the stiffness tensor, whereas the partial derivative with respect to µ could be used for optimization or inverse problems. Since the basis functions B l are constant, only the coefficients α l ∈ R depend on the parameters and need to be found for a new parameter value. Note that with this approach the ability to determine the microscopic displacement, energy and stiffness tensor is lost, if the constitutive relation P (F ) can not be inverted. However, if knowledge of the deformation is required, a similar procedure using displacement or deformation gradient snapshots could be implemented [54,56]. This should in general work well if the stresses can be approximated accurately, because the stress is a non-linear function of the displacement and therefore should be more difficult to approximate. By averaging the microscopic stress, the macroscopic stressP ∈ R 3×3 is found: Lastly, the effective stiffnessĀ ∈ R 3×3×3×3 is determined by differentiating the macroscopic stress over the macroscopic deformation gradient, As seen above, by introducing the approximation Eq. (23), the macroscopic stiffness tensor can be naturally derived when the derivative of the coefficients α l with respect toF is available. Moreover, the derivatives of stress with respect to the material parameters µ can be obtained analogously and used in optimization or inverse problems.
Remark. As this surrogate model is used to replace the microstructural simulation, it is sufficient to construct the basis for macroscopic stretchesŪ instead ofF , since the macroscopic deformation gradient can be split intoF =RŪ by using the polar decomposition, whereR accounts for rotations. From the principle of material objectivity it follows,P see [65]. Due to the symmetry ofŪ , the number of parameters reduces from 4 to 3 in 2D or from 9 to 6 in 3D. In the following,F is therefore replaced byŪ and all snapshots are assumed to have been obtained for differentŪ and µ.

Regression model
The mapping α l (Ū , µ) for all parameters can be constructed by means of a regression, using the training data that have been collected for POD. If the regression quality is insufficient, more data has to be generated.

Data preparation
The stress data P (i) collected was generated for parameters (Ū (i) , µ (i) ) with i = 1, 2, . . . , N reg , where N reg denotes the total number of snapshots available. The best approximation of the ith snapshot on the POD basis is given by and hence α where P denotes the parameter space. Due to the orthonormality of the POD basis functions, see Eq. (22), the coefficients α l are fully uncorrelated and a separate regression for each POD coefficient can be constructed independently. With this mapping, the stress P is approximated aŝ

Gaussian process regression
Since the material model considered here is hyperelastic and therefore history independent, each stress corresponds to exactly one set of deformations and therefore, a wide range of regression techniques can be used. Many different regression approaches have been used to approximate the coefficients, e.g. radial basis functions [58], neural networks (NNs) [55], and Gaussian process regression (GPR) [54,56,59]. A systematic investigation on these three methods has been conducted in [66]. A comparison of different machine learning methods for regression was performed in [57]. For this work, GPR is used since it offers some desirable properties: 1. It can reconstruct the training data perfectly, i.e. it reproduces the exact solution at the training points. 2. Depending on the chosen kernel, the regression function has a specified smoothness and its derivatives can be obtained analytically. 3. The trained GPR model returns an uncertainty measure for each evaluation, which can be used to estimate the regression error or to develop an active learning scheme [54,67].
In GPR, a scalar regression function f (X) with X ∈ R d is assumed to be distributed as a Gaussian process (GP) with a zero mean function and kernel k θ (X, X ), The form of the kernel k θ (X, X ) has to be chosen by the user, and each kernel has hyperparameters θ that must be fitted to the data. In this work, we use, as in [54], the automatic relevance determination (ARD) squared exponential kernel, where θ = [σ f , l 1 , l 2 , . . . , l d ].
A GPR is separately performed for each POD basis coefficient, yielding in total L GPR models. The ARD kernel is infinitely smooth, hence the Jacobian of α l can be easily obtained and Eq. (25) is fully specified. With the trained GPR models and the POD basis, the surrogate model for the microscopic simulation is complete. Given a parameter, the microscopic stress field can be rapidly evaluated, from which the effective stress and its derivative can be derived. Therefore the surrogate model can be considered a learned constitutive model, where additionally the stress field on the microscale is still accessible.
As indicated earlier, the non-intrusive reduced order model will be referred to as P-PODGPR.

Error estimate
For brevity of notation, we use the symbol π to denote both the loading and material parameters in this section. Using the Cauchy-Schwarz inequality, it can be shown that the total error between the high fidelity simulation and the approximation by P-PODGPR for a parameter π ∈ P consists of the projection error and regression error, P-PODGPR (π) = ||P HF (π) −P RB (π)|| L 2 (Ω) = ||P HF (π) − P RB (π) + P RB (π) −P RB (π)|| L 2 (Ω) ≤ ||P HF (π) − P RB (π)|| L 2 (Ω) projection error If the parameter space has been appropriately explored by the snapshots used for POD and the eigenvalues exhibit a rapid decay, the projection error can be reasonably expected to quickly approach zero for any parameter value as the number of basis functions L is increased. The squared regression error for the stresses can be rearranged, where the orthonormality of B l was used in the second last line and || • || 2 denotes the Euclidean norm. Hence, the regression error is simply equal to the Euclidean norm of the error of the coefficient vectors. This result is analogous to the result presented in [56] for the displacement.

Comparison to Neural Networks
After the pioneering works by Ghaboussi et al. [25,26] and recent advances in Deep Learning, many papers have used methods of deep learning to extract a constitutive model from pairs of stress and deformation data by training a deep neural network (DNN), e.g. [27,30,33,69,70]. Frequently their data is collected from RVE simulations by applying boundary conditions that fulfill Eq. (14) and solving the microscopic simulation to generate training data, analogous to what is done in this work. However, one difference is that in this work the stress field is collected while the other works only collect the averaged stress and dispose of the stress field. A brief comparison between both approaches is given below: 1. Training: The DNN approach essentially performs a regression on the stress and deformation data.
This means that a mapping from R 3×3 → R 3×3 (or R 6 → R 6 in the case where the stress and deformation measures are symmetric) has to be learned. P-PODGPR, on the other hand, uses the correlation of the microscopic stress field solutions and finds a few uncorrelated coefficients which can be independently learned, i.e. a mapping from R 3×3 → R or R 6 → R is learned. 2. Implementation: Both surrogate models, after they have been trained, can be easily adopted into any simulation software as they are both non-intrusive and therefore entirely independent of the software used to solve the microscale simulation. 3. Evaluation: The DNN approach needs to compute one forward pass through the neural network to get the effective stress for a given deformation. The effective stiffness can then be computed with one backward pass with automatic differentiation. However, the microstructural stresses, which are desirable for finding local stress concentrations and designing improved microstructures, cannot be obtained. In the P-PODGPR method the stress field is fully obtained by evaluating L GPR models, from which the effective stress can be computed. The effective stiffness is obtained with Eq. (25) as the GPR model also supports an analytical way of computing the derivative of α l over the deformation.

Results
To demonstrate the performance of P-PODGPR and the influence of the number of basis functions L, number of samples used for the basis construction N pod , and number of samples used for the regression N reg , two single-scale examples with different 2D RVEs are presented. The third example involves a two-scale problem, in which the results obtained with P-PODGPR are compared with the FE 2 solution.

Single-scale RVE simulation
To measure the accuracy of P-PODGPR on test data, the following relative error measure for the effective stress is defined: where • F denotes the Frobenius norm,P HF the effective stress from the high-fidelity simulation and P RB the effective stress resulting from P-PODGPR. The projection error of the basis is attained when P RB =P RB , whereP RB is the effective stress after projecting the high fidelity solution onto the POD basis when the solution is known. As the regression error Eq. (37) goes towards zero,P RB tends towardsP RB , so the projection error can be interpreted as a lower bound of the solution from P-PODGPR. All microscopic simulations were conducted with the FE framework MOOSE [71].

Porous material
A porous microstructure as depicted in Fig. 2a is considered for the first example, where the pores account for 14% of the total area. Four-node quadrilateral elements with four quadrature points are employed. The matrix material is modeled as a Neo-Hookean material with material parameters C 1 = 1 and D 1 = 1. Linear displacement boundaries withŪ − I ∈ [−0.05, 0.05] are considered. Due to the geometry of the problem, such a deformation leads to much higher local strains, see Fig. 2b for an exemplary deformation with [Ū xx ,Ū yy ,Ū xy ] = [0.95, 0.95, 0.05]. If larger displacements on the boundaries were considered, some pores might close, requiring contact detection. In this example, the material parameters µ are fixed, hence this problem has three varying parameters.
Data generation. To investigate the number of precomputations needed for an accurate representation, a set of 500 training snapshots for training P-PODGPR was sampled via a Sobol sequence sampling procedure and another set of 1000 test snapshots, to evaluate the accuracy of the surrogate model, was generated randomly from a uniform distribution. It was observed in [72] that Sobol sequence sampling fills the parameter space more evenly as compared to random sampling and thus provides better results.
Eigenvalues. The eigenvalues of the correlation matrix for different numbers of training snapshots N pod are plotted in Fig. 3. For all cases, an exponential decay is observed, indicating the reducibility of the problem. The von Mises stress of the first three POD basis functions is plotted in Fig. 4. It can be seen that the basis functions specifically capture the stress concentrations around the pores. Subsequently, L = 20 basis functions are considered which corresponds to an energy E pod of 99.9996%.
Influence of N pod and N reg . For L = 20 basis functions, combinations of N pod ∈ {20, 50, 100, 200, 500} and N reg ∈ {50, 100, 200, 300, 400, 500} were investigated. All 1000 test data were evaluated and the resulting mean and maximum error of the stresses are plotted in Fig. 5. It can be observed that for all cases the mean error is below 0.1%. From N pod = 20 to N pod = 50, a significant improvement in the error can be observed. However, for N pod > 50, the error only changes marginally. As N reg is increased, more data becomes available for the regression and therefore the mapping in Eq. (29) becomes increasingly more accurate. Nevertheless, even using only 50 snapshots for both basis and regression yields a mean error of 0.1% and a maximum error of roughly 0.65%.
Influence of L. Using N pod = N reg = 50, the influence of the number of basis functions L was then investigated. The relative errors of the effective stress are given in Fig. 6. The projection error is also plotted to compare the quality of the regression in Eq. (29). As seen from the figure, both the mean and maximum error of the first 8 basis functions nearly perfectly match. However, for a larger number of    basis functions, the discrepancy slowly grows and the error flattens. Generally, the POD coefficients get increasingly more oscillatory with increasing number and hence require more data for an accurate regression.
To show that with increasing N reg , the regression error decreases and approaches the projection error, the error over L for N pod = 50 with N reg = 500 is also plotted in Fig. 7. For this case, the mean error matches the projection error for 13 basis functions, while the maximum error also gets much closer to the projection error. Note that the maximum error of P-PODGPR is sometimes slightly lower than the projection error. This can be explained with errors in the regression that lead to a stress field which after averaging ends up closer to the high fidelity solution than the best projection. This is entirely random and as N reg is further increased, the regression error will tend towards zero and the curves will match eventually.
Comparison with Neural Networks. For comparison, a deep forward neural network with different architectures was trained with the available deformation and stress data collected from the RVE simulations. The stress field data was averaged to give effective stresses. Regarding the network architecture, different neural networks with N h = {1, 2} hidden layers with each N n = {20, 50} neurons were tested. The input layer and output layer both had 3 and 4 neurons respectively, one for each component of the stretch tensor and effective stress. Apart from the output layer an ELU activation function was applied after each layer. Before training, the deformation and stress data were normalized to [0, 1]. Then, all available N reg = 500 training snapshots and all 1000 test snapshots were set as training and validation dataset for the optimization. The      network was trained using a mean squared error loss function, the Adam optimizer with a learning rate of 1 × 10 −4 and a batch size of 32 for 10000 epochs. The validation loss and relative errors P obtained for each architecture are given in Tab. 1, where the best architecture is highlighted. All architectures produce similar results with the second architecture N h = 1, N n = 50 yielding the best results with an average and maximum error of 0.17% and 1.36%. On the other hand, P-PODGPR already yields a lower mean and maximum error of around 0.065% and 0.65% when using only 50 training data, showing that P-PODGPR can utilize the information of each snapshot more efficiently than the neural network. When 500 snapshots are used for the regression in P-PODGPR, a mean and max error of less than 0.02% and 0.2% is achieved, hence outperforming the neural network greatly.

Fiber reinforced material
In the second example, the considered microstructural RVE consists of two different phases: a soft matrix and a stiff fiber material. Both materials are Neo-Hookean, but the matrix has parameters C 1 = 1, D 1 = 1, while the fiber has variable parameters C 1 = D 1 ∈ [50, 150], corresponding to a Young's modulus that is 50-150 times higher than the matrix, with the Poisson ratio remaining the same (ν = 0.25), cf. Eq. (12). The geometry used is depicted in Fig. 8, where the matrix (grey) completely surrounds the fiber (red). Eight node quadrilateral elements with four quadrature points are employed. The volume fraction of the fiber is 12.56%. Linear displacement boundaries withŪ −I ∈ [−0.3, 0.3] and fiber parameters of C 1 = D 1 ∈ [50,150] are considered. Therefore, the considered problem has 4 varying parameters.
Data generation. A set of 1000 training snapshots and 1000 test snapshots were generated. The first set contained the 2 4 = 16 corner points of the parameter domain and the remaining points were sampled from a Sobol sequence, while the second set was generated from a random uniform distribution.
Eigenvalues. The eigenvalues of the correlation matrix for different numbers of training snapshots are plotted in Fig. 9. Similar to the last example, for all cases, an exponential decay can be observed, showing the   reducibility of the problem. The von Mises stress of the first three POD basis functions are plotted in Fig. 10. Subsequently, L = 20 basis functions are considered which correspond to an energy E pod of 99.9901%.
Influence of N pod and N reg . For L = 20 basis functions, combinations of N pod ∈ {20, 50, 100, 200, 500} and N reg ∈ {100, 200, 300, 400, 500, 1000} were tested. The mean and maximum error plots can be seen in Fig. 11. It can be observed that for all cases with N reg > 50, the mean error is below 1%. The mean error improves as more snapshots were considered for the POD basis, suggesting that there was still new information in the snapshots, which fine-tune the optimal basis. The mean error reaches around 0.04% for N reg = 500 snapshots with a maximum error of around 1%. Increasing N reg from 500 to 1000 does not improve the mean error significantly, meaning that the projection error has already been reached with 500 snapshots and L = 20 basis functions. Weighing the approximation error over the number of full computations needed, N pod = N reg = 200 is considered in the subsequent analysis.
Influence of L. Using N pod = N reg = 200, the influence of L is investigated. The mean and maximum error of the approximation error over the number of basis functions L used are shown in Fig. 12. The projection error is also plotted to reveal the quality of the regression of Eq. (29). As seen from the figure, the mean      and maximum error for the first 11 basis functions match the projection error quite well, and afterwards flatten out, similar to the last example. When the number of snapshots N reg is increased to 500, the mean error matches the projection error for 16 basis functions, as seen in Fig. 13, while the maximum error also slightly improves.
Comparison with Neural Network. Similarly to the last example, deep forward neural networks with different architectures were trained for comparison. The available 500 training data and 1000 testing data were used for training and validation. The same network architecture configurations as before were tested, with one additional combination N h = 2, N n = 100. The results are given in Tab. 2. The fourth architecture N h = 2, N n = 50 performs the best with an average error of 0.39% and a maximum error of 2.06%. Same as for the previous example, P-PODGPR outperforms the neural network. With only 200 training snapshots, P-PODGPR achieves roughly the same accuracy as the NN using 500 snapshots. With 500 employed snapshots, P-PODGPR reaches a mean and maximum error of 0.04% and 1%, hence outperforming the neural network, while also being able to recover the microscopic stress field.
Remark on Periodic Boundary Conditions. For the same microstructure (Fig. 8), an analogous analysis using periodic boundary conditions was considered. In contrast to linear boundary conditions, the fluctuation displacement field w is assumed to be periodic on the RVE boundary. The obtained results regarding the approximation quality are shown in Fig. 14, where N pod = 200 and N reg = 500 was chosen. In comparison to Fig. 13, the results are comparable and hence it can be concluded that P-PODGPR works independently of the chosen boundary conditions.

Two-scale simulation
To show the performance of P-PODGPR in a two-scale simulation, the learned constitutive model for the fiber reinforced RVE with N pod = N reg = 200 is embedded inside a FE solver. For the macroscopic problem the Cook's membrane, consisting of the fiber reinforced RVE, is chosen. The geometry and mesh of the membrane are given in Fig. 15. The macroscopic mesh consists of 200 bilinear quadrilateral elements with 4 quadrature points. The quadrature points A and B, as marked on the figure, denote points in which the microscopic stress field is compared against the reference solution. The reference solution is obtained by running a full two-scale FE 2 simulation. The left side of the sample is fixed and the right side is loaded in five time steps with a vertical traction of 0.1. The material parameters of the fiber are taken to be fixed with C 1 = D 1 = 100 in this example.
The yx-component of the macroscopic stressP yx obtained by the full FE 2 and FE with P-PODGPR simulation are given in Fig. 16, while the microscopic stress P yx at point A and B are shown in Fig. 17 and Fig. 18 respectively. The relative error defined as Pyx := |P FE2 yx − P ROM yx |/ |P FE2 yx | , with |P FE2 yx | the averaged absolute stress, is also shown (Fig. 16c, 17c, 18c). As can be seen, the shape of the stress field of both solutions is indistinguishable. Even though the relative errors for the microscopic problem reaches a maximum of 7% near the interface of matrix and fiber, after homogenization the highest error reduces to merely 1%. This discrepancy is due to the fact that the method tries to reduce the L 2 -norm of the error in the stress field and therefore allows locally high errors.
Using 48 cores 1 , the FE 2 simulation takes around 100 minutes while the simulation with P-PODGPR is completed within 1 minute on a single core 2 , resulting in a speedup of about three orders of magnitude. For P-PODGPR, 200 RVE simulations have to be pre-computed, which takes less than one hour on a single core. Performing the POD and GPR to construct P-PODGPR takes around one minute.

Conclusions
In this work, the proper orthogonal decomposition (POD) has been applied on the microscopic stress field to find a reduced basis. A direct mapping from the loading and material parameters to the POD  basis coefficients was built with Gaussian process regression (GPR). For the two considered microstructures, involving a porous and a fiber-reinforced material, the proposed method captured the stress field accurately with a mean error of 0.1%, even in the case of varying material parameters. Finally, the learned constitutive model was employed inside a macroscopic simulation and compared to a full scale FE 2 simulation, reaching high accuracy on both macro-and microscale, while gaining a speedup of the order of 10 3 .
Although the examples presented here are two dimensional, the theory has been derived for three dimensions and can readily be applied to 3D problems. Due to the non-intrusive nature of the method, it can be easily implemented into any existing finite element solver. It is noted that the framework is not limited to POD and GPR, but in fact any method for discovering the reduced basis and any regression method can be employed.
This novel method has the potential to open up new ways of material design, as the construction of the surrogate model only requires a relatively small dataset and the surrogate can cover a large range of microstructural parameters. Since the derivatives with respect to the parameters are at hand, it is furthermore possible to define and solve macroscopic optimization problems. Moreover, the microscopic stress field can be fully recovered and visualized in order to make informed modifications of the microstructure to relieve local stress concentrations and remove flaws. Currently, works on more complicated microstructures are being conducted.

Data availability
The data that support the findings of this study are available from the corresponding author upon request.