Viscoelastic Constitutive Artificial Neural Networks (vCANNs) $-$ a framework for data-driven anisotropic nonlinear finite viscoelasticity

The constitutive behavior of polymeric materials is often modeled by finite linear viscoelastic (FLV) or quasi-linear viscoelastic (QLV) models. These popular models are simplifications that typically cannot accurately capture the nonlinear viscoelastic behavior of materials. For example, the success of attempts to capture strain rate-dependent behavior has been limited so far. To overcome this problem, we introduce viscoelastic Constitutive Artificial Neural Networks (vCANNs), a novel physics-informed machine learning framework for anisotropic nonlinear viscoelasticity at finite strains. vCANNs rely on the concept of generalized Maxwell models enhanced with nonlinear strain (rate)-dependent properties represented by neural networks. The flexibility of vCANNs enables them to automatically identify accurate and sparse constitutive models of a broad range of materials. To test vCANNs, we trained them on stress-strain data from Polyvinyl Butyral, the electro-active polymers VHB 4910 and 4905, and a biological tissue, the rectus abdominis muscle. Different loading conditions were considered, including relaxation tests, cyclic tension-compression tests, and blast loads. We demonstrate that vCANNs can learn to capture the behavior of all these materials accurately and computationally efficiently without human guidance.


Introduction
Many important materials, such as elastomers or soft biological tissues, undergo large deformations and exhibit nonlinear viscoelasticity behavior.Biological tissues additionally typically exhibit a pronounced anisotropy.Numerous experiments have confirmed that elastomers [1,2,3] and similarly ligaments and tendons exhibit nonlinear viscoelasticity [4,5,6,7,8,9,10,11].In the past, many constitutive models have been proposed to characterize these materials.However, selecting an appropriate model and identifying its material parameters requires expert knowledge.Further, when selecting a model, one usually has to make a compromise between its computational efficiency and its ability to capture nonlinear viscoelasticity adequately.Therefore, this contribution aims to develop a data-driven framework that automatically discovers constitutive models for anisotropic nonlinear viscoelasticity at finite strains.Ideally, the framework is simple and numerically efficient but, at the same time, highly versatile, describing a wide range of materials.As a starting point for our development, we review existing modeling approaches and identify their advantages and limitations.
Broadly, existing approaches to describe viscoelasticity can be categorized into hereditary integral and internal variables models.In hereditary integral models, the viscoelastic stress response is calculated by the convolution of the deformation history and an appropriate kernel function [12,13,14].A potential difficulty of these models is that, in particular for multiple integral models, the experimental determination of the kernel functions can be cumbersome [15,16] and very sensitive to noise [17].Also, the numerical implementation is often challenging since one must account -in general -for the whole deformation history.Therefore, hereditary models have mainly been applied to simple one-dimensional problems and have had only a limited impact on finite element (FE) analysis.An important exception is the theory of quasi-linear viscoelasticity (QLV) [18].In QLV, the integrand of the hereditary integral is the product of a time-dependent reduced relaxation function and the rate of the instantaneous elastic stress, usually derived from a hyperelastic strain energy function.Often, the reduced relaxation function is represented by a Prony series [19].Due to its computational efficiency and a large number of candidate functions for the reduced relaxation function and the instantaneous elastic stress, this approach has been used frequently [4,20,21,22,23,24,25,26,27,28,29,30].Due to the linear relationship between the reduced relaxation function and the instantaneous elastic stress within the hereditary integral, QLV falls short of representing fully general nonlinear viscoelastic behavior.In fact, normalized relaxation curves predicted by QLV have the same shape, independent of the strain, which contradicts experimental observations.Within the group of internal variable models, two model families have been particularly successful.The first family follows [31] and assumes an additive split of the stress into an equilibrium part and n non-equilibrium overstresses, in analogy to the generalized Maxwell model.The number of overstresses is arbitrary and can be independently selected for isotropic and anisotropic contributions to the overall material behavior.Linear ordinary differential equations (ODEs) with constant coefficients govern the evolution of the overstresses, serving as internal variables.Closed-form solutions of the evolution equations by convolution integrals result in efficient time integration algorithms [32].Therefore, these models appear in many commercial FE codes [33].Due to the linear evolution equations, models of this family are theoretically restricted to finite linear viscoelasticity (FLV), i.e., finite strains but small perturbations away from the thermodynamic equilibrium.Although originating from different theories, FLV and QLV are similar [34,35].Thus, FLV suffers from the same limitations as QLV, failing to represent general fully nonlinear viscoelastic behavior such as strain (rate)-dependent viscous properties.[36] and [37] attempted to account for nonlinear viscoelastic effects by choosing strain-dependent coefficients of the evolution equation.These attempts led to improved but still not yet fully satisfactory results.FLV models have, for example, been employed in [37,38,39,40,41,42].The second model family describes finite nonlinear viscoelasticity (FNLV), i.e., finite strains and finite perturbations away from the thermodynamic equilibrium [43].Motivated by the decomposition of the strain into an elastic and viscous part in the theory of linear viscoelasticity, FNLV models are based on the multiplicative decomposition of the deformation gradient into an elastic and viscous part proposed by [44].In general, nonlinear ODEs govern the evolution of the viscous part of the deformation gradient, serving as an internal variable.In analogy to the generalized Maxwell model, multiple decompositions of the deformation gradient are possible, each associated with the non-equilibrium stress of a Maxwell element [43] and an associated internal variable (representing a viscous part of the deformation gradient).Applications are documented for rubber [1,2,33,45,46,47] and soft biological tissues [48,49,50,51,52,53,54].The drawbacks of FNLV models are the computational cost, especially for large-scale simulations, and their limited availability in widely used commercial FE software.
The above models have been developed by specialists, and also the selection and calibration of these models for a specific material typically require some expert knowledge.Data-driven modeling approaches such as machine learning circumvent these problems by providing a flexible computational framework that directly infers constitutive relations from data rather than specifying them a priori [55,56,57,58,59,60].In deep learning, modeling the time-history effects of viscoelasticity requires a neural network for temporal signal processing.Therefore, recurrent neural networks (RNNs) and similar architectures, usually employed for speech recognition or time series prediction, have been used intensively.[61] used an Elman network to model materials with fading memory based on fractional differential equations under cyclic loading conditions.Instead of a material model, [62] applied RNNs to fuzzy data to describe time-dependent material behavior within the finite element method.The inelastic material behavior of rubber-like materials was modeled with RNNs and used in FE simulations by [63].[64] modeled small-strain viscoelasticity using long short-term memory (LSTM).RNNs compute a material's stress state based on a time window comprising the strain states of n previous time steps.For FE simulations, this entails a significant increase in computational cost, as the strain states of the previous n time steps have to be stored for each quadrature point.In [64], LSTMs reacted sensitively to time step sizes and loading cases, deviating from those used during training.Moreover, the width of the time window directly affects the fading memory property [65] of the material and is difficult to determine.The thermo-viscoelastic constitutive behavior of polypropylene was modeled by [66] using a mechanistic/datadriven hybrid approach in which a neural network represented the viscous part of their rheological model.[67] employed full-field strain measurements to calibrate the material parameters of a generalized Maxwell model for isotropic linear viscoelasticity in the small-strain regime.[68] modeled the time-dependent behavior of human brain tissue by QLV.Therein, neural networks learned the constant relaxation coefficients and times of the reduced relaxation function.A purely data-driven approach to constitutive modeling, which does not require any explicit constitutive models but builds on material data only, was introduced by [69] and recently extended to inelastic materials [70,71].However, this approach typically requires a large database to describe the material's mechanical behavior [72].
To overcome the limitations of the above-delineated approaches, at least in part, herein we propose viscoelastic Constitutive Artificial Neural Networks (vCANNs), a novel physics-informed machine learning framework for anisotropic nonlinear viscoelasticity at large strains.vCANNs are based on a generalized Maxwell model, enhanced with nonlinear strain (rate)-dependent relaxation coefficients and relaxation times represented by neural networks.We show that the data-driven nature of vCANNs enables them to identify accurate anisotropic nonlinear viscoelastic constitutive models automatically.The number of Maxwell elements adapts automatically during the training, promoting a sparse model through L 1 regularization.Adopting the computationally very efficient framework of QLV and FLV, we leverage these well-established theories to model anisotropic nonlinear viscoelasticity.The achieved degree of accuracy is unmatched by similar traditional approaches that have been proposed.We trained vCANNs on stress-strain data from Polyvinyl Butyral, the electro-active polymers VHB 4910 and 4905, and the rectus abdominis muscle.Different loading conditions were considered, including relaxation tests, cyclic tension-compression tests, and blast loads.In all these cases, vCANNs were found to be able to learn the behavior of the materials within minutes, without human guidance, and with high accuracy.

Theory
In this section, we derive the theoretical framework of vCANNs.They can be considered an extension of CANNs, which were introduced in [72], to anisotropic nonlinear viscoelasticity.Therefore, we initially provide a brief review of CANNs.The section's central part will be devoted to the viscoelastic enhancement of CANNs.

Describing anisotropic hyperelasticity with generalized structural tensors
Many materials of interest exhibit direction-dependent, i.e., anisotropic, mechanical properties.CANNs provide a physics-informed machine learning framework for anisotropic hyperelasticity to describe these materials in a very general way.To this end, CANNs employ invariant theory, and the concept of generalized structural tensors [73].A material is called hyperelastic if a strain energy function Ψ = Ψ(F), depending only on the deformation gradient F, can describe the material's mechanical behavior [74].To fulfill the principle of objectivity [75], one usually represents Ψ in terms of the right Cauchy-Green tensor C = F T F, i.e., Ψ = Ψ(C).For an incompressible hyperelastic material (det C = 1), the instantaneous elastic 2. Piola-Kirchhoff stress tensor is given by where p is a Lagrangian multiplier ensuring incompressibility.The superscript (•) e explicitly distinguishes the instantaneous elastic stress from the viscoelastic stress derived in the next paragraph.In Eq. ( 1), the first and second terms represent the volumetric and isochoric stress contribution, respectively.The treatment of compressible and nearly compressible materials is equally possible with our proposed framework of anisotropic nonlinear viscoelasticity.For brevity, and because finite-strain viscoelasticity plays a particularly prominent role in materials often modeled as (nearly) incompressible (such as rubber materials or biological tissues), we limit the discussion in the following to incompressible materials.
To describe the mechanical behavior of an anisotropic material, one can define several so-called preferred directions represented by unit direction vectors l j ∈ R 3 , j = 1, 2, . . ., J, and define the following J + 1 structural tensors Here, I denotes the second-order identity tensor, and the associated L 0 is used to describe the isotropic part of the material's constitutive behavior.The preferred directions l j can often be interpreted as directions of fiber families embedded in the material.It can be shown that to preserve the material symmetry, the strain energy function Ψ has to be an isotropic function of the quantities C and L j , j = 1, 2, . . ., J [76].It can be shown [77,78] that this is the case if the strain energy function depends only on the following invariants: The latter two types of invariants in Eq. ( 4) are constant and can therefore be omitted from the arguments of Ψ.For practical applications, the influence of the first invariant type in Eq. ( 4) is usually negligible.Therefore, Ψ can commonly be expressed as With the 2R + 1 generalized invariants relying on the R generalized structural tensors where and employing the short-hand notation we can alternatively express Eq. ( 5), according to [73], in the form The generalized structural tensors represent linear combinations of the standard structural tensors L rj = l rj ⊗l rj introduced in Eq. ( 2).We use a double index rj to emphasize that, in principle, each generalized structural tensor Lr can rely on a different subset of J r preferred material directions l rj , j = 1, . . ., J r .In order to describe not only the stress-strain behavior of a material but also the dependence of this behavior on certain in general non-mechanical parameters, it is convenient to augment the arguments of Ψ with a feature T , where N f denotes the number of features.For example, f could carry information on the material's microstructure or production process.Thus, Apart from material symmetry and the principle of objectivity, the strain energy function has to fulfill several other conditions.The strain energy must always be positive, Ψ ≥ 0. Also, the strain energy is required to approach infinity if the material is shrunk to zero or expanded to infinite volume, i.e., Ψ → ∞ for det C → ∞ or det C → 0 + , which is called the growth condition.If a stress-free reference configuration is assumed, the strain energy function and the stress have to fulfill the normalization condition: Ψ(C = I) = 0 and S e (C = I) = −pI + 2 ∂Ψ ∂C C=I = 0. Inserting Eq. ( 11) in Eq. ( 1) yields Nowadays, engineers can choose from a vast catalog of strain energy functions to model materials.Choosing a suitable strain energy function, however, typically requires expert knowledge.To overcome this problem, CANNs introduced a particularly efficient machine learning architecture to learn the relation between the argument in Eq. ( 11) and the resulting strain energy Ψ.Basing CANNs on Eq. ( 11) endows them with substantial prior knowledge from materials theory, namely, the theory of generalized invariants.This prior knowledge significantly reduces the amount of training data CANNs need to learn the constitutive behavior of a specific material of interest.At the same time, given the generality of the theory of generalized invariants, using Eq. ( 11) as a basis does not limit the generality of CANNs in any practically relevant way.Rather the underlying neural network equips the constitutive model with the flexibility to adjust to experimental data from various materials without human guidance.In particular, the preferred material directions l j and the scalar weight factors w rj in Eq.
(2) and Eq. ( 7) are learned by the CANN from the available material data.In the following, we extend this concept to anisotropic nonlinear viscoelasticity.

Viscoelasticity
According to [18], in QLV, the viscoelastic 2. Piola-Kirchhoff stress tensor is given by the hereditary integral where G(t) is the time-dependent fourth-order reduced relaxation function tensor.
The fundamental assumption of QLV is that G(t) depends only on time but not the deformation (timedeformation separability) such that the relaxation behavior is for any applied deformation the same.To overcome this limitation, we allow G to depend on the deformation C. At this point, we go beyond the framework of classical QLV because G is not only time-dependent anymore.We add the deformation rate Ċ to the arguments of G since many materials show not only strain-dependent but also strain rate-dependent viscoelastic behavior [2,79]: In the simplest case, G(t) = G(t)I where G(t) denotes a scalar reduced relaxation function and I the fourth-order identity tensor.However, anisotropic materials may exhibit different viscous properties in different directions.Therefore, a single scalar reduced relaxation function would, in general, be insufficient to capture the complex nature of anisotropic viscoelastic materials.On the other hand, the experimental identification of a fourth-order reduced relaxation function tensor is highly challenging, even for simple classes of anisotropy, and is practically often unfeasible for complex classes.A reasonable compromise between practicability and generality of the constitutive model is to use a scalar-valued reduced relaxation function G r for each stress contribution S e r in Eq. (12).Additionally, we augment the arguments of the reduced relaxation with the structural tensors to account for anisotropy: Experiments suggest that in many rubber materials and soft biological tissues, the viscous effects mostly attribute to the isochoric part of the stress [80].In the incompressible limit, this holds exactly [81].Therefore, in Eq. ( 15), the reduced relaxation functions G r affect only the isochoric part of the stress.
The reduced relaxation functions G r are scalar-valued functions of tensors.To fulfill the principle of material objectivity and to reflect the material symmetry correctly, the reduced relaxation functions have to be isotropic functions of the tensor system {C, Ċ, L 1 , L 2 , . . ., L J } [76].Compared to Eqs. ( 3) and ( 4), the set of isotropic invariants, in terms of which all other isotropic functions can be expressed, is completed by [77,78], Following the same arguments as before, we omit the invariants in Eqs. ( 17) and (18) yielding By introducing the 2R + 1 generalized invariants and the short-hand notations we can express Eq. ( 19) alternatively by G r = G r (t; I) .
Finally, we augment the arguments of the reduced relaxation function with the above-introduced feature vector f : From Eq. ( 15), we obtain the 2. Piola-Kirchhoff stress tensor Prony Series.Motivated by linear viscoelasticity and the generalized Maxwell model (Fig. 1), the most popular choice for the reduced relaxation function G in QLV is the discrete Prony series with Here, g ∞ is a material parameter related to the equilibrium elasticity of the generalized Maxwell model, and g α and τ α are parameters characterizing elasticity and viscous relaxation time of the α-th Maxwell element.The g ∞ and g α are referred to as relaxation coefficients.In principle, the number of Maxwell elements N is arbitrary, which enables the model to describe complex viscoelastic materials.Since the material parameters g ∞ , g α , and τ α are constants, the classical Prony series is limited to linear viscoelasticity.
Generalized Prony Series.The classical Prony series does not depend on the deformation or the deformation rate but on time only.Therefore, to account for nonlinear viscoelasticity, we propose the following generalized Prony series The conditions Eq. ( 26) individually apply to the relaxation coefficients g r∞ (I, f ), g rα (I, f ) and the relaxation times τ rα (I, f ) associated with the instantaneous elastic stress component S e r .N r denotes the number of Maxwell branches of the generalized Maxwell model associated with the instantaneous elastic stress component S e r .In contrast to the classical Prony series, the relaxation coefficients and times in Eq. ( 25) are functions of the invariants I and the feature vector f to capture also anisotropic nonlinear viscoelasticity.
Inserting Eq. ( 27) into Eq.( 24) yields where S ∞ r = g r∞ (I, f ) S e r denotes the equilibrium stress associated with the r-th generalized Maxwell model.Q rα is the viscous overstress in the α-th Maxwell branch of the r-th generalized Maxwell model.To illustrate the proposed constitutive model, we particularized a vCANN for the important case of transverse isotropy in Appendix A. In general, closed-form solutions do not exist for the integrals in Eq. (28) so that a numerical time integration scheme has to be applied.Details are provided in Appendix B.

General
In the previous section, we outlined the theoretical foundations of the model of nonlinear viscoelasticity on which we rely in this paper.The main idea of vCANNs is to implement this theory via a machine learning architecture.This architecture is illustrated in Fig. 2. In our approach, we use feedforward neural networks (FFNNs).The networks consist of H + 1 layers, that is H hidden layers, of neurons.The input passed to the first layer is a vector x 0 ∈ R n0 .The output of the l-th layer is denoted by x l ∈ R n l , respectively, and computed as with the activation function σ l (•) of layer l, weights W (l) ∈ R n l ×n l−1 of layer l, and biases b (l) ∈ R n l of layer l.The activation function is applied element-wise to its argument.The output of the last layer (and thus the output of the network altogether) is x H+1 ∈ R n H+1 .Mathematically, an FFNN with H hidden layers establishes a mapping N : R n0 → R n H+1 , x H+1 = N (x 0 ).Applying the model of nonlinear viscoelasticity outlined in Sec. 2 to compute the stress at each point in time (depending on the strain history) requires implementing Eq. ( 15).To evaluate this equation for a given strain history, we need to define the following functions: the strain energy Ψ and the reduced relaxation functions G r .

Strain energy
To define the strain energy, we use a CANN [72] relying on the generalized invariants of the type Ĩ. Stresses can be computed by automatic differentiation.The CANN automatically ensures material objectivity, material symmetry, and an energy-and stress-free reference configuration, i.e., Ψ(C = I) = 0 and S(C = I) = 0.The latter is ensured by a term in the strain energy that is continuously adopted during machine learning such that these two conditions remain satisfied.Non-negativeness of the strain energy function, i.e., Ψ ≥ 0, is ensured by choosing appropriate activation functions and weight constraints in the last two layers of the CANN.In the second-to-last layer, we apply non-negative activation functions σ H : R → R +0 .In the last layer, we apply a linear activation function σ H+1 (x) = x and enforce non-negative weights and biases (W (H+1) , b (H+1) ≥ 0),  27).The generalized invariants Ĩ and f are fed to the CANN which calculates the instantaneous elastic stress contributions S e r .The internal structure of the CANN is depicted in Fig. 1(a) in [72].With Eq. ( 28), we finally calculate the viscoelastic stress S. Note that Fig.
C.12 is slightly modified compared to Fig. 1(b) in [72] since in vCANNs one has to account for Ċ as an input, too.yielding a non-negative strain energy function.Except for the last layer, where we apply a linear activation function, our default activation function is the softplus function σ l (x) = ln(1 + exp(x)).Apart from the useful property of being positive, the softplus function is a C ∞ -continuous function.Hence, the strain energy function, stress tensor, and elasticity tensor are C ∞ -continuous functions which is numerically favorable, particularly for implementing the vCANN in FE software.
Note that, if necessary, we can easily guarantee polyconvexity [82] of the strain energy function when using CANNs.To this end, we enforce non-negative weights in all layers and a non-negative bias in the last layer (but not necessarily in the previous layers [83]).Then, applying convex, non-decreasing activation functions on all layers renders the network convex [84].We meet this constraint on the activation functions by default since we use linear activation functions in the last layer and softplus activation functions in all other layers.Both activation functions are convex and non-decreasing.Using a neural network with the above features, we only have to ensure that only polyconvex invariants are used in the CANN.In particular, the generalized invariants of the type Ĩ are polyconvex [73].For an overview of other polyconvex invariants, the reader is referred to [85,86,87].

Reduced relaxation functions
To define the reduced relaxation functions G r in Eq. ( 27), we must define strain (rate)-dependent relaxation times and relaxation coefficients.To this end, we represent the unknown relation between the strain (rate) and these parameters by FFNNs.We employ separate FFNNs for each relaxation time and coefficient such that each neural network can focus on a particularly simple task.The constraints on the reduced relaxation functions, (26), are less severe than those on the strain energy functions.The positivity of the relaxation times and coefficients in Eq. ( 26) 2,3 is guaranteed by the same methods described above for the strain energy function.The unity constraint on the relaxation coefficients, Eq. ( 26) 1 , is enforced by a custom normalization layer.Apart from that, the functional relations providing the sought parameters are not restricted.Note that using the invariant basis I as input, the relaxation times and coefficients automatically ensure material objectivity and material symmetry.
The number of Maxwell elements in the generalized Prony series is an important parameter.With sufficiently many Maxwell elements, a Prony series can describe arbitrarily complex viscoelastic materials.Often, materials exhibit numerous different relaxation times [88].A generalized Maxwell model represents each relaxation time by a different Maxwell element.Following these arguments, choosing a large number of Maxwell elements is preferable to represent the relaxation behavior as accurately as possible.However, the model complexity, and thus the computational cost, increases with the number of Maxwell elements.Moreover, complex models with many parameters tend to overfit the experimental data, thereby losing the ability to generalize beyond specific given training data.From that perspective, keeping the number of relaxation times and coefficients small is favorable.To balance between an accurate representation of data and a low model complexity, we proceed as follows.
Identifying the relaxation coefficients and times of a generalized Maxwell model is known to be an ill-posed problem [89].Therefore, in classical approaches, the number of Maxwell elements N r is determined beforehand and fixed during the parameter identification process [90].In some approaches, the number of Maxwell elements and the relaxation times are determined a priori and fixed during parameter identification to remove ill-posedness [89].In our approach, we predefine a maximum number of Maxwell elements N max .The literature shows that the relaxation times of viscoelastic materials are typically uniformly distributed on a logarithmic scale [91].To endow our machine learning architecture with this heuristic prior knowledge, we normalized the output of the N max r FFNNs that learned the relaxation times by time constants T rα , α = 1, 2, . . ., N max r .These time constants were uniformly distributed on a logarithmic scale in some range [T min , T max ].T min and T max are parameters the user can initially define based on prior knowledge or heuristic expectations.It is important to underline that the normalizing constants T rα are not the relaxation times of our model.The vCANN can, and will in general, learn relaxation times τ rα (possibly even considerably) differing from the T rα .Yet, the T rα provide via the normalization of the output of the FFNNs some bias regarding the expected time scales for the different relaxation times, which can significantly accelerate the training if [T min , T max ] is properly chosen.
Initially, we prescribe the maximum number of Maxwell elements N max r , typically much larger than the actual number N r required to accurately describe the viscoelastic material.This allows us to gradually eliminate Maxwell elements during training to obtain a sparse model.This approach is similar to the one of [92,93], where the number of Maxwell elements was adjusted by merging or removing them during parameter identification to avoid ill-posedness and improve the fit.Likewise, [94] proposed to apply Tikhonov-regularization [95] to the material parameters and subsequently cluster Maxwell elements with similar relaxation times.We decided to promote sparsity of our model by applying L 1 regularization to the relaxation coefficients g rα , α = 1, 2, . . ., N max r .Using L 1 regularization, the optimal value for some relaxation coefficients will be zero, eliminating the corresponding Maxwell elements.In this approach, one uses a penalty parameter Λ controlling the sparsity of the model.Choosing Λ = 0 disables regularization, whereas with increasing Λ, the sparsity of the model increases, too.The penalty parameter Λ is a hyperparameter that has to be predefined (and possibly iteratively optimized).
In summary, relaxation times and coefficients, as functions of the invariants I and the feature vector f , are learned by individual FFNNs.Scaling of the FFNNs determining the relaxation times by predefined logarithmically uniformly spaced constants introduces a bias in agreement with the literature findings that can help accelerate the training process.L 1 regularization on the relaxation coefficients promotes sparse reduced relaxation functions.We implemented the complete vCANN framework using the open-source software library Keras with TensorFlow backend [96,97].

Results
In this section, we apply vCANNs to various data sets.We use synthetic as well as experimental data.In [72], we already demonstrated that CANNs could successfully learn the preferred material directions l rj and the scalar weight factors w rj in Eq. ( 2).Therefore, for simplicity, we herein assume them to be known to focus on this paper's main problem, viscoelastic relaxation.We list the corresponding vCANNs and their hyperparameters in Appendix E for each of the following examples.There, we also provide additional information on the training procedure.

Anisotropic viscoelasticity with synthetic data
We created synthetic training data to mimic stress-strain data of viscoelastic soft biological tissues.To this end, we used two hyperelastic constitutive models popular in biomechanics, the Ogden model [98], and the Holzapfel-Gasser-Ogden (HGO) model [99].The Ogden model is a phenomenological model for isotropic rubber-like materials and soft biological tissues and is usually formulated in terms of the principal stretches λ i , i = 1, 2, 3, which are the square roots of the eigenvalues of C, In Eq. ( 31), n is a positive integer, µ p and α p are (constant) material parameters.Many soft biological tissues exhibit stiffening fibers that induce anisotropy.Therefore, the Ogden model is often combined with the HGO model, which adds an anisotropic contribution to the total strain energy function.The strain energy function of the HGO model, with one preferred material direction l and structural tensor L = l ⊗ l, is k 1 ≥ 0 and k 2 > 0 are material parameters, and I 4 = C : L. Since I 4 represents the squared fiber stretch, I 4 < 1 means compression of the fibers, which are assumed to bear load under tension only.Thus, for I 4 < I, the anisotropic strain energy and stress contributions are assumed to be zero.
To produce synthetic data, we used a material model where strain energy was a sum of the Ogden (OG) and HGO strain energy functions, that is, For the viscous part of the constitutive model used for generating synthetic material data, we assumed straindependent relaxation times and coefficients: Using the material parameters in Tab.D.1 and Tab.D.2 for Eqs. ( 34) and ( 35), we simulated uniaxial cyclic tension-compression experiments with relaxation periods between each tension and compression period.After each complete cycle the stretch rate λ was changed according to the sequence λ = {0.02,0.03, 0.04, 0.05} s −1 .
Loading and unloading periods took t move = 10 s, respectively.The relaxation periods took t relax = 60 s.Thus, a single cycle took t cyc = 160 s and the total experiment t total = 640 s.Synthetic training data were generated for different preferred directions, characterized by the acute angle ϕ between the loading and preferred material directions.ϕ = 0 • means that loading direction and preferred direction l are parallel, ϕ = 90 • means that both are to orthogonal.The synthetic training data comprised stress data from fictitious materials with four different preferred directions corresponding to ϕ = {0, 15, 20, 25} • .To validate the model, we generated additional synthetic data for a material with the preferred direction ϕ = 10 • , which is not in the training data set.For this material, we simulated two more loading cycles in addition to the above-described loading history, such that the vCANN had to extrapolate the stress response temporally.The stretch rate of the cycles changed according to the sequence λ = {0.01,0.02, 0.03, 0.04, 0.02, 0.05} s −1 .
We trained a vCANN with the transversely isotropic structure and hyperparameters given in Appendix E.1.Figures 3 and 4 show that the vCANN learns to replicate the training data almost exactly.In compression, the stress response is similar for all preferred directions since only the isotropic matrix of the composite (Ogden model) bears the load.The vCANN replicates this feature accurately.Similarly, the prediction of the validation data set for the unknown preferred direction captures and extrapolates almost perfectly the material response.In particular, the irregular stress response in the time interval [640, 710] s, caused by halving the stretch rate, is predicted precisely.

Passive viscoelastic response of the abdominal muscle
We reproduced relaxation responses of the leporine rectus abdominis muscle reported in [37].The reduced relaxation function's shape depends on the stretch level, thus exhibiting nonlinear viscoelastic behavior.Classical QLV cannot account for this stretch dependency and would predict the same curve for each stretch level.To represent the stretch-dependent relaxation behavior, [37] incorporated stretch-dependent relaxation coefficients and times into a Prony series with one Maxwell element.The authors empirically determined the phenomenological strain dependency of the relaxation coefficients and times.However, their model did not accurately capture the reduced relaxation curves despite utilizing optimization algorithms to fit the material parameters to experimental data (cf.Fig. 10 in [37]).This illustrates the limits of human-designed and human-calibrated constitutive models for viscoelastic materials.
In contrast, vCANNs capture the relaxation curves with high accuracy (Fig. 5) otherwise only matched by much more complex FNLV models based on the multiplicative split of the deformation gradient, see Fig. 5 in [53] for comparison on the same experimental data set.We started the training with N max = 10 Maxwell elements.Six of them were discarded during training, leaving only the reduce set of parameters plotted in Fig. 6.We provide the trained vCANN structure and the corresponding hyperparameters in Appendix E.2.Relaxation time Figure 6: Relaxation times (left) and coefficients (right) learned by the vCANN from the data set of leporine rectus abdominis muscle by [37].Only four of the initial ten Maxwell elements remained after training.

Viscoelastic modeling of VHB 4910
Very-High-Bond (VHB) 4910 is a soft electro-active polymer (EAP) that exhibits nonlinear viscoelastic behavior and can undergo substantial deformations.VHB 4910 was experimentally studied by [100], using uniaxial loading-unloading tests to characterize the rate-dependent behavior.The tests were conducted for three different stretch rates λ = {0.01,0.03, 0.05} s −1 and four different stretch levels λ = {1.5, 2.0, 2.5, 3.0} (Fig. F.13).Moreover, the authors conducted a multi-step relaxation test to determine the equilibrium response of the material.The constitutive model proposed in [100] is based on a multiplicative split of the deformation gradient into an elastic and viscous part.The hyperelastic eight-chain model of [101] was chosen to model the elastic part.The material parameters of the elastic part were identified using the data from a multi-step relaxation test.The strain energy function and evolution equation proposed by [102] were chosen for the viscous part.The viscous material parameters were identified using the loading-unloading data of λ = 0.01 s −1 and λ = 0.05 s −1 at a stretch level of λ = 3.The rheological analog model of the constitutive model was a generalized Maxwell model where the number of parallel branches was chosen to be four.
A few years later, VHB 4910 was again studied to demonstrate the abilities of a novel advanced microstructurallyinformed constitutive model developed in [103].The model relies on advanced knowledge of continuum and statistical mechanics and uses a multiplicative decomposition of the deformation gradient to represent a generalized Maxwell behavior.The elastic material parameters of the model were identified using time-consuming quasi-static tensile tests, and the viscous material parameters using (excluding, however, data with λ = 2.5).The number of Maxwell elements was determined by hand and set to three.
By contrast, we only used loading-unloading data with λ = 0.01 s −1 and λ = 0.05 s −1 at the stretch levels λ = 1.5 and λ = 3 to train the vCANN.Figure 7 shows the training and validation results.The fit of both the training and validation is very accurate and at least on par with the one of the FNLV models used in [100] (Figs.[9][10][11][12], and [103] (Fig. 5(b)-(d)).However, we note that the vCANN automatically learned the number of Maxwell elements required to represent the material behavior well.We initialized the vCANN with 10 Maxwell.After training, only two remained, the viscous properties of which we provide in Fig. 8.Moreover, the application of the vCANN did not require advanced expert knowledge and did not require data from particularly sophisticated experiments.These advantages make vCANNs attractive from a practical point of view, in particular in the context of industrial applications.We list details on the trained vCANN structure and the corresponding hyperparameters in Appendix E.3.
Remark: The relatively large differences between the experimental data and the vCANN model for λ = 2.5 in Fig. 7(d), which can also be observed for the FNLV model in [100], are likely a result of experimental scatter.The loading paths should be almost identical for a fixed strain rate up to the respective maximum stretches.However, this is not the case, as is highlighted in Fig. F.13, which suggests considerable measurement errors in a part of the data, which naturally limited the ability of the vCANN to derive a consistent data-driven model.Relaxation coefficient g Figure 8: Viscous properties of the vCANN learned from experimental data on VHB 4910 from [100].Only two of the initial 10 Maxwell elements were kept and are necessary to describe the material accurately.

Blast load analysis of Polyvinyl Butyral
Polyvinyl Butyral (PVB) is a polymer whose primary application is laminated safety glasses.Under heat and pressure, two glass panes are bonded with an interlayer of PVB into a single unit.Under blast loads, the interlayer binds shards of glass, absorbs energy, and mitigates its transfer to the surrounding frame.It is essential to understand the mechanical behavior of PVB to improve the design of laminated glass structures.Large-scale simulations of these structures require simple material models that capture the mechanical behavior over a wide range of strain rates.[104] conducted high-stretch rate experiments on PVB, with stretch rates between 0.01 s −1 and 400 s −1 .The viscous properties of PVC likely vary within such a wide range of strain rates.The significant change of the stress-stretch curve's shape above 0.2 s −1 visible in Fig. 9 suggests this, too.Ideally, the constitutive model should be able to represent this transition accurately.[104], proposed an FNLV model and used the strain-dependent viscosity function by [47].The model describes the experimental data well at high stretch rates, although it cannot accurately resolve the peak stress and subsequent softening at λ ≈ 1.1 − 1.2.At low strain rates, the fit quality is quantitatively unsatisfactory.In [104], the authors also fitted a standard generalized Maxwell model with constant relaxation coefficients and times for comparison.The model comprised six Maxwell elements whose relaxation times were chosen to be uniformly distributed on the logarithmic scale and kept fixed during the parameter identification of the relaxation coefficients.Notably, to account for the broad stretch rate range, two different models had to be used, one for the low stretch rate regime (up to 8 s −1 ) and the other for the high stretch rate regime (20 s −1 and above).However, both models could not accurately describe the material behavior in their respective stretch rate regimes.
To account for the rate-dependent viscoelastic properties, we trained the vCANN detailed in Appendix E.4. Figure 9 shows that the vCANN successfully learned the constitutive behavior over a wide range of stretch rates.Comparing Figs. 12 to 17 in [104] with Fig. 9, reveals that the trained vCANN outperforms the traditional models.In particular, they capture the peak stress and softening in the initial loading phase up to λ ≈ 1.2.Importantly, the data-driven nature of our approach apparently provided the flexibility to model the transition between the low and high stretch rate regimes, whereas two different classical models were required to capture the two different regimes.The vCANN did not only learn the constitutive behavior of the training data but also made precise predictions in the low and high stretch regimes for unknown the unknown validation.Remarkably, no advanced expert knowledge was necessary to apply the vCANN, and training the vCANN from scratch took less than 10 minutes on a standard desktop computer.Since the traditional models in [104] were fitted using the entire data set, we also trained the vCANN on the entire data set to ensure a fair comparison.

Thermo-viscoelastic modeling of VHB 4905
Another commercially available EAP is VHB 4905.Like most polymers, VHB 4905 is strongly temperature sensitive.Hence [105] conducted an extensive experimental study with a wide range of temperatures at different stretch rates and stretch levels.To demonstrate the utility of the feature vector f in the vCANN architecture (which is optional and was not yet used in the previous examples), we included the temperature Θ into the vCANN input as f .For training we used data of loading-unloading tests with different temperatures Θ = {0, 1020, 40, 60, 80} [°C] and stretch rates λ = {0.03,0.1} s −1 at the stretch level λ = 4. Additionally, we included in the training set data of tests with Θ = {0, 40, 60, 80} [°C], λ = 0.1 s −1 at λ = 2, Fig. 10.The strong nonlinear temperature dependence of the stress response is clearly visible by comparing Fig. 10(a) and Fig. 10(c).In particular, the shape of the stress-stretch curve as well as the stiffness changes between 0°C and 20 °C significantly.
To validate the trained can, we took data from loading-unloading tests with Θ = {0, 1020, 40, 60, 80} [°C], λ = 0.03 s −1 at λ = 3.Moreover, we used data from tests with Θ = {0, 1020, 40, 60, 80} [°C], λ = 0.05 s −1 at λ = 4 for validation, Figs.11.Of note, the vCANN had not received any training data with a stretch level λ = 3 nor with a stretch rate λ = 0.05 s −1 .Yet, the trained vCANN was able to predict very well the material behavior for the unknown stretch level and also for the unknown stretch rate.Both is challenging for classical constitutive models and demonstrates the potential of vCANNs.Details on the trained vCANN and its hyperparamters are documented in Appendix E.5.As seen in [105], different sophisticated load protocols are necessary for classical models to calibrate individual parts of the model separately.Although this procedure is possible with vCANNs due to their modularity, they can be trained on a large data set directly, which is much simpler, faster and requires no advanced expert knowledge.

Conclusion
In this paper, we introduced vCANNs, a physics-informed data-driven framework for anisotropic nonlinear viscoelasticity at finite strains.The viscous part is based on a generalized Maxwell model enhanced with nonlinear strain (rate)-dependent relaxation coefficients and times represented by neural networks.The number of Maxwell elements is not determined a priori but adapts automatically during training.Thereby, vCANNs employ L 1 regularization on the Maxwell branches to promote a sparse model.In contrast, traditional models usually specify and fix the number of Maxwell branches before calibrating the material parameters, which requires additional, often labor-intense tests.vCANNs adopt the computationally very efficient framework of QLV and FLV but generalize these well-established theories to model anisotropic nonlinear viscoelasticity.We demonstrated the ability of vCANNs to learn even challenging viscoelastic behavior of advanced materials by several examples.We also briefly illustrated the ability of vCANNs to process non-mechanical information such as temperature data (or, in other cases also, microstructural or processing data) to predict the behavior of materials under conditions not covered by the training data.We demonstrated that vCANNs could learn the viscoelastic behavior of advanced materials from a database similarly small as the one human experts typically need to calibrate their models.However, vCANNs can learn the material behavior in a fast and fully-automated manner, and their application does not require any expert knowledge.These advantages make vCANNs a favorable tool to support the development of new advanced materials in academia and industry.
Of note, vCANNs are not only helpful from a practical perspective but can also promote our theoretical understanding.For example, it is often believed that the generalized Maxwell model with strain-dependent material parameters cannot describe strain-dependent relaxation curves accurately [53].Interestingly, the application example on the rectus abdominis muscle presented above demonstrates that vCANNs are very well able to accomplish this.These findings raise the question of whether doubts about the capabilities of generalized Maxwell models are mainly a result of difficulties humans face in their proper calibration instead of fundamental shortcomings of this class of models.In such a way, vCANNs can help us with their automated and highly efficient calibration process to understand the actual capabilities and limits of generalized Maxwell models.Exploring this further may be an exciting avenue for future research.
For sufficiently small intervals (and assuming a sufficiently smooth problem) ∆t, τ rα and g rα can through the whole time interval be approximated by the average values of its beginning and end point: giving an update formula for the overstress at time t n+1 .The two update formulae (B.10) and (B.12) are very similar to common recurrence formulae for the stress update found in the literature [31,107,80].The major difference to most formulae reported in the literature is that τrα and ḡrα are not constant but change each time step.In essence, in a discrete time stepping scheme, one solves a different QLV problem in each time step.The algorithmic linearization of the stress tensor S n+1 is essential for solving nonlinear boundary problems.With Eq. ( 28), we obtain an update rule for the stress tensor S n+1 at time point t n+1 : .13) between the actual stress response and the one estimated by the vCANN as the loss function.The gradients of the loss with respect to model parameters are calculated by the backpropagation algorithm using automatic differentiation.The training was terminated based on early stopping.All weights and biases were initialized with Glorot/Xavier uniform initializer and zeros, respectively.No regularization (weight decay) was applied to the weights and biases during training.No dropout layers were used.All vCANNs were trained with Adam optimizer (β 1 = 0.9, β 2 = 0.999, ε = 10 −7 ).

Figure 1 :
Figure1: The generalized Maxwell model: the elastic spring on the left represents the equilibrium stress response S ∞ ; each Maxwell element produces a viscous overstress Q α and represents a relaxation process with a different relaxation time.g∞, g i , and τ i are constant material parameters or deformation (rate)-dependent functions.

Figure 2 :
Figure 2: Schematic illustration of the vCANN architecture: The strain (rate) tensors C, Ċ, and the feature vector f serve as input to the structure learning block (Fig. C.12) which learns the generalized invariants Ĩ and Ĩ.The generalized invariants and the feature vector are fed to the relaxation time ANNs Nτ rα and relaxation coefficient ANNs Ng rα .The outputs of Nτ rα are the relaxation times τrα.The neural networks Ng rα calculate the relaxation coefficients grα which are regularized to promote a sparse model (grey box 'L 1 regularization').The reduced relaxation function Gr is obtained by inserting τrα and grα in Eq. (27).The generalized invariants Ĩ and f are fed to the CANN which calculates the instantaneous elastic stress contributions S e r .The internal structure of the CANN is depicted in Fig.1(a) in[72].With Eq. (28), we finally calculate the viscoelastic stress S. Note that Fig.C.12 is slightly modified compared to Fig.1(b) in[72] since in vCANNs one has to account for Ċ as an input, too.

r.
During the training, the actual number N r of Maxwell elements is determined by the vCANN as a part of the learning process within the allowed range [1; N max r ].To avoid unnecessary constraints to the learning process, one may choose relatively large values for N max r , which typically result after the training in N r N max r

Figure 3 :
Figure 3: Training (top) and validation (bottom) results.Scatter points represent the synthetic training and validation data, respectively; solid lines represent the vCANN predictions.

Figure 4 :
Figure 4: Training (top) and validation (bottom) results.Scatter points represent the synthetic training and validation data; solid lines represent the vCANN predictions.

Figure 5 :
Figure 5: vCANNs can learn to replicate (left) and predict (right) the viscoelastic behavior of abdominal muscle with high accuracy: experimental data from Fig. 4(b) of [37] is reproduced by solid circles; solid lines represent the fit of the vCANN.

Figure 7 :
Figure 7: Results of the trained vCANN for the polymer VHB 4910: achieved fitting of training data (a)-(b) and predictive performance on the validation data (c)-(d).Each subfigure shows the loading-unloading stress response for a fixed maximum stretch but different stretch rates.The scatter points represent experimental data on VHB 4910 reproduced from [100]; solid lines represent the trained vCANN.

Figure 9 :
Figure 9: High-stretch rate experiments on PVB.The vCANN accurately describes the constitutive behavior over a wide range of stretch rates for the training (left) and unknown validation data (right).The scatter points represent experimental data reproduced from [105]; solid lines represent the stress response of the vCANN trained on the complete data set.

Figure 10 :Figure 11 :
Figure 10: Performance of the vCANN for VHB 4905: achieved fitting of training data.The scatter points represent experimental data reproduced from [105]; solid lines represent the vCANN performance In a displacement-driven setting, τrα and ḡrα are known since they depend on the prescribed deformation (rate) at the considered times.With the approximation Eq. (B.3), Eq. (B.2) becomes a linear ODE of first order with constant coefficients and with some known initial value:Multiplying both sides of Eq. (B.4) by the integrating factor exp(t/τ rα ) and applying the product rule gives recurrence update formula for the viscous overstress at time t n+1 , given we know the state at time t n .We used the mid-point rule on the integral in Eq. (B.8) approximating the time variable t by (t n+1 − t n )/2.