1 Introduction

Most classical beam theories [18] are based on the definition of so called beam strain measures, namely the axial-shear and torsional–bending strain vectors. Work conjugates to those variables (typically denoted as resultant contact forces and resultant contact couples [9]) can be derived from a hyperelastic energy functional which is defined in terms of the beam strain measures [18]. Other authors prefer an alternative continuum degenerate approach [10, 11] where, typically, the Lagrangian strain tensor is retained as the main strain measure and particularised for the kinematic description of the beam. In these latter formulations, the second Piola–Kirchhoff stress tensor emerges as the work conjugate variable and can be derived from a hyperelastic energy functional expressed as a function of the Lagrangian strain tensor.

The present manuscript aims to extend the variational and computational framework recently introduced in Bonet et al. [12] in the context of three dimensional elasticity to the geometrically exact Simo [13] beam model, the starting point of so many other finite element beam type formulations [17]. The variational approach proposed herein can therefore be viewed as a continuum degenerate formulation emerging from that presented in [12].

The formulation by Bonet et al. [12] has been particularly formulated for use in large strain scenarios [14, 15], where appropriate convexity criteria [1618] are necessary to ensure the well posedness of the problem. The most well-established of these criteria is the concept of polyconvexity [1925] whereby the strain energy is expressed as a convex multi-value function of an extended kinematic set defined by the deformation gradient (fibre map), its determinant (volume map) and its cofactor (area map).

Numerous authors have previously incorporated the concept of polyconvexity into computational models for both isotropic and non-isotropic materials for a variety of applications [2631]. However, the classical approach consists of ensuring that the strain energy satisfies the polyconvexity condition first but then proceeds towards a computational solution by re-expressing the energy in terms of the deformation gradient alone. Pioneered in [29] and [12, 32, 33], an alternative framework is proposed based on maintaining as independent variables the extended kinematic set on which the strain energy is expressed as a convex function, namely, the deformation gradient, its cofactor and its determinant.

In this paper, the latter approach is particularised for a degenerate beam description. The paper also presents the work conjugates of the extended kinematic set in the context of beam theory as well as novel expressions for the tangent operators. The present formulation utilises a novel algebra based on a tensor cross product operation pioneered in [34] and reintroduced and exploited for the first time in [12, 32, 33] in the context of solid mechanics. The new tensor cross product operation is particularly helpful when dealing with polyconvex constitutive laws, where invariants of the cofactor and the determinant of the deformation feature heavily in the representation of the strain energy functional.

In addition, the paper shows how the novel algebra facilitates the re-expression of any invariant of the deformation gradient, its cofactor and its determinant in terms of the classical beam strain measures. The latter being very useful whenever a classical beam implementation is preferred. As an example, and to the best of the authors knowledge, this is the first time that the strain energy function for a polyconvex Mooney–Rivlin (or Neo-Hookean) material is expressed entirely in terms of the classical beam strain measures.

Convexity of a beam constitutive model with respect to the classical beam strain measures is a well accepted condition [35] that must be satisfied by admissible strain energy functionals in the large strain regime. The present manuscript shows the relationship between a polyconvex constitutive model defined in terms of the deformation gradient, the cofactor and the determinant of the mapping in the continuum and its degenerate beam counterpart defined in terms of the beam strain measures. Hence, the connection between the two most accepted restrictions for the definition of constitutive models in three dimensional elasticity and beams is shown, bridging the gap between the continuum and its degenerate beam description.

The paper is organised as follows. Section 2 briefly revises the computational framework developed in [12] for large strain scenarios and especially tailored for polyconvex constitutive models. Section 3 presents the new degenerate beam formulation which extends the variational framework presented in [12] to the case of geometrically exact beam theory where the deformation gradient, its cofactor and its determinant are retained as the main strain variables. Section 4 presents the classical framework for the geometrically exact beam theory where the physically meaningful axial-shear and torsional–bending strain vectors are used as the main strain measures. Additionally, a link is established between the convexity criteria required for the constitutive model in both continuum and beam descriptions. Section 5 briefly presents the variational principle associated with the proposed continuum degenerate formulation. Section 6 presents the Finite Element discretisation, where the use of the novel tensor cross product algebra leads to alternative tangent operator representations. Section 7 includes representative numerical examples in large strain scenarios, including a comparison against a recently published mixed continuum based formulation [12]. Finally, Sect. 8 provides some concluding remarks and a summary of the key contributions of this paper.

2 Continuum mechanics

2.1 Continuum kinematics

Consider the three dimensional deformation of an elastic body from its initial configuration occupying a volume V, of boundary \(\partial V\), into a final configuration at volume v, of boundary \(\partial v\), where \(\varvec{x}\) represents the current position of a particle originally at \(\varvec{X}\) (see Fig. 1). Virtual and incremental variations of \(\varvec{x}\) will be denoted by \(\delta \varvec{u}\) and \(\varvec{u}\), respectively. It will be assumed that \(\varvec{x}\), \(\delta \varvec{u}\) and \(\varvec{u}\) satisfy appropriate essential (displacement) boundary conditions on \(\partial _u V\). Additionally, the body is under the action of certain body forces per unit undeformed volume \(\varvec{f}_0\) and traction per unit undeformed area \(\varvec{t}_0\) on \(\partial _t V\), where \(\partial _t V \cup \partial _u V =\partial V\) and \(\partial _t V \cap \partial _u V = \emptyset \). The deformation gradient tensor \(\varvec{F}\) (or fibre map) and its determinant J (or volume map) are defined as [16]

$$\begin{aligned} \varvec{F}=\frac{\partial \varvec{x}}{\partial \varvec{X}}=\varvec{\nabla }_0\varvec{x};\quad J=\text {det}\varvec{F}=\frac{dv}{dV}, \end{aligned}$$
(1)

where \(\varvec{\nabla }_0\) denotes the gradient with respect to material coordinates and dv and dV represent elemental volumes in the initial and final configurations, respectively. The elemental area vector is mapped from initial \(d\varvec{A}\) to final \(d\varvec{a}\) configurations by means of the cofactor tensor \(\varvec{H}\) (or area map), which is related to the deformation gradient \(\varvec{F}\) and its determinant J via the Nanson’s rule [16] as

$$\begin{aligned} d\varvec{a}=\varvec{H}d\varvec{A};\quad \varvec{H}=J\varvec{F}^{-T}. \end{aligned}$$
(2)

In references [12, 32], the authors employ an alternative definition of the cofactor \(\varvec{H}\), namely \(\varvec{H}=\frac{1}{2}\varvec{F}\varvec{\times }\varvec{F}\), which simplifies considerably the algebra [33]. The new definition of the area map \(\varvec{H}\) is based on a tensor cross product \(\varvec{\times }\) introduced in [34] and applied within the context of nonlinear elasticity for the first time in [12]. The properties of this tensor cross product developed in [12] have been included in Appendix 1 for completeness.

Fig. 1
figure 1

Deformation mapping of a continuum and associated kinematics variables: \(\varvec{F}\), \(\varvec{H}\), J

Crucially, the first and second directional derivatives of \(\varvec{H}\) with respect to geometry changes are now easily evaluated asFootnote 1

(3)

Similarly, the directional derivatives of the volume map J are easily expressed with the help of (139) and (142) as

$$\begin{aligned} \begin{aligned}&D J\left[ \delta \varvec{u}\right] =\varvec{H}:\varvec{\nabla }_0\delta \varvec{u};\\&D^2{J}\left[ \delta \varvec{u}; \varvec{u}\right] =\varvec{F}:\left( \varvec{\nabla }_0\delta \varvec{u}\varvec{\times }\varvec{\nabla }_0\varvec{u}\right) . \end{aligned} \end{aligned}$$
(4)

2.2 Polyconvex elasticity

Polyconvexity is now well accepted as a fundamental mathematical requirement that must be satisfied by admissible strain energy functionals used to describe elastic materials in the large strain regime [1822, 24, 25, 36, 37]. The strain energy \(\Psi \) per unit undeformed volume must be a function of the deformation gradient \(\varvec{\nabla }_0\varvec{x}\) via a convex multi-valued function W as

$$\begin{aligned} \Psi \left( \varvec{\nabla }_0\varvec{x}\right) =W\left( \varvec{F},\varvec{H},J\right) . \end{aligned}$$
(5)

Crucially, frame invariance (objectivity) implies that W must be independent of the rotational components of \(\varvec{F}\) and \(\varvec{H}\), which is typically achieved by ensuring that W depends on \(\varvec{F}\) and \(\varvec{H}\) via the symmetric tensors \(\varvec{F}^T\varvec{F}\) and \(\varvec{H}^T\varvec{H}\), respectively. The three strain measures \(\varvec{F}\), \(\varvec{H}\), and J have work conjugate stresses \(\varvec{\Sigma _F}\), \(\varvec{\Sigma _H}\), and \(\Sigma _J\) defined by [12, 32]

$$\begin{aligned} \begin{aligned} \varvec{\Sigma _F}\left( \varvec{F},\varvec{H},J\right)&=\frac{\partial W}{\partial \varvec{F}};\\ \varvec{\Sigma _H}\left( \varvec{F},\varvec{H},J\right)&=\frac{\partial W}{\partial \varvec{H}};\\ {\Sigma _J}\left( \varvec{F},\varvec{H},J\right)&=\frac{\partial W}{\partial J}. \end{aligned} \end{aligned}$$
(6)

It is then possible [12, 32, 33] to express the first Piola–Kirchhoff stres tensor \(\varvec{P}\) in terms of the extended strain measures \(\{\varvec{F},\varvec{H},J\}\) and conjugate stresses \(\{\varvec{\Sigma }_{\varvec{F}},\varvec{\Sigma }_{\varvec{H}},\Sigma _J\}\) as

$$\begin{aligned} \varvec{P}=\varvec{\Sigma _F} + \varvec{\Sigma _H}\varvec{\times }\varvec{F} + \Sigma _J\varvec{H}. \end{aligned}$$
(7)

A tangent elasticity operator \(D^2\Psi \left[ \delta \varvec{u};\varvec{u}\right] \) is usually needed [16] in order to ensure quadratic convergence of a Newton–Raphson type of solution process, derived in [12, 32, 33] as follows

$$\begin{aligned} D^2\Psi \left[ \delta \varvec{u};\varvec{u}\right]= & {} \begin{bmatrix} \left( \varvec{\nabla }_0\delta \varvec{u}\right) :&\left( \varvec{\nabla }_0\delta \varvec{u}\varvec{\times }\varvec{F}\right) :&\left( \varvec{\nabla }_0\delta \varvec{u}:\varvec{H}\right) \end{bmatrix}\nonumber \\&\left[ \mathbb {H}_W\right] \begin{bmatrix} :\left( \varvec{\nabla }_0\varvec{u}\right) \\ :\left( \varvec{\nabla }_0\varvec{u}\varvec{\times }\varvec{F}\right) \\ \left( \varvec{\nabla }_0\varvec{u}:\varvec{H}\right) \end{bmatrix}\nonumber \\&+\left( \varvec{\Sigma _H} + \Sigma _J\varvec{F}\right) :\left( \varvec{\nabla }_0\delta \varvec{u}\varvec{\times }\varvec{\nabla }_0\varvec{u}\right) , \end{aligned}$$
(8)

where the Hessian operator \([\mathbb {H}_W]\) denotes the symmetric positive semi-definite operator containing the second derivatives of \(W\left( \varvec{F},\varvec{H},J\right) \) as

$$\begin{aligned} \left[ \mathbb {H}_W\right] =\begin{bmatrix} W_{\varvec{F}\varvec{F}}&W_{\varvec{F}\varvec{H}}&W_{\varvec{F}J}\\ \\ W_{\varvec{H}\varvec{F}}&W_{\varvec{H}\varvec{H}}&W_{\varvec{H}J}\\\\ W_{J\varvec{F}}&W_{J\varvec{H}}&W_{JJ} \end{bmatrix}=\begin{bmatrix} \frac{\partial ^2 W}{\partial \varvec{F}\partial \varvec{F}}&\frac{\partial ^2 W}{\partial \varvec{F}\partial \varvec{H}}&\frac{\partial ^2 W}{\partial \varvec{F}\partial J} \\ \\ \frac{\partial ^2 W}{\partial \varvec{H}\partial \varvec{F}}&\frac{\partial ^2 W}{\partial \varvec{H}\partial \varvec{H}}&\frac{\partial ^2 W}{\partial \varvec{H}\partial J} \\ \\ \frac{\partial ^2 W}{\partial J\partial \varvec{F}}&\frac{\partial ^2 W}{\partial J\partial \varvec{H}}&\frac{\partial ^2 W}{\partial J\partial J} \end{bmatrix}. \nonumber \\ \end{aligned}$$
(9)

Note that the first term on the right hand side of (8) is necessarily positive for \(\delta \varvec{u}=\varvec{u}\) and, therefore, buckling can only be induced by the second “initial stress” term \(\left( \varvec{\Sigma _H} + \Sigma _J\varvec{F}\right) :\left( \varvec{\nabla }_0\delta \varvec{u}\varvec{\times }\varvec{\nabla }_0\varvec{u}\right) \). Finally, it is also possible to re-write the tangent elasticity operator (8) as

$$\begin{aligned} \begin{aligned} D^2 \Psi [\delta \varvec{u};\varvec{u}]&= \varvec{\nabla }_0\delta \varvec{u}:\left[ \mathbb {H}_{W}^{eq}\right] :\varvec{\nabla }_0\varvec{u} \\&\quad +\left( \varvec{\Sigma _H} + \Sigma _J\varvec{F}\right) :\left( \varvec{\nabla }_0\delta \varvec{u}\varvec{\times }\varvec{\nabla }_0\varvec{u}\right) , \end{aligned} \end{aligned}$$
(10)

where use of the tensor cross product operations (133) and (134) yields

$$\begin{aligned} \begin{aligned} \left[ \mathbb {H}_{W}^{eq}\right]&=W_{\varvec{FF}}+\varvec{F}\varvec{\times }W_{\varvec{HH}}\varvec{\times }\varvec{F}\\&\quad + W_{JJ}\varvec{H}\otimes \varvec{H}+W_{\varvec{FH}}\varvec{\times }\varvec{F}+\varvec{F}\varvec{\times }W_{\varvec{HF}}\\&\quad +W_{\varvec{F}J}\otimes \varvec{H}+\varvec{H}\otimes W_{J\varvec{F}}+ \left( \varvec{F}\varvec{\times }W_{\varvec{H}J}\right) \otimes \varvec{H}\\&\quad + \varvec{H}\otimes \left( W_{J\varvec{H}}\varvec{\times }\varvec{F}\right) . \end{aligned}\nonumber \\ \end{aligned}$$
(11)

2.2.1 A polyconvex constitutive model: Mooney–Rivlin material

The well known Mooney–Rivlin constitutive model is polyconvex. An elegant representation of its strain energy functional in terms of the extended kinematic set \(\{\varvec{F},\varvec{H},J\}\) [12, 32, 33] is

$$\begin{aligned} W_{MR}\left( \varvec{F},\varvec{H},J\right) =\alpha \varvec{F}:\varvec{F} + \beta \varvec{H}:\varvec{H}+f(J), \end{aligned}$$
(12)

whereFootnote 2

$$\begin{aligned} f(J)=-2\alpha \text {ln}J -4\beta J + \frac{\lambda }{2}\left( J-1\right) ^2-3(\alpha +\beta ), \end{aligned}$$
(13)

with \(\alpha \) and \(\beta \) suitable non-negative material parameters and f(J) a convex function of J (refer to remark 1). The conjugate stresses \(\{\varvec{\Sigma }_{\varvec{F}},\varvec{\Sigma }_{\varvec{H}},\Sigma _J\}\) (6) for this model are obtained as

$$\begin{aligned} \varvec{\Sigma }_{\varvec{F}}=2\alpha \varvec{F};\quad \varvec{\Sigma }_{\varvec{H}}=2\beta \varvec{H};\quad {\Sigma }_{J}=-\frac{2\alpha }{J} -4\beta + \lambda \left( J-1\right) \nonumber \\ \end{aligned}$$
(14)

and the first Piola–Kirchhoff (7) results in

$$\begin{aligned} \varvec{P}=2\alpha \varvec{F} + 2\beta \varvec{H}\varvec{\times }\varvec{F} - \left( \frac{2\alpha }{J} + 4\beta - \lambda \left( J-1\right) \right) \varvec{H}.\nonumber \\ \end{aligned}$$
(15)

The Hessian operator \(\left[ \mathbb {H}_W\right] \) (9) adopts the following simple expression

$$\begin{aligned} \begin{bmatrix} \mathbb {H}_{W} \end{bmatrix}=\begin{bmatrix} 2\alpha \varvec{\mathcal {I}}&\varvec{0}&\varvec{0} \\ \varvec{0}&2\beta \varvec{\mathcal {I}}&\varvec{0}\\ \varvec{0}&\varvec{0}&\frac{2\alpha }{J^2} + \lambda \end{bmatrix}, \end{aligned}$$
(16)

where \({\mathcal {I}}_{iIjJ}=\delta _{ij}\delta _{IJ}\) and \(\left[ \mathbb {H}_W^{eq}\right] \) (11) is expressed as

$$\begin{aligned} \begin{aligned} \left[ \mathbb {H}_{W}^{eq}\right] =&2\alpha \varvec{\mathcal {I}}+2\beta \varvec{F}\varvec{\times }\varvec{\mathcal {I}}\varvec{\times }\varvec{F}+ \left( \frac{2\alpha }{J^2} + \lambda \right) \varvec{H}\otimes \varvec{H}. \end{aligned} \nonumber \\ \end{aligned}$$
(17)

The clear benefits of employing the new tensor cross product algebra in the context of polyconvex elasticity have been presented in [12, 32] and thoroughly detailed in [33].

Remark 1

An alternative function f(J) to that used in (13) is

$$\begin{aligned} f(J) = -(2\alpha + 4\beta )\ln J + \frac{\lambda }{2}(J - 1 )^2. \end{aligned}$$
(18)

Following [12], the material parameters \(\alpha \), \(\beta \) and \(\lambda \) used in (12) and (18) can be related to the classical Lamé parameters \(\mu _{\text {lin}}\) and \(\lambda _{\text {lin}}\) in linearised elasticity (in the reference configuration) through

$$\begin{aligned} 2\alpha + 2\beta = \mu _{\text {lin}};\quad \; \lambda + 4\beta = \lambda _{\text {lin}}. \end{aligned}$$
(19)

The Poisson ratio \(\nu \) in the reference configuration is related to material parameters \(\lambda _{\text {lin}}\) and \(\mu _{\text {lin}}\) as

$$\begin{aligned} \nu =\frac{\lambda _{\text {lin}}}{2(\lambda _{\text {lin}} + \mu _{\text {lin}})}. \end{aligned}$$
(20)

Substitution of Eq. (19) into (20) leads to the following expression for the Poisson ratio in terms of the material parameters \(\alpha \), \(\beta \) and \(\lambda \)

$$\begin{aligned} \nu =\frac{\lambda + 4\beta }{2(\lambda + 6\beta + 2\alpha )}. \end{aligned}$$
(21)

Classical beam theories do not consider dilatations or contractions of the beam section. Hence, a Poisson ratio of value \(\nu =0\) is consistent with such kinematical description of the beam. However, from Eq. (21), it can be inferred that it is not possible to find any combination of non-negative (and hence, satisfying polyconvexity) values for the material parameters \(\alpha \), \(\beta \) and \(\lambda \) compatible with \(\nu =0\). On the contrary, if the volumetric function f(J) in (13) is used instead, the relation between the material parameters \(\alpha \), \(\beta \) and \(\lambda \) relate to \(\mu _{\text {lin}}\) and \(\lambda _{\text {lin}}\) as

$$\begin{aligned} 2\alpha +2\beta =\mu _{\text {lin}};\qquad \lambda =\lambda _{\text {lin}}, \end{aligned}$$
(22a)

which leads to an expression for the Poison ratio \(\nu _{\text {lin}}\)

$$\begin{aligned} \nu _{\text {lin}}=\frac{\lambda }{2(\lambda + 2\alpha + 2\beta )}. \end{aligned}$$
(23)

The definition of f(J) in (13) allows to model the particular case \(\nu _{\text {lin}}=0\) without violating polyconvexity (non-negative values for \(\alpha \) and \(\beta \) with \(\lambda =0\)).

2.2.2 A non-polyconvex constitutive model: Saint Venant–Kirchhoff material

A popular constitutive model for beams is that of a Saint Venant–Kirchhoff material [1, 3, 8, 10, 38], where the strain energy functional per unit undeformed volume \(\Psi \) is defined in terms of the Green–Lagrange strain tensor \(\varvec{E}\) as

$$\begin{aligned}&\Psi _{SVK}\left( \varvec{\nabla }_0\varvec{x}\right) =\frac{\lambda }{2}\left( \text {tr}\varvec{E}\right) ^2+\mu \text {tr}\left( \varvec{EE}\right) ;\nonumber \\&\varvec{E}=\frac{1}{2}\left( \varvec{C}- \varvec{I}\right) ; \quad \varvec{C} = \varvec{F}^T\varvec{F}, \end{aligned}$$
(24)

where \(\varvec{C}\) is the right Cauchy–Green deformation tensor, \(\lambda \) and \(\mu \) the Lamé coefficients and \(\varvec{I}\) the identity tensor. Alternatively, \(\Psi _{SVK}\) can be expressed in terms of \(\varvec{C}\) as

$$\begin{aligned} \Psi _{SVK}\left( \varvec{C}\right) =\frac{\mu }{4}\left( \text {tr}(\varvec{CC}) - 2\text {tr}(\varvec{C})+3\right) + \frac{\lambda }{8}\left( \text {tr}(\varvec{C}) - 3\right) ^2. \nonumber \\ \end{aligned}$$
(25)

As shown in [30], it is possible to re-express the term \(\text {tr}\left( \varvec{CC}\right) \) in (25) as

$$\begin{aligned} \text {tr}\left( \varvec{C}\varvec{C}\right) =\left( \varvec{F}:\varvec{F}\right) ^2 - 2\left( \varvec{H}:\varvec{H}\right) . \end{aligned}$$
(26)

Introduction of identity (26) into (25) enables to rewrite the strain energy functional in terms of \(\varvec{F}\) and J as

$$\begin{aligned} W_{SVK}\left( \varvec{F},\varvec{H}\right)= & {} \frac{\mu }{4}\left( \left( \varvec{F}:\varvec{F}\right) ^2 - 2\left( \varvec{F}:\varvec{F}\right) -2\left( \varvec{H}:\varvec{H}\right) +3\right) \nonumber \\&+ \frac{\lambda }{8}\left( \left( \varvec{F}:\varvec{F}\right) - 3\right) ^2. \end{aligned}$$
(27)

The conjugate stresses \(\{\varvec{\Sigma }_{\varvec{F}},\varvec{\Sigma }_{\varvec{H}},\Sigma _J\}\) (6) for this model are obtained as

$$\begin{aligned} \varvec{\Sigma }_{\varvec{F}}= & {} \left( {\mu } + \frac{\lambda }{2}\right) (\varvec{F}:\varvec{F})\varvec{F} - \left( \frac{3\lambda }{2} + \mu \right) \varvec{F};\nonumber \\ \varvec{\Sigma }_{\varvec{H}}= & {} -\mu \varvec{H};\quad {\Sigma }_{J}=0 \end{aligned}$$
(28)

and the first Piola–Kirchhoff (7) results in

$$\begin{aligned} \varvec{P}=\left( {\mu } + \frac{\lambda }{2}\right) (\varvec{F}:\varvec{F})\varvec{F} - \left( \frac{3\lambda }{2} + \mu \right) \varvec{F} - \mu \varvec{H}\varvec{\times }\varvec{F}.\nonumber \\ \end{aligned}$$
(29)

The Hessian operator \(\left[ \mathbb {H}_W\right] \) (9) adopts the following simple expression

$$\begin{aligned}&\begin{bmatrix} \mathbb {H}_{W} \end{bmatrix}\nonumber \\&\quad =\begin{bmatrix} 2\left( {\mu } + \frac{\lambda }{2}\right) \varvec{F}\otimes \varvec{F} - \left( \frac{\lambda }{2}(3 - \varvec{F}:\varvec{F}) + \mu (1-\varvec{F}:\varvec{F})\right) \varvec{\mathcal {I}}&\varvec{0}&\varvec{0} \\ \varvec{0}&-\mu \varvec{\mathcal {I}}&\varvec{0}\\ \varvec{0}&\varvec{0}&0 \end{bmatrix}\!.\nonumber \\ \end{aligned}$$
(30)

Unfortunately this model is not polyconvex as can be observed from the lack of positive (semi)-definiteness of the Hessian operator in Eq. (30). Finally, the tensor \(\left[ \mathbb {H}_W^{eq}\right] \) (11) is expressed as

$$\begin{aligned} \begin{aligned} \left[ \mathbb {H}_{W}^{eq}\right]&= 2\left( {\mu } + \frac{\lambda }{2}\right) \varvec{F}\otimes \varvec{F}- \left( \frac{\lambda }{2}(3 - \varvec{F}:\varvec{F})+ \mu (1-\varvec{F}:\varvec{F})\right) \varvec{\mathcal {I}}\\&\quad -\mu \varvec{F}\varvec{\times }\varvec{\mathcal {I}}\varvec{\times }\varvec{F}. \end{aligned} \nonumber \\ \end{aligned}$$
(31)

3 Continuum degenerate polyconvex beam formulation

In this section, a continuum degenerate beam model is presented. Contrary to the work in [10, 11], where the strain energy functional is presented in terms of the Green–Lagrange strain tensor, the formulation hereby presented exploits the extended set of kinematic strain measures \(\{\varvec{F},\varvec{H},J\}\) along with the computational framework outlined in [12, 32, 33].

In addition, a co-rotational formulation in terms of an extended alternative set of co-rotational kinematic entities is also presented. This co-rotational approach will prove to be a natural link between the continuum degenerate approach and a more classical beam description where well-known engineering strain measures feature as the main kinematic entities.

3.1 Beam kinematics

Let us now consider that the elastic body introduced in the previous section is a beam structural element. In particular, it is assumed that the beam in the reference configuration has a straight axis of length L which is completely characterised by the position of a centre line \(\varvec{X}_0(s)\), parametrised in terms of \(s\in [0,L]\), and an orthonormal triad \(\{\varvec{D}_1,\varvec{D}_2,\varvec{D}_3\}\) where the vectors \(\varvec{D}_1\) and \(\varvec{D}_2\) lay parallel to the cross sectional area A(s) of the beam (with boundary \(\partial A(s)\)) and \(\varvec{D}_3\) is aligned along the undeformed centre line, see Fig. 2. In the following, summation over Greek indexes \(\alpha \) ranges from 1 to 2 whereas summation over Latin indexes i ranges from 1 to 3 and Einstein’s convention is assumed for repeated indices unless otherwise stated. The beam reference configuration is defined in terms of the convective coordinates \(\{\eta ^{\alpha },\alpha =1,2\}\) [8] by

$$\begin{aligned} \varvec{X}\left( \eta ^{\alpha },s\right) = \varvec{X}_0\left( s\right) + \bar{\varvec{X}}; \qquad \bar{\varvec{X}} = \eta ^{\alpha }\varvec{D}_{\alpha }. \end{aligned}$$
(32)
Fig. 2
figure 2

Reference and current configuration of a beam

For the current configuration, an orthonormal triad \(\{\varvec{d}_1(s),\varvec{d}_2(s),\varvec{d}_3(s)\}\) can be introduced with \(\varvec{d}_1(s)\) and \(\varvec{d}_2(s)\) laying on the cross sectional area a(s) (with boundary \(\partial a(s)\)) and with \(\varvec{d}_3(s)=\varvec{d}_1(s)\times \varvec{d}_2(s)\), see Fig. 2. Hence, the motion of the beam can be defined as [8, 39]

$$\begin{aligned} \varvec{x}(\eta ^{\alpha },s)=\varvec{x}_0\left( s\right) + \bar{\varvec{x}}(s); \qquad \bar{\varvec{x}}(s) = \eta ^{\alpha }\varvec{d}_{\alpha }\left( s\right) , \end{aligned}$$
(33)

where the orthonormal triads defined above are related via a rotation tensor \(\varvec{R}\in SO(3)\) defined as \(\varvec{R}=\varvec{d}_i\otimes \varvec{D}_i\) [1].

It is known that the rotation tensor \(\varvec{R}\) can be defined in terms of the Rodrigues parametrisation of a rotation vector \(\varvec{\theta }\) [40], with associated skew symmetric second order tensor \(\hat{\varvec{\theta }}=\varvec{I}\varvec{\times }\varvec{\theta }\),Footnote 3 as

$$\begin{aligned} \varvec{R}\left( \varvec{\theta }\right) =\varvec{I}+\frac{\sin {\vert \vert \varvec{\theta }\vert \vert }}{{\vert \vert \varvec{\theta }\vert \vert }}\left( \varvec{I}\varvec{\times }\varvec{\theta }\right) +\frac{1-\cos {\vert \vert \varvec{\theta }\vert \vert }}{{\vert \vert \varvec{\theta }\vert \vert }^2} \left( \varvec{I}\varvec{\times }\varvec{\theta }\right) \left( \varvec{I}\varvec{\times }\varvec{\theta }\right) . \nonumber \\ \end{aligned}$$
(34)

The deformation gradient tensor \(\varvec{F}\) (1) derived from the mapping (33) is obtained as [8]

$$\begin{aligned} \varvec{F}=\varvec{x}^{\prime }_{0}\otimes \varvec{D}_3 + \varvec{d}_{\alpha }\otimes \varvec{D}_{\alpha } + \eta ^{\alpha } \varvec{d}_{\alpha }^{\prime }\otimes \varvec{D}_3, \end{aligned}$$
(35)

where \((\bullet )^{\prime }:=\frac{d (\bullet )(s)}{d s}\). Above Eq. (35) in conjunction with (34) leads to a description of the deformation gradient tensor \(\varvec{F}\) in terms of the centre line \(\varvec{x}_0\) and the rotation vector \(\varvec{\theta }\), typically used in continuum degenerate descriptions [10, 11] of beam structural models. Notice that a computational framework built upon the representation (35) of the deformation gradient tensor is not restricted to beams with a straight axis in the reference undeformed configuration.

It can be interesting to re-write \(\varvec{F}\) in a more meaningful manner from the physical standpoint. With that in mind, following the algebraic manipulations in [8], the deformation gradient tensor \(\varvec{F}\) (35) can be re-expressed as

$$\begin{aligned} \begin{aligned} \varvec{F}=\varvec{R}\left( \varvec{\Gamma }\otimes \varvec{D}_3 + \hat{\varvec{K}}\bar{\varvec{X}}\otimes \varvec{D}_3+ \varvec{I}\right) \end{aligned} , \end{aligned}$$
(36)

where the beam strain measures \(\varvec{\Gamma }\) and \(\hat{\varvec{K}}\) are defined as

$$\begin{aligned} \begin{aligned} \varvec{\Gamma }=\varvec{R}^{T}\varvec{x}_{0}^{\prime }-\varvec{D}_3; \quad \hat{\varvec{K}}=\varvec{R}^{T}\varvec{R}^{\prime }, \end{aligned} \end{aligned}$$
(37)

with \(\hat{\varvec{K}}\) a skew-symmetric tensor that can be re-written in terms of a vector \(\varvec{K}\) as \(\hat{\varvec{K}}=\varvec{I} \varvec{\times }\varvec{K}\).Footnote 4 The beam strain measure \(\varvec{\Gamma }\) is known as the axial-shear strain vector whilst \(\varvec{K}\) is known as the torsional–bending strain vector. Thus, an alternative representation of (36) expressed in terms of the beam strain measures \(\{\varvec{\Gamma },{\varvec{K}}\}\) is [8]

$$\begin{aligned} \begin{aligned} \varvec{F}=\varvec{R}\bar{\varvec{U}};\quad \bar{\varvec{U}}=\varvec{B}\otimes \varvec{D}_3 + \varvec{I};\quad \varvec{B}=\varvec{\Gamma } + \left( \varvec{K}\times \bar{\varvec{X}}\right) . \end{aligned} \nonumber \\ \end{aligned}$$
(38)

Above Eq. (38) is known as the right extended polar decomposition of the deformation gradient \(\varvec{F}\) in terms of the beam strain measures \(\{\varvec{\Gamma },{\varvec{K}}\}\) as presented in [8]. Notice that whilst \(\varvec{R}\) is a rotation tensor, the non-symmetric strain tensor \(\varvec{\bar{U}}\) is not a pure stretch tensor (as in the classical polar decomposition theorem [16]). Note that \(\bar{\varvec{U}}\) is only symmetric in the case that \(\varvec{B}\) is colinear with \(\varvec{D}_3\), which only happens in the absence of shear strain and torsion.

Formulae (38) for the representation of the deformation gradient tensor \(\varvec{F}\) are very useful in the context of geometrically exact beam models when the strain energy functional is defined in terms of the classical beam strain measures \(\{\varvec{\Gamma },{\varvec{K}}\}\) [1, 3, 41]. Notice that a computational framework built upon the representation (36) of the deformation gradient tensor is restricted to beams with a straight axis in the reference undeformed configuration. A more general expression for this representation of the deformation gradient tensor can be found in [1].

3.2 Linearisation of the beam kinematics

As stated above (33), the mapping of the beam is defined in terms of the centre line \(\varvec{x}_0\) and the triad \(\{\varvec{d}_i\}\) (or the rotation vector \(\varvec{\theta }\)). Virtual or incremental variations of the centre line \(\varvec{x}_0\) will be denoted by \(\delta \varvec{u}_0\) or \(\varvec{u}_0\), respectively. In addition, virtual or incremental variations of the rotation vector \(\varvec{\theta }\) will be denoted by \(\delta {\varvec{\theta }}\) or \(\Delta {\varvec{\theta }}\), respectively. All of these fields must satisfy appropriate essential and natural boundary conditions at \(s=0\) and/or \(s=L\) [42]. Thus, the virtual or incremental variations of the mapping (33) are then defined as

$$\begin{aligned} \begin{aligned}&\delta \varvec{u}=\delta \varvec{u}_0+\eta ^{\alpha }D\varvec{d}_{\alpha }[\delta {\varvec{\theta }}];\\&\varvec{u}= \varvec{u}_0+\eta ^{\alpha }D\varvec{d}_{\alpha }[\Delta \varvec{\theta }]. \end{aligned} \end{aligned}$$
(39)

The first and second directional derivatives of the triad with respect to geometry changes are defined byFootnote 5

$$\begin{aligned} \begin{aligned}&D{\varvec{d}}_i[\delta {\varvec{\theta }}]=\delta {\varvec{\theta }}\times \varvec{d}_i;\\&D^2\varvec{d}_i[\delta {\varvec{\theta }};\Delta {\varvec{\theta }}]=\delta {\varvec{\theta }}\times \left( \Delta \varvec{\theta }\times \varvec{d}_i\right) . \end{aligned} \end{aligned}$$
(40)

Analogously, for the derivative of the triad with respect to the beam axis, the first and second directional derivatives are

(41)

The first and second directional derivatives of the deformation gradient \(\varvec{F}\) can be obtained as

(42)

where the directional derivatives of the triad were previously computed in (40)–(41). Notice how the kinematics of the beam introduces, in contrast to the continuum formulation, additional non-linearities into the problem through the non-vanishing second directional derivative of \(\varvec{F}\) (recall that this term vanishes in the more general continuum description). The first and second directional derivatives of the cofactor \(\varvec{H}\) are computed as

(43)

Finally, the first and second directional derivatives of the determinant J are computed as

(44)

Due to the kinematics introduced by the beam problem, further nonlinearities are observed in (43) and (44) with respect to (3) and (4), respectively. Alternatively, a formulation exclusively in terms of the centre line \(\varvec{x}_0\) and the triad \(\{\varvec{d}_i\}\) has been presented in references [1, 3]. This formulation avoids relating directional derivatives of the triad \(\{\varvec{d}_i\}\) with respect to the rotation vector \(\varvec{\theta }\) as in Eq. (40).

3.3 Transition from polyconvex continuum model to polyconvex beam theory

For the case of a polyconvex constitutive model satisfying Eq.  (5), it is possible to express the internal virtual work as follows

$$\begin{aligned} \begin{aligned}&\varvec{P}:D\varvec{F}[\delta \varvec{u}_0, \delta {\varvec{\theta }}]\\&\quad =D\Psi [\delta \varvec{u}_0, \delta {\varvec{\theta }}]\\&\quad =DW[D\varvec{F}[\delta \varvec{u}_0, \delta {\varvec{\theta }}],D\varvec{H}[\delta \varvec{u}_0, \delta {\varvec{\theta }}],DJ[\delta \varvec{u}_0, \delta {\varvec{\theta }}]]\\&\quad =\varvec{\Sigma }_{\varvec{F}}:D\varvec{F}[\delta \varvec{u}_0, \delta {\varvec{\theta }}]+\varvec{\Sigma }_{\varvec{H}}:D\varvec{H}[\delta \varvec{u}_0, \delta {\varvec{\theta }}]\\&\qquad \,+\Sigma _{J}DJ[\delta \varvec{u}_0, \delta {\varvec{\theta }}]\\&\quad =\varvec{\Sigma }_{\varvec{F}}:D\varvec{F}[\delta \varvec{u}_0, \delta {\varvec{\theta }}]+\varvec{\Sigma }_{\varvec{H}}:(\varvec{F}\varvec{\times }D\varvec{F}[\delta \varvec{u}_0,\delta {\varvec{\theta }}])\\&\qquad +\Sigma _{J}\varvec{H}:D\varvec{F}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]\\&\quad =(\varvec{\Sigma }_{\varvec{F}}+\varvec{\Sigma }_{\varvec{H}}\varvec{\times }\varvec{F}+\Sigma _J\varvec{H}):D\varvec{F}[\delta \varvec{u}_0, \delta {\varvec{\theta }}], \end{aligned} \end{aligned}$$
(45)

which leads to an identical expression for the first Piola–Kirchhoff stress tensor \(\varvec{P}\) as in (7). The tangent elasticity operator is obtained as

$$\begin{aligned} \begin{aligned} D^2\Psi [\delta \varvec{u}_0,\delta {\varvec{\theta }};\varvec{u}_0,\Delta {\varvec{\theta }}]&=D\varvec{F}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]:\varvec{\mathcal {C}}:D\varvec{F}\left[ \varvec{u}_0,\Delta {\varvec{\theta }}\right] \\&\quad +\varvec{P}:D^2\varvec{F}[\delta \varvec{u}_0,\delta {\varvec{\theta }};\varvec{u}_0,\Delta {\varvec{\theta }}], \end{aligned}\nonumber \\ \end{aligned}$$
(46)

where \(\varvec{\mathcal {C}}=\frac{\partial \varvec{P}}{\partial \varvec{F}}=\frac{\partial ^2 \Psi }{\partial \varvec{F}\partial \varvec{F}}\). Use of Eq.  (45) yields

$$\begin{aligned} \begin{aligned}&D^2\Psi [\delta \varvec{u}_0,\delta {\varvec{\theta }};\varvec{u}_0,\Delta {\varvec{\theta }}]\\&=[\mathbb {S}_{\delta }]_W^T \left[ \mathbb {H}_W\right] [\mathbb {S}_{\Delta }]_W\\&\quad +\left( \varvec{\Sigma _H} + \Sigma _J\varvec{F}\right) :(D\varvec{F}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]\varvec{\times }D\varvec{F}\left[ \varvec{u}_0,\Delta {\varvec{\theta }}\right] )\\&\quad +\left( \varvec{\Sigma _F} + \varvec{\Sigma _H}\varvec{\times }\varvec{F} + \Sigma _J\varvec{H}\right) :D^2\varvec{F}[\delta {\varvec{\theta }};\Delta {\varvec{\theta }}], \end{aligned} \end{aligned}$$
(47)

where

$$\begin{aligned}&\begin{aligned}&\left[ \mathbb {S}_{\delta }\right] _W^T\\&\quad = \begin{bmatrix} D\varvec{F}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]:&(D\varvec{F}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]\varvec{\times }\varvec{F}):&D\varvec{F}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]:\varvec{H} \end{bmatrix};\\&\left[ \mathbb {S}_{\Delta }\right] _W= \begin{bmatrix} :D\varvec{F}\left[ \varvec{u}_0,\Delta {\varvec{\theta }}\right] \\ :\left( D\varvec{F}\left[ \varvec{u}_0,\Delta {\varvec{\theta }}\right] \varvec{\times }\varvec{F}\right) \\ D\varvec{F}\left[ \varvec{u}_0,\Delta {\varvec{\theta }}\right] :\varvec{H} \end{bmatrix}. \end{aligned} \end{aligned}$$
(48)

Note that the first term on the right hand side of (47) is necessarily positive for \(\delta \varvec{u}_0=\varvec{u}_0\) and \(\delta {\varvec{\theta }}=\Delta {\varvec{\theta }}\) and therefore buckling can only be induced by the “initial stress” term associated with the last two terms on the right hand side of (47). In comparison with Eq. (8), an extra nonlinearity present in above Eq.  (47) can be observed, namely the last term on the right hand side.

Finally, proceeding as in Sect. 2.2, it is possible to re-express Eq. (47) in terms of the alternative Hessian operator \([\mathbb {H}_{W}^{eq}]\) defined in Eq. (11) as

$$\begin{aligned} \begin{aligned}&D^2\Psi [\delta \varvec{u}_0,\delta {\varvec{\theta }};\varvec{u}_0,\Delta {\varvec{\theta }}]\\&=D\varvec{F}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]:\left[ \mathbb {H}_{W}^{eq}\right] :D\varvec{F}\left[ \varvec{u}_0,\Delta {\varvec{\theta }}\right] \\&\quad +\left( \varvec{\Sigma _H} + \Sigma _J\varvec{F}\right) :(D\varvec{F}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]\varvec{\times }D\varvec{F}\left[ \varvec{u}_0,\Delta {\varvec{\theta }}\right] )\\&\quad +\left( \varvec{\Sigma _F} + \varvec{\Sigma _H}\varvec{\times }\varvec{F} + \Sigma _J\varvec{H}\right) :D^2\varvec{F}[\delta {\varvec{\theta }};\Delta {\varvec{\theta }}]. \end{aligned} \end{aligned}$$
(49)

3.4 Polyconvex co-rotational beam formulation

In this Section, an alternative polyconvex co-rotational formulation is presented. This new approach will prove to be very useful establishing a link between the continuum degenerate description and a classical beam description in terms of the beam strain measures \(\varvec{\Gamma }\) and \(\varvec{K}\) defined above. Taking into consideration the right extended decomposition theorem (38) and the necessary objectivity requirement, an equivalent energy functional to \(W(\varvec{F},\varvec{H},J)\) can be defined in terms of the strain tensor \(\varvec{\bar{U}}=\varvec{R}^T\varvec{F}\) (38), its cofactor \(\bar{\varvec{W}}=\varvec{R}^T\varvec{H}\) and its determinant J,Footnote 6 namely \(\hat{W}(\varvec{\bar{U}},\bar{\varvec{W}},J)\). Notice that both representations \(W(\varvec{F},\varvec{H},J)\) and \(\hat{W}(\varvec{\bar{U}},\bar{\varvec{W}},J)\) have identical expressions in their own arguments. Analogously to (6), a new set of stress variables \(\{\varvec{\Sigma }_{\bar{\varvec{U}}},\varvec{\Sigma }_{\bar{\varvec{W}}},\Sigma _J\}\) conjugate to \(\{\bar{\varvec{U}},\bar{\varvec{W}},J\}\) can be defined as

$$\begin{aligned} \begin{aligned} \varvec{\Sigma }_{\bar{\varvec{U}}}\left( {\bar{\varvec{U}}},\bar{\varvec{W}},J\right)&=\frac{\partial \hat{W}}{\partial {\bar{\varvec{U}}}}=\varvec{R}^T\frac{\partial W}{\partial \varvec{F}};\\ \varvec{\Sigma }_{\bar{\varvec{W}}}\left( {\bar{\varvec{U}}},\bar{\varvec{W}},J\right)&=\frac{\partial \hat{W}}{\partial \bar{\varvec{W}}}=\varvec{R}^T\frac{\partial W}{\partial \varvec{H}};\\ {\Sigma }_{J}\left( {\bar{\varvec{U}}},\bar{\varvec{W}},J\right)&=\frac{\partial \hat{W}}{\partial J}=\frac{\partial W}{\partial J}. \end{aligned} \end{aligned}$$
(50)

Considering the right extended decomposition theorem (38), the internal virtual work can be written as

$$\begin{aligned} \varvec{P}:D\varvec{F}[\delta \varvec{u}_0, \delta {\varvec{\theta }}]= & {} \varvec{P}:\varvec{R}D\bar{\varvec{U}}[\delta \varvec{u}_0, \delta {\varvec{\theta }}]\nonumber \\&+\varvec{P}:D\varvec{R}[\delta \varvec{u}_0, \delta {\varvec{\theta }}]\bar{\varvec{U}}. \end{aligned}$$
(51)

Conservation of angular momentum in the continuum implies

$$\begin{aligned} \varvec{P}:D\varvec{R}[\delta \varvec{u}_0, \delta {\varvec{\theta }}]\bar{\varvec{U}}= & {} \varvec{FS}:D\varvec{R}[\delta \varvec{u}_0, \delta {\varvec{\theta }}]\bar{\varvec{U}}\nonumber \\= & {} \underbrace{\bar{\varvec{U}}^T\varvec{S}\bar{\varvec{U}}}_{\text {Symmetric tensor}}: \underbrace{\varvec{R}^TD\varvec{R}[\delta \varvec{u}_0, \delta {\varvec{\theta }}]}_{\text {Skew symmetric tensor}}\nonumber \\= & {} 0 \end{aligned}$$
(52)

where \(\varvec{S}\) is the second Piola–Kirchhoff stress tensor. Hence, an alternative co-rotational representation for the internal virtual work (51) is as follows

$$\begin{aligned} \varvec{P}:D\varvec{F}[\delta \varvec{u}_0, \delta {\varvec{\theta }}] = \bar{\varvec{P}}:D\bar{\varvec{U}}[\delta \varvec{u}_0, \delta {\varvec{\theta }}];\qquad \bar{\varvec{P}} = \varvec{R}^T\varvec{P},\nonumber \\ \end{aligned}$$
(53)

which results in

$$\begin{aligned} \begin{aligned}&\bar{\varvec{P}}:D\bar{\varvec{U}}[\delta \varvec{u}_0, \delta {\varvec{\theta }}]\\&=D\Psi [\delta \varvec{u}_0, \delta {\varvec{\theta }}]\\&=D\hat{W}[D\bar{\varvec{U}}[\delta \varvec{u}_0, \delta {\varvec{\theta }}],D\bar{\varvec{W}}[\delta \varvec{u}_0, \delta {\varvec{\theta }}],DJ[\delta \varvec{u}_0, \delta {\varvec{\theta }}]]\\&=\varvec{\Sigma }_{\bar{\varvec{U}}}:D\bar{\varvec{U}}[\delta \varvec{u}_0, \delta {\varvec{\theta }}]+\varvec{\Sigma }_{\bar{\varvec{W}}}:D\bar{\varvec{W}}[\delta \varvec{u}_0, \delta {\varvec{\theta }}]\\&\quad +\Sigma _{J}DJ[\delta \varvec{u}_0, \delta {\varvec{\theta }}]\\&=\varvec{\Sigma }_{\bar{\varvec{U}}}:D\bar{\varvec{U}}[\delta \varvec{u}_0, \delta {\varvec{\theta }}]\\&\quad +\varvec{\Sigma }_{\bar{\varvec{W}}}:(\bar{\varvec{U}}\varvec{\times }D\bar{\varvec{U}}[\delta \varvec{u}_0,\delta {\varvec{\theta }}])+\Sigma _{J}\bar{\varvec{W}}:D\bar{\varvec{U}}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]\\&=(\varvec{\Sigma }_{\bar{\varvec{U}}}+\varvec{\Sigma }_{\bar{\varvec{W}}}\varvec{\times }\bar{\varvec{U}}+\Sigma _{J}\bar{\varvec{W}}):D\bar{\varvec{U}}[\delta \varvec{u}_0, \delta {\varvec{\theta }}], \end{aligned} \end{aligned}$$
(54)

which leads to the following expression for the co-rotational stress tensor \(\bar{\varvec{P}}\)

$$\begin{aligned} \bar{\varvec{P}} = \varvec{\Sigma }_{\bar{\varvec{U}}}+\varvec{\Sigma }_{\bar{\varvec{W}}}\varvec{\times }\bar{\varvec{U}}+\Sigma _{J}\bar{\varvec{W}}. \end{aligned}$$
(55)

Following a similar procedure to that of Eq. (47), an alternative expression for the tangent operator of the energy \(\Psi \) can be obtained in terms of \(\{\bar{\varvec{U}},\bar{\varvec{W}},J\}\) as follows

$$\begin{aligned} \begin{aligned}&D^2\Psi [\delta \varvec{u}_0,\delta {\varvec{\theta }};\varvec{u}_0,\Delta {\varvec{\theta }}]\\&=[\mathbb {S}_{\delta }]_{\hat{W}}^T \left[ \mathbb {H}_{\hat{W}}\right] [\mathbb {S}_{\Delta }]_{\hat{W}}\\&\quad +\left( \varvec{\Sigma }_{\bar{\varvec{W}}} + \Sigma _{J}\bar{\varvec{U}}\right) :(D\bar{\varvec{U}}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]\varvec{\times }D\bar{\varvec{U}}\left[ \varvec{u}_0,\Delta {\varvec{\theta }}\right] )\\&\quad +\left( \varvec{\Sigma }_{\bar{\varvec{U}}} + \varvec{\Sigma }_{\bar{\varvec{W}}}\varvec{\times }\bar{\varvec{U}} + \Sigma _{J}\bar{\varvec{W}}\right) :D^2\bar{\varvec{U}}[\delta {\varvec{\theta }};\Delta {\varvec{\theta }}], \end{aligned} \end{aligned}$$
(56)

where

$$\begin{aligned} \begin{aligned}&\left[ \mathbb {S}_{\delta }\right] _{\hat{W}}^T\\&\quad = \begin{bmatrix} D\bar{\varvec{U}}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]:&(D\bar{\varvec{U}}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]\varvec{\times }\bar{\varvec{U}}):&D\bar{\varvec{U}}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]:\bar{\varvec{W}} \end{bmatrix};\\&\left[ \mathbb {S}_{\Delta }\right] _{\hat{W}}= \begin{bmatrix} :D\bar{\varvec{U}}\left[ \varvec{u}_0,\Delta {\varvec{\theta }}\right] \\ :\left( D\bar{\varvec{U}}\left[ \varvec{u}_0,\Delta {\varvec{\theta }}\right] \varvec{\times }\bar{\varvec{U}}\right) \\ D\bar{\varvec{U}}\left[ \varvec{u}_0,\Delta {\varvec{\theta }}\right] :\bar{\varvec{W}} \end{bmatrix} \end{aligned} \end{aligned}$$
(57)

and with the Hessian operator \([\mathbb {H}_{\hat{W}}]\) denoting the symmetric positive semi-definite operator containing the second derivatives of \(\hat{W}\left( \bar{\varvec{U}},\bar{\varvec{W}},J\right) \) as

$$\begin{aligned} \left[ \mathbb {H}_{\hat{W}}\right] =\begin{bmatrix} \hat{W}_{{\bar{\varvec{U}}}{\bar{\varvec{U}}}}&\hat{W}_{{\bar{\varvec{U}}}\bar{\varvec{W}}}&\hat{W}_{{\bar{\varvec{U}}}J}\\ \\ \hat{W}_{\bar{\varvec{W}}{\bar{\varvec{U}}}}&\hat{W}_{\bar{\varvec{W}}\bar{\varvec{W}}}&\hat{W}_{\bar{\varvec{W}}J}\\\\ \hat{W}_{J{\bar{\varvec{U}}}}&\hat{W}_{J\bar{\varvec{W}}}&\hat{W}_{JJ} \end{bmatrix}= \begin{bmatrix} \frac{\partial ^2 \hat{W}}{\partial \bar{\varvec{U}}\partial \bar{\varvec{U}}}&\frac{\partial ^2 \hat{W}}{\partial \bar{\varvec{U}}\partial \bar{\varvec{W}}}&\frac{\partial ^2 \hat{W}}{\partial \bar{\varvec{U}}\partial J} \\ \\ \frac{\partial ^2 \hat{W}}{\partial \bar{\varvec{W}}\partial \bar{\varvec{U}}}&\frac{\partial ^2 \hat{W}}{\partial \bar{\varvec{W}}\partial \bar{\varvec{W}}}&\frac{\partial ^2 \hat{W}}{\partial \bar{\varvec{W}}\partial J} \\ \\ \frac{\partial ^2 \hat{W}}{\partial J\partial \bar{\varvec{U}}}&\frac{\partial ^2 \hat{W}}{\partial J\partial \bar{\varvec{W}}}&\frac{\partial ^2 \hat{W}}{\partial J\partial J} \end{bmatrix}.\nonumber \\ \end{aligned}$$
(58)

As mentioned above, the identical representation of the energy functionals \(W(\varvec{F},\varvec{H},J)\) and \(\hat{W}(\varvec{\bar{U}},\bar{\varvec{W}},J)\) with respect to their respective arguments results in identical expressions of the Hessian operators \(\begin{bmatrix}\mathbb {H}_{W}\end{bmatrix}\) and \(\begin{bmatrix}\mathbb {H}_{\hat{W}}\end{bmatrix}\) in their respective arguments. Therefore, both Hessian operators \(\begin{bmatrix}\mathbb {H}_{W}\end{bmatrix}\) and \(\begin{bmatrix}\mathbb {H}_{\hat{W}}\end{bmatrix}\) will be positive semi-definite, provided that \(W(\varvec{F},\varvec{H},J)\) and hence, \(\hat{W}(\varvec{\bar{U}},\bar{\varvec{W}},J)\) satisfy polyconvexity. The latter representation of the strain energy functional will be further pursued in the next Section.

4 Relationship with the classical beam formulation

Alternatively to continuum degenerate based formulations, as that presented in Sect. 3, classical beam formulations employ a kinematical description based on the strain measures \(\{\varvec{\Gamma },\varvec{K}\}\) by means of the definition of the deformation gradient tensor presented in Eq.  (38).

The objective of this Section is twofold. On the one hand, the strain energy per unit undeformed volume (i.e. \(W(\varvec{F},\varvec{H},J)=\hat{W}(\varvec{\bar{U}},\varvec{\bar{W}},J)\)) will be re-written in terms of the alternative beam strain vector \(\varvec{B}\) defined in (38). Further manipulations will lead to the introduction of the strain energy per unit undeformed length in terms of the beam strain measures \(\varvec{\Gamma }\) and \(\varvec{K}\). On the second hand, relationships will be established between the tangent operators of the different energy representations to relate the concept of polyconvexity at a continuum and beam level.

4.1 Beam kinematics and linearisation

4.1.1 The classical beam strain measures

Manipulation of equations in (37) enable the strain measures \(\{\varvec{\Gamma },{\varvec{K}}\}\) to be expressed in terms of the centre line and the triads [1] as

$$\begin{aligned} \varvec{\Gamma }=\left( \varvec{d}_i\cdot \varvec{x}^{\prime }_{0}\right) \varvec{D}_i - \varvec{D}_3;\qquad \varvec{K}=\frac{1}{2}(\varvec{d}_i\cdot \varvec{d}^{\prime }_j)(\varvec{D}_j\times \varvec{D}_{i}).\nonumber \\ \end{aligned}$$
(59)

From equations in (59), the first and second directional derivatives of the beam strain measures \(\{\varvec{\Gamma },\varvec{K}\}\) follow as

(60)

and

$$\begin{aligned} \begin{aligned}&D\varvec{K}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]\\&\quad =\frac{1}{2}\left( D\varvec{d}_i[\delta {\varvec{\theta }}]\cdot \varvec{d}^{\prime }_{j} + \varvec{d}_i\cdot D\varvec{d}^{\prime }_{j}[\delta {\varvec{\theta }}]\right) (\varvec{D}_j\times \varvec{D}_i);\\&D^2\varvec{K}[\delta \varvec{u}_0,\delta {\varvec{\theta }};\varvec{u}_0,\Delta {\varvec{\theta }}]\\&\quad =\frac{1}{2}\left( D\varvec{d}_i[\delta {\varvec{\theta }}]\cdot D\varvec{d}^{\prime }_{j}\left[ \Delta {\varvec{\theta }}\right] +D\varvec{d}_i\left[ \Delta {\varvec{\theta }}\right] \cdot D\varvec{d}^{\prime }_{j}[\delta {\varvec{\theta }}]\right. \\&\qquad +\left. D^2\varvec{d}_i[\delta {\varvec{\theta }};\Delta \varvec{\theta }]\cdot \varvec{d}^{\prime }_{j}+\varvec{d}_i\cdot D^2\varvec{d}^{\prime }_{j}[\delta {\varvec{\theta }};\Delta \varvec{\theta }]\right) \\&\qquad (\varvec{D}_j\times \varvec{D}_i). \end{aligned} \end{aligned}$$
(61)

4.1.2 The strain vector \(\varvec{B}\)

The directional derivative of the strain vector \(\varvec{B}\) defined in (38) can be written as

$$\begin{aligned} \begin{aligned}&D\varvec{B}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]=D\varvec{\Gamma }[\delta \varvec{u}_0,\delta {\varvec{\theta }}] + D\varvec{K}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]\times \bar{\varvec{X}}\\&= \begin{bmatrix} \varvec{I}&-\varvec{I}\varvec{\times }\bar{\varvec{X}} \end{bmatrix}\begin{bmatrix} D\varvec{\Gamma }[\delta \varvec{u}_0,\delta {\varvec{\theta }}]\\ D\varvec{K}[\delta \varvec{u}_0,\delta {\varvec{\theta }}] \end{bmatrix};\\&D^2\varvec{B}[\delta \varvec{u}_0,\delta {\varvec{\theta }};\varvec{u}_0,\Delta {\varvec{\theta }}]= D^2\varvec{\Gamma }[\delta \varvec{u}_0,\delta {\varvec{\theta }};\varvec{u}_0,\Delta {\varvec{\theta }}] \\&\quad + D^2\varvec{K}[\delta \varvec{u}_0,\delta {\varvec{\theta }};\varvec{u}_0,\Delta {\varvec{\theta }}]\times \bar{\varvec{X}}\\&=\begin{bmatrix} \varvec{I}&-\varvec{I}\varvec{\times }\bar{\varvec{X}} \end{bmatrix}\begin{bmatrix} D^2\varvec{\Gamma }[\delta \varvec{u}_0,\delta {\varvec{\theta }};\varvec{u}_0,\Delta {\varvec{\theta }}]\\ D^2\varvec{K}[\delta \varvec{u}_0,\delta {\varvec{\theta }};\varvec{u}_0,\Delta {\varvec{\theta }}] \end{bmatrix} \end{aligned} \end{aligned}$$
(62)

where use of the tensor cross product formula (130) has been made above.

4.1.3 Co-rotational strain measures

It is possible to obtain explicit expressions of the strain tensor \(\varvec{\bar{U}}\), its cofactor \(\bar{\varvec{W}}\) and its determinant J in terms of the strain vector \(\varvec{B}\) (linearly related to \(\varvec{\Gamma }\) and \(\varvec{K}\)). Recalling equation (38) which gives an explicit representation of the strain tensor \(\varvec{\bar{U}}=\varvec{B}\otimes \varvec{D}_3 + \varvec{I}\) in terms of the strain vector \(\varvec{B}\), the cofactor \(\bar{\varvec{W}}\) can be evaluated making use of property (143) as

$$\begin{aligned} \begin{aligned} \bar{\varvec{W}}&=\frac{1}{2}\varvec{\bar{U}}\varvec{\times }\varvec{\bar{U}}\\&=\frac{1}{2}\left( \varvec{B}\otimes \varvec{D}_3 + \varvec{I}\right) \varvec{\times }\left( \varvec{B}\otimes \varvec{D}_3 + \varvec{I}\right) \\&=\frac{1}{2}\left( \varvec{B}\otimes \varvec{D}_3\right) \varvec{\times }\left( \varvec{B}\otimes \varvec{D}_3\right) + \left( \varvec{B}\otimes \varvec{D}_3\right) \varvec{\times }\varvec{I} \\&\quad + \frac{1}{2}\varvec{I}\varvec{\times }\varvec{I}\\&=\frac{1}{2}\underbrace{\left( \varvec{B}\times \varvec{B}\right) }_{\varvec{0}}\otimes \underbrace{\left( \varvec{D}_3\times \varvec{D}_3\right) }_{\varvec{0}}+\left( \varvec{B}\otimes \varvec{D}_3\right) \varvec{\times }\varvec{I} + \varvec{I}\\&= \left( \varvec{B}\otimes \varvec{D}_3\right) \varvec{\times }\varvec{I} + \varvec{I}, \end{aligned} \end{aligned}$$
(63)

where use of the properties (141) and (144) has been made. Similarly, the determinant J can be computed by means of the property (142) and using the above result (63)

$$\begin{aligned} \begin{aligned} J&=\frac{1}{3}\bar{\varvec{W}}:\varvec{\bar{U}}\\&= \frac{1}{3}\left[ \left( \varvec{B}\otimes \varvec{D}_3\right) \varvec{\times }\varvec{I} + \varvec{I}\right] :\left[ \varvec{B}\otimes \varvec{D}_3 + \varvec{I}\right] \\&= \frac{1}{3}\left[ \left[ \left( \varvec{B}\otimes \varvec{D}_3\right) \varvec{\times }\varvec{I}\right] :(\varvec{B}\otimes \varvec{D}_3)\right. \\&\quad \left. + \left[ \left( \varvec{B}\otimes \varvec{D}_3\right) \varvec{\times }\varvec{I}\right] :\varvec{I} + \varvec{B}\cdot \varvec{D}_3 + 3\right] \\&= \frac{1}{3}[\underbrace{\left( \varvec{B}\times \varvec{B}\right) }_{\varvec{0}}\cdot \underbrace{\left( \varvec{D}_3\times \varvec{D}_3\right) }_{\varvec{ 0}}+ \underbrace{\left[ \left( \varvec{B}\otimes \varvec{D}_3\right) \varvec{\times }\varvec{I}\right] :\varvec{I}}_{2\varvec{B}\cdot \varvec{D}_3} \\&\qquad + \varvec{B}\cdot \varvec{D}_3 + 3]= \varvec{B}\cdot \varvec{D}_3 + 1, \end{aligned} \end{aligned}$$
(64)

where use of properties (139), (141) and (144) has been made. The directional derivatives of \(\{\varvec{\bar{U}},\bar{\varvec{W}},J\}\) are obtained from above equations as

$$\begin{aligned}&\begin{aligned}&D\varvec{\bar{U}}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]= D\varvec{B}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]\otimes \varvec{D}_3;\\&D^2\varvec{\bar{U}}[\delta \varvec{u}_0,\delta {\varvec{\theta }};\varvec{u}_0,\Delta {\varvec{\theta }}]=D^2\varvec{B}[\delta \varvec{u}_0,\delta {\varvec{\theta }};\varvec{u}_0,\Delta {\varvec{\theta }}]\otimes \varvec{D}_3, \end{aligned} \end{aligned}$$
(65)
$$\begin{aligned}&\begin{aligned}&D\bar{\varvec{W}}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]= (D\varvec{B}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]\otimes \varvec{D}_3)\varvec{\times }\varvec{I};\\&D^2\varvec{\bar{\varvec{W}}}[\delta \varvec{u}_0,\delta {\varvec{\theta }};\varvec{u}_0,\Delta {\varvec{\theta }}]\\&\quad = (D^2\varvec{B}[\delta \varvec{u}_0,\delta {\varvec{\theta }};\varvec{u}_0,\Delta {\varvec{\theta }}]\otimes \varvec{D}_3)\varvec{\times }\varvec{I}, \end{aligned} \end{aligned}$$
(66)

and

$$\begin{aligned} \begin{aligned}&D{J}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]=D\varvec{B}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]\cdot \varvec{D}_3;\\&D^2{J}[\delta \varvec{u}_0,\delta {\varvec{\theta }};\varvec{u}_0,\Delta {\varvec{\theta }}]=D^2\varvec{B}[\delta \varvec{u}_0,\delta {\varvec{\theta }};\varvec{u}_0,\Delta {\varvec{\theta }}]\cdot \varvec{D}_3. \end{aligned} \end{aligned}$$
(67)

4.2 Polyconvex beam theory

It is customary in classical beam formulations to define a strain energy functional per unit undeformed length \(W_b(\varvec{\Gamma },\varvec{K})\) dependent on the classical beam strain measures \(\varvec{\Gamma }\) and \(\varvec{K}\). However, it is insightful to obtain the relationship between the strain energy functional per unit undeformed length \(W_b(\varvec{\Gamma },\varvec{K})\) and the strain energy functional per unit undeformed volume \(W(\varvec{F},\varvec{H},J)\) (5) (or its alternative equivalent representation \(\hat{W}(\varvec{\bar{U}},\bar{\varvec{W}},J)\)) defined previously.

Moreover, by use of Eq. (38), where the strain tensor \(\varvec{\bar{U}}\) is expressed in terms of the strain vector \(\varvec{B}\), in conjunction with the definitions of the cofactor (143) and the determinant of a tensor (142), it is feasible to obtain an explicit expression of the strain energy functional in terms of the strain vector \(\varvec{B}\) (linearly related to \(\varvec{\Gamma }\) and \(\varvec{K}\)), namely \(\bar{W}(\varvec{B}(\varvec{\Gamma },\varvec{K}))\). The relationship between the alternative strain energy functionals can be summarised as

$$\begin{aligned} \begin{aligned}&W_b(\varvec{\Gamma },\varvec{K})=\int _{A(s)}\bar{W}(\varvec{B}(\varvec{\Gamma },\varvec{K}))dA;\\&\bar{W}(\varvec{B}(\varvec{\Gamma },\varvec{K}))=\hat{W}(\varvec{\bar{U}},\bar{\varvec{W}},J)=W(\varvec{F},\varvec{H},J). \end{aligned} \end{aligned}$$
(68)

4.2.1 Convexity in terms of the strain vector \(\varvec{B}\)

In the context of beam mechanics, an appropriate restriction for a constitutive model is convexity with respect to the beam strain vector \(\varvec{B}\). Essentially, the strain energy \(\Psi \) per unit undeformed volume of the beam must be a function of the deformation gradient via a convex function \(\bar{W}\) as

$$\begin{aligned} \Psi \left( \varvec{\nabla }_0\varvec{x}\right) =\bar{W}\left( \varvec{B}\right) , \end{aligned}$$
(69)

where \(\bar{W}\) is convex with respect to its three variables, namely the \(3\times 1\) components of \(\varvec{B}\). Following a similar approach to that in Sect. 2.2, it is possible to define a new work conjugate stress vector \(\varvec{\Sigma _{B}}\) to the beam strain vector \(\varvec{B}\) as

$$\begin{aligned} \varvec{\Sigma _{B}}\left( \varvec{B}\right) =\frac{\partial \bar{W}}{\partial \varvec{B}}. \end{aligned}$$
(70)

This set of conjugate measures enables the directional derivative of the strain energy \(\bar{W}\) to be expressed as

$$\begin{aligned} D\bar{W}\left[ D\varvec{B}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]\right] =\varvec{\Sigma _{B}}\cdot D\varvec{B}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]. \end{aligned}$$
(71)

Recalling that the directional derivative of the strain energy \({\Psi }\) fulfils

$$\begin{aligned} D{\Psi }[\delta \varvec{u}_0,\delta {\varvec{\theta }}]=\bar{\varvec{P}}:D\bar{\varvec{U}}[\delta \varvec{u}_0,\delta {\varvec{\theta }}], \end{aligned}$$
(72)

enables, after comparison of Eqs. (72) and (71) (making use of Eq. (65)), to conclude

$$\begin{aligned} \varvec{\Sigma _B} = \bar{\varvec{P}}\varvec{D}_3. \end{aligned}$$
(73)

It transpires from Eq. (73) and the definition of \(\bar{\varvec{P}}\) in (53) that the stress vector \(\varvec{\Sigma _B}\) represents the traction vector on the cross section of the beam rotated back to the reference underformed configuration.

Alternatively to Eqs. (47) and (56), the tangent operator for the strain energy \(\Psi \) can be re-expressed in terms of the directional derivatives of the strain vector \(\varvec{B}\) as

$$\begin{aligned} \begin{aligned}&D^2\Psi [\delta \varvec{u}_0,\delta {\varvec{\theta }};\varvec{u}_0,\Delta {\varvec{\theta }}]\\&\quad =D(\varvec{\Sigma _{B}}\cdot D\varvec{B}[\delta \varvec{u}_0,\delta {\varvec{\theta }}])[\varvec{u}_0,\Delta {\varvec{\theta }}]\\&\quad = D\varvec{B}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]\cdot \begin{bmatrix} {\mathbb H_{\bar{W}}} \end{bmatrix} D\varvec{B}\left[ \varvec{u}_0,\Delta {\varvec{\theta }}\right] \\&\qquad +\varvec{\Sigma _{B}}\cdot D^2\varvec{B}[\delta \varvec{u}_0,\delta {\varvec{\theta }};\varvec{u}_0,\Delta {\varvec{\theta }}], \end{aligned} \end{aligned}$$
(74)

where the Hessian operator \([\mathbb {H}_{\bar{W}}]\) denotes the symmetric positive semi-definite operator containing the second derivatives of \(\bar{W}\left( \varvec{B}\right) \)

$$\begin{aligned} \left[ \mathbb {H}_{\bar{W}}\right] = \frac{\partial ^2 \bar{W}}{\partial \varvec{B}\partial \varvec{B}}. \end{aligned}$$
(75)

Note that the first term on the last right hand side of Eq. (74) is necessarily positive for \(\delta \varvec{u}_0=\varvec{u}_0\) and \(\delta {\varvec{\theta }}=\Delta {\varvec{\theta }}\). Therefore, buckling can only be induced by the “initial stress” term in (74) defined by \(\varvec{\Sigma _{B}}\cdot D^2\varvec{B}[\delta \varvec{u}_0,\delta {\varvec{\theta }};\varvec{u}_0,\Delta {\varvec{\theta }}]\), where the second directional derivatives of the beam strain vector \(\varvec{B}\) were previously evaluated in (62).

4.2.2 Polyconvexity in terms of the beam strain measures \(\varvec{\Gamma }\) and \(\varvec{K}\)

Equivalently, an appropriate restriction for a constitutive model is polyconvexity with respect to the beam strain measures \(\{\varvec{\Gamma },\varvec{K}\}\) [35]. It is well known that this condition ensures ellipticity of the static problem and, equivalently, hyperbolicity of its dynamic counterpart [35]. Essentially, the strain energy \(\Psi _b\) Footnote 7 per unit undeformed length of the beam must be a function of the deformation gradient via a convex multi-valued function \(W_b\) as

$$\begin{aligned} \Psi _b\left( \varvec{\nabla }_0\varvec{x}\right) =W_b\left( \varvec{\Gamma },\varvec{K}\right) , \end{aligned}$$
(76)

where \(W_b\) is convex with respect to its 6 variables, namely, the \(3\times 1\) components of \(\varvec{\Gamma }\) and \(\varvec{K}\). Moreover, invariance with respect to rotations is automatically satisfied since \(\varvec{\Gamma }\) and \(\varvec{K}\) are defined in the reference configuration. Following a similar approach to that in Sect.  2.2, it is possible to define work conjugates \(\varvec{\Sigma _{\Gamma }}\) and \(\varvec{\Sigma _{K}}\) to the beam strain measures \(\varvec{\Gamma }\) and \(\varvec{K}\), respectively, as

$$\begin{aligned} \varvec{\Sigma _{\Gamma }}\left( \varvec{\Gamma },\varvec{K}\right) =\frac{\partial W_b}{\partial \varvec{\Gamma }};\quad \varvec{\Sigma _K}\left( \varvec{\Gamma },\varvec{K}\right) =\frac{\partial W_b}{\partial \varvec{K}}. \end{aligned}$$
(77)

It is easy to realise that \(\varvec{\Sigma }_{\varvec{\Gamma }}\) represents the axial-shear force vector rotated back to the reference undeformed configuration, whilst \(\varvec{\Sigma }_{\varvec{K}}\) is the torsional–bending moment rotated back to the reference undeformed configuration, work conjugates of the axial-shear strain vector \(\varvec{\Gamma }\) and the torsional–bending strain vector \(\varvec{K}\), respectively. This set of conjugate measures enables the directional derivative of the strain energy \(W_b\) to be expressed as

$$\begin{aligned} DW_b\left[ D\varvec{\Gamma }[\delta \varvec{u}_0,\delta {\varvec{\theta }}],D\varvec{K}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]\right]= & {} \varvec{\Sigma _{\Gamma }}\cdot D\varvec{\Gamma }[\delta \varvec{u}_0,\delta {\varvec{\theta }}]\nonumber \\&+ \varvec{\Sigma _K}\cdot D\varvec{K}[\delta \varvec{u}_0,\delta {\varvec{\theta }}].\nonumber \\ \end{aligned}$$
(78)

Recalling the relation between the strain energies \(W_b\) and \(\bar{W}\) in (68) and hence, between \(\Psi _b\) and \({\Psi }\) and Eq.  (71), it is possible to obtain the following expression for the directional derivative of \(\Psi _b\)

$$\begin{aligned} D\Psi _b[\delta \varvec{u}_0,\delta {\varvec{\theta }}]= & {} \int _{A(s)}D\Psi [\delta \varvec{u}_0,\delta {\varvec{\theta }}]\;dA\nonumber \\= & {} \int _{A(s)} \varvec{\Sigma _B}\cdot D\varvec{B}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]\; dA. \end{aligned}$$
(79)

The directional derivative of \(\varvec{B}\) can be expressed in terms of the directional derivatives of \(\{\varvec{\Gamma },\varvec{K}\}\) through Eq. (62), leading to the following modified version of Eq.  (79)

$$\begin{aligned} \begin{aligned} D\Psi _b[\delta \varvec{u}_0,\delta {\varvec{\theta }}]&=\int _{A(s)} \varvec{\Sigma _B}\,dA\cdot D\varvec{\Gamma }[\delta \varvec{u}_0,\delta {\varvec{\theta }}] \\&\quad \,- \int _{A(s)} \varvec{\Sigma _B}\,dA\cdot (\varvec{I}\varvec{\times }\,\bar{\varvec{X}})D\varvec{K}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]\\&=\int _{A(s)} \varvec{\Sigma _B}\,dA\cdot D\varvec{\Gamma }[\delta \varvec{u}_0,\delta {\varvec{\theta }}]\\&\quad \,+ \int _{A(s)} (\bar{\varvec{X}}\times \varvec{\Sigma _B})\;dA\cdot D\varvec{K}[\delta \varvec{u}_0,\delta {\varvec{\theta }}] \end{aligned} \end{aligned}$$
(80)

where use of the tensor cross product formula (130) has been made. Comparison of Eq. (80) with (78) leads to the relation between the conjugate beam measures \(\varvec{\Sigma _{\Gamma }}\) and \(\varvec{\Sigma _{K}}\) (77) with \(\varvec{\Sigma _B}\) (70) as

$$\begin{aligned} \varvec{\Sigma _{\Gamma }}=\int _{A(s)}\varvec{\Sigma _B}\,dA;\quad \varvec{\Sigma _{K}}=\int _{A(s)}\bar{\varvec{X}}\times \varvec{\Sigma _B}\,dA. \end{aligned}$$
(81)

Note that the tangent operator of the strain energy \(W_b\) naturally emerges as

$$\begin{aligned} \begin{aligned}&D^2W_b[\delta \varvec{u}_0,\delta {\varvec{\theta }};\varvec{u}_0,\Delta {\varvec{\theta }}]\\&\quad = \int _{A(s)}D^2\Psi [\delta \varvec{u}_0,\delta {\varvec{\theta }};\varvec{u}_0,\Delta {\varvec{\theta }}]\,dA\\&\quad = \begin{bmatrix} \mathbb {S}_{\delta } \end{bmatrix}_{W_b}^T\begin{bmatrix} \mathbb {H}_{W_b} \end{bmatrix}\begin{bmatrix} \mathbb {S}_{\Delta } \end{bmatrix}_{W_b} \\&\qquad + \varvec{\Sigma _{\Gamma }}\cdot D^2\varvec{\Gamma }[\delta \varvec{u}_0,\delta {\varvec{\theta }};\varvec{u}_0,\Delta {\varvec{\theta }}]\\&\qquad + \varvec{\Sigma _{K}}\cdot D^2\varvec{K}[\delta \varvec{u}_0,\delta {\varvec{\theta }};\varvec{u}_0,\Delta {\varvec{\theta }}], \end{aligned} \end{aligned}$$
(82)

with

$$\begin{aligned}&\begin{bmatrix} \mathbb {S}_{\delta } \end{bmatrix}_{W_b}^T = \begin{bmatrix} D\varvec{\Gamma }[\delta \varvec{u}_0,\delta {\varvec{\theta }}]\cdot&D\varvec{K}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]\cdot \end{bmatrix};\nonumber \\&\begin{bmatrix} \mathbb {S}_{\Delta } \end{bmatrix}_{W_b} = \begin{bmatrix} D\varvec{\Gamma }[\varvec{u}_0,\Delta {\varvec{\theta }}] \\ D\varvec{K}[\varvec{u}_0,\Delta {\varvec{\theta }}] \end{bmatrix}. \end{aligned}$$
(83)

and the Hessian operator \([\mathbb {H}_{W_b}]\) is

$$\begin{aligned} \begin{aligned} \left[ \mathbb {H}_{W_b}\right] =\begin{bmatrix} \frac{\partial ^2 W_b}{\partial \varvec{\Gamma }\partial \varvec{\Gamma }}&\frac{\partial ^2 W_b}{\partial \varvec{\Gamma }\partial \varvec{K}}\\ \frac{\partial ^2 W_b}{\partial \varvec{K}\partial \varvec{\Gamma }}&\frac{\partial ^2 W_b}{\partial \varvec{K}\partial \varvec{K}} \end{bmatrix} \end{aligned} \end{aligned}$$
(84)

4.3 Relationship between tangent elasticity operators

Use of Eqs. (62) and (81) in the second term on the right hand side of Eq. (82) enables both geometric terms of the tangent operators for \(W_b(\varvec{\Gamma },\varvec{K})\) and \(\bar{W}(\varvec{B})\) to be related as

$$\begin{aligned}&\varvec{\Sigma _{\Gamma }}\cdot D^2\varvec{\Gamma }[\delta \varvec{u}_0,\delta {\varvec{\theta }};\varvec{u}_0,\Delta {\varvec{\theta }}] + \varvec{\Sigma _{K}}\cdot D^2\varvec{K}[\delta \varvec{u}_0,\delta {\varvec{\theta }};\varvec{u}_0,\Delta {\varvec{\theta }}]\nonumber \\&\quad = \int _{A(s)}{\varvec{\Sigma _{B}}\cdot D^2\varvec{B}[\delta \varvec{u}_0,\delta {\varvec{\theta }};\varvec{u}_0,\Delta {\varvec{\theta }}]\,dA} \end{aligned}$$
(85)

Hence, the constitutive term for the tangent operators for both \(W_b(\varvec{\Gamma },\varvec{K})\) and \(\bar{W}(\varvec{B})\) must coincide, namely

$$\begin{aligned} \begin{bmatrix} \mathbb {S}_{\delta } \end{bmatrix}_{W_b}^T\begin{bmatrix} \mathbb {H}_{W_b} \end{bmatrix}\begin{bmatrix} \mathbb {S}_{\Delta } \end{bmatrix}_{W_b}= \begin{aligned} D\varvec{B}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]\cdot \begin{bmatrix} \mathbb {H}_{\bar{W}} \end{bmatrix}D\varvec{B}[\varvec{u}_0,\Delta {\varvec{\theta }}]. \end{aligned}\nonumber \\ \end{aligned}$$
(86)

Use of Eqs. (83) and (62) into (86) finally yields

$$\begin{aligned} \begin{aligned} \left[ \mathbb {H}_{W_b}\right]&=\begin{bmatrix} \frac{\partial ^2 W_b}{\partial \varvec{\Gamma }\partial \varvec{\Gamma }}&\frac{\partial ^2 W_b}{\partial \varvec{\Gamma }\partial \varvec{K}}\\ \frac{\partial ^2 W_b}{\partial \varvec{K}\partial \varvec{\Gamma }}&\frac{\partial ^2 W_b}{\partial \varvec{K}\partial \varvec{K}} \end{bmatrix}\\&=\int _{A(s)}\begin{bmatrix} \varvec{I} \\ \varvec{I}\varvec{\times }\bar{\varvec{X}} \end{bmatrix}\begin{bmatrix} {\mathbb H_{\bar{W}}} \end{bmatrix} \begin{bmatrix} \varvec{I}&-\varvec{I}\varvec{\times }\bar{\varvec{X}} \end{bmatrix}\,dA. \end{aligned} \end{aligned}$$
(87)

Equation (87) establishes a relationship between the tangent operators \(\left[ \mathbb {H}_{W_b}\right] \) and \(\begin{bmatrix} {\mathbb H_{\bar{W}}} \end{bmatrix}\) for the internal energies \(W_b(\varvec{\Gamma },\varvec{K})\) and \(\bar{W}(\varvec{B})\), respectively. Notice that these two energies are defined in terms of the classical beam strain measures. In addition, it is also possible to relate these two Hessian operators with that of the energy \(\hat{W}(\bar{\varvec{U}},\bar{\varvec{W}},J)\) (56) defined in terms of the more general continuum strain measures. This can be obtained by re-expressing the tangent operator for the energy \(\hat{W}(\bar{\varvec{U}},\bar{\varvec{W}},J)\) (56) in terms of the strain vector \(\varvec{B}\).

Notice first that consideration of Eq. (65) enables the simplification of the second term in the right hand side of Eq. (56) as

$$\begin{aligned}&\begin{aligned}&\left( \varvec{\Sigma }_{\bar{\varvec{W}}} + \Sigma _{J}\bar{\varvec{U}}\right) :(D\bar{\varvec{U}}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]\varvec{\times }D\bar{\varvec{U}}\left[ \varvec{u}_0,\Delta {\varvec{\theta }}\right] )\\&\quad =(D{\varvec{B}}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]\times D{\varvec{B}}\left[ \varvec{u}_0,\Delta {\varvec{\theta }}\right] )\\&\qquad \cdot \left( \varvec{\Sigma }_{\bar{\varvec{W}}} + \Sigma _{J}\bar{\varvec{U}}\right) (\varvec{D}_3\times \varvec{D}_3) = 0. \end{aligned} \end{aligned}$$
(88)

Regarding the third term on the right-hand side in Eq.  (56), use of Eqs. (55), (73) and (65) leads to the following identity

$$\begin{aligned}&\begin{aligned}&\left( \varvec{\Sigma }_{\bar{\varvec{U}}} + \varvec{\Sigma }_{\bar{\varvec{W}}}\varvec{\times }\bar{\varvec{U}} + \Sigma _{J}\bar{\varvec{W}}\right) :D^2\bar{\varvec{U}}[\delta {\varvec{\theta }};\Delta {\varvec{\theta }}]\\&\quad =\varvec{\Sigma _B}\cdot D^2\bar{\varvec{B}}[\delta {\varvec{\theta }};\Delta {\varvec{\theta }}]. \end{aligned} \end{aligned}$$
(89)

Finally, introduction of the expressions in (65), (66) and (67) for the directional derivatives of \(\bar{\varvec{U}}\), \(\bar{\varvec{W}}\) and J, respectively, together with Eqs. (88) and (89) enables both constitutive terms in Eqs. (56) and (74) to be related as

$$\begin{aligned} \begin{aligned} D\varvec{B}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]\cdot \begin{bmatrix} \mathbb {H}_{\bar{W}} \end{bmatrix}D\varvec{B}[\varvec{u}_0,\Delta {\varvec{\theta }}] = [\mathbb {S}_{\delta }]_{\hat{W}}^T \left[ \mathbb {H}_{\hat{W}}\right] [\mathbb {S}_{\Delta }]_{\hat{W}}, \end{aligned}\nonumber \\ \end{aligned}$$
(90)

with the Hessian operator \(\begin{bmatrix}\mathbb {H}_{\bar{W}}\end{bmatrix}\) (75) related to the Hessian operator \(\begin{bmatrix} \mathbb {H}_{\hat{W}} \end{bmatrix}\) via

$$\begin{aligned}&\begin{bmatrix} \mathbb {H}_{\bar{W}} \end{bmatrix}=\begin{bmatrix} {\mathbb {H}_{\hat{W}}^{eq}} \end{bmatrix}_{\varvec{D}_3\varvec{D}_3};\nonumber \\&\left[ {\begin{bmatrix} {\mathbb {H}_{\hat{W}}^{eq}} \end{bmatrix}_{\varvec{D}_3\varvec{D}_3}}\right] _{IJ}={\varvec{D}_3}_{K}{\varvec{D}_3}_{L}\begin{bmatrix}{\mathbb {H}_{\hat{W}}^{eq}}\end{bmatrix}_{IKJL} \end{aligned}$$
(91)

where

$$\begin{aligned} \begin{aligned} \begin{bmatrix}\mathbb {H}_{\hat{W}}^{eq}\end{bmatrix}&=\hat{W}_{\bar{\varvec{U}}\bar{\varvec{U}}}+\varvec{I}\varvec{\times }\hat{W}_{\bar{\varvec{W}}\bar{\varvec{W}}}\varvec{\times }\varvec{I}+ \hat{W}_{JJ}\varvec{I}\otimes \varvec{I}\\&\quad +\hat{W}_{\bar{\varvec{U}}\bar{\varvec{W}}}\varvec{\times }\varvec{I}+\varvec{I}\varvec{\times }\hat{W}_{\bar{\varvec{W}}\bar{\varvec{U}}}\\&\quad +\hat{W}_{\bar{\varvec{U}}J}\otimes \varvec{I}+\varvec{I}\otimes \hat{W}_{J\bar{\varvec{U}}}+ \left( \varvec{I}\varvec{\times }\hat{W}_{\bar{\varvec{W}}J}\right) \otimes \varvec{I}\\&\quad + \varvec{I}\otimes \left( \hat{W}_{J\bar{\varvec{W}}}\varvec{\times }\varvec{I}\right) . \end{aligned} \end{aligned}$$
(92)

Notice the similarities between above equation (92) and (11), where the second order tensors \(\varvec{F}\) and \(\varvec{H}\) have been replaced with the identity tensor \(\varvec{I}\).

The relationship between the different tangent operators \([\mathbb {H}_{\bar{W}}]\), \([\mathbb {H}_{\hat{W}}]\) and \([\mathbb {H}_{W_b}]\) as summarised in above Eqs.  (85) and (91) establishes an elegant link between the beam and the more general continuum models. For a given continuum polyconvex constitutive model (characterised by positive semi-definite Hessian operators \([\mathbb {H}_{\hat{W}}]\) (58) and \([\mathbb {H}_{W}]\) (9)), the associated (beam) Hessian operators \([\mathbb {H}_{\bar{W}}]\) (91)–(92) and \([\mathbb {H}_{W_b}]\) (87) will be positive semi-definite too. Therefore, as expected, satisfaction of the ellipticity condition [19] in the continuum through a polyconvex constitutive law implies satisfaction of the ellipticity condition [35] in the consistently derived beam problem. The opposite cannot be inferred. Further inclusion of suitable growth conditions into the constitutive model guarantees well posedness of the overall problem [19].

4.4 Polyconvex constitutive models in classical beam theory

In general, given an arbitrary objective energy functional expressed as a function of the extended kinematic set \(\{\bar{\varvec{U}},\bar{\varvec{W}},J\}\), it is possible to obtain an equivalent expression in terms of the strain vector \(\varvec{B}\) (and hence in terms of \(\{\varvec{\Gamma },\varvec{K}\}\)) after some algebraic manipulations.Footnote 8

4.4.1 A Mooney–Rivlin constitutive model

As presented in Appendix 2, an expression for the strain energy functional \(\bar{W}(\varvec{B})\) can be obtained for a Mooney–Rivlin model (154) as (where use has been made of the function f as defined in (18))

$$\begin{aligned} \begin{aligned} \bar{W}(\varvec{B})&=\alpha \left( \varvec{B}\cdot \varvec{B} + 2\varvec{B}\cdot \varvec{D}_3 + 3\right) \\&\quad -2\left( \alpha +2\beta \right) \text {ln}\left( \varvec{B}\cdot \varvec{D}_3 + 1\right) + \frac{\lambda }{2}\left( \varvec{B}\cdot \varvec{D}_3\right) ^2\\&\quad \, + \beta (2\varvec{B}\cdot \varvec{B} - \underbrace{\left( \varvec{D}_3\times \varvec{B}\right) \cdot \left( \varvec{D}_3\times \varvec{B}\right) }_{\xi }\\&\quad \,+4\varvec{B}\cdot \varvec{D}_3 + 3). \end{aligned} \end{aligned}$$
(93)

As shown in Appendix 2, the stress vector \(\varvec{\Sigma _B}\) is obtained as

$$\begin{aligned} \begin{aligned} \varvec{\Sigma _B}&= 2\alpha \left( \varvec{B} + \varvec{D}_3\right) + 2\beta \left[ \left( \varvec{I} + \varvec{D}_3\otimes \varvec{D}_3\right) \varvec{B}+ 2\varvec{D}_3\right] \\&\quad +\left[ \lambda \left( \varvec{B}\cdot \varvec{D}_3\right) -\frac{2\left( \alpha +2\beta \right) }{\left( \varvec{B}\cdot \varvec{D}_3+1\right) }\right] \varvec{D}_3, \end{aligned} \end{aligned}$$
(94)

and the Hessian operator \([\mathbb {H}_{\bar{W}} ]\) as

$$\begin{aligned} \begin{bmatrix} \mathbb {H}_{\bar{W}} \end{bmatrix}= & {} 2\alpha \varvec{I} + 2\beta \left( \varvec{I} + \varvec{D}_3\otimes \varvec{D}_3\right) \nonumber \\&+\left[ \lambda +\frac{2\left( \alpha +2\beta \right) }{\left( \varvec{B}\cdot \varvec{D}_3+1\right) ^2}\right] \varvec{D}_3\otimes \varvec{D}_3. \end{aligned}$$
(95)

Hence, convexity is guaranteed provided that the material parameters \(\alpha \), \(\beta \) and \(\lambda \) are non-negative. Therefore, conforming to the conclusions obtained in Sect. 4.3, both Hessian operators \(\begin{bmatrix}\mathbb {H}_{W}\end{bmatrix}\) (16) and \(\begin{bmatrix}\mathbb {H}_{\bar{W}}\end{bmatrix}\) (95) are positive semi-definite for the Mooney–Rivlin model.

For the degenerate case of a Neo-Hookean model (\(\alpha =\mu /2\), \(\beta =0\)), the stress vector \(\varvec{\Sigma _B}\) and the Hessian operator \([\mathbb {H}_{\bar{W}}]\) are obtained as

$$\begin{aligned} \begin{aligned}&\varvec{\Sigma _B}= \mu \left( \varvec{B} + \varvec{D}_3\right) +\left[ \lambda \left( \varvec{B}\cdot \varvec{D}_3\right) -\frac{\mu }{\left( \varvec{B}\cdot \varvec{D}_3+1\right) }\right] \varvec{D}_3;\\&\begin{bmatrix} \mathbb {H}_{\bar{W}} \end{bmatrix}= \mu \varvec{I}+\left[ \lambda +\frac{\mu }{\left( \varvec{B}\cdot \varvec{D}_3+1\right) ^2}\right] \varvec{D}_3\otimes \varvec{D}_3. \end{aligned} \end{aligned}$$
(96)

4.4.2 A Saint-Venant constitutive model

The Saint-Venant–Kirchhoff strain energy in Eq.  (24) can be expressed in terms of the strain vector \(\varvec{B}\) using Eq. (38)

$$\begin{aligned} \varvec{E}=\frac{1}{2}\left( \varvec{B}\otimes \varvec{D}_3 + \varvec{D}_3\otimes \varvec{B}\right) +\frac{1}{2}\left( \varvec{B}\cdot \varvec{B}\right) \varvec{D}_3\otimes \varvec{D}_3. \end{aligned}$$
(97)

Introduction of Eq. (97) into (24) leads to a final expression of the strain energy \(\bar{W}_{SVK}\) in terms of \(\varvec{B}\). Successive derivations of this expression yield the following Hessian operator \(\begin{bmatrix}\mathbb {H}_{\bar{W}}\end{bmatrix}\)

$$\begin{aligned} \begin{aligned} \begin{bmatrix} \mathbb H_{\bar{W}} \end{bmatrix}&= \lambda [(\varvec{D}_3 + \varvec{B})\otimes (\varvec{D}_3+\varvec{B}) + (\varvec{B}\cdot \varvec{D}_3 + \frac{1}{2}\varvec{B}\cdot \varvec{B})\varvec{I}]\\&\quad +\mu [\varvec{D}_3\otimes \varvec{D}_3\ + \varvec{I} + 2(\varvec{B}\cdot \varvec{D}_3)\varvec{I}\\&\quad +2(\varvec{B}\otimes \varvec{D}_3+\varvec{D}_3\otimes \varvec{B})\\&\quad +2\varvec{B}\otimes \varvec{B} + (\varvec{B}\cdot \varvec{B})\varvec{I}]. \end{aligned} \end{aligned}$$
(98)

Notice that for the particular case of \(\varvec{B}=-\varvec{D}_3\)

$$\begin{aligned} \begin{aligned} \left. \begin{bmatrix} \mathbb {H}_{\bar{W}} \end{bmatrix}\right| _{\varvec{B}=-\varvec{D}_3} =&-\frac{\lambda }{2}\varvec{I} -\mu \varvec{D}_3\otimes \varvec{D}_3. \end{aligned} \end{aligned}$$
(99)

which is clearly non positive semi-definite. Since positive semi-definiteness of \(\mathbb {H}_{\bar{W}}\) is not guaranteed for all values of the strain vector \(\varvec{B}\), it can be concluded that the constitutive model is not convex in \(\varvec{B}\). In order to overcome this shortcoming, it is customary [8] to neglect the higher order terms in (97) (last term on the right hand side of (97)) resulting in a linearised version of the Saint Venant–Kirchhoff model as

$$\begin{aligned} \bar{W}^{\text {lin}}_{SVK}(\varvec{B})=\frac{\lambda }{2}\Psi _1^{\text {lin}}+\mu \Psi _2^{\text {lin}}, \end{aligned}$$
(100)

where

$$\begin{aligned} \Psi _1^{\text {lin}}=\left( \varvec{B}\cdot \varvec{D}_3\right) ^2\quad \Psi _2^{\text {lin}}=\frac{1}{2}\left[ \left( \varvec{B}\cdot \varvec{D}_3\right) ^2+\left( \varvec{B}\cdot \varvec{B}\right) \right] .\nonumber \\ \end{aligned}$$
(101)

The Hessian operator \(\mathbb H_{\bar{W}}\) (75) for the linearised Saint-Venant constitutive model is finally computed as:

$$\begin{aligned} \begin{aligned} \left[ \mathbb H_{\bar{W}}\right] =&(2\lambda + \mu )\varvec{D}_3\otimes \varvec{D}_3+ {\mu }\varvec{I}. \end{aligned} \end{aligned}$$
(102)

Therefore, this linearised version of the Saint-Venant–Kirchhoff constitutive model is convex with respect to \(\varvec{B}\) and hence, according to Sect. 4.3, \(W_b(\varvec{\Gamma },\varvec{K})\) will be convex with respect to \(\{\varvec{\Gamma },\varvec{K}\}\). However, on the contrary to the Mooney–Rivlin and Neo-Hookean models, the energy functional \(\Psi ^{\text {lin}}_{SVK}\) still fails to satisfy adequate growth conditions, namely the coercivity conditionFootnote 9 [30, 36]

$$\begin{aligned} \lim _{J\left( \varvec{\nabla }_0\varvec{x}\right) \rightarrow 0^+}\Psi \left( \varvec{\varvec{\nabla }_0}\varvec{x}\right) \rightarrow \infty . \end{aligned}$$
(103)

Hence, under high compression scenarios, the Saint Venant–Kirchoff constitutive model does not perform well. Given the limitations of this model in large strain scenarios, more suitable constitutive laws need to be considered.

5 Variational formulation

The objective of this section is to succinctly present the variational formulation of the geometrically exact beam theory following a continuum degenerate approach. This will be used as a starting point when describing the discretisation strategy employed. As a starting point, let us consider the following standard total energy minimisation variational principle

$$\begin{aligned} \begin{aligned}&\Pi _M({\varvec{x}_0}^*,{\varvec{\theta }}^*) \\&\quad = {\mathop {\min } \limits _{\tiny {\begin{array}{c} \varvec{x}_0, \varvec{\theta } \end{array}}}} \left\{ \int \limits _V {\left. \Psi \left( \varvec{\nabla }_0\varvec{x}\right) \right| _{\tiny {\begin{array}{c} \varvec{x}=\varvec{x}_0+\bar{\varvec{x}},\\ \varvec{d}_i=\varvec{R}(\varvec{\theta })\varvec{D}_i \end{array}}}\,dV} -W^{\text {ext}}\vert _{\tiny {\begin{array}{c} \varvec{x}=\varvec{x}_0+\bar{\varvec{x}},\\ \varvec{d}_i=\varvec{R}(\varvec{\theta })\varvec{D}_i \end{array}}} \right\} . \end{aligned}\nonumber \\ \end{aligned}$$
(104)

where \({\varvec{x}_0}^{*}\) and \({\varvec{\theta }}^*\) denote the exact solution and \(W^{\text {ext}}\) the work done by external forces. The stationary condition of this functional leads to the principle of virtual work, written asFootnote 10

$$\begin{aligned} \begin{aligned} D\Pi _M[\delta \varvec{u}_0,\delta {\varvec{\theta }}]=&\int \limits _V {\varvec{P}:D\varvec{F}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]\,dV}\\&\quad -D W^{\text {ext}}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]=0. \end{aligned} \end{aligned}$$
(105)

Following the continuum degenerate approach described in Sect.  3, substitution of \(D \varvec{F}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]\) by its expression in Eq. (42) allows to re-write the first time on the right hand side of (105) as

$$\begin{aligned} \begin{aligned} D \Pi _M^{int}[\delta \varvec{u}_0,\delta {\varvec{\theta }}]&=\int _0^L{ \delta \varvec{u}^{\prime }_{0}\cdot \left[ \int _{A(s)}{\varvec{P}\varvec{D}_3\,dA}\right] }ds\\&\quad + \int _0^L D\varvec{d}_{\alpha }[\delta {\varvec{\theta }}]\cdot \left[ \int _{A(s)}{\varvec{P}\varvec{D}_{\alpha }\,dA}\right] ds\\&\quad +\int _0^L D\varvec{d}^{\prime }_{\alpha }[\delta {\varvec{\theta }}]\cdot \left[ \int _{A(s)}{\eta ^{\alpha }\varvec{P}\varvec{D}_{3}\,dA}\right] ds, \end{aligned} \nonumber \\ \end{aligned}$$
(106)

where the first Piola–Kirchhoff stress tensor \(\varvec{P}\) is evaluated via Eq.  (7) in terms of the conjugate stresses \(\varvec{\Sigma _F}\), \(\varvec{\Sigma _H}\) and \(\Sigma _J\). An iterativeFootnote 11 Newton–Raphson process is applied to obtain the solution. This is usually achieved by solving a linearised system for the increments \(\{\varvec{u}_0,\Delta \varvec{\theta }\}\) as [1]

$$\begin{aligned}&D^2\Pi _M(\varvec{x}_0^k,\varvec{\theta }^{k})[\delta \varvec{u}_0,\delta {\varvec{\theta }};\varvec{u}_0,\Delta {\varvec{\theta }}]\nonumber \\&\quad =-D\Pi _M(\varvec{x}_0^k,\varvec{\theta }^{k})[\delta \varvec{u}_0,\delta {\varvec{\theta }}], \end{aligned}$$
(107)

where the update of the position of the center line \(\varvec{x}_0\) and of the rotation matrix \(\varvec{R}\) is obtained via

$$\begin{aligned}&\varvec{x}_0^{k+1}=\varvec{x}_0^{k} + {\varvec{u}_0};\quad {\varvec{R}}^{k+1}=\varvec{R}\left( \Delta \varvec{\theta }\right) {\varvec{R}}^{k};\nonumber \\&\quad \varvec{d}_i^{k+1}=\varvec{R}^{k+1}\varvec{D}_i. \end{aligned}$$
(108)

and with the rotation matrix \(\varvec{R}\left( \Delta \varvec{\theta }\right) \) obtained particularising Eq. (34) to the incremental rotation vector \(\Delta \varvec{\theta }\). Finally, in the absence of follower loads, the second derivative of the total energy functional is given by

$$\begin{aligned}&D^2\Pi _M(\varvec{x}_0^k,\varvec{\theta }^{k})[\delta \varvec{u}_0,\delta {\varvec{\theta }};\varvec{u},\Delta {\varvec{\theta }}]\nonumber \\&\quad =\int \limits _V{D^2\Psi (\varvec{x}_0^k,\varvec{\theta }^{k})[\delta \varvec{u}_0,\delta {\varvec{\theta }};\varvec{u},\Delta {\varvec{\theta }}]\,dV}, \end{aligned}$$
(109)

where the tangent operator is evaluated using Eq.  (49).

6 Finite element discretisation

Traditional finite element discretisations based on rotations do not satisfy objectivity [1, 2, 4, 6]. In order to circumvent this shortcoming, a formulation in which the nodal degrees of freedom correspond to displacements of the centre line \(\varvec{x}_0\) and the director triad \(\{\varvec{d}_1,\varvec{d}_2,\varvec{d}_3\}\) is considered following reference [1, 6]. In this approach, \(\varvec{x}_0\) and the director vectors are interpolated as

$$\begin{aligned} \varvec{x}_0=\varvec{x}_0^aN_u^a\left( s\right) ;\quad \varvec{d}_i=\varvec{d}_i^aN_d^a\left( s\right) . \end{aligned}$$
(110)

where \(N_u^a\) and \(N_d^a\) are standard Finite Element shape functions. Using a Galerkin approach

$$\begin{aligned} \delta \varvec{u}_0=\delta \varvec{u}_0^aN_u^a\left( s\right) ;\quad D\varvec{d}_i[\delta {\varvec{\theta }}]=D\varvec{d}_i^a[\delta {\varvec{\theta }}]N_d^a\left( s\right) . \end{aligned}$$
(111)

Particularisation of Eqs. (40) to the triads \(\varvec{d}_k\) collocated at the nodes of the underlying mesh, it results in

$$\begin{aligned} \begin{aligned}&D\varvec{d}_{i}^a[\delta {\varvec{\theta }^a}]=\delta {\varvec{\theta }}^a\times \varvec{d}_{i}^a;\\&D^2\varvec{d}_{i}^a[\delta {\varvec{\theta }^a};\Delta {\varvec{\theta }^b}]= \Delta \varvec{\theta }^{a} (\delta {\varvec{\theta }}^a\cdot \varvec{d}_{i}^a)\delta _{ab}- \varvec{d}_{i}^a (\delta {\varvec{\theta }}^a\cdot \Delta \varvec{\theta }^{a})\delta _{ab} \end{aligned} \end{aligned}$$
(112)

where the cross product property \(\varvec{a}\times \left( \varvec{b}\times \varvec{c}\right) =\varvec{b}\left( \varvec{a}\cdot \varvec{c}\right) - \varvec{c}\left( \varvec{a}\cdot \varvec{b}\right) \) has been used in Eq. (112) in order to obtain the expression for \(D^2\varvec{d}_{i}^a[\delta {\varvec{\theta }^a};\Delta {\varvec{\theta }^b}]\).

6.1 Discretised kinematics

Based on the discretisation presented and making use of Eq.  (112), the discretised form of the first and second directional derivatives of \(\varvec{F}\) in Eq. (42) can be obtained as

$$\begin{aligned} \begin{aligned}&D\varvec{F}[\delta \varvec{u}_0^a,\delta {\varvec{\theta }}^a]=\delta \varvec{u}_0^a\otimes {N^{a\prime }_{u}}\varvec{D}_3+ (\delta {\varvec{\theta }}^a\times \varvec{d}_{\alpha }^a)\otimes \varvec{Q}^a_{\alpha };\\&D^2\varvec{F}[\delta \varvec{u}^a_0,\delta {\varvec{\theta }}^a;\varvec{u}^b_0,\Delta \varvec{\theta }^b]\\&\quad =(\Delta \varvec{\theta }^{a} (\delta {\varvec{\theta }}^a\cdot \varvec{d}_{\alpha }^a)\delta _{ab} - \varvec{d}_{\alpha }^a (\delta {\varvec{\theta }}^a\cdot \Delta \varvec{\theta }^{a})\delta _{ab})\otimes \varvec{Q}^a_{\alpha }, \end{aligned} \end{aligned}$$
(113)

with

$$\begin{aligned} \varvec{Q}^a_{\alpha }=\left( N_d^a\varvec{D}_{\alpha } + \eta ^{\alpha }N_{d}^{a\prime }\varvec{D}_3\right) . \end{aligned}$$
(114)

6.2 Discretisation of the weak forms

Substitution of the discretisation expressions for \(\varvec{x}_0\) and \(\varvec{d}_k\) and its virtual variations in (110) and (111) respectively, into (106) leads to the final discretised form of the principle of virtual work as

$$\begin{aligned} \begin{aligned} D \Pi _M^{int}[\delta \varvec{u}_0^a,\delta {\varvec{\theta }}^a]&=\delta \varvec{u}_0^a\cdot \int _0^L{N_{u}^{a\prime }\left[ \int _{A}{\varvec{P}\varvec{D}_3\,dA}\right] }ds\\&\quad \,+ \delta {\varvec{\theta }}^a\cdot \varvec{d}_{\alpha }^a\times \int _0^L\left[ \int _{A}{\varvec{P}\varvec{Q}^a_{\alpha }\,dA}\right] ds. \end{aligned} \end{aligned}$$
(115)

where \(\varvec{P}\) is evaluated according to Eq. (7). Notice that the first term in the right hand side of Eq. (115) can be identified as the weak form of conservation of linear momentum whereas the second term can be identified as the weak form of the conservation of angular momentum.

6.3 Discretisation of linearisations

Bearing in mind Eq. (49), the discretisation of the second directional derivative of the potential \(\Pi _M\) in Eq. (109) will be presented in this section. For instance, second derivatives with respect to the displacement of the centre line are discretised as

$$\begin{aligned} D^2\Pi _M[\delta \varvec{u}_0^a;\varvec{u}_0^b]= & {} \int _V{D\varvec{F}[\delta \varvec{u}_0^a]:\begin{bmatrix}\mathbb {H}_{W}^{eq}\end{bmatrix}:D\varvec{F}[\varvec{u}_0^b]}\,dV\nonumber \\&+\int _V{\varvec{\Sigma }_{\text {g}}:\left( D\varvec{F}[\delta \varvec{u}_0^a]\varvec{\times }D\varvec{F}[\varvec{u}_0^b]\right) \,dV},\nonumber \\ \end{aligned}$$
(116)

where \(\varvec{\Sigma }_{\text {g}}= \varvec{\Sigma _H} + \Sigma _J\varvec{F}\). Equation (116) can be equivalently expressed as

$$\begin{aligned} D^2 \Pi _M[\delta \varvec{u}_0^a;\varvec{u}_0^b]=\delta \varvec{u}_0^a\cdot \varvec{K}_{uu}^{ab}\varvec{u}_0^b, \end{aligned}$$
(117)

where the stiffness matrix \(\varvec{K}_{uu}^{ab}\) is obtained as

$$\begin{aligned} \begin{aligned} \varvec{K}_{uu}^{ab}&=\int _0^LN_{u}^{a\prime }N_{u}^{b\prime }\left[ \int _{A(s)}\left( \begin{bmatrix}\mathbb {H}_{W}^{eq}\end{bmatrix}_{\varvec{D}_3\varvec{D}_3}+{\varvec{\mathcal E}}\right. \right. \\&\left. \left. \quad \,:\varvec{\Sigma }_{\text {g}}\left( \varvec{D}_3\times \varvec{D}_3\right) \right) \,dA\right] \,ds, \end{aligned} \nonumber \\ \end{aligned}$$
(118)

where \(\varvec{\mathcal {E}}\) is the third order permutation tensor. It is easy to observe that the second (geometric) term in (118) vanishes, leading to the final expression for \(\varvec{K}_{uu}^{ab}\) as

$$\begin{aligned} \begin{aligned} \varvec{K}_{uu}^{ab}&=\int _0^LN_{u}^{a\prime }N_{u}^{b\prime }\left[ \int _{A(s)}\varvec{D}_3:\begin{bmatrix}\mathbb {H}_{W}^{eq}\end{bmatrix}:\varvec{D}_3\,dA\right] \,ds. \end{aligned} \end{aligned}$$
(119)

Similarly, the discretisation of the cross derivatives of the potential \(\Pi _M\) with respect to changes in the displacement of the centre line and rotations would lead to an expression of the form

$$\begin{aligned} \begin{aligned}&D^2\Pi _M[\delta \varvec{u}_0^a;\Delta {\varvec{\theta }}^b]\\&\quad = \int _V{D\varvec{F}[\delta \varvec{u}_0^a]:\begin{bmatrix}\mathbb {H}_{W}^{eq}\end{bmatrix}:D\varvec{F}[\Delta {\varvec{\theta }}^b]}\,dV \\&\qquad +\int _V{\varvec{\Sigma }_{\text {g}}:\left( D\varvec{F}\left[ \delta \varvec{u}_0^a\right] \varvec{\times }D\varvec{F}\left[ \Delta {\varvec{\theta }}^b\right] \right) \,dV}, \end{aligned} \end{aligned}$$
(120)

which can be written equivalently as \(D^2 \Pi _M[\delta \varvec{u}_0^a;\Delta \varvec{\theta }^b]=\delta \varvec{u}_0^a\cdot \varvec{K}_{u\theta }^{ab}\Delta \varvec{\theta }^b \), with the stiffness matrix \(\varvec{K}_{u\theta }^{ab}\) being

(121)

where

$$\begin{aligned} \varvec{b}^{ab}_{\alpha }=\varvec{\Sigma }_{\text {g}}\left( N_{u}^{a\prime }\varvec{D}_3\times \varvec{Q}^b_{\alpha }\right) . \end{aligned}$$
(122)

Finally, the discretisation of the second derivatives with respect to changes in rotations is obtained as

$$\begin{aligned} \begin{aligned} D^2\Pi _M[\delta {\varvec{\theta }}^a;\Delta {\varvec{\theta }}^b]&= \int _V{D\varvec{F}[\delta {\varvec{\theta }}^a]:\begin{bmatrix}\mathbb {H}_{W}^{eq}\end{bmatrix}:D\varvec{F}[\Delta {\varvec{\theta }}^b]}\,dV\\&\quad +\int _V{\varvec{\Sigma }_{\text {g}}:(D\varvec{F}[\delta {\varvec{\theta }}^a]\varvec{\times }D\varvec{F}[\Delta {\varvec{\theta }}^b])\,dV}\\&\quad +\int _V{\varvec{P}:D^2\varvec{F}[\delta {\varvec{\theta }}^a;\Delta {\varvec{\theta }}^b]\,dV}. \end{aligned} \nonumber \\ \end{aligned}$$
(123)

Equation (123) is equivalent to \(D^2 \Pi _M[\delta {\varvec{\theta }}^a;\Delta \varvec{\theta }^b]=\delta {\varvec{\theta }}^a\cdot \varvec{K}_{\theta \theta }^{ab}\Delta \varvec{\theta }^b \), with the stiffness matrix \(\varvec{K}_{\theta \theta }^{ab}\) being defined as

$$\begin{aligned} \begin{aligned} \varvec{K}_{\theta \theta }^{ab}&=-\int _0^L \left( \int _{A(s)}\varvec{Q}^a_{\alpha }:\begin{bmatrix}\mathbb {H}_{W}^{eq}\end{bmatrix}:\varvec{Q}^b_{\beta }\,dA\right) \varvec{\times }\left( \varvec{d}_{\alpha }^a\otimes \varvec{d}_{\beta }^b\right) \,ds\\&\quad \, +\int _0^L \left[ \int _{A(s)}\left( {\varvec{\mathcal E}}:\varvec{\Sigma }_g\left( \varvec{Q}^a_{\alpha }\times \varvec{Q}^b_{\beta }\right) \right) \varvec{\times }\left( \varvec{d}_{\alpha }^a\otimes \varvec{d}_{\beta }^b\right) \,dA\right] \,ds \\&\quad \, +\int _0^L\left[ \int _{A(s)} \left( \varvec{d}_{\alpha }^a\otimes \varvec{P}\varvec{Q}^a_{\alpha } - \left( \varvec{d}^{a}_{\alpha }\cdot \varvec{P}\varvec{Q}^a_{\alpha }\right) \varvec{I}\right) \delta _{ab}\,dA\right] \,ds. \end{aligned} \nonumber \\ \end{aligned}$$
(124)

7 Numerical examples

Note that it is not the primary aim of this paper to propose a new finite element discretisation. Nevertheless, the objective of this section is to present a series of numerical examples in order to show the applicability of the variational and computational frameworks presented in Sects. 3, 5 and 6. For all the examples included, linear shape functions have been used for the interpolation of the centre line and the director triad as presented in [1]. The consideration of dynamic effects for one of the examples presented has been implemented in a straightforward manner [4] and thus, it is not further discussed. Reduced numerical integration [1] of the internal virtual work contributions has been carried out in order to alleviate locking effects, whilst exact integration of the inertial contributions has been implemented. For the last example presented, a comparison of the beam model against a sophisticated mixed continuum polyconvex computational framework as reported in [12] is carried out.

Fig. 3
figure 3

Bending test example. Geometry and boundary conditions. \(L=1\,\mathrm{{m}}\) and \(h=\frac{1}{25}\,\mathrm{{m}}\)

Fig. 4
figure 4

Bending test for a discretisation of 24 linear elements (\( 25 \times 6\) degrees of freedom defined in the variables \(\varvec{x}_0\) and \(\Delta \varvec{\theta }\)). Results shown for a Saint Venant–Kirchhoff constitutive model defined in (27) with material properties \(\mu _{SVK}\) and \(\lambda _{SVK}\) defined in (126). a Deformed shape for \(M=M_{max}\). b Contour plot of the hydrostatic pressure distribution \(p\,(\mathrm{{N/m}}^2)\) for \(M=0.8M_{max}\)

Fig. 5
figure 5

Bending test for a Mooney–Rivlin model defined in (12) and (13) with material properties \(\alpha _{MR}\), \(\beta _{MR}\) and \(\lambda _{MR}\) defined in (126). a Order of accuracy for the variables \(\varvec{x}\), \(\varvec{x}_0\), \(\varvec{R}\), \(\varvec{\Sigma _F}\), \(\varvec{\Sigma _H}\) and \({\Sigma _J}\). Two meshes analysed for the maximum applied moment \(M=M_{max}\): 80 and 160 linear elements. Reference solution taken from a 640 element mesh. b Quadratic convergence of the Newton–Raphson algorithm

Fig. 6
figure 6

Beam with slope discontinuity. a Geometry and boundary conditions. \(L=1\,\mathrm{{m}}\) and \(h=0.1\,\mathrm{{m}}\). b Absolute value of the tip displacement versus absolute value of the applied force \(F\,(\mathrm{{N}})\). Material parameters of \(\mu _{\text {lin}}=0.5\times 10^6\,\mathrm{{N/m}}^2\) and \(\nu =0\) in the reference configuration. Results shown for the Mooney–Rivlin model defined in Eqs. (12) and (13) for a discretisation of 18 linear elements (\( 19 \times 6\) degrees of freedom)

Fig. 7
figure 7

Beam with slope discontinuity. Reference configuration (shadowed) and deformed configuration. Contour plot of a hydrostatic pressure \(p\,(\mathrm{{N/m}}^2)\), b Jacobian J, c stress component \(\varvec{\sigma }_{11}\,(\mathrm{{N/m}}^2)\) and d stress component \(\varvec{\sigma }_{22}\,(\mathrm{{N/m}}^2)\). Material parameters of \(\mu _{\text {lin}}=0.5\times 10^6\,\mathrm{{N/m}}^2\) and \(\nu =0\) in the reference configuration. Results shown for the Mooney–Rivlin model defined in Eqs. (12) and (13) for a discretisation of 480 linear elements (\( 481 \times 6\) degrees of freedom)

7.1 Bending test

This example consists of a straight cantilever beam of length L with squared cross sectional area of side \(\frac{1}{25}m\). This example has already been reported in [1, 3] using a linearised Saint Venant–Kirchhoff constitutive model (100) with \(\lambda _{\text {lin}}=1\,\mathrm{{N/m}}^2\) and \(\mu _{\text {lin}}=\frac{1}{2}\,\mathrm{{N/m}}^2\) the Lamé coefficients of the material defined in the reference configuration. With these parameters, the Young’s modulus is computed as \(E=1\,\mathrm{{N/m}}^2\). The geometry and boundary conditions of the problem are shown in Fig. 3. The maximum bending moment applied at the free end of the beam is defined by \(M_{max}=\frac{2\pi EI}{L}\), where I denotes the second moment of area of the beam.

For this example, four nonlinear constitutive models are compared by employing the computational framework described above. For consistency with the reference [1, 3], the material characterisation of the different constitutive models employed is carried out at the origin utilising the Lamé coefficients \(\lambda _{\text {lin}}\) and \(\mu _{\text {lin}}\) defined above. Specifically, the four nonlinear constitutive models are: (i) a Saint Venant–Kirchhoff model defined in (27) with material parameters \(\lambda _{SVK}\) and \(\mu _{SVK}\); (ii) a Mooney–Rivlin model defined in (12)–(13) with material parameters \(\alpha _{MR}\), \(\beta _{MR}\) and \(\lambda _{MR}\); (iii) a Neo-Hookean model defined in (12)–(13) with material parameters \(\alpha _{NH}\), \(\beta _{NH}=0\) and \(\lambda _{NH}\), and iv) a modified Mooney–Rivlin model defined by the expression

(125)

The strain energy defined defined above in Eq. (125) has been extracted from a modified Mooney–Rivlin model given in [29]. Material characterisation renders the following equivalences between the respective material parameters and those of the linearised model (i.e. \(\lambda _{\text {lin}}\) and \(\mu _{\text {lin}}\)), as

$$\begin{aligned} \begin{aligned}&\mu _{SVK} = \mu _{\text {lin}};\quad \lambda _{SVK} = \lambda _{\text {lin}};\\&\alpha _{NH} = \mu _{\text {lin}};\quad \lambda _{NH} = \lambda _{\text {lin}};\\&2\alpha _{{MR}}+2\beta _{{MR}}=\mu _{\text {lin}};\quad \lambda _{MR}=\lambda _{\text {lin}};\\&\alpha _{MMR}+\beta _{MMR}=\mu _{\text {lin}};\\&\lambda _{MMR}=\lambda _{\text {lin}} - 2/3\alpha _{MMR}- 14/3\beta _{MMR}. \end{aligned} \end{aligned}$$
(126)

In order to close the definition of the material paremeters for the Mooney–Rivlin and modified Mooney–Rivlin models, the following choice of material parameters is preferred \(\alpha _{{MR}}=\beta _{{MR}}=\mu _{\text {lin}}\) and \(\alpha _{{MMR}}=3/4\mu _{\text {lin}}\) and \(\beta _{{MMR}}=\mu _{\text {lin}}/4\).

As reported in reference [43], for a linearised Saint Venant–Kirchhoff model (100) with Poisson ratio \(\nu =0\) and an applied moment of value \(M=M_{max}\), the beam closes on itself forming a closed loop which can be described with an available analytical solution. In general, for a nonlinear constitutive model (or when the Poisson ratio \(\nu \) is not equal to zero), the exact closure of the beam configuration cannot be a priori guaranteed. Figure 4a shows the deformed shape of the beam for the general nonlinear Saint Venant–Kirchhoff model (27). As can be observed in this figure, there can be observed a small interpenetration due to the nonlinearity of the considered model.

With respect to the stress distribution, Fig. 4b shows the contour plot of the hydrostatic pressure for the Saint Venant–Kirchhoff constitutive model. For the other three constitutive models, the hydrostatic pressure distribution is almost identical, with hardly any differences, and is thus not displayed. This can be explained due to the moderate strains undergone through the deformation process. For this particular example, the choice of a specific model does not seem to be relevant for the overall solution.

The objective of this example is also to demonstrate the p-order of accuracy of the formulation, as a function of the chosen finite element approximation spaces. For this purpose, and particularising for the Mooney–Rivlin model, the beam is initially discretised with 80 elements and, subsequently, h-refinement is carried out generating a total of 3 discretisations. As a closed form solution is not available for this problem (due to the nonlinearity of the constitutive model adopted), the finest mesh is used to generate numerically the so-called “benchmark” solution for comparison purposes. The two coarsest meshes are compared against the finest mesh. The error between the benchmark solution and the other discretisations is measured in the \(L^2\) norm for all the unknown variables. Let us define for a tensor (e.g. scalar, vector or second order) field, the \(L^2\) norm as

$$\begin{aligned} \Vert \varvec{\zeta }\Vert _{L^2}=\left[ \int _{V} \left( \varvec{\zeta }:\varvec{\zeta }\right) \;d V\right] ^{1/2} \end{aligned}$$
(127)

associated with the magnitude of the tensor field \(\varvec{\zeta }\). Although the integrands associated to the internal virtual work have been underintegrated along the centre line, the integral defined in Eq. (127) for the evaluation of the error is computed exactly. Notice that in the evaluation of the error norm, a reconstruction of the continuum is carried out. In our case, \(\varvec{\zeta }\) can be any of the kinematic or kinetic variables, namely \(\varvec{x}\),Footnote 12 \(\varvec{x}_0\), \(\varvec{R}=\varvec{d}_i\otimes \varvec{D}_i\) \(\varvec{\Sigma _F}\), \(\varvec{\Sigma _H}\) and \(\Sigma _J\). This enables the definition of the following error norm \(\Vert \varvec{\zeta }_{i}-\varvec{\zeta }_{b}\Vert _{L^2}/\Vert \varvec{\zeta }_{b}\Vert _{L^2}\), where \(\varvec{\zeta }_{b}\) stands for the benchmark solution and \(\varvec{\zeta }_{i}\) the solution of the i-th mesh, with \(i=1, \ldots , (b-1)\). This can then be used to assess the convergence of the algorithm under h-refinement.

Figure 5a shows the order of accuracy of the variables \(\varvec{x}\), \(\varvec{x}_0\), \(\varvec{R}\), \(\varvec{\Sigma _F}\), \(\varvec{\Sigma _H}\) and \(\varvec{\Sigma _J}\). The convergence observed is \(p+1\) in \(\varvec{x}\), \(\varvec{x}_0\), \(\varvec{R}\) and \(\Sigma _J\), and p for \(\varvec{\Sigma _F}\), \(\varvec{\Sigma _H}\). As expected, primary variables lead to \(p+1\) order of convergence whereas, with the exception of \(\Sigma _J\) for this particular example, derived magnitudes (i.e. stresses) lead to a reduced order of convergence p. For completeness, the quadratic convergence of the Newton–Raphson algorithm is shown in Fig. 5b.

Fig. 8
figure 8

Beam with slope discontinuity: order of accuracy of the \(\varvec{x}\), \(\varvec{x}_0\), \(\varvec{R}\), \(\varvec{\Sigma _F}\), \(\varvec{\Sigma _H}\) and \({\Sigma _J}\). Mooney–Rivlin model defined in (12) and (13). Two meshes analysed: 120 and 240 linear elements. Reference solution taken from a 480 element mesh

Fig. 9
figure 9

Constrained torsion–compression example. Geometry and boundary conditions. \(L=20\,\mathrm{{m}}\) and \(h=1\,\mathrm{{m}}\)

Fig. 10
figure 10

Constrained torsion–compression example. From left to right: Saint Venant–Kirchfoff with 30 % of total loading, Neo-Hookean (100 % of the loading), Mooney–Rivlin (100 % of the loading) and modified Mooney–Rivlin (100 % of the loading) models. Contour plot of a stress component \(\varvec{\sigma }_{32}\,(\mathrm{{N/m}}^2)\), b conjugate stress \(\varvec{\Sigma _{F}}_{11}\,(\mathrm{{N/m}}^2)\), c conjugate stress \(\varvec{\Sigma _{H}}_{22}\,(\mathrm{{N/m}}^2)\) and d conjugate stress \({\Sigma _{J}}\,(\mathrm{{N/m}}^2)\). Material parameters \(\mu _{\text {lin}}=1/2.7\,\mathrm{{N/m}}^2\) and \(E=1\,\mathrm{{N/m}}^2\) in the reference configuration. Results for a discretisation of 50 linear elements (\( 51 \times 6\) degrees of freedom associated to the spatial coordinates \(\varvec{x}_0\) and incremental rotations \(\Delta \varvec{\theta }\))

7.2 Beam with slope discontinuity with zero Poisson ratio

In this example, a series of interconnected beams with a Young modulus of \(E=10^6\,\mathrm{{N/m}}^2\) and a Poisson ratio of \(\nu =0\) are considered. The geometry and boundary conditions of the problem are shown in Fig. 6a. In order to model the connection between the beams at any angle, continuity of the incremental rotation angle \(\Delta \varvec{\theta }\) is strongly imposed. This example has been presented in [3] using the linearised Saint Venant–Kirchhoff constitutive model (100). In this case, the Mooney–Rivlin constitutive model defined in Eqs. (12) and (13) is considered, with material parameters obtained after material characterisation in the undeformed reference configuration and the consideration of \(\alpha _{{MR}}=\beta _{{MR}}\).

The objective of this example is to demonstrate the use of a polyconvex model with null Poisson ratio in the reference configuration for the case of a non-straight beam configuration. Notice the excellent agreement between the results presented in reference [3] for a linearised Saint Venant–Kirchhoff model and those for the current Mooney–Rivlin model in Fig. 6b.

Figure 7 shows the contour plot for different representative variables, namely the hydrostatic pressure p, the Jacobian J and different stress components (i.e. \(\varvec{\sigma }_{11}\) and \(\varvec{\sigma }_{22}\)) (notice that axis in reference \(\{OX_1,OX_2,OX_3\}\) and deformed \(\{ox_1,ox_2,ox_3\}\) are coincident). In order to emphasize the high deformation that the beam is subjected to, a shadowed representation of the beam in its undeformed configuration is included. For visualisation purposes, the different spans of the structural system in the deformed configuration are shown slightly disconnected, enabling a clearer observation of the contour plot of the different variables.

Finally, Fig. 8 shows the expected \(p+1\) order of accuracy for the primary variables \(\varvec{x}\), \(\varvec{x}_0\) and \(\varvec{R}\), and the reduced p order of accuracy for the derived magnitudes \(\varvec{\Sigma _F}\), \(\varvec{\Sigma _H}\) and \(\varvec{\Sigma _J}\), where two meshes of 120 and 240 elements are compared with a reference solution defined from a mesh of 480 elements.

7.3 Constrained torsion–compression example: coercivity of the models

In this example, a straight beam is subjected to high compressive and torsional effects. The objective of this example is to show the un-realistic behaviour of the Saint Venant–Kirchhoff (24) model specially under high compression scenarios in contrast to that of the Neo-Hookean, Mooney–Rivlin (12)–(13) and the modified Mooney–Rivlin (125) models.

The beam configuration and geometry and the boundary conditions of the problem are shown in Fig. 9. The free end is subjected to a compressive force \(F=-9\times 10^{-1}\,{\text {N}}\) in the \(OX_3\) axis and to a torsional moment \(M=-2.7\times 10^{-2}\,{\text {Nm}}\) about the \(OX_3\) axis. The displacement of all the nodes along the centre line is constrained to remain aligned with the straight axis of the undeformed beam (i.e. \(\varvec{x}_0\cdot \varvec{D}_{\alpha }=0\)). In this way, physical buckling is prevented and the lack of coercivity of the models can be explored. The material properties in the reference configuration are \(E=1\,{\text {N/m}}^2\) and \(\nu =0.35\).

Fig. 11
figure 11

Constrained torsion–compression example. Curve relating the absolute value of the compressible load at the free end of the beam and its associated displacement in the \(OX_3\) axis. Saint-Venant model (SV), Neo-Hookean model (NH), Mooney–Rivlin model (MR) and Modified Mooney–Rivlin model (MMR). Material parameters \(\mu _{\text {lin}}=1/2.7\,\mathrm{{N/m}}^2\) and \(E=1\,\mathrm{{N/m}}^2\) in the reference configuration. Results for a discretisation of 50 linear elements (\( 51 \times 6\) degrees of freedom associated to the spatial coordinates \(\varvec{x}_0\) and incremental rotations \(\Delta \varvec{\theta }\))

Figure 10a shows the contour plot of \(\varvec{\sigma }_{12}\) (notice that axis in reference \(\{OX_1,OX_2,OX_3\}\) and deformed \(\{ox_1,ox_2,ox_3\}\) are coincident) and the deformed configuration of the beam for \(100\,\%\) of the total load for the Neo-Hookean, Mooney–Rivlin and modified Mooney–Rivlin models. For the Saint Venant–Kirchhoff model, only 30 % of the total loading has been applied. Figure 10b–d shows the contour plot for some representative conjugate stresses for the different constitutive models. Note that the conjugate stress \(\varvec{\Sigma _H}_{22}\) vanishes for the Neo-Hookean model (see Fig. 10c) whilst the conjugate stress \(\Sigma _J\) vanishes for the Saint-Venant model (see Fig. 10d).

Fig. 12
figure 12

Twisting column. \(L=6\,{\text {m}}\), \(h=1\,{\text {m}}\) and Dirichlet boundary conditions described in Eq. (128) for the static case in Sect. 7.4.1. \(L=15\,m\), \(h=1\,m\) and initial angular velocity \(\varvec{\Omega }_0\) for the dynamic case in Sect. 7.4.2

Fig. 13
figure 13

Static twisting column. Comparison between beam and continuum representations. Contour plot of stress \(\varvec{\Sigma _F}_{13}\,(\mathrm{{N/m}}^2)\) for (left) a Hu–Washizu mixed formulation [12] with \(3969\times 3\) degrees of freedom associated to the displacement \(\varvec{x}\), \(2304\times 4 \times 9\) degrees of freedom for \(\varvec{F}\), \(\varvec{H}\) \(\varvec{\Sigma _F}\) and \(\varvec{\Sigma _H}\), 2304 degrees of freedom associated to J and \(\Sigma _J\); (right) results displayed for the beam model with a discretisation of 24 linear elements (\( 25 \times 6\) degrees of freedom associated to the spatial coordinates \(\varvec{x}_0\) and incremental rotations \(\Delta \varvec{\theta }\))

Figure 11 displays the curve relating the absolute value of the displacement in the \(OX_3\) axis of the free end of the beam versus the absolute value of the force F applied. For small strains, the constitutive response of the four constitutive models is, as expected, almost identical. However, the lack of the required coercivity and convexity requirements for the Saint Venant–Kirchhoff constitutive model leads to the unphysical buckling observed in Fig. 11.

7.4 Static and dynamic twisting column

7.4.1 Static twisting column

This example includes the twisting of a cantilever beam of length \(L=6\,{\text {m}}\) and a squared cross sectional area of side \(a=1\,{\text {m}}\) as shown in Fig. 12. The beam is clamped at \(X_3=0\) and subjected to a torsion on its free end, namely \(X_3=6\). The torsion at the free end is generated through Dirichlet boundary conditions as follows

$$\begin{aligned} \left( \varvec{I}-\varvec{D}_3\otimes \varvec{D}_3\right) \varvec{x}=\theta \varvec{D}_3 \times \varvec{X}, \end{aligned}$$
(128)

where \(\theta \) is the angle of rotation. As can be observed, the section is not restricted to in-plane torsion and zero Neumann boundary conditions are imposed normal to the cross sectional area. The material properties of the beam are compatible with a shear modulus and Poisson ratio in the reference configuration of \(E=1\,{\text {Pa}}\) and \(\nu =0.35\), respectively. A similar example has been presented by the authors in previous reference [12].

The objective of this example is to benchmark the current beam formulation testing the capabilities of the beam representation against a very robust and precise continuum formulation. In particular, a mixed formulation associated to a special Hu–Washizu type of variational principle as presented in reference [12] is considered for that purpose. In this mixed formulation, the unknowns are displacements \(\varvec{x}\), the fibre, area and volume maps \(\{\varvec{F},\varvec{H},J\}\) and their stress conjugates \(\{\varvec{\Sigma }_{\varvec{F}},\varvec{\Sigma }_{\varvec{H}},\Sigma _J\}\). Regarding the selection of functional spaces: continuous quadratic interpolation of the displacement field \(\varvec{x}\), piecewise linear interpolation of the strain and stress fields \(\varvec{F}\), \(\varvec{H}\), \(\varvec{\Sigma }_{\varvec{F}}\) and \(\varvec{\Sigma }_{\varvec{H}}\) and piecewise constant interpolation of the Jacobian J and its associated stress conjugate \(\Sigma _J\) are considered.

Fig. 14
figure 14

Static twisting column. Comparison between beam and continuum representations. Contour plot of stress \(\varvec{\Sigma _H}_{32}\,(\mathrm{{N/m}}^2)\) for a Hu–Washizu mixed formulation [12] with \(3969\times 3\) degrees of freedom associated to the displacement \(\varvec{x}\), \(2304\times 4 \times 9\) degrees of freedom for \(\varvec{F}\), \(\varvec{H}\) \(\varvec{\Sigma _F}\) and \(\varvec{\Sigma _H}\), 2304 degrees of freedom associated to J and \(\Sigma _J\)

Fig. 15
figure 15

Dynamic twisting column. Contour plot of stress \(\varvec{\sigma }_{31}\,(\mathrm{{N/m}}^2)\) for (left) a Hu–Washizu mixed formulation [12] with \(3969\times 3\) degrees of freedom associated to the displacement \(\varvec{x}\), \(2304\times 4 \times 9\) degrees of freedom for \(\varvec{F}\), \(\varvec{H}\) \(\varvec{\Sigma _F}\) and \(\varvec{\Sigma _H}\), 2304 degrees of freedom associated to J and \(\Sigma _J\). Generalised alpha method with \(\rho _{\infty }=1\) and \(\Delta t=10^{-3}\,\mathrm{{s}}\); (right) results displayed for the beam model with a discretisation of 30 linear elements (\( 31 \times 6\) degrees of freedom associated to the spatial coordinates \(\varvec{x}_0\) and incremental rotations \(\Delta \varvec{\theta }\)). Generalised alpha method with \(\rho _{\infty }=1\) and \(\Delta t=10^{-2}\,\mathrm{{s}}\)

Fig. 16
figure 16

Dynamic twisting column. Contour plot of stress \(\varvec{\Sigma _{H}}_{31}\,(\mathrm{{N/m}}^2)\) for (left) a Hu–Washizu mixed formulation [12] with \(3969\times 3\) degrees of freedom associated to the displacement \(\varvec{x}\), \(2304\times 4 \times 9\) degrees of freedom for \(\varvec{F}\), \(\varvec{H}\) \(\varvec{\Sigma _F}\) and \(\varvec{\Sigma _H}\), 2304 degrees of freedom associated to J and \(\Sigma _J\). Generalised alpha method with \(\rho _{\infty }=1\) and \(\Delta t=10^{-3}\,{\text {s}}\); (right) results displayed for the beam model with a discretisation of 30 linear elements (\( 31 \times 6\) degrees of freedom associated to the spatial coordinates \(\varvec{x}_0\) and incremental rotations \(\Delta \varvec{\theta }\)). Generalised alpha method with \(\rho _{\infty }=1\) and \(\Delta t=10^{-2}\,\mathrm{{s}}\)

The Mooney–Rivlin model defined in (12)–(13) has been considered. Despite the obvious differences between beam and continuum formulations (i.e. different kinematical description, different interpolation spaces), Fig. 13 shows reasonable agreement between both beam and continuum representations in terms of tangential stresses. This agreement is excellent for stages of the deformation in which the kinematical assumptions of the beam model are applicable. Hence, when warping of the cross section of the column is pronounced, as in Fig. 13c, the comparison is still reasonable but not as accurate, as expected. The warping of the cross sectional area shown by the continuum representation leads eventually to the buckled configuration represented in Fig. 14. Appropriate incorporation of warping effects into the kinematics of the beam [9] would enable to capture this nonlinear behaviour.

7.4.2 Dynamic twisting column

Finally, a beam with geometry depicted in Fig. 12 is considered. In this time dependent problem, an initial angular velocity \(\varvec{\Omega }_0=35\sin \left( \frac{\pi X_3}{2L}\right) \varvec{D}_3\) compatible with the boundary conditions is prescribed. The material properties of the beam are compatible with a shear modulus and Poisson ratio in the reference configuration of \(E=0.0179\,{\text {GPa}}\) and \(\nu =0.3\), respectively. A similar example has been presented by the authors in reference [32].

The objective of this section is to benchmark the current beam formulation with the continuum formulation described in Sect.  7.4.1 in a time dependent problem. For both the continuum and beam degenerate problems, a generalised alpha method time integrator is employed [44].

The Mooney–Rivlin model defined in (12)–(13) is considered with material parameters defined in (126). Figures 15 and 16 compare well in terms of displacements between the continuum and beam descriptions. In addition, Figs. 15 and 16 also show a good agreement between the contour plots of the tangential stress \(\varvec{\sigma }_{31}\) (notice that axis in reference \(\{OX_1,OX_2,OX_3\}\) and deformed \(\{ox_1,ox_2,ox_3\}\) are coincident) and the conjugate stress \(\varvec{\Sigma _H}_{31}\), respectively, for the continuum and beam models. However, as expected, the simplifications introduced in the kinematical description of the beam model (lack of contraction/expansion of the cross section of the beam) lead to different results between both continuum and beam models regarding normal stresses.

8 Concluding remarks

This paper has provided a novel variational and computational approach to formulate polyconvex large strain geometrically exact beam theory, extending the original ideas introduced by Bonet et al. [12]. In addition, three key novel contributions are incorporated in the present work. First, the deformation gradient, its cofactor and its determinant, namely \(\{\varvec{F},\varvec{H},J\}\) are used for the first time as the main strain measures in the context of beam theory. Their respective work conjugates, namely \(\{\varvec{\Sigma _F},\varvec{\Sigma _H},\Sigma _J\}\) also feature for the first time in a geometrically exact beam formulation. Moreover, their co-rotational strain \(\{\varvec{\bar{U}},\varvec{\bar{W}},J\}\) and stress \(\{\varvec{\Sigma }_{\varvec{\bar{U}}},\varvec{\Sigma }_{\varvec{\bar{W}}},\Sigma _J\}\) counterparts are also presented.

For the first time, the strain energy of a Mooney–Rivlin model has been presented in terms of the classical beam strain measures \(\{\varvec{\Gamma },\varvec{K}\}\) by taking advantage of the novel algebra associated to the tensor cross product operation presented in [12] and thoroughly detailed in [33]. Notice that this re-expression procedure can be generalised to more complex constitutive models (i.e. anisotropy, higher nonlinearities).

Finally, the authors have shown that polyconvexity of a continuum constitutive model defined via a continuum strain energy functional \(W(\varvec{F},\varvec{H},J)\) implies convexity with respect to the classical beam strain measures of the equivalent beam strain energy functional \(W_b(\varvec{\Gamma },\varvec{K})\), stating explicitly the relationship between alternative tangent operators.

Further extension of our work will include multi-physics electro-magneto-mechanical effects.