Bayesian inference on a microstructural, hyperelastic model of tendon deformation

Microstructural models of soft-tissue deformation are important in applications including artificial tissue design and surgical planning. The basis of these models, and their advantage over their phenomenological counterparts, is that they incorporate parameters that are directly linked to the tissue’s microscale structure and constitutive behaviour and can therefore be used to predict the effects of structural changes to the tissue. Although studies have attempted to determine such parameters using diverse, state-of-the-art, experimental techniques, values ranging over several orders of magnitude have been reported, leading to uncertainty in the true parameter values and creating a need for models that can handle such uncertainty. We derive a new microstructural, hyperelastic model for transversely isotropic soft tissues and use it to model the mechanical behaviour of tendons. To account for parameter uncertainty, we employ a Bayesian approach and apply an adaptive Markov chain Monte Carlo algorithm to determine posterior probability distributions for the model parameters. The obtained posterior distributions are consistent with parameter measurements previously reported and enable us to quantify the uncertainty in their values for each tendon sample that was modelled. This approach could serve as a prototype for quantifying parameter uncertainty in other soft tissues.


Introduction
Fibrous soft tissues such as tendons, skin and arteries are vital to life.Tendons and ligaments, for example, enable movement by transmitting forces around the body [1].It is critical, therefore, that we understand soft-tissue mechanical behaviour to advance fields such as tissue engineering [2] and surgery [3].Soft tissues exhibit complex macroscopic phenomena, including anisotropy and nonlinearity, that are induced predominantly by the microstructure of the tissue.Anisotropy arises from the presence of collagen fibrils, which locally reinforce the tissue in a preferred direction.Initially, the fibrils are crimped and stress-free, but they straighten as the tissue deforms, contributing to its resistance to further deformation once taut [4].This gradual recruitment of collagen fibrils leads to the nonlinear stress-strain profile typical of soft tissues [5], as illustrated in figure 1a with a plot of the Cauchy stress, σ, against stretch, λ.
Additionally, soft tissues are viscoelastic, so assuming that their behaviour can be described by an elastic model is a simplification.Practically speaking, before tests to measure mechanical properties are performed, a tissue is subjected to cyclic loading until the stress-strain behaviour of the tissue is consistent between consecutive cycles (figure 1b).Then, the tissue can be treated as pseudoelastic and modelled as a particular elastic material upon loading and a different elastic material upon unloading [6].In reality, energy is dissipated in the tissue during the loading-unloading cycle, but we can apply elasticity theory to the tissue as long as we only examine one loading path.Furthermore, for sufficiently slow (quasi-static) or extremely rapid deformations, the loading and unloading curves are relatively close to one another as hysteresis is minimized in these regimes.
To model soft-tissue deformation, we will use the theory of hyperelasticity, relating the stress to the strain via a strainenergy function (SEF).There are two approaches to developing a hyperelastic model: the phenomenological and structural approaches (although any one SEF can incorporate features of both).Phenomenological models seek to achieve the best quantitative fit to experimental data.They do not attempt to determine how the microstructure influences the macroscopic behaviour observed in mechanical testing because the model's parameters do not necessarily have a clear physical interpretation.By contrast, structural models incorporate physically relevant parameters to elucidate the relationship between the arrangement and properties of the tissue's constituents and its mechanical behaviour.Incorporating microstructural information into a SEF often increases its complexity and it is important that a structural SEF remains tractable if it is to be viable for studying soft-tissue deformation.This is important when employing a Bayesian framework with Monte Carlo sampling, where we require a large number of solutions of the forward model.Therefore, simplifying assumptions about the microstructure are often required.
Values for unknown structural parameters can be obtained via imaging methods such as serial block facescanning electron microscopy [7,8] and X-ray computed tomography [9][10][11][12], and for constitutive parameters using micromechanical techniques like force spectroscopy [13] and atomic force microscopy [14].These techniques are challenging, however, and a wide range of values has been reported for certain quantities.The collagen fibril Young's modulus, for example, has been reported to have a value ranging from 32 MPa [13] to 2.8 GPa [14].This uncertainty makes it difficult to predict soft-tissue mechanical behaviour using optimization techniques alone.Therefore, in this paper, we take a Bayesian approach to the modelling process to characterize the likely ranges of values that microstructural and micromechanical parameters can take.
Due to their importance and the fact that they have been studied extensively, we focus on tendons in this paper.The mechanical properties of different tendons are distinct from one another, with energy-storing tendons being more extensible than positional tendons due to differences in their microstructures [15,16].One feature that is common to all tendons is that their collagen is structured in a regulated, hierarchical fashion and aligned closely with the tendon's axis [1].Collagen molecules form cross-links and aggregate into fibrils with diameters ranging from 12 to 500 nm [7].Collections of fibrils collect into larger structures called fibres, with diameters of 150-1000 μm, which themselves form fascicles, with diameters of 1000-3000 μm, [17].In other soft tissues, collagen fibrils are less strongly aligned and form a network, but, by aligning many fibrils in one direction, the tendon is stronger in that direction [18].
Models such as the Holzapfel-Gasser-Ogden (HGO) model [19], which was initially created to study arteries, have been adapted to study tendons [20].This model is structural in the sense that it incorporates a strain invariant that is directly related to the stretch in the collagen fibres, but phenomenological in the sense that an exponential function is used to describe collagen recruitment, and the stretches in individual fibrils are not tracked.Other models have explicitly incorporated the crimp morphology of the fibrils [21][22][23][24][25] and produced a good fit to experimental data.Several probability density functions (PDFs) have been used to describe the distribution of fibril length, including the Weibull [26] and triangular distributions [27][28][29], as summarized in a review article by Thompson et al. [30].
In this paper, we derive a new structural SEF for modelling soft tissues that assumes collagen fibrils are linearly elastic and have a triangular length distribution.We test the efficacy of the model using nonlinear optimization to find a parameter vector that produces a local best fit to data.Secondly, we repeat the fitting process using a Bayesian framework.This enables us to incorporate prior beliefs about the unknown model parameters, a statistical model for noisy observations of the stress-strain curves and our nonlinear model to obtain posterior distributions for the model parameters.Through these distributions, we identify and quantify the uncertainty in the parameters and the directions in parameter space in which the model is more or less sensitive.royalsocietypublishing.org/journal/rsif J. R. Soc.Interface 19: 20220031 The structure of the paper is as follows.In §2, we describe the underpinning continuum mechanical theory that is required to model the deformation of an anisotropic soft tissue and, using physical considerations, derive a new constitutive equation to model tendons.In §3, we use nonlinear optimization to fit the model to experimental stress-strain data and compare its quality of fit to that of the widely used HGO model and a microstructural tendon model.In §4, we account for noise in the experimental data using a likelihood function that allows us to study the problem under a Bayesian framework.In §5, we derive the posterior distribution for the model's microstructural and micromechanical parameters.In §6, we summarize our findings and discuss potential ways to expand upon our work.

Model derivation 2.1. Preliminaries
Prior to considering the constitutive response of soft tissues, we need to consider kinematics, i.e. how to formulate the mechanism of deformation.First, we distinguish between two configurations, the reference (initial) configuration and the deformed configuration.Points on the reference and deformed bodies are described by the vectors X and x, respectively.The two sets of coordinates are related via the deformation mapping, χ, i.e. x = χ(X).We define the deformation gradient, F, as where r X represents the gradient operator with respect to the reference coordinates.From the deformation gradient, we define two symmetric measures of the deformation, known as the left and right Cauchy-Green deformation tensors, B = FF T and C = F T F, respectively [31].
The SEF W allows one to define the constitutive equation of a hyperelastic material, relating stress to strain via derivatives of W. In order to determine the exact form of this constitutive response, we must first identify the symmetry properties of the material.The mechanical behaviour of transversely isotropic materials, such as tendons, is only invariant for rotations around a preferred direction, M. Furthermore, the SEF must be objective as the laws of physics are the same in any inertial frame of reference.As the SEF is invariant under a coordinate transformation, we can write it as a function of invariants of the deformation.For an isotropic material, there are only three invariants, I 1 = tr(C), I 2 ¼ 1 2 ððtrðCÞÞ 2 À trðC 2 ÞÞ and I 3 ¼ detðCÞ.For a transversely isotropic material, we must introduce an additional two pseudoinvariants that depend on M:

The model
Collagen fibrils in tendons are crimped when the tendon is relaxed, but straighten out as it is stretched [4].We model the distribution of fibril lengths using a triangular distribution, which enables us to obtain an explicit, analytical form for the SEF.It is unlikely that such an analytical form could be derived for other PDFs; however, when it is symmetric, the triangular distribution coarsely approximates the normal distribution (figure 2a), and it has been shown to provide a reasonable approximation to fibril length distributions in tendons, as measured via second-harmonic generation imaging [29].An individual collagen fibril is assumed to be stress-free until becoming taut at a recruitment stretch λ r .Once taut, it is assumed to be linearly elastic.The nonlinearity of the SEF arises through the gradual recruitment of collagen fibrils [32].Fibrils in the tendon are assumed to be locally coaligned.We follow the widely used assumption that we can accurately describe soft tissue mechanics using only the isotropic invariant I 1 to model the tendon's non-collagenous matrix (NCM) and the anisotropic invariant I 4 [19,22] to model the fibrils (and therefore, assume there is no dependence on I 2 , I 3 or I 5 ).We decouple the contributions of the collagen fibrils and NCM in the SEF and assume that each component's contribution is proportional to its volume fraction.Finally, we assume that tendon is incompressible.Thus, the SEF, W(I 1 , I 4 ), is where ϕ is the collagen volume fraction.
To determine the form of W coll (I 4 ), we start by defining the stress exerted upon a single collagen fibril.We assume the fibrils are slack while crimped and obey Hooke's Law once taut, so that the stress can be expressed as where E is Young's modulus of the collagen fibrils.We can determine the total (Cauchy) stress acting upon the collagen fibrils that are aligned in a given direction within a representative volume element by calculating the following integral: where f (λ r ) represents the PDF of the recruitment stretch.We derive SEFs for two different triangular distributions: a symmetric distribution and a general distribution.We refer to them as the symmetric triangular (ST) and general triangular (GT) models, respectively.For both distributions, the first fibril becomes mechanically active at λ = a, and the last fibril becomes mechanically active at λ = b.For the ST distribution, the mode is half-way between a and b, whereas, for the GT distribution, the mode is designated by a third parameter c, with a < c < b.The PDF for the GT distribution, f gen (λ r ), is ðbÀaÞðbÀcÞ , c l r b, 0, if l r .b: The PDF for the ST distribution, f sym (λ r ), is obtained by setting c = (a + b)/2 in (2.5) (figure 2b).Using (2.3) and the PDF of the fibril recruitment stretch distribution, we can evaluate the integral in (2.4) analytically.Exploiting the fact that I 4 = λ 2 , i.e.I 4 is equal to the square of the stretch of the fibrils, we obtain  [22] to write the left side of (2.6) in terms of W coll (I 4 ).Eventually, we obtain where G(I 4 ) is a piecewise constant that ensures the continuity of W coll (I 4 ).We further assume that the mechanical response of the NCM can be modelled by a neo-Hookean SEF [23], giving where μ is the NCM shear modulus.A full derivation of this SEF is provided in the electronic supplementary material.
For an incompressible, transversely isotropic SEF that is a function of I 1 and I 4 only, the constitutive equation, in terms of the Cauchy stress, is where p is a Lagrange multiplier associated with the incompressibility constraint, and m = FM is the direction of the collagen fibrils in the deformed configuration.The analytical form of the SEF presented above allows the stress to be calculated rapidly compared with constitutive models that require numerical integration over the collagen recruitment stretch (e.g.[28,29]).This rapidity allows the stress to be calculated millions of times within a relatively short time frame, which is exploited in our Markov chain Monte Carlo (MCMC) approach, below.

Nonlinear optimization
The first two sets of stress-strain data that we fitted were experiments on mouse tail tendons collected by Goh et al. [33,34].We used two datasets designated as mtt01 1 t5c and mtt01 1 t6b trunc.We shall refer to them as t5c and t6b for brevity.In these datasets, tendon specimens of length 7 mm were stretched at a displacement rate of 0.067 mm s −1 (a strain rate of approx.1% s À1 ) [33].The second two sets of data were collected by Thorpe et al. [15] and were previously modelled using a different SEF by Shearer et al. [16].In these, the strain rate was 5% s -1 [15].We used the datasets designated as equine common digital extensor tendon (CDET) from horse number 39 and equine superficial digital flexor tendon (SDFT) from horse number 16.These datasets were selected as they have a particularly large elastic region, with the onset of failure not occurring until around 10% strain.For brevity, we refer to them as CDET and SDFT, respectively.Each dataset continues up to failure of the tendon, which is beyond the scope of our model.In order to fit only to data that is consistent with the assumptions of our model, we used the data at stretches below the point at which the maximum gradient occurs for the data collected by Goh et al., and we used all data points up to 10% strain for the data collected by Thorpe et al. in accordance with [16].
To derive the constitutive equation, we assumed that a cylindrical tendon sample, described using cylindrical polar coordinates, is stretched along the axis of the aligned fibrils, which are oriented along the Z-axis (figure 3).For this deformation, given the assumed incompressibility and symmetry of the material, the reference and deformed coordinates, (R, Θ, Z) and (r, θ, z), are related by The Cauchy stress, (2.9), gives the force acting on the deformed material per unit deformed area; however, the quantity recorded in the experiments being modelled is the engineering stress, the force per unit reference area, which is denoted as N in figure 4. Therefore, after substituting our SEF into (2.9),we divide the resulting expression through by λ to obtain where I 4 = λ 2 by (3.1) and the assumed alignment of the fibrils.
To provide a benchmark for the fit of our model to data, we also fitted the tendon data using the commonly used HGO model [19].The HGO model was originally developed for modelling arteries and incorporates two families of collagen fibres; however, it has been adapted to study an extensive range of biological soft tissues, including tendons, and has been implemented in several finite-element software packages.To model collagen in tendons, which contain one family of collagen fibres, we use the following transversely isotropic version of the HGO SEF: where c HGO and k 1 are parameters with dimensions of stress and k 2 is a dimensionless model parameter.The engineering stress produced by this SEF (3.3) is To test our model against an existing microstructural tendon model, we used the following SEF [22]: ð3:5Þ where θ o is the initial crimp angle of the outermost, mostcrimped fibrils in the tendon's fascicles.We adapted the SEF by including a shifting parameter, γ, that corresponds to the engineering strain at which the first collagen fibril becomes mechanically active.In (3.5), this corresponds to replacing λ with λ − γ, that is, replacing The engineering stress for this modified tendon model is ð3:6Þ As ϕ, μ and E only appear in the SEF (2.8) in the distinct terms (1 − ϕ)μ and ϕE, we treated (1 − ϕ)μ and ϕE as two independent fitting parameters.Thus, the ST SEF contains four fitting parameters, (1 − ϕ)μ, ϕE, a and b.The GT SEF has an additional fitting parameter, c.In order to obtain physically realistic values for the parameters, we constrained them as follows: for the ST model, 0 < (1 − ϕ)μ, 0 < ϕE, 1 < a < b, a < λ max , where λ max represents the maximum stretch in the data; for the GT model, we replaced 1 < a < b with 1 < a < c < b; for the HGO model, (3.3), 0 < c HGO , 0 < k 1 and 0 < k 2 ; and for the modified tendon model, 0 The mean absolute error, Δ, between the experimental data, y, and simulated data, ŷ, is where d is the length of the dataset.Similarly, the mean relative error, δ, is The values of Δ and δ when fitting each model to the four datasets are given in table 1.To achieve the closest possible fit to the data, we ran the fitting function once and then restarted it a further four times with the estimated parameters at the end of a run used as the initial estimates for the next run.We found these restarts to have little to no effect on the quality of fit.Both versions of our SEF achieve a closer fit to the data than the microstructural tendon model for each dataset.Additionally, they only perform worse than the HGO model for the relative fit to the t6b dataset.Between the ST and GT models, the mean absolute and relative errors for the four datasets are similar, with the former surprisingly matching and outperforming the latter, in terms of the average relative error obtained, for the t5c and t6b datasets, respectively.This is particularly interesting because the GT model contains an additional degree of freedom.This is likely because nonlinear optimization only provides a local best fit to data and only 1000 iterations of the Nelder-Mead algorithm were performed each time the algorithm was run.Two examples of the fit of our model to the experimental data are presented in figure 4. The electronic supplementary material shows all 16 fits.The parameter values found at the end of the fifth run of the nonlinear fitting function are listed in table 2.

Markov chain Monte Carlo
Through nonlinear optimization, we have found the best fit to experimental data local to the algorithm's initial guesses for the parameter values; however, this approach does not quantify the uncertainty in the parameter values.Uncertainties arise for a number of reasons, including observational noise in the experimental stress-strain data.To address this, we apply a Bayesian framework to the same problem studied with the optimization approach and estimate the likely ranges of the true values of the model's parameters.Using the posterior distributions we obtain from the algorithm, we can estimate likely parameter values and quantify the uncertainty in those estimates.The goal of Bayesian statistics is, given new data, to update any prior knowledge about the values of a model's parameters via the likelihood of a particular parameter vector θ (the vector of constitutive and structural parameters in our SEF) producing the observed (experimental) data y.Through this, we obtain what is known as the posterior probability distribution of θ, π(θ|y), which is related to π 0 (θ), the prior probability of θ and the likelihood via Bayes' rule pðujyÞ / LðyjuÞp 0 ðuÞ, ð4:1Þ where L(y|θ) denotes a function that is proportional to the likelihood density.The posterior is only known up to a constant of proportionality, which often cannot be explicitly computed.Under those circumstances, a common method to characterize the posterior distribution is to sample from it using numerical methods such as MCMC.
Monte Carlo methods can be used to estimate expectations with respect to a particular measure, for example π(θ|y); however, for Bayesian inverse problems, we cannot usually directly sample from the posterior distribution.Instead, we can indirectly sample from the posterior using MCMC methods, which construct an ergodic Markov chain whose unique stationary density is equal to the posterior.Monte Carlo estimates taken with respect to this Markov chain can be shown to converge to expectations taken with respect to the posterior distribution.Initially, Markov chains do not sample from the stationary distribution and values proposed in the MCMC algorithm are dependent on the chain's starting position.This initial period is called the burn-in phase, the size of which depends on the quality of  the initial guess and the rate of mixing of the Markov chain.We do not include samples from the burn-in phase when calculating MCMC estimates, or when visualizing the posterior distribution.

Hierarchical Bayesian approach and conjugate priors
In §2.2, we derived a deterministic SEF.Now, in order to derive the likelihood function for this modelling problem, we assume that the observed data are given by the model output perturbed by some noise.We choose the standard modelling assumption that the noise is additive, mean-zero, Gaussian and independently and identically distributed (IID), giving us a diagonal covariance matrix for which the entries are equal to the observational noise variance σ 2 .This gives us the following statistical model for our observations: 2), we derive the likelihood, which is induced by the statistical model on the left of this equation, once the noise is assumed to be a zero-mean, homoscedastic, multivariate, Gaussian random variable This is sufficient if we have a clear idea of the value of the observational noise variance σ 2 , but in practice this is rarely the case.The value of σ 2 can be very important, potentially causing under-or over-fitting.Therefore, we take a hierarchical Bayesian approach and assign a prior distribution to σ 2 .A priori, we assume that the parameters are independent of one another, so the joint prior distribution is the product of the parameters' individual prior distributions.That is, where h denotes the length of θ.By (4.1) and (4.4), pðu, s 2 jyÞ / Lðyju, s 2 Þp 0 ðu 1 Þ Á Á Á p 0 ðu h Þp 0 ðs 2 Þ: ð4:5Þ Using a conjugate prior for σ 2 , we avoid having to infer σ 2 explicitly by integrating out the dependence of the posterior distribution with respect to σ 2 since this integral can be computed analytically.In this instance, our likelihood function is a Gaussian PDF, so an appropriate conjugate prior for the observational noise variance is an inverse-gamma distribution.After multiplying the likelihood function by the product of the prior densities (4.4), we arrive at the posterior distribution.The marginal distribution on the model parameters can then be derived by integrating out σ 2 , giving a Student's t-distribution multiplied by the prior density on the remaining unknowns: pðujyÞ / t 2as y; MðuÞ, b s a s I d p 0 ðuÞ, ð4:6Þ where we define t 2as ðÃ; g, cÞ as the posterior predictive density of * according to a Student's t-distribution of 2a s degrees of freedom with mean γ and covariance matrix ψ, and where a s , b s .0 are parameters of the hyperprior on σ 2 .For the implementations of the RWM algorithm detailed below, we set the values of the hyperparameters to be α σ = 3 and β σ = 0.3.A full derivation of the posterior predictive is provided in the electronic supplementary material.We cannot integrate (4.6) analytically and determine the normalization constant; therefore, we choose to characterize the posterior predictive by sampling from the target distribution using the random walk Metropolis (RWM) algorithm.This method enables us to construct an ergodic Markov chain with invariant density equal to π(θ|y).We can then use the computed Markov chain for Monte Carlo estimates and to visualize the target distribution.Producing a Markov chain of length n and using Uð0, 1Þ to denote a uniform distribution between zero and one, the RWM algorithm is given in algorithm 1.
The covariance matrix of the proposal distribution, Σ, affects the efficiency of RWM algorithms.It controls the scale of the proposal variance and the correlations between coordinates in the sampled vector.Adaptive random walk algorithms allow us to adapt Σ to optimize efficiency.To address the scale and correlation of the sample vector, we adapt the covariance matrix to Σ = β 2 ζ, where β 2 is a scaling parameter, with β > 0, and z [ R hÂh is the covariance matrix of the parameters constructed from a chosen set of parameter vectors.
Regarding scale, it has been shown that the optimal acceptance rate for multivariate RWM is 0.234 [35].For a given value of ζ, β can be to achieve an acceptance rate close to this value.Small values of β lead to a proposal density closely concentrated around the current state, which leads to a high acceptance rate but slow exploration.Conversely, large values of β lead to a diffuse proposal density where sampled vectors are likely to be in the tails of the posterior distribution, leading to low acceptance rates and therefore slow exploration.
Efficient proposal distributions reflect the correlation structures in the target density.For instance, if the probability density is concentrated close to a lower dimensional manifold, then proposal distributions which favour bigger moves in the directions parallel to the manifold will lead to faster exploration than isotropic proposal distributions.We do not know the correlation structure of the target a priori, but this can be learned through initial exploration with an isotropic proposal distribution.
We employed an adaptive RWM method, recalculating Σ after every block of 500 samples.To construct ζ, we used the position of the Markov chains over the last 10 000 samples in the chain.To ensure Σ was positive definite during the algorithm, we regularized by adding the identity matrix multiplied by a small number, 1 × 10 −5 , to ζ whenever it was recalculated.We let the value of β 2 depend on the acceptance rate within a block, α block .The conditions for updating β 2 at the end of each block were -α block < α LowerTol : multiply β 2 by 0.95 2 ; -α LowerTol ≤ α block ≤ α UpperTol : keep β 2 at the same value; -α UpperTol < α block : multiply β 2 by 1.05 2 , where α LowerTol and α UpperTol denote the lower and upper bounds of the allowed acceptance rates for the algorithm, which we set equal to 0.184 and 0.284, respectively (0.234 ± 0.05).The tolerances account for the range of acceptance rates for which an RWM algorithm is assumed to run efficiently enough.We must stop iterating β 2 and ζ at some point in the algorithm, since adaptive MCMC algorithms must satisfy the property of diminishing adaptation in order to maintain ergodicity [36].We stopped adaptation of Σ at the end of the burn-in phase, which consisted of the first 500 000 samples of the Markov chain.Convergence was checked through the repeatability and smoothness of the computed histograms, but more formal methods to quantify convergence are available [37].

Application of Bayesian methods to tendon deformation
Before running the RWM algorithm, we transformed the parameters so their support extended over the whole of R because sampling parameters whose support matches that of the proposal distributions improves efficiency.Here, we discuss the approach for the ST model.The GT model is discussed in the electronic supplementary material.Each element of the parameter vector, ψ = [(1 − ϕ)μ, ϕE, a, b], is non-negative, and the uncertain parameters a, b must satisfy a > 1 and a < b.These two conditions give rise to the natural choice of parameters for inference a − 1 > 0 and b − a > 0.
Along with the other non-negative, uncertain parameters, we assigned lognormal priors to ensure well-posedness.
Taking the logarithm of these parameters, we obtained the parameter vector u [ R 4 , where where T(ψ) represents an invertible, nonlinear transformation of the target parameters ψ.This transformation, in turn, leads to a transformation of the likelihood and posterior distributions.When performing RWM on the transformed parameters, θ, the target density is given by the pullback p of the posterior π(ψ|y ) through the map T, which has density pðuÞ ¼ pðT À1 ðuÞjyÞ Á j det D T À1 ðuÞj, ð5:2Þ where D T À1 ðuÞ is the Jacobian of T −1 .The value of this additional factor is detailed in the electronic supplementary material.As we assigned a lognormal prior to their exponents, each parameter in θ has a normal prior distribution.The electronic supplementary material details how the two parameters of the lognormal prior, and, thus, the mean and variance of the corresponding normal prior, were chosen for (1 − ϕ)μ, ϕE, a − 1 and b − a.
To validate our approach and its implementation, we first used it on a synthetic dataset created with a chosen parameter vector for the ST model, before moving on to the CDET and SDFT data, for which we used both the ST and GT models.We also ran the ST algorithm on the data collected by Goh et al., as discussed in the electronic supplementary material.All runs of the algorithm described below included a burn-in phase of 500 000 samples that were not included in the results.

Synthetic data
Fitting to synthetic data acts as a proof-of-concept that enables us to study the posterior distributions when we know the 'true' parameter values associated with the data.To create the synthetic stress values, we inputted the same set of strains as the SDFT dataset and the parameters [(1 − ϕ)μ, ϕE, a, b] = [7 MPa, 800 MPa, 1.03, 1.13] into the SEF.To simulate experimentally collected data, we added IID mean zero Gaussian noise to the stresses, and to test the algorithm rigorously, we made the synthetic data noisier than the real data by choosing a variance of 0.01 for the noise.We produced a Markov chain of 1.5 million samples for the synthetic data.
The marginal posterior distributions and the twodimensional joint distributions of the parameters that were obtained when fitting the synthetic data are shown in figure 5.Although they do not align exactly with the modes of their respective posteriors, the parameter values used to create the synthetic data are not located in the tails of the posterior, but lie in regions of relatively high posterior probability.The smoothness of the empirical distribution also implies that the algorithm sampled efficiently.
Figure 6 shows the fit to the data and a 5σ confidence band around the mean predicted stresses for a sample of 50 000 parameter vectors.A close fit is achieved for the whole stress-strain curve.This synthetic experiment demonstrates that accurate estimates of the constitutive parameters, and the uncertainty in those estimates, can be derived through a Bayesian framework, and characterized using our tuned adaptive algorithm.

SDFT and CDET data: ST model
We now analyse the algorithm's predictions when we fit the ST model to the high-resolution tendon data collected by Thorpe et al.We used the ST model based on the assumption that the fibril length distribution is symmetric; however, we relax this assumption in §5.3.We produced a Markov chain of 1.5 million samples for the SDFT data and 10.5 million for the CDET data.The additional nine million samples for the CDET data were performed to help produce smoother posteriors as their irregular shapes caused slower mixing of the Markov chains.For the SDFT data, figure 7 contains the estimated posteriors and contour plots obtained from the RWM algorithm and figure 8 shows a confidence band of 5σ around the mean stress-strain curve of 50 000 parameter vectors plotted against the data.As in the synthetic example, the empirical posterior distribution is smooth, implying good convergence.For the structural parameter ϕE, we have a physically realistic posterior: the 95% credible interval for ϕE is 814-827 MPa, which is feasible compared to literature values for ϕ and E in tendon [14].The stretches at which the first and last collagen fibrils tauten are also realistic.There are parameters with strong positive correlations: (1 − ϕ)μ and a, and ϕE and b.These indicate that to replicate the experimental data closely, the NCM must be stiffer if collagen fibrils are slack for longer, and the fibrils must be stiffer if fewer are mechanically active.Likewise, strong negative correlations between a and b and (1 − ϕ)μ and ϕE indicate that the final fibril must be taut sooner if the first is slack for longer, and the fibrils must be stiffer if the NCM is more compliant.These are all physically reasonable royalsocietypublishing.org/journal/rsif J. R. Soc.Interface 19: 20220031 correlations, demonstrating the benefit of full posterior characterization as opposed to traditional optimization.Figure 8 demonstrates that the algorithm identifies parameter vectors that fit the SDFT data closely.For all stretches, the data lie close to the mean stress, and either within or close to the 5σ confidence band that is narrower than for the noisier synthetic data.Again, the confidence band is consistently sized, demonstrating the model's ability to quantify how different microstructural components and phenomena (the NCM and the gradual tautening of collagen fibrils) influence the macroscopic mechanical response of the tendon.As the strain nears 10%, however, the model's predicted stresses are slightly higher than the experimental data, indicating a degree of discrepancy between the model and data.This could be due to damage to some fibrils as the stretch nears 10% strain, contradicting an assumption of the model.To achieve the best fit to the data overall, while retaining the linearity of the model in region III of the stress-strain curve, some underestimates of the experimental stress occur at smaller stretches to compensate for the overestimates as the strain approaches 10%.
For the CDET data, the parameter (1 − ϕ)μ possesses a high predicted posterior probability mass close to zero (figure 9), with a long tail for larger values.Due to the strong negative correlation between (1 − ϕ)μ and ϕE, the  royalsocietypublishing.org/journal/rsif J. R. Soc.Interface 19: 20220031 shape of the marginal distribution for ϕE is also affected.The shape of the posterior for (1 − ϕ)μ likely occurs because few data points lie in the toe region, with the proposed values of a being close to one, meaning that the collagen fibrils dominate the response to the deformation even at small stretches.
As the density lies close to zero for one of the parameters, taking the logarithm results in a curved posterior, whose global covariance structure is less informative for making effective proposals.Therefore, in order to obtain smoother posteriors, 10 million samples were taken.Alternative approaches would be to use more sophisticated methods such as the Metropolis-adjusted Langevin algorithm or Hamiltonian Monte Carlo [38].The posteriors for a and b are smooth, implying a good level of convergence to the posterior distributions.Furthermore, close fits to the data from the sampled parameters are still achieved (figure 10).With a 95% credible interval of 1360-1380 MPa, we again obtain physically reasonable estimates of ϕE [14].

SDFT and CDET data: GT model
To investigate possible asymmetry of f(λ r ), we applied the GT version of the model within the inference to the SDFT and CDET data.As with the ST version of the model, we produced a chain of 1.5 million samples for the SDFT data and 10.5 million for the CDET data.The posteriors and contour plots obtained are plotted in figures 11 and 12.To evaluate the skewness of the GT distributions, we plotted histograms of the quantity   Contour plots and marginals of the posterior are shown in figure 11.The estimated values of the constitutive parameters are again realistic, with the 95% confidence interval of ϕE being 823-833 MPa.We note, by comparing with the ST case, that the inclusion of the additional parameter c has slightly increased the predicted values of ϕE and decreased those of (1 − ϕ)μ.
In figure 12, we can see that the contour plots and histograms are not as smooth as for the SDFT data, indicating that even more samples might be required to achieve a high degree of convergence.The contour lines also possess a more complex shape than we see for the SDFT data.These features are also present when using the ST model (figures 7 and 9).In the GT model fit, a 95% confidence interval for ϕE is 1360-1380 MPa, which is the same as the ST case to three significant figures.Compared to figure 9, the posteriors for a and b are also similar.
Finally, figure 13 demonstrates that the value of c, the modal fibril length, is predicted to be significantly closer to b than a for the SDFT data, suggesting that this tendon may have a skewed fibril length distribution.By contrast, for the CDET data, c covers a range of values but is most commonly found near the middle of a and b, indicating that the distribution is close to symmetric in this case.

Discussion
In this paper, we developed a new model of soft-issue mechanical behaviour that only contains microstructurally relevant parameters and used a Bayesian MCMC framework to determine the likely range of their values.Our approach was similar to that of Akintunde et al. [20], who used a similar framework to compare the ability of three existing SEFs to fit agedependent, murine tendon stress-strain data.These SEFs, which were derived by Gasser, Ogden and Holzapfel (GOH) [39], Freed & Rajagopal (FR) [21] and Shearer (SHR) [22] take different approaches to modelling soft-tissue fibres.The GOH model accounts for fibril recruitment phenomenologically using an exponential function, whereas the FR model uses an implicit elasticity approach with a phenomenologically chosen implicit energy function.Neither of these models can be used to predict fibril length distributions.The SHR model, on the other hand, makes the specific assumption that fibril length varies radially within a fascicle, with fibrils at the centre of the fascicle being the shortest, and those on its periphery being the longest.While there is evidence that this assumption holds for some tendons [40], it is likely not valid for all, or for other biological soft tissues.This was one of the motivations for developing the new constitutive model presented above, which is more general.When fitted to experimental murine and equine tendon data using nonlinear optimization, the new SEF provides a closer fit than the SHR model to all four datasets studied and a closer fit than the HGO model to three out of the four datasets, only narrowly providing a worse relative fit to the t6b dataset.
We also implemented an adaptive RWM algorithm to characterize posterior probability distributions to quantify the uncertainty in the values of the fitting parameters.This algorithm samples effectively when fitting to both synthetic  and high-resolution experimental data.Furthermore, it samples parameter vectors that provide a close fit to the data, with 95% credible intervals for the important physical parameter ϕE containing realistic values when compared with existing estimates of the parameters ϕ and E. It is intriguing that the GT model predicted that the SDFT has a skewed fibril length distribution, but that the CDET's is likely symmetric.As the SDFT and CDET are archetypal examples of energy-storing and positional tendons, respectively, this may point to a potential difference in the collagen fibril arrangements of these two tendon types.Future work could examine the fit of the GT model in the RWM algorithm to a large number of datasets to determine whether there are indeed differences in the skewness of the fibril length distributions of energy-storing and positional tendons, or whether this result is just an example of inter-sample variation.We emphasize that the distributions that we arrive at here only describe the samples used to create the data, and that they do not reflect the natural variation across all tendons.
As the model is pseudoelastic, Young's and shear moduli predicted by our model are specific to the strain rates used in the experiments we fitted.The effective moduli would increase with increasing strain rate.Our findings suggest that ϕE differs from sample to sample.Consequently, either the collagen volume fraction varies between the samples we fitted, or there may not be a universal collagen fibril Young's modulus.In particular, it may be wrong to assign a Young's modulus to collagen on the fibrillar level as molecular differences may cause some fibrils to be stiffer than others.If so, our model would need to be modified to allow for variation in the constitutive, as well as the structural, parameters.The Bayesian approach assumes that our model is 'correct' in the sense that it incorporates all of the physics necessary to predict the microstructural and constitutive parameters accurately.If a significant feature is absent from the model, there would be inaccuracies in the predicted parameter values; however, the quality of fit our model achieved, along with the agreement of the predicted parameter values with experimental values reported in the literature, together, support the assumption that the most important physical features of tendon deformation under the experimental conditions of our test data are included.A more comprehensive model can always be developed, however.There are phenomena not explicitly considered here, such as chemomechanical coupling and swelling due to variation in water content [41] that could affect the parameter values in our model; therefore, our results apply only to the specific conditions that were imposed in the experiments we modelled.Explicit incorporation of such effects could be one way to improve upon our model.
We used the triangular distribution to model the distribution of collagen fibril lengths in tendon.Other tractable SEFs could be derived by modelling fibril lengths with alternative distributions, such as the step distribution, for example, which would also lead to a convenient analytic representation.Additionally, more efficient sampling methods, such as Hamiltonian Monte Carlo, which uses derivatives of the log-posterior with respect to model parameters to propose parameter vectors in areas of high posterior probability, could be used instead.We have studied tendons, which possess more strongly aligned collagen than tissues such as skin, where fibrils are generally splayed.By applying our model, and the Bayesian approach used here, to other soft tissues, we could quantify uncertainty in a broader range of scenarios.A plausible test of the model and its assumptions would be to fit multiple datasets simultaneously, enforcing the constitutive parameters to be the same between the datasets and varying the structural parameters only.The posterior probability distributions could then quantify the inter-sample variation in the constitutive parameters of a particular tissue in any given species.

Figure 1 .
Figure 1.(a) The stress-strain behaviour of soft tissues.Region I: only the compliant components are loaded; the collagen fibrils are crimped and slack.Region II: gradually, the stiff collagen fibrils straighten and become taut.Region III: all the collagen fibrils are taut; the soft tissue is stiff and linearly elastic.(b) Successive loading-unloading cycles of a viscoelastic soft tissue until the tissue can be treated as pseudoelastic.

Figure 2 .
Figure 2. (a) The PDFs, f(λ r ), for a normal distribution and two symmetric triangular distributions, with minimal value a and maximal value b, that approximate the normal distribution.s.d.= standard deviation.(b) The PDFs, f(λ r ), for the symmetric and general triangular distributions.

Figure 3 .
Figure 3.The tendon sample in the reference configuration (upper left) and face with normal in the Z-direction (upper right), and the deformed body (lower left) and face (lower right) after a force of Fe z is applied to the tissue.

Figure 4 .
Figure 4. Fits of (a) the ST model to the t5c data, and (b) the GT model to the CDET data.Yellow dots represent the experimental data and black, dashed lines represent model fits.
:2Þ where d is the length of y, I d is a d × d identity matrix, N ð0, s 2 I d Þ represents a normal distribution with mean 0 and covariance matrix σ 2 I d , y [ R d , and M(θ) denotes the output of the model given input values θ.From (4.

Figure 5 .
Figure 5. Plots of the posterior distributions calculated using the RWM algorithm on synthetic data.Main diagonal: marginal posteriors.Lower half: twodimensional contour plots of the joint distributions.Upper half: posterior correlations between parameters.The parameter values used to create the synthetic data are represented by a red line on the posteriors and a black dot on the contour plots.For the correlation values, three asterisks represent p < 0.001.In order to create this figure, the 1 million samples were thinned by a factor of 10.

12 Figure 6 .
Figure6.The 5σ confidence band (blue) around the mean of the predicted stresses from 50 000 parameter vectors from the Markov chains against the synthetic data (yellow dots).

Figure 7 .
Figure 7. Approximate posteriors and contour plots of the parameters for the SDFT data.Samples were thinned by a factor of 10.

Figure 8 .
Figure 8.The 5σ confidence band (blue) around the mean of the predicted stresses from 50 000 parameter vectors from the Markov chains against the SDFT data (yellow dots).

Figure 9 .
Figure 9. Approximate posteriors and contour plots of the parameters for the CDET data.Samples were thinned by a factor of 10.

(
2c − b − a)/(b − a) in figure13.This quantity can take values between −1 and 1 and is equal to 0 for an ST distribution.

Figure 11 .
Figure 11.Approximate posteriors and contour plots of the parameters of the GT model for the SDFT data.Samples were thinned by a factor of 10.

Figure 10 .
Figure 10.The confidence band (blue) around the mean of the predicted stresses from 50 000 parameter vectors from the Markov chains against the CDET data (yellow dots).

Figure 12 .
Figure 12.Approximate posteriors and contour plots of the parameters of the GT model for the CDET data.Samples were thinned by a factor of 10.

Table 1 .
Mean relative and absolute errors for the four models fitted to experimental tendon data.All values are given to three significant figures.

Table 2 .
Parameter values (to three significant figures) for each model's best fit to the four data sets.To avoid confusion with c in the HGO model, the modal fibril length in the GT model is written as c mode .