Whole-limb scaling of muscle mass and force-generating capacity in amniotes

Skeletal muscle mass, architecture and force-generating capacity are well known to scale with body size in animals, both throughout ontogeny and across species. Investigations of limb muscle scaling in terrestrial amniotes typically focus on individual muscles within select clades, but here this question was examined at the level of the whole limb across amniotes generally. In particular, the present study explored how muscle mass, force-generating capacity (measured by physiological cross-sectional area) and internal architecture (fascicle length) scales in the fore- and hindlimbs of extant mammals, non-avian saurians (‘reptiles’) and bipeds (birds and humans). Sixty species spanning almost five orders of magnitude in body mass were investigated, comprising previously published architectural data and new data obtained via dissections of the opossum Didelphis virginiana and the tegu lizard Salvator merianae. Phylogenetic generalized least squares was used to determine allometric scaling slopes (exponents) and intercepts, to assess whether patterns previously reported for individual muscles or functional groups were retained at the level of the whole limb, and to test whether mammals, reptiles and bipeds followed different allometric trajectories. In general, patterns of scaling observed in individual muscles were also observed in the whole limb. Reptiles generally have proportionately lower muscle mass and force-generating capacity compared to mammals, especially at larger body size, and bipeds exhibit strong to extreme positive allometry in the distal hindlimb. Remarkably, when muscle mass was accounted for in analyses of muscle force-generating capacity, reptiles, mammals and bipeds almost ubiquitously followed a single common scaling pattern, implying that differences in whole-limb force-generating capacity are principally driven by differences in muscle mass, not internal architecture. In addition to providing a novel perspective on skeletal muscle allometry in animals, the new dataset assembled was used to generate pan-amniote statistical relationships that can be used to predict muscle mass or force-generating capacity in extinct amniotes, helping to inform future reconstructions of musculoskeletal function in the fossil record.


INTRODUCTION
Countless aspects of organismal biology vary with the size of the organism as a whole (Schmidt-Nielsen, 1985;Vogel, 2003). In a mechanical context, the differential scaling of experienced by larger animals, but this deficit is fully counteracted only in extreme cases of positive allometry, where the PCSA exponent exceeds 1.
It remains unknown as to how the aforementioned scaling patterns for individual muscles translate to scaling at the level of the whole limb. Likewise, it remains unknown as to whether architectural disparities observed between muscles, functional groups of muscles (e.g., ankle extensors) or even species remain at the level of the whole limb. The limbs of an animal must exist and function as a single integrated entity, which imposes at least two requirements on the constituent muscles: 1. The muscles collectively share a common volume, whereby changes in the size of one muscle may affect the size of adjacent muscles (Fig. 1A). For example, in order for the limb to avoid becoming too heavy (or having too high a rotational inertia), increase in the size of one muscle may necessitate a decrease in the size of another.
2. Architectural specialization of muscles can limit their ability to effectively contribute to a diverse range of tasks (Wilson & Lichtwark, 2011), and hence specialization in one muscle may necessitate concomitant change in another so that limb functionality is not compromised. For example, a muscle with short fascicles will have a high PCSA but a reduced working range, which may require another muscle to compensate by having longer fascicles, but with reduced PCSA (Fig. 1B). The requirements on muscles as part of a single, functionally integrated whole may impose constraints on their construction. (A) Changes in the size of one muscle may necessitate change in the size of adjacent muscle, such that total muscle volume (illustrated here in cross-section) may remain relatively constant. (B) Functional specialization of one muscle's architecture may necessitate concomitant changes to the architecture of other muscles, in order for a limb to remain capable of effectively executing a diverse range of tasks; total physiological cross-sectional area (PCSA) may therefore remain relatively constant. In this example, moving from left to right muscle 1 decreases PCSA but muscle 2 increases PCSA, such that total PCSA remains unaltered. Full-size  DOI: 10.7717/peerj.12574/ fig-1 These two requirements imply that animal limbs are subject to a 'fascicle packing problem', and raise the question of how flexible (evolutionarily labile) animal limbs are in terms of their muscular anatomy. Are there different strategies for packing PCSA into a given volume of limb muscle, or do the differences observed between individual muscles and between species 'cancel out' at the level of the whole limb? Depending on how stringent the above requirements are, the total force-producing capacity (strength) of an animal's limb may therefore be strongly tied to total limb muscle mass, irrespective of a given species' size or functional requirements.
Understanding how whole-limb muscle mass and force-generating capacity relate to one another, and how this relation scales with body size, is not just pertinent to the study of extant species. Muscle force-generating capacity is a key unknown in studies of extinct animal function and behaviour, and empirical data from extant species play a vital role in informing inferences of extinct species (Bates & Falkingham, 2018;Bishop, Cuff & Hutchinson, 2021a;Fahn-Lai, Biewener & Pierce, 2020;Lautenschlager et al., 2018;Sellers et al., 2013). The use of empirical data from extant members of a specific clade may be appropriate for extinct members of that clade (e.g., using data derived from extant palaeognath birds to guide inferences of extinct palaeognath birds; Bishop, 2015), but it is not immediately clear how this should be approached for extinct species that are phylogenetically distant or morphologically disparate from extant species. In particular, the more phylogenetically distant or morphologically disparate an extinct taxon is, the lower the confidence that can generally be placed in inferences of muscle origin or insertion (Carrano & Hutchinson, 2002;Witmer, 1995), relative size, internal architecture and even whether a given muscle exists in the extinct taxon (e.g., differentiation from adjacent muscles). Developing an understanding of how muscle mass and force-generating capacity scales with body size across a wide range of species, even at a broad anatomical resolution such as the whole limb, could therefore provide a useful starting point for better informed inferences of extinct taxa. Even if just overall bulk and strength of limb musculature were able to be confidently constrained, this would represent a practical advance upon which future refinements could be made.
The present study sought to address the above outstanding issues by conducting a holistic assessment of whole-limb muscle mass and force-generating capacity in extant terrestrial amniotes. It had three key aims: (1) to contribute new data from hitherto unsampled clades (ameridelphian marsupials and lacertoid lizards), broadening the diversity of the available comparative dataset; (2) to investigate how muscles scale with body size at the level of the whole limb, in both the fore-and hindlimb, providing a first assessment of the stringency of the 'fascicle packing problem'; (3) to derive generic, amniote-wide statistical predictive relationships that would have utility in deriving inferences of musculature in extinct amniote species. This study is the first comparative synthesis of muscle architecture scaling across extant mammals, birds and non-avian saurians. In addition to providing a novel perspective on the topic of muscle scaling in animals, the results from this study provides a platform for more rigorous inference of muscle strength in extinct amniote clades.

MATERIALS & METHODS Dataset
The present study is founded upon muscle architecture data derived from dissections (Table 1); all raw data used are provided in Table S1. The sources of the data are described in the following two subsections. Although this study addresses 'whole limb' scaling, it is restricted in scope to the musculature crossing the three primary joints of the limbs (shoulder/hip, elbow/knee and wrist/ankle), ignoring the intrinsic musculature of the manus and pes. This was necessitated by the practical difficulties of accurately dissecting and measuring the latter muscles (particularly for small species), such that data are very rarely reported in the comparative literature. Ignoring manual and pedal muscles is also justified given that they comprise a small fraction of total limb musculature, and presumably contribute only a small fraction towards limb support and propulsion during locomotor activities. Although extrinsic shoulder muscles attaching to the scapula can be important for locomotion in therian mammals (Hudson et al., 2011a;Jenkins & Weijs, 1979), these were excluded from consideration to facilitate a fair comparison across all species; only muscles that explicitly attached to the humerus or more distal forelimb skeleton were included. The forelimb dataset comprised architectural measurements for a total of 912 muscles in 31 species (21 mammals and 10 non-avian saurians, the latter hereafter referred to as 'reptiles' for simplicity) spanning more than four orders of magnitude in body mass, and the hindlimb dataset comprised measurements for a total of 1,181 muscles in 36 species (19 mammals, 12 reptiles and five birds) spanning almost five orders of magnitude in body mass. In the hindlimb dataset, birds and humans were collectively treated as a single group, 'bipeds', such that 'mammals' herein refer to all mammals except humans. Given the small sample size (and taxonomic skew) of birds in the current dataset, and the broad aims of the study, it was not considered justifiable to analyze birds and humans as separate entities here. Furthermore, as the present study concerns terrestrial amniotes, the forelimb of bipeds was not investigated.
For each muscle, its physiological cross-sectional area (PCSA) was calculated as where m muscle is belly mass, a o is pennation angle, ℓ o is fascicle (or 'fibre') length and ρ is muscle tissue density, the latter taken to be constant for vertebrate skeletal muscle at 1,060 kg/m 3 (Hutchinson et al., 2015;Mendez & Keys, 1960). It is important to note that this equation assumes that all fascicles of a muscle act in parallel in generating force, allowing their individual cross-sectional areas to be summed (see also . In reality, the constituent fibres of a given fascicle are often shorter than the fascicle itself, wherein their ends overlap or interdigitate (Gaunt & Gans, 1992;Infantolino, Neuberger & Challis, 2012), although they may still activate simultaneously with one another, thus functioning as a single fibre (Bodine et al., 1982). Additionally, as per common practice, it was assumed that measurements of fascicle length and pennation  Olson et al. (2018) angle corresponded to the muscle's optimal fibre length and optimal pennation angle, respectively (Zajac, 1989).

Published data
The majority of data used in this study were sourced from previously published comparative studies on muscle architecture, biomechanics or scaling (see Table 1 for references; see also Martin et al., 2020 for a review of methods used to measure muscle architecture). Unfortunately, many of the earlier studies on muscle architecture scaling cited in the Introduction did not report their raw measurements and so their data were unable to be incorporated into a taxonomically richer analysis. Literature data were selected according to three requirements: 1. All or almost all of the muscles of the limb had been measured and reported (save the manual and pedal muscles, as noted above), since the overarching aim of the present study is considering the whole limb. Studies in which a few small muscles (e.g., deep muscles such as the gemellus, quadratus femoris or popliteus) were not reported were still included, since the omission of such small muscles is expected to have minimal effect on the overall results. Studies that measured multiple muscles under a single name (e.g., multiple heads of the flexor carpi radialis as a single muscle) were also included, since this nevertheless accounts for all the muscle mass present. In contrast, studies that did not report one or more major muscles were excluded from consideration. All datasets ultimately selected for use in the present study included all major extensor and adductor (antigravity) muscles of the limb. The specific muscles included (and which, if any, were excluded) in a given source study are outlined in Table S1. . A few previous studies had neglected to measure (or at least report) pennation angle, but to maximize consistency across the present dataset these studies were excluded. Studies that reported dry muscle mass only were also excluded. Note: List of species with architectural data for individual muscles, the major group they belong to (for the purposes of the statistical analyses conducted here), body mass, whether a given species contributed to the forelimb ('fore') or hindlimb ('hind') datasets, and studies in which the data were originally published. Note that birds and humans were analysed together in this study as 'bipeds'. See Table S1 for additional species and studies that contributed data on muscle mass only.
3. The data reported were for adult or large-sized individuals, to reduce potential confounding effects of ontogenetic variation (Table S1). In approximately two-fifths of species sampled, multiple similarly-sized individuals had been investigated, but the data reported by the relevant studies were only presented as a species mean, wherein a given architectural parameter for a given muscle had already been averaged across the individuals studied; in these cases, the mean body mass of that sample was used. For all other species, architectural and body mass data were reported for separate individuals. When data for multiple individuals of a given species had been reported separately (e.g., Lamas, Main & Hutchinson, 2014;Allen et al., 2015;Martin et al., 2019), those for the largest individual were used, to reduce possible ontogenetic effects. This approach was deemed more appropriate than computing a species average across all individuals, because in several instances the sample of individuals investigated by prior studies (especially those focused on ontogeny) exhibited high disparity in body sizes, undermining the value of a species mean; moreover, such an arithmetic mean would not account for ontogenetic allometry within the species, and hence could introduce further error into the analyses. Ultimately, each species contributed only a single datapoint to the analyses.
Data were also sourced for a further 15 species from six additional studies, which had reported just muscle mass (Table S1). These studies either reported mass of each individual muscle separately (Hudson et al., 2011a;Hudson et al., 2011b;Ogihara et al., 2009;Payne et al., 2006;Zihlman, McFarland & Underwood, 2011), or reported total muscle mass for the limb as a whole (Grand, 1977), and helped further increase taxonomic coverage in the final dataset.

New data
To broaden the taxonomic diversity of the dataset, and contribute novel data that future studies may draw upon, dissections were undertaken on a single adult individual each of the Virginia opossum (Didelphis viginiana) and the Argentine black and white tegu (Salvator merianae). The former is an ameridelphian marsupial, and apart from a single species of bandicoot (australidelphian; Martin et al., 2019) is the only other marsupial in the dataset; the latter is a lacertoid lizard, representing a hitherto unsampled part of squamate phylogeny. Intact whole carcasses of wild individuals were obtained as part of a prior study (Fahn-Lai, Biewener & Pierce, 2020), sourced from Worcester County, Massachusetts (D. virginiana, Massachusetts Division of Fisheries and Wildlife) and Everglades National Park, Florida (S. merianae, Daniel Beard Center, United States Geological Survey). Upon acquisition, the specimens were immediately frozen at −20 C, and subsequently fully thawed prior to dissection and architectural measurement. The right fore-and hindlimbs were dissected in both cases. Muscle architecture measurement followed standard dissection procedures (Allen et al., 2015;Biewener & Full, 1992; Fahn-Lai, Biewener & Pierce, 2020), using a magnifying lamp where necessary. To minimize desiccation of the fresh specimens, tissues were kept moistened with paper towel soaked in saline solution throughout. Muscle belly mass was measured with an electronic balance (Precisa 320 XB; Precisa Gravimetrics AG, Dietikon; 0.0001 g precision), fascicle length with digital calipers (0.01 mm precision) and pennation angle with a transparent protractor (1 precision). Measurements of fascicle lengths and pennation angles were made at random locations throughout a muscle belly, and depending on muscle size up to ten measures each were made, after which the arithmetic mean was derived.

Anatomical comparisons
Four main, and two subsidiary, comparisons were undertaken for both fore-and hindlimbs, as described below. For each comparison, patterns were examined for the limb as a whole, as well as just the proximal and distal limb muscles separately, given that some previous studies have noted scaling differences between the proximal and distal limb for individual muscles or functional groups (Alexander et al., 1981;Cuff et al., 2016aCuff et al., , 2016bDick & Clemente, 2016;Eng et al., 2008). 'Proximal' muscles were classified as those in which the majority of their bulk resides proximal to the elbow (forelimb) or knee (hindlimb) joints, whereas 'distal' muscles have the majority of their bulk residing distal to those joints. This volume-based approach is more relevant to the 'fascicle packing problem', and avoids the complications caused by flexor or extensor muscles crossing the joints involved. Furthermore, given that a substantial component of hindlimb locomotor muscle in reptiles is the caudofemoralis longus (CFL), which chiefly resides in the proximal tail, whole hindlimb and proximal hindlimb analyses were also re-run with this muscle excluded from the reptile sample. In addition to providing more nuanced insight into questions of scaling and fascicle packing, these additional variations provide greater sophistication to predictive relationships derived from the data. Ultimately, eight different sub-analyses were performed for each anatomical comparison (48 in total).

#1-Total muscle mass (Σm muscle ) v. body mass (m body )
This comparison examines how much of total biomass is invested in limb musculature; under isometry, the scaling exponent would be 1. It ignores the potential for systematic variation in relative body composition in terms of other tissue types (bone, integument, etc.), but is nonetheless informative because it focuses on one of the tissues primarily involved in body support and propulsion during terrestrial locomotion.

#2-Mean size-normalized isometric strength v. m body
Following calculation of PCSA as per Eq. (1), the arithmetic mean across all muscles was taken. Although PCSA is an estimator of maximal isometric force-generating capacity of a muscle, as a measure of area it is not a particularly intuitive descriptor of force, in and of itself. To present mean PCSA in a more tangible form, it was converted to maximal isometric force in multiples of body weight (BW): where σ is maximal isometric stress, assumed here as 300,000 N/m 2 (Bates & Falkingham, 2018; Hutchinson, 2004;Medler, 2002), and g is the acceleration due to gravity. Dividing by body mass means that under isometry the scaling exponent is that of PCSA minus 1.0 (i.e., −⅓), which in turn makes it more straightforward to interpret in the context of scaling. That is, PCSA can scale with positive allometry and yet force-generating capacity can decline with increasing size if ⅔ < PCSA exponent < 1, whereas the sign of the exponent for F max Ã is immediately indicative of whether a strength deficit exists at larger body size: only if it is positive is relative force-generating capacity (i.e., relative performance) maintained with increasing size. It should be noted that using a different value for σ would not alter the resulting scaling exponent.
The relative masses and force-generating capacities of individual limb muscles typically do not follow an even or symmetrical distribution (Table S1), such that the analysis of means as above may be influenced by one or two exceedingly strong, or weak, muscles in the limb. Thus, the median PCSA for all muscles was also computed, and converted to F max Ã as per Eq.

#2c-Total F max Ã v. m body
In the current dataset there is considerable variation in the number of individually measured and reported muscles for a given species. These differences can reflect investigator judgement in the splitting or grouping of muscle heads for measurement, but can also be due to legitimate anatomical differences between species; for example, crocodylians have a single long digital flexor in the hindlimb, whereas birds have up to six. Variation in the number of muscles may influence the mean or median PCSA, and so to account for this the total PCSA was also computed and converted to F max Ã as per Eq. (2).
Again, under isometry the scaling exponent would be −⅓.

#3-Characteristic fascicle length v. m body
Fascicle length is almost ubiquitously investigated in studies of muscle architecture scaling, yet in a broad comparison across amniotes it is not sensible to investigate fascicle length in and of itself, either for individual muscles, functional groups or for the whole limb. This is due to the great variation that can exist in muscle size, origins, insertions and lengths, as well as differences of subdivision (differentiation) between different taxonomic groups, which can all lead to marked variation in fascicle length irrespective of body size. An alternative approach is to compute a single 'characteristic fascicle length' for the limb as a whole, as the weighted harmonic mean of the fascicle lengths of each individual muscle (cf. Alexander et al., 1981): This effectively replaces the musculature of the whole limb (or proximal or distal compartment thereof) with a single equivalent muscle. Note that in using PCSA, Eq. (3) weights fascicle lengths by both muscle mass and cos(a o ), and thus the equivalent whole-limb muscle also factors in pennation. Furthermore, it can be seen that L Ã is approximately inversely proportional to mean relative (mass-normalized) PCSA (Eng et al., 2008). As for 'real' fascicle length, the scaling exponent under isometry would be ⅓.

#4-ΣPCSA v. Σm muscle
By removing the context of m body , this provides a direct assessment of the 'fascicle packing problem', by testing how whole-limb force-generating capacity compares against the amount of available muscle mass. Under isometry, the scaling exponent would be ⅔; note that isometry may reflect scale invariance of pennation angle (which would be theoretically expected; see also Dick & Clemente, 2017; Cieri, Dick & Clemente, 2020), fascicle length, or a more complex combination of both parameters. Comparison #3 also addresses the fascicle packing problem, but indirectly.

Statistical analyses
All data processing and analyses were conducted in R v.4.1.0 (R Core Team, 2021), Harmon et al., 2008) and 'nlme' (v. 3.1; Pinheiro et al., 2021) packages; the full set of code and data used are provided in the Supplemental Information. Data were logarithmically transformed (base 10) prior to analysis, facilitating the use of linear statistical models. Phylogenetically informed statistical analyses were conducted using a single, fully resolved, time-calibrated phylogenetic tree of the study taxa (Fig. S1). The tree was generated using the TimeTree database (www.timetree.org; Hedges, Dudley & Kumar, 2006), which included all taxa except for Isoodon fusciventer; this was substituted with Isoodon obesulus (the only species of Isoodon in the database), which has no effect on divergence times with respect to other taxa in the present study.
For each of the above anatomical comparisons (for whole-limb and proximal and distal compartments), allometric scaling equations of the form log 10 Y = log 10 A + Blog 10 X were derived for mammals, reptiles and bipeds separately, using phylogenetic generalized least squares (pGLS or 'phylogenetic regression'; Smaers & Rohlf, 2016) to determine slopes (exponents, B) and intercepts (log 10 A). This simultaneously estimated the λ parameter of Pagel (1999) using maximum likelihood, offering greater flexibility than a strict Brownian motion model of evolution in accounting for phylogenetic signal in the data. Additionally, the 95% confidence interval (CI) of the slope was determined, using the t-distribution, to facilitate comparison against the slope expected under isometry: if the expected exponent fell outside of the CI, the scaling was deemed to significantly depart from isometry at the P = 0.05 level. Subsequently, all groups were collated together and a new pGLS fit was computed to generate a 'pan-amniote' regression for a given anatomical comparison (again for whole-limb and proximal and distal compartments), thus deriving a statistical predictive framework that can be applied to extinct species. In addition to estimator coefficients, 95% CIs and prediction intervals were also derived (Smaers & Rohlf, 2016), which can provide error margins for future estimations.
To ascertain whether mammals, reptiles or bipeds exhibited different allometric trajectories for a given anatomical comparison, a phylogenetic analysis of covariance (pANCOVA) was performed, which tested for differences in slope and intercept between groups (both separately and together; Smaers & Rohlf, 2016). Here the off-diagonal elements of the variance-covariance matrix were scaled by the λ parameter estimated during calculation of the pan-amniote regression, to minimize false negatives caused by an overly conservative assumption of strict Brownian motion. The analyses were also run without accounting for phylogeny (ANCOVA, λ = 0), to evaluate the effect of phylogenetic relatedness on the results, given that the mammal, reptile and biped samples are nearly mutually exclusive phylogenetically (Fig. S1). Statistical significance was set at P = 0.05.

RESULTS
Scaling results and exponents are presented graphically in Figs Tables S4-S9. Coefficients for the pan-amniote regressions are reported in Table 6, with coefficients for hindlimb analyses excluding the CFL from the reptile dataset presented in Table S10.

Total muscle mass v. body mass
Reptiles almost ubiquitously show negative allometry, whereas mammals and bipeds do not show significant departure from isometry (Fig. 2). The one exception to this generalization is negative allometry in the distal hindlimb of mammals: larger mammals tend to have relatively lighter (less muscled) distal hindlimbs. When the CFL is excluded, the values of the slopes (and CIs) for reptiles change minimally, but this is sufficient to render the revised scaling statistically indistinguishable from isometry (Fig. S4). Analyses of covariance (without accounting for phylogeny) indicate that mammals, reptiles and bipeds each exhibit significantly different allometric trajectories, in terms of both slope and intercept (Tables 2, Table S4). Reptiles have a lower slope and intercept compared to mammals, indicating that they have proportionately lower muscle mass, especially at large body sizes. In contrast, bipeds have proportionately greater hindlimb muscle mass than both mammals and reptiles, especially at larger body size and particularly in the distal limb. Many of these differences disappear (i.e., become statistically non-significant) once phylogeny is taken into consideration using the pANCOVA.
Results are plotted on logarithmic coordinates, along with phylogenetic regression (pGLS) and 95% confidence intervals (CIs). Red = mammals, blue = reptiles, green = bipeds. Note the difference in vertical hindlimb (Fig. S5). Notably, bipeds exhibit strong positive allometry throughout the hindlimb, particularly in the distal limb (Fig. 3); notwithstanding the small sample size, mean size-normalized force-generating capacity appears to be scale-invariant. Analyses of covariance without accounting for phylogeny indicate that mammals, reptiles and bipeds frequently exhibit significantly different allometric trajectories, especially in terms of slope (Table 3, Table S5). Differences in intercept were mostly detected for the forelimb, where reptiles exhibit a markedly lower intercept than mammals, indicating lower mean force-generating capacity regardless of body size. Most of the differences between mammals and reptiles disappear once phylogeny is taken into consideration using pANCOVA; in contrast, most differences between mammals and bipeds, and reptiles and bipeds, were retained following phylogenetic correction, attesting to strong allometric deviation in the biped sample.

Characteristic fascicle length (L Ã ) v. body mass
In the forelimb, both mammals and reptiles exhibit negative allometry, although in reptiles this is not statistically significant in the distal limb, due to wide CIs in that instance (Fig. 4). Mammals and bipeds also exhibit negative allometry in the hindlimb, although in mammals this is driven by the distal limb only; mirroring patterns noted above, bipeds show a stronger departure from isometric scaling in the distal limb. Reptiles do not show any significant departure from isometry in the hindlimb, a result that remains unaltered when the CFL is excluded from the dataset (Fig. S8). In stark contrast to the previous comparisons, ANCOVA (without accounting for phylogeny) revealed almost no significant difference among mammals, reptiles or bipeds (Table 4, Table S8). The only difference detected (which disappeared following phylogenetic correction using pANCOVA) was that mammals have a lower intercept in the distal hindlimb compared to both reptiles and bipeds, indicating that, overall, the distal hindlimb of mammals possesses proportionately shorter muscle fascicles.

Total PCSA v. total muscle mass
Mammals and bipeds almost ubiquitously exhibit positive allometry, whereas reptiles generally do not exhibit statistically significant departures from isometry (Fig. 5). Two exceptions to this are mammals not showing a significant departure from isometry in the distal forelimb, and reptiles exhibiting positive allometry in the forelimb as a whole.

Pan-amniote regression
Coefficients for the pan-amniote regressions (Table 6) provide an empirical basis for estimating some important measures of muscle mass and force-generating capacity in extinct terrestrial amniotes. Note that the coefficients reported here were computed to the exclusion of bipeds, given that bipeds have been shown above to frequently differ from quadrupeds, and that the majority of species throughout tetrapod history were quadrupedal. The code provided in the Supplemental Information enables for predictive relationships to be derived that includes bipeds in the dataset. In addition to the coefficients, mean percent prediction error (%PE) is also reported, expressed in terms of the original, non-log-transformed dimensions of the data (Smith, 1980); this provides an alternative to prediction intervals (see SI code) as a way of deriving upper and lower estimates for a given taxon. Coefficients computed when the CFL was excluded from the reptile dataset (Table S10) are largely similar to those reported in Table 6, although notably mean %PE is generally higher. Each pairwise comparison was tested for differences in slope (S), intercept (I) and slope and intercept (S + I). Results for analyses without controlling for phylogeny are also presented (ANCOVA, †); significant results are in boldface; df = degrees of freedom.    Note: These are reported for data on a log10 scale; also reported is the mean percent prediction error (%PE), expressed in original (antilog) terms. Results for hindlimb analyses where the caudofemoralis longus was excluded from the reptile dataset are reported in Table S10.

DISCUSSION
Through a synthesis of new dissection data with previously published results, the present study aimed to holistically appraise limb muscle scaling in terrestrial amniotes.
In particular, it sought to investigate how muscle mass (size) and force-generating capacity (strength) scales at the level of the whole limb, to explore whether this reflects previously observed patterns noted for individual muscles or functional groups across disparate clades, and to assess how tightly constrained amniote limbs are in terms of whole-limb muscular composition. A subsidiary objective was to generate statistical relationships that have predictive value in inferring function in extinct amniotes. In synthesizing data from numerous studies, it must be acknowledged that the resulting dataset will likely contain a certain level of 'noise' due to various sources. One pertinent source of error is the measurement of fascicle length in fresh specimens: as was the case in the present study, this is typically undertaken following removal of a given muscle from the limb, whereupon the fascicles may non-systematically deviate from a 'reasonable' or functionally relevant length. This problem may be partially mitigated by using formalin-fixed specimens, where the limb joints can be locked in physiologically realistic poses prior to fixation (thus limiting fascicle length change after the muscle is removed), although this approach can involve its own set of challenges, such as muscle shrinkage (Kikuchi & Kuraoka, 2014;Martin et al., 2020). Digital methods that permit architectural measurement in situ, such as contrast-enhanced scanning and automated fascicle tracking (e.g., Sullivan et al., 2019), may be able to avoid these issues, but thus far are limited in spatial scale and hence anatomical and taxonomic scope. Additional sources of noise in the present study will also include variation in the approaches of prior studies, in terms of investigator, occasional exclusion of some small muscles, subjective subdivision of muscle complexes and so on. Nonetheless, given the wide diversity and size range of species covered here, this was considered acceptable for the present study's broad scope.

Mass and force-generating capacity scaling across terrestrial amniotes
In terms of total muscle mass (Fig. 2), the present study found that reptile (i.e., non-avian saurian) limbs typically show negative allometry with respect to body mass (exponents 0.884 to 0.939), whereas mammals (exponents 0.989 to 1.019) and bipeds (exponents 1.104 to 1.109, but wide CIs) typically displayed isometry. This is in partial agreement with prior studies, where weak negative to modest positive allometry has been recovered (exponents 0.9 to 1.15 across all amniotes; see references in Introduction). The differences in findings may be due to multiple factors, including variation in sample sizes affecting CI calculation, the line-fitting approach used and even the species that contribute to the underlying datasets. It may also reflect a genuine biological phenomenon, where allometric patterns observed for a select few muscles (i.e., those which extend the range of exponents recovered by previous studies) are 'cancelled out' by those of many other muscles, when considered together at the level of the whole limb. Irrespective of the proximate cause(s), it is evident that reptiles have proportionately less muscle mass than mammals or bipeds, and bipeds have proportionately greater hindlimb muscle mass, especially in the distal limb. Neither result is hardly surprising, given that the reptiles sampled have short limbs and a long, massive tail that contributes substantially toward total body mass, and that bipeds (by virtue of being bipeds) ought to invest a greater fraction of body mass into longer, more heavily muscled hindlimbs. Muscle force-generating capacity in the present study was expressed in a more intuitive fashion by normalizing to units of body weight. Thus, to compare exponents derived here to those of prior studies that have examined PCSA directly, this requires adding or subtracting 1.0 to the exponent. Three different measures of normalized force-generating capacity (F max Ã ) were explored here-mean (Fig. 3), median (Fig. S2) and total (Fig. S3)which showed a consistent overarching set of patterns with respect to body mass. Mammals and reptiles overall exhibit isometric scaling, although in some circumstances mammals tended towards positive allometry (total range of exponents −0.30 to −0.245) whereas reptiles tended towards negative allometry (total range of exponents −0.465 to −0.302). This is partially consistent with the results of prior studies (corrected exponents −0.31 to −0.09; see references in Introduction). Again, discrepancies in findings may belie the different spatial scales concerned; for example, the high end of previously reported exponents derive from the plantaris muscle of the mammalian hindlimb (Pollock & Shadwick, 1994) and the proximal forelimb of felids (Cuff et al., 2016a). Of note, reptiles often displayed a markedly lower intercept than mammals or bipeds, indicative of reduced relative force-generating capacity and again consistent with possessing short limbs and massive tails. It therefore follows that how 'body mass' is gauged (total body mass, body mass excluding tail, etc.) will influence interpretations of interspecific differences in force-generating capacity, although from a purely mechanical perspective total body mass is appropriate in the context of terrestrial locomotion, since ultimately the limbs must support and propel all of it. Bipeds were found to almost ubiquitously display positive allometry in muscle force-generating capacity, and were frequently found to significantly differ from mammals and reptiles even after phylogeny was accounted for (in 13 of 15 comparisons with mammals, all comparisons with reptiles); again, this can be related to the greater investment of biomass in the hindlimbs of these species. Notably, positive allometry was exceedingly strong in the distal hindlimb (total range of exponents −0.016 to 0.025), to the point that relative force-generating ability may increase with increasing body size. This is comparable to exponents reported for bipedal hopping macropods (corrected exponents −0.22 to 0.33;Bennett & Taylor, 1995;McGowan, Skinner & Biewener, 2008), although intriguingly is higher than the exponents recovered by Maloiy et al. (1979) for the ankle extensors of a selection of ground-dwelling birds (corrected exponent −0.24 to −0.19). These results should be viewed with some caution, given that only six species were examined in the present study. Nonetheless, they raise the interesting question of whether extreme positive allometry would persist at larger body sizes beyond the largest extant biped, the ostrich (Struthio camelus). For example, how would muscle strength fare in multi-tonne non-avian theropod dinosaurs, and what implications may this have for locomotor performance? These are questions that must await future investigation, but the comparative dataset marshalled here can provide an empirical basis for such efforts.
In lieu of investigating fascicle length per se, this study examined an equivalent region-level metric, characteristic fascicle length (L Ã ), which tended to exhibit negative allometry with respect to body mass in both mammals and bipeds, as well as the forelimb of reptiles. The exponents obtained (0.171 to 0.324) fell within the range reported previously for actual fascicle length in individual muscles or functional groups (0.14 to 0.5; see references in Introduction). Negative allometry indicates that muscle fascicles are on the whole proportionately shorter in larger species. This is especially true in the mammalian distal hindlimb, which also exhibits a lower intercept (Fig. 4); shorter muscle fascicles in the distal hindlimb of mammals in turn translate to relatively greater PCSA per unit mass in this compartment (Fig. 5). All else being equal, proportionately shorter muscle fascicles in larger species will translate to reduced joint excursion, and at least for mammals and birds this correlates with the habitual use of more extended limb postures in larger species (Biewener, 1989;Biewener, 2005;Bishop et al., 2018;Gatesy & Biewener, 1991).

Fascicle packing
Analysis of covariance of muscle mass and force-generating capacity identified numerous differences in scaling patterns between mammals, reptiles and bipeds, but when phylogeny was taken into account the majority of these became statistically non-significant (Tables 2, 3, Tables S2, S3). Hence there are genuine physical differences in mass or force-generating capacity between each group (which is biomechanically relevant), and these largely exist due to each group having evolved along different phylogenetic trajectories. However, when total muscle mass was factored into consideration, either directly (comparison #4: ΣPCSA v. Σm muscle ) or indirectly (comparison #3: L Ã v. m body , wherein Σm muscle is used to compute L Ã ), ANCOVA recovered almost no statistically significant difference between any group; this was true regardless of whether phylogeny was accounted for or not, and was observed for the limb as a whole as well as proximal and distal compartments separately (Tables 4, 5). That is, the fore-and hindlimbs of reptiles, mammals and bipeds almost ubiquitously follow a single common scaling pattern 1 . This in turn implies that observed differences in whole-limb force-generating capacity between groups are principally driven by differences in limb muscle mass, not internal architecture: one group does not 'pack their fascicles better' for a given volume of limb muscle, they just invest a greater fraction of total biomass into limb muscle in the first instance. The only possible exception to this generality is the distal hindlimb of mammals, which at larger body sizes at least do indeed appear to 'pack their fascicles better', with a greater PCSA (and in turn, force-generating capacity) per unit muscle mass, correlating with proportionately shorter fascicles for a given body size. All else being equal, this result suggests that larger species may benefit from greater elastic energy storage in the distal hindlimb tendons during locomotion (cf. Pollock & Shadwick, 1994), although it should nevertheless be treated with caution, since a large proportion of the currently sampled mammals are 'cursorial' taxa, which may bias the analysis.
The existence of a single overarching pattern, across a diverse array of terrestrial amniotes that span more than four orders of magnitude in body mass, is remarkable. It implies the tendency towards some adaptive optimum in organismal 'design' or, more likely, the presence of one or more constraints that prevent significant and systematic deviation from a common pattern. These constraints may be functional in nature, such as trade-offs that could occur between conflicting requirements of individual muscles in the execution of disparate tasks (see Introduction), or may have a developmental basis (e.g., Evans et al., 2021). Given the multidimensional and nonlinear aspects of muscle architecture and function, and terrestrial locomotor biomechanics in general, it would be naïve to suggest that a bivariate statistical model such as those derived here can sufficiently represent the mechanical phenomena involved (Taylor & Thomas, 2014). Deciphering the proximate underlying cause(s) of this strong consistency must therefore await future study. It is also important to recognize that there is still scope for (nonsystematic) variation; for example, the greyhound (Canis familiaris, 27 kg) and snow leopard (Panthera uncia, 36 kg) have near-identical total hindlimb muscle mass (~1.64 kg), yet the greyhound-which is selectively bred for high-speed running-has approximately double the PCSA of the snow leopard (0.028 v. 0.014 m 2 ; Table S1). Important insight into the fascicle packing problem can therefore be gained by better understanding the reasons for variation about the common pan-amniote pattern, through exploring variables that were not investigated in the current analyses, such as segment lengths, muscle moment arms, joint mobility, posture and locomotor ecology.

Considerations for future studies
A number of other points are worth noting in the context of future comparative investigations of muscle architecture. Firstly, despite a superficially broad taxonomic coverage, the dataset of the present study is still biased, with few birds, marsupials or non-varanid squamates, numerous 'cursorial' eutherians and not one testudine; future studies should therefore target currently under-or unsampled parts of amniote phylogeny. Increased diversity will be especially useful for elucidating the true extent, and underlying cause(s) of, conservatism in whole-limb fascicle packing. Furthermore, it is recommended that as much architectural data be collected and reported as possible, even if it may not all be immediately used in a given study, since this will maximize its potential utility for later studies. One reason why several prior datasets were excluded from the present study was their omission of pennation angle from the set of reported (or even collected) measurements. A second point worth considering is that, out of necessity, the present study ignored the intrinsic musculature of the manus and pes; if further datasets for these muscles become available in the future, they should be analysed, since this may reveal important differences between plantigrade and digitigrade species.
One final consideration is that the present study found numerous differences between mammals, reptiles and bipeds in the allometry of muscle mass and force-generating capacity with respect to body mass. This is hardly surprising, but it has an important implication for how comparative studies should undertake their analyses. In order to facilitate comparison within or across species, previous studies that have explored muscle architecture and function typically normalize raw architectural measurements by body mass, or the dimensionally appropriate exponent of body mass such as m ⅓ body for normalized fascicle length (Allen et al., 2010;Bates & Schachner, 2012;Dick & Clemente, 2016;Fahn-Lai, Biewener & Pierce, 2020;Martin et al., 2019;Regnault et al., 2020). This may be acceptable for comparisons of closely related species, but becomes questionable when divergent allometries with body mass are involved. A more sensible approach for future studies would be to normalize against more directly relevant parameters, such as limb or limb segment length, or alternatively by the appropriate clade-specific scaling exponents. Exactly which is the most appropriate normalizing metric to use may vary depending on a given study's core questions, and deserves further scrutiny.

CONCLUSION
Whole-limb scaling of muscle mass and force-generating capacity generally follows the same patterns reported previously for individual muscles or functional groups of muscles, although some instances of tendency towards isometry (as opposed to previously reported positive or negative allometry) were noted. This is the first time that muscle scaling has been addressed at such a broad spatial scale, across a diverse array of terrestrial amniotes. Additionally, some important differences were observed between proximal and distal limb compartments, particularly in the hindlimb of mammals and bipeds, which may reflect the 'cursorial' habits of many of the species investigated here. The mammalian distal hindlimb has proportionately less muscle mass and shorter fascicles, but per unit muscle mass has a higher PCSA; and the distal hindlimb of bipeds has proportionately greater muscle mass and PCSA.
Almost all differences in force-generating capacity between groups appear to be due principally to differences in muscle mass, rather than muscle architecture. Thus, one group does not systematically 'pack their fascicles better' than another, instead they simply invest more biomass into limb muscle. The underlying reason(s) for a single overarching relationship across extant amniotes remains to be determined, although it echoes conservatism in other aspects of musculoskeletal design, such as tissue mechanical properties.
A comparative dataset of extant amniotes spanning almost five orders of magnitude in body size has been assembled, which can be built upon and used by future studies in the analysis of muscle architecture diversity. This dataset also forms the basis for a suite of pan-amniote predictive equations that can be used to estimate bulk muscle mass and force-generating capacity, along with error margins, for extinct species. These can be used if estimates of body mass are available, or alternatively from estimates of total limb muscle volume, as derived from digital volumetric reconstructions based on skeletal material. Reliably inferring the force-generating capacity of individual muscles in extinct species remains elusive (Bishop, Cuff & Hutchinson, 2021a), but inferences of total limb muscle force-generating capacity provides a step closer toward achieving this goal.