On the suitability of an allometric proxy for nondestructive estimation of average leaf dry weight in eelgrass shoots I: sensitivity analysis and examination of the influences of data quality, analysis method, and sample size on precision

Background The effects of current anthropogenic influences on eelgrass (Zostera marina) meadows are noticeable. Eelgrass ecological services grant important benefits for mankind. Preservation of eelgrass meadows include several transplantation methods. Evaluation of establishing success relies on the estimation of standing stock and productivity. Average leaf biomass in shoots is a fundamental component of standing stock. Existing methods of leaf biomass measurement are destructive and time consuming. These assessments could alter shoot density in developing transplants. Allometric methods offer convenient indirect assessments of individual leaf biomass. Aggregation of single leaf projections produce surrogates for average leaf biomass in shoots. Involved parameters are time invariant, then derived proxies yield simplified nondestructive approximations. On spite of time invariance local factors induce relative variability of parameter estimates. This influences accuracy of surrogates. And factors like analysis method, sample size and data quality also impact precision. Besides, scaling projections are sensitive to parameter fluctuation. Thus the suitability of the addressed allometric approximations requires clarification. Results The considered proxies produced accurate indirect assessments of observed values. Only parameter estimates fitted from raw data using nonlinear regression, produced robust approximations. Data quality influenced sensitivity and sample size for an optimal precision. Conclusions Allometric surrogates of average leaf biomass in eelgrass shoots offer convenient nondestructive assessments. But analysis method and sample size can influence accuracy in a direct manner. Standardized routines for data quality are crucial on granting cost-effectiveness of the method.


Background
Based on published data for 17 ecosystem services in 16 biomes, Costanza et al. [1] estimated the value of ecosystem services at the level of the whole biosphere. They found a lower bound in the range of US$16-54 trillion (1012) per year, with an average of US$33 trillion per year. Marine systems produced near 63% of the annual value. Almost half derived from coastal ecosystems. Approximately 25% of this share related to algal beds and seagrasses. This contribution to human welfare is deem relevant. Thus maintaining the health of marine ecosystems is subject of scientific concerns. Currently, increasing anthropogenic pressures pose important threats. And global climate change could also threaten future viability of seagrass meadows [2,3]. For instance, water quality and other local stressors promote unprecedented meadow loss [4][5][6]. This reduces mitigation of wave action [7] and filtration [8]. Diminishes food and shelter for a myriad of organisms [9][10][11]. Weakens nutrient cycling [12,13], erosion abatement and shoreline stabilization [14][15][16].
Eelgrass (Zostera marina L.) is a dominant along the coasts of both the North Pacific and North Atlantic [20]. This species supports communities known as among the richest and most diverse in sea life [21]. Contribution of organic materials for food webs in shallow environments [22] is noticeable. Indeed, eelgrass produced up to 64% of the whole primary production of an estuarine system [23]. Current deleterious effects of anthropogenic influences on eelgrass prompted special restoration strategies. Among remediation efforts replanting plays an important role [24][25][26][27]. Transplant success amounts to reinstatement of ecological functions of natural populations. Evaluation relies on monitoring standing stock and productivity of transplanted plants. Then comparing with assessments of a reference population, which usually settle nearby [28].
Combined biomass of leaves in shoots is an important component of standing stock. Assessments rely on the estimation of the biomass of individual leaves. This requires shoot removal followed by dry weight measurement procedures in the laboratory. Elimination of shoots could infringe damage to natural eelgrass populations [29]. And reduced shoot density makes these effects even more perceptible for transplanted plots. Allometric methods make it possible simplified-indirect estimations of eelgrass productivity and standing stock. Echavarría-Heras et al. [30] considered an allometric representation for eelgrass leaf biomass and related length. Agreeing with Solana et al. [31], the involved parameters are invariant within a given geographical region. Estimates and leaf length measurements grant nondestructive approximations of observed leaf biomass values. This way, leaf length measurements grant nondestructive approximations for observed leaf biomass values. Leaf growth rates estimation relies on successive measurements of leaf biomass. Then the allometric model in [30] entails nondestructive assessments of eelgrass productivity. But, invariance does not impede local factors to imply variability of parameter estimates. Besides, local influences other factors could explain numerical differences in parameter estimates. There are methodological influences that may render biased parameter estimates. Analysis method, sample size, and data quality can influence scaling results (e.g. [32,33]). And, since scaling relationships are particularly sensitive to parametric uncertainties, Echavarría-Heras et al. [30] concluded that the actual precision of derived allometric surrogates requires clarification.
Here we deal with allometric surrogates for average leaf biomass in eelgrass shoots. These derive from the model w(t) = βa(t) α for leaf biomass w(t) and area a(t) measured at time t, and α and β parameters. Leaf area is more informative of eelgrass leaf biomass than corresponding length. Thus, the present scaling endures a boost in precision of parameter estimates by the model in [30]. This could increase the accuracy of derived surrogates for leaf biomass in shoots. Besides, eelgrass leaf area and length admit an isometric representation [34,35]. Then, the time invariance found by Solana-Arellano et al. [31] also holds for parameter estimates of the present scaling. This by the way imbeds a nondestructive advantage to the present shoot-biomass substitutes. But, agreeing with Echavarría-Heras et al. [30], we must examine influences on precision of estimates for suitability of projections. Since, such an analysis was not produced before, we took here the try of filling that gap. Achieving the related goals, required the assemblage of an extensive data set. It comprises coupled measurements of eelgrass leaf biomasses and related areas. This is further called "raw data set". A data cleaning procedure adapted from Echavarría-Heras et al. [30] removed inconsistent leaf biomass replicates from the raw data. Thereby forming what we call a "processed data set". Differences in reproducibility strength allowed to assess data quality effects in precision. A similar procedure evaluated sample size effects. And a sensitivity analysis evaluated robustness of the projection method. This supports consistent, cost-effective allometric projections of observed values from raw data. But, this depends on nonlinear regression as an analysis method. Besides, sample size must be optimal. Data quality as expected improved reproducibility strength of the allometric projection method. But, this factor was more relevant in optimizing sample size. A detailed explanation of used procedures appears in the methods section. The results section is not only devoted to the presentation of our findings. It also examines the relative strengths of factors influencing the precision of proxies. A Discussion section emphasizes on the gains and the limitations of the method. Appendix 1 deals with the model selection problem. Appendix 2 is about data processing methodology. Appendix 3 presents the procedure for sensitivity assessment.

Methods
The symbol w m (t) will stand for average leaf dry weight of shoots collected at sampling date t. An the average of these values over all sampling dates symbolized through 〈w m (t)〉. Formal representations of these variables appear in Appendix 3 (cf. Eq. (15) and Eq. (16)). The symbols w(t) and a(t) will one to one stand for biomass and area of an individual leaf collected at a sampling time t. We assume that these variables are linked through the scaling relationship The present raw data come from a coastal lagoon located in San Quintin Bay, México [30]. This comprises 10,412 leaves and measured lengths [mm], widths [mm] and dry weights (g). The product of length times width provided estimations of leaf area [mm 2 ] [36]. In what follows the symbol n ra stands for number of leaves in raw data. Processed data results by applying direct and statistical data cleaning techniques. The direct hinges on the consistency of allometric models for eelgrass leaf biomass. Leaf length or area are allometric descriptors of eelgrass leaf biomass [30,31,34]. A model selection exploration corroborated a power function like trend assumption for the present data. Details appear in Appendix 1. Severe deviations, from the mean response curve, are inconsistent and must be removed. This took care of sets containing less than ten leaf dry weight replicates. The statistical procedure worked on sets with a larger number of replicates. It centers on properties of the median of a group of data. This is immune to sample size and also a robust estimator of scale. The adapted Median Absolute Deviation (MAD) data cleaning procedure [37] appears in Appendix 2. Processing data resulted in a number of n qa = 6094 pairs of leaf dry weights and areas.
Parameter estimatesα andβ and leaf area values yield allometric proxies w m ðα;β; tÞ for w m (t). (cf. Eq. (19)). The symbol hw m ðα;β; tÞi (cf. Eq. (20)) stands for the pertinent average over sampling dates. We use Lin's Concordance Correlation Coefficient (CCC) [38] as an evaluation of reproducibility. This meant as the extent to which two connected variables fall on a line through the origin and with a slope of one. We represent this statistic by means of the symbolρ. Agreement defined as poor wheneverρ < 0:90, moderate for 0:90≤ρ < 0:95, good for 0:95≤ρ ≤0:99 or excellent forρ>0.99 [39]. Values ofρ gave an evaluation of the strength of the w m ðα;β; tÞ devise to reproduce observed values.
In getting parameter estimatesα andβ we relied on two procedures. The traditional analysis method of allometry and nonlinear regression. Assessing analysis method effects on reproducibility strength of w m ðα;β; tÞ depended on testing differences inρ . The traditional approach involves a linear regression equation (cf. Eq. (4)). This obtained through logarithmic transformation of response and descriptor in Eq. (1). The nonlinear regression analysis method relied on maximum likelihood [40,41]. This approach fitted the model of Eq. (1) in a direct way in the original arithmetical scale. The nonlinear fit allowed the consideration of homoscedasticity or heteroscedasticity (cf. Eqs. (5) and (6)). All the required fittings for both raw and processed data depended on the use of the R software.
We also fitted the model of Eq. (1) to samples of different sizes taken out from primary and processed data sets. Each sample of size k; with 100 ≤ k ≤ n ra produced estimatesαðkÞ for α andβðkÞ for β, and resulting w m ðαðkÞ;βðkÞ; tÞ projections. The symbolρðkÞ denotes the value ofρ for a sample of size k. Differences inρðkÞ allow exploring sample size influences in reproducibility.
Deviations Δα q and Δβ r convey fluctuating values α q = α + Δα q and β r = β + Δβ r for the parameters α and β one to one. The modulus of the vector of parametric changes (Δα q , Δβ r ) defines a tolerance range θ(q, r). And the value of θ(q, r) determined by the standard errors of parameter estimates denoted by mean of θ ste . A fixed value of θ(q, r) leads to four possible characterizations of the pair (Δα q , Δβ r ). Each one associates to a trajectory w m (α q , β r , t) shifting from a reference one w m (α, β, t). The symbol δw mθ (α q , β r , t) (cf. Eq. (42)) denotes deviations between reference and average of shifting trajectories at sampling dates. And the average of δw mθ (α q , β r , t) values taken over all sampling dates denoted through 〈δw mθ (α q , β r , t)〉 (cf. Eq. (43)). The absolute value of the ratio of 〈δw mθ (α q , β r , t)〉 to 〈w m (α, β, t)〉 defines a relative deviation index ϑ(θ). It measures sensitivity of 〈w m (α, β, t)〉 to fluctuations of tolerance θ(q, r) on α and β. Appendix 3 presents detailed formulae. Figure 1 shows the variation of leaf dry weight and area observed in the raw data. Smallest and largest leaf areas were 2 mm 2 and 7868 mm 2 respectively. Associated dry weights were 1 × 10 −5 g and 0.1588 g one to one. The time average of mean leaf dry weight in shoots was 〈w m (t) 〉 = 0.01461g (cf. Eq. (16)). Each leaf area measurement associate to several replicates of leaf biomass. Number of replicates increased from a single association up to a largest value of 84. Dispersion masks a power function like trend. Contents of Appendix 1 corroborate this at formal level. And exploration of dispersion reveals severe deviations from the inherent power function-like trend. Inconsistencies are more visible for leaves with areas under 350 mm 2 and also for those over 5000 mm 2 . This hints about the relevance of data quality. Figure 2 exhibits the spreading of leaf dry weight after data quality control. About 40% of the replicates in the raw data were eliminated. A power function like trend appears more depicted than that showing on Fig. 1. But dispersion still shows significant deviations around the expected power function like trend. This suggests lack of standardized routines for data gathering. In this work the length times width proxy [36] approximated leaf area. Errors in estimation of area of older damaged leaves could explain uneven replicates. Faulty  Although about 40% of the replicates in the raw data were found inconsistent and eliminated, this plot still shows significant residual variability around an expected power function like trend equipment, or incorrect recording could explicate inconsistencies for small leaves. Table 1 gives estimatesα andβ for the parameters α and β and corresponding standard errors. Assuming heteroscedasticity in the model of Eqs. (5) and (6) did not affect estimates. Thus, presentation of results of nonlinear regression refers to the homoscedastic case of the model of Eqs. (5) and (6). Figure 3 displays mean response curves fitted using raw data. Figure 4 shows those associated to quality controlled data. Results for the log-linear transformation method included correction for bias of retransformation [42]. The smearing estimate of bias of Duan [43] provided the form of the correction factor.

Results
For raw data the log-linear transformation method producedρ ¼ 0:8910, entailing poor reproducibility. This explains a biased distribution of replicates around the mean response curve (Fig. 3a). Meanwhile, estimates acquired by nonlinear regression from raw data conveyed adequate reproducibility (ρ ¼ 0:9307Þ. This explains a displayed fair distribution of projected leaf biomass values (Fig. 3b).
Estimates via log-linear transformation for processed data seemed enhance reproducibility (ρ ¼ 0:9777 =0.9455). But, Fig. 4a, reveals a bulk of inconsistent replicates for leaves areas under 5000 mm 2 . Notice that this subset of replicates distributes almost evenly around the mean response curve. Yet replicate spread for areas beyond 5000 mm 2 shows significant bias (Fig. 4a). Meanwhile, nonlinear regression and processed data associate toρ ¼ 0:9777. This corresponded to good reproducibility strength. Indeed, spread of replicates around the mean response is fair (Fig. 4b).
Estimates acquired from raw data via the traditional analysis method of allometry returned a value ofρ ¼ 0:9 285 for w m ðα;β; tÞ projections ( Table 2). This seems to correspond to moderate reproducibility. Yet, corresponding rms = 0.01265 was largest among analysis method-data set combinations (Table 2). Figure 3a shows a relative wider bias in spread around the mean response curve for larger leaves. This explains resulting inconsistencies in reproducibility of w m ðα;β; tÞ projections shown in Fig. 5a. Display reveals biased w m ðα;β; tÞ projections for near 50% of sampling dates. This, led to a smallest value of 0.8436 for the ratio of projected to observed averages.
Instead, nonlinear regression and raw data produced a value ofρ ¼ 0:9915: And root mean squared deviation attained a value of rms = 0.00460 (Table 2). This suggest a remarkable reproducibility strength for w m ðα;β; tÞ projections ( Table 2). Correspondence between projected and observed values, shown in Fig. 5b. corroborates high agreement. Moreover, the ratio of projected to observed leaf dry weight averages attained an outstanding value of 0.9773 (Table 2).
Processed data and log-linear transformation analysis producedρ ¼ 0:9489 for w m (α, β, t) projections. This figure is bigger than corresponding to raw data for this method. Nevertheless, lines in Fig. 6a show that this result does not correspond to a real gain in reproducibility. Besides data quality could not significantly reduce calculated root mean squared deviation (Table 2). Also, a value of 0.8588 for a ratio of projected to observed leaf dry weight averages is still low for suitable agreement (Table 2). Thus, regardless data quality, log-linear analysis failed to produce consistent w m (α, β, t) projections of w m (t) averages.
In turn, w m (α, β, t) projections made by nonlinear regression and processed data yield the highest value of ρ ¼ 0:9976. (Table 2). And also the smallest root mean squared deviation among analysis method-data set combinations ( Table 2). As shown by Fig. 6b this corresponds to a fairly good reproducibility strength.
Additionally, data quality and nonlinear regression led to an outstanding value of 0.9975 for the ratio of projected 〈w m (α, β, t)〉 to observed 〈w m (t)〉 averages.
Results exhibit that log-linear transformations failed to produce consistent projections for observed w m (t) averages. In contraposition, nonlinear regression entailed parameter estimates of noteworthy reliability. Thus studying sample size effects on reproducibility addressed only this analysis method. Figure 7 exhibits variation of CCC values depending on sample size k. This is expressed as a percentage of the extent of data set ( n ra = 10412 for raw) or (n qa = 6094 for processed). For raw data, a sample sized k = 0.20n ra endures reasonable reproducibility ðρðkÞ ¼ 0:9889Þ . But samples beyond this threshold would not induce a significant change in reproducibility. Meanwhile, for the quality controlled data, the sample size threshold for excellent reproducibility was k = 0.10n rq . This leads to a high reproducibility strength ofρðkÞ ¼ 0:9929. Thus, a sample 10% the size of processed data set produced excellent reproducibility. Again sample sizes beyond this threshold failed to in-creaseρðkÞ values.
The simulation code of Eqs. (39) through (44) explored the sensitivity of the w m (α, β, t) projection method, to numerical variation of parameters α and β. Available parameter estimates yield reference values for α and β (Table 2). Again, since nonlinear regression associates to highest reproducibility strength, for easier presentation, we only explain results using this analysis method.
The variation of the absolute deviation index for raw data is shown in Fig. 9. A range 0 ≤ θ(q, r) ≤ θ ste , places the relative ϑ(θ) deviation index within the domain 0≤ ϑ(θ) ≤ 0.0205 (Table 3). Therefore, for a bound of θ(q, r) set by the standard errors of estimates largest absolute deviation between w m (α q , β r , t) and w m (t) amounts to about 2% of 〈w m (t) 〉. Moreover, a range 0 ≤ θ(q, r) ≤ 2θ ste produces 0 ≤ ϑ(θ) ≤ 0.031. This leads to an equivalent 3% of 〈w m (t) 〉. Figure 10 displays the dynamics of ϑ(θ) depending on θ(q, r) for processed data. We have that θ(q, r) varying in a range of 0 ≤ θ(q, r) ≤ θ ste implies 0 ≤ ϑ(θ) ≤ 0.003 (Table 3). This time largest absolute deviation was only 0.03% of 〈w m (t)〉. Comparing with results for raw data, we ascertained remarkable gain in precision of w m (α, β, t) projections. This exploration highlights on importance of data quality control as a procedure leading the consistent w m (α, β, t) projections. But, results show that the projection method is robust even when parameter estimates are obtained from raw data.

Discussion
Results of Solana-Arellano et al. [31] explain invariance of the allometric parameters α and β in Eq. (1). This suggest w m (α, β, t) proxies as possible nondestructive estimations of the average leaf dry weight in eelgrass shoots. These assessments are essential for monitoring the efficiency of transplanted eelgrass plots, fundamental in remediation aims. The present examination shows that the w m (α, β, t) proxies could in fact offer reliable and cost-effective assessments. This on condition that practitioners take in to account our guidelines. For instance, our results typify the extent on which accuracy of estimates of the parameters α and β influences the predictive power of the w m (α, β, t) projections.
And, our findings clarify that there are methodological factors affecting the accuracy of estimates. Related influences could spread significant bias in approximations supported by the w m (α, β, t) device. Indeed, analysis method turned into a main factor inducing bias in parameter estimates of the model of Eq. (1). Moreover, only parameter estimates acquired by nonlinear regression yield consistency of the model of Eq. (1) ( Table 1 and lines in Fig. 3b and Fig. 4b).
And, only these estimates upheld conclusive predictive power of the w m (α, β, t) proxies ( Table 2, as well as, lines in Fig. 5b and Fig. 6b). Our results also show that data quality could not improve the performance of w m (α, β, t) projections acquired via log-linear transformations. Without doubt, parameter estimates acquired from processed data by this method still led to significant bias in w m (α, β, t) projections (Fig. 6a). Meanwhile, data processing improved reproducibility of projections built for raw data using nonlinear regression ( Table 2 and lines in Fig. 5b and Fig. 6b).  Besides, relevance of data quality was also evident for optimizing sample size. Indeed, while for raw data, a sample of approximately 2000 leaves shows reasonable reproducibility, for the quality controlled data this threshold drops to near 1000. However, samples sized beyond these thresholds would not induce a noteworthy gain in reproducibility. This result on its own, ties to efficiency of the w m (α, β, t) projection method. Undoubtedly routines for leaf dry weight assessment are tedious and time consuming. So, reducing size of data set for parameter estimation increases costeffectiveness of the w m (α, β, t) projection method. Nonlinear regression estimation also showed advantages in sensitivity over the log-linear analysis counterpart. Estimates from raw data led to a largest absolute deviation between w m (α, β, t) and w m (t) values amounting only 3% of the average of w m (t) over sampling dates. And, for processed data, the fluctuation range for equivalent sensitivity widened to 2.8 times the range for standard errors of estimates. But, on spite of data quality relevance, sensitivity results for raw data reveal that the w m (α, β, t) projection method is robust relative to expected fluctuations in parameter estimates.
Our results show that both the accuracy and costeffectiveness of projections can be enhanced by the addition of data quality control procedures. However, including data processing can become a weakness for the w m (α, β, t) projection method. Indeed, data cleaning procedures convey niceties that relate in a fundamental way to detection and rejection of inconsistent replicates. Also, compromising about which particular rejection edge should work, is hard to determine. Thus, the use of any data processing will endure a doubt, that the examiner selects an arrangement producing the most probable results [37]. In that order of ideas, when attempting to enhance the reproducibility power of w m (α, β, t) projections it is desirable to avoid depending in any form of data processing. For that aim, prior to data assembly, we must bear in mind standardized routines yielding accurate measurements for w(t) and a(t). This will favor direct identification of the model of Eq. (1) in a consistent way. It is of a fundamental importance to be aware, that errors in leaf dry weight or area assessment differentiate in terms of leaf size. Certainly, leaves produced anew normally present a complete and undamaged span. But, they normally yield very reduced dry weight values. Therefore, we can expect estimation errors imputable to the precision of the analytical scale for individual leaf dry weight assessments. To this, we may add errors in the reading and/or recording of the scale output. These issues could explain a perceptible accumulation of inconsistent replicates for leaves with areas between 2 mm 2 and 350 mm 2 (Fig. 1). And, even after data cleaning procedures, leaf dry weight replicate spread for leaves bigger than 2000 mm 2 shows significant residual variability (Fig. 2). Likewise, as far as, bigger and  older leaves is concerned, there are issues on dry weight estimation errors. These seem to relate to damage caused by exposure to environmental factors. The fact that we estimated leaf area by means of the product of related length and width could explain these effects. For complete undamaged eelgrass leaves, the use of a leaf times width proxy grants simplified and accurate estimations of leaf area [36].
But, this approach could deliver inaccurate estimations for long and damaged leaves. Actually, bigger leaves remain exposed during significant periods of time to environmental influences such as drag forces or herbivory. Fig. 8 Examples of changing trajectories w m (α q , β r ,t). Black lines a reference trajectory w m (α,β,t). This produced by raw data and nonlinear regression as an analysis method. For Δα q . Δβ r > 0, shifting trajectories w m (α q , β r ,t) overestimate or underestimate w m (α,β,t) projections (see red or blue lines in panel a)). The case Δα q •Δβ r < 0, leads to relative smaller deviations between w m (α q , β r ,t) and w m (α,β,t) (see red and blue lines in panel b)) Fig. 9 The variation of the relative deviation index ϑ(θ) (raw data). The standard errors of estimates acquired by nonlinear regression from raw data, produced ϑ(θ ste ) = 0.0205. And for a range 0 ≤ θ(q,r) ≤ θ ste the largest value of the absolute deviation between w m (α q , β r ,t) and w m (t) is around 2% of 〈w m (t) 〉 (see Appendix 3 for details) This could remove large amounts of leaf tissue while leaving length unaffected. Thus, causing overestimation of true leaf area when using a width and length product estimation. At the same time, lost portions of a leaf will produce a smaller dry weight than expected for an overestimated area. These effects will bring dry weight replicates that deviate from the power function-like trend associated to the model of Eq. (1). Estimation bias for the dry weights of smaller and longer leaves could explain the anomalous proliferation of inconsistencies (around 41 % ) found while applying the proposed data cleaning procedure to the present raw data. These effects will propagate uncertainty of parameter estimates of the model of Eq. (1), influencing accuracy of the w m (α, β, t) projections. Hence, for the sake of consistent reproducibility of observed values via w m (α, β, t) projections, we need to be aware of these effects. And as elaborated, a good starting point for granting consistency, is appropriate gathering of primary data for the identification of the model of Eq. (1). This will make subsequent data cleaning procedures unnecessary.

Conclusion
This research show that precise estimates of allometric parameters in Eq. (1) grant accurate non-destructive projections of the average leaf dry weight in eelgrass shoots. These projections offer efficient appraisal of eelgrass restoration projects, thereby contributing to the conservation of this important seagrass species. Our findings support views in Hui and Jackson [32], Packard and Birchard [33] and Packard et al. [44], on the relevance of analysis method in scaling studies. Indeed, we found that for assuring suitability of the w m (α, β, t) proxies, the use of nonlinear regression is crucial. On spite of claims that the use of the traditional log-linear analysis method is a must in allometric examination [45], exploration of spread of present crude data reveals curvature [46]. This explaining failure of the traditional analysis method to produce unbiased results for the present data. Besides proxies supported by nonlinear regression and raw data, are robust.
Data cleaning could only marginally enhance the accuracy of proxies produced by nonlinear regression and raw data. But results underline a relevant influence of data quality in setting optimal sample size for a suitable precision of parameter estimates. Nevertheless, the use of data cleaning procedures leads to intricacies. They in a fundamental way relate to choosing thresholds for rejection of detected inconsistencies, which are often regarded as subjective. Thus, instead of using later data cleaning, data gathering should seek for suitability. Special care must be taken when Table 3 Sensitivity of the w m (α, β, t) projections to changes in estimates of the parameters α and β. Included are calculated θ ste values. This gives θ(q, r) as determined by the standard errors of estimates. We also present corresponding values of the relative deviation index ϑ(θ ste ) Analysis method Data θ ste ϑ(θ ste ).  . 10 The variation of the relative deviation index ϑ(θ) (processed data). The standard errors of estimates acquired by nonlinear regression from processed data, produced θ ste =3.954 × 10 − 3 ). This set a range 0 ≤ ϑ(θ) ≤ 0.003. Thus, the largest absolute deviation between w m (α q , β r ,t) and w m (t) amounts to about 0.3% of the 〈w m (t) 〉 average. Data quality control could be a factor improving accuracy of w m (α,β,t) projections processing bigger and older leaves. These are often damaged or even trimmed so that their dry weights do not conform to a true weight to area relationship. Irregularities in raw data may also associate to biased estimation of leaf length or width. Moreover, in a lesser way faulty equipment for leaf dry weight assessment, rounding off, or even incorrect data recording could as well contribute. Taking advantage of a time invariance of the parameters in Eq. (1) the w m (α, β, t) device could offer to the eelgrass conservation practitioner highly consistent and truly nondestructive assessments of the average value of leaf dry weight in shoots. But the explained guidelines on analysis method, sample size and data appropriateness are mandatory for cost-effectiveness. Moreover, the present results suggest that the use of the w m (α, β, t) method could be extended to other seagrasses species with similar leaf architecture to eelgrass.

Model selection
This Appendix covers the selection problem of Eq. (1) as a model for the variation of eelgrass leaf dry weight in terms of related area. On first sight the spread in Fig. 1 may suggest that instead of clinging to the power function model of Eq. (1) we should explore fitting an alternative linear model. There are two possible ways to achieve a linear model fit for the present data. One by fitting an isometric model. This follows by setting α = 1 in Eq. (1) to yield a linear model with w-intercept through origin, that is, A second choice is considering a standard linear equation with intercept different from zero, that is, where β is the slope and δ the intercept. But, by looking at the spread in Fig. 1, we realize that this model turns out to be inconsistent. This because for a vanishing leaf area we expect a vanishing dry weight. In any event, we fit the model of Eq. (3) as a device to analyze the suitability of the model of Eq. (2).

Fitting of the model of Eq. (1) by means the traditional analysis method of allometry
For this fit we used raw data. Appling a logarithmic transformation on both sides of the Eq. (1), since we have available data pairs (w i , a i ), we get where the additive errors ε 1 , ε 2 , …, ε n are independent and identically distributed, with distribution N(0, σ 2 ). Fitting the model of Eq. (4) to data produced entries in Tables 4, 5 and 6. Residual and normal probability plots for the fit of the model of Eq. (4) appear in Fig. 11. We notice that a value α = 1.0 is not included in the confidence interval for the parameter α. This impairs Eq. (2) as a consistent model for the present raw data.

Fitting of the model of Eq. (1) via non-linear regression
For the aim of fitting the model of Eq. (1) to raw data using nonlinear regression, we use the equation the error term ε i has a normal distribution with mean μ zero and standard deviation σ i . This chosen to contemplate heteroscedasticity or homoscedasticity. Moreover, σ i depends on the predictor variable a i through the model where γ 0 and γ 1 are parameters to be estimated. The case γ 1 different from zero associates to heteroscedasticity. Whenever γ 1 vanishes allows the consideration of homoscedasticity. The response variable w i resulting from Eq. (5) is normally distributed, with mean μ i = βa i α and standard deviation σ i given by Eq. (6). Fitting this regression model to raw data relied on a likelihood approach, i.e., acquiring estimates for the parameters β, α, γ 0 and γ 1 , which yield the largest value of the log likelihood function [40,41] lðβ; α; γ 0 ; with μ i = βa i α and σ i given by Eq. (6).
Estimatesβ andα are very similar for the homoscedastic and heteroscedastic cases of Eq. (5). Notice that a value α = 1.0 is not included in the confidence interval for the parameter α. Thus, this fit also excludes the model of Eq. (2) (isometric case). Confidence intervals in Table 7 show a wide overlap region. It follows that parameter estimates do not differ in a significant way. This is also supported by the estimated CCC value (ρ ¼ 0:9307) that was the same for the heteroscedastic and homoscedastic cases. Likewise, plotting the response curves w i = βa i α on a dispersion diagram, reveals that curves estimated considering heteroscedasticity or homoscedasticity coincide (Fig. 12). Figure 13 shows the scatter diagram of dry weight residuals against leaf area. It also displays the region bounded by lines for 95% confidence intervals (blue lines). Figure 14 displays raw data and response curve (median) for the heteroscedastic model. It includes curves for region comprising 95% of leaf dry weight values (blue lines).
Thus, predictive strength from the homoscedastic or heteroscedastic regression models are similar. This because assumed distributions of the error term, in both models are symmetric. What makes a difference is the form capturing the uncertainty pattern.
Fitting of the linear model of Eq. (3) Since, raw data composes of pairs (w i , a i ), in order to test the consistency of the model of Eq. (3), we consider the regression equation where the additive errors ε 1 , ε 2 , …, ε n are independent and identically distributed N(0, σ 2 ). Fitting regression Eq. (8) to raw data produced the results in Table ( Residual and normal probability plot for this fit appear in Fig. 15. Looking at the confidence intervals in Table 7, we observe that a value α = 1 is not included. This implies rejection for the model of Eq. (2) (isometric case). Also, the intercept in Eq. (8) is negative, which is inconsistent. This is inconsistent with observations. Thus, we must reject the model of Eq. (3).
In summary, fitting results for the model of Eq. (1) excludes the value α = 1. Now, for α = 1, the model of Eq.
(1) takes the form of a straight line through the origin (isometric model of Eq. (2)). In the other hand, for the fit of the linear model of Eq. (3) the intercept is negative.  11 Residuals and normal probability plots for the fitting of the model of Eq. (4). This plot associates to a fit to raw data by means of the traditional analysis method of allometry. We observe a biased distribution of residuals around the zero line. Also, normal Q-Q plot shows heavier tails than expected for a normal distribution (left and right panels one to one) This means that this model could predict negative values of leaf dry weight. Thus, the model of Eq. (1) is the only one supporting a good selection criterion.

Data processing
This Appendix explains the Median Absolute Deviation (MAD) criterion for data cleaning. A first step is detecting the group of leaf dry weight replicates that associate to a given leaf area. The symbol G(a) stands for this group, and the number of replicates it contains denoted by means of the symbol n(G). Formally, G(a) = {w i (a) | 1 ≤ i ≤ n(G) }. The median of G(a) is denoted by means of MED(G(a)), that is, And, for 1 ≤ i ≤ n(G) the absolute deviation of replicate w i (a) from the median MED(G(a)) represented by means of δ i (a), that is, Similarly, the median of the set of absolute deviations is signified by the symbol MED(δG(a)), that is, We use the character MAD(G(a)) to denote median absolute deviation of a group G(a). After Huber [47] and recalling that eelgrass leaf dry weight values are lognormally distributed, this is calculated through, where b = 1/Q(0.75), being Q(0.75) the 0.75 quantile of the lognormal distribution.
The removal of inconsistent replicates in a group G(a) follows the decision criterion where T is the rejection threshold. Following Miller [48], we set T = 3. In what follows, the symbol n qa stands for the size of the resulting quality controlled data set.  For that aim we label a generic eelgrass shoot by using a subscript s. Correspondingly, the characters nl(s) and w s (t) shall stand for the associated number of leaves and their combined dry weight. Similarly, ns(t) denotes the number of shoots collected at a sampling date t. And, the number of sampling date symbolized by means of n(t).
Observed dry weight in a shoot s is represented by mean of where ∑ nl(s) indicates summation of the leaves that the shoot s, holds. Meanwhile, the matching average leaf dry weight of shoots denoted by the symbol w m (t) is given by where ∑ ns(t) indicates summation of the ns(t) shoots collected at a time t.
The character 〈w m (t)〉 stands for the average of w m (t) values over the number n(t) of sampling times. It is given by where X nðtÞ indicates summation over the number of sampling times. We assumed that w(t) and a(t) are linked through the scaling expression of Eq. (1). Then, we can obtain an allometric proxy for w s (t). The symbol w s (α, β, t) represents this surrogate and it is given by Moreover, that is, w s (α, β, t) is an approximation to the true value of w s (t), being ϵ s (α, β, t) the error of estimation. Similarly, Eq. (17) produces an allometric proxy for w m (t), the average leaf dry weight in shoots at a time t. This surrogate, represented here by means of w m (α, β, t), is given by In turn, the average of w m (α, β, t) over the number n(t) of sampling times, denoted through 〈w m (α, β, t) 〉, is calculated by means of Similarly, where ϵ m (α, β, t) stands for the resulting approximating error. Moreover, the root mean squared deviation between w m (t) and w m (α, β, t) is denoted by means of rms(α, β) and given by again ∑ n(t) indicates summation over the sampling dates n(t).
We assume changes in parameters α and β given by factors of their standard errors. Then, a factor q associates to a shift of the value of the parameter α. This deviation, denoted by means of the symbol α q , is given by where with ste (α) symbolizing the standard error of α and q satisfying.
Similarly, a factor r, such that.
associates to a fluctuation in β denoted by means of Δβ r and expressed by way of being ste(β) the standard error of β. This returns a variation of the parameter β, denoted via β r and represented in the form Qualitative exploration of sensitivity of w m (α, β, t) A qualitative study is intended to set domains where reference estimations w m (α, β, t) are underestimated or overestimated by changing values w m (α q , β r , t).For that aim, we firstly set reasonable local variation ranges for the numerical deviations α q and β r . Fittings of Eq. (1) to eelgrass leaf dry weight and area data demonstrate that the inequality 0 < β < α holds [30]. As stated by Eqs. (24) and (27), letting |q| < ste(α)/α along with | r| < ste(β)/β set domains |Δα q | ≤ α and |Δβ r | ≤ β. This enables suitable domain to pursue a qualitative study of the effects of parametric changes in w s (α, β, t) (cf. Eq. (17)). For that aim, let's consider deviations δw s (α q , β r , t) defined through Then, and factoring the term βa(t) α δw s α q ; β r ; t where μ a Δα q Δβ r ; a t ð Þ with and Now, since a(t) is positive, for fixed values of the parameters α and β in the domain |Δα q | < α, the factor π(Δα q ) remains positive and increasing. Respectively, for Δβ r varying in the interval |Δβ r | < β, the factor φ(Δβ r ) is positive and decreasing. Furthermore, for Δα q = Δβ r = 0, we have π(Δα q ) = φ(Δβ r ) = 1 (Fig. 16).  From Eq. (31) the proxies w s (α, β, t) will be overestimated by shifting projections w s (α q , β r , t) whenever the inequality is satisfied. And from Eq. (32) this last inequality leads to But, for −α ≤ Δα q ≤ 0 the factor π(Δα q ) is continuous and increases monotonically from a value π(−α) > 0, until it reaches a value of π(0) = 1, while the factor φ(Δβ r ) takes positive and arbitrarily large values whenever Δβ r approaches -β, and decreases monotonically towards a value of φ(0) = 1. Thus, the continuity of φ(Δβ r ) and π(Δα q ) implies the order relationship π(Δα q ) < φ(Δβ r ), in the domain−α ≤ Δα q ≤ 0 and -β < Δβ r ≤ 0. In the other hand in the domain 0 ≤ Δα q ≤ α and 0 ≤ Δβ r ≤ β, both π(Δα q ) and φ(Δβ r ) steer away from one being π(Δα q ) increasing and φ(Δβ r ) decreasing. Therefore, by continuity the statement of inequality (36) will hold only in the domain Δα q > 0 and Δβ r > 0.
Similarly, reference values w s (α, β, t) will be underestimated by fluctuating ones w s (α q , β r , t) whenever, the statement holds, or equivalently whenever which as it has been discussed above, only holds whenever the increments Δα q and Δβ r vary in. the domain −β < Δβ r < 0 and −α < Δα q < 0.
Since, the w s (α, β, t) projections are always positive, the w m (α, β, t) reference projections will be overestimated or underestimated by changing w m (α q , β r , t) values in the same domains of variation of Δα q and and Δβ r where the w s (α, β, t) projections are overestimated or underestimated by w s (α q , β r , t) values.

Exploration of sensitivity of w m (α, β, t) by simulation methods
In order to perform a simulation study of sensitivity, we considered a combined range of variability for α and β given by the modulus of the vector of parametric changes (Δα q , Δβ r ). This is denoted by means of the symbol θ(q, r) and calculated through Equivalently, using Eqs. (24) and (27) yield And, setting q = 1 and r = 1 give (Δα q , Δβ r ) = (ste(α), ste(β)). This produces vectors of shifting parametric values with modulus determined by the standard errors of estimates. Denoting by means of the symbol θ ste the associated range of variability we have By virtue of Eq. (39), every fixed value of θ(q, r) returns different pairs (Δα q , Δβ r ) of increments, each one giving rise to a couple of shifting parameter values (α q , β r ). Besides, for each ordered pair of changing parameters, leaf data in shoots and Eq. (19) return a shifting trajectory, that is denoted trough the symbol w m (α q , β r , t). The average of the values taken by the fluctuating trajectories at a Fig. 16 The variation of the auxiliary factors φ(Δβ r )and π(Δβ r ). This plot presents the variation of the auxiliary factors φ(Δβ r ) and π(Δβ r ) defined by Eqs. (33) and (34) respectively. Both Δα q and Δβ r r vary in the horizontal axis. Similarly, the corresponding variations of π(Δβ r ) and φ(Δβ r ) are projected in the vertical axis sampling date t is denoted by 〈w m (α q , β r , t)| θ〉. The procedure estimates the deviations between the observed w m (t) and the average 〈w m (α q , β r , t)| θ〉 trajectory for each time t. This is denoted by δw mθ (α q , β r , t) and calculated by way of δw mθ α q ; β r ; t À Á ¼ w m t ð Þ− w m α q ; β r ; t À Á jθ : The average of the δw mθ (α q , β r , t) deviation values over all sampling times n(t), represented thorough the symbol 〈δw mθ (α q , β r , t)〉, is given by where, X nðtÞ indicates summation over the involved n(t) sampling dates. These statistics along with, the average of w m (t) values over the number n(t) of sampling times are used to produce a relative deviation index represented by means of the symbol ϑ(θ) and given by The value of ϑ(θ) stands a measure of the sensitivity of the reference trajectory w m (α, β, t) to a change of tolerance θ(q, r) on the values of α and β. Moreover, ϑ(θ) assess in what percentage of 〈w m (t)〉 the absolute deviation between w m (α q , β r , t) and w m (t) amounts.

Application to present data
The results of Appendix 1 show that there are not differences in parameter estimates of the nonlinear regression model of Eqs. (5) and (6) assuming heteroscedasticity or homoscedasticity. Therefore, in order to explore the sensitivity of w m (α, β, t) by means of the simulation code of Eqs. (39)-(44), we take as a reference parameter estimates acquired from the raw data using the homoscedastic case. Parameter estimates wereα ¼ 1:104 with steðα Þ ¼ 5:101 Â10 −3 for α and ofβ ¼ 8:71 Â 10 −6 with steðβÞ ¼ 3:53 Â10 −7 for β (Table 1). Then, letting q = 1 and r = 1 in Eqs. (24) and (27) allows to consider a combined range of parametric change set by the standard errors of estimates. Eq. (41) returned θ ste = 5.101 × 10 −3 . Afterwards, for every fixed value of θ(q, r) we determined all ordered pairs (Δα q , Δβ r ) complying with condition set by Eq. (39). This returned fluctuating values α q for α and β r for β, given one by one by Eqs. (23) and (28). Observed leaf area data and shifting parameter values, shaped the associated changing w m (α q , β r , t) trajectories. Moreover, while 0 ≤ θ(q, r) ≤ θ ste , the values of the relative deviation index ϑ(τ) satisfied 0≤ ϑ(θ) ≤ 0.0205 (Fig. 9). Thus, for a parametric variation range set by the standard errors of estimates the maximum value of the absolute deviation ϑ(θ) between w m (α q , β r , t) and w m (t) amounts to approximately 2% of the average of w m (t) taken over all sampling dates (〈w m (t) 〉 cf. Eq. (16)). Moreover, for θ(q, r) in a range 0 ≤ θ ≤ 2θ ste we have 0 ≤ ϑ(θ) ≤ 0.031. This leads to a maximum absolute deviation between w m (α q , β r , t) and w m (t) of around 3% of 〈w m (t) 〉. This shows the robustness of the w m (α, β, t) projection method.