A note on the estimation of variance for big BAF sampling

The double sampling method known as “big BAF sampling” has been advocated as a way to reduce sampling effort while still maintaining a reasonably precise estimate of volume. A well-known method for variance determination, Bruce’s method, is customarily used because the volume estimator takes the form of a product of random variables. However, the genesis of Bruce’s method is not known to most foresters who use the method in practice. We establish that the Taylor series approximation known as the Delta method provides a plausible explanation for the origins of Bruce’s method. Simulations were conducted on two different tree populations to ascertain the similarities of the Delta method to the exact variance of a product. Additionally, two alternative estimators for the variance of individual tree volume-basal area ratios, which are part of the estimation process, were compared within the overall variance estimation procedure. The simulation results demonstrate that Bruce’s method provides a robust method for estimating the variance of inventories conducted with the big BAF method. The simulations also demonstrate that the variance of the mean volume-basal area ratios can be computed using either the usual sample variance of the mean or the ratio variance estimators with equal accuracy, which had not been shown previously for Big BAF sampling. A plausible explanation for the origins of Bruce’s method has been set forth both historically and mathematically in the Delta Method. In most settings, there is evidently no practical difference between applying the exact variance of a product or the Delta method—either can be used. A caution is articulated concerning the aggregation of tree-wise attributes into point-wise summaries in order to test the correlation between the two as a possible indicator of the need for further covariance augmentation.


Background Introduction
Double sampling is a method employed in forestry that samples a population in two phases over a set of sample units. The primary sample includes all sample units and trees that meet the criterion for selection, on which one or more measurements related to the main attribute of interest are recorded to be further assessed on the second *Correspondence: jeff.gove@usda.gov 1 USDA Forest Service, Northern Research Station, 271 Mast Road, Durham NH, 03824, USA Full list of author information is available at the end of the article phase sample. The second phase normally includes more detailed measurements on a subset of whole sample units or a subset of individual trees within all primary sample units (e.g. Bruce 1961;Bell et al. 1983;Odewald and Jones 1992). For example, in point double sampling, trees are counted on a large number of sample points using a relascope to estimate basal area per hectare, but measurements of individual trees needed to compute other variables (such as volume per hectare) are confined to a subsample of the points. More details on double sampling in forestry can be found in Valentine (2008, p. 262) andde Vries (1986, p. 104). Big BAF sampling (Marshall et al. 2004) is a simple form of double sampling that can be used in a horizontal point sampling (HPS) inventory. The target attributes are frequently basal area and volume, though more measurements could be taken to estimate the density as well; and recently Chen et al. (2019) have demonstrated its use for the estimation of carbon rather than volume. The big BAF method uses two basal area factors (BAFs) on the same full set of sample points in a forest inventory: the smaller BAF is used to select a sample for the estimation of basal area (the BAF c sample), while the larger BAF is used to select trees on which to take more detailed measurements for volume estimation (the BAF v sample). Therefore, the relation to double sampling comes about not through a second phase subsample of individual points, but by using the larger BAF to select a subset of trees from the full set of inventory points as the second phase sample.
The big BAF method has evidently been suggested several times prior to its actual adoption as a form of double sampling. Iles (2012) notes that in the United States Grosenbaugh (1952, p. 35) may have been the first to suggest something that resembles big BAF when he somewhat casually noted, in reference to horizontal point sampling, that "An obvious adaptation is to use a larger critical angle to obtain a smaller sample of the average ratio, while using a smaller critical angle to obtain a larger sample of average basal area per acre." Iles (2012) also notes that Grosenbaugh mentioned the use of two prism factors in an earlier letter to Wheeler in 1949. The next published mention of using two prism factors in a double sampling context was the suggestion by Bell et al. (1983 p. 702), but was not formalized until Marshall et al. (2004) described the method in detail. Since that time it has also been included in several texts on forest mensuration and sampling (e.g., Valentine 2008, p. 268 andKershaw et al. 2016, p. 377).
In addition to the aforementioned papers, several other studies have used big BAF sampling. Corrin (1998), Crowther (1999) and Desmarais (2002) all give practical examples of the use of big BAF in operational field inventories in both the eastern and western forest types of the U.S. and Canada. Brooks (2006) followed with a more detailed analysis of sampling an even-aged Appalachian hardwood forest with the big BAF method that used 13 different "big" BAFs paired with 6 different "small" BAFs, all of which were compared against a fixed-radius plot inventory. In a more recent study, Rice et al. (2014) evaluated several different sampling schemes including fixed-area plots, various BAFs under HPS, big BAF, and horizontal line sampling. Sampling was conducted in partial harvests in the Acadian mixedwood forest type in northern Maine. In each case a second phase sample was chosen with either a fixed number of trees or a second large BAF. Only the smallest BAF under HPS turned out to be better than big BAF in terms of standard error. In a comprehensive study on an Acadian forest in New Brunswick, Canada, Yang et al. (2017) analyzed how the interplay between sample size and choice of BAFs affects the costs of a big BAF sample, and determined optimal sampling plans under cost-constraints. A follow-up study by Chen et al. (2019) used a suite of simulations and the cost models of Yang et al. (2017) to show how big BAF sampling could be readily applied to the estimation of carbon.

Big BAF background
Let F c and F v (m 2 ·ha −1 ) be the basal area factors for the selection of count and volume trees, respectively. Thus, the BAF c sample of trees are selected with the F c gauge, while the BAF v sample uses the F v gauge to select the trees for volume. Notably, F c < F v , and possibly F c F v .
The double sampling big BAF estimator begins by forming the volume to basal area ratios (VBAR) for each tree in the BAF v sample of n points (e.g., Kershaw et al. 2016, p. 377); i.e., for the ith tree on a given sample point we have where v is some evaluation of tree volume and b is tree basal area. And averaging over all trees on the BAF v sample gives (Gregoire and Valentine 2008, p. 258) Here m v s is the number of BAF v trees measured for volume on the sth point; it follows that the total number of trees sampled for volume is therefore m v = n s=1 m v s . It is important to realize that in (2) the total number of VBAR trees (those trees on the BAF v sample for which volumes have been measured) must include replicates for any tree sampled on more than one point. The double sum ensures that the summation over the n sample points will include multiple counts on the point-wise tallies.
The full set of counts with BAF c on all n points provides an estimate of the average basal area given by the designunbiased (Palley and Horwitz 1961) estimator where m c = n s=1 m c s is the number of BAF c "in" trees on all n sample points, and m c s is the number of BAF c trees tallied on the sth point. Similarly, the estimator for the total is simplyB c =m c AF c , where A is the area of the tract in hectares. Note thatB c can refer to either the total or per unit area estimate according to the context. In particular, the simulations are all in terms of totals.
The product of the mean VBAR and basal area gives an estimate of the volume under big BAF leading to the familiar estimator Because this estimator is the product of two random variables-in this case derived from the above estimators-its variance is estimated by the variance of a product. As noted by Iles (2012), the usual double sampling variance can not be applied because of the design of the big BAF method. That is, the second phase sample is carried out on all sample points (with the larger BAF v ) so the first and second phase sample sizes are equivalent in terms of the number of sample points. In addition, Lynch et al. (2021, in press) point out that the big BAF estimator in (4) is a ratio estimator and thus contains a bias (see also Palley and Horwitz (1961) for a double sampling application). These authors derive the bias, which in most practical cases should be small and can safely be ignored. The exact variance of a product was popularized by Goodman (1960), who also presented an unbiased estimator of the variance that is applicable to big BAF sampling as follows where the variance estimators for (2) and (3) are given by (e.g., Gregoire and Valentine 2008, p. 257-259) var In particular, Palley and Horwitz (1961, Appendix 1) demonstrated that (7) is design-unbiased. Kershaw et al. (2016, p. 380) provide equivalent estimators. The associated standard error estimators corresponding to (6) and (7) are given as se V = var V and se B c = var B c , respectively. WhenB c represents the total, (7) is multiplied by A 2 .

Historical background
It has been noted (e.g., Marshall et al. 2004;Gregoire and Valentine 2008, p. 259) that Bell and Alexander (1957) were the first to employ a version of the product variance in the calculation of standard error for (4). Their familiar form for the estimator was written in terms of standard error as a percent; viz., Somehow, perhaps simply through the Bruce (1961) reference being better known to foresters (alternatively, perhaps Bruce had popularized it in unpublished works that were used by Bell and Alexander (1957)), (8) has become commonly known in the United States as "Bruce's method, " even though the latter was not published until several years after Bell and Alexander (1957). Additional attribution for (8) is often shared between Bruce (1961) and Goodman (1960) in these references as well as others (e.g., Chen et al. 2019;Iles 2012;Yang et al. 2017), where it is sometimes also noted that Bruce's method derives from the first two terms of (5) (e.g. Marshall et al. 2004).
Of course there is a small problem here, because (8) was known to foresters (Bell and Alexander 1957) before the Goodman (1960) paper was ever published. Therefore, attributing (8) to the first two terms of (5), for example, while mathematically correct, can not possibly be chronologically true. Clearly the above authors know this; thus, their statements linking the two derive from this recognition of the mathematical relationship, and are not meant to suggest lineage. Where then did Bell and Alexander (1957) come up with (8)? There are two obvious possibilities that are plausible answers to this question. First, it turns out that Goodman (1960) was not the first author to publish the exact variance of a product. In fact, in a reference that would probably have been rather obscure at the time, Barnett (1955) published a succinct derivation of the exact product variance for two independent random variables. Thus, it is possible that Bell and Alexander (1957) were aware of this reference and considered the third term small and extraneous, therefore dropping it. However, given that the reference appeared in the actuarial literature, this seems the less likely of the two explanations. A second and perhaps more plausible explanation lies in the fact that an approximation to the exact variance of a product had been known in statistics and used in published studies for decades prior to the 1950s. This variance approximation is derived through a Taylor series expansion and is commonly known as the Delta Method (see the "Methods" section and Supplementary Material for more details).
The motivation for this study then is two-fold. First, we will demonstrate that the Delta method yields Bruce's method when the covariance terms in the first-order Taylor series expansion are truncated, leaving only the variance terms when independence of (2) and (3) is tenable. The second objective is to illustrate just how close the approximation and the exact variances are using a small simulation study. The close agreement between the two has been pointed out before (e.g., Gregoire and Valentine 2008, p. 259;Marshall et al. 2004;Iles 2012), where it has been noted that the third term in (5) is quite small in relation to the first two, which dominate. In concert with these objectives, we demonstrate that the ratio variance estimator may provide a useful alternative to the normal sampling variance for the variance ofV. Much like the recent paper by Kerr (2014), who cleared up some misconceptions in the literature about the historical attribution of "de Liocourt's q" distribution, our intent is chiefly to clarify the possible lineage of Bruce's method in a historical context. Again, much like the paper by Kerr (2014) where it is almost assuredly true that others had recognized the missattibution of q to de Liocourt but never published the observation, we also acknowledge 'up front' that much of this material may be already known and understood by those who are students of the history of forest biometrics, but have never written these results up, perhaps feeling they were 'universally' known within the discipline.

The variance of a product
A careful reading of Goodman's 1960 paper presents a detailed, though somewhat abstruse description of the derivation of the exact variance of a product of two random variables. Formulas for the case of independent random variables and independent estimators (i.e., the sample mean) are given as well as associated forms when independence can not be assumed. The notation can be difficult in places. We present a slightly different derivation (Supplementary Material Section S.3.1) that may be somewhat easier to follow. In a companion paper, Goodman (1962) extended the results to more than two random variables. This followed with a paper on the exact covariance for products of random variables by Bohrnstedt and Goldberger (1969). All of the authors mentioned previously who have used big BAF sampling have assumed independence either implicitly, or explicitly by citing a small correlation betweenV andB c . In such cases, the exact variance of the product estimator given in (5) should be used; a somewhat more tractable version of the full variance is presented in Eq. (S.12) of the Supplementary Material. Goodman (1960, p. 708) refers to "the usual formula" for the variance of a product as an "approximation" and cites " Yates (1953, p. 198)" as one source for this formula. This edition of Yates (1953) is unavailable to the authors, however, in both Yates (1949, p. 198) and in a later edition (Yates 1981, p. 189-190) we find what we assume to be the same approximation referred to by Goodman (1960) along with an extension when the random variables can not be assumed independent. The exact same wording is used in each of these editions; therefore, it is reasonable to assume that the presentation in the edition cited by Goodman is identical or nearly so. The genesis of the approximation is unfortunately not given in the Yates reference either; however, it is indeed the Delta method.

The Delta method
The Delta method was evidently well known in statistics at the time, though it was not always referred to as such and sometimes the results were simply stated as in the Yates (1981) reference. Ver Hoef (2012) presents an interesting history of the Delta method and traces its roots back to Dorfman (1938). In a comment on this paper, Portnoy (2013) suggests that its origins go back further and found the earliest reference was a paper by Friedrich W. Bessel (who also gave us the unbiased correction to the sample variance estimator) published in 1838. In twentieth-century statistics, Portnoy (2013) again mentions Doob (1935) who notes "There is a well-known δ-method used in statistics to find limiting variances of statistics" (p. 167) and goes on to cite two earlier works that employed it. A perhaps more popular source that covered the Delta method and was certainly available in the 1950s was Cramér (1946, p. 353) who also provides a proof of the Delta method for the mean and variance.
As noted previously, the Delta method is a first-order approximation to the variance using a Taylor series expansion. The derivation for the sample variance approximation is given in Supplementary Material (see Section S.2). The Delta method yields an approximate estimator for the variance of the mean for big BAF sampling under the assumption of independence from Eq. (S.7) as and dividing both sides byB 2 cV 2 gives. . .
taking the square root of both sides and converting to Comparing the final result in (10) with that in (8), we see that Bruce's method is the Delta method. The full result given in the Eq. (S.10) includes a covariance term when the assumption of independence is not tenable.

The ratio variance estimator
There is another method by which the variance ofV may be estimated as an alternative to using the usual sampling variance of the mean presented in (6). Gregoire and Valentine (2008, p. 259) have suggested that becauseV is actually a ratio estimator, then the ratio variance estimator would be an appropriate alternative to (6). They note that this may be worth considering if the actual variance ofV is of interest in and of itself. However, a comparison of the two variance estimators in the context of their use in the product variance estimator for big BAF sampling may be of some interest as well if there is reason to favor one over the other. It is straightforward to show (Gregoire and Valentine 2008, p. 258) that the big BAF estimator for volume (4) can be written aŝ comes from the big BAF sample and it is equivalent toV in (2).
The alternative ratio variance estimator forV is (Gregoire and Valentine 2008, p. 259). . . where the point-wise estimator of basal area applied to the big BAF sample. The associated ratio standard error estimate is given by se R V = var R V . Simulations may be used to explore the differences, if any, between the two variance estimators forV, and potentially provide some insight into whether the use of one is preferred over the other in the product estimator. Note, however, that Eq. (12) may be preferred from a theoretical standpoint because it is based on independent random samples of points; whereas by contrast individual trees used in (6) are correlated within points, resulting in the number of individual volume trees in the sample, m v , being a random variable.

Simulation experiments
The simulation experiments were conducted on two different small populations of trees. The simulator used was the sampSurf package (Gove 2012), which was developed for the R statistical analysis system (R Core Team 2020). The sampSurf simulator employs the simple model of a "sampling surface" (Williams 2001a;2001b) in which a raster tract with area A is tessellated into square grid cells of fixed area. Trees are added to the tract, and their inclusion zones are created based on the sampling method-in this case horizontal point sampling. Each grid cell has a conceptual sample point at its center, the total for the cell is accumulated for all trees whose inclusion zone includes the grid cell center. The surface itself therefore, is the total attribute value of interest over all grid cells. The tracts used here are square with a grid cell size of 1 m 2 for both of the simulated populations.
The experimental design employed nine sets of simulations using all combinations of the BAF pairs (F c , F v ) with F c ∈ {3, 4, 5} and F v ∈ {10, 20, 30} for each population. For each simulation, four sampling surfaces were constructed: one each for total volume and basal area using each of the two BAF sets. This yielded 36 total sampling surfaces on each of the two populations. For each of the 9 simulation sets in each population, random samples of size n = 10, 25, 50, and 100 were drawn in a Monte Carlo experiment that was replicated 1,000 times. For each sample on each sampling surface the requisite summary statistics for HPS and big BAF sampling were computed. Thus, because both basal area and volume surfaces were created for each pair of big BAF factors (e.g., (F c , F v ) = (3, 10)) various quantities that are not available in a typical field big BAF inventory, such as individual tree VBARs for all selected trees on the count sample (from the BAF c volume sampling surface), were available in the simulations. This allows, for example, the comparison of the big BAF results with that of a full HPS inventory where all trees are measured on the BAF c sample. The simulations conducted here are modest in extent when compared with e.g., Chen et al. (2019), but they serve to illustrate the similarities in variance estimators.

The mixed northern hardwood population
The mixed northern hardwood tree population is a somewhat larger version of the population used in Gove (2017); it is completely synthetic and is contained on a tract with a total area A = 3.17 ha (31,684 grid cells). An external buffer with width 18 m encloses the internal stand with an area of 2 ha. The internal portion of the tract was populated with m = 667 trees having a total basal area of 48.4 m 2 (approximately 333 trees · ha −1 with a basal area of 24.2 m 2 ·ha −1 ) giving a quadratic mean stand diameter ofD q = 30.3 cm. This places the stand in the fully stocked region of the northern hardwoods stocking guide (Leak et al. 2014). Tree diameters at breast height were generated from a three-parameter Weibull distribution (Bailey and Dell 1973) with location, shape, and scale parameters α = 10 cm, γ = 2, ζ = 20 cm, respectively. The associated tree heights were generated using a metric version of the all species equation for northern hardwoods in New Hampshire (Fast and Ducey 2011) augmented by an additive Gaussian perturbation with standard deviation of 2.5 m. The trees were distributed throughout the internal tract area using a spatial inhibition process with inhibition distance of 3 m (Venables and Ripley 2002, p. 434). Bound-ary overlap was corrected by allowing the inclusion zones to penetrate into the buffer region (Masuyama 1953;Gregoire and Valentine 2008, p. 224). Individual point samples drawn according to the Monte Carlo scheme described above are allowed to fall anywhere within the entire tract region A, thus preserving the full inclusion probabilities for each tree.
The sampSurf simulator requires taper data for each tree, either from direct measurements or generated from an appropriate taper function. The following built-in default taper function (Van Deusen 1990) for diameter at height 0 ≤ h ≤ H was used in the generation of each tree where D b is the butt diameter and D u the top diameter at height H. The taper is controlled by the parameter r, and was randomly generated for each tree within the range r ∈[ 1.5, 3]; with overall stem forms following 0 < r < 2 a neiloid, r = 2 a cone, and r > 2 a paraboloid. The individual tree volumes corresponding to each tree's taper were generated by the methods given in (Gove 2011b, p. 8).
The correlation between VBAR and BA for the population is ρ(Vb) = 0.62. Histograms of the resulting DBH and height distributions are shown in Figure S.2.

The eastern white pine population
The eastern white pine (Pinus strobus L.) population was created from Barr & Stroud FP-12 dendrometry data taken over a 20-year period in various pure, even-aged white pine stands in southern New Hampshire (see Gove et al. 2000 for more details). The dendrometry measurements were processed using the R Dendrometry package (Gove 2011a). The population is composed of m = 316 white pine trees, some of which were measured more than once during this period. This population is set within a tract of 1 ha in size with an 18 m buffer surrounding it to fully contain the inclusion zones, yielding a total tract size of A = 1.85 ha (18,496 grid cells). The total basal area for the population is 47.2 m 2 , yieldingD q = 43.6 cm. This places the stand well within the range of full stocking on the eastern white pine stocking guide (Leak and Lamson 1999). The trees, having been measured in multiple stands without location information, were distributed within the internal tract area using a spatial inhibition process with inhibition distance of 3 m, similar to the northern hardwood stand. Boundary overlap was again handled using Masuyama's method in the simulations by allowing sample points to fall in the buffer outside the internal 1 ha tract area containing the trees. Because the white pine population was dendrometered, these measurements provide the taper data for the sampSurf simulator rather than a taper equation. Within sampSurf , such data are modeled using a cubic spline fitted to the raw measurements for each individual tree (Gove 2011b), though calculation of volumes is via Smalian's method (Kershaw et al. 2016, p. 141). Histograms of the white pine DBH and height distributions are shown in Figure S.5.

Results
The sampling surface results from the population specifications given in the previous section are found in Table 1. Note that the resulting surface estimates of volume and basal area are quite close to the population values for each species. An example set of sampling surfaces with F c = 3 and F v = 30 is presented in Figures S.1 and S.4 for the northern hardwoods and white pine populations, respectively. The surface results in Table 1 clearly illustrates the higher stocking of volume and basal area in the white pine stand than for the hardwoods, which mimics what would often be found in practice when comparing fully-stocked stands of these forest types in New England. The species histograms also indicate that the northern hardwoods diameter distribution ( Figure S.2) is more positively skewed than the white pine distribution ( Figure S.5), and the tree heights are shorter in general for the hardwoods, which, combined with the synthetic taper equation used for the hardwood volumes accounts for much of the difference in total volume between the two populations. Thus, the white pine, while on a tract half the size of the hardwood tract, and with half the population size, still carry much higher volume per tree than the hardwoods. The coefficients of variation in percent (CV%) (see, e.g., Freese 1962, p. 13) are given in Table 1 for both volume and basal area. The results clearly demonstrate that as the BAF increases, the population variance increases, yielding a larger CV%. This is a natural phenomenon of the variance that is related to a smoothing of the sampling surface itself as the inclusion zone size (or plot size) expands. In this case, the related BAFs for both volume and basal area demonstrate that the larger inclusion zones produce smaller variance. This phenomenon is a consequence of the so-called 'empirical law' of the variance first recognized by Smith (1938). In forestry it was applied in terms of the CV (Freese 1961) and extended to the variable plot method by Wensel and John (1969) (see, e.g., Gove 2017 or Lynch 2017 for recent literature reviews). The CV% for the northern hardwoods is consistently lower (with the exception of F v = 30) for both volume and basal area, presumably due to the shorter, smaller diameter trees compared to the white pine population. The sampling efficiency is defined as the percentage of surface grid cell centers-or sample points-that are covered by one or more inclusion zones. Of course, this must decrease with increasing BAF in accord with the process behind the prior observations on the inclusion zone size in relation to overall surface variance.

Monte Carlo simulations Observed sample ratios
In theory, the number of count and volume trees sampled on an inventory using the big BAF method should come close to the ratio of the two BAFs used in sampling. For example, using F c = 4 and F v = 10, the sample ratio, F v F c , should be 10 4 = 2.5 count trees sampled for each volume tree. However, in practice the actual ratio in an inventory could vary widely purely by chance, and that variation could influence the distribution of the resulting estimates, as well as the accuracy of approximate variance formulae and the resulting confidence limit coverage. The results from the Monte Carlo simulations are presented for each sample size in Tables S.1 and S.3. The population ratio in the tables are calculated as above, directly from the ratio of the respective BAFs. The results for each sample size are given as a ratio of means estimate, where the means are over all individual sample ratios from the Monte Carlo replicates. It can easily be seen that for each sample size and species the results, on average, match very closely with the population target ratio. However, as noted above, the individual sample ratios for each Monte Carlo replicate can vary quite a bit from these mean values. Figures S.3 and S.6 display two sets of histograms for the northern hardwoods and white pine results. The results clearly demonstrate that at the smaller sample sizes the distribution of the sample ratios is positively skewed, with some observed sample ratios several times the mean. The distributions quickly trend to more Gaussian with much smaller variance as the sample size increases. This same trend manifests itself in the other BAF combination for each population.
Tables S.1 and S.3 illustrate that there are several combinations of BAFs that produce similar population sample ratios. For example, BAF pairs of (F c , F v ) = (3, 20) and (5, 30) both yield approximately six count trees per volume tree. The companion results in Tables S.2 and S.4 present the standard deviations for individual Monte Carlo replicate sample ratios by sample size. Note that the (3, 20) BAF pair tends to have smaller variation in observed sample ratios than the larger (5, 30) pair over all sample sizes for both populations. One might argue that this could be due to the fact that the (3, 20) pair samples almost one more count tree per volume tree than the (5, 30) pair. However, this trend seems to hold for other similar combinations, where the sample intensity is reversed. For example, the (3, 10) and (5, 20) pairs have sample ratios of three to four count to volume trees, respectively, and again the smaller pair of BAFs has the smaller variation in observed sample ratios, even though the larger pair samples relatively more count trees on average for both populations. However, this trend is not universal. It is quickly observed that within a given BAF v gauge and sample size, n, the variability in sample ratio decreases with increasing BAF c . Similarly, the variability increases with increasing BAF v for a given BAF c . The increase in this latter case is much larger than for the former. Evidently a plausible explanation for this phenomenon is that as the BAFs converge, the target sample ratios also decrease; and in the limit, this decrease sim-ply yields a sample taken with one BAF as under traditional HPS. Thus, it appears that the larger the difference between the BAFs, the larger the variation in hitting the target sample ratio. Finally, as expected, the Monte Carlo sample ratio variability decreases within a given BAF pair as sample size increases in accord with the convergence to Gaussian distributions (e.g., Figures S.3 and S.6).

Standard error comparisons
The Delta method and Goodman's method variance of the mean estimators in Eqs. (9) and (5) are compared in the following in terms of the corresponding standard error of mean equations (i.e., their square roots). Figure 1 presents the results of the Monte Carlo simulations for the northern hardwoods in terms of the average standard errors by sample size and (F c , F v ) pair. Each panel corresponds to a 'Smith' plot (Smith 1938), with the inverse of BAF v used as a surrogate for average inclusion zone size under HPS (e.g., Arvanitis and O'Regan 1967;Gove 2017;Yang et al. 2017). The reference for comparison in terms of the smallest standard error attainable under the simulations is that for the BAF c HPS variance. The most obvious and expected trend is that the standard errors decrease with increasing sample size over all BAF pairs. A second salient result is that, within a given sample size, the standard errors increase with increasing BAF c ; thus, either BAF could have been used in the Smith plot abscissa, though the decrease in variance with inverse BAF c is less pronounced. For n = 10, the standard errors partition nicely into two sets: the first corresponding to the use of se V , while the second uses the ratio standard error estimate, se R V . Within each set, the results for Goodman's method are uniformly lower than those for the Delta method. These distinctions quickly disappear as the sample size gets larger and are indistinguishable at n = 50. In accordance with Smith's theory, all standard errors decline as the average inclusion area increases through the inverse of BAF v , quickly converging towards the reference HPS standard error at F v = 10 for all values fof BAF c (note carefully the standard error scales on each panel to Fig. 1 The northern hardwoods Monte Carlo standard error simulation results as the average over 1,000 replications for each BAF pair and sample size with the Delta Method ( ) and Goodman's (+), using the variance of the mean (6) (dashed) and the ratio variance estimator (12) (dotted). The reference line (solid, •) is the average Monte Carlo standard error for the BAF c HPS results see the magnitude of the difference), though convergence increases as BAF c increases.
For the white pine population (Figure S.8) the standard errors have merged to where they are no longer individually indistinguishable at n = 25 for all combinations of (F c , F v ) pairs. There is virtually no discernible difference between the Delta method and Goodman's method at any sample size, except for the slight difference at the smallest sample size of n = 10, which converges by F v = 10. A similar observation applies for the difference between the two different standard error estimators for theV component. Lastly, overall convergence to the HPS reference is much more gradual than for the northern hardwood results in all cases, evidently requiring a smaller BAF v , approaching BAF c , where the distinction of big BAF sampling from pure HPS becomes irrelevant.
For completeness, the Monte Carlo results for se V and se R V are presented in Figures S.7 and S.9 for the northern hardoods and white pine populations, respectively. Since these results are unaffected by the choice of BAF c , only one level of conditioning is shown in the panels. The results for both populations echo the trends above, showing se V > se R V for n ∈ (10, 25), with any differences disappearing as the sample size increases. The results for the two populations differ only in magnitude of the average estimated standard errors.

Confidence interval capture rates
The confidence interval capture rates for the northern hardwoods population are found in Fig. 2. The rates are all nominally 95% and range within the Monte Carlo experiments from 93.5%-95.2%, depending on the sample size and BAFs used. Each result is the capture rate over all the 1,000 Monte Carlo replicates. Within this range the rates can vary somewhat based on sample size and the BAFs used. In general there is nothing remarkable about the results since this range of capture rates is quite close to the nominal rate in all cases. The lowest rates occur in the smallest sample sizes (n = 10, and 25) for F v = 30, which is a direct result of the small number of trees selected for measurement with this BAF v . Beyond that, the capture rates for the ratio estimator, se R V , are slightly lower at n = 10, F v = 30 and F c ∈ (3, 4) than for se V , a result that is in accord with the smaller standard errors realized for the ratio method at these settings. Of course the confidence interval capture rates depend on the individual Monte Carlo sample draws; thus, even the HPS capture rates vary somewhat (94%-95%) from the nominal, which is to be expected.
The confidence interval capture rates for the white pine population are found in Figure S.10. The rates are again all nominally 95% and range within the Monte Carlo experiments from 93.2%-96%. This slightly wider range for capture results, as compared with the northern hardwoods, may reflect the marginally higher degree of variability in individual tree size within the white pine population as compared to the northern hardwoods. Indeed, even the HPS intervals vary from 93.2%-95.9%, covering essentially the full range of variability in the big BAF results. However, none of the capture rate results is unreasonable for a variable population in comparison to the nominal level. And in general, the degree of concordance between each of the different variance estimator combinations for big BAF sampling is remarkable and more consistent than in the less variable northern hardwoods population.

Correlation
The reason for examining the correlations between VBARs and basal area relates to the question of whether the simple Delta method or Goodman's method variance estimators in Eqs. (9) and (5) are relevant, or whether more terms are required to account for covariances in the case where independence is untenable. However, as discussed in more detail later, the question of how exactly to calculate the relevant correlations is not a straightforward one-in fact it is a bit of a conundrum. For now, we present the results for three approximate estimators for the correlation. In addition, the population correlation, ρ(V i , b i ), i = 1, . . . , m, was computed on an individual tree basis over all trees in each population.
The first estimator is based on those individual trees that were sampled using the BAF v angle gauge and are tree-wise correlationsρ(V i , b i ) over all trees i = 1, . . . , m v . The second is a point-wise estimatorρ V v s , B c s , s = 1, . . . , n, between the total VBAR and basal area per point with Additionally, V v s = 0 on count points where no big BAF trees have been selected. Lastly, the third estimator, ρ V v s ,B c s , is also point-wise withV v s = V vs m vs andB c s = B cs m cs . Note, of course, thatB c s is constant at each point or zero. This will diminish the correlation somewhat.
The results for the northern hardwood population are shown in Fig. 3. First, the tree-wise correlation summaries,ρ(V i , b i ), appear to be fairly constant at all but the smallest sample size of n = 10, where there is a slight decline as BAF v increases, regardless of BAF c . Sinceρ(V i , b i ) is an estimator of the population correlation, ρ(V i , b i ), it is not surprising that increasing the sample size shows a convergence toward the population value. However, whileρ(V i , b i ) is an estimator of the population correlation, ρ(V i , b i ), it is not an estimator for the sample correlation required to make decisions about Fig. 2 The northern hardwood Monte Carlo simulation results for confidence interval capture rates as the average over 1,000 replications for each BAF pair and sample size with the Delta Method ( ) and Goodman's Method (+), using the usual sampling variance of the mean (6) (dashed) and the ratio variance estimator (12) (dotted). The reference lines (solid, •) are the average Monte Carlo capture rates for the BAF c HPS results the necessity for handling covariance in the big BAF sampling variance. Therefore, these results are of limited interest in addressing the question of whether the addition of sample covariance is required in variance estimation.
A somewhat more reasonable approach is to aggregate the tree-wise component into a point-wise component, the second and third estimators of correlation,ρ V v s , B c s andρ V v s ,B c s , are an attempt at so doing. These correlations uniformly decrease with increasing BAF v over all sample sizes. Presumably, this is due to fewer trees being sampled with the larger BAF v gauges. For example, Figures S.1b and S.4b both show the high number of sample points with either one or no sample trees selected, whereas the companion BAF c figures show large number of overlap zones for basal area estimation, with few zero count points. This is only one Monte Carlo realization out of 1,000, but serves to illustrate the point that is well-known in HPS. Because the sample point selection is completely random, this same trend manifests itself over all sample sizes.
One might object that sinceB c s is based on the expanded single-tree basal area, this might correlate well with individual tree VBARs on points with m v s = 1. However, that misses the variation in the V i among trees, each of which are matched with the same average value of estimated basal area; this result lowers the point-wise correlations in such cases. Sampling more volume trees per point with BAF v must tend to average out some of the variability resulting in a higher correlation at smaller values of BAF v . The results for the white pine population parallel those discussed here (Figure S.11) with the exception that there is no discernible convergence ofρ(V i , b i ) to the population value.
These results begin to elucidate the problem of computing a covariance or correlation between point-wise aggregates such as mean basal area B c s and mean VBARs V v s : collapsing the latter into an average per point can yield misleading results that get worse the fewer points sampled with any measurable BAF v trees tallied. Moreover, using the totals rather than averages per point, results in similar trends, though raised somewhat higher. Brooks (2006) evidently used a hybrid of the two, where the point-wise basal area, B c s , was paired withV v s to determine the correlations.

Discussion
The Delta method has been suggested here as the potential antecedent to what foresters call "Bruce's method. " In addition, the suggestion has been advanced that the Delta method was a well-known method for approximating the variance of a product of two random variables at the time when Bruce's method evidently first appeared in the American literature on forest sampling by Bell and Alexander (1957). It can be conjectured that their application was either adopted by Bruce (1961) (though not cited there), or alternatively that Bruce (1961) may have known about it independently. However, another source contemporary to these authors that covers various aspects of forest sampling is that of Freese (1962). It is interesting to note that Freese (1962) evidently also knew about the Delta method approximation and advocated its use (Goodman's method was published a couple years earlier, though Freese may not have been aware of its existence at the time). In fact, he gives a version (i.e., Freese 1962, p. 17) of the full first-order expansion that includes the covariance and is equivalent to Eq. (S.10). Similar to the aforementioned authors, however, he gives no source for this equation. Again, we are left with the conjecture that the Delta method was a well-known method in use in statistics at the time, in fact, so much so, that it was unnecessary to give citations when it was used.
As previously noted, there is somewhat of a conundrum in regard to calculating either the correlation or the covariance on which it depends in extended product variance estimation. The problem relates to the use of either the Delta method in (9) or Goodman's method in (5) to determine the overall variance for the inventory. By this point, it should be clear that these estimators have two components based on var V in (6) and var B c in (7). To reiterate, var V is calculated tree-wise over all m v tally trees, but var B c is calculated pointwise over all sample points, n, in the inventory. The two variances thus have different sample support, one is trees, the other points. The problem then lies in the fact that to compute a valid covariance between two random variables, they must share the same sample support: either trees or points, not both. Theρ V v s ,B c s correlations presented previously, for example, attempt to circumvent the inherent support of each variable by combining the two in a way that allows an approximation to the covariance (which could be quite poor in reality) to be computed over the sample points. However, this is obviously not what is required. Any such approximation that aggregates one or the other random variable to a different sample support can be thought of as a transformation from one sample space to another. The consequences of such a transformation are unknown. The use ofρ V v s ,B c s was presented as an example to illustrate the aforementioned problem. In addition, it was noted that there are other ways to aggregate the tree-wise measurements into point-wise summaries (and each of these produces different estimates of the correlation), but none of these is what is required to calculate a valid correlation and no one approximation can be preferred over the others since the true value is not calculable.
The ramifications of the previous discussion are that if a correlation is computed based on aggregation, it is bound to be misleading at best. In addition, the covariance on which it is based, and which is the real basis for the problem because of its need for commensurate support, will also be incorrect no matter how the tree-wise VBARs are aggregated. Thus, there seems to be little possibility of using either the Delta method or Goodman's method in the extended case (i.e., either Eqs. (S.10) or (S.12)) where there may be correlation since neither the indicator of such,ρ, nor the covariance itself can be properly formulated. This seems to be a peculiarity in the application of either the Delta method or Goodman's method to big BAF sampling (via Bruce's method) that does not manifest itself in other applications of these methods. This is because in general applications the sample support will normally be the same for both components of these composite variances. There appears to be no immediate solution to this dilemma with the problem as traditionally formulated. It is suggested that the only reasonable application of (9) or (5) is where no aggregation procedure is used to approximate the correlation or covariance in big BAF sampling. Fortunately, as noted in the simulation results, both meth-ods appear to work well as judged by confidence interval capture rates in the big BAF setting.
In the simulations, we compared the results of using two different methods for calculating the standard error of tree VBARs, the normal theory (6) and ratio (12) estimators. In general, the results were indistinguishable between the two estimators, with the exception of the combination of the smallest sample size and largest BAF v s. In such cases, there was a small but operationally insignificant difference between the two. Thus, we can not recommend one over the other in general, other than to use the one that is most appealing. However, there is one interesting (but hopefully unnecessary in practice) case where the ratio variance is clearly superior. In the case where only one tree (m v = 1) is sampled with BAF v over all points, the standard estimator (6) fails because of the sample size of one. The ratio estimator, however, is based on the number of sample points; therefore, the only time it can not be used to estimate a variance for VBAR trees is when no trees at all are sampled (or n = 1). Again, this is simply pointed out here as a curiosity and we caution against using this little observation in practice for obvious reasons.
There are other variance estimators that can be applied to big BAF sampling. Two nonparametric approaches are the jackknife and bootstrap estimators. Some simulations were run using these estimators, but there was no apparent advantage to their use in terms of standard error or confidence interval coverage rate. In addition, the usual HPS estimator for variance in volume was calculated in the simulations for the BAF v sample, similar to that presented here for the BAF c sample (the target variance estimate in the simulation results). However, predictably, the results of these estimates were much higher than those reported in the standard error comparisons. In addition, the confidence interval capture rates were also consistently higher than the nominal level, on the order of 95.3%-99.9%, with the rates increasing with increasing BAF v .

Conclusions
The origins of Bruce's method as a variance estimator in double sampling designs, and specifically the big BAF method, appears to be unknown as it was used in early publications with no citation given. A plausible explanation for Bruce's method has been given through the derivation of the Delta method, which was known and used in the statistics literature and popular texts prior to the use of Bruce's method. The relationship of the Delta method approximation to Goodman's method is quite well-known and was mentioned by Goodman (1960), who post-dates the original use of Bruce's method in forestry. The simulation results show that there is no practically important difference between the approximate and exact methods, as would be expected based on the small variance cross-product difference between the two. Furthermore, our results have demonstrated that one has a choice between two different methods for calculating the variance of the tree VBARs, again with no discernible difference in practice, except when very few trees are chosen with the BAF v gauge in the second phase sample. We caution that there does not seem to be an exact method for the determination of the covariance or correlation between tree VBARs and basal area point counts because of the disparity in sample support under the usual interpretation of (4) used here; however, there is an alternative interpretation under development that will accommodate a true point-wise covariance and correlation interpretation (Lynch et al. 2021, in press). This suggests that using an approximate correlation from aggregation to determine whether the corresponding covariance approximation term is required should not be recommended. The simple Bruce's method (Delta method) appears to be robust enough based on the capture statistics that it is probably a good approximation in most cases.