Tree variability limits the detection of nutrient treatment effects on sap flux density in a northern hardwood forest

The influence of nutrient availability on transpiration is not well understood, in spite of the importance of transpiration to forest water budgets. Soil nutrients have the potential to affect tree water use through indirect effects on leaf area or stomatal conductance. For example, following addition of calcium silicate to a watershed at Hubbard Brook, in New Hampshire, streamflow was reduced for 3 years, which was attributed to a 25% increase in evapotranspiration associated with increased foliar production. The first objective of this study was to quantify the effect of nutrient availability on sap flux density in a nitrogen, phosphorus, and calcium addition experiment in New Hampshire in which tree diameter growth, foliar chemistry, and soil nutrient availability had responded to treatments. We measured sap flux density in American beech (Fagus grandifolia, Ehr.), red maple (Acer rubrum L.), sugar maple (Acer saccharum Marsh.), white birch (Betula papyrifera Marsh.), or yellow birch (Betula alleghaniensis Britton.) trees, over five years of experiments in five stands distributed across three sites. In 2018, 3 years after a calcium silicate addition, sap flux density averaged 36% higher in trees in the treatment than the control plot, but this effect was not very significant (p = 0.07). Our second objective was to determine whether this failure to detect effects with greater statistical confidence was due to small effect sizes or high variability among trees. We found that tree-to-tree variability was high, with coefficients of variation averaging 39% within treatment plots. Depending on the species and year of the study, the minimum difference in sap flux density detectable with our observed variability ranged from 46% to 352%, for a simple ANOVA. We analyzed other studies reported in the literature that compared tree water use among species or treatments and found detectable differences ranging from 16% to 78%. Future sap flux density studies could benefit from power analyses to guide sampling intensity. Including pretreatment data, in the case of manipulative studies, would also increase statistical power.

Northern hardwood forests are poorly adapted to water stress due to the abundance of water in this region (Pederson et al., 2014). Models predict that the northeastern USA will experience more severe dry periods in the context of overall wetter conditions in the near future due to climate change (Hayhoe et al., 2008). Other studies have suggested that water limitation could become more important with intensifying summer droughts (Brzostek et al., 2014). Because of these projections it is increasingly important to understand the controls on water use in northern hardwood species. The effects of nutrient availability on water use are not currently represented in forest hydrology models because they are poorly understood.
There have been conflicting reports on the effects of nutrient availability on tree water use. An increase in sap flux density has been observed with the addition of multiple element fertilizers in Norway spruce in northern Sweden (Phillips et al., 2001) and in eucalyptus forests in Hawaii (Hubbard et al., 2004). Nitrogen addition reduced evapotranspiration rates in loblolly pine in North Carolina (Ward et al., 2018). Calcium additions (lime and gypsum) increased transpiration in central Amazonia (Da Silva, Goncalves & Feldpausch, 2008). Calcium silicate additions may have increased sap flow at Hubbard Brook in New Hampshire, based on an observation of decreased stream flow for 3 years, after which stream flow returned to pretreatment rates (Green et al., 2013).
In 13 stands distributed across three sites in the White Mountains of New Hampshire, N and P have been added in full factorial combination since 2011 to study Multiple Element Limitation in Northern Hardwood Ecosystems (MELNHE). By 2013, foliar nutrients had responded to treatment (Wild & Yanai, 2015). Relative basal area growth across all 13 stands responded to P addition but not to N addition by 2015, based on the average tree (Goswami et al., 2018), but the dominant trees grew more in response to N addition (Hong et al., 2022). The MELNHE study design included calcium silicate additions in seven of the stands to test the hypothesis that an increase in tree water use explained the reduction in runoff following the earlier calcium silicate addition at Hubbard Brook (Green et al., 2013).
One goal of this study was to determine the role of water use by trees in reducing runoff after a calcium silicate addition by measuring sap flux density of five common northern hardwood species in plots treated with calcium silicate at the same rate as the whole-watershed addition at Hubbard Brook (Green et al., 2013). We also investigated the importance of other nutrients to tree water use by quantifying sap flux density in plots receiving additions of N and P. We predicted that sap flux density would increase with the addition of a limiting nutrient, because nutrient availability is a driver of productivity and photosynthesis. Additionally, we calculated minimum detectable differences in this study and in previously published studies to determine whether failures to detect treatment effects were due to small effect sizes or to high variability among trees. Power analyses of studies such as these can help to guide future research plans.

Site description
We studied tree water use in five naturally regenerated hardwood stands located in three forested sites in the White Mountain National Forest, New Hampshire, USA: two in each of the Bartlett Experimental Forest (44 • 02 N, 71 • 17 W) and Hubbard Brook Experimental Forest (43 • 93 N,71 • 73 W), and one in Jeffers Brook (44 • 03 N, 71 • 88 W; Table 1). The climate is humid continental, with a mean annual temperature of 4.4 • C and precipitation of 1400 mm (Bailey et al., 2003;Smith & Martin, 2001). Soils are moderately well to well drained (Schaetzl et al., 2009), coarse-loamy Spodosols and Inceptisols developed in glacial drift derived from granitic and metamorphic silicate rocks (Vadeboncoeur et al., 2012;Vadeboncoeur et al., 2014). Dominant tree species are American beech (Fagus grandifolia Ehrh.), sugar maple (Acer saccharum Marsh.), and yellow birch (Betula alleghaniensis Britton) in mature stands with the inclusion of white birch (B. papyrifera Marsh.) and red maple (A. rubrum L.) in successional stands (Fatemi et al., 2012;Naples & Fisk, 2010). Tree height ranged from 9.9 m to 21.3 m in the successional stands and 10.3 m to 33.2 m in the mature stands.
These five stands are part of a study of Multiple Element Limitation in Northern Hardwood Ecosystems (Table 1; Hong et al., 2021). Within each of these stands there are four plots measuring 50 × 50 m, except for Hubbard Brook Successional, which has 30 × 30 m plots. These plots have been treated since 2011 with 30 kg ha −1 yr −1 of N as NH 4 NO 3 , 10 kg ha −1 yr −1 of phosphorus as NaH 2 PO 3 , both N and P, or were left untreated. A fifth plot in these stands received a one-time application of 1,150 ha −1 of calcium silicate (wollastonite) in 2011, or in 2015 for Hubbard Brook Successional.

Field methods
Sap flux density measurements were taken during the summers of 2013, 2014, 2015, 2017, and 2018 for periods of 7 to 48 days. In 2013 and 2014, we examined three tree species in three mature stands in plots that received calcium silicate and control treatments (six plots total; Table 2). In 2015, our goal was to assess the effects of N and P treatments (four plots) on one species in one successional stand (Table 3). During those years, data were sparse due to poor contact between the probes and the sapwood, but this problem was corrected in 2017. In 2017, one species in a mature stand in Bartlett was examined, and in 2018, three species were examined in the Hubbard Brook Successional stand. The number of trees of each species monitored in each plot varied by stand and year (Tables 2 and 3). Table 1 Site descriptions of the five forested stands studied in the White Mountains of New Hampshire. Date of stand initiation was determined from the last clearcut or heavy harvest event. Quadratic mean diameters and basal areas are based on trees with diameters greater than 2 cm. Stem density is calculated for trees greater than 10 cm in diameter. Sap flux density measurements were made using the thermal dissipation method (Granier, 1987). This technique used a pair of stainless-steel probes 20 mm long and 1.8 mm in diameter. One probe was wrapped with copper constantan thermocouple wire (Type T) and generated a constant flow of heat (0.2 W). The other was a reference probe that received no heat. Both probes were coated with heat-conducting paste and inserted into aluminum sleeves 20 mm long and 2.4 mm in diameter for protection during insertion into the tree. The thermal dissipation method may underestimate sap flux density depending on species, portion of probe in the heartwood, and wood type (Flo et al., 2019;Peters et al., 2018), but a consistent bias would not impair our ability to detect treatment effects. We Table 3 Studies of sap flux density in response to nitrogen and phosphorus addition showing the date of the study, the number of annual nutrient applications prior to each study, and the number of trees of each species used in analysis in each treatment plot. tested for treatment differences in sap flux in the youngest xylem; reporting sap flow per tree or scaling to the stand level would require information on sapwood depth, which was not measured. We selected trees with full canopies, without noticeable dead branches or cankers. A total of 202 trees were instrumented, of which 174 produced usable data. The number of trees instrumented per species per plot ranged from 1 to 6, averaging 3.2 ± 1.1 (standard deviation; Tables 2 and 3). Before probe installation, ∼4 cm 2 of bark was removed to the cambium at a height of 1.37 m for the reference probe and 1.47 m for the heated probe on the south-facing side of the tree. After the bark was removed, a hole 21 mm deep and 2.8 mm in diameter was drilled in the middle of the exposed cambium for each probe.

Dates (# of years of treatment)
Once probes were inserted, they were protected from precipitation with a plastic cover sealed with acid-free silicone caulk. Closed-cell reflective polyethylene insulation was stapled over the plastic cover to shield solar radiation. The probes were connected to cables no longer than 20 m to minimize loss of power from cable resistance, trampling by humans, and disturbance by wildlife.
Temperature differences between the heated and reference probes in each plot were recorded every 15 min as an average of thirty 30-second readings on a multiplexor and stored on a multichannel data logger (Campbell Scientific CR800). To power the data logger and multiplexer, the first study, in 2013, used two 12 V deep-cycle marine batteries per plot charged by two solar panels; later studies used three batteries linked together.
Gaps in usable data were usually due to cable disturbances or improper probe installation; in the first year solar panels proved insufficient to maintain their charge, and data were very spotty. The number of days of usable data was also improved by more frequent maintenance of wire connections, probes, and batteries.

Data processing
Temperature differences were converted into sap flux density using BaseLiner (version 3.0.10, developed by Ram Oren, Duke University; Oishi, Hawthorne & Oren, 2016). BaseLiner used the following equation to calculate sap flux density (g H 2 O m −2 of sapwood day −1 ): where the constants, 119 (Watts • C −1 ) and 1.23, were derived using the quantity of heat applied to the probes (Granier, 1985;Lu, Urban & Ping, 2004). T max ( • C) is the ''baseline'' value that was manually chosen as the maximum temperature difference between the two probes from 10:00 PM to 4:30 AM EDT. T max was set for each day for each tree, defining nocturnal flux as zero, which accounted for any thermal drift that may have occurred from day to day. T ( • C) is the temperature difference between the probes at each 15-minute observation period. We used the same equation for all trees because species-specific coefficients were available for only one of the five species studied (Peters et al., 2021).
The experimental unit was a treatment plot, and the observational unit was a tree. Because of the many barriers to continuous collection of high-quality data, selecting data for analysis was an important step. Twenty-eight trees (not shown in Tables 2 and 3) never produced usable data due to faulty probe installation in the early years of the study. For each tree, we used the average daily sap flux density calculated from summed 15-minute averages for each day. Days used in analyses were mostly dry and sunny (PAR > 800 W/m 2 for at least 7 hours). Days were excluded from analyses if sap flux values did not follow the characteristic diurnal curve or if the majority of the probes in the stand were not functioning simultaneously (Tables 2 and 3).

Data analysis
The five studies of calcium silicate addition were analyzed in ANOVA Type III sum of squares with average daily sap flux density as the response variable and treatment, species, and the number of years post treatment as categorical fixed effects. The post-treatment years were nested within two categories: ''early'' and ''late'', based on the finding that evapotranspiration increased for 3 years after a calcium silicate addition and then returned to normal (Green et al., 2013). Treatments were nested within stands and stands were treated as a random effect. The assumption of normality of residuals was met through a logarithmic transformation of sap flux density. We tested a model that included additional interaction terms, but the Akaike's Information Criterion (AIC), 44.4, was higher than that of the simpler model (1.48) (Table S1). Data from 2018 were analyzed independently using the same model without time since treatment. For this analysis, the assumption of normality of residuals was met without transformation.
Our study design included all combinations of N and P addition, but no combinations of calcium with N or P. To take advantage of the factorial N × P addition, we analyzed the N and P treatments separately from the calcium silicate addition treatments. Thus, the three studies in N and P addition plots were analyzed together in a randomized full factorial ANOVA with average daily sap flux density as the response variable and species and stand as fixed effects. Three stands were studied in three different years, which precluded distinguishing the effects of stands from the effects of the year of measurement. A model including the interaction of species and treatment was not an improvement, according to the AIC (97 compared to 86) ( Table S2). Normality of the residuals was achieved by logarithmic transformation of sap flux density. Additionally, the year with the best data quality (2018) was analyzed independently using the same model with the exception of excluding stand as a fixed effect. For this analysis, the assumption of normality of residuals was met without transformation. We tested whether nutrient additions (Ca and N × P) affected sap flux density at any time of day, using trees from 2017, which had the largest sample size of a single species (sugar maple). These effects were tested for each hour over three days using ratios of instantaneous sap flux density to the average for the day. There were no significant treatment effects when a Bonferroni correction was applied (p ≥ 0.32). Therefore, only results using daily average sap flux densities were reported.
All analyses were conducted using the ''lme4'' package in R version 3.5.2 (R Core Team, 2018).
Our second objective was approached by calculating coefficients of variation (CV) and minimum detectable differences for our study and published studies. Coefficients of variation across trees were calculated for each treatment by stand and year studied using the average daily sap flux density per tree. Coefficients of variation across treatments were also calculated for each stand and year studied for comparison to within-treatment CV's. Published studies were chosen for minimum detectable difference analyses from a search using the keywords ''sap flux'', ''sap flow'', ''trees'', and ''treatment addition'', and were selected if they reported sample sizes, mean sap flux density, and standard deviation or error for each treatment or species of interest.
The minimum detectable difference in sap flux density among treatments or species in our study and for published studies was calculated using the following equation (Zar, 1984): where k is the number of treatments, S 2 is the sample variance, φ 2 is the non-centrality parameter dependent on β and α, and n is the sample size. Power (1-β) was set at 0.8 and α at 0.05. For an ANOVA, S 2 is the mean squared error. This equation was rearranged to calculate the sample size needed to detect a 20% and 50% difference using data from 2017 and 2018. To estimate the minimum detectable difference from published studies, S 2 was the average of the variances across treatments (Eq. (2)).

Characteristics of sap flux density
Sap flux density peaked between 12:30 and 3:00 PM and was lowest between 2:30 and 4:30 AM Eastern Daylight Savings Time. Trees varied consistently in sap flux density across days (Fig. 1).
The first year, 2013, had the highest CV's among the five studies (Table 4). Coefficients of variation across trees within plots for each year ranged from 9% to 136% and averaged 38% within treatment plots across trees. CV's across treatments averaged 17% with a range of 8% in Bartlett Mature in 2017 to 49% in Jeffers Brook Mature in 2013. Average CV's within treatment plots were higher than CV's across treatments.

Effects of nutrient additions
In 2018, the year with the best data quality, the effect of Ca addition was a marginally significant increase of 36% (p = 0.07). With all years included, the effect of calcium silicate addition on sap flux density was not consistent (p = 0.30 for the main effect of treatment). Sap flux density did not differ consistently between ''early'' (2-and 3-years post treatment) and ''late'' (4 and 6 years post treatment) years (p = 0.26) nor was there a difference in response to a calcium silicate addition as the number of years post treatment increased (p = 0.88 for the interaction of treatment and time period). The five species were indistinguishable in sap flux density (p = 0.42; Fig. 2; Table 5). The effects of N and P on sap flux density were studied in white birch in Bartlett Successional in 2015, sugar maple in Bartlett Mature in 2017, and red maple, sugar maple, and white birch in Hubbard Brook Successional in 2018 (Table 3). Sap flux density differed across these three studies (p < 0.01), which could be because sites were different or because conditions differed during the measurement periods among the three years. Sap flux density in the Hubbard Brook Successional stand, measured in 2018 after 7 years of fertilization, was 63% higher than in Bartlett Successional measured in 2015 and Bartlett Mature measured in 2017. There was, however, no detectable difference in sap flux density due to the main effects of N (p = 0.50) or P (p = 0.95) or their interaction (p = 0.33). Species were also indistinguishable in sap flux density (p = 0.58; Figs. 3 and 4; Table 6).
In 2018, the best year for data quality, there was still not a consistent effect of N (p = 0.26) or P (p = 0.10) addition on sap flux density. Trees in the N and P plots had 14 and 20% higher sap flux densities, respectively, than those in the controls but trees in the NP plot did not, resulting in a weak N × P treatment interaction (p = 0.06). This is apparent in Figs. 3 and 4, where the filled blue symbols, showing the plots receiving the element not depicted on the x and y axes, fall below the open symbols. However, the interaction is based on only one plot receiving both treatments and is thus not very convincing: in this case, an environmental factor that differentially affected water use in one treatment plot would explain the observed ''interaction'' of N and P, whereby the NP plot had lower values than predicted by the main effects of N and P, which were positive, if not significant. The p value of 0.10 for the main effect of P seems possibly worthy of attention, but both of the two main effects became less significant when the interaction term was excluded from the model (p = 0.66 for P), again showing that the analysis based on only one stand is not very robust.

Power analyses
The lack of significant treatment effects does not necessarily mean that effect sizes were small (Amrhein, Greenland & McShane, 2019); rather, the minimum detectable differences  for a simple ANOVA were large, ranging from 50 to 1582 g m −2 day −1 (50% to 352% of the mean) for Ca additions and from 779 to 1209 g m −2 day −1 (46% to 134%) for N × P additions, depending on the year (Fig. 5). Improvements in methods and larger sample sizes resulted in smaller detectable differences over time, associated with lower variability from tree to tree (Table 4). Even in 2018, where detectable differences were smallest, the variation from tree to tree within a plot was large (Fig. 6). The average number of trees monitored per plot increased over time from two in 2013 to nine in 2018. Even with this increase in sample size, we were still unable to detect a treatment effect. We calculated the number of trees needed to detect a given effect size, using the studies with the best data quality, and assuming there were no differences among

Minimum detectable differences for published studies
Effect sizes reported in eight previously published studies of tree water use ranged from 8% to 243% (Table 7). The largest reported effect size was a 90% increase in sap flux density with the addition of N (80 kg ha −1 yr −1 ) in loblolly pine (Pinus taeda) seedlings (Samuelson et al., 2008). A 43% increase was reported with the addition of macro and micronutrients to a Eucalyptus saligna plantation in Hawaii (Hubbard et al., 2004), and a 35% increase was detected in a study involving the addition of P (50 kg ha −1 of P 2 O 5 −1 ) and Ca (2 t ha −1 of CaCO 3 ) on Vismia japurensis, Bellucia grossularioides, and Laetia procera in Amazonia (Da Silva, Goncalves & Feldpausch, 2008).
Minimum detectable differences ranged from 16% to 75% in temperate forests and 52% to 74% in tropical and subtropical forests (Table 7). Two studies of temperate deciduous forest species had lower calculated detectable differences than ours. A study of eight tree species in four forest types in Wisconsin had a detectable difference in sap flux density of only 16% with a sample size of 8 per species, due to low variability among trees (Ewers et al., 2002). Another study in the same region used more trees from a larger area, resulting  The three stands were studied in three different years (Table 3). in higher variability and slightly higher detectable differences (Loranty et al., 2008). Some studies had much higher effect sizes than ours (Kunert, Schwendenmann & Hölscher, 2010;Nagler, Glenn & Thompson, 2003;Samuelson et al., 2008) and thus were able to detect them with statistical confidence although their power to detect a difference was not better. Publication of insignificant findings is rare (Møller & Jennions, 2001); one earlier study took place in two MELNHE stands (including one that we studied, Bartlett Mature) with effect sizes well below detection, similar to ours (Hernandez-Hernandez, 2014).

DISCUSSION
The initial impetus for this study was the surprising decrease in stream flow for three years following a calcium silicate addition at Hubbard Brook, where tree water use was invoked as a likely mechanism (Green et al., 2013). Unfortunately, even in 2018, the year with the lowest tree-to-tree variability and the highest replication, our statistical power was too low to detect a response of sap flux density to calcium addition smaller than 50% with an α of 0.05 (Fig. 5); we obtained a p value of 0.07 for an average increase in sap flux density of 36%, which is larger than the 25% increase in water use postulated by Green et al. (2013). Thus, we lack the statistical confidence to either support or contradict the explanation that increased sap flux density accounted for the decline in runoff in the earlier calcium addition experiment. There are other possible mechanisms for increased water use in response to nutrient addition. In the MELNHE study, the relative basal area increment of the average tree increased in response to P addition (Goswami et al., 2018), and larger trees grew more in response to N addition (Hong et al., 2022). In the calcium silicate addition, trees grew more than in the unfertilized control (Battles et al., 2014;Huggett et al., 2007). An increase in diameter growth could result in increased sapwood area (Nilsson et al., 2021), which could conduct water to a greater leaf area (Wang et al., 2012) without an increase in sap flux density (Hubbard et al., 2004). It is also possible that nutrient availability affects the length of the growing season (Escudero et al., 1992), which would affect annual water use without affecting sap flux density. Other studies of sap flux density have reported statistically significant effects to treatment additions, with effect sizes as low as 35% (Table 7). The difference detectable with our factorial ANOVA comparing N and P additions is somewhat smaller than the 46% shown in Fig. 5, because Eq. (1) describes power for a simple ANOVA. The benefit of factorial designs is evidenced by two of the published studies we analyzed. The study involving combinations of phosphate, lime, and gypsum additions (Da Silva, Goncalves & Feldpausch, 2008) was able to detect a significant difference of 35%, whereas our calculated minimum detectable difference for a simple ANOVA was 52%. Similarly, the full factorial study of irrigation and N fertilization (Samuelson et al., 2008) reported a 90% increase with N fertilization with a p-value of 0.01, which shows greater power than the detectable difference of 97% calculated using Eq. (2) for a simple ANOVA with α = 0.05. Unfortunately, without a power analysis for factorial designs, we cannot better quantify the effect size ruled out by our findings in the N × P addition experiment. It remains possible that nutrient availability has an ecologically important effect on sap flux.
Our study might have benefitted from pretreatment measurements since sap flux density varied consistently from tree to tree (Figs. 1 and 6). Pretreatment data, like factorial designs,   (5) 90% increase with N fertilization (p < 0.01) 74 Samuelson et al.
In addition to including pre-treatment measurements, future sap flux density studies could benefit from incorporating explanatory variables such as leaf area, canopy position, and stomatal conductance, since these characteristics have important influences on sap flux density (Green et al., 2013;Hubbard et al., 2004;Peters et al., 2021;Phillips et al., 2001). Environmental variables such as soil moisture and light environment may also be of interest since they influence sap flux density (Oren et al., 1998a;Oren et al., 1998b). Sapwood area measurements not only allow for scaling up to whole-tree water use but also make it possible to ensure that probes are within the sapwood. If probes are inserted into heartwood, which may have occurred in our study, estimated sap flux measurements can be reduced up to 50% (Clearwater et al., 1999). Many of these variables likely contributed to the variation we observed among trees.

CONCLUSION
This study aimed to assess the influence of soil nutrient availability on sap flux density in northern hardwoods. Specifically, we sought to verify whether increased tree water use could explain the observed reduction in runoff following a calcium silicate application in an earlier study, and we tested for effects of N and P addition on sap flux density in a long-term N by P factorial addition experiment. Unfortunately, poor statistical power due to high tree-to-tree variability prevented detection of potentially important treatment effects. Previous studies that reported significant differences had larger effect sizes; few had sufficient statistical power to detect small differences. To improve the ability to detect a treatment effect, future studies should instrument more trees to increase statistical power, collect pretreatment data, when applicable, to control for tree-to-tree variability, and include additional tree metrics such as leaf and sapwood areas.