Diagnosis of environmental controls on daily actual evapotranspiration across a global flux tower network: the roles of water and energy

Relative contributions from environmental factors to daily actual evapotranspiration (ETa) across a variety of climate zones is a widely open research question, especially regarding the roles played by soil water content ((SWC); water supply) and net radiation ((Rn); energy supply) in controlling ETa. Here, the boosted regression tree method scheme was employed to quantify environmental controls on daily ETa using the global FLUXNET dataset. Similar to the general trend suggested by the Budyko theory at annual scales, the results showed that the relative control of SWC on daily ETa increased with increasing aridity index (Φ); however, Rn played a major role at most FLUXNET sites (roughly Φ < 4), indicating that Rn could be a leading control on daily ETa even at water-limited sites. The variability in the relative controls of SWC and Rn also partly depended on factors affecting water availability for daily ETa (e.g. vegetation characteristics and groundwater depth). Our study showed that other than SWC and Rn, the net effect of environmental controls (particularly leaf area index) on daily ETa was more important at drier sites than at relatively humid sites. This suggests that near-surface hydrological processes are more sensitive to vegetation variations due to their ability to extract deep soil water and enhance ETa, especially under arid and semi-arid climatic conditions. Our findings illustrate how environmental controls on daily ETa change as the climate dries, which has important implications for many scientific disciplines including hydrological, climatic, and agricultural studies.


Introduction
Knowledge of actual evapotranspiration (ET a ) is fundamental in understanding terrestrial water cycles and ecosystem functioning (Oki andKanae 2006, Maxwell andCondon 2016), and also has important implications for hydrological, climatic, and agricultural studies Dickinson 2012, Fisher et al 2017). Across different spatiotemporal scales, ET a intricately interacts with a multitude of land surface and meteorological processes through complex feedback mechanisms, e.g. the significant links between regional precipitation (P) and evaporation (Zveryaev and Allan 2010); changes in gross primary production coupling with surface energy partitioning (Ciais et al 2005). As a result, ET a tends to display significant variability over space and time (Vivoni et al 2008, Trambauer et al 2014. Therefore, it remains challenging to elucidate the impacts of different environmental factors (e.g. energy and water) on ET a under varying land surface and climatic conditions (Wang and Dickinson 2012).
At annual and mean annual time scales, ET a is primarily modulated by energy and water inputs as illustrated by the well-known Budyko framework (Budyko 1974). This framework has been widely used to characterize water and energy exchanges across the land-atmosphere interface by linking ET a with potential evapotranspiration (ET p ; energy input) and P (water input), and also applied at monthly or even shorter time scales (Zhang et al 2008, Chen et al 2013. To this end, hydroclimatic regimes can be classified into (a) water-limited and (b) energylimited environments, depending on relative supplies of energy and water (Haghighi et al 2018). Given that ET a responses to the changes in energy and water inputs differ considerably between the two environments, their differentiation is of importance for understanding a range of hydrological processes, such as the impacts of climate change on streamflow (Arora 2002) and regional water balance (Renner and Bernhofer 2012). In addition, the relationships of ET a with soil water content ((SWC); an indicator of water supply) and ET p vary significantly across sites (Wang and Dickinson 2012), which exhibit multidimensional functions modified by a suite of atmospheric, soil, eco-physiological, and biogeochemical factors (Zhang et al 2001, Vivoni et al 2008, Chang et al 2018, Haghighi et al 2018. For instance, Gu et al (2006) found that vapor pressure deficit (VPD) interfered with how SWC and net radiation (Rn) affected surface energy partitioning, which was also demonstrated by Novick et al (2016). Vivoni et al (2008) showed that vegetation greening affected the relationship between ET a and SWC along a latitudinal gradient of land covers. Apparently, complex environmental conditions provide a challenge for exploring the net effect of water and energy supplies on ET a .
Previous studies have also suggested that the impacts of different controlling factors on ET a vary across different time scales (Cheng et al 2011, Ding et al 2013. At diurnal scales, the variability in ET a is mainly affected by atmospheric water demands, which are primarily determined by air temperature (T a ), Rn, and VPD (Wilson et al 2003, Shi et al 2008. At seasonal and interannual scales, the site-specific variability in ET a could be explained by changes in either annual P or Rn, depending on the water and energy availability (Brümmer et al 2012, Liu andFeng 2012). As a further demonstration, Scott and Biederman (2019) showed that the dominant controls on ET a changed from sub-daily (weather) to seasonal (soil moisture) scales in a semi-arid savanna ecosystem.
From the aforementioned studies, the question arises as to whether the conceptualization of the Budyko framework can be extended to daily time scales (e.g. whether the water input is the primary control on daily ET a in water-limited environments, and vice versa), as applied previously (e.g. Zhang et al 2008). This question is particularly challenging due to the complex interactions of daily ET a with various environmental factors, which has yielded mixed findings from previous studies. For instance, Ford et al (2014) found that daily ET a was strongly influenced by SWC when Rn was above normal in a semiarid region, while Liu et al (2019) showed that Rn was the leading factor even at arid sites. Rim (2008) showed that the combined effect of Rn, SWC, and wind speed (WS) accounted for >70% of daily ET a variations in semi-arid watersheds with Rn being the dominant factor. In addition, Williams and Torn (2015) reported that leaf area index (LAI) had a significant control on daily surface heat flux partitioning instead of SWC in the U.S. Southern Great Plains. Gong et al (2007) reported a similar finding at a semiarid site in northwest China. Note that those studies were based on data from a limited number of sites with certain climatic conditions. Due to the complex interplays between ET a and surrounding environments, the controls on daily ET a variability across different climate zones and ecosystems at the global scale are still not well understood, which calls for further research.
To this end, long-term observed data from a global network of eddy covariance (EC) towers (i.e. FLUXNET) were analyzed to systematically assess the relative controls of environmental variables on daily ET a within a quantitative framework, using the boosted regression tree (BRT) technique. Based on the BRT results, we evaluated the roles played by different factors across the globally distributed network, particularly by water and energy as stressed by the Budyko theory, in affecting ET a at the daily time scale under different land surface and climatic conditions.

Data acquisition
The FLUXNET is a global network equipped with EC towers to measure site-level water, carbon, and energy exchanges between the atmosphere and biosphere (Baldocchi et al 2001). In situ data on meteorological and soil variables are collected at the FLUXNET sites. In this study, the FLUXNET 2015 Tier 1 dataset with daily values (e.g. latent heat (LE), sensible heat (H), soil heat flux (G), P, SWC, T a , soil temperature (T s ), Rn, relative humidity (RH), VPD, and WS) was obtained. This dataset has been treated through rigorous procedures of data processing and quality control for research purposes (Pastorello et al 2014, Vuichard andPapale 2015). Full details regarding the descriptions of the FLUXNET and the data processing can be accessed at https://fluxnet.fluxdata.org/. Based on data availability, EC sites with at least 2 years of observations were initially chosen to ensure the inclusion of sufficient data points in the statistical analysis.
At the selected FLUXNET sites, daily ET p was calculated from measured weather data using the Penman-Monteith equation (Allen et al 1998). The aridity index Φ (Φ = ET p /P, where ET p andP are mean annual ET p and P) was used to define the climate dryness at the sites. Based on the classification from the Food and Agriculture Organization (FAO; Spinoni et al 2015), the FLUXNET sites were divided into four categories: humid (Φ ⩽ 1.3), sub-humid (1.3 < Φ ⩽ 1.5), semi-arid (1.5 < Φ ⩽ 5), and arid (5 < Φ < 20) sites.
The land cover at the FLUXNET sites varied considerably, including cropland, forest, grassland, savanna, shrubland, and wetland according to the classification system given by the International Geosphere-Biosphere Programme (Loveland et al 2001). LAI data that are routinely used to quantify vegetation coverage were obtained from the Moderate Resolution Imaging Spectroradiometer (MODIS) satellite land products. Specifically, based on the geographical coordinates of the FLUXNET sites, LAI data with a temporal resolution of 4 d and a spatial resolution of 500 m were retrieved from the MCD15A3H dataset (after 2002; version 6; https://e4ftl01.cr.usgs.gov/MOTA/) to assess the impact of vegetation on ET a . The global soil database of Shangguan et al (2014) with a spatial resolution of 30" × 30" and the global dataset of the depth to groundwater table (DTGT) of Fan et al (2013) with a spatial resolution of ∼1 km were used to obtain the soil texture and DTGT, respectively, at each FLUXNET site, according to their geographical coordinates.

Data pre-processing
All sites were initially screened to remove low-quality data. First, we only used measurements after 2002 as this is the starting date of the LAI dataset. Then, based on data availability, sites with no SWC or energy flux data were omitted. Secondly, the time periods with missing energy flux data (i.e. LE, H, Rn, and G, with a maximum gap of 51.6% of the raw data) were removed. Thirdly, to minimize the effect of freezing conditions on SWC measurements, the time periods with T s below 0 • C were not included in the analysis. Fourthly, for the energy balance closure problem, data with |ECR − 0.83| ⩾ 0.22 (forest sites) and |ECR − 0.87| ⩾ 0.15 (other sites) were removed (Stoy et al 2013), where the ECR (energy closure ratio; defined as (LE + H)/(Rn − G) based on the daily data) is a standard measure to evaluate the performance of EC measurements. Finally, data with evaporative fraction ((EF); defined as LE/(LE + H)) values greater than 1 were also removed (Liu et al 2019), where EF represents the percentage of incoming energy used for ET a .
LAI data were then further processed to obtain a high-quality LAI dataset. First, on the basis of the quality control information, data with unfavorable conditions (e.g. the presence of cloud and/or snow cover) were first filtered out. Secondly, a cubic smoothing spline method was used for interpolating the 4 day interval data into daily values (Horn and Schulz 2010). Finally, a robust method that is based on the Savitzky-Golay filter for smoothing out noises in the LAI time series was used (Chen et al 2004).
In addition, the multicollinearity in predictor variables (e.g. SWC, Rn, LAI, RH, VPD, WS, and T a ) may affect the performance of the BRT model for predicting response variables (i.e. ET a ). Before the BRT analysis, Pearson's correlation coefficient (r p ) and variance inflation factor (VIF) analysis (i.e. |r p | < 0.7 and VIF < 5) were used to test the multicollinearity among predictor variables (Dormann et al 2013), which led to the use of five factors-SWC, Rn, LAI, RH (or VPD), and WS as predictor variables in this study. Note that due to the strong multicollinearity between RH and VPD at most sites, either RH or VPD was used based on two criteria: (a) no multicollinearity between the remaining RH or VPD and other variables; (b) the selected predictor variables at the sites for each aridity class remained the same as much as possible. Here, the sites existing high multicollinearity between any two of the five variables were also removed.
After the above procedures, a total of 78 FLUXNET sites were used in the following analysis with an average of 852 data points (i.e. 852 d) (see figure 1(a) and table S1 available online at http://stacks.iop.org/ERL/15/124070/mmedia, for site-specific details). Data retained after the filtering processes were used to assess the impact of different variables on daily ET a .

Boosted regression tree method
The BRT method was adopted to quantify the relative controls of predictor variables (i.e. SWC, Rn, LAI, RH/VPD, and WS) on response variables (i.e. ET a ) at the selected sites (Breiman et al 1984, Elith et al 2008. The BRT method couples the regression tree method with boosting algorithms to predict the relationship between predictor and response variables, which does not require any specific data distributions. This method is particularly suitable for analyzing data with issues of outliers, interactions among predictor variables, and missing data (Elith et al 2008). Given its advantages, the BRT method has been increasingly used in a variety of research fields, such as public health (Bhatt et al 2013), oxynitride distributions (Sayegh et al 2016), and carbon stocks (Lin et al 2016).
In this study, the open-source BRT package developed by Elith et al (2008) in R software (version 3.5.1, R Core Team, 2018) was employed to derive the BRT model at each site. The detailed introduction of the BRT method, and the procedures for constructing the BRT model and for computing the relative control of each predictor variable on ET a are provided in the supplemental text S1. The performance of the BRT model was evaluated using the Nash-Sutcliffe Efficiency (NSE), root mean square error (RMSE), and coefficient of determination (R 2 ) by comparing observed ET a with modeled ET a for each site (table S2).

Two typical examples from the FLUXNET dataset
The dataset covered a wide range of hydroclimatic regimes as illustrated in figures 1(b)-(d), which show the distributions ofP, ET p , and mean annual ET a (ET a ; calculated from the non-filtering daily latent heat flux data) across different climatic zones. The diversity of hydroclimatic conditions at the FLUXNET sites led to considerable variations in ET a from 122.0 to 1459.4 mm yr −1 , providing an ideal dataset for examining the impacts of various influencing variables on daily ET a under different environmental conditions.
To illustrate the roles played by water and energy inputs in controlling daily ET a , figure 2 provides the time series of daily P, SWC, Rn, and ET a at two typical sites with contrasting conditions (e.g. an arid climate at AU-ASM with Φ = 5.43 and a humid climate at US-KS2 with Φ = 1.17). The temporal variations in daily ET a were largely controlled by the P inputs at AU-ASM, while corresponded well with the changes in Rn at US-KS2, clearly demonstrating the different roles of water and energy in controlling ET a under waterlimited and energy-limited conditions according to the Budyko framework. In addition, the BRT results revealed that the respective contributions of SWC and Rn to ET a were 59.9% and 8.9% at AU-ASM, and 21.0% and 64.5% at US-KS2, respectively, illustrating the viability of using the BRT method for diagnosing the controls on daily ET a . More intriguingly, at AU-ASM, similar SWC levels during the same season in different years led to different magnitudes of daily ET a (as indicated by the open squared areas in figures 2(a)-(c)). This observation suggested that at this arid site, factors other than SWC were also nonnegligible in controlling daily ET a , which has decisive implications for hydrological cycles in arid regions (e.g. episodic groundwater recharge (GR); Crosbie et al 2012). Figure 3 displays the relative contributions of different environmental factors in affecting daily ET a Figure 2. Example daily curves from two typical sites (an arid climate at AU-ASM and a humid climate at US-KS2) used in this study including precipitation (P), soil water content (SWC), net radiation (Rn) and actual evapotranspiration (ETa).

Environmental controls on daily ET a across the FLUXNET sites
for four aridity classes from the BRT analysis (see table S2 for site-specific contributions). The BRT models exhibited good performance, with high NSE and R 2 , and low RMSE values at most sites. Overall, Rn was the dominant factor with an average contribution of 49.2% across all sites, followed by LAI (24.8%), SWC (14.8%), RH/VPD (7.0%), and WS (4.2%). As expected, the Rn contribution tended to decrease with an increase in climate dryness, while the SWC contribution showed an opposite trend. Interestingly, the impact of LAI appeared to be greater under drier conditions; whereas, no clear patterns could be observed for RH/VPD and WS.

Environmental controls on daily ET a at humid and sub-humid sites
At the humid and sub-humid sites (i.e. Φ ⩽ 1.5), daily ET a fluxes were predominantly affected by Rn with an average contribution of 59.8%, while the average SWC contribution was comparatively less (9.4%), reinforcing the vital role of energy in determining daily ET a at sites with relatively abundant water supplies.
Note that exceptions existed mainly in the cold region of the northern USA with the land cover of forests (see also figure S1), where LAI was the primary contributing factor; this is clearly demonstrated by the fact that the rapid increases in daily ET a during early summers mainly coincided with the leaf emergence ( figure S2). In this region, the climate is characterized by a continental type with short growing seasons and cold winters (Cook et al 2004). The seasonal variations in LAI were obvious and ET a occurred mainly in the growing seasons. One of the likely reasons for the dominant control of LAI on daily ET a might be attributed to the dense coverage of forests, which could block the incoming solar radiation and thus suppress soil evaporation (Bagley et al 2017). Therefore, ET a primarily consisted of plant transpiration, which in turn was controlled by LAI.

Environmental controls on daily ET a at semi-arid and arid sites
At the sites with relatively abundant energy inputs (i.e. Φ > 1.5), the SWC constraint on ET a increased with increasing climate dryness, while the impact of Rn showed an opposite trend (figure 3). This trend was aligned with the general understanding of the roles that are played by water and energy in controlling ET a . However, interestingly, the BRT results indicated that the average contribution of Rn (32.1%) outweighed that of SWC (23.4%), which was consistent with the recent findings at a few water-limited sites in China (Liu et al 2019), but was still unexpected at the global scale. This underscores the importance of energy in influencing ET a and consequently other hydrological processes (e.g. GR), even under water-limited conditions across the globally distributed FLUXNET sites.
Several reasons might be attributed to the seemingly different finding that Rn was the leading factor at some sites with Φ > 1.5. First, soil texture varied noticeably across the FLUXNET sites. It appeared that daily ET a at some sites with high clay content tended to be more affected by Rn (e.g. the RU-Ha1 site with an average clay content of 38% in the topsoil (0-10 cm)). Wang et al (2009) found that under similar climatic conditions, clayey soils had higher water holding capacities, resulting in higher opportunities of soil water for being evapotranspired. Secondly, the presence of any shallow groundwater could alleviate the constraint of water stress on ET a in waterlimited regions, primarily due to capillary rise into the root zone and/or direct root water uptake from groundwater for ET a (Yue et al 2016). For instance, at the CN-Cng site from northeast China where the land cover was grass with Φ = 3.63 and the respective contributions of Rn and SWC were 63.6% and 5.5%, the long-term average DTGT was 0.83 m. As shallow groundwater can provide an additional water source for ET a , energy subsequently becomes a limiting factor for ET a under water-limited conditions that are essentially only determined by climatic conditions (Yue et al 2016). Therefore, the interplay between DTGT and rooting depth is another important factor to be considered in water-limited regions for quantifying ET a (Maxwell and Condon 2016).
It should be emphasized that based on the BRT results, LAI was shown to have an important influence on daily ET a with an average contribution of 34.6% across the semi-arid and arid sites (most notably in Australia and the USA). Note that the sites with LAI as the leading factor were mostly covered by grass or of savanna ecosystems (see figure S1). These sites exhibited similar seasonal patterns of vegetation growth accompanied by soil water and solar radiation status. In the growing season, transpiration made a major contribution to ET a , daily ET a rate remained continuously high, and its amount accounted for the most of annual ET a . As an indicator of biomass, higher LAI generally leads to greater requirements for plant transpiration, which was strongly coupled to photosynthesis under arid conditions (Scott et al 2015, Williams andTorn 2015). Moreover, plants in arid regions tend to access deeper soil water, where soil water is a temporally more stable (or less variable) source for plant transpiration (e.g. Schenk andJackson 2002, Wang et al 2015). The factor LAI seemed to be an improvement over available soil water as a measure of the land surface state relevant to surface heat flux partitioning (Williams and Torn 2015); thus, the ET a responses to soil water status are mostly reflected by the changes in LAI. Therefore, our results suggested that the change in vegetation conditions would have a greater impact on ET a and subsequently on hydrological cycles in drier regions than in humid regions. In short, the net effect of environmental controls, other than SWC and Rn, on daily ET a appeared to be more critical at drier sites (Φ > 1.5) than at humid sites (Φ ⩽ 1.5).

Roles of energy and water in controlling daily ET a
To compare the impacts of Rn and SWC on daily ET a , figure 4(a) shows the changes in the Rn and SWC contributions with Φ across the FLUXNET sites. Although the impact of SWC on daily ET a expectedly increased as the climate became progressively drier, Rn was the leading factor at most sites. Based on the fitted curves in figure 4(a), the overall trend suggested that the impact of SWC exceeded that for Rn when Φ was roughly greater than 4 within the selected FLUXNET sites. Although the point Φ > 4 would likely vary depending on the choice of the sites, our study provides strong evidence that Rn can play a major role in controlling daily ET a even in water-limited environments (e.g. Φ > 1.5 as defined by the FAO for semi-arid and arid climates) from the globally-distributed FLUXNET network.
As an effective tool for diagnosing the coupling mechanism between water and energy exchanges across the land-atmosphere interface, the Budyko hypothesis states that the limiting factor for controlling ET a gradually shifts from energy in humid environments to water in arid environments (Budyko 1974, Haghighi et al 2018. Following Williams et al (2012) who also used the FLUXNET dataset, the relationship between Φ and ET a /P at mean annual time scales is demonstrated in figure 4(b) for the globally-distributed FLUXNET sites. Those sites were divided into two groups based on the relative contributions of Rn and SWC. Note that ET a /P ratios were greater than 1 at some sites, particularly at waterlimited sites, owing to additional water supplies from other sources (e.g. shallow groundwater). In addition, it should be emphasized that long-term data were used to quantify the controls of Rn and SWC on daily ET a in this study (e.g. an average of 852 d for the selected FLUXNET sites), which was in line with the annual time scale required by the Budyko framework.
Nevertheless, our results further demonstrated that the contribution of Rn could exceed that of SWC in controlling ET a at water-limited sites. We proposed that the differences between our findings at the daily scale and the Budyko theory at the annual scale are also highly dependent on site-specific conditions (as indicated by the scatter of the data points). Part of the reason was that at daily or shorter time scales (unlike annual time scales for the Budyko theory; Istanbulluoglu et al 2012), concentrated rainfall events could suppress daily ET a due to lower Rn and higher RH during rainfall events at water-limited sites (e.g. the AU-ASM example in figure 2(a)); therefore, rainfall water might have better opportunities to pass the root zones or become surface runoff. Therefore, it is reasonable to argue that the temporally averaged SWC state does not necessarily reflect the amount of the available soil water for daily ET a , especially from a modeling perspective.

Conclusions
The global FLUXNET dataset was analyzed in this study using the BRT method to assess the contributions of various influencing factors in controlling daily ET a across different climate zones and ecosystems. The average contributions of Rn, LAI, SWC, RH/VPD, and WS were 49.2%, 24.8%, 14.8%, 7.0%, and 4.2%, respectively, across the selected FLUXNET sites. Overall, the BRT results showed that SWC became more important in controlling daily ET a with increasing Φ; meanwhile, the daily FLUXNET data revealed that Rn still played a pivotal role at water deficient sites, which was tightly related to other environmental factors, such as vegetation, soil texture, and groundwater depth. Moreover, LAI exerted a stronger influence on daily ET a at drier sites than at humid sites, suggesting that hydrological processes in drier regions are more sensitive to the variations in vegetation conditions. As a result, the net effect of environmental controls other than SWC and Rn on ET a was more important at drier sites.

Data availability statements
LAI data are available at https://e4ftl01.cr.usgs.gov/ MOTA/, the global soil database is available at http://globalchange.bnu.edu.cn/, the depth to groundwater table data should be obtained from the authors of the reference provided in this paper.
The data that support the findings of this study are openly available at the following URL/DOI: https://fluxnet.org/data/fluxnet2015-dataset/.