Quantifying distribution in carbon uptake and environmental measurements with the Gini coefficient

The Gini coefficient is a measure used in economics to evaluate the equitability of the distribution of a resource across a population. This project applied the Gini coefficient as a classification method for a decade-long data set consisting of environmental observations and carbon flux data for a coniferous forest in Finland. Our results show consistency in the Gini coefficient for environmental variables, even with interannual variation in the measurements during the carbon uptake period or when the ecosystem is absorbing carbon from the atmosphere. The Gini coefficient calculations showed this ecosystem has an inequitable distribution of carbon uptake and release within the carbon uptake period, which is comparable to the inequitable distribution of temperature and precipitation during the same time period. We also calculated the percentage of the carbon uptake period that has passed for different cumulative proportions of a measurement. Future applications of the Gini coefficient to other ecosystems will enhance knowledge of the distribution of environmental and flux measurements across the carbon uptake period.


Introduction
Understanding anticipated changes in climate due to anthropogenic inputs is a major scientific focus of the twenty-first century (Stocker et al., 2013). Short-term changes in temperature, as well as variation in the timing and amount of precipitation, have measurable effects on the interannual patterns of carbon gain for terrestrial ecosystems (Ciais et al., 2005;Keenan et al., 2014;Piao et al., 2008;Wohlfahrt et al., 2013). Likewise, ecological forecasting and mathematical modeling studies suggest long-term changes in patterns of precipitation, temperature, and other climate extremes impact plant habitat, species composition, and ultimately carbon dynamics of terrestrial ecosystems (Hoover & Rogers, 2015;Park, Jeong, Ho, & Kim, 2015). To better understand future carbon dynamics of terrestrial ecosystems, examining the distributions of both carbon gain and loss, as well as environmental measurements linked to short-term carbon dynamics, will be essential.
Nearly continuous monitoring of environmental measurements of precipitation, temperature, and fluxes of carbon dioxide and water at the ecosystem scale are available CONTACT John M. Zobitz zobitz@augsburg.edu through ecological networks such as FLUXNET (Baldocchi et al., 2001, http://daac.ornl.gov/ FLUXNET). This network contains over 500 sites worldwide in a variety of biomes, with some sites providing over fifteen years of continuous data (Baldocchi, 2008;Baldocchi & Meyers, 1998). Ecosystems in this network are instrumented to collect data on carbon dioxide, water vapor, energy exchanges and other environmental measurements such as precipitation and temperature. One key measurement is the net ecosystem carbon exchange (which in this study is abbreviated as NEE, see Table 1 for a summary and description of all measurements used in this study). Half-hourly changes in NEE are driven by two biological processes at the whole-ecosystem level: gross primary productivity (GPP), which transforms atmospheric carbon to simple sugars by photosynthesis, and total ecosystem respiration (TER), which includes both autotrophic and heterotrophic respiration. Typically GPP and TER are derived from measurements of NEE using functional relationships that include temperature, moisture, or other environmental measurements (Desai et al., 2008;Reichstein et al., 2005). The measurements of NEE, GPP, and TER are all rates of change (all have units of g carbon m −2 day −1 ). A value of NEE is calculated as the difference between the two positive rates GPP and TER. Positive values of NEE represent that at that given point in time, the ecosystem is a net source of carbon to the atmosphere; negative values imply that the ecosystem is absorbing carbon from the atmosphere. Mathematically NEE represents a rate of change so the cumulative sum or integral of NEE across a year provide a measurement of the net carbon uptake accumulated by an ecosystem (Zobitz, 2013). The period of decreasing annual cumulative NEE during the year is called the carbon uptake period, which is approximately similar in length to the growing season (Churkina, Schimel, Braswell, & Xiao, 2005). This study aims to understand patterns in the distribution of environmental and micrometrological measurements during the carbon uptake period.
One approach to understanding how a quantity is distributed is with the Gini coefficient. This measure was developed by Corrado Gini as a way to quantify how equitably a resource is distributed in a given population (Gini, 1932). Standard applications of the Gini coefficient are to income distribution across groups through time (Jantzen & Volpert, 2012;US Census Bureau, Data Integration Division, n.d.). In other contexts the Gini coefficent has been applied to measure disproportionality in ship traffic (González Cancelas, Palomino Monzón, Soler-Flores, and Almázan Gárate, 2013), energy resources (Catalano, Leise, & Pfaff, 2009), inequality in plant size or fecundity (Damgaard & Weiner, 2000), spatial distributions of different land types (Huang, Xia, & Yang, 2013), as well as a classification scheme on forest structure (Russell, Woodall, D'Amato, Domke, & Saatchi, 2014).
The objective of this study is to investigate the feasibility for the Gini coefficient as a classification scheme for ecosystems using data provided by FLUXNET. To simplify the analysis, this study calculates the Gini coefficient using data from a coniferous forest in Finland during its carbon uptake period. We address the following key questions: (1) How disproportionate are the distributions of GPP and TER and other environmental measurements? (2) Are there similarities between the distribution of environmental measurements (temperature, radiation, precipitation) as compared to the distribution of GPP and TER?

Environmental site description
This study utilized data from Hyytiälä, Finland with data designated as 'Free Fair-Use' in the FLUXNET database (http://fluxnet.fluxdata.org/data/la-thuile-dataset/). This site (61.8475 latitude, 24.2950 longitude) is a coniferous evergreen needle leaf forest dominated with Scots Pine (Pinus Sylvestrus) located at an elevation of 185 meters above sea level. Average growth of the trees is about 8 m 3 hectare −1 year −1 . Climate characteristics of Hyytiälä are cool, humid summers (Suni et al., 2003). Data analyzed from this study ranged from the years 1997 to 2006.

Environmental and carbon flux data
This study utilized six different measurements -three measurements of environmental or meteorological data along with three measurements of the carbon flux density (NEE, GPP, and TER). All measurements are provided in half-hourly increments. To smooth out stochasiticty in the data we aggregated, or in some cases averaged, the environmental and flux measurements to a daily value. NEE is calculated as the difference between the two positive rates GPP (carbon uptake) and TER (carbon release), according to the following algebraic equation: At the ecosystem scale TER and GPP cannot be measured directly but rather inferred from process based models (Desai et al., 2008;Reichstein et al., 2005;Reichstein, Stoy, Desai, Lasslop, & Richardson, 2012). Typically TER is assumed to be an exponential function of temperature (TER = f (T)). Once TER is specified by an air temperature measurement, then GPP is calculated as the difference between NEE and TER. We recognize with this direct causality between temperature and respiration we would expect strong similarities in the calculated results for the Gini coefficient between TER and temperature. Environmental data included air temperature, precipitation, and photosynthetic photon flux density -or a measure of the spectral range of solar radiation for photosynthesis -which for this study we will refer to this as radiation (Campbell & Norman, 1998). All measurements are provided in half-hourly intervals through the FLUXNET database, processed according to standard published methodologies (Papale et al., 2006). Gaps in a half-hourly measurement are due to instrument malfunction or periods of atmospheric stability which can underestimate the flux measurement in the eddy covariance technique. Notes: Abbreviations are as follows: CUP = carbon uptake period (days), or the period of decreasing annual cumulative net carbon uptake, approximately similar in length to the growing season; Precip = total precipitation (mm); Radiation = average daily incoming photosynthetically active radiation (µmol m −2 day −1 ); T = average temperature ( • C); NEE = Total net ecosystem exchange, or net carbon absorbed by the ecosystem (g carbon m −2 ); GPP = Total gross primary productivity, or total carbon absorbed through photosynthesis (g carbon m −2 ); TER = Total total ecosystem respiration, or total carbon released during respiratory processes (g carbon m −2 .).
In these situations, gap-filling techniques are employed for a half-hourly measurement to produce a reasonable flux value (Reichstein et al., 2005).
After the data were accessed we used the software programs R and RStudio for analysis and visualization (R Core Team, 2014;RStudio Team, 2015). Measurements of NEE, GPP, and TER, radiation, and precipitation were summed to daily values, and air temperature was averaged over the course of the day. A daily value of these measurements was excluded from the results if more than 50% of the aggregated daily data were gap-filled. Table 2 reports the average environmental conditions and the length of the carbon uptake period for each of the years studied within the data, given as an aggregate.

Calculation of the Gini coefficient & cumulative proportion function
The Gini coefficient is a summary statistic that quantifies how much a distribution of measurements differs from the uniform (equitable) distribution (Catalano et al., 2009;Farris, 2010;Jantzen & Volpert, 2012). While typically applied to income, in this study we calculate the Gini coefficient to quantify the distribution of a measurement across the carbon uptake period. We describe how we calculated the Gini coefficient from the carbon uptake in Figures 1 and 2. Panel a of these figures display the time series of measurements of GPP (carbon uptake) and TER (carbon release) for Hyytiälä in 2001. Approximately 870 g carbon m −2 of carbon were taken up through photosynthesis during this carbon uptake period, with a histogram shown in Figure 1(b). Similarly, approximately 570 g carbon m −2 during the carbon uptake period of 2001 were released to the atmosphere through respiration.
The Gini coefficient is calculated from the the histogram in Figure 1(b) by first computing the Lorenz curve in Figure 1(c) (Lorenz, 1905). The Lorenz curve plots the cumulative frequency distribution of the measurement (in this case GPP in Figure 1(b)) as a function of the frequency distribution of the number of measurements in each bin (Farris, 2010). The point (0.67, 0.50) on the Lorenz curve shown in Figure 1(c) signifies that half of total GPP (= 436 g carbon m −2 ) is contained within 67% of the GPP measurements during this time period. If the GPP were distributed equally (meaning every day had the same value of GPP), then the Lorenz Curve would be equivalent to the function y = x, or the 1:1 line, which is also called the line of perfect equality. Once the Lorenz curve is determined, the Gini coefficient is calculated as the area between the line of perfect equality and the Lorenz curve, which in this case is has a Gini coefficient of 0.37. Through this approach, the Gini coefficient is scaled from zero (indicating a resource is equally distributed) to unity (a large proportion of the resource is allocated to a single measurement). We calculated the Gini coefficient for each environmental and carbon flux measurement during the carbon uptake period from 1997 to 2006. The Gini coefficient removes the influence of time because data are sorted by increasing values (see the histogram of GPP in Figure 1(b)). However one way to examine the distribution of a measurement with respect to time is with cumulative proportions. Figure 2(b) displays that the cumulative total proportion of the carbon released by respiration (TER) during the carbon uptake period of 2001. The point (169, 0.5) in Figure  2(b) signifies that half of the carbon released by respiratory process (=285 g carbon m −2 ) has occurred by day 169 of 2001 (June 18). Typically, the carbon uptake period has inter annual variation in its starting and ending points and duration (see Table 2). In order to compare cumulative proportions across different years, we scale the carbon uptake period (horizontal axis in Figure 2(b)) as a cumulative proportion. Figure 2(c) is effectively the inverse function to Figure 2(b), but standardizes the carbon uptake period on a unit scale. Switching the axes facilitates easier identification of when a cumulative proportion of a measurement occurs. The point (0.31, 0.50) in Figure 2(c) signifies that during the first half of the carbon uptake period, 31% of the total TER had occurred. If the ecosystem was releasing carbon at a uniform rate during this time period, then the data would fall on the 1:1 line. Figure 3 shows box plots for the calculated Gini coefficients for the environmental and flux variables during the carbon uptake period for each year examined in this study. The whiskers are the maximum and minimum Gini coefficient values. Outer limits on the boxes are the first and third quantiles (25% and 75% respectively), with the middle line in the box the median. The Gini coefficient for precipitation was fairly consistent at approximately 0.2, regardless of the amount of cumulative precipitation or length of the carbon uptake period (see Table 2).

Results
Figures 4 and 5 display the cumulative proportion for the environmental or flux measurements across all years of the represented data. To facilitate comparison we calculated the box plot every 10% of the cumulative measurement on the horizontal axis. For example in Figure 5(b), 50% of the cumulative TER flux during the carbon uptake period occurs during the first 60% of the carbon uptake period, whereas for GPP, 50% of the cumulative flux occurs during the first 55% of the carbon uptake period. Presenting the data in this way reveals how a measurement is distributed during the carbon uptake period. Values below the blue 1:1 line in Figures 4 and 5 signify that the measurement is disproportionately larger than a uniform distribution, and values above the blue 1:1 line suggest that the measurement is disproportionately smaller than a uniform distribution. For example, precipitation (Figure 4(a)) suggests that this ecosystem has disproportionately less precipitation during the first half of the carbon uptake period than the second half, whereas radiation (Figure 4(c)) is more uniform.

Disproportionality of measurements across the carbon uptake period
Our results indicate that the Gini coefficient for different environmental measurements tends to be fairly consistent across the years represented in the data (Figure 3), in spite of the variation in environmental conditions and the length of the carbon uptake period  (1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006). The label 'Precip' refers to incoming precipitation, 'T' refers to air temperature, 'Radiation' refers to incoming photosynthestically active radiation, 'GPP' refers to gross primary productivity (carbon absorbed by photosynthesis), and 'TER' refers to total ecosystem respiration (carbon released by respiratory processes).

Figure 4.
Cumulative proportion plot for the environmental measurements examined in this study, evaluated every 10% of the cumulative measurement, taken across all years of available data. Values below the blue 1:1 line signify that the measurement is disproportionately larger than a uniform distribution, and values above the blue 1:1 line suggest that the measurement is disproportionately smaller than a uniform distribution.
( Table 2). For this coniferous forest, environmental and flux measurements are not uniformly distributed across the growing season (Gini coefficient is nonzero). These results support application of the Gini coefficient to other types of flux data as a classification period. , evaluated every 10% of the cumulative measurement, taken across all years of available data. Values below the blue 1:1 line signify that the measurement is disproportionately larger than a uniform distribution, and values above the blue 1:1 line suggest that the measurement is disproportionately smaller than a uniform distribution.
Comparing the different environmental measurements, radiation had the least variability in the Gini coefficient calculation (smallest interquartile width) in Figure 3 and 4. Not surprisingly, an increased interquartile width in the Gini coefficients (Figure 3) also corresponded to larger variation in the percentile plots. This suggests that the distribution of the light environment of this ecosystem is more consistent and uniform compared to precipitation and temperature. The uniformity of the light environment could be attributable to the forest structure -coniferous forests have less dynamic variability in their light environment than deciduous forests or agricultural ecosystems. Figure 4 suggests larger variation in precipitation both in the early and late season, with the ecosystem receiving a larger pulse of precipitation in the early season (box plots above the line y = x). The disproportionality in precipitation may affect patterns of carbon uptake through photosynthesis, as Figure 5(a) shows a similar disproportionality in carbon absorbed through photosynthesis or GPP. More uniform temperature and radiation (Figures 4(b)-(c)) in the late season may lead to more uniformity in carbon uptake and release through photosynthesis and respiration ( Figure 5).
Placing the results of this study in context, changes in climate ultimately predict an increase in the carbon uptake period as the growing season length increases (Churkina et al., 2005;Wohlfahrt et al., 2013). Observational studies have observed that an increase in the carbon uptake period could lead to an overall carbon gain (Keenan et al., 2014) or decrease (Ciais et al., 2005;Piao et al., 2008) for terrestrial ecosystems. Our study corroborates with the conclusion that the timing and carbon uptake and release (early or late season) matters. Figure 5 demonstrates the differences in the magnitude between TER (carbon release) and GPP (carbon uptake) and how the two change over the course of the carbon uptake period. Our results indicate a disproportionate amount of GPP acquired during the early spring compared to TER, but that disproportionality does not last throughout the carbon uptake period. Since the GPP flux tends to be larger in magnitude than TER, variation in the timing of spring onset may not be as deleterious compared to any lengthening of the carbon uptake period due to a late autumn onset.

Similarities in the distributions of different measurements across the carbon uptake period
It is well known that two different Lorenz curves can yield the same Gini coefficient (Catalano et al., 2009;Farris, 2010), suggesting that the Gini coefficient is more of a function of disproportionality within a sample rather than comparison across different samples. Despite this potential pitfall, our results suggest that Figure 3 supports a similarity in the Gini coefficients for temperature, radiation, and TER and GPP.
Similarities in the Gini coefficients for temperature and TER can be expected because TER is modeled as a function of temperature (Desai et al., 2008;Reichstein et al., 2005;. However, the similarities between in the Gini coefficients for TER and GPP to radiation were not to be expected a priori, especially given the differences in the cumulative proportion plots to radiation and GPP and TER (Figures 4 and 5). These changes may also be in response to the distribution of precipitation and air temperature (Figures 4(a)-(b)) and the wide variation in these measurements.
The results presented here illuminate two potential benefits of applying the Gini coefficient to environmental and flux measurements in terrestrial ecosystems. By standardizing everything to a unit scale and looking at the deviation from a uniform (equally distributed) resource, the Gini coefficient and cumulative measurement plots help characterize differences between ecosystems in the context of changing length of the carbon uptake period (Bao, Wen, Sun, Zhao, & Wang, 2014), the frequency and timing of precipitation, and short term drought (Wei et al., 2014). The analysis presented here can classify differences across ecosystems rather than through annual mean temperatures, rainfall, or moisture levels.

Conclusions
The Gini coefficient can be used as classification for ecosystems using three environmental sub-classifications: air temperature, moisture, and radiation. While radiation tended to be consistent across seasons, the modeled variation in GPP (carbon uptake) and TER (carbon release) is more of a response to the variation in air temperature and the models used to determine the functional forms of GPP and TER. For our site, the carbon release and uptake is much less productive during the first half of the growing season versus during the second half of the growing season. Future planned work will expand this analysis to other FLUXNET sites.
OzFlux, TCOS-Siberia, USCCC. We acknowledge the financial support to the eddy covariance data harmonization provided by CarboEuropeIP, FAO-GTOS-TCO, iLEAPS, Max Planck Institute for Biogeochemistry, National Science Foundation, University of Tuscia, Université Laval and Environment Canada and US Department of Energy and the database development and technical support from Berkeley Water Center, Lawrence Berkeley National Laboratory, Microsoft Research eScience, Oak Ridge National Laboratory, University of California-Berkeley, University of Virginia. MNS was supported by the McNair Scholars Program at Augsburg College. JMZ thanks N. L. Schoenborg for helpful discussions on this manuscript. Both authors thank M. Ott for help with R and RStudio.

Disclosure statement
No potential conflict of interest was reported by the author.