Drivers of increasing global crop production: A decomposition analysis

Rising crop production over the last half century has had far-reaching consequences for human welfare and the environment. With food demand projected to rise, one of the central challenges in minimizing agriculture’s impacts on the climate and biodiversity is to increase crop production with higher yields rather than more cropland. However, quantifying progress is challenging. When analyzed at the most aggregated, global level, yields can be defined as the total crop output per unit area per year, but aggregate yields are driven by multiple factors, only some of which have a clear relationship to improved agricultural production. To date, there is no research that simultaneously determines how much of rising crop production has been met by rising aggregate yields versus cropland expansion, while also quantifying the unique contribution of each yield driver. Using LMDI decomposition analysis, we find that rising aggregate yields contributed far more than cropland expansion (89% compared to 11%). That is, growing global food demand has by and large been met by growing more crops on the same amount of land, rather than expanding cropland. Our second-stage decomposition showed that nearly two-thirds of aggregate yield improvements have come from pure yield, or the output of a given crop per unit of harvested cropland area in a given country per unit area per year. The remainder has come from less-discussed drivers of aggregate yields, including cropping intensity, changes in the geographic distribution of cropland, and crop composition. Further, we use attribution analysis to show the contributions to different decomposition factors from countries grouped by climate, income, and region, as well as from different crops. Such granular yet comprehensive breakdowns of crop production and aggregate yields offer more accurate forecasts and can help focus policies on the most promising levers to meet rising food demand sustainably.


Introduction
Increases in crop production over the last half century-stemming from farmland expansion and improved technology-have had far-reaching consequences for human welfare and the environment. The growth in global cropland area has caused biodiversity losses due to habitat displacement in some regions and driven an increase in agricultural carbon emissions, whereas increasing yield via intensification has led to local losses of on-farm biodiversity [1]. With crop demand projected to increase by at least another 50% by 2050 [2,3], these impacts are likely to intensify.
One of the central challenges in meeting future food demand with minimal environmental impact is to increase crop production through improving yield rather than by expanding cropland area. Since the 1960s, new technologies and practices associated with the 'Green Revolution' raised yields to such a degree that crop production was increased by 250% even as cropland area only expanded by about 15% [4]. While increases in demand (due to rising population size and increased per-capita consumption) offset some of those gains, the net result was still large areas of land spared from conversion to farmland [5].
In the most general sense, crop yield can be defined as the total output of all crops per unit area per year, which we refer to as aggregate yield. However, this metric, while widely used to quantify progress in agricultural production, encompasses several factors that have entirely different relationships to farming practices, technological change, and food systems at large. As Beddow and Pardey [6] note, a measure of aggregate (or average) yield 'becomes problematic when one assumes that the yield measure implies something about the state of technology. ' To address this problem, we introduce a framework that breaks aggregate yield into four factors that can all be quantified. The first is what we refer to as pure yield, or the per unit area output of a given crop per unit of harvested cropland area in a given location (such as a country). The second is cropping intensity, or the average frequency with which each hectare of standing cropland is harvested. Pure yield and cropping intensity are clear reflections of production methods, but they are nonetheless important to distinguish since they do not necessarily reflect the same practices and have different implications for sustainability [7].
The next two factors can affect aggregate yield without any changes in how farmers produce a given crop. These two factors instead reflect broader patterns of trade and diets. Country share describes the geographic distribution of cropland. A shift in cropland from lower-yielding to higher-yielding countries, for instance, would boost aggregate yield without any one country improving its yields. Finally, crop composition refers to the proportion of cropland dedicated to different crops in each country. Changes in crop composition can affect aggregate yield in that a shift towards higher-yielding crops would result in higher aggregate yields without any one crop actually having improved its yield.
The research literature has typically not studied aggregate yields in a way that is simultaneously comprehensive and systematic. Although trends in global crop production and crop yield have been widely studied, these studies have either analyzed crop production or crop yield as a whole, where individual factors are lumped together, or in a more disaggregated fashion, looking at individual factors (such as pure yield or cropping intensity) in isolation.
For example, Alexandratos and Bruinsma [2] assessed drivers of increases in global crop production, but did not define the contribution of crop composition or country shares. This makes it hard to connect the trends to technological drivers or policy levers, as changing country shares have more to do with food demand and trade than with production techniques and technologies.
Beddow and Pardey [6] showed that the spatial movement of corn production within the United States has increased aggregate yield, since counties with higher yields have taken on a larger share of total production. To date, there is no study that does this at the global level, let alone one that connects geographical shifts with other factors driving aggregate yields.
Other studies have focused on trends in a given yield metric. Trends in pure yield have been documented, inter alia, by Grassini et al [8], who identified stagnating yields for certain crops in certain regions; and Ray et al [9], who found recent trends in crop yields to be insufficient for meeting food demand by 2050. Another trend analysis identified cropping intensity as an important driver of increased aggregate yield over time [7]. While such analyses can connect more directly to technologies, practices, and policies, they are unable to identify the key components (and their relative contribution) driving global aggregate yields.
To overcome this tradeoff between specificity and conflation, we apply decomposition analysis, a method for breaking down aggregate trends into contributing factors [10]. We quantify the contributions of cropland expansion and all four yield factors to increases in global crop production, including food, feed, and fiber crops. We also undertake an attribution analysis of country share, crop composition, and pure yield, showing the respective contributions from different climate zones, geographic regions, income groups, and pairs of climate zones and income groups, as well as from individual crops. This enables us to present, in a form not before shown, a detailed picture of the way in which agriculture has been able to meet growing food demand over the last half century.

First-and second-stage decompositions
The purpose of an index decomposition analysis (IDA) [11] is to express the overall change in an aggregate quantity over a given time interval in terms of contributions from several factors. Our analysis begins with the aggregate quantity P, global crop production. Crop production is primarily made up of crops used for food and animal feed, but also includes a small number of crops with other uses such as fiber and fuel. More precisely, P is the sum of the production, given in tonnes per year, of every crop in every country, which we express as: where i and j index countries and crops respectively. The first step in the analysis is to express each P ij in the form of a multiplicative identity, the factors of which determine the components of the decomposition. We develop these factors in two stages. The first-stage decomposition identifies the contributions to increased global crop production P from global cropland area A (measured in hectares) and aggregate yield, or tonnes of crop output per hectare per year, denotedŶ ij . This decomposition is given by the following identity: Aggregate yield can in turn be disaggregated into four factors: cropping intensity (I = H/A), country share ( In the second-stage decomposition, we express the overall change in total production P in terms of the contribution from cropland area A as well as the four factors composing aggregate yield: The crop-production data, including cropland and harvested area, were sourced from the Food and Agriculture Organization (FAO) of the United Nations [4]. Further details concerning the sourcing, selection, and preparation of the data, including the treatment of sovereign states that began or ceased to exist during the period of analysis, can be found in the supplementary information (available online at stacks.iop.org/ERL/15/0940b6/mmedia).

Decomposition method
We apply the Logarithmic-Mean-Divisia-Index method, and specifically the LMDI-I version, which has many desirable attributes including perfect (without residual) decomposition, zero-value robustness, time-reversal symmetry, and consistency in aggregation [12]. The latter is important because it ensures consistency between the first-and secondstage decompositions defined above. A closely related method is LMDI-II, which satisfies an additional normalization constraint but does not possess the consistency in aggregation and perfect decomposition properties which we require for the present analysis [13]. We will henceforth refer to the method used here (LMDI-I) simply as LMDI. The LMDI method originates in the context of energy and emissions studies where it continues to be applied extensively. In recent years, its application has broadened to include areas such as land use and food production [14][15][16][17]. The approach is based on the Divisia method which, for a time interval t = 0 to t = T, expresses the decomposed changes in the integral form, where X k ij , k = 1, 2, 3, 4, 5 denote the factors A, I, S i , S ij , and Y ij , respectively. The LMDI method is a technique for approximating this integral given discrete rather than continuous data. The corresponding LMDI decomposition factors are: is the logarithmic mean. The specific expression for each decomposition factor, given in both the Divisia and LMDI form, are shown in table 2; for further mathematical aspects, see the supplementary information. An important distinction in time-series applications is the use of fixed versus rolling (chaining) baselines. The fixed method performs a single decomposition using only end-point data for the entire period; this provides a less accurate approximation of the Divisia integral and obscures trends and path dependencies [18]. We opt for a rolling baseline which involves a separate decomposition for each interval of time at which the data is available, with the results summed over all periods to give the total. For more extensive treatment of this question, see the supplementary information.
Another methodological choice in IDA is whether to use multiplicative or additive decompositions. In practice, it is straightforward to map between the results of each type and we use the additive form as it is more easily interpreted and visualized [19]. Note, however, that when applying a rolling baseline to time-series data the transformation between the cumulative results of the two methods has a more complicated form than the commonly known rule for single-step decompositions. Given that both methods are common, we provide an alternative analysis of the second-stage decomposition using chained multiplicative LMDI including the extended methodology due to Choi and Ang [13]; see the supplementary information.

Attribution analysis
To further investigate the composition of each factor, we perform an attribution analysis [13], which examines the contributions made by each country and each crop to a decomposition factor (such as country share or crop composition) at the global level.
The country attribution analysis applies to the country share, crop composition, and pure yield factors. The sign of a country's contribution to any given factor is determined by the direction of change of that factor in that country, and the magnitude is weighted by the other factors (see table 2, Divisia form). For example, if a country increases its share of global harvested area, it will show up as a positive contribution to the global total. The magnitude of this contribution is a function of how much the country's share increased or decreased, weighted by yields and crop composition in that country. A country's contribution to the crop composition factor will depend on how much crop composition increased or decreased crop output in that country, scaled by a country-share weighted yield. Similarly, a country's contribution to pure yield will depend on that country's change in yields, weighted by the harvested area in that country.
We group countries with similar characteristics using three schemes (income group, geographical region, and climate zone) to assess groups' contributions to each decomposition factor. The contribution of a group is the sum of contributions from all individual countries included in the group. The sum of all groups' contributions is the factor total. Income groups include low, lower-middle, upper-middle, and high following the World Bank classification [20]. Climate groups include arid, cold, temperate, and tropical, as determined by the Köppen-Geiger classification [21]. Regions include East Asia and Pacific, Europe and Central Asia, Latin America and Caribbean, Middle East and North Africa, North America, South Asia, and Sub-Saharan Africa.
The crop attribution analysis applies to the crop composition and pure yield factors. The sign of an individual crop's contribution depends on the direction of change in its relative share, in the same way as described above for country attribution. The magnitudes are scaled by crop pure yield and crop share of global harvested area for crop composition and pure yield attribution, respectively.

First-stage decomposition
From 1961 to 2015, global crop production (P) increased by 6 billion tonnes, or 250%, and it increased more than twice as quickly after 2000 as in the preceding decades. The first-stage decomposition breaks this increase in global crop production into contributions from two factors: cropland area and aggregate yield, the latter defined as total crop output per unit area per year. It shows that cumulatively, the majority (89%) of the increase came from aggregate yieldŶ ij , with global cropland area A contributing only 11% (figure 1). Although cropland area's absolute contribution was greatest in the 1980s, and again between 2010 and 2015, the latter time interval corresponds to an overall acceleration in production, such that the relative contribution of cropland area was not higher than the historical baseline.

Second-stage decomposition and attribution analysis
The second-stage decomposition also includes cropland area, but refines the analysis by decomposing changes in aggregate yield into contributions from four factors: pure yield (the output of a given crop per unit of harvested area in each country), cropping intensity (the global average frequency with which each hectare of cropland is harvested), crop composition (the proportion of cropland dedicated to a given crop in each country), and country share (each country's share of global harvested area). We present the results of the second-stage decomposition together with the attribution analysis, which assesses contributions from country groups and crops to different decomposition factors. For example, countries that shifted towards higher-yielding crops would increase the contribution of the crop composition factor to rising global production, and the same is true for a high-yielding crop that increased its share of global harvested area.
The second-stage decomposition (figure 2) shows that pure yield (Y ij ) was the most important driver of rising crop production. Pure yield contributed 56% of the total production increase and 63% of the increase in aggregate yield. The other yield factors combined-cropping intensity, crop composition, and country share-accounted for about one-third of the aggregate yield increase.
Pure yield rose in 80% of countries and declined only modestly by global standards in the remaining countries. The attribution analysis (figure 3) shows that richer high-latitude countries contributed the majority (over 60%) of pure yield. Our results suggest that high-and upper-middle-income countries saw comparatively rapid progress in pure yields, as they contributed a larger share to pure yield than their share of harvested cropland area; the inverse was true for low-and lower-middle-income countries.
Cold, temperate, and tropical zones contributed similar amounts to the global rise in pure yield (figure 4). Among income groups, high-and upper-middleincome countries contributed the most (figure 4), especially China, the US, former Soviet states, and Brazil (table S6). A small number of crops accounted for most of the pure yield factor-the top three being maize (16%), wheat (14%), and rice (12%) (table S5)-resulting from their rapid yield progress and large share of total production.
Next to pure yield, the second-most important factor was cropping intensity, which contributed 20% of the total increase in crop production and 23% of the increase in aggregate yield (figure 2). Cropping intensity's influence was most pronounced between 1961 and 1980, and again from 2000 to 2015.
Crop composition contributed nearly as much as cropping intensity to rising global production, standing for 17% of the total increase in production and 19% of the increase in aggregate yield (figure 2).  This implies that countries have shifted, on average, towards higher-yielding crops, thus raising aggregate yield. Most countries (123 of 195) experienced a positive contribution from crop composition over time, with positive contributions summing to 1776 million tonnes. The remaining countries had negative contributions summing to 732 million, for a net of 1044 million tonnes, indicating a substantial amount of offsetting (table S4).
Crop composition increased production most in tropical upper-middle-and lower-middle-income countries, with Brazil alone accounting for 20% of the total (figure 3). Countries in which shifting crop composition decreased production, indicating a shift Countries with similar characteristics have been grouped together in three schemes (climate, income, and region) to assess groups' contributions to each decomposition factor. For instance, within the climate scheme, countries can be arid, cold, temperate, or topical, each group contributing a positive or negative amount to a decomposition factor (such as country share). The figure represents these contributions with stacked bars, where each group is identified by color. Positive contributions are above the horizontal axis, and negative contributions are below it. For instance, tropical countries had a positive contribution to the country share factor, meaning that they took on a larger share of global production. The sum of the contributions from different groups is the factor total, indicated by a dashed line. This value is the same as the final-year values in figure 2.
towards lower-yielding crops, were concentrated in high-income countries across all climate zones, especially Europe and Central Asia ( figure 4). The impact of crop composition can also be broken down by crop, showing that sugar cane, palm oil, maize, and soybeans together accounted for over 40% of this decomposition factor (table S5). The production impact of these crops is enhanced by their high yields and relatively large share of harvested area. lower-yielding countries, coinciding with a ten-fold increase in international crop trade in the last 60 years [22].
Although country share had the smallest contribution of all factors, this masks an important trend in the geography of global production, as the small cumulative total for country share results from large offsetting shifts in different countries. Changes in the production of 85 countries drove up the global total (by 772 million tonnes) and 110 countries drove it down (by 1009 million tonnes) (table S4). Particularly large negative contributions came from former Soviet states and the US, whereas Brazil and Indonesia had the largest positive contributions.
High-latitude (cold and temperate) rich (highand upper-middle-income) countries-Europe and Central Asia in particular-had large negative contributions and all but the high-income countries in the tropics saw large positive contributions ( figure 3). Overall, the positive contribution from the tropical zone has been more than offset by the negative contributions from the other climate zones, leading to a net negative effect.

Discussion
Using LMDI decomposition, applied here in a novel way to global crop-production data, we identified the specific factors accounting for the rapid rise in crop production (including food, feed, and fiber crops) from 1961 to 2015. We did so in two stages. First, we showed that rising aggregate yield-total crop production over total cropland area-contributed far more than cropland expansion (89% compared to 11%). In other words, growing global crop demand has by and large been met by growing more crops on the same amount of land, rather than converting natural habitats into cropland. Our second-stage decomposition showed that nearly two-thirds of aggregate yield improvements have come from pure yield, or the output of a given crop per unit of harvested cropland area in a given country. The remainder has come from less-discussed drivers of aggregate yields, including cropping intensity, changes in the geographic distribution of cropland (country share), and crop composition. The last two do not have a direct relationship with farming methods and are more closely related to patterns of diets and trade.
Further, we have shown, via attribution analysis, that different regions, climate zones and income groups have evolved in markedly different ways over the last half century, with some contributing positively to crop production-in part through planting higher-yielding crops or increasing their share of global cropland area-and others negatively. Developing countries in the tropics stand out as strong positive contributors to many of the factors, whereas high-latitude developed countries, especially in Europe and Central Asia, have often contributed negatively.
Our method reflects best practices in decomposition analysis that stem from other production and economic contexts [13,23]. The use of LMDI-I decomposition ensures perfect decomposition (that is, no residuals) as well as consistency in aggregation. In contrast to several other works applying LMDI to crops and land use, we opted to use a rolling instead of fixed baseline; this substantially improves estimation of the decomposition results over the total time period (see the supplementary information for a comparison). We used the additive (instead of multiplicative) version of LMDI, which lends itself better to visualization and interpretation, largely because the decomposition terms have the same units as the aggregate quantity.
In this way, our methods represent an important advance over existing research. The only directly comparable study was done by Alexandratos and Bruinsma [2], which also decomposes crop production into cropland expansion, cropping intensity, and yields. Using an unspecified decomposition method 'in the main based on expert judgment,' they found that cropland expansion had accounted for 14%, cropping intensity for 9%, and yields for 77% of increases in price-weighted crop production between 1961 and 2007. While their results are similar to ours (10%, 14%, and 76% over the same period), theirs do not distinguish between pure yield, crop composition, and country shares, and hence likely overestimate the role of improved farming methods in rising crop production. Their dataset was also smaller, using only 34 crops (compared to 150 in our study) and 105 countries/territories (195 here).
A number of papers have decomposed cropland area into population, diets, and yield, or some further expansion of those factors [14][15][16][17]. Their finding that yields do not fully offset increases in production (production here being the product of population and per-capita consumption) is analogous to our finding that cropland area expansion contributed to production increases. In terms of methods, Kastner et al [16] and Alexander et al [15] use LMDI decomposition; however, both apply fixed baselines over long time periods (ranging from 17 to 42 years), which can lead to significant approximation errors (see the supplementary information). Huber et al [14] use the Laspeyers index, which lacks many of the desirable properties of LMDI. Ausubel et al [17] use a decomposition method called the ImPACT identity, which is based on a simple (unweighted) sum of the logarithmic changes of the respective factors.
To date, no studies have separated out cropping intensity, crop composition, or country share as contributors to aggregate yield. Huber et al [14] acknowledge that crop composition might affect their results, but dismiss its importance based on an inspection of crop-share data. Our results suggest that crop decomposition does in fact affect aggregate yields substantially. They also perform a secondary decomposition to determine the effect of trade balances for individual countries but do not account for changing country shares in their global decomposition.
Only one paper, to our knowledge, has rigorously quantified the effect of spatial relocation on crop yields [6]. Focusing on US corn production, it finds that 16%-21% of the increase in corn production since 1879 came from spatial movement of production. While this study represents an important conceptual and methodological innovation, our method has several advantages over it. First, we improve upon their use of Laspeyres and Paasche indices by using the LMDI method. Second, ours is global: we include all crops and all countries. Finally, in addition to spatial reallocation, we are able to disaggregate several confounding yield factors simultaneously.
Our findings have important implications both for the potential to meet future food demand from existing farmland, as well as for policies to support this effort. While-as our results demonstrateaggregate yield has nearly kept pace with rising crop demand to date, it is not guaranteed that the drivers of aggregate yield improvement included in our analysis will be able to contribute as much in the future as they have in the last few decades.
This uncertainty around future contributions applies, for example, to pure yield and cropping intensity. Global improvements in pure yield slowed in the last 20-30 years, and several important cropproducing regions have seen yields stagnate, potentially as a result of approaching the biophysical limits of current crop varieties [8,9]. Ray et al [9] suggest that if yields (in this case, the combination of pure yield and cropping intensity) continue on their recent trajectory, they will not meet projected crop demand by 2050. A further concern is that many of the technologies that boosted yields during the Green Revolution, including synthetic fertilizers and irrigation, are now fully utilized in many places [24]. While some studies (e.g. [7]) have found a large theoretical potential to increase cropping intensity-globally by more than 0.5 (the equivalent of adding half a harvest every year)-the practical and economic potential may be far lower [24]. Research and development to improve crop germplasm and agronomy; extension services bringing new technologies to farmers; infrastructure such as roads and irrigation schemes; and institutions such as credit and insurance [25,26], will all be critical to future improvements in yields.
Aggregate yield has also received a substantial boost from shifting crop composition. Our finding that growing more sugar cane, oil palm, maize, and soybeans increased total crop production the most is consistent with the use of more maize and soybeans for animal feed [4], and rapidly increasing global consumption of edible oils and sugar as part of the nutrition transition [27]. These crops, with the possible exception of soybeans, have high yields relative to other crops [4], meaning that producing proportionally more of them implies that less land is needed for any given level of total crop output. However, any effort to encourage the use of higher-yielding crops, even though it may be beneficial for global land use and food production, risks conflicting with other social and economic priorities, such as nutritional quality and public health. As such, it is not clear that this is a practical policy lever for raising aggregate yields. In fact, the shift towards lower-yielding crops observed in many high-income countries suggests that this factor could go from being a positive to a negative contributor to global crop production in the future, driving increases in land used for agriculture.
If the recent trends in country share documented here continue, it could increasingly offset gains from improved farming practices, with negative consequences for land use and the environment. For example, West et al [28] observed that average crop yields in the tropics are about half of those in temperate regions, but the carbon loss from conversion of natural habitats to cropland is nearly double. As a result, any shift in production from temperate to tropical regions risks increasing the carbon footprint of agriculture. Furthermore, Hertel et al [5] found that the increase in crop production that would result from yield improvements in Sub-Saharan Africa may increase global farmland area, since, if the region takes on a larger share of global crop production, it would reduce global average yields. It follows that in order to have high aggregate global yield, countries with high average crop yields should strive to maintain or increase their share of global production or cropland area. However, this might conflict with other priorities, such as ecological restoration in developed countries, or increased export revenue in developing countries.
Some caveats to our method are worth noting. First, FAO data can be unreliable. In particular, the data on standing cropland area (A) are uncertain. This affects both the contribution from standing cropland area and cropping intensity (I). In particular, if, as crop production rose rapidly in the period after 2000, measures of standing cropland area fell behind the true rate of expansion, our results would overestimate the contribution of aggregate yields and underestimate the contribution of cropland area. The contributions from the country share, crop composition, and pure yield are not affected by mismeasurement of standing cropland area, since they are a function of production and harvested area. The lack of crop-specific cropland data also meant that our cropping intensity is a global average only, rather than a composite of all countries and crops, as it is for country share, crop composition, and pure yield. This lack of spatial resolution affects the computation of the corresponding decomposition factor since the local information contained within each weighting term also becomes globally aggregated. Alexandratos and Bruinsma [2] address this issue by estimating local cropland areas, but the complexity of the estimation process meant that only a restricted set of crops could be analyzed by them in this way.
Second, our use of tonnes as the unit of crop production differs from that of Alexandratos and Bruinsma [2] (who use price-weighted production) and Huber et al, Alexander et al, and Kastner et al [14][15][16] (who use calories). Price-weighted production has a simpler economic interpretation but deviates from the 'biophysical accounting' philosophy we have preferred here. In order to comprehensively account for cropland uses irrespective of the final product, and to be consistent between standing cropland area and harvested area, we also included crops for non-food uses like fiber, which ruled out calories as a metric. Third, the chain-like multiplicative form of the identity leads naturally to 'share'-type quantities (such as country share and crop share of harvested area). This facilitates questions concerning the role of proportionate change as a driver of global crop production, at the cost of not permitting a direct evaluation of the role of absolute changes.
Altogether, the transparent decomposition of aggregate yield presented herein allows for a clearer identification of the threats and opportunities for future agricultural land use. To date, improvements in agricultural practices have allowed for crop production to increase dramatically without a corresponding expansion of cropland area. The effect of farming practices has also been boosted by a shift to higher-yielding crops, but it has been offset by shifts in cropland towards lower-yielding countries. A better understanding of drivers of crop production and aggregate yield offers the potential for more accurate forecasts and can help focus policies on the most promising levers to meet rising food demand sustainably.