Closing the gap: global potential for increasing biofuel production through agricultural intensification

Since the end of World War II, global agriculture has undergone a period of rapid intensification achieved through a combination of increased applications of chemical fertilizers, pesticides, and herbicides, the implementation of best management practice techniques, mechanization, irrigation, and more recently, through the use of optimized seed varieties and genetic engineering. However, not all crops and not all regions of the world have realized the same improvements in agricultural intensity. In this study we examine both the magnitude and spatial variation of new agricultural production potential from closing of ‘yield gaps’ for 20 ethanol and biodiesel feedstock crops. With biofuels coming under increasing pressure to slow or eliminate indirect land-use conversion, the use of targeted intensification via established agricultural practices might offer an alternative for continued growth. We find that by closing the 50th percentile production gap—essentially improving global yields to median levels—the 20 crops in this study could provide approximately 112.5 billion liters of new ethanol and 8.5 billion liters of new biodiesel production. This study is intended to be an important new resource for scientists and policymakers alike—helping to more accurately understand spatial variation of yield and agricultural intensification potential, as well as employing these data to better utilize existing infrastructure and optimize the distribution of development and aid capital.


Introduction
Despite recent swings in oil prices and biofuel production, biofuel subsidies and use mandates remain in place and ethanol and biodiesel production are expected to grow by 70% and 60%, respectively, between 2009 and 2018 [1]. Increasing demand from the energy sector will complicate alreadycomplex agricultural markets, which must balance distributed and volatile global supply with demand from food, feed and fiber industries. Nevertheless, biofuels remain one of the few short-term policy alternatives for reducing dependence on imported petroleum, addressing air quality goals, and potentially reducing greenhouse gas emissions (GHG) in the transportation sector [2]. With an increasing-and increasingly affluent-population, global demand for agricultural products is expected to grow 50% by 2050 [3]. Biofuels and the introduction of unpredictable energy markets will continue to contribute to and complicate the growth of global agricultural demand in the years ahead. In this study we examine the magnitude and spatial variation of new biofuel production potential available through rapid intensification of cropping systems.
Generally speaking, there are two ways to increase agricultural production: (1) bring new land under cultivation, and/or (2) increase agricultural productivity on existing croplands. Approximately one half of all land suitable for growing crops has already been cultivated [4]. However, much of the remaining land suitable for expanding agriculture rests under tropical rainforests in South America, Central Africa and South East Asia, lands rich in both stored carbon and biodiversity [4]. There is increasing concern that growing demand for biofuels will encourage land conversion in tropical forests, which in turn may lead to large net increases in the release of carbon into the atmosphere [5][6][7][8]. To help alleviate this concern, the US EPA and other countries have included life-cycle GHG reduction requirements in national biofuel policies, but it remains to be seen how effectively indirect landuse change will be addressed [9]. Given the issues associated with further agricultural expansion, this study addresses the second option, potential pathways for increasing productivity on existing croplands.
Since the end of World War II, global agriculture has undergone a period of rapid intensification achieved through a combination of increased applications of chemical fertilizers, pesticides, and herbicides, rapid development and implementation of best management practice techniques, mechanization, irrigation, and through the use of optimized seed varieties and genetic engineering. Grain yields in particular have increased tremendously, with maize yields quadrupling in the United States since the 1940s [10]. There is still much room for improvement, however, as not all crops and not all regions have experienced the same level of intensification. There still remains a large disparity in the use of high yielding cultivars, inputs, irrigation and the employment of best-in-class management practices [11]. While many studies have examined past yield performance data to better understand driving factors, there have been comparatively few forward-looking attempts to estimate 'yield gaps'-which we define as the difference between current agricultural yields and future potential based on climatic and biophysical characteristics of the growing region. The FAO regularly assesses potential biomass based on net primary productivity theories first developed by Lieth in the 1970s, which assume ideal conditions for photosynthesis (absorption of solar energy by plants and storage of the energy as plant material) and depend upon plant physiology expertise from a world-wide network of field agronomists [12,13]. Two more recent approaches, by Neumann et al and Licker et al, assess yield potential based on spatialized yield and harvested area data from the M3 cropland datasets (defined below) [14][15][16][17]. The methodology employed in Licker et al forms the basis of this study. However, instead of focusing only on the maximum potential yield, we examine a range of intensification levels and present results for global median yields-an intensification target that might be more attainable than those calculated in previous studies for near-term biofuel investment in underdeveloped regions.
Utilizing the M3 cropland datasets, we calculate median yields and yield gaps for ten ethanol and ten biodiesel crops and present both global and individual results for 238 countries, territories and protectorates (157 of which are reported here). Instead of relying on plant physiology and optimal interception of solar radiation to determine maximum physiological production potential, our analysis takes a new, data-driven approach based on existing reported yields and cultivated area. In a study of six crops, Lobell and Field estimated that, although there are many factors that impact agricultural yields, approximately 30% of variation in recent decades could be explained by climatic characteristics of the cultivated lands [18]. Thus by first controlling for biophysical factors, this study attempts to identify concentrated areas of low yielding agriculture that might benefit from targeted implementations of modern agricultural practices. Translating this additional crop production into liters of biofuels allows us to understand the magnitude of potential biofuel production available from more intensive use of existing cropland resources. However, we note that this conversion is theoretical. The fate of any agricultural production gains initiated by the biofuel industry will ultimately be decided by the global agricultural marketsit could be converted to liquid biofuels and contribute new valuable co-products, such as protein meals and distillers grains, to global supply, or it might face competition from growing food and feed demands [19]. This study is intended to be an important new resource for researchers and policymakers alike-helping to more accurately understand spatial variation of yield and agricultural intensification potential, as well as employing these data to better utilize existing infrastructure and optimize the distribution of development and aid capital so that responsible intensification might be promoted over further expansion of agriculture.

Methodology
The analysis methodology employed by this study was first developed to assess maximum global yield potential in Licker et al [14]. In this study we expand and adapt the analysis to examine various levels of intensification for the 20 most common biofuel crops; ethanol crops include barley, cassava, maize, potato, rice, sorghum, sugar beet, sugarcane, sweet potato and wheat, and biodiesel crops include castor, coconut, cotton, groundnut, mustard, oil palm, rapeseed, sesame, soybean and sunflower. We also provide complete results at the individual country level to help policy-makers and farmers work together to identify and close yield gaps. Both the current study and Licker et al are dependent on the M3 cropland datasets, the only 5 min global gridded datasets of agricultural yield and related harvested area for all 175 agricultural crops reported by the FAOSTAT database. For detailed information on the methodology, datasets, or limitations of the analysis beyond what is presented below, please refer to [14,15,17,20].
The primary challenge of taking a data-driven approach to calculate agricultural production potential is determining how to fairly and accurately compare reported yields, which might come from areas with very diverse climates and growing conditions. To address this challenge, we grouped gridded yield and area data from the M3 cropland datasets into 100 unique climate zones, defined by different combinations of growing degree days (GDD) and soil moisture availability. We then calculated agricultural yield gaps by comparing existing yield performance to various levels of yield performance within the corresponding climate zone. The primary results examine production potential from closing 50th percentile yield gaps. However, we also calculate 75th and 90th percentile yield gap results (full results are available online) for countries which can still benefit from intensification, but that are already at or above the 50th percentile potential.

Datasets
The M3 cropland datasets are one of the most comprehensive global collections of actual agricultural census data, gathered from approximately 22 000 county, state and country-level census reporting units. Global census data from 1997-2003 were aggregated to smooth anomalous climate and market events and combined with a newly generated map of global croplands [17] to create a detailed 5 min resolution (∼10 km) 'snapshot' of crop area and yields for all 175 crops in the FAOSTAT database, circa the year 2000. Figure 1(a) illustrates the maize fractional area per grid cell as reported by the M3 cropland datasets.
As seen in figure 1(a), very small parcels of cultivated land (represented in white) are often widely distributed across the globe-especially for important staple food crops. However, these grid cells with exceptionally small cultivated areas are not necessarily representative of conditions in commercialscale farms that might be able to take advantage of increased resources and advanced agricultural practices. To ensure that the yield data from these small, unrepresentative cultivated lands would not adversely influence our study, data points making up the bottom 5% of harvested area for each of the 20 crops were masked out from further consideration. This was accomplished by sorting all the data points for a given crop by harvested area (the product of fractional harvested area and total hectares per grid cell) from smallest to largest, and then aggregating area up until the bottom 5% of the total harvested area was reached. Figure 1(b) shows the same maize dataset as figure 1(a), but with the pixels representing the bottom 5% of harvested area removed. In total, the bottom 5% of area represented 67% of the total valid data points for maize, but only comprised 3.5% of global production. Figure 2 shows maize yields with the pixels representing the bottom 5% of area removed. Together these area-filtered, yield and harvested area datasets for maize and the 19 other biofuel feedstocks form the basis for all yield gap calculations that follow.
To better understand how the unequal distribution of irrigation infrastructure [21] and sustainable water resources might impact our results, we re-ran our analysis with irrigated areas excluded (supplemental materials figure S1, available at stacks.iop.org/ERL/6/034028/mmedia). This resulted in a decrease in total area of 33% and a decrease in existing production of 32%. Although infrastructure investments may expand the area of irrigated cropland in the future, excluding irrigated areas from the quantification of yield potential and yield gaps gives us a more conservative point of comparison regarding the potential gains from closing yield gaps. We filtered irrigated areas using data from MIRCA2000 [20], a dataset that provides monthly rainfed and irrigated areas for 26 crops and crop classes at a 5 min resolution. For each crop class, we calculated the per cent area irrigated in each grid cell averaged across the growing season (weighting the percentage area irrigated in each month by total cultivated area in each month). We defined rainfed areas as grid cells with <10 % of growing area irrigated throughout the growing season. Cropspecific MIRCA2000 irrigation datasets were used for 15 of the crops in our analysis. For the remaining crops, we used 'others perennial' for coconut and 'others annual' for castor, mustard, sesame and sweet potato.
Growing degree days (GDD) have long been used to represent the length and thermal properties of agricultural growing seasons necessary to drive photosynthetic reactions [22], as well as in the modeling of plant phenological development. We calculated GDD using equation (1) where T i is the temperature at each time step (in degree Celsius) and T b is a crop-specific baseline temperature. In total, five different GDD base temperatures were used for the 20 crops in this study, including: 0 • C (wheat), 1 • C (barley), 2 • C (mustard, potato, rapeseed, sugar beet), 5 • C (rice) and 8 • C (cassava, castor, coconut, cotton, groundnut, maize, oil palm, sesame, sorghum, soybean, sugarcane, sunflower, sweet potato). Soil moisture availability takes into account soil type and water-holding ability, and is a function of the potential plant water uptake rate [24]. Soil moisture availability is a non-linear function of the ratio of the soil water, actual evapotranspiration (AET), to the soil available water capacity, potential evapotranspiration (PET), as used in Prentice et al and Ramankutty et al [4,25]. Potential plant water uptake is high as long as soil water exceeds half of the available water capacity, although it decreases rapidly below this threshold. The calculation of daily soil moisture availability follows a simple two-layer bucket approach, driven by the Priestley-Taylor equation to estimate PET. A more detailed description of the surface energy and water budget calculation is given by Ramankutty et al [4].

Defining climate zones and yield gaps
Equal sized ranges of GDD and crop soil moisture availability were then used to create a 10 × 10 matrix of unique climate zones. The gridded area and yield data were then 'binned' into one of the 100 zones based on methodology from Zaks et al [26]. To ensure statistical relevance, climate zones with fewer than five data points were removed from further consideration. Because GDD is dependent on cropspecific base temperatures, five different matrices for binning were created, each corresponding to one of the GDD base temperatures described above. Figure 3 shows the distribution of the 100 climate zones with GDD base temperature of 8 • C, which was used the by the majority of crops in our assessment. . Part (a) shows the original maize fractional area dataset, while (b) shows the fractional area remaining after filtering out the bottom 5% of cultivated lands. The filtered lands represent 67% of the total maize data points, but contribute only 3.5% to global maize production.
Using the yield distribution within each climate zone to calculate yield gaps would be meaningless without first factoring in the corresponding areal extent associated with each yield value from the M3 datasets. We therefore performed an area-weighted assessment by sorting the gridded data in each climate zone by yield from lowest to highest, and then aggregating harvested area until, for example, the 50th percentile area data point was reached. The yield value associated with that area value was chosen as the 50th percentile yield for later yield gap calculations. We repeated this process for each crop and at different levels of yield improvement, including 75th and 90th percentile yields. Finally, new crop production was converted to volumes of biofuel using tons-to-fuel conversion factors from Johnston et al [27].
While this study is primarily based on biophysical constraints, political, cultural, and economic realities certainly factor into the underlying distribution of yields and area as reported by the M3 cropland datasets. The agricultural census data used here is based on current management practices and cultivars distributed in the field circa the year 2000. Therefore, the results identified in this analysis may be viewed as a conservative estimate of intensification potential that might be achieved with better global distribution of modern agricultural practices. Future increases in yields resulting from biotechnology, seed genetics and agricultural technology are not captured by this analysis. Similarly, this study attempts to ensure an apples-to-apples evaluation of cultivated lands by limiting the statistical comparison to data points in the  same climate zone. However, if none of the data points in a particular climate zone utilize high yielding modern agricultural practices, then the potential calculated by this study may be lower than what is possible from a climatic and biophysical standpoint. On the opposite end, approximately 450 billion liters of ethanol and 33 billion liters of biodiesel could be produced from closing the 90th percentile yield gaps. When removing irrigated pixels and re-running the analysis, we see an average decrease of 36% in ethanol production and 18% in biodiesel production, with both figures remaining very consistent across all levels of percentile improvements. Not surprisingly, excluding irrigated areas from the assessment impacts cotton and rice production potential the most, with  64% and 50% drops respectively. Other crops with decreases in production potential greater than 30% include sugarcane (−44%), groundnut (−44%), wheat (−42%), potato (−37%), sugar beet (−33%), sesame (−31%) and maize (−30%). For a complete comparison of how the rainfed-only results differed from those presented below, please see table S1 and figures S2(a) and (b) (which compare with figures 4(a) and (b) above) in the supplemental materials (available at stacks.iop. org/ERL/6/034028/mmedia). Depending on whether or not intensification efforts include new and/or improved irrigation infrastructure, biofuel production potential would be expected to change accordingly. While it is possible to close yield gaps for various levels of improvement, the detailed look at intensification potential that follows utilizes 50th percentile yield gaps as a middle-of-the-road yield improvement target. The global potential for intensifying yields on existing agricultural lands to median levels for ten common ethanol crops and ten common biodiesel crops are shown in table 1. In addition to listing 'additional' production potential based on 50th percentile yield improvements, the table also includes the total number of data points for each crop, the total harvested area and current global production as reported in the M3 cropland datasets. The 'additional 50th percentile production' in column 5 consists of the total production difference between current yields and the corresponding 50th percentile yields calculated by this study. The production gap in this column only encompasses grid cells that currently perform lower than their corresponding 50th percentile yields. We do not capture aggregate production resulting from grid cells that already yield more than the 50th percentile estimates. The final column in table 1 represents the total biofuel volumes (in liters) that could be produced if this new 50th percentile production potential were converted into either ethanol or biodiesel, as dictated by the feedstock crop.

Results
The new production potential for ethanol crops ranges from 7.4 million tons for sorghum to 110.6 million tons for sugarcane. However, while the new tonnage varies considerably, the overall percentage increases are roughly Figure 5. Normalized biofuel production potential from closing 50th percentile yield gaps. Part (a) illustrates the distribution of aggregate ethanol potential from ten starch and sugar crops and (b) illustrates the distribution of aggregate biodiesel potential from ten oilseed crops. Pixels in white (0) are already at or near their corresponding 50th percentile yields, while pixels in yellow, green and black (1) illustrate increasing levels of intensification potential. on par, ranging from 10%-17%. When converting this potential to liters of fuel we see that maize, wheat and rice make up the vast majority (∼75 %) of the ethanol potential identified by this study, which is not surprising given that they also make up ∼75% of currently harvested area of the crops in question. New production potential from biodiesel crops ranges from only 34 000 tons for mustard up to 13.3 million tons for soybean. In terms of percentage increases, biodiesel crops show slightly less potential for improvement, ranging from 7% to 14%. Overall biodiesel fuel production is more evenly distributed amongst the ten crops in question, with soybean, rapeseed, oil palm, sunflower and groundnut (peanut) each representing between 2.4 and 1.0 billion gallons of potential fuel.
In aggregate, the 50th percentile production gaps identified by this study would translate into approximately 112.5 billion liters of ethanol and 8.5 billion liters of biodiesel above what is currently produced. Figures 5(a) and (b) illustrate the spatial distribution of this new ethanol and biodiesel potential. Because of the varying amount of oil in biodiesel crops and starch/sugar in ethanol crops, it is not possible to directly compare the different crop production potentials without first converting them into a common unitliters of fuel. The new production potentials illustrated in figures 5(a) and (b) represent the aggregate fuel volumes across all ethanol and biodiesel crops, respectively (individual maps for each of the 20 crops are included in the supplemental materials available at stacks.iop.org/ERL/6/034028/mmedia). The results were normalized based on maximum total fuel potential to facilitate comparison between grid cells. In the maps below, grid cells in white (0) are already at or above their corresponding 50th percentile production levels, while grid cells in yellow, green and black (1) have the most aggregate potential for intensification. Using these new maps it becomes clear where the most concentrated areas of new biofuel potential exist globally.
The aggregate potentials identified by this study are important for understanding the near-term limits to which biofuels can contribute to overall liquid fuel production. For example, the Renewable Fuel Standard Program (RFS2) Final Rule of the 2007 Energy Independence and Security Act (EISA) requires the United States to blend 36 billion gallons of biomass-based fuels into transportation fuels by 2022 [9]. While the biofuel potential presented here would not be eligible to meet those volumes due to the 'advanced biofuels' requirement, we find it an interesting point of comparison to understand the overall magnitude of our results. The overall volume target from the US, 36 billion gallons (136 billion liters) is very close to the 121 billion liters of biofuel potential identified here. Even if all countries across the globe were to increase yields for all 20 of the crops in this study simply to median levels of what was possible in the year 2000, there would still not be enough production to meet the 136 billion liter US biofuels target for 2022, let alone the projected 1.1+ trillion liters of combined liquid fuels (including petroleum fuels) the US will be consuming by 2020 [28]. Globally, more than 50 countries have biofuel use mandates on the books totaling more than 220 billion liters.
We are not claiming that biofuels should not be pursued at scale volumes, simply that policy-makers need to set realistic expectations for offsetting the demand for petroleum fuels. Not surprisingly, much of the new potential for biofuel production from intensification is located in developing countries and former Soviet Union states. Of the 112.5 billion liters of ethanol potential, only 9.4 billion liters (∼8%) are located in developed countries (as classified by the United Nations). At ∼25% (or 2.1 billion liters), developed countries hold a higher fraction of overall 8.5 billion liters of biodiesel production potential, however, yield gaps in developing countries are still considerably greater. The growth potential from agricultural biofuels is clearly limited in developed countries that already employ high yielding, modern agricultural practices, such as the United States-which explains the shift in research and development dollars towards next-generation, non-food feedstocks and agricultural wastes in most developed countries.
The M3 cropland data sets are currently undergoing a major expansion with the addition of five-year time steps from 1965 to 2010 [40]. However, even before these data are available for further analysis, we are still confident that the year-2000 results presented here can be useful in identifying spatial yield trends and underperforming regions. For example, our year-2000 results show a large disparity in yield performance between Eastern and Western Europenot surprising given the region's political upheaval not even a decade earlier. Since that time, yields in Eastern Europe have definitely begun to improve, as can be seen by maize yield trends from the FAOSTAT in figure 6 [29]. Romania, in particular, has seen maize yields more than double and overall production increase by 63% between 2000 and 2009, even though cultivated lands decreased by 23%. However, while yields are definitely improving in Eastern Europe, figure 6 illustrates that the yield gap, as defined here, still appears to exist as Western Europe has continued to boost yields in tandem. While certainly some crops in some regions have begun to close yield gaps, until the expanded M3 cropland data sets become available, the year-2000 results presented here still represent one of few analyses of spatial yield performance and future potential.

Discussion
A remarkable, untapped agricultural production potential exists that could be used to meet fuel, food, feed and fiber needs globally. Identifying both the aggregate global potential and spatial patterns of biofuel production potential is useful for establishing bounds and providing a reality check for policymakers and researchers interested in strategies for increasing biofuel production. However, additional crop production from agricultural intensification will clearly never be achieved for all crops and all countries due to differences in infrastructure and investment around the globe. Likewise, additional production will not necessarily be used for biofuels due to competing demands for increased production of food and fiber [3]. Despite these caveats, our exploration of the upper-limits of agricultural biofuel production aids understanding of the true potential for biofuels to supply growing energy demand.
In spite of the limitation for aggregate potential production, we believe a major contribution of this study is the 'first pass' identification of specific countries and agricultural areas that could most benefit from targeted investments in infrastructure development and agricultural intensification. For example, world bodies such as the UN, the World Bank and global aid organizations might use this work to optimize the distribution of development dollars across multiple countries, or, within a single country, they might be used to begin directing scarce resources to the most promising crops or underdeveloped regions.  Recognizing that results for data-poor countries may not be reliable, table 2 includes a selection of ten country-crop pairings that exhibit particularly high potential for producing biofuels, but which require less overall cultivated lands. The results shown here are a small subset of the 1200+ countrycrop combinations analyzed in this study. In addition to having the same columns from table 1 above, table 2 also includes two new columns that might aid optimization efforts: (1) the total area associated with the new intensified productionnot simply the total agricultural area of the crop, as not all grid cells in a particular country can be improved, and (2) the fuel produced per hectare (calculated by dividing the new 50th percentile biofuel production potential by the total area which can be intensified). For example, there are many country-crop combinations with greater overall volume potential than those highlighted in table 2. However, these high-volume combinations are often limited to low yielding and highly distributed grain crops such as wheat and rice. Closing yield gaps for these crops might indeed provide large volumes of biofuels (or food), but the crops' spatial distribution would make it difficult to close the gaps everywhere given the limited infrastructure, resources and investment capital in many developing countries. As an alternative, the results in table 2 show some of the 'lowest hanging fruit': country-crop combinations that have high production potential on a limited footprint, making strategic optimization efforts more likely to succeed.
All of the country-crop combinations in table 2 have the following properties: (1) additional production potential between 25 million and 150 million liters of fuel, which is the production capacity range of most medium-scale commercial biofuel plants, (2) fuel production potential that requires less than 50 000 hectares of cultivated lands to limit the total area requiring intensification, and (3) fuel yields in the top 10% of all country-crop combinations, which equals a minimum of approximately 1200 liters/ha. Using these criteria, we can see that the majority of country-crop pairings include sugarcaneas identified in figure 7-which is not surprising since it is the highest energy yielding agricultural biofuel crop. The lone biodiesel entry in table 2 (highlighted in italic) is based on oil palm, which is also the highest yielding biodiesel agricultural crop. Although the filtering criteria and ultimate countrycrop selections used here were subjective, we encourage users of this study to apply their own vetting processes on the downloadable results to aid in resource optimization efforts for all crops.
We caution that biofuels must be developed sustainably and responsibly. Modern intensive agricultural practices can be highly damaging to the environment, with rowcropping and plantation agriculture linked to the depletion of organic matter and nutrients resulting in decreased soil quality [30,31], increased extent and frequency of marine hypoxia [32,33] and the eutrophication of lakes, rivers and coastal waterways [34,35]. Increased agricultural efficiency can help alleviate some of the most egregious excesses of intensive agricultural systems [36][37][38][39]. Forthcoming analysis of management practices driving yield gaps at the global scale will aid understanding of large-scale patterns of nutrient and water requirements for intensifying agricultural systems, input-yield tradeoffs and opportunities to increase input use efficiency [41]. From a sustainable development perspective, countries that choose to invest in closing yield gaps should simultaneously incentivize precision application of inputs and irrigation and low-or no-till agricultural practices to minimize soil erosion and agri-chemical runoff.
This analysis was conducted from a biofuel perspective to help understand how this new and rapidly growing demand will affect global agricultural markets. However, the methodology presented here can be used more directly to identify production gaps in food insecure countries-many of which exhibit the largest yield gaps.
Only a small fraction of our results are shown here due to space limitations; the full individual countrycrop results (including, high intensification targets of 75th percentile and 90th percentile yields not discussed here), in both tabular and spatial netCDF formats, are available as supplemental materials at: http://sage.wisc.edu/energy and http://environment.umn.edu/gli/gli publications.html.