More sustainable vegetable oil: balancing productivity with carbon storage opportunities

Intensive cultivation and post-harvest vegetable oil production stages are major sources of greenhouse gas (GHG) emissions. Variation between production systems and reporting disparity have resulted in discordance in previous emissions estimates. To assess systems-wide GHG implications of meeting increasing edible oil demand, we performed a unified re-analysis of life cycle input data from diverse oil palm, soybean, rapeseed, and sunflower production systems, from a saturating search of published literature. The resulting dataset reflects almost 6,000 producers in 38 countries, and is representative of over 74% of global vegetable oil production. Determination of the carbon cost of agricultural land occupation revealed that carbon storage potential drives variation in production GHG emissions, and indicates that expansion of production in low carbon storage potential land, whilst reforesting areas of high carbon storage potential, could reduce net GHG emissions whilst boosting productivity. Nevertheless, there remains considerable scope to improve sustainability within current production systems.


Introduction
From around 800,000 years ago, up to the year 1800, atmospheric carbon dioxide (CO 2 ) concentrations averaged around 225 ppm (Bereiter et al., 2015).Despite regular fluctuations, coinciding with ice ages and interglacial periods, concentrations never rose above 300 ppm during this time.However, since the early 1900s, atmospheric CO 2 concentrations have failed to drop below 300 ppm (Ahn et al., 2012).In every year since 2015, they have remained above 400 ppm, 70-80% higher than preindustrial concentrations (Thoning and Tans, 1989;Keeling et al., 2001).The Intergovernmental Panel on Climate Change (IPCC) stated that the dominant cause of global warming since 1950 has been anthropogenic contributions to greenhouse gas (GHG) emissions (IPCC, 2013a(IPCC, , 2013b)).As the human population has grown, food production has risen markedly; today, food supply chains are responsible for 26% of all GHG emissions (Poore and Nemecek, 2018).As we strive to provide greater amounts of nutritious food to over 800 million currently undernourished people (FAO et al., 2019), whilst meeting additional demand as the population continues to grow (United Nations, 2019), carefully targeted global food system interventions are required to limit the effects of increased food production on planetary health.
Vegetable oils are a major source of dietary polyunsaturated fatty acids (Dubois et al., 2007), and are a crucial component of wide-ranging cuisine.Steadily increasing demand for vegetable oil over at least the last 60 years, for food, industrial and energy uses, has led to increased oil crop production through expansion of cultivation area (Phalan et al., 2013;Ritchie and Roser, 2019;FAO, 2021; Fig. 1) and intensifying production practices (Pretty, 2018).Since 2014, oil crops have inhabited over 300 million hectares (ha) globally, approximately 19% of total cropped land (excluding pasture; FAO, 2021).Strikingly, over 85% of the world's vegetable oil is produced by just four crops: oil palm, soybean, rapeseed and sunflower (FAO, 2021), which are distributed across a range of climate zones.Clearing of native vegetation to meet growing demand for these crops (OECD; FAO, 2018) represents a considerable source of GHG emissions, further exacerbated by intensive cultivation and post-harvest processing (Özilgen and Sorgüven, 2011).An assessment of the GHG emissions resulting from vegetable oil production is critical if we are to optimise oil production systems to reduce their environmental impact.
Although there are numerous published life cycle assessments (LCA; ISO, 2006) of GHG emissions from individual vegetable oil production systems, considerable variation exists between studies.Whilst some of this reflects regional variation and varied production practices (Poore and Nemecek, 2018), it is likely that considerable variation between studies is also a result of non-harmonious reporting.Additionally, functional units used in analyses vary between studies, further complicating meaningful comparison.Further, analyses comparing variation between vegetable oil production systems are limited: a study that compared GHG emissions between five oil crops examined production in regions representing only 46% of global oil production, and did not consider variation within each crop type (Schmidt, 2015).A systematic evaluation of variation in the sustainability of each oil crop between diverse production systems is of upmost importance for the identification and promotion of more sustainable systems.
Here we present the results of a harmonised re-analysis of GHG emissions from palm, soybean, rapeseed, and sunflower oil production, using raw input and emissions source data obtained through a saturating search of published literature.The resulting dataset comprises diverse vegetable oil production systems across 38 countries, and is representative of >71% of global vegetable oil production.We combine this with a systems-wide analysis of the carbon costs of agricultural land occupation, following carbon storage opportunity principles (Searchinger et al., 2018): these principles allow both recent land use changes and the choice to continuously occupy ancestrally cleared land to be considered equally.This unified, systematic analysis reveals the carbon impacts of vegetable oil production decisions at a global scale, and provides information on how to reduce GHG emissions, both within and between crop systems.

Study aim and strategy
The aim of this study was to characterise global systems-wide variation in life cycle greenhouse gas (GHG) emissions resulting from the production of oil palm (Elaeis guineensis), soybean (Glycine max), rapeseed (Brassica spp.) and sunflower (Helianthus annuus) derived vegetable oil.This was achieved through a harmonised re-analysis of primary data sources extracted from a saturating search of published literature, combined with systems-wide calculations of the carbon cost of agricultural land occupation as modelled through the concept of carbon storage opportunities (Searchinger et al., 2018).Due to variation in emissions calculations, system boundaries, and functional and time units between studies, extraction of raw emissions source data from the literature, rather than reported emissions values, was crucial for achieving consistent and comparable results.Life cycle input data were used to calculate associated GHG emissions, reported as carbon dioxide equivalent (CO 2 e), based on a custom database of emission factors curated as part of this study (Supplementary Data 1).Thus, the life cycle emissions values reported here reflect not just a comparison of the results of previous LCAs, but rather a comprehensive reassessment of variation in vegetable oil GHG emissions calculated from a saturating dataset of agricultural and processing inputs.Land use carbon storage indicators were generated taking account of native and cropspecific vegetation and soil carbon stocks.Associated CO 2 e values were then additionally assigned to each production system as a function of the difference between native and agricultural land use carbon stocks, permanently amortised over 100 years.To reduce the impacts of within-study reporting biases, studies reporting the same production system in terms of geography and technique were combined into single production systems.Thus, reporting gaps were filled and additional confidence could be attributed to data items reported in multiple studies.Finally, life cycle emissions were attributed to the vegetable oil fraction of crop output by economic allocation.

System boundaries
Each system studied was split into distinct production stages: 1. Land use, 2. Cultivation and harvest, 3. Seed drying and storage (all crops except oil palm), 4. Transport to processing facilities, 5. Processing and refining, and 6.Treatment of palm oil mill effluent (POME; only for palm oil production systems).Stages post-refining such as packaging, distribution and use are omitted, due to their limited reporting and highly variable nature.Wastewater treatment (as palm oil mill effluent) was only considered for palm systems due to an absence of reporting in the literature set for other crop systems.A full list of data items collected within each production stage can be found in Supplementary Data 2. The life cycle of oil palm production is considerably different from that of the other crop types included here.Whilst soybean, rapeseed and sunflower are annual crops, sown and harvested within the same twelve-month period, a single oil palm plantation is generally maintained for around 25-30 years, and includes seedling production and juvenile stages during which time no vegetable oil is produced.To account for this, the entire oil palm life cycle was modelled from seedling production to end of productive lifespan per hectare.Resulting GHG emissions were then divided by the total plantation lifespan in years to obtain normalised annual GHG emissions per hectare.Inputs of services such as cleaning, marketing, accounting, and overheads including office space electricity and upkeep were omitted, due to a lack of reporting in studies included in the meta-analysis.

Functional units
For systems' modelling and spreadsheet management, energy and material inputs are referred to on a per hectare (ha) basis, since this unit is most relevant to decision making at the cultivation stage.For the purpose of final results' reporting, the functional unit is defined as one kg of refined vegetable oil, which enables clear comparison of results between crop systems.

Information sources, search strings and record compilation
To thoroughly extract all relevant literature, eight individual bibliographic databases were consulted.These were Web of Science (all databases), Scopus, PubMed, PubMed Central, Wiley Online Library, SpringerLink, JSTOR, and ScienceDirect.These databases were selected based on their multidisciplinary content, search string capacity and overall performance, as analysed by Gusenbauer and Haddaway (2020).Search strings were formulated to identify studies that concerned oil palm, soybean, rapeseed and/or sunflower in the context of oil production and sustainability.Biofuel/biodiesel was also included in the search strings to incorporate studies which may include data relating to earlier production stages (e.g.cultivation of relevant crops).Search strings varied depending on the required syntax of each bibliographic database, but broadly followed the string used for Web of Science as per below: Palm: (("palm" OR "elaeis guineensis") AND ("life cycle assessment" OR "life cycle analysis" OR "lca" OR "greenhouse gas emissions" OR "greenhouse emissions" OR "carbon footprint" OR "sequestration" OR "nutrient loss") AND ("oil" OR "biodiesel" OR "biofuel")) Soybean: (("soy" OR "soya" OR "soybean" OR "soyabean" OR "glycine max") AND ("life cycle assessment" OR "life cycle analysis" OR "lca" OR "greenhouse gas emissions" OR "greenhouse emissions" OR "carbon footprint" OR "sequestration" OR "nutrient loss") AND ("oil" OR "biodiesel" OR "biofuel")) Rapeseed: (("rapeseed" OR "canola" OR "rape" OR "oilseed rape" OR "brassica") AND ("life cycle assessment" OR "life cycle analysis" OR "lca" OR "greenhouse gas emissions" OR "greenhouse emissions" OR "carbon footprint" OR "sequestration" OR "nutrient loss") AND ("oil" OR "biodiesel" OR "biofuel")) Sunflower: (("sunflower" OR "helianthus") AND ("life cycle assessment" OR "life cycle analysis" OR "lca" OR "greenhouse gas emissions" OR "greenhouse emissions" OR "carbon footprint" OR "sequestration" OR "nutrient loss") AND ("oil" OR "biodiesel" OR "biofuel")).
Full search strings used for all other databases are included in Supplementary Data 3 along with number of search results returned for each.Additional searches were performed in Web of Science filtered to only include results from The International Journal of Life Cycle Assessment with search strings limited to include only crop identifier terms (Supplementary Data 3).In general, searches were directed to scan only text in the title, abstract and in any keywords, since searching in full text records led to too many spurious results.Initial searches were performed on 13th February 2020.In addition, Web of Science email alerts were set up for each of the full search terms listed above.New publications that were indicated by these alerts were screened ad hoc throughout the remainder of 2020, and relevant literature items were added to the respective GHG emissions models where necessary.Thus, the literature included in the metaanalysis described here can be considered to represent the entire set of relevant literature present in the consulted databases from the start of 2000 to the end of 2020.Records were managed in EndNote X9 (Clarivate Analytics, Philadelphia, PA, USA).

Literature eligibility criteria
Studies were assessed for eligibility for inclusion against nine criteria (Supplementary Data 4).These were formulated to fulfil the PRISMA statement reporting guidelines, designed to promote transparent and complete reporting of systematic reviews and meta-analyses (Liberati et al., 2009).Literature was required to be original and complete, published in English between the beginning of the year 2000 and the end of 2020, and to significantly concern production of oil palm, soybean, rapeseed and/or sunflower over other crops, in a commercially viable setting as opposed to experimental or speculative (e.g. on abandoned quarries), in the context of sustainability.Studies were also required to contain life cycle input data relevant to the system boundaries described above, and to frame their input data in terms of one or more of the functional units used here or enable recalculation into such units based on available data.Studies were incorporated into the life cycle database regardless of the nominally defined vegetable oil end-use, provided that earlier production stages relevant to vegetable oil production in general were also included.In such cases, later processing stages relevant only to specific industrial uses of vegetable oil were not incorporated.

Screening
After removal of duplicates, records were exported from Endnote using custom output styles to Microsoft Excel for screening.Records were initially screened based on publication year, language and type, then by titles and finally abstract, to quickly exclude irrelevant literature.Full text articles were accessed online for the remaining records.Text, tables, figures and supplementary information were consulted to ensure that only relevant literature was retained for analysis.On occasion, unique records corresponding to the same study and/or dataset were identified, for example where a conference paper was submitted prior to a full journal submission.In these cases, only the most complete or recent record was retained.An overview of the number of sources identified, screened, excluded and retained for analysis is reported for each crop in PRISMA-style flow diagrams (Liberati et al., 2009) in Supplementary Data 5-8.

Data collection process
Data collection for the meta-analysis utilised custom life cycle input databases managed in Microsoft Excel.Each literature record was given a unique source identifier and apportioned to a unique row within the relevant spreadsheet.Where present in each record, summary information including study location, cultivation practices and oil extraction methods was noted.Relevant data were then identified in tables, figures, text and supplementary information, extracted manually, and used to populate the life cycle input database.The reporting of certain data items was rationalised in the database to provide a suitable number of values for comparison.For example, chemical disease/pest controls were grouped into herbicide, insecticide, fungicide and unspecified pesticide items, rather than reporting specific chemicals used.Similarly, fertilisers were grouped into major data items including synthetic nitrogen (N), urea N, manure (total weight), phosphate (as P 2 O 5 ), and potassium oxide (K 2 O).Life cycle input data were all expressed per hectare in the initial databases.Data that were expressed in alternative units in the literature were converted using other available data.Study-specific input data were used to perform conversions as much as possible, which mostly involved converting by a function of reported yield.However, system yield values were assumed in cases where such information was not available, firstly by considering average values reported across other studies in the life cycle input database reporting the same production system, and as a last resort from the online statistical database FAOSTAT (FAO, 2021).Consistent units were utilised for individual data items, including kg for material inputs, and MJ for energy inputs.Where these were reported differently in literature records, values were converted using consistent conversion ratios e.g. 1 kWh = 3.6 MJ, 1 L diesel = 0.832 kg.A full list of data items collected, conversion factors used, and assumptions made are reported in Supplementary Data 9.Where no data were available in individual studies for a given data item, and no value could be inferred from other available data, cells were left blank in the life cycle input database.The exception was where it was deemed likely that the true value for a specified category was zero if not reported.For example, if a study reported kg of urea N applied to a field but failed to mention synthetic N, it was assumed that no synthetic N was used.In these cases, zero values were added to relevant cells in the input database.

Assessing risk of bias and record consolidation
It was assumed that reporting bias existed within studies, including variation in included data items, and choice of analysing first-hand production data, survey data, regional average data and/or data from unverified assumptions.Bias was also assumed across studies, including underrepresentation of some systems in the literature.To highlight, and where possible address this, the following measures were taken.For each record, it was noted what kind of system was used to acquire input data.Where this was survey or first-hand production data, the number of participants/ farms represented was noted.Many records appeared highly similar in nature, corresponding to the same crop grown in the same geographic region in a highly similar manner.Such records were consolidated into individual production systems, based on geographic production range and cultivation/processing methods, as per Poore and Nemecek (2018).For each data item for each production system, the mean of all reported values was then calculated and used as the system standardised value.Since cells in the database were left blank when the corresponding data was not available in individual studies, such cells were left out of mean average calculations, and thus missing data didn't impact on subsequent analyses.On the other hand, any imputed zero values were included in mean average calculations, on the assumption that they were representative of genuine, within-system variation.This approach enabled most data items to be filled for each system, whilst highlighting the extent to which each system was represented in the literature.Where systems were still missing a value for a given data item, the value or mean from a highly similar production system was used where possible.For example, no data for diesel required for cultivation was available for cold-press rapeseed production in Spain, so the mean value for conventional rapeseed production in Spain was used, on the assumption that only later processing steps were likely to differ between these two systems.Where no data from highly similar production system were available, missing values were imputed as the mean average of data item values across all production systems for that crop type (Supplementary Data 9).Where appropriate, imputed values were weighted by system yield.

GHG emission factors database
To enable calculation of GHG emissions from the life cycle input database, a custom emission factors database was compiled.This comprised estimated carbon dioxide (CO 2 ), methane (CH 4 ) and nitrous oxide (N 2 O) emissions associated with the manufacture, distribution and use of the energy and material inputs under study here.Collection of emission factors relating to the three gasses individually allowed consistent calculation of CO 2 e emission factors, which comprehensively represent Global Warming Potential (GWP).For this study, IPCC AR5 GWP 100 conversion factors with climate-carbon feedbacks were used (IPCC, 2013a(IPCC, , 2013b)).Emission factors were collected from multiple emissions databases including BioGrace (Neeft et al., 2012), UK Government GHG Conversion Factors for Company Reporting 2019 (UK Government, 2019), the EMEP/EEA air pollutant emission inventory guidebook 2019 (European Environment Agency, 2019), and the software GREET 2019 (version 1.3, Argonne National Laboratory, IL, USA), or from literature sources Supplementary Data 1.Electricity emissions were calculated using country-specific emission factors to reflect regional variation in electricity generation practices.For some data items, only CO 2 e emission factor values were available, many of which were calculated using previous GWP conversion estimates (see Supplementary Data 1).Where recalculation to AR5 values wasn't possible, these were retained as a best estimate of the emissions associated with the given factor.Of the gasses under study here, only the CH 4 conversion factor differs between IPCC AR4 and AR5 (with climate-carbon feedbacks).Hence, for data items for which AR4 conversion factors are used here, it is likely that only minimal error in final emissions calculations exists.

Modelling land use through carbon storage opportunity costs
Land use was modelled here using the principle of carbon storage opportunity costs (Searchinger et al., 2018).This compares the carbon stock of native vegetation and soil in a given area, with the carbon stock of vegetation and soil, at equilibrium, of the same area used for production of a given crop.The difference in carbon stored between the two systems can be considered a carbon storage opportunity cost if the land use with the lower carbon storage potential is maintained.This is balanced by productivity, whereby carbon storage opportunity cost is divided by the quantity of food produced.Carbon storage opportunity cost forms a multi-use indicator, allowing comparison of carbon storage potentials in native vegetation and soils across geographic ranges, between different land uses in a given area, and between different areas of cropland with contrasting food productivity and/or native carbon stocks (Searchinger et al., 2018).Importantly, it allows for comparison of the carbon cost of agricultural land occupation between crop systems, irrespective of if or when land use change actually occurred.
For each production system, IPCC Climate Zone (Bickel et al., 2006), soil type (European Commission, 2010) and native land cover (Aalde et al., 2006) data were sourced and used to infer native (i.e.reflecting land use that would exist if land had never been cleared for agriculture) and agricultural vegetation and soil carbon stocks from IPCC values (IPCC, 2006) via datasets presented in Flynn et al. (2012).Agronomic input levels were grouped by total N application rates, where rates above 100 kg ha −1 were considered high, between 50 and 100 kg ha −1 medium, and below 50 kg ha −1 low input, and used to infer agricultural soil carbon stocks.Resulting data were fed into carbon stock change calculations in the Excel tool provided by Flynn et al. (2012) to determine differences in stored carbon between native and agricultural land uses.Carbon stock changes calculated here only consider the differences in carbon stock between native and agricultural land use and not the emissions related to sowing and maintaining crops, which are instead considered in a separate analysis (see Section 2.11).It was assumed that any land use changes can support agricultural production for 100 years, as per Schmidt (2010).Carbon stock change values were therefore divided by 100 to permanently amortise carbon stock changes over 100 years, and attributed to each crop system as an annual emissions penalty (or saving, in cases where more carbon could be stored in the crop system than the native system; Supplementary Data 10-13).This amortisation period differs from that described by Searchinger et al. (2018), which instead made use of a 4% discount rate to the costs and benefits of future changes, equivalent to a 25-year amortisation period.We chose a longer amortisation period to enable attribution of the costs or benefits of land use decisions over a longer timeframe, which we feel is more reflective of the duration of environmental impacts and productivity benefits of clearing land for agriculture.Such a period was also previously used to evaluate the impacts of land use change between specific rapeseed and palm oil production systems (Schmidt, 2010).In addition, we feel that an amortisation period of 100 years better reflects the time required to regenerate ancestrally cleared land.Thus, the carbon costs of agricultural land use values generated here can be more directly compared with the carbon gains that could be realised if alternative land is in parallel set aside for regeneration of native land cover.Use of this timeframe assumes that regeneration of ancestrally cleared land can restore carbon to native levels within 100 years.This is likely to be true for regeneration of aboveground carbon stocks for most forest systems (Poorter et al., 2016;Bernal et al., 2018).For belowground carbon stocks, such regeneration is potentially less likely in such a timeframe, particularly for peatlands for which restoring carbon stocks may take significantly longer (Warren et al., 2017).However, we maintain that an amortisation period of 100 years is a suitable middle ground between attributing land use change emissions appropriately over a useful lifespan of land cleared for agricultural production, and assessing the rates of carbon stock regeneration in spared land.Multi-cropping within one year and fallow periods were not included in land use calculations, due to limited reporting within literature sources and because they were expected to largely offset one another.Carbon stored in agricultural biomass was assumed to be at equilibrium, as any biomass that is degraded after harvest was assumed to be regenerated in the next growing cycle.

Economic allocation and emissions reporting
For each system, life cycle input and direct emissions source data were multiplied by the respective emission factor from the emission factors database (Supplementary Data 1), and total life cycle and production stage specific emissions were determined.This was on an annual basis per hectare for soybean, rapeseed and sunflower.For oil palm, total emissions values were initially determined across the entire lifespan of the oil palm plantation, and then normalised to an annual basis by dividing by the plantation lifetime in years including non-productive years.Annual life cycle emissions values were then combined with amortised carbon storage opportunity losses/gains, and divided by annual, system specific oil yield, for reporting of life cycle GHG emissions per kg refined oil.This process is summarised in eq.1: where life cycle GHG emissions are given as kg-CO 2 e kg-oil −1 , C LU100 is the carbon storage opportunity cost of agricultural land occupation amortised over 100 years, expressed as kg-CO 2 e ha −1 , ∑(inputs and direct emissions × EF) is the sum of all production system specific life cycle input items and direct emissions sources multiplied by their respective emissions factor (EF), expressed as kg-CO 2 e ha −1 , and vegetable oil yield is the annual, system specific oil yield, expressed as kg-oil ha −1 .Note that C LU100 can be negative, in cases where the carbon storage potential of land used for agriculture is higher than that of the native land cover.
Life cycle GHG emissions are reported as a whole for each crop system, and additionally as a proportion reflective of the economic value of the oil portion of total crop produce.Economic allocation of emissions was determined to be the most suitable method for distinguishing between emissions from different products, as this can reasonably be expected to influence land use decisions for a given area of land.Economic values of crop portions were determined primarily using the World Bank Commodities "Pink Sheet" data (The World Bank, 2021) and USDA Oilseeds World Market and Trade reports (USDA, 2020) (Supplementary Data 14).Price data from October 2018 to September 2019 were used, rather than more recent data, in order to avoid impacts of COVID-19 on prices.The economic value of palm kernel meal was calculated using data from the Economics and Industry Development Division of the Malaysian Palm Oil Board (MPOB, 2020), using average export price data from January to December 2019.Allocation was performed separately for each system, based on quantified coproduct outputs.Production emissions were allocated proportionately between co-products for all emissions sources with the exception of emissions only relevant to refining of vegetable oil after separation from co-products, which were allocated in full to the oil fraction.

Figure generation
Figs. 1 and 6 were generated in GraphPad Prism 9 (GraphPad Software, San Diego, CA, USA).Individual bar graphs in Fig. 4 were generated in Microsoft Excel 2016.Fig. 5 was generated in OriginPro 2021 (OriginLab Corporation, Northampton, MA, USA), where system specific life cycle GHG emissions estimates were binned into 0.5 kg CO 2 e intervals and plotted using B-Spline line functions, weighted by system contribution to world production (Supplementary Data 15).Individual pie charts in Fig. 7 were generated in Microsoft Excel 2016, then manually scaled by represented emissions.Simple linear regression analyses were performed in GraphPad Prism.

Curating the global vegetable oil emissions database
We modelled life cycle GHG emissions resulting from palm, soybean, rapeseed and sunflower oil production by combining land use emissions analyses with a harmonised re-analysis of raw crop production and processing data obtained from a saturating search of published literature.Literature searches identified 2814 unique sources for potential inclusion, of which 253, published between the years 2000 and 2020, were retained for quantitative analysis after assessment against nine inclusion criteria (Supplementary Data 4-6, Supplementary Data 16).The literature set contains records corresponding to major production regions for palm (Indonesia, Malaysia), soybean (China, USA, Brazil, Argentina), rapeseed (Canada, Germany, India), and sunflower oil (Ukraine; FAO, 2021).However, no relevant literature records were identified for rapeseed production in China, or sunflower production in Russia, despite these being the second largest producers of rapeseed and sunflower oil, respectively (FAO, 2021).It is possible that sources exist for these production systems in non-English languages, which were not included in this analysis.Across sources based on first-hand agricultural data or surveys that disclosed the study year, data are typically reported from between the years 2007 and 2017 for palm oil, 2007 to 2018 for soybean oil, 2004 and 2017 for rapeseed oil, and 2003-2014 for sunflower oil.However, two palm oil studies and one study each for soybean and sunflower oil both report findings based on longer-term field studies, the oldest of which started in 1985.
We used the literature set to assess oil production GHG emissions from crop cultivation through to oil refining (Fig. 2).Where multiple production systems, distinct in crop type, geography and/or production type, were included within single literature sources, we treated each as individual records.Data from a total of 439 records across crops were compiled (Supplementary Data 17).We then consolidated records into specific oil production systems, based on geography and cultivation/processing methods, to as far as possible eliminate error and reporting gaps present in individual records (Supplementary Data 10-13).Thus, 112 distinct vegetable oil production systems are represented here, with the combined data derived ultimately from almost 6000 producers in 38 countries.Together, the 112 vegetable oil production systems represented here are responsible for production of 71.3% of the world's vegetable oil (Supplementary Data 17).

High yielding crops for lower land use impacts
It is relatively common for vegetable oil LCAs to consider land use as an impact category, either highlighting the area required for production (Parra et al., 2020) or the effects such land occupation might have on biodiversity (Wahyono et al., 2020).However, many vegetable oil LCAs, as well as various GHG emissions calculators, do not include this impact category at all (Colomb et al., 2013;Cerri et al., 2017;Yousefi et al., 2017;Ankathi et al., 2019;Espino et al., 2019;Fridrihsone et al., 2020).Further, of the studies that do include land use as an impact category, many omit the carbon costs of land use occupation (Brondani et al., 2015;De Marco et al., 2016), or alternatively, only consider the carbon costs of recent land use changes (Esteves et al., 2016;Folegatti Matsuura et al., 2017).Failure to assign land use costs to crops grown on ancestrally cleared land could result in intergenerational inequity.For instance, most land clearance for agriculture in Europe took place prior to the 1800s, whereas cropland in various developing regions, including Latin America and SE Asia, has been expanding steadily over the last 100 years (Goldewijk et al., 2017).Whilst only minimal carbon stock changes might be expected from continuous agricultural occupation of ancestrally cleared land, it is likely that such land could store more carbon if it were set aside for regeneration of native vegetation.To overcome land use change metric inequity, we modelled the impacts of agricultural land occupation here using carbon storage opportunity principles, as described in detail by Searchinger et al. (2018).In essence, we explicitly acknowledge that for each year of continuous agricultural land use, an opportunity to sequester carbon from the atmosphere is lost.
The environmental impacts of land use can be balanced by productivity.If a given system can produce large amounts of food per unit area, it may be more efficient to use that land for agriculture, freeing up space elsewhere to store carbon more effectively.We consider two vegetable oil production systems from our analysis in Fig. 3.The presented systems are representative of approximately 26% of global palm oil, and 12% of global rapeseed oil production, respectively.Native tropical rainforest in Indonesia has a total carbon stock of 290 t per hectare, whereas one hectare of oil palm has a carbon stock of 136.6 t (Fig. 3a).Deforesting one hectare of rainforest to grow oil palm would therefore represent a carbon storage opportunity cost of 153.4 t, whilst yielding 3585 kg refined oil per year.Forest in Germany has a carbon stock of 179 t per hectare and one hectare of rapeseed 99.4 t (Fig. 3b).Whilst the carbon storage opportunity cost between these land uses is only 79.6 t, rapeseed is less productive than oil palm: within the systems presented here, 2.59 ha of rapeseed are required to provide the same quantity of oil per year as one hectare of oil palm.The carbon storage opportunity cost between 2.59 ha of temperate forest and rapeseed is 206.4 t, higher than that of the oil palm system (153.4t).We alternatively compare the total carbon stocks of these two scenarios assuming that one offsets the other (Fig. 3c).In Scenario 1, we dedicate 2.59 ha to rapeseed production in Germany, sparing one hectare of land in Indonesia.Total carbon stored among all vegetation and soils in this scenario is 548 t.In Scenario 2, we dedicate one hectare to oil palm production in Indonesia, Fig. 2. System boundaries of the harmonised reanalysis of life cycle greenhouse gas emissions from vegetable oil production.Major inputs and emission sources indicated.sparing 2.59 ha of temperate forest in Germany.The carbon stored in the latter scenario is higher (601 t), suggesting that this is the more efficient use of land for oil production.The analysis presented above demonstrates that vegetable oil produced in high yielding systems can have a lower associated carbon footprint than in alternative lower yielding systems, even if the alternative systems have a lower land use carbon footprint per hectare.However, it is stressed that for the palm oil production system presented above to result in more carbon stored overall, the area used to produce the same quantity rapeseed oil production must actively be dedicated to regeneration of forest.Rapeseed also yields more animal feed than oil palm per hectare: based on the present analysis, one hectare of rapeseed production in Germany yields 2121.82kg pressed seed cake per year, whereas one hectare of oil palm production in Indonesia only yields 460.16 kg palm kernel meal.Increased production of animal feed as a co-product could offset demand for animal feed from oil crops elsewhere, potentially shifting the balance of results presented.This metric also only considers GHG emissions, and not the impact of land use on other sustainability indicators such as biodiversity.For instance, oil palm expansion has been linked to extensive reduction in species richness and abundance across taxa including of insects, birds, small mammals and primates, and surviving species are more likely to be generalists as opposed to the specialised species found in native rainforest (Yule, 2010;Foster et al., 2011;Drescher et al., 2016;Dislich et al., 2017).The impacts of land use change in any region on biodiversity must be properly considered before making any global land use change decisions, as it is unlikely for biodiversity to be completely restored to pre-clearance levels in reforested land once lost (Bremer and Farley, 2010).

Low native carbon stock land for sustainable oil production
We applied carbon storage opportunity costs/gains between native and agricultural land uses as a carbon penalty/credit to each vegetable oil production system.This was expressed as CO 2 e, amortised permanently over 100 years (Supplementary Data 10-13).Only one vegetable oil production system in this study was associated with a carbon storage opportunity gain: areas of Canada for which the land cover is native temperate steppe store 11.75 t less carbon per hectare than the same land used for no-till rapeseed production.This is a result of low initial carbon stocks in native biomass, combined with high agricultural inputs including manure addition to the soil, which can build soil carbon stocks (Flynn et al., 2012).Allocated to refined oil, this carbon storage opportunity gain corresponds to a 0.46 kg CO 2 e reduction in life cycle emissions per kg rapeseed oil produced in this system (Fig. 4c).However, it is worth noting that whilst high input agriculture can build soil carbon stocks in some instances, care should also be taken to avoid excessive inputs of fertilisers and other agrochemicals and to manage these accordingly to limit harmful runoff to waterways (Carpenter and Bennett, 2011;Withers et al., 2014;Huang et al., 2017).Carbon stored in temperate forest and a conventional rapeseed field in Germany in an area which yields the same quantity of vegetable oil as one hectare of oil palm.c: Total carbon storage potential of two vegetable oil production scenarios that result in the same quantity of vegetable oil.In Scenario 1, rapeseed cultivation is favoured, allowing tropical forest in Indonesia to be maintained or reforested.In Scenario 2, oil palm cultivation is favoured, allowing temperate forest in Germany to be maintained or reforested.Production systems shown here were selected as representative examples of each crop, based on life cycle emissions from each falling close to the crop specific median, and on a relatively large number of literature records for each being available.
(caption on next page) Soybean and rapeseed grown in the USA, and soybean, rapeseed and sunflower grown in Iran, can also have low associated GHG emissions, resulting from low native vegetation and soil carbon stocks.Land use emissions from other rapeseed production systems ranged from 0.90 to 4.91 kg CO 2 e per kg refined oil (Fig. 4c; Supplementary Data 12), whilst sunflower land use emissions fell within a similar range from 0.99 to 6.90 kg CO 2 e per kg refined oil (Fig. 4d; Supplementary Data 13).Land use emissions for most soybean systems ranged from 0.36 to 5.53 kg CO 2 e per kg refined oil, but two systems, corresponding to production in South Africa and Nigeria, had higher emissions of 7.05 and 15.32 kg CO 2 e per kg refined oil, respectively (Fig. 4b; Supplementary Data 11).Meanwhile, land use emissions from palm systems fell into two groups, with emissions from oil palm grown on mineral soils ranging from 0.87 to 1.81 kg CO 2 e per kg refined oil, and on peat soils from 22.75 to 28.81 kg CO 2 e per kg refined oil (Fig. 4a; Supplementary Data 10).Unsurprisingly, yield was negatively correlated with oil palm (df = 27; R 2 = 0.28; P = 0.004), soybean (df = 25; R 2 = 0.51; P < 0.001), rapeseed (df = 35; R 2 = 0.14; P = 0.023) and sunflower (df = 21; R 2 = 0.60; P < 0.001) land use emissions: greater productivity per hectare could effectively spare land elsewhere for regeneration of native land cover.Soybean (df = 25; R 2 = 0.42; P < 0.001), rapeseed (df = 35; R 2 = 0.12; P = 0.038) and sunflower (df = 21; R 2 = 0.33; P = 0.006) land use emissions were also positively correlated with native vegetation carbon stocks, whereas oil palm land use emissions were very much a product of native soil carbon stocks (df = 27; R 2 = 0.98; P < 0.001; all simple linear regressions).

Current production systems not optimised for sustainability
We combined systems' land use emissions data with life cycle GHG emissions assessed through comprehensive re-analysis of published data.Variation in total vegetable oil production emissions across global production systems is presented in Fig. 5.For each crop, production emissions are fitted against the contribution of each system to global production.Based  on the economically allocated dataset, total GHG emissions resulting from vegetable oil production in the across-crop median production system are 3.81 kg CO 2 e per kg refined oil.Life cycle GHG emissions from the median palm oil production system are roughly equal to the across-crop median: 3.73 kg CO 2 e per kg refined oil.Median life cycle GHG emissions from soybean oil production are higher than the global median: 4.25 kg CO 2 e per kg refined oil.Meanwhile, median rapeseed and sunflower oil life cycle GHG emissions are lower than the global median: 2.49 and 2.94 kg CO 2 e per kg refined oil, respectively.Life cycle GHG emissions from palm oil production are dependent on soil type and choice of methane capture technology.Palm oil produced on peat soils is associated with the greatest life cycle GHG emissions across all crops.In contrast, capturing methane emitted by palm oil mill effluent (POME) can reduce emissions by over 50% within certain production systems, in most cases to levels lower than the median emissions values for rapeseed and sunflower systems.The proportion of palm oil mills that have adopted methane capture technologies varies by region.In Indonesia, which is the world's largest producer of palm oil by country, currently only around 6% of POME produced is treated using methane capture technologies, the rest generally treated using a series of aerobic and anaerobic open lagoons (Global Methane Initiative, 2015;Schmidt and De Rosa, 2020).Meanwhile in Malaysia, the world's second largest producer of palm oil, around 27.7% of mills have adopted methane capture technologies (Loh et al., 2019).However, whilst additional methane capture facilities are currently under construction or in planning (Loh et al., 2017), still the majority of palm oil mills have yet to adopt methane capture technologies.
For soybean and rapeseed oil production systems, the lowest emissions are associated with vegetable oil production systems on land with low native carbon stocks, specifically in the USA and in no-till systems in Canada.Emissions resulting from rapeseed production in conventional tillage systems in Canada are more than twice as high as in no-till systems, as a result of differences in carbon stored in soils between each system (Fig. 5; Supplementary Data 10-13).
The world's largest producer of sunflower oil, Ukraine, has the production system associated with the second lowest crop-specific GHG emissions.However, within all other crops, it is clear that there is significant scope to reduce GHG emissions (Fig. 5).This could be achieved through more widespread adoption of emissions-reducing technologies, or through shifting the geographic production range.Soybean, rapeseed and sunflower oil life cycle GHG emissions are also strongly negatively correlated with yield, as are emissions from palm oil produced from trees grown on peat (Fig. 5 inset).It follows that if GHG emissions per hectare can stay broadly the same whilst productivity increases, the total emissions per unit of final product are effectively reduced.A major focus should therefore be on sustainably increasing production on land already occupied by agriculture.However, oil crop land occupation is still growing in some regions.For instance, oil palm plantation land area continues to grow at a rate of around 1.8% per year in Indonesia, and a rate of 1.4% per year in Malaysia (OECD; FAO, 2018).Meanwhile, yield increases are predicted to only account for 55% of overall production growth from soybean, and around 60% of overall production growth of oilseed rape and sunflower (OECD; FAO, 2018).Whilst increasing productivity on land already occupied by agriculture should be favoured, care should also be taken to avoid increasing production through means that result in large amounts of additional emissions.For instance, one might seek to increase yield through applying greater quantities of synthetic nitrogen.However, synthetic nitrogen is associated with almost 6 kg CO 2 e per kg applied (Neeft et al., 2012).Therefore, a better approach might be to identify genotypes with a high yield potential under relatively low nitrogen supply (Storer et al., 2018).

Mitigating emissions through management choices
Whole life cycle emissions range from 0.73 kg CO 2 e per kg refined oil for no-till rapeseed oil production in Canada, to 31.20 kg CO 2 e for smallholder palm oil production in Indonesia on peat soils without the use of methane capture technology in the processing mill (Fig. 6; Supplementary Data 10-13).Whilst much of this variation is driven by land use, considerable variation in emissions from other production stages also exists.For instance, soybean cultivation emissions range from 0.27 to 3.89 kg CO 2 e per kg oil.The range in cultivation emissions is lower for other oilseeds, but still varies 3.55-fold and 5.75-fold between rapeseed and sunflower production systems, respectively (Fig. 6).Solutions to reduce production stage emissions are specific for each system.Production of soybean, rapeseed and sunflower in Iran is associated with high emissions from electricity generation, used to power irrigation systems.Similarly, high GHG emissions from seed drying and storage in some regions are a result of high electricity production footprints (Supplementary Data 1, Supplementary Data 22-25).Reducing electricity requirements for irrigation or seed drying is perhaps unrealistic, but shifting to more sustainable sources of electricity could bring cultivation emissions down (Fehrenbach et al., 2016).Decreasing the distance between cultivation and processing centres, or increasing the fuel efficiency of transport vehicles, could reduce transport emissions.Adoption of POME methane capture technologies could considerably reduce emissions from most palm oil production systems: POME is a bigger source of GHG emissions than land use in the median palm oil production system, responsible for 49% of life cycle emissions (Fig. 7a).For all other crops, land use is the dominating source of life cycle GHG emissions (Fig. 7b,c,d).However, synthetic nitrogen application represents a further major source of emissions, particularly for rapeseed (Fig. 7c) and sunflower (Fig. 7d) oil production systems, whilst agricultural diesel use forms the biggest source of non-land-use GHG emissions from both soybean (Fig. 7b) and sunflower oil production systems.

Conclusions
Vegetable oil demand and production is projected to continue to grow, particularly in developing regions in line with rising per capita income.Results presented in this paper indicate that to reduce the negative consequences associated with land use change, this growing demand should be met through increasing productivity on previously cleared land.However, we have also shown that expansion of vegetable oil production in areas of low native carbon stocks or high productivity potential could, in principle, lead to greater net carbon storage, as long as currently occupied areas with lower productivity potential, or higher carbon storage potential, are in parallel set aside for regeneration of native land cover.Selection of oil crop cultivars that are more nitrogen-use efficient will also help to limit the impacts of the high carbon footprint of synthetic nitrogen.In practice, any system to offset emissions from agricultural expansion in one region by regenerating land cover in another would likely require concerted efforts of multiple governments and stakeholders, and perhaps a global carbon credit system, whereby producers pay for regeneration and maintenance of forests elsewhere.It is difficult to see a global sustainability accounting system being implemented over the next few years.However, without globally integrated solutions to rising carbon emissions that acknowledge both production system and land use impacts, we are unlikely to reach net zero emissions targets.

CRediT authorship contribution statement
TDA contributed to study design, performed initial literature review, curated greenhouse gas emission databases, performed and contributed to interpretation of data analyses, and led writing of the manuscript.DES, PW and SJR contributed to study design, advised data analysis procedures, contributed to the interpretation of analyses and critically reviewed and edited the manuscript.All authors approve of the final version of the manuscript.

Fig. 3 .
Fig. 3. Effect of land use on carbon storage.a: Carbon stored in one hectare (ha) of native tropical forest and a large-scale palm oil plantation in Indonesia on mineral soils.b:Carbon stored in temperate forest and a conventional rapeseed field in Germany in an area which yields the same quantity of vegetable oil as one hectare of oil palm.c: Total carbon storage potential of two vegetable oil production scenarios that result in the same quantity of vegetable oil.In Scenario 1, rapeseed cultivation is favoured, allowing tropical forest in Indonesia to be maintained or reforested.In Scenario 2, oil palm cultivation is favoured, allowing temperate forest in Germany to be maintained or reforested.Production systems shown here were selected as representative examples of each crop, based on life cycle emissions from each falling close to the crop specific median, and on a relatively large number of literature records for each being available.

Fig. 4 .
Fig. 4. Life cycle GHG emissions per kg oil for palm (a), soybean (b), rapeseed (c) and sunflower (d) oil production systems split into emissions corresponding to the carbon cost of agricultural land occupation and all other emissions sources.

Fig. 5 .
Fig. 5. Greenhouse gas (GHG) emissions as CO 2 equivalent (CO 2 e) resulting from global palm (28 systems; 147 records), soybean (26 systems; 106 records), rapeseed (36 systems; 128 records) and sunflower (22 systems, 58 records) oil production systems.Emissions allocated by economic value to oil shown as filled curves, with nonallocated emissions shown as dotted curves for reference.The height of each curve represents the percentage of global production from each crop that results in the specified GHG emissions.Median GHG emissions from each crop indicated by white diamonds.Median GHG emissions from all crops combined, weighted by contribution to world vegetable oil production, shown as dashed blue line.Note one data point from non-allocated dataset outside of displayed range for soybean (conventional production in Nigeria; 49.56 kg CO 2 e per kg oil).Figure annotated with selection of pronounced production systems for reference, referring to the allocated emissions dataset in each case.Figure inset (bottom right) shows simple linear regressions between oil yield and life cycle GHG emissions for each crop based on allocated datasets.Palm data split into crops grown on peat (df = 5) and other soils (df = 21).Soybean (df = 25), rapeseed (df = 35) and sunflower (df = 21) oil production systems presented as single regression analyses.

Fig. 7 .
Fig. 7. Contribution of input parameters to total life cycle greenhouse gas emissions for specific palm, soybean, rapeseed and sunflower oil production systems.Systems shown are palm oil production in Indonesia on mineral soil without methane capture technology (a), conventional soybean production in China (b), conventional rapeseed production in Germany (c), and conventional sunflower production in Ukraine (d).Production systems shown here were selected as representative examples of each crop, based on life cycle emissions from each falling close to the crop specific median, and on a relatively large number of literature records being available for each.Pie charts on left in each panel scaled to allow comparison between crops, with emissions not caused by palm oil mill effluent (POME) or land use shown in expanded pie charts to the right, again scaled to enable between-crop comparisons.