A rapid review of meta-analyses and systematic reviews of environmental footprints of food commodities and diets

Systematic reviews, sometimes including meta-analyses, are often presented as an approach for identifying healthy and sustainable diets. Here we explore to which extent systematic review protocols have been adopted by studies comparing environmental impacts of foods based on Life Cycle Assessment (LCA) results, and to which extent they comply with the PRISMA protocol for transparent reporting. Out of 224 studies screened, seven explicitly define themselves as systematic reviews, and/or claim to carry out meta-analyses. Of these, only one acknowledges a review protocol, while none complies with all the PRISMA criteria. Neither do we believe that reviews of LCA results can comply with all the criteria or carry out meta-analyses, due to underreporting on standard deviations and artificial sample sizes in LCAs. Nonetheless, reviews of food commodities and diets based on LCA results would benefit from better aligning with criteria in systematic review protocols.


Introduction
In the recent decade the scientific literature has been increasingly populated by studies recommending diets for improved environmental and/or health performance. Conflicting conclusions from different systematic reviews and meta-analyses are, however, commonplace. For example, a series of articles published in Annals of Internal Medicine recently made headlines by downplaying health risks related to excessive consumption of red and processed meats Johnston et al., 2019;Vernooij et al., 2019;Zeraatkar et al., 2019aZeraatkar et al., , 2019b. Their conclusions were based upon a systematic review, including a meta-analysis, carried out in-line with the PROSPERO protocol (Zeraatkar et al., 2017). Their general recommendation that adults can continue their current meat consumption, however, conflicts with almost all other similar studies to date (Chan et al., 2011;Larsson and Orsini, 2014;Wang et al., 2016). The results also sparked a heated debate among the scientific community, 1 with criticism of the interpretation and exclusion of environmental impacts. 2 An editorial in the same journal later advocated that ethical concerns about animal welfare and environment impacts related to beef would provide better arguments for reduced meat consumption, rather than health concerns (Carroll and Doherty, 2019). Ironically, the literature on environmental impacts related to food commodities is subject to several similar academic challenges with regards to existing literature reviews, with some studies downplaying the environmental impacts of animal production (White and Hall, 2017).
Food is recognized as a major driver behind humanity's transgression of planetary boundaries, impacting the climate system (25% of global emissions), biogeochemical flows (100%), and biodiversity loss (75%) (Gordon et al., 2017;Steffen et al., 2015;Willett et al., 2019). An upsurge of studies quantifying the environmental impacts of different food commodities have hence been published over the past decade, igniting discourse in both public and policy domains (Aleksandrowicz et al., 2016;Reinhardt et al., 2020;Wilson et al., 2019). Their aim is generally to benchmark the environmental footprint of different food commodities or diets (set kinds and amounts of food commodities), to model the impacts of future food scenarios to give policy advice, and/or to compare the impacts of food to other sectors (e.g. travel, housing, or consumer goods) (Aleksandrowicz et al., 2016;Girod et al., 2014).
The majority of these footprint studies of foods focus on global warming, but sometimes also account for freshwater footprints, land use, eutrophication, acidification, and/or energy use (Clark and Tilman, 2017). The environmental footprints of different food commodities are generally derived from different Life Cycle Assessment (LCA) studies, and assembled to draw quantitative conclusions about the environmental impact(s) of food commodities or diets.
LCAs, much like observational health studies of diets, are riddled with prominent and discrete factors that influence the results. However, unlike dietary health studies based upon randomized trials or inventions that are subject to confounding factors, such as lifestyles, LCA results are mainly influenced by modeling choices (Cucurachi et al., 2016). Among other things, these include: underlying models; model parameters; assumptions; process data; impact assessment method; and methodological choices, such as co-product allocation and system boundary setting in LCAs (Heijungs and Guinée, 2007;Menten et al., 2013;Reap et al., 2008;Tu et al., 2018). Not accounting for these poses the danger of enduring misinformation to the public and policy makers, such as ''lettuce produces more GHG than bacon does'' contested by Cucurachi et al., (2016).
Some footprint studies of foods also identify themselves as systematic reviews or meta-analyses. According to Denyer and Tranfield (2009): 'A systematic review should not be regarded as a literature review in the traditional sense, but as a self-contained research project in itself that explores a clearly specified question, usually derived from a policy or practice problem, using existing studies.' A meta-analysis, in the meantime, is a statistical subset of systematic reviews where summary statistics are used as extensions for formulas used in the primary studies (Borenstein et al., 2009). These two terms are often used interchangeably, but in reality, a meta-analysis is strictly numerical and should only be conducted in the context of a systematic review. Thus, a meta-analysis uses summary statistics to reevaluate the results of primary studies, allowing for conclusions to be made across a larger sample size made up by smaller sample sizes from several studies (Hunter and Schmidt, 2015). Individual study results are, in turn, weighted based upon their underlying sample sizethe larger the sample, the smaller the variance, thereby suggesting how precise an estimate should be (Shadish and Haddock, 2009).
By its originator, statistician Gene Glass, meta-analysis was first described in 1976 as 'an analysis of analyses' (Glass, 1976). As it gained traction, a need for more solid reviewing practices was highlighted in the mid-1980s for the medical and social sciences (Light and Pillemer, 1984;Mulrow, 1987). This later evolved into the Quality of Reporting of Meta-analyses (QUOROM) statement (Moher et al., 2000), which, in turn, was updated to address conceptual and practical advances in the science of systematic reviews under the PRISMA acronym (Preferred Reporting Items of Systematic reviews and Meta-Analyses)(PRISMA, 2019). PRISMA has in its turn been digitalized in a registry housed by the University of York and named PROSPERO (Booth et al., 2012). As of October 1st , 2019, PROSPERO also requires users to register prior to starting a review in order to avoid bias from altered search terms or unpublished findings (PROSPERO, 2019). There are also a number of alternative protocols, including 'The Collaboration for Environmental Evidence' (CEE, 2019) Evidence Synthesis established in 2003, and Cochrane Handbook for Systematic Reviews of Interventions by the Cochrane collaboration founded in 1993 (Cochrane, 2020). Both CEE and Cochrane also facilitate their own journals.
The medical sciences were among the first to promote protocols for systematic reviews, in an effort to avoid the risk for bias or systematic error in evidence-based health care (Aromataris and Pearson, 2014). The adoption of protocols has, however, been slow even within the medical sciences, with less than half of the systematic reviews on MEDLINE in 2004 working from a protocol . One major reason for this is that satisfactory systematic reviews and meta-analyses are resource intensive (Tricco et al., 2015). Some protocols, in addition, require two individuals to do the same work independently (Tricco et al., 2015).
Regarding LCA, several studies, and even a special issue, highlight the need to harmonize LCA results before drawing conclusions (Lifset, 2012;Menten et al., 2013;Wiloso et al., 2012), with systematic review checklists specifically designed for LCA data (Zumsteg et al., 2012). Systematic reviews of LCA results from the energy sector have also adopted reviewing protocols (e.g. Blanco et al., 2020).
Given the still forthgoing disputes in the medical sciences (Barnard et al., 2017), from which most protocols for systematic reviews and meta-analyses originate, we carried out a rapid review (Khangura et al., 2012) to evaluate to which extent these protocols have been adopted by the environmental food community, and whether they are adaptable to LCA results. Rapid reviews are shorter, broader, and less comprehensive than systematic reviews; generating descriptive, rather than qualitative summaries (Khangura et al., 2012). Our aim was to evaluate 'to which extent systematic reviews and meta-analyses of food LCA results acknowledge protocols for systematic reviews or meta-analyses, and to which extent they are applicable to LCA results.'

Material and method
Literature was selected based upon the search phrase "+Diet + Food + Sustainable +"food consumption" +"Greenhouse gas emissions" +LCA" in Google scholar (scholar.google.com, accessed April 2020). Only articles published in December 2019 or before were included, yielding 1690 results. Among these, only peer-reviewed scientific articles using LCA data to compare food commodities or diets and defining themselves as systematic reviews or meta-analyses were considered (see Fig. 1), resulting in only eight studies. These studies were then evaluated using the PRISMA 2009 checklist to see to which extent they fulfilled the set criteria.

Results
While not quantified in detail, most exclusions from the original article search were case studies and literature outside the main focus. Other exclusions were books, conference proceedings, reports, studies based on economic input-output models, and comparisons limited to certain food groups. Of the remaining 224 peer-reviewed articles comparing food LCA results, only eight identify themselves as systematic reviews or meta-analyses (Table 1). One article, Pairotti et al. (2015), was however excluded from further analysis as their full analysis remain unavailable and could therefore not be fairly evaluated. Clune et al. (2017) was the only study that identifies itself as a systematic review and refers to a review protocol, namely PRISMA. Another set of food-related studies identify themselves as systematic reviews, and many refer to the PRISMA protocol, but these are reviews of the outcomes of already aggregated comparisons of the environmental impacts of diets (e.g. vegetarian vs. pescatarian), rather than of food items (LCAs of e.g. broccoli vs. fish) (Aleksandrowicz et al., 2016;Chai et al., 2019;Hallström et al., 2015;Jones et al., 2016).
While all seven remaining studies proclaim to carry out metaanalyses, none of them actually do. This as they all fail to weight individual LCA study results on their underlying sample size. Few LCA studies present variances around their results, and even fewer derive these variances based upon empirical data (Bamber et al., 2020;Henriksson et al., 2013;Kuczenski, 2019). Among these, most are represented by artificial sample sizes derived using Monte Carlo simulations (Heijungs, 2020). This as LCA results are aggregated sets of unit processes (e.g. fertilizer production, grow-out, and processing), where each unit processes is represented by own sample, which translate poorly to the traditional sample sizes of medical studies (e.g. number of patients). Consequently, only a minuscule fraction, if any, of all food LCA studies are theoretically eligible for meta-analysis. Thus, any one study could only carry out a systematic review at best, but many of the reviewed studies are in fact nothing of the kind.
Consequently, no study complies with all of the PRISMA criteria for systematic reviews (Table 1); with more extensive argumentation for the criteria evaluations presented in the Supporting Material (SM). All articles provide a structured summary in form of an abstract, but there is a general lack of explicit questions that the reviews set out to address (PRISMA criteria 4; Table 1). Clark and Tilman (2017) specify some questions (e.g. organic versus conventional production, and grassversus grain-fed beef), but post-data analyses drew many broader conclusions beyond these pre-defined comparisons.
The methodology behind data extraction is also largely incomplete, with only Clune et al., (2017) presenting an electronic search strategy (PRISMA criteria 8). There is also a general lack of scrutiny about risk of bias related to methodological choices (PRISMA criteria 12). De Laurentiis et al. (2019) highlight the influence of methodological inconsistencies among studies, but downplay their importance with reference to Clune et al. (2017). Clune et al. (2017), however, only report that the median results for beef between their review of LCA results and another study (Lesschen et al., 2011) vary by 2.6%. Conversely, for other food commodities, such as butter, Clune et al.'s, (2017) estimates range from 3.7 to 25 kg CO 2 -eq. kg -1 butter. Heller and Keoleian (2015) and Tom et al. (2016) carried out sensitivity analyses on whole diets using only minimum and maximum values (PRISMA criteria 16), resulting in three to five-fold differences. With regards to other types of biases (PRISMA criteria 15), Clune et al. (2017) report only minor variation among conference papers, journal papers, and grey literature, while Clark and Tilman (2017) exclude LCAs from for-profit companies and highlighted the underrepresentation of LCAs detailing food production in low-income countries.
Five out of the seven articles present ranges around some or all their food commodity categories, of which only De Laurentiis et al., (2019) determine confidence intervals (CIs) when possible (PRISMA criteria 20 and 21). These ranges and CIs, however, only represent the distribution of single value or mean LCA results among studies, but disregard variability and uncertainty related to individual LCA estimates (Henriksson et al., 2013). As a result, these ranges and CIs are largely defined by the crudeness of bins (e.g. broccoli, tomatoes, olives, etc., versus simply vegetables), and disregard methodological inconsistencies, variability among farms, and uncertainty in emissions models. Sensitivity analyses were also carried out with regards to origin of commodities, functional unit, and mitigation measures (De Laurentiis et al., 2018;Mohareb et al., 2018).
Most studies are fairly complete in identifying their target groups and presenting their results in perspective to earlier research (PRISMA criteria 3, 24, and 26). However, the discussions on limitations seem to deviate (PRISMA criteria 22 and 25), from those arguing that the bias  Pairotti et al. (2015) fulfilled all criteria but refers to an intermediate project report for the full analyses that could not be accessed online, nor provided by the corresponding author; it was consequently excluded from further analysis. Table 1 The seven reviews of environmental consequences of food commodities under review and evaluated against the PRISMA protocol (PRISMA, 2019). Checks means that a criterion is fulfilled (greens), checks in parenthesis that a criterion is partially fulfilled (yellow), and crosses that a criterion is unfulfilled (red). related to individual LCA results are largely negligible (Clune et al., 2016;De Laurentiis et al., 2018), to those cautioning about the strong influence of methodological choices on conclusions (Heller and Keoleian, 2015;Tom et al., 2015), while others seem mainly concerned about food production systems not represented in the LCA literature (Clark and Tilman, 2017;Mohareb et al., 2018).

Discussion
None of the environmental footprint studies under review that proclaim themselves to carry out meta-analyses actually do. In fact, given that many LCA studies of foods base their models on individual farms, and most present their results only as point-values, there is no variances to weight individual studies by. Moreover, where variances are available for LCA results, they are often derived using a data quality pedigree, based on generic uncertainty estimates, and propagated using Monte Carlo simulations (Ciroth et al., 2013;Henriksson et al., 2013). This results in large artificial sample sizes that sometimes are derived from individual data points (Heijungs, 2020;Heijungs et al., 2016). Actual variances are therefore often underestimated and sample sizes arbitrary (Henriksson, 2015;Kuczenski, 2019), skewing the estimate of effect that is critical for correct weighting of results in meta-analyses (Shadish and Haddock, 2009). We therefore argue that meta-analyses of LCA results are uncompilable with the current standard of carrying out and reporting LCAs.
Only one out of the seven studies under review acknowledges a systematic review protocol, namely the PRISMA protocol by Clune et al. (2017). However, Clune et al., (2017) do not fulfill all the criteria listed in the PRIMSA checklist. Consequently, we conclude that no eligible systematic review has been conducted to date on the environmental footprints of food commodities, only rapid reviews (Khangura et al., 2012;Tricco et al., 2015). Neither do we believe that LCA results can appropriately meet all PRISMA criteria, given their characteristics and limited anchoring in randomized samples. We, however, acknowledge that systematic reviews are highly resource intensive and time-consuming, and therefore argue for better coordinated efforts, instead of replicating efforts. Publishing systematic review protocols at the onset of the research, such as by Jarmul et al. (2019), would for this purpose help identify parallel efforts. We acknowledge that the PRISMA guidelines and other relevant protocols have largely evolved around the medical sciences and completely different types of data, which limits their applicability to LCA reviews. Nonetheless, reviews of LCA results would benefit from adopting many of the criteria specified in these guidelines.
For example, one of the most troublesome shortcomings of the reviewed articles related to the PRISMA checklist is the deficiency to 'Provide an explicit statement of questions being addressed' (PRISMA criteria 4). This as the review question should determine the eligibility criteria, search for relevant studies, collection of data from the included studies, and the presentation and discussion of the findings (Squires et al., 2013). The review question ''should specify the types of population (participants), types of interventions (and comparisons), and the types of outcomes that are of interest" (Squires et al., 2013). The studies reviewed, in the meantime, all maintained an exploratory approach, which hampers a structured systematic review; relating to similar concerns among the medical sciences (Barnard et al., 2017).
Other major shortcomings among the reviewed studies are that they insufficiently account for methodological bias among studies when drawing conclusions (PRISMA criteria 22), neither do they present risk of bias or study characteristics for individual studies (PRISMA criteria item 18 and 19). Subsequently, the current practice of evaluating environmental impacts of food commodities and diets could be biased and/or challenged on a methodological basis. For example, it could be argued that the LCA results were selectively chosen to support a predefined hypothesis. This would be counterproductive in a time when shifts in diets are urgently needed to increase both human and planetary health (e.g. Gordon et al., 2017;Willett et al., 2019;Gerten et al., 2020), while misinformation remains a major hurdle for changing consumer preferences (Garcia et al., 2019). Estimates of environmental impacts consequently need to at least face the same level of scrutiny as clinical medical data in order to avoid suboptimal reporting, biased results and policy, and ultimately public confusion.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.