Have food supply chain policies improved forest conservation and rural livelihoods? A systematic review

To address concerns about the negative impacts of food supply chains in forest regions, a growing number of companies have adopted policies to influence their suppliers’ behaviors. With a focus on forest-risk food supply chains, we provide a systematic review of the conservation and livelihood outcomes of the mechanisms that companies use to implement their forest-focused supply chain policies (FSPs)—certifications, codes of conduct, and market exclusion mechanisms. More than half of the 37 cases that rigorously measure the outcomes of FSP implementation mechanisms find additional conservation and livelihood benefits resulting from the policies. Positive livelihood outcomes are more common than conservation additionality and most often pertain to improvements in farm income through increases in crop yields on coffee and cocoa farms that have adopted certifications or codes of conduct. However, in some cases certifications lead to a reduction in net household income as farmers increasingly specialize in the certified commodity and spend more on food purchases. Among the five cases that examine conservation and livelihoods simultaneously, there is no evidence of tradeoffs or synergies—most often an improvement in one type of outcome is associated with no change in the other. Interactions with public conservation and agricultural policies influence the conservation gains achieved by all mechanisms, while the marketing attributes of cooperatives and buying companies play a large role in determining the livelihood outcomes associated with certification. Compliance with the forest requirements of FSP implementation mechanisms is high, but challenges to geospatial monitoring and land use related selection biases limit the overall benefits of these policies. Given the highly variable methods and limited evidence base, additional rigorous research across a greater variety of contexts is urgently needed to better understand if and when FSPs can be successful in achieving synergies between conservation and livelihoods.


Introduction
The production of food commodities, including cattle products, soybeans, palm oil, coffee, and cocoa, is the largest driver of forest loss and degradation in the tropics (Curtis et al 2018). While these forest-risk commodities are central to the livelihoods of millions of rural households, small farmers' participation and share of the value generated by such supply chains has continued to decline over the past three decades (Lee et al 2012, Cohn et al 2016. Since the early 2000s, food companies have responded to increased information and pressure about the environmental and social outcomes of their sourcing activities by adopting a range of supply chain sustainability initiatives, leading to greater private sector involvement in food system governance (Sikor et al 2013, Garrett andRueda 2019).
Because deforestation has been central to the naming and shaming campaigns of civil society, many of these initiatives have focused on supply chain impacts within forest regions. Specifically, many companies have adopted 'forest-focused supply chain policies (FSPs)' , formal and/or public declarations about how forest impacts are considered within their supply chain. Such policies also often include statements about social safeguards and livelihood targets. FSPs can be implemented through a variety of mechanisms, such as the use of market exclusion mechanisms (excluding suppliers based on undesirable production practices, e.g. deforestation) and/or codes of conduct and certification requirements (setting criteria that farmers should meet for inclusion and/or preferred market access). See table 1 for a list of the FSPs included and tables S1, S2 (available online at stacks.iop.org/ERL/16/033002/mmedia) for their contextualization within the broader set of sustainable supply chain initiatives.
Over 484 major food retailers, traders, and processors now have some form of FSP (Rothrock et al 2019). Yet, the degree to which these policies improve or hinder global conservation and rural livelihoods remains unclear. Recent systematic reviews of food certification programs have centered on livelihood impacts (DeFries et al 2017, Oya et al 2018) or conservation impacts (Tscharntke et al 2015), with little crossover. To date, syntheses of food sector zerodeforestation commitments have focused on conservation impacts only, and have been limited to soy, cattle and oil palm supply chains , Lambin and Thorkalson 2018, Pacheco et al 2018. Nor has any review paper examined the conservation and livelihood impacts of the full range of initiatives that food companies can use to govern their supply chains across all forest-risk food commodities. Given the current shortfalls of public policy in protecting forests and helping small farmers, and the rapid growth of FSPs, there is a pressing need for a synthesis of the literature to identify which (if any) private sector policy tools are effective at promoting conservation and whether they also deliver benefits to rural livelihoods, or rather, produce tradeoffs. There is also an urgent need to identify current research gaps and guide future research toward regions, policies, and methods that can help inform our understanding of what types of FSP implementation mechanisms are likely to be both effective and equitable.
Here, we provide a systematic review of the literature from 1 January 2000 to 1 November 2020 on food company FSP implementation mechanisms using Web of Science and Scopus (see section 2; tables S3 and S4) to answer the following questions: a) What are the effects of existing food company FSP implementation mechanisms on conservation and livelihood outcomes? b) Are synergies or tradeoffs between conservation and livelihoods observed? and c) In what contexts have these mechanisms been most effective at improving conservation and/or livelihoods?
Besides offering an expanded policy and commodity scope, the present review innovates on the existing literature by disaggregating the evidence on supply chain policy impacts according to different outcomes. To evaluate the effects of FSP mechanisms first we ask: Do farmers comply with the conservation requirements of FSPs? Next we ask: Do FSPs provide additional conservation and/or livelihood benefits (i.e. are there measurable improvements in conservation and/or livelihood outcomes beyond business as usual)? Finally, we ask: Do spillovers offset or enhance the net conservation and/or livelihood benefits of FSPs (i.e. does the FSP result in either positive or negative conservation and/or livelihood impacts on non-targeted actors, regions, or supply chains)? Disaggregation of compliance, additionality, and spillovers is crucial to avoid misunderstandings about the overall effectiveness of FSP implementation mechanisms, since both compliance and additionality are necessary for generating improvements in forest conservation within a region . Yet, compliance does not always result in additionality and some additionality can occur with moderate compliance, while spillovers may enhance or offset the overall impacts of such initiatives on forest conservation.
Our study also contributes to the literature by integrating results on the livelihood additionality and spillovers resulting from FSPs, particularly the presence or absence of synergies or tradeoffs between conservation and livelihood outcomes. Livelihood goals are directly included in many FSPs (see table 1) and the study of these outcomes is critical to understanding the equity outcomes of private sector policies. While these equity concerns are sufficiently important in their own right to merit consideration (Pascual et al 2010, McDermott 2013, they are also important for the long-term conservation effectiveness of FSPs. Farmers are more likely to adopt pro-conservation behaviors if they receive some type of livelihood benefit (Brown 2002). If powerful actors (e.g. traders, slaughterhouses) leave their suppliers with little choice but to comply with conservation directives or exit a 'sustainable' supply chain, the non-compliant farmers may pursue even more unsustainable land use activities as a result of exclusion (Klein et al 2015, Friedman et al 2018. However, to date, there has been little empirical examination of these purported relationships beyond findings that larger, wealthier farmers are more able to participate in conservation initiatives (Bremer et al 2014, Klein et al 2015, Winters et al 2015, Friedman et al 2018.
While the primary focus on the review is to synthesize results on the impacts of FSP implementation mechanisms, our study concludes by summarizing current knowledge about the conditions that moderate these impacts. This fills a critical knowledge gap about the mechanisms and contextual factors Must not have been implicated as using slave labor.

G4 Cattle Agreement
Multi-stakeholder market exclusion mechanism Zero-deforestation (R) (For direct and indirect suppliers) Do not deforest primary or non-primary forests on properties producing cattle after October 2009.
Must not have been implicated as using slave labor or invading indigenous territories or be accused of land grabbing. The G4 agreement is now called the G6 Agreement, since it includes six major slaughterhouses, but existing papers pertain to the G4. One study also examined the MPF-TAC cattle agreement (which is more of a public-private agreement between slaughterhouses and federal prosecutors. However, this study did not present separate results for MPF-TAC because it was highly correlated with G4 exposure.
underlying the success of voluntary environmental policies and aims to help to make sense of the heterogeneity in conservation and livelihood results across regions.

Methods
Systematic reviews provide a transparent and replicable process for collecting literature based on predefined search and paper inclusion criteria (Tranfield et al 2003, Higgins et al 2019. We followed the best practices laid out by Haddaway et al (2015) to avoid bias, increase transparency, and enhance consistency and objectivity. First, we established a protocol for identifying relevant and rigorous literature and sent this methodology to our peers for comments. Our search included all known food company FSP implementation mechanisms (Rueda et al 2017 (see table S1 regarding the use of the FSP nomenclature and S2 for a list of mechanisms that can be used to implement FSPs). The first step in the protocol was to search Web of Science (WoS) and Scopus within the title, abstract, and keywords for all potentially relevant papers using a predefined set of search strings (see tables S3-S6 for search terms). The overall numbers of returns were 1488 papers in the WoS search (1114 for conservation and 374 for livelihood outcomes) and 1341 papers in the Scopus search (1022 for conservation and 319 for livelihood outcomes). We did not use Google Scholar to identify gray literature for two reasons. First, we were not able to obtain an unbiased sample (results were largely irrelevant and known studies were not showing up in the searches even after screening the first 200 returns for each mechanism). Second, given the sensitivity of causal assessments and impact evaluations to econometric specifications and study design, we found it challenging to compare the scientific value of the gray literature returned by Google Scholar, which did not undergo an external reviewing processes, to the peer reviewed studies.
After collecting all returns from WoS and Scopus we screened abstracts using pre-defined inclusion criteria based on relevance: (a) the paper pertained to an FSP implementation mechanism and (b) the paper included quantitative or qualitative causal analysis of the impact of a specific initiative on conservation and/or livelihood outcomes associated with the production of a specific commodity. Fair Trade and Organic certifications were excluded as stand-alone mechanisms because they did not contain any specific criteria about deforestation or reforestation behaviors for the relevant time period (deforestation requirements for Fair Trade were established in 2019 and no reforestation requirements exist). Conservation outcomes included within the scope were: deforestation (measured as a loss in tree cover), reforestation (measured as an increase in tree cover or changes in the density and diversity of tree species), and fires (measured as the reduced occurrence of fires). For livelihood impacts we included studies that focused on financial outcomes due to the greater consistency and comparability across studies, including target commodity income (often proxied by gross, rather than net revenues), as well as overall farm and household income (sometimes proxied by household expenditures for farm households or by company share value for plantations). Table S8 lists studies that pertain to FSPs, but did not measure the conservation or livelihood impacts covered by the study and states the reason that each paper was excluded.
Next, we divided this set of relevant papers into individual commodity-region-mechanism cases (e.g. a study that examined outcomes of RSPO certification in Indonesia and Malaysia separately would be classified as two separate cases; a study that examined UTZ certification and the CAFE code of conduct separately, even within the same region, would be classified as two separate cases). We then assigned scores for conservation compliance, additionality, and spillovers, as well as livelihood additionality and spillovers, using predefined criteria (see tables S9 and S10 for list of coding criteria and table S11 for the detailed results).
The final step of the screening process was to assess the rigor of the methods for evaluating the policy outcome of a given FSP mechanism, allowing this review to compare and synthesize rigorously determined results only. This process was undertaken for each policy outcome, since methods could be rigorous on one type of outcome (e.g. compliance), but not others (e.g. additionality). Briefly, a conservation compliance paper was included if there was an independent (rather than self-reported) measure of the conservation outcome in question. An additionality or spillover paper was included if a counterfactual scenario about 'business as usual' (the absence of the FSP mechanism treatment) was established to analyze the impacts of the mechanism. These protocols are described in table S9. Cases that met the rigor criteria for a particular outcome were scored as a 1 and included in the study results for that outcome, all others were coded as 0 and excluded from analysis but retained in table S11 for reference. Through an iterative collective refinement of the wording of the coding rubric and by providing detailed justifications of our scores, we were able to come to agreement with all evaluations. Scores, justifications, and contextual notes from the initial coder were reviewed by two people. In cases where reviewers were not in agreement about scoring results and contextual notes, authors engaged directly to discuss and come to a shared decision.
Finally, after reviewing the literature we created categories about frequently mentioned contextual factors explaining FSP implementation mechanism success or failure, as written by the authors Pfaff 2019), we went back to each study to assess whether or not they mentioned the particular factor and noted how the authors' believed the factor influenced outcomes.

Scope of the existing evidence base
We found 43 papers that met our relevance criteria, comprising 53 cases (policy-commodity-region bundles) (see tables S3-S6 for summary statistics on the search). Of the relevant papers, there were 37 cases with sufficiently rigorous measurement of at least one conservation-(deforestation, reforestation, or fire-related compliance, additionality or spillovers) or livelihood-related (commodity income or household income additionality or spillovers) policy outcome (19 for conservation only, 13 for livelihoods only, and five studies that assessed both (figure 1).
In terms of the conservation outcomes, certification studies mostly focused on tree density and tree diversity within farm plots, while market exclusion studies mainly examined deforestation (Newton and Benzeev 2018). There are two coffee papers that examined both deforestation and tree density and diversity outcomes (Hardt et al 2015, Rueda et al 2015. There have been few peer-reviewed studies of the conservation outcomes of FSP implementation mechanisms on cocoa farms or for any type of outcome in East or West Africa or South Asia (figures 1 and 3), despite large food commodity-driven deforestation threats and conservation opportunities in those regions. No market exclusion studies examined livelihood outcomes.

FSP outcomes
This section summarizes the results of sufficiently rigorous studies (per the criteria in tables S9 and S10) to answer our first two overarching research questions: a) What are the effects of existing food company FSP implementation mechanisms on conservation and livelihood outcomes? and b) Are synergies between conservation and livelihoods observed? The first question is disaggregated by our sub-questions on compliance, additionality, and spillovers, while the second focuses only on tradeoffs and synergies pertaining to additionality.

Do farmers comply with the conservation requirements of FSPs?
Conservation requirements differ substantially between food company FSP implementation mechanisms (table 1). Our results, based on the 15 cases that rigorously assessed compliance with conservation requirements, indicate mixed levels of  Map of FSP implementation mechanism cases by outcome and commodity. These results include only the cases where the methodology for the associated outcome was deemed to be sufficiently rigorous (n = 15 for conservation compliance, n = 12 for conservation additionality, n = 18 for livelihood additionality). compliance across mechanisms and commodities (figures 2 and 3). Studies generally find compliance with the reforestation (tree planting and diversity) and deforestation criteria of coffee certifications. Ongoing clearing was limited to individual trees and reflected natural cycles of replanting old coffee trees (Rueda and Lambin 2013, Hardt et al 2015, Caudill and Rice 2016. However, market exclusion studies in the Brazilian Amazon came to diverging conclusions about compliance with the mechanism, depending on the time period, actors, and regions examined. Soy Moratorium studies in Mato Grosso (Azevedo et al 2015) and Rondônia (Costa et al 2017) that examined whether soy was planted in areas deforested any time after the policy's deforestation cutoff date in 2008 concluded that there were notable instances of non-compliance. Studies that examined land use in a shorter time period after the cut-off date concluded that there was high compliance overall (Rudorff et al 2011, Gibbs et al 2015. One study of the G4 Cattle Agreement that included all cattle producing farms in Southwestern Pará (Klingler et al 2018) (including indirect suppliers) found high non-compliance, whereas the other two studies that only focused on the land clearing behaviors of direct suppliers to committed slaughterhouses found high compliance (Gibbs et

Do FSPs provide additional conservation and livelihood benefits?
There were 12 cases for conservation additionality and 18 cases for livelihood additionality that met our criteria for methodological rigor (figures 3 and 4). Improvements in conservation outcomes were found in 43% of the cases, including within RA certified coffee farms in some regions and in the Brazilian Amazon as a result of the Soy Moratorium and G4 Cattle Agreement (one study each). Improvements in income from the target commodity were observed in 65% of the cases and improvements in farm or household income were observed in 53% of the cases, including various certifications and codes of conduct for coffee, RA-UTZ certification for cocoa, and RSPO certification for oil palm.

Coffee certifications and codes of conduct
Coffee certifications have been found to have both conservation and livelihood benefits, but never in the same case. In 38% of the coffee studies, RA certification was associated with both reduced deforestation (Takahashi and Todo 2013, Rueda et al 2015 and increased tree cover (Rueda et al 2015, Haggar et al 2017, Takahashi and Todo 2017. Regions with positive outcomes included Colombia (Rueda et al 2015), Ethiopia (Takahashi and Todo 2013, which comprise a single case), and Nicaragua (Haggar et al 2017). However, most often, no conservation additionality was found, including studies of RA, UTZ, CAFE, AAA and 4C (Haggar et al 2017, Vanderhaegen et al 2018, Pico-Mendoza et al 2020. Improved income outcomes were identified in half of the coffee certification cases. Studies that simultaneously examined commodity and household income outcomes found improvements in only commodity income in 60% of the cases and improvements in both types of income in 40% of cases. Positive livelihood outcomes were found for every coffee standard: Starbucks CAFE (Ruben and  (Chiputwa et al 2015), and RA in Nicaragua (Haggar et al 2017), found no additional improvement in commodity or household income, with one case leading to a reduction in commodity income (UTZ for coffee production in Nicaragua (Haggar et al 2017)) and another to a reduction in household income (UTZ-FT in Kenya (Van Rijsbergen et al 2016)). In the latter case, UTZ-FT was associated with an increase in commodity income due to improved yields, but a decrease in household income because certified farmers became more specialized in coffee production alongside a drop in the coffee price relative to other crops (figure 4). Income benefits failed to materialize in places where there were both low yield benefits from certification and no price premiums associated with the certified good passed down to farmers.
Among the cases that simultaneously and rigorously examined conservation and livelihood additionality (figure 5), none pertained to deforestation and none found evidence of simultaneous improvements (synergies) or simultaneous declines in conservation and livelihood indicators as a result of certification. Most found a combination of a success in one category, but no additionality in the other. Three cases, which included multi-certification impacts in Uganda (Vanderhaegen et al 2018) and CAFE in Nicaragua (Haggar et al 2017), found increases in commodity income from certifications, but no conclusive change in reforestation. A study  (2018). There were no win-wins, tradeoffs, or lose-loses identified. Most cases found improvement in one indicator, but no change in another, while one study found a decrease in income, but no change in tree cover. There were no studies that examined deforestation and livelihoods.
of RA certification on coffee farms in Nicaragua found an increase in conservation, but no increase in commodity income (Haggar et al 2017).

Cocoa RA-UTZ certifications
Only two cases rigorously examined the income outcomes of RA-UTZ certification in cocoa production and these were located in Southern Ghana (Brako et al 2020, Iddrisu et al 2020. Both studies found increases in cocoa, farm, and total household income from UTZ-RA certification. The certification effect was particularly large in the study by Brako et al (2020)resulting in 28% higher household income. This was due to the large improvements in yields which arose from adoption of Good Agricultural Practices and enabled by greater access to technical assistance and financing as a result of the Licensed Buying Companies managing the certification.

Oil palm RSPO certification
RSPO certification results, which pertained mostly to Southeast Asia, but also one case in Ghana, are mixed for conservation outcomes, but positive for livelihood outcomes. A study by Carlson et al (2018), found that RSPO certification reduced deforestation by 33% relative to non-certified plantations. Yet, the authors found that, on average, RSPO certified plantations contained little residual forest (and more oil palm area) when they received certification and may have deliberately cleared their forest prior to declaring their intent to be certified. Furthermore, certification had no impact on forest loss in peatlands. None of the RSPO studies found reduced fire occurrence on certified plantations ( Morgans et al (2018) studied livelihood additionality from RSPO certification in Indonesia, finding that although fire occurrence did not decrease, plantations certified with RSPO had a greater increase in profits (as proxied by certified company share values). Brako et al (2020) found substantially higher oil palm, total farm, and household income from RSPO certification. This is due to higher yields through access to improved varietals, but not as a result of any premium.

Cattle and soy ZDCs
The two studies assessing the impacts of the G4 Cattle Agreement found evidence of reduced deforestation among farms that are direct suppliers to slaughterhouses with zero-deforestation commitments (Gibbs et al 2016, Alix-Garcia and. However, these reductions are partially due to selection bias and largely erased by leakage. Farms that sold to slaughterhouse with zero-deforestation commitments already had less forest remaining before the agreements were signed (Gibbs et al 2016) and reductions in deforestation among farmers that registered earlier in an environmental cadaster (to be able to sell to these slaughterhouses) were completely offset by farmers who registered later (Alix-Garcia and Gibbs 2017).
The one sufficiently rigorous study of the Soy Moratorium in the Brazilian Amazon (which uses deforestation for soy in the Cerrado as a counterfactual), found that the moratorium had conservation benefits (Gibbs et al 2015). In the 2 years preceding the agreement, the authors find that nearly 30% of soy expansion occurred through deforestation rather than by replacement of pasture or other previously cleared lands. After the Soy Moratorium, deforestation for soy dramatically decreased, falling to only ∼1% of expansion in the Amazon biome by 2014. In the Cerrado biome, where the Soy Moratorium does not apply, the annual rate of soy expansion did not decrease in a similar fashion. In the region of Mapitoba, nearly 40% of total soy expansion (2007)(2008)(2009)(2010)(2011)(2012)(2013) occurred at the expense of native vegetation.

Do FSP spillovers offset or enhance net conservation and livelihood benefits?
To date, understanding of policy spillovers has been particularly limited in land system science (Meyfroidt et al 2020), with supply chain policies being no exception. Negative policy spillovers arise when the implementation of a policy in the target region increases incentives to undertake the prohibited behaviors in non-targeted regions, leading to a shift in environmental and social impacts across actors and space (Meyfroidt et al 2020). Negative spillovers may also occur if the introduction of a financial reward or penalty for certain behaviors undermines (crowds-out) intrinsic motivations for conservation or fair wages (Grillos et al 2019). Positive spillovers arise when both market exclusion mechanisms and certifications lead to increased trust and learning among local communities beyond the target population, resulting in enhanced conservation and livelihood benefits (Garbach et al 2012, Simonet et al 2019. Only five cases rigorously examined forest spillovers from FSP implementation mechanisms, while two examined income spillovers. Among the forest related spillovers, results differed substantially. Gollnow et al (2018) found no additional deforestation spillovers within and across nearby soy and cattle properties as a result of the Soy Moratorium. However, Alix-Garcia and  identified that cattle producers who registered in the environmental land registry later increased their deforestation relative to producers that registered earlier (a spillover from early registrants to late registrants). This is plausible because late-registering and non-supplying farms had alternative markets to sell to or could launder their product through compliant farms. For RSPO certification, Heilmayr et al (2020) found both positive and negative spillovers, depending on the land designation. They found that RSPO certified companies decreased deforestation in their noncertified properties when the land was designated as Forest Estates (subject to greater environmental protections). However, deforestation increased in nearby lands designated as APL ('other use lands') that were exposed to a larger proportion of RSPO certified companies. These are lands where oil palm can be legally grown under federal regulations, in contrast to Forest Estates. They hypothesize that the positive spillover in non-certified Forest Estate lands and negative spillover to APL occurred because the RSPO certification reduced fresh fruit bunch prices from Forest Estate lands (due to greater compliance with the federal regulations prohibiting deforestation in those areas as a result of their commitment to legal and responsible practices as part of certification).
With respect to reforestation outcomes, one study of UTZ-RA-4C certification on coffee farms in Uganda by Vanderhaegen et al (2018) identified negative spillovers on other conservation outcomes, including biodiversity and carbon stocks. One study on RA certification, which assessed coffee farms in Ethiopia (Takahashi and Todo 2017), found positive conservation spillovers in the form of reduced degradation in nearby forests. No studies rigorously assessed conservation spillovers for Bird Friendly certification or any code of conduct.
Only two studies examined livelihood spillovers (both focused on coffee certifications). A study of RA certification on coffee farms in Colombia identified positive commodity income spillovers across certified and non-certified farms due to the upgrading of regional coffee supply chains (Rueda and Lambin 2013). Yet, a study of UTZ-FT in Kenya found no significant impact of UTZ or multi-certification on livelihood conditions at the cooperative level (vis-à-vis non certified cooperatives) and did not identify any changes in the structure of the coffee value chain (Van Rijsbergen et al 2016). No studies assessed livelihood spillovers for RSPO certification, market exclusion mechanisms, or codes of conduct.

Factors supporting improved conservation and livelihood outcomes
Here we summarize the conclusions that the existing evidence base offers with respect to our final question: In what contexts have FSP mechanisms been most effective at improving conservation and/or livelihoods? We found five groups of contextual factors mentioned at least twice in the studies: a) existing land use attributes (i.e. regional rates of commodity-driven deforestation, type of agricultural practice-monoculture versus agroforestry, and associated commodity yields), b) characteristics of the producers and cooperatives in the regions where the policy was implemented (i.e. how much wealth do the farmers have, how well organized are the cooperatives), c) monitoring capacities in relation to the supply chain complexity, d) public sector involvement (i.e. the stringency of public policies and their enforcement, including jurisdictional and hybrid governance efforts), and e) NGO involvement (i.e. participation in the diffusion of the certification and/or capacity building among farmers) (figure 6).
Existing land use practices and baseline farm and cooperative characteristics were the most frequently mentioned factors influencing the conservation and livelihood additionality of FSP mechanisms. These contextual attributes are linked to the issues of selection bias and mechanism equity. Selection bias and resulting equity concerns emerge when the farmers who are most likely to adopt or continue complying with a policy are those who are already compliant  or sufficiently wealthy to bear the costs of adoption and compliance (McDermott 2013). Across nearly all certification cases, either larger, wealthier, and/or more educated farms were most likely to adopt the certification.
For certifications focusing on enhancing tree diversity and tree cover on coffee or cocoa farms, (e.g. RA, UTZ, and Bird Friendly), an important baseline land use characteristic is whether or not the farmers were already using some type of agroforestry system (or, conversely, full sun monocultures). Differences in agroforestry practices prior to the adoption of a certification or standard can have a larger impact on tree diversity and cover than the FSP mechanisms (Pico-Mendoza et al 2020). And since some standards promote greater intensification of input use vis-à-vis existing low-input shade-garden practices, exposure to FSP mechanisms rarely incentivizes more diversified practices. However, through the resulting move away from low-input practices, FSP mechanisms can increase yields, leading to higher income from the target commodity (Vanderhaegen et al 2018, Brako et al 2020. The ability of farmers to maintain low input systems, such as agroforestry in some cases, thus strongly depends on the ability of farmers to capture higher prices through quality premia-a rare occurrence (Haggar et al 2017).
For RSPO certification, which focuses on zeroclearing of primary areas or HCV (rather than reforestation or tree diversity), the conservation impact is strongly affected by the amount of primary forest remaining in the properties that adopted the certification. In analyzing the impacts of RSPO certification on oil palm plantations, Carlson et al (2018) found that primary forest clearing rates and fire usage are lower in RSPO certified properties versus comparable ones. But the additionality of certification is reduced because the plantations that adopted RSPO had little primary forest left before certification (Carlson et al 2018). Livelihood impacts from RPSO in oil palm farms are, like cocoa farms, highly influenced by the age of trees and, like all crops, the varieties used prior to adoption of the certification. Where yield gaps are high due to older trees and lower yielding varieties, participation in the certification can lead to major improvements in yields and income (Mitiku et al 2017, Brako et al 2020. With respect to baseline farm and cooperative characteristics, many studies found that farmers that adopted UTZ, RA, or CAFE Practices had significantly higher incomes and education before certification than their matched counterparts (Ruben and Zuniga 2011, Haggar et al 2017, Brako et al 2020, Iddrisu et al 2020. These biases occur partially due to the high financial and informational costs of compliance and adoption, e.g. mapping and registering properties or adopting new restoration and management practices. Low land or labor endowments raise the cost of shifting land or labor away from consumption or income diversification. Given the high up-front costs of adopting certifications and limited education and financial resources of many farmers, participation is often organized through cooperatives or Licensed Buying Companies (for cocoa). Consequently, the underlying capacities of such cooperatives and companies play an important role in determining livelihood additionality. Where these entities have limited market access and lack capacities to provide any value-added processing to the sourced product, returns to farmers from certification tend to be lower (Chiputwa et al 2015, Mitiku et al 2017, Dietz et al 2019. Conversely, when such entities bring enhanced access to credit and technical training, benefits can be multiplied (Brako et al 2020). State and NGO involvement become particularly important for strengthening local capacities to adopt and participate in supply chains with FSPs (Mitiku et al 2017).
Studies examining conservation additionality of the RSPO Certification in Indonesia and Soy Moratorium and G4 Cattle Agreement in the Brazilian Amazon pointed to important interactions with the public sector and limitations in current supply chain monitoring capacities. In Brazil the public sector's involvement in land registration and deforestation monitoring has been critical to existing successes, but policy compliance remains difficult to monitor on properties that have not registered their land on the national property cadaster and/or do not sell directly to the companies that adopted the FSP mechanism (Gibbs et al 2016, Alix-Garcia and. Monitoring of small-scale deforestation and of small-scale properties, in general, is a challenge for RSPO, the Soy Moratorium, and the G4 Cattle Agreement. For example, the Soy Moratorium excludes monitoring of deforestation events smaller than 25 ha in size (over the course of the monitored period) (Rudorff et al 2011). In the context of the Brazilian market exclusion mechanisms and RSPO certification in Indonesia supply chain efforts can help enhance compliance with federal deforestation policies, filling gaps in capacities and political will (Heilmayr et al 2020).

Many knowledge gaps remain about FSP implementation mechanisms
Only 37 cases rigorously assessed the causal impacts of FSP mechanisms on either conservation or livelihoods and only five cases simultaneously examined conservation and livelihood outcomes in regions where FSPs are being implemented. The literature on spillovers associated with FSP implementation mechanisms is particularly limited and major gaps in our understanding of FSPs remain in East and West Africa and South Asia. Among studies with sufficient methodological rigor, there have been more studies assessing the outcomes of coffee certifications and codes of conduct, especially within Latin America, than other commodities or regions. The emphasis on coffee FSPs in the existing evidence base is likely due to how globally widespread coffee production and associated FSP mechanisms are and how early FSPs were adopted in coffee systems compared to other commodities.
The literature on FSPs includes many ex-ante theoretical and quantitative analyses (table S8). Additionally, many studies focused on actors' perceptions of processes. While these are very useful, especially for given how new many of these mechanisms are, post-hoc analyses based on rigorous measurements of on-the-ground impacts are still needed to verify or negate theory, models, and perceptions, and to better understand the effect size of FSP mechanisms. Even within the studies that were included for additionality due to rigorous establishment of a counterfactual, there was significant variation in how counterfactuals were established. Evaluations of the Soy Moratorium stand out in this respect. We found that only one of the nine published Soy Moratorium studies rigorously assessed the mechanism's additionality. The other studies included no counterfactual for how much deforestation for soy in the Amazon would have occurred in the absence of the Soy Moratorium (see tables S8 and S10).
Careful construction of counterfactuals is crucial for all mechanisms, not only because of selection bias, but because many other contextual factors may be changing simultaneous to the policy treatment. For example, during the period where the Soy Moratorium and G4 Cattle Agreements were introduced, public policies to reduce deforestation in the Brazilian Amazon were substantially strengthened and prices became less favorable for soy production (Assunção et al 2015). Simultaneous policy changes in this region included improvements in the enforcement of deforestation on private properties, greater incentives for sustainable intensification, and an expansion in protected areas, and have been found to have large impacts on deforestation and land use practices (Assunção et al 2015, le Polain de Waroux et al 2017, Garrett et al 2018.

FSP implementation mechanisms can bring benefits, but simultaneous improvements in livelihoods and conservation remain elusive
Some certifications and codes of conduct, when coupled with positive incentives such as improved market access, have either incentivized farmers to adopt more tree cover or have enabled them to generate higher target commodity and household income. But no certification or code of conduct studies find simultaneous improvements in conservation and livelihood outcomes, and livelihood benefits appear to be more common than conservation benefits (65% versus 43% of the cases). Reforestationoriented mechanisms have greater conservation additionality in regions where existing practices involve full-sun monocultures, but adoption of agroforestry practices has been associated with reductions in yields and income for the target commodity in the included studies. These outcomes may be offset by other livelihood benefits, such as improved climate resilience and food security, but this was not measured in the included studies.

Given existing results, there is a risk that FSP implementation mechanisms could exacerbate rural inequalities
Selection bias poses a serious challenge for the effectiveness of FSPs, as well as their equity. Farmers with less forest remaining and larger properties or those already connected to well-functioning cooperatives can more easily adopt certifications and/or the necessary practices to continue supplying to companies with FSPs. For this reason, both the underlying land use characteristics and existing household and cooperative characteristics in regions where FSPs are implemented play a large role in amplifying or mitigating trends toward selection and adoption by larger and wealthier farms. Deforestation-oriented mechanisms have greater conservation additionality in regions and on properties where there is still ample forest remaining. Yet food companies may have limited incentive or capacity to continue sourcing from such suppliers and regions if the costs of monitoring and enforcing compliance with the FSP are high.
Given the selection biases identified by several studies, there is a risk that FSPs could exacerbate rural income inequality and harm smaller farmers, rather than improving livelihoods. This is especially likely for market exclusion mechanisms that reject noncompliant farmers. Farm income is strongly affected by yields (including crop tree density) and in many cases the transition to an agricultural system with greater shade and tree diversity can entail productivity tradeoffs. Price premiums are often insufficient to compensate for these tradeoffs. The unequal adoption and implementation of FSPs across space also runs the risk of driving harmful land use and labor practices towards less regulated regions and sectors (le Polain de Waroux et al 2016).

Limitations and future research needs
Despite growing attention to food company FSPs in the academic literature and the fact that all FSP mechanisms examined here have existed for over a decade, there are very few papers that look at tradeoffs or synergies across conservation and livelihood outcomes. Across all types of mechanisms, spillover impacts remain poorly understood. There are even fewer empirical studies that explain why mechanisms do or do not create additionality more often. Aside from RA and UTZ certification, most studies of existing FSP implementation mechanisms are clustered around particular commodities and production regions. To date there have been no field experiments to assess the impacts of FSP implementation mechanisms, which is an approach that can help reduce confounding factors and increase the external validity of the impact and mechanism assessment (Handberg and Angelsen 2015).
Case selection and methodological choices have limited the insights that can be gained from existing studies. These studies are highly clustered in the major production regions of each commodity and few replicated comparative assessments have been conducted (rigorous comparative analysis only exists for coffee certifications and codes of conduct). This clustering inhibits our ability to draw cross-mechanism conclusions about supportive or inhibitory contexts.
Individual scientists' choices about which cases to study, their methods of establishing counterfactuals, and the degree to which they control for or explain important contextual factors have a strong influence on the results they generate. Insights from single observational studies often lead to generalized conclusions in policy making arenas about the potential effectiveness of different approaches. Through systematic assessment of these results we have aimed to reduce confusion about the existing evidence concerning private sector forest-focused policies used to govern food supply chains and to provide insights about what we can and cannot generalize about these mechanisms at this point.
To move forward in closing the large knowledge gaps about the causal impacts of FSP implementation mechanisms, we list some urgent needs for the research community, funders, and practitioners working on this topic:

Establish best practices for measuring policy treatments and outcomes
Given how widely current FSP impact evaluation methods vary, workshops or other types of discussions must be initiated to consolidate best practices to measure supply chain policy treatments and their conservation and livelihood impacts. While impact evaluations for certification programs have been around for decades, guidelines for assessing the impacts of policies applied beyond individual farms must still be developed. For example, most research on market exclusion mechanisms uses temporal dummies marking the onset of a policy within a territory as a proxy for the policy treatment. This approach lacks precision about spatial and temporal variations in how the policy is implemented. Suggested improvements include synthesizing data on: a) how much of the market for the target commodity such companies handle within regions; b) the number of interactions these firms have with farms in their supply sheds; and/or c) volumes sold by individual properties to companies with and without FSPs and links to second-tier (indirect) suppliers.

Simultaneously measure a variety conservation and livelihood outcomes, including spillovers
Research should aim to conduct simultaneous assessment of conservation and livelihood outcomes to better assess whether or not conservation benefits can be obtained without harming rural livelihoods, or rather if key tradeoffs emerge. This necessitates collecting data on property boundaries during local interviews and field measurements. By matching remotely sensed data to property boundaries, researchers can reconstruct land cover and land use histories to better understand conservation outcomes of certifications. As deep learning based on remotely sensed data continues to improve (Karlson et al 2016, Schindler 2018, researchers will be able to study tree cover at finer scales and identify species diversity and agroforestry systems. By focusing not only on treated properties, but also surrounding forests and communities, researchers can better identify spillovers that may negate or enhance the benefits and costs incurred by the targeted actors and properties.

Assess the mechanisms of impact and key contextual variables
Future research on FSPs should not just measure impacts, but also aim to identify the mechanisms leading to current successes and failures. Experimental designs could help generate more precise understanding of impact pathways by reducing confounding contextual factors. To tease out the importance of different contextual mediating factors, impact evaluations that employ parallel, comparable methods should be replicated across a wide range of commodities and regions.

Conclusion
The growth of FSPs among food companies is representative of an increasing trend in private sector, flow-based governance of food systems, leading to novel telecouplings between food demand and supply regions (Sikor et al 2013, Garrett andRueda 2019). These now common governance initiatives are a potential leverage point for creating positive conservation and livelihood impacts in the world's major food production regions. Yet current evidence on the impacts of FSPs and associated implementation mechanisms has not been systematically examined across all forest-risk commodities. Here we aimed to address this knowledge gap through a systematic review of the existing literature on FSP implementation mechanisms, with attention to whether such mechanisms go beyond compliance toward achieving positive conservation and livelihood additionality without negative spillovers. We also summarize existing knowledge of the underlying contextual factors explaining success or failure in generating additionality.
We find evidence that FSP implementation mechanisms have delivered improved conservation and livelihood outcomes in more than half of the cases where additionality was rigorously assessed, and in some cases have generated positive spillovers. Compliance does not appear to be the major challenge limiting the potential of most FSPs. Yet, even though most FSPs have dual conservation and livelihood goals, there is little evidence of win-wins across both types of outcomes and adoption patterns give rise to equity concerns. Larger farms and farms that have already met the FSP criteria are more likely to continue participating in supply chains with FSPs, whereas smaller farms and farms with greater costs of compliance are more likely to be excluded. This outcome anticipates the creation of a bifurcated market where larger, established producers dominate access to FSP supply chains and smaller producers are downgraded to producing for local markets, perhaps with lower prices. Emerging landscape and jurisdictional approaches that aim to reconcile social and environmental objectives with greater participation from local stakeholders and wall-to-wall coverage of all actors in a region offer an opportunity tackle this challenge.
We find glaring gaps in the FSP impact assessment literature, including a clustering of existing studies in limited geographies, few rigorous assessments of the additionality outcomes associated with market exclusion mechanisms, and very few studies that have assessed tradeoffs between conservation and livelihoods or spillovers. Going forward, agreement by both researchers and practitioners on best practices for assessing outcomes of FSPs is urgently needed so that attention can be directed to efficiently filling the many knowledge gaps presented in this analysis in ways that are comparable across initiatives and regions.
The research focus on FSPs should not come at the expense of more research on how public governance can be improved to enhance conservation and rural livelihoods. Supply chain policies are rife with legitimacy, credibility and procedural equity concerns and are no replacement for centralized or grassroots approaches to improving the institutions governing sustainable resource use (Bastos Lima and Persson 2020, Delabre et al 2020, Reed et al 2020). Nevertheless, given the continued popularity of supply chain approaches, especially in the context of growing supply chain diligence mandates in the global North (Bager et al 2020), it remains pertinent and pressing to understand how such approaches can be improved to ensure they generate benefits to both people and nature.

Data availability statement
All data that support the findings of this study are included within the article (and any supplementary files).