Evaluating the scaling potential of sustainable land management projects in the Sahelian Great Green Wall countries

The Great Green Wall (GGW) Initiative aims at combatting land degradation while achieving socio-economic development across the Sahel through a mosaic of sustainable land management (SLM) and restoration practices. As the Global Environment Facility (GEF) is the main funding mechanism for land degradation neutrality related projects, we have analyzed its previous SLM projects in four pilot countries in an effort to assess their capacity to foster scaling of interventions and fast track progress towards the GGW objectives. We developed a literature-based scaling evaluation framework and scoring methods to harmonize the GEF agency based project ratings in terms of performance and persistence along seven evaluation domains. We found that projects perform better over time particularly in terms of monitoring, financing and resilience to shocks but are overall only moderately likely to achieve benefits persistent over time, which is necessary to allow for the scaling of interventions. While these efforts should be maintained and further pursued, we also recommend special attention to be placed on a number of interventions that are often less successful or ignored by projects such as enforcing mechanisms for new SLM regulations, empowering vulnerable groups and ensuring sufficient capacity and finances for sustaining achievements even during periods of political or climatic instability.


Introduction
Land degradation is defined as the long term declining trend in land productivity, ecosystem functioning or human land-based benefits caused either by direct human activity or climatic changes. It is already occurring over a fourth of the Earth's ice free land area and affecting around 1.5 billion people's livelihoods [1]. Projected increases in population induced pressures on natural resources as well as the multiple adverse impacts of climate change are expected to exacerbate land degradation in the future. This will make even more people vulnerable, particularly in the drylands of Asia and Africa [1]. Reducing and reversing land degradation is thus important for achieving the Sustainable Development Goal (SDG) 15 of life on land as well as the targets related to food (SDG 2) and water (SDG 6) security [1,2].
It is in this context that the Great Green Wall (GGW) for the Sahara and the Sahel Initiative emerged with the aim to restore 100 million ha of degraded land, sequester 250 million tons of carbon and create 10 million jobs by 2030 while enhancing food security and supporting local communities to adapt to climate change [3]. The GGW supports a mosaic of sustainable land management (SLM) practices adapted to local contexts to create resilient landscapes [3] and achieve land degradation neutrality (LDN) [2,4]. As the financial mechanism of the United Nations Convention to Combat Desertification, the Global Environment Facility (GEF) is the main contributor to the GGW with its land degradation and multifocal areas focused on achieving LDN by creating an enabling environment, demonstrating new technologies and investing in their scaling up [5].
However, overall progress towards the GGW targets remains slow and a pledge of over USD19 billion was made at the One Planet Summit in January 2021 in order to speed up and further scale existing interventions [3]. As such, thorough planning and evaluation of previous projects is needed in order to efficiently use these new funds. Unfortunately, evaluations of GEF projects are independently performed by their implementing agencies, which hinders the effective comparison of their performance [5]. Moreover, there are concerns about the accuracy of reported project ratings that tend to be 'overly optimistic' while toning down the challenges faced [5]. This in turn hinders the identification of priority issues to address so as to achieve positive ecological and social outcomes at scale.
This study-part of the GEF project 'Large-scale Assessment of Land Degradation to guide future investments in SLM in the GGW'-presents a novel attempt at homogenizing project ratings in terms of performance and persistence over time. It aims at evaluating the potential of previous SLM related GEF projects in four pilot countries of GGW, Senegal, Niger, Burkina Faso and Ethiopia, to bring about broad transformations allowing the achievement of LDN at scale that persist after project completion (figure S1). Starting from a review of the scientific literature on success factors and barriers to scaling (detailed in the SI), complemented with the GEF's Independent Office of Evaluation (IEO) reports, we developed a novel scaling evaluation framework and scoring methods applicable to GEF projects. Through an in-depth analysis of project reports and their theories of change, we present an evidence based evaluation of which project interventions are more likely to contribute to achieving LDN at scale in each of the studied countries and conclude with a set of recommendations for future projects in the region.

Project selection
The selected projects in this study were extracted from the GEF database in April 2021 (www.thegef.org/ projects-operations/database) based on search terms 'sustainable land management' and 'Great Green Wall' in Niger, Burkina Faso, Senegal and Ethiopia under the land degradation focal area (figures S1 and S2(a)). The search resulted in 30 'approved' or 'completed' projects spanning from the GEF-3 to GEF-6 periods with project documents, implementation or terminal reports available (full list in table S1). These projects are mostly implemented by investment agencies (World Bank, International Fund for International Development and African Development Bank followed by the United Nations Development Programme and United Nations Environment Programme (UNEP) (policy agency) (figure S2(b)) and amount to over 1.1 billion USD with GEF contributing over 155 million USD (figure S2(c)). The main objectives and Global Environmental Benefits expected from these projects address (a) ecosystem functioning; including reducing pressures on natural resources and halting or reversing land degradation, (b) livelihoods; aiming at alleviating poverty, increasing incomes and food security, (c) improving climate resilience, reducing GHG emissions and sequestering carbon and to a lesser extent (d) fostering an enabling environment through institutional and policy changes as well as sustainable investment mechanisms, and finally (e) biodiversity conservation in terms of wildlife and habitat protection (figure S2(d)).

Characteristics of a scaling evaluation framework
The GEF defines scaling as achieving Global Environmental Benefits to a higher degree and/or over larger geographical areas [5]. Scaling can be achieved by creating and sustaining enabling environments over long time periods through replication, mainstreaming and market change processes [5]. The theory of change, described in SI, is elaborated by all GEF projects as a pathway from project activities to broad transformations leading to the desired benefits [6][7][8][9]. The intervention logic it presents is used in this study to inform the proposed scaling evaluation framework and domains in order to evaluate projects. The scientific literature explores a range of factors essential for scaling SLM interventions touching upon participatory approaches, capacity building, funding mechanisms and incentives, and promoting an enabling environment for investments in terms of policy, institutions and governance capacity. Scaling is not only conditional upon addressing these factors, but it also requires identifying and remedying financial, socio-political, institutional and governance, as well as environmental risks and obstacles to large scale adoption of SLM practices [10][11][12][13].
Taking into account all these factors (described in detail in SI), we combined elements from the literature [12,14] and the GEF IEO reports [5,11] to elaborate a set of evaluation elements organized along seven evaluation domains; capacity building, monitoring and evaluation (M&E) systems, implementation strategies, resilience to shocks (here referring to ecosystems' resilience to extreme events, political resilience to instability and socio-economic resilience of livelihoods) stakeholder engagement and participatory processes, funding mechanisms and institutional contexts (figure 1(e)). These evaluation domains reflect on the capacity of the various types of project activities ( figure 1(a)) to bring about transformational processes (figure 1(b)) persistent after project completion as well as to tackle risks and barriers ( figure 1(d)) hindering large scale global environmental benefits (figure 1(c)). It thus enables us to evaluate the projects based on intervention performance and the persistence of achievements which we use as a proxy for evaluating scaling potential.

Scoring methods
We conducted an in-depth analysis of all project information documents as well as midterm and/or terminal and final evaluation reports for 25 projects. The five remaining projects are in the early stages of implementation and do not have evaluation reports available. They are thus not considered in the subsequent analyses. We reviewed these reports along the evaluation domains presented in figure 1(e). We introduce two complementary scores (at domain and project level) to standardize project evaluations across GEF agencies and over time. While relying on time consuming in-depth exploration of project reports, these methods are simple and easily reproduced for other projects and regions.
The scoring of each evaluation domain is based on the total of achieved elements compared to the ones planned in the ToC. If an element is achieved, it is given a score of 1, if partly achieved a score of 0.25, 0.5 or 0.75 based on the comments in the reports and finally if not achieved a score of 0. These scores are then averaged over each evaluation domain and used to analyze project achievements. This part of the analysis was performed by one person who read through all the reports in detail, which ensures consistency in the scoring between projects. We also present a range of possible scores by computing a pessimist score attributing 0.25, and an optimist one attributing 0.75 to all partially achieved elements (table S2). This was done in an effort to reduce potential human interpretation errors only for elements that were considered as partially achieved. We viewed not achieved and completely achieved elements (those attributed scores of 0 and 1 respectively) as less open to interpretation.
However, we acknowledge that there might be a risk that they could be scored differently by another assessor. Thus, the replicability of this method and interpretation of results should be considered with care. We also established an overall score for each project based on the total of achieved elements compared to the ones planned. We allocate a value of 1 to achieved elements, 0 to partially achieved and −1 to elements not achieved which might pose a risk to other achievements. For example, if a project successfully provided trainings to government employees, then the capacity building element gets allocated 1 but if there is a high turnover rate then that causes a risk to the achievement of built capacities and so the political stability/turnover element gets a score of −1. The final score is the sum of scores for all elements presented as a percentage of the maximum achievable score which is the total number of elements considered by a project.

GEF projects in the Sahel are targeting foundational and demonstration activities more often than investment activities
When reviewing project information documents, we found that all projects included all three types of activities namely knowledge and information, implementation strategies, and institutional capacity activities [14] (figure 2). However, some activities appear to be prioritized more than others. Within the knowledge and information component, over 90% of the projects designed capacity building, M&E, information access and awareness raising activities, but knowledge generation activities were only targeted by 70% of the projects. Under the implementation component, higher priority was given to technologies and approaches (93% of projects) but less to implementing mechanisms and bodies (73% of projects) and financial mechanisms (67% of projects). Finally, the institutional component presents the highest variability with the majority of projects having planned activities related to regulatory frameworks (90% of projects) and governance structures (87% of projects), while only a third of the projects considered conflict resolution and trust building processes (figure 2).
These findings can be explained by the fact that GEF projects in Least Developed Countries put more emphasis on 'foundation' and 'demonstration' approaches rather than 'investment approaches' [5]. Laying the ground for enabling environments through capacity building and institutional support for governance structures and regulatory frameworks (foundation), and demonstrating the applicability and raising awareness of new technologies, are all still needed in most of these countries (demonstration) before larger investments can be made to scale  interventions [5]. Moreover, the main objectives of the land degradation focal area are geared towards implementing SLM on the ground, building capacity and mainstreaming SLM at local to regional levels, which would explain the greater focus on implementation activities [16].

High performance variability across evaluation domains
While projects target all types of activities, they are not always able to deliver persistent achievements on all of them (figure 3). The aspects most successfully achieved and sustained after project completion are related to the implementation of SLM and their associated livelihood benefits (median score of 0.9), resilience to external shocks whether climatic, socio-economic or political (0.9), the development of participatory approaches enabling governance and ownership of project activities at local level (0.8), as well as capacity building (0.8) ( figure 3). On the other hand, institutional arrangements and regulatory frameworks (0.6), M&E (0.6) and financial mechanisms (0.5) remain a challenge in most projects ( figure 3). This ranking of scores remains the same under the pessimist and optimist scenarios (table S2), even if the actual scores might differ.
These results are consistent with the GEF IEO reports showing that land degradation projects often have satisfactory outcome ratings but significantly lower sustainability ratings (here referring to achievements that persist after project completion) [16]. Environmental sustainability (implementation and resilience domains) in African land degradation projects is most often achieved followed by institutional sustainability (capacity and institution domains). The lowest ratings are reported in terms of financial sustainability and implementation of M&E systems [16]. Nevertheless, there might be some bias associated with the reporting itself, as the GEF reports generally put more effort into measuring direct impacts on ecosystem changes and capacity building activities [17].
Moreover, when looking at the distribution of scores by project, the highest variability is observed for resilience (here in terms of political stability, alternative livelihoods, and climate resilience), monitoring and financial aspects ( figure 3). These domains are the least consistent among GEF projects with some projects achieving very high scores and others hardly resulting in any persistent achievements (figure 3). Current barriers hindering high scores in resilience are related to risks of extreme events such as floods and droughts occurring during the initial phases of project implementation and damaging the first interventions. Political instability diverting funds to national security rather than SLM, and insufficient promotion of alternative livelihoods and institutionalization of surveillance systems to avoid poaching, illegal deforestation and the respect of grazing-free corridors are also recurrent barriers in the four countries.
While M&E systems are implemented in almost every project, their usefulness in enabling decision making, or their potential of being maintained and continue being used after project completion is less common and highly project and context dependent. Another common issue mentioned in project reports is that these systems are often either redundant or based on too many indicators for which data are not gathered. The GEF land degradation report recommends that tracking tools include more appropriate biophysical indicators and baselines as well as geospatial data in order to improve the accuracy and effectiveness of monitoring land degradation and progress towards achieving LDN [16]. Improved socio-economic monitoring is also necessary in order to relate investments in SLM to impacted populations' well-being [17].
For their part, financial mechanisms still more often than not rely almost exclusively on limited government and international development funds which risk being diverted in times of crisis. Moreover, projects often face reluctance from private companies to invest in SLM as they lack financial or regulatory incentives or perceive very low returns on investments in SLM and high risks of market failure [5]. There have however been substantial efforts by some projects to involve insurance companies and banks with microcredit and saving schemes, or to improve access to markets (local, export or carbon) that resulted in increased incomes and self-investments made by farmers. Schemes such as payment for ecosystem services as well as compensation for ecosystem damage have been shown to benefit both the local environment and municipality funds.
In contrast, the evaluation domains with less variability between projects are generally better established and have been prioritized for longer periods by development agencies [5] and as such suffer from less common issues that reduce scores. The main drivers of lower scores in the implementation domain are related to the high maintenance costs of the more expensive infrastructure interventions which are seldom accounted for in project funding. High staff turnover puts at risk the accomplishments of capacity building actions [18]. Finally, while all projects do include women and vulnerable people as project beneficiaries, they still struggle to substantially increase their participation in decision making.

Improving performance over time
We find that in general projects in the selected Sahelian countries are moderately successful in sustaining project achievements with a median overall score of 0.47 ( figure 4(a)). Throughout GEF periods 3-5, the median is almost constant at 0.45 but increases to 0.7 during the GEF 6 period ( figure 4(a)). While the median seems to remain constant, the distribution of scores as well as minimum and maximum values vary between GEF periods (figure 4(a)). All periods but GEF 5 have large overall score variability, but the minimum overall score increases during GEF 5 and 6 compared to the first two periods, suggesting an overall improvement over time albeit still project dependent.
Furthermore, when exploring the progress of the domain evaluation scores over time, the biggest improvements are observed for the monitoring, resilience and funding domains ( figure 4(b)). The evaluation score variability between projects in these domains also decreases over time suggesting that projects are consistently improving the design and implementation of relevant activities ( figure 4(b)). On the other hand, median scores for stakeholder engagement, capacity building and institutions are more variable between GEF periods with the first two domains presenting lower variability over time while the institutions variability increases ( figure 4(b)). Finally, the median score for the implementation domain remains consistently high over time albeit presenting a slight decline and higher variability across periods ( figure 4(b)).
The move towards more integrated and programmatic approaches rather than single projects could explain the increases in scores as they allow for better resource allocation and coherence in achieving Global Environmental Benefits [19]. Indeed, the Integrated Approach Pilots [7]-here GEF 6 projects-are better able to improve local systems as they target market access, policy reforms and engagement with the private sector [16,19]. GEF 7 builds on lessons learnt from these pilot project in its new impact programs as well as new stakeholder involvement policies that emphasize the role of the private sector as an agent for market transformation and not only funding, knowledge platforms, gender mainstreaming and climate resilience [5]. It is thus likely that scores will continue to rise in the coming GEF replenishment periods.

Differences between countries
The differences in performance can also be explained by local contexts, which activities are most often targeted or ignored and whether their achievement is satisfactory or not (figure 5). All countries successfully implement capacity building activities for participatory management and technical implementation, awareness raising campaigns, gender sensitive activities, appropriate and demand driven SLM practices and demonstration activities (figure 5). However, they all fail at delivering M&E systems that are effectively used to support decision making.
In Ethiopia, capacity building on regulatory frameworks are the most successful knowledge management activities while trainings on M&E systems are often needed but seldom planned, and their usefulness in identifying and promoting best practices is hardly ever reported. Satisfactory implementation strategies relate to generating a sense of ownership of project interventions and providing incentives to local beneficiaries to implement SLM techniques, enabling access to markets and encouraging selffinancing mechanisms and achieving some level of climate and livelihood resilience while the influence Figure 5. Average proportion of activities achieved in a 'satisfactory' or 'unsatisfactory' manner, not targeted by projects but reported to be 'needed' , and those 'not mentioned' in project reports for all projects in Ethiopia (n = 4), Burkina Faso (n = 5), Niger (n = 6) and Senegal (n = 6). of political stability is often not reported. Local government funding mechanisms for their part seem to be needed but they are generally not included in project design. In terms of institutional activities, projects in Ethiopia are most successful in creating functional land commissions and enforcing regulatory frameworks, but transparency issues and conflict resolution mechanisms, as well as the capacity of projects to empower women and vulnerable people to contribute to decision making processes are not reported (figure 5).
M&E systems developed by projects in Niger are often functional and enable tracking progress, but are not used to assess interventions' impacts. Capacity building for legal enforcement is frequently considered as needed in evaluation reports but not planned in project design while financial training is rarely mentioned by projects. In terms of implementation, most projects in Niger provide sufficient incentives to local beneficiaries facilitating the broad replicability of easy to implement SLM interventions. They also successfully promote financial mechanisms for local government and self-financing schemes as well as improved market access and value chain creation. However, they still fail at attracting the private sector either as investor or stakeholder and often do not report on issues related to political stability. Finally, projects commonly succeed in creating managing bodies and governance mechanisms but fail at enforcing regulatory frameworks or conflict resolution mechanisms. They also do not report on tenure systems and staff turnover ( figure 5).
In Burkina Faso, knowledge management activities are most successful when they target capacity building in terms of regulatory frameworks and tenure systems, but most reports ignore capacity building towards M&E systems or whether best practices are identified and disseminated through these systems. In terms of implementation strategies, more focus is successfully given towards creating and easing access to markets. However, they fail at improving climate resilience even if it is aimed at by projects, while political stability is often mentioned as a limiting factor but not addressed in project design. When it comes to institutional capacity, projects in Burkina Faso are often successful in terms of improving or creating governance structures, but they fail at improving overall transparency and seldom mention involving the private sector as a stakeholder ( figure 5).
The most commonly successful information generation activities in Senegal are linked to developing functional M&E systems capable of tracking progress through appropriate and measurable indicators. However, they often fail to produce impact assessments. Moreover, capacity building activities towards land tenure are seen as needed but ignored by project design. Within implementation strategies, projects in Senegal tend to successfully enable ownership of SLM interventions by local communities and promote self-funding mechanisms, but they fail to encourage private sector investments, achieve climate resilience or attain broad replicability of some interventions mostly due to high input costs on which many interventions depend. Institutionally, most projects are able to improve governance in Senegal but they do not provide any information on staff turnover or empowering women's contributions to decision making ( figure 5).
For their part, regional projects have the highest proportion (54%) of elements not mentioned in project reports ( figure S3). This could either be due to the narrower scope of regional projects often aiming at fostering South-South collaboration, capacity building or enabling environments, or to the difficulties in reporting on cross-country projects.

Uncertainties and limitations
The analysis presented above depends entirely on the information retrieved from project reports and as such is limited by availability, accuracy and completeness of these reports, particularly when midterm reports are used instead of terminal evaluations. Terminal evaluations are preferred as they are quality controlled both by the GEF agency and the GEF itself. However, only 64% of the projects had terminal reports available and we have had to rely on midterm and implementation reports for the other projects.
Regardless, many of the evaluation elements presented in our framework are not mentioned at all by project reports and could add a degree of uncertainty in the scoring methods but also bias towards projects or domains with fewer elements considered. Figure 5 attempts to shed light on the extent of this uncertainty at country level as between 19% and 25% of evaluation elements in single country projects and up to 54% for regional projects are not mentioned by project reports. For transparency, table S1 also reports the number of elements considered in the analysis for each project. Furthermore, to account for possible biases in our interpretation of achievement with respect to evaluation scores, we have computed these scores based on a pessimist and optimist scenario (table S2). The results show that our method can still be considered as internally robust as the differences between scenarios are relatively small. In the optimist scenario, the stakeholder dimension joins implementation and resilience amongst the top scoring dimensions while finance is the only dimension with a very low score. In the pessimist scenario, it is only implementation that achieves the highest score. Such an uncertainty analysis can be used in future studies in a way that ensures consistency and enables a comparison of possible ranges of project achievements.
However, we have not assessed whether the method can be effectively considered replicable as it was only performed by one person analyzing reports in only one order. As such, future studies could explore these issues by having multiple people perform the scoring, analyzing reports in different orders and replicating the process a few months later.
Moreover, while we have used the median instead of the average to present scores by period and location, caution should still be applied regarding conclusions due to the relatively small sample size, and the very large variations between projects and evaluated elements.

Conclusions
We have taken into account the sustainability evaluation criteria adopted by the GEF IEO as a measure of persistence of project achievements after completion, the literature based guidelines on influencing factors for broad adoption [12], and how the theory of change of GEF projects is used to represent the project interventions' pathway towards achieving Global Environmental Benefits. This was done in order to elaborate 31 evaluation elements organized along seven evaluation domains; implementation strategies and livelihood benefits, resilience to shocks, stakeholder engagement, capacity building, institutional contexts, M&E systems, and financial mechanisms. We have established two scoring methods enabling the evaluation of projects' performance and their likelihood to bring about persistent change, allowing the scaling of SLM practices towards LDN. These scores evaluate each domain separately as well as the overall project performance and thus enable a standardized way of comparing projects, different domains or assessing changes over time and space. Taken together and accounting for the full list of evaluation elements these scores and evaluation framework can provide a full picture of the state, strength and weaknesses of GEF projects in the region. They could thus be used by the GEF to guide progress reports and terminal evaluations or to design future projects.
We have shown that while all projects design activities targeting knowledge and information and implementation strategies and institutional capacity, they are overall only moderately likely to achieve persistent benefits allowing the scaling of SLM by the pilot countries, ultimately contributing to LDN and its associated environmental and socio-economic benefits. We have however presented initial evidence of an apparent improvement over time from the GEF 3 to the GEF 6 periods particularly regarding achievements in terms of M&E systems, resilience and funding mechanisms. We have also presented differences between countries in terms of both types of activities targeted and how often they tend to be achieved as well as which activities tend to be overlooked by projects even if they are needed.
This study thus recommends that future projects in the GGW put more focus on designing M&E systems that target indicators for which data can be gathered, enable decision making and are well integrated into existing national institutions, so as to allow the continuity of maintenance and use after project completion. We would also recommend that new projects reuse or complement monitoring systems from previous projects so as to avoid redundancy and ensure continuity in achieved progress. Special attention to sustainable financial mechanisms is also required to increase the likelihood of future projects leading to scaling of SLM practices. Diversified financing mechanisms would reduce the risks associated with strong dependence on international and governmental funding. Better market integration, access to credit and saving schemes, and involvement of the private sector either as investor or taxed for environmental damage, are examples of the strategies that could create opportunities for local people and municipalities to increase their revenues and capacity to reinvest in SLM practices and community services. Moreover, more efforts should be put on broad communication and awareness raising of projects' achievements, regulatory changes and ensuring alternative livelihood options. Other aspects that need stronger collaboration between development agencies and the different countries' institutions are related to governments' commitment to enforcing new SLM regulations, empowering women and other vulnerable groups' to contribute to decision making, high staff turnover and maintenance of SLM related infrastructure and activities even during periods of political or climatic instability in the Sahel.

Data availability statement
No new data were created or analyzed in this study.