Science of Scaling: Understanding and guiding the scaling of innovation for societal outcomes

This Editorial to the Special Issue “Science of Scaling: connecting the pathways of agricultural research and development for improved food, income and nutrition security” presents the framing, overview and analysis of 10 articles focussed on scaling innovation in the agricultural research for development sector. The publications cut across three categories that focus on: (i) Understanding the scaling trajectory retrospectively from a longer term, systems perspective, (ii) Understanding scaling of innovation retrospectively as part of shorter term agricultural research for development interventions, and (iii) Conceptual or methodological approaches aimed at guiding scaling prospectively. Cross-cutting review of the publications leads to several insights and critically questions dominant ways of understanding and guiding scaling of innovation in the agricultural research for development sector. This provides a starting point for proposing more outcome-oriented scaling as a third wave of understanding and guiding scaling, beyond technology adoption ( first wave ) and the scaling of innovation ( second wave ). The Editorial proposes three Research Domains for the Science of Scaling: (1) ‘Understand the big picture of scaling innovation’ that can inform more realistic ideas about the factors, conditions and dynamics that affect innovation and scaling processes; (2) ‘Develop instruments that nurture efficient and responsible scaling’ that comprises new approaches, concepts and tools that can facilitate the development of evidence-based scaling strategies; and (3) ‘Create a conducive environment for scaling innovation’ that focusses on the institutional arrangements, partnership models, and monitoring and learning for scaling of innovation.


Introduction
Achieving impact at scale is one of the greatest challenges facing the development community (CGIAR, 2015) and the term 'scaling' is increasingly popular in the world of public research for development (Hall and Dijkman, 2019).Scaling usually refers to the adaptation, uptake and use of innovations such as practices, technologies, and market or policy arrangements across broader communities of actors and/or geographies (Eastwood et al., 2017;Glover et al., 2017).In research for development, scaling is usually perceived to be the result of deliberate efforts and interventions that lead to defined societal outcomes such as securing public health, sustaining food availability, living within planetary boundaries, creating jobs and growth, and promoting equality of opportunity.In that sense, scaling is associated with positive change and high target numbers have become an indicator for those funding, implementing, and evaluating research for development to assess the success of projects, policies, programs and other types of interventions.Rising popularity has contributed to the perception that 'scaling' is something one can do and should aspire to when pursuing Sustainable Development Goals (Wigboldus et al., 2016).
In the agricultural sector one of the largest public research for development players is the CGIAR, 1 a global partnership that unites public and private organisations engaged in research for a food secure future (Barrett, 2020).In the agricultural research for development (AR4D) context, there is increasing pressure to demonstrate fast and visible returns on investment and impact at scale (Glover et al., 2016).On the one hand, this pressure has stimulated more critical thinking about how to better link investments in research to development outcomes through theories of change and impact pathways (Douthwaite et al., 2003).On the other hand, it has resulted in sometimes unreasonable and unrealistic expectations about the responsibilities of AR4D organisations for achieving societal outcomes (Hall and Dijkman, 2019).The need to show results has tempted organisations to over-promise and focus on quick wins rather than investing in the development of more structural capacity for innovation and scaling in agricultural systems (Hall and Dijkman, 2019;Leeuwis et al., 2018).Similar trends and discussions about the contribution of research to societal outcomes have been observed in other publicly-funded research for development sectors (Penfield et al., 2014).
This Editorial to the Special Issue on 'Science of Scaling' takes stock of how the world of AR4D is engaging with scaling in theory and practice in the context of increased pressure to demonstrate impact.We define the Science of Scaling as: "The design, testing and validation of scientific theories, concepts and methods to understand and guide scaling of innovation to achieve societal outcomes." The next section of this paper presents observations on how scaling has been interpreted and used in recent years and explain how it adds to earlier conceptions of dissemination, adoption, and uptake of innovation (Section 2).Subsequently, we elaborate the aims and objectives of this Special Issue and continue with an overview of the contents (Section 3).We then provide cross-cutting observations and lessons learned (Section 4) that inform a research agenda for further development of the Science of Scaling (Section 5).The final section provides the main conclusions and an outlook for the Science of Scaling (Section 6).

From technology adoption to scaling of innovation
Scaling is not the first term used to characterise processes of expansion and the achievement of development outcomes through research and innovation.Formerly, concepts like adoption or diffusion of innovations were used (Rogers, 1962) and, arguably, the practice of extension -a notion that dates back to the 1840s (Leeuwis, 2004).Scaling of innovation, although often still interpreted along the lines of adoption, diffusion or extension, refers to more sophisticated and holistic approaches and strategies whereby innovations contribute to and become embedded in broader processes of systemic change in society (Wigboldus et al., 2016).Below, we compare two ways of understanding of scaling; 'technology adoption' (wave 1) and 'scaling of innovation' (wave 2).

Perception of innovation
Compared to technology adoption, there is today much greater attention to the multidimensional character of innovation and the belief that innovation and its uses at scale are influenced by broader societal transformation processes.It is increasingly recognized that the uptake or spreading of a technological innovation requires, or goes along with, other changes, including changes in labour organization, service delivery, regulatory frameworks, policies or cultural meanings (Geels, 2002;Wigboldus et al., 2016).To refine discussions of technology transfer, Smits (2002) conceptualized innovation as a combination of hardware (i.e., new technologies), software (i.e., new knowledge, norms and modes of thinking) and orgware (i.e., new institutions and forms of organization) (in Leeuwis and Aarts, 2011).Consequently, scaling is understood and approached more as a set of interdependent changes in a broader system, rather than as the scaling of a specific technology.Garb and Friedlander (2014) use the example of scaling drip irrigation to show that the copy-pasting of technological innovations from one location to another is likely to fail when there is insufficient attention to the unique socio-organizational conditions surrounding irrigation use in specific contexts.A similar need for unpacking, adaptation, repacking and reconfiguring innovations has been emphasized by Glover et al. (2017) as a key condition for scaling.

Types of scaling process
In relation to its multidimensional character, scaling innovation includes simultaneous processes of upscaling, outscaling and downscaling (Wigboldus et al., 2016).The term outscaling refers to the spreading of something within the same sphere, whereas upscaling refers to the creation of conducive conditions and policies for scaling at higher levels (Hermans et al., 2013).For example, if farmers and researchers in a community develop a novel way of preventing soil erosion with the help of bunds, this new practice may spread within the community through horizontal exchange of ideas, or outscaling.If the newly developed practice then becomes part of the national extension policy or is integrated into provincial regulations regarding natural resources management, we can speak of upscaling.In another sense, upscaling can create an enabling environment for further outscaling beyond the community in which the new practice was developed.Hence, outscaling and upscaling are interdependent.In addition, the outscaling or upscaling (i.e., the increase) of a new practice or innovation (e.g., the use of organic pesticides) may simultaneously imply the downscaling or adaptation (i.e., the reduction) of existing practices (e.g., the use of chemical pesticide) in a local context.

Levels of scaling considered
Scaling requires interactions between different levels of scale (e.g., field, farm, community, region, country, continent) wherein it is recognized that something beneficial at one level (e.g., an individual farm producing a new crop desired at the market) may turn out to be less beneficial at other levels (e.g., one million farms growing the new crop and glutting the market) (Wigboldus et al., 2016).Similarly, there may be un-anticipated scaling processes that cut across domains and levels, whereby scaling of agricultural productivity using chemical fertilizer (which may have positive effects in a certain context) may also result in the scaling of environmental pollution or degradation (a negative effect).Hence, the trade-offs and synergies of scaling an innovation need to be considered from the perspective of different levels for the potential positive and negative impacts.

Stakeholder processes
The multi-dimensional and multi-level character of scaling implies that multiple actors and stakeholders are involved, and that the 'individual' can no longer be the only entry point for understanding processes of adoption and scaling of innovation.This shift is linked with greater attention to scaling activities involving decision-making and change in stakeholder networks and stands in sharp contrast to theories of diffusion of innovations, which are based on the notion that the spread of innovations is the result of many individually made decisions (de Roo et al., 2019).Acknowledging that scaling involves decisionmaking and change in stakeholder networks requires that scaling as process goes beyond 'extension' as a mechanism to provide decision support to the individual adopter (e.g., the farmer) (Wigboldus et al., 2016).The multi-level and multi-dimensional character of scaling also puts more emphasis on the facilitation of multi-stakeholder processes and multi-stakeholder networks, including support to processes of learning, decision-making, collective action, negotiation and conflict that are inherent to scaling (Hermans et al., 2015;Hermans et al., 2017;Leeuwis and Aarts, 2008).

Drivers of scaling
A final shift in emphasis is linked to the idea that changing conditions at higher levels, or in other spheres, can generate a largely selforganised dynamic of scaling at another level, referred as 'pull scaling', in contrast to 'push scaling' (Wigboldus and Brouwers, 2016;Wigboldus and Leeuwis, 2013).For example, if a retailer creates a significant price incentive for vegetables that have been grown without the use of pesticides, many horticulturists may be 'pulled' in this direction with little intervention effort, while a government extension campaign 'pushing' for integrated pest management without such a direct incentive may have a more limited response.This notion of 'pull' scaling is linked to the idea that there may be leverage points from which major influence can be exerted on the dynamic of the entire system (; Wigboldus et al., 2016).Pull scaling also resonates with the observation that self-organising processes in complex systems may result in 'tipping points' (Gladwell, 2000;Leeuwis and Aarts, 2011) and/or 'windows of opportunity' (van Mierlo et al., 2013) where systems change (and scaling happens) in a rapid manner.In complex systems thinking, such rapid transformation or scaling is seen to arise from multiple coinciding trends, events and influences, rather than from a single, orchestrated intervention (Hall and Dijkman, 2019).This complexity imposes limits to the ability to plan, organise or control scaling efforts, and calls for continuous monitoring through short-term feedback loops and adaptive management to respond to the contexts of changing systems in which innovation and scaling processes and embedded (Arkesteijn et al., 2015;Klerkx et al., 2010).
The various shifts in thinking about scaling are summarized in Table 1, and, as a whole, reflect a greater recognition of the complexities for innovations to have impact at scale.The evolution from technology adoption approaches towards scaling of innovation approaches creates an interesting challenge for those interested in the science and practice of scaling in AR4D.On the one hand, individuals and organisations concerned with the science of scaling generally understand and embrace the multi-dimensional, multi-level and multi-stakeholder nature of scaling, and often conclude that intervention strategies fail to do sufficient justice to the complex nature of innovation and scaling.On the other hand, individuals and organisations concerned with the practice of scaling usually struggle to operationalise complex systems approaches within interventions that often have pre-defined scaling targets and expect fast results and clear return on investment (Andersson and Sumberg, 2017;Glover et al., 2016;Hall and Dijkman, 2019).This challenge informed the following question that the Science of Scaling seeks to address: What kinds of scientific concepts, methods and tools can support research for development in understanding and guiding scaling, without losing, oversimplifying or ignoring the key drivers of scaling innovation in complex adaptive systems?

Overview of the Special Issue on Science of Scaling
The publications in this Special Issue focus on how novel connotations and practices associated with the term scaling have been picked up by researchers who study scaling, and how these notions are translated into methodologies, approaches, metrics and tools that aim to assess or guide scaling investments.In the invitation for the Special Issue, we invited contributions on both successful and unsuccessful experiences regarding scaling to inspire both theoretical and/or practical reflection.Among the ten publications that were accepted for publication, we distinguish three categories (Table 2 and Appendix A: Supplementary Data).
The first category contains two publications that focus on understanding the scaling trajectory retrospectively from a longer term systems perspective.Low and Thiele's (2020)  Low and Thiele also underline how unpredictable but critical inflexion points (or windows of opportunity) such as the 2008 food price crisis 'benefitted' the scaling of innovation.The second paper by Shilomboleni et al. (2019) entitled "Scaling up innovations in smallholder agriculture: Lessons from the Canadian international food security research fund" applies a systems perspective to draw programmatic lessons from a large scale investment in scaling.They distinguish between investments that focussed on the deployment of innovations and investments that sought to catalyze systemic change, and advocate for more outcome-oriented scaling processes.
The second category includes five publications focussed on understanding scaling of innovation retrospectively as part of shorter term AR4D interventions.Totin et al. (2020) present findings from "Scaling practices within agricultural innovation platforms: Between pushing and pulling".Their work investigates the scaling approaches employed by innovation platforms under the Sub-Saharan Challenge Program in Rwanda and examines how space for scaling is created in innovation platforms.They explore how a combination of push approaches (orchestrated efforts to promote an innovation for adoption at scale) and pull approaches (creating enabling conditions for the broader use of innovations) formed key success factors for scaling.The paper concludes that flexibility in AR4D interventions is a key success factor for scaling of innovation, as well as embedding local level innovation initiatives within higher level government policies to ensure conducive institutional conditions for scaling.A similar study is presented by Seifu et al. (2020) in their paper "Anchoring innovation methodologies to 'go-to-scale': a framework to inspire agricultural Research for Development."They analyze how multi-level innovation platforms served as the principle approach to scaling sustainable intensification in Ethiopia and seek to validate the concept of anchoring as a condition for successfully linking niche-level innovation to regime-level innovation, building on the multi-level perspective by Geels and Schot (2007).Their paper shows the importance of combined institutional anchoring (e.g., lobby, political support), methodological anchoring (e.g., practice, joint learning, experimentation) and network anchoring (e.g., connect and mobilize key actors) when niche-level innovations are to be integrated at the regime-level, which they define as a key (pre-) condition for scaling of innovation.Another case study from Ethiopia is presented by de Roo et al. ( 2019) who critically assess "Scaling modern technology or scaling exclusion?The socio-political dynamics of accessing in malt barley innovation in two highland communities in Southern Ethiopia".The authors approach scaling from the viewpoint of ensuring access to material and social components of an innovation package.They provide strong empirical evidence that ignoring the  2020) also use the multi-level perspective, but focus on research-development partnerships and partnership drivers and dynamics for scaling complex innovations.Their case of scaling Farmer Business Schools in Asia explains changing roles and contributions of research and development partners in innovation and scaling processes.Staff and leadership stability is identified as one of the key drivers of successful partnerships for scaling and the authors identify four phases in research-development partnerships processes: (1) exploring 'fit' between partners, (2) defining goals, benefits and building trust, (3) power balancing, accountability and learning, and (4) transformation.
Finally, Van Loon et al. (2020) present "Scaling agricultural mechanization services in smallholder farming systems: Case studies from sub-Saharan Africa, South Asia, and Latin America" in which they analyze three mechanization case studies to identify opportunities and challenges for scaling agricultural mechanization services.The scalability assessment emphasizes the different dimensions or ingredients for scaling, how they are interrelated, and defined in space and time.Across the cases the main bottlenecks for scaling mechanization are perceived to be mainly of non-technological nature, for example, access to finance or sectoral collaboration.The third category of three publications presents conceptual or methodological approaches aimed at guiding scaling prospectively.The paper by Woltering et al. (2019) "Scaling -from 'reaching many' to sustainable systems change at scale: A critical shift in mindset" builds a strong case for a systems approach to scaling.They argue that (pilot) project characteristics and narrow scaling conceptions of technical replication and "reaching many" creates obstacles for a shift towards scaling through sustainable systems change.They also compare a number of existing tools and approaches that can support achieving impact at scale, including the Scaling Scan that is used in the Van Loon et al. (2020) article.Sartas et al.'s (2020a) paper "Scaling Readiness: science and practice of an approach to enhance the impact of research for development" operationalises complex systems concepts for assessing the scaling readiness of innovations with the Scaling Readiness decision support framework.They explain how the Scaling Readiness approach is rooted in innovation system science and complex adaptive systems thinking that supports the scientific assessment of innovation packages in terms of their readiness for scaling.Scaling Readiness enables the identification of bottlenecks for scaling using a stepwise process.Drawing on management science and network science they explain how Scaling Readiness develops evidence-based scaling strategies to overcome scaling bottlenecks, while appreciating the need for stakeholder agreement and collective action to achieve impact at scale.Hammond et al. (2020) propose a practical framework for targeting with "Towards actionable farm typologies: Scaling adoption of agricultural inputs in Rwanda".Their paper demonstrates how the use of The Rural Household Multi-Indicator Survey (RHoMIS - Hammond et al., 2017) for rapid characterization of rural households can support scaling partners to move beyond one-size-fits-all approaches to scaling of innovation.The grouping of rural households can inform strategies for scaling of innovation that are more focused on addressing the specific needs and opportunities for such groups without compromising too much the economies of scale that enable large-scale rollout of rural development interventions by organisations.

Cross-cutting observations on the Science of Scaling
Analysis across the ten publications reveals a number of recurrent trends, directions and debates.We synthesize them under several crosscutting themes that contribute to our collective understanding of scaling of innovation in AR4D.

Innovations scale as part of spatially-and temporally-defined packages
Several publications emphasize that innovations scale as part of innovation packages (de Roo et al., 2019;Sartas et al., 2020a;Van Loon et al., 2020).These packages are defined in different ways.Sartas et al. (2020a) distinguish between core innovations (what it is that an intervention is trying to scale), complementary innovations (the enabling conditions or other innovations that are required for the core innovation to be used), and how the innovation package (core and complementary innovations combined) are defined in space and time.de Roo et al. (2019) show how social innovations (e.g., access to land, agronomic knowledge, the 'right' people) can enable or constrain the adoption of technological innovations, and how overlooking or ignoring social and political conditions can potentially scale social exclusion rather than innovation.Several publications emphasize how innovations need adaptation as they reach new settings, which aligns with earlier work from Garb and Friedlander (2014) who refer to "translation" and "re-innovation" as a prerequisite for scaling.Also, the conditions for scaling may suddenly change over time due to shocks in the system, new market opportunities or trends, and other types of tipping points or windows of opportunity.The publication by Low and Thiele (2020) nicely illustrates how scaling of Orange Fleshed Sweet Potato 'benefitted' from the major floods in Mozambique in 2000-2001 and from events such as the 2008 food price crises and the formation of the Scaling Up Nutrition Movement in 2011.Several publications in the Special Issue say that flexibility and the ability to forge coalitions with other innovation networks is a major driver for successful scaling of innovation (Seifu et al., 2020;Totin et al., 2020).

Numbers are only part of the story
Both Sartas et al. (2020a) and Woltering et al. (2019) explain that the notion of "reaching many" is problematic and misleading.First, it creates the wrong incentives in AR4D, where achieving and reporting numbers within an intervention or project cycle has become more important than developing (sustainable) mechanisms and partnerships that can catalyze long-term systemic change.In the quest to reach high numbers, many AR4D organisations scale the innovations themselves, rather than investing in embedding those innovations within the systems, strategies and practices of government and private sector partners who have the mandate and capacity to deliver at scale and sustain delivery of services over time (Woltering et al., 2019).Sartas et al. (2020a) introduce principles of network science that measure how many farmers or other end-users have adopted innovations, but also consider the relative position and relations of actors in the network as an important indicator of the innovation's scaling potential.For example, if all innovation adopters were directly incentivized by the intervention (i.e., farmers being paid by the AR4D project to test or adopt the innovation), then this result would score low in terms of the 'readiness' of that innovations to go to scale, irrespective of the number of innovation adopters.By emphasizing that numbers only tell part of the story, they put more focus on the relative position of those using innovations in the innovation network, and attribute more value to innovation use by actors that were not directly related to or incentivized by the AR4D intervention that supported the designed, tested, validated and scaling of the innovation.

Short-term scaling interventions versus long-term processes of systemic change?
In their report on agri-food system innovation, Hall and Dijkman (2019) analyze a number of case studies that, similar to the Orange Fleshed Sweet Potato case presented by Low and Thiele (2020), show how technology, regulation, stakeholder coalitions, market forces and champions and leadership are among the main drivers in transition and/or scaling processes.This raises the question about what is or should be the role and contribution of shorter-term interventions such as AR4D projects to longer term transition or systemic change processes?The papers form Woltering et al. (2019) and Shilomboleni et al. (2019) build a case for innovation and scaling pathways that move beyond short-term aspirations of projects (i.e., fast and quantifiable value for money) to longer-term changes.Like Glover et al. (2016), they argue that to achieve long-term systemic change, scaling may require prolonged investment in forging stakeholder coalitions, market development and policy advocacy.These types of processes cannot be easily captured and/or measured in simple numerical terms.
A more pragmatic approach would integrate long-term visioning, agenda-setting and incentive systems that encourage sustainable systems change with targeted short-term interventions to build capacities and create more conducive conditions for enabling systemic change.
This would require a redirection of AR4D investments from stand-alone innovation delivery projects, towards strengthening innovation and scaling systems capacity to effectively address questions such as what innovations work where, under what conditions, for whom, and for which outcomes?Decision support tools such as Scaling Scan (Woltering et al., 2019) and Scaling Readiness (Sartas et al., 2020a) can support AR4D organisations to reflect continuously on bottlenecks which limit innovation and scaling, and explore what options, actions and partnerships could be required to improve the functioning of the innovation system, both short-and long-term.

Scaling requires new skills, conditions, and capacities, and is not neutral
Several publications in the Special Issue emphasize the importance of lobbying, networking, and building trust and relationships between innovation and scaling partners as a key condition for success (Prain et al., 2020).Seifu et al. (2020) explain that anchoring small changes in the wider context might be easy but more profound changes require strategic networking, wider experimentation and dialogue with regime authorities.They underline that the power, knowledge and actionability for effective scaling is a complex multi-actor and multi-level business, requiring learning, negotiation, networking, and collective action.Many AR4D organisations have begun to understand that scaling requires different kinds of capacities than those possessed by the typical agricultural scientist.
Low and Thiele (2020) and Seifu et al. (2020) stress the importance of scaling leadership and championing.Scaling champions are typically people who understand a scaling partners' needs and worldviews, and have the capacity and stamina to convince, pursue, identify and capture windows of opportunity (see also: Klerkx and Aarts, 2013).For these champions to perform these functions, they need a flexible and enabling environment that allows them to identify and navigate opportunities for and barriers to scaling.Most AR4D organisations do not offer these conditions as part their research for development structures.AR4D organisations also need to invest more time and energy in understanding the realities and needs of public and private scaling partners, and in co-designing innovation validation and scaling processes.The collaborative partnership model documented in Hammond et al. (2020) is a good illustration of how Science of Scaling can respond to a specific demand for more (cost-) effective strategies by partners that can operate at scale.

Scaling goes hand in hand with reduced influence over how innovations are used in society
Unintended scaling effects described by de Roo et al. ( 2019) amplify the need for tools and approaches that can support the design and implementation of responsible scaling strategies that are concerned with processes of anticipation, inclusiveness, responsiveness, and reflexivity (Wigboldus and Brouwers, 2016).Woltering et al. (2019) propose a "responsibility check" and include responsible scaling as an indicator for success and to anticipate the potential negative social and environmental impacts of using innovations at scale (see also Wigboldus and Brouwers, 2016).Interventions should be designed with "scale in mind" (Redding et al., 2017), which implies that even during early stages of innovation design and testing there is a clear idea about how such innovations can contribute to societal outcomes.This idea connects with the pathways of innovation and scaling as part of a Theory of Change (Douthwaite and Hoffecker, 2017) that needs to be monitored, evaluated and updated based on principles of reflexive monitoring and adaptive management (Arkesteijn et al., 2015;Klerkx et al., 2010).
We run into a dilemma here, as processes of scaling unavoidably go together with reduced influence over how innovations are used and how they impact the livelihoods of heterogenous end-users.A way to deal with this dilemma may be embedded in the typology approach of Hammond et al. (2020).By identifying and clustering different end-user groups, researchers can assess ex-ante whether the intended innovation package(s) and scaling strategy could achieve the desired outcomes for different groups of beneficiaries (e.g., by age, ethnic group or economic class).This work can provide active feedback on whether and how innovations are being used or abused by different groups of end-users and help to validate and modify the scaling strategy.Furthermore, this preparation would help usher a move away from one-size-fits-all approaches, by thinking more specifically about what innovations work for whom, under which kinds of conditions, and what would be the most sustainable approach to scale those innovations.In line with the proposal by Seifu et al. (2020), such testing and validation could be part of a pre-scaling phase that offers a safety-net for both the end-users and the organization(s) supporting the scaling to identify unintended consequences at an early stage, and update the innovation package and the scaling strategy accordingly.The innovation readiness levels presented by Sartas et al. (2020a) can support a process where innovations that are proven under 'controlled conditions' and proceed through steps of testing and validations under '(semi-)uncontrolled conditions' before they can actually be designated as 'ready for scaling'.In addition, there needs to be active monitoring of the intended and unintended consequences of scaling innovations, and a degree of flexibility to adapt the scaling strategy and outcomes if things tend to move in the wrong direction (Totin et al., 2020).

Need for fit-for-purpose partnerships and collaboration models for scaling
Considerable debate surrounds the topic of the role of AR4D organisations in scaling.For example, the CGIAR Strategy and Results framework (2015) states: "Research by CGIAR and its partners can support the drive to disseminate innovations, but the scaling up effort must be led by national institutions, supported by regional or international development organizations where appropriate".Nevertheless, many CGIAR projects struggle with how to best engage in scaling particularly in contexts where national scaling partners may be perceived as weak or lack capacity.The collaborations between an AR4D organization and a scaling partner described in Hammond et al. (2020) and Prain et al. (2020) provide good examples of how an AR4D organization can deploy scientific knowledge, methods and analyses to support the scaling organization in the design, implementation and monitoring of effective scaling strategies.Shilomboleni et al. (2019) emphasize the importance for A4RD organisations to collaborate with government and private sector scaling partners to implement scaling activities.Partnerships between AR4D organisations and scaling partners should be based on mutual interests, and -ideally -on principles of co-investment in a jointly defined Theory of Change towards reaching societal outcomes.Currently, scaling partners often participate in AR4D-initiated scaling interventions and projects, rather than the other way around.This creates mismatches in terms of the types of innovations that are being proposed, tested, and scaled; often leading to disappointing impacts.Prain et al. (2020) add that partnerships have different stages and that the nature of partnerships are likely to change as scaling processes and scaling bottlenecks change.They use the concept of 'fit' in the context of matching research and development partners.
Two papers focus on the role of innovation platforms as a partnership model for innovation and scaling in AR4D.Seifu et al. (2020) conclude that rather than promoting the innovation platform approach as a magic bullet, it is essential to make an ex-ante appraisal of problems, existing contexts, and innovation mechanisms to establish the best option for that context.This process is supported by conclusions from Lamers et al. (2017), Hermans et al. (2017) and Sartas et al. (2018Sartas et al. ( , 2019) ) who studied AR4D innovation platforms.Both Totin et al. ( 2020) and Seifu et al. (2020) conclude that innovation platforms need to be anchored in the broader political and development agendas at the regime level if innovations are to survive beyond the protected niche space where they were designed and tested (see also : Schut et al., 2018).
The analysis of research-development partnerships by Prain et al. (2020) confirms the importance of what Seifu et al. (2020) call "network anchoring" as a key condition for successfully linking niche-level innovation to regime-level innovation.Prain et al. (2020) describe the changing types of partnership models (e.g., initial networking, exploring complementarity, co-producing and learning) and the roles and activities of both research and development partners as scaling processes evolve over time.Their findings align with earlier observations that the type of research, activities and partnerships are likely to change as innovation and scaling processes evolve over time.Prain et al. (2020) describe a "partnership health check-up" to ensure that research and development partners continuously reflect and agree upon the mutual objectives, roles and responsibilities (division of), expected investments, and the processes of communication, learning and decisionmaking.Sartas et al. (2020a) propose Social Network Analysis as a scientific method to map stakeholder networks and support partnership selection based on evidence about which partners are best positioned in a network to fulfill specific innovation or scaling functions.

From scaling of innovation to achieving outcomes at scale
In line with the shifts in our understanding of scaling (Table 1), we observe that in many of the cases presented in this Special Issue, there is still a strong 'push' or preference in terms of which innovation is proposed or preferred to contribute to impact at scale (Low and Thiele, 2020;Van Loon et al., 2020).In the AR4D sector there is strong path dependency and leading AR4D establishments (such as those in the CGIAR) have defined their organizational mandates and responsibilities around specific commodities (e.g., rice, wheat, livestock, agroforestry), thematic areas (e.g., climate change, agrobiodiversity, integrated pest management), or specific innovations (e.g., small scale mechanization, digital extension tools).Sartas et al. (2020a) and Woltering et al. (2019) emphasize the strong dependency on donor-funded projects in AR4D, resulting in a situation where AR4D organisations put their specific commodities, themes and innovations at the centre of scaling efforts, rather than choosing those innovations that can achieve outcomes at scale in the most efficient way.Even in cases where an innovation or scaling process is organised in a more demand-oriented or participatory way (Seifu et al., 2020;Totin et al., 2020), the innovations proposed for scaling are often those that can be supported or developed by the organisations involved in the AR4D intervention.
This selection bias triggers the question whether scaling should be about the innovation, or about the aspired outcomes (i.e., What the use of that innovation at scale seeks to accomplish or achieve?).Taking a more systems perspective and outcome-oriented approach would start by: 1. Mapping what are the main livelihood or development challenges in specific contexts or locations (e.g., malnutrition), followed by: 2. Making an inventory of different types of innovations with high scaling potential for different locations (e.g., orange-fleshed sweet potato in location A; Vitamin A-rich bananas in location B; and biofortified beans in location C), followed by: 3. Developing a better idea about the context-specific measures and conditions through which such innovations could be accessed, adapted and used by different groups of end users (e.g.providing access to credit, ensuring market access, having functional seed systems), and then: 4. Identifying the key bottlenecks for scaling innovation packages and developing scaling strategies and partnership processes to overcome those bottlenecks.
This hypothetical example shows how a focus on achieving outcomes at scale (in this case, combatting malnutrition) may stimulate more critical thinking about the diverse innovations, innovation packages, strategies and partnerships that may be required.Outcomeoriented scaling provides inspiration for a third wave of understanding and guiding scaling, beyond technology adoption (first wave) and scaling of innovation (second wave) presented in Table 1.

Three Research Domains for the Science of Scaling
The contributions to this Special Issue suggest that the Science of Scaling can support the AR4D sector by helping to unpack the notion of scaling, with the ultimate goal of using scientific concepts, tools and evidence to better understand and guide scaling efforts and investments in practice.
Based on the analysis of publications that were submitted to the Special Issue, and our broader reflections, we have identified three Research Domains that can advance the Science of Scaling (Fig. 1).

Research Domain 1: Understand the big picture of scaling innovation
There is scope to connect discussions about scaling more systematically to general theory about system innovation and transition processes (Elzen et al., 2012;Geels and Schot, 2007;Loorbach, 2007).Such theories typically engage with long-term transformative processes and changing paradigms and perspectives on innovation and scaling in society -often from a historical perspective.This form of engagement arguably fits well with the kinds of transformative changes that are implied by the Sustainable Development Goals.Case studies from AR4D may well enrich general theory formation and vice versa, but an important condition is that both positive and negative experiences with scaling are reported.
As part of the call for submissions to the Special Issue on Science of Scaling, we explicitly welcomed publications on well-posed and planned but ultimately unsuccessful scaling initiatives from which insights can be drawn.Except for de Roo et al. (2019), we did not receive cases of failed scaling or ones that showed unintended negative consequences of scaling.Woltering et al. (2019) emphasize that "pilots never fail, pilots never scale", an idea they relate to the fear of losing funds and credibility when communicating the failures and lessons in scaling.They emphasize that pilots can fail and do not always have to scale, if they make a clear contribution to increasing the readiness of the innovation for scaling.This may include the identification of (new) bottlenecks in the enabling environment (e.g., access to credit, markets, information) that prevent innovations from going to scale, and an awareness that as long as those bottlenecks are not addressed, investments in scaling may be in vain.As part of their Orange Fleshed Sweet Potato innovation and scaling history, Low and Thiele (2020) emphasize that innovation and scaling trajectories are lengthy, pass through different phases, and that pilots are critical for generating evidence to convince new research and/or scaling investors.The Scaling Readiness approach presented by Sartas et al. (2020a) provides a framework to keep track of innovation development and scaling, where metrics and indicators are introduced to monitor how the maturity and scalability of innovations evolve over time.This work can contribute to more realistic innovation and scaling pathways understood as dynamic and long-term processes (Penfield et al., 2014).
There is a need for critical ex-post analyses of innovation and scaling histories in AR4D to inform general theory and hypotheses development about the factors, conditions and dynamics that influence scaling as part of systemic transformation processes.We see three key research questions as part of this Research Domain: 1. What are the key factors and drivers that affect scaling over longer time spans and to what extent do such factors and drivers result from self-organization or from deliberate intervention?2. How do technical, organizational, economic, institutional, behavioural, discursive and political dimensions of change co-evolve over time and which type(s) of change provides leverage over others in processes of scaling innovation? 3. What are the strengths and weaknesses of scaling models governed through the public sector, the private sector, or public-private partnership?Which model is most appropriate for scaling innovation in different contexts?

Research Domain 2: Develop instruments that nurture efficient and responsible scaling
A stronger theoretical understanding of innovation and scaling provides the basis for the development of new approaches, concepts, and tools to guide decision-making about scaling strategies.These approaches, concepts, and tools are the subject of Research Domain 2. Theories, such as the multi-level perspective by Geels and Schot (2007) are useful to analyze scaling pathways retrospectively, but need complementary action-oriented tools that can guide the development and implementation of strategies for scaling innovation.For example, Seifu et al. (2020) introduce the notion of 'anchoring' as a concept to embed niche-level innovations into regime-level processes and systems.Similarly, the experiences reported by de Roo et al. (2019) on 'scaling exclusion' underline the need to pay greater attention to 'responsible scaling' as proposed by Wigboldus (2018).To operationalise the concept of responsible scaling, the development and use of better tools for anticipating the likely positive and negative consequences of scaling innovations for different societal interests and/or segments of the population are necessary.Such tools can help AR4D to move beyond onesize-fits-all scaling approaches and tailoring scaling strategies for different types of end-users (Hammond et al., 2017;Hammond et al., 2020).Operationalising the idea of 'responsible scaling' may also require more guidance on how and by whom decisions on scaling (e.g. which outcomes, innovations and next-or end-users are prioritised) can be taken in a democratic and transparent manner.Both directions require new thinking and approaches that are less focussed on the wish to scale particular innovations, and more on how the realization of particular outcomes and impacts (e.g., gender equity, poverty reduction, and improved health) may need the scaling a variety of innovations for different objectives, for different groups of end-users, and for different contexts.In other words: we need to consider that the scaling of innovations cannot be a goal in itself, but must serve a larger purpose in terms of the societal outcomes and goals the scaling of the innovation seeks to achieve (Sartas et al., 2020a;Woltering et al., 2019).
New innovation and scaling theories need to be translated into approaches, concepts and tools that can guide the development and implementation of evidence-based strategies for scaling.We strongly suggest that such approaches, concept and tools build on those that already exist, including Scaling Scan (Jacobs et al., 2018), Scaling Readiness (Sartas et al., 2020b), Responsible Scaling (Wigboldus and Brouwers, 2016), To SCALE (FHI 360, 2004), and the Scaling Up Framework (MSI, 2016).In this domain, we see a specific need for answering the following research questions: 1. How can (un)intended positive and negative consequences of scaling be anticipated and differentiated across dimensions, levels, and societal groups, and how can such trade-offs and synergies guide investments in responsible scaling? 2. What kind of approaches could support a shift from 'scaling innovations' to achieving 'outcomes at scale' and to what extent does this contribute to the achievement of Sustainable Development Goals? 3. What is the (comparative) value of different intervention strategies, methods and tools aimed at scaling in/across different locations, cultures, levels, and spheres?

Research Domain 3: Create a conducive environment for scaling innovation
The structural embedding and use of new approaches, concepts, and tools poses new demands on AR4D organisations, possibly requiring reconfiguration of organizational mandates, capacities and models (Barrett, 2020).Low and Thiele (2020) emphasize the importance of their multi-disciplinary team and diverse partnerships as a key condition for success.Both Low and Thiele (2020) and Seifu et al. (2020) refer to the important role of so-called innovation and scaling champions to mobilize and align people and resources across projects levels.
Enabling and empowering such champions to operate effectively and to navigate the complexities and politics of scaling innovation requires flexibilities and room to maneuver as was highlighted by Totin et al. (2020).Prain et al. (2020) cite the importance of staff stability as an important driver for impactful research-development partnerships for scaling.
Several authors emphasize the tensions occasionally caused by novel approaches to innovation and scaling in AR4D establishments (Leeuwis et al., 2018;Schut et al., 2016).Thus, it is important to continuously monitor and learn about the extent to which institutional arrangements, organizational cultures and structures provide a conducive environment to effectively deliver against their mandate and their ability to contribute to societal outcomes.The use of digital tools can enable citizen science and crowdsourcing of data as part of monitoring, evaluation and learning mechanisms.This combination enables rapid and continuous feedback from innovation users on the extent to which innovations serve their purpose and what modifications or complementary innovations would be required (Steinke et al., 2020;Van Etten et al., 2019).Experiences with attempts to reconfigure institutional arrangements, partnerships, and cooperation models, and monitoring and learning mechanisms need to be documented, reflected upon and changed to accommodate new practices of scaling.The following research questions can guide scientists in contributing to this Science of Scaling Research Domain: 1. What kinds of institutional arrangements (e.g., incentive systems, fund allocation, adaptive management) can contribute to creating an enabling and flexible environment necessary for impactful innovation and scaling processes?2. What partnership models are effective in fostering conducive and equitable collaboration between national and international innovation and scaling partners?3. What mechanisms, indicators and (digital) tools are relevant to capturing innovation and scaling processes, and how can these be used in monitoring and evaluation to foster learning and accountability?
We acknowledge that the Research Domains and the corresponding research questions need to be operationalized for more meaningful use in specific interventions, projects, and/or case studies.

Conclusions and outlook
This Editorial to the Special Issue introduces, frames, and draws cross-cutting lessons from ten Science of Scaling publications.The publications present multiple case studies from Asia, Africa, and Latin America, and include more conceptually-and methodologically-oriented studies.Based on our synthesis of the publications, we propose three Research Domains as part of a Science of Scaling agenda, each with a set of research questions that -if addressed -can inform more effective innovation and scaling efforts.Together, the three Research Domains actively connect the science to the practice of scaling, a connection that is essential to translate progressive innovation and systems transformation theory into approaches and tools that can support scaling of innovation in practice.
The Science of Scaling seeks to support a shift away from 'finding specific solutions' and 'bringing those to scale'.Rather, we are more concerned with contributing to enabling conditions and strengthening capacities in innovation systems where scientists, governments, the private sectors, civil society organisations, and development donors and investors can effectively collaborate and overcome both current and future (agricultural) development challenges.For that to happen, a common understanding among those stakeholders is required on (1) the nature of innovation and scaling processes; (2) how such processes can be nurtured and made more efficient and responsible; and (3) the types of institutional arrangements, partnerships and monitoring and learning systems that provide a conducive environment for scaling innovation.These three elements for success correspond with the three Research Domains presented in this Editorial.
There is currently a wind-of-change in a major player in the international AR4D sector, the CGIAR, which embraces novel ways of thinking about innovation and scaling.The emerging third wave of understanding scaling introduced in this Editorial -outcome-oriented scaling -aligns well with the CGIAR's ambition to put achieving societal outcomes and impact at the core of its approach.Outcome-oriented scaling puts more emphasis on the aspired outcomes of using innovations at scale, and subsequently indentifies those innovations that are (most) ready for scaling, and those scaling strategies and partnerships that are most resource efficient to achieve those outcomes.
Science of Scaling is a topical and relevant new science field that requires more attention and investment by research for development organisations and their donors if they are serious about achieving impact at scale and, ultimately, helping nations and regions achieve the Sustainable Development Goals.

Declaration of Competing Interest
None.

Fig. 1 .
Fig. 1.Three Research Domains, their foci and orientations within the Science of Scaling.

Table 1
Comparison of two ways of understanding scaling: technology adoption (wave 1) and scaling of innovation (wave 2).
socio-political dynamics of access to technological innovations may result in unintended and undesirable scaling outcomes.Prain et al.  (