Scaling Readiness: Science and practice of an approach to enhance impact of research for development

Scaling of innovations is a key requirement for addressing societal challenges in sectors such as health, agriculture, and the environment. Research for development (R4D) programs, projects and other interventions struggle to make particular innovations go to scale. Current conceptualizations of scaling are often too simplistic; more systemic and multidimensional perspectives, frameworks and measures are needed. There is a gap between new complexity-aware theories and perspectives on innovation, and tools and approaches that can improve strategic and operational decision-making in R4D interventions that aim to scale innovations. This paper aims to bridge that gap by developing the key concepts and measures of Scaling Readiness. Scaling Readiness is an approach that encourages critical reflection on how ready innovations are for scaling and what appropriate actions could accelerate or enhance scaling. Scaling Readiness provides action-oriented support for (1) characterizing the innovation and innovation system; (2) diagnosing the current readiness and use of innovations as a proxy for their readiness to scale; (3) developing strategy to overcome bottlenecks for scaling; (4) facilitating and negotiating multi-stakeholder innovation and scaling processes; and (5) navigating and monitoring the implementation process to allow for adaptive management. Scaling Readiness has the potential to support evidence-based scaling strategy design, implementation and monitoring, and – if applied across multiple interventions – can be used to manage a portfolio of innovation and scaling investments.


Introduction
Academic and professional interest in how innovations spread in society has long historical roots, going back to the work of Ryan and Gross (1943) and Rogers (1962) on the adoption and diffusion of innovations. Today, such processes of adoption or diffusion are generally labelled as the scaling of innovations. Innovations can be technologies, products, services and practices, but also organizational and institutional arrangements. Scaling refers to the increased use of innovations beyond the group involved in its initial design and testing. Scaling is considered important in the context of global investments in research and development to address societal challenges related to health, agriculture, and the environment. The scaling of innovations is particularly relevant to research for development (R4D) organizations that have a mandate to develop, test and validate innovations to achieve Sustainable Development Goals (SDGs) and to demonstrate to donors that their research innovations are adopted in society to show returnon-investment (Renkow and Byerlee, 2010).
Experience shows that achieving impact at scale is more complex and difficult than anticipated in intervention proposals (Alvarez et al., 2010;Thornton et al., 2017). Earlier conception that innovations could simply be transferred by intermediaries and change agents (e.g., extension officers or health educators) and then diffuse within communities of individual beneficiaries (Rogers, 1962) has been largely refuted (Röling, 1988;Leeuwis, 2004). Historians of technology, for example, argue that scaling of innovation involves competition between supporters of different technological solutions, and those who defend interests and sunk investment associated with incumbent technological systems (Geels and Schot, 2007;Schot and Geels, 2008). Others stress that the scaling of one innovation (e.g., using a new seed variety) depends on the simultaneous upscaling of other complementary practices (e.g., weeding, pesticide-use, distribution of inputs, credit provision) and the downscaling of existing practices (e.g., the current dominant seed variety) (Wigboldus et al., 2016). This dynamic also points to the existence of interdependencies among the people who are involved in these practices (Leeuwis and Aarts, 2020). Moreover, several authors argue persuasively that scaling of something in one domain (e.g., in agriculture) may have implications for outcomes in another domain (e.g., in health) and that local level scaling processes may influence, and be influenced by, dynamics at higher levels (Cash et al., 2006;Giller et al., 2008;Schut et al., 2014;Wigboldus et al., 2016). In view of such interdependencies, it has been argued that the development of scalable innovations depends on conducive interactions in multi-stakeholder networks, wherein what may be desirable and possible in one context, may be problematic and unfeasible in others (Cole et al., 2016;Hall, 2007;Klerkx et al., 2012).
Individuals and organizations that have a mandate or the ambition to scale innovations could benefit from complexity-aware models and operational and strategic decision support tools to better guide and justify their scaling strategies and investments (Lanham et al., 2013;Patton, 2010;Westley and Antadze, 2010). Our review of the innovation and scaling literature has revealed that there is no comprehensive framework that provides the scientific foundation and key concepts for a set of tools, methods, and processes to guide the development, implementation and monitoring of scaling strategies in the R4D practice. The objective of this paper is to fill this gap by translating scientific insights and theories about innovation and scaling into to a practical approach that provides guidance and recommendations for taking innovations to scale.
The approach presented in this paper -Scaling Readiness -is centred around the ambition to assess the readiness of innovations to achieve specific impact at scale, within a specific context, and develop, implement and monitor scaling strategies to achieve those scaling objectives. Although developed and published in the context of the agricultural R4D sector, it has been designed to support other R4D sectors as well (e.g., health, environment, education). The notion of "readiness" refers to whether an innovation has been tested and validated for the role it is intended to play in society. The concept resonates with levels of technology readiness that have been proposed by the National Aeronautics and Space Administration (NASA) of the United States, the European Commission (EU) and technology studies scholars to assess advancements in technology development, commercialization, and transition pathways (European Commission, 2014;Verma & Ramirez-Marquez, 2006, Kobos et al., 2018. We use the term "Scaling Readiness" in two ways: (1) as a brand name for a decision-support process that we are developing and describing in this paper (capitalized), and (2) as a key variable and measure for assessing the maturity and scalability of an innovation (not capitalized).
Scaling Readiness contributes in meaningful ways to other frameworks and approaches that have been proposed in relation to readiness and/or scaling (inter alia Kuehne et al., 2017;Wigboldus et al., 2016). First, Scaling Readiness not only assesses the maturity of technological innovations, but also of other types of innovations (including social and institutional innovations), which enable a significant broadening of scope when compared to technological readiness (Verma & Ramirez-Marquez, 2006). Second, Scaling Readiness extends conventional readiness assessments by incorporating a measure of the actual use of the innovations in the locations where scaling is desired and thereby grounds readiness in the specific local context. Third, the scope of Scaling Readiness goes beyond the PRactice-Oriented Multi-level perspective on Innovation and Scaling (PROMIS) assessment from Wigboldus et al. (2016) by providing a hands-on and action-oriented process that fosters collective decision-making, learning and strategizing. Furthermore, Scaling Readiness reduces complexity into a single metric indicator that can also be used to assess changes in scalability over time. Fourth, the emphasis on decision-support and complex interdependencies distinguishes Scaling Readiness from the ADOPT approach proposed by Kuehne et al. (2017), which essentially predicts adoption of a specific innovation based on Rogers' (1962) perceives attributes of innovations (i.e., trialability, observability, relative advantage, etc.) and several population characteristics in a given context. In summary, Scaling Readiness aims to advance and complement other tools, approaches, and frameworks by: (1) paying attention to both technological and non-technological innovations; (2) considering contextual conditions; (3) providing hands-on decision support to address scaling bottlenecks; and (4) by striking a balance between analytical complexity and practical applicability.
This contribution is structured in five sections. Section 2 presents the key-concepts underlying Scaling Readiness. Section 3 explains how Scaling Readiness key-concepts are operationalized, measured and made actionable. Section 4 sketches the Scaling Readiness process as we propose it and its potential use in the R4D sector. Conclusions and reflections are presented in Section 5.

Key concepts of Scaling Readiness
This section presents key concepts that underlie Scaling Readiness and which guided the design of our approach in terms of its requirements and measures. The key concepts of Scaling Readiness are a combination of adapted, existing concepts and new concepts introduced by the authors. We group them under five sub-sections that are linked to the five steps proposed in Section 4.

Scaling is subject to a specific spatial and temporal context
A consistent finding across different sectors is that scaling is influenced by contextual conditions, and that 'one-size-fits-all' approaches are unlikely to be effective (Schut et al., 2014;Sartas et al., 2019;Shelton, 2014;Thornton et al., 2010;Levin, 1998;Innes and Booher, 1999). The innovation systems literature conceptualizes innovation as the outcome of (changes in) interactions between networks of interdependent actors and stakeholders, the socio-technical context in which they operate, and the rules and institutions that govern their interactions (Klerkx et al., 2010). This finding suggests that an innovation that may be appropriate and scalable in one context, may not fit another context. Secondly, an intervention strategy that may effectively support the scaling of innovation in one context, may not be effective in another context.
A range of different spatial conditions play a role here, including agro-ecological conditions (Douthwaite et al., 2005) and the socio-institutional features of the innovation system in question (Klerkx et al., 2010;Schut et al., 2016). As well, several studies have noted that the success of similar R4D interventions may vary considerably over time (Abrahams et al., 2004;Baur et al., 2003;Delisle et al., 2005;Henderson, 2000;Laws et al., 2013;Sartas et al., 2019). For example, in agriculture, the 'green revolution' is a good example of how the scaling of uniform high input use in farming (e.g., fertilizers, improved crop varieties) had differential impact over space and time (Evenson and Gollin, 2003;Hazell and Ramasamy, 1991).
Taken together, these studies imply that we cannot usefully make generic statements about whether an innovation is ready for scaling or not. Thus, Scaling Readiness must assess maturity and scalability in specific spatial and temporal contexts and provide context-specific decision support to develop meaningful scaling strategies.

Taking the scaling context and objectives of R4D interventions as starting point
Stating that an evaluation of scaling readiness must be contextual opens the question of what and whose boundaries of context should be considered. This question relates not only to the geographical location (space) and temporal horizon (time) to be taken into account but also requires defining what may usefully scale and for what underlying objective. For instance, an R4D intervention that aims to decrease vitamin A deficiency in rural communities in tropical areas by scaling of vitamin A rich crops could consider supporting the development of three alternative cropping systems and value chains (e.g., rice, banana and sweet potato) to address the vitamin A deficiency problem. 1 However, each crop has a different set of resource requirements and will lead to different impacts depending on the context. Under similar conditions, the same amount of vitamin A can be produced with the least amount of land if rice is used, with the least amount of water if sweet potato is used, and with the least amount of labour if banana is used. Thus, a scaling strategy for decreasing vitamin A deficiency needs to consider other objectives and conditions as well. Yet another dimension of context concerns who are the expected beneficiaries from the scaling of the innovation. For example, if women are defined as the main target group, then scaling an innovation that reduces labour by women may make most sense.
While we realize that different views may exist on what the relevant geographies, objectives and beneficiaries should be for a specific scaling initiative, the starting point of our approach is to assess the scaling readiness of innovations based on the expected goals and impacts of a specific R4D intervention (Kuehne et al., 2017;Cohen and Axelrod, 2001). We take this as a practical and legitimate starting point given that our purpose is to provide decision support to R4D interventions. At the same time, we expect that the need to explicate such parameters will foster critical reflection in situations where boundaries, scaling objectives and target beneficiaries are not clearly defined. Those considerations will trigger further reflection within R4D interventions on whether scaling is indeed feasible, desirable and responsible.

Innovations scale as part of packages
As indicated, innovation systems literature emphasizes that the upscaling of specific innovations (e.g., a new crop variety) may simultaneously require the upscaling of other innovations (e.g., a seed multiplication and distribution system), or the downscaling of existing practices (e.g., use of the currently dominant crop variety) (Kilelu et al., 2013;Wigboldus et al., 2016). These dynamics imply that innovations cannot be usefully scaled in isolation but must be regarded as part of a collection of innovations, or an innovation package. The innovation package then becomes the unit of analysis for assessing scaling readiness.

Packages include core and complementary innovations
R4D interventions often focus on scaling a specific core innovation (e.g., a new drug or new crop variety) that is assumed to contribute to a societal benefit. These core innovations often form the heart of an R4D intervention. However, the scaling of core innovations is influenced by interactions with other innovations or conditions that can be either enabling or constraining. We refer to these other innovations as complementary innovations. For instance, scaling a new animal vaccine (the core innovation) also requires (1) new vaccine dosage and application practices; (2) certification from vaccine control agencies; (3) establishing or improving vaccine delivery systems; and (4) education about vaccine characteristics and use (the complementary innovations) (Curry et al., 2013;Paina and Peters, 2012).
What constitutes a meaningful and viable innovation package depends again on the context, which implies that packages can change over time and are likely to differ across locations. Similarly, the composition of an innovation package may need to vary for different beneficiary groups. Using the animal vaccine example again, for countries where resource poor populations are impacted by a specific animal disease, subsidized vaccine distribution through public veterinary services may be an important complementary innovation to ensure equitable and affordable access.
In sum, Scaling Readiness needs to take into account interdependencies between core and complementary innovations and needs to support the characterization and definition of innovation packages that are specific to the R4D intervention: its context, objectives, users and beneficiaries.

The scaling readiness of an innovation is a function of innovation readiness and innovation use
As mentioned in Section 1, the technology readiness levels proposed by NASA and the EU are, in essence, a measure of the maturity of a technology wherein maturity is defined as a demonstrated capacity to perform a specific function or contribute to a specific objective within a research or development environment (e.g., in the laboratory, under controlled conditions or under uncontrolled conditions). Levels of readiness range from an 'idea' to 'innovation that is validated for use in an uncontrolled environment' with in-between gradations of 'proof of concept', 'tested prototype' and 'demonstrated under controlled conditions'.
However, in spite of this elaboration, the maturity scale is not sufficient for understanding the potential of a core innovation and/or an innovation package as a whole and its readiness to go to scale. Many documented ready innovations have failed to be used at scale, such as improvements to child and maternal health (Althabe et al., 2008) and agroforestry management practices that use fodder shrubs or improve tree fallows (Franzel et al., 2004). In addition, not every innovation may have a demonstrated capacity to perform a specific function or have a desired impact. For example, multi-stakeholder innovation platforms have been increasingly utilized in the agricultural R4D sector to advance innovation and scaling, but evidence of their effectiveness to achieve impact is scarce (Biermann et al., 2007;Cole et al., 2016;Sartas, 2018;Servaes, 2016;Warner, 2006). To put it more simply, innovations with a low potential for achieving impact are sometimes used at scale, whereas innovations with a high potential for achieving impact are not necessarily used at scale. Thus, while it is important to capture the maturity of innovations that are part of an innovation package (i.e., innovation readiness), it is also necessary to incorporate additional variables if we want to fully understand and assess scaling potential.
Inspired by innovation scholars (Geels and Schot, 2007;Leeuwis and Aarts, 2011) and network science (Hermans et al., 2017;Sartas, 2018), we argue that the scaling potential of a core innovation and/or innovation package is -at a given point in time -also shaped by the social networks in which the innovations are embedded, supported and used. In other words, whether or not an innovation is likely to scale depends on who and how many users are already using it, and how such users are positioned in the innovation network. Thus, it makes sense to distinguish between network environments in which the innovation still receives considerable support and protection (e.g., a project or intervention), and network environments in which it has been used without any form of support (e.g., as part of livelihood systems). This thinking aligns with the literature on strategic niche management, which points to the importance of gradually reducing protection of innovation initiatives (niches) over time and the ability of niche-level innovations to reconfigure dominant policies, procedures and practices (regimes) (Hommels et al., 2007;Smith and Raven, 2012).
If innovations are used only by R4D intervention teams, their partners and beneficiaries who are directly linked to or incentivized by the intervention, then the scaling potential is still low, irrespective of the number of team members, partners and direct beneficiaries using those innovations. When we frame the intervention in these terms, it creates a different perspective on claimed scaling achievements such as "this new crop variety is used or adopted by 25,000 farmers in Zambia". Such statements do not reveal much about the performance of the R4D invention unless we are provided with information on who these farmers are and what was their relation to the intervention. Numbers tell only part of the story (Woltering et al., 2019), and the position of those using innovations in the innovation network is a much better indicator of the innovation's scaling potential. Such a variable also captures whether the innovation users operate within a protected space (controlled environment), or whether they use the innovation in more unprotected conditions (uncontrolled environment) (Smith and Raven, 2012). Therefore, we propose a scaling readiness variable that indicates in what type of networks an innovation or innovation package is already being used. We will refer to this concept as innovation use.
Scaling readiness then becomes a function of innovation readiness and innovation use. In our approach, we use these variables to generate diagnostic information relevant to scaling strategy development. We will further elaborate and operationalise these variables in Section 3.

Identify bottlenecks for scaling strategy development
When we acknowledge that innovations scale as part of innovation packages, then the next step is to try to understand which of the innovations limits the scaling of the innovation package and what is the most resource-efficient strategy to overcome such bottlenecks.

Innovations with low innovation readiness and use constitute bottlenecks for scaling
Liebig's Law of the Minimum is a good way to explain the focus on the innovation package as the unit of analysis for assessing its scaling potential. According to Liebig, plant growth is limited by the nutrient in shortest supply (Austin, 2007;de Baar, 1994;Van Der Ploeg et al., 1999). Therefore, to analyse plant growth, it is necessary to know the relative availability of all nutrients necessary for the growth of the plant, and to assess which of the (micro-) nutrients is constraining efficient nutrient uptake. Similarly, the scaling of innovations is limited by the core or complementary innovation in the innovation package that are least developed, or, in other words, form the bottleneck.
This dynamic is illustrated in the adapted version of 'Liebig's barrel' (Whitson and Walster, 1912) presented in Fig. 1. Each individual innovation can be considered as a stave in the barrel. Just as the capacity of a barrel with staves of unequal length is limited by the shortest stave, so is the scaling potential of an innovation package determined or limited by the innovation with the lowest readiness and use. Different staves are combined by a hoop (the R4D approach) that binds staves as an innovation package. The length of each stave corresponds with innovation readiness, and width of the stave corresponds with innovation use. The higher and wider the staves are, the more water the barrel can hold. In other words, the higher the readiness and use of the innovations, the higher the capacity of the innovation package to achieve impact at scale.
Notably, Fig. 1 has clear implications for strategy development and R4D investment. For example, the scaling of a drought tolerant crop variety may be limited by the absence or poor performance of a seed delivery system. One can continue to invest in plant breeding to improve the drought tolerant crop variety, but as long as the seed delivery system is not being improved, such an investment will not result in any impact at scale. In Fig. 1, this is illustrated by the water drops that drip into the barrel (the R4D scaling investment) and leak from the lowest stave (the bottleneck innovation that limits scaling). Thus, bottleneck analysis can form the basis for prioritizing resource allocation and investment in R4D interventions and can inform strategic and operational decision-making for scaling. Accordingly, Scaling Readiness must be able to support the identification and prioritisation of those innovations in a package that form the bottleneck that prevents achieving a defined impact at scale objective.

Strategies for overcoming bottleneck innovations must be realistic and resource-efficient
Once bottleneck innovations are identified, R4D interventions can pursue different options to overcome the bottleneck and increase the impact potential of the innovation package. It is important that strategies for overcoming bottleneck innovations should be realistic in view of available time, human and financial resources. Here, we borrow from the organizational science literature that provides various options for organizations to improve their effectiveness and efficiency when faced with a constraint. Depending on the situation, they can substitute some inputs for others (Wolf et al., 2001;Brach et al., 2012), outsource operations (Roberts et al., 2013;Gunasekaran et al., 2015), choose to invest in design and improvement (Tidd and Bessant, 2018;Snapp and Pound, 2017), or change the locations where they operate (Lin et al., 2016, Reis et al., 2016. We recognize that applying such options in R4D interventions may cause tension, as it can require a degree of flexibility that R4D projects and programs may not have (e.g., Leeuwis and Aarts, 2011and Woltering et al., 2019. Nevertheless, it is important that Scaling Readiness facilitates critical reflection, discussion and prioritisation on a variety of strategic options to overcome bottlenecks to reach the expected objectives.

Scaling requires multi-stakeholder agreement and coalition formation
We have seen that scaling involves the simultaneous up-or downscaling of different innovations in a package. R4D interventions, therefore, cannot realize their scaling ambitions on their own, but are dependent on other actors in the innovation system (Eastwood et al., 2017). These other actors may not necessarily recognize this interdependence and may pursue different and potentially even conflicting goals and interests (Wigboldus et al., 2016;Sahay and Walsham, 2006;Galaz et al., 2008), and, therefore, may not necessarily agree on a proposed scaling strategy. For example, the most efficient short-term strategy for addressing a bottleneck innovation related to the scaling of a drought-tolerant crop variety may be to bypass working with government extension and/or seed systems that face severe capacity constraints. However, this option may not be acceptable for other partners, or have negative consequences for the long-term sustainability, credibility and impact of the R4D organisation in that specific location or context. As a result of such tensions and interdependencies in innovation systems and the inability of a single stakeholder group to scale innovations in complex livelihood systems, scaling of innovation requires forging agreements and accommodations among interdependent actors. There is a growing body of evidence from different sectors that demonstrates this is a process that requires active facilitation (Leeuwis, 2004;Giller et al., 2008).

Coalition formation requires facilitated learning and negotiation
Forging agreement among interdependent actors about scaling ambitions and scaling strategies amounts to building effective coalitions for change (Biggs and Smith, 1998). A coalition is a network of stakeholders that actively supports change in a particular direction . The stakeholders may support the change for different reasons and objectives, and there may be a continuation of tensions and disagreements on specific issues. Two intertwined processes Fig. 1. Scaling Readiness Barrel to illustrate how innovation(s) with the lowest readiness limit an innovation package's capacity to achieve impact at scale. M. Sartas, et al. Agricultural Systems 183 (2020) 102874 need to be facilitated. First, stakeholders need to learn about one another's context and perspectives, discover how they depend on one another to fulfil their ambitions, develop common starting points to build upon, and develop mutual relationships and trust (Kahan and Rapoport, 2014). Second, in the process of reaching agreement, there is likely to be tension and conflict, and typically these differences need to be settled through negotiation. Negotiation is essentially a process of giving and taking among the participants with regard to the proposed scaling objectives, pathways and desired outcomes. In this context, Scaling Readiness must anticipate that scaling strategies cannot be usefully imposed by R4D interventions on rational grounds (e.g., bottleneck assessments) and need to offer facilitated spaces where interdependent innovation system actors can reach agreement regarding the actions and strategies that can support effective scaling of innovation.

Scaling is an emergent and unpredictable process of change
Scaling is a complex process and R4D interventions operate under real and uncontrolled conditions. This combination implies that stakeholders and intervention teams are likely to be confronted with unforeseen developments and with activities that give rise to unintended consequences and outcomes (Geels and Schot, 2007;Schot and Geels, 2008). Moreover, the scaling context is bound to change continuously. Thus, R4D interventions require mechanisms to capture and navigate the ever-changing environment in which they operate.

Scaling processes require reflexive monitoring and learning
Interactions among the innovations in an innovation package evolve with time (Hall and Clark, 2010;Pahl-Wostl, 2007;Paina and Peters, 2012;van den Bergh et al., 2011). The addition of an innovation to an innovation package, or the improvement of an innovation with low readiness for scaling, interacts with and influences other innovations in the package either positively or negatively. As a consequence, scaling processes cannot be simply designed or engineered at the onset of an R4D intervention by projecting the outcomes assuming linear growth (Klerkx et al., 2010;Paina and Peters, 2012). For instance, a new soybean variety introduced to a village by an R4D intervention can be used at scale and lead to an initial increase in the income of farmers producing soybeans since they can sell their produce (positive outcome). However, the success of this initial scaling of the soybean variety may, over time, lead to the collapse of the soybean price and an eventual decrease in the income of the farmers as overproduction results in a market glut (negative outcome) (Gilbert and Morgan, 2010). Sustaining that income growth may require additional innovations (e.g., enabling access to export markets or local value addition) to continue ensuring positive impact, which may not always be possible. In addition, the relations among partners in a scaling coalition may change over time due to internal and external developments.
Scaling Readiness must be sensitive to emergent and unpredictable dynamics in two respects. First, it needs to consider short-term horizons for the design and implementation of scaling strategies by focusing on overcoming the key bottleneck innovations through concrete activities and partnerships. Second, Scaling Readiness should promote continuous and reflexive monitoring, evaluation and learning ( Van Mierlo et al., 2010) to assess whether the scaling strategies had the desired effect over a long-term horizon, and to determine if the bottleneck innovations were addressed. The diagnosis of a reconfigured innovation package may result in the identification of new bottlenecks that require additional strategies and further agreement between stakeholders.

Measures of Scaling Readiness
In this section we explain how to operationalise and measure the key variables in our approach: innovation readiness, innovation use and scaling readiness. As indicated in Section 2 these variables can only be assessed contextually; that is, in connection with specific R4D intervention goals that are pursued in a particular time and space. This condition implies that an innovation may have a high level of scaling readiness in one location, or at a specific moment in time, or for a specific goal, but at the same time have a low level of scaling readiness in a different location, moment in time, or when directed towards a different goal.

Innovation readiness measurement
In the R4D literature, there are different scales for measuring the maturity of an innovation. Although not all of them refer to the term readiness, these scales do capture the capacity of innovations to perform a specific function, or contribute to a specific objective within a specific context. Examples include service innovation readiness (Yen et al., 2012) and technology readiness (Parasuraman, 2000;Richey et al., 2007;Sauser et al., 2008;Verma & Ramirez-Marquez, 2006). We chose to build our measure of innovation readiness on the technology readiness index developed by NASA (Parasuraman, 2000;Sauser et al., 2008) which has also been adopted by the Horizon 2020 Programme of the European Union (European Commission, 2014). This readiness index is one of the oldest available and has been used for assessing the readiness of technological innovations, and for making strategic research investments. We modified the index categories to make them suitable for assessing both technological and non-technological innovations, thus transforming the technology readiness index into a scale for assessing the readiness of all types of R4D innovations (Table 1). According to the innovation readiness scale, innovations evolve from an 'idea' (level 0) towards being 'ready' (level 9) following a process of research, development, testing and validation in controlled and uncontrolled conditions.

Innovation use measurement
Innovation use measures the current use of the innovations that compose the innovation package in the specific spatial-temporal context where the R4D intervention aims to reach its goals. This metric maps the use of core and complementary innovations by various groups of stakeholders against their position in an innovation network (Fig. 2). If only the intervention team is using the innovation (that it has designed and tested), then the scalability potential of the innovation is low. If other intervention teams and partners or aspired end-users start using the innovation (without being actively involved in its design and testing) then the scalability potential is higher. Different types of innovations (e.g., products, practices, services, organizational and institutional arrangements) may have different user groups. For example, a farmer would be the typical end-user of a new crop variety, whereas a seed multiplier would be the typical end-user of a system that promotes certified seed multiplication. It is important to mention that typical end-users of R4D innovations, such as farmers, can also be part of the intervention team or its effective partners if they are directly engaged in the design and testing of the innovation, or incentivized by the R4D intervention to use the innovation.
The stakeholder characterization follows a modified version of the innovation network-based typology offered by Sartas et al. (2018).
The first group of stakeholders are those directly involved in innovation development, testing and validation and who have direct influence over the strategizing and implementation of the R4D activities. We refer to these stakeholders as the intervention team. In R4D, this group typically includes managers of the R4D interventions, researchers and research or development practitioners and support staff.
The first progression towards scaling is the use of the innovation(s) by the next group of stakeholders, who we call effective partners. Effective partners consist of stakeholders who directly collaborate with the intervention team within the R4D intervention. The key difference is that they do not have the same direct influence on the R4D activities M. Sartas, et al. Agricultural Systems 183 (2020) 102874 as the intervention team. They are effective in the sense that they make explicit contributions to improving innovation readiness and/or the innovation use. Effective partners typically consist of representatives of government, civil society, and the private sector, and representatives of the innovation's end-users, such as farmers, patients, technicians or small-scale business managers. The next progression is the use of the innovation(s) by stakeholders who influence the R4D intervention and its activities although they are not directly involved in the innovation design and testing. We refer to this group as innovation network stakeholders. They are typically the management of R4D organizations and the effective partner organizations, such as Ministers and high-level officials, directors of non-governmental organizations, CEOs of companies, or established opinion leaders.
A next progression is the use of the innovation(s) by stakeholders who are not collaborating directly with the R4D intervention teams, effective partners, nor innovation network stakeholders. These stakeholders do not have any direct connection with or influence on the R4D intervention activities. However, their own work can have important implications for scaling the innovation, as they are also in the business of designing, testing and validating innovations that can be complementary to or competing with the innovation. We refer to them as stakeholders in the innovation system which are typically other R4D intervention teams and their effective partners operating in the same spatial-temporal context, and working on similar innovations or scaling objectives.
A final progression towards scaling is the use of the innovation(s) by stakeholders who were not involved in any R4D interventions or activities and had no influence on the R4D activities. We refer them as stakeholders in the livelihood system since they include all the remaining actors who are not part of innovation systems. Innovation use by stakeholders or beneficiaries in the livelihood system is a key outcome for many R4D interventions that aim to scale innovations for achieving impact. Similar to innovation readiness, innovation use is measured using a 9-level scale ( Table 2).

Scaling readiness measurement
While innovation readiness and innovation use are assessed separately for different (core and complementary) innovations in a package, the unit of analysis for assessing scaling readiness is the innovation package as a whole. Following Liebig's Law of the Minimum (Fig. 1) Section 2.3.1, the overall potential of the innovation package to have impact at scale is determined by the innovation in that package with the lowest innovation readiness and innovation use score. As indicated earlier, this is assessed by multiplying the two separate scores for each innovation in the package. The lowest score is referred to as the scaling readiness score of the overall innovation package. We propose that innovation readiness and innovation use have the same weight in determining the potential impact at scale of an innovation package. This allows having one simple measure of scaling readiness allows for comparison and aggregation across a portfolio of innovation packages and R4D interventions. The various measurements discussed so far can be visualized as a two-dimensional graph (Fig. 3) that presents all the key concepts and measures of Scaling Readiness, using a stylised innovation package with a new crop variety as the core innovation. This example represents the different real-life innovation packages in root, tuber and banana cropping systems (Bentley et al., 2018).
In Fig. 3 the overall scaling readiness of the innovation package is 2 based on the innovation with the lowest product of innovation readiness and use -novel seed quality assurance policy (innovation readiness = 2 * innovation use = 1). This innovation forms the bottleneck for this innovation package to contribute to impact at scale in a defined space and time and for a specific objective.

Table 1
Innovation readiness levels, basic descriptions and the type of science and evidence to support readiness level claims. Application Model (unproven) Researching the capacity of the innovation to meet specific goals using existing applied-science-evidence.
Basic science Generic 5 Application Model (proven) Validation of the capacity of the innovation to meet specific goals using existing applied science evidence.
Applied science Generic 6 Application (unproven) Testing of the capacity of the innovation to meet specific goals within a controlled environment that reflects the specific spatial-temporal context in which the innovation is to contribute to achieving impact.
Applied science Generic 7 Application (proven) Validation of the capacity of the innovation to meet specific goals within a controlled environment that reflects the specific spatial-temporal context in which the innovation is to contribute to achieving impact.
Applied science (controlled) Specific to intervention context 8 Incubation Testing the capacity of the innovation to meet specific goals or impact in natural/real/uncontrolled conditions in the specific spatial-temporal context in which the innovation is to contribute to achieving impact with support from an R4D.

Applied science
Specific to intervention context 9 Ready Validation of the capacity of the innovation to meet specific goals or impact in natural/real/uncontrolled conditions in the specific spatial-temporal context in which the innovation is to contribute to achieving impact without support from an R4D.
Applied science (uncontrolled) Specific to intervention context M. Sartas, et al. Agricultural Systems 183 (2020) 102874 3.3.1. Scaling readiness diagnosis merits independent and evidence-based assessment Now that we have presented the various measures in Scaling Readiness in more detail, it is important to reflect briefly on how and by whom such assessments may be made. In R4D, the design and scaling of health, agricultural, environmental and other societal innovations often depend on continuous coordinated support from donor-funded interventions (Sartas, 2018). Sustaining this support depends on, among other factors, the perceived potential and impact of the innovations at scale, and the progress achieved by researchers and innovation developers during previous interventions. Therefore, such closely involved parties are likely to have an interest in overstating innovation impact potential towards donors. This possibility can create a conflict of interest when assessing innovation readiness and innovation use (Suri, 2011;Vera-Cruz et al., 2008). In Scaling Readiness, therefore, documented evidence (e.g., scientific papers demonstrating proof-of-concept, data collected through rigorous and/or independent monitoring and evaluation systems) are required to support claims of innovation readiness and innovation use levels. Whenever such documents are not accessible, experts are requested to provide their judgements. We seek to minimize self-reporting biases by encouraging the assessment of innovation readiness and innovation use by independent experts (Grinstein & Goldman, 2006).

Proposed practical uses of Scaling Readiness
In this section we discuss how Scaling Readiness and its key concepts and measures can be used in practice to contribute to more efficient scaling interventions in R4D. Our discussion is informed by initial testing of the key concepts of Scaling Readiness through several pilot cases . We are currently validating Scaling   Fig. 2. Stakeholder typology for those involved in innovation design, testing, validation, and use, based on a network approach.

Table 2
Innovation use scores, levels and their basic description. Innovation use score Innovation use level Description 0 None Innovation is not used for achieving the objective of the intervention in the specific spatial-temporal context where the innovation is to contribute to achieving impact 1 Intervention team Innovation is only used by the intervention team who are developing the R4D intervention 2 Effective partners (rare) Innovation has some use by effective partners who are involved in the R4D intervention 3 Effective partners (common) Innovation is commonly used by effective partners who are involved in the R4D intervention 4 Innovation network (rare) Innovation has some use by stakeholders who are not directly involved in the R4D intervention but are connected to the effective partners 5 Innovation network (common) Innovation is commonly used by stakeholders who are not directly involved in the R4D intervention but are connected to the effective partners 6 Innovation system (rare) Innovation has some use by stakeholders who work on developing similar, complementary or competing innovations but who are not directly connected to the effective partners 7 Innovation system (common) Innovation is commonly used by stakeholders who are developing similar, complementary or competing innovations but who are not directly connected to the effective partners 8 Livelihood system (rare) Innovation has some use by stakeholders who are not in any way involved in or linked to the development of the R4D innovation 9 Livelihood system (common) Innovation is commonly used by stakeholders who are not in any way involved in or linked to the development of the R4D innovation Fig. 3. Stylized example of an innovation package (with 8 innovations) that have been assessed for their innovation readiness (y-axis) and innovation use (x-axis) specific to space, time and R4D intervention goals.
Readiness in several R4D interventions that aim to scale agricultural innovations, including the scaling of crop varieties, post-harvest technologies, digital decision-support tools, and crop management practices.

Proposition 1: Scaling Readiness can support the development of betterinformed scaling strategies for R4D interventions
One of the aims of Scaling Readiness is to assists R4D interventions in developing, implementing and monitoring scaling strategies in a structured and evidence-based way. To this end, we propose an iterative cycle of five steps that builds on key concepts, measures and requirements discussed so far (Fig. 4).
Step 1: Characterize -This first step is to characterize the innovation system in the spatial and temporal context in which scaling is to deliver impact. Characterization supports R4D intervention teams by clearly defining their goals, available resources (time, budget) and scaling objectives. This step also supports defining the innovation package by identifying the core and complementary innovations that are required to contribute to impact at scale. The failure to unpack innovation packages has frequently been reported as a key constraint for enhancing scaling and impact in the R4D sector (Mangham and Hanson, 2010;Moore et al., 2015;Paina and Peters, 2012). In Scaling Readiness, visualizations of different innovations and their interdependency are used in both conceptual and empirical ways, which was highly appreciated by R4D intervention teams involved in the initial testing and validation of Scaling Readiness.
Step 2: Diagnose -Diagnosis supports independent and evidencebased assessment of the individual innovations in the package in terms of their innovation readiness and innovation use. This step subsequently helps to identify bottleneck innovations that keep the innovation package from contributing to the defined goals and impact in a particular context. Schut et al. (2016) reported how R4D actors and organizations are often unable or unwilling to work on bottlenecks for innovation and scaling, as they perceive this to be outside of their organisation's mandate or comfort zone. Stronger evidence-based identification of bottlenecks for innovation and scaling may translate into real action by R4D establishments to invest in activities and partnerships to overcome bottlenecks.
Step 3: Strategize -The diagnosis of the innovation package and the identification of scaling bottlenecks is meant to kick-start a process of scaling strategy development. The position of the bottlenecks in the Scaling Readiness graph provides valuable information on what kinds of investments, activities and partnerships are required to improve its innovation readiness and/or innovation use. For each innovation with low readiness or use (Fig. 3) the intervention team can discuss how the innovation can be moved along the x-axis (e.g., What strategies can be used to access and engage other innovation networks?) or along the y-axis (e.g., How can we generate further evidence about the innovation's capacity to contribute to impact?). Similarly, the strategic options listed below support critical reflection of how to best overcome the bottlenecks to scaling, and what kinds of partnerships and activities are most essential: 1. Substitute: Can the bottleneck be replaced by another innovation with higher readiness and/or use in the given context? 2. Outsource: Are there organizations or external experts that can more efficiently improve the Scaling Readiness of the bottleneck? 3. Develop: Can the intervention team improve the readiness and/or the use by investing available intervention capacities and resources? Fig. 4. Scaling Readiness proposes a stepwise approach to support the development of better-informed scaling strategies for R4D interventions .
M. Sartas, et al. Agricultural Systems 183 (2020) 102874 4. Relocate: Can the intervention objectives be realized more effectively in another location where innovations have higher readiness and use levels? 5. Reorient: Can the objective or outcome of the intervention be reconsidered if addressing the bottleneck is not possible and relocation is not an option? 6. Postpone: Can scaling the innovation package be achieved at a later point in time? 7. Stop: If none of the above strategic options are feasible, should the team consider stopping the intervention?
These strategic options are ranked in terms of the resources necessary to implement them from the least to most resource demanding. To achieve maximum efficiency, R4D interventions are advised to use the least resource demanding option to address bottleneck innovations. Strategic options 1-3 focus on addressing the core or complementary innovation with the lowest innovation readiness or use (the bottleneck). Strategic options 4-6 focus on finding improved conditions for the intervention as a whole and in a different space (relocate), objective (reorient), or time (postpone). Strategic option 7 is a last resort when finding better conditions is not possible.
Step 4: Agree -The proposed scaling strategy needs to be shared and agreed upon with the broader R4D intervention partners and other stakeholders, such as donors. This step ensures sufficient buy-in for the proposed strategy and validates whether the implementation of the strategy is technically feasible and socially and politically acceptable. If the proposed scaling strategy is found unfeasible or undesirable, then the strategic options mentioned above should be reconsidered, and the team moves back to Step 3. When the scaling strategy is agreed upon, then a scaling action plan can be developed to provide details of the types of partnerships and activities to address the core bottleneck(s). Available time and financial resources will determine whether overcoming the bottleneck within the boundaries of the R4D intervention is realistic, and if it merits additional resource mobilization efforts.
Step 5: Navigate -If agreement is reached on a scaling strategy and a scaling action plan, the implementation and monitoring of the agreed-upon activities begins. Scaling Readiness facilitates and monitors the scaling strategy and action plan implementation through a process of reflexive monitoring and learning. This process requires that R4D intervention teams periodically reflect on the implementation of the scaling strategy and action plan and update these plans, if necessary, to guide implementation towards the desired results. Monitoring can be based on short-term feedback loops that guide the implementation of the scaling action plan, but also on long-term feedback loops to see if the scaling strategy has had the desired effect in terms of increasing the scaling readiness of the innovation package. This long-term feedback loop would require going through another Characterize (Step 1) and Diagnose (Step 2) effort.

Proposition 2: Scaling Readiness can support the management of R4D portfolios and investments
Many organizations involved in the R4D sector manage not just one, but multiple R4D interventions as part of their portfolio. With increasing pressure to show returns on investment and impact at scale (Renkow andByerlee, 2010, Woltering et al., 2019), these R4D organizations can use the key concepts and measures of Scaling Readiness to monitor and manage their R4D portfolio and guide investments to increase the overall scaling readiness of their innovation portfolio. This can be accomplished in two ways.
First, through its standardized measures, Scaling Readiness can support comparison of different innovation packages for particular spatial and temporal contexts and goals. Based on their bottleneck diagnosis (Step 2 of Scaling Readiness), an innovation package can be positioned in a low impact potential at scale zone (innovation package I), in a medium impact potential at scale zone (innovation packages II, III and VI), or in a high impact potential at scale zone (innovation packages V and IV). Based on its overall score, development-focussed investments in improving innovation readiness or dissemination-focussed investment in improving innovation may be prioritised (Fig. 5). This can help R4D organizations identify the innovation packages with the highest potential to achieve impact at scale. It can also be useful for R4D portfolio managers and decision-makers to consider impact potential at scale zones to compare different innovation packages and the type of investments that may be required to further improve their scaling readiness. Innovation packages with same scaling readiness scores might need very different strategies and partnerships to improve their readiness or use. For example, in Fig. 5, although bottlenecks in innovation packages III and VI are positioned in the medium impact potential at scale zone, the required strategies to bring them to the high impact potential at scale zone can be different for each package. For innovation package III, investments with a focus on dissemination to increase the innovation use are required, whereas for innovation package VI, investments with a focus on developing and increasing innovation readiness are desirable. This comparative analysis may facilitate resource allocation decisions or prioritisation of investments in R4D organizations that have both research (increase innovation readiness) and delivery (increase innovation use) mandates.
Second, using Scaling Readiness continuously can reveal if the scaling readiness of an innovation package increases, stays the same, or decreases over time as a result of R4D investments and/or changing innovation systems context. This information can help portfolio managers make smarter decisions about where to prioritise their investments. For example, innovations packages with rapidly increasing scaling readiness could be allocated extra investment. Innovations packages with substantial investment but without significant progress in terms of their increased innovation readiness and/or use can be put on hold, under closer monitoring, or stopped. This use of Scaling Readiness can also facilitate so-called stage-gate management of an M. Sartas, et al. Agricultural Systems 183 (2020) 102874 innovation portfolio in which the scaling readiness scores can inform decision-makers and/or R4D donors whether or not an investment has resulted in the desirable increase in innovation readiness and/or innovation use.
Here it is important to emphasize that the scaling readiness of individual innovations and innovation packages can increase or decrease due to a variety of factors. Readiness and use of complementary innovations in an innovation package -for example innovations that provide access to finance or to markets -may change as a result of a financial crisis or the bankruptcy of an important agricultural business in a specific region. Although such factors are often beyond the direct influence of the R4D intervention team, they need to be taken into account as they form new bottlenecks for scaling innovation which may require a reorientation of resources, activities and partnerships.

Conclusion
Scaling of innovations is the outcome of complex dynamics in innovation and livelihood systems and scaling strategies need to recognize the characteristics of complex systems dynamics. To connect the science and the practice of scaling, there is a need to develop complexity-aware models that can guide operational and strategic decision-making on scaling of innovation in R4D. Scaling Readiness incorporates key concepts and measures to: (1) characterize innovation systems, R4D interventions and their scaling goals, and innovation packages; (2) diagnose the scaling readiness of innovation packages as a function of their innovation readiness and innovation use; (3) support strategy development to overcome the main bottlenecks for scaling; (4) guide stakeholder agreement and coalition formation for scaling strategy implementation; and (5) navigate R4D interventions in effective scaling action through reflexive monitoring and adaptive management. In doing so, Scaling Readiness has the potential to enhance the scaling performance of R4D interventions and can support portfolio management of innovations and scaling investments.
The initial testing and validation of Scaling Readiness key concepts, measures and practices within pilot projects in the agricultural R4D sector has been very promising and intervention teams, to date, have greatly appreciated the stepwise and action-oriented process that Scaling Readiness facilitates. Scaling Readiness Guidelines  and a Scaling Readiness Web-portal (www.scalingreadiness.org) have been developed to enable access to the first generation of Scaling Readiness methods and tools. A next step is documenting the rigorous application of Scaling Readiness in R4D interventions and analysing its contribution to the intervention's scaling performance to demonstrate its value to enhance impact in the R4D sector.

Funding
This work was carried out as an integral part of the CGIAR Research Program on Roots, Tubers and Bananas (RTB). We would like to acknowledge the CGIAR Fund Donors (https://www.cgiar.org/funders/) for their provision of core funding without which this work would not have been possible.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.