Do Scaling Agile Frameworks Address Global Software Development Risks? An Empirical Study

Driven by the need to coordinate activities of multiple agile development teams cooperating to produce a large software product, software-intensive organizations are turning to scaling agile software development frameworks. Despite the growing adoption of various scalin g agile frameworks, there is little empirical evidence of how effective their practices are in mitigating risk, especially in global software develop ment (GSD), where project failure is a known problem. In this study, we develop a GSD Risk Catalog of 63 risks to assess the degree to which two scaling agile frameworks--Disciplined Agile Delivery (DAD) and the Scaled Agile Framework (SAFe)--address software project risks in GSD. We examined data from two longitudinal case studies implementing each framework to identify the extent to which the framework practices address GSD risks. Scaling agile frameworks appear to help companies eliminate or mitigate many traditional risks in GSD, especially relating to users and customers. How ever, several important risks were not eliminated or mitigated. These persistent risks in the main belonged to the Environment quadrant highlighting t he inherent risk in developing software across geographic boundaries. Perhaps these frameworks (and arguably any framework), would have difficulty all eviating, issues that appear to be outside the immediate control of the organization.

distribution, cultural differences, and communication infrastructure [12,13]. As many organizations are now adopting agile or hybrid development methodologies [14,15] in globally distributed organizations [16] that are scaling [17], we ask, How does the adoption of scaling agile framework practices address global software development risks?
To answer this question, we present an industrial multiple case study [18,19] of the adoption of two scaling agile frameworks-Disciplined Agile Delivery (DAD) [20], and the Scaled Agile Framework (SAFe) [21].
In the next section, we provide background on project risks in Global Software Development, and give a short overview of our two scaling agile frameworks, DAD and SAFe. Following that, in Section 3, we describe our methodology, and present our results in Section 4. We discuss the implications of the results in Section 5, along with the study limitations, and conclude with a summary of our study and contribution in Section 6.

Background
In this section, we look to the literature to provide an outline and define types of risk in global software development projects, and then consider some of the agile frameworks that support large-scale software development. We complete this section with our research question derived from the need to have a better understanding of the impact these new frameworks might have on project success or failure.

Software Project Risk
The literature on software development risks extends back some years; for instance, Boehm identified a top 10 list of software project risks in 1991 [22]. In the early 2000s a rigorous set of studies combining a Delphi study and survey methods resulted in a comprehensive set of software project risks [11,23,24]. Wallace and Keil [11] structured those within a "risk categorization framework" which they derived from the Delphi study and validated through a substantial survey of over 500 project managers. That framework comprised 53 project risks mapped to the four quadrants shown in Fig. 1.
Each of the quadrants in Fig. 1 represents a focus on concerns associated with grouped aspects of risk. For example, Customer Mandate is perceived as having a high level of importance, and a low level of perceived control, whereas Execution is perceived as being of moderate importance, with a high level of perceived control. The Customer Mandate quadrant, for instance, concerns "risk factors relating to customers and users, including lack of top management commitment and inadequate user involvement" [23]. The quadrant Scope and Requirements involves "the ambiguity and uncertainties that arise in establishing the project's scope and requirements" [11,23]. Execution concerns "the actual execution of the project. . . and many of the traditional pitfalls  [11] associated with poor project management" [23]. The Environment quadrant focuses on risks associated with the external or internal environment, "including changes in organizational management that might affect a project" [11]. The full list of the 53 risks (plus the 10 new GSD risks we add as part of this study) is shown in Fig. 2.
Apart from some work by Gotterbarn and Rogerson [25] on ethical risk analysis, the subsequent focus of attention was on software project failures, and contributors to software project outcomes [26,27]. As noted in the framework for understanding influences on software systems development and deployment project outcomes developed by McLeod and MacDonell [26], four broad areas of focus were identified: institutional context, people and action, development processes, and project content. Lehtinen et al. [27], through root cause analysis, followed on the work by MacLeod and MacDonell to identify the causes and effects of project failures, adopting a four-element framework of: people, methods, task, and environment to guide their analysis. While not a direct correspondence to Wallace and Keil's four quadrants [11], the coverage and similarities in both of these subsequent studies are strong, suggesting that the Wallace and Keil framework still has validity and is a good candidate for evaluating how software risks are addressed in a scaling agile implementation. Wallace and Keil's risk categorization framework, with its tabulation of risks by quadrant, also supported detailed and aggregated mapping of risks addressed for each method (Section 3).
Indeed, more recent work in outsourced and global software development risks [28,13], again drew on the work of Wallace and Keil [11]. The outcome was a dual conceptual risk framework for outsourced project risks comprising the four elements from Wallace and Keil, and some additional outsourcing  [11] and Verner et al. [13] This an authors' preprint. Please cite as: Sarah Beecham, Tony Clear, Ramesh Lal, and John Noll (2020) "Do Scaling Agile Frameworks Address Global Software Development Risks?" Journal of Systems and Software, Special Issue on Global Software Engineering. specific risks. 8 The mapping of global software development risks by Verner and colleagues [13], resulted in four risk components from the outsourced risk framework being identified (project scope and requirements, project execution, project planning and control, and organizational environment). A second mapping to the ISO 12207 software lifecycle process, also by Verner and colleagues [13], identified organizational management, development process, acquisition process, and training-but again was consistent with aspects of Wallace and Keil [11].
In their review Verner and colleagues [13] identified a broader range of GSD-specific risks and mitigations: high level and detailed level GSD vendor selection risks; requirements engineering risks; software development process risks; architectural design risks; configuration management risks; culture and social integration risks; training risks; communication and collaboration risks; planning risks; coordination risks, and control risks.
Verner et al.'s [13] set of risks are an alternative candidate for assessing how well our scaling agile implementations (in our two case studies) addressed GSD risks. Indeed, Verner and colleagues [13] suggested that with some development the Abdullah and Verner outsourcing framework [28], could be applied as a "useful framework for GSD projects." Verner et al.'s review [13] reflects the work of 24 separate studies of risk in GSD, including frameworks for distributed software development threats in Ågerfalk et al. [29], and managing risks in distributed software projects in Persson et al [12]. Since Verner et al.'s systematic literature review synthesises the body of work in GSD risk (resulting in 85 risks), all the risks mentioned are also considered in our assessment of scaling agile framework's resistance to risk in a GSD context (see Section 3). Another candidate set of risks for GSD was enumerated by Chandli and colleagues [6]; however, they focus solely on project management risks, whereas we wanted to take a wider perspective to include technical risks. So, since other candidate risk frameworks were either incomplete or failed to validate their findings to the same degree as Wallace and Keil, we chose the Wallace and Keil framework of four quadrants to serve as a frame for incorporating the GSD related risks detailed in Verner et al [13]. The combined view fully described in Section 4, and summarised in Fig. 2, shows how we augment Wallace and Keil's framework to cater for Global Software Development risks.
Key insights from this combination indicate that the risks identified in the Customer Mandate quadrant are not apparent in other papers on risk we have reviewed here, apart from some coverage in the "GSD vendor selection" [28] grouping, and the more recent distributed agile development risk observation of "requirements conflicts amongst multiple product owners" [30]. In keeping with this observation, Verner et al., in their tertiary review, criticise the limited focus on developer and vendor perspectives and observe that "the client is pretty much ignored" [13] in the literature. Perhaps the vendor perspective leads to a proxy customer situation, which historically has been generally well aligned within a vendor organisation; however, in a distributed agile setting this can lead to conflict between product owners [17]. In our aim to create a comprehensive risk catalog, we found no additional risks relating to quadrants capturing Customer Mandate and Scope and Requirements, that are both thoroughly covered by Wallace and Keil [11]. Unsurprisingly, the gaps were observed in the Environment, and Execution quadrants. In our GSD risk mapping exercise, we were able to categorise all of the risks (or threats) listed in Verner et al. [13] according to the Wallace and Keil [11] four quadrants, despite many of the GSD risks appearing at different levels of granularity, and several presented as compound risks. The new GSD risks are Delays caused by global distance, Lack of architecture-organization alignment, Lack of face-toface interaction inhibits knowledge sharing, Lack of process alignment, Lack of tool/infrastructure alignment, and Unstable country/regional political/economic environment, all of which fall under the Environment quadrant. We also have three new risks in the Execution quadrant, namely Ineffective collaboration, Ineffective coordination and Lack of trust. [31] In summary, the overall degree of commonality of the other frameworks with the Wallace and Keil framework [11] was sufficient to favor this more established framework for our comparison augmented with new GSD specific risks from Verner et al [13].

Scaling Agile Frameworks
In response to the difficulty of introducing traditional agile methods (originally designed for small co-located teams) into large-scale projects and organizations [32], several scaling agile frameworks have emerged. These frameworks attempt to scale agile practices for enterprise-wide agility, to include agile for distributed teams, large projects and critical systems [33]. Indeed, twenty such frameworks were identified by Uludağ et al. [34], of which the Scaled Agile Framework and DAD frameworks are some of the more popular models (according to the 12 th annual state of agile report [35]).
There are debates over the precise definition of "large scale agile," with Dikert et al. [36] observing that "what is seen as large-scale depends very much on the context and the person defining it." Kalenda et al. [37], distinguish between "large scale" and "very large scale" and consider a number of further aspects of scale. However, in our study, we opt for the simplicity and comparability of a definition based on number of people, teams and locations. Therefore, building on the definition of "large scale" proposed by Dikert et al. [36], we include the additional stipulation for large scale agile GSD that the definition explicitly incorporates a global focus and thus represents: "software development organizations with 50 or more people or at least six teams" [36] and these people or teams must work across sites located in at least two different countries. Development companies, therefore, follow a geographically separated country or company sourcing strategy according to Vizcaino et al.'s GSD ontology [38, p. 74].
Our research is guided by the observation that "scaling isn't easy; large projects often are globally distributed and have many teams that need to collaborate and coordinate" [33]. In their column on Scaling Agile [33], Chris Ebert and Maria Paasivaara note that there is little support in the empirical literature on large-scale agile practice transformation.
The two scaling agile frameworks in our study both consider risk. In their introduction to Disciplined Agile Delivery, Ambler and Lines note: "The Disciplined Agile Delivery (DAD) process framework is a people-first, learning-oriented hybrid agile approach to IT solution delivery. It has a risk-value life cycle, is goal-driven, and is enterprise aware" [20, emphasis added]. In a similar vein, the Scaling Agile Framework (SAFe), places a great emphasis on risk mitigation. For example, on the first day of SAFe's Program Increment (PI) Planning ceremony 9 "teams identify risks and dependencies and draft their initial team PI objectives" (emphasis added). SAFe stipulates the importance of PI Planning, by saying, "if you are not doing it [PI Planning], you are not doing SAFe. 10 "

Disciplined Agile Delivery (DAD)
Disciplined Agile Delivery (DAD) is summarised in the review by Alqudah & Razali [39,40] as comprising a set of roles, practices, and phases. The full details of the method created by Scott Ambler are given in the book [41] and the website for the method 11 , where the authors state that "the Disciplined Agile process decision framework provides light-weight guidance to help organizations streamline their processes in a context-sensitive manner, providing a solid foundation for business agility. " We saw our case study site's implementation of DAD unifying the four levels of the enterprise, encompassing business and software engineering functions (product management, portfolio management, program management, and project management). DAD refers to its framework as a toolkit allowing the creation of an organisation-specific scaling agile approach regardless of the organisation's size. "DAD adopts practices and strategies from existing sources and provides advice for when and how to apply them together. In one sense methods such as Scrum, Extreme Programming (XP), Kanban, and Agile Modeling (AM) provide the process bricks and DAD the mortar to fit the bricks together effectively" 12 .
However, central to the DAD framework is portfolio management, which provides the ability to make the right decisions the first time, removing the bias in decision making when identifying specific business value ideas to be pursued either for development as products, features, or for further experiments, to get feedback and certainty. The governance of the DAD framework comprises a set of mandated practices spanning the entire functional setup involved in identifying, developing, and making software available for use. Hence, portfolio management is driven based on development and operations intelligence (metrics) including required guidance and suggestions from the other functional units. The other major functional process areas contributing to portfolio management are the product management, and enterprise architecture (including the IT governance group) within the organisation. Therefore, portfolio management can make informed decisions in terms of budget and resources, i.e. their development capacity to be successful in the marketplace.
With the DAD framework, a portfolio can be delivered through various IT delivery approaches (based on the size of the organisation) such as program management, agile delivery, continuous delivery, or lean delivery. There is portfolio management guidance and support for IT delivery regardless of the delivery means. However, the program management approach is driven through several projects. While there are mandated practices spanning the functional setups, portfolio management emphasizes team autonomy and self-organisation within a cohesive set of practices 13 . The DAD framework identifies a set of primary and secondary roles and their responsibilities regardless of IT delivery approach, comprising leadership and more technical roles, and encouraging the development of "T-skilled engineers" [39].
DAD allows for formal people management processes 14 , a missing element in many scaling agile frameworks, although as noted in an earlier study [42], we did not see evidence of those processes in action. While DAD is driven by agile values and principles, and therefore does not advocate "big upfront design," in reality the process of managing work item lists from conception to readiness for development encompasses a large amount of massaging of functionality and architectural and design work prior to entering the project level phases. At project level three phases are incorporated: inception (scoping and sprint planning), construction (agile testing and coding), and transition (readiness for deployment or DevOps).

Scaled Agile Framework (SAFe)
SAFe was released in 2011 by Dean Leffingwell and is a living framework that is continually updated, having (at the time of writing this paper in 2020) reached version 5.0. According to Leffingwell et al., "SAFe applies the power of Agile but leverages the more extensive knowledge pools of systems thinking and Lean product development" [21]. According to case studies featured on the Scaled Agile Framework website 15 (which admittedly may present a one-sided view), SAFe offers many business benefits, including: • 20-50% increase in productivity, • 50%+ increases in quality, • 30-75% faster time to market, • Measurable increases in employee engagement and job satisfaction. However, these benefits may not be universal, or may come at a cost to work satisfaction amongst the teams [43]. In her study, Paasivaara reports that one of the teams experienced most changes as negative, where "teams felt lack of autonomy, as they could no longer decide some things on their own, such as the sprint length. With fixed increments they felt moving backward, towards the old waterfall." Passivaara concludes that perhaps this negative attitude is due to teams being new to SAFe and they may not have had time to witness the benefits. This is in contrast to the other case in Paasivaara's study, where all participants described SAFe adoption as "highly successful." The perceived lack of autonomy, and its effects on staff morale, was also found in Noll et al.'s study of SAFe, where there needs to be a balance between autonomy, the feeling of relatedness (support and trust among colleagues), and ability and skills to take on new tasks (following a self-determination theory) [44].
SAFe offers a soft introduction to the agile world through specifying a range of structured patterns often needed when organizations transition from a more traditional environment, particularly in the context of a large project [32]. SAFe considers the whole enterprise, and is organized according to four levels of the organization: Team, Program, Large Solution 16 , and Portfolio. Each level integrates agile and lean practices, manages its own activities, and aligns with the other levels. Also, depending on the size of the operation, and the stage the company is at in terms of scaling, there are varying levels of complexity. For example, organizations can start with entry-level "Essential SAFe" (with just two organizational levels represented-Team and Program). This can build to "Large Solution SAFe" (with an additional level of the Large Solution), to "Portfolio SAFe" (in which the Portfolio level replaces the Large Solution level). Finally, there is "Full SAFe" in which all four levels of Team, Program, Large Solution, and Portfolio are represented.
The Team level outlines techniques similar to those used in standard Scrum, with two-week sprint cycles. As in Scrum, teams of between 5-9 members contain three roles: the Product Owner, Scrum Master, and team member. Each agile team is responsible for defining, building, and testing stories from its team backlog in a series of iterations. Teams have common iteration cadences and synchronization to align their activities with other teams so that the entire organization is iterating in unison. Teams use Scrum, XP, or Kanban to deliver prototypes every two to four weeks [21]. Important for scaling, all SAFe teams form part of a team of agile teams-called an Agile Release Train (ART)-that aims to deliver a continuous flow of incremental releases of value.
At the Program level, SAFe extends Scrum using the same ideas but on a higher level. This level defines the concept of an agile release train (ART), which is analogous to sprints at the Team level, but works at a different cadence on a larger timescale. The ART is composed of five sprint cycles. There is also a sixth "innovation planning sprint," which allows teams to innovate, inspect, and adapt. Teams, roles, and activities are organized around the ART [21].
Existing roles are stretched and new roles created to cater for the new responsibilities and practices, where a Product Manager serves as the content authority for the ART, and is accountable for identifying program backlog priorities. The Product Manager works with the Product Owners (POs) to optimize feature delivery and direct the work of POs at the team level. SAFe sees the emergence of the role of a Release Train Engineer (RTE) "who facilitates Program-level processes and execution, escalates impediments, manages risk, and helps to drive continuous improvement" [21]. Creating this role was a key success criterion in the case study described by Ebert and Paasivaara [33].
The Large Solution (previously Value Stream) level is optional, depending on the size of the organization; in larger organizations, implementation of this level calls for a value stream engineer (VSE) who plays a similar role to the RTE by facilitating and guiding the work of all ARTs and suppliers. Leffingwell et al. [21] describe these and further roles such as Business Owner, DevOps team member, Release Manager, and Solution Manager as important.
A highest level of the SAFe hierarchy (made optional in the more recent 4.6 and 5.0 versions) is the Portfolio level. This set of executive-level processes completes the vertical enterprise view, in which senior management make strategic decisions, deliver value, and prioritize 'epics' that are filtered down to the program level, where they are decomposed into features, which in turn are fed to the team level in the form of user stories.
While there is some early evidence in favor of SAFe and its adoption [43,45], perhaps it is too early to judge its true merits or whether the promised benefits can be universally enjoyed. From the current literature, it is unclear how well this lean agile enterprise approach mitigates risk in global software development.

DAD and SAFe comparison
In a recent study of scaling agile strengths, weaknesses, opportunities, and threats (SWOT) in GSD, Sinha et al. [46] identified six threats, comprising Lack of face to face communication, Improper task allocation, Cultural differences, Temporal differences, Linguistic differences, and Lack of agile coaching for scaling. Under the weaknesses quadrant, they include a Lack of knowledge sharing. These are recurrent themes in the agile and GSD risk literature, all of which we include in our GSD Risk Catalog.
When trying to decide which of the many scaling agile frameworks to adopt, Diebold and colleagues differentiate between a collection of frameworks to include DAD and SAFe [47]. They find SAFe to have a low level of flexibility (incorporating practices such as Scrum/Kanban/Lean, with specific XP practices "mandated"), whereas DAD has a medium level of flexibility (with practices including Scrum/Lean,and a mixed set of methods) [47]. They also suggest that further comparative studies are conducted to help with decision making.
Disciplined Agile Delivery (DAD) and Scaled Agile Framework (SAFe) both draw from a variety of agile and lean practices. However, according to Vaidya's observations on three different scaling agile frameworks [48], an organization's In the diagram above, let G be the set of risks in our GSD Risk Catalog, V be the set of risks identified by Verner and colleagues [13], and W be the set of risks catalogued by Wallace and Keil [11]. Then,   [13] and Wallace & Keil [11] context is what matters most when deciding on which framework to adopt and which associated practices provide the desired results. We therefore present two case studies in two different contexts, and apply different Scaling Agile Frameworks, to provide some insight into how risk is mitigated. Both frameworks (DAD and SAFe) place great emphasis on value and risk, so it seemed appropriate then to evaluate their efficacy at addressing global software development project risks by seeing how well they covered the software project risks identified in our GSD Risk Catalog.
With this aim in mind, we set out to address the research question, How does the adoption of scaling agile framework practices address global software development risks? By 'address' risk, we will specifically look for how the frameworks either mitigate (reduce) or eliminate the GSD risk. In the next section, we explain our approach to answering this question.

Method
To answer the research question noted at the end of the previous section, we take a three-phased approach. First we created a catalog of global software development risks we call the GSD Risk Catalog [31]. Second, we created a theoretical mapping of DAD and SAFe practices to the risks in our GSD Risk

Quadrant 1: Customer Mandate
Lack of user participation National, organizational, and cultural differences of participants can cause problems like rework, loss of data, confusions, etc. Lack of collaboration for RE between distributed stakeholders happens due to differences in culture, language distance and processes Application of agile practices causes problems in distributed development because of the degree of interaction between stakeholders and number of face-to-face meetings needed Requirements information not properly shared with distributed stakeholders affecting their interaction

Quadrant 2: Scope and Requirements
Incorrect system requirements Application of agile practices causes problems in distributed development because of the degree of interaction between stakeholders and number of face-to-face meetings needed Collaboration difficulties caused by geographic distance in agile development may cause misunderstandings and conflicts Problems caused because team members do not share equal knowledge of the domain Lack of a common understanding of requirements leads to problems in system functionality Lack of collaboration for RE between distributed stakeholders happens due to differences in culture, language distance and processes

new: Lack of trust
Collaboration difficulties caused by geographic distance in agile development may cause misunderstandings and conflicts Mutual trust is important but hard to obtain and lack of trust causes problems. This can be due to lack of face-to-face interaction, cultural differences, and weak social relations Fear about the future of jobs and roles, erodes trust Limited face-to-face meetings caused by geographic distance impact trust, decision quality, creativity, and general management; knowledge creation is limited within organization. This may lead to problems in creating collaboration know-how and domain knowledge Trust among stakeholders necessary to achieve innovation, flexibility, cooperation, and efficiency in distributed environment. Since often a short life span, important to achieve mutual trust rapidly, but if trust is misplaced, entire organization may suffer A vendor with poor relationship management can result in problems such as lack of trust

new: Delays caused by global distance
Application of agile practices causes problems in distributed development because of the degree of interaction between stakeholders and number of face-to-face meetings needed Configuration management problems cause dependency, delay and increased time is required to complete maintenance requests Temporal and physical distribution increases complexity of planning and coordination activities, makes multisite virtual meetings hard to plan, causes unproductive waits, delays feedback, and complicates simple things Inability to communicate in real time) causes collaboration problems; Stakeholders located in different time zones can lead to problems in communicating Not tailoring organizational structures to reduce delays in problem resolution causes difficulties and can result in site wars and reduce project cohesion Choosing a vendor with a lack of control over a project can result in problems such as cost and schedule overruns; and Poor schedule management Catalog, which we deemed to address each risk; we also rated the degree to which the DAD and SAFe practice(s) eliminated or mitigated the given risk. Lastly, we moved from our theoretical model, to a real-world empirical setting in which we examined the extent to which each practice (in DAD and SAFe respectively) was implemented in a multiple case study comprising two GSD organisations. This served to illustrate both the extent to which each scaling agile framework addressed the risks in our catalog, and how the two case studies implemented the risk-mitigating practices. We examined interview transcripts, survey results, and observation notes to identify instances where a risk listed in our GSD Risk Catalog was evident in either of our cases. From this evidence, we surmised the extent to which practices in each scaling agile framework addressed GSD risks.
We describe these three phases in detail in the sequel.

Phase 1: GSD Risk Catalog Development
Given the context of our study is risk in globally distributed organizations, we first augmented the well-recognised Wallace and Keil [11] set of risks with additional risks identified in Verner et al.'s [13] tertiary review of risks in a GSD context.
Since there was no one clear, validated set of risks to draw on specific to GSD, we took the 85 risks detailed in Verner et al.'s review of 24 GSD risk studies [13] and mapped them onto the established Wallace and Keil risk framework [11]. A snippet of this extraction and mapping is shown in Table 1.
First, two researchers (author 1 and author 4) independently compared each risk identified by Verner and colleagues to each risk in Wallace and Keil's risk catalog. If the risk from Wallace and Keil was equivalent to, or would be a consequence of, a risk from Verner et al., we created a correspondence between the two. Mapping the two sets of risks was not straightforward, since some risks in Verner and colleagues' catalog are expressed at a high level, or as a combination of risks; in such cases, the risk from Verner et al was mapped to multiple Wallace and Keil risks. Also, we marked those risks in Verner and colleagues' catalog that did not correspond to a Wallace and Keil risk, or was incompletely captured by a set of Wallace and Keil risks, for later consideration.
Second, authors 1 and 4 reviewed the independent mapping of the other researcher to establish the level of agreement; disagreements were discussed between these two authors, and agreement reached. Then, author 2 moderated the final result to yield a single unified mapping. From this mapping, three broad categories emerged: 1. a mapped Verner risk, in which a Verner risk was equivalent to one or more Wallace and Keil risks, 2. an unmapped Verner risk with no Wallace and Keil equivalent, and finally 3. an unmapped Wallace and Keil risk with no Verner et al equivalent.
In the third step, we coalesced unmapped or partially mapped Verner et al risks into a small set of new risks. This was necessary because many risks in Verner and colleagues' catalog were at a different level of abstraction compared to Wallace and Keil's risks, were duplicated at the level of abstraction we needed, or appeared to be more a description of consequences of a risk rather than of the risk itself.
For example, Verner and colleagues identify two risks related to the rule of law: Lack of protection for intellectual property rights in the vendor country and Problems because of differences in legal systems such as jurisdiction. In the GSD Risk Catalog, these rather specific risks coalesced into the more general Country-specific regulations risk. Similarly, Verner and colleagues' list High organizational complexity, scheduling, task assignment and cost estimation become more problematic in distributed environments as a result of volatile requirements diversity and lack of informal communication and Lack of a common understanding of requirements leads to problems in system functionality; we coalesced these (and ten others) into Lack of face-to-face interaction inhibits knowledge sharing.
As before, authors 1 and 4 independently reviewed the candidate GSD Risk Catalog to double-check that the mappings made sense, and that the new risks were placed in the correct quadrant. Again, disagreements were discussed and a resolution agreed between these authors.
Finally, authors 2 and 3 reviewed the resulting GSD Risk Catalog to ensure agreement. Table 1 shows an extract of the result (see the companion technical report [31] for the full mapping). The mapping process is depicted in Fig. 3. As is shown, we added ten risks to Wallace  In summary, this comparison of risks in Verner et al [13] GSD to the wellestablished software project risk inventory of Wallace and Keil [11] ensured that technical risks as well as project management risks associated with Global Software Development (GSD) are considered in our analysis. This comparison resulted in ten new risks that accommodate risks from Verner et al.'s study that were not adequately captured by a risk in the Wallace and Keil inventory. This combined set of 63 risks, which we label the "GSD Risk Catalog," is categorised according to Wallace and Keil's four quadrants: Customer Mandate, Scope and Requirements, Execution, and Environment (see Fig. 1), with the majority of new GSD risks coming under the Environment quadrant.
The full list of 63 risks in our derived GSD Risk Catalog is presented in Fig. 2, discussed in Section 4.2.

Phase 2: Theoretical Mapping
Making use of this newly created GSD Risk Catalog, we then mapped any practice identified in our Scaling Agile Frameworks to the risk, where they appeared to mitigate risk. This mapping involved three steps: 1. Scaling Agile practice mapped to Risk factors: As part of our ongoing, longitudinal case studies, in previous work we identified sets of practices from SAFe [45], and DAD [42,49]. Four researchers, working in pairs, compared each practice in the scaling agile framework, to each of the 63 risks in the GSD Risk Catalog (Fig. 2). Authors 1 and 4 mapped SAFe practices to the GSD Risk Catalog, and authors 2 and 3 mapped DAD practices to the GSD Risk Catalog. To ensure all researchers worked to the same standard, an example of how to 'map' a practice to a risk was shared amongst all researchers.
2. Strength of mitigation assessment: Once the mapping of practices to risks was completed, each risk was rated according to the degree to which the mapped practices eliminated or mitigated the risk, as to whether the practices "definitely" address the risk, address the risk "somewhat", or do "not at all" address the risk. So, if a practice or set of practices unequivocally addressed the risk, we coded the practice as "definitely"; if the practice(s) to some extent contributed to elimination or mitigation, we coded the practice "somewhat"; and when we could not identify a practice that would eliminate or mitigate the risk we coded the risk as "not at all." Again, authors 1 and 4 assessed SAFe practices, and authors 2 and 3 assessed DAD practices.
3. Inter-rater cross-check (within frameworks): When all possibilities were exhausted, each pair of authors reviewed the mapping of his or her peer. Any disagreements were discussed within each pair until a consensus was reached.
The output from this theoretical mapping was a theory of risk mitigation according to DAD and SAFe, which is presented in Table A.11 (Appendix A) and Table B.12 (Appendix B).

Phase 3: Empirical Evidence
In the last stage, we examined data collected in a longitudinal multiple case study of two companies engaged in scaling agile development adoption according to the DAD (Case A) and SAFe (Case B) frameworks. A two case multiple case study has advantages over a single case, as, according to Yin, ". . . analytic benefits from having two (or more) cases may be substantial. . . " since, "analytic conclusions independently arising from two cases . . . will be more powerful than those coming from a single-case . . . alone" [50]. Furthermore, Yin states that when asking "How" and "Why" types of questions, case study research is particularly relevant [50]. Our case boundaries include time (in years), geographic locations, domain, and practice adoption. We apply the multiple case study design to address our "How" research question, and test our theoretical mapping of scaling agile framework practices (described in Phase 2 of our method) in which we hypothesise, that a given set of scaling agile practices can mitigate GSD risk. The business model for both cases is to develop, maintain and sell software to clients throughout the globe. The original aim of both of our case studies was to gain broad insight into issues and benefits of scaling agile framework adoption; as such, while not focused specifically on risk, these studies yielded a rich source of data from which we were able to identify many issues related to risk, test our theory, and compare and contrast across cases.

DAD evaluation-Case A
For the empirical investigation of the DAD method, interview transcripts from Case A (described in Section 4) were examined by authors 2 and 3. In Case A, author 3 conducted interviews of participants in March of 2017 [42,49]; see Section 4.3.1 for a description of the company ("Company A") involved in Case A, and details regarding the interview participants ( Table 2). The interviews were conducted at Company A's software engineering lab at Box Hill, Melbourne, Australia. A total of eight interviews (each an hour long) were conducted one week in March 2017. Participants were identified based on the  various software engineering roles within this software vendor organisation. These roles were: Principal Software Engineer, Senior Software Engineer, Software Engineer, Team Leader, Engineering Manager, Product Manager or Product Owner, Quality Assurance Manager, and the Director of Software Engineering. All the participants for this investigation were part of their DAD transformation. The interview instruments were based on the reasons and approach for switching to the DAD method. An agreement was made to record the interviews that were later transcribed. Although all interviews were conducted out of the Melbourne site only, the projects covered development across Australia, the USA, and India.

SAFe Evaluation-Case B
Data relating to SAFe were obtained from results of an ongoing longitudinal participant-observer study (called "Case B"), with moderate researcher involvement, that began at the end of 2015 and continued through to the autumn of 2019. Similar to Case A, the main purpose of our collaboration was to observe how the company ("Company B") transitioned from a plan-driven development process to a scaling agile development process based on SAFe practices. Company B is specifically interested in how to adopt the new agile, lean, and Kanban practices in their highly distributed setting: where teams and individual team members are globally distributed (see Fig. 5. During our four year collaboration, authors 1 and 4, along with their colleagues, conducted 31 interviews of team members and managers in a variety of roles, at all levels of the company, chosen to be representative of all levels of the development organization 17 . Interviews were conducted on-site in the company's Dublin headquarters, and via video conference, over two years, starting November, 2015, Participants of sixteen of these interviews are directly quoted in our study, as listed in Table 3.
We also observed distributed development teams conducting Scrum 'ceremonies,' such as daily standups, sprint planning, and retrospective meetings; and, we observed weekly program-level "scrum of scrums" style meetings. These observations began in November 2015 and continued, focusing on different teams, until the end of 2017. Observations helped to place the interviews in context, but did not directly provide any data for this study.
Finally, a series of three SAFe "self-assessment" surveys were administered to various teams, and program and portfolio level participants, in February 2017, July 2017, and March 2018 [45,52,17]; see Table 4 for details regarding the participants in the surveys. The self-assessment surveys identified the level to which the participants perceived they implemented various SAFe practices and ceremonies.

Within and between multiple case study evaluation
As a first step, we examined these data for evidence that the companies had experienced problems (or not) related to the risks in our GSD Risk Catalog Then, in the second step, we assessed the frequency at which each company performed the respective scaling agile framework practices mapped to risks in the GSD Risk Catalog. In Case A, this frequency was assessed to be "always" as the company had completed its agile adoption, at the time the interviews took place. In Case B, data from the self-assessment surveys were examined to determine the company's self-assessed frequency of practice performance.
Finally, working in pairs (authors 2 and 3 for case A, and authors 1 and 4 for Case B), we connected the output of the previous two steps, to understand whether the scaling agile practices eliminated or mitigated the corresponding risks: 1. if the practices were implemented in the company, and no evidence of the risk was seen, the practices could have been material in eliminating the risk; 2. if the practices were implemented in the company, but there was evidence that the risk was a problem for the case, the practices still might have been effective at mitigating or reducing the risk; or, 3. this might indicate that the theoretical mapping is not effective in practice.
To determine which of these alternatives was the case, we considered three additional elements: 1. Strength of theoretical mapping: the degree to which the practices address the risk. Risks that are only "somewhat" addressed (by the practice), are perhaps more likely to be seen as problems. 2. Strength of practice implementation in cases: the frequency at which the associated practices were performed. If this was less than "always," it's possible the practices were not effective because they were not thoroughly implemented. 3. Level of control: whether the risk can be eliminated, or only mitigated.
Certain risks, such as Unstable country/regional political/economic environment, are part of the environment; they cannot be eliminated, but their impact can be reduced. We present the results of applying this method in the sequel.

Results
In this section we first present a new catalog of GSD risks created by comparing and merging risks identified by Verner and colleagues [13] to Wallace and Keil's list of risks [11]. Then, we show the theoretical mapping of SAFe and DAD practices to these GSD risks. Finally, we present empirical evidence of the effectiveness of our theoretical mapping, that underpins the extent two which the scaling agile frameworks eliminate or mitigate risks in GSD.

GSD Risk Catalog development
To create a comprehensive catalog of risks faced by global software development projects, we compared 85 risks identified by Verner and colleagues [13] in their tertiary study of risks in global software development, to the 53 in Wallace and Keil's [11] risk framework. We found many risks identified in Wallace and Keil related to GSD risks, and that many of the risks listed by Verner et al. identify more than one risk. For example, a risk listed under "Requirements engineering risks and mitigation advice" states, "A lack of suitable tools or methodologies available for requirements elicitation may lead to problems in obtaining the real requirements. [13, Table 9, p. 64]" This statement articulates two risks: a lack of suitable tools for requirements elicitation, and a lack of suitable methodologies for requirements elicitation. Over a third (32) of the risks in Verner and colleagues' list could be classified as "compound" risks of this nature.
Other risks identified by Verner et al. are general, high-level risks that could have multiple consequences for a software development project. For example, the category "Software development process risks and mitigation advice" includes this risk: "Application of agile practices causes problems in distributed development because of the degree of interaction between stakeholders and number of faceto-face meetings needed. [13,  This high-level risk also leads to several new, GSD-specific risks not found in Wallace and Keil's catalog, including: Delays caused by global distance, Ineffective collaboration, Ineffective coordination, and Lack of face-to-face interaction inhibits knowledge sharing. Of the 85 risks identified by Verner and colleagues, 79 correspond to, imply, or result in at least one risk in Wallace and Keil's catalog.
We also found six that had no correspondence to any of Wallace and Keil's risks: 1. "Lack of well-defined modules causes problems with progressive integration [13,  ." And, we found 43 risks that, due to being high-level or compound in nature, not only corresponded to one or more risks in Wallace and Keil's catalog, but also suggested a new risk, not in Wallace and Keil's catalog. As a consequence, we formulated 10 additional risks; these are listed in Table 5.
The result of combining the 10 new risks with the 53 in Wallace and Keil's catalog, yielded a combined GSD Risk Catalog of 63 risks, illustrated in Fig. 2. The full correspondence of risks identified by Verner and colleagues, to risks in the GSD Risk Catalog, is available in a companion technical report [31].

Theoretical Mapping of Scaling Agile Practices to GSD risks
To understand the extent to which scaling agile frameworks address GSD risks, we assessed how well practices in DAD and SAFe address the risks in our GSD Risk Catalog. The result is a theoretical mapping of scaling agile practices to GSD risks. Table A.11 (Appendix A) and Table B.12 (Appendix B) show our assessment of the degree to which DAD and SAFe practices (respectively) address the risks in our GSD Risk Catalog, along with examples of DAD and SAFe practices that address the given risk. Space does not allow all the associated DAD and SAFe practices to be included, so the tables present selected examples; the complete mapping is available as part of a technical report [31]. Fig. 4 summarizes the extent to which each scaling agile framework theoretically addresses the GSD Risk Catalog risks. Looking at the total, Fig. 4 shows that both frameworks address most of the risks. The raw figures have been normalized across quadrants to allow a comparison of how each framework addresses risks in the GSD Risk Catalog quadrants outlined in Fig. 2. Despite the frameworks addressing a similar number of risks, we see some differences when we compare risk mitigation within some quadrants.
Both frameworks address the eight risks in the Customer Mandate quadrant to a certain extent. SAFe appears slightly more aligned to working with the customer and user than DAD (Fig. 4). Fig. 4 shows little difference in how the frameworks address the Scope and Requirements set of risks. Both DAD and SAFe address all ten risks in this quadrant completely, with the exception of DAD that has slightly weaker support for one of the factors, Users lack understanding of system capabilities and limitations.
The Execution quadrant has the most risk factors (31). As Fig. 4 shows, both DAD and SAFe address most of these 31 risk factors to some extent, although DAD appears to be the stronger framework when it comes to project  The Environment quadrant Fig. 4 has the next highest number of risk factors (since many of the GSD specific risks are categorized in this quadrant). However, this quadrant has the most factors not addressed by the frameworks. For example, these scaling agile methods do not appear to fully support Unstable country/regional political/economic environment and Organization undergoing restructuring during the project.
When looking across all quadrants far left in Fig. 4, the total number of risks we hypothesize each framework addresses are very similar; DAD practices are associated with eliminating (termed "definitely") or mitigating (termed "somewhat") 58 risks, and SAFe is associated with eliminating (termed "definitely") or mitigating ("somewhat") 57 risks. We now, in the next section, look to see whether two companies, who implement DAD and SAFe practices, experience any of the associated risks, in order to test our hypotheses.

Empirical Evidence
To gain some insight into the effectiveness of our theoretical mapping of scaling agile practices to GSD risks, we conducted a multiple case study involving two companies engaged in global software development. As described in our method, we examined a range of data collected from these cases for evidence of how scaling agile practices might have eliminated or mitigated GSD risks. In this section, we introduce the multiple case study settings and summarize the types of risk we observed in the cases, as specified in the GSD Risk Catalog.

Case study setting
Company A, based in Melbourne, Australia, has been using DAD for some years. Company B, based in Dublin, Ireland, was undergoing a transition from a traditional, plan-driven approach to agile development using SAFe.
As illustrated in Fig. 5, both Company A and Company B have development teams around the world.

Case study A.
Company A is a Melbourne-based company that produces highly intelligent enterprise asset management software for a global customer base. This software vendor has undergone a transition towards scaling agile development using Disciplined Agile Delivery (DAD) over the period from 2015 to 2017, and is continuing as of this writing in 2020. Accompanying this transition has been a move to provide the software not solely as an in-house product, but through a cloud-delivered "software as a service" (SaaS) model. The vendor has ten development teams across three different countries, with the engineering operation based in Melbourne and development teams in the USA and India (see Fig. 5). The marketing and after-sales and support teams are based in Australia and the USA.
The company has been actively implementing agile methods since 2003, so could be considered a mature agile practitioner. Nonetheless, the transition from their previous "hybrid-agile" to a scaling agile approach using DAD had been driven by pressures of scope and quality: namely, the inability to deliver the desired scope for planned releases, and inadequate quality of the software releases delivered into a SaaS environment.
With their DAD approach, programs run by the Melbourne site involve their four local project teams plus three project teams in the USA, and one in India. Case A is a global software vendor, that meets our definition of large scale agile GSD (see Section 2.2).
The DAD approach has the practice of self-contained teams. So all DAD (project) teams are local and co-located with their own work item lists (product backlogs) allocated by the program management. With the DAD approach a program is delivered through several projects. Hence, the program planning (create program portfolio which basically is the work item list) involving product manager, program manager, and enterprise architects will work at a global levela program is at a global level whereas projects are local only. While the DAD project teams are local there is a daily tactical huddle where all the leadership roles (Architect, Tech Lead, and Product Owner) of the DAD project teams under a program meet. They (DAD project teams) also do a show and tell collectively for every sprint. Case A is conducting global software development and follows DAD recommendations for self-contained, co-located teams.

Case Study B.
The company we studied for Case Study B is Ocuco Ltd., a medium-sized Irish software company that develops practice and lab management software for the optical industry.
Ocuco Ltd., which we will refer to as "Company B" in the sequel, employs approximately 300 staff members in its software development organization, including support and management staff. Company B has annual sales exceeding €20 million, from customers in Britain and Ireland, continental Europe, the Nordic region, North America, and China.
Company B has ten development teams whose members are distributed across Europe and North America (see Fig. 5), involving approximately 50 developers in twelve countries; as such, Company B also meets our definition of large scale agile GSD (see Section 2.2).
As part of their transition from a plan-driven development approach, to agile software development following SAFe, Company B began introducing Scrum at the team level approximately six months before we began our study in 2015. SAFe is being rolled out to the various teams and projects in stages, with the newer projects leading the way to implementing SAFe practices such as PI Planning, Automated Testing, and Continuous Integration, whereas much of the organization is involved in SAFe recommended practices such as Communities of Practice.
While the purpose of the collaboration with both cases was similar, which was to observe how teams adopted or transitioned to scaling agile in a globally distributed setting, there are distinct differences in the study setting. Company A has been using agile methods for nearly two decades, while Company B had just begun a transition to agile development with SAFe at the time we started our collaboration. So, Company A would be considered a mature agile    5 5 organization, which at the time of the study was scaling their development using DAD. Company B is more of a nascent agile company, introducing many new agile and lean practices as they attempt to scale agile development across teams and up the organizational hierarchy.

Empirical study results
In this section we present results of our investigation into the extent to which GSD Risk Catalog risks were observed (or not) in Cases A and B. We then establish whether the observed risks in the given case, were associated with practices implemented by Company A or B. We first provide an overview of this examination showing the risks not observed, that appear to support our mapping since the risk may have been eliminated. Then, we examine the risks that we observed in the cases.

Risks not observed in cases as issues.
Risks shown in Table 6, were not observed in the case studies as having become issues; this set of practices relates to the first row in Table 7. This table provides some evidence that many risks can be eliminated through the adoption of scaling agile practices.
This category, where the scaling agile mitigation practice is implemented, and no related risks are observed, represents the ideal case, where a risk is Figure 6: Theoretical mapping evaluation -observations in two cases -All Risks identified and addressed before it becomes a problem. An example comes from Company A, involving spikes 18 , that concerns the risk of Team members not familiar with the task(s) being automated: "When you got something like what we're doing right now, rolling out a new dashboard, massive architectural spikes at the start. . . you got to bring that kind of stuff to the architecture upfront" (PA2). The DAD practice "Inception phase involving entire DAD delivery team", along with "Spikes", ensures that technical understanding is developed early in the development lifecycle.
Company B employed the SAFe practice "Develop a feature team that is organized around user-centered functionality" to address a similar instance of this risk: they employ former users of the product, who have formal qualifications in the domain, as QA staff; this ensures that implemented features are usable by actual practitioners.
The proportions of these risks are shown in Fig. 6 (middle bars). In Case A, practices associated with 58 risks were implemented, yet less than half (27) of these risks were observed in Case A; this suggests that DAD was effective at eliminating over half (53%) of the risks for which Company A implemented associated DAD practices. Similarly, in Case B, practices associated with 57 risks were implemented, with 23 of these risks not observed in the case, suggesting that SAFe helped eliminate up to 40% of risks for which Company B implemented associated SAFe practices.
Risks observed in cases. Table 8 shows that many GSD risks were observed to have materialized into issues in one or both cases. The evidence of risks being present is drawn from interview transcripts.
To understand whether the implementation of scaling agile practices affects the occurrence of risks, we assessed the frequency with which each company implemented each practice. Company A has been using DAD for more than five years; as such, Company A always performs all except five practices in DAD that address risks in the GSD Risk Catalog. Evidence of practice implementation by Company B comes from self-assessment surveys.
Company B was in the middle of a transition to SAFe when we began our study; consequently, they perform the SAFe practices at different frequencies, ranging from "never" or "rarely," through "occasionally" and "often," to "very often" or "always". We were able to establish the extent to which the SAFe practices were implemented through a series of self-assessment surveys [52,17] (participants detailed in Table 4). Based on these survey responses, we determined that, in Case B six risks had no associated practices that were performed more often than "rarely." Table 9 lists the risks from the GSD Risk Catalog for which either company rarely or never performs any of the associated framework practices (see Appendix A, Table A.11 and Appendix B, Table B.12 for our theoretical mappings). These results are not surprising: in the case of Company A, they have a stable organization and operate in developed countries with a stable political and economic environment; they also do not engage outside suppliers or interface with many systems.
Company B likewise has a stable if rapidly growing organization with experienced management, and experiences very low turnover. So it is perhaps to be expected that they do not implement practices aimed at reducing the risks associated with these characteristics.
In Case B, there are instances where associated scaling agile practices were not (yet) fully implemented. There are 22 risks in Table 10 where the mode of Company B's frequency of performance of the associated practices is "often" ('3' is "often" performed, '4' is "very often") indicating they don't always perform these practices. As such, risks associated with these practices might not be fully addressed, and so could be expected to become problems occasionally. Conversely, Case A always performs associated DAD practices. This might account for the fact that fewer risks (27 vs 34) were observed in Case A than Case B (Fig. 6, rightmost bars). Table 7 summarizes the results in terms of the frequencies of combinations of risks addressed and seen, and practices implemented. The first column indicates a risk has been addressed by a scaling agile framework; the second column indicates whether the mapped scaling agile practices have been implemented by   the company; the third column indicates a risk has been observed as an issue in the associated case; and, the fourth and fifth columns show the number of risks that are in the state indicated by the first three columns for each case.
The first row shows the number of risks from the GSD Risk Catalog that have practices mapped from the respective scaling agile framework (DAD for Case A, and SAFe for Case B), but have not been observed in the respective case study, and have mapped practices implemented by the case company. This row supports the theoretical mapping: the practices were implemented and the risks were not seen to be present in the case study organization, indicating that the practices were possibly effective in eliminating the associated risks.
The second row shows the number of risks that have practices mapped to them, were observed to have occurred in the respective cases, but the practices were not implemented. This row might indicate areas where the case companies could improve their practices to address observed risks, but no risks were found in this category.
The third row shows the number of risks that have practices mapped, but were not observed in the cases, nor were the associated practices implemented. This category is also empty. The last two rows show the number of risks that do not have associated scaling agile practices. In all but one instance these risks were not observed in either case. The one risk observed in this category-from Case B-is Organization undergoing restructuring during the project. This risk stemmed from the transition from a waterfall to agile development approach: the product owner of a project focused on a large customer noted, "So, we worked in waterfall fashion in the past and I think this is difficult for people to move from the waterfall way to the Scrum way" (PB10).
By contrast, the fourth row shows the number of risks that have practices mapped, were observed in the cases, and the practices were also implemented in the cases. Table 10 lists the risks involved in the fourth row of Table 7. This table shows that few (3 of 8) risks from the Customer Mandate quadrant were seen in either case when the associated actions were implemented. Conversely, the majority (8 of 10) of Scope and Requirements risks were seen. Also, Table 10: Framework risk mitigating practice empirical evaluation. "Degree of impl." column indicates frequency practice is performed ('3' is "often" performed, '4' is "very often" performed, and '5' is "always" performed).   Table 10 are from the Execution quadrant. Section 5 examines the implications of these risks. Fig. 6 shows the proportions of these categories in rows one and four of Table 7 (risks addressed, practices implemented, and issues not seen for row one, or seen for row four). Of note is the greyed portion of risks at the top of the stacked bar chart (representing those risks with no associated practices); these only appear in the category of "risk seen." Conversely, it is only those risks with associated practices that are not seen. The case where the risk is seen, and the practice is implemented, is harder to interpret; we discuss the possibilities in the next section (Section 5).
As Fig. 6 shows, the majority of risks that were seen as issues in either case are also "definitely" addressed (blue portion of the bars), while a slightly higher proportion of risks not seen in both cases were only "somewhat" addressed. This suggests that the theoretical strength of mitigation or elimination does not strongly affect whether the risk was observed in a case.
It seems some risks cannot be eliminated, but can be mitigated.  and Lack of top management support for the project-were realized as issues in the cases.
For example, Company A experienced Conflict between users: "Because they've [product managers] been in the US and I've been here. . . That's sort of things that people could not appreciate and they wouldn't even hear it because they know the product, it works like this, and they assumed the customers wanted the way it was" (PA1).
An project manager provides an example of Lack of cooperation from users seen in company B: ". . . now, in an Agile world there is no way that I could tell them when they are going to get done until the estimate is there, until we start a sprint planning. . . you can't just say, 'we are doing Agile, so, you got to wait for our next planning. . . " (PB11). Fig. 8 shows the proportions of risks addressed, seen, and associated practices implemented, for the Scope and Requirements quadrant (quadrant 2). The companies in both cases experienced the problems in this quadrant, including Conflicting system requirements and System requirements not adequately identified.
For example, in Case A, one participant mentioned, ". . . everything has to be user stories, sometimes we think that's the underlined problem . . . we get too focused on that because while it's good to have those stories to get you going . . . the magic happens every day in the team making adjustments, embracing that, rather than trying to design user stories to end." (PA2) Similarly, in Case B, a senior manager noted, "Like I remember a few weeks ago we were with a customer from Scotland that asked for a particular improvement in a particular area, and the whole issue had gotten completely, you know, misunderstood by development and had been sitting in the backlog for a very long time, and [we] would set up a call with the customer to try and clear up what exactly what they want. Because a lot of the time there is this 'Chinese whispers' thing going on with. . . people misunderstanding things, recording the wrong request and so on" (PB4).
In Case B, the company encountered a problem where the team responsible for maintaining and enhancing the core product was devoting more effort to fixing issues raised by customers than to implementing new features; this caused the core product development to drift away from the product road-map, resulting in important features being delayed. This is another example of the Conflicting system requirements risk; the associated SAFe practice, "Continuously communicate emerging requirements and opportunities back into the program vision through product owner," mitigates this risk by ensuring the product vision takes into account issues raised by customers; also, the product owner would be fully aware of the product vision and associated road-map, and therefore would be able to make correct decisions about the relative priorities of fixes and new features. As a result of recognizing the link between this risk and the corresponding SAFe practice, Company B moved from part-time product owners who doubled as technical support staff, to dedicated personnel who focus solely on the product owner role. This does not eliminate the risk of conflicting system requirements, but it does reduce the impact by ensuring the product owner prioritizes conflicting requirements properly.
Another example from Case A, relating to conflicting requirements, shows the power of spikes to mitigate (rather than eliminate) this risk: an interviewee noted, "absolutely, it [requirements conflict] mainly happens in inception, but throughout the project as we get closer to needing to execute in a specific story, we analyze risks and potentially do spikes on things, hopefully before those stories are done" (PA7). The DAD practice, "Engineers in the inception phase minimize risk through spikes," helps to resolve conflicts when they arise.
Some risks, all from the Execution quadrant (quadrant 3, Fig. 9), suggest problems related to agile development in a GSD context: 1. Lack of an effective project management methodology, 2. Ineffective coordination, 3. Ineffective collaboration, 4. Ineffective communication, 5. Lack of trust, 6. Negative attitudes by development team, and 7. Inadequate estimation of required resources.
For Case A, Lack of an effective project management methodology manifests itself in technical issues. For example, one interviewee described how automation affects their ability to deliver: "Automation has played a bigger role for us, we reduced the release cycle from 6 weeks to 2 weeks, without automation to give us that nightly check ability, it's very difficult, and our product is very complex, just setting up the environment and do manual testing is weeks of effort." But they are making progress: "And also, in the criteria being done, it's as close as possible to 100% automation. . . it's something we didn't have before. We used  to concentrate on a lot of manual work, now we say to the team, it's just your responsibility as QA to automate" (PA6).
Both companies experienced High level of technical complexity, and Team members lack specialized skills required by the project, risks that are related to technical complexity of development. For example, a participant from Case A reflected on the long-term effects of legacy code: "The thing that we have to deal with is that we have a lot of legacy code that was already built in those 3 tiers, and we brought a lot of that across. . . we were sort of forced to stay with it because the need to reuse our code, we just didn't have time to build everything. . . lot of room for improvement I think, if we started from scratch I think things will be a bit different" (PA2). A scrum master in Case B observed, "Even though I have [Sr. Developer] here in Portland [to assist] but he is very heavily involved with other things. So, he is quite busy" (PB10). Similar to the Environment quadrant risks, these two risks are inherent aspects of the project and so can be mitigated (for example, by enlisting more experienced developers) but not eliminated.  US and Europe. I think our biggest database has got like 50 tenants in it, and the other are starting to fill up as well, and we've reached the limit before, and we just simply start up another server. It is not a simple matter of one server, you have to, it's all clustered machines, you have to through 3-4 different servers you have to bring up to every new database" (PA2). In Case B, where growth by acquisition in different countries has changed the organizational structure, a developer commented, ". . . it happens regularly in every week-that someone forgets to unlock the unit. So, when that happens-previously all developers were in Dublin and that shouldn't be a problem-now we have a problem" (PB13). This introduces Delays caused by global distance also in quadrant 4: "The difficulty is that, when I [PO] am here on-site I only get couple of hours in the morning to deal with Dublin stuff. If I do not get the things that I need from them even though these two hours I am pretty much isolated for the day" (PB10).
We identified several instances where the risks observed related to the very process of transition to the new framework (Case B), or in the process of ongoing adaptation and adopting new practices as the process matured (Case A). For instance, among the risks observed in each case we noted that Inadequate estimation of project budget was an issue, and, for Case A, that DAD was hoped to offer some solutions ". . . now the DAD process is coming" (PA4). Inadequately trained development team members appeared to be an issue, again with Case A indicating that new DAD practices and roles involved "a completely different way of thinking for us" (PA2). The risk that Team members lack specialized skills required by the project was evidenced differently in Case A as team members "lacking the expertise of integrating all the different products and offerings under one company product" (PA2), and in Case B "don't fully understand [the] job" (PB10), so both new technical demands and new roles posed challenges.
In the next section, we discuss the implications of these results.

Discussion
This study was motivated by the lack of empirical evidence as to the efficacy of scaled agile frameworks in general, and specifically in managing risk in Global Software Development (GSD) settings, where teams are distributed around the world. Our results in Section 4 provides a promising set of responses to address this gap, as captured by our research question:

How does the adoption of scaling agile framework practices address global software development risks?
The comprehensive GSD Risk Catalog we derived complements the earlier software risk framework of Wallace & Keil [11], by incorporating a further ten new GSD specific risks, making a total of 63 risks. The new GSD specific risks are situated within the Environment and Execution quadrants of Wallace & Keil [11].
The theoretical mapping suggests that the two scaling agile frameworks investigated, DAD and SAFe, could contribute strongly to eliminating or mitigating risks in our GSD Risk Catalog. However, the empirical assessment of issues related to those risks provides a more nuanced picture of the frameworks and their strengths and limitations. On the one hand, nearly half (31 of 58 or 53% for Case A, and 23 of 57 or 40% for case B) of risks were not seen as issues for the companies when they implemented the associated practices. This suggests that the respective scaling agile frameworks are effective at totally eliminating a subset of GSD risks. Fig. 6 shows that a majority of risks are addressed by both DAD and SAFe, both from a theoretical and empirical point of view. The vast majority of risks seen in Case A are addressed by DAD, while a slightly lower proportion of risks seen in Case B are addressed by SAFe. Looking at the figure, in both cases it appears that the strength of mitigation has little influence on whether the company will experience the risk.
At the time of our study, Company B was in the middle of their transition from plan-driven to agile development with SAFe. As such, for the risks that were seen as issues in Case B, a majority (22 of 34, or 65%) of associated SAFe practices were performed less than "always" (Table 10, column five). So this could account for the somewhat higher number of issues seen in Case B than in Case A.
The remaining risks (that were observed in the cases) are possibly mitigated rather than eliminated. That is, the risk became a problem, but its impact was reduced by the associated scaling agile practices. Wallace and Keil note that companies have low control over the risks in the Customer Mandate (quadrant 1) and Environment (quadrant 4) quadrants (see Fig. 1); we would therefore not expect risks in these quadrants to be eliminated, because they result from external forces out of control of the companies.
Delays caused by global distance is an example of an Environment risk: if teams are located in Vancouver and Dublin, or Melbourne and New York, the only way to eliminate delays caused by lack of timezone overlap is to shift working hours, or close one location; neither of these is likely to be practical. Verner and colleagues also recognize that global distance [53] introduces difficulties for agile development, noting that "Lack of synchronous communication in agile development causes problems" and "Collaboration difficulties caused by geographic distance in agile development may cause misunderstandings and conflicts" [13, p. 64]. While Verner and colleagues don't offer any mitigation advice for these risks (unlike other risks associated with agile methods in GSD), we assert that SAFe practices such as "Calculate the Cost of Delay" and "Manage and optimize the flow of value through the program using various tools, such as the Program and Value stream Kanbans and information radiators" [31] can reduce the impact of these risks by highlighting time-critical information.
Risks in the Customer Mandate quadrant (quadrant 1) are determined by the users, who are also largely out of the control of the development organization. This would be the case for the Conflict between users and Lack of cooperation from users risks. DAD practices such as "Product manager and product owner roles, business case, feature funnel, Mandated DOD [definition of done], TDD [test-driven development] practices, and including manual systems testing before deployment into production environment" and "DevOps practices-collaboration between Operations and SE (program and project) teams" reduce the impact of this risk [31], but do not eliminate it. Verner and colleagues also recognize continuous integration and test-driven development as ways to mitigate risks associated with agile development in GSD [13,  Some risks in the Scope and Requirements quadrant (quadrant 2) also depend on the customer. Continually changing system requirements is an example: if the customer decides to change some of the requirements, for example in response to changing market conditions, the development teams need to react accordingly or the customer will not be satisfied. SAFe practices such as "Continuously communicate emerging requirements and opportunities back into the program vision through product owner," "Work with stakeholders to understand the specific business targets behind the user-system interaction," and "Perform system demo as near as possible to the end of the iteration" help ensure that requirements changes are detected and accounted for as soon as possible. DAD has similar practices, including having a "product manager who does market investigation and feedback on potential features and functionalities from potential and current customers" and a "product owner with UX responsibility, for storyboarding, do user research in the field".
We note that overall, few risks in the Customer Mandate quadrant were seen by either company (Fig. 7), despite this quadrant being classified as having a "low" perceived level of control [11] (Fig. 1). Conversely, Company B experienced nearly two-thirds (18 of 28) of the risks in the Scope and Requirements quadrant (see Fig. 9), despite this quadrant having a "high" perceived level of control [11].
Agile methods emphasize the need to accept rather than prevent or control requirements change through requirements freezes and control change boards in order to deliver the most value to customers and end-users [54]; possibly, this shift in attitude towards requirements and users means that Wallace and Keil's original classification of the perceived level of control of the Customer Mandate and Scope and Requirements quadrants appears to be reversed in the context of agile software development.
Agile methods also promote intense interactions (both formal and informal) among stakeholders, including the development organization, customers, and users. This means that there is a closer relationship between developers and other stakeholders than in traditional plan-driven approaches. Yet, according to Pikkarainen et al agile methods applied in larger development situations involving multiple external stakeholders "can sometimes even hinder the communication" [55].
In our theoretical mapping, we contended that many scaling agile practices, such as "all hands" PI ceremony, recognize the importance of all distributed team members meeting face-to-face, and that coupled with enhanced tools for video-conferencing and information sharing, and daily remote stand-up meetings, would have contributed to a highly collaborative environment. Yet the major issues experienced by both cases appear in the Environment and Execution quadrants. This may be due to the situation where the "application of agile practices causes problems in distributed development because of the degree of interaction between stakeholders and number of face-to-face meetings needed" [13]. It would appear that Global Software Development impedes this kind of interaction [3]: of the 17 risks observed in both cases, eight are the new GSD risks added to the GSD Risk Catalog to augment Wallace and Keil's inventory: 1. Ineffective collaboration, 2. Ineffective coordination, 3. Lack of trust, 4. Country-specific regulations, 5. Delays caused by global distance, 6. Lack of architecture-organization alignment, 7. Lack of face-to-face interaction inhibits knowledge sharing, and 8. Lack of process alignment. Lack of architecture-organization alignment is a recurring issue in GSD [8] that appears to be a risk that can at best be mitigated; recent studies into architectural design in GSD indicate that the architecture does not always reflect the structure of the organization, where architects interviewed stated that working across geographic boundaries required new strategies [56].
Also, both cases experienced Inadequate estimation of required resources. Estimation is a persistent problem in any software development context; in their study of 145 software projects, Kitchenham and colleagues found that less than two-thirds of the projects produced estimates within 25% of the actual time required [57]. So, inadequate estimation may be a fact of life in software development, that cannot be eliminated, especially in a global setting.
Agile methods accept that initial estimates are not accurate, but that they will improve as development teams gain more experience with the requirements and their own capabilities. As such, this risk is one that is likely to be a problem initially, but will be mitigated over time; a scrum master in Case B confirmed this, observing, "For the first few sprints we completely over-committed to a lot of stuff which we just couldn't deliver. So, we are trying to fit in with the velocity that is based on the size of the team" (PB10).
It is surprising that Lack of trust is an issue in both cases, since the Agile Manifesto values "individuals and interactions over processes and tools," and among its twelve principles are "Build projects around motivated individuals. Give them the environment and support they need, and trust them to get the job done" [58]. As noted in Section 4.3.2, in Case A, this is related to Ineffective communication that stems from GSD, while in Case B, some of the problem derives from the fact that the company is still in the process of moving from a plan-driven to an agile development approach.
Verner and colleagues also recognized that GSD presents difficulties for agile software development. Six risks from Verner et al's list [13, Table 10, p. 64] specifically concern agile development; five of these explicitly cite the impact of GSD on communication, collaboration, or coordination: 1. "Application of agile practices causes problems in distributed development because of the degree of interaction between stakeholders and the number of face-to-face meetings needed," 2. "Lack of synchronous communication in agile development causes problems," 3. "Collaboration difficulties caused by geographic distance in agile development may cause misunderstandings and conflicts," 4. "Poor communication bandwidth for agile development causes problems with communication and knowledge management," 5. "Lack of tool support for agile development causes problems with agile practices," and 6. "Large teams involved with agile development can cause problems related to communication and coordination." The fact that both cases experienced these risks suggests that certain risks are endemic to GSD, and, due to their impact on communication among developers, customers, and users, may be beyond the capabilities of scaling agile frameworks to eliminate.
However, we know from observation that both companies can exploit communication and coordination technologies such as video conferencing, real-time "chat," and issue management software, to effectively implement scaling agile practices related to communication, coordination, and collaboration. And we know from associated studies with Case B, that practitioners are highly motivated (with a few exceptions) [44], a status further supported by very low staff turnover [59,60,61]. So even though we still observed instances of risks related to communication and coordination becoming issues, this does not mean that scaling agile frameworks are ineffective at addressing GSD risks; rather, we hypothesize that scaling agile practices can reduce the probability of GSD risks becoming problems, and potentially reduce the impact of such problems when they do occur, but that these risks cannot be eliminated entirely.
In summary, the scaling agile philosophy of collaboration, both horizontally and vertically throughout the enterprise [21,17], appears to eliminate or mitigate software development risks through better sharing of information, joint decisionmaking, progressive refinement, and adaptation of goals. As noted in Ambler and Lines [41, p. 8], "high collaboration is a hallmark of agility." A strong governance structure, and set of supporting roles and practices within each framework, enable such behavior, and contribute to reducing risk and thereby to the improved outcomes advocated by each method [45,42].

Threats to Validity 5.1.1. Construct Validity
We have used qualitative data from interviews and observations to determine whether risks from our GSD Risk Catalog were experienced by either of the companies in our case studies. This data collection did not specifically set out to examine risks that the organizations were experiencing. However, the broader questions asked as part of these longitudinal studies were to understand the challenges and issues that the practitioners experienced in their development, whilst transitioning to their target scaling agile frameworks, in a GSD setting.
The measures we used to test whether a scaling agile practice mitigated a given risk, was to look for the absence of qualitative evidence that a company experienced a risk, combined with evidence that the company had implemented practices deemed to address the risk, as an indication that the practices did address the risk by eliminating it. But this relationship may be coincidental; the risk might not have materialized anyway, or might have been eliminated by some other means.
Also, we speculated that the presence of risks combined with practice implementation may indicate that the practices do not eliminate risks. Again, this could be coincidental.
Future work could involve following-up with the participants to verify that the scaling agile practices helped, or did not help, to eliminate risks, and why.

Internal Validity
Our approach to mapping agile practices to risks, which we deemed to mitigate them, was careful, involved inter-rater cross-checks, and was supported by evidence from each framework's documentation and case data. But, we may have misinterpreted some of the risks and corresponding practices. Moreover, the complexity and interlocking nature of many roles and practices were such that making a direct mapping was challenging in some cases. To what extent the identified practices were mutually self-reinforcing across cases was hard to determine especially since the cases represent different application domains. The observed differences and similarities therefore, across frameworks, may be due to situational differences rather than framework differences.
There are trade-offs between types of case study [19] where for example a multiple case study has several validity gains over a single case study in having higher transferability (broader relevance to other cases, and observations are made in a naturalistic real-world setting) and higher confirmability (with more support for findings across studies) [62]. Yet, the level of credibility is considered lower, where, for example, confusion can exist in cause and effect across cases due to the varied settings, that are only visited at one point in time [63] As noted above, the presence or absence of risks might not be related to scaling agile framework practices, but rather could be the result of confounding variables. For example, we did not observe the Unstable country/regional political/economic environment risk in either case. But this is almost certainly because both companies are located in, and sell to, stable economies and countries. Similarly, Case A experienced the Project involves the use of new technology risk; since, at the time of our study, they were moving to SaaS, this risk may simply have been new enough that it was not yet mitigated.
There is also a possibility that researcher bias influenced the mappings that form the core of this study. To reduce this possibility, we first did independent mappings as individuals, then compared the results. We resolved all disagreements by discussion within each pair of researchers, thus achieving complete agreement in the end.
The data collected from Case A might be limited by the fact that interviews and observations were done in one location. Although the Melbourne site has interactions with teams and customers across the globe, and so the participants would have a good understanding of Company A's processes, they might not have a complete view of the issues faced by Company A, nor the real frequency with which DAD practices are performed at other sites. As this would tend to underestimate the issues, and overestimate the frequency of performance, we feel our interpretations stemming from Case A are conservative.

External Validity
Our findings suggest that these conclusions may hold for other companies implementing scaling agile frameworks, but in particular, the DAD and SAFe frameworks, as we observed that two companies with different characteristics and product domains nevertheless experienced similar risks, especially as related to Global Software Development. That said, each organization's implementation will inevitably have its own characteristics, as DAD and SAFe are large, complex, and adaptable frameworks, frequently supported by experienced agile coaches [41, p. 73], who are needed to help tailor the frameworks to the circumstances of each adopting organization. Therefore, the contributing role of context would need further investigation through additional studies.

Conclusion
In this paper, through a three-phase process, we have illustrated how two scaling agile frameworks-DAD and SAFe-largely address the 63 software development risks in the GSD Risk Catalog.
The first phase involved identifying Global Software Development risks faced by software development organizations, by examining the literature on risks in both conventional and GSD contexts. The result was a GSD Risk Catalog of 63 risks, divided into four quadrants following Wallace and Keil [11]: 1. Customer Mandate, 2. Scope and Requirements, 3. Execution, and 4. Environment.
The next phase consisted of a theoretical mapping of scaling agile practices to risks in the GSD Risk Catalog. We compared practices from DAD and SAFe to risks in the GSD Risk Catalog, creating a mapping of practices to risks that shows how scaling agile practices eliminate or mitigate those risks.
To assess the strength of the scaling agile frameworks to mitigate or eliminate risk, and avoid criticism that we "speculated that the strategy would have helped observed problems" [13], we performed an empirical assessment of the result-ing theoretical mappings. This empirical assessment determined the frequency with which practices in each framework were performed in two companies, and the risks encountered by those companies. Through examination of observation and interview notes and transcripts, and self-assessment survey results, from two case studies of global software companies, we were able to support much of our theoretical mapping, and provide evidence that both DAD and SAFe appear to eliminate or mitigate the majority of risks in the GSD Risk Catalog.
Thus, this study adds to the limited empirical evidence of the efficacy of scaling agile frameworks. It suggests that the claims that these frameworks are risk and value-driven approaches have some validity.
Of the four quadrants in the GSD Risk Catalog, the Customer Mandate quadrant appears to be better addressed through the SAFe framework than DAD. Scope and Requirements risks are addressed well by both methods. Execution risks are better mitigated by DAD than SAFe, and Environment risks are less well addressed by either approach. This suggests that the Environment set of risks are less amenable to being addressed by a process framework.
A further outcome of creating the GSD Risk Catalog is the addition of ten new risks related to Global Software Development, which were not identified in the Wallace and Keil inventory [11] (see Table 5). These new risks appear to be endemic and suggest a risk tariff in GSD; all of these except Lack of tool/infrastructure alignment and Unstable country/regional political/economic environment were experienced by both companies.
The result of these three phases is a scaling agile risk theoretical mapping that shows how two scaling agile frameworks-Disciplined Agile Delivery and the Scaled Agile Framework-can potentially eliminate or mitigate software project risks in global software development.
These findings in a global software development scaling agile context echo Oehmen and colleagues' [10] assertion that risks cannot be avoided; at best organizations manage risk by applying practices that lead to a structured reduction of uncertainty.

GSD Risk Catalog risk/Practices Level
Lack or loss of organizational commitment to the project definitely Product management, portfolio management, program management teams. Product planning, portfolio planning, and program planning. Work item list.
Users not committed to the project somewhat Product manager, product owner, domain expert. Business case, feature funnel, storyboarding, DevOp practices, real time monitoring of the product environment by end-users, feedback from end-users.

Users resistant to change definitely
Product management, vision planning, program management, story boarding. Product manager, proxy user product owner, on-site customer with UX responsibility. Both roles require individuals with solid local technical product knowledge.
Users with negative attitudes toward the project somewhat Product manager of features and functionalities, product owner with UX responsibility). Business case, feature funnel, storyboarding, user stories, DOD including acceptance tests, iteration show and tell, beta testing, production environment test, DevOp practices, real time monitoring of the product environment.

Quadrant 2: Scope and Requirements
Conflicting system requirements definitely Product manager who does market investigation and feedback on potential features and functionalities from potential and current customers. Vision planning, product owner with UX responsibility, for storyboarding, do user research in the field. Engineers in the inception phase minimise risk through spikes. Vision planning, business case, feature funnel.
Continually changing project scope/objectives definitely Program management creates work item list for the entire program. Delivery teams choose user stories for their work item list (project). Program daily huddles. DOD, UAT. Programs are time-boxed, defined, and relatively short duration.

Continually changing system requirements definitely
Product manager who does market investigation and feedback on potential features and functionalities from potential and current customers. Vision planning, product owner with UX responsibility, for storyboarding, do user research in the field. Engineers in the inception phase minimise risk through spikes. Vision planning, business case, feature funnel.

Difficulty in defining the inputs and outputs of the system definitely
Program management including Enterprise architects and Product owners. Program work item list, architecture design, story boarding, user stories. Project-level product owners, team architects, software, and QA engineers. Inception phase including spikes to understand users stories. UAT during onstruction phase.

Ill-defined project goals definitely
Product owner, Team lead, Architecture owner. Project work item list, DOD, inception phase, construction phase, transition phase.

GSD Risk Catalog risk/Practices Level
Self-contained DAD delivery team. Primary and secondary roles, including three leadership roles and separate HR manager. Self-organising teams, empowerment, task sharing, T-skilling.

High level of technical complexity definitely
Entire DAD delivery team involved in inception phase. Spikes. Provide story points and estimates. Sprint planning.

Highly complex task being automated definitely
Entire DAD delivery team involved in inception phase. Spikes. Provide story points and estimates. Sprint planning.
Immature technology definitely DAD delivery teams are empowered to make tool decision. Temporary roles in DAD delivery teams. Coaching and training for upskilling on unfamiliar technology.

Inadequate estimation of project budget somewhat
Features are delivered through program.

Inadequate estimation of project schedule definitely
User stories and estimates. Priority setting. Spikes. User stories, story points. Re-estimating.

Inadequate estimation of required resources definitely
Self contained DAD delivery team with up to 13 individuals. Primary and secondary roles three leadership roles and separate HR manager. Empowered self-organising teams. Task sharing. T-skilling.

Inadequately trained development team members definitely
Project management including Product owner, Architecture owner. Self-organising teams, T-skilled, coaching by Architecture owner and Product owner, pair programming. Bring in outside expertise to develop and upskill team members.

Ineffective project manager definitely
Shared leadership roles (Product owner, Architecture owner, Technical lead) for DAD delivery teams. Project management tasks are shared by the entire DAD delivery team (individuals in primary roles).

Inexperienced project manager definitely
Shared leadership roles (Product owner, Architecture owner, Technical lead) for DAD delivery teams. Project management tasks are shared by the entire DAD delivery team (individuals in primary roles).

Inexperienced team members definitely
Self-contained DAD delivery team with up 13 individuals. Primary and secondary roles. Three leadership roles. Task sharing, pair programming.

Lack of an effective project management methodology definitely
Clearly defined program management roles and practices. Clearly defined project management practices and roles, including DAD method phases and activities.

Lack of commitment to the project among development team members definitely
Project management including Team lead. Self-organising teams, inception phase engineers empowered to run spikes, provide story points and re-estimate each user story. Sprint planning and committment to sprint objectives.

Lack of people skills in project leadership definitely
Clearly defined leadership roles and practices with program and project management including based on selforganising DAD delivery teams.

Large number of links to other systems required not at all
Negative attitudes by development team definitely Self-organising teams. Inception phase engineers empowered to run spikes, provide story points, and re-estimate each user story. Sprint planning and committment to sprint objectives.

new: Ineffective collaboration definitely
All DAD project teams are feature team, self-contained and self-organizing teams, T-skilled and all team members are in primary roles -every one are treated as equal, empowered, and make collective decisions. Three leadership roles with a DAD project team-team lead, AO and PO. All DAD project teams are feature team, self-contained and self-organizing teams, T-skilled and all team members are in primary roles (software engineers). Three leadership roles with a DAD project team-team lead, AO and PO. Co-location, shared workspace, daily stand-up meetings, pair programming. All DAD project teams are feature team, self-contained and self-organizing teams, T-skilled and all team members are in primary roles. Three leadership roles with a DAD project team-team lead, AO and PO. Co-location, shared workspace, daily stand-up meetings, pair programming.

GSD Risk Catalog risk/Practices Level
All DAD project teams are feature team, self-contained and self-organizing teams, T-skilled and all team members are in primary roles (software engineers). Three leadership roles with a DAD project team-team lead, AO and PO. Co-location, shared workspace, daily stand-up meetings, All DAD project teams are feature team, self-contained and self-organizing teams, T-skilled and all team members are in primary roles (software engineers). Three leadership roles with a DAD project team-team lead, AO and PO. Co-location, shared workspace, daily stand-up meetings, iteration planning, iteration reviews. All DAD project teams are feature team, self-contained and self-organizing teams, T-skilled and all team members are in primary roles (software engineers). Three leadership roles with a DAD project team-team lead, AO and PO. Co-location, shared workspace, daily stand-up meetings, pair programming.
new: Lack of trust definitely All DAD project teams are feature team, self-contained and self-organizing teams, T-skilled and all team members are in primary roles -every one are treated as equal, empowered, and make collective decisions. Any training need for the team is allowed through the secondary role-a coach, consultant or skilled individual member from another team can join for a period of time to upskill those who need upskilling. All DAD project teams are feature team, self-contained and self-organizing teams, T-skilled and all team members are in primary roles -every one are treated as equal, empowered, and make collective decisions. T-skilled means every team must learn on the fly all the broad skills required to deliver projects. All DAD project teams are feature team, self-contained and self-organizing teams, T-skilled and all team members are in primary roles only-every one are treated as equal, empowered, and make collective decisions. However, AO, PO and Team lead can make decisions so that teams can make progress if stuck with a problem.

Quadrant 2: Scope and Requirements
Conflicting system requirements definitely Apply enabler for exploration that provides a way for development teams to flesh out the details of requirements and design. Continuously communicate emerging requirements and opportunities back into the program vision through product owner. Work with stakeholders to understand the specific business targets behind the user-system interaction.
Continually changing project scope/objectives definitely Perform system demo as near as possible to the end of the iteration. Integrate every other iteration. Actively participate in ongoing agreements to maintain business and development alignment as priorities and scope are inevitably changed.

Continually changing system requirements definitely
Perform system demo as near as possible to the end of the iteration. Integrate every other iteration. Continuously communicate emerging requirements and opportunities back into the program vision through product owner.

Difficulty in defining the inputs and outputs of the system definitely
Define PI planning's primary outputs. Demonstrate each new feature in an end-to-end use case. Develop a feature team that is organized around user-centered functionality. Each team is capable of delivering end-to-end value. Feature teams operate primarily with user stories, refactors, and spikes.

Ill-defined project goals definitely
Implement value stream coordination to ensure that the enterprise moves forward with each value stream in lockstep with the enterprise objectives. Align development to business via business context, vision, and Team and Program PI Objectives. Create a set of 'SMART' team PI objectives for each individual team with business value assigned.

Incorrect system requirements definitely
Epics and lightweight business cases. Ensure that Epics and Enablers are reasoned and analyzed prior to reaching a Program Increment boundary, are prioritized appropriately, and have established acceptance criteria to guide a high-fidelity implementation. Define Roadmap.

System requirements not adequately identified definitely
Epics and lightweight business cases. Ensure that Epics and Enablers are reasoned and analyzed prior to reaching a Program Increment boundary, are prioritized appropriately, and have established acceptance criteria to guide a high-fidelity implementation. Define Roadmap.

Unclear system requirements definitely
Epics and lightweight business cases. Ensure that Epics and Enablers are reasoned and analyzed prior to reaching a Program Increment boundary, are prioritized appropriately, and have established acceptance criteria to guide a high-fidelity implementation. Define Roadmap.

Undefined project success criteria definitely
Define Success criteria to validate the implementation. Impacts the identification, success criteria, and prioritization of epics in the funnel and backlog states. Success criteria provide a mechanism to understand progress towards the intent.
Users lack understanding of system capabilities and limitations definitely Perform system demos. Integrate to illustrate a particular feature, capability, or nonfunctional requirement. Demonstrate each new feature in an end-to-end use case.

Development team unfamiliar with selected development tools somewhat
Understand requirements for working on "technical infrastructure, tooling, and other systemic impediments". It may be more efficient to perform an upgrade or migration at a time when there isn't a critical demo just a few days away. Perform Daily Stand-up to understand team's status, "escalate problems, and get help from other team members".

Frequent conflicts among development team members somewhat
Vote of confidence/commitment from the entire program to these objectives. Develop PI commitment. Establish an agreement to determine how the work is performed for each activity type.

53
P R E P R I N T

GSD Risk Catalog risk/Practices Level
Highly complex task being automated definitely Perform initial exploration of epics and rank them roughly by using Weighted Shortest Job First (WSJF) to determine which epics should move to the next step for deeper exploration. Apply Weighted Shortest Job First prioritization method for job sequencing. Define Weighted Shortest Job First (WSJF).

Immature technology definitely
Work with Agile Teams that perform research spikes, create proof of concepts, mock-ups, etc. Support technology/engineering aspects of program and value stream kanbans. Local stories representing new functionality, refactors, defects, research spikes, and other technical debt are identified, written as enabler stories, estimated, and sequenced.

Inadequate estimation of project budget definitely
Develop Incremental implementation by keeping the epics in the portfolio backlog until there is implementation capacity available. Avoid overhead and enables the train to make fast and local decisions within the constraints of the allocated budget. Provide WIP limits to ensure that the teams responsible for analysis undertake into responsibly and do not create expectation for implementation or time frames that far exceed capacity and reality.

Inadequate estimation of project schedule definitely
Agile Estimating and Planning. Adopt Agile estimating and planning by using the currency of story points. Epic Progress Measure.

Inadequate estimation of required resources definitely
Develop Incremental implementation by keeping the epics in the portfolio backlog until there is implementation capacity available. Provide WIP limits to ensure that the teams responsible for analysis undertake into responsibly and do not create expectation for implementation or time frames that far exceed capacity and reality. Use capacity allocation to estimate portfolio epic based on the given knowledge of program velocities.

Inadequately trained development team members somewhat
Conduct specialized training to keep up with advancements in their respective fields.

Ineffective communication definitely
Facilitate continuous improvement by quantitative metrics, customer feedback, and the Inspect and Adapt retrospective cycle. Establish high-bandwidth communication across all team members and stakeholders. Perform system demo as near as possible to the end of the iteration.

Ineffective project manager not at all
Inexperienced project manager not at all

Inexperienced team members somewhat
Involve a subject matter expert in basic exploration and sizing. Interact with analyst and subject matter experts during specification workshops. Work with stakeholders and subject matter experts to define the epic and its potential benefits.

Lack of an effective project management methodology definitely
Coach leaders, teams, and Scrum masters in lean-Agile practices and mindsets. Use "Agile Project Management Tools to capture stories and status, defects, test cases, estimates, actuals, assignments, burn-down chart".

Lack of commitment to the project among development team members definitely
Vote of confidence/commitment from the entire program to these objectives. Develop PI commitment. Limit the commitments to longer-term work, because some other item may come along that's more important than a prior commitment.

Lack of people skills in project leadership somewhat
Define Scrum Master role. Exhibits Lean-Agile leadership. Protects and communicates.

Large number of links to other systems required definitely
Perform initial exploration of epics and rank them roughly by using Weighted Shortest Job First (WSJF) to determine which epics should move to the next step for deeper exploration. Apply Weighted Shortest Job First prioritization method for job sequencing. Define Weighted Shortest Job First (WSJF).

Negative attitudes by development team somewhat
Perform Iteration Retrospective as the check step for the overall iteration. Use "iteration retrospective to drive program level changes to process either immediately or in the Inspect and Adapt workshop". Perform Iteration Retrospective to identify way to improve.

new: Ineffective collaboration definitely
Manage dependencies by applying extensive degree of cooperation; a common value stream backlog; implementation of new, crosscutting capabilities; additional system integration; additional roles and responsibilities; special considerations for pre-, post-, and PI planning activities; different degree and types of DevOps support. Product manager create features in collaboration with product owner and other key stakeholders. Features are also created as a result of decomposition of epics. Encourage the collaboration between teams and System and Solution Architects, Engineering, and User Experience designers.

54
P R E P R I N T

GSD Risk Catalog risk/Practices Level
Manage dependencies by applying extensive degree of cooperation; a common value stream backlog; implementation of new, crosscutting capabilities; additional system integration; additional roles and responsibilities; special considerations for pre-, post-, and PI planning activities; different degree and types of DevOps support. Implement value stream coordination to ensure that the enterprise moves forward with each value stream in lockstep with the enterprise objectives. Align development to business via business context, vision, and Team and Program PI Objectives.

new: Lack of trust definitely
Manage dependencies by applying extensive degree of cooperation; a common value stream backlog; implementation of new, crosscutting capabilities; additional system integration; additional roles and responsibilities; special considerations for pre-, post-, and PI planning activities; different degree and types of DevOps support. One of the largest projects attempted by the organization definitely Perform initial exploration of epics and rank them roughly by using Weighted Shortest Job First (WSJF) to determine which epics should move to the next step for deeper exploration. Apply Weighted Shortest Job First prioritization method for job sequencing. Define Weighted Shortest Job First (WSJF).
Poor project planning definitely Implement value stream coordination to ensure that the enterprise moves forward with each value stream in lockstep with the enterprise objectives. Align development to business via business context, vision, and Team and Program PI Objectives. Create a set of 'SMART' team PI objectives for each individual team with business value assigned.

Project affects a large number of user departments or units somewhat
Demonstrate each new feature in an end-to-end use case. Develop a feature team that is organized around usercentered functionality. Each team is capable of delivering end-to-end value. Feature teams operate primarily with user stories, refactors, and spikes. Use capabilities as end-to-end solution services that support the achievement of user goals.

Project involves the use of new technology definitely
Work with Agile Teams that perform research spikes, create proof of concepts, mock-ups, etc. PI Features are broken into stories and placed on team backlog. Local stories representing new functionality, refactors, defects, research spikes, and other technical debt are identified, written as enabler stories, estimated, and sequenced.

Project involves use of technology that has not been used in prior projects definitely
Work with Agile Teams that perform research spikes, create proof of concepts, mock-ups, etc. Ensure that the demo environments are adequate to the challenge of reliably demonstrating new solution functionality. Local stories representing new functionality, refactors, defects, research spikes, and other technical debt are identified, written as enabler stories, estimated, and sequenced.
Project milestones not clearly defined definitely Implement value stream coordination to ensure that the enterprise moves forward with each value stream in lockstep with the enterprise objectives. Align development to business via business context, vision, and Team and Program PI Objectives. Create a set of 'SMART' team PI objectives for each individual team with business value assigned.
Project progress not monitored closely enough definitely Perform system demos. Perform system demo as near as possible to the end of the iteration. Demonstrate each new feature in an end-to-end use case.

Team members lack specialized skills required by the project somewhat
Involve a subject matter expert in basic exploration and sizing. Interact with analyst and subject matter experts during specification workshops. Work with stakeholders and subject matter experts to define the epic and its potential benefits.

Team members not familiar with the task(s) being automated definitely
Develop a feature team that is organized around user-centered functionality. Each team is capable of delivering end-to-end value. Feature teams operate primarily with user stories, refactors, and spikes. Refine the backlog. Involve with Agile team for short period of time.

Quadrant 4: Environment
Change in organizational management during the project not at all

Corporate politics with negative effect on project somewhat
Provide decision-making filters in the portfolio kanban, thereby influence the portfolio backlog. Define Strategic Themes. Formulate Strategic themes.

Dependency on outside suppliers definitely
Business Owner prepares to communicate the business context, including milestones and significant external dependencies, such as those of suppliers. Work with Customers, stakeholders, and Suppliers to establish high-level Solution Intent; help establish the solution intent information models and documentation requirements. Work with Suppliers, making sure the requirements for supplier-delivered capabilities are understood, and assist with the conceptual integration of these concerns.

Many external suppliers involved in the development project definitely
This an authors' preprint. Please cite as: Sarah Beecham, Tony Clear, Ramesh Lal, and John Noll (2020) "Do Scaling Agile Frameworks Address Global Software Development Risks?" Journal of Systems and Software, Special Issue on Global Software Engineering.