A systems approach to life cycle risk prediction for complex engineering projects

In order to successfully deliver challenging and complex engineering projects, it is essential that an organisation has an in-depth understanding of the technical and commercial risks. Without this knowledge, decisions can fail to address, manage or mitigate potential and residual risks causing cost blowouts, schedule delays and technical failures. This paper presents a new method that both quantifies and models the relative risk profile of a project throughout the project lifecycle. It allows the continued management and visualisation of risks and enables a process of dynamic analysis to both reduce and/or mitigate residual risks progressively to acceptable levels. This research explores the use of an enterprise model based on three elements: product, process and people. These elements interact but are constrained in a business/engineering environment. The elements can be used to develop a ubiquitous set of generic project risks prevalent across complex engineering projects. The method is illustrated by three case studies taken from the defence environment, but, the general theory and method can be applied to non-defence organisations and industries. Subjects: General Engineering Education; Mathematics & Statistics for Engineers; Engineering Management


Introduction
Many of the platform systems used in the defence environment have significant service life expectations. Modifications and enhancements are inevitable to ensure the platform continues to meet the capability need of the customer and address changes/advancements in technology. In some cases, ABOUT THE AUTHORS Matthew Cook is a Lead Engineer who has worked primarily in the defence industry for over 10 years. He completed a post graduate thesis that focused on risk analysis and quantitative modelling of complex Engineering projects. This work has led on to extended research as a PhD Candidate for RMIT University, Australia.
John P.T. Mo is Professor of Manufacturing Engineering at RMIT University, Australia. Prior to joining RMIT, he was a Senior Principal Research Scientist in CSIRO for 11 years leading research teams in manufacturing and infrastructure systems. He has over 350 refereed publications including one patent and three books.

PUBLIC INTEREST STATEMENT
Very complex engineering projects such as the design and build of ships and aircraft are extremely challenging and risky. In order to ensure they meet the technical, financial and delivery challenges, organisations need to carefully manage the projects' risks. This is far from straight forward as even the identification of risks can prove difficult and ambiguous. Risk is the probability of an event occurring, multiplied by the consequence or severity. These risks need to be controlled and mitigated throughout the project to ensure success. This research has focused on developing a mathematical model that provides a clear visualisation interface of risk levels (as a profile) and allows the constant management and monitoring of risks throughout the project lifecycle. https://doi.org/10. 1080/23311916.2018.1451289 changes to the baseline are un-configured and as a consequence projects can prove ambiguous and challenging. This can lead to key risks and their severity going unrecognised or understood resulting in technical failures, safety concerns, schedule & budget blowouts and ultimately a disgruntled customer.
In order to develop a sound strategy when undertaking complex projects, engineering organisations tend to develop their own bespoke methods for handling risk of project failures with varying degrees and levels of success (Baccarini, 1999). Risk is defined in ISO 31000 (I. S. O. (ISO), 2009) as "the possibility that something unpleasant or unwelcome will happen". When undertaking extensive, highly complex and challenging projects it is essential that any large organisation develops a sound understanding of the risks that may preclude success.
Unfortunately, in reality it is all too common in industry to find that risk is treated as an afterthought and even in some cases seen as a box ticking exercise. It is critical that a strategy is developed for handling/mitigating the identified risks to ensure the success of the project (Felder & Collopy, 2012). How well this is achieved can make or break the project and even the organisation. There are many risk management and analysis tools available in the market place today which offer a variety of attributes (Sharma & Bhat, 2012). Many of these tools are underpinned by Monte Carlo analysis method, which is essentially a mathematical algorithm that was developed in the 1940s for determining unknowns (risks) and providing a series of outcomes or scenarios. These forecast outcomes can go some way to help making informed project decisions, but the analysis process can be complicated, time consuming and tedious.
In some organisations, the use of such risk management and analysis tools may be conducted by a dedicated risk professional who is trained in such practice. However, in a lot of cases risk analysis is left to either the Project Manager or Engineer, depending on the type of organisation (Millet & Wedley, 2002). While these individuals are no-doubt well aware of potential risks relating to their project, the need for a sound understanding of complex and effective risk modelling tools along with schedule pressures can lead to poor demonstration of risk capture and analysis, especially the lack of versatility to manage the risks throughout the project lifecycle. This research has focused on developing a quantitative risk model that can identify risks, develop a risk profile that can be presented in a visual format and manage/track residual risks throughout a projects life cycle. Cohn (2010) proposes a relationship between the potential number of days lost due to software development problems, against the number of software iterations. The model, known as a risk burndown chart, is limited to software effort loss during the development stages. However, no details for further application to more complicated financial and schedule impacts are offered that could be used to aid complex engineering projects.
There is significant value in an analysis tool that provides an organisation with a clear understanding of project risks both prior and throughout the lifecycle of a project. The proposed tool is underpinned by a risk model which is constructed on a comprehensive enterprise model that has the ability to generate a quantitative risk profile. The methodology for the analysis of risks and the development of a risk model for managing complex engineering projects is outlined within this paper. This includes an explanation of how the risk model can assist large organisations in identifying, visualising and managing risks through the project life cycle. It is important to highlight that a number of methods used for the calculations within this particular example are not mandatory and the model has been designed with a certain amount of flexibility for alternative interpretations. In addition, some methods of how data was collected and analysed for the risk model development is provided within the paper. This paper offers a method of initially associating generic risks within a defined framework. It then details a process for assessing risks based on their severity for each project. This data is subsequently modelled and a novel approach of comparison with a baseline or ideal project is proposed along with a method of presenting and managing project risks throughout the lifecycle of the project. Although focused primarily on Engineering Services Support to the Australian Naval Maritime environment, this research is transferrable to other disciplines, industries and countries.

Literature review
There were several distinct stages in the approach taken by this research. Initially, an investigation into the contractual environment affecting defence industry stakeholders and the government was undertaken. From there, focused research was conducted within the Australian Naval Maritime sector to develop an analysis method that could improve the current methods used by the industry for identifying, understanding and managing risks throughout project lifecycles. The literature review is organised in four sections where the research gaps are highlighted. Johnson (2007) identified that risk management provides the most important single framework for strategic, tactical and operational decision-making across the US Military. Composite Risk Management (CRM) has been introduced to guide decision-making across the US Army in training, combat and peacekeeping operations as well as off-duty activities. The paper went on to identify some of the paradoxes that complicate the application of risk management to guide military decision-making but the coverage is incomplete. Many risks present in operations are not explored or accounted for within the work.

Identification of risks
In the area of technical risk assessment, Cork (1998) described a framework for assessment of the risk that a proposed machine or system (such as a ship) would not operate to its required level when it is first developed. The method was based upon the breakdown of the system under assessment into a hierarchy of functionally and/or structurally defined assessment areas. Within each area, technical risks, and methods of assessing these risks, were identified. The framework provides a systematic structure for selecting assessment methods and integrating results of the use of selected methods into the overall assessment of the system but lacked ongoing management of risks through a project lifecycle. Mo (2014) described the use of applying performance measures to assets and support services. The methodology details a number of calculations that provide both cost and availability measures. The calculations also provide indicators of where a company should increase capacity, effort and expenditure to reduce or mitigate risk. One current method is known as the 3PE model, which categories risks into three groups -Product, Process and People in an Environment. Although limited to maintenance cases, this model appears to be somewhat relevant to this project research and could be modified and potentially used to assess project risks.
In their paper on Hybrid Risk Management methodology, Ting, Kwok, and Tsang (2009) highlighted a large variety of tools and approaches were introduced for risk management in the last decade. The research however, focuses primarily on Hierarchical Holographic Modelling (HHM) and Enterprise Risk Management (ERM). It was proposed that HHM could be used to identify particular events or circumstances (risks and opportunities) and ERM to assess the risks in terms of likelihood and magnitude. Business Recovery Planning (BRP) was used to create and validate a logical plan for how an organisation would recover after a critical risk was realised. However, this approach was not quantitative and hence the process could alter due to change of personnel over the project lifecycle. Yim, Castaneda, Doolen, Tumer, and Malak (2013) explored the relationship between different engineering design functions and the presence or absence of risk indicators in system level design projects. A comparative analysis of risk indicators was completed to identify those indicators that appear to be related to the level of functional complexity. Engineering project managers could utilize the results of this research to develop appropriate risk mitigation plans or engineering design projects based on the level of functional complexity. These plans however, were qualitative and the method is unable to prioritise risks by numerical assessment. Walker, Holmes, Hedgeland, Kapur, and Smith (2006) propose that technical risk is the product of the probability of a technical event and the cost of that event. A technique for more objectively assessing and communicating technical risk in an evolutionary development is proposed and a tool realising this technique has been developed for the Eclipse IDE. The tool offers insight into technical risk estimation but does not account for commercial and financial risk. Davis, Finlay, McLenaghen, and Wilson (2006) found that operational risk management was one of the outstanding action items on most firms' to-do lists. They concluded that companies should invest in Key Risk Indicators (KRI) which could help improve the ability to convey risk appetite, optimise risk and return, and improve the likelihood of achieving primary business goals. Fattahi and Khalilzadeh (2016) evaluated risk based on failure mode and effects analysis (FMEA) method and incorporated extended fuzzy analytic hierarchy process (AHP) and fuzzy multiplicative multi-objective optimization risk assessment (MULTIMOORA) to cater for weighting requirements for three criteria of time, cost and profit. This forms part of a rather complicated multi-criteria decision-making indicator used in an industry case study.

Modelling of risks
In large engineering projects the chief element of risk arises from the fact that there are many variables and every step through the lifecycle of a project is laden with risk. Ayyub (2003) explained the fundamental concepts, techniques, and applications of uncertainty and detail risk modelling and analysis. The computational algorithms illustrated data needs, sources, and collection. Practical use of the methods were presented but required further development for the techniques to be applied effectively in large scale engineering projects. Cornalba and Giudici (2004) studied the risks to which a banking organization could be subjected. They presented possible approaches based on Bayesian networks to measure and predict operational risks. This research shows that the ability of modelling and quantifying risks is fundamental to managing risks and have been attempted in many industry sectors. This paper approaches the problem from the whole of system perspective in order to develop a workable model for representing the risks in operations however, there is virtually no development of such techniques in other project sectors.
The collection of data in both qualitative and quantitative forms presents several challenges. It is this data that builds the backbone of any subsequent analysis and modelling. New approaches and tools combining soft and hard modelling are needed to deal increasing complexity. Szczerbicki and Orlowski (2003) suggests a new approach based on feedback mechanisms in combination with fuzzy methodologies is required. The approach proposed in the paper is a combination of fuzzy logic techniques used for knowledge acquisition and processing, and non-fuzzy models to support project management processes. Pedroni, Zio, Pasanisi, and Couplet (2017) provided guidelines for treatment of uncertainty in risk assessment. The guidelines addressed modelling issues in terms of dependency of input variables and parameters as well as quantitative representation. They also reviewed tools derived from classical probability theory. This research highlighted a general direction of modelling approach but application to specific cases were left open for further work.
In order to analyse and measure risks, Liew and Lee (2012) proposed a risk mitigation framework which they divided into four main steps and a quantitative method. During the risk identification and assessment stages, the authors provided several methods of risk classification and early elimination. It was reasoned that risks with low occurring probability and lower loss potential to the organisation can be eliminated. The authors went on to describe how suggestions to mitigate risks are proposed and tested in order to find which suggestions are the best. Finally, risk monitoring was recommended to ensure the risk management techniques being practiced are still effective and any https://doi.org/10.1080/23311916.2018.1451289 new risks were identified and dealt with in the dynamic characteristics of the business environment. This work had focuses primarily on the supply chain activities and was not expanded to cover other areas of engineering project risk.

Current industry practice on project risk management
A typical engineering project in the defence environment will go through some form of Systems Engineering process (Defence, 2012). Such a process usually includes a number of mandatory stages and theoretical gates which need to be passed before the change can be progressed. Through its lifecycle management strategy (Cook & Mo, 2015), an organisation like BAE Systems operates a Risk and Opportunity Management Plan (ROMP) for business units (Figure 1). The main tool for risk management in ROMP used at BAE Systems Australia is a risk register which is usually populated by project managers and engineers. This Systems Engineering Management process (SEMP) has been used as a basis for the design lifecycle in industry. Each stage encompasses a series of gates that must be achieved before the task can progress. There are extensive risks associated with each of these gates that must be mitigated or resolved to an acceptable level to ensure the success of the project.

Research gap
This literature review highlights that there has been extensive research conducted on risk prediction in complex project management but there are deficiencies in several areas. On the identification of risks, it is still an intuitive process and depends on both who the risk analyst is and how it is conducted. The 3PE model, reviewed in Section 2.1 can be explored further with focus on how to systematise the risk identification process so that a consistent set of risk items can be identified. The incorporation of indicators, is considered a step towards providing an objective value for decision makers. However, the use of indicators without understanding the nature of the risk needs to be tackled and could lead to incorrect decisions. The link of indicators to the nature of risk (i.e. modelling) requires further research. On the modelling of risks, Bayesian and fuzzy methods are primarily the tools to manipulate risk values so they are useful for deriving indicators. The risk itself is still treated as black box and a method to visually bind the nature of risk to the indicator is still lacking.

Figure 1. Systems engineering management process used in a typical defence company.
Hence, it is understandable that industry practice is to use a SEMP to manage the situation because it is basically a step-wise gate control process.
This research considers all these deficiencies and develops a quantitative process that can be used in computational analysis and can cover all types of risk in the life cycle of engineering system development. The method is based on 3PE model and expands on it into a comprehensive risk profile throughout the life cycle. This novel process is illustrated with three case studies as examples for further research to be implemented in real industry environment.

Research approach
The primary aim of this research is to develop a novel risk model that provides continuous visualisation and management of risks throughout the project lifecycle. Based on a generic enterprise model framework, the risk model is designed to generate a quantitative risk profile. This profile is built by segmenting the enterprise into functional elements which will theoretically allow the concise identification and visualisation of key risk drivers and the dynamic management of risks through the project life.
In order to set some form of qualitative baseline which could then be used for both quantitative assessment and analysis, an investigation into risks surrounding complex engineering projects was undertaken based on the 3PE model described by Mo (2014), see Figure 2. The main elements in the 3PE model are people, process and product, which are located within an environment. For each of the elements, a list of generic risks, more or less common to all engineering projects was developed, which totalled over three hundred.
A survey was developed based on these 3P elements. The data generated from the risk survey was analysed and interpreted by various methods to determine meaningful and useful results. Visualisation tools were also employed to highlight, manage and control risk as a project progressed through the life cycle. The outcomes of this analysis can then be used as a basis to plan necessary risk mitigation actions that can significantly reduce the risk of conducting complex engineering projects. The risk indicator, which can be used to assess/measure how risky a project is through the lifecycle, can be estimated from the 3PE model as a normalised distribution of risk in this project and is denoted by N(µ j , σ j ), where j is a particular instance of a project. The risk value is high if the project is risky (i.e. high probability of failure). Hence, a risky project will be indicated by a high mean value from an assessment of the 3PE model.
In order to evolve the risk model further, the theory of generating a percentage of failure for a given project was explored. The hypothesis being that an "Ideal" or "Perfect" project would have minimal risk that could be easily mitigated and has a percentage of failure which is the lowest among all projects and can be established as the benchmark. The Ideal project is defined as a distribution N(µ I , σ I ).
To calculate the risk of not achieving the Ideal project level, the differential distribution will show the risk of the project in relation to the Ideal project. The mean and standard deviation was calculated for each using equations: The risk indicator at the time of measuring the risks is then defined as: The Ideal project is thought to have the minimum risk score in all aspects of project 3PE factors. Hence, the score for any other project will have a probability of failure higher than the Ideal project. Due to the SEMP, it is natural to think that each of the stages as being designed to mitigate or resolve some of the unknowns in the project lifecycle. Hence, the normalised normal distribution of the projects should also be a function of time. This means Equation (3) is not a static estimation and the mean of the Ideal project can vary with time t as the project progresses to different stages of the SEMP as shown in Equation (4).
In a properly managed project, the project failure function F(t) should decrease over time due to efforts to decrease µ j (t) towards µ I (t) at the different stages of SEMP.

Case studies
A core aim of this research is to address the inherent subjectivity of risk assessment, along with defining a baseline or "Ideal" project that has minimal risk factors and achieves appropriate success throughout the project lifecycle. This Ideal project can be used for comparison against risk analysis conducted on new projects in a similar manner. The potential chance of a new project failing could theoretical be considered low the closer the project aligns to the Ideal project. Organisations can subsequently assess their risk profile for a project and determine what strategy and/or approach is required to mitigate the risk percentage to an acceptable level to ensure project success.
In order to make sense of the data and begin to develop a risk model, the idea of placing the data into a normal distribution was explored. This would allow data from the three projects to be compared not only within the 3PE model sections, but more importantly comparisons could be drawn between the three projects and furthermore the Ideal project. Three projects from BAE Systems Australia-Naval Defence were chosen for this research. While fundamentally different, each project was reasonably well understood and at different stages of the life cycle. (1) PL1 = This project was completed on budget and schedule with successful commissioning on site and acceptance by the customer. This project is considered medium size and combined OEM equipment and BAE Systems design and installation. The normalised distribution of risk in this project is denoted by N(µ 1 , σ 1 ). PL2 = This is a current project that is an alliance between BAE Systems, another company (located within Australia) and the Commonwealth of Australia (CoA). During this programme, some of the highest risks related to the uncertainty in the development of new technology by the collaborating company. This is considered a major project, with significant risk surrounding the product. The normalised distribution of risk in this project is denoted by N(µ 2 , σ 2 ). PL3 = BAE Systems Naval Maritime has been tasked with the design, manufacture and installation of an enhancement for a specific class of ships for the RAN. The project is considered medium size with risks regards as manageable as the design, fabrication and installation is to be fully controlled by BAE Systems. The normalised distribution of risk in this project is denoted by N(µ 3 , σ 3 ).
To understand the perceived severity of the risks in the three projects, the risk data register of the three chosen projects were examined and the risk items were categorised under 3P elements, i.e. People, Process and Product. Over 1,500 items were found but unfortunately, many items were incomplete and the records couldn't be used to form a definite database for this research. To overcome this, these items were used as a basis to generate a survey. However, it was not feasible to ask respondents to answer such a large number of questions. After several iterations of simplification, a survey was developed that contained 30 questions with 10 for each 3P elements. The questions are attached in Appendix 1.
To provide quantitative data for this research, the respondents were asked to provide a value or metric for each of the generic risks by a score out of 10. The score of 1 representing no perceived risk and 10 representing extreme risk. Sixteen staff from the Engineering and Project Management teams were asked to participate in the survey. The respondents included design engineers, project managers, draftsmen, cost analysts and ILS (integrated logistics support) engineers. A return rate of 87.5%, i.e. 14 questionnaires, was recorded.
Due to the constraints of the research, only a relatively small data-set was achieved from the survey. To achieve some meaningful analysis from the data, it was assumed that the data is normally distributed. For each project, the mean and standard deviation was calculated and the results can be seen in Table 1. To visualise the resulting data, a bell-curve for the combined 3P risk values is shown in Figure 3.
From the perceived nature of the three projects within BAE Systems Australia, it was generally acknowledged that serious challenges relating to task PL2 needed to be mitigated and it is therefore considered a "risky" project. PL1 has actually been completed and was generally considered a success, while task PL3 has a clear scope and is predicted to sit somewhere between projects PL1 and PL2. This is reflected in Figure 3 where PL2 has a higher risk severity than the other two projects (clearly shifted more to the right on the graph).
To further expand the risk model, a measure of the percentage of failure that a particular project could expect was explored. As previously mentioned, the PL1 project is considered a successful project and it is reasonably to judge that any data results from this project could be a starting point for success. The strategy taken by this research is to assume that an Ideal project would improve, for each question, by one value point (1 to 10) better than the PL1 data results (or 10%). A new project that is modelled against the Idea project must align within a tolerance band for the project to approach success. See Figure 4. While it must be acknowledged that there are other methods of setting a benchmark or Ideal project, in the context of this research, the outcome does not affect the methodology discussed in this paper.
The risk indicators for new projects as compared to the Ideal project can then be expressed as the probability of the differential distribution less than zero. The results of the calculation can be seen from the generated graph in Figure 5. The graph presented in Figure 5 can be interpreted as follows, using PL2 as the example. The differential normal distribution of PL2 against the Ideal project is calculated by Equations (1) and (2) as N(0.9148, 2.2580) using data in Table 1. According to Equation (3), the probability of failure is the area under the curve at the right-hand side of the Y-axis, i.e. 73.3%. Likewise, the probability of failure for PL1 and PL3 is 64.1 and 61.7% respectively. Hence, PL2 is clearly a risky project.  It should be emphasised that these results are essentially a snap shot of three projects that were progressing through the design process, in this case BAE Systems SEMP. Although the data was collected at different stages of the lifecycle for each project, the survey was targeted at understanding perceived risks at the infancy of each project. The computed risk values in Table 1 reflects the level of risk relevant to the position of each project (PL1, PL2 & PL3) in reference to the SEMP. Management is then able to adjust the course of actions and mitigation strategies dynamically throughout the lifecycle of the projects.
Since PL2 and PL3 are still progressing projects at the time of this research, from the above analysis, PL2 would need more attention and perhaps more resources to mitigate the higher risk profile. PL3 and PL1 have similar risk level. Since PL1 has been completed without any major problem, PL3 can potentially be managed with routine risk management process.

Conclusion
This paper considers the deficiencies of current risk identification and assessment methods for complex engineering projects and develops a quantitative process that provides a risk profile of a project under three generic system elements: people, process and product. The method can also be used to track the change of the risk profile in the system engineering lifecycle. The proposed risk model not only identifies the risks but also, and crucially, allows managers and engineers to both visualise the risks and manage them at different stages of the project's development. The risk model developed in this research uses a normal distribution to form the basis of computing the failure probability for each of the projects. The profiles are then compared, and an Ideal project can then be established as the relative benchmark.
The set of generic risks and the computational risk profile are applied to three well understood projects, as a method of generating quantifiable data. The early stages of a risk model were developed to compare the risk profile of the three projects and the initial results proved encouraging. The baseline ideal project was proposed as a test case for meaningful comparison among the three projects and a percentage of failure value for each of the projects was established. While the results appear to follow the perceived nature of the three projects, the risk model is by no means conclusive as a data-set of three projects is potentially inadequate due to constraints within the projects. It is therefore planned that as the research continues, more completed projects will be modelled and surveyed so the database for determining the Ideal project can be expanded so that a more generalised benchmark can be established.