Project duration-cost-quality prediction model based on Monte Carlo simulation

Based on the earned value management theory, the project duration and cost forecast data are obtained through Monte Carlo simulation, combined with the knowledge of mathematical statistics to in-depth analysis of the data, and then the project quality calculation is completed by establishing the quality correlation function. On the basis of existing scholars’ research, at the process level, the management focus is identified by calculating process influence and quality, and at the project level, the operation of the entire project is simulated and predicted by the calculation of the total construction period, cost, and quality.


Introduction
The two most important aspects of project management are the effective identification and correction of deviations, and the specific risk early warning elements that are most concerned by project investors are project duration, cost and quality. The Critical Path Method correction technology of project management that emerged earlier has many drawbacks [1][2][3]. Earned value management has become a later research hotspot due to its advantages in a wide range of integration [4][5][6]. Through the analysis and research of historical data of the project progress process, many scholars have found that the project cost and duration can be described by some random variables [7][8][9][10], which means that the project duration and cost can be calculated by Monte Carlo simulation. Many scholars conduct simulation research on project data by combining Monte Carlo simulation with Bayesian, risk matrix, machine learning and other technologies, and these studies provide a large number of research directions and ideas for subsequent project management research [11][12][13][14][15][16]. Since the combination of earned value management theory and Monte Carlo simulation can provide managers with more accurate decision-making reference, many scholars have combined the two for extensive and in-depth research [17][18]. Although the above research has provided many project management methods for project managers, they have not considered the internal connections between specific simulation objects. After realizing the mutual influence between project duration, cost, and quality, many scholars have conducted a series of studies considering the above-mentioned relationship of activities [19][20][21].
Based on the research of the above scholars, this article uses the method of combining earned value management theory and Monte Carlo simulation to predict project duration, cost, and quality. Different from the above research, this paper proposes a new forecasting idea of project control elements and a project quality calculation scheme. On the basis of considering the relationship

Model design
This paper uses the method of combining earned value management and Monte Carlo simulation to concretize the quality of the project by constructing duration-quality [21][22] and cost-quality correlation functions. At the activity level, Monte Carlo simulation data is used to define the activity impact [21,23] to realize the identification of management priorities at the activity level.

Assumptions and symbol definitions
Hypothesis 1: There is a broad correlation between the duration and quality of a single engineering activity, and between cost and quality. The impact of uncertain factors will directly extend the project period or make the project in a state of rush. At this time, the original resource plan will be disrupted and affect the quality of the project. Uncertainty factors can also cause the project budget to be overrun or in a state of budget tightening. At this time, it will affect the purchase and deployment of existing resources in the system and affect the quality of the process. Without considering the relationship between construction period and cost, the hypothesis of the relationship between construction period, cost and quality proposed in this article is shown in the following Figure 1 Hypothesis 2: Activity duration and activity cost are independent of each other. Many scholars' studies have assumed that the project duration or cost are independent [17][18]21]. This article proposes that the duration and cost of a single project activity are independent of each other on the basis of the aforementioned research. This article describes the activity quality in digital by constructing the duration-quality correlation function and cost-quality correlation function. This article describes the activity quality digitally by constructing the duration quality correlation function and the cost quality correlation function.
Hypothesis 3: The project duration and cost all obey the triangular distribution. After combing research related literature [7][8][9][10] and practice, it is shown that the completion period and completion cost of activities can be described by β probability distribution and triangular probability distribution. Considering that there are optimistic, pessimistic, and possible situations for project duration and cost prediction in engineering project practice, this fits the characteristics of the triangular probability distribution, so this article uses the triangular probability distribution to describe the project duration and cost.

Determination of construction period and cost simulation parameters
This part describes the basic indicators of engineering project simulation in the order of defining the calculation formula of the parameters required for project simulation, the calculation formula of project activity quality, and the calculation formula of overall project quality.

Basic indicator calculation definition.
This article assumes that the project activity network diagram is known and the monitoring time node is labeled t(t=1,2…,N). At the monitoring point t, the schedule performance index of activity i is I , the cost performance index is CPI , and the schedulecost joint performance index is I , and the calculation formulas are formulas (1) ~ (3). The singleday planned value of activity i is , and the actual single-day completion value of the activity at the monitoring point t is . Taking into account the uncertain factors that may appear in the follow-up work of the activity, the current schedule performance index is used to further modify the value of the work completed in a single day after the activity i, formulas (4) ~ (6) are their calculation methods.
Here, PV represents the planned value of the planned completed part of activity i at the monitoring point t, EV is the earned value of activity i at the monitoring point t, and AC represents the actual cost of the activity i at the monitoring point t.
represents the completion Planed Value, PCP represents the Planned completion period.
indicates the number of working days for activity i at the monitoring point t.The above formulas (1) and (2) are traditional earned value management models, and formulas (3) ~ (6) are the unit value indicators defined in this article to complete the activity simulation.

Triangular distribution parameter determination.
In this paper, the optimistic and pessimistic values of the project duration and cost are input into the simulation software as the upper and lower limits of the triangular distribution, and the most likely value of the project duration and cost are input into the simulation software as the mode parameters of the triangle distribution.
Assuming that the activity i completed the remaining work as originally planned is the optimistic state of project completion, and based on this, the optimistic completion time and the optimistic completion cost EAC , of the activity at the monitoring point t are calculated. Assuming that the activity continues to work according to the actual work performance at the monitoring time t is the most likely situation, and based on this, the most likely completion time and the most likely completion cost EAC , of the activity i at the detection point t are calculated. Taking into account the uncertainty that may occur after the monitoring time point of t, this article uses different methods to modify the current performance and calculates the most pessimistic completion time T and the most pessimistic completion cost EAC , accordingly. The calculation formulas for the above 6 indicators are as follows: Formulas (7)~(9) are the most optimistic, most probable, and most pessimistic three-value calculation method for the construction period of a certain activity i defined in this paper to complete the Monte Carlo simulation. Formulas (10)~(12) are formulated with reference to the research results of Yang Li et al. [17].

Engineering quality calculation model
The calculation ideas of quality-related parameters after data obtained by Monte Carlo simulation are as follows:① Calculate the quality influence function based on the simulation results; ② Construct duration-quality function 1, cost-quality correlation function 2, and determine the weight of 1 and the weight of 2; ③ Construct the integrated quality function Q and construct the project quality calculation formula, and calculate the activity impact table.
2.3.1. Definition of the impact of activities on the project as a whole. The impact of activity duration, cost, and quality on the project respectively refers to the impact of activity i's duration, cost, and quality fluctuations on the total duration, total cost, and total quality of the project.
This article improves the research ideas of Li Tao et al. [21], using Monte Carlo simulation data to define the impact of a single activity on the overall project from the perspectives of construction period, cost, and quality, as shown in formula (13). Because the calculation formulas are similar, this article only lists the calculation methods of the impact of activity i duration fluctuations on the overall project duration, and the impact of the cost and quality fluctuations of activity i on the overall cost and quality of the project will not be repeated.
In the above formula, represents the calculated duration impact of activity i, and represents the man-hour simulation result of the m_th simulation operation. E( ) and D( ) are respectively the mean and variance of the duration of the simulation data of activity i, and E(T) and D(T) are the total duration of the project and the variance of the total duration of the project, respectively.

Mass function definition method.
This article believes that the quality of activity i will change with the activity duration and cost fluctuations, and the quality of completed and unstarted activities can meet the requirements. This paper uses the method of activity duration and cost impact calculated by multiple simulations to calculate the weight of the impact of activity i on the overall quality of the project. Different from Li Tao et al. [21] and Zhang Lianying et al. [22] only use the duration-quality function to calculate the quality of activity i, this paper further constructs a cost-quality function to calculate the quality of activity i from the perspective of duration and cost. The method of calculating the overall quality of the project in this paper is also different from the network disjoint method used in the above research. This paper adopts the method of using historical data to take the weight to calculate the overall quality of activity i and the project. The specific calculation formula is as follows: In the above formula t and c represents the construction period and cost center point of the project [23][24]. This paper assumes that when the project duration and cost are at the central point, it is the best state. When the actual duration or cost is less than or higher than the central point, the quality of the project will be affected. and respectively represent the process duration and process cost fluctuation function value calculated using the m-th simulation data, and represents the quality calculation result of the m-th simulation data for the activity i, and the value is between 0-1. and represents the m-th group of construction period and cost data simulated by Monte Carlo, and * represents the total quality of the project. represents the degree of impact of activity i's duration fluctuations on the overall project duration, represents the degree of impact of activity i's cost fluctuations on the overall cost of the project, and are both calculated by formula 13. indicates how much activity i can represent the weight of the overall project quality, Calculated based on the impact of the duration of all unfinished activities on the total project duration and the cost impact on the total cost of the project. The calculation method is shown in Formula 19. * is calculated by the activity influence degree calculated after multiple simulations of the project duration and cost. The value of a is generally a number near 3 [22], α+β=1 indicates whether process i is costsensitive or time-sensitive. r is the empirical constant of the project, which means that the more optimistic the expectation of the quality of the activity, the smaller the value. r, α, and β need to be determined by the project leader according to the project type and project characteristics.

Overview of calculation methods
After the project starts, the activities can be divided into three states: completed, in progress, and not yet started. For the started activities, the actual data is directly written into the model. For the ongoing activities, the unfinished part is calculated through simulation and the actual data of the started part is added to the model data. The unstarted activities are entered into the model according to the planned data. The simulation tool in this paper is to adopt the simulation software Crystal ball developed by Oracle. The specific simulation ideas are as follows:① Process the basic data of project monitoring points and establish the activities that need simulation prediction.② Calculate the corresponding basic indicators according to formulas (1) ~ (6).③ According to formulas (7)~(12), the project process duration, cost optimism, pessimism, most likely value and other parameters required for simulation are calculated.④ Define the software parameters based on the above values and according to the start of each activity in the project at the detection time point t.⑤ Establish the relationship between the various variables in the Excel table.⑥ Run the simulation software and analyze the estimated results.

Basic data of the calculation example
Choose an infrastructure project completed by an infrastructure company in 2019 as an example for simulation demonstration. The project has a total of 9 activities, and the logical relationship between the activities is shown in Figure 2. The total planning period (PD) of the project is 20 months, and the total planned cost is 2.9 million yuan. The project plans to conduct a project risk assessment every 2 months, and the progress after 8 months of construction is shown in Table 1. The project is a costsensitive general project, and the simulation parameters are: α: β=4: 6, a=3, r=25.

Project total construction period and cost forecast
Using the formulas (1)~ (12) in Part 3, the earned value analysis indicators and the calculation results of the triangular distribution parameters are shown in Table 2 below. Using the data calculated in Table 2, Crystal Ball software is used for simulation calculation. Set the number of simulations to 10,000, the simulation results show that the total project duration obeys the triangular distribution, and the total cost obeys the Beta distribution. According to the data of 8 months of construction, the probability of completion of the project with a total construction period of 22.54 months and a total cost of 3.0923 million yuan is 80%. The forecast shows that the average project duration is 22.06 months, the standard deviation is 0.53, the average total cost is 3.0593 million yuan, and the standard deviation is 3.82. The software simulation results are shown in part A and B in Figure 3.

Completion quality prediction and activity influence calculation
After obtaining the above simulation data, use formulas (14) to (18) to calculate the quality of activities 4 to 6 as shown in the three parts a, b, and c in Figure 4. After repeated simulation experiments, the final weight ratio of selected activities 4-6 to the overall quality of the project is 8:1:2.
The project completion quality prediction calculated according to formula (19) is shown in part of Figure 4. The overall impact of the three activities on the project is shown in Table 3.  Analysis of Figure 4 shows that the activity quality obeys the right-skewed Beta distribution, which is consistent with the research results of Li Tao et al. [21], Sheng Ju et al. [25], and Wang Chunzhi et al. [26]. The results showed that the average quality of activity 4, activity 5, and activity 6 were 0.8279, 0.8339, and 0.8160, and the variances were 0.0224, 0.0205, and 0.0257, respectively. It can be seen that in the case of using activity man-hours and cost to jointly characterize quality, activity 5 has relatively better quality performance, and activity 6 is the worst, and because activity 6 has greater variance fluctuations, activity 6 has greater quality risks. The overall quality of the project has an average of 0.83 and a variance of 0.01. Table 3 shows that in the later stage of project management, schedule and quality management should focus on activity 4, and cost management should focus on activity 3.

Conclusions
In this paper, a large amount of simulation data is used to effectively improve the shortcomings of the traditional earned value theory that the single prediction data has little reference significance. Based on the research of existing scholars, this paper constructs the project cost-quality correlation function by combining the knowledge of mathematical statistics, and then calculates the activity quality by linking the construction period-quality and cost-quality in the form of weights. The total quality of the project is based on the project duration and cost impact calculated by multiple simulations of the duration and cost to determine the extent to which the activity represents the weight of the overall quality of the project. The simulation results show that it is feasible to use the ideas in this paper to predict the project schedule, cost, and quality, and to identify the impact of activities on the project schedule, quality, and cost. This proves the rationality and effectiveness of the simulation model in this paper.
In the future, considering the interaction between activities and the complex situation of the interaction between different indicators is a method worthy of further study. In addition, by taking more uncertain factors into consideration, the prediction accuracy can be further improved.