Stability Assessment Method Considering Fault Fixing Time in Open Source Project

Recently, open source software (OSS) are adopted various situations because of quick delivery, cost reduction and standardization of systems. Many OSS are developed under the peculiar development style known as bazaar method. According to this method, faults are detected and fixed by users and developers around the world, and the fixed result will be reflected in the next release. Also, the fix time of faults tends to be shorter as the development of OSS progresses. However, several large-scale open source projects have a problem that faults fixing takes a lot of time because faults corrector cannot handle many faults reports quickly. Furthermore, imperfect fault fixing sometimes occurs because the fault fixing is performed by various people and environments. Therefore, OSS users and project managers need to know the stability degree of open source projects by grasping the fault fixing time. In this paper, for assessment stability of large-scale open source project, we derive the imperfect fault fixing probability and the transition probability distribution. For derivation, we use the software reliability growth model based on the Wiener process considering that the fault fixing time in open source projects changes depending on various factors such as the fault reporting time and the assignees for fixing faults. In addition, we applied the proposed model to actual open source project data and examined the validity of the model. KeywordsReliability, Stochastic differential equation, Open source project.


Introduction
Source codes of open source software (OSS) are freely available for use, reuse, fix and redistribution by the OSS users. Many OSS are known for their high performance and reliability although they are free of charge. Also, many IT companies often develop OSS for commercial use. In particular, OSS are developed by using the bazaar method (Raymond, 1999), the source code is implemented in public through the Internet. Then, OSS are promoted by an unspecified number of users and developers. The bug tracking system is also known as one of the systems used to develop OSS. Many fault information such as fix status, their details, and fix priorities are registered through the bug tracking system. Although OSS have been actively developed and used in recent years, there are problems in promoting open source projects. There are over 100 faults reported per day in massive open source projects (Sun et al., 2010). Massive open source projects have a large number of fault reports. Then, it is difficult to quickly fix the faults (Hooimeijer et al., 2010). In addition, imperfect fault fixing sometimes occurs because the fault fixing is performed by various people and environments. Generally, as the development of OSS progresses, the number of unfixed faults and faults with long fixing time decreases. Then, OSS becomes stable. For stable OSS development, several previous studies predict the fix time of each reported fault (Bougie et al., 2010;Giger et al., 2010;Akbarinasaji et al., 2018). However, it is difficult to evaluate the stability of future open source projects by predicting the individual fault fixing time. By predicting the outlook for future projects, the OSS developers and managers can make a future plan. Based on the above, the purpose of this paper is to predict the required fault fixing time and evaluate the stability of open source projects based on this information.
In this paper, we derive the imperfect fault fixing probability and the transition probability distribution of the model used in this study in OSS development for assessing project stability. In particular, the imperfect fault fixing probability means probability of inefficient fault fixing. By deriving these results, we can obtain indicators to know when the project becomes stable. By grasping the stability of the project, the OSS manager can be linked to the stable operation of the project, and the OSS user can be linked to the decision making of the OSS selection. Especially, we focus on OSS development using bug tracking system in operational phase as depicted in Figure  1. In this paper, we aim to derive the imperfect fault fixing probability and the transition probability distribution using the exponential model and the delayed S-shaped model derived from the software reliability growth models (Musa et al., 1987;Lyu, 1996;Kapur et al., 2011;Yamada, 2014). The exponential model and the delayed S-shaped model are known as one of the famous software reliability growth models (Ranjan et al., 2019). By using the stochastic differential models, we can describe the irregulate situation such as OSS development. In particular, we derive the imperfect fault fixing probability and the transition probability distribution considering the characteristics of open source projects. Then, we assume that the number of developers and users change irregularly. We can easily discover the regularity from various factors in open source projects and apply mathematical models with multiple parameters. However, it is difficult to actually use these models in terms of parameter estimation. In this paper, we apply a stochastic differential equation model with noise based on the Wiener process considering the specific circumstances of open source projects. The proposed model will be able to evaluate the project quantitatively considering external factors from indirectly in open source projects.

Wiener Process Models for Assessment OSS Stability
Considering the characteristic of the method of fault fixing in open source projects, the timedependent fault fixing effort expenditure phenomenon keeps an irregular state because OSS developers have different skills and developing environments respectively. ( ), gradually increases as the operational procedures go on. Based on software reliability growth modeling approach, the following linear differential equation in terms of fault fixing time can be formulated: where ( ) is the increase rate of fault fixing time at operational time and a non-negative function, and means the estimated fault fixing time required until the end of operation. Generally, ( ) means the number of faults in software. However, we consider ( ) as the cumulative fault fixing time for stability assessment of the software in terms of fixing time.
Therefore, we extend Eq. (1) to the following stochastic differential equation with Brownian motion (Wong, 1981): where is a positive constant representing a magnitude of the irregular fluctuation, and ( ) a standardized Gaussian white noise. By using Itô's formula (Arnold, 1974), we can obtain the solution of Eq. (2) under the initial condition (0) = 0 as follows: where ( ) is one-dimensional Wiener process which is formally defined as an integration of the white noise ( ) with respect to time . Moreover, we define the increase rate of fault fixing time in case of ( ) defined as: In this paper, we assume the following equations based on software reliability models * ( ) as the cumulative fault fixing time function of the proposed model: where is the increase rate of fault fixing time. Also, ( ) means the cumulative fault fixing time for the exponential software reliability growth model with ( ). Similarly, ( ) is the cumulative fault fixing time for the delayed S-shaped software reliability growth model with ( ).
Therefore, the cumulative fault fixing time up to time are obtained as follows: In this model, we assume that the parameter depends on several noises by external factors from several triggers in open source projects. Then, the expected cumulative fault fixing time spent up to time are respectively obtained as follows: Similarly, we consider the sample path of fault fixing time required for OSS maintenance, e.g., the needed remaining fault fixing time from time to the end of the project are obtained as follows: Then, the expected fault fixing time required for OSS maintenance until the end of operation time t are respectively obtained as follows: Since the Wiener process ( ) is a Gaussian process, log{ − ( )} is also a Gaussian process. The mean of log{ − ( )} are derived as follows: Also, The variance of { − ( )} are derived as follows: Thus, the following equation is derived: where is the cumulative fault fixing time at time . Also, means standard normal distribution and is defined as follows: Considering the above points, the transition probability distributions of ( ) and ( ) are obtained as: Also, the imperfect fault fixing probability is derived as follows: where it is defined as the probability that is not exceeded in a minute time interval ( > 0). Specifically, it is possible to grasp the time when the project is converged by deriving the accumulated fault fixing time and the remaining fault fixing time. As a result, OSS user can grasp the scale of the remaining faults at a specific time. In addition, the project manager and the developer can grasp the amount of effort required in the future, and it will help to decide on future development policies. The transition probability distribution shows the probability that the cumulative fixing time will be a specific value in the operating time. In other words, it is possible to grasp the future project progress by considering an arbitrary cumulative fixing time as the transition probability distribution. It will be useful for OSS users to understand the required total cumulative fixing time , because it is beneficial for OSS users to select the high-reliable OSS. Then, the project managers and developers can also consider the future plan of project management based on the predictions by using the proposed method.
The high probability of imperfect fault fixing means that more fault fixing time is required due to the fault fixing. In other words, it is useful for OSS users, project managers, and developers to understand the optimal time in terms of the probability of imperfect fault fixing, because the project is unstable in this case.
By using above equations, we can evaluate the stability of open source projects by deriving the accumulated fault fixing time, the remaining fault fixing time, the transition probability distribution of the proposed models, and the probability of imperfect fault fixing.

Application of Proposed Method to Actual Data
We discuss the applicability as a method for evaluating the stability of a project by applying actual open source project data to the proposed model.

Used Data Set
In this paper, we used one open source project to evaluate the proposed model and stability the project. For applying the proposed model to actual project data, we use the data of Eclipse obtained from Bugzilla. Eclipse is one of the open source software for integrated development environment used in computer programming. This project uses Bugzilla as open source bug tracking system. The information about reported faults is freely available from the bug tracking system. Figure 2 shows a part of the Eclipse data used in this paper. The fault fixing time used in this paper is the difference between the time when the fault was reported (Opened) and the time when the fault information was changed (Changed). In particular, we use Eclipse version 4.7 (Oxygen) data. From the data used in this research, the Eclipse project has 1760 fault fixing data, so this project is regarded as large-scaled project. Especially, we used only for fixed faults as the number of fix faults. In this paper, we apply the remaining fault fixing time, imperfect fault fixing probability and transition probability distribution derived in Section 3 to Eclipse project data. Also, we have estimated the parameters by the method of maximum likelihood.
In this paper, the unit of fault fixing time is "day", and the fault fixing time represents the weekly average time. Table 1 shows the results of parameter estimation and AIC (Akaike's Information Criterion). In terms of AIC, the exponential model fits better than the delayed S-shaped model for model. Also, we mainly focus on the result of exponential model, because of AIC.   12) and (14) International Journal of Mathematical, Engineering and Management Sciences Vol. 5, No. 4, 591-601, 2020 https://doi.org/10.33889/IJMEMS.2020.5.4.048 598

Transition Probability Distribution and Imperfect Fault Fixing Probability
Figures 5∼6 show the fault fixing time transition probability distribution in Eq. (22). Figure 5 shows the result of substituting actual data for x in Eq. (22). In other words, the value of x differs every time. In conclusion, we calculate the probability of the actual value in the prediction model. Figure 6 substitutes fixed values for x in Eq. (22). We can roughly estimate the time to reach any value x. Especially, Figure 5 shows relationship between accumulated fixing time ( ) predicted by parameter estimation and actual open source project data. By highly estimating this probability, we can judge that the cumulative fixing time ( ) predicted by parameter estimation is predicted lower than the actual data. Also, Figure 6 shows the probability of reaching a certain cumulative fixing time at time . In particular, Figure 6 shows the probability distribution of the time to reach 10%, 30%, 50% and 70% of the estimated cumulative fixing time . By using this result, it is possible to grasp the time when the project converges. Therefore, for the user who considers the risk due to the occurrence of the fault, the problem can be solved by specifying the time when no further fault fixing is required. In this case, the probability distribution of the time to reach near 100% of the estimated cumulative fixing time should be calculated.
Finally, Figure 7 shows the results of sensitivity analysis of the probability of imperfect fault fixing using exponential model. We can judge that the smaller the value of , the more imperfect fault fixing frequently occurs. In particular, the probability was constant using exponential model. Therefore, we can judge that the imperfect fault fixing probability always occurs with a certain probability. Figure 7. Imperfect fault fixing probability using exponential model in Eq. (24) By using these indicators, open source project managers and OSS users will be able to make various decisions.

Conclusions
In general, the prediction of development effort and fixing time for individual faults can assess by using conventional OSS reliability evaluation methods. However, there is no researches in terms of the estimation of the fault fixing time for a long time. It is difficult to use the conventional software reliability growth model for the fault fixing time, because the conventional software reliability growth models mainly evaluate the number of faults in software development. We can easily control open source projects if we can assess the stability and reliability of future projects by using fault fixing time data. Thereby, the proposed method will lead to assess the stability of OSS systems developed under open source projects affected by various people and the environment. Also, the appropriate control of management effort for OSS will be indirectly linked to the quality, safety, reliability, and cost reduction of OSS if the manager grasps the future of the project progress.
In this paper, we discuss the stability

Conflict of Interest
The authors confirm that there is no conflict of interest to declare for this publication.