Modeling and analysis of leftover issues and release time planning in multi-release open source software using entropy based measure

In Open Source Software (OSS), users report different issues on issues tracking systems. Due to time constraint, it is not possible for developers to resolve all the issues in the current release. The leftover issues which are not addressed in the current release are added in the next release issue content. Fixing of issues result in code changes that can be quantified with a measure known as complexity of code changes or entropy. We have developed a 2-dimensional entropy based mathematical model to determine the leftover issues of different releases of five Apache open source products. A model for release time prediction using entropy is also proposed. This model maximizes the satisfaction level of user’s in terms of number of issues addressed.


INTRODUCTION
Release engineering focuses on qualitative and quantitative approach for developing a roadmap, which can shape the product that's ready for release. A systematic review for understanding the different approaches for release planning has been conducted by reviewing the research papers published in various academic journals and conferences [50]. In the OSS development model, users are categorized into various categories depending upon their skills and involvement in software development. In many cases, users also act as developers [48]. The contributors located at different geographical locations request for addressing NFs, IMPs and fixing the bugs through a centralized software platform [1 and 51]. In order to address the different issues reported by users and to fix the bugs the source code of the software need to be changed and these code changes are updated in source code vol 34 no 1 January 2019 33 repositories. These source code changes have been quantified using Shannon entropy [2, 3 and 12].
In every organization, the software development team always desires to produce a software release of high quality with a low fault content. To meet the enormous requirements, they also want to produce a software with frequent releases [14]. In literature, the release time problem for proprietary software has been discussed widely by considering one factor, the bugs which have been fixed in different releases. A release planning study has been conducted by interviewing the persons involved in release planning of the OSS and found that release planning based on the implementation of features requested by users faces various challenges and prefer for release planning based on a given interval of time [25].
In feature implementation based release planning, the practitioners observed that all the features are not taking uniform time in their implementation but some features take enormous time.
In OSS development and even in closed source development the developers/ release managers mainly prefer time based release as the feature based release planning has not been addressed adequately.
The OSS development has been carried by the volunteer contributors (active users) and has not taken into account the cost criteria [42,43].
To release the next version of the software one need to consider the number of addressed issues requested by the users. In OSS, active user's satisfaction can be measured in terms of number of the issues fixed. Different issues are characterized by different severity levels where severity of an issue shows the extent of its impact [44 and 52]. By considering the severity levels of different issues, we have calculated active user's satisfaction as number of issues fixed multiplied by severity levels. In order to increase the active user's satisfaction level, we have maximized this quantity.
In this paper, we propose models for a multi-release software product. We determine the release time by considering the fixing of NFs, IMPs, bugs and code changes. The models have been validated by using datasets of various products of Apache software project. We have taken into consideration the two existing software reliability growth models [37 and 38] for comparison. We observed that our models give highest cases of maximum performance.
Remaining part of the paper has seven sections. In section 2, we have summarized the related work. Data collection and model construction have been discussed in section 3. The experimental setup has been given in section 4. Section 5 documents the results and discussion. The optimal release planning has been discussed in section 6. Threats to validity have been given in section 7 and the conclusion of the paper has been given in section 8.

RELATED WORK
The monolithical software development offers a way for multiple releases as a result of functionality enhancements [21]. A better understanding of release process assists developers to design frequent and fault free software releases [14]. "A timely introduction of a clunker and a delayed entry of a masterpiece can destroy a product's chance of success" [27]. The uncertainty analysis has been carried out in different disciplines in order to measure the uncertainty arises due to internal or external factors. In software development too, the uncertainty analysis has been used by applying Shannon information theory in order to quantify frequent source code changes [2]. A study based on the quantified code changes has been conducted [3]. In this study, the authors also estimated the rate at which source code changes are taking place.
The quantified approach in understanding the different activities relating to software development and maintenance activities has been proven to be very useful. In literature, the search based software engineering approach has been used in discussing the next release problem [8 and 9]. The release time of a software can be affected by many factors like complexity, bugs, software architecture, software domains and tools [22, 23 and 24].
During the past three decades, various mathematical models have been proposed to quantify the quality of the software. Various release time planning approaches have been discussed in literature for commercial/proprietary software. These approaches have considered the fixing of bugs of current and just previous release [7]. In case of OSS, the release time strategy is different from the proprietary/closed source software in the sense that it considers not only the number of bugs fixed but the number of NFs and IMPs implemented. "When using a feature-based strategy, an open source project might make a release even if not all planned features have been implemented" [25].
For a single release problem in closed source software the authors proposed to determine the optimal release time by minimizing the development cost that include testing cost, debugging cost and fixing cost or maximizing the reliability level subject to predefined budget [4]. When the addons are included and the faults are removed a proportion of faults are generated, a release time problem by considering this phenomenon has been developed in [36]. An attempt has been made to develop multiple release time problem [7]. The approach determined the optimal time and the optimal amount of resources. But, again the next release has been decided on the basis of number of bugs removed in testing phase and operational phase. The Dempster-Shafer theory and differential evolution based multi software reliability allocation for a multimedia system has been proposed in [32]. An optimal software release time planning by considering delay incurred cost has been proposed in [33]. Multi-criteria based model has been proposed for the software reliability prediction [34]. In a study [35], the failure processes of testing have been investigated by considering the delay effect in fault fixing. Recently, a mathematical model for optimal time determination in multi-release software has been proposed in [53]. The authors considered different types of users, namely innovators and imitators in predicting the release time.
In this paper, next release planning has been proposed based on the predefined addressed features and bugs.
An economy outlook is presented by Cobb-Douglas function [6, 7 and 13].
In our work, we have used the Cobb-Douglas function to model the growth of fixing of different issues.

Data collection
We have taken data from five products, namely Avro, jUDDI, Hive, Pig and Whirr of Apache open source project [10 and 42] where bugs, NFs and IMPs have been presented with different signs as shown in Figure 1. The fault tracking data of Mozilla project for three successive versions Firefox 3.0, 3.5, and 3.6 (https: //bugzilla.mozilla.org/) [39] and for the product of gnomecontrol-center (https://bugzilla.gnome.org/) [40], for 4 successive versions from 2.0 to 2.3 have been selected [35]. The historical code change data has been extracted using GitHub tool [11]. Figure 2 shows the sample of different issue reports of Avro product. Figure 3 shows the screen shot to download code change history for Avro product from the GitHub repository.
The process of data collection for Apache project and Entropy calculation has been carried out using the methods and formulas as discussed and proposed in [2, 12 and 53].

Modeling for Multi-Release Software Product
In OSS, once the issues are reported, the triaging takes place and different issues are assigned to developers. The source code of different files gets changed during fixing of these issues and the product goes for next release. But, there are some issues which are still left in the current release, which get fixed in the next release. A mathematical model is necessary here, which will predict the leftover issues of a release to be fixed in the next releases. The class of time-domain software reliability models assumes that software failures display the behavior of a Non-Homogeneous Poisson Process (NHPP) [16][17][18][19][20]26].
Let Poisson probability mass function of a random variable, N(t), with parameter X (t) is defined as Various time domain models have appeared in the literature which describe the stochastic failure process by an NHPP [26]. 'a' : potential number of issues. Issues can have 1) bugs/NFs/IMPs, 2) NFs+IMPs and 3) bugs+NFs+IMPs. These issues in infiite time with finite failure NHPP can be written as with the following differential equation (2) as follows in [7 and 53].
Here, F(t) is a distribution function and f (t) = d dt F (t) is a density function. The quantity [a − X (t)] denotes the expected number of issues remaining in the software at time t.
Solving above equation at t = 0, X (0) = 0 we get the following Here, X (t) is the cumulative value of fixed bugs/new features/feature improvements / (new features + feature improvements) at any given time t.
During data collection, we observe that in the beginning, the cumulative number of issues fixed is slow and after that it increases and then stabilizes in subsequent releases. To model such behaviour of issue fixing, we proposed a model based on logistic function, i.e.
By using equation (4) in (3), we get Here, γ is a constant and depending upon its value, models can capture different types of issue fixing growth curves. The ϕ denotes an issue fixing rate per remaining issue.
Here, we consider that in equation  (5) is used to predict the potential number of bugs, new features, feature improvements, and (new features + feature improvements) at any given time for Release 1 data. Now, we consider that source code is changed to fix bugs, new features and feature improvements. The changes in the source code of the software can be quantified using entropy based measure [2, 3 and 12] To incorporate and consider the source code changes for fixing the issues, we extend the Cobb-Douglas type function [13]. We considered time and entropy both simultaneously instead of only time. In (6), s and u represent the time and entropy (the complexity of code changes) respectively. β is the code change elasticity to issues fixing time.
We can develop 2-dimensional model by using calendar time 's' and entropy 'u' which results in equation (7).  The above model can also be written in the following form and We used model given in equation (7) to predict the potential number of total issues at any given time for Release 1 data. We consider the same fixing rate of issues across different releases for the sake of simplicity.

Multi-Release Modeling based on Leftover Issues of just previous Release [53 and 54]
In software, different issues, namely bugs, new features and feature improvements are reported and get fixed in the current re-lease. The remaining unresolved issues which are leftover move to the next release. The mean value function of issues based on calendar time and entropy has been evaluated by using (5) and (7) respectively During our empirical investigation, we found that the issues, namely, new features and feature improvements are fixed in the current release and the unresolved are fixed in next release, means next release considers the leftover new features and feature improvements of the just released version of the software. But, in case of issues, namely bugs, leftover bugs of Release1 are fixed in Release 2, Release 3 and Release 4. It means, in an open source development environment leftover bugs of different releases are passed on to higher releases (up to next three-four releases). Here, we consider that leftover bugs of Release 1 can pass up to Release 4. Based on this empirical evidence, the different mean value functions for different releases have been modeled as follows: computer systems science & engineering We consider that in the first release different bugs are reported and get fixed are modeled by the following equation where a 1 is the potential bugs to be fixed in the first release at time t 1 . The leftover bugs of first release, i.e. a 1 (1 − F 1 (t 1 )) are added to the potential bugs of second release with fixing rate F 2 (t − t 1 ). Therefore, the mathematical equation representing the cumulative number of bugs fixed in the second release is given by 10) In above equation a 2 is the potential bugs to be fixed in the second release. In the line of modeling for the second release and along with taking into consideration the fact that the next release will contain the remaining bugs of all the previous releases, we can write the expressions for Release 3 and Release 4.

Multi-Release Modeling based on unresolved Bugs passed on to different Releases [36, 53 and 54]
If we consider that the next release consists of remaining issues of just previous release, then we can write the following expressions for different releases. The mathematical modeling for cumulative addressed issues for Release 1 is given as equation (13).
a 1 is the addressed potential issues of Release 1. The leftover issues, i.e. a 1 (1 − F 1 (t 1 )) with fixing rate F 2 (t − t 1 ) are added to the issue content of Release 2. Mathematical equations for cumulative addressed issues estimation in Release 2 and Release 3 are given in equation (14) and equation (15).
Similarly, we can write equation (16) for the n th release.

EXPERIMENTAL SETUP
In this section, we have discussed multi-release based on unresolved Bugs passed on to different releases and multi-release based on the Leftover Issues of just previous release.

Multi-Release based on unresolved Bugs passed on to different Releases
In an open source development environment, leftover bugs of different releases are passed on to the higher releases. We validated the proposed models given in equations (9), (10) and (11) for case 1 (in Table 1), on weekly bugs fixed data of three releases of Firefox and Gnome-control-center projects [35]. Results have been presented in Table 3 given in the next section.

Multi-Release based on the Leftover Issues of just previous Release
If we consider that the next release consists of the remaining issues of just previous release, then potential values of different issues (cases 1-5 of Table 1) in Release 1 can be estimated using equation (13) and the potential value of all the issues (case 6 of Table 1) can be estimated using model given in (7) as this model takes care of entropy. For Release 2 we have used model given in (14). For Release 3 we have used model given in (15). The generalized model for n th release given in equation (16) has been used to estimate parameters for rest of the further releases by considering just previous release leftover issues. For the first release, the leftover issues have been predicted by using the parameters estimation results of first release. Release 2 parameters are estimated by considering Release 1 leftover issues along with Release 2 dataset. For this we have used equation (14). With the resultant leftover issues, along with Release 3 dataset, we have estimated Release 3 parameters. For this we have used equation (15). For Release 4 and Release 5 parameters estimation we have followed the same process and used equation (16).
We have validated the proposed models discussed here for Apache open source products, namely Hive, Pig, Avro, jUDDI and Whirr [10]. We have used Nonlinear Regression (NLR) in Statistical Package for Social Sciences (SPSS) software to estimate the parameters (a, φ, β and α). Table 2 shows the values of different parameters we have used in NLR.

Parameter Estimation
Step Infinite method limit step size Value Sequential Quadratic Programming 2 1.00E+20 From the results, it has been observed that the proposed models representing all cases of Table 1 give a high goodness of fit. vol 34 no 1 January 2019

NUMERICAL EXAMPLES
Numerical application examples are given here for the illustration purpose for Firefox, Gnome-control-center and Apache software products. We have measured the performance of different estimation models for all the releases. Table 3 shows the parameter estimates of case 1 defined in Table  1, for Firefox and Gnome-control-center projects. Here, we consider that leftover bugs of Release 1 can pass upto Release 3. The number of fixed real bugs of i th release is shown by 'A i '. The potential number of bugs of i th release is shown by 'a i ' and the leftover bugs of i th release is shown by 'a i − A i '. The issues fixing efficiency for i th release is represented by the parameter φ i . γ i shows the variation in bug fixing pattern of i th release.

Multi-Release based on unresolved Bugs passed on to different Releases
In Table 3, we have documented the leftover bugs of each release, which are the addons to the fault content of the next releases. In case of Firefox, the estimation results show that 50 initial bugs are there in Release 1. But, fixed bugs are 48. Hence 50 − 48 = 2 bugs remained unresolved, which added to the next release fault content. We can draw similar inferences for other releases. For Release 2, 45 bugs have been fixed and the estimation also shows (a 2 + a 1 (1 − F 1 (t 1 ))) = 45 bugs, which means all the bugs have been detected and fixed in Release 2.
For Release 3, model shows under fit performance.
In case of Gnome-control-center, the estimation results show that 43 initial bugs are there in Release 1. But, fixed bugs are 42. Hence 43 − 42 = 1 bug remained unresolved, which added to the next release fault content. We can draw similar inferences for other releases. For Release 2, 35 bugs have been fixed and the estimation shows (a 2 +a 1 (1− F 1 (t 1 ))) = 95 bugs, which means 60 bugs remained unresolved and added to the third release fault content. Similarly, we observe that 220 leftover bugs added to next release from Release 3.
We observe that estimation models, exhibits good fit in terms of MSE, Bias, VAR, RMSPE and R 2 for all the releases.

Multi-Release based on the Leftover Issues of just previous Release
Tables 4 to 8 present parameter estimates of Case 1 to Case 4 ( Table 1) for different Apache datasets. Here, we considered that leftover issues of just previous release are added to the issue content of next release. The number of fixed real issues (bugs/IMPs/NFs/NFs+IMPs) of i th release is shown by 'A i '. The potential number of different issues of i th release is shown by 'a i ' and the leftover issues of i th release is shown by 'a i − A i '. The issues fixing efficiency for i th release is represented by the parameter φ i . γ i shows the variation in bug fixing pattern of i th release. γ i is the code change elasticity to issues fixing time for i th release.
In Table 4, we have documented the leftover bugs of each Avro release, which are the addons to the fault content of the next releases. The estimation results show that 149 initial bugs are there in Release 1. But, fixed bugs are 134. Hence 149 − 134 = 15 bugs remained unresolved, which added to the next Release 2 fault content. We can draw similar inferences for other releases. For Release 2, 81 bugs have been fixed and the estimation also shows (a 2 + a 1 (a − F 1 (t 1 ))) = 91 bugs, which means 10 bugs remained unresolved and added to the third release fault content. Similarly, we observe that 16   Similar, interpretations can be drawn for other products. By analyzing the above empirical results, we can evaluate the quality of maintainability and the adaptability of software in fixing of different issues. The number of issues left in different releases determines the release readiness and the stability of the software. In case of Apache datasets, we observed that for Avro product 61.5% of the potential issues get fixed before the next release. For Pig product except new features fixing in Release 4, all releases are following the same fixing level. For Hive product, all the issues fixing process achieved 61.5% performance. We observe that for jUDDI product issues fixing process for different issues, namely bugs, NFs and IMPs follow the same pattern except bug fixing for Release 4 and total issues fixing for Release 2.
The releases of Whirr product follow the same pattern except improvements and new features fixing in fourth release. Results show that 61.5% of potential issues of current release get fixed before the next release except the four cases in all five products. All the estimation models (Table 1)

Comparison of Performance for the proposed models
The proposed issue estimation models (Case 5 and Case 6 in Table 1) have been compared with Goel-Okumoto model [37] (equation (17)) and Yamada delayed S-shaped model [38] (equation (18)).
X i (t) denotes the i th release cumulative number of fixed errors at time t. 'a i ' is the potential number of errors. 'b i ' is the fixing rate of errors for i th release. Tables 9 to 13 present the results of these two models in comparison of proposed models. For Avro product (Table 9), the proposed models (Case 5 and Case 6 in Table 1) estimation results show that 332 issues (bugs+NFs+IMPs) have been fixed in Release 1. The estimated potential value of issues is 357. This means 25 issues which would have been fixed in Release 1 now will be fixed in Release 2. Similarly, 14 issues which would have been fixed in Release 4 will be fixed in Release 5. Similar, interpretations can be drawn for other products. Out of total 23 releases, in 20 releases proposed models give better R 2 than the two existing models.
We designed the experiment to test the statistical significance of the proposed model as discussed in case 6 in Table 1. The statistical significance has been validated using non-parametric Kolmogorov-Smirnov (K-S) test. The P-values of the experiment have been given in Table 14.
We have taken level of significance α = 0.025. We observed that the proposed model is statistically significant.

OPTIMAL RELEASE PLANNING
In closed source software, due to time and resource constraints it is not possible to address/fix all the issues in the current release vol 34 no 1 January 2019 and in OSS due to active user's demands. By putting a constraint on minimum number of fixed issues with different severity levels before the next release of the software, we can maximize the active user's satisfaction. The following objective function considers two different severity groups, S n1 (average severity for n th release issues) and S n2 (average severity for (n − 1) th and n th release issues which will be addressed in (n + 1) th release).
Maximize S n (t) = S n1 X n (t) + S n2 × a n + a n−1 (1 − F n−1 (t n−1 )) − (X n (t)) (19) In IEEE 982.2 defect indices definition [41] 10 weight has been used for high severity issues, 3 weight for medium severity issues and 1 for low severity issues. We have also used the same weights for different severity issues as given in [41]. By using (7) and (8), we can write the following function to consider active user's satisfaction.
We have solved the above nonlinear problem by using Genetic Algorithm (GA) [28].
To calculate optimal release time of n th release, estimated parameters of n th and (n−1) th releases have been used. We have solved equation (20) in MATLAB with an optimization tool by using solver "ga-Genetic Algorithm". Table 15 shows different parameter values to obtain optimal solution. We have taken Heuristic crossover function and Tournament selection function with Eliete count 2.
Similarly, we have observed that the estimated optimal release time for jUDDI third release is 19 months with 92% active user's satisfaction. The real release time we have observed is 14 months. The estimated optimal release time is close to the real release time. For Hive and Pig, we have estimated optimal release time of 14 months and 10 months for fourth releases at 98% and 98.5% user's satisfaction levels respectively. In case of Hive, the real release time is 13 months and in case of Pig it is 9 months. This shows estimated optimal release time is close to the real release time.
In case of Whirr product, we have estimated optimal release time of 10 months for third release at 97% user's satisfaction level which is 2 months more than the real release time (8 months).
We observed that the release time problem based on the complexity of code change metric is more practical and gives a close prediction.

THREATS TO VALIDITY
Factors affecting the validity of proposed work are as follows: Internal Validity: Some of the assumptions made by us may not always reflect the reality (e.g. leftover issues of n th release may not have a larger severity when they are added in the initial issue content of (n +1) th release). Moreover, we have not empirically validated the real number of leftover issues.

Construct Validity:
We used the information available in GitHub repository, such as code change history for calculating entropy. This information we calculated manually, and may contain some manual errors. The total code changes in files resulting from fixing of different issues has been considered in our work. Statement level code changes need to be considered instead of file level changes. We have taken some versions collectively (versions have less data points) as a release. The selection of the releases is based on the equal data points for every release. We do not claim any causal findings for other choices of releases.

CONCLUSION
We estimated the potential value of different issues based on the time and the complexity of code changes (entropy) in different releases of Apache open source products. The issues left unresolved in different releases have also been calculated. Results show that in all the five products out of total 23 releases in 19 releases, at least 61.5% of potential issues have been addressed before the next release of the software. The leftover issue content contributes in the upcoming releases. The proposed model (case 6 of Table 1) performance has been compared with two models (Goel-Okumoto model and Yamada delayed S-shaped model). Out of 23 releases for 20 releases, we observed that the proposed entropy based model results in maximum cases of maximum R 2 in comparison of these two models.
A model for release time prediction using entropy is also proposed. We optimized the objective function of release time problem using Genetic algorithm in MATLAB. The estimated optimal release time is close to the real release time of jUDDI, Whirr, Hive, Pig and Avro products at 92, 97, 98, 98.5 and 99.7% satisfaction levels.