Identifying Significant Components of Complex Software for Improving Reliability: Using Invocation Relationships and Component Characteristics

The scales of software systems are getting larger and larger, which may make the system reliability become low. To improve their reliability, a set of important components that are able to strongly influence the system reliability are usually determined first. However, the existing determining approaches only consider the components which are often called by other components as significant ones, but overlook the components which often call others also have a strong effect on the system. Besides, the approaches all require component invocation probabilities which cannot be obtained easily in large-scale systems. To attack the problems, we propose a novel approach for identifying the significant components in complex systems. This approach includes two component ranking algorithms, which take into account not only the components that are frequently invoked, but also the components which often invoke others. The two algorithms, which do not require component invocation probabilities, can either make significant component ranking only based on the component invocation relationships or consider not only the component invocation relationships but also the component characteristics to achieve results. The significant components are selected according to the two ranking results. Extensive experiments are provided to evaluate the approach and draw comparisons with existing methods.


Introduction
Nowadays, various forms of computers are everywhere in our daily life --desktop computers or laptops we use, smart phones, electronic control units in cars and embedded systems in intelligent electrical appliances, and this offers tremendous gains in convenience. However, any failure of these computers may bring great impact to people, leading the reliability to be a particular concern. As an indispensable part of a computer, software is becoming large scale and rather complex. Unfortunately, a large quantity of components and their complicated invocation relationships incur a huge threat to the system reliability and cause system failures easily. To find out an efficient method for improving the reliability of complex software has been an urgent and practical problem.
Improving software reliability is not a thoroughly new topic. There are four main methods in traditional software reliability engineering: fault prevention, fault removal, fault tolerance, and fault forecasting [1]. However, due to the massive number of components in a large-scale software system, any of these methods can be applied to all components. It will incur too much cost and it is not efficient. In fact, there is no need to do so, because the 80-20 rule also acts in software reliability. Microsoft pointed out that 80% of failures and crashes in Windows and Office can be ascribed to 20%  [2]. Therefore, identifying the significant components that impact the system reliability much more than others, is the first job to optimize the reliability of complex software.
For small-scale software, performing sensibility analysis based on a Markov model can be employed to locate the significant components that impact the system reliability much more than others [3][4][5]. But utilizing this classic approach to a large-scale system will lead to the state-spaceexplosion problem. Researchers have attempted to attack this issue from other perspectives [6][7][8][9][10]. FTCloud [6], which is one of the ideas, performs component ranking by invocation frequencies and then find out a set of important components which are able to impact the system reliability much more than other components. FTCloud2 [7], a refined work of FTCloud, adds attention on the component characteristics, namely components are divided into noncritical components or critical ones. However, there are still two issues in theses component ranking approaches. On one hand, they take into account only components that are frequently invoked by other components, but ignore components that frequently invoke other components. On the other hand, the approach requires stable usage profiles (i.e. component invocation probabilities), which need to be recorded during a long period and cannot be obtained when building new complex software.
To address the problems, we put forward a new component ranking approach for identifying significant components in complex software. The approach utilizes the invocation relationships as well as the component characteristics to performing component ranking twice. One performs the ranking by the impact of components that are frequently called, and the other does component ranking by the effect of components that often invoke others. According to the two ranking results, a set of the significant components that can greatly influence the system reliability are determined. Compared to the existing approaches, the proposed method which does not require component invocation probabilities takes into account the component characteristics and is able to achieve better results. This method also can be used to locate the most significant components when designing the architecture.
The rest of the paper is organized as follows. Structures of software systems are modelled in Section 2. Section 3 details the two component ranking algorithms and discusses how to determine significant components. Massive experiments are provided to evaluate our approach within Section 4. At last, we conclude the paper.

Component Invocation Graph
The structure of a software system, i.e. the component invocation relationships, can be modelled by a directed graph ,

Ranking and Significant Component Determining
Section 3.1 details one of the algorithms and Section 3.2 introduces the other one. Then determination of significant components which are able to greatly impact on the system reliability is discussed in Section 3.3.

Component Ranking Inspired by PageRank
In a component invocation graph, some components may have larger indegrees than other components. We intuitively feel that these components may be frequently invoked by others and therefore they are able to influence the global reliability of this software much more than other components [3]. In terms of the principle of PageRank [11], an algorithm is put forwarded to score the significance of the software components.
Considering a system which has n components, each component balance the derived value and the basic value, which is usually set as 0.85 [6,7].The basic significance value () In Formula (5) where the matrix ij nn Formula (6) can be solved via calculating the eigenvector with eigenvalue 1. This algorithm ranks the components only based on the component invocation relationships if the critical components cannot be determined. Ones they are provided, the algorithm is able to achieve more accurate ranking results. The algorithm outputs the significance values of all the components and a component that has a larger value has the power impacting the global reliability much more than the one with a small value.

Component Ranking Inspired by TrustRank
Opposite to the condition discussed in Section 3.1, a component invocation graph may have larger outdegrees than other components. These components seem to call others frequently and their failures may have influence on many subsequent components. Therefore, the components should also be considered to be significant. According to the thought of inverse PageRank in TrustRank [12], we propose another method to estimate the significance of the software components.
Assuming that a software system has n components, a significance value () Formula (8) can be written in vector form: where the matrix ij nn Uu   =  is defined by: In this way, it can be solved by calculating the eigenvector with eigenvalue 1. Similarly, this algorithm is able to rank the components only using the component invocation relationships as well as to achieve more accurate ranking results when adding the information of component characteristics. The algorithm also gives the significance values of all the components as the output and a component that has a larger value has more influence on the system reliability.

Significant Components Determining
On the basis of the above two algorithms, two series of significance values of all the components in the software are able to be obtained. One focuses on the components which are frequently invoked and the other pays attention on the components that frequently calls others. The two ranking results are both effective and valuable.
Therefore, Top-% 2 ) components are respectively selected from the two ranking results and a total of % K most significant components are determined.

Experiments
A lot of experiments are provided in this section in order to validate the proposed approach, estimate the impact of the parameters and draw performance comparisons between our approach and FTCloud / FTCloud2.

Experimental Setup
To validate the effectiveness and efficiency of the approach proposed in this paper, we compare seven approaches, which are as follows: • RandSel. We randomly choose K percent components and then improve their reliability by applying fault-tolerance strategies. • OurID1. Top-K percent most important components are identified by the proposed method, which only utilizes the invocation relationships. Then their reliability is improved. • OurID2. Top-K percent most important components are identified by the proposed method, which adds the consideration on the component characteristics. Then their reliability is improved. • FTCloud. Top-K percent most important components are by FTCloud. Then their reliability is improved. • FTCloud2. Top-K percent most important components are by FTCloud2. Then their reliability is improved. A graph whose degree distribution exhibit a power law is called a scale-free graph. A lot of previous works show that not only many networks but also the program structures, such as MySQL and Linux Kernel, appear to be approximate scale-free properties [13,14]. Similarly to [6,7], we also apply Pajek [15], which is network analysis software, to randomly draw scale-free directed graphs as component invocation graphs for the experimental studies and draw performance comparisons. In the section, three graphs which have 500, 1000 and 2000 nodes respectively are generated.
In is section, intf denotes the internal failure probability of the components, including two initial settings --0.05 and 0.1. For fair comparisons, we suppose that the internal failure probability ( intf ) can be reduced by 30 percent after improving the reliability.
In OurID1, OurID2, FTCloud and FTCloud2, the parameter  is used to balance the weight of the derived significance and the basic significance the component has itself. Similar works [6,7] have proved that 0.85 is an optimal value. Thus, in this section, we also set the parameter  to be 0.85.
In Section 4.2, OurID1 and RandSel are compared to validate the identification by the proposed method. The impact of Top-K and component internal failure probability is discussed in Section 4.3 and 4.4 respectively. Performance comparison between OurID1 and FTCloud is drawn within Section 4.5. In this comparison, the component characteristics are not involved. In Section 4.6, OurID2 and FTCloud2 are compared when considering the component characteristics.

Validity of Identification
To validate the significant components identified by our approach have greater power impacting on the system, RandSel and OurID are compared in each graph respectively with different K values.  Table 1 shows that OurID1 provides better reliability performance than RandSel, in all experimental settings. This phenomenon shows that improving the reliability of significant components identified by OurID1 can get higher system reliability than enhancing the reliability of randomly selected components. Obviously, OurID1 is able to find out significant components that have stronger effect on the global reliability.
As the K value grows, the reliability provided by OurID1 increases monotonously, while the reliability provided by RandSel may or may not increase. This result indicates that it is effective to apply OurID1 to locate important components.
With the growths of node number, OurID1 is able to consistently provide better the reliability performance. This observation suggests that OurID1 is capable of effectively identifying significant components in different sizes of software systems.
When the component internal failure probability changes from 0.05 to 0.1, OurID1 can consistently provide higher reliability than RandSel, indicating that OurID1 is free from component internal failure probabilities.

Impact of K
To demonstrate that the parameter K has no impact on the identification results, we compare RandSel and OurID1 under multiple K values and record the reliability results in Table 2. The node number is 1000. When the K value is set as not less than 80, using OurID1 can obtain better reliability performance than using RandSel regardless of the component internal failure probabilities. When K reaches 100, there is no difference between the two methods, since all the components in either methods are selected. This observation demonstrates that K values do not have impact on OurID1.

Impact of Component Internal Failure Probability
To show that OurID1 is free from the component internal failure probability ( intf ) settings, OurID1 and RandSel are compared respectively, under component internal failure probability values ranging

Performance Comparison without FTCloud
FTCloud requires the invocation relationships as well as the component invocation probabilities while our approach does not need the component invocation probabilities and ranks the components. Furthermore, our approach takes into account both the components which are frequently invoked and the ones which frequently invoke other components, while FTCloud takes into account only the former.
The comparative experiments are conducted on three settings of node numbers (500, 1000 and 2000) respectively with various K values. The initial component internal failure probability is set as 0.1 ( 0.01 intf = ). The experiment results of software system reliability are illustrated in Table 3.  When the node number is 500, OurID1 provides better reliability performance than FTCloud under all the Top-K% settings except 2%. When the node number grows to 1000, the reliability performance of OurID1 is better than that of FTCloud under all the Top-K% settings. When the node number reaches 2000, OurID1 consistently outperforms FTCloud. This observation indicates that OurID1 is more efficient than FTCloud in larger software systems.
When the node number is smaller and the Top-K% value is set to be small, that is only a few (may be less than 30) significant components are selected, the performance of OurID1 is a little worse than FTCloud, but the difference is very small. Nevertheless, it is acceptable, because OurID1 requires less information and is able to be applied more widely. FTCloud requires stable component invocation probabilities and can only be used to optimize the systems which have been running after a period of time while OurID1 works only by employing component invocation relationships and can be applied as long as system structure information is provided.

Performance Comparison with FTCloud2
Compared to FTCloud, FTCloud2 introduces the component characteristics, but still ignores the impact of the components that frequently call others on the system reliability. The method put forward in this paper allows users to rank components taking into account the component characteristics and requires less information than FTCloud2.
The comparisons between our method and FTCloud2 are performed on three settings of node numbers (500, 1000 and 2000) respectively using multifarious K values. The initial component internal failure probability is set as 0.1 ( 0.1 intf = ), too. The parameter  is set as 0.7 [7]. The experiment results of Table 4 records the experiment results of the system global reliability. Similarly to the results in Section 4.5, OurID2 outperforms FTCloud2 in all the experimental settings except the node number and the Top-K% are both rather small. OurID2 employs only the invocation relationships to achieve better results than FTCloud2 which utilizes not only the invocation relationships but also the component invocation probabilities.

Conclusion & Future Work
We introduce a ranking-based approach for identifying the significant components in complex software. This approach includes two component ranking algorithms proposed from two different perspectives. One regards the components that are often invoked by others as significant components. The other views the components that frequently call others as significant components. The components in software are ranked twice by employing the two algorithms and then the top K percent most important ones are selected in terms of the two ranking results. The effectiveness and efficiency of the approach are validated by a lot of experiments. Compared with FTCloud and FTCloud2, our approach requires less information but is able to provide better performance in most cases, and is able to be employed when designing the system architecture.
The determination of significant components in this approach treats the two ranking results equally, but there may be a better method to select the important components. This will be studied in the future. The future work also includes more experimental analysis of actual software systems and component ranking considering more factors.