An efficient method of logical expression constraint solving in path-oriented unit test

This paper extends the theory of constraint solving and presents a method to solve logical expression constraint problems existing in the process of generating test cases automatically. It defines the concept of the logical expression constraint on the path of C programs. In order to improve solving efficiency, it also presents an algorithm of pre-processing and heuristic backtracking. Our experiment results show that the method can narrow the domain of variables from meaningless infinite intervals to a finite range and achieve better coverage, and the results also show that the algorithm can improve the testing efficiency and halve the time of generating test cases.


Introduction
Path-oriented automatic generation of test cases is one of the typical ways in unit test, which depends on the establishment and solution of the constraint system. And the constraint system is mainly solved by the search method. The researchers have also done a lot of research on this method [1][2][3], from which it can be found that the problem of solving the logical expression constraint is less concerned. However, in a practical project, the logical expression constraints show relatively high proportion of the constrains (statistics from three large projects [4] -aa200c, qlib, and deco, which indicate that the number of logical expression constraints exceeds 30% of the total). ABased on the existing interval algorithm [5] in CTS (Code Testing System), a method of logical expression constraints solving in Clanguage programs is proposed in this paper, which can support logical expression constraints solving and improve the efficiency.

Problem definition
The Constraint Satisfaction Problem (CSP) consists of a set of variables , the domain corresponding to the variable and a set of constraints. Solving the CSP problem is to find the value of the variables that in the variable domain so that the constraint is satisfied. is a logical value, when is T, the true value of the constraint t is True, and the logical value F represents False. The particularity of the logical expression constraint solving problem studied in this paper is not finding the value of variables, but making domain reduction [6], so that when the variable is searched in the domain , it can reduce the search space in the constraint solving process, for the reason that can improve the efficiency of generating testing cases.

Introduction of algorithm
In this paper, a framework is established for logical expression constraints solving, and three algorithms are proposed in this framework. The whole algorithm is shown in Algorithm 1. The algorithm solves the non-LCS constraints on the path first and takes the result as the initial variable domain. Secondly, we will call the "pruning" algorithm and the Variable Level Determination Algorithm for pre-processing, and heuristic backtracking algorithm will be called if there is a conflict in the solving process.

"Pruning" algorithm for simplification
We can use the set of atomic expression constraints(ACS) to represent the logical expression constraint. In this case, the relationship among variables that inside the atomic expression is ignored and treated as a whole. In this way, we can find the logical value combination of atomic expressions satisfying the logical expression constraints by establishing truth tables, and the results will be expressed by union sets.
For example, a simple logical expression constraint, t = { && || 3 F} ,and the truth table of it is as shown in table 1.
From table 1, we can see that there are diversity and complexity in the combination of atomic expressions, especially when a logical expression constraint consists of multiple atomic expression constraints, the solution space will grow exponentially. And from table 1,we can get In order to improve the efficiency of solving, the set of atomic expressions can be simplified. Simplification is based on the initial variable domain, and the atomic expression constraint that is always satisfied will be reduced by the "pruning" method, thus the unnecessary solving process is reduced. For example, in the ACS set, if it is certain that the initial variable domain of is

Variable Level Determination Algorithm based on variable dependencies
According to the dependency between variables, the Variable Level Determination Algorithm(VLDA) sorts the ACS and solves the domain of the high-priority variable, so that the variable domain of the solution is more accurate, which is beneficial to the selection and generation of test cases.
For example, suppose the logical expression constraint is t = { 䇅 && ⺀&& T} , the variable a, b and c are integer variables, and the initial domain of the variables are ( − ∞ + ∞). We can quickly get ACS = 䇅 ⺀ { } . According to the initial variable domain and ACS, if we call the interval algorithm to calculate each atomic expression in order from left to right, the result of solving ACS will be { : ⺀ + ∞ : − ∞ + ∞ :( − ∞ + ∞)} . Looking at the logical expression constraint, we can get dependencies between variables, the variable depends on the value of the variable , the variable depends on the value of the variable , and the variable depends on a constant. If we follow the dependency, we first solve ⺀ , then solve 䇅 , and finally solve { }, and the result will be : ⺀ + ∞ : + ∞ : + ∞ . It can be seen that the algorithm can narrow the domain of variables, thus improving the efficiency of test case generation. The complete VLDA is defined as follows in Algorithm 2.

Heuristic backtracking algorithm for conflicts
After pre-processing, we can select an element of ACS and call the interval algorithm to solve the atomic expression to reduce the variable domain. However, there may be a conflict of solutions, that is, there is no value in the domain of variable that can satisfy the constraints. In this case, backtracking is required, and the new element of ACS will be re-selected to solve. One of the methods of selection is blind, that is, randomly choosing an element, which is simple and easy to implement, but it will easily lead to multiple conflicts; the other is heuristic, some heuristic information will be collected from the conflict. And according to this information, the results move in the direction most likely to reach the final state. The ACS that satisfies LCS is represented by the path from the root node of the binary tree to the leaf node, as shown by state1 in figure 1. And after calling the pre-processing algorithm, the result will be shown in state2. As can be seen from the scale of the binary tree, the pre-processing algorithm greatly reduces the search space. Next, we can call the interval algorithm. Assuming that we select path1 when solving, start from the root node, when we calculate { − 0 T}, we find conflict, then we need to backtrack and choose − 0 F in path2 to calculate. From the conflict, we can find that the relevant variable of the conflicting expression is { }, so the relevant expression on the path1 is { ⺀ } and { 䇅 0 T}, the same as { − 0 T} in path4 and path5, so we can exclude path1, path2, path4 and path5 at the time of backtracking.

Experiments
The method RDP (Recursive Descent Parsing) solves logical expression constraints recursively and adds all possible values of variables to the set of solutions which will lead the result to be ( − ∞ + ∞). And the domain of ( − ∞ + ∞) is meaningless to the interval reduction. In the experiment, we compared the reduction of the variable domain between the strategy of this paper and the RDP method. Statistics from three large projects indicate that the more complex the logical expression constraints are, the less the number of constraints is. And the most complex logical expression constraints consist of six atomic expression constraints. The experiment in this paper is set as follows, six variables, and these variables are arbitrarily combined into a set of linear atomic expressions by using mathematical operators and relational operators. We will select ( ∈ [ ]) atomic expressions from the set and combine them into logical expressions by logical operators (&& || !), as a condition of the if statement in the program waiting for testing. [7] The experiment tests 100 test programs and the test results of some functions are listed in table 2. As can be seen from experiment result, the strategy of this paper can support the logical expression constraint solving, reduce most of the variable domain to a limited range, and avoid the test case search failure.  The experiment also compares the time of generating test cases to illustrate the efficiency of the strategy. The experiment results show that 86 out of 100 functions can generate test cases more efficiently than the RDP method, and the time of solving the remaining functions is similar. The test results of some functions are listed in table 3. The result shows that the strategy of this paper has significantly improved the efficiency of generating test cases. This paper also analyses the cases that 14 of 100 functions where the test case generation time becomes longer after adding the logical expression constraint solving strategy. The main reasons are as follows. First, the test case generation radix is small, the difference is within a few milliseconds, and the machine state affects the efficiency. Second, when using the strategy of this paper, when there is a conflict that requires backtracking but still fails to solve, this will be accompanied by a large number of interval operations and waste more time.
function LCS reduction of variable domain times of searching test cases using the RDP method using the strategy of this paper

Conclusion
Based on the interval algorithm, this paper supports the logical expression constraint solving in pathoriented unit test and can successfully reduce the domain of the input variable. Meanwhile, the "pruning" and variable level determination algorithm improve the efficiency of test case generation. The experiment results show that the proposed logical expression constraint solving strategy can support complex logical expressions, and efficiency is also significantly improved compared with the RDP method. However, when there are many conflicts in the backtracking process, it will take a certain amount of time. How to make the conflict backtracking process faster and more efficient is a problem that needs to be studied in the future. In addition, the algorithm of this paper only supports linear constraints, and the solution to nonlinear constraints is also a problem to be solved in the future.