Consistency Validation Method for Java Fine-grained Lock Refactoring

Many existing refactoring tools reduce the possibility of lock conflicts and improve the concurrency of system by reducing lock granularity and narrowing the scope of locked objects. However, such refactorings can lead to changes in concurrent program behavior, introduce concurrency errors, and often even produce code that does not compile or can be compiled but has changed semantics. To address the problem that coarse-grained locking to fine-grained locking refactoring leads to changes in concurrent program behavior, a refactoring consistency validation method for fine-grained locking is proposed. Firstly, the types of behavioral changes caused by the existing refactoring engine are analyzed in terms of thread interactions. Secondly, the relevant consistency checking rules are summarized according to the types. Finally, with the help of various program analysis techniques such as call graph analysis, alias analysis and side-effect analysis, the corresponding checking algorithms are designed according to the consistency checking rules to check the consistency of the program before and after refactoring. We implement an automatic validation tool as the Eclipse plug-in. Our approach is verified by ten open-source projects including HSQLDB,Xalan and Cassandra,etc. A total of 1,483 refactoring methods were tested, and 60 inconsistent synchronization behaviors were found, which improved the robustness of refactoring in terms of data dependence and execution order.


I. INTRODUCTION
Locks are used to control access to shared resources by multiple threads and are one of the most commonly used synchronization methods, but the use of locks can easily lead to lock contention. Lock contention is a phenomenon in which multiple threads attempt to access a shared resource protected by the same mutually exclusive lock during program execution. In a highly concurrent environment, especially when the critical section is large or when threads enter it frequently, the performance degradation caused by lock contention can be significant. A critical section is a program fragment that accesses a shared resource that cannot be accessed by multiple threads at the same time. One important property of a lock is the lock granularity. Lock granularity is a measure of the amount of data protected by a lock. A reasonable locking granularity maximizes the use of shared resources, so it is important to optimize the locking granularity. Coarse-grained locking is a type of lock with a small number of locks and a large amount of data protected by each lock. Fine-grained locks are locks with a large number of locks, each protecting a small amount of data. Usually, when choosing coarse-grained locks, the lock overhead is low when accessing protected data in a single thread, but the performance is poor when multiple threads access it at the same time because of the increased lock contention. Conversely, using fine-grained locks increases lock overhead and reduces lock contention.
Due to the difficulty of developing concurrent programs, program developers tend to use coarse-grained locking methods to reduce the burden, such as synchronous methods or synchronous blocks in Java. However, coarse-grained locking may actually result in many operations being executed sequentially, reducing the efficiency of the program. In order to optimize synchronization, developers have introduced fine-grained synchronization mechanisms to reduce the locking granularity and narrow the range of locked objects, reducing the possibility of lock conflicts.
Since the read operation itself does not affect the data integrity and consistency, in suitable scenarios, using the read-write split lock to replace the exclusive lock and partition the system function points can effectively improve the concurrency of the system. Among the fine-grained lock refactoring tools, FineLock [1] adopts the read-write split lock approach and implements a kind of automatic finegrained lock refactoring based on a push-down automaton to complete the automatic conversion from coarse-grained locks to fine-grained locks, which improves the refactoring efficiency compared with manual refactoring.
In existing automated refactoring, improperly refining coarse-grained locks can aggravate the uncertainty and concurrency during the running of concurrent programs, and the refactored programs may have new concurrent behaviors, such as deadlock, live lock, data contention, etc. For sequential programs, developers are usually encouraged to use regression testing techniques to ensure that refactoring does not change the program behavior [2]. However, this approach highlights its effectiveness when only very few threads are scheduled in a concurrent environment.
There are many studies on the correctness of refactoring concurrent programs. some researchers believe that preconditions can be set at the beginning of refactoring to prevent inconsistent behavior, but automated refactoring tools cannot check all preconditions for refactoring [3][4], so they cannot completely rely on checking preconditions to complete all consistency verification. In addition modular reasoning [5] and framework protection behavior [6] have been proposed to improve the correctness of concurrent programs. Among the consistency detection tools for software refactoring, the representative ones are Randoop [7][8][9] and EvoSuite [10], which are not only able to detect errors introduced by refactoring, but also can help improve the efficiency of refactoring. The refactoring detection method IDiff [11] can effectively detect differences, moves and refactoring-related changes in the source code and help to understand software evolution. The model refactoring checking tool CVT [12] helps to solve the problem of checking the consistency between the model and its evolution. There have been some works on concurrent program refactoring to detect inconsistent behavior before and after refactoring [13][14], but the correctness verification in fine-grained lock refactoring has yet to be studied in depth. For example, although FineLock has a certain consistency detection mechanism, it still lacks treatment for ensuring data synchronization and consistency of execution order. Using consistency checking methods to examine the synchronization behavior before and after fine-grained lock refactoring and the original external characteristics of the program allows software developers and maintainers to gain more insight into the evolutionary history of the software and thus better maintain it.
To check the synchronization behavior change of finegrained lock refactoring, the paper proposes a method to check the consistency of refactoring. We analyzed three kinds of behavior changes caused by the existing refactoring engine and summarized the relevant consistency validation rules. According to the proposed rule , variable overlapping validation, condition missing validation and sequential violation validation are designed to verify the consistency before and after refactoring. In experiments, we perform fine-grained lock refactoring of the benchmark program by FineLock refactoring tool, and then use the program before and after refactoring as input to perform consistency checking. We validated the tool on ten large scale real applications and detected a total of 1483 refactorings for read-write lock separation and found 60 inconsistent synchronization behaviors. Experimental results show that the method proposed can effectively discover the inconsistent synchronization behavior caused by fine-grained lock refactoring. The main contributions of the paper include the following:  We found that the existing refactoring technology will introduce concurrency errors that may change the order of threads (see §3).  We proposed a refactoring consistency validation method oriented to fine-grained locks (see §4).  We developed an automated verification tool implemented as Eclipse plugins(see §5).  We evaluated our tool on several real-world applications(see §6). Finally, we conclude the paper in §7.

II. RELATED WORKS
The paper implements a refactoring consistency validation tool for fine-grained locks. We mainly focus on two aspects of related work: fine-grained lock refactoring and refactoring verification.

A. FINE-GRAINED LOCK REFACTORING
In the area of concurrent code optimization through refactoring, Schäfer, in collaboration with IBM T.J. Watson Research Center, designed a refactoring tool for Java display locks, Relocker [15], to refactor synchronous locks into Reentrant locks and refactor Reentrant locks into Read-write locks. Tao et al [16] proposed an automatic lock decomposition refactoring method for Java programs to divide lock protection domains based on class attribute domains, and implemented the automatic refactoring tool in the form of an Eclipse plug-in. Yu et al [17] proposed a lock decomposition method in their research on optimizing synchronization bottlenecks, which reconstructs lock dependencies and uses fine-grained locks to protect disjoint sets of shared variables. Zhang et al [18] proposed FineLock, an automatic refactoring method for fine-grained locks, which uses lock degradation and lock decomposition to achieve a fine-grained way of protecting critical sections.
The above studies implement the fine-grained protection of the critical section by lock allocation, lock reservation, atomic block and lock degradation, lock decomposition and other techniques to reduce the critical section competition, our research is to check the consistency of the change of synchronization behavior before and after the fine-grained lock refactoring, mainly for lock decomposition and lock degradation.

B. REFACTORING VERIFICATION
In terms of consistency verification of refactorings, Ubayashi et al [19] proposed the concept of RbC (Refactoring by Contract), a contract-based technique for verifying refactorings, in order to deal with the bugs embedded in refactorings of cut-oriented programming. The contract in RbC consists of preconditions, postconditions and invariants. After the introduction of RbC, it is checked whether the refactoring preserves the behavior and whether it actually improves the internal structure. Yin et al [20] proposed a new method for formal verification of the functional correctness of software, Echo, which can be used to verify refactorings. Echo mainly proves that the semantics of the refactored program is equivalent to that of the original program. Garrido et al [21] specified three useful Java refactorings, giving detailed correctness proofs for two of them. Each of these methods defines some specifications and conditions to verify the correctness of the refactorings.
Software refactoring is the modification of software to improve its structure, clarity, extensibility and reusability without changing its functionality and external visibility. Therefore, it is necessary to check the consistency of the program before and after refactoring to verify the functionality and external visibility of the program after refactoring. In terms of consistency testing for sequential programs, Silva et al [10] proposed a regression test suite to verify refactorings, and regression tests can also be used for consistency checking before and after refactorings. Abadi et al [22] proposed a method for verifying parallel code after refactoring, which is based on symbolic interpretation, which utilizes the original sequential code that has been tested and verified in most cases and checks whether it is equivalent to that code after refactoring. Dao et al [11] proposed a tool for consistency checking of behavior in model refactoring. In the area of consistency checking for concurrent programs, Peter Hofer et al [23] proposed a new approach to analyze lock contention in Java applications by tracking locking events in the Java virtual machine, which reveals the causes of lock contention and identifies performance bottlenecks of locks. Schaefer et al [24] proposed a behavior-preserving technique to avoid changing program behavior. By analyzing the possible causes of inconsistent behavior changes due to current refactorings, a concurrent program behavior preservation technique is proposed to address the causes of the problem. The technique introduces synchronization dependencies and simulates the ordering constraints of the Java memory model, and proves that the technique can guarantee the behavior retention. Zhang et al [12] proposed a refactoring consistency detection method for concurrent software refactoring, which uses control flow analysis and data flow analysis to detect changes before and after refactoring, and synchronization dependency analysis to detect changes in synchronization dependencies before and after refactoring.
The above studies implement refactoring correctness checking in various forms and automate the tools well, but they are all performed for concurrent programs, and we focus on consistency checking for refactored programs with fine-grained locks.

III. MOTIVATION
This section shows code snippets of existing FineLock refactoring tools that change the behavior of programs before and after software refactoring, giving the motivation for the paper.
To illustrate the change in synchronous behavior of the program before and after the refactoring, the code structure is illustrated. Figure 1 is a selection from the Guava API documentation and shows the refactoring that splits a critical area in Figure 1(a) into a critical area in Figure 1(b) that is locked by a write lock and a read lock, respectively. If there are two threads executing this code at the same time, the synchronized modification in Figure 1(a) ensures that only one thread is accessing the get() method at the same time, the two threads return the result value of value, null and multiple runs remain consistent. After refactoring, if the current thread2 is acquiring the write lock when the write lock has been acquired by thread1, the current thread enters the wait state. When thread1 releases the write lock, the Java memory model will refresh the shared variable value of the local memory corresponding to the thread to null in the main memory. If thread2 gets the write lock before thread1 gets the read lock, the original operation semantics may be changed if both return null. Figure 2 shows two implementations of the processCached() method, which is a typical cache processing operation taken from the Java API documentation for read/write locks.The method processCached() simulates the operation on the database and the cache by first determining whether the data exists in the cache, and if so, reading the data directly from the cache, otherwise writing the data from the database to the cache.
In Figure 2(a), the method uses synchronized for synchronization control, and the whole method is under the protection of the lock is a coarse-grained protection. Figure  2(b) is a fine-grained locking method, which first obtains the read lock and judges the cacheValid (lines [3][4], if the if condition does not hold, it directly reads and releases the read lock (lines [15][16][17]. If it holds, it releases the read lock to obtain the write lock (lines 5-6), and when the cache is written from the database, it obtains the read lock and then releases the write lock to complete the lock degradation operation (lines [8][9][10][11]. However, after refactoring, the conditional statement and its statement body are refactored to different critical areas. If there are two threads executing this code at the same time, even if the data is not in the cache at 1  Because of the exclusivity of the synchronized keyword, all threads must pass through the shared area protected by synchronized serially. Therefore, in Figure 3(a), method m() is executed with flag1 and flag2 read directly after selfincrementing, and if the if condition holds, the bug string is output. When method m1() is executed, flag1 is also read directly after the assignment operation, and if the if condition is valid, the bug string is output. Figure 3(b) shows the refactored code. In method m(), firstly, a write lock is applied to the self-increment operation (lines 7-12), and then a read lock is applied to the if condition statement and the output statement (lines [14][15][16][17][18][19][20]. In method m1(), the assignment operation is first locked with a write lock (lines [24][25][26][27][28], and then the if condition statement and the output statement are locked with a read lock (lines 30-36). After refactoring, the lock object of method m() is rwlock and the lock object of method m1() is nulock. If a thread executes the method in class C, the execution order CS1->CS3->CS2->CS4 may occur, making the program unable to output the bug string.
From the above example, we can see that the refactoring uses lock downgrading and lock decomposition to achieve fine-grained protection of the critical section, which reduces lock contention to some extent, but may lead to changes in synchronization behavior due to improper lock downgrading and lock decomposition, which may lead to program errors.

IV. CONSISTENCY VALIDATION RULES
The paper focuses on the consistency checking operation of FineLock, a fine-grained lock refactoring tool that uses a push-down automaton to construct different lock patterns. Although consistency detection rules are defined in FineLock to constrain the refactoring of lock degradation and lock decomposition, there are still inconsistent behaviors as described in the previous section. In order to further ensure the correctness of refactoring, we provide an additional description of FineLock's consistency checking rules.
Definition 1(Invariance of the external behavior of the program before and after refactoring)The application P before and after refactoring is denoted as before P and after P  . Since the code in an application P is finite, it can be known that the critical section contained in P is also finite, so C is a finite set.
, the lock protection relationship is defined as a v l c  . Definition 7 illustrates the difference between the lock set before and after refactoring. After FineLock refactors the synchronization lock into a read-write lock, the lock set is represented by L . The Happens-before relationship is defined as a sequential relationship between read and write operations based on the Java memory model and is an important criterion for memory consistency in the Java language. One way to establish this relationship is through a synchronization relationship in the program, where the operation before unlocking occurs before the operation after unlocking to obtain the lock.

Definition 8(Conditional layout)For
Based on the above definition, the consistency validation rules for fine-grained lock refactoring are given below.
After refactoring, for between e s and a l , denoted as e a s l  .
Our consistency validation method focuses on fine-grained locks refactoring, after the refactoring the critical sections will still in locks protection. Therefore, it will change the program behavior if the critical sections are without locks protection after the refactoring. Rule 1 illustrates that the critical area, which is in lock protection before refactoring, remains in lock protection after refactoring, and there is a one-to-one correspondence between the lock before refactoring and the lock after refactoring. If rule 1 is broken, the consistency will be violated, but the violation of consistency is not necessarily caused by breaking rule 1. Rule 1 is a necessary and insufficient condition for behavior consistent of the program before and after refactoring, i.e. Rule 3 illustrates that the original control condition and the conditional end instruction are in the same critical section. and after the decomposition of the critical section, it is possible to change the original operation semantics due to thread interaction if they are not in the same critical section and no secondary determination is made within the statement block. Rule 3 will cause problems of visibility, violate the consistency of the program. Rule

A. OVERVIEW
In the process of checking consistency, the source program Cbefore is firstly refactored with fine-grained locks to obtain the refactored program Cafter. Based on the source code, we use the WALA [25] software analysis tool to generate the corresponding call graph and intermediate representation IR for the source program Cbefore and the refactored program Cafter respectively. the analysis methods used are mainly alias analysis and side-effect analysis. Alias analysis is used to solve the alias problem of accessing variables. The sideeffect analysis determines whether the relevant variables involved in the critical section have negative effects and generates read/write field sets. Finally, the variable overlapping validation, the conditional missing validation, and the sequential violation validation are designed to verify the consistency before and after refactoring according to the generalized consistency test rules. The variable overlapping validation is used to check whether the refactoring will destroy the data dependencies that existed before the refactoring, the conditional missing validation is used to analyze whether the refactoring will result in competing conditions, and sequential violation validation is used to analyze whether the refactoring will cause deviations in the execution order of threads. The validation framework is shown in Figure 4.

B. CALLGRAPH ANALYSIS
In the framework, the WALA analysis tool is used to generate a Call Graph for the source and refactored programs. In the specific generation process, we first obtain the object selected by the user in the inspection operation through Eclipse JDT, and then store the object in the analysis domain to build a class hierarchy, and finally generate a relational call graph based on the class hierarchy. In the implementation process, the program's relational call graph is obtained by implementing the makeCallGraph() method in the CallGraphBuilder interface in WALA. The call graph contains nodes and edges, where nodes represent methods and the edges represent the calling process between methods.

C. ALIAS ANALYSIS
Aliasing means that two access variables point to the same memory location and if one object value changes, the other will change accordingly.
Before performing the consistency check, all the methods in the program are first traversed through the call graph analysis, and the related variables involved in the synchronous method or synchronous block are collected. When determining statement dependencies during the test, alias analysis of the relevant variables is required to determine whether the memory locations referred to by the two variable access operations are the same and to avoid misjudgment of dependencies. In the program, the aliasing statements may lead to two types of aliasing: assignment between object variables and combination of object type parameters during method calls. If two variables are aliased to each other and represented by a pair (var × var), the set of aliases is represented as [var × var]*. For example, if x, y are aliased to each other, they are denoted as (x × y) and are equivalent to (y × x).

D. SIDE-EFFECT ANALYSIS
The side-effects are defined as the modification of memory units during program execution. The side-effect analysis in the paper traverses and analyzes the intermediate representation IR in the method to determine whether the instruction modifies the memory units. The analysis algorithm is shown in Algorithm 1.
In the analysis of method call instructions, the number of method call entry levels is limited to 5 in order to ensure the execution efficiency of the tool. First, the instruction set corresponding to the critical section is obtained and the sideEffectAnalysis method is called to analyze each instruction (lines 1-5). Second, the method layer limit is judged(line 7). After analyzing each instruction, if the instruction modifies the field, the field is written into Fprotected_write(lines [8][9]. Finally, if it is a method call instruction, the counter of the call level is incremented by one(line 13)and the sideEffectAnalysis method is recursively called to analyze the instructions in the called method (lines [14][15][16][17]. If the currently called method does not produce sideeffects, the field is written to Fprotected_read (lines 10-11).

E. VARIABLE OVERLAPPING VALIDATION
According to Rule 2 mentioned in the previous section, we designed a variable overlapping validation, which mainly refers to the judgment of the relationship between statements based on data dependency. After fine-grained refactoring, if the dependent statements are distributed to the different critical sections, the execution order of the statements may be destroyed, resulting in changes in synchronization behavior. Algorithm 2 is the basic structure of variable overlapping validation.
The algorithm 2 scans each synchronization method and synchronization block, and analyzes the fields used in them to determine the dependencies, and stores the read and write mapping relationships between statements and protected fields using two sets of key-value pairs. First, each critical section is analyzed and the number of calling layers, RprotectedMap and WprotectedMap are initialized (lines [4][5]. Second, each instruction is traversed and the protected fields in the critical section are divided into protected read and protected write, which are respectively denoted as Fprotected_read and Fprotected_write. The read and write operation analysis of field f is performed on each instruction through the sideEffectAnalysis method (line 9). The protected read and protected write are mapped to the corresponding statements after side-effect analysis (lines [10][11]. Finally, the statements with the same variables in the read-write mapping RprotectedMap

F. CONDITIONAL-MISSING VALIDATION
Condition missing mainly checks the most common competing conditions, i.e. check first and execute later, by determining whether conditional statements and statement blocks are distributed to different critical sections after finegrained lock refactoring, and the program does not redetermine the state. The conditional missing validation is shown in Algorithm 3.
First, the instruction set of the method is generated and two read lock instruction sets Rprotected, R1protected, write lock instruction set Wprotected, condition variable set IFprotected_read, read and write fieldset Fprotected_write, Fprotected_read, read lock counter and layer limit are initialized(lines1-4). If the statement is a read lock operation and the counter is zero, the statement protected by the read lock is written to Rprotected. If the instruction is a conditional judgment, the condition variable is written to IFprotected_read (lines [6][7][8][9][10][11]. If the statement is a write lock operation and the counter is one, the statement protected by the write lock is written to Wprotected (lines [17][18][19][20][21][22]and each instruction is used to analyze the write operation of field f (line 21) in the sideEffectAnalysis method. If the instruction modifies the field, the field is written to Fprotected_write. If it is a method call instruction, the counter of the calling layer is incremented by one and the sideEffectAnalysis method is recursively called to analyze the write operation of the instruction in the called method. If the statement is a read lock operation and the counter is 1, the statement protected by the read lock is written to R1protected (lines [24][25][26][27][28]. Finally, if there is a conditional judgment instruction in the first read lock critical section but no conditional end instruction, there is no conditional judgment instruction in the write lock critical section and there is no conditional judgment instruction in the second read lock critical section but there is a conditional end instruction, at the same time the condition variable is written (line 32), the method signature of the method is returned, otherwise, it returns null. if(Ins is a readlock instruction) &&(count==0) then 7.
for each instruction Kins after Ins do 8.
if Kins is not an unlock instruction then 9.

G. SEQUENTIAL VIOLATION VALIDATION
Sequential violation means that the refactoring may result in a rearrangement of the execution order of multiple critical sections due to changes in the locking objects of synchronous methods/synchronous blocks by fine-grained locking. when a thread releases a lock, the Java memory model flushes the shared variables in the local memory corresponding to the thread to the main memory according to the memory semantics of locks. If the critical sections (for example, if there are three critical sections, write-readwrite) are reordered, it may result in the same memory location being modified consecutively. When the thread acquires the read lock, the critical section code must read the shared variables from main memory, and the data that should have been read may have been overwritten.

Algorithm 4：Sequential violation validation
Input: Clathe target class Output: SeqVMapsequential violation method pair set Algorithm 4 presents the basic structure of sequential violation validation, which scans each synchronization method and synchronization block in the class, and analyzes the fields and lock objects. First, the synchronization block or synchronization method in the class is traversed (lines 1-2), and the writing field mapping SWprotectedMap and the read field mapping SRprotectedMap are initialized. The keywords are the lock objects belonging to the field write and read respectively. In the critical section, we divide the protected fields into protected read and protected write, namely Fprotected_read and Fprotected_write (lines 6-9). Each instruction performs side-effect analysis (line 10), and after the analysis, the protected read and protected write are mapped to the corresponding lock object (lines [11][12][13][14]. If the write mappings of different lock object methods have the same variable, and the same variable is also read in the two methods, it is determined that a sequential violation may occur(lines 17-27).

VI. IMPLEMENTION
We implement our consistency validation tool as an Eclipse plugin. This experiment uses the WALA tool for code analysis. The interface of the validation tool is shown in Figure 5.

FIGURE 5. Consistency validation tool interface
The user needs to select the source program of the test and the program after fine-grained lock refactoring as input, and then click the Detection button in the menu bar to start the plug-in, and select the type of validation that needs to be tested. The result will display the project path and the number of categories that caused the synchronization behavior change in the display column Information after the execution is completed, and the corresponding method names and classes of inconsistent types are presented to the user in the Variable-Overlap, Condition-Miss, and Sequential-Violation columns.

VII. EVALUATION
This section conducts an experimental evaluation of the proposed tools. First, the experimental configuration and selected test programs are introduced, and then the experimental results are analyzed.

A. EXPERIMENTAL SETUP
All experiments are done on HP Z240 workstation with 3.6GHz Intel Core i7-7700 processor and 8GB RAM. The workstation runs Windows 10 and has Eclipse 4.12. 0, JDK 1.8.0_221, and WALA 1.5.2 installed.

B. Benmarks
Ten actual applications were used to evaluate the effectiveness of the proposed validation tool. First, Refactoring operations were performed on these applications by FineLock, the programs before and after refactoring were used as the check objects. These applications include HSQLDB [26], Cassandra [27], SPECjbb2005 [28], JGroups [29], Xalan [30], Fop [31], RxJava [32], Freedomotic [33], Antlr [34], and MINA [35]. HSQLDB is an open-source Java database. Cassandra is an open-source distributed NoSQL database system from Apache. SPECjbb2005 is a Java application server test program. JGroups is a toolkit for reliable messaging. It can be used to create clusters whose nodes can send messages to each other. Xalan and Fop are XSLT transformation processors and formatted object processors from Apache, respectively. RxJava is Netflix's library for composing asynchronous, event-based programs using observable sequences on the Java VM. Freedomotic is an open source, flexible and secure Internet of Things (IoT) development framework. Antlr is a parser generator, and MINA is Apache's web application framework. The version information of these programs, the number of synchronization methods (Sync_B) and synchronization blocks (Sync_M), the number of refactoring operations (lock downgrade, lock decomposition, read lock, write lock) and refatoring times are presented in Table1.

C. RESULT AND ANALYSIS
In the experiment, the tool was used to check the consistency of the ten benchmarks, and some cases were selected for the results to be displayed.

1) RESULT
After checking the consistency of the above-mentioned benchmark programs, we conducted category statistics on the causes of inconsistent synchronization behavior before and after refactoring in terms of variable overlap due to statement dependencies, competing relationships, and sequential thread execution, and the results are shown in Table 2. It can be seen from the experimental results that there are 60 inconsistencies in the benchmarks. The number of overlapping variables is 15, mainly distributed in HSQLDB, SPECjbb, Xalan and JGroups. The number of missing conditions is 7. In RxJava and MINA, there are no missing condition inconsistencies because the source program contains fewer built-in monitor objects and the refactoring does not perform lock downgrading. The number of sequential violations is 38, and the number of changes in synchronization behavior detected is 17 because the number of refactorings converted to lock decomposition mode in the HSQLDB test program is high; the number of lock decomposition and lock degradation in Fop, RxJava, Antlr and MINA is low, so the number of inconsistencies is low or even absent.
The total time consumed by the 10 test programs is 1431 seconds, and the average time consumed by each program is 143.1 seconds, as shown in Table 2. There are more synchronization methods and synchronization blocks in HSQLDB, there are 684 of them, and the test time is 384 seconds. Due to the relatively large scale of Cassandra, the time spent for traversal analysis is long, although it is only 10 inconsistencies tested, and the test time is 371 seconds. SPECjbb2005, JGroups and Xalan take 127 seconds, 134 seconds and 137 seconds; RxJava, Freedomotic, Antlr and MINA are relatively small programs, taking about 30 to 60 seconds. For the FOP test program, no inconsistent synchronization behavior was detected, but it also took 89 seconds. By analyzing the validation time of these programs, we found that the validation tool time consumption was mainly used for static analysis of the program, and the larger the program, the longer the static analysis time, causing the total validation time to be longer. Although our validation time did not achieve particularly good results, but the manual validation method will spend a lot of time in the search for code, while the proposed validation tool can automatically complete the inspection, greatly reducing the time consuming.

2) CASE STUDY
For the presentation of the found inconsistent objects, take SPECjbb as an example, as shown in Table 3. In the table, the class label lists the paths of the classes where the methods with inconsistent synchronization behaviors exit, and the method lists the method signatures that violate the consistency rules. We have manually checked the reported inconsistent methods, and the synchronization behavior of the listed methods has changed after refactoring. Examination of the TimerData class in the project shows that the method updateTPMC() in the class has overlapping variables. The dependency statement related to the variable tpmc in this class is split into two critical areas due to fine-grained locking, which results in an error. In this project, no missing conditions were detected because the refactoring lock degradation operations were relatively few and almost always refactored correctly. From the test results, the changes in synchronization behavior due to sequential violations are mainly distributed in the District class. This class is mainly used to store the modified user information and zone adjustment.   Our selected program segment from SPECjbb2005 program, Figure 6 shows the refactoring that splits a critical area in Figure 6(a) into critical areas locked by a write lock (lines 2-9) and a read lock (lines 10-15) respectively in Figure 6(b). According to the definition and rule 1 in Section 3, there is a critical area Ci locked by synchronized in the method updateTPMC before refactoring, statement 4 performs a write operation opi4 w on the tpmc variable and statement 5 performs a read operation opi5 r on it, and there is a data dependency between the two statements. After refactoring, Ci1 and Ci2 are split into two critical areas locked by the read/write lock tlock and opi4 w and opi5 r are distributed to different critical areas, and this splitting operation destroys the dependency relationship.
If there are 2 threads executing this code at the same time, the execution path of the threads is shown in Figure 7. In the source program, since the synchronized modification ensures that the updateTPMC() method is accessed by only one thread at the same time, thread T1 will execute all the statements within the method body before T2 is executed, and this process does not affect the reading of the tpmc value. The refactored execution sequence is shown in Figure 7(b). After executing opi4 w on the tpmc variable, T1 releases the write lock, and if T2 acquires the write lock at this time and performs the write operation on the variable, it will certainly affect T1's access to the value of tpmc, which is inconsistent with the original behavior of the program, and we express this phenomenon as variable overlap.

FIGURE 7. Thread Execution Path Diagram
From the analysis of Figure 6 and Figure 7, it can be seen that such cases are consistent with variable overlap verification. According to Algorithm 2, the analysis of the side-effects of instructions within the critical area will result in the read and write mapping WprotectedMap, RprotectedMap about temp and tpmc, which combined with Rule 2 can determine that there is a data dependency between statements 4 and 5 before refactoring and fails to maintain this dependency after refactoring. Figure 8 shows the case of missing conditions with HSQLDB as an example. The method registerServer() contains the synchronization block with the monitor object serverMap before refactoring, and after fine-grained lock refactoring, the lock downgrading operation is performed on this synchronization block. The Figure 8(b) shows that the conditional statements and statement blocks related to serverMap are split into two critical section. VOLUME XX, 2021

FIGURE 8. Conditional Missing Test Result
If two threads T1 and T2 execute the method, after refactoring, there will be two threads that both read the data first and store it in the thread's own buffer, as shown in Figure 9. If T1 executes the state determination of serverMap first and modifies the state amount after it is satisfied, the judgment of T2 may fail after the execution is finished and affect the subsequent execution. However, before refactoring, T1 executes all the operations and updates the memory before T2 reads the state volume, so it can be seen that such refactoring will cause the condition missing.
From the analysis of Figure 8 and Figure 9, it is clear that such cases are consistent with conditional missing verification, and according to Algorithm 3 and Rule 3, the refactored conditional judgment instruction and the end instruction are distributed to different critical areas, which are prone to competing conditions when accessed by threads.

FIGURE 9. Thread Execution Path Diagram
Taking the CommitLog class examined in Cassandra as an example, Figure 10 illustrates the violation of sequential consistency. As can be seen in Figure 10(a), the method start() and the method shutdownBlocking() both perform judgmental operations on the started variable (line [4][5][6][7][8][9][10][11][12][13][14][15][16][17], and there are data modification statements related to started. After refactoring with the FineLock fine-grained lock refactoring tool, as shown in Figure 10(b). Both methods are refactored to perform lock decomposition operations, and the method start() is decomposed into read locks (lines 4-9) and write locks (lines 11-23) with the lock object nulock, and the method shutdownBlocking() is decomposed into read and write locks with the lock object tlock.
The execution path is shown in Figure 11, as the locking objects before refactoring are Object to ensure the execution order of the method. But after the refactoring, it may cause the data of the start variable to be read in the start() method and then written in the shutdownBlocking() method started, thus causing the subsequent read of started to be wrong, which obviously breaks the original execution order of the program.
From the analysis of Figure 10 and Figure 11, it can be seen that such cases are consistent with sequential violation verification. According to Algorithm 4, by collecting the lock objects of synchronous methods or synchronous blocks, it is judged that the monitor objects of the two methods are consistent before refactoring, and after refactoring, they are locked by different lock objects nulock and tlock, and the threads are prone to inconsistent sequential consistency when accessing with the prerefactoring. started = true; 17

D. Limitations
static analysis is primarily an analysis performed without running the program, while dynamic analysis is a record of function calls as the program actually runs. While dynamic analysis provides access to more call information, such as the order and number of calls, the presence of branching statements in the program may result in certain statements not being executed and not being recorded in the call graph. Dynamic can give a clearer picture of the calls and the execution of threads, and using a combination of dynamic and static analysis to accomplish consistency checking deserves further study.

VIII. CONCLUSION
This paper proposes a refactoring consistency validation method for fine-grained locks. It uses WALA to generate intermediate code to analyze the three behavioral changes caused by the existing refactoring engine: variable overlap, conditional absence, and sequential violation. Then we summarize the verification rules. According to the proposed rules, the variable overlapping validation, condition missing validation, and execution sequence validation are designed to verify the consistency before and after refactoring through call graph analysis, alias analysis, and side-effect analysis. The validation tool was implemented in the form of an Eclipse plug-in, and the tool was verified using the finegrained lock refactoring programs of ten projects including HSQLDB, Cassandra, and Xalan. Experimental results show that the consistency validation tool can effectively check the three concurrency problems mentioned in the paper caused by refactoring. In our future work, we will explore more concurrency issues caused by fine-grained lock refactoring. Moreover, we will use more practical applications to verify the validation tool.