An Approach for Mining Multiple types of Silent Transitions in Business Process

The purpose of process discovery is to construct a process model based on business process execution data recorded in an event log. Many situations may lead to silent transitions that appeared in the process model, while the execution of silent transitions is not recorded in event logs. Therefore, mining silent transitions has been one of the difficult problems in process mining. Existing approaches have some limitations on discovering the silent transition in concurrent structures and may produce many redundant silent transitions which make discovered process model complicated. A novel approach to discover multiple types of silent transitions from an event log is presented in the paper. The basic behavior relationship between activity pairs based on the event log is used to construct the process model with silent transitions of and-gateway type and loop type. Meanwhile, the technique of behavior distance is proposed to discover silent transitions of skip type. Finally, the process model with multiple types of silent transitions is obtained. Experimental results show that the proposed approach can find multiple types of silent transitions correctly, and the number of redundant silent transitions is much less than the existing methods. Meanwhile, it significantly improves the F-measure of the model.


I. INTRODUCTION
Nowadays, information system records the real execution of the business process in the form of event logs. The goal of process mining is to discover, monitor and enhance real life processes by automatically extracting valuable information from these event logs. It includes three main domains, i.e., process discovery, conformance checking and process enhancement. However, process discovery is the necessary technique for the other domains, which automatically constructs process models by applying the different discovery techniques on event logs [1]. Until recently, dozens of process discovery algorithms have been proposed [2][3][4]. Whereas, many problems need to be solved in process mining, such as short loops, duplicated transitions, invisible (or silent) transitions, non-free-choice constructs, noises, infrequent behaviors, incompleteness, etc [5]. In addition, privacy-preservation issues of crossorganization business process mining [6,7], efficiency problem when dealing with large-scale event logs [8], process model discovery and process similarity measure considering both control-flow structures and data-flow information [9]. However, in this paper, we focus on the detection of silent transitions from event logs. Various factors in real life may cause silent transitions to appear in the process model, and these silent transitions only play a role of routing but cannot be ignored. Silent transitions are difficult to discover because they do not appear in any event trace. The main reasons for leading to silent transitions are as follows [10]:  There are some actual tasks that can be allowed to be missing in some event traces.  There are meaningless tasks only used for routing purposes in process models.  The enactment services of process models allow skipping or redoing the current task and jumping to any previous task. But such execution logic is not expressed in the control logic of the process model. Until now, there are several mining approaches that are capable of discovering silent transitions [10][11][12][13][14][15][16][17][18][19], such as α# algorithm [10,11], genetic algorithm [12], α$ [15]，Inductive Mining (IM) algorithm [2] ， Coupled Silent Markov Model-Nonfree Choice Invisible Task(CHMM_NCIT) [17], etc. These mining algorithms use techniques such as directly-follows relationships, Markov model, or genetic algorithm to detect silent tasks. However, some yield a lot of redundant silent tasks, and some fail to detect silent tasks that should be appeared in the concurrent structure. Thus, the existing process discovery algorithms have difficulties to discover possible silent tasks from the given event logs accurately and guarantee to return the appropriate result. They pay less attention to the behavioral relationship between activities in the event logs when detecting silent transitions. Thus, a novel approach to discover silent transitions based on the behavior distance is presented in the paper. The proposed approach can discover multiple types of silent transitions and deal with silent transitions that appeared in concurrent structures. Meanwhile, it will not produce a large number of redundant silent transitions. Results of experiments on event logs show that the proposed method is promising compared to existing methods. The main works of the paper are as follows: 1. We present a new technique of behavior distance which can quantify the behavior relationship between activities in the event logs. Subsequently, the behavior distance between activities based on log, model, and concurrent structure is presented respectively to discover silent transitions accurately.
2. The traditional behavioral profile is refined to more accurately capture the behavior relationship between activities. We construct the process model with silent transitions by analyzing the behavior relationship between activities in the event logs.
3. The novel approach to discover multiple silent transitions is proposed based on behavior distances and basic behavioral relations of event logs. The proposed approach can find multiple types of silent transitions correctly, and the number of redundant silent transitions is much less than the existing methods.
The remainder of the paper is structured as follows. Section 2 reviews works on the silent task. Section 3 introduces a motivation example. Section 4 presents the preliminary concepts. The approach for discovering silent transitions of and-gateway type and loop type is given in Section 5. Further on, the technology of discovering silent transitions of skip type is described in Section 6. Section 7 shows the evaluation results of the proposed approach. Finally, Section 8 concludes the paper and discusses future work.
The IM algorithm [2] obtains a process model with silent transitions using the technique of process tree cutting. Still, it may yield too many redundant silent transitions, which makes the process model very complicated. The literature [11] gives a mining method of a process model with prime invisible tasks based on the literature [10]. This approach detects invisible tasks by identifying the mendacious dependency, which is separated from the causal dependency. Therefore, if the tasks have no directly-follows relation in the event log, then the silent task between them will not be detected. Thus, it is difficult to detect the silent transition appeared in concurrent structure. In [12], the authors propose the genetic mining algorithm, which supports the detection of invisible tasks, duplicated tasks and non-free choice constructs. Whereas this algorithm needs many user-defined parameters and it cannot always guarantee to return the most appropriate results. A Synchronet based mining algorithm, which is a synchronizationbased model of workflow logic and workflow semantics is proposed in [13]. The authors state that short-loops and invisible tasks can be dealt with at ease. However, neither the original model nor the mined model contains any invisible task. Literature [14] presents a decisions mining approach to mine decisions from process logs, which emphasizes detecting data dependencies that affect the routings of cases. When interpreting the control-flow semantics at a decision point, the authors propose an approach to identify decision branches starting from invisible tasks. However, not all kinds of silent tasks can be dealt with by this approach. The literature [15] provides the α$ algorithm to handle silent transitions in non-free-choice Petri net, but it still cannot detect the silent transition that appeared in concurrent structure. Literatures [16,17] present a method that utilizes silent Markov to construct process models with a free and non-free choice construct containing invisible prime tasks from incomplete event log, both Coupled Silent Markov Model-Nonfree Choice Invisible Task (CHMM-NCIT) and Coupled Silent Markov Model-Invisible Task (CHMM-IT) use Coupled Silent Markov Model (CHMM) and use Baum-Welch algorithm to determine the weights of variables of CHMM. However, the time complexity of these algorithms is very high. In [18], the authors propose a method to construct silent task and silent task in non-free choice relation when converting a declarative model into an imperative model in the form of Petri Net model. Therefore, this method cannot discover silent task when constructing the process model from the event logs. In literature [19], a graph-based algorithm is proposed to discover silent transitions. However, when event logs are very large, it is difficult to get database graphs to express the dependency between events.
All methods mentioned above, none of them concentrate on the behavioral relation between activities when detecting the silent task, and they still cannot handle all kinds of silent transitions well without too many redundant silent transitions. This paper presents a novel approach to discover silent transitions based on behavioral relationship without generating a large number of redundant silent transitions.

III. MOTIVATION
In the following, we will illustrate the limitation of existing approach to mine silent transitions with an example . For the  event log   1  1  2  3  4  5  6  7  8  9  10  11  12  13   { , , , , , , , , , , , , } L The frequency of traces is not considered in the paper. The process models shown in Figure 1 and Figure 2 are obtained by applying the approach proposed in the literature [3] and literature [12] on the above log respectively. Applying the proposed method in this paper, the process model can be yielded shown in Figure 3.   The literature [3] has difficulties to discover the silent transition between activities pairs that have no directlyfollows relation, such as the silent transition 5 t shown in Figure 3. The Inductive Mining method [12] produces lots of redundant silent transitions, which lead to the model complex, incomprehensible and inaccurate analysis results. To address the aforementioned problems, a novel method of discovering a process model with non-redundant silent transitions is presented in this paper, which overcomes the drawbacks of existing approaches to mining silent transitions. Meanwhile, the effectiveness of the method through a set of experiments is validated.

IV. PRELIMINARIES
The basic definitions and related knowledge of Petri net will not be introduced in this paper. More details can be found in the literature [20]. The basic knowledge of process mining can be found in the literature [1]. Here, we only present basic preliminaries and notations, used throughout the paper.
Definition 1 [21] (Labeled Petri Net)A labeled Petri net is a 4-tuple : ( , , , ) N P T F   ，where ( , , ) P T F is a Petri net，  represents the universe of labels to describe actions that have explicit meaning,  represents the particular label that has no domain interpretation, where  . : is a function that assigns labels to transitions. If ( ) t )，we say that there is a directly-follows relation between a and b , which is written as b a L  . Definition 4 [22] (Weak Order(Log))Let L be an event log over we say that there is a weak order relationship between a and b,which is written as L a b  . Definition 5 [1] (Free-choice)A Petri net is free-choice if any two transitions sharing an input place have identical input sets, i.e., 1 1 2 , t t T  . A Petri net is sound, if it is safeness, proper completion and has no dead transitions, meanwhile, all transitions are on the path from input place i and output place o .

V. CONSTRUCTING A PROCESS MODEL WITH SILENT TRANSITIONS OF AND-GATEWAY AND LOOP TYPE
On the basis of literature [23], this section further points out how to associate the behavioral characteristic relationship between activities in the event log and their structural relationship in the process model through the behavioral relationship and behavioral distance in the log. The correctness of the corresponding relation is proved by propositions, which provides a theoretical basis for finding the initial process model with silence transition of andgateway type and loop type from event log. Moreover, the approach to discover silent transitions of and-gateway and loop type from event log is provided.
Obviously, directly-follows relation and weak order relation between activity pairs are qualitative analysis methods to behavior relation. However, behavioral distance is presented in the following, which realizes the quantitative analysis of behavioral relations Definition 6(Trace-based Behavior Distance) Let L be an event log over T .For , L   , the behavior distance between a and b is defined as: (1) Definition 7 (Log-based Maximum Behavior Distance and Minimum Behavior Distance) Let L be an event log over T . For , log-based minimum behavior distance min ( , , ) BDis a b L and log-based maximum behavior distance max ( , , ) BDis a b L are defined as: If an activity a and an activity b are in an exclusive relationship, i.e., they can never both appear in a trace, then min ( , , ) The behavior distance matrix of the event log 1 L shown in Section 2 is in Figure 4. Literature [11] indicates that the mendacious dependencies are important for the discovery of silent transitions.
However, classical behavioral profiles do not distinguish mendacious dependency from directly-follows dependency. As we know that the cyclic structures have a substantial impact on the interleaving behavioral relations, for example, two exclusive transitions inside a cycle. Therefore, it cannot determine the structural characteristics between activities, such as true concurrency and cyclic structure, when activity pairs are in interleaving behavioral relations. Obviously, existing behavioral profile is too rough.It is very important to identify causal dependency, mendacious dependency and different interleaving order relationships for the detection of silent transition. Therefore, Definition 8 presents basic behavior relation based on log.
Definition 8 (Basic Behavior Relation based on Log) Let L be an event log over T , let For the event log 1 L shown in Section 2, the basic behavior relationship between activities is shown in Figure5.
holds, then t is called as a silent transition of skip type.
 holds, then t is called as a silent transition of loop type.
 holds, then t is called as a silent transition of switch type.
then 1 2 , t t is called as a silent transition of and-split type and and-join type respectively. Both of them are called as silent transitions of and-gateway type.
holds, then t is called as a silent transition of side type. Figure 6 illustrates silent transitions of side type, skip type, loop type, switch type, and and-gateway type.  . This indicates that there are multiple connection places between activity a and activity b . Only preserving one place by deleting others places does not affect the behavioral relationships between activity a and activity b .
When the activity a is fired， activity b and activity c are in concurrent. Hence, there exists an execution sequence in the form of acb   , which produces max ( , ) 1 BD a b  . Therefore, this leads to a contradiction. □ Proposition 2 If activity a and activity b are in strict order relation based on log, then there exists a path composed of transitions and places between activity a an activity b in a Petri net .
Proof. According to the definition of the strict order, it is obvious that it is true. □ Proposition 3 For any sound free-choice system it holds that, if activity a and activity b are in exclusiveness relation based on log, then exclusiveness and structural exclusiveness coincide. Proof. Assume activity a and activity b are not in structural exclusiveness. As L a b  , it is impossible that activity a and activity b are in structural concurrent or structural cycle. If a path from activity a to activity b exists, from L a b  , we know that the possible structure between them is shown as Figure 7. i.e. c  , satisfy , Clearly, activity b is not in choice-free structure, which leads to a contradiction. □ FIGURE 7. Activity a and activity b are in exclusiveness which there is a path from a to b . Proposition 4 If activity a and activity b are in the concurrent interleaving order relation based on log, then they are in structural concurrent. Proof. We assume activity a and activity b are not in concurrent structure in process model. From the definition of concurrent interleaving order relation, we know that activity a and activity b are either in exclusive structure or in sequential structure, so we have to consider two cases. For the former, without losing generality, we only consider the concurrent relation of length 1. In order to guarantee both hold and the system is sound, the possible structures are as shown in Figure 8. It is obviously that That yields a contradiction with || L a b . For the latter, as they are in sequential structure, then the possible structures are as shown in Figure 9, i.e.
It is obviously that a  both hold. That yields a contradiction with || L a b . As both assumptions lead to contradictions, we know they are structural concurrent.  , then the structure between activity a and activity b is divided into three cases.
For convenience, only the most basic possible structure behavior relationship is considered here. Other complex behavior relationships can be transformed into the following basic structures by clustering equivalence classes. In Figure  10(a), 1 , , when the occurrence sequence in the form of 1 ... ... t ba , which leads to a contradiction. In Figure 10 holds which leads to a contradiction. In Figure 10(b), as activity a and activity b are exclusive inside a cycle structure. Obviously, activity a and activity b can occur at any number of times, which make the assume hold true. Hence, for case 2, activity a and activity b must be exclusive relationship in a cyclic structure.  .The refore, the assume holds true. In Figure 10 (b), as activity a and activity b are in exclusive relation inside the loop structure, the numbers of occur times between them is independent which yields a contradiction. Hence, for case 3, activity a and activity b must be concurrent relationship inside cyclic structure. Proof. Obviously it is correct. Although the silent transition of and-gateway type added by Lemma 1 will not affect the behavior relation of any others activities of the net system, it may produce many redundant silent transitions of and-gateway type. Algorithm 1 will remove these redundant silent transitions of the andgateway type according to structural features between the silent transition and its pre -or post-transitions.
Theorem 1 Let L be the event log over T . For T T   ,if T is a maximal subset of activities that is in the loop interleaving order relation and t T    belongs to the part of loop-body, then there exists a silent transition of loop type that belongs to the part of loop-redo. Proof. We assumed that there is no silent transition of loop type, i.e., there exists a visible activity t as a loop-redo transition. As t T    belongs to the part of the loop-body, the possible net construction of T  and t is shown in Figure   11. It is easy to know that t T    and t are in the loop interleaving order relation. Hence, the maximal subset of activity that is in the loop interleaving order relation is T t    , which contradict with the condition. Therefore, Theorem 1 is proven to be correct. Silent transitions of side type are no longer discussed here, as for models with multiple start or end activities, they can be realized only by manually adding a unique start or a unique end transition. Algorithm 1 presents how to construct an initial process model with silent transitions of and-gateway type and loop type from the event log. First of all, the behavior distance matrix and behavior relationship matrix between activities is obtained according to the event log L , after that several sub-modules are constructed based on the basic behavior relationship. Subsequently, these submodules are merged into a whole process model.
Meanwhile, silent transitions of and-gateway and loop type can be discovered using Lemma 1 and Theorem 1. Finally, the redundant silent transitions of and-gateway types are removed from the process model.
SubSet and subModules are activity subset getting from a behavior relation Matrix and sub-module set composed of these activity subsets respectively. Function Construct(S, T, structural exclusiveness) construct a sub-module with S and T in structural exclusiveness, and others are similar. InsertSilent(and-type) discovers a silent transition of andtype, and others are similar. Merge(Md i ,Md j ,Md ij ) combine sub-modules Md i and Md j to get a larger sub-module Md ij.
Step1-Step2 creates a behavior distance Matrix and a behavior relation Matrix based on event logs.Step4-Step30 constructs several sub-modules with silent transitions consisting of activity subsets. In Step31-Step36, the submodules are combined into a complete process model through causal dependency and strict relation in the behavior relation matrix. Step37-45 removes the redundant silent transitions of and-gateway type.
The execution of Algorithm 1 is illustrated using the event log 1 L in Section 2. The behavior distance matrix n n BDisMatrix  and basic behavior relationship matrix n n BRelMatrix  are shown in Figure 4 and Figure 5, respectively, in Section 2. The four sub-modules shown in Figure 12 are constructed according to Step4-30. Subsequently, they are merged into a whole process model by using Step31-36, as shown in Figure 13.  Finally, according to Step37-45, the redundant silent transitions of and-gateway types in Figure 11 are removed, and the process model with silent transitions of andgateway type and loop type without redundant silent transitions is built, as shown in Figure 14.

VI. MINING SILENT TRANSITIONS OF SKIP TYPE
Section 4 has shown how to construct a process model with silent transitions of and-gateway type and loop type. This section discusses how to discover silent transitions of skip type based on the initial process model constructed by Algorithm 1, thereby further optimize the initial process model. Proof. We assumed that there is no silent transition of skip type between a and b . As post-transition set of t , which is defined similarly.   (2) when 1 t and 2 t are both silent transitions, the corresponding possible substructure is shown in Figure  16, Based on Algorithm1, Algorithm 2 presents a method to discover silent transitions of skip type. The basic idea of Algorithm 2 is to compare the minimum behavior distance of activities in the log with their minimum behavior distance in the model, and determine whether there is a silent transition of skip type between them using Theorem 2 and Theorem 3.
For the initial Petri net model 0 M in Figure 13, according to Step2-9 of Algorithm 2, we can find a silent transition of skip type between activity pairs ( ) J,M . The silent transition 5 t between A and I can be discovered using step 18-23 of Algorithm 2. Finally, an optimized Petri net 1 M with multiple types of silent transitions is obtained, as shown in Figure 17.

VII.EXPERIMENTS
In this section, series of experiments with real and synthetic event logs have been conducted to assess the goodness of our approach. Section A presents the evaluation criteria used in this paper. Section B presents the synthetic event logs used in this section. In Section C, we compare the proposed method with existing approaches in terms of the number of silent transitions discovered and the quality of the process model using various synthetic event logs. In Section D, we discuss the experimental result on real event logs.

A. EVALUATION CRITERIA
Determining the quality of a process mining result is characterized by many dimensions. Usually, four main quality dimensions are used: fitness, precision, simplicity, generalization. In this paper, we use two main and most important quality criteria: fitness, precision. Fitness is to measure how much behavior in the event log can be replayed by the model. A model with good fitness allows for the behavior seen in the event log. The model has a perfect fitness if all traces in the log can be replayed by the model from beginning to end. There are various ways of defining fitness. Here, we use align-based technique to compute the fitness of the discovered model [24].
Precision is to measure how much behavior the model produces that is not observed in the event log. Here, we use the method proposed in [25] to calculate the precision. The model having a low precision will lead to the underfitting phenomenon. Therefore, a good mining algorithm should score well on the precision of the process model as can as possible when the fitness is good. Sometimes, putting aside a small amount of behavior causes a slight decrease in the fitness value, whereas the precision value increases much more. Therefore, we use the F-measure that combines fitness and precision as follows [26].
2 Precision Fitness F measure Precision + Fitness Rediscovery mainly concerns whether the discovered process model is equivalent to the original model. When comparing two models, we only consider the behavioral relationship between them in this paper. However, the traditional trace equivalence yields a true or false answer and can, therefore, not be directly applied if models overlap partially. It cannot intuitively reflect what extent behavioral equivalence is between two models. Therefore, this paper quantifies the degree of consistency between the mined model and the original model using the trace consistency measurement proposed in literature [23].

B. SYNTHETIC EVENT LOGS
The event logs used in this section are mainly from two aspects. One is the previous synthetic event log from Section 2, and the other is generated by simulating a known process model, which contains multiple silent transitions of different types. For real event logs, there is no reference process model available to compare with the results of process discovery algorithms. Moreover, the correct number of silent transitions in the process model is also uncertain. Therefore, we designed fifteen artificial process models with different behavior and different types of silent transitions. The maximum number of activities in one process model is less than 20. 15 groups of event logs are generated by simulating fifteen artificial process models. Therefore, unlike real event logs, for each synthetic event log there is a corresponding reference process model. We can use these reference models as the ground truths that indicate how many silent transitions exist in the original models. Parts of the models are shown in Figure 18. The first model contains silent transitions of skip type, the second one contains silent transitions of loop and skip type, the third model contains silent transitions of side, andgateway and skip type, the fourth one contains many silent transitions of skip type and loop, the fifth process model contains silent transitions of loop, and-gateway and skip type, and the last one contains all the previous silent transitions.

C.EVALUATION RESULTS WITH SYNTHETIC EVENT LOGS
For the previous synthetic event log and the 15 groups of event logs generated artificially, the number of silent transitions, the fitness of the model [21], the precision [22], and the consistency between the discovered model and the original model [23] are compared respectively.
The comparison results based on the event log 1 L are depicted in Table 1. The experimental results indicate that the proposed approach is more superior than others in terms of the number of silent transitions, the fitness and the precision of the model. We apply IM algorithm, α#, and the proposed method on the 15 groups of event logs generated artificially and give a comparison result to evaluate the proposed method.
For the event log generated by model 1 shown in Figure  18(a), when applying the proposed method and α# algorithm on this event log, three silent transitions in the original model can be rediscovered. Whereas IM algorithm produces four silent transitions, it discovers two silent transitions on the path from activity B to activity D instead of one silent transition. Therefore, the process model constructed by IM algorithm will have more additional behaviors. For the event log generated by model 2 shown in Figure 18(b), three silent transitions can be found using the proposed method and α# algorithm. However, the difference is that α# algorithm inserts a silent transition between the pre-set place and post-set place of G, thus the discovered model generates an additional executable sequence in the form of '... fhi….'. Five silent transitions with the redundant silent transition of and-type are discovered by IM algorithm. For the event log generated by model 3 shown in Figure 18(c), the proposed method and IM algorithm both can rediscover four silent transitions included in the original model. Meanwhile, the model discovered by them has a perfect fitness and precision. However, the α# algorithm only discovers one silent transition of sidetype and the fitness of the discovered process is less than 1. For the event log generated by model 4 shown in Figure 18(d), the proposed method discovers 4 silent transitions included in the original model. The α# algorithm discovers 3 silent transitions and fails to discover the silent transition occurred in the concurrent structure, which leads to the fitness less than 1. However, the model discovered by IM algorithm produces 14 silent transitions, which leads to the precision less than 1. However, for the event logs generated by Figures 18(a), 18(b),18(c) and18(d), the proposed method can all rediscover the original model and have perfect fitness and precision. For the event log generated by model 5 shown in Figure 18(e), the proposed method can discovered 6 silent transitions, however the α# algorithm can only discover 1 silent transition, which leads to the fitness and precision less than 1. However, IM algorithm and the proposed method discover 7 silent transitions and has a perfect fitness and precision. For the event log generated by model 6 shown in Figure 18(f), the proposed method discovers 8 silent transitions, α# algorithm can only discover 4 silent transitions which leads to the precision and fitness are both less than 1. However the model obtained by IM algorithm includes 5 silent transitions, which leads to the fitness less than 1.  Figure 19(a) indicates that the method proposed in this paper can identify multiple types of silent transitions in the original model. The number of silent transitions lies between the number of the IM algorithm and the α# algorithm without too many redundant silent transitions. Figures 19(b) and 19(c) indicate that the proposed method may cause a little decrease in fitness value in some situation, yet always yield a notable increase in precision value compare with others. Figure 19(d) shows that compared with the IM algorithm and α# algorithm, the behavior of the model obtained by this method is more consistent with that of the original model. Figure 19(e) shows that the F-measure of the obtained process models by the proposed approach has a better result than the others. In conclusion, the experimental results show that the proposed method can find multiple types of silent transitions. Meanwhile, it improves the quality of process discovery results.

D. EVALUATION USING REAL EVENT DATA
In this section, we discuss the result using the real logs from the Business Process Intelligence Challenge(BPIC), which are accessible via 4tu Center for Research Dat 1 1 . Table 2 reports the characteristics of all logs used in terms of number of case, number of events, number of unique labels minimum number of events per trace, maximum number of events per trace for each log. We mined the process models with three different discovery algorithms, the proposed approach, IM algorithm, α# algorithm used 4 BPIC logs. Results of this experiment are given in Tab. 3   Figure 20 shows the result of silent transition, obtained when applying three techniques on different real-life event logs. As we see, for these four real event logs, the proposed method produces fewer redundant silent transitions than the IM algorithm. Moreover, the number of silent transitions mined by the proposed method lies between the other two algorithms. Figure 21 shows the obtained fitness, precision, Fmeasure values of applying different methods on real event logs. For BPIC_2012 and BPIC_2013_Open event log, the process model obtained by α# algorithm has perfect fitness. However, the precision of the model is poor as it contains several isolated activities which can produce arbitrary behaviors. For BPIC_2013_Open and BPIC_2013_Close event log, the process model constructed by IM algorithm both has perfect fitness, but due to the existence of redundant transitions, loops, and concurrency, its precision is very poor, especially BPIC_2013_Close event log, which can produce executable sequences of any length. For BPIC_2013_Incidents event log, the model generated by the IM algorithm contains 50 silent transitions, including a large number of redundant silent transitions. Compared with the IM algorithm, the process model obtained by the proposed method achieves more high fitness and precision, as well as reduces the number of redundant silent transitions. For BPIC-2012 event log, the process model generated by α# algorithm contains a large number of isolated activities, that resulting in very low precision of the model. Although the process models built by other algorithms have perfect fitness, however the precision is very poor because of these models produce too many extra behaviors not in event log. Hence, the quality of the obtained process model is still relatively poor. In general, it is more reasonable to use F-measure to measure the quality of the process model. As shown in Figure 21, for all event logs, the proposed method can discover a process model with a high F-measure value.
Based on the experiments using real event log, we conclude that the proposed method presents a little drop in fitness and a significant increase in precision. However, the F-measure significantly improves when our technique is used compared the others. This significant increment of Fmeasure is explained by the noticeable and significant increment of precision. Therefore, the result from synthetic event logs and real-life event logs both confirm the effectiveness of our technique.

VIII. CONCLUSION AND FUTURE WORK
The paper proposes a novel method to mine process models with silent transitions based on behavior distance. First, to more accurately capture the dependencies between activities, the minimum and maximum behavior distances of activities based on the event log are calculated. And then, the basic behavior relationship between pairs of activities based on the event log is obtained through the weak order relationship and behavior distance between activities. After that, an initial process model with silent transitions of andgateway type and loop type is constructed by analyzing the basic behavior relationship and behavior distance. Based thereon, silent transitions of skip type are mined using behavior distance between activities based on log, model, and concurrent structure. Finally, lots of experiments have been done to verify the effectiveness of the proposed method. The experimental results show that the proposed method can successfully discover various types of silent transitions and improve the model's F-measure without significantly reducing the fitness of the model. The future work mainly has two aspects: (1) considering how to improve the method to extend it to a non-free choice structure. (2) When the behavioral relationship of the log is incompleteness, how to correctly discover the possible behavioral relationship between activities.