Automated Discovery of Business Process Simulation Models from Event Logs

Business process simulation is a versatile technique to estimate the performance of a process under multiple scenarios. This, in turn, allows analysts to compare alternative options to improve a business process. A common roadblock for business process simulation is that constructing accurate simulation models is cumbersome and error-prone. Modern information systems store detailed execution logs of the business processes they support. Previous work has shown that these logs can be used to discover simulation models. However, existing methods for log-based discovery of simulation models do not seek to optimize the accuracy of the resulting models. Instead they leave it to the user to manually tune the simulation model to achieve the desired level of accuracy. This article presents an accuracy-optimized method to discover business process simulation models from execution logs. The method decomposes the problem into a series of steps with associated configuration parameters. A hyper-parameter optimization method is used to search through the space of possible configurations so as to maximize the similarity between the behavior of the simulation model and the behavior observed in the log. The method has been implemented as a tool and evaluated using logs from different domains.


Introduction
Business Process Simulation (BPS) is a widely used technique for quantitative analysis of business processes [1]. The main idea of BPS is to generate a set of possible execution traces of a process from a business process model annotated with parameters such as the arrival rate of new process instances, the processing time of each activity, etc. The resulting execution traces are then used to compute performance measures of the process, for example, cycle time, resource utilization, and waiting times for each task in the process.
A key ingredient for BPS is the availability of a simulation model (herein the BPS model) that accurately reflects the actual dynamics of the process.
Traditionally, BPS models are created by domain experts using manual data gathering techniques, such as interviews, contextual inquiries, and on-site observation. This approach is time-consuming [2]. Furthermore, the accuracy of a BPS model discovered in this way is limited by the accuracy of the business process model that is used as a starting point. Yet, oftentimes process models produced by domain experts do not capture all possible execution paths (e.g. exceptional paths are left aside). Indeed, given that business process models are often designed for documentation and communication purposes, they need to strike a balance between completeness and understandability.
Previous studies have advocated the use of Process Mining (PM) techniques to discover BPS models from business process execution logs (also known as event logs) [3,4]. The key idea behind these studies is that a business process simulation model can be obtained by first extracting a process model from an event log using an automated process discovery technique, and then enhancing this model with simulation parameters derived from the event log (e.g. arrival rate, processing times, and conditional branching probabilities).
However, existing proposals in this area do not consider the question of measuring and automatically tuning the accuracy of the resulting BPS models. Instead, existing proposals rely on user intervention to manually tune the resulting BPS models throughout the different steps in their construction.
This article addresses this gap by proposing an accuracy optimized method to automatically discover BPS models from event logs. The method decomposes the problem at hand into a series of steps, each of which can be configured via one or more hyper-parameters. 1 A hyper-parameter optimization method is used to search through the space of possible configurations of the BPS discovery method so as to maximize the similarity between the behavior of the BPS model and the behavior observed in the log. In order to guide this optimization procedure, this article puts forward a measure of accuracy for log-derived BPS models.
The proposed method has been implemented as a tool (namely Simod) that generates process simulation models from an event log in the eXtensible Event Stream (XES) format. The resulting simulation model that can be executed using two simulators: BIMP [5] and Scylla [6]. The magnitude of the accuracy enhancements achieved by the proposed method have been evaluated via experiments on three event logs from different domains.
This article is a significantly extended version of a tool demonstration paper [7]. The previous tool paper outlines the high-level architecture of Simod.
This article adds a detailed description of the algorithms employed, a definition of the BPS model accuracy measure, and an evaluation of the proposed method.
The rest of the article is structured as follows. Section 2 introduces basic terminology in the field of BPS and discusses existing approaches for BPS model discovery. Section 3 presents the proposed method and the approach for measuring BPS model accuracy. Section 4 discusses the experimental evaluation.
Finally, Section 6 draws conclusions and outlines directions for future work. 1 The term hyper-parameter is used to refer to any of the parameters of the method for discovering BPS models, while the term parameter refers to any of the parameters of the generated BPS model, such as the arrival rate or the processing times of activities.

Background and Related Work
This section presents background concepts used throughout the article, followed by an analysis of the related work in data-driven BPS.

Business process simulation
This article considers business process models represented in the Business Process Model and Notation (BPMN). In its basic form, a BPMN process model consists of a set of activity nodes (or activities for short) and gateways that are interconnected by sequence flows. A split gateway has multiple outgoing sequence flows. An exclusive decision gateway is a split gateway that encodes one decision, i.e. when the execution of the process reaches this gateway only one of its outgoing sequences flows is taken. An inclusive decision gateway allows multiple of its outgoing flows to be taken when encoded decisions are satisfied.
However, any inclusive decision gateway can be trivially transformed into a combination of exclusive decision gateways and parallel gateways, hence we can restrict ourselves to exclusive decision gateways without loss of generality. The branches coming out of a decision gateway are called conditional branches.
The cycle time of a process instance (herein called a case) is the amount of time between the moment the case starts its execution and the moment it ends.
By extension, we can also define the cycle time of an instance of an activity as the amount of time between the moment the activity instance is enabled (i.e. ready to be executed) and the moment it completes. The processing time of an activity instance is the amount of time between the moment the activity instance is started and the moment it is completed. Usually, there is a delay between the moment an activity instance is enabled and the moment it starts. This delay is called waiting time. Again by extension, we define the processing time of a case as the amount of time when the process instance is active, meaning that at least one activity instance of this case has started but not yet completed. The waiting time of a case is the cycle time of the case minus the processing time.
These definitions also apply to a process, which consists of a set of observed cases. The cycle time of a process is the mean cycle time of its cases. Similarly, the cycle time of an activity is the mean cycle time of its activity instances.
Given the above terminology, a BPS model consists of a process model plus the following elements [1]: • The mean inter-arrival time of cases and its associated probability distribution function, e.g. one case is created every 10 seconds on average with an exponential distribution.
• The probability distribution of the processing times of each activity. For example, the processing times of an activity may follow a normal distribution with a mean of 20 minutes and a standard deviation of 5 minutes, or an exponential distribution with a mean of 10 minutes.
• For each conditional branch in the process model, a branching probability (i.e. percentage of time the conditional branch in question is taken when the corresponding decision gateway is reached. • The resource pool that is responsible for performing each activity in the process model. For example, in an insurance claims handling process, a possible resource pool would be the claim handlers. Each resource pool has a size (e.g., the number of claim handlers or the number of clerks).
The instances of a resource pool are the resources.
• A timetable for each resource pool, indicating the time periods during which a resource of a resource pool is available to perform activities in the process (e.g. Monday-Friday from 9:00 to 17:00).
• A function that maps each task in the process model to a resource pool.
A BPS model consisting of the above elements can be executed using a discrete event simulator. The simulation of a BPS model yields a simulation log, consisting of the execution traces generated during the execution, as well as a collection of performance measures (e.g. cycle time, resource utilization, etc.).

Related work
Data-driven approaches to BPS can be classified in two categories. The first category consists of approaches that provide conceptual guidance to discover BPS models. The second category consists of approaches that seek to automate the discovery of BPS models. Below we review each of these two categories.
Conceptual guidance for data-driven simulation. These approaches discuss how process mining techniques can be used to extract, validate and tune BPS model parameters, without seeking to provide fully automated support.
The authors in [8] identify four main components of a simulation model, namely entities, activities, resources and gateways. The authors then identify BPS modeling tasks related to each of these components (e.g., modeling gateway routing logic, modeling activities). Later, in [4] the same authors present a literature review on the use of process mining techniques to support each of these modeling tasks. This literature review sheds insights into the question of how to choose process mining techniques for each of the BPS modeling tasks.
In this paper, we use these insights as a starting point for the design of an automated method for discovery of BPS models from event logs.
In [9], the authors present an approach to enhance a given process model with simulation parameters. This approach differs from the one presented in the present article in that it assumes that process model is given as input (in addition to the event log). The approach also assumes that the process model perfectly fits the event log. In reality though, the traces in the event log may deviate with respect to the behavior captured in the process model. Moreover, the approach in [9] does not seek to provide an automated end-to-end approach for discovering BPS models from event logs, but it rather focuses on providing guidance for approaching some of the steps in the discovery of a BPS model. The authors in [2] present a methodology for process improvement based on data-driven simulation. The authors propose as a first step the discovery of simulation models from data for the representation of the current state of the process. Next, the authors propose the manual evaluation of possible scenarios to lead the process to a desired state. The work is illustrated with three case studies from the gas, government and agriculture industries. This work is useful for scoping the relevance of using data for discovering simulation models, as well as for identifying the need for automating this task to explore possible scenarios more quickly and efficiently. However, it does not provide concrete guidance for discovering accurate BPS models from process execution data.
Automated discovery of BPS models using process mining. The methods in this category seek to automate the discovery of BPS models from event logs by means of process mining techniques.
Rozinat et al. [10] propose a semi-automatic approach to discover BPS models based on Colored Petri Nets (CPNs). In this work, an event log is used as input for the discovery of various elements of a BPS model, including the control-flow structure (i.e. the process model), the conditional branching probabilities, and the resource pools. However, the automatic discovery of activity processing times and case inter-arrival times (and their probability distributions) are left aside. In [3], the authors go further by proposing a technique to discover more complete BPS models that include processing times and case inter-arrival times. These simulation parameters are then combined with the process model into a single CPN, which can be simulated using a CPN tool.
One limitation of the work of Rozinat et al. [3] is that it does not seek to automatically adjust or fit the probability distributions of the processing times of activities (nor the probability distribution of case inter-arrival times). Also, Khodyrev et al. [11] propose a process mining approach to generate BPS models tailored for short-term prediction of performance measures. The authors extract the structure of the process as a Petri net and the dependencies between elements and variables are established using decision trees. Subsequently, two experiments were conducted with real events logs, in which the authors sought to predict specific performance measures for each process, such as the number of open or closed events during the prediction period, the processing or cycle times.
A major limitation of this approach is that it does not discover the resource perspective (i.e. the resource pools) of the BPS model. Instead, it is assumed that each activity may be performed by an infinite amount of resources. Also, while the discovery of the conditional branching probabilities and of the activity processing times are done automatically, the integration of these elements into a BPS model is left to the user. This latter approach does not define how to measure and optimize the accuracy of the resulting BPS model.
Finally, Gawin et al. [12] combine multiple process mining techniques to create a BPS model that reflects the actual process behavior. Specifically, process mining techniques are employed to extract the process model structure, the resource pools, the activity processing times, and the decision logic of decision gates. Interviews and process documentation techniques are used to elicit the case inter-arrival times, the costs of resources use, and the definition of resource schedules. The simulation parameters discovered in this way are then manually linked in the ADONIS tool, leading to a BPS model that is then executed with a capacity analysis algorithm of this latter tool. This latter work differs from the one reported in this article in that it does not seek to automate the extraction of all elements of a BPS simulation model (nor their assembly). Also, it does not seek to measure and optimize the accuracy of the BPS model. Table 1 summarizes the capabilities of the above approaches for BPS model discovery. In this table, the symbol (+) implies that the feature is supported, (−) implies not supported, and (+/−) implies partially supported, for example, supported but not in an automated manner.

Characteristics
Rozinat et al.
(2015) [12] Sequence flow discovery Branching probabilities discovery (+) Accuracy assessment This article advances the state-of-the-art in two ways. First, it proposes a fully automated method for discovering each of the perspectives of a BPS model and assembling the resulting perspectives into a complete BPS model.
Second, it proposes an approach to measure the accuracy of a BPS model and to optimize the accuracy of an automatically discovered BPS model.

Approach
The proposed method takes as input an event log in which every event (corresponding to the execution of an activity instance) has a case identifier, an activity label, a resource attribute (indicating which resource performed the activity), and two timestamps: the start timestamp and the end timestamp 2 .
The resource attribute is required to discover the available resource pools, their timetables, and the mapping between activities and resource pools. Equally the start and end timestamps are required to compute the processing time of activities and their respective probability distributions. Figure 1 illustrates the steps of the proposed method and their inputs and outputs. The following describes these capabilities and exemplifies them by using a synthetic event log of a purchase-to-pay (P2P) process. This event log 3 meets the aforementioned requirements and it consists of 21 activities, 27 resources, and 9119 events related to 608 cases.

Pre-processing
In this step, an automated process discovery technique is applied to extract a BPMN process model from the event log. Most of the times and due to the characteristics of the process discovery algorithms, the discovered models do not reflect 100% of the possible paths that can be taken in a business process (in other words, the fitness is not 100%). Therefore, one of the main concerns at this stage is avoid the distortion of the observed reality. To this end, the proposed method also measures the conformance between the discovered BPMN model and the event log, and provides the possibility of applying repair actions on the log in order to improve the fitness between the model and the log.
Control Flow Discovery. This step is critical since it defines the activities, the decision gateways, and the way these are related in the process. We use the Split Miner algorithm [13] to generate BPMN v2.0 models from event logs. We selected this automated process discovery method because it achieves high levels of accuracy (precision and fitness) while at the same time producing simple process models, relative to other state-of-the-art process discovery techniques [14].
However, other process discovery methods such as the inductive miner, could be used instead [15]. Split Miner allows the discovery of models with different levels of sensitivity, which depends on the parameters epsilon ( ) and eta (η).
The parameter refers to the parallelism threshold and determines the quantity of concurrent relations between events to be captured. is defined in a range between 0 and 1, and the larger the value of this parameter, the greater the number of possible relationships between the different events to be considered in the analysis. On the other hand, η refers to the percentile for frequency threshold, acting as a filter over the incoming and outgoing edges of each node retaining only the η percentiles most frequent. η is defined in a range between 0 and 1 and the larger the value of this parameter, the greater the percentage of frequencies to be retained. For the tool implementation, we used the commandline version of the Split Miner, which takes as input an event log in XES format and outputs a BPMN process model. Table 2 outlines the structure of this log, while Figure 2 illustrates the resulting model using as 0.3 and η as 0.7.   Log Repair. Once conformance (fitness) is measured, the model repair, event log repair or both can be performed to improve conformance between the event log and the model as is explained in [17]. In our case we perform the log repair, for which we propose the removal, replacement, or the repair of those traces in the log that do not fully fit a trace in the model. To repair a given trace, its corresponding trace alignment is scanned from left to right to apply one of two operations when an MM or an ML is found.
The event in the trace responsible for an ML is removed. Conversely, the log is annotated with zero processing time and a special resource called AUTO when a MM is found. This means that this activity does not consume any resources and hence does not have an impact on the cycle time of the process. Finally, the algorithm advances one step in the trace alignment When a SM is found.  of the start event of the model). Then, the algorithm iterates over each event in the input trace. Each event denotes an execution of an activity, and has a start time and an end time, which allows us to calculate the processing time of the activity. Before handling a given event e, the algorithm fires every gateway that is enabled in the current marking, until no more gateways can be fired.
When an XOR-gateway is fired, the conditional flow that leads to an activity corresponding to event e is traversed. Accordingly, the traversal frequency of this conditional flow is increased by one.
Next, the algorithm iterates over the activities enabled in the current marking. If an activity is enabled and the enablement time of the next occurrence of this activity has not yet been initialized (the activity was not enabled before), then the enablement time of this activity is set to be equal to the current execution time. At this point, and given that the model can parse every trace in the (repaired) input log, the activity corresponding to event e must be enabled. normal, exponential, uniform, fixed-value, triangular, gamma and log-normal.
In the running example, we find that the PDF that best fits the observed interarrival times is an exponential PDF with a mean of 15455 seconds.  For example, each one of the 21 activities in the purchasing process event log were analyzed. As can be seen in the Table 3, most of the processing times of the activities follow a uniform distribution with a mean of 3600 seconds.
Resource pools. Resource pools (which correspond to organizational roles or groups) are discovered by using the algorithm proposed in [20]. This algorithm   [20] then assigns each activity to one or more resource pools. However, in a BPS model, each activity must be assigned to exactly one resource pool.
Hence, we post-process the output of the algorithm in [20] to fulfil this property by assigning each activity to the pool that most frequently performs it.
In our running example, there are 26 resources. The algorithm in [20] identifies 5 resource pools and assigned each activity to exactly one pool. 6 .
Simulation model assembly. Once we have compiled all the simulation parameters, we put them together with the BPMN model into a single data structure. This step is dependent on the target simulation tool (e.g. BIMP or Scylla).
In BIMP for example, this step involves embedding the simulation parameters inside the BPMN model, using proprietary XML tags.
Simulate Process. In this last step, the BPS model is given as input to a process model simulator. The simulator outputs a simulated event log. Below we discuss how the accuracy of the resulting BPS model is assessed and optimized. 6 It turns out that this event log contains information about roles and about the mapping from activities to roles. We found that the algorithm in [20] re-discovered the roles already present in the event log (without using this information) with 100% accuracy

Assessment and optimization
The objective of this stage is to assess how accurate the event log generated in the simulation stage is in relation to the event log used as input. As mentioned before, it is possible to discover multiple versions of the process due to the characteristics of the process models discovery algorithms, which could be more the Student's t-test, which is used to rule out models that are not representative for the explanation of the value of an objective variable. However, this test helps to reject models but not to select the best one.
Another possible quantitative evaluation is the distance measurement between the generated values and the ground truth. This kind of comparison is highly used in the area of predictive process monitoring, in which Demerau-Levinstein (DL) and Mean Absolute Error (MAE) have been used as distance metrics to assess the similarity between two process traces. DL is applied to measure the similarity of two discrete attribute sequences such as activity or resource name sequences. In practice to evaluate two sequences, these are string encoded allowing the measurement of how many insertion, deletion, substitution or transposition operations are necessary for the sequences to be the same. The fewer operations the sequences are performed closer. For its part, the MAE is a known metric that allows measuring the distance between a prediction and an expected value of a continuous variable. In the case of a trace the attributes to be evaluated are the execution times of the events or the cycle time of the trace.
Although these metrics allow us to establish the precision of a model still do not provide a single measure to establish how accurate a simulation model is.
In addition to these metrics we adapted the model proposed in [21,22] for measuring timed strings to the business process domain. In this article we present an extension for the strings distance metrics for sequences composed of symbols available solely in defined time intervals. In order to define our measure, we must first make some definitions.
Having A as the set of all possible process activities and R as the set of all possible process roles, L is the finite alphabet of all possible activity-role tuples in the process.
Similarly, we define an event as e = (l, p, w), where l is a symbol taken from the alphabet L, p is the processing time and w the event waiting time, both belong to the set of real numbers p, w ∈ R, and are non-negative. In this context E is the set of all events with their respective times.
A trace is a non-empty sequence of events σ = e 1 , e 2 , . . . , e n , with e i = The set of all process traces is S. An event log L is a set of completed traces from S and K is the number of traces in the event log.
For the metric definition, we use the Demerau-Levinstein (DL) algorithm as a base. Lets suppose we have two traces σ ∈ S and σ ∈ S for which we denote d (i, j) as the cost of the least cost trace from σ 0 : i to σ 0 : j being 0 ≤ i ≤ |σ| and 0 ≤ j ≤ |σ |. Then we define the recursion function between two traces as: Each recursive call corresponds to the following cases:

corresponds to an event insertion
• d (i − 1, j − 1) + c (σ i , σ j ) corresponds to an event match or mismatch

corresponds to an event transposition
As you can see the main variation to the DL algorithm is in addition to the function that calculates the cost of the operation that is typically 1 or 0. In this model assuming that there are two events e = (l, p, w) and e = (l , p , w ) the function that calculates the cost is defined as: Where |p−p | is the absolute error of the normalized processing time and |w−w | is the absolute error of the normalized waiting time. On the other hand, the coefficient β 1 represents the significance of the processing time in the total duration of the event in the ground-truth. With this modification the DL algorithm allows to include a penalty related to the time difference in processing and waiting times providing a single measure of accuracy. In this paper we will call this measure Timed String Distance (TSD).
Hyper-parameter optimization. The accuracy of a simulation model depends to a large extent on the process structure that is used as the basis. However, when using discovery algorithms, it is possible to find multiple versions of the process from the same event log. This is because the algorithms seek a balance between interpretability and generalization. In the same way, this generates that the discovered models frequently do not represent all the behaviors registered in the event log, affecting the accuracy of the simulation model.
In the case of the Split Miner algorithm, as explained in the Subsection 3.1, the η and parameters allow different structures of the process to be extracted by filtering the events relationships. However, determining what is the most accurate structure is a complex process of trial and error. Furthermore, the conformance between the event log and the model used as a base is also affected by the structure discovered, also affecting the accuracy of the model.
In a traditional approach, an expert would perform the search manually modifying the values of the parameters based on their expertise and intuition.
However, this is a time-consuming approach that can lead to not finding the global optimum [23]. With this in mind, we propose to use a Tree-structured Parzen Estimator (TPE) as a hyper-parameter optimizer [24] to find the best settings based on historical accuracy the executed models. TPE is a sequential algorithm that, on each trial, defines the next parameter configuration based on historical results and nested functions that select the parameters values based on a probability distribution and ranges defined for each one. The objective function seeks to minimize the loss, calculated as the inverse TSD measure, and the search space considered is presented in Table 4 Category

Evaluation
The method described above has been implemented as an open-source tool, namely Simod, which is packaged as a Python Jupiter Notebook. 7 Simod takes as input an event log in XES format, and produces as output a BPMN model that includes simulation parameters, ready to be simulated using the BIMP simulator [5]. 8 The source code of Simod can also be configured to produce models for the Scylla simulator [6], but BIMP is used as the default simulator

Datasets
A pre-requisite to discover a BPS model from an event log, is that the events in the log should have both a start and end timestamps. Unfortunately, this pre-requisite is not fulfilled by publicly available real-life event logs such as those in the 4TU Collection of event logs. 9 As an alternative, we validate the proposed approach using one synthetic event log and two real-life event logs: one coming from a Business Process Management System (BPMS) and one coming from a factory production system, which satisfy the above requirement.
The descriptive statistics of these event logs can be found below, whereas their structure is described in Table 5.
The first event log is a synthetic log, generated from a model not available to the authors, of a purchase-to-pay (P2P) process. This is the same log used 7 The tool and the event logs used in the evaluation are available at https://github.com/ AdaptiveBProcess/Simod 8 Available at http://bimp.cs.ut.ee 9 https://data.4tu.nl/repository/collection:event_logs_real as a running example in Section 3. 10 The second log stems from an Academic Credentials Recognition (ACR) process at University of Los Andes in Colombia. The log comes from a deployment of a Business Process Management System (BPMS), specifically Bizagi. The model corresponding to this log was not available to the authors of this article. The third log is that of a manufacturing production (MP) process, exported from an Enterprise Resource Planning (ERP) system [25]. The tasks in this process refer to steps (or "stations';) in the manufacturing process.   corresponds to the scenario in which the event log is generated from a process model defined in ideal conditions, so it is expected a 100% fit between the two, and high accuracy in the simulation. The ACR log corresponds to a service process executed on a BPMS. It is a relatively complex process, which delivers a service to hundreds of users and involves over a dozen workers. Finally, the MP event log is the scenario in which the process structure is unknown, and where the behavior of the resources can affect the structure of the process significantly.

Experimental setup
For each event log we generate 100 BPS models with different setup combination of preprocessing parameters. The evaluated combinations were composed by variations of , η, and a log repairing method. The Simod hyper-parameter optimizer was used to choose the values of each combination. Each resulting simulation model was executed 5 times and new event logs with the same size of the original event log were generated. Subsequently, we evaluate the accuracy of the models as the distance between the simulated event log and the ground-truth event log using the TSD. To measure the distance of the entire event log, we paired each generated trace with the most similar trace (w.r.t. TSD distance) of the ground-truth. Once the pairs (generated trace, ground-truth trace) were formed, we calculate the mean TSD between them. Additionally, we used a warm-up and cool-down of 0.2, and the results of each combination were averaged to find their convergence. In total, 100 combinations of hyper-parameters and 500 simulated event logs were evaluated for each inputevent log.  In the P2P event log, the repair technique always obtained the best accuracy results with a wide margin, and that is constant for the values of η greater than 0.14. Likewise, in the ACR event log it is observed how the values of η and have The results when using different parameters for model discovering suggest  that there is a positive correlation between the frequency threshold (η) and the accuracy of the discovered BPS model. For instance, the accuracy of the models discovered from the ACR event log improves as the values of η increasea (cf. Figure 5). This η parameter plays a more significant role than the parallelism threshold ( ) for which there is no a clear trend.

Results
The results of applying the log repair methods show an improvement in the accuracy of the BPS models. This effect can be observed in most of the cases, independently of the event log. Nevertheless, it is more evident in the P2P event log (synthetic log). The event logs with the lowest accuracy was MP, reaching a maximum of 0.35 in contrast to 0.82 and 0.91 of the other two event logs (see Table 6). This large difference may be related to the structure of the event log, since this is the one with the least number of traces and events available, as well as the largest number of activities per traces due to loops in the process. These characteristics mean that none of the techniques used can improve accuracy significantly, evidencing the need for more data to reduce the error.

Threat to validity
The experimental evaluation is restricted to one synthetic and two real-life event logs. As such, the generalizability of the results is limited: The results might be different for other event logs, and as shown in the evaluation, particularly for event logs for which the automated process discovery technique does not manage to discover an accurate process model.
Each parameter is extracted using a particular algorithm, because our focus was on automatic discovery of simulation models and the search for greater precision in relation to the process model used as the basis. One possible extension of this tool could include multiple extraction options for each parameter.
Due to the sequential nature of the BPS techniques, multitasking, batching and deliberate delays within a process due to relative priorities (e.g. a process being "low priority"), are not taken into account in the proposed approach.
Addressing these problems would require the development of new simulation techniques that are beyond the scope of this work.

Conclusion and Future Work
This paper outlined a method for automated discovery of business process simulation models from event logs and defined a measure for assessing the accuracy of a BPS model relative to an event log. The proposed method takes as input an event log, automatically discovers a process model, aligns the log to the model (and repairs it accordingly), and applies a range of replay and organizational mining techniques to extract all the parameters required for simulation. Once a BPS model is discovered, its accuracy is measured using a timed string-edit distance between the simulation log(s) it generates and the original log. A hyper-parameter optimizer is then used to search through the space of possible configurations so as to maximize the accuracy of the final BPS model.
The proposed method has been implemented as an open-source tool, namely Simod, and evaluated using three real-life event logs from different domains. The evaluation shows that the hyper-parameter optimization method significantly improves the accuracy of the resulting BPS model, relative to an approach where default parameters are used. Also, it was observed that the best configuration found varies from one event log to another, further emphasizing the need for automated hyper-parameter optimization in this setting.
The evaluation reported in this paper is limited in terms of number of datasets due to the difficulty in obtaining access to real-life event logs where every (human) activity has both a start and an end timestamp, which is essential in order to determine the processing times of the activities. A direction for future work is to conduct a more systematic evaluation using a larger set of event logs, so as to identify possible relations between the characteristics of an event log and the associated (optimal) hyper-parameter settings and thus derive guidance for the configuration of BPS models.
Another limitation of the present study is that it relies on a relatively simple approach to business process simulation, in which activity instances (a.k.a. work items) are assigned to resources on a first-in, first-out basis (no notion of prioritization), each resource only performs one work item at a time (no multitasking), and a resource starts a work item assigned to it immediately after the assignment and works on it uninterruptedly until it is completed (no delays due to fatigue effects, no pauses, and no batching). Extending the proposed approach to lift these limitations and studying the effects of these phenomena on the accuracy of BPS models is another avenue for future work,