Fast and accurate quantitative business process analysis using feature complete queueing models

Quantitative business process analysis is a powerful approach for analyzing timing properties of a business process, such as the expected waiting time of customers or the utilization rate of resources. Multiple techniques are available for quantitative business process analysis, which all have their own advantages and disadvantages. This paper presents a novel technique, based on queueing models, that combines the advantages of existing techniques, in that it leads to accurate analyses, is computationally inexpensive, and feature complete with respect to its support for basic process modeling constructs. An extensive quantitative evaluation has been performed that compares the presented queueing model to existing queueing models from literature. This evaluation shows that the presented model outperforms existing models with one order of magnitude on accuracy. The resulting queueing model can be used for fast and accurate timing predictions of business process models. These properties are useful in optimization scenarios. © 2021TheAuthors.PublishedbyElsevierLtd.ThisisanopenaccessarticleundertheCCBYlicense (http://creativecommons.org/licenses/by/4.0/).


Introduction
One of the goals of modeling business processes is to facilitate quantitative analysis of their properties.Quantitative analysis techniques can be applied to answer questions, such as: what is the waiting time for a case if it would arrive now; how busy are my resources expected to be; and which task is the bottleneck in the process?They perform their calculations on the properties of individual tasks, cases, and roles, including the arrival rate of cases, the distributions of the processing times of tasks, and the number of resources available per role.In this paper a new method for translating business process models to queueing models is introduced to facilitate quantitative analysis of these business process models.The method is based on previous work by the same authors [1], which is extended with a new approximation for the parallel construct.
Fig. 1 shows an example of a business process model in which timing and resource properties are modeled and that can consequently be analyzed quantitatively.The business process is modeled in the Business Process Model and Notation (BPMN).This notation can be used to model the 'flow' of cases through an organization, as tasks are performed on them.Cases arrive at a start event, denoted by a circle with a thin outline.They then follow the flow of the arrows in the system and at each activity, denoted by a rounded rectangle they are served by a resource of the type associated with the lane that the activity is in.For example, the task 'register application' must be performed by a 'back office' resource.When a case arrives at a choice gateway, denoted by a diamond with an × in it, a choice must be made between alternative paths (in case there are multiple outgoing arrows), or alternative paths are merged again (in case there are multiple incoming arrows).When a case arrives at a parallel gateway, denoted by a diamond with a + in it, multiple paths continue in parallel (in case there are multiple outgoing arrows), or parallel paths are merged again (in case there are multiple incoming arrows).We annotate the BPMN models with the number of resources c per resource type, the arrival rate λ of customer cases, the expected processing times E(P) of activities, and choice probabilities.These parameters are defined precisely in Section 3.
Three types of techniques exist for quantitative analysis of business processes: flow analysis, queueing model analysis, and simulation [1,2].Flow analysis uses a set of simple equations to model the process, where queueing model analysis generates a queueing model to analyze the process.In simulation a simulation model is generated and ran multiple times in a simulation engine to obtain the quantitative results for the business process.Each of these techniques has its advantages and disadvantages especially regarding duration of the computations, accuracy and support for different process modeling constructs and probability distributions.These advantages and disadvantages for flow analysis and simulation are discussed in detail in Section 2.
This paper focuses on queueing model analysis, since it has the benefit of being fast and accurate at the same time, while flow analysis is less accurate and accurate simulation is slow.While the benefits of fast computation times are limited in traditional what-if analysis, they are important when real-time decisions must be made by finding an optimum among a large number of predicted future scenarios.An example of such a decision is deciding if a temp agency should be called now to hire temporary resources to be able to complete the work in progress on time.This combination of prediction and optimization is called prescriptive analytics.The optimal parameters for the process model are set by executing the model with different parameters given a certain optimization function (for example minimizing costs or minimizing throughput time) [3].Prescriptive analytics has been proposed for different fields, for example to optimize capacity planning [4] and to optimize resource allocation [5].
However, while queueing model analysis has the benefits of both being fast and accurate, it has limited support for typical business process modeling patterns (see for example [6]).Of particular concern is the limited support for parallelism and rolebased resource assignment.From a structured literature review three queueing models are obtained that support parallelism [7][8][9], but these queueing models work with approximations and are consequently less accurate.
Therefore, this paper presents a queueing model for business process model analysis that is feature complete with respect to the basic process modeling patterns, including parallelism and role-based resource assignment.In case of exponential arrivals and processing times the results are even exact.In an extensive evaluation we show that for common business process modeling patterns, our technique increases the accuracy of the calculations by up to one order of magnitude over the currently most promising existing queueing model.This paper extends previous work by the same authors [1] by adding support for parallelism (including role-based resource assignment) and providing an extensive evaluation.
To analyze business process models in terms of a queueing model, we propose a three step approach.First, compute the arrival rate of each individual task.Second, transform each role in the business process into a queue, in which items can arrive for each of the tasks that are handled by the role -with the corresponding arrival rate.Third, compute the overall resource utilization, processing time, waiting time and sojourn time for each of the queues and then for the business process model as a whole.
Against this background, the remainder of this paper is structured as follows.In Section 2 discusses related work on quantitative analysis of business process models and the state-of-the-art literature on queueing analysis.Section 3 explains how basic queueing model analysis can be performed on individual tasks (step 1 of the method).Section 4.1 explains how multiple tasks that are assigned to a single role can be transformed into a queue (step 2 of the method), and Section 4.2 explains how multiple resources in a role can be considered.Section 5 presents the extension for the parallel construct.Finally, Section 6 shows how these elements can be combined to transform a complete business process model into a queueing model.Section 7 evaluates the proposed method by comparing its performance to existing queueing models for quantitative analysis.Section 8 presents the conclusions of this research and future work.

Related work
This section discusses the differences between the three techniques for quantitative analysis of business process models and related work on queueing models.An overview of the advantages and disadvantages for all techniques is presented in Table 1, the techniques are compared on their support for probability distributions, their computation time and their support for the main constructs in business process modeling [6,10,11]: Sequence, Choice, Loop and Parallelism.

Flow analysis
Flow analysis [12] is a technique which uses the order of activities in a process, the so-called process structure and a set of mathematical formulas to compute the performance of a process.Since the formulas presented are linear over the number of activities in a process model, the computation time is fast.In the process model all main business process modeling constructs are supported, but there is a major drawback.Flow analysis uses the average waiting times and processing times per task to perform the analysis.There is no support for probability distributions, this has severe impact on the validity of the analysis.If there is a (high) variation in service times the actual performance of the business process can deviate significantly from the predicted value using flow analysis.For example, if the arrival rate in a process has on average an arrival every time unit, so λ = 1 and the processing time in the only task in the model is on average 0.99 time units, so E(P) = 0.99, flow analysis would conclude that the model would perform flawlessly.However, if we simulate this model with an Poisson arrival process and an exponential processing time, we see that the waiting time in the model increases very rapidly towards infinity, as shown in Fig. 2, therefore -in reality -the process model does not perform flawlessly at all and customers would in practice not be happy with this process.Consequently, flow analysis is the least precise form of quantitative business process analysis.

Simulation
Simulation [13] is a technique which creates a simulation model of the existing process model and runs the simulation in a simulation engine.Simulation has extensive support for business process modeling patterns.Therefore, it is the technique which has the least limitations on analyzing the business process.The downside of simulation is that every simulation requires a warmup time and sufficient number of replications [14][15][16][17], which leads to long computation times.It must be noted that many business process simulation tools do not include replications and warm-up times, making them much faster, but at a serious penalty to accuracy [18].
If warm-up time is not considered in a simulation, the results of the simulation do not represent the values from reality, because the system is not (yet) fully loaded.Fig. 3 illustrates how the lack of warm-up time could affect the observed expected waiting time E(W ).The expected waiting time only becomes stable around 4,000 simulated hours.Furthermore the number of replications should be sufficient, the gray area in the figure is the confidence interval at 100 replications.If the number of replications is lowered the gray area would increase, representing that the error of the simulation model would increase, this could lead to incorrect conclusions about the business process [18].The warm-up time and replications make the simulation model relatively slow compared to flow analysis and queueing models.Decreasing the warm-up time or reducing the amount of replications would speed up the simulation, but decrease the accuracy of the results obtained.Therefore if a large number of alternative settings needs to be explored quickly, simulation is too slow to be a valid option.Also see Section 7 for a comparison of the time performance of simulation and queueing analysis.

Queueing analysis
Similar to flow analysis, queueing analysis is an analytical method to analyze business processes.Therefore it has the benefit of having very short computation times.However, queueing analysis does support probability distributions, making it accurate for computing processing times and waiting times.This is a major advantage over flow analysis.In existing literature we identified three existing queueing models [7][8][9].
The paper of Xie et al. [7] describes a business process as a control structure diagram.The four basic control structures defined are a sequence of activities, a loop, a parallel set of activities and a choice between activities.The control structure diagram can be broken down into these basic control structures.The paper presents two assumptions: all process instances arrive according to a Poisson process and the processing time of each task follows an exponential distribution.And the second assumption is that the queue size is infinite and the tasks are performed in the order as they arrive.For each of the different basic control structures a formula is presented to compute the main cycle time of an activity.In order to compute the overall mean cycle time of the process each basic control structure is analyzed separately.The mean cycle time of the basic control structure is computed and it is replaced by a single activity with that mean cycle time.This process is repeated until there is one queue left which provides the mean cycle time.
The paper of Sheng et al. [8] describes a business process as a network of M/M/c queues.Using the control structures of each activity in the process the arrival rate into the queue is computed based on the initial arrival rate in the system.This technique uses the probability density function of the cycle time, waiting time and service time for each activity.In order to combine the different probability density functions from different activities they are aggregated per control structure.Here also the four control structures: sequence, loop, parallel and choice are supported.The probability density functions are then aggregated following specific formulas to compute the probability density function in each control structure.Combining all control structures the probability density functions can be computed for the complete workflow.The probability density functions obtained represents the probability that the process takes a certain amount of cycle time, waiting time and processing time.
The paper of Ha et al. [9] describes a process as a set of activities, grouped into control structure blocks, connected by a set of links.A control structure block can represent a sequence, a loop or so called repeat block and a parallel set of activities.Choice is handled by indicating a probability on an arc in the process.The activities are handled in a first in first out manner.The cases are equally distributed over the different resources, based on the number of resources available.The expected cycle time is computed differently per block type, but involves the expected cycle time of the activities, furthermore the variance of the cycle time is determined per block type.Aggregating over all the different blocks the average total cycle time can be computed.A comparison of these techniques is presented in Table 2, comparing their support for common business process modeling constructs, including shared resources, general probability distributions, and the main business process patterns.
As shown in Table 2 the queueing models all support the common business process modeling constructs.However, existing queueing models only support the exponential distribution or even only average service rates in case of the work of Ha et al. [9].The queueing model proposed in this paper supports general distributions for all constructs except the parallel construct, but in the quantitative evaluation section it is shown that the approximation is quite robust for general distributions.The paper of Sheng et al. [8] assumes dedicated resources per task, where the paper of Xie et al. [7] and the paper of Ha et al. [9] have separate queues per resource where work items are allocated to specific resources.Therefore it supports resource sharing partially since resources can work on the same activity, but do not share the same work queues.An extensive quantitative analysis is provided in Section 7.
Another way of analyzing business processes via queueing analysis is by using queue mining [19,20].This technique uses an event log to mine a queueing network.The work on queue mining differs in focus from the work described in this manuscript: it focuses on mining a queueing network from an event log, while the work described in this manuscript focuses on analyzing a queueing network after it has been mined.Because of this focus, the work on queue mining also requires an event log to work, while the work described in this manuscript works based on a business process model that can be obtained through (queue) mining but also by other means.Consequently, queue mining and the analysis techniques described in this paper complement each other to facilitate quantitative analysis based on real-world data (event logs).
Congestion graphs are related to queue mining and can also be used for analyzing business processes [21].Like queue mining, congestion graphs also use an event log to generate a process model, which indicates the congestion in the system.If there is more congestion at a certain activity there are more cases which must be served in that queue, so the waiting time increases.This technique uses the congestion properties to predict the waiting time in a system.
Since queue mining and congestion graphs differ from the other techniques for quantitative business process analysis, presented ere, in that they require an event log to work, they are not taken into account in the evaluation section of this paper.

Queueing theory
As presented in the previous section, queueing analysis is an advanced analytics tool for analyzing business processes.It is performed by transforming a business process to a network of queues [22] and subsequently using queueing theory to analyze that network.The state-of-the-art in queueing theory regarding networks of queues served by a multitude of resources are generalized Jackson Networks [23,24].A generalized Jackson network is a system of queues which all have independent Poisson arrival processes and general processing times, where each queue is served by a single resource [24].To cover for the situation in which there are multiple resources for a single queue, the processing time is divided by the number of resources available for a single resource.These Jackson networks are used in operations research to predict job shop scheduling in a factory, see e.g.[25], which can be considered similar to analyzing resource occupancy in a business process model.In job shop scheduling each station where jobs can be processed is modeled as a separate queue, for which the order in which the jobs need to be scheduled is determined.Furthermore, these networks are used in telecommunications and computer systems [26].In this paper we will transform a BPMN process into a queueing network, which can be used to analyze the process performance.
For the support of parallelism, fork-join networks are commonly used in queueing theory [27][28][29].A fork-join network consist out of a set of parallel queues, each with their own servers.The arrival at the so called fork initiates all queues simultaneously, where at the join all paths need to synchronize [27].Approximations for computing a fork-join network under a First Come First Served policy are described by Nguyen [30,31].Downside of these fork-join networks is that they require dedicated servers for each parallel task.Since business processes very often have shared resources in parallel tasks, a new approximation for parallel tasks with shared resources is presented in this paper.

Basic queueing models
A way to simply translate business processes to queueing models is to replace each activity in a business process by a queue [2].Each queue has an arrival rate and a processing rate.The arrival rate is the interval at which cases arrive in the queue.The processing rate is how fast a resource can process these cases.According to Kendall et al. [32].a queue can be denoted by its arrival rate (A), processing rate (S) and number of servers (c) as A/S/c.For each different probability distribution a different letter is used to indicate what type of queue it is.The M stands for an exponential distribution, the D stands for a deterministic value and the G stands for a general distribution.Furthermore, these queues have no limit in terms of the maximum number of cases in a queue.The service discipline, i.e. the order in which the queued cases are served, is for these queues First In, First Out (FIFO).
The assumption of a FIFO serving discipline may not always hold in practice [33].The technique presented can still be used to transform BPMN models into queueing networks and analyze the performance and approximate the actual queueing discipline with a FIFO discipline.However, the results of the analysis may then differ from the actual performance of the process.The extent to which this approximation impairs on the conclusions, depends on model various model properties.A queue which has an exponential arrival rate and an exponential processing rate with only one server is denoted as a M/M/1 queue.For these type of queues mathematical formulas for the expected waiting time and expected processing time are available.
According to Kleinrock [34] the expected processing time and expected waiting time of an M/M/1 queue can be computed when the arrival rate (λ) and the processing rate (µ) are known.For the expected processing time of each case, denoted as E(P), it holds that E(P) = 1 µ .Based on this expected processing time the expected waiting time, denoted as E(W ), can be computed.By obtaining both the expected processing time and the expected waiting time, also the expected sojourn time, denoted as E(S), can be computed.This is the total time a case spends in the queue, combining the time it is waiting and the time it is being processed.Furthermore, the fraction of the time the resource serving the queue is occupied, the utilization rate denoted by ρ, can be computed as shown in Definition 1.
By using Little's Law [35] the expected number of cases in the queue can be computed.Little's Law describes the number of cases in the queue, denoted as E(L), as the time it spends in the queue multiplied by the arrival rate.Furthermore the number of cases waiting in the queue is denoted as E(L q ) is computed by the expected waiting time multiplied by the arrival rate.These formulas are shown in Definition 2.

Definition 2. Little's Law
As an example for calculating the waiting time of a process, consider the process model in Fig. 4, which is extended from the running example in Dijkman et al. [1].Fig. 4 describes an insurance process, where applications arrive with λ = 1.Each application is first registered with an expected processing time of E(P A ) = 0.1.Fifty percent of the cases are determined to be simple and are handled by the 'Back Office' by a simple check (E(P S ) = 0.2).The other fifty percent are complex cases and therefore are handled by the 'Front Office'.An extended check is performed for the application (E(P E ) = 0.4).After the checks the result is shared with the customer who filled the application (E(P D ) = 0.4).In parallel to that the result is logged in the system (E(P L ) = 0.1).The 'Back Office' has one employee assigned to it, the 'Front Office' has two employees assigned to it.
In order to transform a process, as shown in Fig. 4, into a queueing model it can be assumed that each task is transformed into a M/M/1 queue with a dedicated resource.For the first task in the process it holds that ρ = 2 • 25 60 = 5

Advanced queueing models
In order to represent business processes two types of queueing models are needed compared to standard Jackson networks: queueing models with different type of cases and queueing models with multiple resources per role.In this section both extensions are discussed.

Different types of cases in a queue
In a business process it is common to have different type of cases.For example, an insurance company can have three type of insurance claims running the same process (e.g.fire-, car-and travel-insurance claims).Therefore, to compute the performance of a process these different types of cases should be accommodated.It is assumed that the type of case is known for all cases in the process and is known before the process starts.Furthermore, when translating a business process into a queueing network the roles in a business process can be considered a queue and each individual activity assigned to that role a different case type entering that queue [1].This allows resources to be shared between tasks, while if all resources are dedicated to a specific task there would be no different type of cases in a queue, and consequently no sharing of resources.
Fig. 5 shows a queue with two type of cases [1].Both types of cases, respectively A and B, have their own processing time E(P A ) and E(P B ) and an arrival rate λ A and λ B .In order to compute the expected waiting time and expected sojourn time an adaption needs to be made to the queueing formulas as presented in Section 3. Since the processing time distribution also is no longer exponential, because the two different case types both have a different processing time, the queue type is changed to one with a general processing time distribution.The arrival rate of cases in the queue is equal to the sum of both arrival rates.For the expected processing time (E(P)) and the variance of the processing time (E(P 2 )) a weighted approach is taken.Let λ be the total arrival rate, then λ A and λ B are respectively the arrival rates of cases of type A and B. According to Kleinrock [37], the fraction of cases of type A is equal to λ A λ , and the fraction of cases of type B is equal to λ B λ .Therefore the fraction of the expected processing time of cases of type A is E(P A ) multiplied by Generalizing, the set T holds all case types and case types are denoted by t, where t ∈ T .The arrival rate for case type t is λ t , the expected processing time is E(P t ) and the expected squared processing time is E(P 2 t ).The overall arrival rate, expected processing time and squared processing time are computed as defined in Definition 3 [37].Definition 3. Arrival rate and expected processing time in a multi-case type M/G/1 queue Using the formulas from Definition 3 the expected waiting time E(W ) can be computed.In order to make this computation an assumption needs to be made on the joint inter-arrival time.The joint inter-arrival time is assumed to follow a Poisson process.Therefore, they are exponentially distributed.For multiple case types a Poisson process is a good representation when multiple arrival processes are joined together [38].Since it cannot be assumed that the new distribution for arrivals is exponentially distributed, as assumed in Section 3, new formulas need to be applied [1].The new formula for the expected waiting time E(W ) uses the expected remaining processing time E(R) for a case.By using the joint expected processing time E(P), the joint expected squared processing time E(P 2 ), and ρ from Definition 1 the following formulas are defined: Definition 4. Expected waiting time in a multi-case type M/G/1 queue The formula for E(W ) in Definition 4 is based on the assumption that cases arrive in the queue and see an average of E(L q ) (see Definition 2) cases in front of it, based on the PASTA property [39].Each case in front of the arriving case is processed with the expected processing time E(P).Furthermore, in addition to that the server might be already working on a case, with probability ρ.
This case only needs to complete the remaining processing time From Little's Law, see definition 2, the formula for E(L q ) can be expressed as λ • E(W ).By substitution we obtain the formula as presented in Definition 4 for E(W ).For details on the proof of both formulas a reference to general queueing theory is made [34,36].
When applying these formulas to the example process as displayed in Fig. 4 the following analysis can be made.First the role, in this example the 'Back Office' role, is transformed into a queue.Both activities associated with the role are then transformed into case types for this queue: A 'Register application' and S 'Perform simple check'.For these case types, let λ A = 5  4 , λ S = 5 8 , and let the expected processing time be exponentially distributed with 1  10 , and E(P S ) = 2 10 , such that it can be computed 50 and E(P 2 S ) = 2 25 .Using these values, E(P) and E(P 2 ) can be computed: Using the values obtained for E(P) and E(P 2 ) the utilization rate, expected waiting time and expected sojourn time can be computed:

Multiple server queues
Until this point the assumption has been that there is only a single resource available per queue.Commonly in a business process there are multiple resources available to serve the cases in a queue.In order to account for multiple resources serving a queue the formulas for ρ and E(W ) need to be adjusted to a M/G/c queue, where c is the number of resources serving the queue as defined according to Buzacott [40].

Definition 5. Utilization rate and expected waiting time in a
The formula of ρ changes since the work is now done by c resources instead of only 1, therefore the formula for ρ from Definition 1 is divided by c.The formula of E(W ) is adjusted for the fact that cases in the queue are now processed c times as fast.The formula is exact when an exponentially distributed processing time is in place and it is an excellent approximation for general processing times [40].For an extensive proof of these formula's see [38].Suppose that the 'Back Office', from the example in Fig. 4, now has three resources instead of only a single resource.Therefore, the value of c increases to 3, which results in a utilization rate which is reduced to ρ = 1  12 .Also Π W can be computed: ) +

Parallel tasks
The queueing models presented in the previous section can be used to transform business process models with sequence, choice and loop constructs.In order to represent the parallel construct in a single queue a new approximation needs to be made.This section first presents exact formulas for the E(P) and E(W ) for a parallel construct with two tasks, a and b, exponential processing times, an exponential arrival rate, and two shared servers, as illustrated in Fig. 6.It then shows how these formulas can be extended to the general case.
For ease of reasoning and readability, the formulas for the parallel construct with two tasks are developed in two steps.First, the formulas are presented under the condition that the number of jobs that is waiting when a new job arrives is fixed.Second, the formulas for the unconditional case are presented, i.e. the case in which an arriving job sees an arbitrary number of other jobs waiting in front of it.
Against this background, the remainder of this section in structured as follows.Section 5.1 presents the conditional formulas for the expected processing time and waiting time, E(P) and E(W ), for the case with two parallel tasks and two shared resources.Section 5.2 presents the unconditional formulas for this case.In general, parallel constructs with more than two resources or more than two tasks also exist and Section 5.3 shows how the formulas for E(P) and E(W ) can be adapted for this.Finally, if there are dedicated resources within a parallel construct the formulas for the exact solution with shared resources need to be adapted as well.This adaptation is described in Section 5.4.

Conditional formulas
To compute the expected processing time and expected waiting time, E(P) and E(W ), of the parallel construct, a Markov chain is constructed for the quasi birth death process (QBD) [41] of the parallel construct.For the parallel construct, with two tasks, a and b, jobs arrive according to a Poisson process with rate λ.Each job is a pair (b, a), with processing times that are exponentially distributed with rates µ a and µ b .It is assumed that a is always the first task of the job that will be served.Typically, the first task a to be served would be the task with the longest E(P), because that leaves the most flexibility for the parallel construct.Furthermore, there are two servers, which can both process task a and b.
The Markov chain is illustrated in Fig. 7.A state in the Markov chain is denoted as (n, i), where n represents the number of pairs of (b, a) in the queue, and the i denotes the state of the servers.
For n > 0, the state i can be one of: 1.Both servers serve type a (and hence the next first in the queue is b) 2. There is one a and one b in service and the first in the queue is a 3.There is one a and one b in service and the first in the queue is b 4. Both servers serve type b (and hence the next first in the queue is a) Consider, for example, the state (2, 1), which means that there are two pairs (b, a) in the queue, and both servers are currently busy performing a task a, meaning that there must be an (additional) item b in the queue, which is the next one to be served.From this state, a new pair (b, a) can arrive, with rate λ, leading to the state (3,1), also an a can be completed, and since both servers are currently performing a task a, this happens at twice the processing rate µ a of that task.Besides the states with n > 0, there are a seven 'boundary states' with n = 0, which represent the situation in which there is no complete pairs in the queue: • the state (0, 0) represents the empty system; • the state (0, a) represents an empty queue, one active server that serves a; • the state (0, b) represents an empty queue, one active server that serves b; • the state (0, 1) represents a queue with a single b, both servers serving a; • the state (0, 2) represents an empty queue, one server that serves a, and one that serves b; • the state (0, 3) represents a queue with a single b, one server that serves a, and one that serves b.
• the state (0, 4) represents an empty queue and both servers serving b; For this Markov chain, formulas can be derived for the waiting time W n and for the processing time P n , conditional on the number of jobs (b, a) in the queue.The recursive formula for W n,i , the waiting time of a job that arrives and find the system in state (n, i), is: Where the boundary conditions can be written as: In order to get the values for the waiting time in each state, W n , the following computation is made: Where the matrix A is the transition matrix between the different phases of the waiting time W n , with the following rates: For the processing time there are the same four states possible as identified for the waiting time for the system to be in.These are expressed by P n,i , which is the probability that a batch will start service with the other server serving a job, given the state upon arrival is (n, i): These result in an expression for P n which is equal to: Or in a compact way: Where P 0 are the three possible boundary states the system can be in.The first state is the empty system, where there are no jobs in progress.The second state is the case where there is only one server busy processing an a task, where the third state is where only one server is busy with processing a b task.This results in the following value for P 0 :

Unconditional formulas
In order to obtain the unconditional expression for the waiting time and the processing time for the parallel construct the QBD that is represented by the Markov chain from Fig. 7 should be solved to find the steady state.To solve the QBD three matrices are defined that make up the QBD process [41]: B is the matrix of a state transition backwards in the Markov chain from Fig. 7 (i.e. a transition where a full job is processed and leaves the queue), F is the matrix for a state transition forwards (i.e. a transition where a job arrives in the queue), and L is the matrix that contains the transition rates of events that does not change the number of full jobs in the queue (and its diagonal makes the row sum to 0).These are defined as follows: Using an iterative procedure the matrices will be used to obtain the R matrix, which solves the equation all elements being non-negative and the spectral radius smaller than 1.In order to obtain the steady state π n = (π n,1 , π n,2 , π n,3 , π n,4 ), the relation between the R matrix and the steady state will be used: π n = π 0 • R n .Only unknown variable in this equation is π 0 , which can be computed by solving the following system of linear equations: Using the solution obtained from the system of linear equations π 0 the value of the expected waiting time until the batch is served can be computed by: For the expected processing time there are three scenarios: the first is when the batch arrives in an empty system and start service immediately, the second is when the batch waits and starts service with the other server is serving an a job, this scenario has probability θ and the third is when the batch waits and starts service with the other server is serving a b job, with probability 1 − θ.In order to obtain the expected service time the boundary conditions ((0, 0), (0, a), (0, b)) are multiplied by the steady state probabilities for the first scenario.
In order to construct the final formula for the expected processing time E(P), there are three scenario's to take into account as described, which will be denoted by E(P (0,0) ), E(P (0,a) ), E(P (0,b) ).Where S a and S b are the exponential service time distributions of a and b with rates µ a and µ b this gives: • E(P (0,0) ) ( 23) • E(P (0,0) ) ( 24) (25)

Generalization of the parallel construct
The general case for our model has k identical servers, where the parallel construct has m tasks, with the processing times of task j follow an exponential distribution with µ j .So for each server there are m options of the type it is serving, plus there are m options for the type of task that is the first in line.Therefore the number of phases in the Markov model is equal to m k+1 , but some phases are transient.This means that there are no state transitions from the current state which lead to these states, for example if there is a task a in front of the queue the states where task b is in front of the queue are unreachable.For the QBD this is not a major issue, but the R matrix will include a lot of states which have a value of zero.These transient phases can be pruned of the model to simplify computations.Adding more tasks or more resources will only increase the state space, but the method of computing the E(W ) and E(P) remains equal.

Dedicated resources
In case there are dedicated resources in a parallel construct for both tasks a couple of changes have to be made to model this situation.The stability condition changes from and therefore it can handle less cases than a system where resources are shared between tasks.The situation with dedicated resources can be modeled in a two dimensional Markov chain with the states being the pairs (n a , n b ), respectively the number of tasks of type a and b.The state space is unbounded in two variables, therefore numerical approximation needs to be used to truncate the system using a maximum number of items in the system K thereafter cases are rejected from the system.In order to compute the sojourn time of the parallel construct with dedicated resources the non-zero transition rates and the steady state are defined.The non-zero transition rates between two states: Since the two jobs in a parallel process need to wait for each other the expected sojourn time is equal to the maximum of the sojourn times of both two tasks.Both tasks have an Erlang distribution distributed with (n a + 1, µ a ) and (n b + 1, µ b ).Using the fact that min {S a , S b } + max {S a , S b } = S a + S b : The unconditional expected sojourn time is the weighted average of the formulas presented above, where the weights are the steady state probabilities of the Markov chain.The steady state probability of observing a state n a , n b is derived using the equations π • P = 0.The formula for the expected sojourn time is:

From BPMN model to queueing model
By using the results from the previous section, now also parallel constructs can be analyzed in business processes.In order to analyze a business process, in the form of a BPMN model, using queueing models a conversion between the two is necessary.The procedure to convert a BPMN model into a queueing model is the following [ This procedure to convert BPMN models into queueing models and compute the overall processing time, waiting time and sojourn time has been implemented in a simulator [42].

Calculate arrival rate for each task
First, for each task in the BPMN model their arrival rate should be computed.The arrival rate of the start event in a business process should be specified in the process.If there are no loops in a business process, the computation of the arrival rate of the other tasks is fairly straightforward.The total arrival rate in a task is the sum of all incoming arrival rates.Furthermore, the total incoming flow of a tasks is equal to the total outgoing flow of a task.For gateways in the process the following applies: • For a choice in a process the fraction of the arrival rate, indicated by the percentage on the arc given in the process model, is added to the arrival rate of the activity that the branch leads to.
• For a join of a choice in the process all incoming arrival rates are added together.
• For a parallel split all activities in the parallel split have the same arrival rate.
• For a parallel join the arrival rate is the same as one of the incoming arcs.
If we apply this to the example in Fig. 4 all cases flowing into task A ('Register application'), denoted by λ A , are split between task S ('Perform simple check') and task E ('Perform extended check') both with a fraction of 50%.Consequently, the arrival rate for task S is equal to λ S = 0.5 • λ A and the arrival rate for task E is equal to λ E = 0.5 • λ A .The process then is split between the two parallel tasks D ('Discuss result') and L ('Log result').Since these two tasks are in parallel, both have the same arrival rate, which is the flow from task S and task E combined, resulting in λ D = λ S + λ E and λ L = λ S + λ E .When there are loops in the process, the arrival rate in the process partially depends on its own arrival rate.To illustrate this the example from Fig. 4 is extended with a loop as shown in Fig. 8.
In the example shown in Fig. 8 a loop is added, after the check on the application of the customer it might be incorrectly registered, which happens in 20% of the cases.These cases are redirected to the start of the process to be registered again.Since there is now a loop in the process the arrival rate of task A depends on the arrival rates of tasks S and E. Since these two already depended on the arrival rate of task A, the arrival rate at task A is indirectly dependent on itself.In order to compute the arrival rate at task A a system of linear equations is solved, these are defined as follows.
Definition 6 (Arrival Rate into Task).Let λ be the arrival rate into the BPMN process that consist of a set of tasks T .Furthermore, let for each t, t ′ ∈ T : p t,t ′ be the probability that after executing task t for a case, task t ′ will be executed for that case, and p ⊙,t be the probability that t is executed after the start event, denoted by ⊙ (⊙ / ∈ T ).For each task t ∈ T , the arrival rate λ t can now be defined as [1]: Rewriting the formulas by transferring the sum part to the left side of the equation, a system of linear equations is obtained which can be solved to an unique solution, since every case will at some point leave the business process.When applying this to the running example in Fig. 8, the system of linear equations is: 4 , λ S = 5 8 , λ E = 5 8 , λ D = 1 and λ L = 1.

Create a queue for each role
The logical translation of a business process model to a queueing model is to create a queue for each role, as already discussed in Section 4.1.Therefore, in the second step for each role in the business process a queue is created, where each task associated with that role is a case type and the number of servers is the number of resources in that role.Based on Dijkman et al. [1], a role queue can be defined as follows: Definition 7 (Role Queue).A role r, with tasks T r associated with that role, behaves like a multi-case type queue with c r servers and case types T r , where each case type t ∈ T r has arrival rate λ t , expected processing time E(P t ) and expected processing time squared E(P 2 t ).Using this definition the formulas from Sections Section 4 can be used to compute the waiting time for this role (E(W r )).The utilization rate for this role (ρ r ).Using these formulas the assumption is made that the combined arrival process of the tasks can be represented by a Poisson process.If we apply this to the example in Fig. 8 for the role of 'Back Office' a multi-case queue with a single server and two case types A and S is created.The two case types have the following properties: λ A = 5  4 , λ S = 5 8 , 02, E(P 2 S ) = 0.08.Using these properties the E(W BackOffice ) = 1  20 and ρ BackOffice = 1 process, from Fig. 8, the queueing network as shown in Fig. 9 is obtained.Similarly, for the 'Front Office' with two servers, it can be computed that E(W FrontOffice ) ≈ 1 40 and ρ FrontOffice = 3 8 .

Calculate the overall timing properties
Finally, the overall waiting time, processing time and sojourn time of the process as a whole can be computed.Tasks within the same role have the same expected waiting time, since they are in the same queue.For the example from Fig. 8 tasks A and S for the 'Back Office' have the same expected waiting time, for the 'Front Office' the tasks E, D and L also have the same expected waiting time.The overall times can be computed by analyzing all the possible paths through the process and adding up all the times of the different tasks.Each path is multiplied with the probability that it is taken.The only issue here is that loops in the system create an infinite set of paths through the process.This is solved by the restriction to only analyze block-structured BPMN models [43,44].This restriction limits the process models suitable for analysis with this method, but is not considered to be a major drawback since it is often used in different process related algorithms [45][46][47] and some business process modeling languages are block structured by definition, such as the Business Process Execution Language (BPEL) [48], which is a business process modeling language focused on modeling processes with web services [49].For more information on the algorithms that can be used to translate a BPMN model to a block structure a reference is made to their definitions [43,44] and to the open source library that implements an algorithm to generate a block structure from a BPMN model [50].A block in a block structure is a sub-graph which has one entry and one exit point, multiple of these subgraphs connected can form a block structured process.Due to the block structure of the process each part of the process can be computed individually.A block structured BPMN model can be described by a process structure tree as defined in Dijkman et al. [1].Let t be a task.Furthermore, let p, p 1 , p 2 , . . ., p n be probabilities.

Definition 8 (Process Structure Tree).
• t is a process structure tree.
• if all Pt, Pt 1 , Pt 2 , . . ., Pt n are process structure trees, then so are The various types of subtrees, indicated by (⟲, →, ⊗, ⊕) are the loop, sequence, choice, and parallelism subtrees, respectively.The semantics of each type of subtree can be defined as shown in Definition 9, which is an extension on Definition 4 in Dijkman et al. [1].
Definition 9 (Process Structure Tree Semantics).In a process structure tree of type: • t, an arriving item will be processed and leave the queue associated with task t according to the usual queueing semantics.
• ⟲(Pt, p), an arriving item is sent as an arriving item to subtree Pt, an item that leaves subtree Pt, is sent as an arriving item to subtree Pt with probability p, or leaves the tree with probability 1 − p. • →(Pt 1 , Pt 2 , . . ., Pt n ), an arriving item is sent as an arriving item to subtree Pt 1 , an item that leaves subtree Pt i for 1 ≤ i < n is as an arriving item to subtree Pt i+1 , and an item that leaves subtree Pt n leaves the tree.
• ⊗(Pt 1 , Pt 2 , . . ., Pt n , p 1 , p 2 , . . ., p n ), an arriving item is sent to subtree Pt i with probability p i , when it leaves the subtree it leaves the tree.
By taking into account these semantics the waiting time, processing time and sojourn time of a process structure tree can be computed.By combining the different process structure trees of a business process the waiting time, processing time and sojourn time of a block structured BPMN model can be computed.Proposition 1 is an extension of Proposition 6 in Dijkman et al. [1].

Proposition 1 (E(W), E(P), E(S) of a Process Structure Tree
).Let r be a role and T r be the tasks associated with that role.Furthermore, let E(P x ), E(W x ), E(S x ) be the processing time, waiting time, and sojourn time of a component x, which can be either a task, a role, or a process structure tree.In a process structure tree Q of type: • t, the expected waiting time E(W Q ) of the tree is E(W r ) for the role r with t ∈ T r ; the expected processing time E(P Q ) of the tree is E(P t ); and the expected sojourn time E(S Q ) of the tree is E(W Q ) + E(P Q ).
• ⟲(Pt, p), the expected waiting time E(W Q ) of the tree is are defined analogously.
• ⊕(Pt 1 , Pt 2 , . . ., Pt n ), the expected waiting time E(W Q ) of the tree is computed by Eq. (20) and the expected processing time E(P Q ) is computed by Eq. (25).E(S Q ) is defined analogously.
The process structure tree of type t is very straightforward.The expected waiting time for a task is E(W t ), where the processing time is equal to E(P t ).For the process structure tree of type ⟲ it involves the probability of the loop p to compute the expected waiting time and expected processing time.The loop is always executed one time, the probability that it is executed again is equal to p.The probability that the loop is executed for a third time is equal to p•p.This continues to be multiplied by p for every possible iteration of the loop.Consequently, the probability that a loop is executed n times is equal to ∑ n i=0 p i and the expected number of times the loop is executed is equal to ∑ ∞ i=0 p i .This is a geometric series which is equal to 1  1−p [2].Therefore the expected waiting time of a loop is 1   1−p • E(W Pt ) and the expected processing time of the loop is 1   1−p •E(P Pt ).For the process structure tree of type →, the sequence of n tasks, the expected waiting time is equal to the sum of the waiting times of the n tasks ∑ n i=1 E(W Pt i ) and the expected processing time is equal to the sum of the processing times of the n tasks ∑ n i=1 E(P Pt i ).For the process structure tree of type ⊗, the choice between n tasks, the expected waiting time is equal to ∑ n i=1 p i • E(W Pt i ) and the expected processing time is equal to For the process structure tree ⊕, the parallelism of n tasks, the formulas as presented in Section 5 need to be applied to compute E(W ) and E(P).
For the running example from Fig. 8, the process structure tree is:

Evaluation of accuracy of queueing models
In this section an evaluation is performed regarding the accuracy of the queueing models as presented in this paper.Section 7.1 discusses the evaluation of the queueing models compared to each other for the various process constructs.Section 7.2 discusses the evaluation of the developed method on a real-life size process model.

Evaluation of queueing models
To analyze if the method in this paper analyzes the expected waiting time, expected processing time and sojourn time correctly it is evaluated.Furthermore, it is compared to existing queueing models for quantitative business process analysis [7][8][9], as they are presented in Section 3. In the paper of Ha et al. [9] some inconsistencies where found in the formulas for the parallel construct.These inconsistencies exist with respect to the definitions of k A and d b in the paper.k A is a fraction of which the denominator can be zero in case A consists on only one task, which will frequently be the case because A is a powerset of a set of tasks.d b relies on the cycle time of the task or tasks b.However, it is undefined how the cycle time is calculated if b is a set of multiple tasks.The authors were contacted to obtain more insight to implement these formulas, but until the time of writing this paper no response was obtained.Therefore this queueing model is not taken into account in the evaluation.
In the evaluation the four main business process modeling constructs [6,10,11] are investigated with exponential arrivaland processing times.In order to also test the validity of the newly developed approximation for the parallel construct, it is also evaluated under a normal distribution.Since this distribution is not supported by the approximation, it can be seen whether the approximation still provides a good enough estimate of the process performance.For the reference values a simulation model is ran in the simulator presented in the Peters et al. [51] (see Fig. 10).
Table 3 shows the simulation parameters that are used per construct.Note that the choice for a particular processing time is arbitrary, because the expected waiting time depends on the   utilization rate, which is determined by the processing time in combination with the arrival rate and the number of servers.
For that reason, a fixed processing time is used and vary the utilization rate with a minimum of 0.30 and a maximum of 0.95 with increments of 0.001.For each of the parameter settings and a utilization rate, a simulation model is executed with a warm-up time of 1000 h and a total duration of 5000 h.This in order to ensure that the system is completely loaded before the performance of the techniques is measured for a long enough period.In each model resources are shared between tasks that are performed by the same role, as is common in business processes.Note that he models obtained from literature do not support shared resources and dedicated resources are used instead to approximate the results for those models.
In Table 4 the results are presented for each of the setups.The table reports the Mean Absolute Error (MAE), the Mean Squared Error (MSE) and the difference in percentage between the queueing model and the simulation result (%Err).The Mean Absolute Error is computed using the following formula: In Eq. ( 27) each value ŷi is compared against the value obtained in the simulator y i , the sum over these absolute differences is taken and divided by the number of observations.The Mean Squared Error is computed using the following formula: In Eq. ( 28) each value ŷi is compared against the value obtained in the simulator y and squared, the sum over these absolute differences is taken and divided by the number of observations.In Table 4 the best value between the three models compared is marked bold, from this it is clear that the queueing model as presented in this paper outperforms the other queueing models in most cases.Only for the setup of the loop all values are almost identical, this is because there was only one resource involved and therefore the other models did not suffer from the lack of support for resource sharing.In the next paragraphs the individual results of the different setups are discussed in more detail.For the sequential construct, where task a is directly followed by task b, the comparison between the queueing models is provided in Fig. 11.Both models from the literature assume that there are dedicated resources for both tasks, therefore this results in the same performance and an overlapping line in the graph.The model presented in this paper approaches the black line significantly closer than the other two methods and therefore is considered better to use for predicting sequential tasks with shared resources.
For the choice construct again two tasks are present, but after task a there is a choice whether the process continues with task b with probability 0.5 or the process terminates directly with probability 0.5.From Fig. 12 it can be observed that the models from literature are more close to the simulation and to the method proposed in this paper, this can be explained by less cases being processed in task b and therefore minimizing the effect of shared resources as identified in the sequential pattern.
For the loop construct a single task a has after completion of the task the probability of 0.1 to be performed again and a probability of 0.9 that the process is terminated.The model of Sheng et al.only takes into account the first three iterations of the loop, so a difference might be expected in the sojourn time, but with a probability of 0.1 of re-entering the loop, the fourth iteration has a probability of 0.1 4 of occurrence, which does not influence the results on the sojourn time.All models perform equally well for the loop setup as can be seen in Fig. 13, provided that there is no difference now between the models due to shared resources, since there is only a single activity.
For the parallel construct two tasks are put in parallel, where the waiting time ends as soon as the first tasks starts service and   the processing time ends when the last task is finished.From Fig. 14 it can be clearly observed that the solution in this paper indeed conforms to the simulation results.The models from literature deviate significantly from the real result for the parallel construct, since they do not support the shared resources between the two parallel tasks.Therefore the sojourn time is significantly higher at a specific utilization rate, since the cases need to wait until the dedicated resource becomes available, even when the other resource might be idle.
In order to check whether the proposed exact solution for the parallel construct in this paper is also valid in cases where general distributions are used a comparison is made for a model with normally distributed processing times and the formulas as presented in this paper.In Fig. 15 it can be seen that the proposed exact solution still outperforms the existing models from literature for the parallel construct.Since there is only a small deviation, relatively compared to the other techniques, from the simulated result it is a good indication that the presented solution is robust for general distributions.

Real-life process
In order to verify if the presented queueing analysis method also work for a real-life size business process the BPI 2017 dataset [52] is used as a basis for constructing a business process model to analyze using the method described in this paper.The BPI 2017 dataset is chosen because it contains a set of activities in the workflow where the processing time is precisely recorded.We constructed a business process model for the BPI 2017 dataset by using the Disco process mining toolkit [53] and focusing only on the tasks involved in the workflow, indicated by the letter W in the dataset (i.e. the tasks that involved human activity).Using a filter in Disco to filter out the other elements of the data the process model that is shown in Fig. 16, is obtained.The process is simplified by using only 10 percent of all paths to improve readability of the process.It is important to note that the simplification of paths does not influence the performance of the queueing analysis method.
The process described in the BPI 2017 dataset and illustrated in Fig. 16 is an application process for a loan.In the process three different roles are identified, the first role is the application administration, the second role is the back office and the third role is the fraud detection department.All tasks are assumed to have an exponentially distributed processing time with the mean value as their parameter.Furthermore, in order to obtain the arrival rate, the number of cases is divided by the total time the dataset is recorded over, this results in a cases arriving on average every 18 min.The interarrival time is again be assumed to be exponentially distributed.
In order to validate the results of the queueing analysis in a real-life size process model a simulation of the model is executed.The simulation can be considered a more accurate representation of reality than the queueing analysis, because it makes fewer assumptions than the queueing model.Therefore, a trade-off can be made between the accuracy of the simulation versus the speed of the queueing analysis in analyzing business processes.For this case study, a simulation was run with a warm-up time of 1000 h, a total duration of 5000 h and 30 replications.Running the simulation on a computer equipped with an Intel i7-6700 K running at 4.4GhZ with 16GB RAM, the simulation took 4 min to complete.Compared to the queueing analysis method, which takes only a few milliseconds to complete, it is substantially slower.This is no issue if there is only a single simulation to be executed.But for optimization purposes, where thousands of scenarios need to be analyzed, the difference between the execution times of both techniques becomes relevant.
For comparison of the different methods on this case a simulation study is performed at the obtained λ of 0.2.Given that the waiting time for that scenario is minimal since the utilization rate is relatively low, another scenario is also investigated with the λ of 0.4, where the utilization rate is high.The results of the comparison are summarized in Table 5.
In the first scenario with a λ equal to 0.2 we obtain from the simulation model an expected sojourn time of 54.28 h is obtained with a 95-percent confidence interval ranging from 53.30 h to 55.26 h.The expected sojourn time from the queueing analysis is 54.17 h.For the technique from Sheng et al. [8] we obtain an expected sojourn time of 59.66 h.For the technique from Xie et al. [7] we obtain an expected sojourn time of 60.05 h.Both other techniques are outside of the confidence interval, where the queueing analysis is within the confidence interval.
In the second scenario with a λ equal to 0.4 we obtain from the simulation model an expected sojourn time of 89.62 h with a 95-percent confidence interval ranging from 83.59 h to 95.64 h.The expected sojourn time from the queueing analysis is 89.85 h.This approaches the expected value for the sojourn time of the simulation model.For the technique from Sheng et al. [8] we obtain an expected sojourn time of 83.30 h.This time is lower than expected since it is outside the confidence interval, this is caused by the lack of incorporation of the frequent loop in the back office.For the technique from Xie et al. [7] we obtain an expected sojourn time of 102.30 h.This time is higher than expected, where it is outside of the confidence interval.Therefore we can conclude that the proposed technique works best in a scenario with a high, and more realistic, utilization rate.Concluding, for this single case study the computational efficiency gain of the queueing analysis is multiple orders of magnitude over the simulation model.At the same time the accuracy of the queueing analysis is in line with that of the simulation model with a 95% level of confidence.These results are in line with the findings of the theoretical evaluation above.

Conclusion and future work
In this paper a new method for translating BPMN models to queueing models is introduced.This method is based on Dijkman et al. [1], which is extended with a new approximation for the parallel construct.The new method for translating BPMN models translates every role into a queue and analyzes the queueing model based on the possible paths through the process.In order to make this method suitable for business process analysis a suitable approximation for the parallel construct needed to be developed.The new approximation is evaluated and compared to the state-of-the-art literature.From this evaluation it is clear that the new approximation for the parallel construct performs significantly better than the state-of-the-art in predicting the sojourn time under different utilization rates.Therefore this new method for analyzing business processes using queueing models is an improvement over the current state-of-the-art literature and is well suited for business process analysis, because it supports sequence, choice, loop and parallel constructs.A single case study with a real-world dataset confirmed the findings from the theoretical evaluation, showing that the analysis results of the queueing model are in line with those of a simulation model (with a 95% level of confidence), while being several orders of magnitude faster.In future work, these results -and in particular the accuracy results -can be studied further in more case studies.In future work we also aim to develop methods in which the proposed model is used to optimize business processes.

Fig. 1 .
Fig. 1.Process model with parameters for quantitative analysis.

Definition 1 .
Utilization rate, expected waiting time, and expected sojourn time in an M/M/1 queue

Fig. 5 .
Fig. 5. Queue with two case types A and B.

Fig. 6 .
Fig. 6.Parallel construct with two tasks a and b.

Fig. 7 .
Fig. 7.The Markov chain for the parallel construct.

40 .
Now E(W ) of the process as a whole can be recursively calculated: ⊕(D,L) ) ≈ 0.046; leads to E(W ) ≈ 0.155.

Table 1
Comparison of quantitative analysis techniques.
Fig. 2. Simulated values for single task.

Table 2
Qualitative evaluation of existing techniques.Xie et al. [7] Sheng et al. [8] Ha et al. [9] This paper b Only average service rates supported.c Except for the parallel construct.
1]: 1. Calculate the arrival rate for each task in the BPMN model.2. Create a queue for each role in the BPMN model and calculate the arrival rate, processing time, utilization rate, waiting time, and sojourn time for that queue.3. Calculate the overall processing time, waiting time, and sojourn time.

Table 4
Evaluation results.