Adaptive scheduling in service systems

This paper considers appointment scheduling in a setting in which at every client arrival the schedule of all future clients can be adapted. Starting our analysis with an explicit treatment of the case of exponentially distributed service times, we then develop a phase-type-based approach to also cover cases in which the service times’ squared coeﬃcient of variation differs from 1. The approach relies on dynamic programming, with the state information being the number of clients waiting, the elapsed service time of the client in service, and the number of clients still to be scheduled. The use of dynamic schedules is illustrated through a set of numerical experiments, showing (i) the effect of wrongly assuming exponentially distributed service times, and (ii) the gains (over static schedules, that is) achieved by rescheduling.


Introduction
In a broad range of service systems, the demand is regulated by working with scheduled appointments.When setting up such a schedule, the main objective is to properly weigh the interests of the service provider and its clients.One wishes to efficiently run the system, but at the same time, the clients should be provided with a sufficiently high level of service.This service level is usually phrased in terms of the individual clients' waiting times, whereas the system efficiency is quantified by the service provider's idle time.In order to generate an optimal schedule, one is faced with the task of identifying the clients' arrival times that minimize a cost function that encompasses the expected idle time as well as the expected waiting times.
The above framework has been widely adopted in healthcare ( Cayirli & Veral, 2003;Gupta & Denton, 2008 ), but there is ample scope for applications in other domains as well.When thinking for instance of a delivery service, clients are given times (or time win-dows) at which the deliverer will hand over the commodity (parcel, meal, etc.).In this case, idle times manifest themselves if the deliverer happens to arrive earlier than scheduled, whereas clients will experience waiting times when the deliverer arrives later than scheduled.Other examples in which the above scheduling framework can be used can be found in applications in logistics, such as the scheduling of ship docking times ( Wang, 1997 ).
In the literature, the dominant approach considers static schedules: the clients' arrival times (chosen to minimize the given cost function) are determined a priori, and are not updated on the fly; examples of such studies are, besides the seminal paper Bailey, 1952, Hassin & Mendel, 2008;Kuiper, Kemper & Mandjes, 2015;Pegden & Rosenshine, 1990;Wang, 1997 .One could imagine, however, that substantial gains can be achieved if one has the flexibility to now and then adapt the schedule.For instance, if one is ahead of schedule or lagging behind, one may want to notify the clients to provide them with a new appointment time, so as to better control the cost function.In the context of a delivery service, the client will appreciate being given updates on the delivery time, so as to better plan her other activities of the day.Besides, the option of updating the schedule enables the deliverer to mitigate the risk of losing time because of arriving at the client's address before the scheduled time.Throughout this paper we will phrase the problem in terms of arriving clients, their appointment epochs, and their service times.
In the present work, we consider the setting in which the schedule is adapted at any client arrival time.In this respect, it should be noted that, evidently, for any specific application that one would like to analyze, domain-specific details need to be added.The approach that we develop is particularly useful in situations in which clients are essentially already on standby, but benefit from being given a precise estimate of their appointment time.
A few examples in which our setup could be used are: • Consider a hospital in which a certain type of surgery is performed.The patients are already on standby (i.e., physically present at the hospital, say, residing in a day surgery waiting room.Then with our algorithm, whenever a patient's surgery starts, the start time of the next patient's surgery can be reliably scheduled.The clear benefit is that it is preferred that the patient spends her waiting time in the waiting room rather than in the immediate vicinity of the operating theater.Furthermore, it is often the case that various preparatory tasks need to be performed, such as commencing an anesthetic regime, that can be better timed when there is a more precise estimate of the operating time.• Consider a healthcare facility, say an operating theater, that is used by multiple doctors, say on-call surgeons.In order to maximally efficiently use the operating theater, the surgeons can be given 'appointment times', i.e., times at which they have to be at the facility to potentially start their surgery. In the language of our scheduling problem, the clients are now the surgeons, each service time corresponds to the total time that it takes a specific surgeon to treat a consecutive stream of patients that were assigned to her, waiting time is the time the surgeon has to wait because of a preceding doctor still being busy, and idle time corresponds to time intervals in which the healthcare facility is not used.In the most basic variant, the patients could be assumed to be on standby, while in more advanced variants two stages of waiting are incorporated in the cost function: (1) time spent in the 'standby mode' in which the customer has to be at the service location or close to their home, and (2) the 'true' waiting time, in which service is imminent.• Consider the situation of a parcel delivery service, in which couriers drop parcels according to some predetermined optimized route.Typically, clients are provided a relatively wide window, devised as a single appointment time with a confidence interval, in which they are supposed to be close to their home.With our approach, they can be given a refinement at which time they actually need to be at home.For example, this can be done at the moment the delivery person enters the previous zone (e.g., a street, a zip code, or a small neighborhood).The (random) time between the first parcel delivery of two subsequent zones can be interpreted as a 'service time' in our model.This comprises both the handling time and travel time of the driver.Sending this real-time update on the delivery enables the client to be aware of when to step outside, thus reducing the handling time of the delivery driver.• Various other applications can be thought of.Also in the setting of, say, a repairperson or a plumber, clients are typically given wide intervals in which they have to be present, but they would be pleased to be sent (reliable) adjustments on the fly.In this context, a client's 'service' should be thought of as capturing the time until the repairperson is available to start working on the next service (that is, the travel time is included).
Besides being applicable in the settings mentioned above, we would like to stress that our approach provides a benchmark for what one could win by adaptively updating the appointment schedule.Concretely, with adaptive schedules becoming increasingly important, our numerical output can be used to quantify the potential gain.Our machinery also facilitates the comparison of (conceptually or computationally) simpler adaptive scheduling approaches with the 'ideal' DP-based one; as such it is useful when exploring the 'complexity/efficiency tradeoff'.
The approach followed relies on dynamic programming ( Bellman, 1957 ): the new schedule is evaluated by recursively reducing it to simpler sub-problems of a lower dimension.Importantly, the schedule update is performed by making use of all information available at the client arrival time under consideration, namely (a) the number of clients who are already waiting at that time, (b) the elapsed service time of the client in service (if any), and (c) the number of clients still to be scheduled.In many appointment scheduling studies, such as Hassin & Mendel (2008); Pegden & Rosenshine (1990) , service times are assumed to be exponentially distributed, having the advantage of facilitating more explicit analysis.In particular, due to the memoryless property, it is not necessary to track information about the service time as in (b) above.
Empirical studies, such as Cayirli, Veral, & Rosen (2006) for the healthcare setting, however, show that in various practical contexts, this exponentiality assumption is not justified.In de Vuyst & Bruneel (2014) , this is resolved by choosing a discrete-time setting that obtains accurate predictions for any known service-time distribution at low computational cost.As the exact service-time distribution is generally unknown, this led us to develop a method that can in principle deal with any type of service-time distribution.In our approach, the service-time distributions are parameterized by their mean and variance (or, more conveniently, by their mean and squared coefficient of variation (SCV)).Then the idea is to generate a phase-type distribution with this mean and variance.Informally, phase-type distributions are effectively generalizations of exponential distributions, thus offering considerably more flexibility but at the same time still allowing explicit computations.
The main complication of working with non-exponential distributions lies in the fact that the state of the system has to contain some information about the service time of the client in service.It could, for example, be the elapsed service time or the residual service time.In the case when the non-exponential distribution is of phase type, it suffices to track the current phase of the service time.This complication leads to serious computational challenges: due to the large state space comprising (a), (b), and (c), the numerical evaluation of the schedule is far from trivial.
We proceed by detailing the paper's main contributions.
• In the first place, we set up a framework for dynamic appointment scheduling.As first steps, we deal with systems with exponentially distributed service times, for which still relatively explicit computations can be performed.Importantly, however, our approach is then extended to service-time distributions with an arbitrary SCV.It requires various explicit calculations for queues with deterministic interarrival times and service times that correspond to either a mixture of Erlang distributions or a hyperexponential distribution.Our framework revolves around a dynamic-programming recursion.The focus is primarily on the homogeneous case, i.e., the situation that the clients' service times are equally distributed; in the exponential case, however, we also deal with the heterogeneous case, as some of the underlying computations help in the analysis of the case of a general SCV.• The second main contribution concerns our computational approach.It relies on a delicate combination of various numerical procedures, some of which are applied in a rather pragmatic manner.These procedures include efficient search algorithms that are required to determine the optimal appointment times for a given instance, storing a sizeable set of dynamic schedules that correspond to specific parameter settings, and various interpolation and extrapolation steps.All procedures that we applied are backed by extensive testing.• Our approach is illustrated by a set of numerical experiments.
In the first place, we assess to what extent assuming exponentiality leads to performance degradation.In the second place, we quantify the gain of dynamic schedules over static, precalculated schedules.This is done as a function of the SCV, for various values of the parameter that weighs the cost of waiting relative to the cost of server idleness.
In the setup presented in this paper, an initial schedule is computed before the start of the first appointment.This schedule is dynamically updated whenever a new client arrives.It is up to the scheduler to determine how frequently schedule updates should be communicated, as this may depend on the application area, the lengths of the service times and client travel times (if applicable).In the healthcare context, for instance, our concept of dynamic schedules is primarily useful when dealing with treatments that take relatively long, as the patient should have enough time to appear at the healthcare facility.For similar reasons, in the parcel delivery context, our setup is well-suited in situations in which the time between subsequent deliveries is sufficiently long.In addition, applications are envisaged in the context of e.g.electricians, plumbers, heating engineers, etc., in which a typical service takes a relatively long time.In practical situations, one may schedule the next client in the way described in our paper, but still provide all subsequent clients with an estimate of the time at which their service is expected to start (which is then later adjusted).
Dynamic schedules have been studied extensively, but with a focus on settings that crucially differ from the one described above.Some papers study what could be called 'advance admission scheduling', in which a planner is to dynamically assign appointments for future days, taking into account the clients' preferences and future demand; see for example Hahn-Goldberg (2014); Wang & Fung (2015); Zander & Mohring (2016) .Another related domain concerns the sequencing of the clients: whereas in our framework the order of the clients is fixed, one could also study the question of identifying the best sequence ( Berg, Denton, Erdogan, Rohleder, & Huschka, 2014;Mak, Rong, & Zhang, 2015 ).In this respect, there is empirical evidence and formal backing to opt for the rule that sequences clients in order of increasing variance of their service durations ( de Kemp, Mandjes, & Olver, 2022;Kong, Lee, Teo, & Zheng, 2016 ).A relevant related research line concerns settings in which there are routine clients, who can be assigned an appointment time in advance, in combination with last-minute clients ( Chen & Robinson, 2014;Erdo gan & Denton, 2013 ); this problem has been formulated in terms of a (multistage) stochastic linear program.Our type of modeling has only recently been introduced in the delivery literature; in the broader context of home delivery services, e.g.Tsang & Shehadeh (2023) ; Zhan, Wang, & Wan (2021) work with a setup that is very similar to ours, viz.an objective function comprising a weighted sum of mean busy and idle times (where it is noted that Zhan et al. (2021) has a strong focus on the routing problem, i.e., finding the order that minimizes the objective function, while in our study it is the objective to exploit real-time information to reduce costs by updating the schedule).Finally, we would like to mention the significant recent contributions Wang, Liu, & Wan (2020) , focusing on managing walk-in clients, and Zacharias & Yunes (2020) , exploiting the concept of multimodularity to set up a highly general framework e.g.no-shows, non-punctuality, and walk-ins.
This paper is organized as follows.Section 2 introduces the modeling framework, defining the dynamics of the underlying system, and formally defining the cost function.We then focus on schedules with exponentially distributed service times: the homogeneous case is dealt with in Section 3 , whereas the heterogeneous case can be found in Section 4 .In Section 5 we develop our phasetype-based framework, setting up the dynamic programming routine that deals with distributions with a general SCV.Numerical experiments are presented in Section 6 .In Section 7 we discuss robustness properties, as well as a computationally attractive variant in which the so-called value-to-go is ignored.We conclude the paper with some final remarks.
We consider a sequence of n ∈ N clients with service times that are represented by the independent, non-negative random variables B 1 , . . ., B n .The idea is that we schedule the jobs one by one, in the sense that at the moment client i enters the system, the arrival epoch of client i + 1 is scheduled.This we do relying on dynamic programming, with a pivotal role being played by the Bellman equation.At (immediately after, that is) each of the arrival epochs the state of the system is the number of clients waiting and the elapsed service time of the client in service, where one in addition knows the characteristics of the clients that are still to be served.Clients are assumed to arrive punctually.As soon as a client's service has been completed, the server proceeds by serving the next client (if present).
Our objective function, or cost function, reflects the interests of both the clients and the service provider, in that it captures the 'disutilities' experienced by both parties involved.Both components are weighed to reflect their relative importance.The clients' disutility is measured through their waiting times, and the service provider's disutility through the queue's idle times.Concretely, the cost function that we will be working with is where I i is the idle time and W i the waiting time associated with client i .The arrival epoch of client i will be denoted by t i , where evidently t 1 can be set equal to 0 (and obviously I 1 = 0 ); the corresponding interarrival times t i − t i −1 are denoted by x i (where we set x 1 = 0 ).For a pictorial illustration, we refer to Fig. 1 .( Kuiper, 2016 ).
Note that we can write the schedule's makespan (i.e., the time until the last client has left the system) in multiple ways: the left-hand side follows from the observation that every point in time during the makespan belongs to either an idle time or a service time, whereas the other two expressions follow from the fact that the makespan equals the arrival epoch of the last client increased by her waiting time (if any) and her service time.Realizing that E B i are given numbers, the consequence of the above identity is that we can equivalently consider the cost function Let C i (k, u ) denote the minimal cost incurred from the arrival of the i -th client (until the end of the schedule), given there are k clients in the system immediately after the arrival of the i -th client, and the client in service has an elapsed service time u 0 .The objective is to minimize (1) , which is equivalent to evaluating C 1 (1 , 0) .In the next section, we start our analysis with the easiest case: the homogeneous exponential case, i.e., each B i is exponentially distributed with mean μ −1 , to gradually proceed to the setup with a general SCV in Section 5 .

Homogeneous exponential case
When the service times are exponentially distributed, the state of the system is just the number of clients waiting.Observe that the elapsed service time of the client in service is irrelevant, by virtue of the memoryless property of the exponential distribution.In this section, we set up an efficient technique to determine the optimal arrival time of the next client in case the B i stem from the same exponential distribution, for the dynamic scheduling mechanism we defined in Section 2 .
The main result of this section is Theorem 1 , where we present an efficient, dynamic-programming ( Bellman, 1957 ) based algorithm to dynamically find the optimal arrival times.The algorithm dynamically evaluates the cost function by conditioning on the state of the system (i.e., the number of clients in the system) at arrival epochs.To evaluate this cost function, we first determine the contribution due to the idle and waiting times (computed in Proposition 1 ), and then determine the transition probabilities (in Lemma 1 ).
Suppose we wish to evaluate the cost between the arrival of the i -th and (i + 1) -st client.For ease, we (locally) identify time 0 with the arrival of this i -th client, and we assume that, immediately after this arrival, there are k clients in the system (where k = 1 , . . ., i ).We let t be the time at which the (i + 1) -st client is scheduled to arrive.Let N s denote the number of clients in the system at time s ∈ [0 , t] , including the client in service.
The first observation is that there are essentially two (mutually exclusive) scenarios upon the arrival of client i + 1 : (i) the newly arrived client immediately gets treatment, i.e., the system idles before t, and (ii) the newly arrived client has to wait in the queue, i.e., there is still a previously arrived client in service at time t.A second observation is, considering the objective function that we defined in (1) , in the former scenario there is a contribution to the mean idle time as well as the mean waiting time, whereas, in the latter scenario, only the mean waiting time increases.
• The contribution of the idle time (in scenario (i)) to the cost function, due to the interval (2) • The contribution of the waiting time (in both scenarios) to the cost function, due to the interval [0 Here we used that if there are clients in the system, then − 1 of them are waiting.
The quantities f k (t) and g k (t) can be evaluated by means of a direct calculation of the integrand appearing in the right-hand sides of (2) and (3) .The main idea is to condition on the number of service completions up to time s .To this end, observe that the event { N s = 0 } , conditional on { N 0+ = k } describes the situation where all k clients have left the system before time s .Hence, this event amounts to a Poisson process with intensity μ making at least k jumps in the interval [0 , s ] .Likewise, { N s = k − } (with = 0 , . . ., k − 1 ), conditional on { N 0+ = k } , amounts to this Poisson process making precisely jumps in [0 , s ] , which is precisely the number of clients that have left in this interval.We thus find The above expressions for f k (t) and g k (t) can be considerably simplified.To this end, we first write the integrals featuring in (4) in closed form.Define E( , μ) as an Erlang random variable with shape parameter and rate parameter μ.Then, t 0 e −μs (μs ) Let us start by evaluating f k (t) .By virtue of (5) , Hence, with Pois (μ) denoting a Poisson random variable with mean μ, we obtain The quantity g k (t) can be dealt with in a similar way, but the calculations are slightly more involved.Appealing to (5) , with Swapping the order of the sums, the right-hand side of the previous display can be alternatively written as We proceed by analyzing the second term in (6) .It is easily verified that k −m 2 can be written as 1 k.An application of this identity directly leads to, with the empty sum being defined as 0, Upon combining the above, The following proposition summarizes what we have found above.We use the notation F μ (k ) := P ( Pois (μ) k ) for the distribution function of a Poisson random variable with mean μ.
Proposition 1.For k = 1 , . . ., i and t 0 , Besides the quantities f k (t) and g k (t) , in the dynamicprogramming routine an important role is played by the transition probabilities: for k = 1 , . . ., i and = 1 , . . ., k + 1 , These can be computed in a standard way; we state their explicit form for completeness.
Lemma 1.For k = 1 , . . ., i and = 2 , . . ., k + 1 and t 0 , . We In Theorem 1 , we kept track of the contributions of the waiting times to the slots between subsequent arrivals.More precisely, in the quantity g k (t) we keep track of the contributions to the expected waiting time by all clients in the system, between two subsequent arrivals (with an interarrival time t), given there are k clients present at the beginning of the slot (i.e., directly after the arrival of a client).There is a convenient alternative, though: we can work instead with h k , denoting the expected waiting time of a client if the number of clients immediately after her arrival is k (i.e., also the waiting time outside the interarrival time under consideration).Some thought reveals that this way one also gathers the sum of the clients' mean waiting times.Indeed, the quantities g k (t) represent the aggregate waiting time (of all clients present, that is) in the interval of length t, whereas h k is the waiting time of the single client in the rest of the schedule, in both cases conditional on k clients being present at the beginning of the slot.Informally one could say that working with the g k (t) s one first adds up the contributions of the clients and then one aggregates over time, whereas working with the h k s one does the opposite.
Working with the h k s has an important advantage, namely its simplicity: we have independently of t.It leads to the following alternative dynamic programming scheme.Let f k (t) and p k (t) be given by Proposition 1 and Lemma 1 , and let h k be given by ( 7) .Then we The resulting value of the objective function (i.e., ˜ C 1 (1) ) and the corresponding schedule coincide with their counterparts that are produced by the algorithm of Theorem 1 .
Example 1.We present an instance in which we apply Theorem 1 .Denote by the optimal time between the i -th and (i + 1) -st arrival, given that immediately after the i -th arrival there are k = 1 , . . ., i clients present.Consider the case of ω = 1 2 and n = 15 .We assume μ = 1 , but values corresponding to any other value of μ can be found by multiplying all interarrival times by 1 /μ.The optimal interarrival times τ i (k ) corresponding to this schedule are given in Table 1 .
Example 2. In this example, we assess the gain achieved by scheduling dynamically rather than in advance.We let K pre (n, ω) be the value of the objective function (for a given ω) when we precalculate the schedule, i.e., optimizing (1) at time 0 over all arrival times t 1 = 0 , t 2 , . . ., t n .Also, K dyn (n, ω) is the value of the objective function when at every arrival we determine the optimal arrival epoch for the next client, taking into account the number of clients present.In addition, we define r(n, ω) as the ratio of K dyn (n, ω) and K pre (n, ω) .The results, shown in Table 2 , indicate that the gain of scheduling dynamically can be substantial.In particular, this holds when the weight ω is relatively high and/or the number of clients n gets large.
Remark 2. We observed that the gain of adapting the schedule becomes more pronounced as ω increases.This phenomenon is essentially due to the intrinsic asymmetry between idle times and waiting times.In the regime that ω is relatively large, the schedule will be such that interarrival times are relatively short, so as to avoid idle time.This means that there are systematically relatively many clients in the system, where for each number of clients there is a specific ideal (i.e., cost-minimizing) time until the next arrival.In other words, there is a substantial gain due to the fact that current information is taken into account when determining the next arrival time.If ω is relatively small, on the contrary, interarrival times are relatively long to avoid waiting times.In the extreme case of a very small ω, effectively every new client will find the system empty.This means that there is no gain in letting the time until the next arrival depend on the state of the system, as the state of the system upon client arrival is virtually always the same.We conclude this section discussing the concept of stationary schedules.Table 1 reveals that the τ i (k ) depends on k (i.e., the number of clients present), but hardly on i (i.e., the index of the client who just arrived).Indeed, this suggests that there is an 'approximate stationary policy', that provides τ (k ) , i.e, the interarrival time when k clients are present in settings where the total number of clients n is large.Such a policy can be evaluated as follows.
Suppose that, when observing k clients after a new client has joined, we consistently schedule the next client after x k time units.Then the number of clients immediately after client arrivals forms a (discrete-time) Markov chain, with the transition probability of going from k to clients given by p k (x k ) (where p k (x ) has been defined above).The equilibrium distribution of the Markov chain is given by π = (π 1 , π 2 , . . . ) .Observe that π depends on all the interarrival times x k (due to the fact that the Markov chain depends on all the x k ), so that we prefer to write π( x ) , with x = (x 1 , x 2 , . . . ) .
Then the minimal long-term per-client cost is given by with f k (x ) and g k (x ) as defined above.Then τ (k ) is the k -th component of the optimizing x .To numerically evaluate the long-term per-client cost C, there is the practical (but not complicated) issue that the above sum should be truncated at a suitably chosen level (say K).
It is noted that the τ (k ) that are found are of great practical interest, as these can serve as the initial values for the τ i (k ) in the numerical evaluation of our (non-stationary) dynamicprogramming based schedule, thus accelerating computation times considerably.The search for the optimizing x is of low computational cost, as this concerns just the minimization of a function with a K-dimensional argument, which can be done relying on standard software.It should be kept in mind that in every step we have to compute, for the current value of x = (x 1 , . . ., x K ) , the equilibrium distribution π( x ) corresponding to the transition probabilities p k (x k ) , but this is a matter of using a standard routine to solve a system of linear equations.In Table 3 we present the optimal stationary dynamic schedules for various values of the weight ω.Notice the striking agreement between the τ i (k ) values presented in Table 1 on one hand, and the τ (k ) values in the column corresponding to ω = 0 .5 in Table 3 on the other hand.
Stationary schedules have been studied in the static case as well; see e.g.Kuiper et al. (2015) , and also the related heavy-traffic analysis of Kuiper, Mandjes, & de Mast (2017) .

Heterogeneous exponential case
In this section, we consider the case that the service requirements are heterogeneous, but still exponentially distributed.As will become clear in the next section, some of the elements appearing in the present section help in the analysis of the case of service times with a general SCV, as dealt with in the next section.In particular, our approach deals with evaluating convolutions and recursions, which is thoroughly used in the general case in the next section.The main result of this section is Theorem 2 , which is the equivalent of Theorem 1 for heterogeneous exponential service times.
The mean of B i , denoting the service requirement of client . For ease we assume that all μ i are distinct; later in the section, we comment on the case that some of the μ i coincide.In the sequel, we will intensively use, for k = 1 , 2 , . . ., = 0 , 1 , . . .and s 0 , the following notation for the density of the sum of independent exponentially distributed random variables: Here E j denotes an exponentially distributed random variable with mean μ −1 j .The following lemma shows that this density can be written as a mixture of exponential terms, with the μ j in the exponent.It should be noted that the corresponding weights are not necessarily positive.It is in principle possible to derive a closedform expression for ϕ k (s ) ; this result is included in Appendix A .
Proof.We prove the statement by induction.It is clear that c k 0 k = μ k , as claimed, due to ϕ k 0 (s ) = μ k e −μ k s for s 0 .We proceed by verifying the induction step, by checking whether ϕ k, +1 (s ) has the right form, given that ϕ k (s ) has.To this end, note that using the usual convolution representation, (8) which by the induction hypothesis can be written as This expression reveals the stated recursion.
Remark 3. In the above setup we assumed that all μ i are distinct, but the calculations can be adapted to the case that some μ i coincide.Upon inspecting the proof of Lemma 2 , it is seen that the terms in the expression for ϕ k (s ) will not be just exponentials, but products of polynomials and exponentials.More concretely, if in the right-hand side of (9) , for some j we have that μ k + +1 = μ j , we obtain that the corresponding summand should read c k j μ j se −μ j s .
It can be checked that if m of the μ j are equal, this leads to a term proportional to s m −1 e −μ j s ; cf. the density of the Erlang distribution.
The computations below are performed for the case that ϕ k (s ) is the sum of exponentials, but it can easily be adapted to the case that some terms are products of polynomials and exponentials.
The following interesting result is proved in Appendix A and will be useful to simplify several expressions later in this section.
In this case with heterogeneous exponentially distributed service times, the setup of its homogeneous counterpart essentially carries over, but now we have to work with the terms where the subscript i indicates that the i -th client entered at time 0. Again, to make the notation lighter, we shift in these computations time such that the arrival of the i -th client corresponds to time 0.
We start our computations with the evaluation of f ki (t) , by first considering the integrand.It is readily checked that, by Lemma 2 , Observe that the event of interest corresponds to clients i − k + 1 up to (and including) i being served before time s .By Lemma 3 , we obtain As a sanity check, observe that indeed f ki (0) = 0 , as desired: the mean amount of idle time over a period of length 0 should equal We proceed by analyzing g ki (t) .Using (8) , we observe that, for = 0 , . . ., k − 1 , . Now the event of interest corresponds to clients i − k + 1 up to (and including) i − k + being served before time s , but the service of client i − k + + 1 being completed only after time s .It follows that Now that we have expressions for the mean idle times and service times, we compute the corresponding transition probabilities.
Let these transition probabilities be defined as p k ,i (t They can be calculated using the machinery that we developed above: for = 2 , . . ., k + 1 , . As for the homogeneous case, we can set up a dynamic program to identify the optimal strategy.Theorem 2. Let f ki (t) , g ki (t) and p k ,i (t) be as defined above.We can determine the C i (k ) recursively: .
As in the homogeneous case, in the above dynamic programming recursion, we can work with h ki instead of g ki (t) , with now With the above dynamic programming algorithm at our disposal, we can assess which order of the clients minimizes the objective function C 1 (1) .Since this is not the main focus point of the paper, we refer the interested reader to Appendix B , where the numerical examples for this section can be found.

Service times with general SCV
Having dealt with the exponential case, in the present section we extend our analysis to the case of the service times having a general SCV.This we do relying on the concept of phase-type distributions.The main idea behind our approach is that we fit our service-time distribution, which is characterized by its mean and SCV, to specific types of phase-type distributions.For these, we point out how to produce dynamic schedules using dynamic programming.In this section we consider the homogeneous phasetype case (i.e., the clients' service times are i.i.d.random variables with a given mean and SCV); at the expense of a substantial amount of additional calculations, this can in principle be extended to the corresponding heterogeneous case.
We define (Asmussen, 2003, Section III.4) phase-type distributions in the following way.Let α ∈ R d be a probability vector, i.e., its entries are non-negative and sum to 1.In addition, we have the transition rate matrix with 0 denoting a d-dimensional all zeroes (column-)vector, t := −T 1 , and 1 a d-dimensional all ones (column-)vector.The pair ( α, T ) represents a phase-type distributed random variable B : the initial phase is sampled from the distribution α, then the state evolves according to a continuous-time Markov chain with rate matrix Q, until the absorbing state d + 1 is reached.The value of the phase-type distributed random variable B records the time it takes for the process we just described to reach state d + 1 .The exponentially distributed times spent in the states 1 , . . ., d are often referred to as phases .
In the context of appointment scheduling with phase-type service requirements, an evident major complication is that the phase is a non-observable quantity.Indeed, when scheduling the arrival epoch of the (i + 1) -st client, which happens at the moment client i arrives, the information one has is the number of clients in the system and the elapsed service time of the client in service (in addition to the number of clients still to be served).Importantly, one does not know the phase that the client in service is in.This can be resolved, however, by working with the distribution of the phase, conditional on the elapsed service time.It is this crucial idea that is applied extensively in the procedure outlined below.
Two special subclasses of phase-type distributions are of particular interest: the weighted Erlang distribution and the hyperexponential distribution.More concretely, as motivated in great detail in e.g.Kuiper et al. (2015); Tijms (1994) , we propose to approximate the service-time distribution by a weighted Erlang distribution in case the SCV is smaller than 1, whereas the hyperexponential distribution can be used if the SCV is larger than 1.In this context we refer to the method of moments , frequently used in statistics; see also Asmussen, Nerman, & Olsson (1996) for statistical procedures to estimate phase-type distributions from data.
In the next subsection, we detail the procedure to identify the parameters of the weighted Erlang distribution and the hyperexponential distribution.Importantly, the error due to replacing a nonphase-type distribution with its phase-type counterpart (with the same mean and SCV) is negligible, as was shown in e.g.Kuiper (2016, pp. 110-111) .We provide a detailed account of this aspect in Section 7 .

Fitting phase-type distributions
As mentioned above, we aim at identifying, for any given distribution of the service time B , a phase-type distribution that matches the first and second moment, or, equivalently, the mean and the SCV.Here, the coefficient of variation C V(X ) of a random variable X is defined as the ratio of the standard deviation to the mean: The squared coefficient of variation S (X ) or SCV is defined as the square of C V(X ) : As in Kuiper et al. (2015) , if S (B ) is smaller than 1, then we match a mixture of two Erlang distributions (also referred to as a weighted Erlang distribution) with K and K + 1 phases (for a suitably chosen value of K ∈ N ).If, on the contrary, S (B ) is larger than 1, then we use a mixture of two exponential distributions (also referred to as a hyperexponential distribution).When S (B ) equals 1, both cases reduce to the homogeneous exponential case considered in Section 3 .
• In the first case, i.e., when S (B ) 1 , our fitted distribution can be represented as follows: with the obvious independence assumptions, for some K ∈ N , μ > 0 , and p ∈ [0 , 1] , where U ∼ Unif [0 , 1] .In words: with probability p we have an Erlang distribution with K phases and with probability 1 − p an Erlang distribution with K + 1 phases.As Now define the function f (•) (and its derivative) through Due to f (0) = 1 / (K + 1) and f (1) = 1 /K, we find that S (B ) lies between these two values.Hence, given S (B ) 1 , we can set K = 1 / S (B ) .It remains to identify the probability p. Solving the equation , the lowest of which (the solution with the minus sign, that is) lies in the interval [0,1].
• In the second case, i.e., when S (B ) > 1 , we fit a hyperexponential distribution: where μ 1 , μ 2 > 0 , μ 1 > μ 2 (without loss of generality) and We have Note that there are now three parameters to pick, which have to satisfy two equations.To find a unique solution, as pointed out in Kuiper et al. (2015) ; Tijms (1994) , we impose the additional condition of balanced means , i.e., μ 1 = 2 pμ and μ 2 = 2(1 − p) μ for some μ > 0 .Then it follows that Hence, given the expectation E B of our service-time distribution, we find the parameters μ 1 and μ 2 by setting μ = 1 / E B , and using the balanced means condition.We thus find As a consequence, Note that, as desired S (B ) > 1 .As μ 1 > μ 2 , we obtain the unique solution In the following two subsections we develop dynamic programming algorithms for the weighted Erlang case and hyperexponential case, respectively.
Let Z t be the phase the process is in at time t.We argued above that the phase as such is not observable.For that reason, we wish to determine the distribution of the phase as a function of the (observable) elapsed service time.We compute for t 0 and z = 1 , . . ., K + 1 , It is easily verified that indeed γ 1 (0) = 1 and γ 2 (0 We proceed by successively evaluating the mean idle time, the mean waiting time, and the transition probabilities, this facilitating setting up the dynamic program.We do this by first conditioning on the phase z = 1 , . . ., K + 1 of the client in service; for this we make use of the notation f kz (t) and ḡ kz (t) (or equivalently, h kz ); they can be seen as more general versions of the f k (t) and g k (t) (or h k ) of the homogeneous exponential case.Next, we focus on computing the mean idle and waiting time given the observable information, i.e., the elapsed service time u ≥ 0 besides the number of clients k in the system.Since these terms depend on the newly introduced terms, we use the notation f ) for the idle and waiting time given the state information.

Mean idle time
We start by evaluating the mean idle time, given Z 0+ = z.Using the same arguments as in the previous sections, this quantity equals Let Bin (z, p) denote a binomially distributed random variable with parameters z ∈ N and p ∈ [0 , 1] .If z = 1 , . . ., K, to achieve N s = 0 we should have that at time s the number of phases that have been served is K − z + 1 (the 'certain phases' of the client in service), plus 1 with probability 1 − p (the 'uncertain phase' of the client in service), plus (k − 1) K (the 'certain phases' of all k − 1 waiting clients), plus Bin (k − 1 , 1 − p) (the 'uncertain phases' of all waiting clients).Hence, As a consequence, with f k (t) as given by Proposition 1 , The case that z = K + 1 has to be treated separately, as in this case, the number of phases corresponding to the client in service is K + 1 with certainty.This means that for z = K + 1 , in order to achieve N s = 0 it is required that at time s the number of phases that have been served is 1 (the (K + 1) -st phase of the client in service), plus (k − 1) K (the 'certain phases' of all waiting clients), plus Bin (k − 1 , 1 − p) (the 'uncertain phases' of all waiting clients).
We thus arrive, for z = K + 1 , at where the f kz (t) are given by (10) for z = 1 , . . ., K, and by (11) for z = K + 1 .This completes our computation of the mean idle time given the observable quantities (viz.the number of clients present at the beginning of the slot and the elapsed service time of the client in service).

Mean waiting time
Now that we have analyzed the mean idle times, we continue by computing the mean waiting times.As pointed out in Remark 1 , it is more convenient to work with the term h • k,u representing the expected waiting time of the newly arrived client rather than ḡ • k,u (t) , i.e., the expected waiting time of all clients present between the client arrivals.For completeness, the evaluation of ḡ • k,u (t) is provided in Appendix C ; we now focus on the with h 1 z := 0 , and for k = 2 , . . ., i , This completes our computation of the mean waiting time given the observable quantities.

Transition probabilities
As a next step we evaluate the transition probabilities at arrival epochs, i.e., the distribution of (N t+ , B t+ ) conditional on N 0+ = k, B 0+ = u .The analysis borrows elements from e.g. de Kemp et al. (2022); Wang (1993Wang ( , 1997) ) .We start by separately analyzing the cases N t+ = k + 1 and N t+ = 1 .As these are the two most extreme scenarios (i.e., no client has left the system or all clients have left the system), we use the notation P ↑ k,u (t) and P ↓ k,u (t) for the corresponding transition probabilities.First observe that for N t+ = k + 1 , corresponding with the scenario that no client leaves in the interval under study, .
Also, considering the scenario N t+ = 1 , in which all clients leave the system before time t, where it is noticed that we have already evaluated the right-hand side of the previous display above.We are left with the evaluation, for v ∈ (0 , t ) and = 2 , . . ., k , of In order to analyze p k ,u v (t) , in the sequel we work extensively with the term, for v ∈ (0 , t ) and = 2 , . . ., k , where the random variables E(k, μ) and E( − k, μ) are independent.The following lemma gives an elegant expression for these terms.
Lemma 4. For v t and k < , What is stated in Lemma 4 has an appealing intuitive explanation.Observe that the number of events before t should be at least k and should be smaller than .This leads to the Poisson probability.Conditional on this number being j, more than j − k of them should happen between t − v and t, which, for each of them, happens with probability v /t, and in addition the corresponding events are independent.This leads to the binomial probability.A formal proof is given in Appendix A .
In order to compute p k ,u v (t) , we first consider the probability of { N t+ = , B t+ v } conditional on { N 0+ = k, Z 0+ = z} .By applying the usual procedure, we will later use the resulting probabilities to express their analogs in which the conditioning is on { The reasoning behind this expression is the following.It is first observed that we can rewrite the event , N t − = − 1 } : at time t− there should be precisely − 1 clients, whereas at time (t − v ) − this number of clients was not reached yet.For an example we refer to Fig. 2 .Hence, at time (t − v ) −, the number of phases that have been completed is less than K − z + 1 (certain phases of the client in service at time 0), plus (k − ) K (certain phases of the k − clients who must have left before time t; the last of which leaves at time t − v at the latest), plus Bin (k − + 1 , 1 − p) (uncertain phases of these k − + 1 clients).Adding up these numbers, one finds I k mz in case Bin (k − + 1 , 1 − p) = m .At the same time, it should not be the case that by time t another client has been served, i.e., the number of phases completed cannot exceed I k mz + K if this client has K phases (this occurs with probability p), and Combining the above expressions, we find that (12) equals, with q k ,z, v (t) denoting the derivative of q k ,z, v (t) with respect to v ,

Dynamic program
As before, we let C i (k, u ) , with i = 1 , . . ., n and k = 1 , . . ., i , correspond to the cost incurred from the arrival of the i -th client, given there are k clients in the system immediately after the arrival of this i -th client and that in addition, the elapsed duration of the job in service immediately after the arrival of this i -th client is u .
Theorem 3. We can determine the C i (k, u ) recursively: for i = 1 , . . ., n − 1 , k = 1 , . . ., i , and u 0 , whereas, for k = 1 , . . ., n and u 0 , From a numerical perspective, solving the above dynamic programming problem is challenging, in particular, due to the fact that the argument u can attain in principle all non-negative real values (smaller than the current time, that is).More precisely, it is a dynamic programming problem in which one of the arguments is real-valued.In practice, this means that one has to impose some sort of discretization on the variable u in order to approximate the optimal strategy.The approach we propose here is to evaluate, for a given > 0 , ξ i (k, m ) , which is to be interpreted as an approximation for C i (k, m ) that becomes increasingly accurate as ↓ 0 , for k = 1 , . . ., i and m ∈ N 0 (with N 0 := N ∪ { 0 } = { 0 , 1 , . . .} ).Define by [ x ] y the truncation of x to the interval [0 , y ] , for some y > 0 : [ x ] y := min { max { 0 , x } , y } .Also, for t ∈ N 0 and > 0 , This quantity is to be interpreted as an approximation of the transition probability of the number of clients jointly with the age of the client in service (truncated to a multiple of , that is), from the state (k, m ) to the state ( , j ) , over a time interval of length t , for some t ∈ N 0 .The approximation of the dynamic programming recursion thus becomes q k ,m j (t) ξ i +1 ( , j) whereas, for k = 1 , . . ., n and m ∈ N 0 ,

Hyperexponential distribution
In this case the service time B equals with probability p ∈ [0 , 1] an exponentially distributed random variable with mean μ −1 1 , and with probability 1 − p an exponentially distributed random variable with mean μ −1 2 .This means that and α 1 = p, i.e., α = (p, 1 − p) T .
As in the weighted Erlang case, the phase is not observable: it cannot be observed which of the two exponential random variables B corresponds to.As before, we determine the distribution of the phase conditional on the elapsed service time.It is checked easily Mimicking the structure we have used in the weighted Erlang case, we subsequently discuss the mean idle time, the mean waiting time, the transition probabilities, and the resulting dynamic program.

Mean idle time
Again we first determine the mean idle time, given Z 0+ = z: Translating the expression for f kz (t) into an expression for f • k,u (t) (i.e., a probability in which the conditioning is on information that is observable) works as in the weighted Erlang case: Below we concentrate on the case z = 1 , but evidently the case z = 2 can be dealt with fully analogously due to the inherent symmetry of the hyperexponential distribution.The starting point is the identity where, with E(m, μ 1 ) and E(k, μ 2 ) being independent, The following lemma is proved in Appendix A . where We proceed by pointing out how the term ρ t [ m, k ] can be evaluated.To this end, we set up a recursive procedure.Integration by parts yields that, for k, m = 1 , 2 , . . ., which after rearranging yields the recursive relation In addition, ρ t [ m, 0] and ρ t [0 , k ] can be given in closed form.Indeed (with μ 1 = μ 2 ), Using the above recursive relation in combination with these initial values, we can evaluate the quantities ρ t [ m, k ] .

Mean waiting time
Again the evaluation of ḡ where the mean waiting time equals 1 Transition probabilities To this end, we continue by analyzing the probability . We start by considering the case that ∈ { 2 , . . ., k } , i.e., for the moment we exclude the scenarios in which no or all clients leave before time t.As we have observed, we can rewrite with E(k, μ 1 ) , E( , μ 2 ) , and E(1 , μ i ) independent Erlang distributions.The following lemma, proved in Appendix A , provides ex- Lemma 6.For v t and k, = 1 , 2 , . . ., It is relatively straightforward to provide closed-form expressions for the probabilities χ v t, 1 [ k, ] and χ v t, 2 [ k, ] in case k and/or equals 0, thus complementing the cases dealt with in Lemma 6 .Indeed, for k, = 1 , 2 , . . ., where the latter statement follows due to Lemma 4 .The expressions for χ v t, 2 [0 , ] and χ v t, 2 [ k, 0] follow by symmetry: they equal respectively, but with the roles of μ 1 and μ 2 being swapped.Also, We thus obtain that We are left with analyzing the cases N t+ = k + 1 and N t+ = 1 .Recall that N t+ = k + 1 corresponds with the scenario that no client leaves in the interval under study, Regarding the scenario N t+ = 1 , with all clients leaving before time t, Here it is noticed that the probabilities appearing on the righthand side can be evaluated with arguments similar to the ones used above (in particular, use that the number of clients waiting at time 0 corresponding with an exponential distribution with rate μ 1 being binomially distributed with parameters k − 1 and p).

Dynamic program
We are now in a position to define our dynamic programming algorithm.The cost C i (k, u ) is as defined before.
As in the case of the weighted Erlang service times, this procedure can be numerically evaluated by discretizing the elapsed service time u (as multiples of > 0 , that is), and by searching over arrival times that are also multiples of this .

Numerical evaluation
As mentioned above, primarily due to the fact that we have to compute the optimal t 0 (or t ∈ N 0 , in the discretized version) for any value of the elapsed service time u , performing the dynamic programming procedure is numerically challenging.In this section we present our applet that provides dynamic appointment schedules, provide more detail about its implementation, and discuss a number of illustrative experiments.

Applet
We have developed an applet providing the optimal appointment time of the next client for a given instance. 3In this context, an instance is a combination of the SCV of the (homogeneous) service times S (B ) , the weight ω, the number of clients n , the index i ∈ { 1 , . . ., n } of the entering client, the number of clients in the system k ∈ { 1 , . . ., i } when the i -th client enters (including this entering client), and the elapsed service time u 0 of the client in service (if any).To cover situations we came across in the service system literature, we let S (B ) ∈ [0 . 2 , 2] .The weight ω can be chosen in the set ω ∈ { 0 . 1 , 0 . 2 , . . ., 0 .9 } , and one can choose the number of clients n ∈ { 1 , 2 , . . ., 20 } .Note that above we normalized time such that E B = 1 , whereas in the applet E B can be freely chosen.

Implementation
To avoid doing computations in real-time, we have chosen an approach in which we have off-line performed the computations on a dense grid of instances, varying ω and the SCV.It is inevitable to follow an approach that uses pre-computed values, due to the intrinsic complexity of our dynamic programming problem, involving multiple dimensions of which one (the elapsed service time) is effectively continuous; this can be considered as a manifestation of the notorious curse of dimensionality.
It is important to note that the majority of the computation time of ξ i (k, m ) , for a given instance, lies in finding the value of t that minimizes the expression in ( 13) , which we denote by τ i (k, m ) .Starting with i = n and decreasing i in steps of 1, we have gained a significant speed-up by storing all computed values of ξ i +1 (•, •) and caching the q k ,m j (t) to avoid multiple computations of the same quantity.To quickly find the optimal value of t, it is essential to have a good initial guess.We have employed the following three-step approach for each value of ω: 1. Observe that S (B ) = 1 corresponds to the relatively straightforward case of exponentially distributed service times, as dealt with in Section 3 .Hence, Step 1 consists of computing ξ i (k, m ) and τ i (k, m ) for S (B ) = 1 , for k = 1 , . . ., i .Note that ξ i (k, m ) and τ i (k, m ) do not depend on m due to the memoryless property of the exponential distribution.

In
Step 2, our goal is to obtain good initial estimates of τ i (k, m ) .
We choose a relatively large step size , which reduces the computation time of (13) , to obtain good initial estimates for different because in this case, only the last client is suffering from potentially high waiting times.3.In the final step, we take a smaller value of (typically a factor 10 smaller) to compute ξ i (k, m ) and τ i (k, m ) for all values of i , k , and m .Finding the minimum of ( 13) is done by a nearest neighbor search starting in the estimate obtained in Step 2 and stopping when the absolute decrease in the objective function is smaller than = 10 −5 .Since is small, the difference between ξ i (k, m ) and ξ i (k, m + 1) is very small as well.
For this reason, we have decided to compute ξ i (k, m ) only for m ∈ { 0 , 10 , 20 , . . ., 250 } and obtain the other values by interpolation or extrapolation.
A final remark: when computing the optimal schedule (i.e., the arrival times) for a given value of n , say n = N, we can auto-matically derive the schedules for all n = 1 , 2 , . . ., N − 1 from this schedule.To clarify this, suppose that we have n = N clients that need to be scheduled.Our algorithm computes the optimal arrival times τ (N)   i (k, m ) and the corresponding costs ξ (N)   i (k, m ) for i = 1 , . . ., N, where we use the superscript (N) to indicate that these quantities correspond to the schedule for N clients.Using this schedule, we can immediately obtain the optimal schedule for scheduling N < N clients.Indeed, due to the assumption of homogeneous service times, the transition probabilities and the expected idle and waiting times do not depend on the client position i , and therefore

Experiments
In this subsection we present a series of experiments that highlight the performance of our dynamic scheduling approach.We start by providing insight into the efficiency gain, to later study in greater detail the role played by the model parameters.

Experiment 1: Efficiency gain
The first experiment quantifies the gain of using dynamic schedules relative to using precalculated schedules.In Fig. 3 we show for ω ∈ { 0 . 1 , 0 .3 , . . ., 0 .9 } and n = 15 , as a function of S (B ) , the cost of the dynamic schedule and the cost of the precalculated schedule.In Table 4 , we see the ratio between the two costs.We conclude that using dynamic schedules particularly pays off for high values of the weight (i.e., ω), in line with what we discussed in Remark 2 , and high values of the SCV of the service times (i.e.,

S (B ) ).
Obviously, in practice, one does not always have good a priori estimates of (the parameters of) the service-time distributions.We continue by doing a simulation experiment where we sample a random SCV value in each run.This will provide more insight in the impact of fluctuations in the SCVs on the quality of the dynamic schedules.Additionally, it enables us to create confidence intervals (CIs) for the mean decrease in costs.The SCV of the clients is sampled uniformly at random from the interval [0 .5 , 1 .5] , while the mean service time is set equal to 1.For each set of parameter values, we determine the optimal precalculated schedule and the optimal dynamic schedule and compare their costs.We repeat this simulation experiment 10 0,0 0 0 times and compute 95% CIs for the mean difference of the costs of precalculated and dynamic scheduling, for ω ∈ { 0 . 2 , 0 .5 , 0 .8 } .The results, shown in Table 5 , confirm that there is indeed a significant difference.For practical applications, the prediction intervals (PIs) are arguably even more interesting.Each prediction interval consists of the 2.5% and the 97.5% percentiles of the simulated costs, which is an excellent quantification of the magnitude of the random fluctuations one can expect in the costs.The PIs indicate that fluctuations of more than 30% can be expected due to the randomness in the parameters, confirming that it is really important for practical implementation of the scheduling policies to have accurate estimates of the service-time distributions.In this example, we have varied the variance of the service-time distributions; in Example 3 in Appendix B , we study the impact of varying their means in a heterogeneous case.
Experiment 2: Impact of model parameters We continue by studying the impact of incorporating the elapsed service time u .As argued before, in the case of exponentially distributed service times this elapsed service time has no impact due to the memoryless property.Define by τ i (k, u ) the optimizing argument in the definition of C i (k, u ) .

Table 4
Cost of dynamic and precalculated schedule in Experiment 1.

S (B ) ω
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.25 B ) , ω) Cost of dynamic and precalculated schedule, with n = 15 and random S (B ) ∈ [0 .5 , 1 .5] .95% confidence intervals and prediction intervals are computed for the difference in costs.In Fig. 4 we show the dependence of the optimal interarrival time τ i (k, u ) and C i (k, u ) on u .For illustrational purposes, we do this for specific values of i , k , and ω, but the conclusions below hold in general.In the first place, we observe that in case S (B ) = 1 the elapsed service time has no impact on τ i (k, u ) and C i (k, u ) , in line with the reasoning above.In case S (B ) < 1 , the service times are 'relatively deterministic'.This explains why in this regime both τ i (k, u ) and C i (k, u ) decrease in u : the longer the elapsed part of the current service, the sooner it is expected to finish, and the more accurately this completion epoch can be predicted.If, on the other hand, S (B ) > 1 , then the service times are "less deterministic" than when stemming from the exponential distribution, so that τ i (k, u ) and C i (k, u ) increase in u .In the latter case, in which we work with the hyperexponential distribution, the remaining service time is increasing in u : with increasing probability the exponential distribution corresponding to the minimum of μ 1 and μ 2 (i.e., the one with the highest mean) is sampled from.

Experiment 3: Impact of ignoring coefficient of variation
In many approaches, owing to its convenient properties, it is assumed that service times have an exponential distribution.In this experiment, our objective is to quantify the loss of efficiency due to ignoring the non-exponentiality of the service-time distribution.In the experiment performed we let the service times be endowed with an SCV different from one.The following three different schedules will be compared: (i) a static schedule assuming exponential service times, computed using the methodology described in e.g.Kuiper et al. (2015) ; (ii) a dynamic schedule assuming exponential service times, computed using the DP-based machinery developed in Section 3 ; (iii) a dynamic schedule assuming the correct SCV, computed using our methods developed in Section 5 .
As before, all mean service times are set equal to one.We estimate the cost of (i) and (ii) by simulation and compare it with the   cost of (iii).As can be seen in Figs. 5 and 6 , wrongly assuming exponentiality may lead to a significant loss of efficiency.Below we provide more detailed observations.
• Particularly when comparing (i) and (iii), which we do in Fig. 5 , we observe a substantial loss: using a precalculated schedule based on SCV equal to 1 performs considerably worse than the dynamic schedule based on the correct SCV.This conclusion is in line with what was observed in Experiment 1. • When comparing (ii) and (iii), which we do in Fig. 6 , the loss is relatively small for values of the SCV around 1.This may have to do with the fact that in dynamic schedules there are ample opportunities to correct when unforeseen scenarios occur, thus mitigating the effect of the randomness in the service times.
We observe, however, that for lower values of the SCV, the loss can be substantial; it is noted that in the healthcare context such low values are common, as was reported in Cayirli et al. (2006) .

Discussion
In this section we reflect on two relevant aspects of our approach.The first is related to its robustness : what is the impact of replacing general distributions by their phase-type counterparts (i.e., phase-type distributions with the same mean and SCV).The second relates to an attempt to substantially reduce the computation times needed, by ignoring a part of the cost function (the so-called value-to-go ).

Non-phase-type service times
Concretely, we systematically study the inaccuracy that may be introduced due to replacing the service-time distributions by their phase-type counterparts.One would like to know how well our method works if the actual service times are not of the phase type, but rather e.g.lognormal (motivated by various empirical studies in the healthcare context, such as Cayirli et al. (2006) ) or Weibull distributed.
To be able to quantitatively assess this issue, we need to have access to optimal dynamic schedules for the situation that the clients' service times have lognormal or Weibull distributions.We have done this by discretizing time and expressing all quantities relevant in the dynamic programming formulation of the optimal dynamic schedule in terms of convolutions.This technique has also been applied in related papers (cf.Deceuninck, Fiems, & de Vuyst (2018) ; Begen (2011) ;de Vuyst & Bruneel (2014) ; Zacharias & Yunes (2020) ).Then existing software for dealing with convolutions is used for the numerical evaluation.We refer to Appendix D for a compact description of the underlying algorithm.
We performed the following experiment.Define by γ XY the total cost when the dynamic schedule is based on the service times having distribution X , while in reality, they have distribution Y ; here the indices X and Y are P (i.e., phase type), L (i.e., lognormal), or W (i.e., Weibull).These costs are determined by simulating the system under the dynamic schedule that was determined under X using service times that are sampled under Y .Table 6 presents the output (where we have chosen n = 15 and ω = 0 .5 ).
To assess the accuracy of our phase-type based method, one needs to compare the columns γ PL and γ LL in the lognormal case, and the columns γ PW and γ WW in the Weibull case.The remarkably strong agreement indicates that the dynamic schedules based on the phase-type approximation perform just marginally worse than the dynamic schedules based on the actual distribution (lognormal or Weibull).The numbers in the table provide a strong justification for replacing the service times with their phase-type counterpart.We note that the above findings are in line with what was found in the corresponding experiments for the static case as reported in Kuiper (2016, Section 2.6.1) .
In the last column of Table 6 we also included the cost in case of phase-type service times, with the dynamic schedule being based on the same phase-type distribution.The numbers reveal that in some cases there are substantial differences between γ PP and the values in the other columns.The conclusion from this observation is that, while using the phase-type-based optimal schedule leads to different costs in the lognormal, Weibull, and phasetype cases (compare the columns γ PL , γ PW , and γ PP ), in all three cases these numbers are close to the respective optimal costs.
The main conclusion from the above experiment is that virtually no accuracy is gained when the dynamic schedules would have been based on the actual service-time distributions rather than their phase-type counterparts.In addition, as pointed out in Schwarz, Selinka, & Stolletz (2016) , the alternative method used to obtain the schedules for lognormal and Weibull service times (i.e, discretization of the service times and numerical evaluation of the convolutions) is markedly slower than our approach.As an indication, for small instances with n = 4 the computation time of our approach is about 60% of the computation time of the alternative approach; for n = 8 this is about 35%, for n = 12 about 17%, and this percentage decreases even further for larger n .In these experiments, we tuned the underlying numerical algorithms such that both approaches reached the same level of precision.It is in addition noted that, when increasing this level of precision, the relative advantage (in terms of computation times) of our method even increased.

Ignoring the value-to-go
A naïve, and obviously suboptimal, approach is to schedule the next client without considering future arrivals.More concretely, one could determine the length of the interval until the next arrival, say x , based on minimizing ω f k (x ) + (1 − ω) g k (x ) , but now with g k (x ) defined as the waiting time of the next client; we thus ignore the so-called value-to-go .The attractive aspect of this approach is the low computational effort needed.Observe that, as the effect of future clients is not incorporated, the dynamic schedule depends on the current number of clients k and the elapsed service time u (of the client in service, that is) only; the total number of clients n does not play a role.
We proceed by providing further details on this approach, in which the value-to-go is ignored.Suppose after a client's arrival there are k clients in the queue, of which there was one client already in service (her elapsed service time being u ).Let S ≡ S(k, u ) denote the total (remaining) service time of these clients, with f S (•) its density and F S (•) its cumulative distribution function; these can be evaluated using the techniques of Section 5 , as we will explain below.Then the mean idle time until the next client's arrival is Now performing the optimization, in which we recognize the classical news vendor problem ( Stevenson, 2013 ) we obtain (by differentiating with respect to x , and equating the resulting expression to 0) that the optimizing x solves in words, this means that the optimal interarrival time is the (1 − ω) -quantile of S. This is a highly natural formula, with an appealing interpretation: it quantifies how much the interarrival time should go up when decreasing ω (evidently, the more we care about idle times, the smaller the interarrival time should be).
We thus conclude that for ω = 1 2 the optimal interarrival time x equals the median of S; as an aside we mention that, interestingly, it is also argued in Kemper, Klaassen, & Mandjes (2014) that if we had wanted to minimize the sum of the second moments of I (x ) and W (x ) (again with ω = 1 2 ), then x equals the mean E S. in Kemper et al. (2014) this idea of ignoring the value-to-go has been analyzed in detail for the static setting; there it is called sequential optimization (where minimizing the total cost is called simultaneous optimization ).As pointed out in Kemper et al. (2014) , the sequential problem optimizes an intrinsically different objective function than its simultaneous counterpart.In the static context of Kemper et al. (2014) it is for instance observed that the interarrival times do not have the dome shape; this makes sense, as the drop at the rightpart of the dome shape is essentially due to the fact that there are relatively few remaining clients to be scheduled, i.e., information one does not have in the sequential setup.
The above considerations also mean that in our dynamic context, it is not clear how to rigorously justify the use of a sequential approach, other than its low computational burden.Indeed, as soon as one is able to determine the distribution function of S ≡ S(k, u ) , for any k and u , one can apply this rule, without the need to solve a complex dynamic programming problem.Importantly, with the formulas we have derived, it is straightforward to determine the distribution of S. Concretely, it can be checked that F S(k,u ) (x ) is the P ↓ k,u (x ) computed in Sections 5.2 (for the weighted Erlang case) and 5.3 (for the hyperexponential case).We conclude this subsection by reporting on a series of numerical experiments, quantifying the increase in the total cost when using the sequential approach described above (instead of the optimal dynamic schedule).We performed numerical computations for various values of n , S (B ) , and ω.The first conclusion is that for small values of ω, the value-to-go can safely be ignored; this

Table 7
Cost of optimal dynamic schedule and the variant that ignores the value-to-go, as well as their ratio, for n = 10 and ω = 0 .9 .makes sense as for these ω the schedule will be such that the number of waiting clients remains low.In the second place, the effect of ignoring the value-to-go becomes more pronounced with increasing S (B ) .And, finally, there is little effect of the number of clients n on the (percentage-wise) increase of the cost.As an illustration, Table 7 provides the cost γ − that ignores the value-to-go, the cost γ of our approach, and the ratio γ /γ − , for n = 10 and ω = 0 .9 .

Concluding remarks
In this paper, we have studied dynamic appointment scheduling, focusing on a setting in which at every arrival the arrival epoch of the next client is determined.The fact that the service times may have distributions with an SCV substantially different from 1 has been dealt with by working with phase-type distributions.Therefore, our approach extends beyond the conventional framework in which service times would be exponentially distributed.It led us to resolve the specific complication that the elapsed service time of the client in service has to be accounted for.We rely on dynamic programming to produce the optimal dynamic schedule, where the state information consists of the number of clients in the system and the elapsed service time of the client in service (if any), besides the number of clients still to be scheduled.We have developed an applet that in real time provides the optimal dynamic schedule.Experiments have been performed that provide insight into the achievable gain as a function of the model parameters.
The model presented in this paper can be seen as a base model that can be enhanced by adding realistic features.In the first place, borrowing ideas developed in Kong, Li, Liu, Teo, & Yan (2020) ; Kuiper, Mandjes, de Mast, & Brokkelkamp (2022) , no-shows and additional non-anticipated arrivals can be included in our framework.As explained in detail in Kuiper et al. (2022) , this can be done in a natural way by adjusting the parameters of the service times.Overtime can be dealt with similarly, as pointed out in e.g.Kuiper (2016, Section 6.4.3) .Also, it is worthwhile to explore including non-punctuality, potentially building on ideas developed in Kemper et al. (2014) ; Zacharias & Yunes (2020) .
It is important to notice that in some settings (such as the surgery example and the repairman example mentioned in the introduction), there could be two 'stages of waiting': (1) a 'standby mode' in which the customer has to be at the service location or close to their home, and (2) a true 'waiting mode' in which service is imminent and, for example, anesthesia and drug treatment starts or the customer has to be in the home expecting the repairperson to turn up.The costs of being in the waiting mode for a long time are likely to be much higher than those in the standby mode.While our current model does not explicitly incorporate 'standby time' explicitly, in a more advanced model it could be included.
In our framework, at any arrival epoch, we schedule the arrival epoch of the next client.In the introduction, we provided a set of application areas in which such a mechanism is natural.An evident alternative is to periodically reschedule, for instance, every hour.It is noted, though, that the latter approach has the intrinsic drawback that at some rescheduling moments potentially no client is present.Given that it typically takes the client some time to arrive at the service facility, this is an undesirable feature (that the approach that we presented in this paper does not have).
It is anticipated that the framework presented in this paper can be extended in various ways.For instance, a challenging variant concerns the situation in which at every arrival not the next client, but the one after that is scheduled.One could in addition think of a setup in which the full remaining schedule is periodically recalculated, for instance, every hour; at every adaptation, the clients are sent an update of their scheduled arrival epochs.In another version, potentially useful for a delivery service, at any adaptation also the order of the clients can be optimized.
where, for x ∈ [0 , 1] and k are the incomplete beta function and the beta function, respectively.It is a well-known property ( Davis, 1972 ) of these functions that where in particular it holds that, for any x ∈ [0 , 1] and k, i ∈ N , Combining the above findings, after some standard algebra, we obtain the stated.
Proof of Lemma 5.
Using the same argument, it follows that σ t [ m, k ] equals This proves the claim.
Proof of Lemma 6.

Appendix B. Numerical Results for the Heterogeneous Exponential Case
In this appendix, we present some numerical results for the dynamic appointment scheduling model with heterogeneous exponentially distributed service times, as discussed in Section 4 .With heterogeneous service times, there is an additional, interesting challenge to find the optimal scheduling order for the clients.As backed by the results in de Kemp et al. (2022) , the so-called smallest-variance-first rule (ordering the clients such that their mean service times, and hence also the corresponding variances, are increasing) is a logical candidate.
Example 3. As in Example 2 , we wish to quantify the gain achieved by scheduling dynamically rather than in advance.As before, we let K pre (n, ω) and K dyn (n, ω) be the value of the objective function of the precalculated and the dynamic schedule, respectively.Again r(n, ω) denotes the ratio of K dyn (n, ω) and K pre (n, ω) .
The results are shown in Table 8 .One of the conclusions from this table is that the cost grows superlinearly in the number of clients n .As in the homogeneous case, the gain is more pronounced when ω and/or n are large.
We proceed by studying the effect of the spread of the parameters.To this end, we take n = 10 and equally spaced parameters in an interval of length that is centered around 1.More formally, we choose, for i = 1 , . . ., n = 10 and ∈ (0 , 2) , The obtained results are given in Table 9 .For small values of , the parameters are relatively homogeneous, so that the gains resemble those achieved in the homogeneous case.The main conclusion is that the higher the value of , the more spread in the parameters, the higher the gain.
As announced before, we can now also analyze the effect of the order in which the clients are scheduled.We do so for equally spaced μ i in the interval [0.5,1.5] and n = 10 .We first compute the cost when the μ i are increasing, corresponding to the smallestvariance-first strategy studied in de Kemp et al. (2022) .Secondly, we pick a randomly selected permutation of the μ i .Finally, we take decreasing μ i .The results are given in Table 10 .The main conclusion is that the numerics align with findings of de Kemp et al. (2022) , in the sense that the cost of serving clients with decreasing μ i is substantially lower than for increasing μ i or randomly ordered μ i .
It is interesting to conduct a similar experiment as Experiment 1 in Section 6.3 , where we have studied the impact of randomized parameter values on (the quality of) the schedules.This time, we fix the number of clients at n = 10 and sample the μ i values uniformly at random from the interval [0 .5 , 1 .5] .We compute 95% confidence intervals (CIs) and prediction intervals (PIs) for the difference of the costs of precalculated and dynamic scheduling, for ω ∈ { 0 . 2 , 0 .5 , 0 .8 } , based on 10 0,0 0 0 independent simulation runs.
The results, shown in Table 11 , are comparable to those in Experiment 1, confirming once again that dynamic scheduling strongly decreases the costs of a schedule.The absolute decrease remains approximately constant for ω > 0 .5 , something that can also be observed in Tables 9 and 10 .
s the age of the service time (i.e., the elapsed service time) of the client who is in service at time s 0 .Recalling how we introduced γ z

Fig. 2 .
Fig. 2. Example with k = 5 and = 3 .B + 0 is the residual service time of client 0 (i.e., the client in service at time 0), and B i is the service time of client i .

Fig. 5 .
Fig. 5. Ratio of cost when using a dynamic schedule based on the correct SCV and cost when using a precalculated schedule based on an SCV equal to one for n = 15 and several values of ω.

Fig. 6 .
Fig. 6.Ratio of cost when using a dynamic schedule based on the correct SCV and cost when using a dynamic schedule based on an SCV equal to one for n = 15 and several values of ω.

Table 2
Cost of dynamic and precalculated schedule for Example 2 .

Table 3
Optimal interarrival times τ (k ) corresponding to the stationary schedule.

Table 6
Cost of basing optimal dynamic schedule on a phase-type distribution, while the actual service-time distribution is lognormal or Weibull.

Table 11
Cost of dynamic and precalculated schedule for Example 3 , with n = 10 and random μ i .95% confidence intervals and prediction intervals are computed for the difference in costs.