Staffing for many-server systems facing non-standard arrival processes

Arrival processes to service systems often display (i) larger than anticipated fluctuations, (ii) a time-varying rate, and (iii) temporal correlation. Motivated by this, we introduce a specific non-homogeneous Poisson process that incorporates these three features. The resulting arrival process is fed into an infinite-server system, which is then used as a proxy for its many-server counterpart. This leads to a staffing rule based on the square-root staffing principle that acknowledges the three features. After a slight rearrangement of servers over the time slots, we succeed to stabilize system performance even under highly varying and strongly correlated conditions. We fit the arrival stream model to real data from an emergency department and demonstrate (by simulation) the performance of the novel staffing rule.


Introduction
The design of staffing algorithms for service systems has been attracting a great deal of interest ever since Erlang published his first papers. The delay experienced by the system's users is predominantly caused by the inherent randomness in the arrival stream. From a managerial point of view, it is natural to address this randomness in such a way that operational costs and customer satisfaction are balanced. An important complication that recently received a lot of attention is that, as has been observed in various empirical studies, the variance of the arrival stream is larger than the corresponding mean; this phenomenon, called overdispersion, is not captured by standard Poisson processes. The challenge that arises is to develop staffing algorithms that are based on more sophisticated, realistic arrival stream models. See [9,14,19,27,35] for related work on the design of staffing rules for service systems with overdispersed input. Such staffing rules have broad application potential, in settings that include call center environments [1,3,5,7,40], cloud computing [34,36], and health care delivery [28].
Staffing in stochastic service systems. Due to the intrinsic stochastic variability of most service systems, they are naturally described by a stochastic model. More specifically, queueing models have proven to reveal 'good' staffing rules; they determine the number of staff needed to effectively but efficiently cope with the demand imposed on the system. Evidently, any staffing rule should be such that the average workload brought in per time unit is smaller than the system capacity, to make sure that the delays experienced by patients remain bounded. Moreover, certain performance targets are set to guarantee the patients a specific 'Quality-of-Service' (QoS) level, which is typically expressed in terms of waiting time. The objective of the system operator is, on the other hand, to bring the utilization level as close as possible to 1, so as to control operational cost by efficiently using the resources. Staffing rules aim to strike a proper balance between the interests of the patients and the system operator.
The goal that is often strived for is to design a service system in such a way that its patients go into service more or less immediately upon arrival [19,40]. Consequently, a commonly chosen service-level agreement (SLA) is to bound the probability of delay by some (typically small) QoS parameter ε > 0 [5,18,42]. In the typical situation that the arrival rate varies over time, this delay probability is clearly time-dependent too. The manager's objective would be to set up a staffing schedule that minimizes the operational costs under the constraint that the delay probability, at any point of time, does not exceed ε. The ideal staffing algorithm is one that stabilizes the probability of delay over time around some value close to ε, bringing the system in the so-called Quality-and-Efficiency-Driven (QED) regime [13]. Note that, in case there are periods where the algorithm induces a probability of delay significantly smaller than ε, it would mean that (at least locally) 'too many' resources are deployed; the variability in the arrival stream is anticipated suboptimally.
Realistically modeling the arrival process. Queueing models have been used intensively to describe and understand congestion phenomena that arise in case of scarce resources. As a first step in designing a realistic model, it is key to study the arrival process at hand. We will continue this introduction with a short recap on the properties that should be present in a realistic arrival stream model.
A common assumption in queueing theory is that of Poissonian arrivals, entailing that the mean and variance of the number of arrivals (roughly) match. However, it is often observed that service systems face arrival streams that are highly variable (mean variance; overdispersion), while in specific cases systems have to deal with almost deterministic arrivals (variance mean; underdispersion).
As an example of the latter, consider service systems in healthcare with scheduled yet not necessarily punctual arrivals (so that arrival epochs randomly fluctuate around the appointed arrival time), as studied in e.g., [21,22,25]. In such settings clearly some sort of 'induced deterministicness' plays a role, in the sense that arrivals are actively being directed to (or away from) the system. In this thesis however, we will focus on 'undirected' arrival streams only.
For 'undirected' arrival streams, overdispersion is a phenomenon commonly found in data. Examples where one could expect to encounter overdispersed arrivals include a call center of a bank, an insurance company and an emergency department in a hospital; see e.g. [3,23,24,27]. In such settings, arrivals are usually triggered (or inhibited) by occasional events or (un-)favorable circumstances which can cause unforeseen peaks (or dips) on top of the usual daily patterns. This so-called 'random environment' gives rise to an effect commonly referred to as parameter uncertainty [3,40], which naturally leads to overdispersion.
Speaking of daily patterns: in nearly all practical applications, the mean number of arrivals is not constant over time (e.g. over the course of the day) and follows a predictable pattern. It must be noted that the variability that causes overdispersion is of a different nature than the variability induced by nonstationarity. Nonstationarity can be modeled by a non-homogeneous Poisson proces, replacing the constant arrival rate of a Poisson process by a (deterministic) time-varying one.
However, for non-homogeneous Poisson processes the mean and variance of the number of arrivals still match, hence such processes fail to capture the entirety of the desired dynamics observed in arrival processes. Nevertheless, nonstationarity is another important feature of a real-life arrival process [10,11,37] and as such should be incorporated in any realistic arrival stream model as well.
Besides being overdispersed and having a time-varying rate, a realistic arrival stream might even have dependencies between the numbers of arrivals in disjoint time intervals. That is to say: it's highly unlikely that the random environment affects the arrival stream in an i.i.d. fashion over the different intervals; the effects at hand possibly play a role for a longer period of time. Indeed, arrival data often exhibits these kinds of dependencies, e.g. in call centers [16,17].
Existing staffing methods. As mentioned, our objective is to develop a staffing rule such that the delay probability is sufficiently low, uniformly over time. With this rather stringent service-level requirement in mind it is fairly natural to approximate the system relying on its infinite-server counterpart. The famous square-root staffing principle is based on exactly this observation. In the classical setting (M/G/∞ with arrival rate R and unit-mean service times) it uses that the steady-state number of busy servers is Poisson distributed with mean R. By asymptotic normality the coefficient of variation (i.e., standard deviation divided by the mean) has the approximate form 1/ √ R, and that the steady-state probability of delay in a corresponding finite-server setting with s servers, say p s (R) can be approximated by for large R, with Φ(·) the distribution function of a standard Normal random variable. For β such that 1 − Φ (β) = ε we find for the required number of servers [38,39] Note that Eqn. (1) has an appealing interpretation: the number of servers s should evidently be taken larger than the expected workload R, with the extra term β √ R ('uncertainty hedge') being of the same order as the natural load fluctuations of the workload process. Refinements of order smaller than √ R are explored in e.g. [18,29,42].
Although the excess probability P(M > s) corresponding to the infinite-server system (with M denoting the stationary number of busy servers in this M/G/∞ queue) is likely to be smaller than the probability of delay in its finite-server counterpart, still square-root staffing rules have shown to give accurate results [5,18]. This can be explained by the fact that as R grows large, the hedge β √ R prevents congestion more and more effectively. If the arrival process is an NHPP with nonstationary arrival rate λ(·), then the number of arrivals N (s, t) in the interval [s, t), with s < t, is Poisson distributed with parameter t s λ(r) dr.
Note that such a non-homogeneous arrival process not yet exhibits overdispersion (E N (s, t) = Var (N (s, t))). For the resulting model M/G/∞-based staffing rules cannot be applied directly, but various approaches have been proposed.
In a first approach, the nonstationarity is essentially ignored: one uses a simple stationary approximation (SSA), based on a stationary model in which the arrival rate is chosen equal to the long-run average [32]. This method performs poorly in most scenarios [11], for example when the actual rate is slowly varying with respect to the service time or when the relative amplitude (level of nonstationarity) of the rate is larger than 10%. A second approach is the pointwise stationary approximation (PSA) [10,11,37], which considers the system at time t as if it has dealt with an arrival rate λ(t) with s t servers from the start (i.e., assuming steady state), thus ignoring nonstationarity in a different way. This method works well in settings where the arrival rate changes sufficiently slowly [10,37], so it covers scenarios on the other side of the spectrum.
As a comprimise between the two extremes, [37] suggests the average stationary approximation (ASA) that generalizes both SSA and PSA, replacing the arrival rate at time t with for some positive constant a and mean service time E S. An alternative to this was proposed by [19], saying that one could replace R in the staffing formula by m ∞ (t), the expected 'offered load' in an infinite-server system with time-dependent arrival rate λ(t) at time t: where S denotes the service time. They showed that this method stabilizes the probability of delay close to some target value at all times, independently of the arrival rate being slowly varying or not. Importantly, in [19] asymptotic normality was used to arrive at the approximation, hence their method follows the tradition of the square-root staffing procedure described above.
As mentioned above, NHPP models fail to exhibit overdispersion, a phenomenon observed across various types of service systems; see e.g., [6,20,23,31,33]. The parameter uncertainty underlying overdispersion potentially jeopardizes the effectiveness of the square-root rule, typically leading to overoptimistic staffing algorithms. This complication was brought forward in many studies, e.g.
Relatively little attention has been paid to staffing rules in the context of arrival processes in which the numbers of arrivals in disjoint intervals are dependent.
Contributions and organization. The contributions of this paper are twofold. In the first place, we present a flexible model for the arrival process, based on [15], that can deal with (any level of) overdispersion, nonstationarity and dependencies between arrivals of consecutive time slots. The challenge being to come up with a model that remains of practical use/computationally tractable, we believe that the model proposed here is among the simplest models with these three properties.
Moreover, fitting data to our model is a relatively straightforward task. The model is presented in In the second place, we develop staffing rules meeting the criteria mentioned in the introduction, to go with this comprehensive yet simple model for the arrival stream. It requires low computational cost to determine staffing prescriptions based on these rules. In Section 2.2 we present the new staffing rule. Subsequently, in Section 2.3 we present a case study based on a healthcare-related data set to show that the rule succeeds to stabilize the delay probability around the targeted ε. The observations here lead to a much improved version of the staffing rule that was introduced in Section 2.2. This concludes Section 2.
In the rest of the paper we work out the details necessary for implementation and further asses the performance of the proposed staffing rules. Section 3 presents straightforward statistical procedures to estimate the modeling parameters. We perform the estimation procedure both for a real data set from an emergency department, and for a stylized example. In Section 4 we perform extensive experiments to assess the performance of our staffing rule in settings with overdispersion, a timevarying arrival rate, and temporal correlation. Here, we incorporate impatience into the model (as in reality, customers might abandon the system before their service begins), in order to analyze how this affects the performance. Finally, Section 5 concludes the paper.

Model and staffing rule
In this section we first present our arrival stream model meeting the criteria mentioned in the introduction (overdispersion, time-varying rate, correlation between disjoint time intervals). Our model is arguably the simplest among all models satisfying these requirements. It is relatively compact and only requires a few input parameters. We then introduce a suitable staffing rule to match such arrival streams. It is new compared to the earlier described methods in the introduction, as it combines all three features of realistic arrival processes while still using the concept of squareroot staffing, where the mean under the square-root is replaced by the variance of the number of customers in the approximative infinite-server system. We conclude the section by an illustrative case study, in which we demonstrate the procedure and its performance.

2.1.
Model description. The model we consider could be termed a mixed M t /G/s t queue with infinite waiting room. We systematically introduce the components of the model, starting with the arrival process.
Arrival process. In our setup the arrival process is a Cox process, i.e., a time-dependent Poisson process with random arrival rate. At time t, the arrival rate is Λ(t) 0. This Λ(t) consists of a deterministic trend λ(t) (capturing the daily pattern), which is inflated by a stochastic busyness factor that incorporates the desired overdispersion and temporal correlation. As a consequence, the model proposed possesses the three desired properties.
More specifically, the arrival rate is built up as follows. Following common practice, we assume that λ(t) is piecewise constant on time intervals of fixed size ∆. For t ∈ [j∆, (j + 1)∆) we can therefore write λ(t) = λ j . We introduce a sequence of random variables W ≡ {W j } j∈Z , which are independent and distributed as a random variable W 0; we normalize them such that EW = 1, and assume The busyness factor of slot j is affected by the current value of the W process (W j , that is), but also by the previous I values (W j−I up to W j−1 ). The parameter I ∈ N reflects the amount of dependence between the stochastic arrival rates pertaining to consecutive disjoint slots.
Let N be the total number of time slots of size ∆ in the considered time frame, i.e. N = 24 when ∆ = 1 hour and the considered time frame is a day. The level of dependence from previous values of the W process is dealt with in an autoregressive way, with parameter α ∈ (0, 1). Concretely, this means that for t ∈ [j∆, (j + 1)∆) the stochastic arrival rate is given by here is a normalizing constant that ensures that the busyness factor has mean 1: It means that the busyness factor gets a new value every ∆ time units; thus, ∆ −1 can be regarded as the sampling frequency. Note that the process W is not observable; as we show later, in our staffing formula we just need to know Var (W ).
The values λ j reflect the mean arrival rates during the individual time slots. When assuming periodicity in the data (e.g., daily and weekly patterns), one can estimate these values in a straightforward way from historic data. The value of ∆ is situation-dependent; one often picks 5, 10 or 15 minutes. This leaves us with estimating α, I, and Var (W ). The procedure we followed is that we use standard least-squares tools to estimate α and Var (W ) for given I; this we do for multiple values of I, so as to select an 'optimal' I. We elaborate on these estimation issues in Section 3.
Service times. The patients' service times are independent and identically distributed samples from a general non-negative distribution; we denote the underlying random variable by S, and write P (t) := P(S > t). In the numerical experiments in Sections 2.3 and 4 we focus on the case of exponentially distributed service times (with mean µ −1 ), but in the staffing rule one could pick in principle any distribution.

Number of servers.
At time t, the number of servers is s t . The value of s t is as determined in Section 2.2. We assume that services are always completed, even if s t drops to a value that is insufficient to serve all patients present; as this assumption is fairly natural in practice this is the way the system dynamics will be modeled in the simulation experiment.
here m ∞ (t) and v ∞ (t) are the mean queue length and variance of the mixed M t /G/∞ counterpart of the mixed M t /G/s t system introduced in Section 2.1. Note that given an overdispersed arrival The use of such a rule is justified by asymptotic normality, which is backed by the results in [15].
The initial choice for the constant β is (with Φ(·) the normal CDF): It is expected that this choice is not optimal, given the approximative nature of the procedure. In fact, β is always smaller than optimal, since the actual number of customers in a finite-server system will be higher than predicted by an infinite-server proxy (where each customer is served immediately upon arrival and hence can leave without waiting). The idea is to slightly tweak the value β in order to more closely attain the desired service level (i.e., P(delay) < ε). More importantly, irrespective of the level the shape of s t as a function of time should ensure a delay probability that is relatively flat over time. This depends mostly on the shape of m ∞ (t) and v ∞ (t).
Hence, the next step is to determine expressions for m ∞ (t) and v ∞ (t). Following the approach of [15], we deduce that the queue-length process of an infinite-server queue fed by a Cox process arrival process with arrival rate Λ(t) is again a Cox process. More specifically, the distribution of the number of patients at time t is Poisson with random parameter We thus obtain that and, by the law of total variance (conditioning on Λ(s), As an aside, note that indeed m ∞ (t) v ∞ (t), which reflects the overdispersion that we introduced.
These expressions can easily be simplified using the observation that Λ(t) is piecewise constant. For the case that S is exponentially distributed, µ ∞ (t) and v ∞ (t) can be evaluated in closed form in t = n∆ with n ∈ N; see Appendix A .
2.3. Case study: MOL staffing for overdispersed hospital arrival data. We continue by illustrating our approach and its performance in a case study. The data set used was provided by the SEElab and originates from the emergency department (ED) of an Israeli hospital. It contains We aggregate different weekdays separately, which implies that we have N = 224 observations for each day of the week (see Table 1). Figure 1 presents the sample mean and variance of the number of arrivals per slot for a 5, 10, 15, 30 and 60 minute resolution, based on these 224 observations.
When considering the hourly mean arrival rates (same timescale as LOS) for each hour of the week it is fairly variable: with a time average of 14, its minimum is 2.1 and its maximum 33.2. In conclusion, the level of nonstationarity is high and the rate is rapidly changing with respect to the LOS. Note that this means that both of the methods mentioned in the introduction, SSA and PSA, would not be accurate.
An observation from Figure 1 is that different weekdays indeed show different patterns in the arrival stream and also the level of overdispersion and nonstationarity is visibly different although data has already been averaged over 224 samples. Note for example that Sunday (in Figure 1 the 4th day) is an exceptionally busy day with high peaks in the mean and variance, in contrast to Friday and Saturday (i.e., Israeli weekend). When fitting the model to the arrival data we find that choosing the parameters α, I and Var (W ) differently for different weekdays significantly improves the fit. As this modeling decision also affects the staffing rule and the subsequent performance analysis, we decided to simplify reality and examine only isolated Sundays in the rest of this case study. Note that consequently we pretend that Sundays succeed one another, so that time slots around midnight are correlated in the model although in reality there is a week of (ignored) events in between. It is expected that the error resulting from this simplification is small as the arrival volume is small around midnight, and even smaller for small values of I.
In the rest of this subsection we will present our rule's performance when staffing a multi-server system where the arrival stream is taken from the data set, restricted to Sundays. In Section 4 we present a systematic evaluation using stylized input.
Given ∆ = 1 hour and I = 10 (where we intentionally pick a large value of I to be on the safe side), we find by the statistical inference procedure that will be described in Section 3 (cf. Table 2) that α = 0.81 and Var (W ) = 0.11 are the best fit for the data. As this will be the input for the  simulation, we expect that using these parameters for the staffing rule will give the best result. To check this, we will also generate the delay probabilities if we plug in different parameter settings in the staffing rule, i.e., I = 0, 1, 5 (with corresponding α and Var (W ) as given in Table 3). As in Table 1, the mean length of stay is 109 minutes, so the hourly service rate to be used in the simulation is µ = 60/109 ≈ 0.55.  Next, the empirical probability of exceeding the staffing level in an infinite-server system is computed using the simulation results. As the staffing rule is based on analysis of infinite-server systems, it can be expected that this probability behaves well: asymptotic normality predicts that it should be close to the required level ε in every time slot, which implicitly says that the service level should be more or less stable. However, Figure 3 shows that the exceedance probability sometimes crosses the required service level and does not follow a smooth straight line.  This can partly be explained by rounding errors (note that the 'tooths' in the lines are often caused by a difference of 1 server), and moreover it is noted that asymptotic results in the end are just approximations. All in all the performance of this very straightforward and easy-to-use staffing rule is satisfying. But importantly, the infinite-server results can of course only be used as a proxy. The actual delay probabilities from the finite-server setting (where the staffing rule dictates the number of patients that can be served at a time) will be significantly larger due to queueing caused by the waiting patients. If this queueing bias would result in a uniform shift upwards, the staffing rule would still prove perfectly useful, as we can easily tune the delay probability down by tweaking β. Unfortunately this is not the case; Figure 3 shows a heavy spike around noon, so the system is locally performing unacceptably poorly. Note that in the (finite-server) setting with abandonments the performance would certainly be better, depending on the abandonment rate. Now, instead of going immediately into service as in the infinite-server setting, all customers initiate an exponential clock (with a rate that might even be comparable to the service rate) right upon arrival, for potential abandonment of the system. The infinite-server proxy is way more accurate in such a setting. In Section 4 we will consider this adaptation, but for now we try to further improve the staffing rule for the basic setting (i.e., the setting without abandonments).
Note that, although for most of the day the delay probability seems rather stable, around noon it takes on values twice the targeted service level. Comparing Figure 2, we find that around noon, which is not incidentally precisely the area where the increase in load is extremely high (due to nonstationarity of the arrival stream), the prescribed number of servers follows the same slope as that of the square-root of the variance. However, apparently this is not enough; the system can not deal with the backlog that is rapidly building up around noon. Based on this observation, we cook up a heuristic that could potentially overcome this hurdle: the hedge β v ∞ (t) is replaced by a more involved one, that accounts for extreme fluctuations in the arrival rate in settings where the level of nonstationarity is high.
Slope heuristic. Let v n the ratio between the variance in time slot n + 1 and n. The idea is to scale up the number of servers when v n 1 while mildly reducing the number of servers when v n < 1, without changing the total number of staffed servers over the day. It is important to only make subtle changes, so that the 'shape' (viz., Figure 2) prescribed by the infinite-server proxy stays unaltered. Consequently, we are after an increasing function f (x) with the property that Functions f δ (x) = x δ with 0 < δ ≤ 1 satisfy these conditions and have the advantage that they can easily be tuned via the parameter δ. We arrive (for t = n∆) at for n = 1, . . . , 24. Then δ can be picked such that the variance of the resulting delay probability (given a staffing level according to s δ n∆ for n = 1, . . . , 24) is minimized. Alternatively, a few values for δ are compared to arrive at a value for which this variance is relatively small. ♦

Remark 2.1. Note that Eqn. (8) simplifies to Eqn. (3) if λ(t) is constant. The heuristic introduces
an extension to the staffing rule that was originally proposed to account for nonstationarity; if there is no nonstationarity present (λ(t) ≡ λ), the slope-adapted rule reduces to the original rule, in which case the latter's performance is satisfactory.
Moreover, note that in the infinite-server setting performance would not improve by using the rule in Eqn. (8); in this setting Eqn. (3) is the best we can get. That is to say, implementation of this heuristic is useful in situations where an extremely steep slope in the arrival rate causes an avalanche of queueing patients once the number of servers is restricted. ♦ Using that EΛ j = λ j , the W j have unit mean and that c α is a normalizing constant, theΛ j are unbiased estimators for the λ j .
Covariance matrix. Recall that the covariance of two independent mixed Poisson random variables (meaning that the enveloping Poisson random variables are independent) with dependent parameters equals the covariance of the parameters. In addition, recall that the variance of a mixed Poisson random variable is the sum of the expectation and variance of its parameter.
Procedure for α, I and Var (W ). The idea is to vary I in an outer loop and to estimate α and   To be able to determine the values of α and Var (W ) given λ j , λ j+k and I, we need the empirical covariance matrix Σ derived from the arrival data. Note that an estimate of any two nonzero entries of Σ provides enough information to solve for α and Var (W ), after having equated them to the expression in Eqn. (12). However, each nonzero pair leads to a different solution. We wish to determine values for α and Var (W ) such that the theoretical covariance matrix Σ(α, Var (W )) as given by Eqn. (12) is the 'best' approximation for Σ. Therefore, the next step in the procedure is to minimize the average of the entrywise mean squared errors, where we sum over the entries for which the theoretical covariance matrix is nonzero (noting that the number of nonzero entries, being equal to N (2I + 1), depends on the choice of I). In Table 2 this value is labeled with MSE*, with a separate column for the exact MSE values where all entries of the empirical covariance matrix are taken into account. The gain is computed as the relative gain in (exact) MSE compared to the standard Poisson case (where I = Var (W ) = 0). We observe that from I = 5, not much improvement is still to be gained, so I = 5 seems to be a good choice when we aim for moderate complexity and a good fit.
Some more examples. In Table 3 the same procedure is used to obtain the best fit for I = 0, 1, 5, 10, which are used in the case study in Section 2.3. Similar gain percentages in MSE are obtained by choosing I larger, although in Table 2 I = 5 and I = 10 achieve a better fit than in Table 3. At the same time, the variance of W as well as the correlation parameter α is consistently smaller in the data set that corresponds to Sundays; although total arrival volume is larger here, temporal correlation and overdispersion seems to be less prominent.  Table 4. Fitted parameters given I and corresponding MSE.
We add a theoretical example to assess the precision of this minimization method. With the λ j as above, set α * = Var (W ) * = 0.5 and I * = 5. As these parameters together define the arrival process, this gives a certain covariance matrix Cov * . We use the minimization method to find, given some choice of I ∈ {0, 1, . . . , 7}, the optimal values for α and Var (W ) in terms of the MSE of Cov(α, Var (W )) with respect to Cov * . The results can be found in Table 4, together with the corresponding MSE. Observe that the method recovers the true values α * and Var (W ) * in case we set I = I * = 5. For lower degree of correlation, it is found that a larger value for α (more dependence) is compensated by a smaller value for Var (W ) (less overdispersion), however apart from the case I = 4 choosing I too small inevitably leads to a big loss in precision. For I = 4 the MSE is acceptably small. On the other hand, setting I too large leads only to small errors, which means that selecting a value I above the true value leads to marginal differences.

Performance
In this section we extend the numerical work on the performance of the presented staffing rules in the finite-server setting with described arrival stream, based on simulations instead of data. The goal is to assess the individual effects of nonstationarity, temporal correlation and overdispersion.
Moreover, to reduce the gap between the finite-server system and its infinite-server proxy we add an abandonment rate θ to the model. Note that in any service system that involves waiting, it is natural to have a (possibly small) positive abandonment rate.
We start with a particular stylized instance for the arrival stream, again inspired by the hospital data, i.e., the levels of overdispersion, nonstationarity and temporal correlation are comparable. The daily pattern is represented by a sine function with a cycle length of 24 hours (with ∆ = 1 hour), having a dip early in the morning (at 4:30) and a peak late in the afternoon (at 16:30). That is, where N is the system size and p the level of nonstationarity. The parameters are set as in Table 5. Table 5. Parameter setting base case.
Note that, as in Section 2.3, the concerning system is fairly small whereas the level of nonstationarity is extremely high. As the service rate is low, this means that the patient 'sees' effectively different arrival rates during its stay and nonstationarity can not be ignored. The correlation structure is abundantly present and the level of overdispersion seems mild (though nonzero). However, the effect of Var (W ) being positive on the size of the hedge (i.e., on √ v ∞ ) is quite large; compare columns 1 and 3 in Table 6. On the other hand, taking I > 0 slightly mitigates this effect; compare columns 2 and 3 in Table 6. This table was included to show the effects of different modeling choices made independent of the impact of a (wildly fluctuating) daily pattern, which will be added in the experiment that follows.
In Table 6 the staffing levels for 3×3 different settings are given. Here abandonments are incorporated to different extents: the abandonment rate is θ = a * µ, for a = 0, 0.5, 1 ('no', 'mild' or 'max'). Note that incorporating abandonments does not affect the prescription for the staffing level, as our staffing rule does not account for it.
We find that in this instance with stationary deterministic daily pattern, performance does not change significantly when a correlation structure is added to the model, where it is noted that we account for it in the staffing rule (in this case that means that less servers were used to achieve approximately the same delay probability). We do however need significantly more servers to attain a comparable level for the delay probability when switching from the 'no overdispersion' setting (column 1) to the setting with overdispersion (column 3), which of course is the motivation for the staffing rule introduced in this paper. Note that performance gets worse anyway, despite the complication in the number of servers.
From Table 6 it becomes very clear that firm improvement in performance can be achieved by incorporating a positive abandonment rate. Nevertheless, we see that even without a daily pattern, the β constant needs to be tuned somewhat until the delay probabilities match the targeted probabilities, partly due to the overdispersion (the first column displays better performance) and partly due to the inaccuracy of the infinite-server proxy (when abandonments are incorporated performance gets better until it's nearly perfect). Strangely, only in the cases where ε = 0.1 with no/mild abandonments, the performance got worse when correlation was left out. Of course the proxy is least accurate in this case, but apparently having dependence between arrival rates 'helps' here.
standard  Table 6. Delay probability obtained through simulation, for the setting without nonstationarity (we set p = 0). The first column gives the probability when setting Var (W ) = 0, in the third column the correlation structure is ignored (I = 0). The middle column is the delay probability as dictated by our model given the stated parameter setting (see Table 5).
From the plots in Figure 5 we observe that incorporating only mild abandonments significantly improves the performance of our staffing rule, which makes sense as the infinite-server proxy is more accurate for finite-server models with abandonments. Note that the finite-server setting endowed with an abandonment rate θ = µ coincides with the infinite-server setting, the setting in the last row of plots. Note that the somewhat erratic nature of the delay probability is due to inevitable rounding errors resulting from the fact that the number of servers needs to be integer. Figure 5. Delay probability in finite-server setting, staffing level according new rule.
In the first column we set Var (W ) to zero (Subfigures (a), (d) and (g)), in the arrival stream as well as in the staffing rule.
Given that the delay probabilities in this setting define some sort of baseline for the performance (this should be the easiest setting to handle), it is remarkable that the performance does not get (significantly) worse when taking into account overdispersion and correlation. In that sense it looks like our staffing rule is prescribing the correct number of servers. The plots even suggest slight improvement in many settings. Comparing the overdispersed setting where correlation is left out (column 3) with the setting with both overdispersion and correlation (column 2), there is only a very slight improvement in performance over all plots with different abandonment rates and staffing levels.
That is, our rule seems to account well for overdispersion, but it is struggling slightly harder to deal with the correlation structure. However, it can be concluded that nonstationarity is the main factor that complicates achieving stable delay probabilities, mostly in the setting without abandonments.
In order to make a fair comparison with the case study in Section 2.3, it is necessary to apply the slope-adapted staffing rule (cf. Eqn. (8)) here as well. We will only apply it to the case with no abandonments, hence we mirror Figure 5  it can be concluded that the slope-adapted staffing rule indeed stabilizes the delay probabilities (a) (b) (c) Figure 6. Delay probability in finite-server setting, staffing level determined with slope-adapted staffing rule. Here δ from Eqn. (8) is of the form δ = k 24 , for the value of k that maximally stabilizes the delay probability. over the day. However, further improvement could be made by tweaking β, to get the stabilized probabilities below the targeted level. The resulting improvement is not shown, as the procedure and its effect are trivial.

Conclusion and discussion
In this paper we propose new staffing rules for a specific queueing model with overdispersed and nonstationary input with temporal correlation. The objective is to stabilize the delay probability throughout the day around a fixed target value, which the final staffing rule developed succeeds to do.
In the numerical experiments in Section 2.3 the originally proposed rule based on an infinite-server proxy proves insufficient for staffing purposes. The main complication turns out to be nonstationarity.
Considering the same model with abandonments, we observe significantly better performance, due to the fact that the infinite-server proxy is more accurate for finite-server models with abandonments.
Applying the introduced slope heuristic, the adapted staffing rule renders a major improvement (already without abandonments!).
The observed performance is robust for the choice of parameters for overdispersion and temporal correlation; as long as the combination of parameters results in an accurate estimate for the variance in the number of arriving customers, the prescribed (slope-adapted) staffing level is appropriate. In Appendix B, it is shown that this variance is decreasing in α and I and at the same time increasing in Var (W ), so that different parameter settings can result in the same variance. Although the statistical procedure in Section 3 does not lead to a unique 'optimal' choice for the parameters α, I and Var (W ), because of this robustness it is sufficient to select a reasonable parameter setting.
Note that the implementation of nonstationarity in the model is rather straightforward: fitting a constant arrival rate to fixed time slots is the simplest and also a widely used procedure to implement time-of-day or time-of-week effects. It is remarked, though, that the discontinuities might cause poor predictions close to the slot boundaries. Therefore, in [43] a slight adaptation is suggested: it is proposed to use piecewise linear (hence continuous) rates.
For j = 1, . . . , N , we introduce (using the periodicity) This leads to an expression for m ∞ (t) in terms of a finite sum: We now move on to compute v ∞ (t). To this end, we define The idea is to rearrange the contributions to the random arrival rate due to each of the W j in the expression for v ∞ (t) in Eqn. (7): Noting that the W j are independent and identically distributed, the expression in the previous display becomes The next step is again to exploit the periodicity. For this we study ∞ j=1 B j , under the assumption I < N (which is fairly natural). Elementary calculus reveals that v ∞ (t) can be expressed as a finite sum, due to Note that for convergence of the infinite series, we have to assume that e −µ∆ < α.
The final expression for v ∞ (t) is as follows: where D j := Appendix B. Variance of the arrival process as a function of the parameter space In this appendix we show that the variance of the arrival process is decreasing in α and I and increasing in Var (W ). With the nonstationary doubly-stochastic arrival rate process Λ(t) given by Eqn.
(2), we get a nonstationary mixed Poisson arrival process. This process is overdispersed, which becomes visible once we write down its variance: Var Poisson(Λ(t)) = λ j + λ 2 which is larger than λ j as the second term at the right-hand side of Eqn. (15) is positive for Var (W ) > 0. It is hence directly seen that Eqn. (15) is increasing in Var (W ). Observe that the depends both on α and I. We state and prove that Eqn. (16) is (strictly) decreasing in α and I.
Note that indeed 1 + α I+1 1 − α I+1 < 1 + α I 1 − α I for I = 0, 1, . . . , for all α ∈ (0, 1). Now consider the function f I (α) = 1−α 1+α 1+α I 1−α I for some I > 1 (for I = 1 we find this function is constant; note that this corresponds to the case where I = 0 in our model, i.e. there is no correlation between past time slots). After taking the logarithm and differentiating, we end up with the condition that the function is (strictly) decreasing if
Appendix C. Constraint on I In this appendix we explain why we take I, the number of elapsed time slots that affect the busyness factor, to be at most equal to (N − 1)/2 . Note that the number of nonzero entries in the covariance matrix equals N (2I + 1), as for each time slot j = 0, . . . , N − 1 we have a nonzero entry on the diagonal and for k = 1, . . . , I both Σ j,j+k and Σ j,j−k are nonzero. Of course the dimension of the matrix only allows for N 2 entries. In other words: N (2I + 1) ≤ N 2 must hold, i.e., To be even more precise, strictly it is only required to set I ≤ N/2 , however in the case where N is even, for k = I = N/2 we should replace Eqn. (11) by for j = 0, . . . , N − 1.
The following example serves as an illustration of the complication that arises when I is not restricted as in Eqn. (17).