Operations Research Letters Point and interval estimation of decomposition error in discrete-time open tandem queues

regression We analyze the approximation quality of the discrete-time decomposition approach, compared to simulation, and with respect to the expected value and the 95th-percentile of waiting time. For both performance measures, we use OLS regression models to compute point estimates, and quantile regression models to compute interval estimates of decomposition error. The ANOVA reveal major inﬂuencing factors on decomposition error while the regression models are demonstrated to provide accurate forecasts and precise conﬁdence intervals for decomposition error. © 2022 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).


Introduction
Queueing models are widely used for performance evaluation of production and logistics systems which are subject to the influence of randomness [23,33,35,39,41]. When applying continuoustime queueing models, engineers calculate the first and second moment of performance indicators of interest (e.g. throughput, waiting time, and the number of customers in the queue) using the well-known formulas for M/M/1 and M/G/1 queues as well as approximation formulas for G/G/1 queues. Books that provide an overview of continuous-time queueing models are written by Buzacott and Shanthikumar [6] and Wolff [38].
However, production and logistics systems are typically designed to guarantee performance not on average, but with a given probability (e.g. 95%), which necessitates the calculation of the distribution of key performance indicators (such as waiting time) to know, for example, which percentage of orders is processed in 3h or less, or what promised throughput time will be met in 95% of the cases [29]. Applying discrete-time queueing models allows for the computation of the entire probability distributions of key performance indicators under very general assumptions. Discrete-time modelling means that events are only recorded at moments that are multiples of a constant time unit t inc . Thus, the probability mass function of a discrete random variable x is denoted by * Corresponding author.
Given the discrete random variables for the inter-arrival and service time, the probability distributions of performance measures can be computed, for example the waiting time [10] or the interdeparture time [15] distributions. A comprehensive introduction to discrete-time queueing models can be found in [1,5]. The models have been successfully applied in various use cases related to logistics and production systems [7,[27][28][29][30].
The analysis of discrete-time open queueing networks relies on a decomposition approach. As in the continuous-time domain, the technique is known to yield approximate results in the case of non-Poisson arrivals and generally distributed service times. The drawback with approximations is that we cannot quantify the deviation of the performance measures calculated with a decomposition approach from their actual values. While the approximation quality of decomposition approaches has been studied in the literature for the continuous-time domain (see e.g. [16,34]), decomposition error in the discrete-time domain has not yet been comprehensively examined. So far, no estimator is available to predict decomposition error for a given queueing network in the discretetime domain.
In this paper, we investigate discrete-time open tandem queues to analyze and forecast the approximation quality of the discretetime decomposition technique, compared to simulation. We limit ourselves to the analysis of tandem queues with external Poisson arrivals that become non-renewal at the downstream queue with the aim to reveal fundamental dependencies regarding the approximation quality of the discrete-time decomposition approach.

Theoretical background
Open queueing networks allow for the analysis of systems with infinite buffer capacity and generally distributed inter-arrival and service times. Generalizations of Jackson's product form solution [12,13] with respect to generally distributed inter-arrival and service times are proposed by Reiser and Kobayashi [24] with modifications presented by Kuehn [22], Shantikumar and Buzacott [32], Whitt [36], and Bitran and Tirupati [3,4]. Each decomposition approach relies on two basic assumptions [8]: First, it is assumed that the individual queueing systems can be treated as being statistically independent GI/G/1-queues. Second, it is assumed that the point process which forms the input to each GI/G/1-queue can be approximated by a renewal process. It is therefore important to emphasize that congestion measures obtained by decomposition techniques are approximate, since the assumption of independence among queueing systems does not properly account for the correlations of the arrival stream which have a significant effect on the performance measures [16].
Decomposition approaches for discrete-time open tandem queues are based on these conditions, as well. The arrival stream of a downstream queue is approximated as renewal process by the inter-departure time distribution of the upstream queue, which can be efficiently computed with the algorithm by Jain and Grassmann [15]. The waiting time distribution of the resulting GI/G/1queue is obtained with the algorithm presented by Grassmann and Jain [10]. Further performance measures, such as the distribution of customers, can be computed with the approaches presented by Hasslinger [11], and Grassmann and Tavakoli [9].
In an effort to investigate the approximation quality of the decomposition techniques, tandem lines have been studied extensively in the literature. Suresh and Whitt [34] examine the impact of non-renewal processes on the approximation quality with different traffic intensities. Wu and McGinnis [40] introduce the intrinsic ratio, a fundamental property of tandem queues that is based on the insight that some servers are directly affected by the external arrival process. Whitt [37] suggests using a variability function (instead of a single parameter as in the QNA) for the arrival stream of the downstream queue, which is a function of the traffic intensity of the incoming queue. Sagron et al. [25] extend this method to multi-class systems that address the scenario when the upstream server in a tandem queue experiences downtimes (e.g. set-up, maintenance, and repair), events that increase the station's departure variability, while causing starvation of a downstream bottleneck station. To achieve better computational efficiency, Sagron et al. [26] approximate the between-class effect (the variability caused by interactions with other classes) in a queue with downtimes using a Regression-Based Variability Function (RBVF). RBVF receives the squared coefficient of variation of the arrival and service times, as well as the expected value of the service process as input and approximates the variability function using methods of linear regression.

Methodology
The object of investigation in this paper is a tandem queue, that is, two discrete-time queueing systems are arranged one after the other. The upstream queueing system is fed by an external arrival stream with arrival rate 1/E( A u ) of customers. If the service station is busy upon arrival of a customer, this customer waits for service in the waiting room. After being served at the upstream station with service rate 1/E(B u ), all customers enter the waiting area of the downstream queueing system. The size of the waiting area is infinite, meaning that all customers wait to be served with service rate 1/E(B d ) at the downstream station and are smaller than 1. Since the arrival process at the downstream queue is approximated as point process with inter-arrival time distribution A d , only the downstream queueing system is prone to decomposition error. For the sake of clarity, Table 1 defines the system performance metrics of the tandem queue.
In our analyses, we assume that the random variables describing the service processes are described by discretized gamma distributions. Let X be a gamma-distributed random variable with shape parameter k and scale parameter θ . The probability density function of X is given by [2] f (x; k, θ) = where (k) is the gamma function. We use the squared coefficient of variation (scv) as normalized measure of statistical dispersion to measure the process variability. Let E( X) define the expected value of X , and V ar( X) its variance. The variability of X is defined as In order to generate gamma-distributed random variables X with predefined values for E( X) and scv( X), we use the wellknown closed-form expressions for the shape and scale parameters of the gamma distribution, Finally, we define σ τ X as the τ -percent percentile of the probability mass function (pmf) of random variable X . In this paper, we are interested in the error of the waiting time W at the downstream queue computed by the discrete-time decomposition approach, compared to discrete-event simulation. We conduct two distinct studies with different dependent variables. In Study I, let (E) be the divergence of the expected value of waiting time where E Sim (W ) and E Q ueue (W ) denote the expected value of waiting time, computed with the discrete-time queueing approach and simulation, respectively. In Study II, let (σ ) be the divergence of the 95th-percentile of waiting time where σ 95 W ,Q ueue denotes the 95th-percentile of waiting time, computed with the discrete-time queueing approach, and σ 95 W ,Sim the 95th-percentile of waiting time, obtained with simulation.
In the following, we introduce the methodologies used for the computation of point and interval estimates of decomposition error and briefly describe the empirical evaluation criteria, the simulation model, and our design of experiments.

Point and interval estimates
We use Ordinary Least Square (OLS) multiple linear regression to compute point estimates, and quantile regression to compute interval estimates for decomposition error. The methodological background on OLS regression can be found e.g. in [31]. Quantile regression aims at the estimation of conditional quantile functionsmodels in which quantiles (percentiles) of the conditional distribution of the dependent variable are expressed as functions of observed covariates [17,20]. Unlike OLS which is used to compute the conditional mean of the dependent variable, quantile regression can be used to explain the determinants of the dependent variable at any point of the pmf of the dependent variable.
The dependent variables of the regression models in Study I and Study II are (E) and (σ ), respectively. In both studies, we consider the same sample of N observations for the estimation of decomposition error. To help simplify the notations introduced in the following, we do not differentiate between both studies, but instead set y n = (E) in Study I, and y n = (σ ) in Study II for a given data point n. The observations include y and X , where y denotes the N-vector of decomposition error, and X is the (N × K ) design matrix of the independent variables, with K − 1 dependent (explanatory) variables. Point estimates for decomposition error are computed with the well known formula for multiple linear regression y = X β + ε, (3) where ε is the N-vector of the random error terms of the regression model. The estimates β for problem (3) are found by minimizing the sum of squares residualŝ In contrast to OLS, quantile regression finds the estimates β (τ ) for a given quantile τ ∈ (0, 1) by minimizing the weighted sum of the absolute deviationŝ where the weight ω n is defined as The quantile regression estimates β (τ ) in problem (4) can be computed very efficiently by linear programming methods. In this paper, we use the modified version of Barrodale and Roberts algorithm [18,19] to calculate the quantile regression estimates.
We always consider the quantile regression models in pairs, so that they form the upper and lower endpoints of the 90%, 95% or 99% confidence interval (CI) of decomposition error, respectively.

Goodness of fit criteria and likelihood ratio tests
To evaluate the accuracy of the fitted OLS models, we are interested in the empirical distribution of the error term ε in problem (3). A preliminary evaluation of the data set shows that the Gauss-Markov conditions [31], and especially E(ε) = 0, hold for our data set. Consequently, mean error measurements for the cumulated error terms of ε (such as MSE and RMSE) will be (nearly) zero and therefore not meaningful for interpretation. Instead, we evaluate the absolute values |ε n |, n ∈ N and denote |ε n | as forecasting error (FE) for observation n. To arrive at the determination of the accuracy of the OLS model, we compute the relative frequency distribution function of FE for all observations in ε. Interpreting the relative frequency distribution of FE, the higher the percentage of small values, the better the model fits the data and thus the higher the accuracy of the model.
The goodness of fit criterion of quantile regression is calculated with the algorithm by Koenker and Machado [21]. Analogous to the conventional R 2 statistic of OLS regression, we call it Pseudo R 2 .
Let β (τ ) denote the minimizer of problem (4), and V (τ ) the error sum of the conditional quantile function. Further, let Ṽ (τ ) denote the error sum of the corresponding conditional quantile function, that is restricted to only consider the intercept parameter of β (τ ).
Conventionally, the goodness of fit criterion is defined as Note that Pseudo R 2 is not comparable to the standard coefficient of determination R 2 although it lies between 0 and 1. It is only useful for the comparison between quantile regression models since it is based on the weighted sum of absolute residuals, while R 2 is based on residual variance. Finally, it should be noted that Pseudo R 2 may be a skewed measure as it is not corrected by the degrees of freedom. However, a definition for the goodness of fit that follows the concept of Adjusted R 2 known from OLS regression is not available for quantile regression analyses.
We use likelihood ratio tests to test the overall significance of the OLS regression models [31]. We are interested in testing whether all the independent variables have any effect on decomposition error and test the general linear hypothesis where C is a (M × K ) matrix of rank M < K and γ is a M-vector.
Note that hypothesis (5) allows us to test the overall significance of the OLS model, where as well as the significance of elected independent variables (socalled nested models), where for arbitrary values of m and γ m . Hypothesis (5) is rejected if We report the test statistic (8) as well as the p-value of the hypothesis test, which is the probability of observing a value of F larger than the one observed under H with degrees of freedom (M, N − K − 1) and significance level α. Generally speaking, when the test statistic is large, and the p-value is small, we can safely reject H and conclude that the OLS model provides a better fit to the data than a model which contains no independent variables (hypothesis (6)) or the nested model (hypothesis (7)).

Simulation model
We use a discrete-event simulation model of a tandem queue to obtain the waiting time distribution at the downstream station. Each simulation run is composed of 50 replications with 10,000,000 simulated time steps each. In each simulation run, the first 100,000 time steps are discarded. The observed width of the 95%-CI of the expected value of waiting time is 0.0286, which is less than 0.5% of the average simulated waiting time. The authors therefore feel that the performance metrics obtained with the simulation model -despite being prone to some variance -are valid estimates for the waiting time.

Design of experiments
Each tandem queue is parameterized with rate and variability parameters of the external arrival stream and the service processes in both queueing systems. For the sake of conciseness, we limit ourselves to experiments where the arrival process at the first queue is Poisson, and the service times at both queues are gammadistributed. Given its flexibility, the gamma distribution allows for the modelling of a wide range of dispersion and is therefore well suited to represent the stochastic behaviour of the service process. Further, it is well known that the exponential distribution is a special case of the gamma distribution when the scv-value equals 1. We first consider tandem queues where the utilization parameters at the upstream and the downstream queue are equal. This allows us to define a generic utilization parameter ρ for the tandem queue, ρ = ρ u = ρ d . A relaxation of this assumption will be discussed in Section 5.
Based on these conditions, we parameterize each tandem queue with four parameters, the external arrival rate, the service rate, and the variability parameters of both service processes. We define the utilization of the tandem queue ρ, the variability parameters of both service processes scv(B u ) and scv(B d ), and the variability of the arrival process at the downstream queueing system scv( A d ) as independent variables (IVs) of the regression models. We partition the data set into two subsets, the training data set which consists of 932 randomly chosen data points, and the test data set which consist of the remaining 234 data points. The data sets are accessible in a repository [14] and described in detail in the accompanied data article.

Results
We first consider the distribution of decomposition error in the overall data set. The empirical cumulative distribution of decomposition error reveals that both, positive (meaning that discrete-time queueing theory underestimates the waiting time) and negative errors (overestimation of the waiting time) are found. We find the relative errors in the range of -21.9% and 32.5% (referring to Study I) and -30.8% and 36.7% (referring to Study II). The mean absolute values of decomposition error equal 3.93% and 4.51% regarding the expected value and the 95th-percentile of waiting time, respectively.

Study I: Expected value of waiting time
The OLS regression coefficients for Study I are presented in the accompanied data article. Recall that in Study I, the dependent variable is (E), cf. equation (1). The OLS regression analysis is found to be statistically significant (F (10, 921) = 2123, p < .001), explaining the majority of the variance of the relative error of the expected value of waiting time (R 2 Adj. = 0.958). The ANOVA reveals all direct and the majority of the interaction effects to be statistically significant. Since the non-significant coefficient is small, we did not find evidence for the regression model to perform significantly better without incorporating this interaction (F (921, 922) = 1.234, p = .267). We identify the service process variability at the upstream queueing system and the arrival process variability at the downstream queueing system, as well as the utilization as major impact factors. Despite being statistically significant, the effect of the variability of the service process at the downstream queueing system is found to be a minor influencing factor.
The Pseudo R 2 of each quantile regression model is well above 0.8. All quantile regression equations show similar patterns of changes in coefficient values as the OLS regression. We find the majority of direct and interaction effects to be statistically significant. As in the OLS regression, the interaction effect between the service process variability (at the upstream queueing system) and the utilization is found to be non-significant among each model. While the absolute sizes of the coefficients for most factors vary little across the equations, it should be noted that the weights of the service process variability at the upstream queueing system, and the arrival process variability at the downstream queueing system rise with increasing quantile.

Study II: 95th-percentile of waiting time
The regression coefficients for Study II are presented in the accompanied data article. In Study II, the dependent variable is (σ ), cf. equation (2). We find a statistically significant OLS regression equation (F (10, 921) = 1064, p < .001), which explains the majority of the variance (R 2 Adj. = 0.920) of decomposition error regarding the 95th-percentile of waiting time. The impact patterns of the interaction effects are the same as in Study I. Again, we did not find evidence for the OLS estimate to better perform without incorporating the non-significant interaction effect between the service process variability and utilization (F (921, 922) = 0.917, p = .339). Analogous to Study I, the service process variability (at the upstream queueing system), the arrival process variability (downstream queueing system), and the utilization are found to be the major direct effects. Despite being statistically significant, the service process variability at the downstream queueing system is a minor impact factor.
The Pseudo R 2 of all quantile regression models is well above 0.6. Except for the service process variability at the downstream queueing system, which is non-significant for the models with τ .05, all direct effects are found to be statistically significant among each regression model. The majority of interaction coefficients is found to be significant or marginally significant. However, we did find non-significant coefficients among the interaction effect of the service process variability and the arrival process variability (both at the downstream queueing system), as well as in the Q (.975) model. As in Study I, the absolute sizes of coefficients vary little for most factors across the equations. However, the weight of the utilization increases by rising quantiles, while (in contrast to Study I) the weight of the arrival process variability decreases.

Performance of point and interval estimates
The accuracy of the point estimates is presented in Table 2. For the majority of data points, we find an absolute error of the OLS predictions of less than 1 percentage point from the simulated value. The mean absolute forecasting errors are less than 1 percentage point in Study I and only slightly above 1 percentage point in Study II. In both studies, this accuracy is achieved for the training and the test data set, which indicates that our OLS prediction approach is robust to overfitting.
Despite the minor mean errors, the results suggest that the accuracy of point estimates decreases when forecasting severe values Notes: Subsets (a) and (b) denote the subsets of test data with absolute decomposition error smaller than 3% and above 10%. The sample sizes are 136 and 23 (Study I), and 127 and 33 (Study II). of decomposition error. To investigate this effect, we examine the subsets of test data with minor decomposition errors, that is, all data points with absolute decomposition errors smaller than 3% (in the following referred to as subset (a)), and with severe decomposition errors, that is, all data points with absolute decomposition errors above 10% (subset (b)). The sample sizes of subsets (a) and (b) are 136 and 23 in Study I, and 127 and 33 in Study II, respectively. The relative frequency distributions of FE and its mean errors (cf. Table 2) suggest that subset (a) is forcasted with significantly higher accuracy than the data points from subset (b) in both studies. Further, the share of data points that is forecasted with a FE greater than 0.05 is significantly higher in subset (b). However, it cannot be concluded that data points with severe absolute decomposition errors are frequently predicted with minor accuracy. In the test data from Study I, we find 96% of the data points with an absolute decomposition greater than 10% to be forecasted with a FE less than 0.05 (in Study II the share is 89%). Interval estimation compensates for this effect. By providing the 90%, 95%, and 99% confidence intervals, we evaluate the precision of the point estimates. Table 3 presents the performance of the interval estimates for Study I and Study II, listing the mean interval lengths and the actual shares of decomposition errors included in the respective confidence intervals. As expected, the average interval lengths increase with rising confidence in finding a data point in the corresponding interval. In both studies, the average interval lengths differ only marginally between training and test data which indicates that the approach of interval estimation is robust to overfitting. In the training data set, the confidence intervals contain exactly the respective share of values they were determined for. These shares are only slightly undermined for the test data.
The interval estimates are designed to indicate uncertainty in the forecast of point estimates. The results are presented in Table 3. In subset (a), the precision of interval estimations increases, compared to the entire test data set. This is indicated by the narrower intervals, as well as the high shares of values that are in-cluded in the respective intervals (which is especially to be emphasized for Study II). As discussed above, in subset (b), the forecast uncertainty of the point estimates increases, which is indicated by longer mean intervals and a smaller share of values contained in the intervals.
We conclude that minor decomposition errors are predicted with satisfactory point estimation accuracy and great precision. Predicting severe decomposition errors is subject to uncertainty: the absolute error of the point estimate might be considerable, which is indicated by large confidence intervals. By combining the methods, the authors feel to satisfy both, the aspect of an accurate point estimation forecast, as well as the quantification of its uncertainty.

Bottlenecks and longer lines
The investigations of the heavy-traffic bottleneck phenomenon in open queueing systems [34] suggest that the performance of bottleneck downstream queues is strongly related to the variability of the non-renewal arrival process variability, which impacts the approximation quality of decomposition methods. Therefore, we extend our analyses to tandem queues and longer lines with bottlenecks. As Suresh and Witt [34] mention, in a narrower sense, the bottleneck is the queue with the highest traffic intensity. However, increasing the traffic intensity of a queue by only a small amount may shift the bottleneck position. Therefore, it is intuitive to state that either of the queues is the bottleneck if it's utilization is substantially greater than some , |ρ u − ρ d | > .
We create a further data set containing 969 data points, following the procedure described in section 3.4, but with the relaxation that the expected values of service times are now independent. We choose = 0.1 and find 403 data points where the downstream queue is the bottleneck. We use OLS and quantile regression to identify the major and minor effects on decomposition error in bottleneck queues. The coefficients of the regression analyses, where the dependent variables are (E) (Study I), and (σ ) (Study II) are provided in the accompanied data article. We find the previously identified major and minor effects on decomposition error to apply in this analysis, as well. However, the empirical distributions of the decomposition error show that the approximation quality of the decomposition approach depends significantly on which of the queues is the bottleneck. In the case of similar traffic intensities, we find mean absolute values of decomposition error to be 5.45% (6.51%) for the expected value (95th-percentile) of waiting time, which is in line with the expectations of previous examinations. When the bottleneck is downstream, the mean absolute values of decomposition error regarding the expected value (95th-percentile) of waiting time equal 4.87% (5.50%). In contrast, when the bottleneck is upstream, we find mean absolute values of decomposition error of 1.36% (1.46%) for the expected value (95thpercentile) of waiting time. Similar results are observed in longer lines. We investigate a set of lines with i queues in series, where i equals 3, 5, 7, and 9. For each line length i, we evaluate 250 data points. The utilization parameters of the first i − 1 queues are equal, and the last queue in each case is the bottleneck. Table 4 shows the mean absolute decomposition errors for the expected value (Study I), and the 95th-percentile (Study II) of waiting time. It can be clearly seen that the last queues are prone to significant decomposition errors with 9.69% on average in Study I, and 11.67% on average in Study II. This is significantly more than the decomposition errors for the intermediate queues which are 2.82% on average in Study I, and 2.85% on average in Study II. The results confirm the long-range variability effect formulated by Suresh and Whitt [34], that states that variability in the external arrival stream or the service times can have a dramatic effect on a downstream queue with a much higher traffic intensity.

Concluding remarks
From the analyzes of decomposition techniques in the continuous-time domain, it is well known that utilization and variability parameters for arrival and service processes are significant for the approximation quality of congestion measures. Based on the regression coefficients, we identify utilization and arrival process variability as major impact factors on decomposition error. Service process variability was revealed as a minor impact factor.
Utilization is found to be the enabler for decomposition error: In low-traffic queueing systems, the mean absolute decomposition error is significantly lower than the mean absolute errors in the entire data set. Severe absolute decomposition errors are only observed in heavy-traffic systems. In tandem queues with bottlenecks, we find the decomposition error to be significantly higher when the bottleneck is downstream. This leads to the conclusion that downstream bottlenecks are analyzed with limited accuracy, which should be of particular interest since the performance evaluation of bottlenecks is obviously particularly critical. The arrival process variability determines the tendency (that is, overestimation or underestimation of the waiting time) of the decomposition technique. For scv-values of the arrival process at the downstream queue lower than 1.0, the decomposition approach underestimates waiting time. Overestimation of waiting time occurs for scv-values of the downstream arrival process greater than 1.0. Variability of the service process is a minor impact factor. This is indicated by the fact that when the arrival process at the downstream queue is Poisson, we did not find considerable decomposition errors, regardless of the utilization of the queueing system nor the scv-value of the service process.
We conclude the discrete-time decomposition approach to analyze low traffic queueing systems with high accuracy. In heavytraffic systems, the approximation quality depends on the arrival process variability. The analysis of queueing systems with highly volatile as well as deterministic arrival processes is prone to considerable decomposition errors. When the arrival process is Poisson, the decomposition approach yields high accuracy, regardless of the service process variability.

Declaration of competing interest
The authors declare that they have no conflict of interest.

Data availability
Data are available in a repository (cited in the manuscript).