Diffusion limits for shortest remaining processing time queues

We present a heavy traffic analysis for a single server queue with renewal arrivals and generally distributed i.i.d. service times, in which the server employs the Shortest Remaining Processing Time (SRPT) policy. Under typical heavy traffic assumptions, we prove a diffusion limit theorem for a measure-valued state descriptor, from which we conclude a similar theorem for the queue length process. These results allow us to make some observations on the queue length optimality of SRPT. In particular, they provide the sharpest illustration of the well-known tension between queue length optimality and quality of service for this policy.


Introduction
In a single server queue employing the Shortest Remaining Processing Time (SRPT) policy, preemptive priority is given to the job that can be completed first, that is, the job with the shortest remaining processing time.More precisely, consider a single server queue with renewal arrivals and i.i.d.service times, and let I(t) index in the order of their arrival those jobs that are in the queue at time t.For i ∈ I(t), let w i (t) denote the residual service time at time t of job i.This is the remaining amount of processing time required to complete this job.If j ∈ I(t) is the smallest index such that w j (t) ≤ w i (t) for all i ∈ I(t), then under SRPT, d dt w j (t+) = −1 and d dt w i (t+) = 0 for all i ∈ I(t) \ j.
Interest in the SRPT policy goes back to the first optimality result of Schrage [15], who showed that SRPT minimizes the number of jobs in the system, or queue length, at each point in time (see also Smith [18]).More explicitly, given fixed arrival and service processes, if Z(t) is the queue length at time t under SRPT and Q(t) is the queue length at t under an arbitrary work conserving policy, then almost surely, for all t ≥ 0. (1.1) This holds with no distributional assumptions on the underlying arrival and service processes.Expressions for the mean response time for an M/G/1 SRPT queue were developed earlier by Schrage and Miller [16], and extended later in Schassberger [14] and Perera [12] (see Schreiber [17] for a survey of the same time period).Another notable contribution was made by Pavlov [10] and Pechinkin [11], who characterized the heavy traffic limit of the steady state distributions for the queue length of an M/G/1 SRPT queue.
Recently, there has been renewed interest in the SRPT policy, mainly in computer science.For example, Bansal and Harchol-Balter [1] study fairness for SRPT ( [1] is also a good source for a more extended list of prior work on SRPT).More recent work seeks to provide a framework for comparing policies in the M/G/1 setting; see for example Wierman and Harchol-Balter [20].
There has also been a recent body of work on the tail behavior of single server queues under SRPT; see for example Núñez Queija [8] and Nuyens and Zwart [9].They discuss the advisability of implementing SRPT using large deviations techniques.
In [3], Down and Wu employ diffusion limits to show certain optimality properties of a multi-layered round robin routing policy for a system of parallel servers, each operating under SRPT.This was done under the assumption of a finitely supported service time distribution, mainly due to the absence at the time of diffusion limits for more general service time distributions.In the case of a general service time distribution, Down, Gromoll, and Puha [4] developed fluid limits for SRPT queues, and used these to obtain a formula for state-dependent response times (on fluid scale) of jobs entering the system (see also [5]).
In this paper, we prove a diffusion limit theorem that holds for a general service time distribution, under usual heavy traffic assumptions.We do this for a measure-valued state descriptor, so that diffusion limits for various other performance measures may be obtained as corollaries; see Theorem 3.1.In particular, we obtain a diffusion limit theorem for the queue length process.This result reveals just how optimal SRPT is, in the sense of (1.1), and is explained below.
Let Z r (t) = r −1 Z r (r 2 t), t ≥ 0, be the rth diffusion scaled queue length process from an r-indexed sequence of SRPT models, as detailed in Section 3. In particular, we assume the fairly standard heavy traffic assumptions (3.4), (3.5), (3.7), (3.8), (3.9), and (3.11).We use W * (•) to denote the limit in distribution of the corresponding sequence of diffusion scaled workload processes (see (3.10)).As noted there, W * (•) is the same for all work conserving policies and is a reflected Brownian motion in R + [7].We use ν to denote the limiting service time distribution (see (3.5)) and x * to denote the supremum of the support of ν.Informally, x * is the largest possible job size.Then, This result follows from Theorem 3.1 by the continuous mapping theorem.Theorem 1.1 makes a striking statement about the queue length optimality of SRPT.Consider the following simple lower bound, valid for any work conserving policy and service time distribution ν.Assume for the moment that x * < ∞.Let Q(t), t ≥ 0, be the queue length process under an arbitrary work conserving policy.Then at each time t ≥ 0, the workload W (t) is bounded above by Q(t)x * , because it is the sum of Q(t) residual service times, each of which is bounded above by x * .So almost surely, x * , for all t ≥ 0. (1.2) Note that (1.2) makes sense when x * = ∞ as well, as the right side is interpreted as zero.Unlike (1.1), which gives a universal lower bound (over all work conserving policies) in terms of the queue length process of one such policy, (1.2) gives a universal lower bound in terms of the common workload process of all such policies.In particular, we may combine these bounds and have, almost surely, The bound (1.2) is intuitively appealing because it results from the hypothetical configuration of residual service times that minimizes the queue length at time t, given the workload at t.At each t ≥ 0, the queue length minimizing configuration is the one that puts as many residual service times as possible at x * , such that they sum to W (t). (To be precise, all of them if x * divides W (t) and all but one of them otherwise).Additionally, since the workload process is a much simpler object than the queue length process under SRPT, (1.2) may be easier to work with in practice, when x * < ∞, than (1.1).
Of course, this bound is hypothetical because no work conserving policy, including SRPT, can achieve such optimal configurations for all t ≥ 0, although many may achieve it for some t (including for example all times t for which W (t) = 0).The interesting fact contained in Theorem 1.1 is that, on diffusion scale in heavy traffic, SRPT actually achieves the hypothetical lower bound asymptotically, almost surely for all t ≥ 0.
So SRPT is not only better than any other work conserving policy in the sense of (1.1), it is in fact as optimal as possible in the heavy traffic limit.Of course, this optimality is from the point of view of the server, who one imagines wants to minimize queue length.As is well known, SRPT performs poorly from the point of view of large jobs (see e.g.[4]), who wish to minimize their time in queue, but tend to wait for long periods as they are preempted by smaller jobs.Indeed the queue length optimality of SRPT comes at the expense of long sojourn times for large jobs, and this tension is made explicit by Thereom 3.1, which gives the measure-valued diffusion limit.From this result, we see that in the heavy traffic limit, all mass is concentrated at x * .So asymptotically for all t ≥ 0, the queue consists entirely of jobs of the largest possible size, whereas smaller jobs are flushed out instantly.That is, the diffusion limit in Theorem 3.1 puts the contrast between queue length optimality and poor performance for large jobs in the sharpest light.
In the remainder of the paper, we give a precise definition of the stochastic model for an SRPT queue (Section 2), state our assumptions and main result (Section 3), and provide the proofs (Section 4).

Notation
The following notation will be used throughout the paper.Let N denote the set of positive integers and let R denote the set of real numbers.For a, b ∈ R, we write a ∨ b for the maximum of a and b, and ⌊a⌋ for the largest integer less than or equal to a.The nonnegative real numbers [0, ∞) will be denoted by Let M denote the set of finite, nonnegative Borel measures on R + .For ξ ∈ M and a Borel measurable function g : R + → R that is integrable with respect to ξ, define g, ξ = R + g(x)ξ(dx).The set M is endowed with the weak topology.That is, for ξ n , ξ ∈ M, we have ξ n w → ξ if and only if g, ξ n → g, ξ as n → ∞, for all g : R + → R that are bounded and continuous.With this topology, M is a Polish space [13].We denote the zero measure in M by 0 and the measure in M that puts one unit of mass at the point x ∈ R + by δ x .For x ∈ R + , the measure δ + x is δ x if x > 0 and 0 otherwise.For ξ ∈ M, we say that x ∈ R + is a ξ-continuity point if 1 {x} , ξ = 0. Let M a denote those elements of M that do not charge the origin.We say that a measure ξ ∈ M has a finite first moment if χ, ξ < ∞.Let M χ denote the set of all such measures and let We use " d =" for equality in distribution and "⇒" to denote convergence in distribution of random elements of a metric space.Unless otherwise specified, all stochastic processes used in this paper are assumed to have paths that are right continuous with finite left limits (r.c.l.l.).For a Polish space S, we denote by D([0, ∞), S) the space of r.c.l.l.functions from [0, ∞) into S, endowed with the Skorohod J 1 -topology [6].

Stochastic Model for an SRPT Queue
Our stochastic model of an SRPT queue consists of the following: a random initial condition Z(0) ∈ M specifying the state of the system at time zero, stochastic primitives E(•) and {v k } k∈N describing the arrival of jobs to the queue and their service times, and a measure valued state descriptor Z(•) describing the time evolution of the system.These are defined below.
Initial condition.The initial condition specifies the number Z(0) of jobs in the queue at time zero, as well as the initial service time of each job.Assume that Z(0) is a nonnegative integer valued random variable that is finite almost surely.The initial service times are the first Z(0) elements of a sequence {ṽ j } j∈N of strictly positive, finite random variables.The initial job with service time ṽj , j ≤ Z(0), is called job j.
A convenient way to express the initial condition is to define an initial random measure Z(0) ∈ M by which equals 0 if Z(0) = 0. Our assumptions imply that Z(0) satisfies (2.1) In particular, the number of initial jobs and the initial workload are finite almost surely, and so Z(0) ∈ M 0 almost surely.
Stochastic primitives.The stochastic primitives consist of an exogenous arrival process E(•) and a sequence of initial service times {v k } k∈N .The arrival process E(•) is a rate α ∈ (0, ∞) delayed renewal process such that the interarrival times have standard deviation a ∈ [0, ∞).For t ∈ [0, ∞), E(t) represents the number of jobs that arrive to the queue during the time interval (0, t].Jobs arriving after time 0 are indexed by integers j > Z(0).
Then job j ∈ N arrives at time Hence, for i < j, T i ≤ T j and we say that job i arrives before job j.
For each k ∈ N, the random variable v k represents the initial service time of the (Z(0) + k)th job.That is, job j > Z(0) has initial service time v j−Z(0) .Assume that the random variables {v k } k∈N are strictly positive and form an independent and identically distributed sequence with common Borel distribution ν on R + .Assume that the mean χ, ν ∈ (0, ∞) and standard It will be convenient to combine the stochastic primitives into a single, measure valued load process.
Definition 2.1 The load process is given by Evolution of the residual service times.In an SRPT queue, the smallest nonzero residual service time decreases at rate one until either it becomes zero or a job arrives that has a smaller initial service time, at which time the rate changes to zero and the new smallest nonzero residual service time begins decreasing at rate one.We adopt the convention that in case of a tie, the residual service time of the job that arrived first (that is, the job with smaller index) begins decreasing at rate one.
For j ∈ N and t ∈ [0, ∞), let w j (t) denote the residual service time of job j.By convention, for j ∈ N and t ∈ [0, T j ], Furthermore, for j ∈ N, if D j denotes the time at which job j completes service and departs the system, then w j (t) = 0 for all t ≥ D j .On (T j , D j ), w j (•) is nonincreasing.In particular, w j (•) decreases at rate one when job j is in service, and is constant when job j is not in service.See [4] for a detailed definition of the residual service times.

Diffusion Limit Theorem
We first define a sequence of systems over which the limit is taken.Let R be a sequence of positive real numbers increasing to infinity.Consider an R-indexed sequence of stochastic models, each defined as in Section 2. For each r ∈ R, there is an initial condition Z r (0); there are stochastic primitives E r (•) and {v r k } k∈N with parameters α r , a r , ν r , β r , b r , and ρ r , and an arrival process A r (•) with arrival times {T r j } j∈N ; there is a corresponding measure valued load process V r (•); there is a state descriptor Z r (•).The stochastic elements of each model are defined on a probability space (Ω r , F r , P r ) with expectation operator E r .A diffusion scaling (or central limit theorem scaling) is applied to each model in the R-indexed sequence as follows.For each r ∈ R and t ∈ [0, ∞), let Also, for each r ∈ R and t ∈ [0, ∞), let Let α, a ∈ (0, ∞) and define α(t) = αt for all t ∈ [0, ∞).Let ν be a probability measure such that For the sequence of stochastic primitives we make the following asymptotic assumptions.For the exogenous arrival processes, assume that as r → ∞, α r → α, a r → a, and where E * (•) is a Brownian motion starting from zero with drift zero and variance a 2 α 3 per unit time.This implies a functional weak law of large numbers for the exogenous arrival processes.In particular, it implies that as r → ∞, where Ēr (t) = E r (r 2 t)/r 2 for all t ∈ [0, ∞) and r ∈ R. For the sequence of service time distributions, assume that as r → ∞, Then β r → α, ρ r → 1, and b r → b as r → ∞.It also follows that {ν r , r ∈ R} satisfies a Lindeberg-Feller condition, i.e., for all ε > 0, In addition, assume the heavy traffic condition that for some γ ∈ R, Finally, if x * < ∞, also assume that for all x > x * , lim r→∞ r χ1 (x,∞) , ν r = 0. (3.8) For the sequence of diffusion scaled initial conditions { Z r (0) : r > 0}, assume that as r → ∞, for some random variable W * 0 .Then from (3.4), (3.5) (which implies (3.6)), (3.7), (3.9), and the fact that SRPT is a work conserving discipline, it follows that, as r → ∞, where W * (•) is a reflected Brownian motion with initial value W * (0) d = W * 0 , variance (a 2 + b 2 )α per unit time, and drift −γ (see [7]).Further assume that, as r → ∞, Z r (0) ⇒ Note that (3.11) implies that Z r (0) converges in distribution to a random measure that is almost surely an invariant state (see [4,Corollary 3.7]).
This result, in the first case when x * < ∞, is a continuous analog of the diffusion limit result for a multi-class static buffer priority queue, where in the diffusion limit work only resides in the lowest priority class [19].In an SRPT queue, those jobs with larger service times receive lower priority.Hence, an informal restatement of the first case is that in the diffusion limit the work concentrates in jobs with the largest possible service time, i.e., the lowest priority.The case when x * = ∞ is the natural extension of this result when there is no largest possible service time.Indeed, for the work to get pushed out to infinity on diffusion scale while the diffusion scaled workload process converges, the queue length must necessarily tend to zero.

Proofs
Throughout this section we assume that (3.4), (3.5), (3.7), (3.8), (3.9), and (3.11) hold.In Section 4.1, we state a well known result and use it to derive three diffusion limit results to be used in the sequel.In Section 4.2, Theorem 3.1 is proved.

Diffusion Limits for Load Related Processes
The following result is well known and follows from [13, Theorem 3.1] used to extend [2,Section 17.3].Proposition 4.1 For each r ∈ R, let {x r k } ∞ k=1 be an independent and identically distributed sequence of nonnegative random variables on (Ω r , F r , P r ) with finite mean µ r and standard deviation σ r , that is independent of E r (•).Suppose that for some finite nonnegative constants µ and σ, µ r → µ and σ r → σ as r → ∞.Further assume that for each ε > 0, x r k .and X r (t) = X r (⌊r 2 t⌋) − ⌊r 2 t⌋µ r r .
Then as r → ∞, , where E * (•) is given by (3.4) and X * (•) is a Brownian motion starting from zero with zero drift and variance σ 2 per unit time, that is independent of where for each r ∈ R and t ∈ [0, ∞), α r (t) = α r t.
Note that the limiting process X * (α(•)) + µE * (•) in Proposition 4.1 is a Brownian motion starting from zero with zero drift and variance ασ 2 +µ 2 α 3 a 2 per unit time.We apply this proposition to three processes of interest here, that we respectively refer to as the total load, the truncated load, and the tail load processes.For r ∈ R and t ∈ [0, ∞), let Then, for r ∈ R, let the total load and scaled total load processes be given respectively by Then, for r ∈ R, From (3.5) and Proposition 4.1, it follows that as r → ∞, where V * (•) is a Brownian motion starting from zero with zero drift and variance α(a 2 + b 2 ) per unit time.
Next we consider the truncated load process.For r ∈ R and x ∈ R + , let Then, for r ∈ R and x ∈ R + , Note that (3.5) implies that for any ν-continuity point x ∈ R + , as r → ∞, Hence (3.5) and Proposition 4.1 imply that for any ν-continuity where V * x (•) is a Brownian motion starting from zero with drift zero and finite variance per unit time.
Finally, for each r ∈ R and x ∈ R + , we consider the tail load process Note that (3.5) (which implies (4.1)) also implies that for any ν-continuity point x ∈ R + , as r → ∞, Hence, (3.5) and Proposition 4.1 imply that for any ν-continuity where T * x (•) is a Brownian motion starting from zero with drift zero and variance s 2 x per unit time.Here, Notice that if x * < ∞ and x > x * , then x is a ν-continuity point and 1 (x,∞) , ν = 0. Hence, if x > x * , then in (4.4), s 2 x = 0, i.e.,

Proof of the Main Theorem
Here we use the diffusion limits for the load related processes derived in Section 4.1 to prove the main result.We use the result about the scaled truncated load process to prove that, on diffusion scale, the truncated queue length tends to zero when the truncation is below x * , the supremum of the support of the limiting service time distribution.Then we use the result about the scaled tail load processes to prove that, on diffusion scale, the queue length above x tends to zero when x is above x * .Then these two results are put together to show that in the diffusion limit, the queue mass concentrates at x * .For r ∈ R and x ∈ R + , let Proof.Since Z r y (•) ≤ Z r x (•) for each 0 < y ≤ x < x * , it suffices to verify (4.8) for x ∈ (0, x * ) that are ν-continuity points.Fix such an x.For r ∈ R and t ∈ [0, ∞), let τ r x (t) = sup{s ∈ [0, t] : Z r x (s) = 0}, which is taken to be zero if {s ∈ [0, t] : Z r x (s) = 0} = Ø.Then, for r ∈ R and t ∈ [0, ∞), (4.9) First, we obtain an upper bound on Z r x (τ r x (•)).Fix r ∈ R and t ∈ [0, ∞).Either τ r x (t) = 0 or τ r x (t) > 0. If τ r x (t) = 0, then Z r x (τ r x (t)) = Z r x (0).Otherwise, τ r x (t) > 0. If Z r x (τ r x (t)) = 0, then any nonnegative upper bound suffices.Hence, without loss of generality, we also assume that Z r x (τ r x (t)) > 0. Then Z r x (τ r x (t)−) = 0 and Z r x (τ r x (t)) > 0. Hence, in the rth system at time r 2 τ r x (t), the exogenous arrival process jumps and at least one of the entering jobs has an initial service time in [0, x], and/or the residual service time of the job in service just before time r 2 τ r x (t) decreases to x.Therefore, Z r x (τ r x (t)) ≤ E r (τ r x (t)) − E r (τ r x (t)−) + 1 r .Combining the bounds for τ r x (t) = 0 or τ r x (t) > 0 gives where we adopt the convention E r (0−) = E r (0) = 0. Hence, for r ∈ R and We have that for each r ∈ R and t where we adopt the convention that E r (t) = E r (0) if t < 0. Therefore, for each r ∈ R and t By (3.4), the fact that E * (•) is continuous almost surely, and (4.12), it follows that, as r → ∞, (see [2,Section 17]).Hence, as r → ∞, Hence, all that remains is to prove (4.11).For this, for each r ∈ R and t ∈ [0, ∞), we exploit the behavior of W r x (•) (defined in (4.6)) on time intervals of the form (r 2 τ r x (t), r 2 t] to derive an expression that relates W r x (t) and θ r x (t).In particular, since for each r ∈ R and t ∈ [0, ∞), Z r x (s) = 0 for all s ∈ (r 2 τ r x (t), r 2 t] and the service discipline is SRPT, it follows that for each r ∈ R and t t).Using the same line of reasoning that gave rise to (4.10), for r ∈ R and t ∈ [0, ∞), W r x (t) ≤ W r x (0) + V r x (t) − V r x (τ r x (t)−) + x r + α r χ1 [0,x] , ν r − 1 rθ r x (t).