Queueing models of RAID systems with maxima of waiting times

https://doi.org/10.1016/j.peva.2006.11.002Get rights and content

Abstract

A queueing model is developed that approximates the effect of synchronizations at parallel service completion instants. Exact results are first obtained for the maxima of independent exponential random variables with arbitrary parameters, and this is followed by a corresponding approximation for general random variables, which reduces to the exact result in the exponential case. This approximation is then used in a queueing model of RAID (Redundant Array of Independent Disks) systems, in which accesses to multiple disks occur concurrently and complete only when every disk involved has completed. We consider the two most common RAID variants, RAID0-1 and RAID5, as well as a multi-RAID system in which they coexist. This can be used to model adaptive multi-level RAID systems in which the RAID level appropriate to an application is selected dynamically. The random variables whose maximum has to be computed in these applications are disk response times, which are modelled by the waiting times in M/G/1 queues. To compute the mean value of their maximum requires the second moment of queueing time and we obtain this in terms of the third moment of disk service time, itself a function of seek time, rotational latency and block transfer time. Sub-models for these quantities are investigated and calibrated individually in detail. Validation against a hardware simulator shows good agreement at all traffic intensity levels, including the threshold for practical operation above which performance deteriorates sharply.

Introduction

Traditional, e.g. product-form, queueing networks cannot model synchronizations at parallel service completion instants. We approximate this effect in a queueing model of RAID (Redundant Array of Independent Disks) systems, derived by considering the explicit flow of control in the physical architecture. The contention in each parallel phase of processing is represented using an approach based on the M/G/1 queue. The synchronization time is then the maximum of a collection of M/G/1 queue sojourn times (also called waiting times or response times). We assume these sojourn times to be independent; initially, exponential random variables, as in an M/M/1 queue, and then general.

Based on initial work in [12], Section 2 derives an exact recurrence formula for the Laplace transform of the probability density function of the maximum of a set of independent exponential random variables, from which the mean and higher moments follow. In the special case that all the constituent exponential distributions are identical, the well-known result for the mean value of the maximum in terms of harmonic numbers follows immediately. The recurrence is then generalized to approximate the mean of the maximum of independent, generally distributed random variables. This simplifies to the previous exact result when the constituent distributions are exponential but in general requires their second moments. The accuracy of the approximation is assessed by comparison with simulation results obtained for Erlang and Pareto constituent distributions, which typify the cases of small and large variances respectively.

RAID storage systems and existing analytical models are briefly reviewed in Section 3 and the results of Section 2 are then used in our new multi-level RAID performance model in Section 4. This model assumes Poisson external requests but allows general disk seek, latency and transfer times. We determine the higher moments of the queueing time in the M/G/1 queue by differentiating its Laplace–Stieltjes transform at the origin. The second moment is then given in terms of the third moment of the service time, which is obtained in turn from the assumed distributions of seek time, rotational latency and block transfer time. Detailed studies of their principles of operation show that RAID levels 0–1 and 5 produce quite different demands on the disks in the array for each type of input-output access. This difference is amplified in the corresponding queueing times; it is seen in both the explicit simulation of the physical systems’ operation and in the calculation of mean and variance of queueing time in the analytical model.

The accuracy of the model is assessed in Section 5 by comparing the analytical predictions with a simulation of the actual system at the operational level. The quantitative results are presented as graphs of mean system response time against traffic intensity, showing generally good agreement and hence providing justification for our approach. The validity of the assumption of Poisson arrivals was tested numerically by comparing with simulation models with non-Poisson input; the simulation output (mean response time) shows little change. This is consistent with the commonly observed robustness of the Poisson assumption for external arrivals. In addition, in Section 6 we further investigate possible causes of inaccuracy in the model’s approximations. We isolate two possible sources, apart from the precision of the mean–max algorithm assessed in Section 2.4: (a) the representation of the delay at a single disk as the response time in an M/G/1 queue; and (b) the effect of assuming such response times are independent when arrivals actually occur simultaneously. The paper concludes in Section 7 with a summary of the present contribution, open questions and suggestions for further research.

Section snippets

Maximum of random variables

Suppose a task forks into a number of subtasks that are processed in parallel independently. The task’s completion instant is that of the last subtask to complete processing, whereupon the subtasks combine (join) to re-form the original task. The fork-join time of the task, i.e. the time elapsed between the fork instant and the join instant, is therefore the maximum of the subtasks’ processing times. In a Markovian environment, we derive the following:

Proposition 1

The maximum of n independent, negative

RAID storage system

A RAID storage system consists of a disk system manager and a collection (array) of independent disks. The disk system manager is a software component; it receives requests from the multiple system users. These requests are considered logical because they are independent of the physical configuration of the storage system. Requests may arrive from different users at various rates λj. The disk system manager subdivides the data into blocks called stripe units and distributes them across the

The multi-level RAID analytical model

Our aim is to determine the mean logical request response time for data stored according to RAID0-1 and RAID5 patterns in a single multi-level RAID storage system. We consider relevant hardware parameters and requests’ execution schedules, for which we give task graphs to highlight the one or two synchronization points. We then determine the mean logical request response time using the fork-join model of Section 2 in an M/G/1 queueing context.

Results and discussion

In order to validate our model and assess its accuracy, we developed a detailed event-driven simulator. This simulator is written in C and is composed of three main parts. The first part is a logical request generator, which uses standard random number generation functions to produce inter-arrival times for the logical requests with arbitrary probability distributions. The second part is a logical to physical mapping, which contains all the physical request generation functions. This part deals

Sources of approximation

We have already investigated in Section 2.4 one possible source of inaccuracy in our model, namely the mean–max approximation of Section 2.3, which is only exact for parallel exponential delays. We concluded that only for coefficients of variation (ratio of standard deviation to mean) much less than one is the approximation likely to be poor. Fortunately this is the least likely scenario, file access times being notoriously variable, sometimes even having heavy tailed distributions.

However,

Conclusion

We have developed a new, efficient approximation to compute the mean duration of certain synchronized fork-join operations. The approximation is exact in the case of exponentially distributed constituent delays, where exact results were also obtained for higher moments and the Laplace transform of the duration’s probability density function itself. Using these results, quite intricate, analytical models, based on simple queueing theory, were derived, which take into account the detailed

Peter Harrison is currently a professor of computing science at Imperial College, London where he became a lecturer in 1983. He graduated at Christ’s College Cambridge as a Wrangler in Mathematics in 1972 and went on to gain Distinction in Part III of the Mathematical Tripos in 1973, winning the Mayhew prize for Applied Mathematics. He obtained his Ph.D. in Computing Science at Imperial College in 1979. He has researched into stochastic performance modelling and algebraic program transformation

References (23)

  • S. Chen et al.

    The design and evaluation of RAID5 and parity striping disk array architecture

    Journal of Parallel and Distributed Computing

    (1993)
  • A. Merchant et al.

    An analytical model or reconstruction time in mirrored disks

    Performance Evaluation

    (1994)
  • E. Bachmat, J. Schindler, Analysis of methods for scheduling low priority disk drive tasks, in: Proc. ACM Sigmetrics,...
  • The RAID Advisory board

    The RAIDBOOK: A Source Book for RAID Technology

    (1993)
  • H. Bohnenkamp et al.

    The mean value of the maximum

  • S. Chen et al.

    A performance evaluation of RAID architecture

    IEEE Transactions on Computers

    (1997)
  • S. Chen, Design, modeling and evaluation of high performance, Ph.D. Thesis, University of Massachusetts, September...
  • G. Gibson, D.A. Patterson, R.H. Katz, A case for redundant arrays of inexpensive disks (RAID), in: Proc. SIGMOD...
  • G. Gibson, D.A. Patterson, P.M. Chen, R.H. Katz, Introduction to redundant arrays of inexpensive disks (RAID), in: IEEE...
  • J. Xu, E. Varki, A. Merchant, X. Qiu, An integrated performance model of disk arrays, in: Proc. International Symposium...
  • J. Xu et al.

    Issues and challenges in the performance analysis of real disk arrays

    IEEE Transactions on Parallel and Distributed Systems

    (2004)
  • Cited by (32)

    • Dynamic Subtask Dispersion Reduction in Heterogeneous Parallel Queueing Systems

      2015, Electronic Notes in Theoretical Computer Science
    • A highly reliable and parallelizable data distribution scheme for data grids

      2013, Future Generation Computer Systems
      Citation Excerpt :

      Different aspects of the RAID system have been discussed in the literature. To name just a few, reliability [20–26], availability [27,21,28–30], scalability [31,32,13], energy–efficiency [33,34], and RAID modeling techniques [35,36]. Wu et al. [29] proposed an outscoring-based method to improve the availability of the RAID-structured storage systems.

    • Bus Modelling in Zoned Disks RAID Storage Systems

      2009, Electronic Notes in Theoretical Computer Science
    • Optimising hidden stochastic PERT networks

      2017, ValueTools 2016 - 10th EAI International Conference on Performance Evaluation Methodologies and Tools
    View all citing articles on Scopus

    Peter Harrison is currently a professor of computing science at Imperial College, London where he became a lecturer in 1983. He graduated at Christ’s College Cambridge as a Wrangler in Mathematics in 1972 and went on to gain Distinction in Part III of the Mathematical Tripos in 1973, winning the Mayhew prize for Applied Mathematics. He obtained his Ph.D. in Computing Science at Imperial College in 1979. He has researched into stochastic performance modelling and algebraic program transformation for some twenty years, visiting IBM Research Centres during two summers. He has written two books, had over 150 research papers published and held a series of research grants, both national and international. The results of his research have been exploited extensively in industry, forming an integral part of commercial products such as Metron’s Athene Client–Server capacity planning tool. Currently, his main research interests are stochastic process algebra, where he has developed the RCAT methodology for finding separable solutions, response time analysis and optimization of fluid-based models. He has taught a range of subjects at undergraduate and graduate level, including Operating Systems: Theory and Practice, Functional Programming, Parallel Algorithms and Performance Analysis.

    Soraya Zertal is a Lecturer in the Architecture and Parallelism research group in the PRiSM Laboratory at the University of Versailles, France. She obtained her Ph.D. in Computing Science from the University of Versailles in 2000, a Master degree from the University of Versailles in 1996 and an engineering degree from the University of Constantine in 1993. Her research interest include parallel architecture, storage systems specifically algorithms for data placement and performance modelling using both simulation and analytical methods.

    View full text