Distributed gang scheduling in networks of heterogenous workstations
Section snippets
Introduction and background
Major attention is currently being given to concurrent processing via distributed systems 1, 2, 3, 4. The feasibility of concurrent processing via distributed computing is supported by a variety of software systems [5]and the availability of a great number of problems that can be solved by distributed computing [6]. The Parallel Virtual Machine (PVM), P4, Linda and others [5], are examples of software systems that can be used to execute parallel application pro grains onto a distributed
Assumptions and terminology
In this section, we present and discuss the assumptions and the related terminology used in the scheduling algorithm.
The number of processes of an application or a job, or Virtual Processors (VPs) as we refer to in this paper, defines the VP-count X. Scheduling in a heterogenous environment should assign a number of VPs proportional to the processing power of each workstation. The term `Siblings' is used to indicate all the concurrent VPs of a job. In the discussion of workstations scheduling,
The scheduling model
Interaction properties among the VPs of a job are factors that influence the scheduling strategies. Fine grain applications that exhibit close cooperation among their VPs are shown to perform well if executed by gang scheduling [14]. Gang scheduling is a technique where all the VPs of an application are scheduled on the available processors. Multiple jobs are scheduled by time sharing the workstations time. We consider scheduling fine grain applications where the VPs exhibit close interactions
Allocation algorithms
The allocation of an application to a set of workstations with unequal processing powers is performed in two steps: first each workstation in the initial set is assigned a number of VPs proportional to its processing capacity such that the TAT of the application is minimized. Two algorithms are introduced to produce the optimal number of VPs per workstation for the minimal TAT. Those algorithms are the Assignment Equation (AE), and the Rounding Algorithm (RA). Second, further optimization is
Static allocation
In this section, we present algorithms corresponding to operations that are not restricted by workstations architecture. An example of such operations is the spawn of a new job which can be scheduled without architecture restrictions, i.e. statically. The spawn activation calls the Regular Pattern Search (RPS) algorithm which is presented next.
Dynamic allocation
In this section, we present some allocation algorithms and Bin operations which consider architecture heterogeneity. First, we address the heterogeneity issue.
Comparison with related work
First we present some of the known distributed scheduling algorithms for distributed environments, then we compare them with the approach proposed in this paper.
Kremien et al. [12]scheduling algorithm subdivides the system into domains. Load information is exchanged only among processors in the same domain. Every node independently determines the nodes it includes in its domain. A domain includes all the nodes having opposite load status, i.e. over-loaded and under-loaded nodes. When an
Summary and conclusion
We presented a distributed scheduling algorithm that enables the concurrent execution of applications onto networks of non-dedicated heterogeneous workstations. Such networked environments impose several requirements on a scheduling algorithm. The scheduling algorithm must be dynamic, since it executes applications in a non-dedicated environment. The algorithm must take migrations into consideration and, therefore, migrations are to be invoked at points when it is deemed to be cost effective.
Acknowledgements
The first author acknowledges the comments and suggestions made by Jon Walpole, and Steve Otto during the sabbatical year spent at the Oregon Graduate Institute of Science and Technology.
Khaled Al-Saqabi received his B.S.E. degree from the University of South Florida, his M.S. degree from the Ohio State University, and his Ph.D. degree from North Carolina State University in 1982, 1985, 1989; respectively. He is an assistant professor in the department of Electrical and Computer Engineering at Kuwait University. He was a member of the BLITZEN group at the Micro Electronics Center of North Carolina (MCNC), Research Triangle Park between 1987 and 1989. He was a visiting research
References (19)
- T.L. Casavant, J.G. Kuhl, A Taxonomy of scheduling in general-purpose distributed computing systems, IEEE Trans....
- M.W. Mutka, Estimating capacity for sharing in a privately owned workstation environment, IEEE Trans. Software...
- V.S. Sunderam, PVM: A framework for parallel distributed computing, Concurrency: Practice and Experience 2 (4) (1990)...
- G.A. Geist, V.S. Sunderam, Network based concurrent computing on the PVM system, Concurrency: Practice and Experience 4...
- C.C. Douglas, T.G. Mattson, M.H. Schultz, Parallel programming systems for workstation clusters, Technical Report 975,...
- J. Boyle, R. Butler, T. Disz, B. Glickfeld, E. Lusk, R. Overbeek, J. Patterson, R. Stevens, Portable Programs for...
- A.S. Tanenbaum, Operating Systems Design and Implementation, Prentice Hall, New Jersey,...
- M.K. Litzkow, M. Livney, M.W. Mutka, Condor – A hunter of idle workstations, in: Proceedings the Eighth International...
- F. Douglas, J. Ousterhout, Process migration in the sprite operating system, in: Proceedings the Seventh International...
Cited by (5)
Scheduling and resource management using PSO in P-grid
2010, Proceedings of 2010 International Conference on Communication and Computational Intelligence, INCOCCI-2010A heuristic on job scheduling in grid computing environment
2008, Proceedings - 7th International Conference on Grid and Cooperative Computing, GCC 2008Task scheduling by Mean Field Annealing algorithm in grid computing
2008, 2008 IEEE Congress on Evolutionary Computation, CEC 2008Task scheduling by neural network with mean field annealing improvement in grid computing
2006, Canadian Conference on Electrical and Computer EngineeringQoS guided Min-Min heuristic for grid task scheduling
2003, Journal of Computer Science and Technology
Khaled Al-Saqabi received his B.S.E. degree from the University of South Florida, his M.S. degree from the Ohio State University, and his Ph.D. degree from North Carolina State University in 1982, 1985, 1989; respectively. He is an assistant professor in the department of Electrical and Computer Engineering at Kuwait University. He was a member of the BLITZEN group at the Micro Electronics Center of North Carolina (MCNC), Research Triangle Park between 1987 and 1989. He was a visiting research professor at the Oregon Graduate Institute of Science and Technology (OGI) from 9/93–9/94. His research interests are distributed operating systems, and communication software design.
Mansoor Sarwar is currently an Associate Professor of Electrical Engineering at the University of Portland, Oregon. He earned his undergraduate degree in Electrical Engineering from the University of Engineering and Technology, Lahore, Pakistan, and M.S. and Ph.D. degrees in Computer Engineering from Iowa State University. His current teaching and research interests are in experimental performance evaluation, parallel and distributed computing, operating systems, and engineering education.
Kassem Saleh born in Beirut, Lebanon in 1963, received the B.Sc., M.Sc. degrees in Computer Science, and the Ph.D. degree in Electrical Engineering from the University of Ottawa, Canada in 1985, 1986 and 1991, respectively. He was a computer systems specialist at Bell Canada from 1985 to 1991 then he joined Concordia University as an assistant professor for one year. He is currently an assistant professor in the department of electrical and computer engineering at Kuwait University. He was awarded the IBM telecommunications Software Scholarship in 1988, the George Franklin Prize for the best paper in 1990 from the Canadian Interest Group on Open Systems (CIGOS), and the Distinguished Young Researcher Award from Kuwait University in 1995. His research and teaching interests include software engineering, distributed system design and communications protocol engineering.