Using coordination to parallelize sparse-grid methods for 3-D CFD problems

doi:10.1016/S0167-8191(98)00043-X

Parallel Computing

Volume 24, Issue 7, July 1998, Pages 1081-1106

https://doi.org/10.1016/S0167-8191(98)00043-X Get rights and content

Abstract

In this paper, we investigate the good parallel computing properties of sparse-grid solution techniques. To this end, an existing sequential computational fluid dynamics (CFD) code for a standard 3-D problem from computational aerodynamics is restructured into a parallel application. The restructuring is organized according to a master/worker protocol. The coordinator modules developed thereby are implemented in the coordination language Manifold and are applicable to other problems than the present CFD problem only. Performance results are given for both the sequential and the parallel versions of the code. The results are promising. The paper contributes to the state-of-the-art in improving the efficiency of large-scale computations. We also present a theoretical analysis of speed-up through parallelization in a multi-user single-machine environment.

Introduction

One of the major challenges in science and technology is the fast numerical solution of partial differential equations. Important examples of such equations are those of fluid mechanics. When partial differential equations are solved numerically, they must be discretized, i.e., their solution, which is a set of functions defined over an area, is approximated by a set of - say - O(N^d) real numbers, where d is the space dimension of the problem (d=1, 2 or 3). Thus, the original differential equations are transformed into a system of O(N^d) algebraic equations with the aforementioned O(N^d) real numbers as the unknowns. For d=3 the size of the system can be very large. To solve these large systems, various techniques have been developed. Among these, the multigrid methods are optimal in the sense that the amount of computational work to solve the algebraic system is only linear with the number of unknowns. For all other known solution methods, the amount of work grows faster than linearly with the number of unknowns. For literature on multigrid techniques, see, e.g., Refs. 1, 2, 3, where Ref. [1] is recommended for an elementary introduction.

Novel multigrid techniques to speed up the solution of systems of discrete equations are the so-called sparse-grid techniques; see Ref. [4] and its references. Sparse-grid techniques are very attractive from the viewpoint of computational efficiency, particularly for 3-D problems. The gain in efficiency is achieved through a strong reduction of the number of grid points. Of course, this goes at the expense of numerical accuracy. Fortunately, the sparse-grid-of-grids approach has a better ratio of discrete accuracy over number of grid points [5] than a standard multigrid method (which in turn already has a much better performance in this sense than a single-grid method).

The efficiency of sparse-grid methods can still be improved further; an advantage of the methods is their good suitability for implementation on a parallel computer or a cluster of workstations. In this paper we present the parallel implementation of an existing sparse-grid solution method for the steady, 3-D Euler equations of gas dynamics 6, 7. Our starting point is a sequential Fortran 77 code describing this standard problem. If, for instance, entire subroutines of this code can be plugged into a new parallel structure, the resulting renovated software can take advantage of the improved performance offered by modern parallel computing environments, without rethinking or rewriting the bulk of the existing code [8]. The good parallel computing properties of sparse-grid solution techniques allow us to perform such a coarse-grain restructuring. The restructuring is organized according to a master/worker protocol and essentially consists of picking out the computation subroutines in the original Fortran 77 code, and glueing them together with coordination modules written in Manifold. Hardly any rewriting or changes to these subroutines is necessary: within the new structure, they have the same input/output and calling sequence conventions as they had in the old structure, and they still manipulate the same global data. The Manifold glue modules are separately compiled programs that have no knowledge of the computation performed by the Fortran modules - they simply encapsulate the protocol necessary to coordinate the cooperation of the computation modules running in a parallel computing environment. Manifold is a coordination language developed at CWI (Centrum voor Wiskunde en Informatica) in the Netherlands. It is very well suited for managing complex, dynamically changing interconnections among sets of independent concurrent cooperating processes 9, 10.

The rest of this paper is organized as follows. In Section 2, we introduce the discrete equations under consideration. In Section 3, we describe the concept of sparse-grid methods. For this, first standard multigrid methods are described. In Section 4, we briefly describe the sequential implementation of the 3-D CFD code and pay attention to its good parallel computing properties. Next, in Section 5we show how we can restructure this sequential 3-D software into a parallel code, using the coordination language Manifold. In Section 6, we give an analysis of the speed-up figures in a multi-user single-machine environment and show performance results for the test case of a half-wing in transonic flight: the standard test case of the ONERA M6 wing (Fig. 1) at a far-field Mach number of 0.84 and 3.06° angle of attack. Finally, the conclusion of the paper is in Section 7.

Section snippets

Continuous equations

In this paper, we consider the flow of a perfect, di-atomic gas (air, e.g.) in three dimensions (3-D). The unknown quantities that describe the gas flow are the gas velocity components in the three coordinate directions, u, v and w; the gas density ρ; and the gas pressure p. Neglecting friction forces, the gas flow is described by the steady, 3-D Euler equations $∂f(q) ∂x + ∂g(q) ∂y + ∂h(q) ∂z =0,$ in which q is the so-called state vector $q= ρ ρu ρv ρw ρe,$ with e the sum of internal and kinetic energy, satisfying

Sparse-grid methods

In summary, by discretizing the flow problems, we create a set of - say - N^d finite volumes Ω_i,j,k, N^d gas states q_i,j,k and N^d nonlinear equations of the form (3). As mentioned in the introduction, N^d may be very large, particularly in 3-D (d=3). All methods to solve such large systems of equations are iterative: a guessed initial solution is improved step-by-step during the solution process. As also mentioned in the introduction, most iterative methods have the drawback that the rate of

The 3-D CFD Fortran code

It is our experience that a parallel implementation is enhanced if first a sequential prototype is made available. In this way of working we can fully concentrate on the algorithmic aspects of our application and do not need to be occupied with all the ins and outs of parallel programming tools. For the present 3-D CFD algorithm, it becomes quickly clear which parts can run in parallel. The 3-D CFD code we consider in this section is sequential and is based on a data structure which is

Restructuring the 3-D CFD code

In this section we describe the restructuring of the Fortran code, as presented in Section 4, into a parallel application. For the parallelization we use Manifold. Manifold is a coordination language for managing complex, dynamically changing interconnections among sets of independent, concurrent, cooperating processes [9]. Manifold is based on the IWIM model of communication [10]. The basic concepts in the IWIM model are processes, events, ports and channels.

The crux of our restructuring is to

Speed-up analysis

A number of experiments were conducted to obtain concrete numerical data to measure the effective speed-up of our parallelization. All experiments were run on a single multi-processor machine in a real contemporary computing environment, i.e., an environment in which it cannot be guaranteed that one is the only user. In such an environment, care should be taken in interpreting speed-up numbers. This is shown in the following multi-user, single-machine analysis, in which we make the following

Conclusions

One of the promises of sparse-grid techniques, their good parallelization property, has been realized for the computation of a realistic and practically relevant test case from steady gas dynamics. The intrinsically low computational complexity of sparse- and semi-sparse-grid methods, plus the additional gains in computing time through parallelization, make both methods really appealing for very computing-intensive work. (As far as CFD applications are concerned, here one may think of, e.g.,

Acknowledgements

The authors want to thank Farhad Arbab for his suggestions to improve this paper.

References (20)

P.W. Hemker et al.
Multiple grid and Osher's scheme for the efficient solution of the steady Euler equations
Appl. Numerical Math.
(1986)
W.L. Briggs, A Multigrid Tutorial, SIAM, Philadelphia,...
W. Hackbusch, Multi-Grid Methods and Applications, Springer, Berlin,...
P. Wesseling, An Introduction to Multigrid Methods, Wiley, Chichester,...
P.W. Hemker, Finite volume multigrid for 3D-problems, in: H. Deconinck, B. Koren (Eds.), Euler and Navier–Stokes...
M. Griebel et al.
Multilevel Gauss–Seidel-algorithms for full and sparse grid problems
Comput.
(1993)
P.W. Hemker, B. Koren, J. Noordmans, 3D multigrid on partially ordered sets of grids, Proceedings of the Fifth European...
B. Koren, P.W. Hemker, P.M. de Zeeuw, Semi-coarsening in three directions for Euler-flow computations in three...
C.T.H. Everaars, F. Arbab, F.J. Burger, Restructuring sequential Fortran code into a parallel/distributed application,...
F. Arbab, Coordination of massively concurrent activities, Report CS-R9565, CWI, Amsterdam, 1995. Available on-line at...

There are more references available in the full text version of this article.

Cited by (18)

Proactive task scheduling and stealing in master-slave based load balancing for parallel contingency analysis
2013, Electric Power Systems Research
Citation Excerpt :
Further, since parallelization brings the need of efficient scheduling for maximizing resource usage efficiency, several scheduling techniques have also been used. Master-slave scheduling (MSS), also known as manager-worker or master-worker scheduling [3–5] is a well-known scheduling technique, which has been used in a wide class of parallel applications [5–7]. MSS is well suited as a programming model for applications targeted to distributed resources [8] and is used in computer systems such as nCube hypercube [9].
With increasing emphasis on analyzing N − k contingencies, use of parallel resources has become imperative. Parallelization imposes the requirement of load-balancing for achieving high resource usage efficiency. Conventional static allocation based scheduling techniques fail to achieve load balancing. To address this limitation, master-slave scheduling (MSS) has been used; however, in MSS, after task completion, slave processors wait for the next task to arrive leading to idle-wait. In the case of contention at master, the idle-wait could become significant and degrade the performance of the MSS algorithm.
We present a technique to combine the advantage of proactive task scheduling and stealing with the simplicity of MSS. We refer to it as PTMSS. In PTMSS, master proactively queues an extra task at the slave processor, such that on completion of a task, the next task is immediately started. Further, when master runs out of the tasks, it steals a queued task from one slave and allocates to another slave which has completed its tasks. Simulation experiments have been conducted on a large power system with 13,029 buses and thousands of contingencies have been analyzed. The results show that PTMSS performs better than conventional MSS and also offers significant computational gains over serial execution.
A broker architecture for object-oriented master/slave computing in a hierarchical grid system
2004, Advances in Parallel Computing
This chapter discusses broker architecture and its implementation for a hierarchical grid system. The broker, whose logic is based on an economy-driven model, is able to both transparently split a sequential object-oriented application into tasks, according to the master/slave computing model, and to automatically distribute slave tasks to a set of computational resources selected so as to execute the application satisfying the QoS specified by the user. The goal is achieved by designing the broker with three patterns: grid broker, reflection, and master/slave. In particular, the broker is integrated in a Java-based grid middleware, called Hierarchical Metacomputer Middleware (HiMM), which provides information and communication services and allows users to program applications by adopting a distributed object model. The grid broker pattern is discussed, which is a pattern purposely defined for designing grid middleware platforms.
The sparse-grid combination technique applied to time-dependent advection problems
2001, Applied Numerical Mathematics
In the numerical technique considered in this paper, time-stepping is performed on a set of semi-coarsened space grids. At given time levels the solutions on the different space grids are combined to obtain the asymptotic convergence of a single, fine uniform grid. We present error estimates for the two-dimensional, spatially constant-coefficient model problem and discuss numerical examples. A spatially variable-coefficient problem (Molenkamp–Crowley test) is used to assess the practical merits of the technique. The combination technique is shown to be more efficient than the single-grid approach, yet for the Molenkamp–Crowley test, standard Richardson extrapolation is still more efficient than the combination technique. However, parallelization is expected to significantly improve the combination technique's performance.
A transition system semantics for the control-driven coordination language MANIFOLD
2000, Theoretical Computer Science
Coordination languages are a new class of parallel programming languages which manage the interactions among concurrent programs. Basically, coordination is achieved either by manipulating data values shared among all active processes or by dynamically evolving the interconnections among the processes as a consequence of observations of their state changes. The latter, also called control-driven coordination, is supported by MANIFOLD. We present the formal semantics of a kernel of MANIFOLD, based on a two-level transition system model: the first level is used to specify the ideal behavior of each single component in a MANIFOLD system, whereas the second level captures their interactions. Although we apply our two-level model in this paper to define the semantics of a control-oriented coordination language, this approach is useful for the formal studies of other coordination models and languages as well.
Constructing dimension-adaptive sparse grid interpolants using parallel function evaluations
2006, Parallel Processing Letters
EOMT: A Master-Slave task scheduling strategy for grid environment
2008, Proceedings - 10th IEEE International Conference on High Performance Computing and Communications, HPCC 2008

View all citing articles on Scopus

¹: E-mail: [email protected]

View full text