A parallel mixed time integration algorithm for nonlinear dynamic analysis

https://doi.org/10.1016/S0965-9978(02)00021-2Get rights and content

Abstract

This paper presents a parallel mixed time integration algorithm formulated by synthesising the implicit and explicit time integration techniques. The proposed algorithm is an extension of the mixed time integration algorithms [Comput. Meth. Appl. Mech. Engng 17/18 (1979) 259; Int. J. Numer. Meth. Engng 12 (1978) 1575] being successfully employed for solving media-structure interaction problems. The parallel algorithm for nonlinear dynamic response of structures employing mixed time integration technique has been devised within the broad framework of domain decomposition. Concurrency is introduced into this algorithm, by integrating interface nodes with explicit time integration technique and later solving the local submeshes with implicit algorithm. A flexible parallel data structure has been devised to implement the parallel mixed time integration algorithm. Parallel finite element code has been developed using portable Message Passing Interface software development environment. Numerical studies have been conducted on PARAM-10000 (Indian parallel supercomputer) to test the accuracy and also the performance of the proposed algorithm. Numerical studies indicate that the proposed algorithm is highly adaptive for parallel processing.

Introduction

Unconditionally stable implicit time integration algorithms, which accurately integrate the low frequency content of the response and successfully damp out the high frequency modes are ideally suited for structural dynamic applications. For large-scale structural dynamic analysis with nonlinearities, the implicit time integration algorithms become highly compute intensive. The equation-solving phase is the most dominating computational phase in these algorithms. Considerable efforts have been made in the past two decades to alleviate the source of computational cost for different classes of problems while retaining the requisite stability of the algorithm.

However, in recent years the most exciting possibility in the algorithm development area for nonlinear dynamic analysis has been the emergence of parallel processing machines. In the last couple of years, significant advances in hardware and software technologies have been made in parallel and distributed computing which includes development of portable software development environments and cost effective parallel computing using cluster of workstations and also heterogeneous computing, etc. At present it appears that parallel processing, as a discipline is fairly matured and ready for development of serious commercial applications.

Eventhough, there are significant advances in the hardware and associated software development areas, the development of parallel approaches for compute intensive applications to exploit the latest computing environments is clearly lagging behind. Hence major research efforts are still essential in this direction.

Moving the conventional algorithms (developed for sequential machines) onto parallel hardware is not straightforward as the programming models required to take advantage of the parallel and distributed computer architecture are significantly different from the traditional paradigm for a sequential machine. Implementation of an engineering application, besides optimising matrix manipulation kernels for the new computing environment, must take careful consideration of the overall organisation and data structures of the program.

Many researchers have devised algorithms for nonlinear dynamic analysis exploiting parallelism in both explicit and implicit time integration techniques. The explicit time integration algorithms like central difference method can easily be moved onto parallel processing machines, as the resulting dynamic equilibrium equations are uncoupled when mass and damping matrices are diagonal. However, explicit algorithms are conditionally stable and cannot be applied efficiently for structural dynamic analysis.

On the other hand implicit algorithms are unconditionally stable, but parallel implementation is not so straightforward. The most time consuming part of the implicit time integration is the simultaneous solution of equations. In fact, implicit analysis are excellent for parallel processing except for the solution of equations which need to be performed for every time step/iteration in the case of nonlinear analysis. Unconditional stability makes the implicit algorithms very attractive. Since the effective stiffness matrix is symmetric and positive definite, a number of parallel linear solution techniques can be employed. The difficulty encountered in parallel solution algorithms is the amount of inter-processor communication involved. During the parallel solution phase each processor has to communicate with all other processors. This increases the communication overheads much further due to the communication latencies and synchronisation delays. The implicit algorithms can also be reordered in order to minimise the communication overheads in parallel implementation.

Considerable efforts have been made to improve the performance of these implicit algorithms in parallel processing environment. Here no attempt has been made to present a comprehensive review. However, some of the selected earlier works are outlined to give a flavour of directions in which attempts have been made to devise parallel approaches for dynamic analysis employing implicit time integration techniques. Hajjar and Abel [4] have employed the implicit Newmark-β constant average acceleration algorithm for solution of dynamic analysis of framed structures on a network of workstations. For devising parallel algorithms, the domain decomposition strategy, coupled with Preconditioned Conjugate Gradient (PCG) algorithm for the iterative solution of the interface stiffness coefficient matrix has been employed. Profile storage scheme has been employed for storing the global matrices. It was concluded that the PCG algorithm is attractive for parallel processing as it requires less interprocessor communication and is easier to balance the workload among processors than direct methods. Chiang and Fulton [5] investigated implicit Newmark type methods with a skyline Cholesky decomposition strategy for FLEX/32 shared memory multi-computer and the Intel iPSC Hypercube local memory computer. It was shown that the shared database nature of the decomposition algorithm made the FLEX/32 multicomputer a more efficient parallel environment than the Hypercube computer.

Bennighof and Wu [6] have devised domain decomposition methods that permit independent subdomain computation. In these algorithms, independently computed subdomain responses were corrected to obtain response for the global problem using interface portions of independent subdomain responses. These corrections are carried out less frequently (say every fifth time step) in order to reduce the computations for correcting the independent subdomain responses. However, a drawback of these methods is that they are conditionally stable, and their stability behaviour is much more complex than that of standard global time stepping algorithms. Bennighof and Wu [7] have later proposed unconditionally stable parallel algorithms with multi-step independent computations for linear dynamic analysis. These algorithms, however, are not suitable for nonlinear implementations.

Noor and Peter [8] derived the finite element discretisation based on a mixed formulation, where unknowns consisted of the generalised displacements, the velocitycomponents, and the internal forces of the structure. Each component of the unknowns has its discrete shape function, which is independent from each other. The formulation allows the stress resultants to be discontinuous at the interelement boundaries. The unknowns at the boundary nodes are assumed to be only the generalised displacement and velocity components. The response for each substructure is regarded as the summation of symmetric and asymmetric responses of the substructure. This could be achieved by means of transformation matrices in conjunction with the operator splitting technique. The solution encompasses two-level iterations, where the outer loop and the inner loop are solved using the Newton–Raphson method and the preconditioned conjugate gradient techniques, respectively. The numerical integration is performed using an implicit multi-step, one-derivative scheme.

Storrasli and Ransom [9] have devised a parallel Newmark implicit method on finite element machine. It can be observed from this brief review that most of the algorithms for implicit transient dynamic analysis have been devised by using a parallel solver (direct or iterative) or by reordering the computations and also majority of the implementations are on outdated hardware and software development environment.

In this paper an effort has been made to extend the mixed time integration algorithms proposed by Belytschco et al. [1], [2] to impart parallelism in to the time integration algorithms with efficient interprocessor communications. The state-of-the-art Message Passing Interface (MPI) [10] portable software development environment has been employed for parallel programming development. Till date, very few MPI based parallel implementations of dynamic analysis codes employing time integration techniques [11], [12] are reported in the literature.

Section snippets

Newmark's time stepping algorithm

The spatial discretisation of the structure leads to the governing equilibrium equation of structural dynamics and can be expressed as[M]{a}+[C]{v}+[K]{d}={f(t)}with d(0)=d0 and v(0)=v0, where M is the mass matrix, C is the damping matrix and K is the stiffness matrix. a, v and d are the acceleration, velocity and displacement vectors, respectively. Solution of this initial value problem requires integration through time. This is achieved numerically by discretising in time the continuous

Mixed time integration algorithm

The mixed time integration algorithms are developed by Belytschco et al. [1], [2], where both implicit as well as explicit algorithms are employed on implicit and explicit mesh partitions, respectively. These algorithms are originally developed for solving the media structure interaction problems efficiently. In the explicit and implicit mesh partitions, the flexible part of the mesh is integrated explicitly and the stiff part of the mesh is integrated implicitly. The two subsets of the nodes

Message Passing Interface

Each processor in a distributed memory parallel computer has local memory and local address space. Since the only way to exchange data between these processors is to use explicit message passing, any nontrivial parallel system requires communication among processors. The message passing model assumes a set of processes that have only local memory but are able to communicate with other processes by sending and receiving messages. Data transfer for the local memory of one processor to the local

Parallel implementation of mixed time integration algorithm

The mixed time integration algorithm has been implemented using MPI software development environment. In the parallel domain decomposition based mixed time integration algorithm, nodes lying along the interface are divided into primary boundary nodes and secondary boundary nodes, with each interface node being a primary boundary node in a submesh and secondary boundary node in all other submeshes sharing that particular node. The nodes which are not interface nodes are called internal nodes.

SPANDAN—finite element code for parallel nonlinear dynamic analysis

The proposed parallel algorithm has been implemented in the finite element code Software for PArallel Nonlinear Dynamic Analysis (SPANDAN). SPANDAN consists of a suit of parallel algorithms for nonlinear dynamic analysis employing explicit, implicit time integration techniques with parallel sparse preconditioned conjugate gradient solvers, Profile direct solvers as well as hybrid (Combination of iterative and direct) solvers. SPANDAN is developed using MPICH implementation of MPI software

Numerical studies

The parallel code, SPANDAN has been ported onto PARAM-10000 super computer at NPSF (National PARAM Supercomputing Facility), Pune, India. PARAM scalable parallel computer is a distributed memory machine based on the heterogeneous open frame architecture. The open frame architecture unifies cluster computing with Massive Parallel Processing (MPP). PARAM-10000 is built with a cluster of 40 workstations. Each workstation in the cluster has four Ultra SPARC processors running at 300 MHz and 512 MB of

Conclusions

A parallel algorithm for mixed time integration technique is discussed in this paper. This approach is an extension of the originally developed mixed time integration algorithm of Belytschco and his co-workers. Unlike its sequential counterpart, the prime objective of the proposed parallel algorithm is to impart concurrency into the implicit time integration technique.

The parallel mixed time integration algorithm has been developed within the broad framework of domain decomposition. A parallel

Acknowledgements

The author wishes to acknowledge the support of National PARAM Supercomputing Facility (NPSF), CDAC, Pune, India, for extending their computing facilities for this work. This paper is being published with the permission of the Director, Structural Engineering Research Centre, Madras.

References (16)

There are more references available in the full text version of this article.

Cited by (0)

View full text