Simulation of heat transfer with considering permafrost thawing in 3D media

An approach to mathematical modeling of heat transfer with a permafrost algorithm in 3D media based on the idea of localizing the phase transition area is considered. The paper presents a problem statement for a non-stationary heat transfer and a description of a numerical method based on a predictor-corrector scheme. For a better understanding of the proposed splitting method, the accuracy order of approximation considering inhomogeneous right-hand side was studied. The phase changes in the numerical implementation of permafrost thawing is considered in the temperature range and requires recalculation of coefficients values of the heat equation at each iteration step with respect to time. A brief description of the parallel algorithm based on a 3D decomposition method and the parallel sweep method is presented. A study of the parallel algorithm implementations using a high-performance computing system of the Siberian Supercomputer Center of the SB RAS was performed. The results of the permafrost algorithm on models with wellbores are also presented.


Introduction
An approach to the mathematical modeling of the heat transfer in permafrost [1][2][3][4][5][6][7][8] that is based on the idea of localizing the phase transition area is considered. In such a case, numerical calculations can be carried out based on the through-counting method. This paper presents a problem statement for a nonstationary heat conduction equation and a description of a numerical method based on a predictorcorrector scheme. For a better understanding of the splitting method of the predictor-corrector type, which well preserves the balance properties, the accuracy order of the method proposed was studied considering the inhomogeneous right-hand side. Using different mesh models, the results of estimating the approximation order with respect to time and space have been obtained. These results are the basis for considering a more complicated 3D problem: a model of the interaction of wellbores having the temperature different than that in permafrost area. The phase transition in the numerical implementation is considered in the temperature range and requires recalculation of coefficients values of the heat equation at each iteration step with respect to time. At the same time, the general methodology and approach based on the predictor-corrector scheme are preserved. In this case, threepoint equations appear in the predictor part, which can be easily solved by the matrix sweep method. The approaches described for the heat transfer simulation can be adopted to use high-performance multicore clusters [9]. In this paper we also present some results obtained for parallel algorithm implementations using MPI and OpenMP.

The problem statement and the method used
Consider the domain Ω having the linear dimensions Lx, Ly, Lz in 3D Cartesian coordinates. The wellbore zone w  (that is a part of the simulated area Ω) is described by a parallelepiped with corresponding temperature. Consider the nonstationary heat conduction equation (1) in 3D Cartesian coordinates: We will discuss the general idea of the predictor-corrector method [10,11] in the case when the operator of the right-hand side () T  =   can be represented as combination And corrector (6) will be: 11 1 22 , nn nn The general form of finite difference approximation of (3)-(4) for the predictor part will be of the same type for the operators ,, x y z    . In such a case we will use with PDE with the tridiagonal matrix constructed from approximations of the type (7):   (7) for the operator x  , we need a right-hand side of the type: The corrector will be (9): To solve PDEs with the tridiagonal matrices for the operators ,, x y z    , one can use the sweep method.

Phase transition algorithm for permafrost thawing
We will assume that the water-ice phase transition (solid phase -liquid phase) occurs at a specified temperature * T . Let us denote by () S S t = the interface between the phases, where such a phase transition occurs, where t is the time variable. This boundary divides the simulation domain  into two sub-domains −  and +  . Let −  be a frozen zone and +  be a thawed zone. Thus, subdomains can be defined as follows:  The basic equation with permafrost thawing (9) for the whole domain  using presented functions (10) and (11) will be: L is the specific heat of the phase transition, , c  ++ and , c  −− are the density and specific heat of the thawed and frozen zone [7]. In this case, the new coefficients of equation (12) are the functions of temperature requiring the recalculation at each time step of the predictor-corrector scheme for the numerical simulation of the heat transfer.

Studying accuracy order of the predictor-corrector scheme
We will find the numerical solution of the nonstationary heat equation (13) We will study the accuracy of the described predictor-corrector approach using the test function After substituting the test function g into equation (13), we obtain the form of (15) of the right-hand side function  Boundary conditions were taken as 0 T = .
We have performed tests to study the approximation accuracy of the scheme with respect to time and space with an inhomogeneous right-hand side (table 1 and table 2). To study the accuracy, we used the error value (Error in tables) equal to abs maximum of the difference of numerical and exact solution divided by abs value of the numerical solution for grid nodes. It can be seen from the tables that the proposed predictor-corrector scheme has 4-th accuracy order.

Studying the parallel implementation
The developed parallel implementation allows the large-scale mathematical modeling to solve the problem of heat transfer on high-performance computing systems. This implementation is based on a 3D decomposition method of the computational domain to subdomains. Each of these subdomains is independently calculated at a dedicated computational node of the cluster. In this case, within the framework of one node, additional parallelization can be carried out using OpenMP, since a cluster node can be represented by several multi-core processors. Thus, a parallel algorithm and a program are developed based on a combination of technologies for parallel computations MPI and OpenMP. A parallel algorithm can be represented at several stages: -preparation (creation of 3D topology and a number of 1D topologies, data initialization), -parallel sweeps along the coordinates X, Y and Z, that require data exchanges, -a corrector in parallel mode on all processors.
First, using MPI, a 3D topology is created that helps to perform the communication between computational nodes of the cluster. Thus, the computational domain (the grid model) is divided into 3D subdomains, with all the data necessary for the calculation is placed in the RAM at the selected cluster nodes. We also create an additional set of 1D topologies along the axes Ox, Oy, Oz. They are necessary for the parallel implementation of the sweep algorithm [12] along each direction. The schematic representation of parallel computations and the interconnection between computational subdomains are shown in figures 1 and 2. Each process in the 3D topology knows its nearest neighbors along the coordinate axes for data exchange. The parallel algorithm can be explained at the following steps: 1) Creating topologies 2) Initialization of required input data and buffer arrays 3) Parallel sweep in X direction 4) Parallel sweep in Y direction 5) Parallel sweep in Z direction 6) Data exchanges 7) Parallel corrector In this case, steps 3) -7) work in a cycle for time variable for a given mesh. The parallel sweep algorithm [12] consists of three steps. First, solving three systems of partial differential equations. Second, constructing and solving a PDE with values at the points located on the boundaries of two neighboring subdomains along directions to obtain solutions. In such a case, we use 1D topologies to gather into each 0 process coefficients and then scatter the solution (z) into each corresponding group of processes in 1D topologies, figure 3. Then, we calculate the full solution using solutions and values from the first and the second steps.  The results of studying of the algorithm for the predictor-corrector scheme were obtained with use of the high-performance cluster with computational nodes each with two 6-core Xeon X5670 CPUs (cluster of the Siberian Supercomputer Center of the SB RAS). The results presented in table 3 and table 4 (in seconds) were obtained in tests for two algorithm implementations: when we use only MPI, and when we apply a combination of MPI and OpenMP. We apply OpenMP to perform calculations in the parallel mode at each multicore cluster node. One subdomain was calculated with one MPI process. The number of OpenMP threads per one MPI process was chosen to be equal to 3. In the model tests, the total computational domain consists of 2x2x2, 3x3x3, 4x4x4 subdomains along the corresponding coordinate axes Ox, Oy and Oz. Thus, the total number of MPI threads in the tests was 8, 27 and 64. For example, 27 MPI processes can be treated as 9 nodes with 9 cores each with 3 MPI processes and 3 OpenMP threads for MPI&OpenMP program. The results presented show a good behavior of the paralleled implementation , table 3 and table 4. The use of OpenMP parallelization within a single computational node works about 2.5-2.6 times faster in the scalability test. In the speed-up test, the achieved acceleration is about 1.4-1.6 times.  The scaling means that the running time of the parallel algorithm will not significantly change, when the scale of a computational model increases proportional to the number of computational resources used. Thus, each of the MPI processes calculates the same number of grid nodes. In the tests  10. In the speed-up test, the total size of the computational model, consisting of several subdomains, remains the same for all the versions of the partitioning while the number of computational resources varies. It is expected that when the number of computing resources increases, the program will run faster. In the speed-up test, 10 iterations with respect to time for all models were performed.

Conclusion
In the paper, an approach to simulate the heat transfer with considering permafrost thawing is discussed. We performed the study on the accuracy of described predictor-corrector schem for numerical modelling. An approach with the governing equation for the permafrost is presented. We apply such an approach to perform simulations of the heat transfer with the permafrost thawing in a medium with wellbores. Also, we have discussed the parallel algorithm based on the sweep method for calculations on multi-core cluster systems. The results of the parallel algorithm work on different tests are presented.