The parallel & distributed code for numerical simulation of relativistic magnetohydrodynamics simulation

A new code for modeling relativistic magnetohydrodynamic flows is described in the paper. The developed program code is based on a combination of adaptive nested grids. Magnetic hydrodynamics of the process is simulated using nested grids. Subgrid processes are simulated using regular grids. In the paper, we will outline the main components of program code building. These steps are common to other program codes for magnetized flows simulation.

The top ten supercomputers listed in the 2019 November version of the Top 500 list are equipped with graphic accelerators and Intel/Sunway processors with AVX technologies support. Most likely, the first ExaScale performance supercomputer will be built based on the hybrid approach. The code development for the hybrid supercomputers is not a solely technical problem, but an individual complex scientific problem, requiring co-design of algorithms during all stages of problem solving -from physical statement to development tools.
The problem of Mind the Gap for sub-grid processes remains even when using top-level supercomputers when solving these problems. One possible solution to such problems is the use of multi-level nested grids. The approach is to use adaptive nested grids to simulate hydrodynamics. The next level of nesting of grids allows to reproduce the sub-grid more correctly. Using the resources of SSCC, we were able to partially solve the Mind the Gap problem by reproducing seven orders of magnitude. We hope that regular access to more productive supercomputers will allow us to advance several orders of magnitude. Following is a short review of codes, that allow you to use a high resolution.
The CAFE [22] program code developed to solve the problems of special relativistic ideal magnetic gas dynamics. Such software package is based on the WENO scheme of fifth-order accuracy with a method for solving the Riemann problem HLLE. To ensure the non divergence  [23] code. The formulation of the method for cylindrical coordinates is given and code testing is given in detail.
The GIRAFFE [24] code is developed to simulate the electrodynamic general relativistic hydrodynamics that is to study of highly magnetized plasma flows. The code is built on the infrastructure of The Einstein Toolkit framework and uses its main software components. Using this infrastructure allowed us to concentrate on writing the mathematical part of the code, the number of lines of which was only 1600 lines, which is very few for such simulations.
The main objects for modeling with the IllinoisGRMHD code [25] are black holes and neutron stars, which are the main sources of gravitational waves. The code implements a model of general relativistic hydrodynamics using an adaptive grid apparatus. A combination of the piecewise parabolic method and the Lax-Harten method is used in the study. The code was tested on a one-dimensional relativistic magnetohydrodynamic test, on point explosion in twodimensional cylindrical symmetry, on the Bondi accretion problem, on star collapse. And also convergence tests were performed.
The model of semi-relativistic magnetic hydrodynamics is implemented in the MURAM code [26]. The code is specifically designed to simulate the upper convection region of the solar corona. The code implements the physics of thin losses based on radiation, the process of thermal conductivity taking into account equilibrium ionization. Restriction of speeds of Alfvn and of heat-conducting term is realized in the code using semi-relativistic magnetic hydrodynamics with artificially reduced speed of light.
The WhiskyMHD code [27] is adapted for modeling the general relativistic magnetic hydrodynamics of compact objects of solar masses. Among the potential range of tasks solved by the code is the collapse of rotating magnetized stars into black holes and the phenomenology of gamma-ray bursts. The code implements the polytropic equation of state and the ideal gas model.
In the second section, we describe the concept of co-design, within which the numerical model was developed. We also briefly summarize the information about numerical methods that was used. The third section will be devoted to the parallel implementation of the code. In the fourth section the results of relativistic magnetohydrodynamics jet simulation will be presented. The conclusion is given in the fifth section.

The co-design of numerical model
As mentioned in the introduction, the development of software for supercomputers is a complicated scientific problem and it requires the co-design at all stages of the numerical model creation. We outline six co-design stages of numerical modeling Fig. 1. The main difference between the co-design and the classic design of the computational model is the possibility of returning to the previous development stage with the constraints at the current stage. This makes it possible to build in a short time an effective computational model that takes into account all the developments.

The parallel & distributed computing
The relativistic magnetohydrodynamics of jet is performed on architecture with shared memory on adaptive nested meshes and is distributed using OpenMP tools within a single process. In our computational experiments we used an Intel Optane node which has 700 GB RAM for a single process. The sub-grid processes is performed on an architecture with distributed memory, with a software implementation based on a one-dimensional geometrical decomposition of a regular calculation domain by MPI tools and subsequent decomposition of the calculations into threads using OpenMP tools within a single process. A diagram of calculations organization is shown in Fig

The numerical model
By the physics of process we apply a set of physical variables: ρ is density, is magnetic field vector. We introduce the concept of enthalpy h: where γ is adiabatic exponent. Put the speed of light c ≡ 1. Then the Lorentz factor Γ is determined by equation: A similar expression holds for the square of the magnetic field vector. We introduce the conservative variables of relativistic magnetic hydrodynamics. D = Γρ is relativistic density and ⃗ m = (m x , m y , m z ) is relativistic angular momentum defined by equations: is scalar product of the velocity vector by the magnetic field vector. Due to invariance the magnetic field vector is a vector of conservative variables. Relativistic total energy is determined by the equation: As a result, the vector of conservative variables U has the form: To form the flow of the vector of conservative variables F = F (U ) along the x coordinate, we introduce a 4-dimensional vector ) .
Also for calculations, we need an expression for the square of the vector ⃗ b, which can be determined by the equation: We dwell on the outline of the scheme for the one-dimensional case. The conversion to multidimensional formulation is done elementarily. The flow of the vector of conservative variables F = F (U ) along the x coordinate is determined by the equation: As a result, the equation of special relativistic magnetohydrodynamics in a one-dimensional formulation can be written as: For the last equation, we can write the Godunov scheme, defining the cells as half-integer numbers, and the nodes as natural numbers. The scheme for the next time step has the form: where △x is constant mesh step, τ is time step, which can be obtained quite simply from the speed of propagation of waves limited by the speed of light. Then the Courant condition for the time step is written elementarily: where CF L is Courant-Friedrichs-Lewy number. We use a similar approach to modify the Rusanov scheme, where we take the speed of light as the maximum wave propagation speed. Then the numerical scheme has a simple form: .
Next, we proceed to the procedure for recovering physical variables ρ, p, ⃗ v from the vector of conservative variables U . In contrast to ideal magnetic hydrodynamics, the restoration of physical variables in the case of the Lorentz factor is not a trivial task. Moreover, a solution cannot be obtained analytically in the general case. Therefore, we will solve such a problem numerically. To do this, we introduce the variable W = ρhΓ 2 and consider the energy function written through the variable W , for which it is necessary to find the zeros of the function: , which, after recalculating the next time step, can be calculated using the vector of conservative variables. To use the Newton method, the first derivative of the function under consideration is necessary: where dp dW Consider the iterative process to obtain physical variables on the n + 1 time layer, provided that conservative variables on the n + 1 layer are already known. To denote the number of the iterative step, we will use the subscript for the variables.
(i) Take the value from the previous time step: (ii) Calculate the necessary auxiliary variables: (iii) Also from the previous layer we take the speed ⃗ v 0 = ⃗ v n , pressure p 0 = p n and the Lorentz (iv) For steps k = 1, 2, . . . we perform the following sequence of calculations: (v) Calculate the next values of the function f k (W ) and the derivative df k (W )/dW by the equations: (vi) Recalculate the value of W k according to Newtons method: (viii) If |W k − W k−1 | > ε, then increase k by one and go to the fourth step of the algorithm, otherwise, go to the next step. (ix) We get the final values of the physical variables on the n + 1 time layer by the equations: The iterative process is complete.
At the final stage, we perform the divergence cleaning operation for the magnetic field.
To fulfill the condition ▽ · (B) = 0 we used a scheme based on the Stokes theorem: To do this, by solving the Riemann problem at the Euler stage of the numerical method, the values of the electric field vector E = v × B in the nodes of the computational grid E x,y,z i±1/2,k±1/2,l±1/2 are determined. Then, the magnetic field values are recalculated from the previous time step, located in the center of the cells, using the finite-difference scheme of the equation ∂B ∂t = ▽ × E. Thus, we obtain the corresponding values of the magnetic field vector on the faces of the cells B x i±1/2,k,l , B y i,k±1/2,l and B z i,k,l±1/2 , which provide the condition of non divergence of the magnetic field. To perform transition to the values of the vector B in the cell, we average the values in each direction of the corresponding components of the magnetic field from the faces of the cell.

The performance analysis
As noted above, the hydrodynamics numerical simulation of SNIa is made on architecture with shared memory. Therefore, we consider a parallel implementation of the second level of nested meshes based on domain decomposition [28]. The MPI tools are used to perform a one-dimensional geometrical decomposition of the calculation domain. In the case of Intel Xeon Phi processors the OpenMP tools are employed. When using Intel Xeon Phi (KNL) processors the calculations are vectorized with some low-level tools [29,30].
The speedup of the code on a mesh of size 512 3 has been studied. For this, the total numerical method time was measured in seconds at various numbers of threads. The speedup P was calculated as where T otal 1 is the calculation time using one thread, and T otal K is the calculation time on K threads. The actual performance has also been estimated. The results of these investigations of the speedup and performance on the mesh of size 512 3 are shown in Fig. 3. A performance of 173 gigaflops and a 48x speedup are obtained on a single Intel Xeon Phi processor. The scalability of the code on calculation grid size of a 512p × 512 × 512 was studied using all threads for each of the processors, where p is the number of processors being used. Thus, a subdomain of size of 512 3 was used for each processor. To study the scalability, the total where T otal 1 is the calculation time with the use of one processor, and T otal p is the calculation time with the use of p processors. The results of these investigations of the scalability are shown in Fig. 4. A 97 per cent scalability is reached with 16 processors, which is a rather good result.

The numerical simulation
Let us simulate a galactic jet of density ρ J = 10 −1 cm −3 and radius R J = 200 parsec. The jet moves at relativistic mode with Lorenz factor Γ = 10. The temperature of the galactic atmosphere T A = 10 7 K, and the density ρ A = 1 cm −3 . Axial magnetic field according to the magnetization parameter β ≡ pmag p = 1. The adiabatic index γ is taken to be equal to 5/3. The density of dense cloud ρ C = 10 cm magnetized jets, like a GRS 1758-258 jet -the formation of the North lobe and South lobe. Also at the base of the jet remains a fairly dense core. Unlike similar jets, in the absence of a strong magnetic field, the development of instability is suppressed, which makes it possible to form lobes.

Conclusion
The new parallel & distributed hydrodynamical code for numerical simulation of relativistic magnetohydrodynamics simulation was described in the paper. The code is developed on the basis of combination of adaptive nested mesh for hydrodynamical simulation and regular mesh that is a second level of nested mesh for hydrodynamical simulation of sub-grid physics. A performance of 173 gigaflops and a 48x speedup are obtained on single Intel Xeon Phi processor. A 97 % scalability is achieved on 16 processors. Results of relativistic magnetohydrodynamics jet simulation on massive parallel supercomputers are presented.