Numerical simulations of black hole accretion flows

We model the structure and evolution of black hole accretion disks, and their neighboring regions, using numerical simulations. The numerics is governed by the equations of general relativistic magneto-hydrodynamics (GRMHD). In particular, such disks and outflows can be found at the base of very energetic ultra-relativistic jets produced by cosmic explosions, so called gamma-ray bursts (GRBs). Another, more persistent type of the jet phenomena, are blazars, emitted from the centers of galaxies. Long-lasting, detailed computations are essential to properly determine the physics of these explosions, and confront the theoretical models with any potential observables. From the point of view of numerical methods and computational techniques, three ingredients need to be considered. First, the numerical scheme must work in a conservative manner, which is achieved by solving a set of non-linear equations at each time-step, to advance the conserved quantities from one time step to the next. Second, the efficiency of computations intrinsically depends on the code parallelization methods, which may use various techniques. Third, the analysis of results is possible via the post-processing of the computed time-dependent physical quantities, and visualization of the flow properties. This is done via implementing various packages and libraries that are standardized in the field of computational astrophysics and supported by community developers. In the present paper, we discuss the physical picture of the cosmic sources which are modeled using numerical framework. We also describe several technical issues, in the particular context of our own experience with the performance of the GRMHD code which we develop. We also present a suite of performance tests, done on the High-Performance Computer cluster (HPC) in the Center for Mathematical Modeling of the Warsaw University.


Introduction
Astrophysical black holes are ubiquitous in the Universe. They occupy centers of galaxies [28], they may be found in the binary systems with ordinary stars, where the streams of plasma lead to the phenomenon called a 'microquasar' [23], and, finally, they may be responsible for the most extreme cosmic explosions -the gamma ray bursts [20]. In all these types of sources one common physical process is in work: accretion of matter onto a black hole. It is the most efficient of the known energy release mechanisms, which is orders of magnitude stronger than the nuclear fusion reactions that fuel ordinary stars. The gravitational potential energy in the field of the compact star is governed by its mass-to-radius ratio. Hence, per unit rest-mass energy of the gas fallen into the black hole, we can extract up to almost sixty per-cent of power available to be released in the form of the electromagnetic radiation [10].
Gamma ray bursts (GRBs) are electromagnetic transients observed from the most distant parts of the Universe. Their brightness detected by the human-made telescopes implies that the intrinsic power of the events is enormously large. During the collapse of a massive star into a black hole in a hyper-nova process a long duration burst (t ∼ 100 − 1000 seconds) can be observed. It is required that the progenitor star has enough angular momentum to form an accretion disk around a black

GRMHD Equations
From the steady state based on the analytical, equilibrium solution driven by the main physical parameters of the black hole accretion disk (namely, BH mass, its spin, and size and the mean accretion rate in the torus), as well as the seed configuration of magnetic field, we follow the dynamical evolution. This is achieved by solving numerically the continuity Eq. (1), the four-momentum-energy conservation Eq. (2), and induction Eq. (3) equations in GR framework: Here the stress tensor separates into gas and electromagnetic parts: T µν = T µν gas + T µν EM , T µν gas = ρhu µ u ν + pg µν = (ρ + u + p)u µ u ν + pg µν , Other symbols in Eq. (4) have their usual meaning: u µ is the four-velocity of gas, u is internal energy, ρ is density, p denotes pressure, and b µ is the magnetic four-vector. Note that in Eq. (3) B i is the magnetic field three-vector, b i is the spatial part of the magnetic field four-vector and u i is the spatial part of the four-velocity. Finally, F is the Faraday tensor, and in the force-free approximation E ν = u ν F µν = 0. The space-time metric g µν is described in Eq. (1) with determinant g ≡ Det(g µν ) and Γ λ νκ is the spatial connection.

Our Code for the High Accuracy Relativistic MHD
HARM (High Accuracy Relativistic Magneto-hydrodynamics) is a conservative shock capturing scheme, for evolving the equations of GRMHD, developed by C. Gammie et al. [11]. The integrated equations are of the form: where U is a vector of "conserved variables", such as particle number density, or energy or momentum, F i are the fluxes in finite control volume, and S is a vector of source terms. U is conserved in the sense that, if S = 0, it depends only on fluxes at the boundaries. The vector P is composed of "primitive" variables, such as rest-mass density, internal energy density, velocity components, and magnetic field components, which are interpolated to model the flow within zones. Initial conditions P < l a t e x i t s h a 1 _ b a s e 6 4 = " d O P X w / / 6 E m W q V W p f V L m 0 j C G F C 9 M = " > A A A C 0 H i c j V H L S s N A F D 2 N r 1 p f V Z d u g k V w V V I R 1 F 3 R j c u K x h b a K s l 0 W k P z Y j I R Q i n i 1 i 9 w q z 8 l / o H + h X f G F N Q i O i E z Z 8 6 9 5 8 z c u W 7 s e 4 m 0 r N e C M T M 7 N 7 9 Q X C w t L a + s r p X X N y 6 T K B W M 2 y z y I 9 F y n Y T 7 X s h t 6 U m f t 2 L B n c D 1 e d M d n q h 4 8 5 a L x I v C C 5 n F v B s 4 g 9 D r e 8 y R R F 1 1 3 M j v J V l A y 6 g x v i 5 X r K q l h z k N a j m o I B + N q P y C D n q I w J A i A E c I S d i H g 4 S + N m q w E B P X x Y g 4 Q c j T c Y 4 x S q R N K Y t T h k P s k O Y B 7 d o 5 G 9 J e e S Z a z e g U n 3 5 B S h M 7 p I k o T x B W p 5 k 6 n m p n x f 7 m P d K e 6 m 4 Z r W 7 u F R A r c U P s X 7 p J 5 n 9 1 q h a J P g 5 1 D R 7 V F G t G V c d y l 1 S / i r q 5 + a U q S Q 4 x c Q r 3 K C 4 I M 6 2 c v L O p N Y m u X b 2 t o + N v O l O x a s / y 3 B T v 6 p b U 4 N r P d k 4 D e 6 9 6 V L X O 9 i v 1 4 7 z T R W x h G 7 v U z g P U c Y o G b L I W e M Q T n o 1 z I z P u j P v P V K O Q a z b x b R g P H 6 1 I l T U = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " d O P X w / / 6 E m W q V W p f V L m 0 j C G F C 9 M = " > A A A C 0 H i c j V H L S s N A F D 2 N r 1 p f V Z d u g k V w V V I R 1 F 3 R j c u K x h b a K s l 0 W k P z Y j I R Q i n i 1 i 9 w q z 8 l / o H + h X f G F N Q i O i E z Z 8 6 9 5 8 z c u W 7 s e 4 m 0 r N e C M T M 7 N 7 9 Q X C w t L a + s r p X X N y 6 T K B W M 2 y z y I 9 F y n Y T 7 X s h t 6 U m f t 2 L B n c D 1 e d M d n q h 4 8 5 a L x I v C C 5 n F v B s 4 g 9 D r e 8 y R R F 1 1 3 M j v J V l A y 6 g x v i 5 X r K q l h z k N a j m o I B + N q P y C D n q I w J A i A E c I S d i H g 4 S + N m q w E B P X x Y g 4 Q c j T c Y 4 x S q R N K Y t T h k P s k O Y B 7 d o 5 G 9 J e e S Z a z e g U n 3 5 B S h M 7 p I k o T x B W p 5 k 6 n m p n x f 7 m P d K e 6 m 4 Z r W 7 u F R A r c U P s X 7 p J 5 n 9 1 q h a J P g 5 1 D R 7 V F G t G V c d y l 1 S / i r q 5 + a U q S Q 4 x c Q r 3 K C 4 I M 6 2 c v L O p N Y m u X b 2 t o + N v O l O x a s / y 3 B T v 6 p b U 4 N r P d k 4 D e 6 9 6 V L X O 9 i v 1 4 7 z T R W x h G 7 v U z g P U c Y o G b L I W e M Q T n o 1 z I z P u j P v P V K O Q a z b x b R g P H 6 1 I l T U = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " d O P X w / / 6 E m W q V W p f V L m 0 j C G F C 9 M = " > A A A C 0 H i c j V H L S s N A F D 2 N r 1 p f V Z d u g k V w V V I R 1 F 3 R j c u K x h b a K s l 0 W k P z Y j I R Q i n i 1 i 9 w q z 8 l / o H + h X f G F N Q i O i E z Z 8 6 9 5 8 z c u W 7 s e 4 m 0 r N e C M T M 7 N 7 9 Q X C w t L a + s r p X X N y 6 T K B W M 2 y z y I 9 F y n Y T 7 X s h t 6 U m f t 2 L B n c D 1 e d M d n q h 4 8 5 a L x I v C C 5 n F v B s 4 g 9 D r e 8 y R R F 1 1 3 M j v J V l A y 6 g x v i 5 X r K q l h z k N a j m o I B + N q P y C D n q I w J A i A E c I S d i H g 4 S + N m q w E B P X x Y g 4 Q c j T c Y 4 x S q R N K Y t T h k P s k O Y B 7 d o 5 G 9 J e e S Z a z e g U n 3 5 B S h M 7 p I k o T x B W p 5 k 6 n m p n x f 7 m P d K e 6 m 4 Z r W 7 u F R A r c U P s X 7 p J 5 n 9 1 q h a J P g 5 1 D R 7 V F G t G V c d y l 1 S / i r q 5 + a U q S Q 4 x c Q r 3 K C 4 I M 6 2 c v L O p N Y m u X b 2 t o + N v O l O x a s / y 3 B T v 6 p b U 4 N r P d k 4 D e 6 9 6 V L X O 9 i v 1 4 7 z T R W x h G 7 v U z g P U c Y o G b L I W e M Q T n o 1 z I z P u j P v P V K O Q a z b x b R g P H 6 1 I l T U = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " Q Z Q B t X X y c / w g w e N y 5 m 2 x b E r r t n E = " O v e f M 3 L l 2 6 P J Y m O Z r R p u a n p m d y 8 7 n F h a X l l f 0 1 b W L O E g i h 1 W d w A 2 i u m 3 F z O U + q w o u X F Y P I 2 Z 5 t s t q d v 9 Y x m v X L I p 5 4 J + L Q c h a n t X z e Z c 7 l i C q r W 8 1 7 c D t x A O P l u H J 6 J K 3 / c I 4 V R m 1 / Z 2 2 n j e L p h r G J C i l I I 9 0 V A L 9 B U 1 0 E M B B A g 8 M P g R h F x Z i + h o o w U R I X A t D 4 i J C X M U Z R s i R N q E s R h k W s X 2 a e 7 R r p K x P e + k Z K 7 V D p 7 j 0 R 6 Q 0 s E 2 a g P I i w v I 0 Q 8 U T 5 S z Z 3 7 y H y l P e b U C r n X p 5 2 v 3 W p 3 2 v 1 n q p Z J N e v 4 N r S H D 4 4 D n s U = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " Q Z Q B t X X y c / w g w e N y 5 m 2 x b E r r t n E = " O v e f M 3 L l 2 6 P J Y m O Z r R p u a n p m d y 8 7 n F h a X l l f 0 1 b W L O E g i h 1 W d w A 2 i u m 3 F z O U + q w o u X F Y P I 2 Z 5 t s t q d v 9 Y x m v X L I p 5 4 J + L Q c h a n t X z e Z c 7 l i C q r W 8 1 7 c D t x A O P l u H J 6 J K 3 / c I 4 V R m 1 / Z 2 2 n j e L p h r G J C i l I I 9 0 V A L 9 B U 1 0 E M B B A g 8 M P g R h F x Z i + h o o w U R I X A t D 4 i J C X M U Z R s i R N q E s R h k W s X 2 a e 7 R r p K x P e + k Z K 7 V D p 7 j 0 R 6 Q 0 s E 2 a g P I i w v I 0 Q 8 U T 5 S z Z 3 7 y H y l P e b U C r n X p 5 2 v 3 W p 3 2 v 1 n q p Z J N e v 4 N r S H D 4 4 D n s U = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " Q Z Q B t X X y c / w g w e N y 5 m 2 x b E r r t n E = " O v e f M 3 L l 2 6 P J Y m O Z r R p u a n p m d y 8 7 n F h a X l l f 0 1 b W L O E g i h 1 W d w A 2 i u m 3 F z O U + q w o u X F Y P I 2 Z 5 t s t q d v 9 Y x m v X L I p 5 4 J + L Q c h a n t X z e Z c 7 l i C q r W 8 1 7 c D t x A O P l u H J 6 J K 3 / c I 4 V R m 1 / Z 2 2 n j e L p h r G J C i l I I 9 0 V A L 9 B U 1 0 E M B B A g 8 M P g R h F x Z i + h o o w U R I X A t D 4 i J C X M U Z R s i R N q E s R h k W s X 2 a e 7 R r p K x P e + k Z K 7 V D p 7 j 0 R 6 Q 0 s E 2 a g P I i w v I 0 Q 8 U T 5 S z Z 3 7 y H y l P e b U C r n X p 5

Equation of State
The equation of state in the plasma is based on equilibrium of the nuclear reactions, which, when reached, defines the proton-to-baryon density ratio, and hence the pressure, internal energy and entropy of the gas.
In the hyper-accreting matter in the GRB central engine, temperatures are above 10 9 − 10 10 K, and the plasma is fully ionized, and composed of free nuclei, n, p, and electron-positron pairs, e + , e − . The chemical and pressure balance required by nuclear reactions between these species, namely the electron and positron capture on nuclei, and the β-decay. Reactions are in the form: p + e − → n + ν e , p +ν e → n + e + , p + e − +ν e → n, n + e + → p +ν e , n → p + e − +ν e , n + ν e → p + e − .
The rates for these reactions are given by appropriate integrals [27]. The electron neutrinos and anti-neutrinos are the source for cooling of the plasma, in addition to advective and radiative cooling (the latter is in fact negligible, due to the huge opacities for the photons). The above nuclear processes have been studied in numerous works, devoted to the neutron star Equation of State, and later on in the context of GRB central engines [5,7,[14][15][16][17][18]26].
Other neutrino emission processes that occur at lower rates are: electron-positron pair annihilation, bremsstrahlung, and plasmon decay: We calculate their rates numerically, with proper integrals over the distribution function of relativistic, partially degenerated species. These processes lead to formation of heavy lepton neutrinos, ν τ and ν µ .

Pressure Components
In the EOS, contribution to the pressure is by the free nuclei and e + −e − pairs, helium, radiation and the trapped neutrinos: where P nucl includes free neutrons, protons, and the electron-positrons: Here in Eq. (6), F k are the Fermi-Dirac integrals of the order k, and η e , η p and η n are the reduced chemical potentials, η i = µ i /kT , is the degeneracy parameter (where µ i is the standard chemical potential) Reduced chemical potential of positrons is η e+ = −η e − 2/β e . Relativity parameters are defined as β i = kT /m i c 2 . EOS computed numerically by solving the balance of nuclear reactions [12,16,35].
To sum up, the proper description of the hyper-accretion in GRBs requires detailed treatment of microphysics. Based on the solutions for degenerate Fermi gas EOS, with P (ρ, T ) non-linear dependence, a non-trivial transformation between conserved variables and 'primitives' in HARM due to GRMHD scheme.
The interpolation over the tables of EOS is done (using the Akima-spline method [2]) during the dynamical simulation at each and every time step. In order to save the computer power, we usually compute a small matrix of 4 × 4 = 16 points at each grid cell, whenever the value must be interpolated, and only then we store the table. To perform the interpolations with maximum efficiency, we use the multi-threading feature of the Linux operating system (with pthread command).

Neutrino Cooling
The effect important for the state of accretion disk is neutrino cooling. The neutrinos of three flavors are emitted via the above weak interactions, and the neutrino cooling rate is finally given by the two-stream approximation, and includes the scattering and absorptive optical depths for neutrinos of all three flavors: as given by [7]. This expression assumes that neutrinos are thermalized. Ideally, neutrino transport should be accounted for (see e.g. [25]).

Conversion Scheme
The composition-dependent EOS is in our simulations coupled to the conservative scheme. The HARM (high-accuracy relativistic magneto-hydrodynamics) scheme solves equations Eq. 5 where U is a vector of "conserved" variables, (i.e., the number density, energy or momentum density in the coordinate frame), F i are the fluxes, and S is a vector of source terms. that do not involve derivatives of P and therefore do not affect the characteristic structure of the system. In non-relativistic MHD, both P → U and U → P have a closed-form solution. However, in GRMHD U(P) is a complicated, nonlinear relation. Inversion P(U) is calculated once or twice in every time step, numerically. The transformation between 'conserved' (momentum, energy density) and 'primitive' (rest mass density, internal energy) variables requires to solve a set of 5 non-linear equations. This inversion is complex for a non-adiabatic relation of the pressure with density. We are doing it numerically and interpolate over the table of pre-computed EOS results.

Visualization and Post-processing of Results
The code produces a set of outputs that can be divided into three categories. The Initialization Output is only produced once, before the integration begins, and contains the constant quantities of the simulation like the grid and coordinates, the metric and its determinant on the grid points. Some model parameters are also stored during the Initialization stage, e.g. the BH spin parameter, the adiabatic index etc.
The Results Output contains the main results of the integration and stores the physical quantities of the flow (density, internal energy, contravariant velocity, magnetic field vector), while it contains also some other derived quantities of physical interest. Among the various options we used for the form of the dumping data (text, raw binary, HDF5, etc), the HDF5 proved to be the most advantageous. The 1-D, and 2-D models of an ideal conducting fluid does not pose significant restrictions on the dumped data type. The situation alters when we assume a composition dependent EOS or we proceed to the 3-D simulations where both file size and structure complexity increase dramatically. The hierarchical structure of the file makes easy to locate specific quantities through a POSIX -like syntax.
The size of the dumped file or of the objects it includes is not limited, while the special extensions, slib, zlib, can be used to compress the resulting file. But the most significant features that HDF5 provided, were the performance of its collective (parallel) I/O driver and the portability of the data in both C/C++ and Python/IPython interfaces. The robust performance of the I/O driver was noticed in both of the HPC and local servers we used. The post-processing and the visualization of the results for the 1-D, 2-D simulation is performed by the standard packages of the Python3 language (numpy, matplotlib/pylab, scipy). With the extension of the h5py package, the import of the results is straightforward and sequentially the powerful routines of the python libraries are used for further manipulation. An exemplary plot (Fig.  2) was produced using the matplotlib/pyplot module. The parallelization of the above scripts proved to be a challenging task and we concluded that the mpi4py module is easier portable to a series of machines from our Desktops and 32-Intel-cores local server, up to the Cray XC40 of the Okeanos HPC, mostly because of the parallel HDF5/h5py and the mpi4py module compatibility. It is clear that the parallelization is essential especially for the 3D post-process where various manipulations, e.g. interpolation, coordinate transformations, have to be done on the points of extended grids. Once these calculations are completed, we use PyVtk module to produce a VTK file output and more advanced tools for the 3D visualizations, e.g. VisIt. An example plot is shown in Fig. 3.
The Debug Output is the final set of output produced during the run time. Beyond the simple execution messages and the validity of the ∇ · B = 0 condition, the code performs a number of physical diagnostics during each step of integration and dumps the results through a series of binary and ppm graphic files (per process). The motivation for these tests is to help user to identify unphysical results and avoid time consuming calculations.
To sum up, the HDF5 format is characterized by: • Robust and satisfactory performance of the MPI I/O process.
• High portability in many interfaces (C/C++ and Python).
• Easy to locate quantities through the POSIX type structure. Furthermore, our Python main processing (cython under development) allows for: • Calculation of various physical quantities (h5py).

The Boundary Conditions
The HARM code does not solve the equations of the GRMHD in the Boyer-Lindquist coordinates system, but rather on a modified version of the so called Kerr-Schild coordinate system (KS). The KS system is not singular on the black hole horizon and therefore, the matter can accrete smoothly through this surface and the evolution of the flow can be followed, see for example [9]. A further transformation applies on the radial component of the specific system using a logarithmic mapping [11]. As a consequence, our points distribution is denser close to the horizon, when we assume an equally spaced grid on the r-direction.
The boundary conditions that apply on the radial direction are the free inflow-outflow conditions, modified by a specific extrapolation schema that reduces the unphysical behavior. This behavior is induced mostly by the variation of the metric between the normal and the ghost cells [11] and the selected extrapolation was chosen such as to maximize the robustness of the code. In reality, the radial boundary conditions have negligible effect on the physics of the problem under the proper The 2-dimensional simulations are, by definition, axisymmetric, i.e.the derivatives of quantities in the φ-direction are neglected; note however, that the vectors (velocity field, magnetic field) still have all the three components. In contrast for the 3-dimensional simulations the derivatives of the quantities are computed so the non-axisymmetric evolution is being followed properly. The periodic boundary condition is always used in the azimuthal direction and the flow quantities at φ 0 + 2π are the same as at φ 0 describing the smooth continuation of the solution between the neighboring domains.
The real technical problem is posed by the boundary condition at the z-axis, namely in the polar direction. Physically, the vertical axis is only the symmetry axis in the 2D flows, but it should not work as a real boundary in the 3D flows. In order to get some intuition on this effect, the reader can think of a Cartesian coordinate system. In such a case boundary conditions has to be set in an inner box, that will lay well inside the horizon and an outer one which lays at great distances, but not on the axis of the black hole rotation (see Fig. 4). A technique to overcome the extra needed boundary conditions is to cancel in practice the axis existence by attaining the 'ghost cells' variables from the values of the corresponding normal grid cells on the other side of the axis; notice though that the vectors φ− components must change sign because of the opposite direction of the φ−unitary. Therefore the application of the above technique requires by our algorithm to connect the two correspond grid domains and sets further complexity in our MPI implementation.

Parallelization Methods in HARM Simulations
The HARM code works in 1, 2, and 3-Dimensions. In the latter case, the optimal time for any realistic simulation requires parallel computing.
In the simulation presented in Fig. (3), i.e., the non-magnetized, 3-dimensional case, the grid resolution was 192x192x192 points. The number of HPC nodes was N=64, the number of tasks per node was n=24. The number of run-time steps of integration was about 110000 and the total real time of computation was about 12.7 hours.

MPI in Practice
The distribution of the processes among the physical system directions is provided by the function MPI Dims create(nprocs, 2, dims); currently is 2D, but it can easily generalized to 3D. According to this routine the divisors are set to be as close as possible using an appropriate divisibility algorithm, while dims[i] are ordered in decreasing order: An alternative procedure incorporated also in our schema distributes the number of processes using two criteria. In order to reduce bottlenecks, the N x × N y grid is distributed to the n x × n y processes on each direction such as N x −n x [ Nx nx ] processes loaded with [ Nx nx ] points and n x ([ Nx nx ]+1)− N x processes with [ Nx nx ] + 1 ones where the square brackets denote the integer part of the division; the same holds for the y direction. We then perform an optimization routine requiring that the grid points load per each process is as balanced as possible, while in addition the necessary MPI communications between the neighboring domains are minimum. The two criteria are calculated with different weight allowing us to perform further optimization based on the specific integrating system and the machine specifications.
The assignment of the Cartesian grid to the MPI communicator is performed with the default MPI Cart create and MPI Cart coords routines that provide a specific rank per direction to every process. The latter is of primary importance for identifying the mirror to the rotation axis points and applying the proper boundary conditions (see Sec. 3).

Supercomputer Performance Tests
We used both the standard Message Passing Interface (MPI) technique, and also a more advanced, Hybrid parallelization method. The MPI+hyper-threading was also tested (cf. Fig. 5, red and blue lines). The computational grid was divided into slices where every process works on its own area and boundaries are exchanged when needed. The pure MPI program running on the cluster of 1024 nodes and using 24 threads per node creates 24576 processes in total. It is a huge number, and the first expectation is that interprocess communication for boundaries exchange should decrease the overall performance.
The code occurred to be well-scalable for a uniform resolution, e.g. the number of the grid points in 3 dimensions equal to N r × N θ × N φ = 192 × 192 × 192. For non-uniform grids, the dependence on N nodes and N tasks/node is not monotonic, due to the properties of function MPI Dims create. Another method was to use the MPI + OpenMP shared memory, i.e., a Hybrid approach 3 . Here, only one MPI process is created per node and called master. The parallel execution is done for every loop in the code in fork-join mode. It was rather straightforward (in comparison to MPI) to add OpenMP calls, using only several pragma statements. Our preliminary tests were aimed to check if pure MPI-HARM, running on N nodes with 24 cores each will be more/less efficient with comparison to MPI+OpenMP hybrid solution running on the same number of nodes. The results are In our code, there is one significant difference with respect to the common MPI usage. We have computational domain divided into pieces and every MPI process uses only small memory region. This gives a faster memory access and it might make a difference.

Models Specific to Black Hole Accretion Systems
Below, we present some exemplary results of our simulations. Then, in the next Section, we discuss their results in the context of the visualization and post-processing methods. These 'technical' aspects are by no means trivial from the computational point of view, and proper analysis of the physics involved is tightly coupled with the post-processing demands. Finally, these simulations are at the limits of our computational resources, and fine numerical techniques have to be used to increase the efficiency of the simulations and code performance on the available computer clusters.

Magnetized Torus
In the purpose of better understanding of the black hole accretion and jets variability, we investigate the role of magnetically arrested disks (MADs) [32], as producer of turbulence in relativistic jet. MADs state occurs when the magnetic pressure force, pushes outward on the accretion disk gas. Because we need a certain amount of magnetic flux surrounding the black hole, to have enough magnetic pressure balancing the accretion, we need a large initial magnetized torus. To do this, we implement a thick disk model as an intial condition in HARM using the Chakrabarti's prescription [4]. In this model, the angular momentum distribution is chosen to have a power law distribution. Alternatively, the default disk model of Fishbone & Moncrief [8], is assuming the angular momentum to be constant in the disk. This model allows to create an initial torus configuration with a large amount of poloidal magnetic flux. The initial and evolved state of the trous, seeded by

Ejection of Relativistic Jets
If the black hole starts fastly rotate, the jet ejection is inevitable. The presence of magnetic fields and/or neutrino-antintineutrino pairs provide the mechanism for jets acceleration. The Blandford-Znajek process, which allows for the extraction of rotational energy of the black hole and transports it to the remote jets via magnetic fields, can be quantified in our simulations with the following expressionĖ Here, in Eq. (7) the magnetic part of the stress-energy tensor is integrated over the black hole horizon. In addition, the magnetic fields transport angular momentum in the accretion disk and allow accretion. In Fig. 7 we show the resulting power available for the jets, as dependent on the black hole rotation spin. The jets are accelerated at the vicinity of the black hole due to the central engine activity, and at large distances their kinetic energy has to be ultimately transferred to the emitted gamma ray radiation. To achieve this, the jet speed, expressed with the dimensionless Lorentz factor, , has to be on the order of 100. This is required to avoid so-called 'compactness problem', since the observed gamma ray spectra in GRBs are non-thermal. If the Lorentz factors in jets were small, the huge optical depth due to electron-positron pair creation would produce rather thermal emission [24]. The Lorentz factors of the jet estimated at 'infinity' in our simulations can easily reach the values around 80-100 (see Fig. 2). Power available to accelerate the relativistic jets, produced within the accretion torus in the GRB central engine. Several results are plotted, for a varying black hole spin parameter, a, as denoted in the figures. Two mechanisms are compared: Blandford-Znajek process (left) and neutrino anihillation (right). Note, that the latter has to be reduced by an efficiency factor, on the order of η ν ∼ 0.01, due to the uncertainties in the neutrino pair annihilation process.

Formation of Heavy Nuclei in the Neutron-rich Plasma
In the hyper-accreting disk at the GRB central engine, the conditions in the degenerated Fermi gas allow for substantial overabundance of the neutrons over protons. This is quantified with socalled 'electron fraction' ratio: Y e = 1 1 + n n /n p , which in the highly neutronized matter is much smaller than 0.5.

Summary
Numerical modeling of black hole accretion flows in extremely high energetic systems, such as the gamma ray bursts, is essential from the point of view of the recent observational discoveries. The discovery of gravitational waves in 2015, which was awarded the Nobel Prize in Physics in 2017, boosted the research also in high energy astrophysics, while it is related to the 'multi-messenger' astronomy. An example of the recently announced event is the kilonova signal, and the short gamma ray burst accompanied by the gravitational wave emission.
Kilonova effect • Optical and near-Infrared emission, powered by radioactive decay of r-process nuclei [19].
• Kilonova candidates can be distinguished from supernova by faster time evolution, fainter absolute magnitudes, and redder colors. • Dynamical ejecta from compact binary mergers, M ej ∼ 0.01M , can emit about 10 40 − 10 41 erg/s in a timescale of 1 week. • Subsequent accretion can provide bluer emission, if it is not absorbed by precedent ejecta [30].
• In the GRB 130603B afterglow, the excess NIR flux corresponds to absolute magnitude M (J) ∼ 15.35 mag at ∼ 7d after the burst, consistent with the kilonova behavior. The lightcurve is in agreement with predicted r-process kilonova optical emission [31]. Electromagnetic counter part: GW 170816 • Gravitational waves were discovered with the detection of binary black hole mergers and they should also be detectable from lower mass neutron star mergers. • NS-NS are predicted to eject material rich in heavy radioactive isotopes that can power a kilonova. • The gravitational wave source GW170817 arose from a binary neutron star merger in the nearby Universe with a relatively well confined sky position and distance estimate. • A rapidly fading electromagnetic transient in the galaxy NGC4993, is spatially coincident with GW170817 and a weak short gamma-ray burst [6,29].

Conclusion
With our simulations, we found that the proper microphysics treatment in GRMHD simulations of hyper-accretion is essential for determining the disk structure: thickness, chemical composition of torus and its ejecta. Furthermore, we concluded that the neutrinos and Blandford-Znajek process have comparable role in powering the GRB jets. As the large scale jets in GRBs are concerned, the variability of these jets is related to the disk's magneto-rotational turbulence timescale. The ultimate Lorentz factors are found to be on the order of few 100. Thus, the late-time X-ray and high frequency radio emission can provide constraints on the properties of the disk-jet system for a particular source, e.g. GW+EM170817. Finally, the magnetically driven, low Y e winds from accretion disks in GRB engine can provide power to kilonova emission, which was found in this source.