Xcompact3D: An open-source framework for solving turbulence problems on a Cartesian mesh

.


Motivation and significance
Understanding, predicting and controlling turbulent flows is of central importance and a limiting factor to a vast range of industries and scientific endeavours. It remains one of the most challenging problems in science. Furthermore, the design of many engineering and industrial systems as well as the prediction of The implementation of a numerical code for high-fidelity simulations of turbulent flows requires a wide range of expertise including both numerical methods and computer science. Correctly setting up a simulation for a given flow requires similar expertise. A major motivation of the Xcompact3D framework is the provision of a tool which can be used ''off-the-shelf'' to obtain high-quality simulations of turbulent flows. Further motivating the development of Xcompact3D (né Incompact3D [1,2]) is to bring together the numerous versions of Incompact3D developed in the last few years and to add new capabilities to the solver while making it more user friendly and flexible.
As a Cartesian mesh solver, Xcompact3D has a well-defined raison d'être and mission within the CFD community. Its unique features are its accuracy, efficiency, scalability, independence of external libraries apart from a MPI library, user-friendliness and its customisability. In particular, ready-to-run simulations are provided for users with limited experience in CFD as well as a template for the only input file needed to run a simulation in a new set-up. Compilation is very fast and simple with only one Makefile and no dependence on external libraries (except for MPI). The numerical parameters of a simulation (domain size, number of mesh nodes, number of cores, etc.) can easily be changed without having to recompile the code. The manuscript focuses on exactly these aspects. Section 2 is dedicated to the software architecture of Xcompact3D including its parallel computing strategy, functionalities, main features and scalability. Subsequently, illustrative examples and impact case studies are shown in Section 3 and Section 4. Finally conclusions and future directions are briefly discussed in Section 5.

Software description
The turbulent flows described in Xcompact3D are governed by the Navier-Stokes equations, given for low Mach number (LMN) flows as where u i is the velocity vector, p the pressure, ρ the density, T the temperature, τ ij the viscous stress tensor, κ the heat diffusion coefficient and D (·) /Dt is the material derivative. Re is the Reynolds number, a dimensionless number defined as the ratio between inertial and viscous forces, which determines the range of turbulent scales to simulate. Similarly, the Prandtl number Pr is a dimensionless number defined as the ratio of momentum to thermal diffusivity. As shown in (1)-(5) under the LMN approximation the pressure field may be decomposed into the thermodynamic p (0) , and mechanical p (1) pressures. The former is constant in space which allows filtering of the acoustic pressure waves from the solution, whilst the mechanical pressure behaves as the pressure field of an incompressible flow; for further details the interested reader is referred to [3][4][5]. This formulation allows Xcompact3D to tackle flows with variable density in the regime 0 ≤ M ≲ 0.3 where φ is a scalar, D a scalar diffusion coefficient, Sc the Schmidt number defined as the ratio of kinematic viscosity to mass diffusivity and subscript s is the scalar counter.

Software architecture
An overview of the modular structure of the framework is presented in Fig. 1. Xcompact3D is build around the 2D Decomp & FFT library [2,6] which is designed to implement a 2D pencil decomposition (as shown in Fig. 2(a)) for data distribution on distributed-memory platforms. This library also provides a highly scalable and efficient interface to perform three-dimensional distributed Fast Fourier Transforms, which are used to solve the Poisson equation (needed in the Navier-Stokes equations subjected to LMN or incompressibility approximations). Equations 1 − 5 are solved with high-order compact finite-difference schemes on a Cartesian mesh, with a conventional fractional step strategy based on the general principle of projection methods [7,8]. Since DNS of high Reynolds number flows is very often not possible due to its computational cost, LES can be performed with an implicit approach for which the sub-grid scale (SGS) stress tensor is modelled using numerical dissipation [9,10]. Different explicit SGS models are also available for completeness, namely the standard Smagorinsky model and its so-called dynamic version (see [11] and [12] for more details). For flows around relatively complex geometries, it is possible to use a customised Immersed Boundary Method (IBM) [13]. IBMs, originally proposed by [14] are very attractive for Cartesian meshes, avoiding sophisticated body-fitted approaches and their associated loss of simplicity and increase of computational cost. Finally, please note that there exists a new Python library for the post-processing of the data, Py4Incompact3d, based on the same schemes as in Xcompact3D. 1

High-order finite-difference schemes and spectral Poisson solver
The superior properties of high-order schemes for DNS/LES in comparison to more conventional low-order schemes is now fully recognised, particularly regarding their ability to accurately capture a wider range of turbulent scales for a given spatial resolution. Standard spectral methods based on Fourier or Chebyshev representation yield highly accurate and efficient solutions to the Navier-Stokes equations, however with severe restrictions to their applicability. High-order compact finite-difference schemes on the other hand, approach the accuracy of spectral methods and allow greater flexibility in the selection of boundary conditions (in Xcompact3D, it is possible to use periodic, Dirichlet and Neumann boundary conditions). Although compact schemes are implicit in space, they are very competitive in terms of computational efficiency by comparison to more advanced spectral element methods designed for body-fitted meshes (see [15,16] for more details). Table 1 presents an overview of the compact schemes used in Xcompact3D. The velocity field information is on the main mesh (cross nodes) while the pressure field on the half-staggered one (dots), see inset diagram in Table 1. Various second derivative schemes are available, given in Table 1, depending on the type of simulation (hyper-viscous scheme for ILES, other schemes for DNS and explicit LES). The filtering scheme is only for use in explicit LES for the dynamic Smagorinsky model. Various sixthorder interpolation schemes with different properties for the small scales are available, but the recommendation is to use the optimised scheme as it can improve the quality of the solution in the context of LES. Note that the more conventional second-order schemes are also presented for completeness.
Xcompact3D advances the governing equations via the fractional step method [7,8], forming a Poisson equation for the pressure by taking the divergence of the momentum equations. This operation employs the interpolatory first derivatives given in Table 1 so that pressure is evaluated at the centre of control volumes (circle nodes in inset diagram) where the incompressibility condition is checked, the pressure gradients to correct the velocity are then evaluated by the reverse operation; this partial staggering is done to prevent spurious oscillations [1]. One of the main originalities of Xcompact3D is that the Poisson equation is solved in spectral space using the concept of modified wave numbers [17] for which operations in physical space are strictly equivalent to those in spectral space. Such a direct strategy, to avoid the use of expensive iterative techniques, is not new for periodic and/or free-slip boundary conditions for the velocity field [18], however it has not been used in the past for Dirichlet boundary conditions combined with high-order finite-difference schemes [1].

Time advancement
The flow field is initialised either with an initial condition (init_xcompact3d in Fig. 1) or by loading a restart file. The Navier-Stokes equations advance in time through the fractional step method, implemented in 4 key steps. First, the right hand side terms are evaluated numerically, calc_transeq_rhs in  Fig. 1. At the end of a time step, the user can decide which quantities to store (postprocessing in Fig. 1).

The 2D Decomp & FFT library
The finite-difference schemes and spectral Poisson solver employed by Xcompact3D naturally break down into a series of 1D subproblems. It is therefore natural to parallelise the computational domain with a pencil decomposition as illustrated Fig. 3. Results from Taylor-Green Vortex simulations at Re = 1, 600 (top row, with reference data of [10] in black) in a DNS setup and for Re = 10,000 (bottom row, with reference DNS data of [10]). In the figures X3D-O2 corresponds to DNS data produced with conventional second-order schemes, X3D-SS corresponds to the standard Smagorinsky model, X3D-DS to its dynamic version and X3D-iSVV to the implicit approach. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) in Fig. 2(a). Each decomposition (in the x, y and z axes respectively) allows independent computation of derivatives/ interpolations/filtering. Global transpositions to switch from one pencil to another are performed with the MPI command MPI_ALLTOALL(V). The 2D Decomp & FFT library also provides a parallel I/O module to help Xcompact3D handle large data, using MPI-IO. More details about the parallel computing strategy implemented in Xcompact3D can be found in [2].
The scalability of Xcompact3D has been tested recently on various Tier-1/0 supercomputers ARCHER, MareNostrum and HazelHen with simulation sizes of more than 68 billion mesh nodes. The scalability plots have been obtained for the Taylor-Green case (see details in the next section) with a very high Reynolds number of 10,000 in a DNS set-up. Xcompact3D exhibits very good strong (see Fig. 2(b)) and weak (see Fig. 2(c)) scaling properties across the three HPC platforms and for up to 100,000 computational cores (with one MPI-process per core). Despite the large number of global communications (up to 70 per time step), the scalability is excellent, except maybe when the number of mesh nodes per core is too small, and the cost of the small communications dominates the calculation.

Illustrative examples
Example applications are defined in the Xcompact3D source under the examples/ directory. These demonstrate the use of different aspects of the framework and allow new users to begin using Xcompact3D immediately, even if they are not familiar with the framework and its numerical methods. In this section, we present four of these examples to illustrate the use of Xcom-pact3D to solve well-known CFD benchmarks. Each case consists of run-time settings specified in examples/<CASE>/input.i3d which are easily modifiable by the user and Fortran code implementing case-specific boundary/initial conditions, source terms etc. in src/BC-<CASE>.f90 which must be compiled. Note that a special case USER is supplied for users to create their own setups. The input files for each case described can be found in the source repository or can be downloaded here with the supporting data used to generate the figures in this section. Please note if a user does not know how to define a parameter, default values are defined in src/parameters.f90 so it should not be a problem to run a simulation even if some parameters are not defined in the input file.

Taylor-Green vortex
The Taylor-Green Vortex (TGV) is a well-known benchmark in CFD, modelling the transition from an initially laminar state to a fully turbulent one [10,24,25]. It is attractive as a test case due to its simple setup, with the possibility to use a combination of periodic and free-slip boundary conditions. Further details of the case setup can be found in the corresponding input.i3d files.
The DNS case was performed on a CPU-based cluster at Imperial College London (see performance here) using 32 cores for 30 min (20,000 time steps) while the LES cases were performed on 64 cores on the same cluster for about 2 h (40,000 time steps). Quantitative comparisons of the temporal evolution of kinetic energy and its dissipation with the reference data of [10] are presented in Fig. 3. For the DNS case at Re = 1, 600 the 6th order compact schemes show excellent agreement with the results of [10], whereas the 2nd order schemes show some discrepancies.  [19] (×) and [20] (+ ) -top row; and for flow past a cylinder at Re = 300 at several downstream stations compared with reference data of [21] (♦) -bottom row.
It clearly demonstrates the ability of high-order schemes to capture a wider range of turbulent scales at a given spatial resolution by comparison to more conventional second-order schemes. The LES results at Re = 10,000 show that the implicit model is able to reproduce the main features of the flow as already reported in [10]. The explicit LES models however struggle to reproduce the DNS data, especially for the kinetic energy dissipation. Note finally that the explicit LES results are different from the ones obtained in [10] (different constant and version for the models).

Channel flow and flow past a cylinder
The canonical turbulent channel flow and the flow past a cylinder are very well-documented CFD benchmarks, widely studied theoretically, experimentally and numerically. The main challenge associated with these flows from a simulation point of view is related to the presence of a wall in the computational domain, either at the boundaries for the channel flow or immersed inside the computational domain for the cylinder flow. To demonstrate Xcompact3D's capabilities to handle both wall-bounded flows and flows with immersed boundaries, DNS of a turbulent channel flow at Re τ = 180 (based on the half-height of the channel and the friction velocity at the wall) and of a flow past a circular cylinder at Re = 300 (based on the diameter D of the cylinder and the free stream velocity) are presented and compared with reference data. The numerical parameters for these two simulations can be found in the relevant input file. The channel case was performed on ARCHER with 512 computational cores for 6 h (300,000 time steps) while the cylinder case was performed on ARCHER with 512 computational cores for 60 min (160,000 time steps). Some results are shown in Fig. 4. It can be seen that a very good agreement is obtained with the reference data for the first and second-order moments for the channel flow, and for the mean and fluctuating streamwise components of the velocity downstream of the circular cylinder.

Lock-exchange flow
Finally, as an example of the variable density flow capability for Xcompact3D, the well-known lock-exchange case for gravity currents is studied, and data are compared with the data of [23]. The simulations were performed on Imperial's cluster with 512 computational cores for 12 h (100,000 time steps). In this setup, uniformly suspended particle sediments are enclosed in a small portion of the domain separated by a gate with clear fluid. When the gate is removed the particle-fluid mixture flows due to gravity with a mutual inverse interaction between the ''heavy'' particle-mixture flow and the ''light'' clear fluid. An illustration of the flow can be seen in Fig. 5(a) where the gravity currents develop a highly 3D turbulence when evolving away from their initial reservoir. When studying buoyancy-driven flows, such as gravity currents, the Boussinesq approximation is frequently used allowing the use of an incompressible formulation only when the density variations are small, typically less than 1%. When changes in density are higher, the low Mach number solver needs to be used (non-Boussinesq case). The reference data of [23] is used to validate the setup in the Boussinesq limit, and a good comparison of energy budgets is found as shown in Fig. 5(b). With limited data for comparison, the energy budget for the non-Boussinesq case is presented with density ratio ρ 2 /ρ 1 = 0.7 in Fig. 5(c), noting that the total energy is conserved.

Impact
Xcompact3D is dedicated to DNS/LES, combining the versatility of industrial codes with the accuracy of the best academic codes based on spectral methods (the most accurate ones). It has been used recently worldwide for a variety of projects ranging from the study of the main features of gravity currents [22,26,27], a performance analysis of active and passive drag reduction techniques [28][29][30], the heat transfer features of a turbulent impinging jet [31,32], the flow over a periodic hill to design a machine learning framework [33,34], active control of turbulent jets [13,35], and nonequilibrium scalings of turbulent wakes [36][37][38]. Two impact case studies are highlighted in more detail below.

Fractal-generated turbulence
Laboratory and computational works have recently used multiscale/fractal objects to generate turbulence and have shown that complex multiscale boundary/initial conditions can drastically influence the behaviour of a turbulent flow [39][40][41][42][43]. Fractal geometry is a concept where a given pattern is repeated and split into parts, each being a reduced-copy of the whole, see Fig. 6 for a fractal square grid. Multiscale objects can be designed to be immersed in any fluid flow where there is a need to passively control and design the turbulence generated by the object. They offer the opportunity to discover new complex flow effects that can help understand how to manage complex fluid flows. The first DNS of the flow generated by fractal objects were performed with Xcompact3D nearly ten years ago [44,45] and have allowed the discovery of the Space Scale Unfolding (SSU) mechanism as a way to dramatically improve stirring in turbulent flows [46,47].

Wind farm simulator
Modern wind farms consist of multiple turbines clustered together to achieve large-scale energy extraction from a given site. Turbine-turbine interactions through wake interference, enhanced by a complex terrain and atmospheric turbulence phenomena, can lead to reductions of the wind farm energy production by 20% [49]. To this end, a Wind Farm Simulator (WInc3d), built on Xcompact3D, has been developed in order to investigate such turbine-turbine interactions by conducting high-fidelity, turbulence-resolving simulations of the turbine wakes and the ambient atmospheric turbulence [48,50] (see Fig. 6). Additionally, it has also been demonstrated that the implicit LES approach implemented in Xcompact3D is perfectly capable of reproducing the main features of laboratory-scale turbine wakes, with the possibility to investigate various control strategies and operational scenarios [50].

Conclusions and future direction
In this paper, we have shown how Xcompact3D can easily be used to simulate a wide range of turbulent flows. This original framework can be used on a laptop for simple flow configurations but can also be used on the most powerful CPUbased supercomputers in the world for large-scale simulations. Its user-friendliness, simplicity, versatility, accuracy, scalability, portability and efficiency makes it a very attractive tool for the CFD community. The next planned major development of the framework is to port Xcompact3D to GPU-based supercomputers. In addition, we are currently working on a free surface solver to simulate air-water interfaces and on implementing the full compressible Navier-Stokes equations to study compressible turbulent flows.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.