Flow–heat topology optimization of internally cooled high temperature applications using a voxelization approach for domain initialization

ABSTRACT A method is presented for obtaining topology optimized designs for internally cooled high temperature applications, using a flexible geometry description, by means of a voxelization methodology and a novel boundary detection algorithm. A conjugate heat transfer approach is taken; the physics is described by a Stokes–Brinkman model for the flow, weakly coupled with a convection–diffusion model for the heat transfer. A practically relevant optimization formulation, consisting of a maximum temperature objective with a mass flow constraint, is used, and applied to an industrial-relevant non-trivial geometry resembling a guide vane in a gas turbine. Temperatures and velocities from the optimized design are compared with the response from a Stokes flow model with body-fitted mesh and a high-fidelity Reynolds-averaged Navier–Stokes model. A comparison of the performance from a mixed and a penalty approach for solving the flow problem is included. The voxelization approach shows good promise for handling complex design domains.


Introduction
Topology optimization (TO) (Bendsøe and Sigmund 2004) is a potent design tool for engineers in the early design process owing to its generous design freedom, parametrizing every point in the design space.Realization of non-trivial TO designs in 3D was for long not feasible, but lately it has become possible by means of additive manufacturing (AM) (Gibson et al. 2021), a technique by which parts are made by adding material in a layer-by-layer fashion.
Lately, AM has gained interest in the gas turbine industry, where AM components can be found in the hot sections of turbines, e.g.around the combustion chamber and turbine blades (Andersson et al. 2017).The gas turbine could act as a grid stabilizer in a future, sustainable but more volatile, energy system, using hydrogen as fuel (Öberg, Odenberger, and Johnsson 2022).However, the higher flame temperature of hydrogen (Glassman, Yetter, and Glumac 2015) increases the rate of wear and tear in turbine components owing to, for example, increased thermomechanical loads and deteriorated material properties, causing fatigue, creep and large plastic deformations.
A gas turbine component that is a promising candidate to be designed using TO and manufactured by AM is the guide vane.Guide vanes are found in the hot part of gas turbines, statically mounted in a radial configuration with fixed inner and outer radii, directing the hot gas flow from the combustors to the turbine blades, see Figure 1.Owing to the hostile environment, the guide vanes need to be cooled from within to keep the material at permissible temperatures.Coolant enters the interior of the guide vane radially from the outward facing short end and is released back into the hot gas stream through the trailing edge of the guide vane (and to some extent also through so-called shower heads, which create a protective film of coolant along the leading edge, a configuration that is not considered herein).The coolant consists of compressed air taken from the latter stages of the compressor and is therefore very expensive to use, as it would otherwise have been used to power the turbine.However, the coolant needs to be highly pressurized to avoid back-flow of the hot gas stream through the guide vane.
Since guide vanes essentially act like compact heat exchangers when cooled from within, it is possible to get an idea of how a good interior design might look by investigating the design of heat exchangers in general.A main concept is to have a large surface-to-volume ratio to enhance the heat transfer between solid and fluid.Furthermore, fins, tubes and lattice structures are common design features of heat exchangers (Holman 2010).This corresponds to the guide vane's having a large and complex interior interface relative to the space occupied by the coolant.
A desirable design of the guide vane is one where the outer surface temperature is kept as low as possible, but without spending too much cooling air, and thus decreasing the efficiency of the turbine.This constitutes a conflict of interest that is addressed by analysing a multiphysics TO problem of conjugate heat transfer (CHT), using the maximum temperature on the surface of the component as objective, and the coolant mass flow as constraint.
Herein, numerical examples using a 3D geometry resembling a gas turbine guide vane are presented.To facilitate the computations, the PETSc 1 suite (Smith 2011) is utilized, which uses the message passing interface (MPI) standard for parallelization to efficiently solve partial differential equations numerically.Aage, Andreassen, and Lazarov (2015) demonstrated the capability for doing high performance computing (HPC) TO in PETSc, and the implementation presented herein is inspired by that work.
A motivation of this work is to allow for 3D design domains with non-trivial shapes.To that end, a voxelization method (Zhang et al. 2021) is employed to analyse general geometries on unfitted structured meshes, somewhat similarly to CutFEM (Burman et al. 2015) and other approaches for unfitted meshes (Badia, Martorell, and Verdugo 2022).A boundary detection algorithm is implemented to refine the imposing of boundary conditions.It is believed that the voxelization, together with the new boundary detection algorithm, contributes to making TO a viable option in the industrial design process by achieving a higher degree of automation.The voxelization approach has the potential to produce high quality meshes with extremely fine resolution, something that is beneficial for simulations of complex geometries, such as flow through narrow channels.
It is not common practice in the TO literature to include numerical verifications of results from density-based TO models with a monolithic mesh (i.e.interpolating between solid and fluid using the Brinkman parameter), especially not for fluid applications.One exception is the recent work from Rogié and Andreasen (2023), in which a heat sink is optimized and later verified against a CFD model using a body-fitted mesh, and other examples include Zhao et al. (2018), Haertel et al. (2018) and Pietropaoli, Gaymann, and Montomoli (2020).A numerical verification is often a way to avoid the complicated procedure of manufacturing and experimental testing.Before AM was common, it was only reasonable to manufacture and conduct experimental testing of simple (often extruded 2D) geometries (Koga et al. 2013;Dede, Joshi, and Zhou 2015;Li et al. 2019), and few complex 3D designs have been considered (Lei et al. 2018;Lazarov et al. 2018).Details are included on the numerical comparison of responses from the TO model and two models using a segregated mesh: a Stokes flow model (a TO model twin), and a high-fidelity Reynolds-averaged Navier-Stokes (RANS) turbulence model.Dede (2009) and Yoon (2010) published the first articles on multiphysics flow-heat TO.Since then, much has been published on laminar flow in 2D, mainly including examples of heat exchangers with trivial geometries, see e.g.Alexandersen and Andreasen (2020) and Fawaz et al. (2022) for comprehensive reviews.
Few articles include 3D examples of TO for CHT (Alexandersen and Andreasen 2020).Several approaches are taken to avoid modelling full 3D, such as using an extruded 2D design for planar flow in a 3D context (Koga et al. 2013;S. Qian et al. 2018).More advanced models include two-layer models (McConnell and Pingen 2012;Zeng, Kanargi, and Lee 2018;Haertel et al. 2018;Yan et al. 2019) and other 2D-design/3D-application models (Haertel and Nellis 2017).When full 3D examples are presented, low-fidelity flow models are often used, such as the Darcy flow model (Zhao et al. 2018;Kambampati and Kim 2020).Even examples without flow models exist, e.g.where the convection is simulated by interpolating the heat transfer coefficient based on the design density values in 2D (Bruns 2007), 3D (Zhou et al. 2016) and 3D with 2D design (Lundgren et al. 2019).
Steady, laminar Navier-Stokes (NS) flow models have been used recently for CHT problems in 3D (Yaji et al. 2015;Pietropaoli, Montomoli, and Gaymann 2019;Sun, Liebersbach, and Qian 2020).Regarding NS with turbulence modelling, Kontoleontos et al. (2013) presented 2D examples with a prescribed temperature distribution in the solid, and Dilgen et al. (2018) modelled turbulence in 3D CHT problems, something that is rare in a TO context (Alexandersen and Andreasen 2020;Fawaz et al. 2022).However, a few more recent articles on that topic exist (Pietropaoli, Gaymann, and Montomoli 2020;Holka et al. 2022).Fawaz et al. (2022) showed that articles investigating the TO of heat exchangers mostly consider only the temperature as an objective (Yaji et al. 2015;Haertel and Nellis 2017;Haertel et al. 2018;Zhao et al. 2018), or a weighted sum multiphysics-objective (Marck, Nemer, and Harion 2013;Pietropaoli et al. 2017;Sato et al. 2018;Thore, Lundgren, and Lundgren 2023).One exception is the work by X. Qian and Dede (2016), where the weighted sum multiphysics-objective is accompanied by a tangential thermal gradient constraint.Herein, a restriction is used for the mass flow instead of the temperature.By doing this, it is not possible to prescribe the inlet mass flow or flow velocity, as is commonly seen in the literature.Very few publications treat CHT with unspecified inlet mass flow.However, some examples exist, e.g.Yaji et al. (2015) in a transient context, and Haertel and Nellis (2017) and Haertel et al. (2018) for simplified (2D-)3D models.
A mass flow constraint similar to the one used herein has been used before in pure flow models (Gersborg-Hansen, Berggren, and Dammann 2006;Aage et al. 2008;Behrou, Ranjan, and Guest 2019), but to the authors' best knowledge, the only example of application in flow-heat models is Thore et al. (2022), where it was combined with a thermal compliance objective.

The conjugate heat transfer problem
The computational domain (Figure 2) is denoted by ⊂ R d , d = 2, 3. Its boundary = in ∪ out ∪ U ∪ q is made up of disjoint parts, with outward normal n.The domain can be divided into disjoint parts f and s , containing a fluid and a solid, respectively.s represents a metal structure that almost entirely encloses f , by means of the outer non-designable solid shell s,fix ⊂ s .A coolant driven by a traction t F occupies f , enters at the inlet in , and exits at the outlet out .The thermal driving force is a hot gas stream heating U by convection.The boundary q is adiabatic.

State problems
A standard density-based TO formulation is used wherein the design at a point x ∈ is described by ρ(x) ∈ [0, 1], such that ρ(x) = 1 ⇔ x ∈ s .The design is regularized using a linear density filter (Bourdin 2001), with radius R, to avoid checkerboards and mesh dependency (Sigmund and Petersson 1998).The regularized design is denoted ρ = ρ(ρ).

The flow problem
The strong formulation of the Stokes flow problem in , augmented with a Brinkman term −α( ρ)u (Borrvall and Petersson 2003) for the purpose of interpolating between s and f , is to find the velocity u and the hydrostatic pressure p satisfying The fluid is characterized by the constitutive relation where μ > 0 is the dynamic viscosity.The Brinkman parameter describes the impermeability of the material as a function of the regularized design variable, where q is a penalty factor, and the impermeability parameter α 1 represents the resistance to flow permeation in s .
In Section 3.1, the numerical solving of the flow problem is discussed, using both a mixed and a penalty approach.

The heat problem
The strong formulation of the convection-diffusion heat problem in is to find the temperature T satisfying where T ∞ is the ambient temperature of the hot gas, U is the heat transfer coefficient, and c = c v ρ f , in which c v and ρ f are the specific heat capacity and the material density of the coolant, respectively.The thermal conductivity k( ρ) = k f + (k s − k f ) ρ is linearly interpolated between the solid (k s ) and fluid (k f ) conductivities.

Discretization
The state problems are solved with the finite element (FE) method on the approximated domain h .The FE approximated hydrostatic pressure, velocity and temperature are denoted with p h , u h and T h , respectively.The mesh is structured and made up of n e equally-sized and cuboid-shaped eight-noded HEX-elements.The same mesh is used for both the flow and the heat problem, and both use standard Galerkin methods.
For the design problem, an element-wise constant design density ρ h is used.The corresponding regularized, physical density ρh = ρh (ρ h ) is also approximated as element-wise constant, with element values denoted ρe .

The FE flow problem
Two different approaches are investigated for solving the flow problem, and a comparison between them is to be found in Section 4.2.1.

Penalty approach
In the penalty approach, the volume continuity equation in (F) is not considered as a separate equation of state.Instead, the penalty parameter λ ≥ 0 is introduced, and the pressure is which embodies the near-incompressible assumption, in the sense that λ → ∞ ⇒ ∇ • u → 0. The velocity is approximated as continuous and element-wise trilinear (Q1) and the resulting matrix formulation reads where u ∈ R 3n n is the global velocity vector, and n n is the number of nodes in the domain.The element stiffness matrix and load vector are where N F is the flow shape function matrix (such that u h | e = N F u e ) and B F its gradient, C μ and C λ are constitutive matrices for the viscosity and the penalty contributions, respectively, and the element domain e ⊂ h .The use of Q1 elements requires reduced integration on the penalty term in (4) to avoid volumetric locking (Hughes 2000).The penalty approach has shown good capability of approximating the velocity, despite the possible occurrence of pressure instabilities (Johnson and Pitkäranta 1982).However, the performance of this method depends on the choice of the penalty parameter λ in (2).This parameter cannot be too low, or otherwise the flow will not be incompressible (enough).And if the parameter is too large, the convergence of the linear state problem is affected, even when using direct methods such as Cholesky factorization.Hughes (2000) suggests a value of 10 7 < λ/μ < 10 9 .Given the specific problem and parameters herein, a good number is found to be in the range 10 9 -10 10 .

Mixed approach
In the mixed approach, where both the velocity and the pressure are unknown, the volume continuity equation in (F) is considered as a separate equation of state.A pressure-stabilized Q1-Q0 method is used, in which the pressure is approximated as element-wise constant (Q0) and the pressure-jump between adjacent elements is penalized to ensure stability and accuracy of the method (Hughes and Franca 1987).The matrix formulation of this approach reads where p ∈ R n e is the global pressure vector.The element contributions are where A ef is the area of the face between elements e and f, F e is the set of faces on e \ h , h max is the largest element side length (which is the same for all elements in the uniform mesh used), and the nonnegative number δ is a user-specified parameter that should be large enough to get rid of oscillations in the pressure field, but small enough to avoid (roughly) constant pressure solutions-leading, similarly to what happens with a too small penalty parameter, to designs without channels.The vector c e ∈ R n f is defined from the pressure jump between two adjacent elements as where p e collects p e and the neighbouring element pressures, such that n f ≤ 7 in 3D, excluding faces on h .

The FE heat problem
The temperature is approximated as continuous and element-wise trilinear.The matrix version of the heat problem reads where t ∈ R n n is the global temperature vector.The element stiffness matrix and load vector in the heat problem are where N T is the temperature shape function matrix (such that T h | e = N T t e ) and B T its gradient.
The convection term in the heat problem can cause oscillations in the temperature solution for high flow speeds.A cheap way of taking care of these oscillations is by adding the Streamline-Upwind (SU) stabilization to (6).The stabilization parameter, evaluated at each Gauss point, is where • is the Euclidean norm, Pe i the directional Péclet number and [•] i represents the ith component of the vector.

Voxelization
The original TO implementation in PETSc (Aage, Andreassen, and Lazarov 2015) only allows cuboid geometries with a uniform element size.The aim is to extend the analysis beyond cuboid geometries, and to that end a so-called voxelization method is employed, to handle more general geometries.
The voxelization method used is based on the work by Zhang et al. (2021), where standard triangle language (STL) files are used for the geometry description.The voxelization approach eliminates the need for advanced and time-consuming manual meshing procedures, while still allowing general geometries to be analysed.Also, the element quality will turn out to be really good, since all elements are cuboids of similar size, and will be very near perfect cubes if the number of elements in each direction is chosen wisely.This is particularly important for iterative linear solvers, whose performance can be greatly affected by mesh quality.
The output of the voxelization, illustrated in Figure 3, is a cuboid mesh consisting of voxels.After the mesh is scaled and translated, it encloses perfectly, and the voxels are recognized as the FEs used to discretize , by assigning elements intersecting s,fix to s,fix h , the remaining elements intersecting to the designable domain h \ s,fix h , and the rest to v,fix h .The total (cuboid) domain is tot h = h ∪ v,fix h , but the state problems are solved on h only, by means of prescribing a zero-value to degrees-of-freedom (DOFs) outside h .Furthermore, ρ e = ρe = 0 and ρ e = ρe = 1 are prescribed for elements in v,fix h and s,fix h , respectively.Implementation-wise, two different STL files, describing and s,fix , are needed to make the partitioning of domains.The result is two vectors identifying elements in s,fix h and the designable domain h \ s,fix h , respectively.

Boundary detection algorithm
The imposing of the boundary conditions is refined compared with Zhang et al. (2021), in which the conditions are imposed on a set of elements closest to the boundary.Herein, the boundary conditions are instead only imposed on sets of nodes belonging to the voxelized boundary h , by means of a boundary detection algorithm.
For each boundary condition, except adiabatic and traction-free boundaries, an STL file is provided, which is voxelized using the algorithm by Zhang et al. (2021).In the numerical example, two STL files are provided describing U and in , respectively.The output from the voxelization consists of vectors identifying HEX-elements intersecting the boundary definitions in the STL files.To identify which nodes of each HEX-element are on h , an algorithm to identify all nodes lying on h is used, see Figure 4.The result is a vector (b) that can be used together with the output from the voxelization process to identify the exact nodes to which the boundary conditions should be applied.Here, e is the element number, n tot n is the number of nodes in tot h , n tot e is the number of elements in tot h , x v,fix is a vector indicating if an element belongs to v,fix h , and the output b is a vector indicating if a node is on the boundary or not.

The design problem
The objective is to minimize the maximum temperature in h .The maximum temperature is expected to be found on the convection boundary U h = U ∩ h , and to this end the maximum temperature is approximated using an objective function based on an P -norm as where P is the power parameter, and f T is included in a compliance-like fashion to weigh each nodal temperature in order to obtain a measure only over the nodes on U h .The mass flow constraint reads where −n is the inward boundary normal, m is the specified maximum mass flow, and the surface integral is (exactly) evaluated using a 2 × 2 numerical integration scheme.
The complete design problem reads Table 1.Continuation parameters.

Numerical examples
PETSc (v.3.16.3)(Smith 2011) is used for the numerical calculations.The TO implementation is inspired by the code described in Aage, Andreassen, and Lazarov (2015), and runs on the high performance computing clusters Sigma (NSC 2023a) and Tetralith (NSC 2023b), where each computational node has two Intel Xeon Gold 6130 CPUs for a total of 32 cores and 96 GiB of RAM.
The design problem is solved using the method of moving asymptotes (MMA) solver (Svanberg 1987), with analytical gradients computed using the discrete adjoint method (see the appendix).The initial guess is ρ e = 0.5 everywhere.To avoid getting stuck in poor local minima, while achieving close-to-binary-valued designs, a couple of continuation steps for the Brinkman parameters α and q in (1) is needed, see Table 1.Each continuation step runs for a maximum of 250 iterations, except for the final step, where 500 iterations is allowed.A stopping criteria is set to ρ k − ρ k−1 ∞ < 0.001, where ρ k is the design density vector in MMA iteration k, but this criteria was never met during this work.MMA and design problem parameters are to be found in Table 2.
The majority of the time to convergence of the optimization problem is spent solving the flow problem in each MMA iteration.The penalty approach is used to produce the designs shown in Section 4.2, and the solution of the state problem is obtained using Cholesky factorization from the parallel MUMPS suite (v.5.4.1)(Amestoy et al. 2001(Amestoy et al. , 2019)).An iterative refinement process with three steps is implemented to achieve a slightly better residual at a very cheap cost.The flow problem parameters are to be found in Table 3.
The heat state problem is solved by the iterative FGMRES solver, with a multigrid preconditioner (PCMG).Parameters for the heat problem are to be found in Table 4.
The adjoint problems are solved with the same solvers as the respective state problem.The symbolic factorization of the flow stiffness matrix is done once in the beginning of the optimization process, while the numerical factorization is done every iteration, but is saved during the MMA iteration for use in the adjoint flow problem, making that solution time negligible.
The value of the dynamic viscosity μ in Table 3 is not based on the fluid material parameter, which is several orders of magnitude lower.Instead, a modelling choice has been made to use an  increased value, mainly to get a smooth convergence of the model.It is stressed that this choice is not a fundamental aspect of this work.

Geometry
A tapered 2D shape has been extruded in the normal direction to obtain a geometry resembling that of a true guide vane.The height is 120 mm, and the other dimensions are to be found in Figure 5.The boundary definitions are given in Figure 6, and the domain definitions are illustrated in Figure 7.The complete mesh (covering the total domain tot h ) is 240 × 30 × 200 elements, utilizing symmetry in the y-direction.For h , the number of elements is 910,410, and the number of nodes is 969,443.

Results
The visualization in Figure 8 shows the solid domain s h = { e | ρe ≥ 0.5} after each continuation step.The colour indicates the relative normal distance from the symmetry plane, to illustrate the depth in the figure.The first two iteration steps show designs without continuous channels from inlet to outlet, due to the low value of α, which makes significant flow through solid parts possible.The channels are more clearly defined when raising the impermeability parameter α in steps 3 and 4.
The iteration history in Figure 9 shows good convergence, with some oscillations in the fourth continuation step.Disregarding the spikes in the maximum temperature in Figure 9(a), which reach 1390K but are cut from the figure for visualization reasons, the appearance of the objective function and the maximum temperature are similar, indicating good agreement between the two measures.Probably, the number of iterations in the initial phase could have been reduced, based on the flat appearance of the objective function during the final parts of the first two continuation steps.
An enlarged version of Figure 8(d) appears in Figure 10, together with a temperature plot of the solid domain with flow streamlines, also temperature coloured.Figure 11 It seems to be optimal to only have one large channel occupied by some free-floating islands (solid subdomains disconnected from h ), clearly visible in Figures 10(b) and 11(b) as regions with considerably lower temperature than other solid regions nearby.Probably, the role of these islands is to slow the flow, so that the mass flow constraint is fulfilled.The free-floating islands makes it impossible to realize the obtained designs.To achieve designs without these islands, it is necessary to include something in the optimization problem that prohibits the forming of islands, either in an heuristic way where islands are detected and removed, or by improving the structural integrity of the design by introducing a linear elasticity field.
In Figure 10(b), it can be seen that the fluid temperature at the outlet is above 1200K, indicating that the coolant has taken up almost as much thermal energy as possible.It is also noted that the minimum temperature in h is lower than the inlet temperature T 0 .This numerical artefact seems to have little impact on the resulting design.
The temperature plot in Figure 12(a) shows that the highest temperatures are found on the leading edge, while the largest temperature gradients are found close to the inflow boundary, visible in Figure 12(b).The way of imposing the inlet temperature boundary condition has significant impact on the temperature gradients close to the inlet, since not only the inlet is restricted to T 0 , but also the entire top surface, in an unnatural way.This effect is extra influential due to the relatively small inletto-top area ratio.However, this is not a fundamental aspect of the model; it is believed that this effect could be attenuated with more accurate boundary conditions for the heat problem.Apart from the top region, the temperature distribution visible in Figure 12 is fairly uniform, which can be expected when minimizing the maximum temperature.
The final volume fraction is around 76%, with respect to h .A binary-measure is calculated as 143, which is interpreted as sufficient in this context.

Mixed versus penalty
The penalty approach, used to produce the optimized designs herein, is compared with the mixed Q1-Q0 approach in terms of the time it takes to solve the flow state problem once in the beginning of the optimization process, depending on the number of cores used to run the example.Included in the comparison are three different set-ups: the penalty approach solved with the same settings as described in Section 4 together with the REDISTRIBUTE preconditioner, the mixed approach solved with an iterative solver, and the mixed approach with the REDISTRIBUTE preconditioner solved with an iterative solver.For this comparison, an optimized version of PETSc (v. 3.18.1),linked with Intel Math Kernel Library (v.2018) and MUMPS (v.5.5.1)(Mumps Technologies 2022), was used on the Tetralith cluster (NSC 2023b).
The REDISTRIBUTE preconditioner works by eliminating the fixed DOF from the system matrix.By doing this, a smaller linear system is obtained, compared with imposing the Dirichlet boundary conditions by zeroing rows and columns using the PETSc function MatZeroRowsColumns.This reduction in problem size is believed to be able to reduce the computational time for both the mixed and penalty approaches.
An implementation of the mixed approach is done in PETSc based on the DMStag object, which represents a staggered grid, in this case having velocity DOF at the vertices and the pressure DOF associated with the 'element' (DMSTAG_ELEMENT).The matrix problem is solved using the BCGSL solver with the FIELDSPLIT preconditioner, with both parts using the ASM preconditioner.Two stopping tolerances-10 −10 and 10 −6 -are used for the relative residual L 2 -norm of the preconditioned problem.The smaller tolerance matches the relative residual norm from the direct solver used for the penalty approach, while the larger tolerance is provided as a reference for what possibly can be earned by stopping the iterative solver a bit earlier.The user-specified pressure stabilization parameter in ( 5) is δ = 10 −2 .
Figure 13 shows times for solving the state problem once, using the same geometry and mesh as in Section 4.2.The times for assembly of the system matrix for the two approaches are quite similar and negligible, and hence not accounted for.It should be added that the adjoint problem most often took a bit longer to solve than the state problem for the examples treated herein.It is clear that the penalty approach is competitive when fewer-here 32 cores, which correspond to one computational node on Tetralith (NSC 2023b)-computational resources are available.For visualization reasons, the penalty approach without using the REDISTRIBUTE preconditioner is not included in Figure 13.This approach requires 158 seconds on 32 cores and 70 seconds on 256 cores to solve the flow state problem.The mixed approach is indeed faster when using the REDISTRIBUTE preconditioner, although it is at the same time more unstable and prone to breakdowns, and has a worse convergence relative to the problem size compared with the mixed approach without the REDISTRIBUTE preconditioner.It is not obvious that the mixed set-up using the REDISTRIBUTE preconditioner benefits from also using the FIELDSPLIT as the inner preconditioner, but no better inner preconditioner has been found.
Included in Figure 13 is an illustration of the trade-off for when the mixed approach is competitive with the penalty approach.The solid black trade-off line is half of the time it takes to solve the state flow problem once with the penalty approach using the REDISTRIBUTE preconditioner, and it should be interpreted as the maximum time the mixed approach can take without being inferior to the penalty approach.
When comparing the computational times for the penalty and the mixed approach, there are a number of things to keep in mind.First, when using the penalty approach, solving the adjoint flow problem for a given design is essentially free in terms of computational time compared with solving the state problem.To be competitive on the presented flow problem, with one state and one adjoint flow problem, the iterative solver must thus be (roughly) twice as fast.
Secondly, the times for the mixed approach are most likely not the best that can be achieved, and can probably be improved by further experiments with combinations of the plethora of solvers and preconditioners, and settings for these, offered by PETSc.In addition, iterative linear solvers are wellsuited for GPU computations, which might offer further speed-up.In practice, the stopping tolerance for the iterative solver may be overly strict; lowering the tolerance can result in non-negligible savings on computational time-illustrated in Figure 13-though at the same time, less accurate solves might lead to less accurate derivatives and thus more MMA iterations.The possibility of using early stopping for iterative solvers in TO has been investigated recently (Amir, Stolpe, and Sigmund 2010;Amir, Aage, and Lazarov 2014;Limkilde, Evgrafov, and Gravesen 2018).
Finally, the performance of iterative solvers tends to degrade as the design becomes closer to binary-valued.Based on this, it can be expected that times will increase as the optimization process progresses.The experience from the presented numerical examples is that the solution time rises by 10%-30% because of this.However, this effect will probably be counteracted when reusing the flow solution from the previous MMA iteration as an initial guess for the iterative solver, since the change in design, and thus the change in the flow, is reduced between later MMA iterations.Based on experience from the presented numerical examples, the time earned from this can compensate for the time lost owing to the design becoming closer to binary-valued.

Result comparison
An investigation of the validity of the flow and temperature response in the TO model is conducted by comparing the TO model with two other models: a high-fidelity CFD simulation using RANS equations and a low-fidelity Stokes flow model, both using a mesh with segregated domains (bodyfitted meshes) and solved by conjugate heat transfer analysis.
The computational mesh for the segregated meshing approach consists of about 7.7 million solid and fluid tetrahedral elements and is created as follows: first, an STL file encapsulating the fluid volume, defined by the isosurface ρ = 0.5, is exported from ParaView (v. 5.8.1) to HyperMesh (v. 2017.3),where it is remeshed with triangular elements with an approximate size of 3.5 × 10 −4 m.The STL file defining , used in the voxelization process, is then imported and meshed with the same element size.Finally, 3D tetrahedral meshes are generated for the fluid and solid parts, and linked via a common solid-fluid interface.This is referred to as a segregated mesh, in contrast to the monolithic mesh used in the TO model.
The RANS simulations are performed utilizing the finite-volume based software ANSYS Fluent (v. 2022 R2).The SST turbulence model developed by Menter (1994) is used to perform a steady-state RANS simulation.This implies that the RANS equations along with two additional transport equations for the turbulence kinetic energy and the specific dissipation rate are solved and the model blends effectively the k-ω formulation for the near wall region with the k-ε formulation in the far field (ANSYS, Inc 2022).The velocity and pressure coupling in the momentum equations is done by means of the COUPLED scheme in ANSYS Fluent.The spatial discretization for gradients is least-squares cell-based, while pressure, momentum, energy and turbulent quantities use a second-order upwind scheme.
When performing the simulation, residuals level out to 10 −4 for all quantities, i.e. continuity, momentum, energy, k and ω.In addition, velocity and temperatures at local points are monitored to make sure that the solution is stable and converged, which is achieved after about 3000 iterations.
The boundary conditions and the material properties for the two segregated models are exactly those described in Section 4, except for the inlet condition and the viscosity for the RANS model, which use a mass flow boundary condition at the inlet, where 1 g/s of coolant enters, and a viscosity set to 3.25 × 10 −5 kg/ms, representing operating conditions.This means that the only difference between the Stokes TO model and the segregated Stokes model is the meshing approach.It is noted that the Brinkman term is not used for the segregated models.The purpose of the segregated Stokes model is to be able to study separately the effect of interpreting the TO design on a body-fitted mesh and the effect of changing to a more advanced flow model.The segregated Stokes simulations are performed using COMSOL (v.6.0).
It is noted that the process for generating body fitted meshes from TO designs is far from automatic.One reason is that the STL from ParaView can include errors, such as overlapping elements, duplicated elements or bad connectivity between elements.And even if the STL is flawless, such errors might appear after the remeshing.Another issue is that the remeshing of the STL surfaces introduces an arbitrariness when it comes to the interpretation of the fluid-solid interface, by the fact that the original isosurface ρ = 0.5 is overridden by the remeshing in HyperMesh.There are several options for doing this remesh, including shrink wrap and surface deviation in the automesh panel.
Figures 14 and 15 show the velocity and the temperature at the fluid symmetry plane for the three different models, while Figure 16 shows the temperature on the convective boundary.
Figure 14 shows a fairly similar flow behaviour between the TO model and the segregated Stokes model.The RANS model on the other hand has a more diffusive and longer sustain of the momentum compared with the TO model, which excludes the convective terms.The exchange of momentum associated with convective acceleration also indicates a somewhat different trajectory of the flow compared with the Stokes models.This effect can also be seen in Figure 15(a), where a horizontal flow path closest to the inlet is clearly visible in the dark colour.A corresponding flow path is not visible in the RANS model in Figure 15(c).The consequence is a hotter surface in the RANS model at that height, as seen in Figure 16.These possible consequences of this effect may need to be further investigated in relation to the use of the Stokes equations in TO.
It is obvious from Figure 14 that the monolithic meshing causes large resistance to flow near the solid-fluid interface, owing to the Brinkman term.This is probably the cause of the very much higher flow velocities (maximum 52.0 versus 9.92 m/s) and mass flow (7.6 versus 1.0 g/s) in the segregated Stokes model.To avoid this effect, the isosurface can be defined much closer to the fluid bulk, i.e. making the flow channels narrower.As indicated by Figure 15, the segregated Stokes model does not achieve mixing of cooling, i.e. on the symmetry plane the fluid temperature remains fairly similar to the inlet temperature.The RANS model, in contrast to the Stokes models, resembles a more diffusive mixing of the cooling when exposed to a hot solid surface, primarily associated with the role of convective acceleration and momentum exchange from the solid wall to the symmetry plane, see Figure 15(c).
In Figure 16, it can be seen that the TO model is cooler on the surface than the segregated Stokes model.This could be due to an overestimation of the cooling effect.This hypothesis is supported by the fact that the segregated Stokes model has a higher mass flow and higher flow speed closer to the fluid-solid interface, which most likely increases the heat transfer from the solid to the fluid.

Concluding remarks
A method for TO using complex design domains based on a voxelization technique with a novel boundary detection algorithm has been proposed.The use of voxelization circumvents complex and time-consuming generation of body-fitted meshes, and can give perfectly shaped elements, something that is appealing e.g. when using iterative solvers.This facilitates the possibilities for analysing complicated geometries with high resolution in a parallel computational environment.The method is demonstrated on an industrially relevant test case, with a geometry resembling a guide vane, and shows that the voxelization approach is a viable option when dealing with non-trivial design domains.
Furthermore, a demonstration has been made for a working HPC implementation of a large 3D example of a coupled flow-heat problem with a conjugate heat transfer approach, using PETSc (Smith 2011).The bottleneck is to solve the flow problem, and for that reason a comparison is included between the use of a penalty and a mixed FE approach that shows the penalty approach to be competitive if less resources are available and the problem size is not too large.
Finally, a comparison between the optimized designs from the TO model and two models using segregated meshes is provided, as an illustration on the methodology of performing a post-analysis validation of TO results.The key differences are believed to be due to the penalization of flow in intermediate design density regions, and the non-existing fluid mixing when using laminar flow models.The consequence is that the cooling effect is lower in the TO model.A possible remedy is to decrease the width of the cooling channels when creating the segregated mesh, which would lower the mass flow given a certain pressure drop.

Appendix. Sensitivities
The design problem is solved using gradient-based methods.Therefore, the sensitivities of the objective (7) and constraint (8) functions are derived using a discrete adjoint approach.

A.1 Objective and constraint function sensitivities
Consider the Lagrangian function where φ h is the objective function in (P) D , λ i ∈ R ni×1 for i = {F, T} are adjoint vectors, and r i are the residuals The Lagrangian φ will be equal to the original objective function φ as long as the residuals are zero, i.e. the state problems are solved exactly.Differentiating the Lagrangian by the regularized FE design variable ρe gives the sensitivity where the first term disappears since there is no explicit design dependency in the objective function ( 7), and the adjoint vectors are obtained from the adjoint problems where the adjoint vector λ g is the solution to 4 ) This adjoint problem shows it to be unnecessary to solve, since it resembles the flow state problem to such a degree that the solution vector in (A4) becomes a pure scaling of the solution vector u in (3).The details are provided in Section A.3.3.

A.2 Element stiffness matrix sensitivities
In (A1), the derivatives of the fluid and heat stiffness matrices with respect to ρe are required.The derivative of the fluid stiffness matrix with respect to ρe is where α( ρ) is specified in (1), and thus The derivative of the heat stiffness matrix with respect to ρe is

A.3.1 Heat adjoint
In (A2), the derivative of the objective function ( 7) with respect to the nodal temperatures t is required.It reads , forj = 1, . . ., n n .

A.3.2 Fluid adjoint
In (A3), the derivative of the global heat stiffness matrix with respect to the nodal velocities u is required.The righthand side of the fluid adjoint matrix problem is assembled at element level, and therefore the derivative of the element heat stiffness matrix with respect to the nodal velocity component u k e is given as ∂K e The derivative in the second term in (A6) is, for i = 1,2,3, given as The final expression for the stabilization parameter derivative with respect to the element nodal velocities reads The derivative in the second term in (A5) can be evaluated by use of (A7), which gives

A.3.3 Mass flow constraint adjoint
The right-hand side of (A4) is where A is the assembly operator, n e is the number of elements that have faces in in h , and n e is the local element outward normal.
The right-hand side of (A4) is a pure scaling of the load vector in (3), since both t e F and −n e are oriented in the same direction (into the domain, normal to the boundary).This means that the constraint adjoint vector λ g will be a pure scaling of the state solution vector u in (3).
The scaling factor for the specific problem treated herein becomes where the constant input traction t in is given in Table 3.

Figure 1 .
Figure 1.The SGT-800 gas turbine, with a detailed picture of the guide vanes (dashed) at the exit of the combustion chamber.Courtesy of Siemens Energy AB.

Figure 3 .
Figure 3. Voxelization of (a 2D version of) the geometry, later described in Section 4.1.

Figure 4 .
Figure 4. Flow chart for boundary node identification process.Here, e is the element number, n tot n is the number of nodes in tot h , n tot e is the number of elements in tot h , x v,fix is a vector indicating if an element belongs to v,fix h , and the output b is a vector indicating if a node is on the boundary or not.

Figure 6 .
Figure 6.Illustration of (left) boundary definitions and (right) the designable domain \ s,fix with inlet and outlet.The bottom (not shown here) is adiabatic (part of q ).

Figure 7 .
Figure 7. Illustration of (left) the design domain and (right) the fixed domain.

Figure 8
Figure 8. s h after each continuation step, coloured by the relative normal distance to the symmetry plane: (a) Step 1; (b) Step 2; (c) Step 3; (d) Step 4.

Figure 9 .
Figure 9. Iteration history.The spikes are due to the change of parameters at each continuation step.The objective function is normalized by its value in MMA iteration k = 1: (a) normalized objective value and maximum nodal temperature in h ; (b) constraint function value g h , and mass flow limit m.
Figure 10.s,opt h after the fourth continuation step.The temperature range is [min(T), max(T)]: (a) the relative normal distance to the symmetry plane; (b) the temperature of the solid domain and the flow streamlines.

Figure 11 .
Figure 11.Isosurfaces of the optimized design after the fourth continuation step: (a) f,opt h inside parts of h ; (b) s,opt h with freefloating islands reflected across the symmetry plane.

Figure 12 .
Figure 12.The temperature and its gradient on the convection boundary U : (a) temperature; (b) temperature gradient with log scale.

Figure 13 .
Figure 13.Timing for different approaches and solver settings for the flow state problem.

Figure 14 .
Figure 14.Relative flow velocities on the symmetry boundary for different models, ranging from zero (dark) to one (light).Maximal velocities are 9.92, 52.0 and 3.59 m/s, respectively: (a) Stokes TO model; (b) segregated Stokes model; (c) segregated RANS model.

Figure 15 .
Figure 15.Fluid temperatures on the symmetry boundary for different models, ranging from inlet temperature T 0 (dark) to ambient temperature T ∞ (light): (a) Stokes TO model; (b) segregated Stokes model; (c) segregated RANS model.

Figure 16 .
Figure 16.Solid temperatures on the convection boundary for different models, ranging from 1345K (dark) to ambient temperature T∞ (light): (a) Stokes TO model; (b) segregated Stokes model; (c) segregated RANS model.
3 )Note the transpose in both instances of K T in the adjoint problems.The derivatives of the stiffness matrices in (A1) and the right-hand sides in the adjoint problems are provided in Sections A.2 and A.3, respectively.The sensitivity of the mass flow constraint (8) becomes ∇ ρe g h (u) T ] i [N F ] mk [B T ] mj dV, which is non-symmetric, and T u e (u e ) T B T ] ij dV + e c 2 τ ∂[B T T u e (u e ) T B T ] .The element nodal velocity vector u e contains 24 components u k e .The derivative of the stabilization parameter τ in (

Table 2 .
Numerical and MMA parameters for the design problem.Note: The notation (•) ci indicates the ith continuation step.

Table 3 .
Numerical values for the flow problem.

Table 4 .
Numerical values for the heat problem.