The FLAMINGO project: cosmological hydrodynamical simulations for large-scale structure and galaxy cluster surveys

We introduce the Virgo Consortium's FLAMINGO suite of hydrodynamical simulations for cosmology and galaxy cluster physics. To ensure the simulations are sufficiently realistic for studies of large-scale structure, the subgrid prescriptions for stellar and AGN feedback are calibrated to the observed low-redshift galaxy stellar mass function and cluster gas fractions. The calibration is performed using machine learning, separately for three resolutions. This approach enables specification of the model by the observables to which they are calibrated. The calibration accounts for a number of potential observational biases and for random errors in the observed stellar masses. The two most demanding simulations have box sizes of 1.0 and 2.8 Gpc and baryonic particle masses of $1\times10^8$ and $1\times10^9 \text{M}_\odot$, respectively. For the latter resolution the suite includes 12 model variations in a 1 Gpc box. There are 8 variations at fixed cosmology, including shifts in the stellar mass function and/or the cluster gas fractions to which we calibrate, and two alternative implementations of AGN feedback (thermal or jets). The remaining 4 variations use the unmodified calibration data but different cosmologies, including different neutrino masses. The 2.8 Gpc simulation follows $3\times10^{11}$ particles, making it the largest ever hydrodynamical simulation run to $z=0$. Lightcone output is produced on-the-fly for up to 8 different observers. We investigate numerical convergence, show that the simulations reproduce the calibration data, and compare with a number of galaxy, cluster, and large-scale structure observations, finding very good agreement with the data for converged predictions. Finally, by comparing hydrodynamical and `dark-matter-only' simulations, we confirm that baryonic effects can suppress the halo mass function and the matter power spectrum by up to $\approx20$ per cent.


INTRODUCTION
The standard model of cosmology allows us to compute the evolution of the universe and the growth of structure in it starting shortly after inflation.Among its ingredients are the physics of the standard model of particle physics, whose constituents include baryonic matter, photons and neutrinos, and a distribution of initial density perturbations.However, to fit the cosmological data two additional, ad-hoc contributions to the energy content are required, namely non-baryonic dark matter and dark energy (for a review see e.g.Workman et al. 2022).
★ E-mail: schaye@strw.leidenuniv.nl In the simplest version of the model, referred to as LCDM (or ΛCDM), the universe is spatially flat; the dark matter is assumed to be non-interacting (apart from gravitational interactions) and cold (i.e. the free streaming scale is negligible for practical purposes); the dark energy is a cosmological constant (i.e. the dark energy has an equation of state  =  2 with  = −1); there are three neutrino flavours and the sum of the neutrino masses is equal to the minimum allowed by ground-based neutrino oscillation experiments (0.06 eV; e.g.Esteban et al. 2020); there are no primordial tensor metric fluctuations; and the primordial density perturbations are adiabatic, Gaussian and their power spectrum is described by a power law.This model has only 7 free parameters.One of these, the current temperature of the cosmic microwave background (CMB) radiation, is measured very accurately from its spectrum (Fixsen 2009).This leaves only 6 parameters that need to be measured more precisely, which include the amounts of cold dark matter (CDM) and baryons, the normalisation and slope of the primordial power spectrum of density perturbations, the optical depth to photon scattering due to reionization (which is determined by galaxy formation physics) and a final parameter which must depend on the normalisation of the expansion history and can therefore be thought of as the Hubble constant.
The standard model of cosmology can reproduce an impressively diverse set of observations that span a wide range of length and time-scales (for a review see e.g.Lahav & Liddle 2022).Examples are the abundances of the light elements; anisotropies in the CMB; standard rods like the baryon acoustic oscillations (BAO) in the distributions of galaxies and intergalactic hydrogen; geometric probes such as the Alcock-Paczynski effect; the distance-redshift relation of standard candles like supernovae of type Ia; the age-redshift relation of cosmic chronometers; and last but not least the growth of structure as measured e.g. through the redshift evolution of the abundance of galaxy clusters, galaxy clustering and associated redshift distortions, cosmic shear and CMB lensing, the Ly forest, and the Sunyaev-Zel'dovich effect (SZE).
Although the overall agreement of the model with observations is impressive, there is some tension between different data sets (see Abdalla et al. 2022 for a recent review of the problems and proposed solutions).The Hubble constant measured from the local distance ladder, particularly the value of  0 = 73 ± 1 km s −1 Mpc −1 measured by Riess et al. (2022) from supernovae Ia calibrated using Cepheid stars (which in turn are calibrated using Gaia parallaxes), is significantly greater than the  0 = 67.4± 0.5 km s −1 Mpc −1 inferred from the CMB by Planck Collaboration et al. ( 2020) assuming the LCDM model.Another area of tension concerns the clumpiness of the matter distribution.For example, CMB anisotropies imply  8 ≡  8 (Ω m ) 0.5 = 0.83 ± 0.01 (Planck Collaboration et al. 2020), whereas cosmic shear measurements and a variety of other lowredshift probes of large-scale structure (LSS) find  8 = 0.77 ± 0.02 (e.g.Heymans et al. 2021;Abbott et al. 2022).At present it is unclear whether these tensions require an extension of the base model or whether they are due to underestimated or unidentified systematic errors.
A diverse set of surveys is about to map the LSS down to smaller scales than was possible before, which will greatly reduce the statistical uncertainties on the cosmological parameters.However, their scientific potential can only be realized if the model predictions are at least as accurate and precise as the observations.While predictions for the CMB are thought to be sufficiently robust, the same is not true for the growth of structure and its observational manifestations on scales ≲ 10 Mpc.On these scales baryonic matter cannot be assumed to trace the CDM.On scales of 1 ≲  ≲ 10 Mpc the baryons are predicted to be distributed more smoothly than the CDM, mainly due to their redistribution by galactic winds driven by feedback from star formation and particularly active galactic nuclei (AGN), while on smaller scales the baryons are predicted to cluster more strongly than the CDM due to their ability to radiate away their binding energy, which allows them to condense into galaxies (e.g.Van Daalen et al. 2011).This prediction from hydrodynamical simulations is confirmed by halo models and enhanced dark matter only (DMO) simulations that use the observed hot gas and stellar content of galaxies and clusters as input (e.g.Schneider et al. 2019;Debackere et al. 2020;Giri & Schneider 2021).The emerging picture is that the low baryon fractions of groups and low-mass clusters are closely related to the baryonic suppression of the matter power spectrum on rela-tively large (1 ≲  ≲ 10 Mpc) scales (e.g.Semboloni et al. 2011Semboloni et al. , 2013;;Van Daalen et al. 2020;Aricò et al. 2021;Salcido et al. 2023).If these baryonic effects are not accounted for, then they will, for example, result in catastrophic systematic errors on upcoming cosmic shear surveys (e.g.Semboloni et al. 2011;Eifler et al. 2015;Huang et al. 2019;Lu & Haiman 2021;Martinelli et al. 2021).
Hydrodynamical simulations (for a recent review see Vogelsberger et al. 2020) offer a number of key advantages over DMO-based techniques.First, they can provide much more detail, e.g.galaxy colour gradients, galaxy shapes, and intrinsic alignments between galaxies, which may all bias cosmic shear results, and spatially-resolved X-ray and SZE (virtual) observations of clusters.Second, they selfconsistently model the gravitational back-reaction onto the dark matter due to the redistribution of baryons.Third, they self-consistently model the relations between different physical processes and galaxy properties, e.g. the fact that the dynamical friction experienced by a satellite of a given total mass will depend on its stellar mass, or that the star formation activity of a central galaxy may correlate with the distribution of gas around it.Fourth, they predict not only the properties of the galaxies, but also the 3-D distribution, kinematics, temperature, and chemical composition of the gas.This enables direct comparisons with more types of data, such as diffuse X-ray emission, the SZE, dispersion measures, as well as their cross correlations with galaxy clustering, cosmic shear, and CMB lensing.Even if they cannot yet replace DMO simulations or DMO-based semi-analytic or semi-empirical models in parameter inference, hydrodynamical simulations are needed to validate those methods' assumptions and to calibrate the corrections for baryonic effects that they apply.
The necessity for cosmological simulations to model baryonic effects blurs the line between the fields of LSS and galaxy formation and demands a new approach.This development constitutes a challenge for the modellers, but it also represents an opportunity, as probes of LSS can also be used to advance our understanding of the formation and evolution of galaxies and clusters of galaxies.
Predictions for observables that are directly sensitive to the distribution of baryons are crucial, because the ab initio predictive power of the simulations is limited.Without additional observational constraints, it is, for example, impossible to predict the effect of outflows driven by AGN feedback with the accuracy needed for upcoming experiments.
One approach is to brute force the problem by running a very large number of hydrodynamical simulations that vary all the relevant subgrid parameters and then look for predictions that depend on cosmology but are insensitive to the uncertain subgrid physics, or vice versa (e.g.Villaescusa-Navarro et al. 2021;Ni et al. 2023).However, computational expense then dictates the use of volumes that are too small for most of the commonly used probes of cosmology.Another approach is to calibrate the subgrid physics explicitly, which requires a choice of calibration target.For example, the EAGLE simulations of galaxy formation (Schaye et al. 2015;Crain et al. 2015) were calibrated to the local galaxy stellar mass function (SMF) as well as the relations between galaxy mass and size, and between galaxy and supermassive black hole (BH) mass, while the Illustris-TNG simulations (Pillepich et al. 2018) included additional constraints such as the cosmic star formation history, the intragroup medium, the mass-metallicity relation and galaxy quenching.Because the TNG model was calibrated for the resolution of the TNG100 simulation, the lower-resolution (205 Mpc/ℎ) 3 TNG300 (Springel et al. 2018) and (500 Mpc/ℎ) 3 MillenniumTNG (MTNG; Pakmor et al. 2022) simulations do not fit the calibration data as well, but are more useful for cosmology due to their larger volumes.
For observational cosmology the most relevant calibration targets are arguably the SMF, because it constrains the galaxy-halo connection, which is, for example, critical for galaxy clustering, and the gas and baryon fractions of groups and clusters, because those correlate with the degree to which feedback processes suppress the matter power spectrum on large scales.The (400 Mpc/ℎ) 3 BAHAMAS simulations (McCarthy et al. 2017(McCarthy et al. , 2018) ) were calibrated to match these two observables, both at  ≈ 0. Because the SMF and gas fractions are not precisely known, BAHAMAS included a strong and a weak AGN feedback model, which skirted the observational error bars on the cluster gas fractions.The ANTILLES suite (Salcido et al. 2023) uses a similar approach but includes a much larger number of models spanning a wider range of subgrid parameter values, though in a much smaller volume of (100 Mpc/ℎ) 3 .
Other suites of hydrodynamical simulations with volumes ≫ (10 2 Mpc) 3 that were run to  = 0, such as the cosmo-OWLS (Le Brun et al. 2014) and Magneticum (Dolag et al. 2016) projects, did not have an explicit calibration strategy.For cluster physics zooms of haloes selected from very large volume but low-resolution (DMO) simulations are often used, both as stand alone samples (e.g.Hahn et al. 2017;Cui et al. 2018Cui et al. , 2022;;Tremmel et al. 2019;Henden et al. 2020;Pellissier et al. 2023) and to complement large-volume hydrodynamical runs by extending their mass range, e.g. the MAC-SIS sample (Barnes et al. 2017a) for BAHAMAS and the Hydrangea (Bahé et al. 2017) and C-EAGLE (Barnes et al. 2017b) samples for EAGLE.
Here we present the FLAMINGO1 project, which improves on BAHAMAS and other large-volume hydrodynamical simulations in many respects.Below we list some of FLAMINGO's key features.
First, FLAMINGO uses three different resolutions that are all calibrated to the same data.The two flagship runs use volumes of (2.8 Gpc) 3 and (1 Gpc) 3 and baryonic particle masses of 1×10 9 M ⊙ (which we will refer to as intermediate/m9 resolution) and 1×10 8 M ⊙ (high/m8 resolution), respectively.While the former has the same resolution as BAHAMAS, its volume is more than two orders of magnitude larger.To highlight the dynamic range captured by this simulation, Fig. 1 zooms in from the full simulation box with the large-scale structure of the cosmic web to a single massive galaxy cluster and its internal structure.The color scale of the background image encodes the density of neutrinos, while the intensity encodes the CDM density, which can clearly be seen to be modulated on smaller scales than the neutrino density.The consecutive zooms in the three insets show, respectively, the gas temperature, CDM surface density and X-ray surface brightness.
Second, the simulations use very large numbers of particles, up to 3 × 10 11 , which is the largest number of resolution elements for any existing cosmological hydrodynamical simulation run to  = 0. See Fig. 2 for a comparison with simulations from the literature.
Third, the calibration of the subgrid physics is not done by trial and error, but systematically using machine learning (Gaussian process emulators), also accounting for observational errors and biasing (Kugel et al. 2023).
Fourth, massive neutrinos are modelled using particles with a new, self-consistent and efficient implementation, the '  'method (Elbers et al. 2021), that was designed to reduce shot noise.
Fifth, besides the fiducial simulations, FLAMINGO includes eight astrophysics variations, all in (1 Gpc) 3 volumes with intermediate resolution.These models are each calibrated using our emulators by shifting the calibration data according to the allowed error bars.Model variations are then no longer expressed only as variations of particular subgrid parameters that have no direct connection with observables, like the subgrid AGN heating temperature that was varied in the cosmo-OWLS and BAHAMAS projects.Instead, the variations can be specified in terms of the data that they are calibrated to.For example, we include runs where the observed stellar masses or cluster gas fractions have been shifted by different multiples of the expected systematic errors.Multiple subgrid parameters are then adjusted to accomplish the new fits.However, having models that span the uncertainties in the observables used for calibration may not be sufficient to quantify the uncertainty in the predictions for observations that were not considered in the calibration.A different model, in particular a different implementation of AGN feedback, may result in different predictions even when the model is calibrated to the same data.FLAMINGO therefore includes two simulations that use jet-like AGN feedback instead of the fiducial thermal AGN feedback, but that are calibrated to the same data.
Sixth, besides the fiducial cosmology, which assumes cosmological parameter values from the Dark Energy Survey year three (3x2pt plus external constraints) for a spatially flat universe with    2 = 0.06 eV; (Abbott et al. 2022), FLAMINGO includes four (spatially flat) cosmology variations that each use the fiducial calibration data and were run in (1 Gpc) 3 volumes with intermediate resolution.The variations are the Planck Collaboration et al. ( 2020) cosmology, one with    2 = 0.06 eV and two with    2 = 0.24 eV, and a model with a lower amplitude of the power spectrum, as preferred by many LSS surveys (Amon et al. 2023).
Seventh, we produce on-the-fly full-sky lightcone output for up to eight different observers, including HEALPix maps for gravitational lensing, X-ray emission, the thermal and kinematic SZE, and dispersion measures.
Eigth, we use a new hydrodynamics code (Swift; Schaller et al. 2023) with improved subgrid models.
Nineth, we use 3-fluid initial conditions with separate transfer functions for CDM, baryons and neutrinos, perturbing particle masses rather than positions to suppress discreteness noise (Hahn et al. 2020;Hahn et al. 2021;Elbers et al. 2022).
This paper serves to introduce the FLAMINGO project, document the simulation methods, describe the simulation suite and the calibration strategy, provide a comparison with some key observables, and to report some first results on the effects of baryonic physics on structure formation.
The paper is organized as follows.In Section 2 we discuss the simulation methods.This section is meant as a reference and can be skipped by readers not interested in the technical details.The calibration of the subgrid galaxy formation physics as well as the observational bias factors is detailed in Kugel et al. (2023) and summarized in Section 3. Section 4 summarizes the numerical, cosmological and subgrid parameters for the simulation runs and presents some visuals.We compare with the calibration data in Section 5 and with a selection of other observables in Section 6, including the cosmic star formation history ( §6.1), galaxy properties ( §6.2), cluster scaling relations ( §6.3), and the cross-correlation of the thermal SZE and CMB lensing signals ( §6.4).We investigate baryonic effects on the halo mass function and the matter power spectrum in Sections 7 and 8, respectively, and we conclude in Section 9. A summary of the on-the-fly generation of lightcone data as well as the associated data products (particle data and HEALPix maps) can be found in Appendix A.

The gravity and hydrodynamics solver Swift
The simulations were performed using Swift (Schaller et al. 2023), a fully open-source coupled cosmology, gravity, hydrodynamics, and  galaxy formation code2 .Swift uses task-based parallelism within compute nodes and interacts between compute nodes via MPI using non-blocking communications resulting in excellent scaling up to > 10 5 compute cores (Schaller et al. 2016(Schaller et al. , 2023)).The short-and long-range gravitational forces are computed using a 4 th -order fast multipole method (Greengard & Rokhlin 1987;Cheng et al. 1999;Dehnen 2014) and a particle-mesh method solved in Fourier space, respectively, following the force splitting approach of Bagla & Ray (2003).The accuracy of the gravity solver is mainly controlled by an adaptive acceptance criterion for the fast multipole method similar to the one proposed by Dehnen (2014).
The equations of hydrodynamics are solved using the smoothedparticle hydrodynamics (SPH) method (for a review, see Price 2012).In particular, the FLAMINGO simulations use the SPHENIX flavour of SPH (Borrow et al. 2022) that was specifically designed for galaxy formation simulations.The particle smoothing is done using a Wendland (1995) C2 kernel with the resolution parameter  = 1.23483 .The SPHENIX scheme uses a density-energy formulation of the equations of motion combined with artificial viscosity and conduction terms.Viscosity and conduction limiters are included in the solver to prevent spurious energy losses across feedback-generated shocks or radiative cooling events.Swift also uses the mechanism limiting the time step of inactive neighbour particles of Durier & Dalla Vecchia (2012) to properly evolve the fluid even in the most extreme shocks.
Time integration is performed using a standard leapfrog scheme with the individual time steps of the particles set by the minimum of the gravity time step (Δ = (0.025/) 1/2 ), where  is the gravitational softening length and  the acceleration, and the Courant-Friedrichs-Lewy (CFL) condition for hydrodynamics with parameter value 0.2.All particles use the same fixed, but time evolving, gravitational softening length (Table 2) and we impose a floor on the gas particle smoothing length at 0.01 of the gravitational softening length.

Treatment of neutrinos
The baseline cosmology for the FLAMINGO simulations includes a single massive neutrino species with a mass of 0.06 eV and two massless species, representing the minimum neutrino mass scenario under the normal ordering (Esteban et al. 2020;de Salas et al. 2021).The FLAMINGO suite also includes variations with larger neutrino masses (see §4 and Table 4).A key requirement for our implementation of massive neutrinos is therefore that small and large neutrino masses should both be treated accurately.To accomplish this task without exceeding the memory and time constraints imposed by the size and scope of FLAMINGO, we make use of the recently proposed   method (Elbers et al. 2021).
Massless neutrinos are included in the calculation of the Hubble expansion rate and in the initial conditions, but are otherwise treated as a smooth component.Massive neutrinos are included at both the background and perturbation levels using the   method.This method uses particles to capture the full non-linear evolution of the neutrino phase-space distribution, but statistically weights the particle contributions by comparing the known phase-space density, which is manifestly conserved by the symplectic leapfrog integration scheme of Swift, with the phase-space density expected at the background level.This minimizes the level of shot noise in the neutrino density field and thus significantly reduces the required number of simulation particles.The suppression of shot noise is particularly strong at early times, eliminating the spurious back-reaction on the CDM and baryon components that otherwise results.
As the centre of expansion for the multipoles in Swift is the centre of mass of each tree node and not the geometric centre of the node (see Schaller et al. 2023 for details), we cannot allow for negatively weighted particles to dominate as this would push the centre of expansion outside of the cell.This would in turn lead to large errors in the calculation of the gravity forces.We therefore treat neutrinos as ordinary massive particles in short-range interactions and only apply the   weighting scheme in the mesh-based long-range gravity calculation.This choice nevertheless ensures that the back-reaction on large scales is eliminated and that non-linear neutrino interactions are not neglected.The weights are also used in the power spectrum calculation and preserved for post processing, since they reveal additional information about the phase-space distribution and make it possible to probe neutrino clustering on smaller scales (Elbers et al. 2021;Adamek et al. 2022).
Massive neutrinos are relativistic at early times.We account for this by using relativistic velocities for the neutrino particles.Relativistic corrections to the acceleration are negligible in the time frame of the simulations ( ≤ 31) and are not included to preserve symplecticity of the leapfrog integration scheme (Elbers 2022).Further modifications are needed to account for the presence of neutrinos in the initial conditions, as discussed in Section 2.4.

Subgrid prescriptions
Like any hydrodynamical simulation, FLAMINGO relies on subgrid prescriptions to model unresolved physical processes.These prescriptions build on those developed for the OWLS (Schaye et al. 2010) project, which were also used for cosmo-OWLS (Le Brun et al. 2014) andBAHAMAS (McCarthy et al. 2017), and some of which were developed further for the EAGLE project (Schaye et al. 2015).For FLAMINGO the models were ported from the code gadget (Springel 2005) used for these previous projects to the Swift code used for FLAMINGO.

Radiative cooling and heating
Radiative cooling and heating rates are implemented element-byelement and are taken from Ploeckinger & Schaye (2020, their fiducial model UVB_dust1_CR1_G1_shield1). They used the spectral synthesis and radiative transfer code cloudy (Ferland et al. 2017, version 17.01) to tabulate the rates as a function of density, temperature, chemical composition, and redshift.The gas is assumed to be in ionisation equilibrium, and exposed to the CMB, the evolving metagalactic UV/X-ray background radiation given by a modified version of the Faucher-Giguère (2020) model, and, at high densities, also to a diffuse interstellar radiation field.The gas and dust column densities used to account for self-shielding and to compute the intensity of the interstellar radiation field scale with the density and temperature like the local Jeans column density.We do not allow cooling below a temperature of 100 K.
At  > 3 ( > 7.2) Ploeckinger & Schaye (2020) attenuate the Faucher-Giguère (2020) spectrum above the H i (He ii) ionisation energies using H i (He ii) column densities tuned to match the effective photo-ionisation and photo-heating rates that Faucher-Giguère (2020) finds to be needed to match observations of the optical depth seen by the CMB and observations of the intergalactic medium.In the models used to compute the cooling rates, hydrogen and helium reionize at  = 7.8 and  = 3.5, respectively.To account for the extra heat due to spectral hardening and non-equilibrium effects, we inject an extra 2 eV per hydrogen atom at H reionization.We also inject 2 eV per hydrogen atom at  = 3.5, spread over a Gaussian redshift interval with () = 0.5, to account for the later reionization of He ii.Fig. 3 shows that this yields a thermal evolution of the IGM that agrees with observations.Compared with the rates of Wiersma et al. (2009a) used by BA-HAMAS, the Ploeckinger & Schaye (2020) rates use a newer version of cloudy, a more recent model for the background radiation, a lower redshift of reionization, and they account for self-shielding, the presence of dust, cosmic rays and an interstellar radiation field.
A new addition is the treatment of rapidly cooling gas.In BA-HAMAS and our other gadget simulations we computed the entropy that a particle is expected to reach at the end of its time step based on its current radiative cooling rate and then adjusted the time derivative of the entropy such that the particle would gradually drift to this final entropy over the course of its time step.However, this is inappropriate Redshift z Figure 3.The temperature of gas at the cosmic mean density as a function of redshift.The peaks at  ≈ 7 and 3 are due to H and He reionization, respectively.The thermal evolution is in good agreement with observations of the Ly forest.Data points are based on measurements of absorption line widths as a function of strength (Schaye et al. 2000;Rorai et al. 2018;Hiss et al. 2018;Telikova et al. 2019;Gaikwad et al. 2020), on the small-scale cut-off in the flux power spectrum (Boera et al. 2019;Walther et al. 2019), or on both types of methods (Gaikwad et al. 2021).
if the cooling time is short compared with the time step.Therefore, if the internal energy 4 is expected to change by more than one third, then we immediately set it to the value that we estimate it will reach at the end of the time step.

Star formation and the pressure of the interstellar medium
Since FLAMINGO lacks the resolution to model the multiphase ISM, we follow Schaye & Dalla Vecchia (2008) and impose a densitydependent lower limit on the pressure corresponding to the temperature onto gas with proper hydrogen number density  H > 10 −4 cm −3 and overdensity greater than 10.This relation, which is often referred to as an 'equation of state', corresponds to a constant Jeans mass of ∼ 10 7 M ⊙ , which is, however, unresolved by the FLAMINGO simulations.The temperatures of gas near  EoS should be interpreted as representative of the pressure of a multiphase medium and should therefore not be used to compute observables.
During the simulation, gas particles are converted into collisionless star particles.Gas particles with proper density 5  H >  * H = 4 While the gadget simulations used entropy as an independent variable, in FLAMINGO's hydrodynamics solver sphenix the internal energy is used instead.
5 Due to a bug, in all intermediate-resolution simulations except for the Jet 10 −1 cm −3 , overdensity > 100 and pressure at most 0.3 dex above the temperature  EoS are eligible for star formation.The proper density threshold is motivated by the models of Schaye (2004) for the transition from the warm, atomic interstellar gas phase to the cold, molecular phase, though we neglect the predicted (weak) dependence on metallicity.The overdensity threshold ensures intergalactic gas at very high redshift does not form stars.The temperature ceiling prevents star formation in high-temperature gas.
The low-resolution runs do not sample the density distribution to sufficiently high values to form enough stars if the above star formation criteria are used, even in the absence of feedback.These simulations therefore use lower thresholds of  * H = 10 −3 cm −3 and overdensity > 10 combined with a temperature ceiling of  < 10 5 K.
Gas particles are stochastically converted into star particles using the pressure-dependent star formation rate (SFR) of Schaye & Dalla Vecchia (2008), where  g is the gas particle mass,  = 5/3 is the ratio of specific heats,  is the gravitational constant, and  is the pressure.This relation is derived from the observed Kennicutt-Schmidt surface density star formation law, under the assumption of vertical, hydrostatic equilibrium with a gas fraction of unity,  g = 1.We use the values  = 1.515 × 10 −4 M ⊙ yr −1 kpc −2 and  = 1.4 measured by Kennicutt (1998), where the former has been converted to the value for a Chabrier (2003) stellar initial mass function (IMF).Note that equation (2) implies a minimum possible nonzero SFR (attained for a single gas particle with density  H =  * H ) of  * ≈ 0.4 M ⊙ yr −1 ( g /10 9 M ⊙ ).

Stellar mass loss
Star particles are treated as simple stellar populations with a Chabrier (2003) IMF for zero age main sequence masses between 0.1 and 100 M ⊙ .Time-dependent stellar mass loss by stellar winds from massive stars, asymptotic giant branch stars, core-collapse supernovae, and type Ia supernovae are implemented as in Wiersma et al. (2009b) with the modifications described in §4.4 of Schaye et al. (2015).Briefly, we track the abundances of the elements H, He, C, N, O, Ne, Mg, Si, and Fe.These abundances are used to compute the corresponding element-by-element radiative cooling rates, along with those of Ca and S, whose abundances we assume to track that of Si with mass ratios of 0.094 and 0.605, respectively.Mass loss occurs when a star leaves the main sequence.At each time step 6 , we therefore compute the (pre-main sequence) mass range of stars that leave the main sequence using the mass-and metallicity-dependent stellar lifetimes of Portinari et al. (1998).The mass (and momentum) models, particles with metallicity equal to precisely zero used a threshold of  * H = 10 cm −3 .Tests show that this only has significant effects on galaxies with fewer than 10 star particles, where it artificially suppresses the stellarto-halo mass ratio.Hence, the bug compromised our ability to fit the SMF to even lower masses in this, in any case, poorly resolved regime.The error was fixed before running the low-and high-resolution simulations and the simulations using jet-like AGN feedback. 6To limit the computational expense, mass loss is executed after 10 time steps for stellar particles with ages > 100 Myr.
released by stars in this mass range is transferred from the star particle to its gaseous neighbours using SPH-weighting (but the weights are computed using the gas particles' initial mass; see Schaye et al. 2015) according to the stellar yields tabulated by 7 Marigo (2001), Portinari et al. (1998) and Thielemann et al. (2003).In contrast to BAHAMAS and EAGLE, we force the time steps of star particles with ages ≤ 45 Myr to be no longer than 1 Myr in order to ensure that the evolution of massive stars is sufficiently well sampled.
Supernova type Ia (SNIa) rates per unit formed stellar mass are taken from Schaye et al. (2015), who found that the following function results in an evolving cosmic SNIa rate density that agrees with observations, where  is the age of the stellar population,  = 2 × 10 −3 M −1 ⊙ and  = 2 Gyr.Motivated by the idea that SNIa require a compact stellar remnant, and in contrast to Schaye et al. (2015), we set the rate to zero for ages below 40 Myr, which corresponds to the lifetime of an 8 M ⊙ star.However, this does not have a significant impact on the cosmic SNIa rate.
While stellar mass loss reduces the masses of star particles, it increases the masses of gas particles, thus causing more metal-rich particles to be more massive.To keep the masses of baryonic particles similar, we split particles whose mass exceeds 4 g into two equal mass particles, where  g is the mean, initial gas particle mass.Note that this was not done in BAHAMAS.

Stellar energy feedback
Massive stars and supernovae inject energy into their surroundings.We implement stellar energy feedback kinetically by stochastically kicking SPH neighbours of young star particles, as in Dalla Vecchia & Schaye (2008).We assume that stars with masses between 8 and 100 M ⊙ each inject 10 51 erg at the end of their main sequence lifetime and that a fraction  SN of this energy couples to the ISM.For our IMF the fraction  SN = 1 then corresponds to an energy budget of 1.18 × 10 49 erg M −1 ⊙ .Besides the energy fraction  SN , the feedback prescription uses a second parameter, the target wind velocity Δ SN .The values of the stellar feedback parameters  SN and Δ SN are assumed to be constant and are calibrated as described in §3.
In contrast to BAHAMAS, we do not impose the fixed time delay of 30 Myr between star formation and feedback from Dalla Vecchia & Schaye (2008).Instead, we follow Richings & Schaye (2016) and evaluate the probabilities for feedback at each time step according to the energy associated with massive stars leaving the main sequence during that time step.
Differently from Dalla Vecchia & Schaye (2008) and BAHAMAS, we do not simply increase the velocity of SPH particles by the wind velocity, because this violates energy conservation if those particles have non-negligible velocities relative to that star particle.We instead use the method8 of Chaikin et al. (2022a), which conserves linear momentum, angular momentum and energy by kicking particles in pairs in random but opposite directions.In this scheme the actual post-kick velocity of the wind particles relative to the star will differ somewhat from the target kick velocity if the particles are moving with respect to the star or if the two target particles have different masses.
While Dalla Vecchia & Schaye (2008) and BAHAMAS used massweighting 9 to select the neighbours to be kicked, we use the method of Chaikin et al. (2022b) to ensure the energy distribution is statistically isotropic.As demonstrated by Chaikin et al. (2022b), this difference is important because mass-weighting is biased to higher densities, which results in stronger cooling losses.
Note that contrary to what is done in e.g. the IllustrisTNG (Pillepich et al. 2018) and SIMBA (Davé et al. 2019) cosmological simulations, we do not decouple wind particles from the hydrodynamics.As demonstrated in Dalla Vecchia & Schaye (2008), such decoupling has major consequences and results in a qualitatively different feedback prescription.Decoupled winds also underestimate the energy required because of the neglect of thermal losses and work done during the opening up of the channels through the ISM that the decoupled models implicitly assume to exist.
As in BAHAMAS, the energy from SNIa is injected thermally at every time step, assuming each supernova provides 10 51 erg.As discussed in e.g.Dalla Vecchia & Schaye (2012), such a 'thermal dump' results in only small temperature increases of the gas, which means it is subject to excessively large radiative losses and has very little effect.

Black holes
The origin of supermassive BHs is still unclear (e.g.Volonteri et al. 2021), but proposed mechanisms for their seeding such as direct collapse from metal-free gas in low-mass haloes, population III stellar remnants, and mergers inside star clusters are all unresolved by our simulations.Following Di Matteo et al. (2008) and Booth & Schaye (2009), we therefore place seed BHs in haloes that are sufficiently massive and do not yet contain a BH.Starting from time  = 1/(1 + ) = 0.05, after every Δ log 10  = 1.00751 we run a friends-offriends (FoF) halo finder with linking length 0.2 times the mean inter-particle distance.The minimum halo mass for seeding is set to 2.757 × 10 11 M ⊙ ( g /1.07 × 10 9 M ⊙ ).If the FoF halo were to consist purely of dark matter, then this would correspond to 49 particles.If the halo contains baryonic particles, then the number of particles is larger because baryonic particles are less massive than CDM particles.We use a BH seed mass of 10 5 M ⊙ .However, for the low-resolution simulations the seed mass had to be increased by a factor of 100 because those runs lack the resolution to follow the rapid growth phase of the BHs (e.g.Bower et al. 2017).The BH seed is placed at the position of the densest gas particle in the halo (if the halo does not contain gas then it is not seeded), which is converted into a collisionless BH particle of the same mass as the progenitor gas particle.
Because the BH mass can be much smaller than the particle mass, BH particles carry a subgrid BH mass that is initially set equal to the seed mass.As in Springel et al. (2005), BH processes such as gas accretion and BH mergers are computed using the subgrid BH mass, while gravitational forces are computed using the particle mass.
For comparison, BAHAMAS used the Booth & Schaye (2009) values of a minimum halo mass for seeding equal to 100 dark matter particles and a BH seed mass of 10 −3  g .McCarthy et al. (2017) found that the minimum halo mass for seeding has a non-negligible 9 All SPH neighbours are given a probability proportional to their mass.effect on the SMF, though its effect can be largely compensated for using other subgrid parameters.
Massive BHs are subject to significant dynamical friction, which causes them to sink to the center of the galaxy and limits their excursions thereafter.Cosmological simulations lack the resolution to capture this dynamical friction for two reasons.First, to experience dynamical friction, the BH must be much more massive than the surrounding particles.For FLAMINGO this condition is violated across the entire range of BH masses.Second, scattering of particles with small impact parameters is important for dynamical friction, but is not captured because the gravitational forces are softened and because of poor sampling.To compensate for the inability to simulate dynamical friction directly, we follow Springel et al. (2005) and Booth & Schaye (2009) and reposition the BHs by hand.Bahé et al. (2022) have recently demonstrated that BH repositioning is critical for simulations like BAHAMAS, and hence for FLAMINGO, as well as for much higher-resolution simulations like EAGLE.Without repositioning, BH growth through gas accretion and mergers is dramatically reduced and AGN feedback is consequently ineffective.
At each time step, we move the BH to the location of the SPH neighbour within three gravitational softening lengths that has the lowest gravitational potential, provided it is lower than at the location of the BH.Note that the velocity of the BH is not explicitly altered when it is repositioned.While BAHAMAS allowed repositioning onto star and dark matter particles, we only consider gas particles in order to reduce the computational cost of neighbour finding (the gaseous neighbours need to be found in any case in order to compute the BH accretion rate).In contrast to BAHAMAS, we do not impose an upper limit on the velocity of gas particles on to which the BHs can be repositioned.As discussed in Bahé et al. (2022), such a restriction is unnecessary and at the resolution of BAHAMAS it strongly reduces the efficiency of AGN feedback, though it is possible the effect would be reduced if repositioning onto stellar or dark matter particles were allowed and the effect can be compensated with changes to the subgrid parameters.
When computing the gravitational potential for the purpose of repositioning, we should subtract the contribution of the BH itself in order to prevent the BH from becoming trapped by its own local potential well.This subtraction was done neither in BAHAMAS nor in any other simulation we are aware of.We also neglected to do so for nearly all our intermediate-resolution simulations, but did implement it for the high-and low-resolution simulations, as well as the intermediate-resolution Jet models, after recognizing the problem.To test its effect, we repeated a (400 Mpc) 3 intermediate-resolution simulation that did include the subtraction of the BH's contribution to the potential.For massive galaxies ( * ≳ 10 12 M ⊙ ) this roughly doubled the fraction of quenched galaxies while for active galaxies the specific SFRs decreased by a factor of a few.However, for intracluster gas and lower-mass galaxies we did not find any significant differences.It should thus be kept in mind that for high-mass galaxies the quenched fractions in the intermediate-resolution FLAMINGO simulations with our fiducial implementation of AGN feedback may be artificially low.The failure to subtract the BH's own potential for the purpose of repositioning may also have caused excessive star formation in brightest cluster galaxies in earlier simulations.
The prescription for the merging of nearby BHs is taken from Bahé et al. (2022).BHs are merged if they are separated by less than 3 gravitational softening lengths,  < 3, and if their relative velocity satisfies Δ < √︁ 2 BH /, where  BH is the mass of the most massive of the two BHs and  is their separation.This differs from BAHAMAS, which used the Booth & Schaye (2009) criterion  < ℎ and Δ < √︁  BH /ℎ, where ℎ is the SPH smoothing length of the most massive BH.The new criterion, which we think is more appropriate because merging is a gravitational process, leads to somewhat more rapid merging, particularly in low-density environments.When two BHs merge, we conserve momentum, subgrid BH mass, and particle mass and the particle carrying the lower-mass BH is removed from the simulation.As in Springel et al. (2005), BHs are assumed to accrete at a modified Bondi-Hoyle rate limited by the Eddington rate.The modified Bondi-Hoyle rate is given by where  and  s are the gas density and the speed of sound of the ambient medium,  is the velocity of the BH with respect to its environment, and the coefficient  is a boost factor described below (unmodified Bondi-Hoyle accretion corresponds to  = 1).The Eddington rate is where  p is the proton mass,  T is the Thomson cross section for electron scattering,  is the speed of light, and  r = 0.1 is the assumed radiative efficiency, i.e. the fraction of the accreted rest mass energy that is converted into light.
Because the simulations generally do not resolve the Bondi radius,  B =  BH / 2 s ≈ 4 kpc ( BH /10 8 M ⊙ )( s /10 km s −1 ) −2 , and because the simulations do not model the multiphase ISM, it is justified and necessary to boost the Bondi-Hoyle accretion rate.Following Booth & Schaye (2009), we multiply the Bondi-Hoyle rate by the factor where  * H = 0.1 cm −3 is the threshold for star formation and  BH is a parameter that is calibrated (see §3 and Table 1).As discussed by Booth & Schaye (2009), at low densities the accretion rate should not be boosted for two reasons.First, for sufficiently high BH masses and low temperatures we can resolve Bondi-Hoyle accretion and for densities  H <  * H we do not expect the gas to contain a cold phase (Schaye 2004).Second, if the large boost factors that are needed at high densities are applied everywhere, then for typical BH masses the Bondi-Hoyle rate only falls below the Eddington rate for extremely low densities,  H ≪  * H , which forces AGN feedback to reduce the density of the ISM and the inner CGM to unrealistically low values in order to regulate BH growth through feedback.
While the subgrid BH mass is smaller than its host particle's mass, the growth of the subgrid BH does not require the transfer of mass from the surrounding gas particles.If the BH subgrid mass exceeds the BH particle mass, then we transfer the difference in mass from the BH's SPH neighbours to the BH particle using the method of Bahé et al. (2022).The mass (and momentum) taken from each neighbour is proportional to that neighbour's contribution to the SPH density at the location of the BH.Neighbours with mass less than half the initial baryonic particle mass are excluded (a limit that is not reached in practice).This method of letting the BH 'nibble' on its neighbours differs from the method of Booth & Schaye (2009) used in BAHAMAS, where entire gas particles were stochastically swallowed by the BH.The new method leads to a smaller relative difference between the BH's particle and subgrid masses, particularly when the latter is similar to the mass of gas particles.
Gas accretion increases the BH subgrid mass as  BH = (1 −  r )  accr (Booth & Schaye 2009) and decreases the BH particle mass by  r  accr (Bahé et al. 2022).We note that the correction to the particle mass, which accounts for the loss of rest mass to radiation, was neglected in BAHAMAS.

AGN feedback
In our fiducial model AGN feedback is implemented thermally as in Booth & Schaye (2009), but in some of our intermediate-resolution runs we instead use anisotropic, kinetic feedback in order to enable tests of the sensitivity of the results to the implementation of AGN feedback.For simplicity we employ only a single mode of AGN feedback in each run.

Fiducial model: thermal injection
Although our fiducial model does not include jets, the outflows emerging from our thermal AGN feedback prescription are anisotropic because they naturally take the path of least resistance, and in clusters AGN feedback episodes result in buoyantly rising bubbles of high-entropy gas (e.g.Nobels et al. 2022).
A fraction  f of the energy available in the time step,  r  accr  2 Δ, is assumed to couple to the gas surrounding the BH.In order to prevent numerical overcooling, i.e. the overestimate of the radiative cooling rate due to the underestimate of the post-shock gas temperature that results from having to heat at least the mass of one gas particle (e.g.Dalla Vecchia & Schaye 2012), we do not inject feedback energy at every time step.Instead, we store the energy in the subgrid reservoir of the BH particle until it suffices to increase the temperature of  heat gas particles by Δ AGN .To ensure the feedback is well sampled, we adopt the BH time step limiter of Bahé et al. (2022).Unless it would result in a time step smaller than 10 5 yr, we limit the time step of each BH particle such that its energy reservoir will not increase by more than the energy needed to heat one particle if the BH continues to accrete at its current rate.
As demonstrated by Booth & Schaye (2009, 2010), because selfregulation determines the amount of energy that AGN provide for a given galaxy-scale gas accretion rate, the coupling efficiency,  f , effectively determines the BH mass of galaxies that are regulated by AGN feedback, but other galaxy properties are insensitive to its value.If  f is increased, then the BH has to accrete less gas in order to inject the same amount of energy.As in Booth & Schaye (2009) and BAHAMAS, we set  f = 0.15, which means a fraction  f  r = 0.015 of the accreted rest-mass energy is used for feedback.We will show in Section 5.4 that this value results in good agreement with the observed  = 0 relation between BH and stellar mass.
As in (Cosmo-)OWLS, we set  heat = 1 whereas BAHAMAS used  heat = 20.Hence, at a fixed mass resolution,  g , and for a fixed Δ AGN , AGN events in FLAMINGO are an order of magnitude less energetic than in BAHAMAS.However, McCarthy et al. (2017) found that reducing  heat to unity only slightly increases the SMF and cluster gas fractions.In contrast to our previous work, we do not use mass-weighting to select the SPH neighbour that receives the energy.Instead, we inject the feedback energy into the gas particle nearest to the BH.This choice minimizes the occurrence of feedback at large distances.As demonstrated by Chaikin et al. (2022b) for the case of stellar feedback, for small values of  heat selecting the nearest particles yields a nearly isotropic distribution of energy and gives results that are nearly indistinguishable from their isotropic scheme.
Finally, Δ AGN is a parameter of the model that we calibrate (see §3 and Table 1).

Model variation: kinetic, jet-like injection
For accretion rates that are not extremely small compared to the Eddington rate, our fiducial, thermal implementation can be thought to represent the effect of radiatively-driven winds from geometrically thin accretion discs (Shakura & Sunyaev 1973).However, when the accretion rate drops below some critical fraction of the Eddington rate (  BH /  Edd ∼ 10 −2 ), accretion discs are expected to be radiativelyinefficient, advection-dominated, geometrically thick (Narayan & Yi 1995) and to be efficient at launching collimated jets (e.g.Yuan & Narayan 2014) whose power (or efficiency) depends on the dimensionless BH spin  =  BH /( 2 BH ) ∈ [−1, 1], where  BH is the BH angular momentum (Tchekhovskoy et al. 2010;Narayan et al. 2022), and that are directed along the BH spin axis (i.e. the direction of its angular momentum vector).
Our 'Jet' simulations employ a simplified version of the BH spin and AGN jet implementation of Huško et al. (2022).However, for simplicity, and to maximize the difference with respect to our fiducial model, we use the jet mode (with its accompanying accretion disc subgrid physics) for all accretion rates, and we employ a constant jet feedback efficiency of 0.015 for consistency with the fiducial, thermal model.This efficiency results in BH masses that are consistent with observations (see Fig. 12).Given this choice, the BH spins obtained using the aforementioned model are used only to determine the direction that the jets are launched in.
The model accounts for the following processes when evolving BH spin: 1) gas accretion; 2) jet spindown (negligible in this case due to the small jet efficiencies we have assumed); 3) BH-BH mergers (as in Rezzolla et al. 2008); and 4) Lense-Thirring torques that mediate the angular momentum transfer between the disc and the BH.If the accretion rate is high, then these torques can cause the BH spin to be redirected towards the angular momentum of the outer regions of the accretion disc on Myr time-scales (King et al. 2005).At low accretion rates (matching the assumptions used here), the redirection is instead much slower.For the BH spin axis to change appreciably, the BH therefore needs to accrete a large fraction of its current mass (see appendix B of Huško et al. 2022), or the spin evolution needs to be dominated by major BH-BH mergers, which are relatively rare.Given these considerations, redirection is expected to occur on Gyr rather than Myr time-scales, at least in galaxy clusters.
We assume that the direction of the angular momentum of the outer accretion disc (whose size is up to 10 5 gravitational radii, ∼ pc scales) is the same as that of the gas in the BH smoothing kernel, which we acknowledge to be a strong assumption given the low resolution of the simulations.While this may imply that the direction of the BH spin vector (and thus the jets) is not entirely realistic, Huško et al. (2022) showed that jet redirection is unimportant for long-term feedback effects, as long as it occurs relatively rarely.To be more precise, provided the jets do not redirect more frequently than the typical duration of a jet episode (≲ 10 2 Myr), their effects are insensitive to their direction and thus the time-scale of redirection (which we expect to be ∼ 1 Gyr here).The most important point is that the jets do not redirect frequently (e.g. on Myr time-scales), since that would, at the resolutions employed in this project, correspond more to an isotropic kinetic wind (as in e.g.MTNG) than jets.
The jets are launched by kicking gas particles from within the BH's SPH kernel.We use a constant target jet velocity,  jet .Every time the BH's feedback energy reservoir exceeds 2 × (1/2) g  2 jet , two particles are kicked.We choose the two particles closest to the BH spin axis (in terms of angular distance), and their velocities are increased along unit vectors chosen randomly from within two cones with 7.5°opening angles around the BH spin axis (one on each side of the BH spin axis, for each particle).Note that as was the case for stellar feedback (see §2.3.4),energy conservation implies that the actual magnitude of the velocity increase can be different from the target velocity, depending on the initial particle velocity.
AGN jets can have velocities approaching the transrelativistic regime on the scales resolved by our simulations.However, due to the low mass resolution, jets using such high velocities would be extremely poorly sampled.Thus,  jet is treated as a subgrid parameter of the model whose value we calibrate (see §3 and Table 1) and whose role is analogous to that of Δ AGN for the case of thermal AGN feedback.

Initial conditions
Initial conditions (ICs) for purely gravitational -body simulations of dark matter are commonly set up with higher-order Lagrangian perturbation theory (LPT), which is known to be significantly more accurate than the first-order Zel'dovich approximation (Scoccimarro 1998;Sirko 2005;Crocce et al. 2006).However, doing so for simulations with multiple fluid components with distinct transfer functions, such as hydrodynamical simulations and simulations with neutrinos, is non-trivial.For FLAMINGO, we made use of several recent theoretical developments that enable multi-fluid third-order Lagrangian perturbation theory (3LPT) ICs that accurately reproduce the relative growth of the individual fluid components.These developments were incorporated in the monofonIC code (Hahn et al. 2020;Michaux et al. 2021).For FLAMINGO, a modified version of monofonIC was used that implements the effects of massive neutrinos10 .
We use the prescriptions for 3-fluid ICs with CDM, baryons, and massive neutrinos outlined in Elbers et al. (2022), which builds on the 2-fluid formalism of Rampf et al. (2021) and Hahn et al. (2021).CDM and baryon particles are set up in a two-stage process.First, the combined mass-weighted CDM + baryon fluid is initialized with single-fluid 3LPT, accounting for the presence of neutrinos.This single fluid is then split into separate components with distinct transfer functions by perturbing the masses and velocities in accordance with the first-order compensated mode.Hahn et al. (2021) showed that discreteness errors can be suppressed by perturbing particle masses rather than displacements, thereby eliminating spurious growth of the compensated mode (see also Bird et al. 2020;Liu et al. 2023).
The underlying Gaussian random fields were chosen from subregions of panphasia to facilitate future zoom-in resimulations (Jenkins 2013), see appendix B for details.To limit cosmic variance without compromising the ability to do zooms, we used partially fixed ICs (Angulo & Pontzen 2016), setting the amplitudes of modes with ( ) 2 < 1025 to the mean variance, where  is the wavenumber and  is the side length of the simulation box.A paired simulation with inverted phases was run for the  = 1 Gpc DMO fiducial cosmology simulation to enable further limiting of the cosmic variance (Angulo & Pontzen 2016).The starting redshift,  = 31, was chosen to be as late as possible to limit discreteness errors and reduce computational cost, but before shell-crossing, such that LPT remains valid, and before BH seeding and star formation are initiated.The linear power spectra and transfer functions were computed with class (Lesgourgues 2011; Lesgourgues & Tram 2011).Given that FLAMINGO includes simulations with side lengths of  = 2.8 Gpc and greater, relativistic effects become a factor on the largest scales.The ICs were therefore set up in -body gauge (Fidler et al. 2015(Fidler et al. , 2017)).In the absence of radiation and neutrinos, this is enough to ensure that the relativistic fluid equations coincide with the usual Newtonian equations solved by -body codes at first order.
Since our simulations include massive neutrinos, we also need to account for their presence (Zennaro et al. 2017;Aviles & Banerjee 2020).At first order, massive neutrinos change the transfer functions and introduce a scale-dependence in the growth factors (e.g.Lesgourgues & Pastor 2006).This is taken into account in a generalized back-scaling procedure (Zennaro et al. 2017), the purpose of which is to correct for the remaining differences between the relativistic and Newtonian fluid systems.This is accomplished by starting with the desired result: the  = 0 transfer functions computed by class, and evolving them back to  = 31 with a Newtonian fluid approximation implemented in the zwindstroom code (Elbers et al. 2022).In practice, this amounts to a small boost in the initial power spectrum on large scales.The simulations correctly model the expansion history, including the effects of massive neutrinos, an amount of radiation corresponding to a CMB temperature of  = 2.7255 K, and an effective number of relativistic species  eff = 3.046 at high redshift.Accordingly, the same is done in the back-scaling calculation.Neutrinos also change the growth rate of matter perturbations relative to the geometric expansion, which feeds back into the higher-order LPT solutions.While the full third-order theory requires successive expensive convolutions (Aviles & Banerjee 2020), the effects can be captured to high accuracy by scale-independent correction factors obtained from an all-order recursive solution in the small-scale limit (Elbers et al. 2022), which were included.Finally, the neutrino particles themselves also require ICs.It is understood that neutrino perturbations are suppressed relative to dark matter perturbations, such that neutrinos can still be treated linearly at  = 31.To take into account linear perturbations to the neutrino phase-space distribution function, we integrated neutrinos from  = 10 9 , when all modes of interest were outside the horizon, down to  = 31 using the fastdf code (Elbers 2022).This represents a substantial improvement over the Zel'dovich approximation, which neglects higher moments of the neutrino distribution and underestimates the power spectrum on large scales (Elbers 2022).

Structure finding
The identification of haloes and substructures in the outputs of the simulations was performed using the VELOCIraptor subhalo finder (Elahi et al. 2019).We summarize the procedure here.In a first phase, haloes are identified in configuration space using a 3D FoF algorithm with a linking length  = 0.2 of the mean interparticle separation (Davis et al. 1985).The FoF search includes all particle types except neutrinos.Within each halo, we then search for substructures that are dynamically distinct from the mean background halo.This is achieved by performing an iterative 6D FoF search, in phase space and again including all particle types except neutrinos, on each halo individually with a velocity-space linking length set to the velocity dispersion of the original 3D FoF object and a real-space linking length set to 0.1 times the one used in the initial 3D FoF search, i.e. 0.1×0.2= 0.02.The most prominent object thus found (the one that is the most distinct from the background halo) is labelled as the central and the remaining objects as satellites.Once these substructures have been identified, we clean them up via an unbinding procedure which removes particles that are not gravitationally bound to the object.The most bound particle in each cleaned object is then used to define its centre.
Finally, we note that all the galaxy and cluster measurements presented here were computed using the Spherical Overdensity and Aperture Processor (SOAP), a tool that we developed for the FLAMINGO project.SOAP takes the (sub)halo centers and particle membership determined by VELOCIraptor as input, and computes a large number of (sub)halo properties for a range of apertures, which can be 3D or projected, include or exclude other substructures and unbound particles, and whose sizes can be specified as physical radii or mean internal overdensities.

CALIBRATION OF SUBGRID PARAMETERS AND OBSERVATIONAL BIASES
Subgrid prescriptions for unresolved physical processes necessarily involve choices and free parameters.Some parameter values are chosen based on theoretical considerations.An example is the star formation threshold density, which is motivated by the radiative transfer models of Schaye (2004) for the atomic to molecular phase transition.
Others are fixed based primarily on numerical considerations, e.g. the equation of state imposed on the ISM discussed in Section 2.3.2.Some parameters can be taken directly from observations.This is for example the case for the parameters  and  appearing in the pressure-based star formation law (eq.2), which are, respectively, the normalisation and the slope of the Kennicutt-Schmidt surface density law (eq.3).Other parameters are fit to specific observations that directly constrain the corresponding subgrid process.This is for example the case for the normalisation, , and time delay, , appearing in the SNIa delay function (eq.4), which are fit to the observed cosmic SNIa rate density assuming the observed cosmic star formation history.A less intuitive example is the efficiency of AGN feedback,  f , which, in galaxies regulated by AGN feedback, determines the masses of BHs but has little effect on everything else and is chosen to reproduce the observed relation between stellar mass and BH mass for massive galaxies (Booth & Schaye 2009, 2010).
However, there are also subgrid parameters that affect multiple observables of primary interest and that are not directly constrained by specific observations.In those cases we have to choose observables to calibrate to and we require a method to set the values of said subgrid parameters.We have chosen to calibrate to two observables that are particularly important for the goals of FLAMINGO: the presentday galaxy stellar mass function (SMF) and the gas mass fraction in clusters of galaxies.The SMF is important because it constrains the relation between stellar mass and halo mass, where the former is observed and the latter determines the clustering properties.The gas fraction in groups and clusters is important because it largely determines the baryonic suppression of the matter power spectrum (e.g.Semboloni et al. 2011Semboloni et al. , 2013;;McCarthy et al. 2018;Schneider et al. 2019;Debackere et al. 2020;Van Daalen et al. 2020;Aricò et al. 2021;Delgado et al. 2023;Salcido et al. 2023) and of the cluster mass function (e.g.Velliscig et al. 2014;Cui et al. 2014;Debackere et al. 2021).We wish to calibrate our galaxy formation model to these observations by varying as few parameters as possible.
From previous simulation work (e.g.McCarthy et al. 2017) we know that the SMF and gas fractions are most sensitive to stellar and AGN feedback, respectively, though these processes are not independent (Booth & Schaye 2013).After some experimentation, we found it necessary and sufficient to vary two parameters related to stellar feedback, the energy budget (  SN ) and the kick velocity (Δ SN ), and two parameters related to BHs, the AGN heating temperature (Δ AGN ) and the logarithmic slope of the density dependence of the BH accretion rate boost factor ( BH ).For the model variations using AGN jet feedback the parameter Δ AGN is replaced by the jet velocity,  jet .For the low-resolution runs we do not use stellar feedback, because the mass range in which stellar feedback dominates remains unresolved, thus leaving only the two BH parameters.
Below we summarize our calibration method and results.For more details, motivation, discussion and additional results, including the covariance between the different parameters, we refer to the companion paper by Kugel et al. (2023).

Emulation
In previous projects such as EAGLE and BAHAMAS we calibrated the subgrid parameters through trial and error, mostly be systematically varying one parameter at a time.For FLAMINGO we instead use a more systematic and quantitative Bayesian approach that has already been applied to the semi-analytic model GALFORM (Bower et al. 2010;Rodrigues et al. 2017).After settling on the set of parameters to vary and the ranges over which to vary them (i.e.our priors), which we based on physical arguments and a small number of test runs, we employ machine learning to fit the subgrid parameters to the calibration data.We use Gaussian process emulators (e.g.Rasmussen & Williams 2006) trained on 32-node Latin hypercubes of simulations.The dimensionality of the hypercube equals the number of subgrid parameters and the side lengths are our priors.The 32 nodes are distributed quasi-randomly throughout the hypercube such that the minimum distance between the nodes is maximized.A separate hydrodynamical simulation is run for each node.We then build a different emulator for each observable based on all 32 simulations.The SMF emulator takes as input the stellar mass,  * , and the subgrid parameter vector, , and it predicts the SMF,  ( * ).The inputs for the gas fraction emulator are the total cluster mass,  500c (i.e. the mass inside the radius  500c within which the mean density is 500 times the critical density), and the subgrid parameters .It outputs the gas fraction as a function of mass,  gas,500c ( 500c ).
As discussed in Schaye et al. (2015), it is generally necessary to recalibrate subgrid parameters for unresolved processes in the ISM when the resolution is changed.A higher-resolution simulation resolves smaller scales and higher gas densities and will therefore, for example, yield different radiative losses and different BH accretion rates (and hence different AGN feedback), which will, in turn, change the structure of the ISM even on scales resolved by the lowerresolution run.To compensate, it is therefore generally necessary to adjust the parameter values when the resolution is modified.This is particularly true if, as is the case here, we demand a very good fit to the calibration data, and if the statistical errors on both the observations and the predictions are small due to the availability of large surveys and large simulation volumes, respectively.A comparison of recalibrated simulations of different resolutions is referred to as a 'weak convergence test'.We therefore calibrate each of the three FLAMINGO resolutions separately using their own Latin hypercubes and emulators.For high, intermediate, and low resolution (respectively m8, m9, and m10 in Table 2) we employ hypercubes consisting of (100 Mpc) 3 , (200 Mpc) 3 and (400 Mpc) 3 simulations, respectively.
We fit the emulator predictions to the data using Markov chain Monte Carlo sampling of the parameter space, accounting for both errors in the data and the emulator uncertainty.We compute the log likelihood separately for the SMF, the X-ray gas fractions and the weak lensing gas fractions (see §3.2).These are then combined, giving equal weight to the two types of gas fraction data and equal weight to the combined gas fraction result and the SMF.We use the maximum likelihood values of the subgrid parameters as our fiducial values.

Calibration data
For the SMF we calibrate to the recent results for  = 0 from Driver et al. (2022) for the GAMA survey.For the gas fractions of groups and clusters we use X-ray observations at  ≈ 0.1 compiled from the literature by Kugel et al. (2023), which measure the total mass inside  500c ,  500c , under the assumption of hydrostatic equilibrium, and weak gravitational lensing data at  ≈ 0.3 from the HSC-XXL survey of Akino et al. (2022).
We do not attempt to measure these observables from virtual observations using observational methods, because we lack the resolution to create realistic mock data for galaxies and low-mass clusters.Virtual observations could probably be used to measure gas fractions of clusters sampled with many particles, but more work is needed to see if the simulations are sufficiently realistic for mock observational analyses to be preferable to using observationally inferred gas and total masses.
If we assume that the bulk of the cluster gas is detectable in Xrays, then there is no ambiguity about the definition of cluster gas fractions, which observations express as the gas mass fraction within  500c , and which can thus be measured straightforwardly from the simulation output.However, the situation is more murky for stellar masses.Observed stellar masses are typically based on extrapolated Sérsic fits to surface brightness or inferred mass profiles.Even for stellar mass profiles, we cannot mimic this procedure at our resolution down to the low masses (corresponding to ∼ 10 stellar particles per object) for which we aim to reproduce the SMF.We therefore choose to define the stellar mass of a galaxy as the stellar mass that is gravitationally bound to the subhalo and contained within a 3D aperture of radius 50 kpc, which is well resolved and which de Graaff et al. (2022) found to yield results close to the masses inferred from virtual observations of the (much higher-resolution) EAGLE simulation.An observational stellar mass bias factor that is discussed below accounts for any systematic offset due to differences in mass definitions.To account for random measurement errors present in the observations, which lead to an Eddington bias (i.e. if the SMF is steep, then the number of objects that scatter up into a given observed mass bin strongly exceeds the number that scatter down into that bin, thus flattening the observed slope), we add lognormal scatter of (log 10  * ) = min (0.070 + 0.071, 0.3) dex (Behroozi et al. 2019) to the simulation stellar masses before training the emulator.
We only fit to the data over a limited range of masses.The lower mass limit for the SMF is determined by the resolution limit of the simulation.For high, intermediate, and low resolution the emulator predictions are fit to the data with  * > 10 8.67 , 10 9.92 , and 10 11.17 M ⊙ , respectively.The upper mass limit is always 10 11.5 M ⊙ , because at higher masses there are significant systematic differences between different data sets and the simulation predictions become highly sensitive to the size of the aperture.For the cluster gas fractions the lower mass limit is always  500c = 10 13.5 M ⊙ , while the upper mass limit is determined by the volume of the simulations used for the Latin hypercube.For high, intermediate and low resolution the upper mass limits are, respectively, 10 13.73 , 10 14.36 , and 10 14.53 M ⊙ .

Observational bias factors
We allow for three types of potential observational biases, which for simplicity we assume to be constants.First, a stellar mass bias factor accounts for possible systematic errors in the observationally inferred stellar masses, e.g. because of uncertainty in spectral energy distribution modeling.It shifts the SMF horizontally, i.e. along the mass axis.Second, a cosmic variance bias factor accounts for systematic uncertainty in the galaxy number densities due to the possibility that the finite-sized observed volume is unrepresentative.It shifts the SMF vertically, i.e. along the number density axis.
Third, a hydrostatic mass bias factor accounts for the possibility that measurements of total cluster masses inferred from X-ray observations are systematically offset from the true masses because they assume the gas is in hydrostatic equilibrium and neglect non-thermal pressure.Indeed, comparisons between X-ray and weak lensing observations indicate that the former significantly underestimate the total masses (e.g.Hoekstra et al. 2015;Eckert et al. 2016), assuming that the weak lensing masses are relatively unbiased.The hydrostatic mass bias shifts the observed cluster gas fraction -mass relation horizontally, i.e. along the mass axis.We neglect the effect on the gas fraction within  500c .This correction is small, because  500c increases if the mass increases, and the gas mass increases if  500c increases.Hence, hydrostatic mass bias will only change the gas fraction insofar as the cumulative gas fraction changes between the biased and corrected values of  500c .Even for a mass bias as large as 25 per cent,  500c ∝  1/3 500c changes by only 10 per cent, which is expected to lead to only small differences in the gas fraction (see e.g.fig.6 of Velliscig et al. 2014).Note that correcting for this small effect would be difficult, because observational studies do not report measurements at the corrected values of  500c .Our priors for the bias factors are based on results taken from the literature: Behroozi et al. (2019) for the stellar mass bias, Driver et al. (2022) for cosmic variance, and Hoekstra et al. (2015); Eckert et al. (2016) for hydrostatic mass bias.
When calibrating the intermediate-resolution simulations, we fit simultaneously for the subgrid parameters and bias factors.Because observational biases should be independent of the resolution of the simulations, we do not re-fit the bias factors for the low-and highresolution models.We choose to use intermediate resolution to fit the biases because only this resolution probes a large mass range (≳ 1 dex) for both types of observables.Similarly, we do not re-fit the biases when we change the model, i.e. the cosmology or the subgrid feedback parameters.
The best-fitting (i.e.maximum likelihood) stellar mass bias factor correspond to an increase of the observed stellar masses by 0.026 dex and the best-fitting cosmic variance bias corresponds to a change in the observed galaxy number densities by a factor 0.995 (i.e. a decrease of 0.5 per cent).The 1 posterior confidence intervals for these biases are 0.06 ± 0.11 dex and 0.98 ± 0.06, respectively, and are thus consistent with the data being unbiased.These can be compared with our adopted Gaussian priors of N (, ) = N (0, 0.14) and N (1, 0.06), respectively, where  is the mean and  the standard deviation.The best-fitting hydrostatic mass bias corresponds to dividing the observed X-ray total masses by a factor 0.743.The evidence for such a bias is strong, with a posterior of 0.74 ± 0.09, but fully in line with (and largely determined by) our prior of N (0.74, 0.10) based on the literature.In all the plots shown in this work the observational data has been shifted by the best-fitting bias factors.

Subgrid parameter values
The fiducial subgrid parameter values for all resolutions are listed in Table 1.These are the maximum likelihood values from table 2 of Kugel et al. (2023), which also lists the priors and the posteriors.For intermediate resolution, which is BAHAMAS resolution, we use  SN = 0.238, Δ SN = 562 km s −1 , Δ AGN = 10 7.95 K and  BH = 0.373.These differ from the BAHAMAS values11 of 0.16, 300 km s −1 , 10 7.8 K and 2. However, except for  BH = 2, all the BAHAMAS values fall within our 1 posteriors.We need a significantly smaller value of  BH , which implies significantly smaller boosts of the Bondi-Hoyle accretion rates.This is despite the fact that we use an order of magnitude smaller BH seed mass than in BA-HAMAS.The difference in  BH is mostly due to our improvement in the BH repositioning scheme, in particular the absence of a velocity ceiling for particles to be eligible as repositioning targets, discussed in Section 2.3.5 and Bahé et al. (2022).
Compared with intermediate resolution, at high resolution we require about twice as much energy from stellar feedback but about half as large a kick velocity.The increase in energy is probably needed to compensate for the increase in radiative losses due to the higher densities that are resolved at higher resolution.The reduction in the wind velocity likely reflects the fact that at higher resolution we calibrate down to lower galaxy masses and thus lower circular velocities.At high resolution the AGN heating temperature is only 0.12 dex higher than for intermediate resolution, but the slope of the density dependence of the BH accretion rate boost factor is significantly decreased to  BH = 0.038, implying almost no boost of the Bondi-Hoyle rate.This reduction is likely due the higher gas densities that can be resolved.The best-fitting values for low resolution (Δ AGN = 10 8.29 K,  BH = 0.373) are not directly comparable to those of the other resolutions because of the absence of stellar feedback, the higher threshold for star formation and the larger BH seed mass (see §2.3).
The cosmologies we consider are sufficiently close that it is unnecessary to recalibrate the subgrid model when changing cosmology (e.g.McCarthy et al. 2018).We do, however, change the subgrid model for a series of intermediate-resolution (1 Gpc) 3 runs that vary the stellar and/or AGN feedback.The 'M*' and 'fgas' models were calibrated analogously to the fiducial models, but after shifting all the observed galaxy stellar masses (for M*) or cluster gas fractions (for fgas).For models M*− the observed SMF was shifted to 0.14 dex lower stellar masses.For models fgas+2, fgas−2, fgas−4, and fgas−8 the observed cluster gas fraction data points (one point per mass bin) were all shifted by, respectively, +2, −2, −4, −8 times the error estimated from bootstrapping in the case of the X-ray data (table 5 of Kugel et al. 2023) or the error on the fit for the weak lensing data from Akino et al. (2022) (as given in §3.2.2 of Kugel et al. 2023).While even the 2 gas fraction variations are formally ruled out at more than 2 confidence if the data points are independent, systematic errors are likely correlated between different mass bins.Moreover, we wish to use the feedback variations to obtain conservative estimates of the potential effects of baryonic physics.
The values of the subgrid parameters for the calibrated feedback variations are listed in Table 1.These are the maximum likelihood values from table 8 of Kugel et al. (2023), which also lists the posteriors.The changes relative to the fiducial model are modest.The main effect of the (0.14 dex) reduction in the observed stellar masses is an increase in  SN (by 0.13 dex for the fiducial gas fractions and by 0.14 dex for models fgas−4) though Δ SN , Δ AGN and  BH all increase as well.The differences in the gas fractions are driven mostly by Δ AGN , which changes by 0.13-0.24dex for each 2 shift in the gas fractions, with higher Δ AGN corresponding to lower  gas .
Models Jet and Jet_fgas−4 use kinetic, jet-like AGN feedback instead of our fiducial thermal AGN feedback implementation.Compared with our fiducial model, model Jet requires slightly weaker stellar feedback and a similar value of  BH .If we convert the fiducial AGN temperature increase of Δ AGN = 10 7.95 K into a velocity (using 2 1 2  p  2 jet = 3 2  B , where  p is the mean particle mass) we obtain a value that is just 0.21 dex higher than the adopted jet velocity of 836 km s −1 .Compared with model Jet, model Jet_fgas−4 requires a higher jet velocity and a smaller value of  BH .

THE SIMULATIONS
The FLAMINGO suite presented here consists of the 16 hydrodynamical simulations listed in Table 2 and the 12 gravity-only simulations listed in Table 3.Most of the runs use a (1 Gpc) 3 cubic volume, denoted by 'L1' in the simulation identifier, but one run has a volume of (2.8 Gpc) 3 ('L2p8').The hydrodynamical simulations span three resolution levels ('m10', 'm9' and 'm8', where the number indicates the rounded logarithm base 10 of the baryonic particle mass12 ), with the mass (spatial) resolution between consecutive resolutions changing by a factor of 8 (2).Most of our runs are of intermediate resolution ('m9'), which corresponds to an (initial) mean baryonic particle mass of ≈ 1 × 10 9 M ⊙ , a mean cold dark matter particle mass of ≈ 6 × 10 9 M ⊙ , and a maximum proper gravitational softening length of 5.7 kpc.At  > 2.91 the softening length is held constant in comoving units at 22.3 kpc.All runs use equal numbers of baryonic and dark matter particles, while the number of neutrino particles is a factor 1.8 3 smaller.Tables 2 and 3 list the parameter values that determine the numerical resolution for all the runs.
The values of the cosmological parameters for our fiducial model are the maximum posterior likelihood values from the Dark Energy Survey year three (DES Y3; Abbott et al. 2022) '3×2pt + All Ext.' ΛCDM cosmology ('D3A' in Table 4).These values assume a spatially flat universe and are based on the combination of constraints from three DES Y3 two-point correlation functions: cosmic shear, galaxy clustering, and galaxy-galaxy lensing, with constraints from external data from baryon acoustic oscillations (BAO), redshift-space distortions, SnIa, and the Planck observations of the CMB (including CMB lensing), Big Bang nucleosynthesis, and local measurements of the Hubble constant (see Abbott et al. 2022 for details).Our fiducial cosmology D3A uses the minimum neutrino mass allowed by neutrino oscillation experiments of   = 0.06 eV (Esteban et al. 2020;de Salas et al. 2021), which is consistent with the 95 per cent confidence upper limit of 0.13 eV from DES Y3.In this model the neutrino contribution is provided by one massive and two massless species.
FLAMINGO includes 4 intermediate-resolution hydrodynamical simulations with the fiducial calibration of the subgrid physics in (1 Gpc) 3 volumes that vary the cosmological parameters.Three of the alternative cosmologies we consider are variations on Planck Collaboration et al. ( 2020): their best-fitting ΛCDM model with the minimum allowed neutrino mass,   = 0.06 eV ('Planck'); a model with a high neutrino mass,   = 0.24 eV, (allowed at 95 per cent confidence by Planck; Planck Collaboration et al. 2020) in which the other cosmological parameters take their corresponding best-fitting values from the Planck MCMC chains ('PlanckNu0p24Var'); and a model with the same high neutrino mass,   = 0.24 eV, that keeps all other parameters fixed to the values of model Planck, except for Ω CDM which was reduced in order to keep Ω m constant ('PlanckNu0p24Fix').Note that for the latter model we fix the primordial power spectrum amplitude,   , rather than  8 .All models with   = 0.24 eV use three massive neutrino species of 0.08 eV.Finally, we include the 'lensing cosmology' from Amon et al. (2023) ('LS8').This model has a lower amplitude of the power spectrum,  8 = 0.766, compared with 0.815 and 0.833 for D3A and Planck, respectively.Amon et al. (2023) show that the lensing cosmology is consistent with observations of galaxy clustering from BOSS DR12 (Reid et al. 2016) and galaxy-galaxy lensing from D3A (Abbott et al. 2022), HSC Y1 (Aihara et al. 2018) and KiDS-1000 (Kuijken et al. 2019) over a wide range of scales, 0.15 − 60 ℎ −1 Mpc, if allowances are made for theoretical uncertainties associated with baryonic feedback and assembly bias.In contrast, they find that the Planck cosmology does not fit the same data on small scales.We note that Heymans et al. (2021) showed that the LS8 model is also consistent with KiDS-1000 cosmic shear measurements.
All the FLAMINGO cosmologies are spatially flat,  Ω  = 1, where the sum is over all components , which includes dark energy, cold dark matter, baryons, massive neutrinos, massless neutrinos, and radiation.All runs assume initial baryonic mass fractions of hydrogen and helium of 0.752 and 1 − 0.752 = 0.248, respectively 13 .The values of the cosmological parameters that vary between runs can be found Table 4.
All runs use the subgrid model described in section 2.3.Up to four subgrid parameters, of which two relate to stellar feedback, one to BH growth and one to AGN feedback, are calibrated to observations of the present-day SMF and low- cluster gas fractions as described in section 3 and in more detail in Kugel et al. (2023).In summary, eight intermediate-resolution, (1 Gpc) 3 volumes vary the subgrid feedback.'Jet' in the simulation name indicates that the AGN feedback is kinetic, jet-like rather than thermal.'M*−' indicates that the observed stellar masses were decreased by the expected systematic error of 0.14 dex before calibration, which mainly results in somewhat stronger stellar feedback.Finally, 'fgas±' indicates that for each cluster mass bin the observed cluster gas fraction was shifted by ± times the error (see §3.4).Because one of the main motivations for the model variations is predicting the observational signatures of scenarios that result in larger differences between the LSS in hydrodynamical and DMO simulations, we include more models that vary the gas fractions than the SMF and more models 13 The transfer function used for the initial conditions assumes fractions of 0.754579 and 0.245421, respectively.
MNRAS 000, 1-43 (2022) Table 2. Hydrodynamical simulations.The first four lines list the simulations that use the fiducial galaxy formation model and assume the fiducial cosmology, but use different volumes and resolutions.The remaining lines list the model variations, which all use a 1 Gpc box and intermediate resolution.The columns list the simulation identifier (where m8, m9 and m10 indicate log 10 of the mean baryonic particle mass and correspond to high, intermediate, and low resolution, respectively; absence of this part implies m9 resolution); the number of standard deviations by which the observed stellar masses are shifted before calibration, Δ * ; the number of standard deviations by which the observed cluster gas fractions are shifted before calibration, Δ  gas ; the AGN feedback implementation (thermal or jets); the comoving box side length, ; the number of baryonic particles,  b (which equals the number of CDM particles,  CDM ); the number of neutrino particles,   ; the initial mean baryonic particle mass,  g ; the mean CDM particle mass,  CDM ; the Plummer-equivalent comoving gravitational softening length,  com ; the maximum proper gravitational softening length,  prop ; and the assumed cosmology which is specified in Table 4.For each hydrodynamical simulation there is a corresponding gravity-only simulation (postfix '_DMO' in the simulation name), which uses the same total mass-weighted CDM+baryon perturbations but eliminates the baryon-CDM isocurvature and decaying modes, while leaving the neutrino part untouched.The effective total matter (baryon+CDM) fluid is then discretised using the same number of particles as used for CDM alone (and and also for baryons alone) in the hydrodynamical simulation.This implies that the particle mass is increased by a factor of (Ω CDM + Ω b )/Ω CDM relative to the mean mass of CDM particles in the hydrodynamical simulation.
For four DMO simulations we did not run the hydrodynamical counterpart.The intermediate-resolution (1 Gpc) 3 simulation PlanckNu0p12Var_DMO has a neutrino mass of   = 0.12 eV (allowed at 95 per cent confidence by Planck plus BAO; Planck Collaboration et al. 2020) and the other cosmological parameters take their corresponding best-fitting values from the Planck MCMC chains.Simulation L5p6_m10_DMO uses 5040 3 particles in a (5.6 Gpc) 3 box, which corresponds to low resolution ( CDM = 5.38×10 10 M ⊙ ).Simulation L11p2_m11_DMO uses the same number of particles, but in a volume of (11.2 Gpc) 3 ( CDM = 4.30 × 10 11 M ⊙ ).Model L1_m9_ip_DMO is identical to run L1_m9_DMO except that the phases were inverted in the initial conditions.Note that all the intermediate-resolution hydrodynamical runs that vary the subgrid physics correspond to the same DMO simulation L1_m9_DMO.
Our two most demanding 14 simulations are L2p8_m9, which uses 2 × 5040 3 + 2800 3 ≈ 3 × 10 11 (i.e.0.3 trillion) particles, and L1_m8, which has 2 × 3600 3 + 2000 3 ≈ 1 × 10 11 particles.As far as we know, the former uses more particles than any previous cosmological, hydrodynamical simulation that includes radiative cooling and that was run to  = 0. Its number of particles exceeds that of the similar-resolution BAHAMAS simulations by more than two orders of magnitude.In Fig. 2 we compared the resolution and box size of FLAMINGO with simulations from the literature.
Before showing some quantitative results, we will first provide a visual impression of the simulations and some of the data products.More visualisations, including videos and interactive sliders, can be found on the FLAMINGO website 15 .We already illustrated the large dynamic range in Fig. 1, which zooms in on a region centered on the most massive halo in the L2p8_m9 simulation.
Fig. 4 compares the CDM and neutrino distributions in a 20 Mpc thick slice centered on the most massive halo in the L2p8_m9 simulation.While neutrinos trace the CDM on very large scales, they are distributed much more smoothly on scales ≲ 10 2 Mpc.To visualize the difference in the distributions of gas and CDM we have to zoom in.Fig. 5 compares the gas and CDM distributions in a 50x50x20 Mpc slice through simulation L1_m8, while the inset zooms in further onto a ∼ 10 14 M ⊙ halo.Clearly, on scales ≲ 1 Mpc the gas distribution is much smoother than that of the CDM.Together these two figures illustrate the need for the explicit inclusion of particles representing gas, neutrinos and CDM, and the large dynamic range that is required to simultaneously cover the large-and small-scale differences in the spatial distributions of these species.In addition, there are stellar particles, which trace the CDM better than is the case 14 L2p8_m9 and L1_m8 took 31M and 17M core hours, respectively.These run times include the time spent creating lightcone outputs.These simulations used, respectively, 240 and 120 compute nodes with dual 64-core AMD EPYC 7H12 2.6GHz processors on the DiRAC COSMA8 system in Durham, UK. 15 https://flamingo.strw.leidenuniv.nl/for gas and neutrinos, and BH particles, which are not shown here.
The FLAMINGO website has an interactive slider versions of these figures as well as of other figures.
The three different simulation resolutions are compared in Fig. 6, which shows the same 63x63x20 Mpc region in, from left to right, the L1_m8, L1_m9, and L1_m10 simulations.On large scales the images look identical, but the zooms shown in the insets demonstrate clearly that there is structure down to smaller scales if the simulation resolution is higher.Fig. 7 illustrates the lightcone output for the case of the thermal SZE (see appendix A for a detailed description of the implementation of the lightcone output).The panels show full-sky HEALPix maps of the dimensionless Compton  parameter, where  B is Boltzmann's constant,  e is the electron mass,  e is the free electron number density, d is the proper distance along the path traveled by the CMB photon, and the redshift limits are indicated below each panel.See appendix A2.3 for our numerical implementation of the above integral.The bottom right panel of Fig. 7 shows the contributions of all shells from redshift 0 to 5, while the panels above it show the contributions of smaller redshift intervals, as indicated.The lightcone maps output during the simulation have a redshift width of Δ = 0.05, but note that thinner shells can be created in post-processing using the particle lightcone output.The HEALPix maps have 12 × 2 28 ≈ 3 × 10 9 pixels, corresponding to a maximum size of about 13.46 arcsec.LSS is more clearly visible in the lowest-redshift shells, but this is largely due to the larger angular size of the structures.Zooming in would also reveal LSS in the somewhat higher redshift shells (not shown).

COMPARISON WITH OBSERVATIONS USED FOR CALIBRATION
In this section we compare the simulations to the observables used to calibrate the subgrid model, i.e. the  = 0 SMF ( §5.1) and  ≈ 0.1 − 0.3 gas fractions of low-mass clusters ( §5.2).In addition, we show three closely related quantities: the stellar-mass-halo-mass (SMHM) relation ( §5.1) and cluster stellar and baryon mass fractions ( §5.3).Because the volume of the flagship runs is 3 orders of magnitudes greater than that of the calibration runs, FLAMINGO enables probing these observables up to much higher masses than used for the calibration, which means that results for massive clusters can, in fact, be considered predictions.Finally, we compare with observations of the relation between stellar and BH mass ( §5.4).Although we did not explicitly calibrate to this last relation, we include it here because we would have adjusted the subgrid AGN feedback efficiency, which we took from Booth & Schaye (2009), if the BH masses had been in substantial disagreement with the data.

Galaxy stellar mass function
The  2022), shifted by the best-fitting systematic cosmic variance and stellar mass biases, which are however both negligible (−0.0022 dex and +0.026 dex, respectively).In the bottom panels, the simulation SMFs have been divided by a spline fit through the observations in order to reduce the dynamic range and thus facilitate the comparison with the data.The specifications of the simulations can be found in Tables 2 and 4. The parts of the SMFs below the resolution-dependent lower mass limits for the calibration are displayed using dotted line styles.The upper mass limit for the calibration is always the same and indicated by the vertical dashed line in all panels.The error bars labelled 'Sys.err' indicate the ±1 systematic errors due to cosmic variance (vertical error bar; negligible) and uncertainty in the stellar mass determination for a fixed IMF (0.14 dex; horizontal error bar).As in all figures, the simulation SMFs include the random lognormal scatter in the stellar mass that is expected to be present in the data (0.3 dex).Except for models M*− , which were calibrated to the observed SMF shifted to 0.14 dex lower masses, all the m10-and m9-resolution models are consistent with the data, while the differences for the m8 resolution are only ≈ 0.1 dex.For reference, the grey dot-dashed curves show the MTNG simulation, which underestimates the number densities by ≈ 0.3 dex, and the grey dashed curves show the BAHAMAS simulation, which agrees very well with the FLAMINGO models that have the same resolution (m9).
therefore shows the simulation result divided by a spline fit through the observations.As discussed in §3.2, we added 0.3 dex lognormal scatter to the simulated stellar masses to mimic Eddington bias due to random measurement errors.As discussed in §3.3, the observed SMF has been shifted by the best-fitting bias factors for cosmic variance and stellar mass, which are, however, far too small to make a visual difference.
The simulations are generally in good agreement with the data over the mass range used for calibration, which extends down to masses corresponding to only ∼ 10 star particles.For the highresolution simulation (m8) the agreement is, however, not quite as good as for the lower resolutions.At higher masses than used for the calibration,  * ≳ 10 12 M ⊙ , the results for the different resolutions diverge and only the low-resolution model (m10) appears to agree with the highest mass data point.However, m9 also agrees with the data if we allow for the systematic error on the stellar mass.Indeed, at fixed low number density the stellar masses still only differ at the factor of ≈ 2 level between the different resolutions, but this corresponds to a diverging difference in number density due to the exponential decline of the SMF at high mass.Moreover, in this mass range the SMF is also sensitive to the assumed Eddington bias, with a smaller random error on the stellar mass resulting in a steeper cut off in the SMF (e.g.Furlong et al. 2015).In addition, above 10 11 M ⊙ the stellar masses become sensitive to the choice of aperture and the treatment of intracluster light (see Kugel et al. 2023), which implies that comparison to the data requires a more sophisticated analysis of the simulations than our 50 kpc spherical apertures.For these reasons we decided against using these high masses in the calibration 16 .The SMFs for the L2.8 (blue) and L1 (green) intermediate-resolution simulations are nearly identical and hence the green curve is invisible except at the highest masses.
For comparison, the grey dashed and dot-dashed curves show, respectively, the BAHAMAS (McCarthy et al. 2017) and MTNG (Pakmor et al. 2022) simulations.While the former shows similarly good agreement with the data as FLAMINGO, the latter strongly underpredicts the SMF.MTNG uses 30 kpc apertures, whereas we use 50 kpc apertures.This difference in aperture is unimportant for  * < 10 11 M ⊙ , but for higher masses the SMF becomes increasingly sensitive to the aperture, with larger apertures resulting in higher masses.Had we used 30 kpc apertures, then this would have improved the agreement between FLAMINGO and observations at the highmass end (Kugel et al. 2023).
The right panels of Fig. 8 are similar to the left panels, but show the cosmology and feedback variations, which all use intermediate resolution and  = 1 Gpc.The SMFs of most models are nearly identical and agree similarly well with the data as the fiducial model.This confirms that re-calibration was unnecessary for the cosmology variations and that models varying only the cluster gas fractions are properly re-calibrated to the SMF.The models using jet-like AGN feedback appear to undershoot the data at  * ≈ 2 × 10 11 M ⊙ , but the difference in number density can easily be accounted for by shifting the masses by the expected systematic error on the observed stellar mass.The only two models that do not fit the data are the ones that are not supposed to: the two M*− variations, which were calibrated to the observed SMF after decreasing the observed masses by the systematic error of 0.14 dex.Interestingly, these models actually still fit the original, unperturbed data in the mass range where the SMF is flat,  * < 10 10.5 M ⊙ .
Fig. 9 is similar to Fig. 8 but plots the SMHM relation for central galaxies.The curves and shaded regions indicate the median and the 16th -84th percentile scatter, respectively.The curves become dotted below the resolution-dependent lower mass limits for the calibration while the dashed horizontal line indicates the constant upper mass limit.The sharp downturns at low halo masses are due to the limited resolutions of the simulations.Note that lower percentiles cut off at higher halo masses, which implies that for the lowest halo masses that still have realistic median stellar masses, a significant fraction of the haloes do not form any stars.The data points are from Behroozi et al. (2019), who used their semi-empirical 'UniverseMachine' model to infer the SMHM relation from observations.These data points are model-dependent and should therefore only be used to rule out extreme simulations.The m9 resolutions agree best with UniverseMachine, while m8 and m10 predict, respectively, higher and lower stellar-to-halo mass ratios for  * > 10 11 M ⊙ .At lower masses the m9 and m8 simulations both predict higher SMHM ratios than UniverseMachine down to their resolution limits.The right panel shows that for halo masses ≲ 10 13 M ⊙ the M*− runs agree better with UniverseMachine than the fiducial model does.

Cluster gas fractions
Fig. 10 is similar to the top row of the previous figure, but shows the median mass fraction in gas as a function of halo mass for  = 0.1 clusters of galaxies.The gas fraction,  gas,500c , and halo mass,  500c , are measured within  500c .The dashed vertical line indicates the lower halo mass limit for the calibration, which is always  500c = 10 13.5 M ⊙ .The upper mass limits are indicated by the coloured downward pointing arrows.They depend on the resolution, because we used smaller box sizes for the calibration of higherresolution models (see §3).Note that the predictions extend to much higher (≈ 0.5 − 1.5 dex, depending on the resolution) halo masses than the maximum mass used for the calibration.The gas fractions increase monotonically with halo mass, but even at the highest halo masses they fall well short of the universal baryon fraction, Ω b /Ω m (horizontal, dotted line).
The simulations are compared with  ≈ 0.1 X-ray observations compiled from the literature by Kugel et al. (2023) and with  ≈ 0.3 X-ray/weak lensing observations from Akino et al. (2022); Mulroy et al. (2019); Hoekstra et al. (2015).The error bar on each X-ray gas fraction data point reflects the error on the median of all the clusters in the mass bin, but it should be noted that the scatter is much greater than the error on the median and that the indicated errors likely underestimate the true uncertainty.The observed X-ray halo masses have been corrected for the best-fitting hydrostatic mass bias (i.e. they have been shifted to higher halo masses as indicated by the arrow labelled 'HSE bias').This correction brings them in line with the lensing data, as expected given that our best-fitting bias factor largely reflects the prior that we adopted and that is based on comparisons of X-ray and weak lensing masses in the literature (Hoekstra et al. 2015;Eckert et al. 2016).We caution that we have not attempted to account for possible observational selection effects.
The simulations shown in the left panel, which all use the fiducial cosmology and were calibrated to the fiducial data, are in good agreement with the observations, though the gas fractions may be slightly underestimated for high-mass ( 500c ∼ 10 15 M ⊙ ) clusters.The fiducial BAHAMAS model (grey dashed curve) predicts gas fractions that are somewhat lower at the low-mass end, but which agree very well with FLAMINGO for  500c > 10 14 M ⊙ .In contrast, MTNG (grey dot-dashed curve) predicts gas fractions that are much higher than any of the simulations and that are inconsistent with the data.
In the right panel, where all the m9-resolution model variations are compared, the models that were calibrated to the same data generally fall nearly on top of each other, making them nearly indistinguishable.This confirms that re-calibration was unnecessary for cosmology variations and that model M*−, for which the target SMF was changed while leaving the gas fraction data unchanged, was properly calibrated.As expected, simulation fgas+2 overshoots the fiducial data, while models fgas−2, fgas−4, and fgas−8 undershoot the fiducial data by increasing amounts.Each of these models is, in fact, in good agreement with its own calibration targets.
Beyond the upper mass limit for the calibration of the m9 resolution,  500c = 10 14.36 M ⊙ , the different models converge towards similar values, indicating that the results become less sensitive to the strength of the AGN feedback.Another result of note is that the Jet models yield shallower relations between gas fraction and halo mass.At low mass, below the minimum mass for calibration, model Jet predicts substantially higher gas fractions than the fiducial model.At high mass, above the calibration limit, model Jet_fgas−4 predicts slightly lower gas fractions than model fgas−4.

Cluster star and baryon fractions
Fig. 11 shows the median cluster stellar mass fractions as a function of halo mass, both measured within  500c .From the SMHM relation it follows that the maximum stellar mass used for calibration,  * = 10 11.5 M ⊙ , corresponds to a halo mass  500c ∼ 10 14 M ⊙ .Hence, for higher halo masses the stellar masses of the central galaxies were not calibrated.However, what is plotted in Fig. 11 is the total stellar mass, which is dominated by satellites and the extended, diffuse component that is traced by the intracluster light and which originates from disrupted satellite galaxies (e.g.Bahé et al. 2017;Mitchell & Schaye 2022).Hence, the star fraction is predominantly determined by the part of the SMF that accounts for most of the mass, i.e. the knee, which we did calibrate to, though the contribution of the central galaxy is not negligible.
The simulations predict star fractions that decrease with halo mass, which is consistent with the observed trend.However, the predicted stellar masses are on the high side for massive clusters.This discrepancy worsens with increasing resolution, which is consistent with the increase in the stellar-to-halo mass ratio with resolution (Fig. 9).The low-resolution model agrees well with the data, but this may be for the wrong reason because its SMF already cuts off at  * ∼ 10 11 M ⊙ due to resolution effects.However, it is unclear whether it is the total stellar mass fraction that should be compared with the data.The extended low surface brightness stellar emission around galaxies, including the intracluster light, is difficult to detect.Indeed, that is why we measure the stellar mass in apertures of 50 kpc when we calibrate to the observed SMF.As shown by the dotted curves, including only the stellar mass within the 50 kpc apertures (summed over all member galaxies) results in much lower stellar mass fractions, where the relative difference increases with halo mass.The total and the aperture-limited stellar mass fractions bracket the observed values.We also note that the offsets between different data sets exceed the scatter within a given data set, which suggests that the systematic errors are large compared with both the quoted errors and the intrinsic scatter.Nevertheless, it should be kept in mind that FLAMINGO may overestimate the stellar masses of massive clusters.
The right panel shows that the M*− models, which were calibrated to the SMF after shifting it to 0.14 dex lower stellar masses, yield lower stellar mass fractions, similar to those in the BAHAMAS simulation (grey, dashed curve).The same holds for Jet_fgas−4, which Fig. 9 shows gives similar stellar masses as the M*− models for the centrals in haloes of mass ≳ 10 13 M ⊙ .
The bottom row of Fig. 11 shows the median baryon mass fraction within  500c as a function of  500c , where the baryon fraction is just the sum of the gas and the true total stellar mass fractions that we discussed before.The baryon fractions increase with halo mass and at the high-mass end the different models converge to values slightly below the universal baryon fraction.There is a large amount of scatter in the data, much more than is predicted by the simulation, which may indicate that the observational scatter is dominated by measurement or modeling errors.The simulations are in good agreement with the data, with the exception of the  500c ∼ 10 15 M ⊙ data from Zhang et al. (2011).However, as can be seen from Fig. 10, the baryon fractions measured by Zhang et al. (2011) are much lower, and hence inconsistent with the gas fractions from a variety of studies targeting larger samples of similarly high cluster masses.The BAHAMAS simulation shows a similar trend as FLAMINGO, but yields slightly lower baryon fractions for  500c ≲ 10 14 M ⊙ .8, the left panel shows the simulations using the fiducial galaxy formation model and the fiducial cosmology, while the right panel shows the model variations, as indicated in the legend.The vertical dashed line indicates the lower (mass limit for the calibration.The resolution-dependent upper mass limits for the calibration are indicated by the coloured downward pointing arrows.The rightward pointing arrow labelled 'HSE bias' indicates the correction for hydrostatic bias that has been applied to the observed X-ray data.For reference, the horizontal dotted lines indicate the universal baryon fraction.The simulations are compared with the data used for the calibration: the compilation of  ≈ 0.1 X-ray data from Kugel et al. (2023) and the  ≈ 0.3 weak lensing (plus X-ray) data from Akino et al. (2022), Mulroy et al. (2019), andHoekstra et al. (2015).For comparison, the grey dot-dashed curves show the MTNG simulation (at  = 0), which predicts too high gas fractions, and the grey dashed curves show the fiducial BAHAMAS simulation (at  = 0), which is closest to model fgas−2.

Supermassive black holes
Fig. 12 shows the median relation between the stellar mass and the mass of the most massive BH of the galaxy.The simulations are compared with observations from Graham & Sahu (2023) for different galaxy morphologies.At the high-mass end ( * ≳ 10 11.5 M ⊙ ) FLAMINGO agrees with the observations for ellipticals, while at the low-mass end ( * ≲ 10 11 M ⊙ ) there is good agreement with the data for discs.In contrast, the MTNG simulations follows the data for elliptical galaxies even at masses were most galaxies are observed to be disky (e.g.Moffett et al. 2016).
At low stellar masses the curves asymptote to the BH seed mass, which is 10 5 M ⊙ for m8 and m9, but 10 7 M ⊙ for the m10 resolution.As the galaxy (and halo) mass increases, there comes a point when the BH starts to grow rapidly through gas accretion.As can be seen in the figure, this rapid growth phase begins at  * ∼ 10 10.5 M ⊙ and ends when the BH mass has increased to ∼ 10 8 M ⊙ and the stellar mass has increased by only a factor of a few to values slightly less than 10 11 M ⊙ .This behavior is similar to that in the EAGLE simulation (Schaye et al. 2015), for which Bower et al. (2017) found that the rapid growth phase is triggered when stellar feedback becomes inefficient, which in itself is related to the appearance of a hot, gaseous halo for halo masses ∼ 10 12 M ⊙ .The rapid growth phase ends when the BH become sufficiently massive to regulate its own growth via AGN feedback.Beyond this point the relation remains super-linear, which is steeper than observed for elliptical galaxies, but similar to the observations including all galaxies if we account for the fact that low-mass ( * < 10 11 M ⊙ ) galaxies tend to be disky.
The normalisation of the high-mass end of the relation is sensitive to the assumed efficiency of AGN feedback (Booth & Schaye 2009, 2010).We use a feedback efficiency of  f = 0.15 and a radiative efficiency of  r = 0.1, which implies that 1.5 per cent of the accreted rest mass energy is used to heat the gas surrounding the BH.These values are identical to those used by Booth & Schaye (2009) and were also used in the OWLS (Schaye et al. 2010) and EAGLE (Schaye et al. 2015) simulations.Although we therefore did not tune the efficiency, Booth & Schaye (2009) motivated this choice by the desire to fit the normalisation of the BH mass -stellar mass relation, and if the chosen efficiency had resulted in a poor fit then we would have changed it.As shown by Booth & Schaye (2009, 2010), this would have made hardly any difference for observables other than the BH mass, which can be understood if AGN feedback is self-regulating.We therefore consider the normalisation of the BH mass -galaxy mass relation to be part of the calibration.Moreover, by changing the Booth & Schaye (2009) boost factor for the BH accretion rate (see §2.3.5),we have some control over the galaxy mass corresponding to the rapid BH  10, but showing the median stellar (top panels) and baryonic (bottom panels) mass fractions of  = 0.1 clusters.The solid curves in the top panels, as well as the grey dashed curves showing the BAHAMAS simulation, show the true median total stellar mass fractions, while the dotted curves show the fractions we obtain if we only include the mass that is within our fiducial apertures of 50 kpc (summed over all galaxies).The simulation results are compared with observations from Zhang et al. (2011), Gonzalez et al. (2013), Kravtsov et al. (2018), andAkino et al. (2022), where the total and brightest cluster galaxy stellar masses from the last study have been increased by factors of 1.15 and 1.30 to account for blue galaxies and intracluster light, respectively.Observed stellar masses have been corrected to our Chabrier IMF following Chiu et al. (2018).Our fiducial correction for hydrostatic mass bias has been applied to the observed data points.While the star fractions appear higher than observed, they are highly sensitive to the apertures within which the galaxy masses are measured.
growth phase, which determines the mass above which star formation is quenched.
The right panel of Fig. 12 shows that there is little difference between the different models.For models M*−, which were calibrated to produce 0.14 dex lower stellar masses than the fiducial model, the rapid growth phase is shifted to lower stellar masses by about this amount.This indicates that the rapid growth phase is determined mostly by the halo rather than by the stellar mass, which is consistent with the conclusions of Bower et al. (2017).

COMPARISON WITH OBSERVATIONS NOT USED FOR CALIBRATION
In the previous section we compared FLAMINGO to observations used for the calibration of the subgrid physics.In this section we compare to selected observables that were not considered in the calibration: the cosmic star formation history ( §6.1), the stellar mass dependence of specific star formation rates, passive fractions, stellar metallicities, the sizes of active and passive galaxies ( §6.2), cluster scaling relations ( §6.3), and the cross-correlation of thermal SZE and CMB gravitational lensing convergence maps ( §6.4).

Cosmic star formation history
Fig. 13 shows the cosmic SFR density as a function of redshift.The SFR density was computed on-the-fly with high cadence by summing the instantaneous SFRs of all gas particles.As in earlier figures, the left panel shows the models that use the fiducial cosmology and galaxy formation model but which differ in terms of resolution and volume, while the right panel compares model variations in a (1 Gpc) 3 volume at m9 resolution.The L2p8_m9 and L1_m9 models lie on top of each other, which implies that the predictions of the L1 models are already converged with box size.The kink at  = 7.8 is due to photo-heating associated with reionization (see §2.3.1).
Close inspection reveals a much less prominent kink at  = 2.91, the redshift where we switch from a gravitational softening length that is fixed in comoving units to one that is fixed in physical units.At present we do not have an explanation for the small kink at  ≈ 2  and m9).The simulations are compared with data from Graham & Sahu (2023) for elliptical (E), spiral (S) and SO galaxies.There is good agreement with the data if we consider that most galaxies with masses < 10 11 M ⊙ are disky.
that is only visible for the high-resolution model and that is absent for smaller volume simulations of the same model (not shown).
The models are compared with a compilation of pre-2016 data from Behroozi et al. (2019) and with a number later FIR and radio surveys.For Behroozi et al. (2019) we show both the originally reported measurements (labelled 'Observed') and Behroozi et al.'s best estimate of the intrinsic values after accounting for systematic errors (labelled 'True').All models are in agreement with the data below  ≈ 2. At higher redshifts the different resolutions start to diverge and eventually fall below the observations.This is expected, we resolve star formation in the haloes that dominate the cosmic SFR up to some resolution-dependent redshift.At very high redshift the low-resolution (m10) simulation yields higher SFRs than the m9 model because it uses a much lower threshold density for star formation (see §2.3.2).
The right panel shows that for  < 2 the M*− models predict lower SFRs.The other models are very close to the fiducial one 17 .BAHAMAS and MTNG predict shallower declines from  = 2 to  = 0 than observed and predicted by FLAMINGO. 17At  > 2 the Jet models predict slightly higher star formation rate densities, which is due to the fact that the bug affecting star formation in zero metallicity gas, described in footnote 5, was fixed for the Jet models (as well as for the fiducial m8 and m10 simulations).

Galaxy properties
Although FLAMINGO generally has too low resolution for detailed studies of galaxy structure and evolution, it can provide useful predictions for studies of relatively massive galaxies, particularly for integrated (i.e.spatially unresolved) properties.In order to demonstrate the uses and limitations of the simulations when it comes to galaxy properties, we show in Fig. 14 a number of observables as a function of stellar mass at  = 0. From top to bottom, the different rows show the median, instantaneous specific star formation rate (sSFR) for active galaxies, i.e. galaxies with sSFR ≥ 10 −11 yr −1 , the fraction of galaxies that are passive (i.e.sSFR < 10 −11 yr −1 ), the stellar metallicity18 , and the projected stellar half-mass radii of active and passive galaxies.Unless specified otherwise, all quantities except galaxy sizes are computed using all particles bound to the subhalo and within a 3D spherical aperture of radius 50 kpc, while galaxy sizes are computed from all bound particles inside a projected 2D circular aperture of radius 50 kpc.As in earlier figures, the left and right panels compare different resolutions/box sizes and different models, respectively.
The near perfect agreement between L2p8_m9 and L1_m9 implies that the galaxy properties are converged with the simulation box size, except at  * > 10 12 M ⊙ , where the L1_m9 simulation suffers from small number statistics.Convergence with the numerical resolution is less good, except at high mass.The curves are dotted below the Cosmic time [Gyr] Figure 13.Cosmic SFR density as a function of redshift (bottom axis) and cosmic time (top axis, assuming the fiducial cosmology).As in Fig. 8, the left panel shows the simulations using the fiducial galaxy formation model and the fiducial cosmology, while the right panel shows the model variations, as indicated in the legend.The predictions are compared with pre-2016 data from the literature compiled by Behroozi et al. (2019), who provide both the reported measurements and their best estimate of the 'true' values after accounting for systematic effects, and more recent radio data from Novak et al. (2017); Enia et al. (2022) and far-infrared data from the ALPINE survey (Gruppioni et al. 2020;Khusanova et al. 2021).For reference, we also show predictions from BAHAMAS and MTNG.FLAMINGO agrees well with the data for  < 2, but at higher redshifts the results become increasingly sensitive to the resolution.The differences between the model variations are small, though the models that were calibrated to lower SMFs (M*− ) yield lower SFRs at  < 2. mass limit for the calibration of the SMF, but it is clear that for some properties, e.g.passive fractions and sizes, resolution effects kick in at higher masses than this calibration limit.This is hardly surprising given that the SMF was calibrated to masses corresponding to fewer than 10 stellar particles.The upturns of the sSFR and passive fraction towards low masses, as well as the downturn of the metallicity, are all resolution effects.The minimum possible nonzero SFR (see §2.3.2),i.e. the rate of a single star-forming gas particle with density equal to the star formation threshold, corresponds to a minimum possible nonzero sSFR of 0.4 Gyr −1 ( g /10 9 M ⊙ )( * /10 9 M ⊙ ) −1 for the star formation threshold used for the m8 and m9 resolutions.Equating this to the upper limit used to define passive galaxies, sSFR = 0.01 Gyr −1 , yields a critical stellar mass of  * = 4 × 10 10 M ⊙ ( g /10 9 M ⊙ ).For m10 this is an overestimate, because it uses a much lower star formation threshold.Galaxies with masses lower than this critical value will be active even if they have only a single star-forming gas particle.Galaxies with much lower masses can either have zero sSFR and thus be passive or have a sSFR (far) above the main sequence for star forming galaxies.Hence, to form the right amount of stars in their haloes, such galaxies must be passive most of the time.Therefore, for galaxies with masses below the critical value we expect both the sSFRs of active galaxies and the passive fractions to be too high for purely numerical reasons and this is indeed what we find.
Resolution effects are also visible for galaxy half-mass radii (solid curves that become dotted below the lower mass limit for the calibra-tion of the SMF).The size-mass relation flattens when the size drops below about three gravitational softening lengths.This flattening is likely caused by spurious collisional heating by dark matter particles rather than by the softening itself, which actually reduces the collisional heating (Ludlow et al. 2021(Ludlow et al. , 2023)).At high mass there is reasonable agreement between the sizes for the different resolutions, though the high-resolution m8 model, which is closest to the data, predicts somewhat smaller values.At the high-mass end the definition of size is however sensitive to the treatment of the extended, diffuse component, as can be seen by comparing to the dashed lines, which show the half-mass radii computed by including all bound particles within 100 kpc circular apertures.Note that at the high-mass end the size-mass relation asymptotes to a size equal to about half the aperture.
Comparing with the observations, we see that in the resolved mass range (which, for properties other than metallicity, begins at higher masses than the mass limit used for calibration) there is generally good agreement with the observations for sSFR, passive fraction and metallicity.The exception is the passive fraction at  * ≳ 10 12 M ⊙ , where the simulations predict an increasing fraction of active galaxies with increasing mass that is inconsistent with the inference from the semi-empirical model of Behroozi et al. (2019) that we compare with.The decline of the passive fraction and the increase in the sSFR of active galaxies with mass for  * ≳ 10 12 M ⊙ are less steep at high resolution (m8) than at intermediate resolution (m9).The results from small-volume test runs that we performed suggest   12, but showing a selection of median galaxy properties as a function of stellar mass.All properties are shown at  = 0.1 with the exception of the passive fractions, which are at  = 0. From top to bottom the rows show specific SFR of active galaxies, passive fraction (i.e. the fraction of galaxies with sSFR < 10 −2 Gyr −1 ), stellar metallicity, and projected stellar half-mass radii for active and passive galaxies, respectively.Solid/dotted curves are computed using all particles bound to the subhalo and inside our fiducial 3D apertures of radius 50 kpc, except for half-mass radii which are projected and computed within 2D apertures of radius 50 kpc.The dashed curves in the panels showing galaxy sizes are computed using 2D apertures of radius 100 kpc.As indicated in the legends, the simulations are compared with observed  ≈ 0.1 sSFRs from Bauer et al. (2013),  ≈ 0 passive fractions compiled by Behroozi et al. (2019),  ≈ 0.1 stellar metallicities from Gallazzi et al. (2005, error bars indicate the scatter), and  ≈ 0.1 galaxy sizes from Van der Wel et al. (2014); Lange et al. (2015).Many of the galaxy properties are sensitive to numerical resolution.Over the mass ranges where the simulation results are resolved, which depends on the property and the resolution, the agreement with the data is generally good.However, the passive fractions are underestimated for the highest masses ( * ≳ 10 12 M ⊙ ) and the sizes are slightly too large.
this is probably due to the fact that in the m9 simulations the BHs are less well pinned to the halo center because we neglected to exclude the contribution of the BH to the gravitational potential for the purpose of BH repositioning (see §2.3.5).Galaxy sizes are generally overestimated, except for the high-resolution simulation at high mass ( * ≳ 10 11 M ⊙ ).
The right panels of Fig. 14 show that the differences between the models is small, though the M*− models, which were calibrated to a lower SMF, are indeed shifted towards lower stellar masses (i.e. to the left in the figure) and to lower metallicities.The only other models predicting significantly different properties are the Jet models, which for  * ≫ 10 11 M ⊙ yield lower sSFRs for active galaxies and higher passive fractions. 19

Cluster scaling relations
Modeling galaxy clusters is an important goal of FLAMINGO.Clusters are widely used for observational cosmology, mainly to measure the halo mass function, and they are also of great interest for studies of the evolution of massive galaxies and the intracluster medium.While the simulations were calibrated to reproduce the observationally inferred cluster gas fractions inside  500c , this does not guarantee that the X-ray and SZE observations are reproduced, since those can depend on the profiles of the density, temperature, and metallicity as well as the clumpiness and multiphase structure of the gas.Moreover, the observed gas fractions are model dependent and subject to selection effects.We will present detailed studies of the hot gas and its observational signatures elsewhere, but will provide a preview by comparing with some observed cluster scaling relations between integrated thermodynamic properties and halo mass.
Fig. 15 compares the predictions of the models using the fiducial calibration and cosmology for the median relations between  = 0 X-ray luminosity and temperature (top left panel), X-ray luminosity and halo mass (top right panel), temperature and halo mass (bottom right panel), and integrated SZE Compton  and halo mass (bottom left panel).Here the X-ray luminosities are measured in the 0.5-2 keV band and the temperatures are mass-weighted averages over all gas with  > 10 5 K.As will be detailed in Braspenning et al. (in preparation), the particle X-ray luminosities are computed using emissivity tables generated with cloudy (Ferland et al. 2017, version 17.02) and account for the individual elemental abundances of each particle.Hence, the the X-ray luminosities are consistent with the element-by-element radiative cooling rates used during the simulation, which also used cloudy (see §2.3.1).To avoid artefacts due to the subgrid prescription for AGN feedback, particles that were subject to the injection of AGN feedback energy in the last 15 Myr and whose temperatures are between 10 −1 Δ AGN and 10 0.3 Δ AGN , were excluded, but this made negligible difference.Using spectroscopic-like instead of mass-weighted temperatures gives nearly identical median relations, but results in some outliers, particularly if recently heated gas is not excluded (not shown).Luminosities, temperatures and masses are measured inside  500c .
The cluster Compton  parameter is the integral of the thermal SZE,  = ∫ ddΩ, where  is given by equation ( 8), dΩ is the solid angle and the integral extends from  = 0 to the redshift of photon decoupling and over the observed angular aperture.Using dΩ = d  d  / 2 A , where  A () is the angular diameter distance of 19 These differences may partly be due to the fact that the Jet models, as well as the fiducial m8 and m10 simulations, used the improved implementation of BH repositioning discussed in §2.3.5.
the cluster and d  and d  are the proper sizes in the directions of the observer's spherical polar coordinates  and , we have ddΩ = , where d is the proper volume element.Hence, if we ignore contributions from structures in front and behind the cluster and limit the integral to 3D distances <  from the cluster center, we obtain We exclude gas recently heated by AGN using the same criteria as mentioned above for X-ray luminosities, but this makes negligible difference.To facilitate comparison with Planck measurements we measure  within 5 500c .
The convergence with simulation box size and resolution are both excellent.Even the low-resolution simulation L1_m10 yields converged X-ray scaling relations for haloes with temperatures  X,500c > 2 × 10 7 K and halo masses  500c > 10 13.5 M ⊙ .For the SZE the L1_m10 results are even converged down to the lowest masses shown,  500c = 10 13 M ⊙ .The scatter, which is indicated by the shaded region for L2p8_m9, increases towards lower masses and is largest for the luminosity.
The predictions are compared with a compilation of X-ray observations from the literature (see the legend) and SZE observations of  < 0.25 clusters from Planck Collaboration et al. (2016), where the latter were compiled by McCarthy et al. (2017).We scaled the X-ray luminosities to the 0.5-2 keV band and  = 0 using PIMMS20 (Mukai 1993).The observationally inferred halo masses have been corrected for hydrostatic mass bias by dividing the masses by a factor 0.743 (see §3.3), as indicated by the arrows labelled 'HSE bias', but the biases on other quantities (as a result of the small change in  500c ) are neglected.The predicted median X-ray luminosity is ≈ 0.5 dex low compared with the lowest data point from Lovisari et al. (2015).However, this data point represents only 7 objects, of which the lowest-luminosity one is far below the lower error bar indicating the 16th percentile.Moreover, as discussed by Lovisari et al. (2015), their low-mass clusters are biased to high luminosities due to Malmquist bias.We conclude that the agreement with the data is excellent for all observables and over the full range spanned by the data.The same holds for the BAHAMAS simulation, which is also shown (grey dashed lines).
In Fig. 16 we compare the X-ray luminosity -temperature relation for simulations that vary the calibration data for our fiducial cosmology and intermediate resolution.The different cosmologies are indistinguishable (left panel).The simulations that were calibrated to different gas fractions do show large differences, with higher gas fractions yielding a higher luminosity at fixed temperature (middle panel).The fiducial model fits the data best and the most extreme model, fgas−8, is clearly inconsistent with the observations.Varying the SMF at fixed gas fraction has very little effect (right panel).The jet implementation of AGN feedback gives nearly identical results as the fiducial thermal implementation when the models are calibrated to the same gas fractions (right panel).Model Jet does differ from L1_m9 for  X,500c < 2 × 10 7 K, but this can be attributed to the fact that model Jet has higher gas fractions for  500c < 10 14 M ⊙ (see Fig. 10), which corresponds to  X,500c ≈ 2×10 7 K (see Fig. 15).
In summary, the X-ray and SZE cluster scaling relations are converged and insensitive to the investigated variations in cosmology, to the uncertainty in the SMF and, for a fixed gas fraction, to the implementation of AGN feedback (we showed the model comparison only , where  () ≡  ()/ 0 , to correct the data to  = 0 assuming self-similar evolution).Observed X-ray luminosities have been scaled to  = 0 and the 0.5-2 keV band using PIMMS (Mukai 1993).The arrow labelled 'HSE bias' indicates the correction for hydrostatic mass bias that has been applied to the observed X-ray data.The grey dashed lines show the results from the BAHAMAS simulation (McCarthy et al. 2017).The simulations are converged with box size and resolution (except for the low-resolution L1_m10 at the lowest masses) and the agreement with the data is excellent.
for the luminosity -temperature relation, but these conclusions also hold for the other relations shown in Fig. 15).The X-ray luminosity -temperature relation is sensitive to the gas fraction.The model calibrated to the fiducial (i.e.unperturbed) gas fractions is in excellent agreement with the data, while the models with gas fractions that differ more disagree more strongly with the observations.The luminosity -mass relation paints a similar picture, while the other scaling relations are less sensitive to the gas fractions (not shown).

Large-scale structure
The large volumes of the FLAMINGO simulations make them well suited for comparisons to LSS observables such as galaxy clustering, CMB lensing, the SZE, and cosmic shear.Indeed, the ability to make such comparisons was a primary consideration in the design of FLAMINGO.To facilitate such comparisons, we have produced onthe-fly full-sky lightcones for multiple observer locations (described The median cluster X-ray luminosity -temperature relation at  = 0 for clusters with mass  500c > 10 13 M ⊙ .The curves are plotted down to the median temperature for the minimum included halo mass.The X-ray luminosities are for the 0.5-2 keV band and the temperatures are mass-weighted; both are measured within  500c .The different panels compare simulations with different cosmologies (left), simulations that were calibrated to different gas fractions (middle), and models that were calibrated to different stellar mass functions (right) and simulations that use jet-like AGN feedback (also the right panel).For reference, the fiducial simulation with the same box size and resolution as the models shown, L1_m9, is repeated in every panel.The data points show the same observations as in the top-left panel of Fig. 15.The X-ray luminosity -temperature relation is sensitive to the gas fraction, but not to the cosmology, stellar mass function or the implementation of AGN feedback (model Jet differs at low temperature, but there its gas fractions are also higher).
in Appendix A), including HEALPix maps, and various 3D power spectra of each full simulation volume with a high time cadence.
In future papers we will explore various measures of the clustering of matter, gas, and galaxies, elucidating the feedback and cosmology dependencies of these quantities, and we will confront the simulations with a wide variety of LSS observables.Here we present an example of the kind of comparisons that are enabled by the FLAMINGO data set.Specifically, we discuss here the spatial cross-correlation between the thermal SZE signal and the CMB lensing convergence field, .As this cross-correlation depends on both the clustering of matter and how hot baryons trace the matter field, it is sensitive to both cosmology and feedback variations, making it an interesting test case.
We first describe the construction of the thermal SZE and CMB lensing maps.As described in Section A2.3, we accumulate the Compton  values of individual particles crossing the lightcone onto HEALPix maps over fixed intervals in redshift.To construct a total Compton  map, we only need to sum these maps along the line of sight, which we do back to  = 3 (which we have verified is sufficient for the SZE-CMB lensing cross power spectrum).To construct CMB lensing  maps, we follow the method described in McCarthy et al. (2018), which employs the so-called Born approximation (i.e.light ray paths are approximated as straight lines).In short, for each HEALPix total mass map (of which there are 60 per lightcone back to  = 3), we compute a projected (2D) overdensity map, ( , ).The maps are then integrated along the line of sight weighted by the CMB lensing kernel to yield the total convergence map where  is the comoving distance,  max is the maximum redshift of integration (taken to be  = 3), and  CMB = 1100 is the surface of last scattering.
We use the NaMaster21 package (Alonso et al. 2019) to compute the angular cross-spectrum between the dimensionless scalar (spin-0) quantities  and .To save computational effort, the HEALPix maps here have been downsampled from  side = 16384 to  side = 4096, corresponding to an angular resolution of ≈ 0.86 arcmin, which is sufficient for current measurements.When computing the crossspectra, we initially use a multipole moment resolution (bandpower) of Δℓ = 8 but then employ a Savitzky-Golay filter of order 3 and window size of 25 to further smooth the simulated spectra.We deconvolve the  side = 4096 pixel window function from the computed cross-spectra using the pixwin function within the HEALPix package.
We compare our map-based cross-spectra with a Limber approximation-based calculation integrating the 3D matter-electron pressure cross-spectrum along the line of sight back to  = 3, where  m,e is the 3D matter-electron pressure cross-spectrum and the normalisation factor, , is defined as The derived cross-spectra are compared in Fig. 17.In the top left panel, we examine the dependence on box size, resolution, and method for computing the cross-spectrum (the fiducial map-based method versus 1D Limber integration), as well as the role of cosmic variance.At fixed box size (L1), there is excellent convergence between the three different resolutions.The large 2.8 Gpc run (comparing the mean from the 8 independent lightcones) has slightly more Thermal SZE and CMB lensing convergence maps are constructed from full-sky lightcone-based HEALPix maps (downsampled to  side = 4096 for the present test) produced on-the-fly while running the simulations.Cross-spectra are computed using the NaMaster pseudo- ℓ package.In the top left panel, the shaded region represents the scatter between the 8 independent lightcones (i.e. 8 different observer locations in the simulation volume) for the L2p8_m9 simulation, while the dashed curve represents the cross-spectrum derived by integrating the 3D matter-electron pressure cross-spectrum along the line of sight from 3D power spectra produced on-the-fly by employing the Limber approximation.The - cross-spectrum is well converged with the numerical resolution at fixed box size, but comparison of the two intermediate-resolution simulations shows that the 1 Gpc volume misses a small amount of power relative to the 2.8 Gpc simulation (top left panel).This particular cross-spectrum is more sensitive to variations in cosmology (top right panel) than variations in feedback implementation (bottom panels), though the effect of the latter is also clearly visible.Comparison to the measurements of Hill & Spergel (2014) reveals a weak preference for a low  8 cosmology or a high neutrino mass relative to the Planck cosmology.
power than the 1 Gpc runs on all scales, which is likely due to a larger number of very rare, massive clusters.In the context of the 2.8 Gpc run, the Limber 1D calculation agrees well with the mean map-based calculation (compare the dashed curve labelled 'P(k)' with the solid blue curve).The shaded region encapsulates the cone-to-cone scatter from the large run, illustrating that cosmic variance becomes significant at the largest scales shown, ℓ ≲ 500.
In the top right panel of Fig. 17 we explore the cosmology dependence with fixed baryon physics (the fiducial calibrated model).
Here we see that switching from the Planck cosmology to the lensing LS8 cosmology results in an amplitude reduction of ≈ 30%.The impact of massive neutrinos is also clearly discernible, although over this range of scales it would be difficult to discriminate between a variation in   and a variation in  8 , as both mainly alter the amplitude.
In the bottom panels of Fig. 17 we explore the impact of varying the feedback on the - cross-spectrum.Although the impact of feedback variations is evident, the cross-spectrum is clearly more sensitive to the variations in cosmology we have explored (top right panel) than to the variations in feedback (in spite of the large variations in the latter).The bottom-left panel shows that smaller cluster gas fractions lead to more power for ℓ < 600, but less power on smaller scales.This is consistent with the findings of McCarthy et al. (2018), who previously explored this observable using the BAHAMAS simulations, and found that stronger feedback tends to suppress (enhance) the cross-spectrum on small (large) angular scales.This suggests the cross-spectrum probes both the locations of gas ejection from haloes and gas accumulation at larger physical distances from galaxies (i.e.where the outflows stall).the cross-correlation between the high-frequency 857 GHz Planck data and the SZE map.Nevertheless, Hurier (2015) estimate that the derived cross-spectrum is still likely to be biased high due to residual CIB contamination at the level of 20% ± 10%.We have therefore rescaled the measurements down by 20%.Furthermore, Hill & Spergel (2014) actually measure the - cross-spectrum, where  is the lensing potential, but this is straightforwardly converted into a - spectrum via  ℓ = 2 ℓ /(ℓ(ℓ + 1)).
Comparing the simulated cross-spectra with the observations, we infer that current data is not sufficiently precise to distinguish between the feedback variations, but that there is a weak preference for a low- 8 (or a high neutrino mass) cosmology, which is consistent with other recent findings in the literature based on LSS observables (e.g.Amon et al. 2023).However, we expect that forthcoming measurements of this cross-spectrum from the Advanced Atacama Cosmology Telescope (De Bernardis et al. 2016) and the Simons Observatory (Ade et al. 2019) will yield considerably more precise measurements of this observable, potentially allowing for competitive constraints on both cosmology and astrophysics.

HALO MASS FUNCTION
The halo mass function (HMF) forms the basis of a variety of models of LSS, ranging from halo models of the power spectrum to halo occupation distribution models of galaxy clustering (for a recent review see Asgari et al. 2023).Provided we can calibrate the observable-mass relation for clusters, the HMF can also itself be used as a constraint on cosmology.However, baryonic effects are expected to change halo masses and hence the HMF.Radiative cooling and condensation can draw matter in, both baryons and dark matter, while galactic winds driven by star formation and AGN can expel baryons and cause the dark matter distribution to expand.Contraction tends to dominate in the central regions whereas the outer regions of the  halo tend to expand relative to a DMO model (e.g.Velliscig et al. 2014).Hence, baryonic effects can both increase and decrease halo masses and this will directly impact the HMF, as has been demonstrated using both hydrodynamical simulations (Stanek et al. 2009;Cui et al. 2014;Cusworth et al. 2014;Velliscig et al. 2014;Bocquet et al. 2016) and observationally constrained analytic models (e.g.Debackere et al. 2021).Before investigating baryonic effects, we first show the HMFs for the DMO simulations.Fig. 18 shows the  = 0 HMF,  () = To facilitate a quantitative comparison of the small, but significant differences, the top panel of Fig. 19 shows the ratio of each HMF to that of L2p8_m9_DMO.Convergence with the box size roughly follows expectations based on the Poisson errors shown for L2p8_m9_DMO (dotted curves).Convergence with numerical resolution is excellent, with systematic deviations smaller than 5 per cent down to haloes of 100 particles, below which the HMF begins to decrease relative to higher-resolution simulations.
For  (Bocquet et al. 2020) (bottom panel only).Although there is agreement at the per cent level with Magneticum and MiraTitan for 10 13 ≲  200c /M ⊙ ≲ 10 14 , the differences with (and between) predictions from the literature generally far exceed the Poisson errors (which should also be small for the literature studies).The discrepancies are likely dominated by differences between the halo definitions used by different halo finders (e.g.Bocquet et al. 2016;Euclid Collaboration et al. 2023).Such large discrepancies are obviously problematic and suggest that a different approach may be needed for cluster counts cosmology (e.g.Debackere et al. 2022).
Each curve in Fig. 20 shows the ratio of the  = 0 HMF in a hydrodynamical and the corresponding DMO simulation as a function of halo mass,  200m .The top left panel compares the effects of baryons for the fiducial FLAMINGO simulations.The convergence with box size is excellent, with L1_m9 and L2p8_m9 agreeing nearly perfectly up to  200m = 10 15 M ⊙ , beyond which the smaller volume runs out of haloes.Convergence with resolution is also excellent, at least in the mass range of clusters.L1_m9 is nearly identical to the high-resolution L1_m8 for  200m > 10 13 M ⊙ .For 11.7 < log 10  200m /M ⊙ < 13.0 the differences are < 5 per cent, but at lower mass the two resolutions diverge.This is expected, because for intermediate resolution (m9) a significant fraction of the haloes with mass 10 11.7 M ⊙ have not formed any stars at all (see Fig. 9).
Focusing for the moment on the high-resolution L1_m8 simulation, which covers the largest range in halo mass, we see that baryon physics reduces the HMF by more than 5 per cent for both  200m < 7 × 10 11 M ⊙ and 2 × 10 12 M ⊙ <  200m < 1 × 10 14 M ⊙ , where the low-and high-mass suppression can be attributed to the ejection of gas by stellar and AGN feedback, respectively.For 10 12 M ⊙ haloes the HMF is nearly unchanged by baryonic effects.This is the halo mass for which stellar feedback becomes inefficient but AGN feedback has not yet had a large impact, resulting in a peak in the ratio  * / 200m (which manifests itself as the inflection in the SMHM relation shown in Fig. 9).At very high masses,  200m ≫ 10 14 M ⊙ the HMFs in the DMO and hydrodynamical simulations converge for all resolutions.This is consistent with the high baryon fractions of such massive haloes (see the bottom row of Fig. 11), and suggests that even AGN feedback is unable to remove, or keep out, a large fraction of the baryons.The largest deviation, ≈ 18%, occurs at  200m = 2 × 10 13 M ⊙ .
The top right panel of Fig. 20 shows that the effect of baryons on the HMF is insensitive to cosmology, though it should be noted that we have not varied the universal baryon fraction, which may be expected to have the largest impact, at least without recalibration of the subgrid physics.
The bottom left panel compares the models that have been calibrated to different cluster gas fractions but to the same SMF.For  200m > 10 13 M ⊙ smaller gas fractions yield a stronger suppression of the HMF, which is unsurprising.For the model with the smallest gas fraction the reduction of the HMF exceeds 5 per cent up to 10 15 M ⊙ .
The bottom right panel shows the remaining models.Comparing the models calibrated to the fiducial SMF with the M*− models, for which the observed stellar masses have been reduced by the expected systematic error (0.14 dex) before calibration, we see that the latter simulation predicts a 5-10 per cent stronger baryonic suppression of the HMF for  200m ∼ 10 12 M ⊙ but almost no difference for higher masses, where stars contribute a smaller fraction of the halo mass.Model Jet predicts a weaker baryonic effect than L1_m9 for 10 13 − 10 14 M ⊙ haloes, which is consistent with its higher gas fractions for these masses (Fig. 10).Models Jet_fgas−4 and fgas−4 are in close agreement, which is consistent with the good agreement between their gas fractions (Fig. 10).Hence, it seems that, at least at our relatively low resolution, the effect of AGN feedback on the HMF is insensitive to the subgrid implementation, kinetic jet-like versus isotropic and thermal, provided the models are calibrated to reproduce the same halo gas fractions.The observational uncertainty on this quantity, i.e. the difference between the fgas−2 and the fgas+2 models, then implies a ∼ 10% uncertainty on the HMF at 10 14 M ⊙ .
Each panel of Fig. 20 also shows the baryonic suppression of the HMF predicted by the Magneticum simulations (Bocquet et al. 2016) and the two cosmo-OWLS simulations that bracket the observed cluster gas fractions (Velliscig et al. 2014).For  200m ≫ 10 13 M ⊙ the cosmo-OWLS models indeed bracket the fiducial FLAMINGO model, with cosmo-OWLS model AGN8.0 agreeing very closely for  200m > 10 14 M ⊙ , the mass range where this model fits the observed gas fractions well (Le Brun et al. 2014).In the mass range 10 14 − 10 15 M ⊙ Magneticum also agrees very well with fiducial FLAMINGO, but at higher masses the fit from Bocquet et al. (2016) gives strange results, which is probably due to extrapolation beyond the mass range covered by the simulations.At the low-mass end,  200m ≪ 10 13 M ⊙ , both cosmo-OWLS and Magneticum predict a stronger suppression of the HMF than any of the FLAMINGO variations.At least for cosmo-OWLS this can be explained by the fact that the simulations strongly underpredict the SMF for  * ≲ 10 11 M ⊙ (see fig. 1 of McCarthy et al. 2017), which suggests that feedback is too strong and hence that the gas fractions may also be too low.We do not show the MTNG results from Hernández-Aguayo et al. (2022) because they plot the baryonic suppression as a function of  200c rather than  200m .However, a direct comparison (not shown here) reveals that MTNG predicts a suppression curve with a similar shape but shifted to about a factor of two higher masses.For log 10  200c /M ⊙ = 14.0 − 14.5 MTNG agrees well with model fgas+2, while it predicts even higher  hydro /  DMO than fgas+2 for log 10  200c /M ⊙ = 14.5 − 15.0.These differences are qualitatively consistent with the differences in the gas fractions measured at  500c shown in Fig. 10.
We close this section by noting that failing to account for baryonic effects of the magnitude found here would significantly bias cosmological parameters inferred from cluster counts (Castro et

MATTER POWER SPECTRA
Obtaining precision measurements of the total matter power spectrum is a key goal of observational surveys.Baryons modify the power spectrum in various ways.Before recombination, BAO imprint wiggles on very large scales, ∼ 10 2 Mpc, that are well understood theoretically and can be modelled accurately using CMB Boltzmann codes.However, non-linear effects are not completely negligible for the observational signature of BAO in the low- galaxy distribution (e.g.Eisenstein et al. 2007) and simulations like FLAMINGO may be of some help in modelling them.On smaller scales baryonic effects on the power spectrum are much larger and much more uncertain.
Outflows driven by stellar and AGN feedback suppress the power on intermediate scales, while gas cooling boosts the power on small scales (e.g.Van Daalen et al. 2011).For current and upcoming surveys the former is much more important than the latter.Modelling the effect of galaxy formation on the power spectrum is one of the motivations for the FLAMINGO project.Because the baryonic suppression is thought to depend mainly on the gas fractions in clusters (e.g.Semboloni et al. 2011Semboloni et al. , 2013;;Schneider et al. 2019;Van Daalen et al. 2020;Aricò et al. 2021;Salcido et al. 2023), FLAMINGO was calibrated to reproduce this observable.
The solid, black curve in Fig. 21 shows the  = 0 total matter power spectrum in the fiducial L2p8_m9 simulation.The total mat-ter power spectrum is given by () = δ() 2 , where δ is the Fourier transform of the density contrast,  = ( − ρ)/ ρ, with  the total density (i.e.CDM+gas+star+BHs+massive neutrinos) and ρ the cosmic mean density.Comparison with the dotted, black line, which shows the linear theory prediction, demonstrates that modelling non-linear effects is essential for wavenumbers  > 0.1 ℎ Mpc −1 ( < 10 2 Mpc).
The colored lines show the contributions of CDM (green), gas (yellow), stars plus BHs (red), and neutrinos (blue).Solid lines indicate the auto-spectra, i.e. δ () , which dominate over the auto-spectra.
For  ≪ 0.1 ℎ Mpc −1 the baryons trace the CDM, but the relative contribution of neutrinos declines visibly with  over all the scales probed by the simulation.For  ∼ 1 to 10 2 ℎ Mpc −1 the contribution of gas is suppressed.Debackere et al. (2020) used an observationally constrained halo model to show that this suppression is mainly due to ejection of gas from low-mass ( 500c /M ⊙ = 10 13 − 10 14 ) clusters for  ≲ 10 ℎ Mpc −1 and from groups ( 500c /M ⊙ = 10 12 − 10 13 ) for  ≳ 10 ℎ Mpc −1 .At very small scales ( ∼ 10 3 ℎ Mpc −1 ) gas boosts the power spectrum because dissipation allows it to condense into galaxies.Because only cold and dense gas turns into stars, the stellar component (BHs contribute negligibly) boosts the small-scale power  and for  ≫ 10 ℎ Mpc −1 this effect dominates over the suppression due to gas expulsion.Note that the CDM contribution shown in the figure is taken from the hydrodynamical simulations.Its shape differs from that in the corresponding DMO simulation (not shown) because of the gravitational back reaction of the baryons on the CDM (Van Daalen et al. 2011, 2020).
The total baryonic suppression/boost factor, i.e. the ratio of the total matter power spectrum in the hydrodynamical and the corresponding DMO simulations, is shown in Fig. 22. Clockwise starting from the top left, the different panels compare the fiducial models, the cosmology variations using the fiducial galaxy formation model, the models using jet-like AGN feedback as well as models calibrated to different SMFs, and the models calibrated to different gas fractions.Each panel clearly shows that baryons trace the total matter on large scales (i.e. hydro / DMO ≈ 1), suppress the power on intermediate scales ( hydro / DMO < 1) and boost the power on small scales ( hydro / DMO > 1).
The top-left panel demonstrates that the baryonic effects are converged with the box size, because the results for L1_m9 and L2p8_m9 fall on top of each other.Convergence with the resolution is excellent on large scales, but L1_m10 and L1_m9 deviate by more than 1 per cent from the high-resolution L1_m8 simulation for  > 3 and  > 20 ℎ Mpc −1 , respectively.
For comparison, the grey curves show the MTNG and fiducial BAHAMAS simulations.MTNG predicts much weaker baryonic effects, which is consistent with its cluster gas fractions being too high (see Fig. 10).BAHAMAS predicts significantly stronger baryonic suppression than fiducial FLAMINGO, even though its cluster gas fractions are only slightly lower (Fig. 10).The difference in star fractions is a bit larger (Fig. 11), but that does not explain the fact that the difference in power suppression increases towards large scales, particularly since the BAHAMAS gas and star fractions agree better with FLAMINGO for more massive clusters, whose contribution to the power spectrum increases towards larger scales.The halo models of Debackere et al. (2020) suggest that the fact that for  ∼ 1 ℎ Mpc −1 BAHAMAS predicts a stronger suppression indicates that the gas is ejected to larger distances than in FLAMINGO.
The top-right panel shows that the baryonic effect is insensitive to the cosmology, as had already been shown by Van Daalen et al. (2011); Mummery et al. (2017);Van Daalen et al. (2020).However, we note that none of these papers nor FLAMINGO tests a wide range of cosmological parameters.In particular, none explore large variations in the cosmic baryon fraction, which might be expected to show the strongest interaction with the galaxy formation physics.
The bottom-left panel compares the models calibrated to the same SMF but to different cluster gas fractions.Clearly, the smaller the gas fraction, and hence the baryon fraction, the stronger the baryonic suppression.Since the models were calibrated to the same SMF, they converge on small scales where stars dominate over gas.The fgas−2 and fgas+2 models could be taken as upper and lower limits on the magnitude of the baryonic suppression, given that they were calibrated to gas fractions that were shifted by ±2 times the estimated error on the gas fractions (see Kugel et al. 2023).However, comparison with the BAHAMAS prediction shows that this would be naive, because depending on , BAHAMAS predicts a suppression as strong as in fgas−4 or even fgas−8, even though the BAHAMAS gas fractions are much closer to the fiducial FLAMINGO model than to these fgas variations (Fig. 10).This again suggests that the gas (and star) fractions inside  500c do not fully constrain the baryonic suppression.
Indeed, the bottom-right panel of Fig. 22 shows that the Jet models predict a different shape for the scale dependence, with relatively stronger suppression on larger scales.Hence, for a fixed baryon fraction in clusters, there is a residual dependence on the implementation of AGN feedback, as anticipated by Debackere et al. (2020) who showed that the distance out to which the gas distribution is modified is also important.However, the differences between the fiducial and jet-like AGN feedback at fixed gas fraction, as well as those between fiducial FLAMINGO and BAHAMAS, are small in terms of percentage points.While the relative differences in the baryonic suppression factors are large for  < 1 ℎ Mpc −1 , on such large scales  hydro is only a few per cent smaller than  DMO .Finally, the comparison of the different M* variations at fixed gas fraction suggests that uncertainties in the SMF only become important for  ≳ 10 ℎ Mpc −1 .Fig. 23 shows the effect of baryons on the  = 0 matter power spectrum at a fixed scale of  = 1.0 ℎ Mpc −1 as a function of the mean baryon fraction within  500c in clusters of mass  500c = 10 14 M ⊙ .This is similar to fig.16 of Van Daalen et al. (2020), who showed that the latter is a remarkably good predictor of the baryonic suppression for  ≤ 1 ℎ Mpc −1 .Each data point represents a different simulation.The dashed curve shows the fit of Van Daalen et al. (2020) to the cosmo-OWLS (Le Brun et al. 2014) and BAHAMAS simulations with box sizes of at least 400 Mpc/ℎ.The coloured points correspond to the FLAMINGO simulations shown in Fig. 22, except for the cosmology variations which would have fallen on top of L2p8_m9 and L1_m9.The grey points are for simulations taken from the literature.
Note that the widely used galaxy formation simulations EA-GLE (Schaye et al. 2015), Illustris-TNG (Pillepich et al. 2018), and Horizon-AGN (Dubois et al. 2014) predict very small suppression factors because their cluster baryon fractions are larger than observed.used jet-like AGN feedback, the Jet models do follow the relation, although Jet_fgas−4 is the most deviant of the FLAMINGO models.Much more discrepant are Illustris and SIMBA (Davé et al. 2019), but we note that this may in part be due to their relatively small box sizes (∼ 10 2 Mpc), which will cause them to underestimate the contribution of massive haloes to the power spectrum, which are less affected by baryonic physics.If the difference cannot be explained by the simulation volume, then, based on the results of Debackere et al. (2020), it is likely that AGN feedback affects baryons out to larger distances than in the simulations that do follow the Van Daalen et al. (2020) relation.

SUMMARY
Observational cosmology based on measurements of the growth of large-scale structure (LSS), such as cosmic shear, galaxy-galaxy and cosmic microwave background (CMB) lensing, galaxy clustering, and Sunyaev-Zel'dovich Effect (SZE)/X-ray observations of hot gas, is increasingly limited by the accuracy of theoretical predictions.The models that are compared with the data are nearly always based on dark matter only (DMO) simulations, though some allow for marginalization over expected baryonic effects associated with galaxy formation.However, baryonic effects become increasingly important as observations target smaller scales, and may be considerably more complex than is assumed in the corrections applied to DMO simulations.Hydrodynamical simulations can in principle help resolve this issue, but they tend to have volumes that are too small to study LSS, they often do not reproduce the relevant observables, and/or they do not include model variations that cover the relevant parameter space.The FLAMINGO project aims to address these shortcomings.
FLAMINGO consists of a suite of new large-volume cosmological, hydrodynamical simulations.There are three different resolutions, corresponding to baryonic particle masses of 1.3 × 10 8 M ⊙ (referred to as 'high' or 'm8' resolution), 1.1 × 10 9 M ⊙ (intermediate/m9 resolution) and 8.6 × 10 9 M ⊙ (low/m10 resolution), where the latter is used only for convergence testing.The flagship runs are the (1 Gpc) 3 high-resolution L1_m8 and the (2.8 Gpc) 3 intermediate-resolution L2p8_m9 simulations.The latter follows 2.8 × 10 11 particles, which makes it the largest hydrodynamical simulation ever run to  = 0. Importantly, the FLAMINGO suite contains 12 additional simulations at m9 resolution in the 1 Gpc box that vary the cosmology and galaxy formation physics (see Tables 2 and 4 for an overview of all the hydrodynamical simulations).In addition, there is a DMO counterpart to each hydrodynamical run, plus some additional DMO simulations, including a 5.6 Gpc and a 11.2 Gpc box (see Table 3).Besides regular snapshot outputs, lightcone output is generated onthe-fly from the perspective of a number of different observers (8 for L2p8 and 2 for the L1 simulations).
The simulations are performed with the Swift code (Schaller et al. 2023) using the sphenix SPH scheme (Borrow et al. 2022).The simulations include neutrino particles using the new   method of Elbers et al. (2021).The initial conditions include separate fluids for CDM, baryons and neutrinos, and discreteness errors are suppressed by perturbing particle masses rather than displacing particles (Hahn et al. 2020;Hahn et al. 2021;Elbers et al. 2022).The simulations include subgrid models for unresolved physical processes whose importance is widely accepted.Radiative cooling is calculated element-by-element while accounting for self-shielding (Ploeckinger & Schaye 2020).Star formation is implemented using a pressure law that reproduces the observed Kennicutt-Schmidt law (Schaye & Dalla Vecchia 2008).Stellar mass loss is tracked element-by-element for winds from AGB and massive stars as well as core collapse and Type Ia supernovae (Wiersma et al. 2009b;Schaye et al. 2015).Energy feedback from star formation is injected stochastically and isotropically in kinetic form (Dalla Vecchia & Schaye 2008;Chaikin et al. 2022b,a).BHs are seeded, repositioned down the gravitational potential gradient, and can grow through mergers and gas accretion (Springel et al. 2005;Booth & Schaye 2009;Bahé et al. 2022).Most models use thermal, isotropic AGN feedback (Booth & Schaye 2009), but two models instead use jet-like kinetic AGN feedback (Huško et al. 2022).
The subgrid models for BH accretion and for stellar and AGN feedback are calibrated to the observed  = 0 SMF, gas mass fractions inside  500c for clusters at  ≈ 0.1 − 0.3 from a combination of X-ray and weak lensing data, and the  = 0 relation between BH mass and stellar mass.Contrary to common practice, the calibration (to the SMF and cluster gas fractions) is not performed by trial and error, but using machine learning (Kugel et al. 2023).In particular, for each resolution and each observable a Gaussian process emulator is trained on a 32-node Latin hypercube consisting of simulations with the target resolution but much smaller volumes (which limits the maximum cluster mass used for the calibration to log 10  500c /M ⊙ = 13.7,14.4, and 14.5 for m8, m9 and m10, respectively).Four subgrid parameters are varied: the amount of supernova energy, the target velocity for kinetic stellar feedback, the AGN heating temperature or jet velocity, and the density dependence of the BH accretion rate (at m10 resolution we do not need stellar feedback and vary only the two BH parameters).
Another novelty is that the calibration accounts for expected observational errors and biases.We impose random errors on the simulated stellar masses to account for Eddington bias.During the calibration of the fiducial intermediate-resolution model we fit for systematic errors in the SMF due to cosmic variance, bias in the inferred stellar mass, and for hydrostatic mass bias in the cluster gas fractions inferred from X-ray observations.The best-fitting bias factors, which are negligible for cosmic variance and stellar mass, and which is consistent with the literature for the hydrostatic mass bias, are then applied to the calibration data for all resolutions and models.
The emulators are not only used to design simulations that reproduce the observations, but also to create models in which the SMF and/or cluster gas fractions are shifted to higher/lower values.This allows us to specify model variations in terms of the number of  by which they deviate from the calibration data, which is more intuitive and useful than specifying simulations solely by the values of subgrid parameters that are not directly observable.FLAMINGO includes four models in which cluster gas fractions are varied (by +2, −2, −4 and −8, respectively) while keeping the SMF unchanged, one model in which the SMF is reduced by decreasing the stellar masses by the expected systematic error (0.14 dex; Behroozi et al. 2019) while keeping gas fractions fixed, and two models that simultaneously vary the gas fractions and the SMF.Finally, two models use jet-like AGN feedback rather than the fiducial isotropic and thermal feedback, one of which is calibrated to the fiducial data and one to gas fractions shifted down by 4.Comparison of these last two models with the corresponding fiducial ones enables estimates of the uncertainty due to differences in the implementation of AGN feedback for a common calibration.
The flagship runs and galaxy formation variations all assume the cosmological parameters from the Dark Energy Survey year three (3x2pt plus external constraints; Abbott et al. 2022) for a spatially flat universe and the minimum allowed summed neutrino mass of    2 = 0.06 eV (one massive and two massless species).In addition, FLAMINGO includes three runs based on the Planck Collaboration et al. (2020) cosmology, one with    2 = 0.06 eV and two with    2 = 0.24 eV.Finally, one model is motivated by the preference of many LSS surveys for a lower amplitude of the power spectrum than inferred from the CMB (Amon et al. 2023).
The fiducial models reproduce the calibration data, i.e. the  = 0 SMF (down to log 10  * /M ⊙ = 8.7, 9.9 and 11.2 for m8, m9, and m10, respectively; Fig. 8), the gas fractions of  ≈ 0.1 − 0.3 clusters (Fig. 10) and the  = 0 BH mass-stellar mass relation (Fig. 12), within the mass ranges for which the results are converged with the resolution and box size.The same holds for all cosmology variations using the fiducial galaxy formation parameters, which implies that the changes in cosmology did not necessitate re-calibration.Similarly, the galaxy formation variations calibrated to perturbed data yield good fits to their own calibration targets.An exception is the SMF at  * ≳ 10 12 M ⊙ where we find large differences between the different simulations and with the calibration data.However, in this regime the mass is sensitive to the aperture within which it is measured (we apply a 50 kpc 3D aperture to the simulations) and the treatment of the intracluster light and we therefore ignored masses exceeding 10 11.5 M ⊙ when calibrating to the observed SMF.This systematic uncertainty also complicates comparison with the observed stellar mass fractions in clusters (Fig. 11).For cluster gas fractions the agreement with the data extends to higher masses than considered during the calibration, e.g. to more than an order of magnitude higher cluster masses for the high-resolution model.
Although the resolution of the FLAMINGO simulations is too low for detailed studies of galaxy structure and evolution, except perhaps for massive galaxies, we did compare the simulations to a number of observables that characterize galaxy properties.The simulations reproduce the observed cosmic star formation history (Fig. 13).In the stellar mass range for which we find convergence (which, depending on the property, can begin at higher masses than for the SMF) there is generally good agreement with the observations for sSFR, passive fraction and metallicity.The exception is the passive fraction at  * ≳ 10 12 M ⊙ , where the simulations predict an increasing fraction of active galaxies with increasing mass that is not observed.The Jet models look better in this respect, though they also do not quench star formation sufficiently in very massive galaxies.As expected given the relatively low resolution, galaxy sizes are generally overestimated, except for the high-resolution simulation at high mass ( * ≳ 10 11 M ⊙ ) (see Fig. 14).
We compared the simulation predictions to observations of a number of low-redshift cluster scaling relations, namely the relations between X-ray luminosity and temperature, X-ray luminosity and halo mass, X-ray temperature and halo mass, and SZE Compton Y parameter and halo mass (Fig. 15).The simulation predictions are converged and the fiducial model is in excellent agreement with the data.The X-ray relations are sensitive to the gas fractions (Fig. 16), but not to the investigated variations in the cosmology, SMF, and AGN feedback implementation.More detailed comparisons, including profiles, will be presented in a future study.
As a first test of the predicted large-scale distributions of matter and hot gas, we investigated the cross-correlation of the thermal SZE and CMB lensing convergence signals.The simulation predictions are converged, but on large scales cosmic variance becomes significant, as evidenced by the scatter between the past lightcones of different observers.The results differ more for the cosmology than for the galaxy formation variations.Higher cluster gas fractions result in an increase of the cross spectrum for ℓ > 600 but a decrease on larger scales.We found good agreement with the data, which are slightly better fit by the low  8 and high neutrino mass cosmologies (Fig. 17).
We provided two examples of applications relevant for observational cosmology: the suppression due to baryonic effects of the halo mass function and the matter power spectrum, both at  = 0. Except at masses  200m ∼ 10 12 M ⊙ , where galaxy formation is most efficient, the HMF in the hydro simulations is suppressed relative to that in the corresponding DMO simulation.For low-mass clusters ( 200m ∼ 10 13 M ⊙ ) the mass function is reduced by ≈ 20 per cent.At higher masses the baryonic effects are weaker and for the models with the fiducial gas fractions the difference in the mass function is smaller than 5 per cent for  200m ∼ 10 15 M ⊙ (Fig. 20).In the mass range of clusters, the main factor determining the suppression is the gas fraction.The observational uncertainty on this quantity translates into a ∼ 10 per cent uncertainty on the HMF at 10 14 M ⊙ .
The matter power spectrum is suppressed for 1 ≲  ≲ 10 ℎ Mpc −1 , mainly because gas is distributed more smoothly than CDM on these scales, and boosted for  ≳ 10 2 ℎ Mpc −1 , mainly due to stars (Fig. 21).The reduction in power peaks at  ∼ 10 ℎ Mpc −1 , where it exceeds 10 per cent in all models, and remains greater than 1 per cent down to at least  = 1 ℎ Mpc −1 even for our model with too high gas fractions (Fig. 22).On large scales the baryonic suppression in the fiducial model is smaller than for the fiducial BAHAMAS simulation, but all these simulations follow the Van Daalen et al. (2020) relation between the reduction in power for  = 1 ℎ Mpc −1 and the mean baryon fraction in haloes of mass  500c = 10 14 M ⊙ (Fig. 23).
Together with Kugel et al. (2023), where we describe the calibration of the model using Gaussian process emulation, this paper serves to document the methods used for the FLAMINGO suite of simulations.In addition, we have provided an overview of basic results for galaxies and clusters, and a preview of some of its applications to observational cosmology.Upcoming papers will investigate cluster selection effects, thermodynamic profiles of clusters, galaxy clustering, the effect of massive neutrinos on LSS, and the consistency between LSS and the primary CMB (the so-called  8 tension).There are many more potential applications, including, for example, the validation and improvement of methods to correct DMO simulations for baryonic effects.Thanks to the availability of multiple resolutions and box sizes, and the enormous data volume, we anticipate that the simulations may also prove useful for machine learning applications.
In the future we intend to use the FLAMINGO galaxy formation model and calibration strategy to run a new suite of simulations that will enable emulation of LSS observables as a function of both cosmological parameters and baryonic effects.This will enable further investigation of the interplay between baryonic effects and cosmology.Moreover, it will allow the application of observational constraints, such as the cluster gas fractions used in this work, during the inference of cosmological parameters from LSS data.This approach has the potential to reduce the uncertainty in the magnitude of baryonic effects and their impact on precision cosmology.Some additional information as well as visualizations can be found on the FLAMINGO website. 22Table A1.Number of observer positions and the maximum redshifts at which lightcone particle and HEALPix map outputs are generated in each simulation.From left to right the particle types are dark matter (DM), neutrinos, black holes (BH), stars and gas.Filtered gas particles are those satisfying the density and temperature criteria described in appendix A1.2.In simulations with two lightcones the individual particles are only output for the first lightcone.need to output any periodic copy of a particle which crosses the observer's lightcone.We therefore generate a list of all periodic copies of the simulation volume that overlap the shell around the observer.Then, whenever a particle is drifted during the time step, we iterate over the periodic copies in the list and check whether that periodic copy of the particle crossed the observer's lightcone during the drift operation 23 .If so, the particle's position is interpolated to the redshift at which it crossed the lightcone and the particle is added to a buffer.At the end of each step we check whether the size of the buffer exceeds a specified threshold size and if it does, we write out the particles.

A1.2 Redshift limits
Table A1 gives the maximum redshifts at which we output particles of each type for each simulation.There are two redshift thresholds for the output of gas particles crossing the lightcone.All gas particles are output at redshifts less than the limit shown in the 'All gas' column in the table.Gas particles which have both temperature  > 10 5 K and hydrogen number density  H > 10 −6 (1 + ) 4 cm −3 are output at redshifts below the limit shown in the 'Filtered gas' column.
23 As an additional optimization, we take advantage of the way that Swift stores particles in a cubic grid of cells.Before the particles in a particular cell are drifted, we take the list of periodic replications of the volume computed at the start of the time step and find the subset of those replications in which particles in the current cell may cross the lightcone.Then, when drifting a particle, we only iterate over this subset of replications rather than over the full list.
These temperature and density thresholds were chosen based on experimentation with X-ray observables.
The redshift limits for dark matter, neutrino and unfiltered gas particles of  = 0.25 and  = 0.78 approximately correspond to the simulation box size in the 1 Gpc and 2.8 Gpc boxes, respectively.Gas particles that contribute significantly to X-ray (and SZE) observations are stored up to  = 0.5 because the output is not prohibitively large and it enables the generation of multi-frequency X-ray emission lightcones in post-processing.BH particles are stored at redshifts below  = 15 because the output size is modest and they are useful for the construction of halo lightcone catalogues in post-processing.

A2 HEALPix maps
Lightcone particle outputs rapidly grow in size as the upper redshift limit is increased and can become impractical to store.We therefore also implement a scheme to store spherical maps of arbitrary quantities on the lightcone with user specified angular resolution and redshift bins.
The observer's past lightcone is split into a set of concentric spherical shells in comoving distance.For each shell we create one full sky HEALPix (Gorski et al. 2005) map for each quantity to be recorded.Whenever a particle is found to have crossed the lightcone according to the criteria above, we determine which shell it lies in at the time of crossing and accumulate the particle's contributions to the HEALPix maps for that shell 24 .
to the map, where  g is the particle's mass, Ω pixel is the solid angle of a HEALPix pixel and  A is the angular diameter distance to the observer.As for the X-ray emission, gas particles whose temperatures are affected by recent, direct AGN feedback are excluded.
We also construct smoothed maps of the Doppler  parameter of the kinematic SZE.When a gas particle crosses the lightcone the dimensionless quantity which ought to be accumulated to the map is: where  r is the particle's radial velocity relative to the observer.Due to a bug in our implementation, an extra factor of  was introduced.This has been approximately corrected by dividing each map by the expansion factor at the shell mid point.Note that if necessary the maps can be corrected precisely using the particle lightcone outputs, at least for the observers and redshifts for which such data was stored (see Table A1).Again, particles that have recently received AGN feedback energy are excluded using the same criteria as described for X-ray emission.Finally, we compute a smoothed map of the dispersion measure (DM).Each time a gas particle crosses the lightcone the following quantity ought to be accumulated to the HEALPix map for the appropriate shell: where  is the expansion factor at which the particle crossed the lightcone.However, due to a bug, the  factor was missing in our implementation and so we have approximately corrected the dispersion measure maps by multiplying each map by the expansion factor at the mid point of the shell.Note that if necessary the maps can be corrected precisely using the particle lightcone outputs, at least for the observers and redshifts for which such data was stored (see Table A1).Gas particles recently heated by AGN are excluded from the dispersion measure maps in the same way as for the X-ray maps.

APPENDIX B: THE CHOICE OF THE LINEAR PHASES FOR THE INITIAL CONDITIONS
The Gaussian phases for all the FLAMINGO volumes use a newer version of the Panphasia hierarchical Gaussian White Noise field than was described and published in Jenkins (2013).There are two main changes: (i) a larger set of polynomials, completed to sixth order, is used instead of the S8 scheme; (ii) a faster and more flexible pseudo-random generator, ThreeFry4x64 (Salmon et al. 2011), is used instead of a multiple linear congruence generator.
The text descriptors in Table A3 specify the phases for the different FLAMINGO volumes and, in principle, define the phases for all future possible zoom simulations of these volumes.The public version of monofonIC (Hahn et al. 2020) and the version of monofonIC we used to create the Flamingo initial conditions, are both able to use the new Panphasia field descriptors to set up the phases.
The initial conditions for the FLAMINGO simulations represent a significant advance in accuracy for cosmological initial conditions with CDM, baryons and neutrinos.To our knowledge, there is currently no code that is able to make zoom initial conditions to the same degree of accuracy as our cosmological volumes when including all three of these components.DMO zoom initial conditions can be generated for the FLAMINGO volumes with the latest version of the IC_Gen code (Jenkins 2013).
This paper has been typeset from a T E X/L A T E X file prepared by the author.

Figure 1 .
Figure 1.A projection through a 40 Mpc thick slice through the fiducial, intermediate-resolution simulation with box side length 2.8 Gpc (run L2p8_m9 in Table 2) at  = 0.The luminosity of the background image gives the CDM surface density whilst the colour encodes the surface density of massive neutrinos, both on a logarithmic scale (see Fig. 4 for a side-by-side comparison of images of the CDM and neutrino surface densities with color bars for each).The insets show three consecutive zooms centred on the most massive halo (total mass  200m = 6.7 × 10 15 M ⊙ ).First inset: a projection of a 200 × 200 × 40 Mpc 3 sub-volume containing the cluster, showing the mass-weighted gas temperature along the line of sight.Second inset: CDM surface density in a 40 × 40 × 20 Mpc 3 region.Final inset: X-ray surface brightness in the 0.5-2 keV band in a 20 × 20 × 20 Mpc 3 region, computed from the  = 0 snapshot but placing the cluster at  = 0.025.

Figure 4 .Figure 5 .Figure 6 .
Figure 4. Comparison of the CDM (left panel) and neutrino (right panel) surface density in a 20 Mpc thick slice through the  = 0 snapshot of the L2p8_m9 simulation.Note that the dynamic range covered by the color bar is much smaller in the right panel.On scales of ≲ 10 2 Mpc the neutrino distribution is much smoother than that of the CDM.

Figure 7 .
Figure 7. Full sky maps of the thermal Sunyaev-Zel'dovich effect as quantified by the Compton  parameter (eq.8) for different redshift intervals.The first three maps show the shells from  = 0 − 0.05,  = 0.05 − 0.1 and  = 0.1 − 0.15, while the fourth one in the back shows a much larger redshift range,  = 0.15 − 5.0, and hence gives higher values.The map on the bottom right shows the total integrated Compton  from  = 0 to 5, i.e. the sum of the four maps shown above it.

Figure 8 .
Figure8.The  = 0 galaxy stellar mass function for the fiducial galaxy formation model and cosmology with different resolutions and box sizes (left) and the model/cosmology variations in (1 Gpc) 3 volumes at m9 resolution (right; note the different -axis ranges).The simulation results are compared with observations from the GAMA survey fromDriver et al. (2022), shifted by the best-fitting systematic cosmic variance and stellar mass biases, which are however both negligible (−0.0022 dex and +0.026 dex, respectively).In the bottom panels, the simulation SMFs have been divided by a spline fit through the observations in order to reduce the dynamic range and thus facilitate the comparison with the data.The specifications of the simulations can be found in Tables2 and 4. The parts of the SMFs below the resolution-dependent lower mass limits for the calibration are displayed using dotted line styles.The upper mass limit for the calibration is always the same and indicated by the vertical dashed line in all panels.The error bars labelled 'Sys.err' indicate the ±1 systematic errors due to cosmic variance (vertical error bar; negligible) and uncertainty in the stellar mass determination for a fixed IMF (0.14 dex; horizontal error bar).As in all figures, the simulation SMFs include the random lognormal scatter in the stellar mass that is expected to be present in the data (0.3 dex).Except for models M*− , which were calibrated to the observed SMF shifted to 0.14 dex lower masses, all the m10-and m9-resolution models are consistent with the data, while the differences for the m8 resolution are only ≈ 0.1 dex.For reference, the grey dot-dashed curves show the MTNG simulation, which underestimates the number densities by ≈ 0.3 dex, and the grey dashed curves show the BAHAMAS simulation, which agrees very well with the FLAMINGO models that have the same resolution (m9).

Figure 9 .
Figure9.Median stellar mass as a function of halo mass for central galaxies at  = 0 (solid curves in the top panels).The shaded regions (left panel only) show the 16th to 84th percentile scatter.As in Fig.8, the left panels show the simulations using the fiducial galaxy formation model and the fiducial cosmology, while the right panels show the model variations (which all use the same box size and resolution as L1_m9), as indicated in the legend, and the error bar labelled 'Sys.err' indicates the expected systematic error on the observed galaxy mass for a fixed IMF (0.14 dex).The horizontal dashed lines indicate the upper stellar mass limit for the calibration.Median curves are dotted below the resolution-dependent lower mass calibration limit.For comparison, the data points show the stellar-mass-halo-mass relation inferred byBehroozi et al. (2019) using the semi-empirical model 'UniverseMachine'.The bottom panels show (log 10 of) the ratio of the median relations and a spline fit through the data points.For consistency withBehroozi et al. (2019), we adopted the definition of halo mass fromBryan & Norman (1998).The sharp cut offs at low halo masses are due to resolution effects.

Figure 10 .
Figure10.The median gas mass fraction within  500c as a function of halo mass ( 500c ) for clusters at  = 0.1.The shaded region in the left panel shows the 16th to 84th percentile scatter for model L2p8_m9.As in Fig.8, the left panel shows the simulations using the fiducial galaxy formation model and the fiducial cosmology, while the right panel shows the model variations, as indicated in the legend.The vertical dashed line indicates the lower (mass limit for the calibration.The resolution-dependent upper mass limits for the calibration are indicated by the coloured downward pointing arrows.The rightward pointing arrow labelled 'HSE bias' indicates the correction for hydrostatic bias that has been applied to the observed X-ray data.For reference, the horizontal dotted lines indicate the universal baryon fraction.The simulations are compared with the data used for the calibration: the compilation of  ≈ 0.1 X-ray data fromKugel et al. (2023) and the  ≈ 0.3 weak lensing (plus X-ray) data fromAkino et al. (2022),Mulroy et al. (2019), andHoekstra et al. (2015).For comparison, the grey dot-dashed curves show the MTNG simulation (at  = 0), which predicts too high gas fractions, and the grey dashed curves show the fiducial BAHAMAS simulation (at  = 0), which is closest to model fgas−2.

Figure 11 .
Figure11.As Fig.10, but showing the median stellar (top panels) and baryonic (bottom panels) mass fractions of  = 0.1 clusters.The solid curves in the top panels, as well as the grey dashed curves showing the BAHAMAS simulation, show the true median total stellar mass fractions, while the dotted curves show the fractions we obtain if we only include the mass that is within our fiducial apertures of 50 kpc (summed over all galaxies).The simulation results are compared with observations fromZhang et al. (2011),Gonzalez et al. (2013),Kravtsov et al. (2018), andAkino et al. (2022), where the total and brightest cluster galaxy stellar masses from the last study have been increased by factors of 1.15 and 1.30 to account for blue galaxies and intracluster light, respectively.Observed stellar masses have been corrected to our Chabrier IMF followingChiu et al. (2018).Our fiducial correction for hydrostatic mass bias has been applied to the observed data points.While the star fractions appear higher than observed, they are highly sensitive to the apertures within which the galaxy masses are measured.

Figure 12 .
Figure12.Median mass of the most massive black hole in the galaxy versus the galaxy's stellar mass at  = 0.The left panel shows the simulations using the fiducial galaxy formation model and cosmology, while the right panel shows the model variations (which all use the same box size and resolution as L1_m9), as indicated in the legend.The curves become dotted below the minimum stellar mass used for the calibration of the galaxy mass function and stop at the stellar mass corresponding to the initial mass of a single baryonic particle.The shaded region in the left panel shows the 16th to 84th percentile scatter for model L2p8_m9.The grey dot-dashed and grey dashed curves show the MTNG and BAHAMAS simulations, respectively.The arrows in the left panel indicate the seed BH masses (which are identical for m8 and m9).The simulations are compared with data fromGraham & Sahu (2023) for elliptical (E), spiral (S) and SO galaxies.There is good agreement with the data if we consider that most galaxies with masses < 10 11 M ⊙ are disky.

Figure 14 .
Figure 14.Similar to Fig.12, but showing a selection of median galaxy properties as a function of stellar mass.All properties are shown at  = 0.1 with the exception of the passive fractions, which are at  = 0. From top to bottom the rows show specific SFR of active galaxies, passive fraction (i.e. the fraction of galaxies with sSFR < 10 −2 Gyr −1 ), stellar metallicity, and projected stellar half-mass radii for active and passive galaxies, respectively.Solid/dotted curves are computed using all particles bound to the subhalo and inside our fiducial 3D apertures of radius 50 kpc, except for half-mass radii which are projected and computed within 2D apertures of radius 50 kpc.The dashed curves in the panels showing galaxy sizes are computed using 2D apertures of radius 100 kpc.As indicated in the legends, the simulations are compared with observed  ≈ 0.1 sSFRs fromBauer et al. (2013),  ≈ 0 passive fractions compiled byBehroozi et al. (2019),  ≈ 0.1 stellar metallicities fromGallazzi et al. (2005, error  bars indicate the scatter), and  ≈ 0.1 galaxy sizes fromVan der Wel et al. (2014);Lange et al. (2015).Many of the galaxy properties are sensitive to numerical resolution.Over the mass ranges where the simulation results are resolved, which depends on the property and the resolution, the agreement with the data is generally good.However, the passive fractions are underestimated for the highest masses ( * ≳ 10 12 M ⊙ ) and the sizes are slightly too large.

Figure 15 .
Figure 15.Cluster scaling relations at  = 0 for the simulations using the fiducial cosmology and the fiducial galaxy formation model but different resolutions and box sizes.The solid lines are the median relations between X-ray luminosity and temperature (top left), X-ray luminosity and halo mass (top right), temperature and halo mass (bottom right) and SZE Compton  and halo mass (bottom left).Luminosities, temperature and masses are measured within  500c while Compton  is measured within 5 500c .All haloes with  500c ≥ 10 13 M ⊙ are included and the luminosity -temperature relations are shown down to the median luminosity for haloes of this minimum mass.The shaded region indicates the 16th to 84th percentile scatter for L2p8_m9.The X-ray luminosities are for the 0.5-2 keV band and the temperatures are mass-weighted.As indicated in the legends, data points show X-ray data (medians and 16th-84th percentiles) fromPratt  et al. (2009, 31  clusters at 0.08 <  < 0.15), Lovisari et al. (2015, 23 clusters at  < 0.035), Lovisari et al. (2020, 120 clusters at 0.059 <  < 0.546), Gaspari et al. (2019, 85 clusters at  < 0.04), and Migkas et al. (2020, 313 clusters at  < 0.3), and SZE data for 616  < 0.25 clusters from Planck Collaboration et al. (2016) compiled by McCarthy et al. (2017) (we multiply by  () −2/3 , where  () ≡  ()/ 0 , to correct the data to  = 0 assuming self-similar evolution).Observed X-ray luminosities have been scaled to  = 0 and the 0.5-2 keV band using PIMMS(Mukai 1993).The arrow labelled 'HSE bias' indicates the correction for hydrostatic mass bias that has been applied to the observed X-ray data.The grey dashed lines show the results from the BAHAMAS simulation(McCarthy et al. 2017).The simulations are converged with box size and resolution (except for the low-resolution L1_m10 at the lowest masses) and the agreement with the data is excellent.

Figure 17 .
Figure17.The thermal SZE () -CMB lensing convergence () cross-spectrum.Thermal SZE and CMB lensing convergence maps are constructed from full-sky lightcone-based HEALPix maps (downsampled to  side = 4096 for the present test) produced on-the-fly while running the simulations.Cross-spectra are computed using the NaMaster pseudo- ℓ package.In the top left panel, the shaded region represents the scatter between the 8 independent lightcones (i.e. 8 different observer locations in the simulation volume) for the L2p8_m9 simulation, while the dashed curve represents the cross-spectrum derived by integrating the 3D matter-electron pressure cross-spectrum along the line of sight from 3D power spectra produced on-the-fly by employing the Limber approximation.The - cross-spectrum is well converged with the numerical resolution at fixed box size, but comparison of the two intermediate-resolution simulations shows that the 1 Gpc volume misses a small amount of power relative to the 2.8 Gpc simulation (top left panel).This particular cross-spectrum is more sensitive to variations in cosmology (top right panel) than variations in feedback implementation (bottom panels), though the effect of the latter is also clearly visible.Comparison to the measurements ofHill & Spergel (2014) reveals a weak preference for a low  8 cosmology or a high neutrino mass relative to the Planck cosmology.

Figure 18 .
Figure 18.The  = 0 halo mass function for the DMO simulations using our fiducial D3A cosmology.The left axis indicates the number density, while the right axis shows the number of haloes in the L2p8_m9_DMO simulation per halo mass bin of width 0.1 dex.Arrows indicate the halo mass corresponding to 100 dark matter particles for each resolution.

Figure 19 .
Figure 19.The ratio of the halo mass functions in the DMO simulations using our fiducial D3A cosmology and that for the L2p8_m9_DMO simulation for two different halo mass definitions:  200m (top panel) and  200c (bottom panel).Grey shaded regions show deviations of ±5% and ±10%.Dotted lines show the 1 Poisson errors on the counts in each bin for L2p8_m9_DMO.For comparison, we show the predictions for our fiducial cosmology from the universal HMF fits from Tinker et al. (2008) (top panel only) and Bocquet et al. (2016, based on the DMO Magneticum simulations), from the Aemulus emulator (McClintock et al. 2019) (top panel only) and from the MiraTitan Universe emulator (Bocquet et al. 2020) (bottom panel only).
d d log 10  (), for the DMO FLAMINGO models and the halo mass definition  =  200m , where  200m is the mass inside  200m , the radius within which the mean density is 200 times the cosmic mean.The right axis shows the number of haloes per 0.1 dex mass bin in L2p8_m9_DMO.

Figure 20 .
Figure 20.The effect of baryonic physics on the  = 0 halo mass function.Each curve shows the ratio of the halo mass function in a hydrodynamical simulation and the corresponding DMO model.The different panels compare the models with different box sizes/resolutions (top left), cosmologies (top right), cluster gas fractions (bottom left), SMFs (bottom right), and AGN feedback implementations (also bottom right).For reference, model L1_m9 is repeated in all panels.Grey shaded regions indicate ±5% and ±10% deviations from L2p8_m9.For comparison, we also show the results from the Magneticum simulations (grey squares;Bocquet et al. 2016) and two cosmo-OWLS models that bracket the observed cluster gas fractions (grey upwards and downwards pointing triangles;Velliscig et al. 2014).

Figure 21 .
Figure 21.Matter power spectra as a function of wavelength  (top axis; in units of Mpc) and wavenumber  = 2 / (bottom axis; in units of ℎ Mpc −1 ) for model L2p8_m9 at  = 0.The solid black curve shows the total matter power spectrum, while the dotted curve is the linear theory prediction, which strongly underestimates the power for  > 0.1 ℎ Mpc −1 .Coloured lines show the contributions from individual species (see the legend), where solid lines indicate auto-spectra and dashed lines show cross-spectra with CDM.The downward arrow indicates the gravitational softening length.The importance of different species varies with scale.Gas and stars provide the dominant non-CDM contributions for  < 10 and  > 10 ℎ Mpc −1 , respectively.Neutrinos are always much less important and only non-negligible on very large scales.

Figure 23 .
Figure 23.The ratio of the  = 0 total matter power spectrum of hydrodynamical and DMO simulations,  hydro / DMO , at wavenumber  = 1.0 ℎ Mpc −1 as a function of the mean baryon fraction within  500c in haloes of mass  500c = 10 14 M ⊙ for different FLAMINGO simulations and simulations from the literature: cosmo-OWLS (LeBrun et al. 2014), fiducial BAHAMAS  (McCarthy et al. 2017), EAGLE(Schaye et al. 2015), Illustris(Vogelsberger et al. 2014), IllustrisTNG(Springel et al. 2018), Horizon-AGN(Dubois et al. 2014), and SIMBA(Davé et al. 2019).MillenniumTNG uses the same model as TNG300 and should therefore give nearly identical results in this plot.Note that the L2p8_m9 point falls on top of the L1_m9 point.The FLAMINGO cosmology variations are not shown as they would also be indistinguishable from L2p8_m9 (see the top-right panel of Fig.22).The dashed line shows the relation between these quantities thatVan Daalen et al. (2020) fit to the cosmo-OWLS and BAHAMAS simulations.The FLAMINGO simulations are consistent with the relation established for previous simulations to within |Δ | / ∼ 10 −2 (grey shaded region).

Table 1 .
Values of the calibrated subgrid parameters (see §2.3 for definitions of the parameters) for the fiducial model for each of the three simulation resolutions and for the feedback variations at intermediate resolution.The low-resolution simulations do not include stellar feedback.

Table 3 .
Gravity-only simulations.The columns list the simulation identifier; the comoving box side length, ; the number of CDM particles,  CDM ; the number of neutrino particles,   ; the mean CDM particle mass,  CDM ; the Plummer-equivalent comoving gravitational softening length,  com ; the maximum proper gravitational softening length,  prop ; and the assumed cosmology which is specified in Table4.Simulation L1_m9_ip_DMO is identical to L1_m9_DMO except that the phases in the initial conditions were inverted.Note that there are no hydrodynamical counterparts for L5p6_m10_DMO, L11p2_m11_DMO, PlanckNu0p12Var_DMO, and L1_m9_ip_DMO.

Table 4 .
The values of the cosmological parameters used in different simulations.The columns list the prefix used to indicate the cosmology in the simulation name (note that for brevity the prefix 'D3A' that indicates the fiducial cosmology is omitted from the simulation identifiers); the dimensionless Hubble constant, ℎ; the total matter density parameter, Ω m ; the dark energy density parameter, Ω Λ ; the baryonic matter density parameter, Ω b ; the sum of the particle masses of the neutrino species,    2 ; the amplitude of the primordial matter power spectrum,  s ; the power-law index of the primordial matter power spectrum,  s ; the amplitude of the initial power spectrum parametrized as the r.m.s.mass density fluctuation in spheres of radius 8 ℎ −1 Mpc extrapolated to  = 0 using linear theory,  8 ; the amplitude of the initial power spectrum parametrized as  8 ≡  8 √︁ Ω m /0.3; the neutrino matter density parameter, Ω     2 /(93.14 ℎ 2 eV).Note that the values of the Hubble and density parameters are given at  = 0.The values of the parameters that are listed in the last three columns have been computed from the other parameters.withstrongerthanwith weaker feedback. Tvalues of the calibrated subgrid parameters are listed in Table1.
comparison we also show predictions from the literature for our cosmology (grey open symbols) for  ( 200m ) (top panel) and  ( 200c ) (bottom panel), where  200c is the mass inside  200c , the radius within which the mean density is 200 times the critical density.We compare with the universal HMF fits from Tinker et al.