DESI mock challenge Halo and galaxy catalogues with the bias assignment method

Context. We present a novel approach to the construction of mock galaxy catalogues for large-scale structure analysis based on the distribution of dark matter halos obtained with e ﬀ ective bias models at the ﬁeld level. Aims. We aim to produce mock galaxy catalogues capable of generating accurate covariance matrices for a number of cosmological probes that are expected to be measured in current and forthcoming galaxy redshift surveys (e.g. two-and three-point statistics). The construction of the catalogues shown in this paper is part of a mock-comparison project within the Dark Energy Spectroscopic Instrument (DESI) collaboration. Methods. We use the bias assignment method ( BAM ) to model the statistics of halo distribution through a learning algorithm using a few detailed N -body simulations, and approximated gravity solvers based on Lagrangian perturbation theory. We introduce cosmic-web-dependent corrections to modelling redshift-space distortions at the N -body level – both in the halo and galaxy distributions –, as well as a multi-scale approach for accurate assignment of halo properties. Using speciﬁc models of halo occupation distributions to populate halos, we generate galaxy mocks with the expected number density and central-satellite fraction of emission-line galaxies, which are a key target of the DESI experiment. Results. BAM generates mock catalogues with per cent accuracy in a number of summary statistics, such as the abundance, the two-and three-point statistics of halo distributions, both in real and redshift space. In particular, the mock galaxy catalogues display ∼ 3% − 10% accuracy in the multipoles of the power spectrum up to scales of k ∼ 0 . 4 h − 1 Mpc. We show that covariance matrices of two-and three-point statistics obtained with BAM display a similar structure to the reference simulation. Conclusions. BAM o ﬀ ers an e ﬃ cient way to produce mock halo catalogues with accurate two-and three-point statistics, and is able to generate a variety of multi-tracer catalogues with precise covariance matrices of several cosmological probes. We discuss future developments of the algorithm towards mock production in DESI and other galaxy-redshift surveys.


Introduction
The cosmological volume spanned by the nearly 40 million galaxies and quasars that are to be surveyed by the Dark Energy Spectroscopic Instrument (DESI Collaboration 2016a) poses unprecedented challenges for both theoretical and numerical cosmology.DESI is a robotic, fibre-fed, highly multiplexed spectroscopic surveyor operating on the Mayall 4 m telescope at Kitt Peak National Observatory (DESI Collaboration 2022).It can obtain simultaneous spectra of almost 5000 objects over a ∼3 • field (DESI Collaboration 2016b; Silber et al. 2023, Miller et al., in prep.), and is currently conducting a five-year survey covering nearly one-third of the sky.DESI uses multiple supporting software pipelines and products, including significant imaging from the DESI Legacy Imaging Surveys (Zou et al. 2017;Dey et al. 2019, Schlegel et al., in prep.) as well as an extensive spectroscopic reduction pipeline (Guy et al., in prep.), a template-fitting pipeline to derive classifications and redshifts for each targeted source Bailey et al. (in prep.), a pipeline aimed to assign fibres to targets (Raichoor et al., in prep.), a pipeline to tile the survey and to plan and optimise observations as the campaign progresses (Schlafly et al., in prep.), and a pipeline to select targets for spectroscopic follow-up (Myers et al. 2023).The DESI target selection relies on the public Legacy Surveys (Dey et al. 2019), with preliminary target selection details published for the MWS (Allende Prieto et al. 2020), the LRGs sample (Zhou et al. 2020), BGS (Ruiz-Macias et al. 2020), ELGs (Raichoor et al. 2020), and QSOs (Yèche et al. 2020).Specific target selection approaches for DESI are varied and extensive.In particular, it is important that we mention the work describing the DESI Survey Validation (SV) phase (DESI collaboration, in prep.), two papers describing the process through which truth tables were produced via visual inspection of target spectra acquired during the SV phase and how these are used to inform target selection for the DESI Main Survey (Alexander et al. 2023;Lan et al. 2023), as well as a series of papers describing the selection of DESI bright-time and dark-time science targets (MWS, Cooper et al. 2023;BGS, Hahn et al. 2022;LRG, Zhou et al. 2023;ELG, Raichoor et al. 2023;QSO, Chaussidon et al. 2023).The Early DESI Data Release (DESI collaboration, in prep.) and the Siena Galaxy Atlas (SGA, Moustakas et al., in prep.) are forecast for 2023.
The precision of the measurements of the statistical properties of the spatial distribution and weak-lensing signals to be obtained from such an unprecedented number of tracers will shed light on the most intriguing features of the standard cosmological model; for example, the nature of dark energy (e.g.Levi et al. 2013; DESI Collaboration 2016a) and primordial non-gaussianities (see e.g.Vargas-Magana et al. 2019;Alam et al. 2021).The accomplishment of these goals depends heavily on access to precise and accurate covariance matrices for the statistical analysis of several cosmological probes, such as clustering, weak-lensing signals, redshift-space distortions, and baryon acoustic oscillations (e.g.Dodelson & Schneider 2013;Taylor et al. 2013;Percival et al. 2014;Paz & Sánchez 2015;Pearson & Samushia 2016;Howlett & Percival 2017;Lacasa 2018;O'Connell & Eisenstein 2019).
This paper is part of a mock challenge within the DESI collaboration (see e.g.Garrison et al. 2018;Grove et al. 2022;Ding et al. 2022) which is designed to establish a road map towards the construction of mock galaxy catalogues with per cent accuracy and precision in a number of statistical properties of the spatial distribution of galaxies.In particular, this article describes the application of a calibrated approach to producing mock catalogues, the so-called bias assignment method (BAM; Balaguera-Antolínez et al. 2019).
In recent years, machine-learning techniques have made an appearance in the cosmological scenario (see e.g.Dvorkin et al. 2022, for a recent review) with a number of different goals and applications.Among others, these techniques have been used to learn the spatial distribution of dark matter tracers from a large number of detailed N-body simulations (see e.g.Villaescusa-Navarro et al. 2021;Kreisch et al. 2022;Piras et al. 2023), generate corrections to the displacement field in Lagrangian perturbation theory (e.g.He et al. 2019), increase the mass resolution of fast and computationally cheap simulations (typically characterised by low mass resolutions, e.g.Li et al. 2021;Forero-Sánchez et al. 2022), learn the galaxydark matter connection from hydro-simulations (e.g.Zhang et al. 2019), and to provide a platform to obtain covariance matrices from fast and/or inaccurate sets of mocks (see e.g.Chartier et al. 2021;de Santi & Abramo 2022).
BAM is the latest of a class of algorithms designed to produce mock galaxy catalogues.Its unique combination of physical content and learning scheme means that it can be regarded as both a calibrated method and a physically supervised machinelearning approach to the production of mock galaxy catalogues.The method represents a step forward in precision as well as efficiency, as it has been demonstrated to provide covariance matrices of the halo power spectrum with per cent accuracy (with respect to an N-body simulation) and at low cost in terms of computing time as well as in the number of training sets (Balaguera-Antolínez et al. 2020).BAM has also been shown to be potentially useful to generating mock catalogues for Lyman-α and quasars by learning from hydro-dynamic simulations (Sinigaglia et al. 2021(Sinigaglia et al. , 2022)).
In this work, we present a methodology that uses BAM to generate ensembles of halo catalogues with phase-space coordinates.The methodology implemented for constructing halo catalogues presented here improves on previous approaches by including more precise recipes for the peculiar velocities and A130, page 2 of 28 Balaguera-Antolínez, A., et al.: A&A 673, A130 (2023) intrinsic properties of halos (such as virial mass and velocity dispersion).Building on the set of halo catalogues, we implement a halo occupation distribution (HOD) framework (e.g.Cooray 2002;Cooray & Sheth 2002;Berlind & Weinberg 2002;Kravtsov et al. 2004) to populate these halos with galaxies, and in particular emission line galaxies (ELGs), which are a key target of the DESI galaxy redshift survey (e.g.Raichoor et al. 2020).The strategy that we envisage for BAM allows us to implement more approaches to populate dark matter halos with galaxies, such as the sub-halo abundance matching (SHAM), (see e.g.Vale & Ostriker 2004;Kravtsov et al. 2004;Conroy et al. 2006;Favole et al. 2016), and to generate galaxy cluster catalogues (see e.g.Cai et al. 2009;Balaguera-Antolínez et al. 2012) based on different halo properties (Hearin et al. 2016;Wechsler & Tinker 2018).This provides high flexibility at the time of producing mock catalogues containing a number of different dark matter tracers with the same underlying dark matter density field.This is optimal for multi-tracer analyses (see e.g.Hamaus et al. 2012;Abramo & Leonard 2013;Abramo et al. 2016;Wang & Zhao 2020;Zhao et al. 2021) as expected to be performed in many experiments.Indeed, BAM is expected to provide halo and mock galaxy catalogues for several ongoing galaxy-redshift surveys, such as DESI Levi et al. (2013), EUCLID Amendola et al. (2018), J-PAS (Benitez et al. 2014), and the Nancy Grace Roman Space telescope Spergel et al. (2015).
The outline of this paper is as follows.In Sect.2, we describe the BAM approach to calibrating the halo bias.In Sect.2.4, we describe the reference N-body simulation and the different models of halo bias used in this work.Section Sect.2.7 depicts the methodology used to learn the halo bias while Sect. 3 is devoted to the construction of halo catalogues.In Sect.4, we present the HOD model used to generate galaxy catalogues and describe the main statistical properties of the resulting ensemble.We end with conclusions and a list of potential developments designed to improve our method.

The halo bias in BAM
The halo bias (i.e. the link between the halo and dark matter distribution) is a quantity of paramount relevance to the understanding of halo, and subsequently galaxy, clustering, as it represents the midpoint between the distribution of light (galaxies) and the distribution of the underlying dark matter in the Universe.It is very well established that the bias of dark matter tracers needs to be modelled beyond the standard linear scaleindependent scheme: scale dependencies induced by the process of halo formation and merging, the non-linear evolution of the dark matter density field (see e.g.Matsubara 1999;Sigad et al. 2000;Somerville et al. 2001;Smith et al. 2007;Zentner 2007;Tinker et al. 2010;Valageas 2011;Pollack et al. 2012;Sheth et al. 2013;Ahn et al. 2015;Pujol et al. 2017;Desjacques et al. 2018;Han et al. 2019;Nasirudin et al. 2020, and references therein), and the discrete presentation of halo and matter density fields generalises the concept of halo bias to a non-local and stochastic quantity (see e.g.Fry & Gaztanaga 1993;Tegmark & Peebles 1998;Dekel & Lahav 1999;Blanton 2000;Simon 2005).
The BAM algorithm is designed to capture the aforementioned properties of the bias of dark matter tracers (halos in this case) at the field level by assuming that the number counts of dark mat-ter halos in a cell of volume ∂V depends on a set of properties {Θ dm } of the underlying dark matter (DM) density field evaluated on the same cells.This dependency is assumed to be represented by a probability distribution of halo occupation number N h conditional to a set of N p properties of the underlying dark matter field.Accordingly, we represent the halo bias as a multidimensional histogram: where γ ≡ [{Θ dm } − ∆ /2, {Θ dm } + ∆ /2) represents the set of bins (of width ∆ ) defined for the -th property of the density field, with 1 A (x) as the indicator function: , and 0 otherwise.The quantity B carries no information on the phases of the density fields, and therefore represents a statistical target that can be learned and mapped into a different realisation of the dark matter density field.Equation (1) approximates the true underlying halo bias, as it ignores key aspects such as the effects of the mass assignment and the correlation between pairs in different property bins, among others.The impact of these effects in the measurement of the halo bias is captured (and corrected for) within the iterative process, as discussed in Sect.2.7.

The ingredients of BAM
The BAM machinery relies on a number of ingredients, which are mainly related to properties and outputs of detailed N-body simulations.These can be enumerated as follows: 1. Initial conditions (ICs) of a reference N-body simulation.These ICs are represented by an initial Gaussian random field built at a much lower resolution than that originally used by the N-body run.A subset of these ICs corresponds to downsampled versions of the original ensemble, evolved by the N-body code to redshifts at which dark matter halo catalogues are identified and used in this analysis.2. A set of a few dark matter halo catalogues containing phasespace coordinates as well as halo properties that will be used for the assignment of galaxies by means of, for example, HOD prescription.These halos correspond to the ICs whose initial seeds are the same as those in the subset described in point (1).3.An approximated gravity solver (or surrogate) that evolves -in a fast way -the low-resolution IC to the redshift of the tracer catalogue.Provided the above set of ingredients, the generation of mock galaxy catalogues in BAM is performed in four stages: 1. Stage I: Calibration: Learning process in which the halo bias (introduced in the previous section) and BAM kernel (introduced in Sect.2.7) are calibrated using the two-point statistics of the reference as a target (or cost function).2. Stage II: Halo mock production: Generation of independent halo number count fields through the sampling of independent dark matter density fields using halo bias.3. Stage III: Phase-space coordinates and properties: (a) Assignment of position, (b) velocities, and (c) intrinsic properties to dark matter halos.4. Stage IV: Galaxy catalogues: Implementation of a HOD model to populate dark matter halos with galaxies.We cover each of these steps throughout this article.To facilitate understanding of the processes involved in the method, we depict the different steps as a flow chart in Fig. 1.A130, page 3 of 28 Balaguera-Antolínez, A., et al.: A&A 673, A130 (2023) Fig. 1.Flow-chart representing the different stages involved in the generation of mock galaxy catalogues with BAM described in Sect.2.2.The process is mainly divided into two sections: learning phase and mock production.In the learning phase, a number of kernels and halo biasses are calibrated from different realisations of the reference simulation and are stacked to generate one version of kernel and bias used in the mock production phase.The different colours in the arrows indicate the different stages involved in the process (e.g.calibration, generation of independent halo number counts, assignment of halo properties, construction of galaxy catalogues).

Training set: Reference simulation and initial conditions
We use the Scinet LIghtCone Simulations (SLICSs) described by Harnois-Déraps et al. (2018), which consist of an ensemble of cosmological N-body simulations run in a comoving box of L box = 505 Mpc h −1 per side, following the nonlinear evolution of 1536 3 particles initialised on a mesh of 3072 3 points, from an initial redshift of z ini = 120 down to z = 0.
The original set of initial conditions of this simulation consist of about 1000 realisations in the form of particle positions and velocities, which need to be converted to density fields.A fraction of this set is to be used in particle mesh codes (such as FastPM Feng et al. 2016) as part of the mock-comparison project in DESI (Variu et al., in prep.).Accordingly, the initial density fields are obtained from the displacement field Ψ Z ( q) by reversing the Zeldovich displacement (Zel'dovich 1970) as δ HR IC ( q) = −∇ q •Ψ Z ( q), where Ψ Z ( q) = q−q is computed from the particle positions q relative to a regular distribution with coordi-nates q on a N 3 HR = 1536 3 lattice (which we refer to as highresolution, HR).
For applications to BAM, we adopted a resolution of N 3 LR = 192 3 cells 1 and applied an ideal (real) low-pass filter in order to obtain low-resolution initial conditions.The fiducial spatial resolution yields fields represented by a regular mesh with volume ∂V ∼ (2.6 Mpc h −1 ) 3 and a Nyquist frequency of ∼1.2h Mpc −1 .For comparison against the reference simulation, and according to DESI scientific requirements, we adopt a maximum wavenumber of ∼0.4h Mpc −1 , which amounts to ∼30% of the Nyquist frequency.At those scales, mass assignment effects inherent to the interpolation of halos in a mesh are expected to be negligible (see e.g.Jing 2005).

The reference halo catalogues
The corresponding halo catalogues from the SLICS consist of a set of virialised objects identified at z = 1.04 with a spherical overdensity algorithm (see e.g.Harnois-Déraps et al. 2013).We used a set of 80 realisations of halo catalogues2 to assess the accuracy of our mocks in terms of two and three-point statistics.A subset (of a maximum of 27 randomly selected3 references) was also used as part of the training set from which BAM learnt the halo bias (described in Sect.2.7).The mass resolution of the SLICS is ∼2.8 × 10 9 M h −1 .We selected dark matter halos with masses above 2 × 10 11 M h −1 , which agrees with the expected mass cut at which dark matter halos can host ELGs (see e.g.Alam et al. 2020).Number counts are generated over a mesh with our fiducial resolution of 192 3 using the nearest gridpoint mass assignment (Hockney & Eastwood 1988).Although the halo-finder algorithm allows the determination of different halo properties (e.g.mass, spin, concentration, velocity dispersion), the current set of reference catalogues involves only the virial mass and the velocity dispersion, along with halo coordinates and peculiar velocities, obtained from the position of the density used to identify each halo.These quantities are sufficient to apply an HOD prescription and populate halos with central and satellite galaxies.We note that the BAM method can be applied to reference halo catalogues built with different halo finders (see e.g.Balaguera-Antolínez et al. 2020).

Fast gravity solver
BAM relies on a combined Lagrangian and Eulerian perturbation theory approach, dubbed augmented Lagrangian perturbation theory (ALPT; see Kitaura & Hess 2013;Kitaura et al. 2014), to map the initial conditions represented by Lagrangian coordinates q (regularly spaced points at the redshift z ini ) into final (Eulerian) comoving coordinates r(z) via r(z) = q + Ψ(q, z), where Ψ(q, z) represents the displacement field.This displacement is assumed to be split into long and short-range components, Ψ(q, z) = Ψ short (q, z) + Ψ long (q, z).ALPT implements the displacement field from second-order Lagrangian perturbation theory (2LPT) to model the large-scale (long-range) displacement (see e.g.Buchert & Ehlers 1993;Bouchet et al. 1995;Bernardeau et al. 2002) where D (1) (z) is the growth factor (see e.g.Heath 1977), (Bouchet et al. 1995).The potentials φ i (q) are the solutions of the Poisson equations ∇ 2 q φ (i) = δ (i) , where i = 1 is the linear density obtained in Sect.2.3, and where we use the notation ∂ i j φ ≡ ∂ 2 φ(q)/∂q i ∂q j .Equation (3) shows how 2LPT takes into account the Hessian of the initial gravitational potential, and is therefore expected to develop the main features of the cosmic web on large scales.However, given that 2LPT is not accurate on small scales (see e.g.Kitaura & Hess 2013), its displacement is filtered with a Gaussian kernel G s (q), as Ψ long (q, z) = Ψ 2LPT (q, z) ⊗ G s (q), with a smoothing scale of r s = 20 Mpch −14 .While ALPT models the large scales using Lagrangian perturbation theory, it relies on Eulerian perturbation theory to model the small-scale clustering signal.In particular, the shortrange displacement is written as Ψ short (q, z) = (1 − G s (q)) ⊗ Ψ sc (q, z) where the displacement Ψ sc (q, z) is derived within the spherical collapse (SC) approximation (see e.g.Bernardeau 1994;Bernardeau et al. 2002), ψ sc (q, z) = ∇ • Ψ sc (q, z) where ψ sc (q, z) is the solution to the Poison-like equation (see e.g.Mohayaee et al. 2006;Neyrinck 2013) (4) The regular Lagrangian coordinates are then mapped into Eulerian coordinates using the total displacement: With dark matter particles evolved, a cloud-in-cell massassignment scheme is implemented to generate an approximated dark matter density field (A-DMDF) on the fiducial mesh.To improve the description of the non-linear dark matter field with a low number of particles, we implement the phase-space mapping technique (Abel et al. 2012;Hahn et al. 2013).
We note that the method can in principle implement any approximated gravity solver (with the correct large-scale clustering signal), given that the BAM kernel is meant to correct for missing power towards small scales, with the aim being to generate the correct tracer power spectrum through a learning procedure (further discussed in Sect.2.7).Nevertheless, as long as we use the positions and velocities (see Sect. 3.3) from the dark matter particles computed from such approximated methods, along with the fact that the tidal field (see Eq. ( 3)) is a key ingredient of the method, the desired trade-off between precision, speed, and physical content means that we favour 2LPT or ALPT over, for example, the Zeldovich approximation (see White 2014, for a review on the Zeldovich approximation).Further developments designed to improve the precision without a significant increase in required computing time were recently presented by Kitaura et al. (2023) and will be implemented in the BAM machinery in future applications.
Finally, we highlight that the resulting mass of the dark matter particles (∼10 14 M h −1 ) used to define the cosmic web is nearly five times larger than in the reference N-body simulation (see Sect. 2.4).

Properties of the dark matter density field in BAM
BAM explicitly determines several properties {Θ dm } of the underlying DM density field upon which the occupation number of dark matter halos is assumed to depend, as explained in Sect. 2. In general, such properties can be nominally divided into local and non-local, depending on the quantity used to infer them.While as a local property, we can readily use the dark matter overdensity at each cell (obtained using a given mass-assignment scheme), non-local properties (also dubbed as environmental) can be extracted from quantities defined on scales larger than the cell volume, such as the tidal field tensor T i j = ∂ i ∂ j Φ (where Φ is the comoving gravitational potential satisfying the Poisson equation ∇ 2 Φ = δ dm ).In particular, previous implementations of BAM used the cosmic-web classification (CWC), which relies on the value of the eigenvalues λ i (i = 1, 2, 3) of the tidal field (see e.g.Hahn et al. 2007;van de Weygaert et al. 2009;Forero-Romero et al. 2009;Aragon-Calvo 2016;Yang et al. 2017;Paranjape et al. 2018) with respect to some arbitrary threshold λ th .Similarly, the information of the velocity shear of the DM particles (see Bond et al. 1996;Libeskind et al. 2018;Kitaura et al. 2022) and its eigenvalues can be used to characterise the halo occupation number.In this work, we restrict ourselves to the CWC.
The CWC allows us to define the behaviour of the halo number counts in knots (labelled k, , and voids (v, with λ 1 < λ th , λ 2 < λ th and λ 3 < λ th ) 5 .Furthermore, the CWC permits exploration of the dependency of halo occupancy on the mass M k of collapsing regions, defined as the number of dark matter particles in sets (regions) formed by cells classified as knots.These regions are identified through a friend-of-friend percolation algorithm (Zhao et al. 2015).
The set of properties (CWC+M k ) has been explored in previous BAM publications (see e.g.Balaguera-Antolínez et al. 2020), where it was shown that it is key to reconstructing the halo number counts based on the dark matter density field.In the same context, Kitaura et al. (2022) introduced the implementation of the invariants of the tidal field I i in the definition of halo bias used in BAM (where . This approach is designed to bridge a phenomenological description of the tidal field (e.g. with the CWC) and theoretical models of perturbation theory in which higher order terms can be written in terms of combinations of the eigenvalues of the tidal field.
In summary, we explore the following models for the reconstruction of halo density fields and the generation of mock catalogues: 1. TkWEB: Use the local density, cosmic-web types, and the mass of collapsing regions.2. IkWEB: ants of the tidal field and the mass of collapsing regions.

TIWEB: {Θ
Use the cosmic web classification and one invariant of the tidal field.The functions f i (x) represent non-linear transformations designed to improve the extraction of the bias information in each variable x = {δ dm , I 2 , I 3 }.We use ) and α a free parameter (fixed to ∼0.11).The form of f 1 (x) has the usual form already used in (Balaguera-Antolínez et al. 2020), while the shape of f 2,3 (x) is designed to map the (large) dynamic range spanned by the invariants I 2,3 to the interval [−1, 1], thus simplifying its binning.Other sets of properties, such as the eigenvalues of the tensor ∂ i ∂ j δ dm (see e.g.Peacock & Heavens 1985;Bardeen et al. 1986), can also be applied to characterise the bias of dark matter tracers (see e.g.Sinigaglia et al. 2021).
As previously mentioned, the physical motivation behind the choice of these models lies in the fact that local dark matter is not the only driver for halo clustering.Several works have already presented evidence of assembly bias in halos and galaxies (see e.g.Kauffmann et al. 1997; Sheth & Tormen  2004; Gao et al. 2005; Wechsler et al. 2006; Gao & White  2007; Croton et al. 2007; Angulo et al. 2008; Dalal et al. 2008;  Faltenbacher & White 2010; Lee et al. 2017; Lazeyras et al.  2017; Montero-Dorta et al. 2017; Mao et al. 2018; Musso et al.   5 In this work, we use λ th = 0. 2018; Contreras et al. 2019;Xu et al. 2021).This type of bias not only includes the clustering of halos as a function of their intrinsic properties but also as a function of their environment (see e.g.Yang et al. 2017;Fisher & Faltenbacher 2018), a dependency that can be covered with approaches such as the TkWEB model.Furthermore, the inclusion of the mass of collapsing regions allows us to include short-range non-local bias, focusing on regions with a distinct (collapsing) dynamical state.
On the other hand, implementing the invariants of the tidal field (i.e. the IkWEB model) allows us to assess the halo bias of Eq. ( 1) in a more complete fashion.This can be understood from the degree of arbitrariness arising in the framework of the CWC, whose characterisation depends on the parameter λ th .The invariants of the tidal field do not suffer from this freedom and therefore contain all the information in the cosmic-web decomposition.Also, and similarly important, the connection between the invariants of the tidal field and the different terms present in a perturbative approach (see e.g.McDonald & Roy 2009;Kitaura et al. 2022) allows BAM to explicitly include a non-negligible signal of non-local bias up to third order in perturbation theory, a signal that is expected to be measured in forthcoming experiments (see e.g.Goldstein et al. 2022).Finally, the TIWEB model is designed to use the information from the TkWEB, adding the information from one invariant of the tidal field or a function thereof.One such function is the so-called tidal anisotropy parameter defined as a function of the eigenvalues of the tidal field as Paranjape et al. 2018).This property is used in the assignment procedure for intrinsic halo properties, which is explained in Sect.3.5.
Along with the models, the total number of bins adopted to discretise the information in the different dark matter properties (e.g.f 1 (δ dm ), M k ) is also important when assessing whether the process can fall into an over-fitting regime.This can be quantified by computing the ratio η between the total number of bins and the total number of spatial cells used in the field description of halos and dark matter.After a series of numerical tests (mainly focused on the ideal number of dark matter property bins needed to achieve the convergence of the method as explained in Sect.2.7), we obtain η ∼ 0.02, 2.2, and 0.9, for TkWEB,IkWEB and TIWEB respectively.This implies that the IkWEB model is likely to incur overfitting6 .This situation does not arise during the calibration procedure because the kernel and bias are applied to the same dark matter field from which these quantities are obtained.However, when implementing these products on independent dark matter density fields (as described in Sect.3.2), the IkWEB model will be more sensitive to any difference in the dark matter distribution of the new field with respect to the reference.In that case, the algorithm generates biased estimates of the halo power spectrum for some realisations (e.g.those with density peaks not present in the reference), which leads to mode coupling in the covariance matrix of the power spectrum.

Learning phase: Iterative procedure and calibration of halo bias
The learning procedure in BAM is designed to generate two main outputs, namely, (i) the so-called BAM-kernel, and (ii) the corresponding (multi-dimensional) halo bias introduced by Eq. ( 1).The role of the halo bias is to assign the number of tracers in cells according to the underlying dark matter properties, keeping track of all the statistical anisotropies of the latter.The role of the kernel is twofold: it corrects for any effective large-scale contribution from non-local bias dependencies not accounted for in the set {Θ dm }; and it also corrects for any aliasing effects caused by the representation of the DM field and the halo distribution on a mesh with respect to the original halo-finding algorithm used to construct the reference catalogue.
Let us now describe the procedure developed in the so-called learning phase of BAM.The main scope of the process is to modify the A-DMDF such that, when sampling it using the halo bias obtained with Eq. ( 1), we reconstruct the statistics of halo number counts to per cent precision up to the Nyquist frequency.Let us now focus on the i-th iteration: at this stage, the algorithm starts determining the properties {Θ i dm } from a dark matter density field obtained as the convolution of the input A-DMDF with the so-called BAM-kernel (to be defined below) K7 , which in turn is the result of the previous iteration: where the kernel is a Dirac's delta function for the first iteration, and remains spherically symmetric in subsequent iterations.
With this new density field, the halo bias B(N ref h |Θ i dm ) is measured using Eq. ( 1), and is then used to sample the density field δi dm to obtain a new version of halo number counts (which we also refer to as the reconstructed field): The main statistical property adopted as the target for the BAM algorithm is the halo power spectrum.This is obtained as an spherical average of the 3D Fourier transform of the new halo number count field, N i h (k), performed in shells identified with a wavenumber k n , where N n is the number of Fourier modes in the n-th shell (of width ∆k n ), and the sum incorporates all vector modes with magnitude in that shell.Here, S = 1/n is the Poisson shot noise (Peebles 1980) of the reference halo catalogue.We then define a power transfer function, where P ref (k n ) denotes the power spectrum of the reference halo catalogue (measured as in Eq. ( 8)).The sampling procedure of Eq. ( 7) is performed such that the new HDF not only contains the same number of objects as that of the reference but also shares its number-count statistics (number-count distribution function).
For each spherical shell in Fourier space, BAM implements a Metropolis-Hasting algorithm (see e.g.Heavens 2009, and references therein) to accept or reject the corresponding value of the transfer function defined in Eq. ( 9).As metric, BAM uses the quadratic difference between the mock and reference power spectra in units of the Gaussian variance (see e.g.Dodelson 2003) of the latter (the standardised Euclidean distance).That is, we define a mode-by-mode likelihood of the form The algorithm maximises the function L i (k n ) by accepting the new power spectrum -and therefore the corresponding transfer function If the transfer at a given mode is not accepted, the algorithm retains the previously accepted value.To express this fact, we define a set of weights ω i (k n ) constructed according to the rejection criteria: These weights are used to define (and to update) the BAM-kernel in Fourier space (which is isotropic, being only a function of k n ), by making products of the weights at the current step with those from the preceding iterations (at the same shell k n ): We use this new version of the kernel to convolve the input DM field, as in Eq. ( 6), from which a new iteration follows (in practice, the convolution with the kernel is done in Fourier-space).
We note that the transfer function defined in Eq. ( 11) is applied to the dark matter density field, and not directly to the field we are trying to reconstruct.Hence, there is no explicit need to define it under a squared root (see e.g.Weinberg 1992).
The learning (or calibration) phase is considered to converge when the absolute residuals R, defined as (where N F is the number of spherical shells used to measure the power spectrum) reach the threshold ∼1%.We used 300 iterations, although with about 150 iterations, the calibration has already reached the sub-per cent residuals.In terms of computing time, at a 32-thread workstation, the calibration procedure (with 300 iterations) takes ∼1 h.
In order to verify that the outputs of the iterative process are independent of the realisation used as a reference, we repeated this procedure for a number of reference simulations (IC plus corresponding halo catalogues) available in the SLICS set. Figure 2 shows slices through the different density fields involved in the calibration procedure performed with one randomly selected reference simulation.In particular, we show the halo number counts on a mesh (second row) reconstructed using the three halo bias models described in Sect.2.6.
Figure 3 shows the summary statistics arising from the products of the iterative stage, using different models for the multidimensional halo bias of Eq. ( 1), and using one reference simulation.All the models shown are in a position to generate sub-per cent residuals in the calibration (see panel (a)) with reconstructed power spectra (panel (b)) within a 5% difference with respect to the reference (up to the Nyquist frequency).The models of halo bias shown display minor differences in their performances when explored at the level of the reconstructed power spectrum, as can be inferred from panel (c) of the previously mentioned figure, where we show the ratio of the power spectrum from the reconstructed field to that measured from the reference.It is only on the first Fourier mode that the differences with respect to the references are above 2%, while for the rest of the probed Fourier modes and up to the Nyquist frequency, the differences oscillate around ∼0.6%.We explicitly verified that similar trends are obtained when another realisation is used to A130, page 7 of 28 perform the calibration.It is key to note that the fluctuations on large scales are not only a consequence of the small volume but are also linked to the stochastic nature of halo bias as expressed by Eq. ( 1).
The differences among the implemented models of halo bias can be observed in the shape of their corresponding kernel, as shown in panel (d) of Fig. 3.We note that the definition of the kernel implies that it does not explicitly encode any information on the anisotropies of the halo density field, which are clearly present in the large-scale distribution in the form of a filamentary structure.Indeed, the kernel in configuration space is fully symmetric, although the patterns can change according to the model of halo bias (and the type of mass-assignment scheme).The information on the anisotropies in the 3D halo distribution is instead statistically encoded in the halo bias, and as such, the model (i.e. the set of properties {Θ}) is key to reproducing higher order statistics, as we show below.
In general, the overall shape of the kernel agrees within all tested models: a constant amplitude towards large scales, with a scale dependency on small scales.The difference in the largescale amplitude encodes the different content of information on the assembly bias8 , and is accounted for as long as different nonlocal terms are included.That is, the higher the amount of information on halo bias, the closer the kernel is to unity on large scales.The constant amplitude of the kernel towards large scales is a property that can be used to generate mock catalogues on larger cosmological volumes.This will be the subject of forthcoming publications.

Construction of halo catalogues
In this section, we describe the steps followed to generate halo number counts on a cubic mesh starting from independent dark matter density fields and using the outputs described in Sect.2.7.To compare the summary statistics of the mocks produced within BAM with those from the reference, we make all comparisons with a set of N sim = 80 mocks.We test different models of halo bias, and based on the performance of the summary statistics obtained from the mocks constructed with these models, we adopt one of them to generate the final set of mock galaxy catalogues.

The halo bias and kernel
Based on the set of N sim initial conditions described in Sect.2.3, we generated the same number of realisations of (approximated) dark matter density fields δ j dm ( j = 1, • • • , N sim ) using the methods described in Sect.2.5.These are convolved with the BAMkernel (obtained from the learning phase, Sect.2.7) to generate a new dark matter density field δ j dm ≡ K ⊗ δ j dm , after which the non-local properties (e.g.types of cosmic web) of the resulting field δ j dm are determined.According to these properties, the algorithm populates these dark matter fields with a number of haloes in cells sampling as A previous analysis with BAM (Balaguera-Antolínez et al. 2019) showed that when using reference simulations probing larger cosmological volumes (e.g.approximately three times that of the SLICS), only one realisation (one member of the reference set) is sufficient to generate an ensemble of number counts with precise summary statistics (up to the four-point statistics).However, numerical tests with the current setup have shown that this procedure, which is based on one single calibration (i.e. based on a single realisation), can suffer from effects that are due to the relatively small volume of the reference simulation (cosmic variance).To circumvent this, we generalise the sampling procedure of Eq. ( 14) and allow each dark matter density field to be sampled with the bias and kernel independently inferred from one or more reference catalogues.That is, we calibrated N ref halo bias and kernels from the same number of SLICS references (as shown in Sect.2.7) and constructed a total bias by 'stacking' the independent halo bias, along with a kernel, obtained as the average from those of each reference: Adding the results of different calibrations as expressed by Eq. ( 15) is equivalent to increasing the volume of the reference simulation (keeping the same minimum tracer mass and spatial resolution) to an effective value However, we note that the stacked version of the halo bias B will differ from that measured from an N-body simulation probing the volume V eff (with the same initial conditions) because of the absence of super-sample modes (e.g.Rimes & Hamilton 2006;Takada & Hu 2013) in the halo bias, with this difference manifesting as an underestimation of high-density peaks, leading to biased estimates of covariance matrices of clustering probes.This is not a problem for the present case because we apply the kernel and bias of Eq. ( 15) to the DM field with the volume of the reference simulation.
In Balaguera-Antolínez et al. ( 2019), it was also demonstrated that the implementation of a kernel along with its corresponding halo bias (i.e. the set of outputs obtained from the learning phase with a given IC) is key to delivering halo fields with accurate summary statistics, in particular, the covariance matrix of the power spectrum.Therefore, the implementation of Eq. ( 15) can be a potential source of inaccuracy because the resulting kernel K tot has not necessarily attached a halo bias represented by B tot .Instead, it is closer to what we can obtain using a reference simulation with fixed-amplitude initial conditions (Angulo & Pontzen 2016); that is, Eq. ( 15) is designed to suppress cosmic variance in the kernel while keeping it in the total halo bias.Accordingly, these two quantities are not physically (statistically) compatible because the abundance of massive halos (or high-density regions) present in the total bias is A130, page 9 of 28 Balaguera-Antolínez, A., et al.: A&A 673, A130 (2023) Fig. 4. Power spectrum (left) and reduced bispectrum (right, isosceles configurations) computed from sets of mock halo number-count catalogues (of 80 realisations each) obtained from the calibration of BAM using the TkWEB model as described in Sect.2.6.The first row shows the mean in each summary statistic.The second row shows the ratio of the mean statistics to that from the reference (RTR mean ).The third row shows the variance in the respective statistics, and the fourth their respective ratio to the variance from the reference ensemble (RTR var ).The shaded area in the second row denotes the 5% deviation to unity.
sensitive to the amount of cosmic variance of the corresponding IC (see e.g.Heß et al. 2013;Aragon-Calvo 2016), which is the same cosmic variance that an averaged kernel is designed to suppress.Keeping this in mind, we implemented Eq. ( 15) to assess whether or not increasing the effective volume can provide better statistics at the number-counts level.We discuss the results in the following section.
In Appendix D, we present the performance of BAM using larger cosmological simulations that were generated with an IC that was in turn generated with variance-suppressing methods (Chuang et al. 2019;Garrison et al. 2018;Maksimova et al. 2021).In forthcoming publications, we shall address this subject in more detail.

Stage II: Generation of halo number counts
According to the discussion of the previous section, we generated N sim halo number-count fields, increasing the effective volume by a factor of 2 and 3, that is, using N ref = 2 3 and N ref = 3 3 calibrations obtained from the same number of references.To obtain a global picture of the performance of the different characterisations of the halo bias, we repeated this procedure for all the models proposed in Sect.2.6.As an example, Fig. 4 shows a comparison between the summary statistics of 80 BAM mocks and the same number of realisations from the reference set.This shows that BAM can generate mock catalogues whose mean and variance of halo power spectrum are in 5% agreement with respect to the same statistics obtained from the reference set.We verified that similar results are obtained with the TiWEB model.
To further assess the level of accuracy with respect to the same statistics from the reference, we use the three-point statistics in Fourier space.In particular, we explore the reduced bispectrum (or hierarchical three-point amplitude) Q(θ 12 |k 1 , k 2 ), (Peebles 1980), where θ 12 is the cosine of the angle between the sides k 1 and k 2 .We use estimates of the bispectrum to assess the precision of the method9 (see e.g.Pollack et al. 2012;Gil-Marín et al. 2012), using an isosceles configuration with k 1 = k 2 = 0.2h Mpc −1 as an example.We remind the reader that this quantity is not constrained in the calibration procedure and can therefore be used as a yardstick to determine which of the models (or amount of effective volume) provides the best scenario to generate mock catalogues in the form of halo number counts.In the case of the TkWEB model (Fig. 4), the signal of the reduced bispectrum is mostly within 5% of that of the reference, except for low values of θ 12 , where the difference can be of the order of 10%.The variance of the bispectrum for such a configuration is also within 5%−10% of that of the reference.We verified that the results based on the TIWEB model show the same general trend.
The correlation matrix r i j = C i j / C ii C j j (where C i j is the covariance matrix) of the statistics under inspection (power spectrum in this case) for different halo bias models is shown in Fig. 5 along with a number of references used as training sets.In general, we can conclude that the TkWEB model generates correlation coefficients that are in good agreement with those from the reference.The TIWEB model displays extra coupling, which tends to decrease as the number of training references increases, which emphasises the need for larger cosmological volumes when one or two realisations are expected to be used as a training set and more detailed models are to be used.We have similarly verified that (as anticipated in Sect.2.6) the IkWEB model displays strong mode coupling towards small scales even with N ref = 27, and we therefore discard it for the present applications.Such extra couplings are likely to be a consequence of the overfitting regime in which this model has been applied (as shown in Sect.2.6), enhanced by the lack of compatibility between kernel and bias, as discussed in Sect.3.1.
In terms of three-point statistics, Fig. 6 reveals that both the TkWEB and TIWEB models can generate sets of number counts whose noise in the correlation matrix of the reduced bispectrum (for isosceles configurations) qualitatively agrees with that observed from the reference simulation, especially N ref = 27.Figure 7 complements the presentation of the performance of the statistical properties of the halo mocks by showing the behaviour of the reduced bispectrum -in several configurations (using the TkWEB with N ref = 27) -in response to the corresponding signal from the reference: the left column shows ratios of the mean (solid lines) and variance (dashed lines) of the BAM ensemble to the results from the SLICS; the BAM mocks reproduce the mean reduced bispectrum, with average deviations (computed over the θ 12 -range) of ∼7%, while the variance shows an average deviation of ∼2% with respect to the reference.We expect that the implementation of improved gravity solvers, which provide a more accurate description of the underlying DM field (e.g.Kitaura et al. 2023), will help to reduce the difference in the mean signal.
The right column of Fig. 7 shows two elements of the correlation matrix of the reduced bispectrum as obtained from the reference (solid lines) and the BAM set (dashed lines), showing that in general, the BAM approach is able to replicate the noise in the correlation matrix of three-point statistics (in real space).
Based on these results, we adopt the TkWEB model to generate independent realisations of halo number counts; this model will be used to generate the final set of halo catalogues as described in the following section.We used N ref = 27 references, but note that this particular model is already good enough to allow us to use the calibration from only one reference simulation.

Stage III-a: Assignment of halo coordinates
To transform the set of number counts obtained in Sect.3.2 into an ensemble of discrete tracers, we assign coordinates and velocities following the approach of Kitaura et al. (2016), which consists in using the phase-space coordinates of dark matter particles generated by the approximated gravity solver (Sect.2.5).The sampling of the halo number counts field is complemented with a set of random tracers (e.g.tracers with random coordinates within each cell), which are used when, at a given cell, the number of halos requested is larger than the available number of dark matter particles.The fraction of such random tracers depends on the redshift of the reference, and for the current setup represents ∼20% of the total number of tracers.
We use dark matter particles to sample the halo number count field in an attempt to maintain a precise clustering signal on scales below the fiducial cell size.As the randomly distributed tracers impact the shape of the power spectrum when analysed at higher Nyquist frequencies, BAM introduces a subgrid modelling based on the collapse of the random tracers towards their closest DM particles.This collapse is modulated by a fraction f col of the separation between each random tracer and its nearest DM particle.That is, for a separation between a random particle and its closest DM particle d r , we displace the former towards the latter such that their new separation is the product f col d r .This is depicted in Fig. 8. Numerical experiments (some of which are discussed in Appendix D) have shown that this parameter depends on the redshift and the nature of the approximated gravity solver (Balaguera-Antolínez et al., in prep.).For the current setup, f col ∼ 0.35 provides a good description of the halo power spectrum.Furthermore, the parameter f col can be generalised to depend on halo properties once these are assigned (Sect.3.5).
The panel (a) of Fig. 9 shows the mean real-space power spectrum obtained from an ensemble of 80 BAM halo catalogues with coordinates assigned as previously described.Each realisation is embedded in a 400 3 cubic mesh using the triangular-shaped-cloud interpolation scheme (Hockney & Eastwood 1988)  10 .Panel (b) shows that the accuracy of the mean power from the BAM (with respect to the SLICS) is below 3% up to k ∼ 0.4 h Mpc −1 .

Stage III-b: Assignment of halo velocities
The displacement obtained with ALPT (see Eq. ( 5)) provides the velocities of dark matter particles at their Eulerian coordinates r : where the 2LPT velocity field is written as (see e.g.Buchert & Ehlers 1993;Kitaura et al. 2014) In this expression, f (i) ≡ f (i) (z) = d ln D (i) (a)/d ln a are the growth indices computed as f (1) (z) ∼ Ω mat (z) 5/9 and f (2) (z) ∼ 2Ω mat (z) 6/11 (see e.g.Lahav et al. 1991).The velocity field associated with the SC model is analogously derived as v sc (q, z) = ∇ψ S C (q, z), where ψ S C (q, z) is the solution of the Poisson equation .
We assign peculiar velocities to dark matter halos in two steps.First, we generate a velocity field in Eulerian space using the velocities computed with Eq. ( 16) and implement an NGP interpolation scheme.It is well known that this kind of approach introduces sampling artefacts due to the fact that it relies on the particles to generate the velocity field (see e.g.Zheng et al. 2013;Zhang et al. 2015), which means cells without tracers are incorrectly assigned a null velocity.Alternatives such as the 'nearest point' (see e.g.Zhang et al. 2015;Chen et al. 2018) or more sophisticated algorithms such as the Delaunay Tesselation (see e.g.Romano-Díaz & van de Weygaert 2007) or the Kriging scheme (see e.g.Yu et al. 2015) are designed to reduce the spurious bias introduced by these sampling artefacts.We implement a hybrid approach and use NGP as the primary method, assigning to empty cells the average velocity computed from the first neighbour cells.A second step consists of a trilinear interpolation of the resulting velocity field at the position of both dark matter particles and random tracers (introduced in the previous section).Figure 10 shows an example of the resulting distribution of the modulus of the halo peculiar velocity v = |u| from one realisation of SLICS and BAM sets (sharing the same seed).One strong feature arising from this comparison is the difference in the abundance (in terms of v) towards high velocities: above ∼300 km/s the abundance from the BAM halos (i.e.ALPT) is, in general, underestimated with respect to the reference.To correct this deviation, we introduce an isotropic correction to the i-th component of the velocity of each particle in the form , where γ(r) = (1+δ dm (r)) α .Numerical experiments have revealed that α ∼ 0.2 leads to good agreement in the halo velocity distribution, as is also presented in Fig. 10.We verified that this correction is indeed needed to obtain the good agreement between the clustering signal of the BAM mocks and that from the reference.We speculate that the origin of this correction is linked to the lack of small-scale modelling of coherent flows in ALPT combined with the resolution used in the analysis.A more detailed analysis (exploring e.g.redshift and cosmology dependencies) will be presented in future publications.
The second correction to the velocities is applied as part of the subgrid modelling, focusing again on the random tracers (as depicted in Fig. 8).In this case, along with the collapse A130, page 12 of 28 Balaguera-Antolínez, A., et al.: A&A 673, A130 (2023) 8. Subgrid modelling for the assignment of coordinates in phase space.The coordinates of the random tracers are modulated by the fraction f col such that if d r denotes the separation between the random particle and its closest dark matter particle, the new separation is f col d r .Velocities are modified in two steps: (i) an isotropic density-dependent correction γ(δ dm ) is applied to the velocities of both random tracers and dark matter tracers (u r → u r ), and (ii), after collapsing the random tracers, their velocities are modified using the parameters β (direction) and ζ (modulo).The angles α and β are defined on the plane generated by the vector joining random particles with their closest dark matter particle and the velocity of the random particle r r .
towards the closest dark matter tracer (discussed in the previous section), we induce a rotation (or collapse) of the random velocities, modifying its orientation (through an angle β) and magnitude (through a parameter ζ), which can be a function of the tracer properties or local density.In this work, we empirically set this parameter to that is, we apply a rotation to the velocity vector to align it with the axis connecting the random particle and the dark matter particle, keeping its magnitude fixed.We verified that the effect of β 0 helps to improve the signal in redshift space towards small scales and leave a thorough study of their impact in the velocity field to a future study.
The performance of the velocity assignment in terms of the two-point statistics is presented in panels (c) to (h) of Fig. 9, where we show the mean halo power spectrum in redshift space.This signal is obtained by transforming the halo coordinates (x, y, z) along a line-of-sight axis (taken to be one of the three Cartesian coordinates, e.g. the z-direction) using the distant observer approximation to its redshift coordinate via z → s = z + v z /(aH(a)), (see e.g.Kaiser 1987).The clustering in this space is summarised through the Legendre decomposition which, according to the distant observer approximation, can be measured as (see e.g.Hamilton 1998) where the sum denotes averages in spherical shells, is the three-dimensional halo power spectrum, L (x) is the Legendre polynomial of order , and S is the Poisson shot noise (as in Eq. ( 8)).We measure the monopole ( = 0), the quadrupole ( = 2), and the hexadecapole ( = 4) as main statistical probes of redshift-space distortions.
In general, the redshift-space power spectrum probed on scales up to k ∼ 0.4h Mpc −1 agrees within the 1σ uncertainty region with that of the reference, as is demonstrated by panels (c) to (h) in Fig. 9. Figure 11 shows the correlation coefficient of the halo power spectrum in real and redshift space computed from the halo distribution.The bottom panels of Fig. 11 show two elements of the correlation matrix, which reveal good agreement between the two compared sets, both in terms of the width of the correlation coefficients and the underlying noise.We verified that this agreement is also observed when using BAM realisations with seeds different from those of the reference set.A130, page 13 of 28 ).In all panels, the corrected histogram is obtained after applying the isotropic velocity correction described in Sect.3.3.

Stage III-c: Assignment of halo properties
Given that the BAM mock halo catalogues are to be considered as the building blocks of galaxy catalogues (though with the implementation of an HOD model), the BAM algorithm pays special attention to the assignment of halo properties such as the virial mass M vir and the velocity dispersion σ v .This step is indeed a critical and far-from-trivial task within the construction of BAM mock catalogues (Balaguera-Antolínez et al., in prep.), given that a simultaneous generation of precise clustering and halo properties would imply the assessment of the distribution of pairs in all possible bins of halo properties, a computation which goes openly against the need for speed in the generation of mock catalogues.
The procedure encoded in BAM finds its motivations in early methods developed by Zhao et al. (2015; see also Chuang et al. 2015a), which were envisaged to generate luminous red galaxy catalogues (see e.g Kitaura et al. 2016;Rodríguez-Torres et al. 2016).Those algorithms used the properties of the underlying dark matter density field, as in BAM.However, BAM takes the method to a greater level of detail, in which more properties of the dark matter and dark matter tracers are considered.
The assignment procedure in BAM (see Fig. 1) relies on a hierarchical approach in which a 'main property' is defined and assigned, followed by the assignment of secondary properties using the scaling relation with respect to the main property.To determine the main property, different options can be considered; for example, selecting the halo property with the tightest correlation with the underlying dark matter density field, or choosing the halo property that drives the main dependencies in the HOD framework.The latter option would lead us to treat the virial mass M vir as the main property (which mainly determines galaxy number-count statistics in the HOD framework), followed by the velocity dispersion (which dictates the redshift space distribution of satellite galaxies).Nevertheless, given that BAM is designed to explore environmental dependencies (defined through the dark matter density field), we select the first option.Accordingly, while the correlation between virial mass and local dark matter density is ∼28%, the velocity dispersion displays tighter correlations (∼58%) with the local dark matter density.This is not surprising, as quantities directly derived from the dynamical properties of the dark matter particles in halos trace the depth of the potential wells very well (see Appendix A) and are less prone to ambiguities typical of the definition of the mass of a dark matter halo (see e.g.Skibba & Macciò 2011;Zehavi et al. 2019).In Appendix B, we describe the methodology implemented for the assignment of properties within BAM.
Figure 12 shows the correlation between different halo properties with respect to the underlying dark matter density field, again for different cosmic-web types.We verified that the scaling relations between virial mass and velocity dispersion show an acceptable level of agreement with those from the reference set.

Results
Figure 13 shows the halo abundance as a function of the two assigned halo properties from a set of 80 realisations of BAM and the same number of SLICS simulations.We see good agreement in general, but this agreement partially breaks down for tracers with high velocity dispersion in low-density regions (voids) where BAM overestimates the abundance in terms of that particular property.
Similarly, we also verified that the multi-scaling approach generates better results than a direct assignment of properties.Although this approach naturally goes towards solving the problem of halo exclusion, it relies on the specification of the different thresholds, and we checked that the precision of the mean power spectrum of halos is sensitive to such figures, especially on the high-mass halo population.New alternatives are being explored and will be presented in forthcoming publications.
Figure 14 shows the ratio between the mean power spectra from the BAM set and that from the SLICS, both in real and redshift space and in three disjoint halo-mass bins.The area around that curve denotes the standard deviation.In general, the trend shown as a function of the halo mass is similar in the two sets of mock catalogues.However, closer inspection reveals ∼5% deviations towards small scales both in real and redshift space, and in particular, for the most massive halos.
Figure 15 shows the variance (left column) and correlation matrix (right column) of the halo power spectrum in real and A130, page 14 of 28 Balaguera-Antolínez, A., et al.: A&A 673, A130 (2023) Fig. 11.Correlation matrix of halo power spectrum with BAM.Top row: Correlation matrix r i j = C i j / C ii C j j (where C i j is the covariance matrix) obtained from the BAM mock halo catalogues and the SLICS references computed in real space and redshift space, the latter expressed through the monopole P 0 (k), the quadrupole P 2 (k), and the hexadecapole P 4 (k).Bottom row: Examples of elements of the correlation matrix r i j at two different wavenumbers k j ∼ 0.1 and ∼0.32 h Mpc −1 from the two sets of halo catalogues.Fig. 12. Joint probability distribution B(x, δ dm ) of halo properties x (number counts, virial mass and velocity dispersion) and the underlying dark matter density, interpolated on a 192 3 mesh using a CIC mass-assignment scheme and for different cosmic-web environments.Solid lines (and coloured regions) denote contours enclosing 98% and 68% of the total number of cells in a reference SLICS simulation.Dotted lines represent the same quantity obtained from one BAM halo catalogue.A130, page 15 of 28 redshift space for the full halo population.We can read from this figure that in real space, the BAM mocks display a closer correlation between modes on small scales compared to the reference simulation.The situation is mildly better in redshift space, where indeed the correlation matrix for the quadrupole agrees to a greater extent with the SLICS simulations.The most significant discrepancy appears when exploring the clustering as a function of the halo mass.In particular, high-mass halos display covariance matrices of the power spectrum with a strong mode coupling.Such coupling comes from realisations where the power spectra deviate considerably from the expected mean of the ensemble (i.e. that from the reference suite).These discrepancies originate from two different aspects of the BAM approach.On one hand, the assignment of halo properties turns out to be complex, especially for massive objects, where effects such as halo exclusion (see Appendix A) are not fully modelled.On the other hand, the deviations seen in redshift space are inherited from those in real space, plus any remaining deviation from the true halo velocity field from the velocity field generated from the ALPT.These two aspects mean that there is room for improvement in the assignment of coordinates, velocities, and properties of the halo population, especially towards small scales.
With the procedures described in the previous sections, we generated 770 mock halo catalogues based on the same number of initial conditions.One of the great advantages of the method is the small computing-time requirements: the generation of this set of halo catalogues was achieved in ∼2 days (∼4 min per mock) using a work station with 128 threads and 256 Gb of random access memory (RAM).

Stage IV: Construction of galaxy catalogues
One of the main advantages of BAM is its capability to provide catalogues of different dark matter tracers, all sharing the same underlying dark matter and halo distribution.This is key to providing covariance matrices for multi-tracer analysis (see e.g.Hamaus et al. 2012;Abramo & Leonard 2013;Abramo et al. 2016;Wang & Zhao 2020;Zhao et al. 2021).While this is not a unique feature of this method (see e.g.Zhao et al. 2021), it represents an improvement over approaches that need to be calibrated with a particular galaxy population.
To assign galaxies to the dark matter halos of BAM, we implemented the HOD prescription based on the high-mass quenched model (see e.g.Alam et al. 2020), which describes the abundance of emission line galaxies (ELGs) in dark matter halos (see e.g.Gonzalez-Perez et al. 2018).This model suppresses the probability for the central ELG galaxies to be found in very massive dark matter halos.In particular, the probability that a halo of mass M vir hosts a central ELG is expressed as where φ(x) = N(log 10 M c , σ M ), M ≡ γM vir , Φ(x) = x −∞ φ(x)dx and A = 2(p max − 1/Q)/max(2φ(x)Φ(x)).Here, Q denotes the quenching efficiency, p max controls the saturation level of occupancy, and M c is the cut-off mass for ELGs, which determines the maximum of the occupation distribution.The number of ELG satellites is generated from a Poisson realisation with a mean modelled as a power law with a lower cut-off: where M 1 is a characteristic satellite mass, while the parameter κ defines a cut-off mass (in units of M c ) below which the occupancy of satellites drops to zero.The satellites are distributed within dark matter halos following an NFW density profile (Navarro et al. 1996).On the other hand, the random components of the velocities of the satellite galaxies are derived from a normal distribution, v s N(0, σ v ).The total velocity of the satellite galaxies is then given by the sum of the halo peculiar velocities of the parent halos and the random component.
The velocities of the central galaxies are the same as those of their parent halos.
The HOD prescription of Eqs. ( 18) and ( 19) has been simultaneously applied to the SLICS and BAM halos to obtain their respective galaxy catalogues (see Alam et al. 2020, for the set of parameter {Q, γ, M c , κ, α, σ M , M 1 }).The performance of the BAM galaxy mocks in terms of the galaxy occupation distribution is presented in Fig. 16, where we observe how the larger deviations with respect to the reference are embodied in the satellite population at the high-halo mass end (∼60% difference at M vir ∼ 4 × 10 14 M h −1 , mass scale below which ≥99% of the sample is contained).
A130, page 16 of 28 Balaguera-Antolínez, A., et al.: A&A 673, A130 (2023) Fig. 14.Ratio (ratio-to-reference) between the mean power spectrum from the set of 80 BAM mock halo catalogues and that obtained from the same number of SLICS catalogues, both in real and redshift space (monopole = 0, quadrupole = 2, hexadecapole = 4), in three bins of halo virial mass.The shaded areas denote the 1σ region (standard deviation) computed from the means and their respective errors.A130, page 17 of 28 Balaguera-Antolínez, A., et al.: A&A 673, A130 (2023) Let us now summarise the performance of the mock galaxy catalogues produced with BAM through the assessment of different statistical probes.
Fourier space.Figure 17 shows the ratio of the mean power spectrum of galaxies in BAM with respect to the signal from the same type of populations in the reference simulation, both in real and redshift space.The most noticeable difference in real space comes from the clustering of satellites, which on small scales directly probes the density profile of the dark matter halo (see e.g.Cooray & Sheth 2002).As the two sets of mocks share the same density profile (by construction), the differences in the clustering pattern can be traced back to the assignment of halo masses, as this is a key property shaping the mass -concentration relation.
Similarly inherited from the parent halo population, the galaxy redshift-space power spectrum displays a lack of power towards small scales.We note that, contrary to the halo population in which such deviations can be solely tackled from the velocity field of the approximated gravity solver, at this stage we must also add -in addition to the velocity field of the dark matter particles -information on the halo properties used to derive galaxy coordinates in phase space.
Figures 18-20 show the variance and the correlation matrix of the mock galaxy catalogues, split into the two types of population and into different bins of (host) halo mass.In general, the covariance matrices show good agreement with the reference.The extra correlation in the high-halo-mass bins is inherited from the parent halo distribution but is only embodied within the statistics of the satellite population.The behaviour of the correlation matrices for central galaxies (in all host-mass bins) contrasts with that from the parent halo distribution, as it lacks the extra mode coupling presented in Fig. 19.The reason for this is that, as pointed out above, the HOD model applied here suppresses the abundance of central ELGs in high-mass halos, which negates the possible deviations coming from the combination of cosmic variance and inaccuracies in the kernel-bias connection (see Sect. 2).
Configuration space.We measure the standard correlation function (Peebles 1980) in real and redshift space (the latter similarly represented by the multipole decomposition) based on the natural estimator (e.g.Kerscher et al. 2000) ξ(s, µ) + 1 = DD(s, µ)/RR(s, µ)11 , where DD(s) (RR(s)) is the number of (random) galaxy pairs at a separation s in redshift space, and µ = ẑ (according to the distant observer approximation; see Sect.3.4).The multipoles of the two-point correlation function are obtained as a Legendre decomposition, in analogy to Eq. ( 17). Figure 21 shows the comparison of the two-point correlation function of the two data sets (see e.g.Favole et al. 2017, for a clustering analysis of ELGs).The picture observed in Fourier space (see Fig. 17) is replicated here, in which a systematic bias (∼5%) can be seen in the quadrupole, albeit not statistically significant.The real space correlation function and its monopole in redshift space agrees very well with that of the reference.It is of paramount relevance for the quality validation of this suite to show how the position and the amplitude of the baryonic acoustic peak is well preserved in the BAM mocks.
Marked statistics.We can jointly assess the clustering properties and the quality of the assignment of halo properties using marked statistics in Fourier space (see e.g.Balaguera-Antolínez where WW α (s) represents the count pair of galaxies weighted with a given property α at separation s. Figure 22 shows the galaxy marked correlation function in real and redshift space, using the halo mass and velocity dispersion as marks, as obtained from the two suites of mocks.In general, the trend followed by the measurements from the two ensembles is consistent, showing how the BAM mocks can properly encode the information of galaxy bias as a function of different properties.Statistically significant differences are sizable when the halo mass is marked under inspection, especially on small separations, evidencing the trends observed in terms of the power spectrum of Fig. 17.
Statistical compatibility with the reference suite.We verified the statistical compatibility between the two sets of mock catalogues by means of a Kolmogorov-Smirnov test (e.g.Press et al. 2002) based on the χ 2 distributions of power spectra.The test, displaying p-values of ∼0.3, forecasts precise and accurate results (compared to those obtained from the reference N-body simulation) when the set of BAM mocks are subject to likelihood analysis (Chuang et al., in prep.).
The main body of this paper is devoted to the construction of galaxy catalogues based on the generation of halo catalogues endowed with intrinsic properties (such as virial mass and velocity dispersion) to which an HOD prescription is applied.This procedure allows us to discriminate between central and satellite galaxies, and paves the way towards the assignment of galaxy properties by linking them with the properties of their host halos (see e.g. de Santi et al. 2022).Nevertheless, if the main goal is to generate galaxy catalogues without further properties, we could similarly have applied the BAM machinery directly using a set of reference galaxy catalogues as the training set.In Appendix C, we discuss this option and present the results of such a direct approach.

Discussions and conclusions
In this paper, we present the construction of mock galaxy catalogues based on the bias assignment method (BAM), (Balaguera-Antolínez et al. 2019).We used the initial conditions of the reference N-body simulation (the SLICS simulation) Harnois-Déraps et al. ( 2018), down-sampled, and evolved it using augmented Lagrangian perturbation theory (Kitaura & Hess 2013).This approximated density field is accurate enough at scales of ∼0.4 h −1 Mpc to robustly study the bias relation between the cosmic web and the halo number counts.The remaining differences between the approximate gravity solver and the exact solution from an N-body simulation (on scales above the mesh resolution) are automatically taken into account in the effective halo bias extracted from the reference simulation.In particular, the approximated density field is used along with the corresponding halo catalogue reference N-body simulation to iteratively learn the halo bias and a kernel, which are the main  outputs of the learning phase of the method.We show how the characterisation of the halo bias as a function of cosmic-web type and the implementation of a number of realisations as a training data set (so as to increase the effective volume of the reference simulation) can generate ensembles of independent realisations of halo number counts with ∼2%−5% precision in the two-and three-point statistics, as well as in the variance obtained from the corresponding covariance matrices.
We describe a procedure to assign halo coordinates, velocities, and intrinsic properties, with the aim being to generate a set of 770 dark matter halos with the same number of properties as in the reference set.We assigned velocity dispersion and A130, page 20 of 28 virial mass following a hierarchical and multi-scaling approach (see Sect. 3.5) designed to replicate the abundance as a function of these properties.The halo two-point statistics replicates that of the reference with 5% precision at k ∼ 0.4h Mpc −1 , which is the maximum wave number adopted by the DESI mock challenge.We verified that covariance matrices of the two-and threepoint statistics measured from the mock catalogues generated by BAM are in good qualitative agreement with those obtained from the reference N-body simulation.A thorough likelihood analysis using these covariance matrices will be performed by Chuang et al. (in prep.) as part of the DESI mock comparison project.
Based on this set of halo catalogues, we generated the same number of galaxy catalogues using an HOD prescription designed to replicate the abundance of emission-line galaxies.It is important to stress that the same HOD parameters were applied to both BAM and the N-body-based halo catalogues.The goodness of the suite of mocks is assessed not only in Fourier space but also in configuration space through the correlation function and the marked correlation function.
Despite the good performance of the method in terms of the different statistical probes explored, we identified a number of items in which BAM has to be improved to reduce the deviations observed with respect to a reference simulation.These are: -Peculiar velocities.Along with the density-dependent bias correction (described in Sect.3.4), the small-scale clustering signal in redshift space demands the further treatment of the peculiar velocities.This treatment starts with a thorough analysis of the velocity assignment, especially for random tracers, as shown in Fig. 8. Generalisations of such an approach to taking into account halo properties and/or modification of the full population (dark matter and tracers) are part of this task.A deeper understanding of the origin of the different corrections in the velocities applied in this work is to be addressed in forthcoming publications.-Assignment of halo properties.A thorough approach to this task is to impose the pair distribution of tracers as a function of the different halo properties.However, this is a highly expensive task.Although the multi-scale algorithm described in Appendix B is an improvement with respect to previous algorithms, further developments need to be investigated and implemented.To that end, we are currently including a second learning phase and using marked statistics as the main diagnosis.The goal is to replicate the clustering pattern observed in the reference as a function of all halo properties.In general, the accurate performance of BAM does not only depend on these planned improvements.It is also complemented by the characteristics of the reference simulation used as the training data set.The SLICS, with a relatively small cosmological volume, and initial conditions generated from Gaussian random fields, is highly prone to cosmic variance.This is reflected in the limited statistical information contained in a halo bias obtained from only one reference, which in turn can lead to inaccuracies in the generation of mock halo catalogues with different seeds.To take this into account, we pushed the method to the implementation of more than one reference simulation (see Eq. ( 15)) as the training data set.
In Appendix D, we describe our motivation to implement a reference simulation based on initial conditions with variance suppression (fixed amplitude initial conditions; see e.g.Angulo & Pontzen 2016;Chuang et al. 2019;Maion et al. 2022) covering larger cosmological volumes.This scenario can substantially improve the accuracy in the two-and three-point statistics, as well as the procedure to assign halo properties.Such a setup will also allow the method to extrapolate the generation of mock catalogues to volumes larger than that of the reference.
The present paper demonstrates the potential of BAM to speedily deliver mock halo catalogues -with a number of properties -that are both flexible and accurate enough to implement any mechanism to generate galaxy samples within the context of the halo model.The next step is the generation of larger sets of mock halo catalogues (larger cosmological volumes and light cones) with more halo properties (e.g.spin, concentration, maximum circular velocity) on which different methods for galaxy occupation and selection functions can be imposed to replicate the sky observed by different experiments.
1. We randomly select one tracer (without assigned property) from the mock, and then identify the fiducial cell where it resides and the corresponding set of properties {Θ} mock .2. Returning to the list {σ v }({Θ} ref ), we randomly assign any of the corresponding values still available (i.e. with σ v < σ th v ).Finally, if there are still tracers with no assigned properties (e.g.due to cosmic variance), BAM statistically assigns values of velocity dispersion by sampling the global halo abundance measured from one reference catalogue.This last case represents ∼5% of the total assignment, depending on the realisation.For the current case, we implemented N = 3 levels with thresholds at σ th v = 4, 8 and 10 km/s.Once velocity dispersion is assigned, we measure the scaling relation P(M vir |σ v , {Θ}), (using one realisation of the reference set) to assign virial masses with For this step, we implemented, along with the information of the velocity dispersion, the local dark matter density and the cosmic-web classification.The scheme can be generalised to any other set of halo properties tabulated in the reference catalogue (e.g.spin, concentration, maximum circular velocity).

Appendix C: Calibration using galaxy catalogues
In this Appendix, we assess the precision of mock galaxy catalogues built using the set of ELG catalogues described in Sect. 4 as a reference (i.e.SLICS halos plus HOD).Following the procedures described in Sect.2.7 and § 3.2 and using the TkWEB model to characterise the galaxy bias in BAM, we generated 770 realisations of galaxy number counts.Coordinates of DM particles are used to define the positions of the galaxy-type tracers, while random tracers are similarly introduced, collapsing them towards their closest DM particles.Numerical tests indicate that for the current galaxy population, a fraction f col = 0.05 has to be used to obtain a per cent accuracy in the real space mean power spectrum on small scales.We note that this collapsing fraction is smaller than that used for halos (∼0.35), and is expected as galaxies populate smaller scales than their parent halos, thus generating the need for a stronger collapse of the random set.The assignment of velocities to this set of tracers is not evident, as no clear identification of centrals or satellites is available, In order to match the large-scale clustering signal, bulk velocities can be assigned as shown in Sect.3.4 with a constant velocity bias of ∼20%.On the other hand, small-scale clustering can be replicated by adding a random velocity component to the components of the bulk velocities from a normal distribution, with a width that can vary among the different cosmic-web types (see Sect. 3.4).Such a random component can be added to each component of the velocity (as would be the case of galaxies inside dark matter halos) or to the magnitude of the velocity, a situation that can be linked to parent halos in the mildly non-linear regime (Hikage & Yamamoto 2016;Zheng et al. 2019).Either of these options can be used to replicate the small-scale redshift space clustering signal up the k ∼ 0.4hMpc −1 .For example, assigning a random component to each velocity component in knots demands a velocity dispersion of σ ∼ 350 km/s, while adding the noise and keeping the direction of the velocity of each tracer fixed demands σ ∼ 740 km/s.Here, we do not aim to precisely determine the best scenario for the assignment of random components to galaxy-like tracers.For the current test, we used σ = 350 km/s applied only to galaxies in knots.Figure C.1 shows the comparison of different summary statistics in Fourier space for the set of BAM galaxies generated from halo catalogues (BAMh) and the BAM set generated directly from the calibration to the SLICS galaxy catalogues (BAMg) described in Sect. 4. We can summarise the results of this comparison as follows: -In real space (first column of Fig. C.1), the BAMg set yields a more accurate description of the mean power spectrum, especially towards small scales (k ∼ 0.4hMpc −1 ).The variance of the power spectrum is also improved.With respect to the reference correlation matrix, this quantity displays less mode coupling for the BAMg than the BAMh set.In general, the improvement in real space of the set BAMg is due to the lack of inaccuracies in the galaxy distribution associated with the assignment of halo properties, as in the BAMh case.-The behaviour in redshift space from the BAMg is consistent with that from BAMh.Noticeable differences are (i) the mean of the quadrupole on scales k ≥ 0.2hMpc −1 and (ii) the correlation matrix, where the set BAMg displays less extra coupling than seen in BAMh.These results are remarkable for two reasons: On one hand, the fact that the BAMh set is consistent with the BAMg highlights the ability of the method to generate galaxy catalogues with information on the hosting halos without a significant decrement in the precision of two-point statistics.On the other hand, the behaviour of the BAMg set shows how the method can be further adapted to generate galaxy catalogues by directly learning from a reference galaxy catalogue containing galaxy velocities based on theoretical models.Comparison of the summary statistics of the set of BAM galaxies generated from the halo catalogues (BAMh) with those of the BAM galaxies generated from calibration (BAMg) obtained from the SLICS galaxy catalogue.The top row shows the ratios to the references (RTF) from the mean and variance of the power spectrum (real and redshift space).The middle row shows the absolute value of the difference between the correlation coefficients r i j of the BAM sets (BAMg, upper diagonal, BAMh, lower diagonal) and those from the SLICS.The third row shows the elements of the correlation coefficients (in only one Fourier bin, to avoid clutter) from the three sets.

Fig. 2 .
Fig. 2. Slices of 25 Mpc h −1 thick though different density fields involved in the calibration of the BAM products and its products.The bottom panel shows the reconstruction of the halo number density field using different models of halo bias (see Sect. 2.6).The rightmost column shows the galaxy density field from the reference and from BAM, built by populating halo catalogues with a model of halo occupation distribution (see Sect. 4).

Fig. 3 .
Fig. 3. Summary statistics of the calibration procedure in BAM based on one SLICS realisation.Panel (a): Residuals computed from the reference power spectrum and the halo power spectrum at different iterations within the calibration procedure.Absolute residuals (see Eq. (13)) show that the calibration leads to a precise (<1%) reconstruction of the halo number counts and its two-point statistics, while relative values show that the deviation around the reference is randomly distributed, with a ∼0.15% amplitude.The different lines in each case show the behaviour under different models (TkWEB, IkWEB, TIWEB) of halo bias (see Sect. 2.6 for details).Panel (b) shows the power spectrum from the reconstructed halo number counts field in each of the halo models.Panel (c) shows the transfer function T i (k) computed as in Eq. (11), evaluated at the last iteration of the calibration procedure; the shaded area denotes the 3% deviation around unity.Panel (d) shows the BAM kernel computed using Eq.(12).

Fig. 5 .
Fig. 5. Correlation coefficients of the halo power spectrum obtained from a set of 80 realisations of halo number-counts using the SLICS and BAM mock catalogues, calibrated from the different number of references N ref and two characterisations of the properties of the dark matter density field (TkWEB and TIWEB).

Fig. 7 .
Fig. 7. Comparison of the signal of reduced bispectrum obtained from 80 mock catalogues generated with BAM (using the TkWEB model) and the same signal from the reference set, for several triangle configurations, which are specified in each panel.The left column shows the ratio to the reference (RTF) of the mean (solid lines) and variance (dashed lines); the shaded areas in the panels of this column denote a 20% deviation from unity.The right column shows two elements of the correlation coefficients of the bispectrum r i j .

Fig. 9 .
Fig. 9. Power spectrum of halo catalogues in real space P(k), (panels (a) and (b)) and redshift space, with the latter represented by the monopole P 0 (k), (panels (c) and (d)), the quadrupole P 2 (k), (panels (e) and (f)), and the hexadecapole P 4 (k), (panels (g) and (h)).Panels (a), (c), (e), and (g) show the mean power spectrum from the 80 SLICS realisations (grey dashed line) and the mean from the same number of BAM mocks (solid blue lines).Panels (b), (d), and (f) show the ratio (RTR) of the BAM mean spectrum to that of the reference.The shaded areas denote the standard deviation computed from the mean and variance of each set.

Fig. 10 .
Fig. 10.Example of the distribution of halo peculiar velocities v = |u| in one reference (solid black histogram) and one BAM (solid blue and filled histogram) halo catalogue, in different types of cosmic web (see Sect. 3.3).In all panels, the corrected histogram is obtained after applying the isotropic velocity correction described in Sect.3.3.

Fig. 13 .
Fig. 13.Cumulative halo abundance as a function of virial mass M vir (left column) and velocity dispersion σ v (right column) obtained in different cosmic-web types (normalised to the number of halos in each cosmic-web type).Error bars indicate the mean and standard deviations computed from 80 realisations.

Fig. 15 .
Fig. 15.Correlation matrix of power spectrum.Left column: Ratio between the variance of the power spectrum from the BAM mocks and that measured from the SLICS in different halo-mass bins.Right column: Correlation coefficients obtained from the BAM mock halo catalogues and the SLICS references computed in real and redshift space, with the latter expressed through the monopole P 0 (k), the quadrupole P 2 (k), and the hexadecapole P 4 (k).Three bins of halo mass are shown (rows).

Fig. 16 .
Fig. 16.Mean number of galaxies (central and satellite) from the BAM and SLICS ensembles as a function of host halo mass.The points with error bars denote the mean and variance from each ensemble.
Fig. 17.Same as Fig. 14 but for the mock galaxy catalogues generated as explained in Sect. 4.
Fig. 19.Same as Fig. 15 but for the central galaxy population.

Fig. 20 .
Fig. 20.Same as Fig. 15 but for the satellite galaxy population.

Fig. 21 .
Fig. 21.Galaxy correlation function in real space ξ(r), (top panel) and redshift space in the form of monopole ξ 0 (s), quadrupole ξ 2 (s), and hexadecapole ξ 4 (s).The solid line and shaded area denote the mean and sample variance from the SLICS simulation, respectively.

Fig. 22 .
Fig. 22. Galaxy marked correlation in real M(r) and redshift space M(s).The top panel show the result of using the halo virial mass as the mark.The middle panel uses the velocity dispersion and the third shows the results of using the cross-marked correlation function.
vir |σ i v = σ ref v , {Θ} mock j = {Θ ref }).(B.1) Fig. C.1.Comparison of the summary statistics of the set of BAM galaxies generated from the halo catalogues (BAMh) with those of the BAM galaxies generated from calibration (BAMg) obtained from the SLICS galaxy catalogue.The top row shows the ratios to the references (RTF) from the mean and variance of the power spectrum (real and redshift space).The middle row shows the absolute value of the difference between the correlation coefficients r i j of the BAM sets (BAMg, upper diagonal, BAMh, lower diagonal) and those from the SLICS.The third row shows the elements of the correlation coefficients (in only one Fourier bin, to avoid clutter) from the three sets.