Impact of Property Covariance on Cluster Weak lensing Scaling Relations

We present an investigation into a hitherto unexplored systematic that affects the accuracy of galaxy cluster mass estimates with weak gravitational lensing. Specifically, we study the covariance between the weak lensing signal, $\Delta\Sigma$, and the"true"cluster galaxy number count, $N_{\rm gal}$, as measured within a spherical volume that is void of projection effects. By quantifying the impact of this covariance on mass calibration, this work reveals a significant source of systematic uncertainty. Using the MDPL2 simulation with galaxies traced by the SAGE semi-analytic model, we measure the intrinsic property covariance between these observables within the 3D vicinity of the cluster, spanning a range of dynamical mass and redshift values relevant for optical cluster surveys. Our results reveal a negative covariance at small radial scales ($R \lesssim R_{\rm 200c}$) and a null covariance at large scales ($R \gtrsim R_{\rm 200c}$) across most mass and redshift bins. We also find that this covariance results in a $2-3\%$ bias in the halo mass estimates in most bins. Furthermore, by modeling $N_{\rm gal}$ and $\Delta\Sigma$ as multi-(log)-linear equations of secondary halo properties, we provide a quantitative explanation for the physical origin of the negative covariance at small scales. Specifically, we demonstrate that the $N_{\rm gal}$-$\Delta\Sigma$ covariance can be explained by the secondary properties of halos that probe their formation history. We attribute the difference between our results and the positive bias seen in other works with (mock)-cluster finders to projection effects. These findings highlight the importance of accounting for the covariance between observables in cluster mass estimation, which is crucial for obtaining accurate constraints on cosmological parameters.


INTRODUCTION
Cluster abundance and its evolution with redshift are linked to the constituents of the Universe through the growth of cosmic structure.(Allen et al. 2011, for a review).Cluster abundance measured in large-scale galaxy surveys offers power constraints on cosmological parameters (e.g., Vikhlinin et al. 2009;Mantz et al. 2015;de Haan et al. 2016;Mantz et al. 2016a;Dark Energy Survey Collaboration et al. 2016;Pierre et al. 2016;Costanzi et al. 2021).These constraints are based on accurate cluster mass measurements, which are not directly observable and must be inferred.Cluster mass calibration has been identified as one of the leading systematic uncertainties in cosmological constraints using galaxy cluster abundance (Mantz et al. 2010;Rozo et al. 2010;von der Linden et al. 2014;Applegate et al. 2014;Dodelson et al. 2016;Murata et al. 2019;Costanzi et al. 2021).Accurate mappings between a population of massive clusters and ★ E-mail:zzhang13@uchicago.edu their observables are thus critical and essential in cluster cosmology.Considerable effort has been put into measuring the statistical relationships between masses and observable properties that reflect their baryon contents (see Giodini et al. 2013, for a review) and quantifying the sources of uncertainties.
The Dark Energy Survey (DES) cluster cosmology from the Year 1 dataset (Abbott et al. 2020) reported tension in Ω  -the mean matter density of the universe -with the DES 3x2pt probe that utilizes three two-point functions from the DES galaxy survey (Abbott et al. 2018).The tension between these two probes that utilize the same underlying dataset may be attributed to systematics that bias the weak lensing mass of clusters low at the low mass end (To et al. 2021;Costanzi et al. 2021).A possible origin for this discrepancy is that cluster masses are biased low due to systematics in cluster mass calibration.On the other hand, the tension can also originate from new physics that extends the Standard Cosmological model.Thus, it is important to understand the systematics of cluster mass calibration.Cluster masses estimated from X-ray and SZ data are known  to suffer from hydrostatic bias (Pratt et al. 2019).Conversely, cluster masses estimated from weak lensing have the potential to be more accurate compared to X-ray and SZ cluster masses.The systematics in the weak lensing mass calibration has just started to be explored recently (Applegate et al. 2014;Schrabback et al. 2018;McClintock et al. 2019;Kiiveri et al. 2021;Wu et al. 2022).
A relatively unexplored category of cluster systematics is the covariance between different cluster properties, including cluster observables and mass proxies.In cluster mass calibration, it is often assumed that this property covariance is negligible.However, as initially pointed out by Nord et al. (2008) and later shown in Evrard et al. (2014) and Farahi et al. (2018a), non-zero property covariances between cluster observables can induce non-trivial, additive bias in cluster mass.As property covariance is additive, the systematic uncertainties that it induces will not be mitigated with the reduction of statistical errors as the sample size of the cluster increases.To achieve accurate cosmological constraints with the next generation of largescale cluster surveys, it is imperative that systematic uncertainties in the property covariance be accurately and precisely quantified (Rozo et al. 2014).
Although the property covariance linking mass to observable properties is becoming better understood and measured (Wu et al. 2015;Mantz et al. 2016a;Farahi et al. 2018aFarahi et al. , 2019;;Sereno et al. 2020), studies that specifically investigate weak lensing property covariance are scarce, which poses a challenge for upcoming lensing surveys of galaxy clusters such as the Rubin (Ivezić et al. 2019) observatories.To achieve the percentage-level lensing mass calibration goals for the upcoming observations, the property covariance of weak lensing must be quantified.
The physical origins of property covariance in lensing signals of galaxy clusters can be attributed to the halo formation history of the cluster and baryonic physics (Xhakaj et al. 2022).Developing a firstprinciple physical model for the property covariance as a function of halo formation history and baryonic physics is a daunting task due to the highly non-linear and multi-scale physics involved in cluster formation.To make progress, in this paper, we adopt a simulationbased, data-driven approach whereby we develop semi-analytical parametric models of property covariance, which we then calibrate with cosmological simulations.We then apply our model to quantify the bias induced due to a non-zero property covariance in the expected weak lensing signal and the mass-observable scaling relation.
As will be presented in §3, a key element of this analysis is the estimation of true cluster richness by encircling clusters within a 3D radius within the physical vicinity of the halo center, as opposed to a 2D projected radius used by cluster finders as redMaPPer (Rykoff et al. 2014a) by identifying galaxies within the red-sequence band in the color-magnitude space -the major difference being the removal of projection effects, or the mis-identification of non-cluster galaxies in the 2D projected radius from the photometric redshift uncertainty of the red-sequence when estimating the true richness from a gravitationally bound region around the halo.Furthermore, as this simulation-based study does not introduce other observational systematics as shape noise of galaxies, point spread function, miscentering, among others, this study can be used to explore the intrinsic covariance between observables prior to the addition of extrinsic systematics as projection effects.Our results will not only provide insight into the physical origin of the covariance, the difference between the total covariance as measured by observations and the intrinsic covariance will provide estimates on the amplitude of the extrinsic component.The goals of this work are to (i) develop an analytical model that accounts for and quantifies the effect of non-zero covariance on cluster mass calibration, (ii) quantify this property covariance utilizing cosmological simulations, (iii) update uncertainties on inferred cluster mass estimates, and (iv) explain the physical origin of the covariance using secondary halo parameters.The rest of this paper is organized as follows.In §2, we present a population-based analytical framework.In §3, we describe the simulations and data-vector employed in this work.In §4, we present our measurements and the covariance model.In §5, we present the impact of the covariance on weak lensing mass calibration.In §6, we quantify the physical origin of the covariance by parameterizing it using secondary halo parameters.In §7, we compare our work with those that employ realistic cluster finders.We conclude in §8.

THEORETICAL FRAMEWORK
This section presents a theoretical framework that examines the impact of covariance on mass-observable scaling relations.In §2.1 we introduce the definitions of richness and weak lensing excess surface mass density and their scaling relations with cluster mass.We then describe the model of property covariance of richness and excess surface mass density in §2.2.In §2.3, we model the impact of covariance on stacked observable scaling relations.Finally, in §2.4,we develop a theoretical framework that explains the covariance based on a set of secondary halo parameters.A graphic representation of the outline of the paper is shown in Figure 1.The notations used in this section to describe the covariance are listed in Table 1 and notations for scaling relations are listed in Table 2.

Excess Surface Mass Density ΔΣ from Weak Lensing
In weak lensing measurements of galaxy clusters, the key observable is the excess surface mass density, denoted ΔΣ.The excess surface mass density is defined as where Σ(, , <   ) denotes the average surface mass density within projected radius   , and Σ(, ,   ) represents the average of the surface mass density at   .We model the average surface mass density Σ as where   is the mean matter density at the redshift of the cluster,  is the projected radius in the plane of the sky,  is the comoving distance along the line of sight centered around the cluster, and  ℎ () is the halo-matter correlation function which characterises the total mass density within a halo.Under the halo model, the halomatter correlation function consists of a "one-halo" term: and a "two-halo" term: where  NFW is the Navarro-Frenk-White (NFW) density profile (Navarro et al. 1997), and  lin is the linear matter correlation function, and  is the halo bias parameter.
In weak lensing, the excess surface density ΔΣ is tied to the tangential shear   of the galaxies relative to the center of each foreground halo by the relation where the critical surface density Σ crit defined as and where   ,   , and   refer to the angular diameter distances to the source, the lens, and between the lens and source, respectively.In this work, for each halo of mass  at redshift , we compute the corresponding ΔΣ profile.We compare these measurements with theoretical predictions -in the one-halo regime we model the cluster overdensity as NFW profiles with their concentration determined by concentration-mass models of Prada et al. (2012), Ludlow et al. (2016) and Diemer & Joyce (2019), whereas in the two-halo term we adopt the linear matter correlation   multiplied by halo biases using the Tinker et al. (2010), Pillepich et al. (2010) and Bhattacharya et al. (2011) models to derive the halo-matter correlation  ℎ .At the transition radius between the one-and two-halo regimes, we follow SDSS (Zu et al. 2014) in setting the halo-matter correlation to the maximum value of the two terms, i.e., In Fig. 2, the theoretical models described above are compared with our measurements of ΔΣ in cosmological simulations to validate our data product.We model the mean ⟨ΔΣ | , ,   ⟩ at fixed mass , redshift , and projected radius   as a log-linear relation given by where  ΔΣ is the power-law slope of the relation and  ΔΣ is a normalization that is a function of redshift and mass.

Optical Richness 𝑁 gal
Optical richness  gal is an observable measure of the abundance of galaxies within a galaxy cluster.It is often defined as the number of detected member galaxies brighter than a certain luminosity threshold within a given aperture or radius around the cluster centre.
Richness is often used as a proxy for cluster mass, as more massive clusters are expected to have more member galaxies (e.g., Rozo et al. 2014;Rykoff et al. 2014a).The richness-mass scaling relation relates the richness of a galaxy cluster to its mass.In this work, we consider the mean  gal - Δ scaling relation expressed as where  Δ is the mass of the halo within a radius where the mean density is Δ times the critical density of the universe,   gal (, ) is the power-law slope of the relation,   gal (, ) is a normalisation that is a function of redshift and mass.

Halo Mass and Radius Definitions
A common approach to defining a radial boundary of a galaxy cluster is such that the average matter density inside a given radius is the product of a reference overdensity Δ ref times the critical (  ) or mean density (  ) of the universe at that redshift.The critical density is defined as where  () 2 = Ω ,0 (1 + ) 3 + Ω Λ,0 , Ω ,0 is the present day matter fraction of the universe, Ω Λ,0 is the dark energy fraction at the present age such that Ω ,0 +Ω Λ,0 = 1 for a flat universe ignoring the minimal contribution from the radiation fraction.The mean (background) density is defined as The overdensity Δ  = 200 is commonly chosen as the reference overdensity in optical weak lensing studies and is closely related to the virial radius.Another radius definition is the virial radius  vir , with overdensity values calibrated from cosmological simulations (Bryan & Norman 1998).In this work, we use  200c and  vir to scale various observations, including the ΔΣ measurements and richness.Since the covariance is close to zero at the outskirts  ≳  200c as shown in §4, we adopt   = / 200c and   = / vir as our normalised radii, as the cluster properties are more self-similar with respect to   () compared to   () (Diemer & Kravtsov 2014;Lau et al. 2015).To test for the robustness of our covariance against different radii definitions, we also introduce a physical radius of a toy model of a constant  = 1 Mpc/h; here ℎ = 0.6777 is the reduced Hubble constant used in this study.

Covariance between ΔΣ and 𝑁 gal
In optical surveys, we cannot expect the covariance between richness  gal and the excess surface mass density ΔΣ to be zero.Ignoring this covariance will lead to bias in cluster mass inferred from the excess surface mass density of the cluster selected based on richness.This work aims to quantify and analyse this covariance and its impact on the mass calibration relation.To achieve this objective, we must first specify the joint probability distribution of excess surface mass density and richness, (ΔΣ, ln  gal | , ,   ).In this work, we assume that the joint distribution follows a multivariate normal distribution (Stanek et al. 2010;Evrard et al. 2014;Mulroy et al. 2019;Miyatake et al. 2022), which is fully specified with two components, the mean vector and the property covariance.We have checked the goodness of this assumption in Appendix D.
From the mean observable-mass scaling relations in Equation ( 8) and Equation ( 9), the scaling relation between these two observables can be modeled as a local linear relation given by where  and  are the normalization and slope of the model.
The property covariance matrix is a combination of scatter and correlation between the scatter of ΔΣ and ln  gal at a fixed halo mass, redshift, and projected radius.We use   gal (, ) and  ΔΣ (, ,   ) to denote the scatter of the observable-mass relation for ln  gal and ΔΣ, respectively, and use   gal ,ΔΣ (, ,   ) to denote the correlation between these scatters.The covariance matrix is then given by Cov ,  (, ,   ) =  ,  (, ,   )   (, ,   )   (, ,   ), (13) where  and  ∈ {ΔΣ, ln  gal }.Specifically, the covariance between ΔΣ and  gal can be expressed in terms of the residuals about the mean quantities Cov ΔΣ,  gal (, ,   ) = Cov(res ΔΣ (, ,   ), res  gal (, )), (14) where the residuals of the ΔΣ and  gal are, respectively, are given by res To model the mass dependencies of the mean profiles of ΔΣ and ln  gal , we employ the Kernel Localised Linear Regression (KLLR, Farahi et al. 2022a) method.This regression method fits a locally linear model while capturing globally non-linear trends in data and has shown to be effective in modeling halo mass dependencies in scaling relations (Farahi et al. 2018a;Wu et al. 2022;Anbajagane et al. 2022).By developing a local-linear model of ΔΣ − ln  gal with respect to the halo mass and computing the residuals about the mean relation, we remove the bias in the scatter due to the mass dependence and reduce the overall size of the scatter.As shown in Fig. 4 the 1 −  of the covariance is determined by bootstrap resampling.

Corrections to the ΔΣ − 𝑁 gal relation due to Covariance
The shape of the halo mass function plays an important role in evaluating the conditional mean value of ⟨ΔΣ |  gal , ,   ⟩ where the scatter between two observables with a fixed halo mass is correlated.Ignoring the contribution from the correlated scatter, to the zeroth order, the expected ΔΣ evaluated at fixed richness is given by Equation ( 12).This is the model that has been used in mass calibration with stacked weak lensing profiles (Johnston et al. 2007;Kettula et al. 2015;McClintock et al. 2019;Chiu et al. 2020;Lesci et al. 2022). The ) −1 is the compression factor due to curvature of the HMF, the subscript 1 denoting that the scatter is taken from the HMF expanded to first order; here we omit the (, ) dependence of the covariance as a shorthand notation.These expansions around the pivot mass are for halos centered around a narrow enough mass bin.We show explicitly in Fig. 9 that the first-order expansion converges using our binning method.The derivations for the first and secondorder expansion terms can be found in Evrard et al. (2014) and Farahi et al. (2018b) and the derivation for this particular expression of the second-order term is shown in Appendix C.
Here  1 and  2 are the parameters for the first and second-order approximations to the mass dependence of the halo mass function (e.g., Evrard et al. 2014 These property covariance correction terms are absent in the current literature.A key feature of this approximation method is that the second-order solution has better than percent-level accuracy when the halo mass function is known (Farahi et al. 2016).In Figure 9, we demonstrate that the statistical uncertainties for the first-order correction in Equation ( 17) is larger than the uncertainty in the halo mass function and the uncertainty due to the second-order halo mass approximation.

Secondary halo parameter dependence of
Cov(ΔΣ, ln  gal |, ) We elucidate the physical origin of the covariance between ΔΣ and ln  gal by developing a phenomenological model based on the secondary halo parameters listed in Table 3.These parameters are computed from the ROCKSTAR halo finder (Behroozi et al. 2013).They capture the halo's mass accretion history , which we hypothesise is the driving force behind the observed covariance.To incorporate these parameters into our model, we extend Equations ( 8) and ( 9) Offset of density peak from mean particle position (kpc ℎ −1 ) by introducing multi-linear terms that include the secondary halo parameters denoted by the vector Π: where Π is a vector of secondary halo parameters of potential interest listed in Table 3, and  ΔΣ and   gal are normally distributed intrinsic scatter terms with zero means and uncorrelated variances.In Appendix E we show that the residual conditioned on secondary halo parameters can largely be assumed to be Gaussian.Additionally, we assume ⟨  gal  ΔΣ ⟩ = 0, which implies that the scatter about the mean relations is uncorrelated after factoring in the secondary properties.Due to the bilinearity and distributive properties of covariance, combining Equation (20) and Equation ( 21) yields: where we omit the explicit ΔΣ (, ) to simplify the notation.The KLLR method is utilised to estimate these mass-dependent parameters.All terms involving ⟨ΔΣ⟩ and ⟨ln  gal ⟩ vanish, as these terms are independent of Π by definition.Terms involving   gal and  ΔΣ also go to zero, as they are uncorrelated Gaussian scatters.Only the term Cov( ì •Π) remains, and hence our final expression for the covariance is To estimate the error in the covariance due to each of the secondary halo parameters, we compute Cov(ΔΣ, Π  |, ) for each secondary halo parameter  in each (  , , ) bin and take their standard deviations as the error measurement.Modeling the richness-mass relation as in Equation ( 20) and using the same derivation as in Equation ( 22 We test the validity of this model by checking how well the secondary halo parameters can explain covariance between lensing and richness in §6.After subtracting the covariance from each of the Π  terms, the full covariance should be consistent with null, given the uncertainty.Our results confirm that the dependency of secondary halo parameters can indeed explain the covariance.

DATASET AND MEASUREMENTS
In this section, we describe the measurements on the individual ingredients that make up the covariance -ΔΣ the lensing signal in §3.1 and ln  gal the log-richness measurement in §3.2.

Measurements of ΔΣ
We employ the MultiDark Planck 2 (MDPL2) cosmological simulation (Klypin et al. 2016) to measure halo properties.The MDPL2 is a gravity-only -body simulation, consisting of 3840 3 particles in a periodic box with a side length of  box = 1 ℎ −1 Gpc, yielding a particle mass resolution of approximately  p ≈ 1.51 × 10 9 ℎ −1  ⊙ .
The simulation was conducted with a flat ΛCDM cosmology similar to Planck Collaboration et al. ( 2014), with the following parameters: ℎ = 0.6777, Ω m = 0.307115, Ω Λ = 0.692885,  8 = 0.829, and  s = 0.96.We use the surface over-density of down-sampled dark matter particles to measure the weak lensing signal.We selected cluster-sized halos using the ROCKSTAR (Behroozi et al. 2013) halo catalogue, which includes the primary halo property of mass and redshift and a set of secondary halo properties listed in Table 3 that we utilise in this analysis.To capture the contribution of both the one-and two-halo terms to  ℎ , we use a projection depth of   = 200 Mpc ℎ −1 to calculate ΔΣ (e.g., Costanzi et al. 2019;Sunayama et al. 2020).The MDPL2 data products are publicly available through the MultiDark Database (Riebe et al. 2013) and can be downloaded from the CosmoSim website 1 .The excess overdensity, ΔΣ, is calculated by integrating the masses of the dark matter particles in annuli of increasing radius centred around the halo centre.However, since clusters do not have a welldefined boundary, we compare the results of two radial binning schemes.The first scheme uses 20 equally log-spaced ratios between 0.1 and 10 times  vir , while the second scheme spans 0.1 to 10 times  200c .We consider the measurements binned at  200c as our final results to be consistent with the weak lensing literature.Figure 2 shows that our measurements are consistent with most models of the concentration-mass and halo-bias models at a 1 level.
At a projection depth of   = 200 Mpc ℎ −1 , the projection effects can be modeled as a multiplicative bias (Sunayama 2023).In Sunayama (2023) the projection effects on ΔΣ are modelled as ΔΣ obs = (1 + )ΔΣ true , where  = 18.4 ± 8.6%.Although the multiplicative bias of projection effects may increase the amplitude of Cov(ΔΣ, ln  gal ) by a factor of (1 + ), we argue that it does not introduce an additive bias into our model for Cov(ΔΣ, ln  gal ).This is because, under the richness model and ΔΣ in Equations ( 20) and ( 21), only terms in richness that are correlated with projection effects will contribute to the covariance.As demonstrated in §3.2, we enclose the halo within a 3D physical radius, so the number count  gal of galaxies should not include projection effects.Therefore, projection effects should not introduce an additive bias to our covariance.
To remove the 2D integrated background density, we first computed the background density of the universe (  ) at the cluster redshift using the cosmological parameters of the MDPL2 simulation.The integrated 2-D background density is given by Σ  = 2    , where factor 2 comes from the integration of the foreground and background densities.

Dataset for 𝑁 gal -SAGE galaxy catalog
The Semi-Analytic GALAXY Evolution (SAGE) is a catalogue of galaxies within MDPL2, generated through a post-processing step that places galaxies onto N-body simulations.This approach, known as a semi-analytic model (SAM), is computationally efficient compared to hydrodynamical simulations with fully self-consistent baryonic physics.SAMs reduce the computational time required by two to three orders of magnitude, allowing us to populate the entire 1 (Gpc/h) 3 simulation volume with galaxies.SAGE's statistical power enables us to conduct stacked weak lensing analyses.
The baryonic prescription of SAGE is based on the work of Croton et al. (2016), which includes updated physics in baryonic transfer processes such as gas infall, cooling, heating, and reionization.It also includes an intra-cluster star component for central galaxies and addresses the orphan galaxy problem by adding the stellar mass of disrupted satellite galaxies as intra-"cluster" mass.SAGE's primary data constraint is the stellar mass function at  = 0. Secondary constraints include the star formation rate density history (Somerville et al. 2001), the Baryonic Tully-Fisher relation (Stark et al. 2009), the mass metallicity relation of galaxies (Tremonti et al. 2004), and the black hole-bulge mass relation (Stark et al. 2009).

Model for 𝑁 gal
To determine the number of galaxies inside a cluster-sized halo, we utilise the SAGE semi-analytic model and compute the total number of galaxies within a 3D radius around the halo centre.We compare the true richness ( gal ) to  200c scaling relations between different models and the observed richness-mass relations from Costanzi et al. (2021) using data from the DES Year-1 catalogue and massobservable-relation from the South Pole Telescope (SPT) cluster catalogue (see Figure 3).The observed richness-mass relation is fitted as a log-linear model with 2 −  error bars that trace the posterior of the best-fit richness-mass model parameters.The  500c mass definition in the catalogue is converted to  200c using an NFW profile for the surface density of the cluster and adopting the Diemer & Joyce The measured ΔΣ profiles using downsampled particles for every 10 particles and theoretical ΔΣ as computed from the NFW profile using different concentration-mass relations (LHS in legend) in the one-halo regime and different halo-bias models (RHS in legend) in the two halo regimes, with errors taken to be 1 −  standard deviations; the measurements are consistent with theoretical predictions and the size of the errors is too large to distinguish between models.The same conclusion (not shown) holds for ΔΣ binned by  vir .(2019) concentration-mass relation anchored at  = 0.35, which is roughly the median redshift of the cluster sample.
We use the KLLR method to determine the local linear fit for our  gal -mass model, which relaxes the assumption of global loglinearity (Anbajagane et al. 2020).Realistic cluster finders, such as redMaPPer (Rykoff et al. 2014a), impose a colour-magnitude cut or a stellar mass cut, which are highly dependent on the red-sequence model or the spectral energy density model.We found that imposing a stellar mass cut of 10.5 log(/ ⊙ ) would correspond roughly to the bottom 5% percentile of SDSS detected galaxies (Maraston et al. 2013).However, this drastically decreases the number of galaxies in a halo, with most having  gal in the single digits.As we are interested in the intrinsic covariance from the physical properties of the halo, we do not impose additional magnitude or stellar mass cuts.We confirm that, as described in Croton et al. (2016), the galaxy stellar mass distribution at  = 0 is consistent with the best-fit double Schechter function calibrated with low-redshift galaxies from the Galaxy and Mass Assembly (GAMA) (Baldry et al. 2012) down to stellar masses of M > 10 8.5  ⊙ .
Figure 3 illustrates that our  gal -mass models, which count the number of galaxies within a physical 3D radius and impose no colourmagnitude cut as redMaPPer does, resemble the general behavior of the observed richness-mass relations in terms of both slope and intercept.However, we acknowledge that redMaPPer may suffer from projection effects that artificially inflate the number count of redsequence galaxies within its aperture because of line-of-sight structures.Additionally, the  gal count within the  vir radius exceeds that of  200c as  vir is greater than  200c .In the toy model scenario, where we use a constant 1 Mpc ℎ −1 radius, the slope of the massrichness relation starts to decrease as the mass increases due to the increasing physical size of the clusters, as expected.The diversity of cluster radii and the resulting variation in the local slope and intercept of the  gal -mass relations demonstrate the robustness of our covariance model.In Section 4, we show that different radii/mass definitions have little impact on the parameters of our covariance model, thus establishing its independence from different reference radii, the definitions of cluster edges, and the resulting richness-mass relations.

RESULTS: COVARIANCE SHAPE AND EVOLUTION
In this section, we report the measurements for our covariance.In Figure 4, we find an anti-correlation between  gal and ΔΣ at small scales across most redshift and mass bins spanned by our dataset, which we fit with the best-fit "Sigmoid" functional form of the expression with  ≡ log / 200c the log-radius and x ≡ ( − )/ the scaled and offset log-radius.In Appendix A, we offer statistical verification of the best-fit functional form.
We first describe the evolution of the covariance in §4.1 by binning across the (, ) bins.Next, in §4.2, we present an alternative binning scheme based on halo peak height that can provide insight into the dependence of the time formation history of the covariance scale.
Our results suggest that the shape of the covariances can be accurately described by the full error function.Additionally, for  ≥  vir or  ≥  200c , the covariance aligns with the null-correlation hypothesis.This alignment is reflected in the fact that all nine bins with constrained posterior shape have best-fit  values within 2 of  = −1.Deviations from  = −1 can be interpreted as evidence of disagreements with the Press-Schechter formalism (Press & Schechter 1974) of spherical collapse halos, which can be originated from the presence of anisotropic or non-Gaussian matter distribution around halos at large scales (Lokken et al. 2022), or it can be an indicator of an open-shell model of halos that allows for the bulk transfer of baryonic and dark matter in and out of the halo potential well during the non-linear collapse.
With  = −1 fixed, the reduced error function marginally improves the constraints in most bins.However, with the reduced model, we can provide posterior constraints for  200c ∈ [5×10 14 , 1×10 15 )  ⊙ ℎ −1 at  = 0.49 and  200c ∈ [2 × 10 14 , 5 × 10 15 )  ⊙ ℎ −1 at  = 1.03, which the full model failed to constrain but with very loose posterior constraints.The estimated parameters for both the full and reduced models are presented in Table D1 and Table D2, respectively.
To assess the impact of varying the definition of the halo radius on our measurements of the shape of the covariance, we considered two factors: the scale dependence of ΔΣ discussed in §2.1.Figure 3 in §3.2. Figure 5 demonstrates that there is no apparent evolution of the shape parameters  ∈ {, , } when altering the scale dependence for ΔΣ or the true richness count.However, we find marginal 3 evidence of a difference in the amplitude parameter of the covariance  when changing the scale normalisation from   = / 200c to   = / vir while using the same true richness count.As halos exhibit more self-similarity in the inner regions when scaled by  200c (Diemer & Kravtsov 2014), we adopt this as our radius normalisation and use the number of galaxies enclosed within  200c as our true richness count.Subsequently, we explored the evolution of the shape parameters with respect to (, ) and found no strong mass dependence.However, we observed a monotonically decreasing redshift dependence of the amplitude parameter , as illustrated in Figure 6.To explain both the halo mass and the redshift dependence, we used the peak height of the halo, (, ).

Binned by peak height
An alternative binning scheme that encapsulates both the halo mass and redshift information is to bin halos by the peak height parameter, defined as where   () is the collapse overdensity at which gravitational collapses enter the non-linear regime and (, ) is the smoothing scale seen in Equation (B2) at the radius of the cluster.the linear growth factor defined as for a ΛCDM cosmology, where  () ≡  ()/ 0 is the normalised Hubble parameter.(, ) depends strongly on redshift, and hence, the peak height  strongly depends on the halo radius and the redshift of non-linear collapse.
The peak height has been adopted to simplify the mass and redshift dependence in various halo properties, such as halo concentration (Prada et al. 2012) and halo triaxiality (Allgood et al. 2006).Here, we explore whether the peak height can serve as a universal parameter to explain the scale and shape of Cov(ΔΣ, ln  gal | , ).We bin the halos into deciles of  and set posterior constraints on the shape of the covariance using our erf model in the full model case.In the when binned in deciles of halo peak height.The first nine bins reject the null hypothesis at a  < 0.01 level, and the highest decile rejects the null hypothesis at a  < 0.05 level.We can provide posterior constraints for all bins in peak height except for the one with the highest peak height value.
highest decile (90%-100% percentile), we reject the null-correlation hypothesis at the  = 0.01 level, but due to the size of the error bars, the shape of the parameters , , and  and largely unconstrained and  = 10 12 × 0.16 +0.34 −0.12 .Due to the large degeneracy, we exclude the highest decile from our dataset and limit the range of our model to  ∈ [1.57, 3.40), which spans 0% to 90% of our sample set.The large error bars may be due to the fact that the halo abundance as a function of  falls precipitously around  ∼ 4, so the highest decile spans a wide tail of high  ∈ [3.4,4.6).The plots in Figure 7 are the best-fit templates when binned by peak height, and Figure 8 shows the best-fit parameters as a function of peak height.We do not see a strong dependence on the peak height for ,  and .For , its dependence on  can be modeled as a log-linear relation of the form log 10 () =   + , with   = 13.07 +0.26 −0.26 and  = −0.44 +0.11  −0.11 .At the highest decile, the  = 10 12 × 0.16 +0.34  −0.12 falls within the 1 confidence band of the log-linear fit.Compared to the first nine deciles, the fit yields a  2 -value of 0.73.The negative slope between  and  indicates that more massive halos at the cosmic era of their formation exhibit a lesser anti-correlation between ΔΣ and ln  gal .

IMPACT OF COV(ΔΣ, ln 𝑁 gal | 𝑀, 𝑍) ON WEAK LENSING MEASUREMENTS
To assess the impact of Cov(ΔΣ, ln  gal | , ) on the scaling relation ⟨ΔΣ |  gal , ⟩, we utilise Equation ( 17) for the first-order correction and Equation (18) for the 2nd order.The mean mass of the halos in each (, ) bin is chosen as the pivot mass around which the HMF is Taylor expanded, and the intercept   gal and slope   gal for the richness-mass scaling relation shown in Equation ( 9 2016), and find that the difference is subdominant to the first-order correction, which is at a ∼ 1% level at small scales, as shown in Figure 9.
To estimate the mass bias in each bin, we stack ⟨ΔΣ | , ⟩ and model the profiles as if they were individual halos with a mean mass, redshift and concentration as described in Equations ( 2)-( 7).We assume NFW profile using the concentration-mass model of Diemer & Joyce (2019) in the one-halo regime.The two-halo regime should not be affected, as the covariance is consistent with zero at  ≳  200c .We convert the 3D overdensity of the modeled halo  ℎ to ΔΣ using Equations ( 2) and ( 5), and then apply the first-order correction in Equation ( 17).Using a Monte Carlo method we obtain the expected mass with and without this correction and report the change in the mean halo mass with this correction for each (, ) bin.As shown in Figure 9, we find that adding the correction leads  17) and ( 18) as denoted by ΔΣ cov and without applying corrections as denoted by ΔΣ fid .The slope and curvature of the halo mass function are calculated numerically from the Tinker et al. (2008) halo mass function in our nominal correction.The errors are taken from bootstrapped errors of the covariance.We compare the results with first-order corrections from other halo mass functions using Watson et al. (2013); Bocquet et al. (2016); Despali et al. (2016).We find that the percentile difference in ΔΣ far exceeds the uncertainty in the choice of halo mass functions, and that second-order corrections are subdominant to the first-order correction itself, which is at a ∼ 1% level at small scales for ΔΣ and propagates into an upward correction of stacked halo mass of  / ∼ 2 − 3% for most bins after applying the correction.
to an upward correction of the stacked halo mass of approximately / ∼ 2 − 3% for most (, ) bins.

Secondary Halo Parameter Dependence of ln 𝑁 gal
We employed a multi-variable linear regression model to determine the best-fit when incorporating secondary properties in the regression.Initially, considering the full set of parameters listed in Table 3, we applied a backward modeling scheme to identify the relevant parameters of interest.Details of this process can be found in Appendix E, which led to the selection of the following secondary halo parameters for our model: Π ⊂ {Γ 2dyn ,  1/2 ,  vir , /||,  off }.The resulting model demonstrated good explanatory power, as indicated by a high  2 coefficient.Additionally, the model passed various tests, including variance inflation, global F-statistic, partial F-statistic, T-statistic, scatter heteroscedasticity, and scatter normality in most cases.Specifically, through a comparison of  partial values, we found that richness could be modeled by a multi-linear equation involving all secondary halo parameters.Further information can be found in Table E1, where the F-statistic demonstrates that all parameters are statistically significant.Only when considered collectively can they accurately reflect the dependence of richness on halo formation history.
To establish informative priors for upcoming weak lensing surveys such as HSC and LSST, we examined whether the dependence of  gal on secondary halo properties, as inferred from the slope   gal , aligns with arguments based on halo formation physics.We expected that   gal , vir resulting from the formation of satellite galaxies (equivalent to  gal − 1 in the presence of a central galaxy) within halos would exhibit a negative relation, i.e.,   gal , vir < 0. Simulationbased studies have suggested that early-forming halos possess higher concentrations (Wechsler et al. 2002), and correspondingly, highconcentration halos (which form early) have fewer satellite galaxies due to galaxy mergers within the halos (Zentner et al. 2005).This effect is known as galaxy assembly bias (Wechsler & Tinker 2018) -the change in galaxy properties inside a halo at fixed mass due to the halo formation history.There is marginal evidence of the existence of assembly bias from recent observations using galaxy clustering techniques (Zentner et al. 2019;Wang et al. 2022), as well as measurements of the magnitude gap between the brightest central galaxy (BCG) and a neighboring galaxy as a proxy for formation time (Hearin et al. 2013;Golden-Marx & Miller 2018;Farahi et al. 2020).
As noted in Table E1, the signs of   gal , for the remaining parameters  ∈ { 1/2 , /||, Γ 2dyn ,  off } align with our expectations of assembly bias in most bins -late-forming clusters undergo more rapid mass accretion (higher Γ 2dyn ) and are less virialized (higher T/|U|), and because they also from the galaxy assembly bias mentioned above are richer in galaxy number counts when conditioned on the mass, we expect a positive partial slope   gal ,Γ 2dyn and   gal ,T/|U| .The case for  1/2 and  off is more complicated.Under the isolated formation of halos  1/2 and  off would be smaller for earlier forming halos due to the monotonic mass accretion and relaxation of halos over long time scales.However, as halos undergo mergers and tidal stripping the monotonicity of the parameters over time is not guaranteed.Therefore, we see a mixture of positive and negative partial slopes   gal ,a 1/2 and   gal ,X off in these cases.To describe the physical mechanisms on a case-by-case basis would require that we probe into the halo merger tree history of individual halos.
In this paper, we take a closer look at the sign of   gal , vir and observe that while the partial scope matches our expectations in most bins, in some mass bins at medium and high redshifts it changes signs from negative at lower redshifts to positive at higher ones.While we observe a diminishing impact of secondary halo properties on richness (indicated by a smaller absolute value for   gal , vir ), the reversal of the coefficient's sign cannot be solely attributed to statistical fluctuations around zero, as some values are inconsistent with zero at levels exceeding 3.
This issue can be attributed to the effect of major mergers on concentration.Recent studies (Ludlow et al. 2012;Wang et al. 2022;Lee et al. 2023) have shown that halos, during major merger events, experience a transient fluctuation in concentration before returning to the mean relation over a time period slightly less than the dynamical time of the halo.The measured concentration spike during major mergers, particularly prominent at higher redshifts, could explain a positive  {  gal , vir } .
To test this hypothesis, we employ a toy model that divides halos in each (, ) bin based on the median Γ 1dyn into low-Γ 1dyn and high-Γ 1dyn subsamples.Given the timescale of mergers to be roughly the dynamical time of the halo, we choose Γ 1dyn as a good proxy for potential merger events even though this parameter is excluded in the final linear regression model due to multicollinearity (see Appendix E.) Figure 10 displays the halo concentration plotted against the richness residuals, separated by Γ 1dyn , at benchmark bins of  200c ∈ [5 × 10 13 , 1 × 10 14 )  ⊙ ℎ −1 at three different redshift snapshots of  = 0, 0.49, 1.03.At  = 0, we observe a negative slope as expected from halo formation physics for both low-Γ 1dyn and high-Γ 1dyn subsamples, as well as for the overall sample.Furthermore, we observe a change in the slope between the low-Γ 1dyn and high-Γ 1dyn sub-samples, which can be explained by the gradual increase (or decrease) in concentration (Γ 1dyn ) over time, even without major merger events (Wechsler et al. 2002;Zhao et al. 2003;Lu et al. 2006).At redshifts of  = 0.49, 1.03, we observe a positive slope in the overall and/or high-Γ 1dyn samples, which contradicts the scaling relations between HOD and concentration in models that track their gradual evolution over  ≫  1dyn .However, in the presence of major mergers, when Γ 1dyn is significantly enhanced, the halo concentration may also experience a transient spike after the merger.The deviation from hydrostatic equilibrium provides a plausible explanation for a positive { gal ,  vir }, which could be fully tested on MDPL2 through the reconstruction of halo merger trees, an analysis beyond the scope of this paper.

Secondary Halo Parameter Dependence of ΔΣ
In this section, we employ a multi-linear regression approach to model the lensing signal, similar to the methodology described in §6.1.We extend this approach to the model (ΔΣ| gal (Π), , ) as a linear function of Π. Upon analysing different (, ,   ) bins, we observe that the reduced parameters Π ⊂ { 1/2 ,  vir , /, Γ 2dyn ,  off } pass the variance inflation factor (VIF) test for multi-collinearity, or in other words we showed that the variance is not inflated and thereby made less reliable in the case that the secondary halo parameters in the full model are highly correlated.As with the case for the The scatter plot illustrates the data points, while the shaded regions show the best-fit linear fit with 1 confidence interval for the main sample and each sub-sample.At  = 0, the richness-concentration relation exhibits a negative slope, consistent with our expectations of halo formation physics.The slopes for the low and high Γ 1dyn subsamples diverge due to the negative correlation between concentration and MAR.However, at  = 0.49 and  = 1.03, the slopes for the entire sample and/or the high Γ 1dyn subsample become positive, contrary to our observations of the richness-concentration relation.In contrast, the low Γ 1dyn subsample still shows a negative slope.These findings suggest that at medium to high redshifts, a subset of unrelaxed and recently merged halos with high MAR could elevate the concentration from its expected value at hydrostatic equilibrium.lensing signal, this indicates that ln  gal can be described without redundancy by a linear decomposition of these reduced parameters.Furthermore, most bins exhibit homoscedasticity, as confirmed by passing the Breusch-Pagan Lagrange multiplier test.This implies that the scatter terms  ΔΣ and  Π  remain constant within each bin, with a few exceptions.Lastly, the scatter  ΔΣ|  gal in most bins (with a few exceptions) meets the criteria of the Shapiro-Wilk test for Gaussianity, suggesting that the distribution closely resembles a Gaussian distribution.
The multi-linear regression is a good fit to the conditioned lensing signal if we assume that (ln  gal |, ) and (ΔΣ|, ) can be modeled with a normal distribution (e.g., Anbajagane et al. 2020;Costanzi et al. 2021;To et al. 2021).In this case (ΔΣ |  gal , , ) is a multi-linear equation with respect to the secondary halo parameters with mean and is normally distributed around the mean with variance The parameters 1, 2, 3, and  0 can be explicitly derived where (ln  gal |, ) and (ΔΣ|, ) are known, but the exact values are not essential for this paper.We refer the reader to Appendix F for derivations of Equations ( 29) & (30).We note that only in bins of  ≲  200c do the multilinear regression models pass the global F-statistic test and the T-statistic test for each parameter.This result suggests that, at  ≳  200c , we find little correlation between ΔΣ and Π  .Because the scatter still passes the Shapiro-Walk test for Gaussanity, the conditional probability (ΔΣ| gal , , ) at large scales is still normally distributed, but with  ΔΣ−Π  = 0.By setting  ΔΣ−Π  = 0 the variance can be reduced to We visualise the dependence of  ΔΣ−Π  on / 200c in Figure 11.By dividing ΔΣ into quintiles of Π  we find a strong correlation for all parameters at  ≲  200c and a null correlation at  ≳  200c .On small scales, our results show a positive correlation for concentration and a negative correlation for { 1/2 , /, Γ 2dyn ,  off }.We observe that this trend holds for all (, ) bins plotted for a benchmark bin of  200c ∈ [2 × 10 14 , 5 × 10 14 ) ⊙ ℎ −1 at  = 0.
The dependence of ΔΣ on secondary halo parameters qualitatively agrees with Xhakaj et al. (2022) wherein they targeted a narrow mass bin, with residual mass dependency inside the bin resampled so that mass follows the same distribution.In our work, we remove the mass dependency with the KLLR method (Farahi et al. 2022a), which achieves the same effect.We extend their results to mass and redshift bins probed by the optical surveys and quantitatively show that the dependence of ΔΣ on Π can be modeled as a multi-linear equation.

Results: Secondary Halo Parameter Dependence of
Cov(ΔΣ, ln  gal | , ) In Figure 12, we observe that the total covariance Cov(ΔΣ, ln  gal |, ), which remains after removing the contribution of each secondary halo parameter   gal , Cov(ΔΣ, Π  |, ), is consistent with zero at a significance threshold of 0.05 in all bins.The errors on the total covariance and individual contributions are computed using bootstrapping, and the errors on the remaining term are determined by adding the errors of the total and individual terms in quadrature.
Based on our hypothesis in Equation ( 24), we conclude that the set of secondary halo parameters Π, which are related to the formation time and the mass accretion history of the halos, can fully explain the joint distribution of ΔΣ and ln  gal given the precision allowed by current errors, limited by the resolution limit (see Appendix B for information on particle resolution and measurement errors).
Since the joint distribution of ΔΣ and ln  gal follows a multivariate normal distribution, (ΔΣ, ln  gal |, ) is completely characterised by its mean relation and Cov(ΔΣ, ln  gal | , ).It should be noted that the contribution of each individual parameter to the total covariance,   gal , Cov(ΔΣ, Π  | , ), is determined by the richness dependency captured by the slope   gal , and the ΔΣ dependency represented by Cov(ΔΣ, Π  | , ).Qualitatively, individual contributions to total covariance maintain their sign when both ΔΣ and  gal contributions preserve their sign.Consistent with the arguments of halo formation, Π correlates with ΔΣ at small scales, as demonstrated in Figure 11 across all (, ) bins.In most cases, the dependence of the richness on secondary halo parameters also maintains its sign across the (, ) bins.In instances where we encounter a sign reversal in the ln  gal −  vir relation, we speculate that it is due to a transient increase in concentration following a major merger.
Furthermore, the total and individual contributions to the covariance tend to decrease in magnitude at smaller scales with increasing redshift.This decrease in covariance can be attributed to two factors: the decreasing explanatory power of Π on richness, as indicated by the decreasing values of  2 and  partial values in Table E1, and the decreasing absolute value of Cov(ΔΣ, Π  | , ).This trend aligns with the idea that as halos have more time to form, the secondary halo properties related to the mass accretion history become more significant both in richness and in ΔΣ.As discussed in §4, the dependence of the mass and redshift on covariance can be explained by the halo peak height, (, ).
Intrinsic vs. Extrinsic Covariance.Our study distinguishes between the intrinsic covariance investigated here and that observed in empirical cluster data sets.This distinction arises from systematic biases introduced by the cluster-finding algorithm (extrinsic component) and the underlying physics governing halo formation (intrinsic component).Specifically, our analysis involves counting galaxies in 3D physical space.In contrast, a realistic cluster finder like redMaPPer (Rykoff et al. 2014b) employs a probabilistic assignment of galaxies to halos in 2D physical space, considering projected radii and redshift through color matching onto the red sequence.Our study does not account for the observational systematics in redMaPPer associated with uncertainties in photometric redshifts and projection effects (Rozo et al. 2015;Farahi et al. 2016).24), where Cov(ΔΣ, Π | , ) comes from the dependence of ΔΣ and the slope   is the dependency of richness.The thick black dashed line is the remaining covariance after removing the contribution from each Π  term; the errors are obtained by adding the total and individual errors in quadrature without considering the correlations between terms.In agreement with our hypothesis in Equation ( 24), the remaining term is consistent with null at a  < 0.01 level for all bins.
We find that the fractional amplitude of the bias and the scale dependence on the lensing signal observed in our results (Fig. 9) are comparable to those reported by Farahi et al. (2022b), who measured the covariance between the dark matter density and the galaxy number count enclosed inside a halo after applying a realistic stellar mass cut.The enclosed mass within a 3D radius using the IllustrisTNG100 simulation is anchored at  = 0.24.By comparing our findings to those obtained using a realistic cluster finder such as redMaPPer, we can unravel intrinsic and extrinsic contributions to the covariance between weak lensing observables (Wu et al. 2022).This work provides more profound insights into the distinct effects originating from the underlying physics and the methodology employed in cluster-finding algorithms (Euclid Collaboration et al. 2019).
Projection Effects.A noteworthy distinction arises regarding the covariance observed in our study compared to others examining clusters in projected space.Projection effects can potentially introduce a sign flip in covariance, as they can positively bias both ΔΣ and richness (Costanzi et al. 2019;Wu et al. 2022;Zhang & Annis 2022;Zhang et al. 2023).Particularly, Zhang et al. (2023) showed that the lensing signal can be affected both at large and small scales from the preferential alignment of halo orientation with the underlying large-scale structure filament.Wu et al. (2022) detected a positive correlation between ΔΣ and ln  gal employing Buzzard simulations, where ΔΣ were measured using dark matter particles and galaxy counts were performed within a cylindrical region of depth 60 Mpc/h.Their investigation revealed that the positive correlation primarily stems from galaxy number counts beyond the halo's virial radius and within 60 Mpc/h.On the other hand, using the Dark Quest emulator and HOD-based galaxy catalogues (Nishimichi et al. 2019), Sunayama et al. (2020) found negligible deviations from the mean relation at small scales and an overall reduction in selection bias at large scales, approximately halving the effect observed by Wu et al. (2022).Furthermore, Huang et al. (2022), using data from the Subaru Hyper Suprime-Cam (HSC) survey, observed that the selection bias is most prominent in the vicinity of the transition from the one-halo to the two-halo regime, as evidenced by the comparison between the outer stellar mass proxy and richness.In a study based on the IllustrisTNG300 simulation, Zhang & Annis (2022) discovered a net positive correlation between the fitted weak lensing mass and the projected 2D number count of the halo when conditioned on the halo mass.
These results suggest that projection effects can potentially introduce a positively correlated bias to both ΔΣ and  gal .We can estimate the impact of projection effects by comparing the intrinsic covariance measured in our study with the total covariance observed in the projected space.Consequently, our results serve two essential purposes: (i) elucidating the physical origins of the negative covariance and (ii) discerning intrinsic and extrinsic components to determine the covariance attributable to projection effects accurately.
Radial Dependence.There is a notable difference in the reported amplitude and scale dependence of covariance, which can be attributed to discrepancies in the employed halo occupation density models.Notably, a distinctive scale dependence discrepancy exists between simulation-based investigations of projection effects (Sunayama et al. 2020;Wu et al. 2022;Salcedo et al. 2020) and observational data from the HSC (Huang et al. 2022).In particular, the analysis of observational data reveals a prominent 1 Mpc bump, which could be explained by uncertainties inherent in observations, such as miscentering effects.It is crucial to gain insight into the sensitivity of the covariance with respect to the model parameters and the influence of selection effects.Understanding these factors is essential to comprehensively interpret and account for the observed covariance in galaxy cluster survey studies.
Accuracy vs. Precision.The statistical power of current and future surveys enables us to determine the normalisation and slope of the mass-observable relations at a few percent levels (e.g., Farahi et al. 2016;Mantz et al. 2016b;Mulroy et al. 2019;To et al. 2021).However, these estimates are susceptible to known and unknown sources of systematic errors that inflate the uncertainties.These uncertainties introduce biases and degrade the accuracy of the results.Therefore, it is essential to carefully identify, quantify, and account for these systematic effects to ensure robust and reliable measurements.In this work, we focus on studying one of these sources of systematic uncertainty that was not considered previously.

SUMMARY
This work reveals insights into the scale-dependent covariance between weak lensing observables and the physical properties of the halo.Using the MDPL2 N-body simulation with galaxies painted using the SAGE semi-analytic model, we present several key findings: • We observe that the intrinsic covariance between ΔΣ and ln  gal enclosed within a 3D radius is negative at small scales and null at large scales in (ln , ) ranges that cover optical surveys.• We model the shape of the covariance across all bins using an error function that is insensitive to the radius definition used to define halo boundaries.
• We find that the magnitude of the covariance is relatively insensitive to mass and decreases considerably with increasing redshift.The (, ) dependence of the shape of the covariance can be encapsulated by the peak height parameter (, ), which suggests that the scale of the covariance is related to the formation history of halos.• We show that incorporating the covariance into ⟨ΔΣ| gal , ,   ⟩ using the first-order expansion of the halo mass function yields about > 1% bias on ⟨ΔΣ| gal , ,   ⟩ at small scales, which implies a mass bias of > 2% in the halo mass estimates in most bins.• Our analysis reveals that the covariance between ln  gal and ΔΣ can be fully explained by secondary halo parameters related to the history of the halo assembly.This finding provides strong evidence that the non-zero covariance results from the variation in the formation history of dark matter halos.
This work underscores the importance of accounting for covariance in cluster mass calibration.Incorporating the covariance between richness and the weak lensing signal and its characterisation should be an essential component of weak lensing cluster mass calibration in upcoming optical cluster surveys.The results of this work can be introduced as a simulation-based prior for the forward modeling of scaling relations used by cluster cosmological analysis pipelines.Within the LSST-DESC framework, this systematic bias would be implemented in the CLMM pipeline (Aguena et al. 2021) to update the stacked weak lensing mass in each richness bin.Considering this covariance paves the path toward percent-level accuracy cosmological constraints, thereby enhancing the precision and reliability of our scientific conclusions.Moving forward, it is imperative to integrate this understanding into the design and analysis of future optical cluster surveys.

APPENDIX A: FUNCTIONAL FORM
This section aims to characterize the shape of the covariance across mass and redshift bins by fitting a template curve.The process involves several transformations and adjustments.First, a logarithmic transformation is applied to the radial bins, denoted as  = log 10 (/ 200c ).Then, a horizontal offset is introduced using a parameter , and scaling is applied using a parameter .This results in a transformed variable x = ( − )/.
To analyze the transformed data vector  ( x), we test a set of functional forms presented in Table A2.The normalization factors and coefficients associated with these functions are chosen in such a way that  ( x) approaches 1 at large scales, -1 at small scales,  ′ (0) = 1 and  (0) = 0.A linear transformation of  ( x) is then performed, given by   ( x) +  , where  represents a vertical shift and  represents a scaling factor.The magnitude of  is comparable to the magnitude of Cov(ΔΣ, ln  gal | , ), while the parameters , ,  are of the order of unity.These parameters, along with , form the set of parameters denoted as  ∈ {, , , }, which define our best-fit model.If  = −1, it implies a zero covariance at large scales.We fit two models: a full model with all parameters free, and a reduced model with  = −1, and the rest of the parameters free.The priors for the parameters are specified in Table A1.At our current resolution (nth=10, solid green line), the standard error is just above the cosmic variance at small scales and drops below the cosmic variance at large scales.The solid black line is density fluctuation estimated from the cosmic variance floor, as described in Equations ( B2) and (B1).In the ideal case that Poisson noise accounts for all the standard error, fully sampling all particles (th = 1, red dotted line) will reduce the standard error by a factor of √ 10, rendering it just below the cosmic variance floor at small scales.In the realistic case that the standard error for ΔΣ contributes from both Poisson noise and the intrinsic diversity of halo profiles, the fully sampled standard error should be on par with the cosmic variance at small scales.which is smoothed over an area of  = 4 2 , Δ 2 () is the matter power spectrum for a wavenumber , and  1 is the Bessel function of the first order.Figure B1 shows the standard error of ΔΣ at a benchmark bin  200c ∈ [1 × 10 14 , 2 × 10 14 )  ⊙ ℎ −1 at z=0.00.At particle downsampling factors of 200, 100, and 10, the reduction in error is consistent with the Poisson term of √ , indicating that at these sampling rates, Poisson noise dominates.At our current downsampling rate of nth=10, the standard error is just above the cosmic variance floor at small scales and drops below the cosmic variance floor at large scales.In the ideal case that Poisson noise accounts for all the standard error, fully sampling all particles (nth=1, red dotted line) will reduce the standard error by a factor of √ 10, rendering it just below the cosmic variance floor at small scales.In the realistic case that the standard error for ΔΣ contributes from both Poisson noise and the intrinsic diversity of halo profiles, the fully sampled standard error should be on par with the cosmic variance at small scales.A future study with fully sampled particles should yield greater statistical constraints.The exact values of 1, 2, 3, and  0 are not essential for this paper as we aim to derive a general expression for (ΔΣ |  gal , , ) as a function of secondary halo parameters.

Figure 1 .
Figure 1.Graphic representation of the modeling of Cov(ΔΣ, ln  gal | , ) -the covariance between the halo weak lensing signal ΔΣ () and log-richness ln  gal conditioned on mass and redshift -and its dependence with secondary halo parameters Π.The labels marked S. XX point to the location in the text.A full list of the notations used in this paper is introduced inTable 1 & 2.

Figure 3 .
Figure3.Using different prescriptions of the richness count, we compare with the SPT-DES data(Costanzi et al. 2021).The richness estimator, with no stellar or color-magnitude cut, shows a similar trend with the data.In §4, we show that the results are robust to changes in the definition of the number count estimator.

Figure 4 .
Figure 4. Measured against the left hand side y-axis are measurements of Cov(ΔΣ,  gal | , ) with 1 errors and different functional forms using the full model.The functions are classes of "Sigmoid" functions.In all bins, the error function outperforms other functional forms in their DIC parameters, providing good  2 values.For  200c ∈ [5 × 10 14 , 1 × 10 15 ) at  = 0.49 and  200c ∈ [2 × 10 14 , 5 × 10 15 ) at  = 1.03, the posteriors of the full models do not converge as the size of the covariance is too small.Measured against the right-hand side y-axis are the correlation coefficients  ΔΣ,  gal |, with smoothed bands representing the 1 −  error.The errors are measured by bootstrap resampling.

Figure 5 .
Figure5.Evolution of Cov(ΔΣ,  gal | , ) shape parameters with respect to mass and radial binning schemes and  gal definition at fixed redshift at z=0.ΔΣ is binned in equal log-space radial bins in / 200c or / vir ; for each radial binning, the number count of galaxies inside the cluster is given by a constant radius of 1 Mpcℎ −1 or  200c when binned by  200c and  vir when binned by  vir .We find no strong evolution in the shape or scale of the covariance under different binning schemes or  gal definitions.The trend is consistent across different redshift bins and demonstrates the robustness of the covariance under different true richness definitions.The error bars indicate the 1 −  distribution of the posteriors.

Figure 6 .
Figure 6.Evolution of Cov(ΔΣ,  gal | , ) shape parameters of the error function with respect to mass and redshift, binned in units of  200c and with  gal taken to be the number of clusters inside the  200c radius of the cluster.There is no strong dependence of , , and  with respect to mass and redshift and a strong monotonically decreasing  with respect to redshift.At  200c ∈ [5 × 10 14 , 1 × 10 15 ),  = 0.49 and  200c ∈ [2 × 10 14 , 5 × 10 15 ),  = 1.03 the covariance is consistent with null at  = 0.01 and  = 0.05 levels, respectively.The error bars indicate the 1 −  distribution of the posteriors.

Figure 7 .
Figure 7. Best-fit "full" error function model for Cov(ΔΣ, ln  gal | , )when binned in deciles of halo peak height.The first nine bins reject the null hypothesis at a  < 0.01 level, and the highest decile rejects the null hypothesis at a  < 0.05 level.We can provide posterior constraints for all bins in peak height except for the one with the highest peak height value.

Figure 8 .
Figure 8.The evolution of shape parameters for Cov(ΔΣ, ln  gal | , ) binned in deciles of the peak height , excluding the highest decile.The parameters , , and  show little dependency with  while the amplitude  exhibits a log-linear relationship with  of the form shown in Equation (28).The mean  is consistent with -1.The horizontal teal bands fill the 1 range around the mean, and the pink line is the best log-linear fit between  and  with 1 confidence bands.

Figure 9 .
Figure 9.The percent level change in stacked ΔΣ measurements after including the covariance terms in Equations (17) and (18) as denoted by ΔΣ cov and without applying corrections as denoted by ΔΣ fid .The slope and curvature of the halo mass function are calculated numerically from theTinker et al. (2008) halo mass function in our nominal correction.The errors are taken from bootstrapped errors of the covariance.We compare the results with first-order corrections from other halo mass functions usingWatson et al. (2013);Bocquet et al. (2016);Despali et al. (2016).We find that the percentile difference in ΔΣ far exceeds the uncertainty in the choice of halo mass functions, and that second-order corrections are subdominant to the first-order correction itself, which is at a ∼ 1% level at small scales for ΔΣ and propagates into an upward correction of stacked halo mass of  / ∼ 2 − 3% for most bins after applying the correction.

Figure 10 .
Figure10.Residual log-richness versus concentration relation in subsets of halo mass accretion rate (MAR).The figure consists of three panels -left, middle, and right panels corresponding to  = 0,  = 0.49, and  = 1.03, respectively, all on a benchmark mass bin of  200c ∈ [5 × 10 13 , 1 × 10 14 )  ⊙ ℎ −1 .The sample is split into low and high Γ 1dyn based on their median values.The scatter plot illustrates the data points, while the shaded regions show the best-fit linear fit with 1 confidence interval for the main sample and each sub-sample.At  = 0, the richness-concentration relation exhibits a negative slope, consistent with our expectations of halo formation physics.The slopes for the low and high Γ 1dyn subsamples diverge due to the negative correlation between concentration and MAR.However, at  = 0.49 and  = 1.03, the slopes for the entire sample and/or the high Γ 1dyn subsample become positive, contrary to our observations of the richness-concentration relation.In contrast, the low Γ 1dyn subsample still shows a negative slope.These findings suggest that at medium to high redshifts, a subset of unrelaxed and recently merged halos with high MAR could elevate the concentration from its expected value at hydrostatic equilibrium.

Figure 11 .
Figure11.The dependence of ΔΣ on accretion history parameters in  200c ∈ [2 × 10 14 , 5 × 10 14 )  ⊙ ℎ −2 ,  = 0.In each of the 5 panels is plotted the ΔΣ in different quintiles of the accretion history parameter Π ∈ { 1/2 ,  vir , /| |, Γ 2dyn ,  off } compared to the mean ΔΣ.For { 1/2 , /| |, Γ 2dyn ,  off } there is a strong negative correlation at small scales at  ≲  200c and for  vir we find a strong positive correlation at  ≲  200c .Comparing the width of ΔΣ binned at different quintiles of Π with the standard deviation of the profiles, we find that accretion history parameters account for much of the variance at small scales and play a negligible role at large scales.

)Figure 12 .
Figure12.The dependence of Cov(ΔΣ, ln  gal ) on secondary halo parameters Π.The solid blue line is the total covariance.The dashed lines of the lines represent the covariance contribution coming from each of the secondary halo parameters modeled and governed by Equation (24), where Cov(ΔΣ, Π | , ) comes from the dependence of ΔΣ and the slope   is the dependency of richness.The thick black dashed line is the remaining covariance after removing the contribution from each Π  term; the errors are obtained by adding the total and individual errors in quadrature without considering the correlations between terms.In agreement with our hypothesis in Equation (24), the remaining term is consistent with null at a  < 0.01 level for all bins.

Figure B1 .
Figure B1.The standard error of ΔΣ measurements tested on a benchmark bin of  200c ∈ [1 × 10 14 , 2 × 10 14 )  ⊙ ℎ −1 at z=0.The standard error is estimated using the bootstrap method for the  = 500 clusters with dark matter particles downsampled by a factor of 200, 100, and 10 (solid lines).At our current resolution (nth=10, solid green line), the standard error is just above the cosmic variance at small scales and drops below the cosmic variance at large scales.The solid black line is density fluctuation estimated from the cosmic variance floor, as described in Equations (B2) and (B1).In the ideal case that Poisson noise accounts for all the standard error, fully sampling all particles (th = 1, red dotted line) will reduce the standard error by a factor of sity fluctuation, given byΔΣ() =     (),(B1)where   = 200ℎ −1 Mpc is the projection depth,   () is the mean density of the universe at that redshift, and () is the root mean squared matter density fluctuation, given by  2 ()

Table 1 .
Notations employed in our framework for the covariance in §2.3

Table 3 .
Notations employed in exploring the secondary halo parameter dependence

Table E1 .
200c ( ⊙ ℎ −1 )   gal    gal ,    gal  10 6 ×   gal  10 3 ×   gal   = 0.00 [5 × 10 13 , 1 × 10 14 ) Best-fit parameters, global  2 , and explanatory power indicators for log-richness modeled in Equation (20).Values in parentheses represent 1 confidence intervals to the partial slopes .The partial -statistic, defined in Equation (E1), is used to quantify the explanatory power of each variable.A higher partial F-statistic indicates a greater amount of predictive power uniquely attributed to that variable.Staistical significance is determined by an F-statistic of  > 10.zero mean Gaussians with correlation efficient , and then by using the variance of quotient approximation  are the mean and variances of (ΔΣ | , ) and   ,  2 the mean and variances of (ln  gal | , ) in our specific case.