Cross Correlation between the Thermal Sunyaev-Zel'dovich Effect and Projected Galaxy Density Field

We present a joint analysis of the power spectra of the Planck Compton $y$-parameter map and the projected galaxy density field using the Wide Field Infrared Survey Explorer (WISE) all-sky survey. We detect the statistical correlation between WISE and Planck data (g$y$) with a significance of $21.8\,\sigma$. We also measure the auto-correlation spectrum for the tSZ ($yy$) and the galaxy density field maps (gg) with a significance of $150\,\sigma$ and $88\,\sigma$, respectively. We then construct a halo model and use the measured correlations $C^{\rm gg}_{\ell}$, $C^{yy}_{\ell}$ and $C^{{\rm g}y}_{\ell}$ to constrain the tSZ mass bias $B\equiv M_{500}/M^{\rm tSZ}_{500}$. We also fit for the galaxy bias, which is included with explicit redshift and multipole dependencies as $b_{\rm g}(z,\ell)=b_{\rm g}^0(1+z)^{\alpha}(\ell/\ell_0)^{\beta}$, with $\ell_0=117$. We obtain the constraints to be $B =1.50{\pm 0.07}\,(\textrm{stat}) \pm{0.34}\,(\textrm{sys})$, i.e. $1-b_{\rm H}=0.67\pm 0.03\,({\rm stat})\pm 0.16\,({\rm sys})$ (68\% confidence level) for the hydrostatic mass bias, and $b_{\rm g}^0=1.28^{+0.03}_{-0.04}\,(\textrm{stat}) \pm{0.11}\,(\textrm{sys})$, with $\alpha=0.20^{+0.11}_{-0.07}\,(\textrm{stat}) \pm{0.10}\,(\textrm{sys})$ and $\beta=0.45{\pm 0.01}\,(\textrm{stat}) \pm{0.02}\,(\textrm{sys})$ for the galaxy bias. Incoming data sets from future CMB and galaxy surveys (e.g. Rubin Observatory) will allow probing the large-scale gas distribution in more detail.


INTRODUCTION
The thermal Sunyaev-Zeldovich (tSZ) effect is a secondary anisotropy of the cosmic microwave background (CMB) radiation caused by the inverse Compton scattering of CMB photons off warm-hot electrons. This phenomenon results in an effective CMB spectral distortion which can be quantified as (Carlstrom et al. 2002): where T CMB = 2.725 K is the mean CMB temperature, y is the Compton parameter and the function g(x) quantifies the frequency dependence of the tSZ effect. The latter is expressed as a function of the dimensionless frequency x ≡ hν/k B T CMB , with h the Planck con- † Corresponding author: Y.-Z. Ma, ma@ukzn.ac.za stant, ν the photon frequency and k B the Boltzmann constant. Neglecting relativistic corrections, the function g(x) reads: The Compton parameter y quantifies the amplitude of the tSZ effect independently of the observing frequency. It is proportional to the electron pressure integrated along the line-of-sight (LoS) distance l: where n e and T e are the electron number density and temperature, σ T is the Compton cross section and m e is the electron mass. Low-mass and high-z clusters, for which an individual detection is generally difficult, provide a significant integrated contribution to y which is detectable by measuring the angular power spectrum of the tSZ effect, particularly at small scales (Trac et al. 2011;Planck Collaboration et al. 2011). The tSZ angular power spectrum is then an excellent probe of the physical conditions of the hot gas around dark matter haloes (Komatsu & Seljak 2002).
A key feature of the tSZ is that it is not explicitly dependent on redshift; the LoS integral in Eq. (3) implies that all of the warm-hot gas encountered by CMB photons from the last-scattering surface up to the observer contributes to the spectral distortion. In this context, cross-correlating the observed tSZ with other large-scale structure (LSS) tracers is a very useful tool to recover information on the redshift of the responsible hot gas; this, in turn, allows for a better characterisation of the diffuse gas component distribution in relation with the cosmic web, eventually providing insights into the growth of structures. Such LSS tracers are usually provided by optical survey measurements, and many works in recent years have contributed to their exploitation in this sense. This type of cross-correlation analysis has been conducted using galaxy clustering (Pandey et al. 2019;Koukoufilippas et al. 2020;Chiang et al. 2020), weak lensing (Van Waerbeke et al. 2014;Hill & Spergel 2014;Ma et al. 2015;Atrio-Barandela & Mücket 2017), cosmic shear (Hojjati et al. 2017), luminous red galaxies (Tanimura et al. 2019;de Graaff et al. 2019), cosmic voids (Alonso et al. 2018), galaxy groups Lim et al. 2020) and galaxy clusters (Komatsu & Kitayama 1999;Bolliet et al. 2018;Bolliet 2020;Rotti et al. 2021).
One of the key parameters entering SZ-based studies of the gravitational clustering of dark matter haloes is the tSZ mass bias, which is defined as the ratio (Planck Collaboration et al. 2016a): whereas in some literature it is inversely defined as 1 − b H ≡ M tSZ 500 /M 500 . In Eq. (4) M 500 is the cluster overdensity mass defined with respect to the Universe critical density at that redshift (see also Sec. 4.1 for details), and represents the "true" cluster mass. The quantity M tSZ 500 is instead the cluster mass inferred from the measured tSZ flux. Any systematics affecting the mass measurement is then encoded in the bias parameter B. The main contribution to B is most likely the assumption of hydrostatic equilibrium in the intra-cluster medium, which is made in the modeling of the ICM pressure profile (Arnaud et al. 2010), but not necessarily satisfied by the detected clusters. To this, we can add the contribution of instrumental calibration and additional systematics in the underlying X-ray modeling, which is required to provide mass proxies and calibrate the mass estimation (Planck Collaboration et al. 2016b). The current uncertainty on the mass bias is one of the major issues hindering the full exploitation of clusterrelated observables as cosmological tools.
Several studies have recently conducted crosscorrelation studies between the tSZ and other tracers to constrain the tSZ mass bias. For example, Koukoufilippas et al. (2020) cross-correlated the tSZ maps from Planck with the projected galaxies sourced from a combination of the near-infrared 2 Micron All-Sky Survey (2MASS), optical SuperCOSMOS, and the midinfrared Wide Field Infrared Survey Explorer (WISE, Wright et al. 2010) at z 0.4, and achieved the constraint 1 − b H = 0.59 ± 0.03. Makiya et al. (2020) used the tSZ map from Planck and the 2MASS redshift survey (2MRS) catalogue to the same aim, constraining the bias to be 1 − b H = 0.649 ± 0.041. Similar studies can be found in Hurier & Angulo (2018), Bolliet et al. (2018), Salvati et al. (2019), Osato et al. (2020) and Zubeldia & Challinor (2019). These findings are summarised in Table 3. Our study pursues a similar scientific goal, but for the first time utilising uniquely the all-sky WISE galaxy catalogue in combination with Planck tSZ maps. More precisely, we calculate the crosscorrelation spectrum between the Planck tSZ map and the projected galaxy density field map obtained from the WISE catalogue, and the auto-correlation spectrum for each observable. We then employ a halo model framework to theoretically predict all three correlation cases, and fit for the tSZ mass bias by jointly comparing the predicted spectra with our measurements. Our analysis will also allow us to place novel constraints on the galaxy bias, which is a fundamental ingredient to model the projected galaxy field, as well as on other parameters quantifying the foreground contaminations affecting the tSZ map.
This paper is organized as follows. In Section 2 we describe the data set we use. The measurements of the auto-and cross-correlations are presented in Section 3, while Section 4 details their theoretical predictions in a halo model framework. In Section 5, we present the methodology and results of our parameter estimation, and discuss the resultant implications. The concluding remarks are presented in Section 6. Throughout this work we assume a spatially flat Λ-CDM cosmology with cosmological parameters fixed to Planck 2018 best-fitting values, i.e. Ω c h 2 = 0.120, Ω b h 2 = 0.0223, h = 0.674, n s = 0.965, τ = 0.0540 and ln(10 10 A s ) = 3.043 (Planck Collaboration et al. 2020).
In the analysis of this paper we employ the Compton-y map from the Planck 2015 data release (Planck Collaboration et al. 2016c) and the projected galaxy density field from the WISE All Sky Catalogue (Wright et al. 2010). A detailed description of this data set is provided in the following.

Compton parameter map
We use the full sky Compton parameter map issued by the Planck Collaboration (Planck Collaboration et al. 2016c). The map was generated by a combination of Planck individual frequency maps with tailored algorithms that enhance the SZ signal and suppress the contribution from the CMB and other Galactic and extra-Galactic foregrounds. The individual frequency maps were convolved to a common resolution of 10 , which also sets the resolution of the resulting y map. We stress that the latter is still affected by foreground residuals, particularly by thermal dust emission at large scales and clustered cosmic infrared background (CIB) and Poisson-distributed radio and infrared sources at small scales (Planck Collaboration et al. 2014a, 2016c. These spurious contaminants will be accounted for in our modeling analysis in Sec. 4.4. These data are publicly available and can be downloaded from the Planck Legacy Archive 1 . The legacy products provide two independent all-sky maps produced using different linear combination algorithms, namely the Needlet Independent Linear Combination (NILC) method (Remazeilles et al. 2011) and the Modified Internal Linear Combination Algorithm (MILCA, Hurier et al. 2013). The results presented in Planck Collaboration et al. (2016c) suggest that, at multipoles < 30, the amplitude of the tSZ power spectrum measured on the MILCA-reconstructed y map is slightly higher than the one measured on the NILC-reconstructed y map. This difference is ascribed to a higher degree of contamination from thermal dust emission at large scales in the MILCA map. Hence, we kept the NILC map as our preferential choice and will present the results for this y map only. We remind, however, that for the remaining part of the explored multipole range, the power spectra extracted from the two versions of the y map prove consistency with Planck Collaboration et al. (2016c), and we did check that, in our power spectrum measurement, results obtained using the two maps agree within their uncertainties.
The maps are delivered in HEALPix format (Górski et al. 2005) with an original pixel resolution set by the 1 http://pla.esac.esa.int/pla Figure 1. WISE galaxy overdensity map, computed using Eq. (5). The masked region, where the overdensity is null, includes the contribution of both the Galactic plane mask and the cuts applied to the WISE catalogue to remove pointings affected by Moon contamination.
parameter N side = 2048. To optimise the efficiency of our data processing pipeline, the pixelisation was degraded to a lower N side = 512, which is sufficient for our purpose. To suppress the contribution from Galactic foregrounds, we finally impose a 40% Galactic plane mask, which is also available in the Planck Legacy Archive.

Galaxy overdensity maps
The WISE satellite scanned the whole sky in four photometric bands at 3.4, 4.6, 12 and 22 µm (labelled as bands W1 to W4). The survey had enhanced sensitivity and angular resolution compared to previous infrared missions such as the InfraRed Astronomical Satellite (IRAS Neugebauer et al. 1984) or the Two Micron All-Sky Survey (2MASS, Skrutskie et al. 2006). In the context of this paper WISE represents therefore the ideal candidate to map the distribution of galaxies over the full sky.
The resulting WISE All-Sky Data were made publicly available in 2012 and can be accessed at the NASA/IPAC Infrared Science Archive 2 . In this paper, we employed the WISE All-Sky Source Catalogue (Wright et al. 2010;Cutri et al. 2012) which contains positional and photometric data for more than 563 million sources detected at more than 5σ in at least one band. Among these are Galactic stars, galaxies and quasars, plus other unidentified sources. Hence, the catalogue needs to be suitably queried to extract the galactic objects representing the targets of our analysis.
The selection is performed based on the source flux values across different wavelengths, as bands W1 and W2 are mainly sensitive to Galactic or extra-Galactic starlight, while bands W3 and W4 probe thermal dust emission from the interstellar medium. Specifically, we follow the criteria outlined in Ferraro et al. (2015): we first apply the cut W1 < 16.6, which ensures a 95% completeness of the resulting catalogue. According to the same reference, this cut also ensures substantial sample uniformity on the sky at high galactic latitudes, despite the WISE inhomogeneous scanning strategy. Second, we consider sources satisfying the condition W1 -W2 > 0, which is typically found in galaxies. We then make use of additional flags to remove contaminations and spurious signals. Pointings close to the Moon are affected by its infrared emission, the effect being quantified by the field moon lev; we mitigate the Moon contamination by discarding all sources for which moon lev> 4. Finally, artifacts are eliminated by selecting only sources with the associated field cc flag= 0. The resulting queried catalogue consists of 50,030,431 sources, which are the galaxies we employ to reconstruct the matter overdensity field.
The galaxy number density map is generated by projecting the object catalogue onto an HEALPix map with resolution N side = 512. To the map we overlay the same Galactic plane mask used for the Compton parameter map, which, combined with the queries we applied on the WISE catalogue, yields a final unmasked sky fraction of f gg sky = 0.40. The resulting mean number of galaxies per pixel (computed outside the masked area) isn g 39.96. If we denote by n g the number of galaxies in a generic pixel, then the corresponding galaxy density fluctuation δ g for that pixel is computed as: Fig. 1 shows the resultant WISE galaxy overdensity map, where the masked region covers the Galactic plane plus a series of stripes that result from the excision of Moon contaminated pointings. WISE photometric data cannot yield direct estimates of the object redshifts. The analysis conducted in this paper, however, does not require the knowledge of individual source redshifts, as only the redshift distribution of the galaxy number density p s (z) will be needed in our theoretical modeling (Sec. 4.3). To this aim, we adopted the statistical distribution of galaxies derived in Yan et al. (2013) via cross-matching with SDSS DR7 data (Abazajian et al. 2009). The distribution is plotted in Fig. 2 and is found to peak at z ∼ 0.24, spanning the range from z = 0 to z ∼ 0.85. For the subsequent theoretical modeling described in Sec. 4 it is useful to have the redshift distribution parametrised analytically, which allows to evaluate the expected galaxy number density for any value of redshift within the covered range. For this purpose, we employ the Python Scipy cubic spline, which we found can reproduce the features of the histogram in Fig. 2 with higher fidelity compared to a polynomial fitting.

MEASUREMENTS OF ANGULAR POWER SPECTRA
The Compton parameter map and the projected galaxy overdensity map described in Secs. 2.1 and 2.2 are used to measure the auto-correlation angular power spectrum of each observable, and the cross-correlation angular power spectrum between the two. To this aim, we employ the Spatially Inhomogeneous Correlation Estimator for Temperature and Polarisation 3 (PolSpice, Challinor et al. 2011) package, which is a tool to statistically analyze any data pixelated over a sphere. The software accepts as input any combination of maps in HEALPix format, together with a possible sky mask or pixel weighting scheme, and it delivers as output the corresponding auto-or cross-power spectrum or two-point correlation function. For our purpose, it is more convenient to work with power spectra, so we will not be using correlation functions in this paper.
By inputting the maps described in Sec. 2 and the associated masks to PolSpice, we obtain the resulting tSZ angular power spectrum C yy , the power spectrum of the galaxy overdensity field C gg and the cross-correlation power spectrum C gy . Hereafter we shall label them as  Table 1, with error bars computed as the square root of the diagonal terms in the corresponding covariance matrices. Lines represent the associated theoretical predictions computed using the best-fit parameters quoted in Table 2 (full covariance case), obtained as described in Sec. 5. For each case we show separately the contribution of the one-halo and the two-halo terms, and the contribution of the foregrounds affecting the correlation. the yy, gg and gy power spectrum, respectively. These spectra are plotted separately in the three panels of Fig. 3. In order to smooth out the power spectrum scatter across neighbouring multipoles, the data points show the bandpowers computed over a set of effective multipole bins eff , adopting the same binning scheme as in Planck Collaboration et al. (2016a). Overall, we consider N bands = 19 multipole bins from a minimum eff = 10 to a maximum eff = 1247.5; the corresponding bandpowers are computed by averaging the spectrum values in each bin. In Table 1 we report the values eff we consider, together with their multipole range limits and the bandpowers for yy, gg and gy. This binning procedure will also be applied to the theoretical prediction of the power spectra as described in Sec. 4.5. The spectra plotted in Fig. 3 confirm that our data set allows measuring the auto-correlation of the galaxy density field and the Compton parameter maps. More importantly, a cross-correlation between the two is detected. It is convenient to quote the significance of these measurements, computed as where C represents any of the three power spectra and Cov −1 is the inverse of its covariance matrix, which is required to account for possible correlations between multipole pairs; the superscript "T" denotes transposition. The analysis of the covariance is described in details in Sec. 3.2; it is worth mentioning here that the diagonal terms of the covariance matrix quantify the uncertainties for the measured correlation at each multipole eff , and are plotted as error bars in Fig. 3. The significance computed with Eq. (6) are s = 88 for the gg spectrum, s = 150 for the yy spectrum and s = 21.8 for the gy spectrum. The latter value confirms that our measurement of the tSZ and galaxy density cross-correlation is robust. Finally, it is worth mentioning that our reconstruction of the yy power spectrum is consistent with the results obtained in Planck Collaboration et al. (2016c). To this aim, it is possible to compare the yy amplitudes listed in Table 1 with the ones from Table 3 in Bolliet et al. (2018). Marginal differences can be due to details in the data processing, such as the fact that we are using a non-apodized version of the mask, or the fact that Planck considers cross-correlations between data from the first and second halves of each pointing period, while we compute the auto-correlation from the full y map. Also, the contribution of our non-Gaussian uncertainties (Sec. 3.2) implies our final error bars on the yy power spectrum are larger than the ones shown in Planck Collaboration et al. (2016c).

Shot-noise correction
The discrete nature of the WISE galaxy catalogue and its splitting into different pixels induces a shot-noise component in the overdensity map that can bias our computation of the gg power spectrum. In order to estimate the shot noise power spectrum, we follow the same procedure outlined in Ando et al. (2018) and Makiya et al. (2018). We randomly split the WISE catalog into Table 1. Summary of our power spectra measurements. By labelling AB any of the possible combinations gg, gy or yy, we report for each effective multipole eff the power spectra value C AB , the Gaussian error contribution σG(C AB ) and the non-Gaussian error contribution σNG(C AB ). eff 10 16 C yy 10 17 σG(C yy ) 10 15 σNG(C yy ) 10 5 C gg 10 6 σG(C gg ) 10 5 σNG(C gg ) 10 10 C gy 10 11 σG(C gy ) 10 11 σNG(C gy ) two halves with a similar number of galaxies, and project them onto the sky generating two maps which we label δ g,1 and δ g,2 . We then compute the half-sum (HS) and half difference (HD) maps as: By construction, the HS map contains both signal and noise, while in the HD map, the signal cancels out, leaving only the shot noise contribution. We then get to a noise-cleaned estimate of the gg power spectrum as The power spectrum plotted in the first panel of Fig. 3 has already been shot noise-corrected. The correction is not applied to the cross-correlation of the galaxy overdensity field with the Compton map, as the shot noise in the former is uncorrelated from the noise affecting the latter.

The covariance matrix
The covariance matrix is required to complete the statistical characterisation of the measured power spectra, as well as to quantify their agreement with our theoretical models (Sec. 5.1). We expect to observe a non-zero correlation between different multipoles, with increasing statistical weight as their separation decreases. In order to maximise the statistical information provided by our measurements, in this paper we combine the contribution of all of the observed spectra together. To this aim, we define a 3N band -length vector C tot obtained by concatenating the three spectra gg, gy and yy as: We need to derive a general matrix Cov tot that quantifies the covariance for the full vector C tot (i.e the three spectra gg, gy and yy and the cross-correlation between different observed spectra). Applying the definition of covariance, it can be computed as: Cov gg,gg Cov gg,gy Cov gg,yy (Cov gg,gy ) T Cov gy,gy Cov gy,yy where the brackets denote the statistical sample average. The last equality shows that the full covariance can be expressed as a set of six independent N band × N band covariance matrices of the form Cov AB,CD (C AB 1 , C CD 2 ), where again capital letters denote any one of y or g. We recognise that the diagonal blocks, for which AB = CD, are the covariance matrices for each of the correlations gg, gy and yy, while the off-diagonal blocks quantify the "cross" covariance between different measured spectra. The square roots of the diagonal elements in the N band × N band matrices Cov gg,gg , Cov gy,gy and Cov yy,yy provide an estimate of the uncertainties associated with the corresponding power spectrum measurements, and are represented by the error bars in Fig. 3. By construction, these uncertainties also carry the contribution from any foreground residual contamination in the maps.
The covariance matrix needs to be evaluated analytically. Under the assumption that the thermal SZ fluctuations are purely Gaussian, the covariance matrix would be diagonal with no correlation between different multipoles. It could be evaluated from the knowledge of the measured spectra and the available sky fraction. However, works on hydrodynamical simulations proved that SZ fluctuations can indeed be non-Gaussian (Seljak et al. 2001;Zhang et al. 2002). We then write the covariance as the sum of a Gaussian and a non-Gaussian term following Makiya et al. (2018): where f ABCD sky is the observed sky fraction. The spectra in the square brackets (i.e.,Ĉ AC ,Ĉ AD ,Ĉ BD andĈ BC ) are the observed power spectra which include the contribution from noise and foregrounds, δ 1 2 is the Kronecker delta, and ∆ gives the discrete difference between multipole bins. We write the non-Gaussian term of the covariance matrix as The angular trispectrum is given by (see also Makiya et al. 2018 andBolliet et al. 2018): where χ(z) is the comoving distance, H(z) is the Hubble parameter and dn/dM is the halo mass function. The sky fraction in Eq. (13) depends on the chosen mask.
As we used different masks on the WISE and tSZ maps, f ABCD sky is determined by the combinations of the two. The sky fraction is equal to f yy sky = 0.59 and f gg sky = 0.40 for the yy-and gg-auto correlation, respectively. We determined the sky fraction for the cross-correlation as f gg,yy sky = f gg sky f yy sky ≡ 0.486 (Makiya et al. 2018). In Table 1 we report the diagonal values of the Gaussian and non-Gaussian covariance matrices obtained for yy, gg and gy.
Before closing this section we compute the correlation coefficient matrix, which can be obtained from the full covariance by normalising it to the diagonal values of the associated N band × N band covariance matrices: Notice that, according to this definition, only the diagonal elements of the diagonal blocks of the covariance matrix from Eq. (10) are equal to unity, whereas the diagonal elements in the non-diagonal blocks are not normalised to one. The resulting correlation matrix, made by its six independent building blocks, is plotted in Fig. 4, binned over the effective multipoles eff . The off-diagonal elements are negligible in each block due to the increasing multipole separation and the binning over effective multipoles. On the contrary, the non-zero diagonals inside the off-diagonal blocks reflect the existing correlation between C gg , C gy and C yy , which is nonnegligible. Such correlation is indeed captured by the full covariance matrix computed with Eq. (11), which is the one we employ to conduct the parameter estimation in Sec. 5.

THEORETICAL MODELING
We detail in this section the formalism we employ to theoretically predict the observed auto-correlation power spectra C yy and C gg , and the cross-correlation power spectrum C gy theoretically. The following describes how the mass bias parameter we are interested in enters this prediction, together with a set of nuisance parameters whose correlation will be explored in Sec. 5.  Table 1.

Halo model
Our theoretical framework for the angular power spectrum calculation is based on the halo model, which is an established approach to the problem of predicting cross-correlations under the assumption that all galaxies live in haloes (Komatsu & Kitayama 1999;Osato et al. 2020). We label C AB the generic cross-correlation power spectrum between observables A and B. According to the halo model, it can be decomposed into the contribution of a one-halo (intra-halo) term C AB,1h and of a two-halo (inter-halo) term C AB,2h , as: The one-halo term quantifies the integrated contribution of all the observable haloes considered individually, and can be expressed as: where cχ 2 (z)/H(z) = d 2 V /(dzdΩ) is the comoving volume per unit redshift and solid angle, and A and B are the spherical Fourier transforms of the corresponding generic observables on the sky. The integral endpoints z min , z max and M min , M max are to be chosen depending on the redshift and mass spans of the targeted observables.
The two-halo term quantifies the effect of inter-halo correlations, and can be written as: where P lin m (k, z) is the linear matter power spectrum and b(M, z) is the halo bias.
In our implementation we will use the mass function parametrisation from Tinker et al. (2008), the halo bias parametrisation from Tinker et al. (2010) and compute the linear matter power spectrum using camb (Lewis et al. 2000). As per the integration extrema, we set z min = 10 −3 , z max = 5 when computing the yy autocorrelation, as this interval safely includes all contributions from galaxy clusters (we do not set z min = 0 to avoid divergences in the computation of angular sizes). For the gy cross-correlation and the gg auto-correlation, instead, we set z min = 3 × 10 −2 and z max = 1, as this the redshift range spanned by WISE galaxies (see Fig. 2). Regarding the mass, we set a lower limit of 10 11 h −1 M , below which the ICM pressure becomes negligibly low and an upper limit of 10 16 h −1 M , after which the mass function severely cuts off the halo abundance. We checked that the final predictions do not vary appreciably if changing the mass limits by a few per cent in log 10 (M ).
The remaining quantities to be determined are the Fourier transforms A (M, z) and B (M, z) for the generic observables. We detail in the rest of this section their evaluation for the Compton parameter and the galaxy density field. Before, it is worth reminding that the mass function is parametrised in terms of the overdensity mass M 200,m (Tinker et al. 2008), i.e. the mass enclosing a radius whose mean density equals 200 times the matter cosmic density at that redshift. The cluster pressure profile, which is used to compute the Compton parameter, is instead typically expressed as a function of M 500,c , namely the mass enclosing a radius R 500 whose mean density equals 500 times the critical density of the Universe ρ crit : The critical density, in turn, can be expressed as with Ω m and Ω Λ the matter and dark energy density parameters, respectively. Finally, the galaxy density field is usually modelled via the halo virial mass M vir , defined as: where R vir is the virial radius and the overdensity ∆ vir is parametrised in Bryan & Norman (1998) as (see also eqs. D2, D9, D10 in Komatsu et al. 2011) with In our implementation of the halo model formalism, we choose to set M 200,m as our independent variable, and the aforementioned integration limits are quoted for this mass definition. When computing the Fourier transform for the Compton parameter or the galaxy density field, the value is properly converted into M 500,c and M vir , respectively. This conversion is performed employing the Python COLOSSUS package 4 (Diemer 2018) assuming a Navarro-Frenk-White (NFW, Navarro et al. 1996) profile and the mass-concentration relation from Duffy et al. (2008).

Fourier space Compton-y parameter
The Compton parameter defined in Eq. (3) is proportional to the electron pressure P e = k B n e T e integrated along the line of sight. The effect is particularly relevant in the direction of galaxy clusters; assuming that galaxies reside in virialised dark matter haloes, the y definition allows to define an effective halo 2-dimensional Compton parameter profile projected on the sky. For a generic halo of mass M 500 (to simplify the notation we will drop the subscript "c" hereafter) at redshift z, the latter is a function of the 3-dimensional electron pressure profile P e (r; M 500 , z), with r the comoving radial separation from the halo center.
The associated 2-dimensional SZ Fourier transform for a single halo can then be computed as (Planck Collaboration et al. 2014a, 2016c: where we introduced the scaled radial separation x = a r/R 500 (with a the scale factor at the halo redshift), and 500 = aχ/R 500 . We evaluate the integral between the limits x min = 0 and x max = 6 as the physical scale 5 R 500 is usually considered to mark the outer boundary of a galaxy cluster. We adopt the electron pressure profile parametrisation derived in Arnaud et al. (2010): where h 70 = h/0.7, α p 0.12 represents the departure from the standard self-similar solution and P(x) is the "universal" pressure profile (UPP). The latter is parametrised as a generalised Navarro-Frenk-White profile (Nagai et al. 2007):  (24) shows that the pressure depends on the effective tSZ cluster mass M tSZ 500 already introduced in Eq. (4). The departure from the assumption of hydrostatic equilibrium in the ICM, together with other model systematics, is quantified by the bias parameter B or by its equivalent b H . We want to stress that we also account for the hydrostatic mass bias when computing R 500 , so that our definition has been rescaled as R 500 (1 − b H ) 1/3 . As anticipated in Sec. 1, providing an independent estimate of b H is a major goal of our paper. We find it more convenient to fit for the bias expressed as B, which will be the independent parameter in the analysis described in Sec. 5.1. We will also report the corresponding constraints on the quantity 1 − b H in Table 3.

Galaxy density field Fourier transform
The projected galaxy density field measured at a generic directionn on the sky can generally be expressed by integrating the matter overdensity δ m over the comoving distance as (Ferraro et al. 2015;Hill et al. 2016;Ferraro et al. 2016): where the kernel function W g (χ) = b g p s (χ) is the product of the galaxy bias b g and the source distribution function p s (χ). The latter quantifies the probability of detecting a source in the interval [χ, χ + dχ], and has to satisfy the normalisation condition: The source function can be more conveniently expressed as a function of redshift via the variable change (we will continue to call the redshift distribution p s in a slight abuse of notation). For our WISE catalogue, the source distribution as a function of redshift is plotted in Fig. 2. The galaxy bias b g is an unconstrained quantity in our model. To a first approximation, it could be factorised out of the integral in Eq. (26) as a mean value for our data set, assuming it is independent of redshift; such approach was adopted for example in Ferraro et al. (2016) and Hill et al. (2016). Given the wide range of z values spanned by the WISE catalogue, it is more meaningful to explore a possible redshift dependence of the halo bias. Furthermore, we notice that the high-points have the smallest error bars and consequently a higher statistical weight in the parameter estimation described in Sec. 5. As those points lie in the range where the one-halo term is dominant, the latter is expected to be driving the fit in determining the most likely value for the galaxy bias. In order to break this coupling between small and large scales, and increase the statistical weight of the two-halo term in the fit for b g , we also include an explicit bias dependence on the multipole. As a result, we parametrise the galaxy bias as: letting the normalisation b 0 g at z = 0, = 0 and the scaling power indices α and β be free parameters in our model. The pivot scale 0 = 117 is computed as the median of the available multipole range, and it roughly corresponds to the scale at which the one-and two-halo terms have comparable amplitudes. The parametrisation in Eq. (29) is a generalisation of the redshift dependence for b g explored in Ferraro et al. (2015).
For a generic halo of mass M at redshift z, let ρ(ar; M, z) be the matter density at a radial separation ar from its center (r being the comoving separation and a the scale factor), and ρ m (z) the mean matter density at the same redshift. The associated matter overdensity halo profile is then defined by the ratio: The 2-dimensional Fourier transform for the matter overdensity defined above can be computed as: In order to Fourier transform the galaxy overdensity, we have to include the kernel function W g introduced in Eq. (26), as: The computation of the Fourier transform in Eq. (32) requires the choice of a functional form for the matter density halo profile ρ(ar; M, z), for which we shall assume again an NFW parametrisation as: where we denote by R the physical radial separation from the halo centre. The NFW profile is governed by two parameters, the normalisation density ρ 0 and the scale radius r s . The normalisation density can be computed by imposing that the volume integral of the halo density within its radius yields the total halo mass. As in this context we are working with virial quantities, the mass normalisation condition reads: By introducing the halo concentration parameter as the ratio between the virial and the scale radius, c vir = R vir /r s , the integral in Eq. (34) can be carried out to yield the normalisation density: where the function m(x) is defined as m(x) = ln(1 + x) − x/(1 + x) (Cooray & Sheth 2002). We shall use the parametrisation from Duffy et al. (2008) to compute the concentration parameter of a generic halo of mass M vir at redshift z: The normalised NFW profile can then be plugged back into Eqs. (30) and (31) to compute the 2-dimensional Fourier transform of the matter overdensity field. By defining the scaled radial separation as x = ar/r s , and s = aχ/r s , we obtain the expression: where˜ ≡ / s and in the last step we made the matter density redshift dependence explicit, ρ m (z) = ρ crit (0)Ω m a −3 (ρ crit (0) is the critical density at redshift 0). By substituting the expression for the NFW profile from Eq. (33) and its normalisation from Eq. (34), we obtain where we defined for convenience the sine and cosine integral functions: The final expression for the Fourier transform of the galaxy overdensity field g is therefore: .
The result in Eq. (40) can then be plugged into Eqs. (17) and (18) to compute either the auto-correlation power spectrum for the galaxy density field, or its crosscorrelation power spectrum with the Compton parameter map.

modeling the foreground contribution
As anticipated in Sec. 2.1, the Compton parameter map is affected by residual foreground contaminations that are not completely suppressed in the component separation analysis. These contaminations will bias the tSZ power spectrum measured from the map in Sec. 3, so that the pure tSZ power spectrum model described in Sec. 4.2 is no longer representative of the observed autocorrelation. We shall model the real yy auto-correlation spectrum, which hereafter is labelled C yy , as the halomodel predicted tSZ-only spectrum, C tSZ , plus a set of foreground terms as: The relation above considers the contribution from the clustered cosmic infrared background (CIB), infrared sources (IR), radio sources (Rad) and instrumental correlation noise (CN). The study of these foreground contributions to the Planck tSZ map has already been tackled in previous works, and we will employ for our analysis their values tabulated in Planck Collaboration et al. (2016c). Although such templates define the contaminants dependence at different angular scales, we let their actual contribution to the yy power spectrum be controlled by the set of amplitude parameters A CIB , A IR and A Rad , which are not constrained a priori and shall be fitted against our observables. We only fix the value for A CN = 0.903, as this value is required to reproduce the yy spectrum at the highest multipole = 2742, where instrumental noise is the dominant contribution (Bolliet et al. 2018;Makiya et al. 2018).
As WISE data probe the galaxy overdensity field up to z ∼ 0.8, the cross correlation with the Compton map may also be affected by CIB contaminations, as the latter is indeed a relevant foreground at z ∼ 1 (Makiya et al. 2018). Our theoretical modeling should therefore incorporate this possible contribution in the prediction of the gy cross-correlation. The Planck Collaboration delivered three maps of CIB anisotropies outside the Galactic plane region at the frequencies 353, 545 and 857 GHz (Planck Collaboration et al. 2016d). To assess the CIB contamination level in our power spectra, we compute the cross-correlation of each of these maps (downgraded to a 10' resolution) with the WISE galaxy overdensity map. The resulting spectra are plotted in Fig. 5; the decrease in power at ∼ 1000 is due to the beam smoothing, while there is no straightforward interpretation for the observed peak at low-. Using a different CIB map affects the amplitude of the resulting correlation, but does not result in an appreciable change in the spectrum shape. Therefore, we can consider the average of these three spectra, C g−CIBavg , also plotted in Fig. 5, to be representative of the CIB contamination dependence on . The CIB contamination is then included in our modeling of the cross-correlation between WISE and Planck data as: where C g−tSZ is the cross-power spectrum between the Compton map and the galaxy overdensity map computed using the halo model (Eq. (16)) and B CIB is a free parameter which gauges the actual CIB contamination at the power spectrum level 5 . B CIB is then different from the parameter A CIB which controls the amplitude of the CIB contamination in the yy auto-correlation, as it is not dimensionless. Since the CIB maps report the foreground specific intensity in units MJy sr −1 , while the galaxy overdensity map is dimensionless, we expect the amplitude coefficient B CIB to have dimensions sr MJy −1 . Its actual value has to be fitted against our measurements, as it is described in Sec. 5.1.

modeling observational effects
The power spectra predictions computed with the algorithm described above cannot be directly compared with the measurements presented in Sec. 3, because they do not include the effect of the beam convolution or the artificial mode coupling induced by the mask. The recipe described in Hivon et al. (2002) allows to account for these effects and to compute the associated pseudopower spectrum C P . The pseudo power spectrum is obtained from the theoretical power spectrum C using the following transformation: In the equation above, B = exp − 2 σ 2 b /2 is the beam window function, where σ b is related to the beam full width at half maximum θ FWHM by σ b = θ FWHM / √ 8 ln 2 = 0.00742 (θ FWHM /1 • ). For the Planck Compton maps we have θ FWHM = 10 , and our projected galaxy density map was degraded to the same resolution. The factor M in Eq. (43) is the mode-coupling matrix which is calculated as: where the term in round brackets is the Wigner-3j symbol, and W is the power spectrum of the mask. In our implementation we employ the MASTER code (Hivon et al. 2002) to perform this computation 6 . Notice that although our measured spectra are binned over the effective multipoles eff , the coupling in Eq. (43) is to be evaluated for all individual multipoles . Hence, we perform the spectrum binning into bandpowers only 6 Specifically, we employ the FORTRAN90 routines available at Prof. E. Komatsu webpage (https://wwwmpa.mpa-garching.mpg.de/ ∼ komatsu/crl/list-of-routines.html) to carry out the computation of the M 1 2 matrix.
on the final pseudo-power spectrum C P , and not on the simple theoretical prediction C , before comparing it with our measurements.

PARAMETER ESTIMATION
The theoretical modeling described in Sec. 4 allows us to compute the auto-and cross-correlation power spectra between the Compton parameter and galaxy overdensity maps, provided a set of parameters is defined. The parameters include the SZ mass bias parameter B, the galaxy bias parameter defining the kernel of the projected density field (more specifically, its normalisation b 0 g and its redshift and multipole scaling power indices α and β), and the nuisance parameters quantifying the amplitude of the foreground contaminations. In this section, we describe the methodology we employ to provide constraints on these parameters against the power spectra measured in Sec. 3, and discuss the resultant estimates.

Methodology
For the rest of this work we shall fix the cosmology and only let the foreground parameters and the mass and galaxy bias parameters free to vary. Although the bias parameters are the main focus of our work, we are also interested in assessing to what extent the degener- acy with the foreground parameters can affect their estimation. The parameter space we explore is then eightdimensional, each point of which can be expressed as a vector Θ = {B, A CIB , A IR , A Rad , B CIB , b 0 g , α, β}. The best-fitting 8-tuple can be determined by maximising a suitable likelihood function L, or by minimising a corresponding χ 2 function defined as χ 2 = −2 ln L. As anticipated in Sec. 3.2, we want to fit our model against all of the observed auto-and cross-correlations at the same time. The theoretical model can then predict a theoretical vector C theo (Θ) as defined in Eq. (9), which would depend this time on the parameter set Θ. If we label by C obs the vector whose components are the spectra measured in Sec. 3, the likelihood function of Θ can be computed by using the full covariance matrix Cov tot , defined in Eq. (10), as: where Cov −1 tot denotes the inverse of the covariance matrix, and the theoretical vector is made of the pseudopower spectra computed with Eq. (43). The procedure we follow for inverting the full covariance matrix is detailed in Appendix A.
We also consider a parameter estimation performed without using the full covariance matrix, reverting instead to the computation of three independent χ 2 , one for each spectra, and summing their contribution together: By construction, this is equivalent to using the full covariance matrix but setting the non-diagonal building blocks to zero (i.e. neglecting the correlation between different spectra). We shall refer to this method as the "diagonal blocks" case, in contrast to the "full covariance" case, which also takes into account the nondiagonal blocks in Eq. (10).
We explore the parameter space using a Markov Chain Monte Carlo (MCMC) approach. We employ the Python emcee package 7 (Foreman-Mackey et al. 2013), which allows to set priors on the parameters and specify the number of chains. We employed 100 chains with a total number of effective steps of 50000 after burn-in removal and chain thinning. The thinning factor was chosen as half the auto-correlation time, which represents the number of steps taken by each chain before reaching a position that is uncorrelated from the starting one, so that our thinned chains can be considered independent draws of parameter values from their posterior distributions; the large number of points per chain (> 50) still available after thinning ensures that our chains have reached convergence. We then use the Python Get-Dist package (Lewis 2019) 8 to retrieve the final posterior distributions on the parameters, which are plotted in Fig. 6 for both the full covariance and the diagonal blocks cases. The final estimates on the fitted parameters for both cases, together with their uncertainties and initial priors, are summarised in Table 2. The associated best-fit predictions for the gg, gy and yy power spectra are overplotted to the measured data points in Fig. 3.
We stress that the parameter errors extracted from their posterior distributions only quantify their statistical uncertainty. For the case of the full covariance, in Table 2 we quote, in addition, our estimates for the systematic uncertainties affecting the parameters. These systematic errors were obtained by modifying the WISE redshift distribution shown in Fig. 2. A more detailed description is provided in Appendix. B, together with a general discussion on the possible sources of systematics affecting our analysis.

Discussion
7 https://emcee.readthedocs.io/en/stable/. Table 3. Comparison between different constraints for the tSZ mass bias, expressed as 1 − bH, including the one obtained in this work. For each case we report the main observable(s) employed in the analysis, the corresponding specific survey/instrument and the considered mass range, although we redirect to the corresponding references for details. Constraints above the horizontal line all involve the use of the tSZ Compton maps. "WL" refers to weak lensing for brevity.  We discuss in this section the results obtained from the parameter estimation analysis, beginning with the difference between the diagonal blocks and the full covariance cases. The contours in Fig. 6 show that the results are generally compatible: the only posterior distributions that show a mild tension between the two cases are those involving the clustered cosmic infrared background A CIB and the amplitude of the infrared source contamination A IR . However, these deviations are always within one sigma. We conclude therefore that the usage of the full covariance matrix, which takes into account the cross-correlation between different observed spectra, produces a consistent result with respect to the case of considering only the corresponding covariances. This could be expected as the off-diagonal blocks of the full covariance matrix provide only a second-order contribution with respect to the diagonal ones.
We observe hints of anti-correlation between the parameters A CIB and A Rad , which can be expected as the associated foreground spectra have a very similar multipole dependence. The tight anti-correlation between the galaxy bias normalization b 0 g and the slope of the redshift dependence α is simply a result of our chosen functional form for b g (Eq. (29)); similar considerations apply to the joint posterior distribution between b 0 g and β. A positive correlation is found instead between A CIB and B CIB , which is understandable as they both gauge the level of CIB contamination, although in different power spectra. We also observe a positive correlation between B and B CIB : a higher value of B CIB requires a lower contribution from the tSZ power spectrum to fit our data points, thus favouring a lower Compton parameter amplitude which can be achieved via a higher bias B.
We consider now the best-fit values quoted in Table 2. The nuisance parameters controlling the foreground amplitudes in the yy auto-spectrum have also been constrained in previous works (Makiya et al. 2018;Bolliet et al. 2018;Rotti et al. 2021). Our A CIB and A Rad estimates are consistent with the findings of Makiya et al. (2018) and Rotti et al. (2021) respectively, while we find a higher value for the A IR parameter instead. However, the actual, individual values of the nuisance parameters are not of particular interest, as different amplitudes in Eq. (41) can lead to the same observed power spectrum. What matters in this context is the overall foreground contamination from all these sources combined. From the third panel of Fig. 3 we see that the foreground contribution dominates the yy auto-correlation at multipoles 200. This clearly shows the necessity of in-cluding the nuisance parameters in our fit and that their fitted values allow recovering the observed spectral amplitude at the smallest scales. In addition, we first provide an estimate of B CIB , which controls the CIB contamination to the crosscorrelation of the galaxy field with the Compton parameter map. The best-fit value is of order 10 −5 sr MJy −1 , and the associated contamination to the gy crosscorrelation allows us to recover the amplitude of the measured power spectra at large scales, as clearly shown in the second panel of Fig. 3. Similarly, the B CIB contribution is particularly relevant also at very small scales, where it has an amplitude larger than the two-halo term and enables our theoretical prediction to match the observed spectral amplitude. This proves the importance of accounting for the CIB contribution when tSZ is crosscorrelated with galaxy catalogues at z ∼ 1 or above.
We consider now the constraints obtained on the tSZ mass bias parameter, which is the main goal of our work. As already mentioned, several previous works have fitted for b H against different observables (see, e.g. Table 1 in Ma et al. 2015), as summarised in Table 3. With our analysis we obtain the estimate B = 1.50±0.07 (stat) ± 0.34 (sys), which corresponds to 1 − b H = 0.67 ± 0.03 (stat) ± 0.16 (sys); the latter value is also reported in Table 3. Our finding is largely in agreement with other estimates and is particularly consistent with Makiya et al. (2018), who also considered the cross-correlation between tSZ and galaxy density field. Although results from hydrodynamical simulations suggest that the tSZ mass underestimates the total cluster mass by only 5% to 20% (Meneghetti et al. 2010;Truong et al. 2018;Angelinelli et al. 2020;Ansarifard et al. 2020;Pearce et al. 2020;Barnes et al. 2021;Gianfagna et al. 2022), to solve the tension between cluster-based and CMB-based estimates on cosmological parameters, higher bias values are required (Planck Collaboration et al. 2016b). The results from Rotti et al. (2021) show that by excising the contribution of detected/resolved clusters from the Planck y map, a power spectrum analysis yields 1 − b = 0.85 ± 0.04, in agreement with simulations, while the inclusion of massive clusters leads to the estimate 1−b = 0.60±0.05, which is lower than our findings. The reference points out that this can result from a possible mass dependence of the bias or CIB contaminations, with novel data required to provide a deeper understanding. Our constraint suggests that the tSZ mass underestimates the true cluster mass by ∼ 33%, thus corroborating the hypothesis that the mass bias is higher than the results favoured by numerical simulations alone.
Finally, the linear galaxy bias amplitude at z = 0 and = 117 is constrained to b 0 g = 1.28 +0.03 −0.04 (stat) ± 0.11 (sys), while the power index for its redshift and multipole dependences are α = 0.20 +0.11 −0.07 (stat) ± 0.10 (sys) and β = 0.45±0.01 (stat) ± 0.02 (sys). From the first panel in Fig. 3 we see that the inclusion of a redshift and multipole dependence in the galaxy bias allows to recover the measured gg correlation at small scales, but tends to underestimate the spectral amplituce at the largest scales. This could be a result of our theoretical modeling, and could possibly be solved by opting for a full halo occupation distribution (HOD) approach instead of the halo model. For our parametrisation, the value of α denotes a mild redshift dependence, which results in a mean value for the galaxy bias across WISE redshift range (at the reference multipole) ofb g 1.37. It is indeed expected to obtain a higher linear galaxy bias with increasing redshift for an approximately magnitude-limited galaxy sample, in agreement with several observational and theoretical studies (Somerville et al. 2001;Gaztañaga et al. 2012;Crocce et al. 2016;Merson et al. 2019). We can compare our findings with previous estimates of the galaxy bias from works employing WISE data. In Ferraro et al. (2015) a similar functional form for the redshift evolution of the galaxy bias was considered, with a fixed exponent α = 1 and a fitted normalisation b 0 g = 0.98 ± 0.10. The same reference also considered a model with a constant bias, which yielded the estimateb g = 1.41 ± 0.15, which is compatible with our redshift-averaged value of 1.37. The works in Hill et al. (2016) and Ferraro et al. (2016) favour instead the mean value b g = 1.11 ± 0.08, which is lower than our normalisation at z = 0. Finally, we point out that our overall error estimate for b 0 g , considering both the statistical and the systematic contributions, is comparable with the uncertainties quoted by those studies.

Effective mass range
In order to better understand the best-fit values presented in Sec. 5.2, it is interesting to investigate what mass range has the highest statistical weight in constraining our model. To this aim, we follow the formalism presented in Rotti et al. (2021), which allows to estimate the mean mass that contributes to the computation of the power spectrum for each multipole . The computation is performed separately for the one-halo and the two-halo term via a weighted average over mass and redshift. For the generic AB cross-correlation, the mean mass contributing to the one-halo term reads: where it is understood the integrals are to be evaluated within the chosen ranges [z min , z max ] and [M min , M max ], and we have introduced the short-hand notation: The weighing factor f AB (M, z) is obtained as the product of the comoving volume element, the halo mass function and the Fourier transforms A (M, z) and B (M, z). Similarly, for the two-halo term we have: where and with P lin (k, z) the linear matter power spectrum and b(M, z) the halo bias (same as in Eq. 18). The resulting mean masses M AB,1h and M 2 AB,2h are plotted as a function of the multipoles in Fig. 7, for the gg, gy and yy cases. We see that in all cases the mean mass decreases with , thus suggesting the contribution from lower mass clusters dominates the computation of the power spectra at smaller scales, as expected. In general, the multipole dependence is stronger for the one-halo term than for the two-halo term, with the former being dominated by higher masses. When comparing different correlation cases, we notice the yy correlation is dominated by higher masses compared to the gg correlation, with the gy case in between. On average, the mean mass that mostly contributes to our yy measurement is of order 10 15 h −1 M ; this corresponds to the scale of massive clusters, which provide indeed the strongest signal in the y map. The gg power spectrum, instead, mainly receives its contribution from masses below 4×10 14 h −1 M . This may be linked to a difference in the nature of the projected galaxy density field and the Compton parameter as LSS tracers. While galaxy overdensities can be more readily linked to the underlying dark matter distribution, the hot gas responsible for the tSZ effect requires a higher gravitational potential; the latter is only reached at the peaks of the dark matter distribution, where the inter-galactic medium can be ionised and local galaxies virialise into galaxy clusters. We can expect therefore higher halo masses to provide the main contribution to the yy spectrum. Figure 7. The mean halo mass, as a function of , which mostly contributes to the computation of the auto-and crosspower spectra, calculated using Eq. (47) for the one-halo term and Eq. (49) for two-halo term. Results are shown for all our correlations cases, marked as gg, gy and yy.
In Table 3, we list the mass range explored by previous works in the literature. We notice that the effective mass range derived in this section is consistent with the ones employed by Hoekstra et al. (2015) and von der Linden et al. (2014); our estimate for the cluster mass bias is compatible with the results provided by those works.

Possible systematics
This section is dedicated to a review of the most likely sources of systematics that can affect our findings, related to both our data set and the methodology adopted in this paper.
Regarding our data set, the strongest source of systematic is the choice of a sensible redshift distribution to describe our selected WISE galaxy sample. As already commented in Sec. 2.2, we adopt the redshift distribution derived in Yan et al. (2013) for the sample of sources detected in the W1 band with S/R > 7. This distribution may not be entirely representative of the galaxy sample employed in this analysis, due to the cut we applied on WISE data. Besides, the redshift distribution was derived by cross-matching WISE detections with optical SDSS sources; this procedure may not be adequate for WISE higher redshift detection, which could be missed by the SDSS selection. In order to take these issues into account, we follow the strategy adopted in Ferraro et al. (2015): we repeat the full analysis by considering two modified versions of the WISE redshift distribution, obtained by shifting the fitted p s (z) function by ∆z = ±0.1. These distributions are compared in Fig. 8, while the results of the corresponding parameter estimation are shown in Fig. 9 (for simplicity we moved these results to Appendix B); the best-fit parameter val-ues are quoted in Table 4. For each parameter, we take the largest of the two offsets between these new bestfits, and the ones obtained from our fiducial choice for the p s (z) distribution, as the systematic uncertainty on that parameter. The resultant systematic error bars are quoted together with the statistical errors in Table 2.
Another possible source of systematics is our use of a halo model to predict theoretically the measured correlations. In this context, using a full HOD model could provide a better fit to the gg power spectra, especially at large scales. This approach was employed for example in Makiya et al. (2018). We feel, however, that the use of an HOD model may provide a better fit at the expense of increasing the number of free parameters; in the current analysis we prefer to keep using a halo model with less but more physically representative parameters (the halo mass bias and the galaxy bias). In particular, it is interesting to provide new constraints on the galaxy bias parameter using the cross-correlations obtained from our WISE projected density maps; the galaxy bias would not appear in our modeling if we employed an HOD approach.
Finally, for our modeling of the cluster pressure profile we employ the UPP form from Eq. (25), using the bestfit parameters obtained in Planck Collaboration et al. (2013); the latter were estimated on a set of 62 clusters with masses M 500 > 2 × 10 14 h −1 M at z < 0.45. As our analysis extends to higher redshifts and includes lower masses, it is legit to argue that our chosen UPP parametrization may not be suitable for objects with lower masses and higher redshifts. Adaptations of the UPP form to accommodate mass and redshift dependence have been explored for example in Battaglia et al. (2012) andLe Brun et al. (2015). Nonetheless, the analysis we described in Sec. 5.3 proves that the main masses contributing to our yy spectrum are of order 10 15 h −1 M ; for these high masses, the UPP fitted over the resolved Planck -detected clusters is adequate. The introduction of a mass and redshift dependence in the UPP parameters would introduce a further complication in our modeling, and although it could be tackled by future studies, it goes beyond the scope of the current analysis.
To summarise this discussion, we do acknowledge that our results are affected by systematic issues. The most relevant one, at the data level, is the choice of the redshift distribution describing our WISE sample; as commented above, the choice of offset versions for the p s (z) allows to provide an estimate for the additional systematic component in our error bars. We choose a substantial offset of ∆z = ±0.1, and take the offset between the extreme cases as a measurement of our systematic er-rors; it is then reasonable to believe that this additional uncertainty is likely overestimated (at least as far as the choice of a redshift distribution is concerned). Hence, even though we do not explicitly quantify the systematic errors stemming from the other two points described in this section, our conservative apporach enables us to consider the quoted error bars as representative of the overall systematic uncertainty affecting our analysis.

CONCLUSIONS
The current work aimed at providing novel constraints on the cluster mass bias parameter B, which quantifies the deviation of the tSZ-estimated cluster mass from the actual cluster mass. The difference between the two masses is due to the assumption of hydrostatic equilibrium in the ICM and other systematics affecting the determination of the underlying mass proxies. Although this task has already been tackled by previous work, no definitive conclusion has been reached about the value of B. The uncertainty on B is one of the major issues hindering the effective use of cluster number counts as a cosmological probe.
In this work, we fitted for the bias parameter by studying the correlation between the Sunyaev-Zel'dovich effect and the galaxy overdensity field. The former is quantified by an all-sky map of the Compton parameter published by the Planck Collaboration. The latter is obtained by projecting on the sky a galaxy catalogue acquired with the WISE infrared satellite. A proper mask was overlaid to these maps to excise regions affected by Galactic foregrounds or noisy pointings. With this data set, we measured the power spectra quantifying the Compton parameter auto-correlation (yy), the galaxy density auto-correlation (gg) and the cross-correlation between the two (gy). We made use of the PolSpice package, which allows computing power spectra of sky maps with customised masks, and also output the crosscorrelation between pairs of multipoles. To maximise the statistical information encoded in our data set, we joint the three spectra in a unique vector and computed its full covariance, which also includes the correlations between different spectra in its non-diagonal terms.
Our theoretical prediction for the observed spectra was based on a halo model. We detailed how we computed the Fourier transforms of the Compton parameter and the galaxy field and used them to evaluate the one-halo and two-halo terms. The hydrostatic mass bias parameter is a key ingredient in the modeling of the Compton parameter Fourier modes. Similarly, the Fourier transform of the galaxy density field is dependent on the linear galaxy bias, which controls the amplitude of the redshift distribution of the observed sources.
We allowed such bias to depend on both redshift and scale and parametrised it in terms of its amplitude b 0 g at z = 0 and on the respective power-law indices α and β. In our modeling, we also included the effect of foreground residuals in the Compton map, which affect its auto-correlation and its cross-correlation with the galaxy field. Eventually, we obtained a recipe to predict each correlation starting from a set of model parameters, including the hydrostatic mass bias B, the galaxy bias parameters b 0 g , α and β, and a set of nuisance parameters that quantify the foreground contamination in our measurements.
We derived the posterior probability distributions for these parameters using an MCMC approach implemented with the Python emcee package. As a sanity check, we also repeated the fit by neglecting the nondiagonal terms in the full covariance (i.e. by joining a posteriori the gg, gy and yy likelihoods), which yielded estimates close to the full-covariance case. While the posterior probability distributions quantified the statistical errors on the parameter estimates, we also evaluated additional systematic uncertainties; the latter were computed as the maximum offsets between these best-fit values and the ones obtained by adopting different redshift distributions to model our galaxy sample. Specifically, we considered two additional versions of the fiducial p s (z) distribution obtained by shifting the baseline redshift by ∆z = ±0.1. We showed this is an important variation in the p s (z), ensuring the resulting uncertainty is quite conservative and sufficient to include other possible sources of systematics in our analysis (e.g. the choice of an halo model, or the use of a universal pressure profile).
The joint distributions between parameter pairs did not show any strong degeneracy between the nuisance parameters and the parameters of most interest (B, b 0 g , α, β) except for the case of B and B CIB . We observed degeneracy though between b 0 g and α, which is expected as α controls the slope of the redshift dependence. The final best-fit estimates are B = 1.50±0.07 (stat) ± 0.34 (sys), corresponding to 1 − b H = 0.67 ± 0.03 (stat) ± 0.16 (sys), b 0 g = 1.28 +0.03 −0.04 (stat) ± 0.11 (sys), and the power index for its redshift and multipole dependences are α = 0.20 +0.11 −0.07 (stat) ± 0.10 (sys) and β = 0.45±0.01 (stat) ± 0.02 (sys). These results, together with the constraints for the foreground coefficients, prove effectiveness in reproducing the observed power spectra.
We find a linear galaxy bias normalisation in broad agreement with the estimates found in previous works, with an increase in the precision as far as the statistical error bar is concerned. The small statistical uncer-tainty we obtain for b 0 g is a result of our careful treatment of systematic contaminations from CIB and other foregrounds when modeling the reconstructed power spectra. We find a moderate bias dependence on the multipole, with smaller scales favouring a larger bias and a mild redshift dependence.
Finally, our estimate for the mass bias B suggests a ∼ 33% decrement of the tSZ mass with respect to the true cluster mass. This value is larger than estimates from numerical simulations, but it agrees with previous analyses that exploited this type of cross-correlation studies ( Table 3). This large value for the bias helps releasing the tension between CMB-based and cluster-based constraints of cosmological parameters (e.g. σ 8 Ω 0.3 m , Planck Collaboration et al. 2014b). This type of cross-correlation analysis can be applied to future data sets, which will allow improving our understanding of the halo warm-hot gas physics (Pandey et al. 2020). Improved constraints of the bias can be obtained, for instance, with the next generation of galaxy surveys, e.g. Vera C. Rubin Observatory (LSST; LSST Dark Energy Science Collaboration 2012), Euclid (Amendola et al. 2018) and DESI (DESI Collaboration et al. 2016), and CMB missions, such as the LiteBird (Matsumura et al. 2014) and CMB Stage-4 experiments (Abazajian et al. 2016 We detail in the following the procedure we adopted to invert the full covariance matrix defined in Eq. (10). In order to simplify the notation, we can re-label its six independent building blocks as: a ≡ Cov gg,gg b ≡ Cov gg,gy c ≡ Cov gg,yy d ≡ Cov gy,gy e ≡ Cov gy,yy f ≡ Cov yy,yy , so that the full covariance matrix reads: In the trivial case of considering only the diagonal blocks and setting the non-diagonal ones to zero, b = 0, c = 0, e = 0, the inverse covariance can be computed as: i.e. by simply inverting the diagonal blocks. This is the inverse covariance we employ for the parameter estimation when neglecting the cross-correlation between different power spectra (the fourth columns in Table 2).
In the more general case, it is still possible to partition the full covariance matrix into four blocks as: Matrices with the general form expressed by Eq. (A2), where A, B, D have arbitrary size, can be inverted as: where: In the derivation above it is implied that A, D and D − B T A −1 B must be square, invertible matrices. The diagonal case can be recovered by setting B = 0. The only non-trivial term now is the inverse A −1 , which can be computed as follows:  where: In our χ 2 computation we implement the transformations in Eqs. (A3) to (A6) in order to minimize possible numerical errors deriving from the inversion of a high rank matrix.

B. QUANTIFYING THE IMPACT OF THE CHOICE OF THE REDSHIFT DISTRIBUTION
We show in this section some further details on the estimation of the systematic error bars based on different choices of the WISE redshift distribution. The offset p s (z) distributions, compared with the original one, are shown in Fig. 8, while the posterior distributions obtained repeating the parameter estimation procedure with each of them are shown in Fig. 9. The best-fit parameters obtained in each case (which are used to estimate the systematic error component), are instead quoted in Table 4.