Does the Correlation between 2MRS Galaxies and the CMB Indicate an Unmodeled CMB Foreground?

We revisit the claimed detection of a new cosmic microwave background (CMB) foreground based on the correlation between low-redshift Two Micron All Sky Survey Redshift Survey (2MRS) galaxies and CMB temperature maps from the Planck and Wilkinson Microwave Anisotropy Probe missions. We reproduce the reported measurements but argue that the original analysis significantly underestimated the uncertainties. We cross-correlate the 2MRS galaxy positions with simulated CMB maps and show that the correlation measured with the real data for late-type spiral galaxies at angular scales θ ≥ 0.°1 and redshift cz < 4500 km s−1 is consistent with zero at the 1.7σ level or less, depending on the exact CMB map and simulation construction. This was the sample that formed the basis for the original detection claim. For smaller angular separations the results are not robust to galaxy type or CMB cleaning method, and we are unable to draw firm conclusions. The original analysis did not propose a specific, falsifiable physical correlation mechanism, and it is impossible to rule out any contribution from an underlying physical effect. However, given our calculations, the lack of signal from expanding the redshift range, and the lack of corroboration from other galaxy surveys, we do not find the evidence for a new CMB foreground signal compelling.


INTRODUCTION
Measurements of the CMB fluctuations, particularly from the Wilkinson Microwave Anisotropy Probe (WMAP; Bennett et al. 2013) and Planck (Planck Collaboration VI 2020) satellite missions, are a cornerstone of modern cosmology.The public data products from these experiments have seen extensive scrutiny within the community, motivated in-part by seemingly anomalous features in the data (e.g., Bennett et al. 2011;Schwarz et al. 2016;Planck Collaboration VII 2019, and references therein), and ongoing discrepancies between the CMB predictions within the standard ΛCDM model and other observations (e.g., Riess et al. 2022).Obtaining cosmological constraints requires separating the CMB from astrophysical foreground microwave signals, and accurate foreground separation has become increasingly important as measurement precision has improved.
Recently, Luparello et al. (2023;hereafter L23) reported a detection of a new, previously unmodeled CMB foreground signal by correlating the positions of low-redshift galaxies (cz ≤ 4500 km s −1 ; z ≤ 0.015) in the 2MASS Redshift Survey (2MRS; Huchra et al. 2012) with CMB maps provided by the Planck and WMAP teams.They showed that the thermal Sunyaev Zel'dovich and Integrated Sachs-Wolfe effects, known physical mechanisms that produce correlations between large-scale structure and the CMB, cannot explain this signal.If L23 have measured a genuine correlation and discovered a new foreground this could have widereaching implications.It may necessitate revising values of cosmological parameters estimated from the CMB, which are used very widely in modern cosmological and astronomical analyses.It could also have important implications for the interpretation of large-scale CMB anomalies, as discussed by Hansen et al. (2023) and Lambas et al. (2024).
In this Letter we revisit the L23 analysis in an attempt to better understand the origin of the reported correlation.We reproduce their main results for elliptical and spiral galaxies in Section 2, address the key issue of estimating uncertainty in the correlation statistic in Section 3, and provide concluding remarks in Section 4.

COMPUTING THE 2MRS-CMB CORRELATION
Here we summarize the calculations performed by L23 and show that we can adequately reproduce them from public data.We follow L23 and use the SMICA temperature map and mask provided in the 2018 Planck data release 1 .For the various 2MRS galaxy samples we use the public catalog 2 described by Huchra et al. (2012).
The angular correlation statistic used by L23 is (their Equation 1) where i and k label the CMB map pixels and galaxies, respectively, and C k is the set of N pix,k pixels lying within the annulus with radius between θ and θ + δθ from galaxy k.We exclude pixels that are masked in the SMICA confidence mask from the sum and adjust N pix,k accordingly.Hereafter we drop the expectation brackets and denote the correlation statistic simply as ∆T (θ).To approximately match the angular binning shown in the L23 figures we use logarithmically spaced angular bins with radii from 0.01 • to 20 • .Figure 1 shows the correlation statistic we computed for the elliptical and late-type (Sb, Sc, and Sd class) spiral galaxies in the redshift range 300 km s −1 ≤ cz ≤ 4500 km s −1 .These results can be directly compared to panels (a) and (e) of Figure 2 of L23.In addition to the mean correlation, we compute the standard deviation of the correlation statistic across each galaxy sample and plot the mean plus and minus the standard deviation as the upper and lower colored lines in each panel, again following L23.We also computed the correlation using the real SMICA CMB map and mask but 100 simulated galaxy catalogs consisting of random, unclustered galaxy positions with the same number of entries as the real 2MRS data.The mean and standard deviation across the random galaxy samples are included as solid black lines in Figure 1 (cf.black lines and green hashed regions in Figure 2 of L23).
We reproduce the essential features of the L23 analysis: there is a seemingly highly significant detection of negative ∆T (θ) on angular scales 1  Sb + Sc + Sd spiral galaxies random galaxy positions Figure 1.Correlation between subsets of 2MRS galaxies at cz < 4500 km s −1 and the Planck SMICA CMB temperature map.The top and bottom panels can be directly compared with panels (a) and (e) of Figure 2 of L23.We show two sets of uncertainty bands.The black lines and hashed regions correspond to the mean and standard deviation from mock galaxy samples with randomized positions.The colored bands correspond to the standard deviation across the real galaxy sample for each angular bin.The spirals in particular appear to exhibit a significant negative correlation over a wide range of scales, deviating from zero by many times the width of either set of uncertainty bands.
the elliptical and spiral galaxies for cz < 4500 km s −1 , given the uncertainties estimated from either (1) the standard deviation across the 2MRS galaxies for each annular bin, or (2) the spread in the mock galaxy samples with random positions.This negative ∆T extends to smaller scales for the spirals but not for the ellipticals.There are some differences between our Figure 1 and Figure 2 of L23, particularly for smaller angular separations (θ ≲ 0.1 • ).This may arise, at least in part, from differences in angular bin boundaries, or how pixels lying at bin boundaries are being assigned.As discussed in the next Section, results at θ < 0.1 • also show more dependence on choice of CMB map.The more puzzling result from L23, and the primary motivation for our analysis, is the behavior at larger scales.

STATISTICAL SIGNIFICANCE OF CORRELATION
From a cosmological viewpoint, both the positions of the 2MRS galaxies, which trace the underlying density field, and the temperature fluctuations of the CMB, are random quantities that contribute scatter to the 2MRS-CMB correlation measured by L23.When comparing the data to simulations featuring no statistical correlation, one could generate synthetic versions of both the 2MRS catalogs and CMB skies.On average, however, the same correlation function uncertainty should be obtained from holding fixed one of the data sets, and only generating synthetic versions of the other 3 .This was the approach taken by L23, who correlated synthetic galaxy catalogs against the (fixed) CMB data map.However, they used random, unclustered galaxy positions, whereas the distribution of the real 2MRS galaxies is highly anisotropic on the sky.This is clear by eye from Figure 4 of L23, and arises from a combination of the clumpy nature of the underlying density field and any non-uniformity in the selection of the 2MRS sample.
Given the complexities of the real 2MRS data processing and selection, we do not have a straightforward way to generate more realistic, clustered, synthetic catalogs.We therefore considered a simpler method to estimate the ∆T uncertainties: correlating the real 2MRS data against simulated CMB temperature maps.We note that L23 did also use simulated CMB maps in their analysis (see their Section 4.2), but not for assessing detection significance.
One might wonder why it is informative to vary the CMB map given the consistency of results from switching out the Planck map for a WMAP-derived one.While the consistency of the Planck and WMAP results shown by L23 rules out a systematic origin specific to one of the CMB data sets (see their Appendix B), here we are instead trying to quantify the scatter in ∆T expected from the cosmic variance in the CMB map (i.e., the different ways the harmonic mode amplitudes and phases could have been arranged while originating from the same underlying statistical distribution).
3 The L23 correlation ∆T (θ) is directly related to the usual twopoint angular correlation function.For uncorrelated fields a and b the variance of the two-point correlation depends on a fourpoint product of the form a 2 b 2 .One of a 2 and b 2 can be replaced with a fixed (measured) a 2 0 or b 2 0 without biasing the variance estimate.
We generated Gaussian, statistically isotropic CMB temperature maps using the synalm routine in healpy4 , the Python implementation of the HEALPix library (Górski et al. 2005).To produce maps with power consistent with the real SMICA maps we used the bestfit Planck 2018 ΛCDM cosmological parameters from the joint fit to temperature, polarization and lensing power spectra (Planck Collaboration VI 2020).We then smoothed the simulated maps with a Gaussian beam with FWHM of 5 ′ , matching the SMICA processing (Planck Collaboration IV 2020).To improve agreement with the small-scale power in the real SMICA map (multipole moment ℓ > 1500) we finally added a white noise component with power C ℓ = 1.5 × 10 −4 µK 2 .
Figure 2 shows ∆T (θ) obtained from correlating the elliptical and spiral 2MRS samples used in Figure 1 (cz < 4500 km s −1 ) with 1000 simulated CMB maps.We applied the same SMICA confidence mask used in the analysis of the data map.We show the individual simulation results in Figure 2 to emphasize that there are very significant off-diagonal correlations between different θ-bins.More quantitatively, we find that neighboring bins are up to 97 − 99% correlated, depending on the exact 2MRS sample, while, for example, the widely spaced bins centred around θ of 0.1 • and 20 • are correlated at the 70 − 90% level.
We construct a simple χ 2 statistic of the form where d is the (real or simulated) ∆T (θ) vector, m is the model prediction, fixed to zero to correspond to the null case of zero underlying 2MRS-CMB correlation, and C −1 is the inverse covariance matrix.We find that simply inverting the sample covariance from the simulations is sufficiently accurate given the number of realizations and θ-bins.We recover the χ 2 distribution with the expected number of degrees of freedom for all the 2MRS-CMB simulation results, which also confirms that the distribution of the ∆T statistic is sufficiently Gaussian at all θ.
On the real data, we find that the χ 2 value for the 2MRS elliptical sample at cz < 4500 km s −1 is consistent with the simulations, with a probability-to-exceed (PTE) of 0.64 based on the fraction of simulations with worse χ 2 .For the late-time (Sb, Sc, Sd) spirals, we find that the χ 2 is far higher than in any of the simulations, but that this discrepancy is driven entirely by the correlation pattern at θ < 0.1 • .While the measured correlation clearly lies within the envelope of simulated results in Figure 2, the bin-to-bin variation is significantly different from that predicted in the simulations over this  1.The grey lines show results using 1000 simulated CMB maps with no statistical correlation with the 2MRS galaxies, and the black lines denote the mean and ±1σ, ±2σ intervals for the simulation ensembles.For separations θ > 0.1 • the elliptical and spiral galaxy data correlations are consistent with zero within 0.2σ and 0.8σ (see text).For the concatenated elliptical-spiral data vector the difference is 1.1σ.Different θ bins are highly correlated, making a by-eye χ 2 estimate using the per-bin scatter inaccurate.
range.It is not clear what drives this behavior, although we have verified that the high χ 2 is not from any single bin.Overall the results at θ < 0.1 • are not fully consistent with galaxy type (see also Figure 2 of L23), or CMB map choice (Section 3.1, below).We therefore focus on results at θ > 0.1 • .In this case, the χ 2 from the spiral sample is compatible with the simulations, with a PTE of 0.20 (corresponding to a 0.8σ deviation from ∆T = 0).We report PTE values for θ > 0.1 • in Table 1.
The consistency between the elliptical and spiral galaxy correlations at θ ≳ 1 • may seem to argue against a statistical fluke explaining the measured ∆T .To test this, we formed a joint data vector, concatenating the ∆T (θ) results from the two samples, and computed the χ 2 .These results are shown in Table 1.Combining the ∆T (θ) measurements only slightly reduces the PTE for the SMICA CMB map compared to spirals alone (0.20 to 0.14).Based on the simulations, it is not in fact surprising to see similar correlation patterns with the elliptical and spiral samples.This is again due to strong off-diagonal covariance elements for ∆T (θ), including in this case between the two galaxy samples, as clearly suggested by eye in their spatial distribution (Figure 4 of L23).L23 performed further splits of the 2MRS sample by galaxy and environment properties.We have not investigated all of these, however we did perform calculations for the large spiral sample (with physical radius greater than 8.5 kpc, see Section 2.2 of L23).These results are shown in Table 1.This was the sub-sample with seemingly the strongest correlation signal, and L23 reported a ≥ 4σ deviation from zero in each of the ten angular bins from 1 • to 10 • in their Section 5. Overall, none of the calculations we have performed indicate deviations at even the 2σ level compared to results from correlating against the simulated CMB maps for θ > 0.1 • .

Checking results with more realistic SEVEM CMB simulations
While our simplified CMB simulations were constructed to roughly match the power in the real SMICA CMB map, they do not contain realistic foreground residuals or noise.We investigated whether this could be driving the anomalous results for the spiral galaxies at small θ.The Planck 2018 data release did not include a full set of SMICA-cleaned simulated maps (i.e., from applying the SMICA algorithm to simulations contain-  2, except using the Planck PR4/NPIPE SEVEM foreground separation method for the CMB maps, and the 600 simulated SEVEM-cleaned maps provided with the PR4 release.The SEVEM simulations include more realistic foreground and noise residuals than those used in Figure 2, however the χ 2 for the spiral galaxies at θ < 0.1 • remains a strong outlier from the simulations (see text).The data correlations at θ > 0.1 • , where results are more robust to CMB map choice and galaxy sample, are consistent with zero at the 0.3σ and 1.7σ levels for the ellipticals and spirals, or 1.6σ for the joint case.
ing CMB, foregrounds and noise).However, 600 such simulations were provided for the alternative SEVEM foreground separation method (Planck Collaboration IV 2020) in the subsequent PR4/NPIPE release5 (Planck Collaboration Int.LVII 2020).We therefore loaded the PR4/NPIPE SEVEM cleaned data map plus the SEVEM simulations, and performed the same ∆T (θ) calculation as before, retaining the 2018 SMICA confidence mask.These results are shown in Figure 3.
While the ∆T (θ) results with the SEVEM CMB map are not identical to those from SMICA, particularly for the spiral galaxies at small θ, the overall results are qualitatively consistent.The spiral galaxy χ 2 remains anomalously high, and this is again due to bins at θ < 0.1 • .This indicates that our more simplistic CMB simulations described above were not the main factor causing the high χ 2 .Excluding θ < 0.1 • , the SEVEM χ 2 is consistent with the simulations at the 0.3σ and 1.7σ levels for the elliptical and spiral samples (PTE values are reported in Table 1).Concatenating the data vectors from the two samples does not produce a significant detection, with the combined χ 2 consistent with the simulations at the 1.6σ level.

Understanding the discrepancy with L23 uncertainties
The uncertainty bands plotted in Figures 2 and 3 are many times wider than those shown by L23.As mentioned earlier (see Figure 1), it is notable that L23 found reasonable agreement between two different methods for estimating the uncertainties: (1) internal standard deviation across galaxies in the real 2MRS samples, and (2) scatter from correlating randomly generated, unclustered galaxy positions with the real CMB map.Our results imply both these methods must significantly underestimate the true uncertainties, and here we investigate this in more detail.
The standard deviation in ∆T (θ) across the galaxies in the sample would be an appropriate measure of uncertainty if pixels belonging to annuli around the galaxy positions were independent for a given θ (i.e., if each galaxy contributed a unique set of pixels to the sum in Equation 1).For larger θ, however, this is increasingly not the case, since the annuli overlap.The highly anisotropic nature of the distribution of the 2MRS samples on the sky (Figure 4 of L23) exacerbates this issue.More quantitatively, for the 2MRS elliptical galaxies at cz ≤ 4500 km s −1 , we find that around 75% of pixels in the correlation sum are unique for θ ≃ 1 • , decreasing to only 5-10% for θ ≥ 10 • .For the spiral galaxies, the sample is several times larger, and around 60% of pixels are unique for θ ≃ 1 • , decreasing to 1-3% for θ ≥ 10 • .This suggests, independent of our results using simulated CMB maps, that the L23 uncertainties are increasingly underestimated going to larger θ.
As mentioned earlier, the 2MRS galaxies are not uniformly distributed across the sky but cluster, and this was not taken into account when L23 constructed simulated galaxy catalogs based on random sky positions.To roughly estimate the degree to which this might cause an underestimate in ∆T (θ) uncertainties, we correlated the real SMICA CMB map with 2MRS samples with randomly rotated galaxy coordinates6 .This has the advantage of preserving the internal 2MRS clustering pattern without requiring an actual model of the clustering (power spectrum, etc.), while also not relying on simulated CMB maps.Unfortunately this method is not perfect, and one particular issue we found is that systematically more galaxies are 'lost' to the SMICA sky mask in the rotated coordinates than in the true coordinates.This is simply because in the true coordinates the Galactic plane cut in the SMICA mask roughly matches the portion of the sky with fewest 2MRS galaxies.Ignoring this caveat, the uncertainties in ∆T (θ) for the rotated galaxies are larger than those from the random, unclustered samples by a factor of roughly two for smaller scales θ ≲ 0.1 • , and up to a factor of five or more for θ ≳ 10 • for the late-time spirals, based on 1000 random galaxy coordinate rotations.
While we regard our calculations using the simulated CMB maps as more robust, the calculations just described provide further support to our argument that both methods used by L23 significantly underestimate the true ∆T (θ) uncertainties.

CONCLUSIONS
We have reproduced the main results for the correlation statistic ∆T (θ) from the L23 analysis using elliptical and spiral 2MRS galaxies at cz < 4500 km s −1 and the Planck SMICA CMB temperature map.For separations θ ≳ 0.1 • we find negative ∆T (θ), roughly consistent for both galaxy types, which was the basis for the L23 claim of a new foreground detection.
We cross-correlated the 2MRS positions with simulated CMB maps that had zero underlying correlation with the galaxies.We found that for separation θ > 0.1 • the measured elliptical and spiral correlations are actually consistent with zero at the 0.2σ and 0.8σ level (or 0.3σ and 1.7σ for the alternative PR4/NPIPE SEVEM CMB maps and simulations).While the correlation pattern for the elliptical and spiral samples is similar on these scales, the simulations show that this is not particularly surprising, and a joint analysis does not show a significant deviation from zero signal (1.1σ and 1.6σ for SMICA and SEVEM).We highlighted the strong offdiagonal covariance elements that must be accounted for in a quantitative goodness-of-fit assessment.We argued that the uncertainties estimated by L23 are significantly underestimated (Section 3.2), and give the impression of a significant signal detection.
On smaller scales, the correlation measured in the spiral galaxies is discrepant with the simulated results.The ∆T (θ) results at θ < 0.1 • are not robust between different galaxy samples (as already shown by L23), or for the different CMB maps we considered.We therefore do not interpret the discrepancy with the simulations as indicating a genuine signal detection, although what exactly drives this difference is not clear.
L23 did not propose a specific, falsifiable physical mechanism for the 2MRS-CMB correlation, and we cannot rule out a physical mechanism making some contribution to the θ > 0.1 • measurement.Based on our results in this work, however, we view a statistical fluctuation as an adequate explanation for the apparent correlation.While we have focused here on the redshift range highlighted as providing evidence for the new foreground signal by L23, 0.001 < z < 0.015, it should also be pointed out that the correlation signal is diluted, and shifts towards zero, when a broader redshift range is considered.See Appendix A and Figure A.1 of L23 for results for z < 0.04, or cz < 12000 km s −1 .Furthermore, to our knowledge, no corroborating signal has been reported from any other low-redshift galaxy surveys.The 2MASS data have been extremely valuable for many analyses in the past two decades, however more recent surveys such as the 6dF Galaxy Survey (Jones et al. 2004, 6dFGS;) and Sloan Digital Sky Survey (SDSS; e.g., Gunn et al. 2006) have the advantage of providing better characterized selection functions for cosmology-focused correlation studies.
Finally, we note here that the kinematic Sunyaev-Zel'dovich effect (kSZ;Sunyaev & Zeldovich 1980), the Doppler shifting of the CMB photons from bulk motion of free electrons, may make some contribution to the measured correlation.This effect has the same frequency scaling as the primary CMB temperature fluctuations.Typically, the kSZ effect would average to zero in a cross-correlation between galaxies and the CMB, without some additional weighting, because roughly half the galaxies have peculiar velocity vectors pointing towards us, while the others point away (e.g., Hand et al. 2012).However, for the small volume and very low redshift considered here, some coherent bulk motion may produce a net positive or negative signal.We have not performed any quantitative test of this.
Code to reproduce calculations in this work is available in a Zenodo repository (Addison 2024).
I would like to thank Chuck Bennett, Mark Halpern, Gary Hinshaw, and Janet Weiland for helpful discussions relating to this work, and for providing comments on a draft of the manuscript.This work was supported in part by NASA ROSES grant 80NSSC24K0625.This work was based on observations obtained with Planck (http://www.esa.int/Planck), an ESA science mission with instruments and contributions directly funded by ESA Member States, NASA, and Canada.I acknowledge the use of the Legacy Archive for Microwave Background Data Analysis (LAMBDA), part of the High En-ergy Astrophysics Science Archive Center (HEASARC).HEASARC/LAMBDA is a service of the Astrophysics Science Division at the NASA Goddard Space Flight Center.This research has made use of NASA's Astrophysics Data System Bibliographic Services.

Figure 2 .
Figure2.Correlation between 2MRS galaxy samples at cz < 4500 km s −1 and real or simulated CMB maps.The orange and blue lines are identical to those in Figure1.The grey lines show results using 1000 simulated CMB maps with no statistical correlation with the 2MRS galaxies, and the black lines denote the mean and ±1σ, ±2σ intervals for the simulation ensembles.For separations θ > 0.1 • the elliptical and spiral galaxy data correlations are consistent with zero within 0.2σ and 0.8σ (see text).For the concatenated elliptical-spiral data vector the difference is 1.1σ.Different θ bins are highly correlated, making a by-eye χ 2 estimate using the per-bin scatter inaccurate.

Figure 3 .
Figure 3. Same as Figure2, except using the Planck PR4/NPIPE SEVEM foreground separation method for the CMB maps, and the 600 simulated SEVEM-cleaned maps provided with the PR4 release.The SEVEM simulations include more realistic foreground and noise residuals than those used in Figure2, however the χ 2 for the spiral galaxies at θ < 0.1 • remains a strong outlier from the simulations (see text).The data correlations at θ > 0.1 • , where results are more robust to CMB map choice and galaxy sample, are consistent with zero at the 0.3σ and 1.7σ levels for the ellipticals and spirals, or 1.6σ for the joint case.