A Comparison of Cosmological Parameters Determined from CMB Temperature Power Spectra from the South Pole Telescope and the Planck Satellite

The Planck cosmic microwave background (CMB) temperature data are best fit with a LCDM model that is in mild tension with constraints from other cosmological probes. The South Pole Telescope (SPT) 2540 $\text{deg}^2$ SPT-SZ survey offers measurements on sub-degree angular scales (multipoles $650 \leq \ell \leq 2500$) with sufficient precision to use as an independent check of the Planck data. Here we build on the recent joint analysis of the SPT-SZ and Planck data in \citet{hou17} by comparing LCDM parameter estimates using the temperature power spectrum from both data sets in the SPT-SZ survey region. We also restrict the multipole range used in parameter fitting to focus on modes measured well by both SPT and Planck, thereby greatly reducing sample variance as a driver of parameter differences and creating a stringent test for systematic errors. We find no evidence of systematic errors from such tests. When we expand the maximum multipole of SPT data used, we see low-significance shifts in the angular scale of the sound horizon and the physical baryon and cold dark matter densities, with a resulting trend to higher Hubble constant. When we compare SPT and Planck data on the SPT-SZ sky patch to Planck full-sky data but keep the multipole range restricted, we find differences in the parameters $n_s$ and $A_se^{-2\tau}$. We perform further checks, investigating instrumental effects and modeling assumptions, and we find no evidence that the effects investigated are responsible for any of the parameter shifts. Taken together, these tests reveal no evidence for systematic errors in SPT or Planck data in the overlapping sky coverage and multipole range and, at most, weak evidence for a breakdown of LCDM or systematic errors influencing either the Planck data outside the SPT-SZ survey area or the SPT data at $\ell>2000$.


Introduction
Anisotropies in the cosmic microwave background (CMB) have provided a wealth of information about the universe. The CMB temperature anisotropy power spectrum in particular provides some of the tightest current constraints on cosmological models. The most precise measurement of the CMB temperature power spectrum at medium and large angular scales has been made by the Planck satellite as published in the 2015 February Planck data release (Planck Collaboration et al. 2016a). Sensitive measurements of the CMB temperature anisotropy have also been made using ground-based telescopes such as the South Pole Telescope (SPT, Carlstrom et al. 2011) and the Atacama Cosmology Telescope (ACT, Swetz et al. 2011). Story et al. (2013, hereafter S13) used SPT data from the 2540 deg 2 SPT-SZ survey to make the most precise measurement of the CMB temperature power spectrum damping tail above angular multipoles ℓ 2000 and a measurement at ℓ 650 2000   that is second only to Planck in precision.
With the exquisite precision of the Planck measurements, signs of moderate discrepancy have been noted between cosmological parameters estimated from the Planck CMB power spectra and other cosmological measurements. For example, Riess et al. (2016) found the value of H 0 determined from measurements of supernovae (SNe) Ia, calibrated with Cepheids, to be inconsistent with the Planck value by 3σ. Additionally, the amplitude of density fluctuations in the local universe implied by Planck CMB power spectrum data disagrees with certain local measurements of the density fluctuations at the 2s level (e.g., Kilbinger et al. 2013;Planck Collaboration et al. 2016b).
Some discrepancy has also been noted between the cosmological parameter constraints from measurements of the CMB using different instruments or multipole ranges. Many authors, including Calabrese et al. (2017), have demonstrated 1-2σ differences in the values of H 0 and 8 s between pre-Planck and Planck data. Addison et al. (2016) point out discrepancies between cosmological parameters determined from two halves of the Planck data (split at ℓ 1000 = ), and between the best-fit cosmologies of Planck and SPT, although Planck Collaboration et al. (2016c) argue that these discrepancies are statistically insignificant.
While the statistical significance of these reported discrepancies ranges from low to moderate, they could be hints of the ΛCDM model breaking down or systematic contamination in one or more of the measurements. Because Planck and the SPT provide the most precise temperature power spectrum measurements, it is particularly important to carefully investigate any differences between these two data sets.
In a previous paper, Hou et al. (2017, hereafter H17) compared the SPT-SZ and Planck data at the map and powerspectrum level in a study similar to that performed by Louis et al. (2014) on ACT and Planckdata. H17 used the Planck143GHz and SPT 150GHz maps to create three sets of binned power spectrum measurements, or "bandpowers," in the SPT-SZ sky patch, namely the cross-spectrum of two independent halves of the Planck143GHz data (143 × 143), the cross-spectrum of the SPT 150GHz data and Planck143GHz data (150 × 143), and the cross-spectrum of two independent halves of the SPT 150GHz data (150 × 150). We refer to these collectively as the "in-patch" bandpowers. In H17, these bandpowers were shown to be consistent with each other and marginally consistent with the bandpowers obtained from the full Planck map. In this paper, we extend this comparison to the cosmological parameters obtained from these bandpowers and the Planck full-sky power spectrum.
We start by comparing the best-fit parameters obtained from the full-sky Planck data and the best-fit parameters from the SPT-SZ data, with a null hypothesis that the ΛCDM model is correct and the statistical models of both data sets are accurate. Under this null hypothesis, the parameters derived from the two data sets are marginally discrepant-the 2 c values for these parameter differences should be higher only 3.2% of the time (the probability to exceed (PTE) the 2 c between the parameter sets is 0.032; see Section 3 for details). Assuming that the null hypothesis is correct, this 3.2% probability must be understood as resulting from a somewhat (but not highly) unlikely statistical fluctuation. Other possible explanations include uncharacterized systematic errors or a breakdown in ΛCDM. In this paper, we attempt to distinguish between these three possibilities.
After quantifying the discrepancy in parameters determined from the full SPT and Planck temperature power spectra, we test for systematics by restricting the SPT-SZ and Planck data sets to the anisotropy modes that are measured well in both data sets. Specifically, we restrict both data sets to the SPT-SZ footprint, using the H17 in-patch bandpowers, and consider a fixed multipole range. Such a restriction greatly reduces the covariance of parameter differences given our null hypothesis by eliminating nearly all the sample variance contribution.
After testing for systematic errors with the restricted data sets, we test for other potential sources of the observed parameter differences between the full data sets. We first explore how parameters shift when the in-patch bandpowers are restricted to different ranges of angular scales; this tests for scale-dependent systematics or an inadequate cosmological model. Next we study how parameters shift from the in-patch bandpowers to the full-sky Planck bandpowers; this tests if the SPT-SZ patch is sufficiently unusual to challenge the assumptions of statistical isotropy and Gaussianity underlying our cosmological model or if there are systematic errors in the Planck data outside of the SPT-SZ patch. We also discuss the influence on parameters of the tilt in the in-patch bandpowers relative to Planck full-sky data first noted in H17.
Finally, we explore several other factors that could cause the mild discrepancy between the Planck-and SPT-derived parameters. We examine the SPT foreground model, the SPT calculation of beam uncertainty, the SPT τ prior, and the effects of lensing. These tests probe analysis and uncertainty modeling choices that could introduce systematic differences. This paper is organized as follows. In Section 2 we describe our method for parameter estimation and comparison. In Section 3 we explore the consistency of Planck and SPT parameters estimated from the full data sets and from data restricted to the same sky patch and multipole range. In Section 4 we test for sources of systematic error from the foregrounds and beam uncertainty. We also discuss the influence of lensing and the τ prior on the parameter estimates. The conclusions are presented in Section 5.

Bandpowers
Several of our tests in this work make use of the publicly available Planck2015 baseline high-ℓ temperature and low-ℓ temperature and polarization bandpowers. These are optimally combined multifrequency bandpowers, and we refer to them as the Planck Full Sky (PlanckFS) bandpowers (Planck Collaboration et al. 2016d). The Planck parameters we use are obtained from the baseline ΛCDM Monte Carlo Markov chain (MCMC, Planck Collaboration et al. 2016a). We also use the designation PlanckFS to refer to the parameter estimates from this chain.
We also use bandpowers created from the SPT-SZ 150GHz and Planck143GHz maps of the 2540 deg 2 SPT-SZ survey region. We refer to the cross-correlation of two half-depth maps as either 150×150 or 143×143. The cross-spectrum 150×143 is the correlation of the full-depth SPT and fulldepth Planck maps. A detailed description of the creation of these bandpowers is provided by H17.

Cosmological Parameter Likelihood
We obtain parameter estimates by searching the space defined by the likelihood of the cosmological and nuisance parameters Q given the data D i j , a set of temperature bandpowers where i and j both run over the two frequency bands (143 GHz for Planck and 150 GHz for SPT). We assume the likelihood to take the following form where bb i j S ¢ is the bandpower covariance, and the model temperature bandpowers are expressed as ] , and we have ignored the normalization constant in the likelihood. The Y Y , i j terms are temperature calibration parameters, W bℓ i j is the bandpower window function (e.g., Knox 1999), and F ℓ i j is the foreground model from Story et al. (2013) with frequency dependence included (George et al. 2015). The term a ℓ includes aberration effects due to our proper motion with respect to the CMB as Jeong et al. (2014) and with 1.23 10 3 b =´and cos 0.26 q á ñ = -. The calibration parameters are based on work from H17, where the three in-patch bandpowers are simultaneously calibrated to each other over the common multipole range. -, and the scalar spectral index, n s . While the Hubble constant H 0 is derived from these five parameters, we discuss it throughout this work because the discrepancy between the CMB-determined value of H 0 and local measurements is of particular interest. We place a Gaussian prior on the optical depth τ of 0.07±0.02.
Our parameter vector also includes six nuisance parameters. The Y Y i j term is treated as a single parameter, and we include five parameters for foregrounds (three for the template amplitudes and two for the frequency dependence), giving Q a total of 12 elements. All other parameters are fixed to the baseline model values from Planck Collaboration et al. (2016a), which are the default settings for the 2016 May version of CAMB.

The Covariance for the In-patch Bandpowers
The covariances for the bandpowers 150×150, 143×143, and 150×143 contain sample and noise variance along with beam uncertainty. A correlation matrix is formed for each source of beam uncertainty as where S, N, and B superscripts signify the sample, noise, and beam covariances, respectively. 35 Note that our comparisons between parameters from 150×143 and 143×143 with parameters from PlanckFS have inconsistent treatments of the 143GHz calibration uncertainty. In the former two, the uncertainty is set to zero, while in the latter, the 0.07% absolute calibration uncertainty reported by the Planck team is included. We expect that these inconsistent treatments have negligible impact on our results because a 0.07% map-level calibration uncertainty would be a highly subdominant contribution to any of our in-patch parameter uncertainties; for example, the fractional uncertainty on A e s 2t is approximately 3% for 143×143. We also find that fixing the relative calibration uncertainty between Planck and SPT has negligible impact on our parameter comparisons.
Planck Collaboration et al. (2016d) show that the uncertainty in the Planck beams has an effect weaker than a 0.2% on 143×143 bandpowers, and the resulting impact on parameter estimation is also extremely small (Planck Collaboration et al. 2016a). We thus make the simplifying assumption that 0

Changes to Likelihood Since S13
In 2013, S13 presented the SPT-SZ 150GHz bandpowers and resulting ΛCDM parameter constraints. There are some differences between the parameter estimation for S13 and for this analysis. First, we here assume massive neutrinos with a total mass of 0.06eV and a Planck-based τ prior. Calabrese et al. (2017) point out the importance of these assumptions when comparing parameters estimated from the Planck data with other CMB results.
We handle calibration uncertainty by accounting for it in our model instead of including it in the bandpower covariance, and we use a different calibration prior than was used in S13. The method outlined above for handling beam uncertainty differs from S13 in that the beam covariance is now formed in a model-dependent way based on M ℓ i j , the cosmology of each MCMC sample. The previous methods for handling calibration and beam uncertainty used in S13 produced biased parameter constraints, lowering A e s 2t and n s (see the Appendix for further detail). The new, much tighter calibration prior is based on the in-patch bandpower comparisons in H17.
In S13, aberration effects due to our proper motion with respect to the CMB were not included in the parameter estimation. Based on Jeong et al. (2014), we include aberration in Equation (2). Accounting for aberration leads to a shift of approximately 0.3σ in the S13 MC Q value toward the PlanckFS value.
In Figure 1 we show the differences in parameter estimates that are due to the above changes by comparing the parameters from S13 to parameters estimated from the same bandpowers, but with the updated likelihood (S13 * ). The decrease in MC q is primarily the result of including aberration effects, A e s 2t is increased by the new calibration prior, and n s is mostly increased by the new method of handling beam uncertainties. The changes to the likelihood relative to S13 lead to greater consistency between Planck and SPT.
In H17 and this work, we use the 150×150 bandpowers generated from half-power SPT maps instead of the bandpowers from S13, which were generated from cross-spectra of hundreds of single-observation maps. This choice makes the data easier to simulate and simplifies the 143 GHz cross-spectrum analysis, since that data were created in a similar manner. No significant difference was found between the 150×150 and S13 bandpowers-for more details see the Appendix of H17. In Figure 1 we compare the differences between parameters estimated from S13, S13 * (the S13 bandpowers with the updated likelihood), and 150×150 (H17).

Parameter Comparison and Parameter-difference Covariance
To obtain parameter estimates for our in-patch bandpowers, we use the Metropolis-Hastings algorithm to produce a chain from which we generate the posterior for Q and then marginalize over the nuisance parameters (foreground and calibration parameters). We generate the chains using the likelihood sampler Cosmoslik (Millea 2017).
The primary statistic we use to infer the compatibility between various parameter distributions is where C is the parameter difference covariance and The p a are either the means of the parameter posteriors or obtained through minimization of the negative log-likelihood. The latter method is used when simulations are required as minimizing the negative log-likelihood requires significantly less computation time than running an MCMC. The p a are composed of the five non-τ cosmological parameters: -, and n s . When comparing parameters from the in-patch bandpowers to PlanckFS, the parameter difference covariance is approximated The small correlations between the in-patch and full-sky parameter sets are ignored.
The parameter difference covariance for comparisons between the in-patch bandpowers cannot be calculated as simply. The parameters are obtained from bandpowers in the same sky cut, and therefore a large portion of the sample variance is common between all three sets and must be accounted for. It is necessary to estimate the covariance from the fluctuations across a set of simulations.
To calculate the in-patch parameter difference covariance matrices, we generate 400 bandpower simulations for each of the in-patch spectra. The creation of the simulations is described in H17. To simulate the calibration uncertainty, we multiply each simulation by a random draw from the appropriate calibration prior. The simulated bandpowers are then substituted into Equation (1), and we calculate a set of parameter estimates through minimization of the negative Figure 1. Parameter estimates for S13, S13 * (obtained from the same bandpowers as S13 but with the likelihood modifications discussed in Section 2.4), and 150×150 (H17). The vertical bars are the 1s PlanckFS parameter constraints. The estimates are based on the multipole range of ℓ 650 3000   . The shift in A e s 2t and the reduction in the error bar on that parameter combination, between S13 and S13 * come from a combination of the new calibration constraint from H17 and the correction of a bias in the calibration uncertainty treatment. The shift in n s comes from the correction of the beam uncertainty bias. The shift in MC q is primarily due to the inclusion of aberration effects. log-likelihood. The minimization is made using the scipy minimize module with the Nelder-Mead method (Jones et al. 2001). Running the minimizer on the 150×150 bandpowers returns parameter values similar to the results obtained from the MCMC procedure.
With the 400 sets of parameters for each of the three in-patch bandpowers, we obtain the parameter difference covariances for the three in-patch comparisons. The stability of our covariances was tested by splitting the simulations into two groups of 200 and recalculating 2 c with each half. We find the results from the two halves to be consistent, and, as we show below, the simulated parameter differences follow a 2 c distribution with five degrees of freedom. After calculating the 2 c for a set of parameter differences, we convert it into a PTE, which we use to infer compatibility between the sets of parameters.

SPT-SZ and Planck Consistency Tests
In this section, we use the method of Section 2.5 to quantify the significance of differences in parameters estimated from SPT-SZ data and Planck data. Our primary metric is a 2 c statistic and its associated PTE. We first compare the parameter constraints from the PlanckFS and SPT 150×150 data sets and find a relatively low PTE of 3.2%. We then perform a series of tests investigating possible causes for this low PTE.
In Section 3.2 we test the hypothesis of a systematic error in one or both experiments by restricting the Planck and SPT data sets to modes on the sky that are measured well by both experiments. In particular, we restrict the Planck data to the SPT-SZ patch, and only consider multipoles in the range ℓ 650 2000   . By doing so, we greatly decrease the expected variance in parameter differences under our null hypothesis, primarily because we eliminate nearly all the sample variance contribution. The volume of parameter space within the 1σ uncertainties in parameter differences is reduced by a factor of over 300 relative to the comparison of the full data sets, greatly increasing sensitivity to systematic errors.
In Section 3.3 we reintroduce sky modes that are only measured well by one of the two experiments, either by Planck outside the SPT-SZ survey region or by SPT in the multipole range above which the in-patch Planck data become very noisy. Specifically, we first explore the consistency of parameters from in-patch bandpowers over several multipole ranges by varying ℓ max , the maximum multipole included for parameter estimation. In Section 3.4 we then compare the various in-patch bandpowers to the PlanckFS data set, also comparing different ℓ ranges. We then discuss specific features of the data and parameter shifts of interest in Sections 3.5 and 3.6.

SPT and Planck Parameter Comparison, Full Data Sets
Comparing the parameter differences derived from the SPT 150×150 and PlanckFS data sets, we find

Comparison of Planck and SPT in the SPT-SZ Survey Region
In this section, we compare parameters derived from modes that are well measured by both Planck and SPT. We consider data within the SPT-SZ sky region, which covers 2540 deg 2 , or about 6% of the sky, and within the multipole range ℓ 650 2000   . Specifically, we compare parameters estimated in this multipole range from the in-patch cross-spectrum bandpowers presented in H17. This comparison provides a sensitive test of unaccounted-for systematic errors in either experiment.
The lower cutoff of ℓ 650 = is set by the SPT analysis in S13, which did not report data at larger angular scales (lower multipoles) because of the increasing noise from the atmosphere on these scales. The upper cutoff of ℓ 2000 = is set by high-ℓ noise in the Planck143GHz data resulting from the larger Planck beam (roughly 7 arcmin FWHM, compared to 1 arcmin for SPT 150 GHz) and the slightly higher noise per pixel in the Planck maps (∼25 μK arcmin for Planck143 GHz compared to ∼18 for SPT). The variance of the 143×143 bandpowers (the set with the largest noise variance) is dominated by sample variance to approximately ℓ 1500; = as a result, the three sets of bandpowers have similar uncertainty in this range. The 143×143 error bars begin to grow significantly larger than those for 150×150 around ℓ 1800 = , and 150×143 begins to show the same behavior around ℓ 2200 = . We choose ℓ 2000 max = to maximize the signal-to-noise ratio of the comparison between 150×150 and 150×143. When restricted to the SPT-SZ sky area and this range of angular scales, both experiments are measuring a very similar set of modes on the sky with a similar signal-to-noise ratio per mode. Given our null hypothesis, the expected covariance of parameter differences for these modes is thus greatly reduced, making it easier for us to see the impact of any systematic errors.
The parameter estimates for the in-patch bandpowers over this multipole range are in Figure 2, and Figure 3 shows the ratio of the in-patch bandpowers and models to the best-fit PlanckFS model. The in-patch parameters are more similar to each other than to the Planck full-sky values, and the features apparent by eye in the bandpower and best-fit-model ratios to PlanckFS are similar among the three in-patch sets. These plots still include sample variance in the in-patch error bars, however, so it is difficult to assess the statistical consistency of the three data sets. Figure 4 shows the distribution of 2 c values for differences in simulation parameters calculated in this comparison (as expected, the histogram closely follows a 2 c distribution for five degrees of freedom). This statistic accounts for the large decrease in sample variance in the parameter difference covariance and provides a quantitative assessment of the consistency among the three in-patch parameter sets. The 2 c values of the data differences are shown by vertical red lines, and none of these values lie notably outside the main distribution. The PTEs from this test are included in Table 1 and confirm that all three sets of inpatch bandpowers are consistent with each other in the multipole range ℓ 650 2000.   Some parameter differences among the three in-patch sets are visible in Figure 2. Between the 143×143 and 150×143 data sets, the largest differences are the slightly lower preferred values of h m 2 W and n s in 150×143. This trend continues in 150×150, but as shown in Figure 4 and Table 1, these differences are consistent with our null hypothesis.
We pay special attention to the comparison between the parameters derived from 150×150 with ℓ 2000 max = and those from 150×143 with ℓ 2000 max = because this comparison provides the most stringent test of our null hypothesis. Figure 5 shows that the expected covariance of parameter differences, indicated by the contours in the upper triangle, are quite small, comparable to the covariance of parameter uncertainties in the PlanckFS posterior. In this regard, examining these parameter differences provides us with a much more powerful test than the comparison between the PlanckFS parameters and the 150×150 full ℓ-range parameters.
In addition to the visual impression of this increase in precision of the test given by Figure 5, we also provide a quantitative description of the increase in precision. We do so by simultaneously diagonalizing the covariances for the 150×150 and PlanckFS parameter differences at ℓ max = 3000 and the 150×150 and 150×143 parameter differences at ℓ 2000 max = . We then multiply the square root of the eigenvalue ratios to calculate the reduction in the 1s volume for the five-dimensional parameter space. Comparing the ratio of the volumes, we find a ratio of 0.003, i.e., the volume in the parameter difference space containing 68% of the probability is 300 times smaller for the ℓ 2000 max = 150×150 versus 150×143 parameter differences than for the full ℓ-range 150×150 versus PlanckFS parameter differences. Despite the precision of this test, we find a perfectly acceptable PTE for this comparison: In conclusion, when Planck and SPT data are restricted to modes on the sky that are measured well in both experiments, we find that the best-fit cosmological parameters are fully consistent. The observed consistency between the two data sets in this stringent test provides strong evidence against instrumental systematics affecting either data set on these angular scales, on this part of the sky.

Relaxing Restrictions on the Multipole Range
Next, we reintroduce smaller angular scales measured within the SPT-SZ survey region. Given the ΛCDM model and a spectrum measured in the range ℓ 650 2000   , one can predict the spectrum at other angular scales. Given the consistency found in the previous section, finding significant discrepancy from extending the multipole range could indicate either a systematic affecting SPT data at high ℓ or a failure of the ΛCDM model. Parameter estimates for the in-patch bandpowers at various values of ℓ max are shown in Figure 2, and PTEs are reported in Table 1.
For 143×143, increasing ℓ max from 2000 to 2500 adds little information to the parameter estimates. This is consistent with Figure 3, where the error bars of the 143×143 bandpowers have become significantly larger by ℓ 1800 = . Some parameters in the 150×143 measurement do shift when we expand the ℓ range: MC q , h b 2 W , and n s shift away from the 143×143 values with increasing ℓ max . Nevertheless, at ℓ 2500 max = , we still find that the 150×143 and 143×143 measurements are consistent with a PTE of 0.62. 36 For 150×150, when ℓ max is increased from 2000 to 2500, MC q increases in a manner similar to what we saw with 150×143, the baryon density increases, and the matter density decreases. These shifts correspond to an increase in H 0 . At ℓ 2500 max = , the 150×150 measurement remains consistent with 150×143 and 143×143, with PTEs of 0.66 and 0.38, respectively. The trend in parameter shifts continues when we increase ℓ max to 3000 to include the full range of the 150×150 data, yet the PTEs remain moderate. We also plot in Figure 2 parameter results for ℓ max = 1800, and we see the trend toward the PlanckFS values continues. Uncertainties rapidly grow for ℓ max < 1800. Finally, we calculate the 2 c and PTEs for the comparison of parameters from 150×150 data at ℓ 2000 max = to parameters from 150×150 data with ℓ 2500 max = and 3000. These PTEs are 0.88 and 0.75, respectively. Thus, while the parameter shifts with increasing ℓ max are suggestive of a potentially interesting trend, they are consistent with our expectations under the null hypothesis. . There is a noticeable trend in the 150×150 density parameters toward better agreement with PlanckFS as ℓ max is lowered. 36 Note that for the bottom two rows of Table 1 the PTE increases as ℓ max is increased from 2000 to 2500. This increase is driven by the increase in the parameter difference covariances as sources of fluctuation are added that are not common to the two data sets in question-most predominantly from noise in the 143GHz map.

Relaxing Restrictions on Sky Coverage
In the previous sections, we have found that the ΛCDM parameters estimated from the Planck and SPT in-patch bandpowers are consistent for all ℓ ranges considered. In this section, we relax the restrictions on sky coverage and compare the in-patch parameters to the PlanckFS parameters in different fixed ℓ ranges. This effectively tests ΛCDM and the assumptions of statistical isotropy in the CMB. The PTEs between the in-patch parameters and PlanckFS parameters in all ℓ ranges tested are listed in Table 2 and the parameter constraints are shown in Figure 2.
We first compare PlanckFS to 143×143 at ℓ 2000 max = and 2500. Because of the rapidly increasing noise at high ℓ in the 143×143 bandpowers, we do not consider ℓ 3000 max = for this comparison, and we see very little difference in the parameters and comparison PTEs for ℓ 2000 max = and 2500. These PTEs are 0.29 and 0.31, respectively. Although the PTE values indicate no discepancy between the data sets, we do see small differences in MC q , A e s 2t -, and n s between PlanckFS and 143×143 at ℓ 2000 max = and 2500. The two main differences between the sets of bandpowers that could drive parameter differences are the Planck low ℓ data (which are not included in the in-patch bandpowers) and the sky outside of the patch. However, we can rule out the low-ℓ data as an explanation based on the results of Planck Collaboration et al. (2016c), who found only marginal parameter shifts when cutting the low-ℓ (ℓ 650  ) data. Therefore, the majority of the parameter differences between PlanckFS and 143×143 can be attributed to differences in the Planck data from the SPT-SZ patch and the Planck data from the rest of the sky.
As noted in H17, all three sets of in-patch bandpowers have more power than the PlanckFS model at moderate ℓ and less power at high ℓ, creating a tilt. In Figure 3 we show the ratios of the in-patch bandpowers to the PlanckFS best-fit model. We also show the ratios of best-fit in-patch models to the PlanckFS best-fit model. The tilt is clearly visible by eye in all bandpowers and best-fit models, and this tilt drives the differences in A e s 2t and n s . We discuss this tilt further in Section 3.5. There is also an oscillatory pattern in the best-fit model ratios, consistent with the difference in best-fit MC q , although this feature is not as obviously discernible directly in the bandpower ratios as is the tilt.   . Finally, we compare PlanckFS to 150×150 with varying ℓ max . In Figure 2 we see a trend toward the PlanckFS values as ℓ max is lowered, with near-convergence of Ω b h 2 , Ω m h 2 , and H 0 , and black stars indicate the parameter difference values from the same comparison in the data. It is visually apparent that this comparison constitutes a much more stringent consistency test than comparing to PlanckFS; in fact, this comparison reduces the parameter volume by a factor of 300 (see the text for details). The observed consistency provides strong evidence against a systematic difference in the modes measured in common between the two experiments. by ℓ 1800 max = . For ℓ 1800, 2000, max = and 2500, we find 150×150 and PlanckFS to be at least marginally consistent (minimum PTE of 0.094). The discrepancy between the data sets only approaches the 2σ level when we extend the parameter estimation to the full 150×150 multipole range, where we find the PTE of 0.032 with which we began this investigation. As noted in the previous section, the only notable difference between the 150×143 and 150×150 data at ℓ 2500 max = is the marginally lower matter density (and hence higher Hubble constant) that 150×150 prefers; these parameters are pushed even farther in this direction when ℓ max is increased to 3000. From this comparison, we see that the 150×150 bandpowers above ℓ 1800 > drive some of the discrepancy with PlanckFS.

Bandpower Ratios
H17 showed that the bandpowers from the SPT-SZ patch (from both SPT and Planck data) have a tilt relative to the PlanckFS bandpowers. We also see this feature prominently in the ratios of in-patch bandpowers to the PlanckFS best-fit model, as plotted in Figure 3. In this section, we investigate the impact of this tilt on cosmological parameters. To do so, we fit a power law to the ratio of the in-patch bandpowers to the PlanckFS bestfit model and multiply this power law into the theory spectrum in Equation 2 when estimating new parameters.
The power law takes the form Using the PlanckFS model instead of the PlanckFS bandpowers allows us to better assess how this tilt drives the best-fit parameters away from the PlanckFS values. We assume Gaussianity and use the likelihood where D ℓ PlanckFS is the PlanckFS best-fit model. In Figure 6 we present the best-fit power laws for each spectrum with ℓ 2500 max = . As the amount of SPT data included is increased (i.e., as we go from 143 × 143 to 150 × 143 to 150 × 150), we see an increase in the tilt, with 150×150 having a best-fit value of n that is discrepant with 0 by 2.2s. We note, however, that the best-fit tilt values from the three bandpower sets are consistent within 1σ.
To connect this feature with parameters, we remove this tilt from the 150×150 bandpowers by multiplying the theory spectrum in Equation (2) by the best-fit power law for 150×150 and run a new chain. As shown in Figure 7, removing this tilt from the full range of the 150×150 bandpowers results in a decrease in A e s 2t and an increase in n s to significantly better agreement with PlanckFS. By contrast, we note that removing the tilt hardly affects the two density parameters (and H 0 ); therefore, the preference for higher H 0 by 150×150 is due to high-ℓ information unrelated to the tilt.

Discussion of Shifts in Cosmological Parameters, and the Hubble Constant in Particular
The picture that emerges from the previous sections is the following: when SPT and Planck data are restricted to modes on the sky that are measured well in both experiments-i.e., to modes in the SPT-SZ survey region in the multipole range ℓ 650 2000   -the best-fit parameters from the two data sets are fully consistent. When the comparison is relaxed to either just the same sky or just the same multipole range, parameter constraints from the two experiments are still marginally consistent, although parameter differences begin to emerge. Only when we relax all restrictions on sky coverage and multipole range do we find a difference greater than 2σ between Planck and the SPT. This difference arises in roughly equal parts from the SPT data at high ℓ and from differences in the SPT-SZ patch relative to the whole sky at moderate ℓ. The latter fluctuation accounts for nearly all of the difference in A e s 2t -, and a sizeable fraction of the differences in MC q and n s , but almost none of the differences in the other parameters. The remainder of the parameter differences arises from the high-ℓ fluctuation.
Of the parameter differences driven by high-ℓ SPT data, the Hubble constant is of particular interest given the discrepancy between the value derived from Planck CMB power spectrum data assuming ΛCDM and the traditional distance ladder measurement of Riess et al. (2016). As can be seen in Figure 2, half of the Hubble constant difference between SPT and Planck arises from the SPT data at ℓ 2000 > . In our parameterization, the Hubble constant is a derived parameter that can be calculated from h b 2 W , h m 2 W , and MC q . From the perspective of the Hubble constant, the angle MC q is essentially fixed-the uncertainties and shifts between data sets are so small that the impact on H 0 is negligible. Thus the observed variation in H 0 between data sets is due to changes in the two density parameters. Changing either the baryon density or the matter density (and enforcing a flat universe) would result in a change in the angular size of the sound horizon at recombination and hence a different observed value of MC q . The only parameter available to preserve the observed MC q is H 0 . Specifically, the baryon density affects the sound speed in the early universe. Increasing the baryon density decreases the sound speed and thus the physical size of the sound horizon at recombination. To preserve the angular size, the angular diameter distance to recombination  Note. The entries for 150×143 and 143×143 at ℓ 3000 max = are blank since these spectra have negligible signal-to-noise ratio above ℓ 2500 = .
where z * is the redshift of recombination, must be made smaller. At fixed matter density and with flatness enforced, this can only be achieved by increasing H 0 . Changing the matter density, meanwhile, affects the expansion rate, both in the early universe and from recombination to today (as can be seen from Equation (15)). In the early universe, this change would affect the physical size of the sound horizon, although its impact is softened by the contribution of radiation density to the expansion rate. Decreasing the matter density at late times would increase the angular diameter distance to recombination, which, at fixed baryon density, would make the angular size of the sound horizon too small. To preserve the measured angular size, H 0 must increase. Thus both the increase in h b 2 W and the decrease in h m 2 W driven by the high-ℓ SPT data lead to an increase in the inferred value of H 0 .

Additional Tests
In the previous section, we found no evidence that the parameter differences between SPT and PlanckFS at ℓ 650 2000   are driven by instrumental systematics. In this section, we investigate potential systematic contributions to the parameter differences driven by SPT data at ℓ 2000 > by examining the SPT foreground model and the SPT calculation of beam uncertainty. We also investigate other known sources of potential systematic uncertainty (not specific to high-ℓ data), including the SPT τ prior and the effects of lensing. These tests differ from those in the previous section in that they are more specific probes for systematic errors that do not aim to reduce the comparison to parameters estimated from the same modes. Instead, they focus on places where systematics may have entered into the parameter estimation, either through instrumental effects or faulty modeling assumptions.

Parameter Dependence on Beams and Foregrounds
In this section, we investigate the possible impact of foreground and beam misestimation on parameter differences, particularly at high ℓ. To test for beam systematics, we include the amplitudes of the fractional beam uncertainty as parameters in the MCMC, rather than analytically marginalizing over the beam uncertainty. This means in practice that we modify Equation (5) to include parametrized amplitudes of the SPT beam error templates for each source of beam uncertainty and multiply this into Equation (2). The covariance in Equation (1) no longer has Equation (7) included in it. Our model bandpowers now take the form At each step of the new 150×150 chain, we calculate the five-parameter 2 c from Equation (8) using the difference between the cosmological parameters of the current step and the PlanckFS means. If a beam or foreground parameter were connected to the low PTE found in Section 3, we would expect to find the 2 c posterior to be correlated with said parameter. In other words, we treat the five-parameter 2 c as a derived parameter and look for correlations or degeneracies between this derived parameter and the beam or foreground parameters. We do not find any significant correlation; the largest correlation found between any foreground or beam parameter and the five-parameter 2 c was 0.12, indicating a very weak linear response and no clear direction in the foreground and beam parameter space in which the parameter comparison 2 c can be lowered.
We find furthermore that the foreground and beam posteriors are not driven significantly from their priors in any chain, as one would expect if any of these components were a poor description. The largest deviation of a parameter's posterior mean from the prior mean was 0.13s, and most parameters showed shifts of less than 0.1s. These tests provide more support for our hypothesis that the somewhat low PTE between 150×150 and PlanckFS is not due to a systematic error in the beam or foreground treatment.
We expand on the foreground tests by adding free parameters for the kinematic Sunyaev-Zel'dovich (kSZ) effect-the S13 foreground model has a single parameter for the sum of kSZ and tSZ-and a cross-correlation between the tSZ and the cosmic infrared background based on George et al. (2015), since these components were also included in the determination of the PlanckFS parameters. This change has a negligible impact on the parameter posteriors for 150×150. This result is expected: the motivation for the simplified foreground parameterization in S13 was that these extra foreground components are expected to have a similar power-spectrum shape to tSZ in the multipole range examined in S13 and here and are thus only distinguishable via their frequency dependence. Thus, the differences between Planck and SPT do not appear to be related to foreground components that are included in the Planck analysis, but not in S13.

t Prior
Potential systematic errors may also creep in through the optical depth measurement, given the proven challenges in recovering the reionization peak from the midst of the Galactic foregrounds. We test whether the τ prior is contributing to the low PTE by running 150×150 chains with a low optical depth, 0.05 0.02 t =  , and a high optical depth, 0.10 0.02 t =  . For the τ=0.05 prior there are small shifts in both density parameters toward PlanckFS values, but the shifts only change the PTE from 0.032 to 0.058. For the τ=0.1 prior we see the opposite effect and calculate a PTE of 0.015. The small improvement in the PTE as τ is lowered argues against the idea that the parameter differences between PlanckFS and 150×150 are significantly connected to the τ prior.

Gravitational Lensing
Two of the most discrepant parameters between 150×150 and PlanckFS are h m 2 W and A e s 2t -. Since these parameters both impact the lensing amplitude, we test the hypothesis that the parameter differences are lensing-related. The impact of lensing on parameter estimates is often studied by marginalizing over an artifical lensing-power scaling parameter, A L . Here we follow Planck Collaboration et al. (2016c) and instead fix the amount of lensing power. With this choice we avoid some difficulties of interpreting parameter constraints after marginalization over A L . With marginalization over A L , one projects out lensing information. That is useful, but interpretation is complicated by the fact that one also removes any sensitivity to parameter variations that produce effects that can be mimicked by lensing. In contrast, by fixing the lensing potential we can remove the contribution of lensing variation to our parameter constraints, while keeping the contribution of any non-lensing responses of the power spectrum to parameter variations.
Specifically, we fix the 150×150 lensing potential to its best-fit ΛCDM value. In practice, we modify our model of the bandpowers (Equation (2) where UL signifies an unlensed spectrum and * q represents cosmological parameters fixed to the 150×150 ΛCDM bestfit values.
We first note from the results of this test, shown in Figure 7, is that there is still some preference in the SPT data for lower matter density even with the lensing information removed, albeit it is weaker. Adding the lensing information strengthens this preference.
We note also that the PTE for comparison with PlanckFS only improves to 0.045 with lensing fixed. We attribute this to the small shift in A e s 2t that also occurs when we fix the lensing potential. Thus, although lensing has an impact, removing the impact of lensing on the SPT parameter estimates does not significantly improve the agreement with the PlanckFS parameter estimates.
Finally, we note that in Planck Collaboration et al. (2016c) the Planck collaboration performed a similar test with the Planck data. Fixing the lensing potential lowers the matter density for Planck and increases it for SPT, bringing the preferred matter density values for these two data sets closer together. However, the shifts are relatively small when compared to the full SPT and Planck parameter differences.

Conclusions
The Planck CMB temperature data at moderate angular scales ( ℓ 650 2000   ) prefer a ΛCDM model that mildly differs from some other cosmological probes. In this paper, we have used measurements from the SPT as an independent check of the Planck data at these angular scales. This check was performed by comparing ΛCDM parameter estimates using observations of the CMB temperature anisotropies from the Planck satellite and the SPT. When comparing parameter constraints from the full multipole range of SPT data to parameter constraints from Planck full-sky data, we found a slight difference between the two, with a PTE of 0.032. We have attempted to distinguish between three possibilities for the observed parameter differences: slightly unusual statistical fluctuations, unaccounted-for systematic error, or a breakdown of ΛCDM. To this end, we compared parameter estimates that were restricted to measurements of the same modes on the sky, and then we relaxed the range of angular scales and sky coverage.
We have arrived at three primary conclusions: 1. When Planck and the SPT are restricted to measure the same modes on the sky (specifically, the SPT-SZ patch between ℓ 650 2000   ), the resulting cosmological parameters are fully consistent. This stringent test provides strong evidence against a systematic contamination in either experiment at these angular scales and on this patch of sky. 2. The observed discrepancy between Planck on the full sky (PlanckFS) and the SPT arises both from the sky area (that is, the SPT-SZ patch versus the full sky at ℓ 650 2000   ) and from data above ℓ 2000 > Figure 7. Parameter estimates for 150×150 bandpowers in the fiducial case and as a result of two tests for systematics discussed in Section 4. "Lensing fixed": the parameter estimates from the chain with lensing fixed to the 150×150 best-fit. Lensing information is important for constraining the matter density. "Tilt removed": the parameter estimates after removing the best-fit power law from Section 4.3 from the 150×150 bandpowers. W -and therefore in H 0 . While these shifts are intriguing in the context of broader discussions of the value of H 0 , when considered alone, they are nevertheless consistent with expectations given the null hypothesis that the ΛCDM model is correct and the statistical models of both data sets are accurate.
We arrived at these conclusions from the following set of tests and calculations. We first quantified the difference between the best-fit ΛCDM models for Planck and SPT and found a PTE of 3.2%. We tested for systematic errors in one or both experiments by restricting Planck and SPT data to nearly the exact same modes on the sky. To this end, we restricted the Planck data to the SPT-SZ sky patch and limited each data set to the multipole range of ℓ 650 2000   . Using the measured bandpowers and simulations described in H17 to create parameter difference covariances, we calculated 2 c values and PTEs for parameter differences between different spectra. We found PTEs of 0.74 and 0.32 for 150×143 and 143×143, respectively, when compared with 150×150 at ℓ 2000 max = . This is an extremely precise test of the consistency between the two measurements, as nearly all sample variance is eliminated from the comparison. We quantified the increased precision of this test by calculating the reduction in the volume of the 68% confidence region for the expected distribution of parameter differences between different data comparisons; using this metric, the 150×143 versus150×150 comparison is over 300 times more stringent than for the 150×150 versusPlanckFS comparison. These powerful tests would have magnified any evidence for systematic errors in either experiment; instead, their results strongly disfavor the presence of significant systematic errors in either the SPT or Planck data sets in the modes that are measured well by both experiments.
Next, we found that the discrepancy between PlanckFS and SPT comes from two parts of the data. The first part is differences between the SPT-SZ patch at intermediate scales (  ℓ  650  2000    ) and the whole sky over the full range of angular scales measured by Planck (ℓ 2000  ); this can be seen from Table 2, Figures 4 and 5, and the text of Section 3.4. The second part is the inclusion of high-ℓ data in the SPT-SZ patch ( ℓ 2000 3000   ); this can be seen from Table 1, where the PTE between PlanckFS and 150×150 drops below 5% only with the inclusion of data up to ℓ 3000 max = . The discrepancy between PlanckFS and the SPT can be alleviated by removing either of these parts of the data. By restricting Planck to the SPT-SZ patch, all comparisons are consistent ( Table 2); alternatively, removing the high-ℓ SPT data increases the PTE to 0.24 ( 0.57, 0.20  = , respectively), the data sets remain consistent; only when we relax both the sky coverage and the ℓ range does the PTE drop below 0.05. Third, we related certain of the parameter differences noted above to specific features in the bandpowers. The in-patch data bandpowers have a tilt relative to PlanckFS; this can be seen from H17 and Figures 3 and 6. This tilt is seen by both Planck and SPT data, and is thus unlikely to arise from systematics in this range of angular scales in either experiment. We find that this tilt is connected to the ΛCDM parameters A e s 2t -, and n s (see Section 3.5 and Figure 7). The tilt is not connected to the density parameters and H 0 ; this is confirmed by the fact that when the tilt is artifically removed, H 0 remains high relative to PlanckFS (see Section 3.5 and Figure 7).
Finally, we performed an additional set of tests designed to investigate whether specific potential sources of systematic error could be responsible for any of the measured parameter differences. We investigated the effects of the SPT instrument beam, the treatment of foregrounds in SPT data, and the influence of assumptions about the optical depth to reionization and the amplitude of gravitational lensing in the analysis. We found no evidence of any coupling of these effects to the measured parameter differences.
We conclude that our tests reveal weak evidence at most for a breakdown of ΛCDM or systematic errors influencing either the Planck data outside the SPT-SZ survey area or the SPT data at ℓ 2000 > . Instead, the discrepancy between SPT and Planck under ΛCDM may be explained by two individually insignificant statistical fluctuations-one between the SPT-SZ survey area and the full sky, the other in the high ℓ data that are better constrained by the SPT.
Whether this explanation is correct will ultimately be determined most directly by additional observations of the CMB temperature anisotropies at ℓ 2000 > , both within and beyond the SPT-SZ patch. Additionally, measurements of the EE and TE CMB polarization power spectra-from, e.g., Advanced ACTPol ( where S and N signify the sample and noise covariances, W bℓ are the window functions, and the beam correlation term, ℓℓ B r ¢ , is formed in a manner similar to Equation (4), and Y s is the calibration uncertainty.
Using the data in the calculation of the covariance instead of a fiducial model introduces a bias in the likelihood. Elements of D with lower values will have a smaller beam and calibration uncertainty than larger elements of D. When we fit a model to D, the preference will be to fit the elements of D with lower values better than larger elements. The calibration uncertainty bias has a greater effect at low-ℓ and the beam uncertainty bias has a greater effect at high-ℓ, explaining the connection to the parameters A e s 2t and n s . If the beam and calibration uncertainty are added into the covariance in a model-dependent way, the error bars on all elements of D are adjusted with the model. The resulting fits are unbiased, since a model that favors fitting the smallest elements of D produces smaller error bars for all elements of D and gives a worse overall fit.
This bias can best be understood in the context of a simple example. Since the bias works in the same manner for both beam and calibration uncertainty, we focus on just the latter. We assume a set of data (d) from a Gaussian distribution with covariance C. Our model has the form m Y q ( ), where we explicitly include the calibration Y, and m q ( ) depends on the remaining parameters in our model (θ).
With a Gaussian prior on Y of N 1, Y s ( ), the probability for the model parameters θ and Y given d can be written as , . 20