Prospects for kSZ$^2$-Galaxy Cross-Correlations during Reionization

We explore a new approach for extracting reionization-era contributions to the kinetic Sunyaev-Zel'dovich (kSZ) effect. Our method utilizes the cross-power spectrum between filtered and squared maps of the cosmic microwave background (CMB) and photometric galaxy surveys during the Epoch of Reionization (EoR). This kSZ$^2$-galaxy cross-power spectrum statistic has been successfully detected at lower redshifts ($z \lesssim 1.5$). Here we extend this method to $z \gtrsim 6$ as a potential means to extract signatures of patchy reionization. We model the expected signal across multiple photometric redshift bins using semi-numeric simulations of the reionization process. In principle, the cross-correlation statistic robustly extracts reionization-era contributions to the kSZ signal, while its redshift evolution yields valuable information regarding the timing of reionization. Specifically, the model cross-correlation signal near $\ell \sim 1,000$ peaks during the early stages of the EoR, when about 20% of the volume of the universe is ionized. Detectible $\ell$ modes mainly reflect squeezed triangle configurations of the related bispectrum, quantifying correlations between the galaxy overdensity field on large scales and the smaller-scale kSZ power. We forecast the prospects for detecting this signal using future wide-field samples of Lyman-break galaxies from the Roman Space Telescope and next-generation CMB surveys including the Simons Observatory, CMB-S4, and CMB-HD. We find that a roughly 13$\sigma$ detection is possible for CMB-HD and Roman after summing over all $\ell$ modes. We discuss the possibilities for improving this approach and related statistics, with the aim of moving beyond simple detections to measure the scale and redshift dependence of the cross-correlation signals.


INTRODUCTION
The Epoch of Reionization (EoR) is the time period when early star-forming galaxies and accreting black holes first formed, emitted ultraviolet (UV) light, and gradually photoionized hydrogen throughout essentially the entire volume of the universe. There has been a great deal of recent observational progress in studying the EoR (Planck Collaboration et al. 2020;Becker et al. 2015;McGreer et al. 2015;Davies et al. 2018;Mason et al. 2019), but future efforts are required to better understand when precisely reionization occurs and to measure the spatial variations in the ionization state of the intergalactic medium (IGM) during the reionization process. An improved understanding here will reveal the properties of the universe during this largely unexplored observational frontier, and help in testing models of the earliest phases of galaxy and cosmological structure formation (Loeb & Furlanetto 2013).
One powerful signature of the EoR is referred to as the patchy kinetic Sunayev-Zel'dovich (kSZ) effect (Sunyaev & Zeldovich 1972;Gruzinov & Hu 1998;Knox et al. 1998). The ionization state of the IGM during reionization is expected to be highly inhomogeneous, with "bubbles" of ionized gas forming around highly clustered ionizing sources, which gradually grow and merge as the ionizing photons propagate into the IGM and new UVluminous sources turn on. Some of the photons from the cosmic microwave background (CMB) scatter off of free electrons during the EoR and receive Doppler shifts owing to the line-of-sight bulk peculiar velocities of the ionized regions. This imprints secondary anisotropies in the CMB, which are sourced in part by the patchy ionized structure in the EoR. The CMB anisotropies from the kSZ effect are hence sensitive to the bubble sizes during reionization (e.g. McQuinn et al. 2005). Furthermore, the overall duration of reionization partly sets the amplitude of the patchy kSZ signal (Gruzinov & Hu 1998). In longer reionization scenarios, it is more likely that CMB photons scatter off of free electrons within ionized bubbles during reionization; this imprints largeramplitude features on the CMB maps on the angular scales spanned by the ionized regions. Hence, the patchy kSZ signal contains a wealth of information regarding the timing, duration, and spatial structure of the reionization process (Battaglia et al. 2013a;Natarajan et al. 2013;Choudhury et al. 2021;Paul et al. 2021).
There are, however, a number of challenges that must be overcome to extract this signal from CMB surveys and to best exploit it as a probe of reionization. The first issue is that one must separate the kSZ contributions to the CMB anisotropies from other sources of variations at relevant frequencies, including fluctuations in the cosmic infrared background (CIB) sourced by dusty star-forming galaxies, radio sources, and the thermal Sunyaev-Zel'dovich (tSZ) effect, predominantly from clusters and groups at much lower redshift. In principle, these can be distinguished spectrally because the kSZ effect maintains a blackbody spectral shape, in contrast to these other sources of anisotropies. The kSZ fluctuations are, however, much smaller in amplitude and so this places stringent requirements on the accuracy and precision of component-separation algorithms (Zahn et al. 2012;Calabrese et al. 2014;Reichardt et al. 2021). This issue is further complicated by spatial correlations between different components, such as the tSZ and CIB, which must be properly accounted for (Addison et al. 2012).
A still more daunting challenge is separating the patchy reionization anisotropies from post-reionization contributions to the kSZ effect, the so-called "late-time kSZ" signal. The late-time kSZ effect shares, of course, the blackbody spectral shape of the patchy component, and so must be distinguished based on its angular dependence alone. Unfortunately, the expected angular power spectrum of the post-reionization signal is relatively similar in shape and amplitude to the contributions from the EoR (Zahn et al. 2012). Furthermore, the late-time kSZ power spectrum itself is somewhat uncertain because it depends, for instance, on feedback effects from galaxy formation and supermassive black holes, which are hard to model (Park et al. 2018). Finally, the kSZ signal is a projected quantity and so by itself provides mainly an integral-type constraint on the reionization history and not detailed information on how reionization evolves with redshift. One promising approach to help with these challenges is to exploit a suitable fourpoint function statistic, which can aid in extracting the patchy reionization contributions and their redshift evolution (Smith & Ferraro 2017;Ferraro & Smith 2018;Alvarez et al. 2021).
Another possibility, which we pursue here, is to turn to cross-correlations. The naive choice of a two-point correlation between the kSZ signal and another tracer of large-scale structure (LSS) during the EoR is not, itself, promising: In this statistic, a cancellation occurs because ionized regions may be moving either toward or away from the observer. 1 This issue may be easily circumvented, however, by simply filtering the CMB map to suppress contamination from primary anisotropies and other contributions and then squaring the filtered map (Doré et al. 2004;Ferraro et al. 2016;Hill et al. 2016). The filtered and squared map will then correlate with a suitable LSS tracer, and photometric samples are sufficient here, i.e. expensive spectroscopic observations are not required. Indeed, this basic approach has been used successfully to study the late-time kSZ effect by combining Planck CMB data with photometric galaxy catalogs from the WISE survey at z 1.5 to measure the kSZ 2 -galaxy cross-power spectrum Kusiak et al. 2021). Moreover, the future outlook for more precise measurements of this statistic appears outstanding, with forecasts promising measurements at many hundreds of sigma statistical significance (Ferraro et al. 2016, hereafter F16). In the context of reionization studies, Ma et al. (2018) studied the possibility of measuring the kSZ 2 -21 cm cross-power spectrum, while La Plante et al. (2020) explored the related kSZ-kSZ-21 cm bispectrum. Both of these correlations with redshifted 21 cm data are, unfortunately, problematic in that the kSZ signal is sensitive only to Fourier modes with very long-wavelength line-of-sight components, yet these modes are almost inevitably lost to fore-ground contamination in the 21 cm surveys (La Plante et al. 2020).
Here we instead study, for the first time, the kSZ 2galaxy cross-power spectrum during the EoR. 2 This is motivated in part by the unprecedented photometric samples of reionization-era galaxies expected from the Roman Space Telescope and anticipated advances in CMB observations, with an upcoming suite of wide-field, high-sensitivity and fine-angular-resolution surveys, including the Simons Observatory (SO 3 ; Ade et al. 2019), CMB-S4 4 (Abitbol et al. 2017), andCMB-HD 5 (Sehgal et al. 2019). In particular, the Roman high-latitude survey (HLS) will use the Lyman-break selection technique to detect more than ∼ 10 6 galaxies at z ≥ 6 across roughly 2200 deg 2 on the sky (Spergel et al. 2015). The photometric galaxy samples provide clean measurements of transverse modes and so, unlike the case of the foreground-corrupted 21 cm field, are well suited for combining with kSZ data. Furthermore, the high-redshift galaxy samples should be well correlated with the patchy kSZ signal but not the late-time kSZ effect. Likewise, most of the other variations at the relevant frequencies, such as the tSZ, CIB, and radio sources, are produced at lower redshifts and so should be uncorrelated -or very weakly correlated -on average with the z ≥ 6 Lyman-break galaxies from the Roman HLS sample. In other words, other sources of anisotropy in the CMB maps will produce a noise source for kSZ 2 -galaxy correlation measurements but not an average bias. The exception here is CMB lensing, which will contribute to our estimates and must be separated based on its rather different angular dependence (F16) (see also Section 4.2).
Finally, if the measurements can be made precisely enough, the kSZ 2 -galaxy cross-power spectrum may be measured in different photometric galaxy redshift bins to study the evolution of the patchy kSZ effect with redshift. Thus, the kSZ 2 -galaxy cross-correlation offers a potential means to robustly extract the reionization-era contributions to the kSZ signal and may also access further tomographic information regarding the reionization history.
These features motivate the current work, which aims to model the kSZ 2 -galaxy cross-power spectrum and to forecast the expected signal-to-noise ratio (S/N) achievable with upcoming surveys. In order to do so, we make use of seminumeric simulations of reionization, which are fast and flexible, yet compare reasonably well to more detailed radiative transfer calculations (Battaglia et al. 2013b). In Sec 2, we describe the simulation methods used throughout this paper. In Section 3, we present the simulated kSZ 2 -galaxy cross-power spectrum models and discuss their interpretation. In Section 4 we give S/N forecasts for a number of upcoming surveys and describe the prospects for actually measuring the simulated signals. Section 5 discusses the possibilities for more precise measurements further in the future. We conclude and mention possible directions for extensions to this paper in Section 6. Throughout we assume a flat ΛCDM cosmology with Ω m = 0.316, Ω b = 0.049, h = 0.673, σ 8 = 0.812, and n s = 0.966. These parameters are consistent with those reported by the Planck18 analysis (Planck Collaboration et al. 2020).

METHODS
Here we present our basic approach for crosscorrelating kSZ and galaxy survey data. First, we describe the seminumeric reionization simulations used throughout this work (Section 2.1). We then discuss how the mock kSZ data (Section 2.2) and simulated maps of the galaxy distribution are generated (Section 2.3). Finally, we introduce the kSZ 2 -galaxy cross-power spectrum statistic employed in this paper (Section 2.4).

Seminumeric Reionization
Here we make use of so-called seminumeric simulations of reionization. This approach provides a fast and flexible treatment of the reionization process. Given the relatively weak signal modeled here, it is helpful to have rapid reionization calculations: This allows us to span large volumes, to average over many independent simulation realizations, and to explore a few different reionization models. Although more-detailed reionization simulations incorporating radiative transfer and hydrodynamics have been performed (Trac & Cen 2007;Trac et al. 2008;Iliev et al. 2014;Ocvirk et al. 2016Ocvirk et al. , 2020, the box sizes in these calculations are too small to capture the large scales of interest for our current study. One such seminumeric method for modeling reionization is zreion (Battaglia et al. 2013b). This technique has been applied to studies of the kSZ (Battaglia et al. 2013a;Natarajan et al. 2013;La Plante et al. 2020) and 21 cm signals (La Plante et al. 2014;La Plante & Ntampaka 2019). This model approximates each simulated volume element as either completely ionized or completely neutral. In order to compute the ionization field x i (r, z) for each point and redshift in the simulation volume (where x i = 0 represents neutral gas and x i = 1 is ionized), we begin by defining a local redshift of the reionization field z re (r). This field expresses the redshift at which a particular cell in the simulation volume is first reionized. We then define the fractional fluctuation in this field, δ z (r), as wherez is a model parameter that defines the midpoint of reionization. The reionization field δ z is assumed to be a biased tracer of the matter overdensity field on large scales. This relationship is quantified by a bias This bias is parameterized as a function of Fourier mode k in the following way: We use the value of b 0 = 1/δ c = 0.593, where δ c is the critical overdensity in the spherical collapse model. The reionization history is then determined by the set of model parameters {z, α, k 0 }. The midpoint of reionization is fixed byz, defined in Equation (1), whereas α and k 0 influence the duration of reionization as described in Equation (3). Battaglia et al. (2013b) demonstrate that the zreion model compares favorably with full radiation hydrodynamic simulations of reionization. In particular, for a suitable choice of model parameters, it reproduces the average ionization history and the cell-by-cell ionization field in the more-detailed simulation (and hence also summary statistics such as the power spectrum of the ionization field). For this study, our fiducial model adoptsz = 8, α = 0.2, and k 0 = 0.9 h Mpc −1 . 6 These values produce an ionization history that is relatively late and extended and consistent with measurements of the electron-scattering optical depth, τ , from Planck (Planck Collaboration et al. 2020), as well as observations of the Lyman-α forest toward high-redshift quasars 6 The values ofz, α, and k 0 in Battaglia et al. (2013b) were calibrated from hydrodynamic simulations and are used as our "early" simulation parameters discussed below in Section 3.1. and studies of Lyman-α emitters (McGreer et al. 2015;Davies et al. 2018;Mason et al. 2019).
For the current work, we simulate realizations of the matter density field using second-order Lagrangian perturbation theory (2LPT). This method is sufficiently accurate for the present study, in which we are interested in high, reionization-era redshifts (i.e., z 6 or so), and mostly consider large spatial scales Lidz et al. 2007). Our simulations track 1024 3 particles in a cubic simulation volume with a comoving side length of 2 h −1 Gpc. In what follows, we restrict our analysis to multipole moments that are well sampled and resolved in the simulations. Although our simulations capture multipoles from roughly ∼ 20-19,000, we conservatively consider only 100 ≤ ≤ 10, 000 in our analyses.
To generate an ionization field, we first use 2LPT to evolve the particles to the midpoint of reionization,z. The particles are deposited onto a grid with triangularshaped clouds (TSC), and the resulting matter density field δ m (r) is tabulated. We then apply the bias factor in Equation (3) to the matter density field, yielding δ z (k). The local redshift of the reionization field, z re (r), follows after applying an inverse FFT to δ z (k) and using Equation (1) to relate δ z (r) and z re (r). The ionization field at a particular redshift, z 0 , is set to 1 provided z re (r) is greater than z 0 (meaning that portion of the simulation volume was reionized at an earlier time) and 0 otherwise.

The kSZ Field
The kSZ effect results from CMB photons that scatter off of free electrons that are moving with some peculiar velocity relative to the observer, v ·n, wheren is a unit vector along the line of sight. The resulting CMB temperature fluctuation is (Sunyaev & Zeldovich 1972) Here χ is the comoving distance along the line of sight to the scattering location and q is the local electron momentum field with q = v(1 + δ m )(1 + δ x )/c. The patchy reionization effects arise through the spatial fluctuations in the ionization fraction, 1 + δ x = x i / x i . Here x i is the ionization fraction at the scattering location, and x i is the global average ionization fraction. The quantity g(χ) is the visibility function, which describes the probability that a CMB photon scatters between χ and χ − dχ, without subsequent scattering along its path to the observer. This may be written as (Alvarez 2016) where σ T is the cross section for Thomson scattering and n e,0 = [1 − (4 − N He )Y /4]Ω b ρ crit /m p is the mean electron number density at z = 0. 7 Here, τ (χ) is the electron-scattering optical depth between the observer and a comoving distance χ. In this equation, we can safely compute e −τ (χ) using the global average reionization history because the electron-scattering probability is small, and its variations across different lines of sight are unimportant here. We set the number of helium ionizations per hydrogen atom, N He = 1, so that helium is singly ionized along with hydrogen. (We can safely ignore doubly ionized helium because helium becomes twice ionized mainly at significantly lower redshifts; La Plante et al. 2017.) The contribution of free electrons from singly ionized helium leads to a weak dependence on the helium mass fraction, Y . We use Equations (4) and (5) to generate kSZ maps from our simulations. The details of the kSZ map-generating procedure are as follows. We first construct the redshift of the reionization field z re (r) as described above in Section 2.1. We then build kSZ sight lines through the volume, interpolating from the comoving coordinates in the simulation box onto a fixed grid in angular coordinates (θ x , θ y ). The matter density and velocity fields at each position along a given line of sight are linearly interpolated from 2LPT simulation snapshots at two nearby scale factors, a. The ionization field is subsequently determined using the values of z re (r). We then sum up the contributions across the entire length of each sight line and produce a two-dimensional map of the kSZ fluctuation using Equations (4) and (5). Note that we only include reionization-era contributions to the simulated kSZ signal in this work because the late-time kSZ signal is only a source of noise in the present study. We do, of course, account for the late-time kSZ signal in our estimates of the variance of the cross-correlation signal (see Section 4).

The Galaxy Field
In the current study, we adopt a simple linear-biasing model for the spatial distribution of the Lyman-break galaxies in the Roman HLS. We base our models on the BlueTides simulation (Feng et al. 2015. This simulation resolves reionization-era galaxies and has been used to make detailed predictions for an HLS-like survey (Waters et al. 2016, hereafter W16). We use the redshiftdependent linear-bias factor from their Figure 14 and 7 Note that the density and ionization fraction fluctuations are included in the local electron momentum field, q (Equation 4), and so g(χ) describes the scattering probability as a CMB photon propagates through the universe at the cosmic mean density and ionization fraction.
the number density of mock HLS Lyman-break galaxies from their Figure 3 in what follows. In the future, it may be interesting to move beyond the linear-biasing model adopted here (see, e.g. Mirocha et al. 2021 for one approach), but we expect the linear-biasing description to be relatively accurate for the forecasts here because most of the S/N comes from large angular scales (see Section 4). In order to model the two-dimensional galaxy distribution as a function of direction on the sky,n, we generate sight lines through the simulation box that are fixed in angular coordinates, (θ x , θ y ). The resulting twodimensional galaxy fluctuation field is Here W g (z) is a redshift window, which describes the probability that a mock HLS Lyman-break galaxy lies within a particular range in redshift for measurements in a given photometric redshift bin. We approximate this by a top-hat function in redshift, with a midpoint z 0 and a width ∆z: where the normalization W 0 is a constant chosen so that the window function obeys the constraint that dz W g (z) = 1. The minimal redshift width, ∆z, which may be achieved in practice depends on the accuracy of the redshift determination for the Lyman-break galaxies in the HLS. Several studies (Bouwens et al. 2015;Finkelstein 2016) find photometric redshift uncertainties of σ z ∼ 0.2-0.3 for z 6 Lyman-break galaxies, as selected using the color filters available for the Hubble Wide Field Camera 3 (WFC3). The photometric redshift accuracy for Roman's Wide Field Instrument (WFI) may be higher given its larger set of filters. In addition, a grism instrument on Roman can provide spectroscopic redshifts for a subset of the galaxies in the HLS sample. In this work, we will vary ∆z to test its impact on the results but will generally consider redshift bins that are substantially coarser than the photometric redshift uncertainties. In our S/N calculations, we ignore contamination in the Lyman-break samples from lowerredshift interloping galaxies and foreground stars (see, e.g., Finkelstein 2016 for a discussion.)

Cross-correlation
As motivated in the Introduction, we study the filtered kSZ 2 -galaxy cross-power spectrum in this work as a means to extract patchy reionization contributions  (8). The kSZ signal from reionization is determined by averaging over 30 simulation realizations, as described in Section 2.2. The late-time kSZ component is defined in Equation (9). The lensed primary CMB anisotropies are computed using CAMB (Lewis et al. 2000). The post-ILC noise power spectra, N , are shown for various CMB experiments.
to the kSZ signal. The filter we apply suppresses modes where the reionization-era kSZ angular power spectrum is small relative to other contributions to the anisotropies on the relevant scales, which act as "noise" for the patchy kSZ measurement (Doré et al. 2004;Ferraro et al. 2016;Ma et al. 2018). Specifically, the lensed primary CMB anisotropies; residual foregrounds left over after component separation from the CIB, radio sources, and the tSZ; the late-time kSZ signal; and instrumental noise all contribute to the effective noise here. In this case, an appropriate filter may be written as (F16): where C kSZ,reion is the angular power spectrum of the kSZ signal due to reionization (z 6), C T T is the lensed primary CMB spectrum, C kSZ,late is the late-time (z 6) contribution to the kSZ, and N accounts for both residual foregrounds and detector noise for a given experiment, after deconvolving the instrumental beam. To generate C kSZ,reion , we run 30 independent realizations of our simulations and average their C spectra together. The resulting maps have a power spectrum amplitude of ( + 1)C /2π ∼ 1 µK 2 at ∼ 3000, which is in line with other seminumeric simulations of reionization for similar reionization histories (Mesinger et al. 2012;Paul et al. 2021). For the late-time kSZ signal, we use the power-law fit of Park et al. (2018), which is Note-These numbers define the noise values for the naive experimental noise defined in Equation (11), as well as the beam size used in b( ) (used both for the naive noise as well as the filter with the post-ILC noise estimates).
In addition to this filter, we convolve the simulated signal maps with an instrumental beam to account for the finite angular resolution of the upcoming surveys. The beam is described by a Gaussian model and denoted by b( ), while the parameters for each instrument are listed in Table 1. Accounting for both the noise-suppressing filter and the instrumental beam, the observed temperature fluctuation ∆T f ( ) in harmonic space is related to the underlying variations before smoothing, ∆T ( ), as: where we have defined f ( ) ≡ F ( )b( ). We consider the prospects for a range of upcoming CMB surveys, spanning the noise (N ) appropriate for each of SO, CMB-S4, and CMB-HD. Given the frequency coverage of these surveys, residual foregrounds will inevitably remain after component-separation algorithms use the blackbody spectral shape to separate the kSZ signal from other contributions to the anisotropies. For example, the standard internal linear combination (ILC) method constructs a weighted linear combination of maps at different frequencies (Bennett et al. 1992;Tegmark et al. 2003;Eriksen et al. 2004;Delabrouille et al. 2009). The weights are chosen to give a unit response to the desired blackbody spectrum and to minimize the variance of the resulting map. We consider applying the ILC technique in harmonic, i.e., space, and use the publicly available orphics code (Hotinli The overall amplitude of this quantity is arbitrary, and so we normalize each curve to unity at the value where the filter is maximal. The filter downweights low modes where the primary anisotropies swamp the kSZ signal, as well as higher modes where residual foregrounds and instrumental noise dominate. In general, this filter peaks at higher values for the experiments with higher angular resolution and lower pixel noise. Note that although the CMB-HD filter would allow modes at 10, 000 such scales are not well resolved in our simulated maps but in any case contribute negligibly to the S/N. In practice, we hence truncate the filter beyond = 10, 000. et al. 2021), 8 to estimate the expected angular power spectrum of residual CIB, radio sources, tSZ, and instrumental noise. Note that the residual foreground contributions are quite significant here, and so it is important to account for them along with the instrumental noise.
For comparison, we will also consider cases where N consists solely of instrumental noise with negligible foreground contamination. In the instrumental-noisedominated limit, the relevant power spectrum is  (Hotinli et al. 2021). The more realistic case, including residual foregrounds after ILC component separation will be referred to as the "post-ILC noise" scenario, or the "ILC noise" case throughout. Figure 1 shows the different components that contribute to the filter defined in Equation (8). For multipoles of 4000, the lensed primary CMB anisotropies are the dominant source of noise for all upcoming CMB surveys, and significantly lower values are completely swamped by the primary anisotropies. Hence such scales will be heavily downweighted in the filter. At higher values, residual foregrounds and instrumental noise are the strongest contributions. While the combined noise is larger than the patchy kSZ signal at all scales, it can nevertheless be detected statistically after averaging over many modes.
The resulting filters for SO, CMB-S4, and CMB-HD are shown in Figure 2. As expected from Figure 1, the filter downweights low-modes that are swamped by the primary anisotropies, as well as higher values where the residual foreground and instrumental noise dominate. In each case, the filter peaks at relatively small scales near 3000 5000, with the higher resolution and more sensitive instruments giving peaks at smaller angular scales.

RESULTS
We now turn to apply the filter of Equation (10) to the simulated kSZ map, square the signal, and crosscorrelate with the projected galaxy distribution. It is instructive to first visually examine the resulting simulated fields. Specifically, Figure 3 shows the simulated filtered kSZ 2 maps, generated following Section 2.2, along with projected galaxy fluctuation fields (Section 2.3) and ionization fraction variations. It is impossible to visually discern correlations between the kSZ 2 signal and the ionization fluctuations because the kSZ signal receives contributions from the entire reionization history and is sourced partly by density and velocity variations. Likewise, the kSZ 2 -galaxy crosscorrelation is too weak to see but may nevertheless be measured statistically. We include maps of the simulated signals just to provide some overall impression of the quantities we model and correlate here.
The color bars in Figure 3 are also instructive as they indicate the amplitude of the fluctuations in these fields. For instance, the very brightest fluctuations in the kSZ 2 map are at the kSZ 2 ∼ 30µK 2 level (the color bar in Figure 3 is truncated above 5 µK 2 for visualization purposes), while the rms fluctuation in this quantity is kSZ 2 RMS = 1.6 µK 2 . Note that this is a strongly non-Gaussian field, with a long large-fluctuation tail. The galaxy abundance fluctuations are order ∼ unity, with an RMS of δ g,RMS = 0.54. Note that regions with δ g < −1 are unphysical, and their occurrence here is an artifact of our linear-biasing model. We nevertheless stick to this simple model in this work, as it should still provide a reliable forecast for the expected S/N of the cross-spectrum. As we will see, the kSZ 2 and galaxy fields are only weakly correlated and so the  (8) and b( ) defined in Equation (11). Here we use the parameters for SO. Center: the corresponding high-redshift galaxy field. This two-dimensional field was generated by averaging across 9 ≤ z ≤ 10, corresponding to a window function Wg defined in Equation (7) with z0 = 9.5 and ∆z = 1. We have also applied a low-pass filter removing all modes with ≥ 1000 to emphasize the large-scale features that are most important for the cross-correlation (discussed more below in Section 3). Right: the corresponding ionization field, xi, averaged over 9 ≤ z ≤ 10. The objective here is to use correlations between the kSZ 2 and galaxy fields to learn about the average ionization fraction evolution and its fluctuations.
Although the correlations are too weak to be seen by eye in the simulated maps here, they may be detected statistically by averaging over multiple simulation realizations, or by using larger patches of the sky in the real universe.
cross-correlation signal is substantially smaller than the product of the RMS variations in each field. In order to quantify this, Figure 4 shows the angular cross-power spectrum between the kSZ 2 and galaxy fluctuation fields, C kSZ 2 ×δg , as a function of . Here In order to reduce the noise on the simulation measurements, we average over 30 independent simulation realizations. The error bars in Figure 4 give estimates of the error on the mean simulated signal. These results illustrate several interesting features of the signal. First, the kSZ 2 -galaxy cross-power spectrum peaks near ∼ 1000, virtually independently of redshift. As discussed further below (see Equation 12), this implies that this statistic is mostly picking out fairly squeezed-triangle configurations of the related bispectrum. The SO filter peaks at ∼ 3000, while the cross-power here is maximum near ∼ 1000. The CMB-S4 and CMB-HD filters pick out still higher modes and the signal extends down to ∼ a few hundred, so the statistic mainly quantifies how the smallscale kSZ power varies with the galaxy overdensity on larger scales. Although the signal is also appreciable at ∼ 3000, corresponding to equilateral-type triangles, we will see that most of the S/N comes from squeezed con-figurations (see Section 4). Second, the cross-correlation is positive for all redshifts and scales shown. This arises because large-scale overdensities in the matter field contain more galaxies, and also larger fluctuations in the ionization fraction on small scales during most of the EoR, and hence greater high-kSZ power. Third, the amplitude of the power spectrum peaks at around ( + 1)C kSZ 2 ×δg /(2π) ∼ 0.02 µK 2 . Although this is a small signal, one should keep in mind that the filter applied suppresses the primary anisotropies and other sources of variations in the map. The peak is reached rather early in reionization: The largest signal in the examples shown occurs in the rightmost panel, at a volume-averaged ionization fraction of x HII ∼ 0.20. As discussed further below, this results in part because the filter picks out kSZ variations on fairly small angular scales. Finally, another interesting feature of the results in Figure 4 is that the signal depends little on the redshift extent of the galaxy distribution considered. We suspect that this reflects the following trade-offs. First, the patchy kSZ signal receives contributions from a broad range of redshifts across the EoR (Equations 4 and 5), and so increasing the extent of the galaxy window boosts the overlap with the kSZ visibility function. Second, however, the projected galaxy fluctuations decrease in amplitude for a broader redshift window. Third, the CMB filter extracts the kSZ fluctuations over a fairly narrow range of modes, concentrated on small scales that peak relatively early in the EoR. Hence, although a broader galaxy window promotes overlap with the visibility function, this does not boost the signal apprecia-   bly because the small-scale kSZ 2 signal -and its correlation with matter overdensities -is sharply peaked in redshift. It seems that, in total, these three effects conspire to give little dependence on ∆z. We discuss the interpretation of the simulated signal further below in Section 3.1, but leave a more complete treatment to possible future work. Figure 5 shows the redshift evolution of the crossspectrum at several example values of in more detail, with the corresponding volume-averaged ionization fractions given along the upper x-axis. Here we take galaxy redshift windows with a redshift extent of ∆z = 1comfortably broader than the photometric redshift accuracy of the Lyman-break galaxies detectable in the Roman HLS -and show the central redshift in each bin. Similar to Figure 4, the shaded regions show the standard error of the mean computed from our 30 simulation realizations. The example modes in the figure all illustrate a similar trend with redshift and ionization fraction: The signal rises and falls and peaks near an ionization fraction of x HII ∼ 0.2. The ionization fraction variations are evidently the dominant source of signal here, as the cross-power is nearly vanishing towards the end of reionization, around z ∼ 6 in this model. This figure also highlights the potential power of future kSZ 2 -galaxy cross-power spectrum measurements as a tomographic reionization probe. If the S/N of the measurements is large enough, one can determine the crosspower spectrum in different photometric redshift bins and extract how reionization evolves with redshift. The small, yet nonvanishing, value of the cross-correlation in the highest-redshift bin in Figure 5 reflects contributions from the low-redshift end of the bin. In this bin, a few percent of the simulation volume is ionized, and the ionized regions are strongly concentrated around rare overdense peaks in the matter distribution.

Dependence on Ionization History
In order to explore further the trends with ionization fraction seen in Figure 5, we produced two additional reionization models. In these added cases, the timing of reionization differs from that in the fiducial scenario considered thus far. Specifically, our alternate models include a "short" reionization history with the same midpoint ofz = 8 (defined by where x HII ∼ 0.5), but a shorter reionization duration. We also model an "early" reionization scenario, which has a similar overall duration to our fiducial model, yet with a midpoint of z = 10. The corresponding k 0 and α parameters (see Equation 3) are k 0 = 0.185 hMpc −1 , α = 0.564 for the short history. The early scenario uses the same k 0 and α parameters as the short scenario but adopts a higher value ofz = 10. The early scenario is disfavored by current constraints, such as the electron-scattering optical depth inferred from Planck18: the early scenario has τ early = 0.079, whereas the current Planck constraint is τ Planck18 = 0.054 ± 0.007 (Planck Collaboration et al. 2020). Thus, the optical depth in the early scenario is 3.6σ above the Planck18 central value. 9 Nevertheless, this scenario still provides a useful and illustrative example for testing the impact of alternate reionization models on the kSZ 2 -galaxy cross-spectrum.  x HII Figure 5. The cross-correlation statistic C kSZ 2 ×δg as a function of z0. Different lines correspond to different modes. The redshift values shown on the bottom x-axis correspond to the central redshifts, z0, of the galaxy window, Wg. In each case the width is fixed to ∆z = 1. The corresponding volume-averaged ionization fractions, xHII, are shown on the upper x-axis. For most modes, the signal peaks relatively early in reionization, near when the volume-averaged ionization fraction is xHII ∼ 0.2. The shaded regions show model uncertainties as in Figure 4. See Section 3 for more discussion.
contrasts the reionization histories in the three different models explored here. 10 Figure 7 compares the behavior of the kSZ 2 -galaxy cross-power spectrum in the three different models. The left-hand panel shows that the early model has the largest amplitude signal. This results because of the increased gas density at high redshift, which leads to a larger electron-scattering opacity at early times in this model. The increased opacity, along with the enhanced clustering bias of the galaxies at high redshift, boosts the signal in this scenario. Although the signal is larger, we will see that the cross-spectrum is less detectable in this case (see Section 4). The fiducial and short reionization histories have similar peak amplitudes, but the rise and fall with redshift in the fiducial model are more gradual than in the short history. This illustrates that the cross-power spectrum statistic modeled here can be 10 Note that we do not vary the abundance and clustering of the Roman HLS Lyman-break galaxies self-consistently with the reionization history model. That is, we just fix the bias factors and abundances of these sources to the BlueTides-model values throughout. This should be a good approximation in that the HLS observations will capture just the bright end of the galaxy populations and not the typical ionizing sources. Therefore, the observable Lyman-break galaxies are largely decoupled from the ionizing sources themselves. x HII Fiducial Early Short Figure 6. The volume-averaged ionization fraction of hydrogen, xHII, as a function of redshift for our fiducial, early, and short reionization scenarios. The different models shown here help in understanding how the kSZ 2 -galaxy cross-power spectrum signal depends on the underlying reionization history.
used to determine how rapidly reionization evolves with redshift, provided it can be measured precisely enough across different photometric redshift bins. Another interesting feature of Figure 7 is emphasized in the right panel, in which the cross-spectrum is plotted as a function of the volume-averaged ionization fraction x HII . Although the timing of reionization differs markedly across our three example models, they all show a similar evolution when considered as a function of x HII . In each case, the signal rises rapidly at small ionization fractions, reaches a peak at around x HII ∼ 0.15-0.2, and then shows a more gradual decline with increasing ionization fraction. One detail appears slightly differently in the early reionization history, however: This model shows a negative cross-correlation signal toward the end of reionization. This may result because the large-scale matter densities are almost entirely ionized by the late stages of reionization in this scenario. In this case, the ionization field is smooth on small scales within overdense regions, and so there is less small-scale kSZ power in such portions of the universe (leading to a negative correlation). At any rate, the main result of Figure 7 is that the kSZ 2 -galaxy cross-power spectrum evolves in a generic way with average ionization fraction across these three different scenarios. It may therefore help to determine the timing of reionization, with the peak providing a potential marker of when 15-20% of the volume of the universe is ionized.
A more quantitative understanding of the trends observed in our simulated examples may be obtained by relating the kSZ 2 -galaxy cross-power spectrum to an  Figure 5. This shows the behavior of the signal for the different ionization histories of Figure 6, discussed in Section 3.1. Right: the same as the left panel but plotted as a function of ionization fraction xHII instead of redshift. Although the timing of reionization differs in each scenario, in all three cases the signal peaks at a similar average ionization fraction. The overall rise and fall in the amplitude of the cross-spectrum are also comparable when the three models are considered as a function of a ionization fraction. The generic behavior here is promising to use this statistic to extract information about the timing of reionization. underlying bispectrum. We leave a more complete treatment along these lines to possible future work and confine ourselves here to a few general remarks. The crosspower spectrum may be written, in the Limber approximation, as (Doré et al. 2004): The cross-bispectrum in the above equation, B δpnpn , involves the matter density fluctuations, δ m , and the spatial variations in the line-of-sight component of the electron momentum field, pn at two different multipoles. Note that the ionization fraction fluctuations are embedded in the electron momentum field here (see Equations 4 and 5). A few immediate conclusions can be drawn from Equation (12). First, recall that f (L) is rather sharply peaked around 3000 L 5000, with the detailed filter shape depending on the CMB survey considered (see Figure 2). If we consider L, the regime where the S/N peaks (see Section 4), then the above cross-spectrum is picking up contributions from the bispectrum for squeezed-triangles configurations, as mentioned earlier. The squeezed-triangle bispectrum describes the correlation between the low-fluctuations in the galaxy density field and the high-L kSZ power. Second, near the redshift of the signal peak at z ∼ 9.5, the CMB filter extracts primarily somewhat small-scale electron momentum fluctuations with a wavenumber of k ∼ L/χ(z) = 0.3-0.5 Mpc −1 for 3000 L 5000. At these multipoles, the angular power spectrum of the density-weighted ionization fraction fluctuations (in ∆z = 1 bins) peaks when the average ionization fraction is x HII = 0.38. That is, the small-scale density-weighted ionization power reaches its maximum fairly early in the reionization process. This peak occurs, however, slightly after that in the kSZ 2 -galaxy cross-spectrum signal. This presumably relates to the fact that our statistic is sensitive to how the small-scale kSZ variations correlate with the large-scale galaxy fluctuations, and so its amplitude is not entirely set by the strength of the small-scale electron momentum fluctuations alone.
In future work, it may be interesting to decompose the bispectrum of Equation (12) into several constituent terms involving various products of the ionization, density, and velocity fields. One can then carry out the integrals in Equation (12) to quantify the relative contributions of the different terms to the cross-spectrum and further analyze their dependence on the galaxy window and reionization model.

OBSERVATIONAL PROSPECTS
We now turn to consider the prospects for detecting the kSZ 2 -galaxy cross-power spectrum with upcoming surveys. As motivated previously, we focus on the case of the Roman HLS Lyman-break galaxy sample in combination with SO, CMB-S4, and CMB-HD. First, we start by determining the total S/N at which we can measure C kSZ 2 ×δg in a given photometric redshift bin, summed over all modes. We consider various values for the galaxy redshift window parameters, z 0 and ∆z ( Equation 7). Here we follow the closely related calculations in F16. Assuming our fiducial case describes the true underlying kSZ 2 -galaxy cross-power spectrum signal, the total S/N of a measurement is (F16) where f sky is the fraction of overlapping sky surveyed by both experiments, C δgδg is the angular power spectrum of the galaxy map including shot noise, and CT 2T 2 ,f is the total power spectrum of the filtered CMB 2 map. The equation above adopts the usual Gaussian approximation for the cross-spectrum variance. Furthermore, we also compute CT 2T 2 ,f in the Gaussian approximation (F16): Here we define where the various components have the same meaning as in Equation (8).
Equation (13) requires an estimate of C δgδg , which includes both clustering and shot-noise terms. This is written as where C clust is the clustering component of the galaxy C spectrum estimated from our simulations and C SN is the shot-noise contribution. Both the clustering and shot-noise terms depend on the galaxy window, W g . As discussed previously, we assume a linear-biasing model and use the galaxy bias versus redshift from W16. The shot noise is computed as Ng , where N g is the total expected number of galaxies for the HLS survey in a given photometric redshift bin. Throughout this analysis, we assume a survey area for the HLS of 2200 deg 2 (Spergel et al. 2015), which translates to f sky = 0.053. We suppose here that the CMB survey covers the entire HLS field. We use results from the BlueTides simulations in W16 for the expected number of HLS galaxies and the corresponding linear-bias factors. Specifically, we adopt their simulated galaxy abundance and clustering for the case of a 5σ detection threshold. We take the geometric mean between the optimistic "intrinsic" brightness of the simulated galaxy samples and their "dust-corrected" galaxy abundance model. These values are broadly consistent with the predicted number of galaxies from Spergel et al. (2015), which are based off of measurements from Bradley et al. (2012) at z ∼ 8 and extrapolated to higher redshifts, assuming a model similar to that of Postman et al. (2012). Figure 8 shows the angular power spectrum of the galaxy distribution in our models as a function of redshift. An interesting feature here is that, perhaps contrary to naive expectations, the Roman HLS galaxy samples are not entirely swamped by shot noise, as emphasized earlier by W16. For instance, in the case shown, which adopts a photometric redshift bin of ∆z = 1, the shot noise is subdominant to the clustering term for 1000 even at z ∼ 10. This is partly a consequence of the linear-bias factors, which are expected to be huge for the Roman HLS galaxies, ranging from b g ∼ 9 at z = 6 to b g ∼ 20 at z = 12. Note also that the clustering term actually increases slightly toward high redshift, even though the matter density fluctuations are smoother at early times. This occurs because b g grows relatively steeply with redshift, while at the same time the comoving volume enclosed within a shell of fixed redshift width ∆z decreases. These effects tend to boost the fluctuations in the projected galaxy distribution toward high redshift and more than compensate for the smoother matter field. Nevertheless, as can be gleaned from the figure, at sufficiently high , shot noise dominates, especially in the highest-redshift bins. We explore the dependence of the overall S/N on the galaxy shot-noise contribution more below in Section 5.2.  (13), for a single redshift bin, as a function of the galaxy window function Wg parameters z0 and ∆z. This shows the result for CMB-S4 assuming the fiducial reionization scenario. The maximum response comes from a relatively wide window of ∆z ∼ 4.2, centered near z ∼ 9.2. Interestingly, the peak in S/N is at a slightly smaller redshift than the peak in the signal, owing primarily to the smaller galaxy shot-noise term at lower redshift. The corresponding S/N is 2.8σ. The dashed line in the upperleft part of the figure corresponds to combinations of z0 and ∆z that would include contributions from post-reionization (z ≤ 6), which are not included in our simulations. See Section 4 for further discussion. Figure 9 shows the predicted S/N for the case of CMB-S4 and the Roman HLS galaxies as a function of the galaxy window function parameters z 0 and ∆z. In Figure 4 we found that the cross-spectrum depends relatively little on the extent of the galaxy window, while the galaxy angular power spectrum in the cross-spectrum variance (i.e. in the denominator of Equation 13) decreases as the width of the window increases. Hence, the S/N for a single redshift bin is largest for a broad redshift window. In the CMB-S4 case, we find that the highest S/N is attained for a redshift bin of ∆z ∼ 4.2, centered on a redshift just a little bit lower than the redshift of the signal peak. In the case of our fiducial reionization history, and CMB-S4 noise properties, the total S/N reaches 2.8σ for a single, broad (∆z ∼ 4.2) redshift bin centered on z 0 ∼ 9.2. Given that our forecasts are inevitably idealized, this seems somewhat marginal if tantalizingly close to a detectable signal.
Note that given the wide parameter space of z 0 and ∆z considered, there are some combinations that include contributions from the post-reionization epoch (z ≤ 6). Figure 9 shows a dashed line, where combinations above this line would include such contributions. Rather than include the post-reionization epoch and potentially double-count the low-z portion of the signal, we truncate the window to have a minimum value of z = 6, both for the signal and noise components of the S/N. As such, the S/N in this portion of the figure may not adequately capture the interplay between the reionization-era kSZ and low-redshift kSZ signals. Nevertheless, this portion of parameter space may still be useful with real data, as looking at post-reionization redshift values can confirm that reionization is complete by a particular redshift.
Although this model is only borderline detectable with CMB-S4, the prospects are more promising for the case of Roman HLS and CMB-HD cross-correlations. Specifically, Figure 10 shows the S/N (summed over all modes) for different CMB surveys (each in combination with the HLS galaxies) and our three reionization history models. In each case, we consider a single redshift bin and choose the galaxy window parameters z 0 and ∆z that maximize the S/N. The first key conclusion here is that the cross-spectrum signal is detectable at high (13σ) significance in our fiducial model for CMB-HD. The signal in the short reionization history is somewhat less detectable (8σ) because of the more rapid rise and fall of the cross-spectrum in this scenario. The earlier reionization history is still less detectable (4σ): Even though the signal is stronger in this model (see Figure 7), the peak occurs at higher redshifts where the shot noise is larger. Note that the fiducial case is likely the most plausible of these models given current observational constraints on the reionization history. Hence, although these forecasts depend somewhat on the precise reionization model adopted, the prospects of detecting this signal look quite good for the case of CMB-HD. See Table 2 for the numerical values in each reionization history and experiment.
Next, Figure 10 further illustrates that most of the S/N comes from modes with 1000, with the cumulative S/N rising steeply between 500 1000, especially in the most detectable CMB-HD case. This cements our previous remarks that this statistic primarily measures squeezed-triangle configurations. Finally, one can see that strong detections appear a little out of reach for CMB-S4 (∼3σ) and SO (∼1σ) in our fiducial reionization history.

From Detection to Characterization
In the preceding discussion, we focused primarily on the detection of the overall signal, where we combined the total sensitivity across relatively wide ranges in redshift and summed over all modes. However, the projected sensitivity of CMB-HD is such that it becomes possible for us to move beyond a simple detection and potentially characterize the evolution of the signal as a function of redshift. Such an analysis would be useful for inferring the reionization history, as Figure 7 shows that the signal follows generic trends-at least across the three cases considered-with the volume-averaged ionization fraction of the universe. This is illustrated explicitly in Figure 11 which shows the projected CMB-HD × HLS error bars across several independent photometric redshift bins. Here we have binned the galaxy distribution into bins of width ∆z = 1 and calculate the S/N (Equation 13) in a few different ranges at each redshift. Specifically, we determine the S/N in three bins, 250 ≤ ≤ 750, 750 ≤ ≤ 1250, and 1500 ≤ ≤ 4500 centered on = {500, 1000, 3000}, respectively. As discussed previously, the expected photometric redshift accuracy is σ z ∼ 0.2-0.3 and so we should be able to safely measure the evolution across several independent ∆z = 1 redshift bins. Although the error bars are large in the two highest-redshift bins, the remaining results look promising for CMB-HD: One should be able to at least roughly detect the peak in the signal and the subsequent decline in the kSZ 2 -galaxy cross-power spectrum as the universe is progressively reionized. Note that the different nonoverlapping redshift bins here should be only weakly correlated with x HII Figure 11. Forecast observational errors for the case of CMB-HD. This is similar to Figure 5, except here we show the expected error bars for a real planned survey rather than uncertainties on the simulation measurements. In the case of a future measurement using the planned CMB-HD survey and Lyman-break-selected galaxies from the Roman HLS, the characteristic redshift evolution of the signal may be measured with moderate statistical significance (see further discussion in Section 4.1).
each other. 11 From the error forecasts in Figure 11, it is thus clear that CMB-HD can measure the overall evolution of this signal as a function of redshift, yielding important information regarding the timing and duration of reionization.

CMB Lensing Leakage
As discussed in previous works, e.g., F16, the kSZ 2galaxy cross-power spectrum estimates discussed here will be contaminated by CMB lensing. The lensing effects must be separated from the desired kSZ signal using its different angular dependence, which should be distinctive and clean to model. Here we calculate the lensing contamination for the very high-redshift application considered in the present work.
In order to investigate this, we determine the lensing leakage terms as described in Section B of Kusiak et al. (2021) (see their Equations (9)-(15)). We ignore the magnification bias terms considered in that work (their Section C), which require modeling the full luminosity 11 In the Gaussian and Limber approximations, the covariance between CMB 2 -galaxy cross-power spectra estimates in two redshift bins, i and j, with galaxy fluctuation fields g i and g j , is (2 +1)f sky . This formula assumes that the redshift bins are nonoverlapping. The covariance here is weak relative to the variance in the cross-spectrum estimates in each redshift bin.  Figure 12. A comparison between the cross-correlation signal of interest C kSZ 2 ×δg and the leakage due to CMB lensing ∆C T 2 ×δg . These results adopt the S4 filter and a galaxy window with z0 = 9.5 and ∆z = 1. Although the lensing leakage is stronger in amplitude than the kSZ 2 -galaxy crosspower spectrum signal, the dependence is very different, as in F16. It should therefore be possible to fit for both contributions without significantly increasing the errors on the kSZ 2 -galaxy cross-power spectrum.
function of the galaxy samples involved, although these would be interesting to consider in future studies. In calculating the leakage, we adopt a linear-biasing model and the linear theory matter power spectrum, which should be a good approximation for our high-redshift application here. As in previous work, the lensing leakage is denoted by ∆C T 2 ×δg .
The resulting leakage is shown in Figure 12, as compared to the cross-correlation signal of interest, C kSZ 2 ×δg . This is for the case of a ∆z = 1 galaxy redshift bin, centered at z 0 = 9.5, close to the peak redshift for the kSZ 2 -galaxy cross-power spectrum. The lensing leakage shows a distinctive anticorrelation at low and a positive correlation at higher , as discussed in F16. Although we are considering very high-redshift galaxy samples here, the CMB lensing leakage is still quite important. This occurs in part because the CMB lensing kernel is quite broad, and because although the matter distribution is relatively smooth at high redshift, the Roman HLS galaxies are highly biased tracers of the matter distribution. Indeed, the overall amplitude of the leakage is strongest around ∼ 1000-2000 where it reaches ( + 1)∆C T 2 ×δg /(2π) ∼ −0.12µK 2 . This is a factor of ∼ 5 larger in absolute value than the signal we seek to measure. Nevertheless, we believe this is not a big obstacle for our measurements. First, the leakage has a very different dependence than the signal we seek to measure. Therefore, one can hope to simultaneously fit for both components without significantly inflating the error bar on the kSZ 2 -galaxy cross-spectrum. Second, the lensing contribution should be robust to model, up to an overall bias factor which can be constrained independently from the auto-power spectrum of the galaxy distribution itself. Third, the kSZ 2 -galaxy cross-power spectrum evolves strongly with redshift and the patchy reionization contribution vanishes entirely once reionization completes, below z 5.5-6 or so. The lensing contribution evolves much less sharply in redshift, especially in its dependence. That is, the differing redshift evolution of the two contributions offers a further handle for distinguishing them. We will not quantify the prospects for joint fits further here: While it is important to account for the lensing leakage, it should not be a major impediment for measuring the kSZ 2 -galaxy cross-power spectrum. An alternate approach is to reconstruct the projected mass distribution using lensing-induced correlations between E-and B-mode polarization signals. This polarization-only estimate can be used to determine the lensing contribution to our kSZ 2 -galaxy statistic without contamination from the kSZ signal, which is intrinsically unpolarized (F16).

FURTHER IMPROVEMENTS
Although the CMB-HD results above appear promising, it is useful to consider possible directions for improving these measurements further. There are essentially three areas where there is room for improvement. First, the S/N forecasts are sensitive to residual foregrounds in the CMB maps, and hence further foreground mitigation may help. Second, a stronger detection is possible if a larger common region of the sky can be covered by both the galaxy and CMB surveys. Finally, a deeper galaxy survey would reduce the shot noise and boost the overall detection significance.

CMB Noise Models
In the calculations thus far, we have included estimates of the residual foreground contributions to the effective noise variance after harmonic-based ILC component separation. In general, the resulting variance is substantially larger than the purely instrument-based noise power spectrum (described by Equation 11). In order to quantify the impact of foregrounds, we consider here the idealized pure instrumental-noise-dominated limit in which residual foregrounds are absent.
The first and second columns of Table 2 show the cumulative S/N we forecast for the different experiments and noise models. We include both the post-ILC noise levels for each experiment, as well as the naive instrument-only noise. For each experiment, including only the instrumental noise (i.e., assuming the foreground contamination can be removed entirely) increases the S/N. Although the difference is relatively modest for SO, the gain is much more substantial for CMB-S4 and CMB-HD. The S/N increases roughly by a factor of 4 for CMB-S4 and by a factor of around 3 for CMB-HD. The larger boosts from CMB-S4 and CMB-HD arise because of the improved sensitivity and angular resolution of these experiments relative to SO. This larger improvement factor of about 4 for CMB-S4 compared to about 3 for CMB-HD owes to a greater difference in the power spectrum of the filtered CMB-squared map CT 2T 2 ,f at low-modes, where most of the S/N arises for each instrument. In the case of CMB-HD, the cumulative S/N forecasted in the noise-dominated limit is 33: This is an impressive detection, which would allow a fairly detailed characterization of the cross-spectrum signal along the lines of, yet better than, Figure 11. Although it will be hard to remove the foregrounds down to below the instrumental-noise limit, note that the high sensitivity and angular resolution of CMB-HD and CMB-S4 may allow additional mitigation steps beyond the ILC component separation considered here. In particular, bright infrared sources and radio galaxies may be identified directly in the CMB maps and masked. This would help to reduce the level of residual CIB and radio galaxy foregrounds in the maps, and push toward the noise-dominated sensitivity limit, although a detailed investigation is beyond the scope of this paper. In principle, the CMB surveys might allocate additional observing time to the HLS field to help with bright foreground source excision on the crosscorrelation field. Another futuristic possibility is to employ additional frequency channels to aid componentseparation efforts. Along these lines, we have also considered the prospects of measuring the cross-power spectrum with the proposed space-based Probe of Inflation and Cosmic Origins experiment (PICO; Hanany et al. 2019). Although it has 21 frequency channels, which help with the foreground separation, it has coarser angular resolution and we find that it will be unable to detect our cross-spectrum statistic.

Galaxy Survey Sky Coverage and Depth
The other potential axes for improving the S/N involve the sky coverage and survey depth of the galaxy survey. First, let us consider the sky coverage. The relevant sky area here is, of course, the common region of sky surveyed by both the galaxy survey and the CMB experiments. In practice, the CMB surveys plan to cover very large fractions of the sky (with coverage of f sky ∼ 0.4 or larger), and so the main limitation here is the coverage of the HLS. At fixed galaxy density, the S/N scales simply with the square root of f sky (Equation (13)), and so it is easy to quantify the potential benefit here. Specifically, the sky coverage of the HLS is f sky = 0.05, and so in the limiting (yet hard to achieve in practice) case that one could cover the entire sky at the HLS depth, the S/N boost would be a factor of √ 20.
Combined with CMB-HD-type sensitivity, the total S/N in this case would reach 50 for post-ILC noise or 150 in the instrumental-noise-dominated limit. In order to quantify the impact of shot noise, we can again take the idealized limit that only sample variance in the galaxy distribution contributes to the crossspectrum variance. In order to make this estimate, we assume that the clustering bias remains constant as the shot noise is reduced to zero. Note that at least in the linear-biasing-and sample-variance-dominated regime, Equation (13) is independent of b g and so our conclusions should be insensitive to this simplifying assumption. In this case, we find that the total S/N forecast with CMB-HD is 22 for the post-ILC case and 62 for the instrumental-noise limit. In the ultimate, full-sky, negligible foreground and shot-noise limit with CMB-HD level instrumental noise, the S/N would reach 280.
Although these simplified limiting cases are illustrative, a more nuanced question relates to whether one can improve the S/N at fixed total observing time for the galaxy survey. In other words, for the purposes of optimizing the S/N of the cross-spectrum considered here, would it be better to adopt a deeper or wider observing strategy for the HLS? A detailed exploration of this is beyond the scope of the present work, but here we simply note that at 1000 the galaxy abundance fluctuations are mainly dominated by clustering rather than shot noise (Figure 8). This suggests that a more optimal strategy would be to cover a wider region of the sky, less deeply, until the clustering and shot noise become comparable for the most important modes. This is unlikely, however, to boost the expected S/N substantially.

CONCLUSION
We have modeled the kSZ 2 -galaxy cross-power spectrum signal during the EoR. The usual CMB angular power spectrum approaches for detecting the patchy reionization kSZ signal require templates for both the patchy reionization and late-time kSZ signals. Here, the post-reionization kSZ effect only contributes to the cross-correlation noise and does not produce an average bias and so this method may allow one to directly extract patchy reionization contributions to the CMB anisotropies. Furthermore, we find that the signal -as measured in different photometric galaxy redshift bins -evolves in a distinctive way across the EoR. Specifically, the signal peaks at an amplitude of C kSZ 2 ×δg ∼ 0.02 µK 2 (after suitable filtering) for modes around 500 3000 when the volume-averaged ionization fraction is around x HII = 0.15-0.20.
These predictions rely somewhat on the seminumeric zreion model introduced in Battaglia et al. (2013b). Although these models produce ionization fields that are in reasonable agreement with more-detailed radiation hydrodynamical simulations of reionization, they should be further cross-checked for the statistics considered here.
We forecast that a 12σ detection of this signal is possible using future cross-correlation measurements with CMB-HD and Lyman-break galaxies from the Roman HLS. Furthermore, we expect to be able to detect the distinctive redshift evolution in our models at moderate statistical significance. In the case of CMB-S4, however, we find only a 2.7σ detection significance and a 1.4σ significance for SO, each in correlation with the Roman HLS sample. Although the prospects are marginal for these upcoming experiments, they may provide bounds on the signal strength and test for unanticipated possibilities, while providing a real-world test of the technique discussed here. Furthermore, it may be possible to extend and generalize the approach discussed here. Specifically, we model the kSZ 2 -galaxy cross-power spectrum in this work mainly because it has been successfully applied to real data in the post-reionization universe (Kusiak et al. 2021) and by virtue of its simplicity. However, it may be interesting to generalize this and calculate the full kSZ-kSZ-galaxy bispectrum for a broad range of triangle configurations. This may boost the detection sensitivities and allow measurements in advance of CMB-HD.