Cross-correlating Carbon Monoxide Line-intensity Maps with Spectroscopic and Photometric Galaxy Surveys

, , , , , , , , , , , , , , , , , , and

Published 2019 February 25 © 2019. The American Astronomical Society. All rights reserved.
, , Citation Dongwoo T. Chung et al 2019 ApJ 872 186 DOI 10.3847/1538-4357/ab0027

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

0004-637X/872/2/186

Abstract

Line-intensity mapping is an emerging field of observational work, with strong potential to fit into a larger effort to probe large-scale structure and small-scale astrophysical phenomena using multiple complementary tracers. Taking full advantage of such complementarity means, in part, undertaking line-intensity surveys with galaxy surveys in mind. We consider the potential for detection of a cross-correlation signal between COMAP and blind surveys based on photometric redshifts (as in COSMOS) or based on spectroscopic data (as with the HETDEX survey of Lyα emitters). We find that obtaining ${\sigma }_{z}/(1+z)\lesssim 0.003$ accuracy in redshifts and ≳10−4 sources per Mpc3 with spectroscopic redshift determination should enable a CO-galaxy cross spectrum detection significance at least twice that of the CO auto spectrum. Either a future targeted spectroscopic survey or a blind survey like HETDEX may be able to meet both of these requirements.

Export citation and abstract BibTeX RIS

1. Introduction

The technique of line-intensity mapping or intensity mapping (LIM or IM) images the aggregate emission in specific spectral lines from the galaxy population at large, rather than attempting to resolve individual galaxies. Line-intensity surveys thus trade understanding of individual galaxies for improved statistical insight into global astrophysics and cosmology within a significant survey volume. The 21 cm hydrogen line is one example of a line emitted commonly enough to be viable as a target—but other lines, such as carbon monoxide and ionized carbon lines, can be tied to molecular gas and star formation activity. Surveys in each of these lines have the potential to yield, for example, a greatly improved understanding of cosmic star formation and ionization histories. (See Kovetz et al. (2017) for a general overview of the theoretical and experimental landscape.)

While LIM is relatively new, with 21 cm detections at z ≲ 1 only arising within the past decade (Chang et al. 2010; Switzer et al. 2013; Anderson et al. 2018), searching for individual galaxies is a tried-and-true method of mapping the luminous matter beyond our own galaxy. Current and future galaxy surveys are massive undertakings in collecting and processing high-resolution optical and infrared (IR) imagery, extracting galaxy catalogs from this imagery, and calculating redshifts and other galaxy properties for each object. The resulting data represent a wealth of astrophysical and cosmological information that serve as important tests of our models of the early universe.

However, optical and infrared surveys cannot detect indefinitely faint galaxies. One of the deepest surveys currently public is the Hawk-I UDS and GOODS Survey (HUGS) (Fontana et al. 2014), which reaches AB magnitude limits of K ≃ 26–28 (5σ limit per 0.4 square arcseconds) throughout the 10 arcmin wide GOODS-South (Giavalisco et al. 2004) and 20 arcmin wide UKIDSS (Lawrence et al. 2007) Ultra Deep Survey fields. The depth of this imaging has allowed studies of z ≳ 4 galaxies with stellar mass functions measured down to as low as ${10}^{9}\,{M}_{\odot }$ (Grazian et al. 2015). While impressive, the scientific output of HUGS and other ultra-deep surveys is ultimately limited by their field size and therefore sample variance. Looking at shallower but wider fields, the COSMOS2015 catalog (Laigle et al. 2016) is complete down to Ks = 24.0 (AB magnitude, 3σ, 3'' aperture) over a square-degree-scale intersection of the COSMOS (Scoville et al. 2007) and UltraVISTA (McCracken et al. 2012) fields, corresponding to a 90% stellar mass completeness limit of 1010 M. These data cover an area several orders of magnitude beyond typical ultra-deep fields, but the correspondingly reduced depth and mass completeness may lead to missing a majority of the total cosmic star formation activity happening in galaxies below the COSMOS2015 catalog's stellar mass limit. (See Juneau et al. 2005 and Sobral et al. 2014 for studies at z ≲ 2 of contributions of galaxies of different stellar mass ranges to the global star formation rate (SFR).)

However, resolving and cataloging individual galaxies over COSMOS-scale fields with HUGS-level depth is challenging, with the cameras currently online. Considering the 10–30 hr exposure times per 70 square arcminute pointing used in HUGS, covering the 1.58 square degree (or 5688 square arcminute) area of the COSMOS2015 catalog with the same camera (Hawk-I, the High Acuity Wide-field K-band Imager, at the ESO VLT) to the same depth as the HUGS data would require ∼103 hr. A project requiring this amount of time is difficult to run on community instruments and would only output near-IR imagery, with further follow-up requiring more time on other instruments.

Such is the niche that line-intensity surveys aim to fill, by operating dedicated instruments to map line emission over galaxy survey fields to greater depths than conventional galaxy surveys. As previously mentioned, however, the increased depth is not necessarily accompanied by an understanding of each individual object emitting in the observed line—only a statistical understanding of the whole emitter population—and furthermore, it requires removal of significant foregrounds and systematics to meaningfully achieve.

Overall, the range of different trade-offs, systematics, advantages, and challenges in galaxy and line-intensity surveys means that the two techniques provide complementary views into the early universe, and could be even more powerful in coordination. This will only become truer with further developments in LIM, as well as in near-IR imaging technology and analysis. Work is already progressing on determining how to exploit cross-correlations both within LIM (as in Breysse & Rahman 2017) and between line-intensity and galaxy surveys (as in Wolz et al. 2017) to provide novel insights into star formation and galaxy evolution.

This leads into our own interest in prospects for cross-correlation between galaxy surveys and line-intensity surveys, which is specifically in the context of the Carbon monOxide Mapping Array Pathfinder (COMAP) (as explored in Li et al. 2016). The initial phase of COMAP targets the CO(1-0) line (rest frequency 115.27 GHz) at redshifts 2.4–3.4 over square-degree-scale patches. The patch size and redshift range are well-matched to a galaxy catalog like the COSMOS2015 catalog, leading to the question of whether a potential COMAP detection of CO could be augmented by cross-correlation with the COSMOS2015 data, or potentially even an independent spectroscopic follow-up.

We bring up the idea of spectroscopic follow-up specifically because galaxy surveys typically undertake wide-field photometric imaging followed by deeper, targeted spectroscopy of objects selected from the former. However, surveys operating outside of this paradigm are set to come online in the near-future. One example is the Hobby–Eberly Telescope Dark Energy Experiment (HETDEX) (Hill et al. 2008), a wide-field, blind spectroscopic survey and a possible platform for Lyα LIM (e.g., as considered in Fonseca et al. 2017). While the main product of the HETDEX survey will be a catalog of ∼106 Lyα emitters (LAEs) over ∼400 square degrees of sky, the locations of these LAEs are not predetermined. Rather, the survey footprint is blindly and sparsely sampled (with a fill factor of 1/4.5; see Section 2.3 for details) using the VIRUS spectrograph (Hill et al. 2014), with individual LAEs extracted from the resulting spectra. This places HETDEX at the intersection of conventional catalog-oriented surveys and blind line-intensity surveys, and potentially allows for generation of both LAE catalogs and Lyα intensity cubes from the same data. The redshift coverage of HETDEX (z = 1.9–3.5) is well-matched to that of COMAP, which naturally then leads also to the question of how detectable a COMAP–HETDEX cross-correlation would be, as well as how it would compare to a COMAP–COSMOS cross-correlation—using HETDEX not only as a conventional cataloging machine, but also as a line-intensity mapper.

We aim to answer the following questions.

  • 1.  
    What number of sources do we need for significant cross-correlation, in the case of a hypothetical spectroscopic follow-up to complement COMAP?
  • 2.  
    What redshift accuracy must the reference galaxy catalog achieve to enable significant cross-correlation?
  • 3.  
    What would be the detection significance of the various cross-power spectra under consideration?

The paper is structured as follows. In Section 2, we outline the different experimental methods that COMAP, COSMOS, and HETDEX use to survey galaxies, then we introduce our methods for simulating CO, galaxy, and Lyα observations in Section 3. We present expected cross-correlation results in Section 4. After some discussion of these results and their implications for COMAP in Section 5, we present our conclusions in Section 6.

Where necessary, we assume base-10 logarithms, and a ΛCDM cosmology with parameters Ωm = 0.286, ΩΛ = 0.714, Ωb = 0.047, ${H}_{0}=100h$ km s−1 Mpc−1 with h = 0.7, σ8 = 0.82, and ns = 0.96, broadly consistent with nine-year WMAP results (Hinshaw et al. 2013). Distances carry an implicit h−1 dependence throughout, which propagates through masses (all based on virial halo masses, proportional to h−1) and volume densities (∝h3).

2. Context: Experimental Methods

2.1. CO LIM: COMAP

Table 1 describes the current anticipated parameters for the initial phase (or Phase I) of COMAP. The receiver is currently undergoing commissioning at the Owens Valley Radio Observatory (OVRO) in California, where we expect the Phase I instrument to undertake a two-year observing campaign.

Table 1.  COMAP Instrumental and Survey Parameters Assumed for This Work

Parameter Value
System temperature 40 K
Angular resolution 4'
Frequency resolution 15.625 MHz
Observed frequencies 26–30 GHz; 30–34 GHz
Number of feeds 19
Survey area per patch ∼2.5 deg2
On-sky time per patch 1500 hr

Note. Feeds are single-polarization. The survey observes frequencies of 26–34 GHz with two separate backend systems, each covering a 4 GHz band in that range. The angular resolution above is the full width at half maximum of the Gaussian beam profile, for the receiver's central pixel. We simulate only one patch, though we expect to observe more than one—at least, for CO autocorrelation. A patch with 8.6% observing efficiency could expect 1500 hr of integration time in two years, compared to typical values of ∼10% for fields close to the celestial equator (conditioning observability from the COMAP site on solar altitudes below −10°, field altitudes above 30°, and elongations greater than 30° from the Moon).

Download table as:  ASCIITypeset image

Table 2.  HETDEX Instrumental and Survey Parameters Assumed for This Work

Parameter Value
On-sky area per fiber 1.8 arcsec2
Resolving power 700
Observed wavelengths 350–550 nm
(or frequencies) (857–545 THz)
Fill factor 1/4 (1 in SHELA field)
Line sensitivity × 10−17 erg s−1 cm−2

Note. Per Hill & HETDEX Consortium (2016), the line sensitivity estimate is based on integrating 20 minutes per shot with three dithered exposures of 180 s, and the actual fill factor outside the SHELA field is closer to 1/4.5 in reality.

Download table as:  ASCIITypeset image

While a wide range of predictions exist for the CO power spectrum at z ∼ 3 (Righi et al. 2008; Visbal & Loeb 2010; Pullen et al. 2013; Breysse et al. 2014; Li et al. 2016; Padmanabhan 2018), the sensitivity calculations made in Li et al. (2016)—given their fiducial model—place COMAP Phase I squarely in a regime where instrumental noise dominates over sample variance, which is still true after various changes to COMAP parameters made since the writing of Li et al. (2016). This dictates the optimal observing strategy to some extent, pushing COMAP toward surveying, at most, several small fields (as close as possible to the ∼1 deg2 field of view) with maximum observing efficiency. In a 2D analysis assuming a total on-sky time of one year (∼9000 hr) split across four patches (for ∼2200 hr per patch), Breysse et al. (2014) found that a survey footprint of four patches with almost 4 deg2 per patch would maximize the total signal-to-noise ratio (S/N). If the optimal area scales linearly with on-sky time, the fiducial area per patch of 2.5 deg2 is close to ideal for a survey time of 1500 hr per patch as assumed in this work.

2.2. Conventional Galaxy Survey: COSMOS2015

Conventional galaxy surveys are a natural target for cross-correlation with LIM, and the successful detection of 21 cm line emission from galaxies at z ∼ 1 comes from cross-correlation with spectroscopic galaxy surveys (Chang et al. 2010; Switzer et al. 2013). However, spectroscopic data are currently limited in depth and abundance at z ∼ 3, so we look to existing public photometric data sets.

The COSMOS2015 catalog contains half a million galaxies observed in $1\lt z\lt 6$ across 1.58 square degrees of sky near the celestial equator. The catalog is Ks-selected (the Ks band being at 2.2 mm), and as mentioned previously, the completeness limit is Ks = 24.0. The Ks magnitude correlates well with stellar mass up to z ∼ 4 (and magnitudes in longer-wavelength bands may be used at higher redshifts; see, e.g., Davidzon et al. 2017). The redshift distribution skews largely toward lower redshift, but the source abundances are still relatively high for the redshift range relevant to COMAP, within which we find just under 20,000 sources over 1.58 square degrees (of which 0.2 square degrees are masked due to saturated pixels) with Ks ≤ 24.

The critical limiting factor of the COSMOS2015 catalog for studies of 3D large-scale structure is the redshift accuracy. Laigle et al. (2016) quote photometric redshift errors at 3 < z < 6 to be ${\sigma }_{z}=0.021(1+z)$, with some fraction of catastrophic failures; certain subsets even reach ${\sigma }_{z}\,\lesssim 0.01(1+z)$. However, Davidzon et al. (2017) suggest that the error is higher for z ≳ 3 galaxies, and is closer to ${\sigma }_{z}=0.03(1+z)$.

Deep low- to medium-resolution spectroscopic follow-up exists in the COSMOS field, but the surveys either do not satisfactorily cover z > 2 or are limited in area. A recent catalog of ten thousand objects selected across the COSMOS field (Hasinger et al. 2018) only contains ∼102 objects in the redshift range of interest to COMAP, with the majority of spectroscopic redshifts well below (or above) that range. Meanwhile, the VIMOS Ultra-Deep Survey (VUDS) (Le Fèvre et al. 2015) reports one of the largest z > 2 emission-line galaxy samples, with ∼2800 spectroscopic redshifts at z = 2.5–3.5 down to iAB ≃ 25 over a square degree of sky. However, the initial data release (the only public data release, at time of writing; see Tasca et al. 2017) covers less than 10% of this area, and even the full square degree is split across three patches covering ∼103 square arcminutes each, including half a square degree of the COSMOS field, or around a third of the COSMOS2015 coverage (and a fifth of the expected area per COMAP patch). Such surveys are well-suited for measuring stellar mass functions and average spectral properties, but the areas covered are less than ideal for cross-correlation against line-intensity maps. Uncertainty in the resulting cross spectra is roughly proportional to the inverse square root of the survey volume, so factors of 3–5 in sky area can noticeably affect detection significance.

2.3. Blind Spectroscopic Survey: HETDEX

The central stated goal of the HETDEX survey is constraining the expansion history of the universe, and specifically detecting dark energy at 3σ significance, by identifying ∼106 LAEs through a wide-field spectroscopic survey (Hill et al. 2008; Hill & HETDEX Consortium 2016). However, we note a few interesting differences between HETDEX and previous conventional spectroscopic galaxy surveys measuring dark energy, e.g., the Dark Energy Spectroscopic Instrument (DESI) (DESI Collaboration et al. 2016), the SDSS-IV Extended Baryon Oscillation Spectroscopic Survey (eBOSS) (Dawson et al. 2016), and WiggleZ (Drinkwater et al. 2018).

  • 1.  
    HETDEX targets a redshift range of z = 1.9–3.5,10 well beyond the typical redshifts of z ∼ 1 of other dark-energy-centric optical and NIR surveys. (eBOSS and DESI will target quasars, and thus the Ly-α forest at z ≳ 2, but emission line galaxies only up to z ∼ 2.)
  • 2.  
    HETDEX does not target specific points on the sky based on prior imaging, but rather samples its survey footprint with integral field spectroscopy, integrating for ∼20 minutes at each spot in the sky, and picks sources out from the noisy spectra.

The first point is of interest to us because, unlike many other dark-energy-centric surveys, future HETDEX detections will fall squarely in a redshift range relevant to COMAP Phase I. The second point is of interest because, in principle, the noisy spectra uniformly sampled across the survey footprint could be processed and analyzed as a line-intensity cube.

Due to the survey and instrument design of HETDEX, the survey footprint will not be completely filled in with data. Rather, the integral field unit (IFU) arrangement of the VIRUS instrument covers only 1/4.5 of the area of each 20'-diameter "shot," and while some dithering (in three exposures) will fill in areas between fibres, no attempt will be made to fill in the IFU spacing of 100'', given a greater need for survey volume than for capture of small scales. The patch used for the Spitzer/HETDEX Exploratory Large Area (SHELA) survey (Papovich et al. 2016) is an exception, and the area between IFUs will be filled in for this field only (Hill & HETDEX Consortium 2016). At 13 × 2 square degrees, the SHELA patch could entirely contain a single (appropriately oriented) COMAP patch.

HETDEX is subject to interloper emission, detecting ∼106 galaxies from z < 0.5 emitting in the [O ii] doublet (Hill et al. 2008). When extracting individual emitters from LAE survey spectra, imposing a minimum equivalent width cutoff removes the low-z emitters (Cowie & Hu 1998; Adams et al. 2011), and more sophisticated classification using Bayesian methods may also recover more of the underlying LAE sample (Leung et al. 2017). While we find no literature discussing foreground removal strategies in the context of LIM with HETDEX, such literature does exist in the context of [C ii] observations (Cheng et al. 2016; Lidz & Taylor 2016; Sun et al. 2018) and some strategies may be applicable beyond their original context. Furthermore, the low-z [O ii] emission will have no corresponding component in COMAP data, potentially leading to its amelioration in COMAP–HETDEX line-intensity cross-correlation.

To conclude this section, Table 3 shows a summary of the coverage and redshift precision of all surveys discussed above. For simulation purposes, we will expand or truncate coverage as necessary to match the COMAP coverage, as explained in Section 3.3.

Table 3.  A Summary of Survey Coverage and Redshift Precision for all Experiments Considered Above

Experiment Field size Range of z ${\sigma }_{z}/(1+z)$ Source Selection Source Count
or catalog (deg2)       (per deg2 per Δ z = 1)
COMAP 2.5 2.4–3.4 ∼1/2000 none (surveys aggregate CO emission) ...
COSMOS2015 1.58 1–6 ∼0.02 Ks-band magnitude ≲24.0 ∼13000
HETDEX 300 + 150 1.9–3.5 ∼1/700 Lyα luminosity ≳3 × 1042 erg s−1 ∼1400
(in SHELA) (26) ... ... ... ∼6000

Note. The COMAP survey footprint will comprise two or more patches of 2.5 square degrees each; the HETDEX survey footprint includes a 300 deg2 "Spring" field and a 150 deg2 "Fall" field, with possible 50–60% extensions to each. The "HETDEX in SHELA" row describes HETDEX full-fill coverage of the 13 × 2 deg2 SHELA field.

Download table as:  ASCIITypeset image

3. Simulation Methods

We simulate all surveys under consideration using halo catalogs derived from a dark matter simulation. We describe the dark matter simulation in Section 3.1, the models of various halo properties in Section 3.2, and then the mocks of CO, Lyα, and conventional galaxy survey data using these properties in Section 3.3. Finally, we outline the calculation of auto and cross power spectra from these data in Section 3.4, and the calculation of sensitivity estimates with respect to those spectra in Section 3.5.

3.1. Dark Matter Simulation

We use a cosmological N-body simulation as the basis for our simulations. In particular, we use the c400-2048 box, which is part of the Chinchilla suite of dark-matter-only simulations. Li et al. (2016), who used the same simulation, provide implementation details of the simulation and subsequent halo identification. The simulation spans 400h−1 Mpc on each side, and has a dark matter particle mass of $5.9\times {10}^{8}{h}^{-1}\,{M}_{\odot };$ we include dark matter halos more massive than ${M}_{\mathrm{vir}}={10}^{10}\,{M}_{\odot }$ in our analysis, meaning that we assume halos with lower virial halo mass are not massive enough to host galaxies with substantial star formation activity. (Li et al. 2016 justify the same choice of cutoff mass for CO simulations in their Appendix A; we consider its effect on Lyα simulations in our Appendix A alongside other details of Lyα modeling.)

To simulate galaxies in our field of observation, we use dark matter halos identified in "lightcone" volumes, enclosing all halos within a given sky area and redshift range, with each lightcone based on arbitrary choices of observer origin and direction within the cosmological simulation. We use 100 lightcones spanning z = 1.5–3.5 and a flat-sky area of 100' × 100', each populated with ∼106–107 halos. These lightcones form the basis for our simulated observations in z = 2.4–3.4. Note that this redshift range spans approximately 1 Gpc, so some line-of-sight repetition of the N-body data will occur, as we exploit the periodic boundary conditions of the simulation box—which is only $400{h}^{-1}\approx 570$ Mpc along each side—to extend the lightcone beyond the actual simulated comoving volume. However, the lightcone extents are not so much greater than the simulation volume that we expect this periodicity to impact the results of our study.

3.2. Deriving Halo Properties

We derive CO and Lyα line luminosities for each of the halos (above the 1010 M cutoff mass) from their virial masses and redshifts, with Figure 1 showing the average halo mass–line luminosity relations at z = 2.8 (the COMAP mid-band redshift). In addition, we calculate stellar masses (for the galaxy survey selection) and SFRs (as an intermediate property for the line luminosities) for each halo. Below, we explain the derivation of each of these properties.

Figure 1.

Figure 1. Mean relation at redshift 2.8 between halo mass and line luminosity for CO(1-0) and Lyα emission. The shaded area around each mean curve indicates the 1σ log-scatter of the log-normal distribution at each halo mass.

Standard image High-resolution image

Stellar mass—We assign a stellar mass M* to each halo using the best-fit stellar mass–halo mass relation from Behroozi et al. (2013a, 2013b). We apply the mean relation and the redshift-dependent scatter from the model, which is ≈0.23 dex at the redshifts considered here. The stellar mass is a property in itself, and unlike the SFR, it does not influence other properties.

Star formation rate—We convert halo masses to SFRs for each halo via interpolation of data from Behroozi et al. (2013a, 2013b). The main focus of these papers is to constrain the stellar mass–halo mass relation and derived quantities by comparing simulation data with observational constraints, and the resulting data include the average SFR in a halo given its mass and redshift.

We approximate halo-to-halo scatter in SFR by adding 0.3 dex log-normal scatter to the SFR obtained above, preserving the linear mean. The assumption of 0.3 dex scatter is reasonable, given the 0.2–0.4 true or intrinsic scatter observed in the SFR–stellar mass relation (Speagle et al. 2014; Salmon et al. 2015), combined with the tight ∼0.2 dex scatter in the stellar mass–halo mass relation of Behroozi et al. (2013a).

We also re-express SFR as infrared (IR) luminosity, using a known tight correlation:

Equation (1)

As in Behroozi et al. (2013a) and Li et al. (2016), we assume a Chabrier initial mass function (IMF) (Chabrier 2003).

CO luminosity—We convert between IR luminosity and observed CO luminosity through power-law fits to observed data, commonly given in the literature:

Equation (2)

where for our fiducial model, we take α = 1.37 and β = −1.74 from a fit to high-redshift galaxy data (z ≳ 1) given in Carilli & Walter (2013), following Li et al. (2016).

In this context, ${L}_{\mathrm{CO}}^{{\prime} }$ (or indeed, any ${L}_{\mathrm{line}}^{{\prime} }$) is the observed luminosity (or velocity- and area-integrated brightness temperature) of the halo, which we convert into an intrinsic luminosity for each halo, as in Li et al. (2016):

Equation (3)

We also add 0.3 dex log-normal scatter in CO luminosity, again preserving the linear mean. We model this scatter as completely independent of the scatter in SFR. If we were simulating CO emission alone, as in Li et al. (2016), we could end up with the same log-normal distribution width with a total scatter of ${\sigma }_{\mathrm{tot}}={({\sigma }_{\mathrm{SFR}}^{2}/{\alpha }^{2}+{\sigma }_{{L}_{\mathrm{CO}}}^{2})}^{1/2}=0.37$ (with all σ in units of dex) on top of the mean SFR–LCO relation. However, unlike in Li et al. (2016), the scattered SFR for a given halo informs both CO and Lyα line luminosities. Furthermore, our approach to scatter is subtly different from adding log-normal scatter to CO luminosity for a given halo mass and redshift; see Appendix B for details.

Note that, while the scatter exhibited is representative of the amount of scatter seen in the high-redshift galaxy data obtained via Carilli & Walter (2013), it may not be representative of the galaxy population at large that COMAP will study, and a larger scatter in CO luminosity than we have assumed here may reduce our ability to cross-correlate CO against galaxy catalogs. However, this means that, were COMAP to indeed confidently detect CO while finding little in the way of CO-galaxy cross-correlation, that in itself would lead to interesting insights about how stochastic CO is in these high-redshift galaxies (or at least, those galaxies with sufficiently high mass and luminosity to be cataloged in the cross-correlation sample).

Lyα luminosity—Before dust absorption and other attenuation mechanisms, the Lyα line is 8.7 times stronger (for case B recombination) than Hα, a frequently chosen emission-line tracer of star formation activity (Kennicutt & Evans 2012). We use an intrinsic Lyα–SFR calibration (via Hα) with an escape fraction encapsulating possible attenuation of the intrinsic Lyα luminosity. We outline the specifics behind our model in Appendix A, the end result of which is that, for a given halo, we calculate

Equation (4)

with

Equation (5)

The escape fraction fesc increases with lower SFR and higher redshift, so as to allow this model to match observed LAE luminosity functions (LFs) (including those from Sobral et al. (2017) and Gronwall et al. (2007); see Appendix A for more references) in the redshift range of interest (z ∼ 2–4).

As with the CO luminosity, we add 0.3 dex log-normal scatter in Lyα luminosity. Due to the non–power-law nature of the SFR–LLyα relation, the net scatter in LLyα varies non-monotonically with halo mass (between 0.31 and 0.42 dex in the mean relation shown in Figure 1 at z = 2.8).

Note that our model is tuned to LAE observations and assigns luminosities only at the centers of dark matter halos; thus, it does not account for the finer details of Lyα radiative transfer beyond each LAE, per se—which would result, for example, in diffuse Lyα halos or blobs (Steidel et al. 2011). Incorporating such details would introduce additional components to the Lyα emission, which we discuss in more detail in Appendix C. The signal as simulated here could be modified by these components in ways that may be detectable in HETDEX data through LIM, but not necessarily through LAE identification. All of this suggests the need to study implications of diffuse Lyα emission for HETDEX and cross-correlation with COMAP, but we leave this for future work.

3.3. Mock Surveys

After the processing outlined in the last section, each halo has a sky position, redshift (excluding peculiar velocities, which have minimal effects on the results of this work11 ), virial halo mass, stellar mass, CO luminosity, and Lyα luminosity. We now use these properties to simulate survey data for COMAP Phase I, a COSMOS2015-like mass-selected galaxy catalog, and a HETDEX-like Lyα survey.

For ease of analysis, all survey cubes are generated with the same grid of voxels, based on the COMAP observation. The angular extent of each voxel is ${\delta }_{x}={\delta }_{y}=0\buildrel{\,\prime}\over{.} 4$ or 1.16 × 10−4 rad in each direction (oversampling the COMAP beamwidth by a factor of 10 and the on-sky VIRUS IFU width by a factor of 2.1), and each voxel spans ${\delta }_{\nu }=15.625\,\mathrm{MHz}$ in COMAP frequency (equivalent to 335 GHz in HETDEX frequency) unless otherwise specified. The simulated cubes span 100' × 100' and a continuous 26–34 GHz band in COMAP frequency (557–729 THz in HETDEX frequency), enclosing a total comoving volume of 190 × 190 × 1000 = 3.6 × 107 Mpc3.

3.3.1. CO Intensity Data

We follow Li et al. (2016) again in generating a temperature cube, taking the same set of steps:

  • 1.  
    Bin the halo luminosities into resolution elements in frequency and angular position, resulting in a certain luminosity Lline,vox for each voxel that is simply the cumulative line luminosity of all halos in that voxel.
  • 2.  
    Convert these luminosities into surface brightness (apparent spectral intensity, in units of luminosity per unit area, per unit frequency, per unit solid angle):
    Equation (6)
    where DL is the luminosity distance to that voxel.
  • 3.  
    Convert to the expected brightness temperature contribution from each voxel. The Rayleigh–Jeans brightness temperature for a given surface brightness is
    Equation (7)
    from which we obtain our temperature TCO(x) at each voxel position x in the data cube.

3.3.2. Galaxy Overdensity Field

We devise an ideal NIR-selected galaxy survey tracing galaxies down to a certain stellar mass limit. We claim that the galaxy–halo connection allows us to model this, starting with a catalog of halos and imposing stellar mass cuts corresponding to realistic magnitude limits.

Mass Completeness—To crudely simulate the Ks magnitude cut used by catalogs like COSMOS2015, we assume the Ks magnitude correlates reliably with the stellar mass in our redshift range, an assumption that Laigle et al. (2016) support—at least in relating completeness limits for the two quantities. Each step down in magnitude is a factor of 100.4 up in brightness, so a constant mass-to-light ratio would result in the same factor up in stellar mass. Laigle et al. (2016) find their ${K}_{s,\mathrm{lim}}=24.0$ limit to be equivalent to a stellar mass completeness limit of ${M}_{* ,\mathrm{lim}}={10}^{10}\,{M}_{\odot }$ in our redshift range. We extrapolate this to different completeness limits with the following relation:

Equation (8)

The limits of ${K}_{s,\mathrm{lim}}=(25.0,24.0,23.0,22.0)$ are then equal to $\mathrm{log}({M}_{* }/{M}_{\odot })=(9.6,10.0,10.4,10.8)$. We use these stellar mass cuts to select our mock galaxy survey sample in each lightcone. Of these, the $\mathrm{log}({M}_{* }/{M}_{\odot })\gt 10.0$ cut matches the COSMOS2015 source abundance within a factor of order unity, so we take this as our fiducial M* cut.

Redshift accuracy—To simulate uncertainty in redshifts derived from imagery, we apply different levels of scatter in observed redshift relative to the true cosmological redshift. While we do simulate cross-correlations against a survey with perfect galaxy redshift knowledge, not even spectroscopic surveys have such information. Therefore, we simulate normal scatter of redshifts with ${\sigma }_{z}/(1+z)=0.0007$, 0.003, 0.01, 0.02, and 0.03. The first scenario meets the minimum redshift accuracy required for cosmological applications of the Subaru Prime Focus Spectrograph (PFS) (Takada et al. 2014). The second scenario corresponds to lower-resolution spectroscopy, as seen in HST grism surveys like 3D-HST (${\sigma }_{z}/(1+z)=0.003$) (Momcheva et al. 2016), prism surveys like PRIMUS (${\sigma }_{z}/(1+z)=0.005$) (Coil et al. 2011; Cool et al. 2013), or even narrow-band photometric surveys like the PAU Survey (${\sigma }_{z}/(1+z)=0.0037$) (Eriksen et al. 2019). The last three scenarios represent optimistic, fiducial, and pessimistic expectations for photometric redshift accuracy, based on the discussion in Section 2.2.

After applying the stellar mass cut and redshift scatter (if applicable), we calculate the galaxy overdensity across the voxel grid. We count the number of galaxies ${N}_{\mathrm{gal},\mathrm{vox}}({\boldsymbol{x}})$ in each voxel, divide by the comoving volume of the voxel to get the number density ${n}_{\mathrm{gal},\mathrm{vox}}={N}_{\mathrm{gal},\mathrm{vox}}/{V}_{\mathrm{vox}}$. The quantity we deal with then is normalized by the average number density ${\bar{n}}_{\mathrm{gal}}$ across all voxels observed:

Equation (9)

3.3.3. Lyα Survey Simulation

We simulate two data products for the Lyα survey: an LAE overdensity cube, and a Lyα line-intensity cube. This is in view of our earlier statement in Section 2.3 that, while the primary data product from HETDEX will be a catalog of high-redshift LAEs, the collection of spectra across the survey footprint could be treated and analyzed as a Lyα line-intensity data cube.

We calculate the relative LAE overdensity ${\delta }_{\mathrm{LAE},\mathrm{vox}}$ for each voxel in much the same way as in the galaxy survey cubes, except our selection criterion is now the Lyα luminosity of each halo rather than the stellar mass. The HETDEX Pilot Survey (Adams et al. 2011; Blanc et al. 2011) reported luminosity limits of (3–6) × 1042 erg s−1 with 5σ line flux sensitivities of 5 × 10−17 erg s−1 cm−2, which is not far from the goal for the final survey shown in Table 2, so we set luminosity cuts at (3 × 1042, 6 × 1042) erg s−1.

The Lyα line-intensity cube is generated in much the same way as the CO temperature cube, but rather than converting the observed intensity to a brightness temperature (which is no longer applicable for observations in optical bands), we work with the intensity per unit log-frequency interval ${\nu }_{{\rm{L}}y\alpha }{I}_{\nu ,\mathrm{Ly}\alpha }$, in units of erg s−1 cm−2.

VIRUS is expected to have a resolving power of R ∼ 700 (Hill et al. 2014), and the Lyα emission in LAEs from the HETDEX Pilot Survey has been observed with velocity offsets of several hundred km s−1 relative to the galaxy systemic redshifts (Chonis et al. 2013). We translate all of this to an expectation of redshift precision of ${\sigma }_{z}/(1+z)\approx 0.0015$, mostly based on $1/R\approx 0.0014$ but adding on possible velocity offsets of the Lyα line relative to the rest of the galaxy (which will result in residuals even after the subtraction of an average velocity offset). We simulate normal scatter with this error in the LAE redshifts when calculating the LAE overdensity. When simulating the Lyα intensity cube, we do not apply this scatter; however, we do account for the attenuation from spectral resolution when calculating the cross spectrum and its detection significance.

Sparse sampling—As previously mentioned, the great majority of the HETDEX survey footprint will be sampled sparsely. To emulate this in our simulations, we leave regions of two pixels by two pixels unmasked, each separated by two masked pixels. This results in a fill factor of 1/4 with a regular pattern of 48'' × 48'' squares with centers spaced apart by 96'', approximating both the fill factor and the IFU on-sky spacing that will show up in HETDEX data. While the actual IFU arrangement and shot tiling is more complex and results in additional biasing of the P(k) measurement, this serves as a first pass at simulating the effect of sparse sampling on both auto and cross spectra, at a level sufficient for this work. A detailed analysis from Chiang et al. (2013) shows that resulting measurement biases for more complex sparse sampling scenarios (versus perfect tiling as simulated in this work) are within 10% up to scales of $0.5h$ Mpc−1.

3.4. Simulated Auto and Cross Spectra

We have now defined a grid of voxels and four quantities associated with each voxel: the CO brightness temperature ${T}_{\mathrm{CO}}$, the mass-selected galaxy overdensity ${\delta }_{\mathrm{gal}}$ (for four different mass cuts), the Lyα spectral intensity per log-frequency interval ${\nu }_{{\rm{L}}y\alpha }{I}_{\nu ,\mathrm{Ly}\alpha }$, and the luminosity-selected LAE overdensity ${\delta }_{\mathrm{LAE}}$ (for two different luminosity cuts). Following previous works (Visbal & Loeb 2010; Li et al. 2016), we use Fourier estimators of the auto and cross power spectra. If $\tilde{A}({\boldsymbol{k}})$ and $\tilde{B}({\boldsymbol{k}})$ are the Fourier transforms of the fields $A({\boldsymbol{x}})$ and $B({\boldsymbol{x}})$, then the full 3D auto spectra are

Equation (10)

and the full 3D cross spectrum is

Equation (11)

Because the fields are defined on a grid of voxels at discrete values of ${\boldsymbol{x}}$, the Fourier transforms and full 3D spectra are also defined at discrete ${\boldsymbol{k}}$.

With the assumption of isotropy, we then spherically average the power spectra in shells of $k=| {\boldsymbol{k}}| $ of width ${\rm{\Delta }}k=0.035$ Mpc−1, each containing some number of modes Nmodes(k) for which $k-{\rm{\Delta }}k/2\lt | {\boldsymbol{k}}| \lt k+{\rm{\Delta }}k/2$, to obtain the spherically averaged 3D power spectra PA(k), PB(k), and ${P}_{{\rm{A}}\times {\rm{B}}}(k)$. Here, we take $A({\boldsymbol{x}})={T}_{\mathrm{CO}}$, while $B({\boldsymbol{x}})$ can be any of the other three fields defined above.

Broadly speaking, we can consider each of the auto and cross P(k) to be the sum of a clustering component that dominates at low k and a constant shot-noise term that dominates at high k. The clustering component traces the matter power spectrum with some bias associated with the quantity being observed, while the shot-noise component arises from Poisson fluctuations. Cross shot noise can have interesting interpretations explored in previous work: Wolz et al. (2017) show that H i-galaxy cross shot noise may be used to infer the H i content of the cross-correlated galaxies, and Breysse & Rahman (2017) show that 12CO–13CO cross shot noise within COMAP may be used to learn about 12CO–13CO isotopologue ratios and 12CO saturation. One could extend the latter idea to cross-correlate between two separate line-intensity surveys, e.g., between COMAP and HETDEX, in order to learn about the molecular fraction of Lyα emitters (using CO as a proxy for molecular gas). However, as we will see, such astrophysical interpretation of the cross shot noise must account for attenuation both from sparse sampling, as seen in HETDEX, and from redshift errors in all galaxy samples. We leave a detailed investigation into effects of such attenuation on astrophysical inferences for future work.

3.5. Sensitivity Estimates

For our purposes, we take the sources of uncertainty to be sample variance, thermal noise in the CO temperature or Lyα intensity field, and shot noise in the galaxy density field.12 From Visbal & Loeb (2010),

Equation (12)

where the "total" power spectra include noise, interloper emission, and other components not necessarily correlated to the tracer. We ignore those components in this work, apart from instrumental noise:

Equation (13)

Equation (14)

Therefore, it is best to treat the signal-to-noise estimates given in this work as upper bounds on what the actual surveys may ultimately achieve. In general, the instrumental (thermal) noise power spectrum is given by the root-mean-square temperature or intensity fluctuation per voxel ${\sigma }_{n}$ and the comoving voxel volume ${V}_{\mathrm{vox}}$; it is assumed to be pure white noise, and thus constant across all k (Lidz et al. 2011):

Equation (15)

(This is analogous to the inverse of the weight per solid angle $w={({\sigma }_{\mathrm{pix}}^{2}{{\rm{\Omega }}}_{\mathrm{pix}})}^{-1}$ in the calculation of uncertainties in 2D ${C}_{{\ell }}$ analysis from Knox 1995.) We then need only calculate ${\sigma }_{n,\mathrm{COMAP}}$ and ${\sigma }_{n,\mathrm{HETDEX}}$ based on the expected instrumental and survey parameters.

Calculation of the COMAP instrumental noise follows the same procedure outlined in Appendix C of Li et al. (2016), using the parameters in Table 1. Specifically, ${\sigma }_{n,\mathrm{COMAP}}$ derives from the system temperature ${T}_{\mathrm{sys}}=40$ K, the number of feeds ${N}_{\mathrm{feeds}}=19$, the frequency resolution ${\delta }_{\nu }=15.625\,\mathrm{MHz}$, and the survey time per pixel ${\tau }_{\mathrm{pix}}=(1500\mathrm{hr})\cdot {\delta }_{x}{\delta }_{y}/(2.5{\deg }^{2})$ (being simply the total survey time per patch divided by the number of pixels per patch):

Equation (16)

We simulate and assume only one patch of 2.5 deg2, so the signal-to-noise estimates are also given per patch. If we fix the on-sky time per patch and the solid angle per patch, uncertainties from COMAP instrumental noise (which we expect to dominate total uncertainties) will decrease as the square root of the number of patches (from the linear increase in the number of modes averaged to obtain our best P(k) estimate).

For the HETDEX instrumental noise, we refer to the sensitivity metrics given in Hill et al. (2014). Each VIRUS fiber covers a solid angle of 1.8 square arcseconds at one time, and a dither pattern of three exposures allows the area within each IFU to be completely covered. The line sensitivity expected from each 20 minute shot is ≲4 × 10−17 erg s−1 cm−2, and dividing by the 5.4 square arcsecond solid angle per dithered fiber gives ${\sigma }_{n,\mathrm{HETDEX}}\,=3.15\times {10}^{-7}$ erg s−1 cm−2 sr−1. Thus, we now have ${P}_{n,\mathrm{COMAP}}\,={\sigma }_{n,\mathrm{COMAP}}^{2}{V}_{\mathrm{vox}}$ and ${P}_{n,\mathrm{HETDEX}}={\sigma }_{n,\mathrm{HETDEX}}^{2}{V}_{\mathrm{vox}}$. For COMAP, the dependence on the voxel size (simulated or otherwise) cancels out because ${\sigma }_{n}^{2}$ is proportional to the inverse of the frequency bandwidth per channel as well as the inverse of the solid angle per pixel (via ${\tau }_{\mathrm{pix}}$). For HETDEX, we choose ${V}_{\mathrm{vox}}$ in the context of ${P}_{n,\mathrm{HETDEX}}$ to correspond to the 5.4 square arcsecond solid angle per dithered fiber (canceling the same factor we used to convert from line sensitivity to intensity fluctuation per voxel) and the spectral resolution of the instrument (or a redshift interval of $(1+z)/R\sim 0.005$), which comes out to ∼0.03 Mpc3. As with the cross power spectrum uncertainty in Equation (12), the errors on the individual auto power spectra are also given by dividing the "total" spectra by the square root of Nmodes(k). In the case of galaxy or LAE overdensities, the "total" power spectrum is equal to the simulated power spectrum, as we never subtract the shot noise term of $1/\bar{n}$. In the other cases, we use the "total" spectra from above:

Equation (17)

Equation (18)

When calculating S/N, we also need to account for attenuation in the CO signal due to the beam. As discussed in Li et al. (2016), the beam resolution limit attenuates the Fourier-transformed CO temperature field ${\tilde{T}}_{\mathrm{CO}}({\boldsymbol{k}})$ at each ${\boldsymbol{k}}$ by a factor of $\exp (-{k}_{\perp }^{2}{\sigma }_{\mathrm{beam}}^{2}/2)$, where ${k}_{\perp }$ is the transverse component of ${\boldsymbol{k}}$ and ${\sigma }_{\mathrm{beam}}$ the width of the Gaussian profile of the beam (projected into the comoving survey volume). However, angular resolution limits differ significantly between the CO temperature field and the field being cross-correlated against, and the latter (which may be due to pixelization, galaxy survey limits, fiber diameters, and so on) will be much finer than the COMAP ${\sigma }_{\mathrm{beam}}$. Thus, while the full 3D CO auto spectrum ${P}_{\mathrm{CO}}{({\boldsymbol{k}})\propto {\tilde{T}}_{\mathrm{CO}}({\boldsymbol{k}})}^{2}$ is attenuated at each ${\boldsymbol{k}}$ by $\exp (-{k}_{\perp }^{2}{\sigma }_{\mathrm{beam}}^{2})$, the cross spectra only scale linearly with the attenuated CO signal and consequently are attenuated at each ${\boldsymbol{k}}$ by approximately $\exp (-{k}_{\perp }^{2}{\sigma }_{\mathrm{beam}}^{2}/2)$.

The spherically averaged auto and cross P(k) are correspondingly attenuated by

Equation (19)

where the average is over all (discrete) ${\boldsymbol{k}}$ in the $P({\boldsymbol{k}})$ averaging that fall within the shell corresponding to k, and ${\sigma }_{\perp }$ is the applicable resolution limit for the P(k) being calculated.13 For the CO auto spectrum, ${\sigma }_{\perp }$ is simply ${\sigma }_{\mathrm{beam}}$, and given the large beam size of the COMAP telescope, effectively ${\sigma }_{\perp }\approx {\sigma }_{\mathrm{beam}}/\sqrt{2}$ for the cross spectra as discussed above.

Recall that, toward the end of Section 3.3, we also discussed attenuation in cross spectra between CO and Lyα intensity fluctuations due to the limited spectral resolution of HETDEX. This follows an average similar to that in Equation (19), but with ${k}_{\parallel }$, the line-of-sight component of ${\boldsymbol{k}}$:

Equation (20)

Were we dealing with the HETDEX auto spectrum, we would base ${\sigma }_{\parallel }={\sigma }_{\parallel ,\mathrm{HETDEX}}$ on the redshift-space resolution of $0.0015(1+z)$ at the average redshift of the survey volume, again based on the resolving power $R\sim 700$ of VIRUS (Hill et al. 2014) and observed Lyα component velocity offsets (Chonis et al. 2013) as discussed in Section 3.3. However, in cross-correlation with COMAP, which has much higher redshift precision, we can assume that ${\sigma }_{\parallel }\approx {\sigma }_{\parallel ,\mathrm{HETDEX}}/\sqrt{2}$ using arguments similar to those for ${\sigma }_{\perp }$.

In discussing our results, we will often quote the total S/N over "all" scales, meaning all scales $k\in (0.017,4.2)$ Mpc−1 (in linear bins of width 0.034 Mpc−1) that our simulations nominally access (with the minimum and maximum k respectively corresponding to the lightcone and voxel angular widths in comoving space). Following Li et al. (2016), we calculate this total ${\rm{S}}/{\rm{N}}$ as

Equation (21)

where ${P}_{\mathrm{obs}}(k)=P(k){W}^{2}(k)$ (or $P(k){W}^{2}(k){W}_{z}^{2}(k)$ in the case of the CO-Lyα intensity cross spectrum) and the sum is over all k-bins with central values in the previously specified range of $(0.017,4.2)$ Mpc−1. Note that ${W}^{2}(k)$ and ${W}_{z}^{2}(k)$ also modify the auto and cross spectra (before thermal noise) in the expressions for ${\sigma }_{P}(k)$.

4. Results

We examine how the S/N and the cross-correlation signal itself vary with survey variables when cross-correlating COMAP against a conventional galaxy survey or a HETDEX-like survey. While we present the signal-to-noise ratio based on the auto or cross spectra in isolation, we will mostly plot the signal in the form of the normalized cross-correlation coefficient between tracers A and B:

Equation (22)

the value of which varies from −1 for perfect anticorrelation to +1 for perfect co-correlation. This allows us to evenly compare different cross-correlation scenarios—for which the cross P(k) otherwise have significantly varying units and amplitudes—and relevant scales over which correlations wax or wane. We do not account for instrumental noise or beam response in the plotted r(k), although we do in calculating power spectra S/N.

We present r(k) and overall cross P(k) signal-to-noise for cross-correlation against a conventional galaxy survey in Section 4.1, and against a HETDEX-like survey in Section 4.2. We then summarize the relevant auto and cross P(k) for COMAP and sensitivities for all scenarios in Section 4.3.

4.1. Cross-correlations with Conventional Galaxy Surveys

We explore two different variables in the galaxy survey: (1) the mass completeness of a perfect redshift survey (${\sigma }_{z}/(1+z)=0$), and (2) the redshift accuracy of a survey with the fiducial mass completeness cut ($\mathrm{log}({M}_{* }/{M}_{\odot })\gt 10.0$).

4.1.1. Mass Completeness

Figure 2 shows how r(k) varies based on the mass completeness of an ideal survey with perfectly determined redshifts. Note that because the CO emission traces faint galaxies well below halo masses of ${10}^{12}\,{M}_{\odot }$ in between the brighter galaxies with ${M}_{\mathrm{vir}}\gtrsim {10}^{12}\,{M}_{\odot }$, r(k) falls off with higher k as the CO and galaxy surveys begin to trace less similar fluctuations at smaller comoving scales. The falloff is greater with less complete surveys, and it impacts all scales significantly once we reach ∼4 × 103 galaxies in our survey volume (or densities of ∼103 galaxies per deg2 per ${\rm{\Delta }}z=1$).

Figure 2.

Figure 2. Median (curves) and 95% sample intervals (shaded areas) across 100 lightcones of normalized cross-correlation coefficient r(k) for simulated CO-galaxy cross-correlation, assuming a perfect redshift survey (${\sigma }_{z}/(1+z)=0$). The different curves show r(k) for different stellar mass cuts used to select the galaxy sample for cross-correlation, as indicated in the legend. These curves show the underlying r(k) rather than a detectable signal, because all galaxy redshifts are assumed to be perfectly known in these simulations.

Standard image High-resolution image

We show signal-to-noise ratios for all simulated cross spectra in Table 4. All of these cross spectra—even the one with the lowest assumed density—might be detected with a signal-to-noise ratio above 20, far higher than the 4.6 expected from the CO auto spectrum alone, at least provided that the galaxy sample fully covers the COMAP volume.

Table 4.  Mean over All Simulated Observations of 100 Lightcones of Total ${\rm{S}}/{\rm{N}}$ for ${P}_{\mathrm{CO}\times \mathrm{gal}}(k)$ over All Modes

$\mathrm{log}({M}_{* ,\min }/{M}_{\odot })$ ${\sigma }_{z}/(1+z)$ Median Galaxy Count ${\rm{S}}/{\rm{N}}$
9.6 0. $5.4\times {10}^{4}$ 33.2
10.0 0. $2.9\times {10}^{4}$ 32.4
10.4 0. $1.3\times {10}^{4}$ 29.6
10.8 0. $3.5\times {10}^{3}$ 22.7
9.6 0.0007 $5.4\times {10}^{4}$ 30.1
10.0 0.0007 $2.9\times {10}^{4}$ 29.1
10.4 0.0007 $1.3\times {10}^{4}$ 26.6
10.8 0.0007 $3.5\times {10}^{3}$ 20.4
10.0 0.003 $2.9\times {10}^{4}$ 19.1
10.0 0.01 $2.8\times {10}^{4}$ 10.6
10.0 0.02 $2.7\times {10}^{4}$ 7.32
10.0 0.03 $2.7\times {10}^{4}$ 5.93

Note. For comparison, the ${\rm{S}}/{\rm{N}}$ for PCO(k) is 4.6. All signal-to-noise ratios are quoted for a single patch of 2.5 deg2 observed for 1500 hr; we may expect an improvement of up to a factor of $\sqrt{2}$ if two equivalent patches are observed for 1500 hr each.

Download table as:  ASCIITypeset image

4.1.2. Redshift Accuracy

Now fixing the mass completeness cut at the fiducial value of $\mathrm{log}({M}_{* ,\min }/{M}_{\odot })=10.0$, we vary the redshift accuracy in the survey. Figure 3 shows the resulting r(k) values for each value of ${\sigma }_{z}/(1+z)$ assumed.

Figure 3.

Figure 3. Median (curves) and 95% sample intervals (shaded areas) across 100 lightcones of normalized cross-correlation coefficient r(k) for simulated CO-galaxy cross-correlation. The different curves show r(k) for different redshift errors used to select the galaxy sample used in the cross-correlation exercise, as indicated in the legend. The galaxy samples in these simulations are selected based on a stellar mass cut of $\mathrm{log}({M}_{* ,\min }/{M}_{\odot })=10.0$.

Standard image High-resolution image

Note that, even for ${\sigma }_{z}/(1+z)=0.0007$, we find significant attenuation of cross-correlation at large k, suggesting the effect is particularly great for the cross shot-noise component of the signal (which dominates over the clustering component for $k\gtrsim 1$ Mpc−1).14 It is realistic to expect attenuation of this kind given the possible fitting errors in spectroscopic redshift determination, as well as mismatches in line-of-sight pixelization or resolution between the two survey data. This level of precision is also where redshift-space distortions (RSD) from peculiar velocities of galaxies begin to distort line-of-sight structure. As this precision is thus sufficient for galaxy redshift surveys looking for RSD, they have little motivation to pursue higher spectral resolution—e.g., Gaztañaga et al. (2012) and Eriksen & Gaztañaga (2015) demonstrate that ${\sigma }_{z}/(1+z)=0.003$ is adequate for the purposes of the PAU Survey (mentioned above in Section 3.3.2).

For both ${\sigma }_{z}/(1+z)=0.0007$ and ${\sigma }_{z}/(1+z)=0.003$ (the higher- and lower-resolution spectroscopic errors), we see significant attenuation of the cross-correlation at large k, i.e., the smallest scales simulated, but relatively little attenuation at the largest scales probed. This works to our advantage, as our single-dish line-intensity survey aims to detect CO fluctuations at these largest scales rather than the CO shot noise. The S/N reflects this, falling only from 32.4 to 29.1 if ${\sigma }_{z}/(1+z)\,=0.0007$, and then to 19.1 if we increase ${\sigma }_{z}/(1+z)$ to 0.003.

These numbers become more discouraging as we approach errors more typical of wide-band photometric surveys, settling in a range closer to 6–11. We show the effect graphically in Figure 4, and again summarize the signal-to-noise ratios calculated in Table 4. Even taking these ratios at face value, the high ${\sigma }_{z}$ significantly dulls the advantage of cross-correlation over auto-correlation in detection significance. We must also carefully consider the integration time of 1500 hr per patch assumed for all scenarios. In the specific case of COMAP, when observing from the site in California, this integration time takes 2–3 times longer to achieve on an equatorial field like COSMOS versus on a field at 50–70° decl. If we had fixed the "real" survey duration for all scenarios instead of the integration time, we would expect to see no advantage in detection significance from cross-correlating against a COSMOS2015-like galaxy catalog over CO autocorrelation in a field of our choice.

Figure 4.

Figure 4. Median (circles) and 95% sample intervals (error bars) across 100 lightcones of total signal-to-noise over all scales ${\rm{S}}/{\rm{N}}={[{\sum }_{k}({\rm{S}}/{\rm{N}}{)}_{k}^{2}]}^{1/2}$ for simulated CO-galaxy cross spectra for different galaxy ${\sigma }_{z}/(1+z)$ values. The annotations indicate what instrument, technique, or catalog can achieve redshift accuracy broadly similar to each of our simulated scenarios. The galaxy sample is simulated with a minimum stellar mass of $\mathrm{log}({M}_{* ,\min }/{M}_{\odot })=10.0$. All ${\rm{S}}/{\rm{N}}$ are quoted for a single patch of 2.5 deg2 observed for 1500 hr; we may expect an improvement of up to a factor of $\sqrt{2}$ if two equivalent patches are observed for 1500 hr each.

Standard image High-resolution image

One natural step we might take to ameliorate the problem of photometric redshift errors is to coarsen the line-of-sight resolution of the data, so as to make the redshift errors less relevant. While this will result in boosting the signal closer to its true value, by essentially removing attenuated line-of-sight modes from consideration, it will also result in increased uncertainties in the end result as the Fourier-space volume—and thus the number of modes decreases. The net result is largely a loss in total cross signal-to-noise, as we show in Figure 5, and only a slight gain for photometric scenarios of ${\sigma }_{z}/(1+z)\geqslant 0.01$ (which plateaus when ${\sigma }_{z}/(1+z)\approx {\delta }_{\nu }/\nu $).

Figure 5.

Figure 5. A demonstration of the effect of COMAP line-of-sight resolution on the signal-to-noise ratio for auto and cross spectra. We express frequency resolution here as number of channels across the spectrometer bandwidth (also expressed as $\nu /{\delta }_{\nu }$ per channel for $\nu =30\,\mathrm{GHz}$ by multiplying by 3.75), and show how it affects total S/N over all scales ${\rm{S}}/{\rm{N}}={[{\sum }_{k}({\rm{S}}/{\rm{N}}{)}_{k}^{2}]}^{1/2}$ for simulated CO auto spectra and CO-galaxy cross spectra for different galaxy ${\sigma }_{z}/(1+z)$ values. The thick curves and shaded areas show the median and 95% interval across 100 lightcones. The simulated galaxy sample is selected with a minimum stellar mass of $\mathrm{log}({M}_{* ,\min }/{M}_{\odot })=10.0$. All signal-to-noise ratios are quoted for a single patch of 2.5 deg2 observed for 1500 hr; we may expect to see improvement of up to a factor of $\sqrt{2}$ if two equivalent patches are observed for 1500 hr each.

Standard image High-resolution image

4.2. Cross-correlations with a HETDEX-like Survey

We consider two variations on cross-correlating against a HETDEX-like survey: one in which we cross-correlate against a Lyα intensity cube, and one in which we cross-correlate against the LAE overdensity field with different luminosity cuts, as discussed in Section 3.3. We fix ${\sigma }_{z}/(1+z)=0.0015$ in all cases, however.

We first consider the COMAP×LAE scenario, and show the simulated r(k) in Figure 6. For luminosity cuts of $(3\times {10}^{42},6\times {10}^{42})$ erg s−1, we find an average of $(1.4\times {10}^{4},4.2\times {10}^{3})$ LAEs in the survey volume, with approximately one-fourth as many LAEs when the volume is sparsely sampled with a fill factor of 1/4. The number of LAEs with ${L}_{\mathrm{Ly}\alpha }\gt 3\times {10}^{42}$ erg s−1 approximately matches the expected source abundance in Hill et al. (2008) of 8 × 105 LAEs across ∼400 (sparsely sampled) square degrees in a redshift interval of ${\rm{\Delta }}z=1.6$.

Figure 6.

Figure 6. Median (curves) and 95% sample intervals (shaded areas) across 100 lightcones of normalized cross-correlation coefficient r(k) for simulated CO-LAE cross-correlation. The different curves show r(k) for different ${L}_{\mathrm{Ly}\alpha }$ cuts used to select the LAE sample used in the cross-correlation, both with and without sparse sampling. LAE redshifts are scattered by a normal distribution with ${\sigma }_{z}/(1+z)=0.0015$. For comparison, we also show r(k) from cross-correlation of COMAP with a galaxy survey with a minimum stellar mass of $\mathrm{log}({M}_{* ,\min }/{M}_{\odot })=10.0$, assuming both perfect redshifts (black) and scattered redshifts with ${\sigma }_{z}/(1+z)=0.02$ (purple).

Standard image High-resolution image

If the HETDEX data filled the survey volume completely, as in the SHELA field, the COMAP×LAE cross spectrum would be detectable with total S/N as high as 20.7 for ${L}_{\mathrm{Ly}\alpha }\gt 3\times {10}^{42}$ erg s−1, even with the LAE redshifts scattered by ${\sigma }_{z}/(1+z)=0.0015$ (without which the ${\rm{S}}/{\rm{N}}$ might be higher by around 30%). However, with the quarter-fill factor, the ${\rm{S}}/{\rm{N}}$ does drop to 14.5. The numbers are lower by 20%–25% for the more stringent cut of ${L}_{\mathrm{Ly}\alpha }\gt 6\times {10}^{42}$ erg s−1.

We now consider cross-correlating against a Lyα intensity cube generated from HETDEX, showing r(k) in Figure 7. Compared to cross-correlation against LAE overdensity, the roll-off of r(k) with greater k is slower, even with sparse sampling and limited spectral resolution. The signal is potentially detectable at an S/N of 29.1 without sparse sampling, and an S/N of 23.3 with sparse sampling. This results in a slight edge versus cross-correlating against individually identified LAEs, although the simulated advantage may change with the Lyα model; see Section 5.2 (and Appendix C) for further discussion.

Figure 7.

Figure 7. Median (curves) and 95% sample intervals (shaded areas) across 100 lightcones of normalized cross-correlation coefficient r(k) for simulated CO-Lyα intensity cross-correlation. The different curves show r(k) for different assumptions of sparse sampling and HETDEX resolution. Apart from the r(k) curve that we label "$R\to \infty $," the COMAP–HETDEX r(k) curves are attenuated by the amount expected for the VIRUS resolving power of $R\sim 700$. We show the same CO-galaxy r(k) curves for comparison as we did in Figure 6.

Standard image High-resolution image

We summarize the signal-to-noise ratios from COMAP–HETDEX cross-correlation (and expected LAE counts for applicable scenarios) in Table 5, and compare the LAE cross-correlation S/N graphically against S/N from cross-correlation against a mass-selected galaxy sample in Figure 8. Note that, unlike the COSMOS field, the HETDEX survey footprint is partly well-matched with areas of relatively high observing efficiency for COMAP. Therefore, a CO observing campaign with sufficient data (in one patch of several) for a $\sim 5\sigma $ CO auto detection could readily overlap with HETDEX to generate a $\sim 15\sigma $ detection in cross-correlation, per the signal-to-noise ratios in Table 5.

Figure 8.

Figure 8. Median (curves) and 95% sample intervals (shaded areas surrounding curves) of cross spectrum signal-to-noise ratio for cross-correlation of CO temperature against mass-selected galaxies and LAEs (without and with sparse sampling), as a function of minimum stellar mass or Lyα luminosity. We loosely align the two survey limit metrics based on source abundances as reported in Tables 4 and 5—e.g., cuts of ${M}_{* }\gt {10}^{10.4}\,{M}_{\odot }$ and ${L}_{{\rm{L}}y\alpha }\gt 3\times {10}^{42}$ erg s−1 both result in ∼104 sources in a 2.5 deg2 patch of the COMAP survey volume, or ∼5 × 103 sources per deg2 per ${\rm{\Delta }}z=1$. Note, however, the different redshift resolutions assumed for the mass-selected galaxy sample and the LAE sample. All signal-to-noise ratios are quoted for a single 2.5 deg2 patch observed for 1500 hr; we may expect to see an increase of up to a factor of $\sqrt{2}$ if two equivalent patches are observed for 1500 hr each.

Standard image High-resolution image

Table 5.  Mean over All Simulated Observations of 100 Lightcones of Total ${\rm{S}}/{\rm{N}}$ for ${P}_{\mathrm{CO}\times \mathrm{Ly}\alpha }(k)$ over All Modes

${L}_{\mathrm{Ly}{\alpha },\min }$ (erg s−1) Median LAE count ${\rm{S}}/{\rm{N}}$
none (LIM) 29.1 (23.3)
× 1042 $1.4\times {10}^{4}$ ($3.5\times {10}^{3}$) 21.2 (14.5)
× 1042 $4.2\times {10}^{3}$ ($1.1\times {10}^{3}$) 16.7 (10.6)

Note. Counts and ${\rm{S}}/{\rm{N}}$ in parentheses are for quarter-fill sparse sampling; counts and ${\rm{S}}/{\rm{N}}$ not in parentheses are for full-fill sampling. We assume ${\sigma }_{z}/(1+z)=0.0015$ in all cases. For comparison, the ${\rm{S}}/{\rm{N}}$ for PCO(k) is 4.6. All ${\rm{S}}/{\rm{N}}$ are quoted for a single patch of 2.5 deg2 observed for 1500 hr; we may expect to see an increase of up to a factor of $\sqrt{2}$ if two equivalent patches are observed for 1500 hr each, and a further, roughly linear increase with more integration time.

Download table as:  ASCIITypeset image

4.3. Final Summary: Power Spectra and Sensitivities

To conclude this section, we show a plot of all auto and cross P(k) with sensitivities in Figure 9. Note the shape of the CO auto P(k), which flattens beyond $k\sim 1$ Mpc−1 as the shot-noise component of the power spectrum begins to dominate over the clustering component following the underlying matter distribution. Any such shot-noise component in the cross spectra is far less apparent, as expected from the random redshift errors wiping out smaller-scale correlations. (The exception is the CO×HETDEX LIM cross P(k) plotted, which does not incorporate this effect, as it is folded into the accompanying sensitivity curve instead. Thus, a shot-noise component is visible for this set of P(k).) This matches what we also demonstrate in the r(k) plots of Figures 3 and 6.

Figure 9.

Figure 9. Upper four panels: median (curves) and 95% sample intervals (shaded areas surrounding curves) across 100 lightcones of indicated auto and cross spectra, with sensitivities indicated by the shaded areas of corresponding color and line style extending to the lower limit of each panel. The M*-selected galaxies exceed a minimum stellar mass of 1010 ${M}_{\odot }$, and the LAE samples exceed a minimum Ly-α luminosity of 3 × 1042 erg s−1. Unlike the CO × M*-selected galaxy and CO × HETDEX LAE cross spectra, the CO ×  HETDEX LIM cross spectra do not account for the effect of redshift precision, which is folded instead into the sensitivity curve. Lower panel: median (curves) and 95% sample intervals (shaded areas surrounding curves) of signal-to-noise ratio for all auto and cross spectra as a function of k, calculated in k-bins of width ${\rm{\Delta }}k=0.035$. All ${\rm{S}}/{\rm{N}}$ are quoted for a single patch of 2.5 deg2 observed for 1500 hr; we may expect to see an increase by a factor of up to $\sqrt{2}$ if two equivalent patches are observed for 1500 hr each, and a further, roughly linear increase with more integration time.

Standard image High-resolution image

Note also that the shapes of the sensitivity curves clearly show the impact of cross-correlating against data with significantly higher angular resolution than the COMAP data (and thus reducing ${\sigma }_{\perp }$ in Equation (19) by a factor of $\sqrt{2}$). We also plot the signal-to-noise ratio at each k for all spectra considered, which shows the same.

Note finally that we find a different total signal-to-noise ratio if we consider uncertainties on the anisotropic power spectrum P(k, μ) (where $\mu ={k}_{\parallel }/k$ is the cosine of the k-space spherical polar angle, using ${k}_{\parallel }$ to describe the line-of-sight component of the vector ${\boldsymbol{k}}$) instead of the spherically averaged P(k), which may be higher for galaxy samples with $\sigma {z}/(1+z)\geqslant 0.01$. However, the enhancement is not enough to alter the fundamental conclusions of our work; see Appendix E for further discussion.

5. Discussion

Broadly, the above results show that photometric redshift errors significantly attenuate the CO-galaxy cross-correlation at all scales, and overlapping with a spectroscopic data set is the best path to a confident detection. We now provide additional discussion of specific topics driving these results.

5.1. Effects of Redshift Errors

Our treatment of the effects of redshift errors is considerably simplified from what we may expect from real-world survey data. In particular, we simulate photometric redshifts that lack both bias and catastrophic outliers, either of which could potentially further affect the cross-correlation signal.

Furthermore, photometric redshift algorithms usually compute a redshift probability density function (PDF) for each galaxy, which can be reduced to the point estimate that we consider. Indeed, Asorey et al. (2016) propose using the full redshift PDF for each galaxy in studies of angular galaxy clustering within bins of roughly twice the typical PDF widths.

However, the technique is of limited applicability when using LIM to probe 3D clustering. In general, techniques using galaxy redshift PDFs will not (and are not meant to) recover the true redshift of the galaxy or the true galaxy density fluctuations in the survey volume. In fact, using the redshift PDFs for each galaxy instead of the estimated redshifts will merely convolve the PDF (around the estimated redshift) with the scatter distribution of the estimated redshift (versus the true redshift). This can only result in additional line-of-sight smearing and thus attenuation of the signal.

Furthermore, the detectability of the line-intensity signal relies in large part on the decrease in uncertainty due to being able to access the line-of-sight fluctuation modes. In Figure 5, where we actually explore the idea of wider frequency channels (albeit not the idea of using full redshift PDFs), the signal-to-noise remains low despite recovery of the signal due to the loss of line-of-sight information.

Ultimately, we must contend with some suppression of the cross spectrum signal from errors in galaxy redshifts, even for spectroscopic redshifts. In principle, we might consider calculating a transfer function to compensate for this suppression. However, the signal being suppressed in the first place diminishes our confidence in its detection, regardless of whether or not we can undo the suppression in analysis.

Furthermore, an accurate transfer function will require accurate and precise characterization of redshift errors—both of the mean offset or bias, as well as of the variance or scatter. For comparison, when binning populations of galaxies in a photometric survey to gauge cosmological parameters, both bias and scatter in redshift must be characterized within 0.003 to limit significant degradation in error, even using bins of ${\rm{\Delta }}z\sim 0.1$ (Huterer et al. 2006; Ma et al. 2006). In practice, LIM can probe fluctuations across bins of ${\rm{\Delta }}z$ as fine as ∼0.001, although the science is often more astrophysical than cosmological, especially in deep surveys with small sky fractions.

Such requirements in and of themselves should not preclude the possibility of studies of shot noise cross-correlation of the kind posited by Wolz et al. (2017) (which proposes cross-correlation against a spectroscopic galaxy survey, for which the errors may be more precisely characterized than for photometric redshifts) and Breysse & Rahman (2017) (if extended to cross-correlations between line-intensity surveys). However, future studies may wish to take careful inventory of expected redshift errors. A thorough study of the feasibility of such techniques and the impact of these effects on shot noise detection significance is beyond the scope of this particular work, but still highly desirable as these cross-correlations become increasingly viable.

5.2. The Slim Advantage of Lyα LIM over Individual Galaxy Detections

Recall that we consider two possible outputs of HETDEX: the LAE overdensity ${\delta }_{\mathrm{LAE},\mathrm{vox}}$ for all emitters above a certain luminosity cut, and the total Lyα line-intensity quantified as ${\nu }_{{\rm{L}}y\alpha }{I}_{\nu ,\mathrm{Ly}\alpha }$. One might expect that, because the latter includes Lyα emission from emitters below a realistic luminosity cut, the line-intensity cube should trace structures absent in the ${\delta }_{\mathrm{LAE},\mathrm{vox}}$ cube—and potentially significantly improve detectability in cross-correlation against TCO.

However, our simulated cross-correlation of ${T}_{\mathrm{CO}}$ against ${\nu }_{{\rm{L}}y\alpha }{I}_{\nu ,\mathrm{Ly}\alpha }$ only provides a slim advantage (40–60%, based on the figures in Table 5) in signal-to-noise ratio over cross-correlation against LAEs detected with ${L}_{\mathrm{Ly}\alpha }\gt 3\times {10}^{42}$ erg s−1. This seems significant enough until one considers that the signal-to-noise ratio is still about the same as a conventional spectroscopic galaxy survey with ${M}_{* ,\min }\sim {10}^{10}\,{M}_{\odot }$. Furthermore, in our model, LAEs with ${L}_{\mathrm{Ly}\alpha }\gt 3\times {10}^{42}$ erg s−1 are typically in halos with ${M}_{\mathrm{vir}}\gtrsim {10}^{12}\,{M}_{\odot }$ (looking at Figure 1), and the line-intensity data should trace twice as much (or more) signal by including emission from lower-mass halos (looking at Figure 13 in Appendix A.4).15 Note first that this ignores both the additional information that may be captured in LIM (but is beyond the scope of our LAE-based simulations) and the additional challenges that would be inherent to LIM, including contamination from zodiacal light and interloper emission in [O ii] (the latter of which would be rejected in COMAP–HETDEX cross-correlation but would nonetheless contribute additional uncertainty about that cross signal).16 With these caveats in mind, we should ask why overcoming such challenges and observing the full LAE population would appear to result in so little improvement in forecast ${\rm{S}}/{\rm{N}}$.

One possible explanation may be the halo mass–line luminosity relations in Figure 1. The large step down in Lyα escape fraction around $\mathrm{SFR}\sim 1\,{M}_{\odot }\,{\mathrm{yr}}^{-1}$ or halo mass ${M}_{\mathrm{vir}}\sim {10}^{11}\,{M}_{\odot }$ means that the slope of the halo mass–${L}_{\mathrm{Ly}\alpha }$ relation begins to decline above this mass, before the turnaround at ${10}^{12}\,{M}_{\odot }$ seen in both this relation and in the halo mass–${L}_{\mathrm{CO}}$ relation. This would lead to differences between the CO and Lyα signals in the relative contributions of halos with ${M}_{\mathrm{vir}}\lesssim {10}^{11}\,{M}_{\odot }$ and those with ${M}_{\mathrm{vir}}\gtrsim {10}^{12}\,{M}_{\odot }$, which may adversely affect the cross-correlation of the line-intensity signals. However, whether only the most massive halos behave similarly enough to co-correlate would certainly be a highly model-dependent effect. Furthermore, the Lyα relation is based solely on observed LAE densities, which may not result in a complete model of Lyα emission (as discussed briefly in Section 3.2).

Another (not mutually exclusive) possibility is that LAE detection, and not Lyα intensity mapping, may actually be the optimal observation for HETDEX, which has low instrumental noise and high angular resolution—and thus, low source confusion. Cheng et al. (2018) provide an overview of optimal observations in different noise and confusion regimes using an analytic source model. Their work suggests that detecting individual sources rather than aggregate LIM is the optimal observation for HETDEX, as well as for the higher-redshift Lyα observations that would be possible with the Cosmic Dawn Intensity Mapper (CDIM) (Cooray et al. 2016), although not for the SPHEREx concept (Doré et al. 2014).

However, Cheng et al. (2018) are also careful to note that the Lyα model used only incorporates point sources (just as our own model relies only on LAE LFs) and does not account for the expected extended emission from radiative transfer beyond galaxies. We have also discussed this as a limitation of our model, and repeat our caveat from Section 3.2 that a need exists for future work on implications of extended Lyα emission for HETDEX and COMAP–HETDEX cross-correlation.

6. Conclusions

We find that cross-correlation of COMAP Phase I data with galaxy surveys, both targeted galaxy surveys and blind Lyα surveys, could result in high signal-to-noise detections, but not unconditionally.

  • 1.  
    With perfect or at least very precise redshift knowledge, the exercise could be done with as few as several thousand sources covering the COMAP survey volume, corresponding to a source abundance of ≳10−4Mpc−3 or ≳102 per square degree per ${\rm{\Delta }}z=0.1$.
  • 2.  
    However, to provide a significant advantage in cross-correlation alone over auto-correlation alone in signal-to-noise ratio, the galaxy catalog must achieve a redshift accuracy of ${\sigma }_{z}/(1+z)\lesssim 0.003$, which is best obtained with low- to medium-resolution spectroscopy and will be challenging (at best) with photometry at high redshift.
  • 3.  
    If the redshift accuracy and source density satisfy the above, cross-correlations could result in a cross spectrum detection at an S/N of up to 15–30, compared to ${\rm{S}}/{\rm{N}}\lesssim 5$ for a CO auto spectrum detection (in a single patch). We expect this to be true in the case of cross-correlation with HETDEX, although this (and cross-correlation with Lyα surveys in general) requires further investigation with more faithful treatment of radiative processes.

We take care to note that targeting specific fields like COSMOS and targeting auto-correlation may not be mutually exclusive in general. Choosing one of the two is a necessity for noise-dominated surveys operating from sites where such fields cannot be mapped with high observing efficiency, but some line-intensity surveys may be in a position to observe a field like COSMOS with high observing efficiency. CONCERTO,17 for instance, is set to operate on the Atacama Pathfinder EXperiment (APEX) antenna located on Llano de Chajnantor, which is well-suited for observing the COSMOS field and other equatorial fields (being located at 23fdg01 south latitude, versus the 37fdg23 north latitude of the OVRO site). Catalogs in those fields may be able to provide priors for point sources and low-redshift CO emitters acting as foregrounds for the high-redshift [C ii] emission that CONCERTO targets. In particular, cross-correlation between $z\sim 0.5$–2 galaxy surveys and CO emission captured in CONCERTO and other [C ii] experiments could enable novel science at both intermediate and high redshift, but we leave this possibility for others to investigate in future work. Overlapping with galaxy survey fields may also allow use of line-intensity maps as spectroscopic references (in addition to, e.g., quasars) for inferring clustering-redshift distributions of cataloged objects (see Ménard et al. (2013) for a description of the clustering redshift technique).

Furthermore, raw projected S/N is not an adequate singular basis for dismissing cross-correlating against less precise redshifts. Qualitative differences between the interpretation of an autocorrelation and that of a cross-correlation mean that, even with the same S/N, a cross-correlation measurement may lend more confidence about the origin of the CO signal, in addition to allowing for better rejection of systematics and uncertainties beyond the fundamental sources of noise accounted for here. (H i intensity mapping provides a case study where auto spectra are biased by unknown systematics and a more robust constraint emerges from putting together the auto spectrum with the cross spectrum from cross-correlation of the H i data against a spectroscopic galaxy survey—see Switzer et al. (2013).) That said, we qualitatively expect and quantitatively confirm that the redshift accuracy typical of photometric surveys significantly affects our ability to detect 3D clustering in cross-correlation, and this gives us cause for concern in using photometric catalogs for cross-correlation against line-intensity surveys.

Existing photometric surveys should still inform cross-correlation prospects, as we may expect significant spectroscopic follow-up with instruments like PFS—and even surveys like HETDEX, which will use the COSMOS field for science verification and calibration (Hill & HETDEX Consortium 2016). However, we expect such surveys to operate beyond COSMOS—like in the SHELA patch, which already has deep multi-wavelength imagery in the optical and infrared through Spitzer/IRAC, NEWFIRM, and DECam, and will see full-fill HETDEX data in the next several years (Hill & HETDEX Consortium 2016; Papovich et al. 2016), which could enable the high-S/N cross-correlation detections discussed above. Cross-correlating against a photometric catalog will not be a quick path to a detection for near-future line-intensity surveys like COMAP, but we find hope for future prospects as we wait for an influx of high-quality wide-field spectroscopic data in the coming years.

D.T.C., M.P.V., S.E.C., and R.H.W. acknowledge support via NSF AST-1517598 and a seed grant from the Kavli Institute for Particle Astrophysics and Cosmology. K.A.C. acknowledges funding from the NSF via award AST-1518282, as well as from the Keck Institute for Space Studies. H.K.E., M.K.F., H.T.I., and I.K.W. acknowledge support from the Research Council of Norway through grant 251328. J.O.G. acknowledges support from the Keck Institute for Space Studies, NSF AST-1517108, and the University of Miami. S.E.H. acknowledges support from an STFC Consolidated Grant (ST/P000649/1). H.P.'s research is supported by the Tomalla Foundation. We thank Tony Li for initial simulations and discussions that evolved into this work, and Clive Dickinson for crucial discussions and comments at the inception of this work. We also thank Lluís Mas-Ribas for an enlightening discussion about Lyα blobs. Some of this work was presented and refined at the workshop "Cosmological Signals from Cosmic Dawn to the Present" held at the Aspen Center for Physics, which is supported by National Science Foundation via grant PHY-1607611. We would like to acknowledge the organizers and participants of that workshop, including Yun-Ting Cheng for providing a draft version of work in preparation. We thank Matthew Becker for access to the Chinchilla cosmological simulation (c400-2048) used in this work. Finally, we thank an anonymous referee whose comments and suggestions greatly improved this manuscript. This research made use of NASA's Astrophysics Data System Bibliographic Services. This work also used computational resources at the SLAC National Accelerator Laboratory.

Software: Astropy, a community-developed core Python package for astronomy (Astropy Collaboration et al. 2013); Matplotlib (Hunter 2007); hmf (Murray et al. 2013).

Appendix A: Lyα Model Details

We consider the SFR–${L}_{\mathrm{Ly}\alpha }$ relation from Section 3.2 in two parts: the "intrinsic" Lyα luminosity per unit SFR based purely on ionizing emissivity (i.e., the numeric coefficient in Equation (4)), and the escape fraction that modifies this luminosity (as given in Equation (5)). We explain our rationale for each in Appendices A.1 and A.2, respectively, and compare simulated LFs and power spectra to observations and previous work in Appendix A.3. We also consider our choice of ${10}^{10}\,{M}_{\odot }$ as the minimum emitting halo mass in Appendix A.4.

A.1. Intrinsic Luminosity per SFR

The conversion is based on assuming a certain intrinsic Hα luminosity per SFR and a Lyα/Hα line ratio of 8.7. Stellar synthesis modeling done in Kennicutt et al. (1994) (via Kennicutt 1998) suggested that, for a Salpeter IMF,

Equation (23)

The calibration arises from models of stellar evolutionary tracks, ionizing emissivity, and recombination rates for gas with $T={10}^{4}$ K; see Kennicutt et al. (1994) and Hummer & Storey (1987) for the last item.

Murphy et al. (2011) update this calibration with revised stellar synthesis models incorporating a Kroupa IMF, yielding a SFR per luminosity 0.68 times that of the old calibration, or a luminosity per SFR ${(0.68)}^{-1}=1.47$ times that of the old calibration. Any difference in this calibration due to using a Chabrier IMF is comparatively small, and we use the value from Murphy et al. (2011) without alteration.18

The convention in previous literature is to convert the above ${L}_{{\rm{H}}\alpha }/\mathrm{SFR}$ ratio into a ${L}_{\mathrm{Ly}\alpha }/\mathrm{SFR}$ ratio by assuming a Lyα/Hα ratio of 8.7. Common citations for this convention include

  • 1.  
    Pengelly (1964) (communicated by Seaton), the first of three papers including Pengelly & Seaton (1964) and Seaton (1964);
  • 2.  
    Brocklehurst (1971) (again communicated by Seaton, and in fact the content is very similar to Pengelly ( 1964));
  • 3.  
    Hummer & Storey (1987);
  • 4.  
    and Hu et al. (1998) (who cite Brocklehurst ( 1971), but are sometimes cited in isolation—for example, in Hayes (2015) and Bridge et al. (2018).

Of these, only the last citation is strictly appropriate, as it is the only one explicitly stating a Lyα/Hα ratio of 8.7. The first three deal with hydrogen and helium recombination rates and line ratios—including the Balmer series and specifically the Hα/Hβ ratio—for different possible gas densities and temperatures, and for different assumptions about whether the gas is optically thin (case A) or thick (case B) to the recombination lines. However, as Henry et al. (2015) note, there needs to be additional information to link the Lyα line to the Balmer series.

Osterbrock (1989) is a possible source for the Lyα/Hα ratio. In the low-density limit, 68% of recombinations lead to Lyα emission, and 45% lead to H-α emission; see Dijkstra et al. (2014) or Dijkstra (2017). Combining this with the ratio of photon energies, we obtain a Lyα/Hα flux ratio of 8.2.

However, this is in the low-density limit, and collisional excitations at higher densities result in an enhanced ratio. Typical assumptions for the electron density fall within the range of ${n}_{e}={10}^{2}$–103 cm−3, and if we consult tables of line ratios as in Dopita & Sutherland (2003) (which Henry et al. (2015) consult for Hα/Hβ and Lyα/Hβ tables, and are synthesised from Storey & Hummer (1995)), we find that 8.2–9.1 is a reasonable range for the line ratio, given the density range at $T={10}^{4}$ K (and assuming case B recombination). The conventional value of 8.7 appears to have been chosen (or at least kept) as a happy intermediate.

The resulting intrinsic conversion between SFR and Lyα luminosity is

Equation (24)

This is the origin of our value for the numeric coefficient in Equation (4).

A.2. The Escape Fraction

The above conversion operates under the assumption that recombination balances photoionization within H ii regions. However, this does not include the possibility of ionizing radiation being absorbed by dust before it is able to trigger a photoionization event, or the possibility of recombination line emission being absorbed by dust. There is also the possibility of ionizing photons escaping the galaxy without a photoionization event (in which case it will likely trigger an event in the intergalactic medium) or being absorbed in H i regions without triggering recombination line emission.

In this model, we ignore the last two possibilities for simplicity but model the first two, in an abbreviated version of the sort of model found in Cai et al. (2014):

Equation (25)

where C is the value obtained in Equation (24), and both escape fractions are functions of SFR and redshift. For this work, as shown in Equations (4) and (5), we lump the two escape fractions together into a single effective escape fraction relative to the intrinsic Lyα prediction.

We take ${f}_{\mathrm{esc}}^{\mathrm{ion}}\sim {f}_{\mathrm{esc}}^{\mathrm{Ly}\alpha }$, effectively squaring one escape fraction of UV photons against dust. Note that the escape/absorption mechanisms for ionizing photons (${\lambda }_{\mathrm{rest}}\leqslant 912$ Å) and Lyα photons (${\lambda }_{\mathrm{rest}}=1216$ Å) are in fact different, and any correlation between the two is subject to large amounts of scatter. The escape fractions are at least around the same order of magnitude, however—see the numbers obtained through simulations in Yajima et al. (2014) and the dust attenuation factors assumed in Cai et al. (2014) (the dust optical depth at 1216 Å is assumed to be 1.08 times the dust optical depth at 1350 Å, which itself is assumed to be $1/\gamma \simeq 1.18$ times the dust optical depth of ionizing photons).

The escape fraction is highly contrived, to two ends:

  • 1.  
    It increases monotonically with redshift (converging to 1 as $z\to \infty $).
  • 2.  
    It decreases with higher SFR.

The latter is easier to justify—it is natural to associate higher SFR with more dust, and there is observational evidence for this correlation (Santini et al. 2014). However, this in turn makes the former more difficult to justify: cosmic SFR density evolves nonmonotonically with redshift—increasing up to $z\sim 3$ before showing a clear decline after $z\sim 2$ (Madau & Dickinson 2014)—suggesting that the redshift evolution of the escape fraction cannot be monotonic. However, (a) the scope of our modeling is limited to $z\gtrsim 2$, where the behavior may as well be monotonic, and (b) it may be possible for factors other than SFR to influence dust content in older/late-type galaxies, although there is considerable uncertainty around the latter.

Assigning specific numbers requires either a sophisticated simulation incorporating radiative transfer (like Yajima et al. 2014) or observational constraints. Two simulation forecasts influence our choice of form and approximate parameter values, with subsequent fine-tuning based on observational constraints:

  • 1.  
    Results in Yajima et al. (2014) show a median ${f}_{\mathrm{esc}}^{\mathrm{ion}}\sim 0.2$ evolving very weakly with redshift and a nonmonotonic evolution of ${f}_{\mathrm{esc}}^{\mathrm{Ly}\alpha }$, with median values ranging between ∼0.3 and ∼0.9. Scatter around the median is quite large, however.
  • 2.  
    Results in Garel et al. (2012) show a simulated escape fraction of near-unity for most simulated galaxies with SFR less than 1 ${M}_{\odot }$ yr−1. The distribution of ${f}_{\mathrm{esc}}(\mathrm{SFR})$ evolves strongly into a flat one with higher SFR, with an average of 21% for simulated galaxies with SFR greater than 10 ${M}_{\odot }$ yr−1.

We model the total escape fraction ${f}_{\mathrm{esc}}\equiv {f}_{\mathrm{esc}}^{\mathrm{ion}}{f}_{\mathrm{esc}}^{\mathrm{Ly}\alpha }$ as the squared product of a generalized logistic function in redshift and an algebraic function in SFR:

Equation (26)

This form combines S-shaped curves in each variable, and shows the desired asymptotic behavior discussed above. The function in redshift is an overall normalization between 0 and 1, with a characteristic redshift z0 acting as an inflection point of the redshift evolution, and ξ and ζ controlling the shape. Meanwhile, the function in SFR changes between f0 and 1 around a characteristic ${\mathrm{SFR}}_{0}$, with ς again controlling the shape.

A.3. Tuning and Comparison to Previous Work

With the model above (including 0.3 dex log-normal scatter in SFR and in Lyα luminosity), we are able to translate the analytic halo mass function fit from Behroozi et al. (2013a, 2013b) into simulated LFs at different redshifts. We use a brute-force technique, randomly drawing from the halo mass function and applying the Lyα model to the masses drawn, then binning the resulting luminosities to obtain an LF.

We tune the escape fraction parameter values based on comparing these simulated data to observed LFs at four different redshifts: $z\sim 0.3$ from Cowie et al. (2010), $z\sim 0.92$ from Barger et al. (2012), $z\sim 2.23$ from Sobral et al. (2017), and $z\sim 3.1$ from Gronwall et al. (2007). The resulting parameters are $\xi =1.6$, z0 = 3.125, $\zeta =1/4$, f0 = 0.18, ${\mathrm{SFR}}_{0}\approx 1.29\,{M}_{\odot }$ yr−1, and $\varsigma =0.875$. We match the higher-redshift data better than the lower-redshift data, suggesting that our model cannot completely describe the strong evolution of the LAE LF from $z\sim 0.3$ to $z\sim 2$ (which is expected, given the monotonic redshift evolution of the escape fraction, as discussed in the previous section). However, note also that there is support for a composite Schechter/power-law LF for low-to-intermediate redshift LAEs, and by and large we are trying to match only the Schechter part of this. We show a comparison of the simulated LFs to these observed data in Figure 10.

Figure 10.

Figure 10. Comparison of simulated Lyα LFs (dashed curves) to observed LAE LFs (solid curves) at four different redshifts, each from a different work: $z\sim 0.3$ from Cowie et al. (2010), $z\sim 0.92$ from Barger et al. (2012), $z\sim 2.23$ from Sobral et al. (2017), and $z\sim 3.1$ from Gronwall et al. (2007). We generate the simulated LFs using 107 random draws from the halo mass function, calculating a model Lyα luminosity for each mass and binning these into log-luminosity bins. The model is specifically tuned to match the four observed LFs as much as possible.

Standard image High-resolution image

Without further tuning, we also compare against LFs derived in Sobral et al. (2018) from a compilation of deep and wide LAE surveys (dubbed S-SC4K), and the plots in Figure 11 show a reasonable match up to $z\sim 5$.

Figure 11.

Figure 11. Comparison of simulated Lyα LFs (dashed curves) to S-SC4K LAE LFs (solid curves) at six different redshifts from Sobral et al. (2018). The latter are derived from a compilation of deep and wide LAE surveys. We generate the simulated LFs by taking 5 × 106 random draws from the halo mass function, calculating a model Lyα luminosity for each mass, and binning these into log-luminosity bins.

Standard image High-resolution image

Because the Chinchilla lightcones used in this work span z = 1.5–3.5, we can use these to simulate Lyα fluctuations and power spectra at $z\sim 2$ through the same methods used in the main work. We compare these in Figure 12 to power spectra in previous work in Pullen et al. (2014) (using only the halo contribution) and Fonseca et al. (2017), and find our model yields predicted power spectra squarely in between the two previous works.

Figure 12.

Figure 12. Comparison of the simulated Lyα spherically averaged 3D power spectrum P(k) at redshift 1.9–2.5 to simulated P(k) at similar redshifts in the previous works of Fonseca et al. (2017) and Pullen et al. (2014). In the case of Pullen et al. (2014), we use only the halo contribution, which is subdominant to IGM excitations in the model used in that work. Upper panel: P(k) values from Pullen et al. (2014) (cyan), Fonseca et al. (2017) (orange), and this work (faint black lines, one for each lightcone used). Lower panel: P(k) values from Pullen et al. (2014) (cyan), Fonseca et al. (2017) (orange) normalized by the P(k) values from each lightcone in this work (dubbed PStanford(k) in the plot).

Standard image High-resolution image

A.4. Minimum Halo Mass for Lyα Emission

As was the case for Li et al. (2016), our choice to assign no line luminosities to halos below ${10}^{10}\,{M}_{\odot }$ in virial halo mass is partly pragmatic. From the point of view of simulation constraints, because we use a cosmological N-body box whose dark matter particle mass is only $5.9\times {10}^{8}{h}^{-1}\,{M}_{\odot }=8.4\times {10}^{8}\,{M}_{\odot }$, the halo population is severely incomplete for ${M}_{\mathrm{vir}}\lesssim {10}^{10}\,{M}_{\odot }$. Unlike CO, however, Lyα emission does not require a particularly dusty or high-metallicity environment, so any physical mass cutoff for Lyα emission would likely be much lower than for CO emission.

Note, however, that this cutoff mainly affects our simulations of HETDEX line-intensity cubes, because the Lyα luminosity cutoffs used for our mock LAE catalogs correspond typically to halo masses well above a ${10}^{10}\,{M}_{\odot }$ cutoff. Furthermore, even considering line-intensity cubes, our model ${L}_{{\rm{L}}y\alpha }({M}_{\mathrm{vir}})$ relation falls off quite sharply for ${M}_{\mathrm{vir}}\lesssim {10}^{11}\,{M}_{\odot }$. Therefore, even if we assigned model luminosities to a well-represented population of halos with ${M}_{\mathrm{vir}}\lesssim {10}^{10}\,{M}_{\odot }$, they would likely not contribute significantly to the signal.

We show this in Figure 13 via analytic calculations of contributions to average line intensity from different ranges of halo masses. We calculate the line luminosity per volume ${{dL}}_{\mathrm{line}}/{dV}=\int {L}_{\mathrm{line}}(M)({dn}/{dM}){dM}$, where dn/dM is the halo mass function fit in Behroozi et al. (2013b) at the appropriate redshift, and convert this into observer quantities of CO brightness temperature and Lyα $\nu {I}_{\nu }$. For both CO and Lyα emission, the contribution to mean line intensity falls off below ${10}^{11}\,{M}_{\odot }$ in halo mass for both lines, but especially rapidly for Lyα emission; the analytic results suggest that we have captured a great majority of any expected signal using our cutoff halo mass of ${10}^{10}\,{M}_{\odot }$, at least for our assumed model.

Figure 13.

Figure 13. Expected contributions by halo mass to mean intensity of CO and Lyα emission at z = 2.80 (corresponding to the midpoint of the COMAP observing frequency band). We quantify the contribution from halo masses ${M}_{\mathrm{vir}}\in [M,M+{dM}]$ to average CO temperature as ${dT}/d(\mathrm{log}M)\propto {L}_{\mathrm{CO}}(M){dn}/d(\mathrm{log}M)$, and to average Lyα $\nu {I}_{\nu }$ as $d(\nu {I}_{\nu })/d(\mathrm{log}M)\propto {L}_{\mathrm{Ly}\alpha }(M){dn}/d(\mathrm{log}M)$. All calculations use the line-luminosity models in this work and the halo mass function fit from Behroozi et al. (2013b). The gray shaded area indicates the range of halo masses below the cutoff mass of ${10}^{10}\,{M}_{\odot }$ used in our simulations.

Standard image High-resolution image

Appendix B: Implementation of Log-scatter in Calculating Line Luminosities for Halos

As we describe in Section 3.2, we use scaling relations to convert a halo's virial mass ${M}_{\mathrm{vir}}$ and redshift z to SFR, and then to CO or Lyα line luminosity. We add log-normal scatter to halo properties at two points in the calculation:

  • 1.  
    We add log-normal scatter to SFR while preserving the linear mean $\mathrm{SFR}({M}_{\mathrm{vir}},z)$. In practice, this means that, for each halo, we calculate the expected mean SFR, then multiply this by a sample value from a log-normal distribution with a log-space standard deviation of ${\sigma }_{\mathrm{SFR}}=0.3$ (in units of dex) and a mean logarithm of $-{\sigma }_{\mathrm{SFR}}^{2}\mathrm{ln}10/2$. Thus, the mean logarithm is not equal to the logarithm of the linear mean SFR value, but rather $\mathrm{log}[\left\langle \mathrm{SFR}\right\rangle /({M}_{\odot }\,{\mathrm{yr}}^{-1})]-{\sigma }_{\mathrm{SFR}}^{2}\mathrm{ln}10/2$, which is necessary for the linear mean of the distribution to be the desired $\left\langle \mathrm{SFR}\right\rangle $.
  • 2.  
    We then add log-normal scatter to ${L}_{\mathrm{CO}}$ and ${L}_{{\rm{L}}y\alpha }$ in the same manner, multiplying the mean line luminosity from the ${L}_{\mathrm{IR}}$${L}_{\mathrm{line}}$ relation by a sample value from a log-normal distribution with a log-space standard deviation of ${\sigma }_{{L}_{\mathrm{line}}}$ (again in units of dex) but a mean logarithm of $-{\sigma }_{{L}_{\mathrm{line}}}^{2}\mathrm{ln}10/2$.

The way in which we implement log-scatter thus preserves the linear mean SFR for a given halo mass and redshift, and preserves the linear mean line luminosity for a given SFR and redshift. This is justified, as we take at least the halo mass–SFR and SFR–${L}_{\mathrm{CO}}$ relations from literature. However, this is not always the same as preserving the linear mean line luminosity for a given halo mass and redshift, and we will focus on the CO luminosity to illustrate this point.

The log-normal distribution with natural log mean μ and natural log standard deviation σ has linear mean $\exp (\mu +{\sigma }^{2}/2)$. However, in using our scaling relations, we want to preserve the linear mean of the dependent variable at each step. This means that, if we have y as a function of x and there is a mean relation $\left\langle y\right\rangle (x)$ and desired log-scatter ${\sigma }_{y}$ in units of dex or ${\sigma }_{y}\mathrm{ln}10$ in natural log space, simply drawing from a log-normal distribution with natural log mean $\mathrm{ln}[\left\langle y\right\rangle (x)]$ and natural log standard deviation ${\sigma }_{y}\mathrm{ln}10$ will result in a linear mean of $[\left\langle y\right\rangle (x)]\exp ({\sigma }_{y}^{2}{\mathrm{ln}}^{2}10/2)$.

Therefore, what we need to do to preserve the linear mean of $\left\langle y\right\rangle (x)$ is draw from a log-normal distribution with natural log mean $\mathrm{ln}[\left\langle y\right\rangle (x)]-{\sigma }_{y}^{2}{\mathrm{ln}}^{2}10/2$ and natural log standard deviation ${\sigma }_{y}\mathrm{ln}10$. What we do in practice is equivalently multiply $\left\langle y\right\rangle (x)$ by a random variable Zy drawn from a log-normal distribution with natural log mean $-{\sigma }_{y}^{2}{\mathrm{ln}}^{2}10/2$ and natural log standard deviation ${\sigma }_{y}\mathrm{ln}10$. It is common notation to express that Zy is drawn from such a log-normal distribution—or equivalently, that $\mathrm{ln}{Z}_{y}$ is drawn from a normal distribution with the appropriate mean and standard deviation—by writing $\mathrm{ln}{Z}_{y}\sim { \mathcal N }(-{\sigma }_{y}^{2}{\mathrm{ln}}^{2}10/2,{\sigma }_{y}\mathrm{ln}10)$.

Our fiducial model includes an $\mathrm{SFR}({M}_{\mathrm{vir}})$ relation (which also depends on redshift, but let us fix this for the time being) and a ${L}_{\mathrm{CO}}(\mathrm{SFR})$ relation, both of which we scatter separately. Then, for each halo i,

Equation (27)

and

Equation (28)

The $\left\langle {L}_{\mathrm{CO}}\right\rangle (\mathrm{SFR})$ relation specifically takes the form

Equation (29)

once we have combined all the relations between SFR, IR luminosity, CO luminosity in observer units, and CO luminosity in intrinsic units

Equation (30)

Equation (31)

Equation (32)

The overall offset in the log mean versus naïvely combining the relations in log space then comes out to be

Equation (33)

Note that this procedure, used for the work in our main text, should preserve the linear mean SFR for a given halo mass and the linear mean CO luminosity for a given SFR.

We now return to the idea of preserving the linear mean CO luminosity for a given halo mass and redshift, and how our fiducial model actually will not accomplish this. As we note in Section 3.2, we may describe the total log-scatter in ${L}_{\mathrm{CO}}$ with a total log-space standard deviation of ${\sigma }_{\mathrm{tot}}={({\sigma }_{\mathrm{SFR}}^{2}/{\alpha }^{2}+{\sigma }_{{L}_{\mathrm{CO}}}^{2})}^{1/2}$—where the exponent of the SFR–${L}_{\mathrm{CO}}$ power law scales the originally applied log-scatter in SFR by $1/\alpha $. We may then consider combining the average $\mathrm{SFR}({M}_{\mathrm{vir}},z)$ and ${L}_{\mathrm{CO}}(\mathrm{SFR})$ relations into a ${L}_{\mathrm{CO}}({M}_{\mathrm{vir}},z)$ relation and simply applying a single log-scatter of ${\sigma }_{\mathrm{tot}}$ (0.37 dex in our case) while preserving the linear mean ${L}_{\mathrm{CO}}$ for fixed ${M}_{\mathrm{vir}}$ and z. Then, for each halo,

Equation (34)

from which we would obtain

Equation (35)

Thus, our fiducial log-mean ${L}_{\mathrm{CO}}({M}_{\mathrm{vir}})$ offset of Equation (33) differs from the log-mean offset required to preserve the linear mean ${L}_{\mathrm{CO}}$ for a given halo mass, The difference in decimal log space between the right-hand sides of Equation (33) and Equation (35) is $-{\sigma }_{\mathrm{SFR}}^{2}\mathrm{ln}10(1/\alpha -1/{\alpha }^{2})/2$, corresponding to a multiplicative factor of $\exp [-{({\sigma }_{\mathrm{SFR}}\mathrm{ln}10)}^{2}(1/\alpha -1/{\alpha }^{2})/2]$. By separately preserving the linear mean SFR for a given halo mass and the linear mean CO luminosity for a given SFR, the linear mean CO luminosity for a halo mass is actually modified by this factor, relative to the expected value from combining the scaling relations with zero scatter.

For $\alpha =1.37$ and ${\sigma }_{\mathrm{SFR}}=0.3$ (in units of dex), the effect is quite small—the linear mean ${L}_{\mathrm{CO}}$ is 6% below what might be expected from combining the mean scaling relations. However, the effect increases exponentially with ${\sigma }_{\mathrm{SFR}}$, so for ${\sigma }_{\mathrm{SFR}}$ of 1.0 dex, the linear mean ${L}_{\mathrm{CO}}$ falls to half of what would be expected. This explains why, in Figure 5 of Li et al. (2016), the P(k) values at low k fall with increasing ${\sigma }_{\mathrm{SFR}}$ (although at the same time, increasing log-scatter in SFR also increases shot noise, which cushions the effect of not preserving the linear mean ${L}_{\mathrm{CO}}$ for a given ${M}_{\mathrm{vir}}$ value). Therefore, the details of the implementation of log-scatter become important if the scatter in SFR is high and the SFR–${L}_{\mathrm{CO}}$ power law is significantly sub- or superlinear.

Appendix C: Lyα Modeling Beyond This Work: Overview of Radiative Processes

The models used for CO and Lyα emission are both very simple models built on the galaxy–halo connection, assigning a luminosity to each halo identified in a dark matter simulation. This is already a significant simplification for CO emission, which depends on gas metallicity, AGN feedback, and other physical and environmental factors that a dark-matter-only simulation will not capture. The simplification is even more drastic in the case of Lyα emission, as its radiative transfer through the neutral gas of the circumgalactic and intergalactic media (CGM and IGM) alters observations beyond the simple escape fractions we posit.

  • 1.  
    Scattering in the CGM results in diffuse Ly-α halos or blobs, significantly increasing the total flux over radii of ∼10'' (Steidel et al. 2011). Because this diffuse surface brightness is extended and still relatively faint per solid angle, conventional targeted LAE surveys would not detect it, but line-intensity mappers like HETDEX may be able to.
  • 2.  
    Scattering in the IGM may result in anisotropic clustering observed in the Lyα intensity cube, as demonstrated in a simulation study from Zheng et al. (2011). An analysis by Croft et al. (2016) of Lyα intensity in galaxy spectra from the Baryon Oscillation Spectroscopic Survey (BOSS), cross-correlated with BOSS quasars, reports this effect. However, Croft et al. (2018) have since reported a non-detection of any cross-correlation signal against the Lyα forest and a lower quasar cross-correlation signal than first reported, and they no longer claim a quantitative measurement of anisotropic clustering. In addition, the results of another simulation study from Behrens et al. (2018) show a smaller anisotropy than was found in Zheng et al. (2011). IGM scattering may have a greater effect by smoothing small-scale fluctuations, potentially leading to a strong dependence of the power spectrum log-slope on the mean IGM neutral fraction; see Visbal & McQuinn (2018), showing this at $z\sim 7$.
  • 3.  
    Emission from excitations in the IGM could be an additional factor, but while Pullen et al. (2014) found this to be a dominant contributor to the Lyα intensity signal, Silva et al. (2013) and Comaschi & Ferrara (2016) did not.

Overall, radiative transfer significantly impacts the expected Lyα signal, and future forecasts should account for the effects discussed above through sophisticated modeling of Lyα radiative processes.

Appendix D: An Analytic Check on the Effect of Redshift Errors on Power Spectra

While we use ${W}^{2}(k)$ and ${W}_{z}^{2}(k)$ in the main text to describe attenuation of the auto and cross power spectra due to instrumental resolution, we may also use the same formalism to analytically calculate the expected attenuation of spectra due to redshift errors, by approximating the resulting effect on the galaxy density field as a simple convolution with a Gaussian profile. Because a discrete and relatively limited population of galaxies make up the density field, this is only an approximation—but at large scales, it will suffice.

Given the relevant comoving size ${\sigma }_{\parallel }$ of the Gaussian profile, we average the expected attenuation of $\exp (-{k}_{\parallel }^{2}{\sigma }_{\parallel }^{2})$ within each k-shell to find Wz(k). In the main text, we average across the discrete grid of ${\boldsymbol{k}}$ values that correspond to the discrete Fourier transform used to calculate the power spectra. However, in this section, we will obtain a closed-form expression for the attenuation with an analytic average, calculated across the full range of $\mu ={k}_{\parallel }/k$, which is the cosine of the spherical polar angle of ${\boldsymbol{k}}$. This ranges from −1 to 1, but the quantity averaged is an even function of μ, so

If we want to describe attenuation due to redshift errors that follow a Gaussian distribution with standard deviation ${\sigma }_{z}$ in redshift space, we would set ${\sigma }_{\parallel }\approx {\sigma }_{\parallel ,\mathrm{gal}}\equiv c{\sigma }_{z}/H(z)$, and the resulting ${W}_{z}^{2}(k)$ would describe attenuation of the galaxy auto spectrum due to redshift errors. As we discussed in Section 3.5, if COMAP has much finer redshift resolution than the galaxy survey, then we would set ${\sigma }_{\parallel }={\sigma }_{\parallel ,\mathrm{gal}}/\sqrt{2}$ to calculate the appropriate ${W}_{z}^{2}(k)$ for the CO-galaxy cross spectrum.

Thus, the expected attenuation of the galaxy density auto spectrum is

Equation (36)

and the analogous ${W}_{z,\mathrm{CO}\times \mathrm{gal}}^{2}(k)$ for the CO-galaxy cross spectrum is

Equation (37)

Because $\mathrm{erf}(x)/x\to 2/\sqrt{\pi }$ as $x\to 0$, both of the above should equal 1 for ${\sigma }_{z}=0$. However, once ${\sigma }_{\parallel ,\mathrm{gal}}=c{\sigma }_{z}/H(z)\,\gtrsim {k}^{-1}$, the auto spectrum attenuates significantly at the given k, and the cross spectrum does the same once ${\sigma }_{\parallel ,\mathrm{gal}}\gtrsim {2}^{1/2}{k}^{-1}$.

As we note in Section 3.5, the range of k represented in our simulations is ∼0.02 to 4 Mpc−1 (although we only plot k above ∼0.05 Mpc−1, as the lightcone-to-lightcone variance at $z\sim 0.02$ Mpc−1 is quite high), and $c/H(z=2.8)\sim {10}^{3}$ Mpc. Ergo, for the cross spectrum to decrease appreciably at $k\sim 1$ Mpc−1, we only require ${\sigma }_{z}/(1+z)\gtrsim 0.0004$ at z = 2.8, and it will start decreasing appreciably at the lowest scales simulated once ${\sigma }_{z}/(1+z)\gtrsim 0.02$ (and at the lowest scales plotted once ${\sigma }_{z}/(1+z)\gtrsim 0.01$).

The attenuation of the cross spectrum is, of course, different from the attenuation of r(k). Because the galaxy–galaxy auto spectrum is attenuated by ${W}_{z,\mathrm{gal}}^{2}$ and the line–galaxy cross spectrum by ${W}_{z,\mathrm{CO}\times \mathrm{gal}}^{2}$ (and the line–line auto spectrum by a comparatively negligible amount), r(k) is attenuated by a factor of

Equation (38)

which is approximately 1 up to $k{\sigma }_{\parallel }\simeq 1$, and $\approx {\pi }^{1/4}/{(k{\sigma }_{\parallel })}^{1/2}$ for $k{\sigma }_{\parallel }\gtrsim 3$.

We note again that all of this assumes a Gaussian smoothing of the galaxy density field, while what really happens is Gaussian scattering of discrete redshifts. Because galaxies and very bright CO emitters (the dominant source of the shot noise in the CO auto spectrum) are discrete objects, and we have here considered only continuous CO temperature and galaxy density contrast fields, this analytic calculation is only an approximation and breaks down particularly at high k. The power spectrum of the galaxy overdensity field goes to the inverse of the comoving galaxy density as $k\to \infty $ and Poisson noise dominates. Therefore, while redshift errors will attenuate the cross shot noise (which does require coincidence of the CO peaks and the galaxies), the shot-noise component of the galaxy auto spectrum will remain unchanged. This means that, at high k, the r(k) attenuation is simply ${W}_{z,\mathrm{CO}\times \mathrm{gal}}^{2}(k)\sim 1/(k{\sigma }_{\parallel })$.

We compare our analytic expectations to simulations in Figure 14, and find good agreement at low k. However, with the shot-noise component of the galaxy auto spectrum unattenuated, the simulated attenuation of r(k) is near the expression of Equation (36) at the lowest k values considered but quickly approaches ${W}_{z,\mathrm{CO}\times \mathrm{gal}}^{2}(k)$ instead for higher k.

Figure 14.

Figure 14. A comparison of analytic expectations of attenuation of CO-galaxy cross-correlation and galaxy autocorrelation against simulations. The mock galaxy sample is selected based on a minimum ${10}^{10}\,{M}_{\odot }$ stellar mass. The solid curves show median quantities for different redshift errors; the dotted curves in the upper panels show the median spectrum for ${\sigma }_{z}/(1+z)=0$ multiplied by the analytically calculated attenuation of the CO-galaxy cross spectrum from Equation (37) (upper left panel) and of the galaxy auto spectrum by Equation (36) (upper right panel) for each nonzero ${\sigma }_{z}/(1+z)$. In the lower panel, analytic expectations of attenuation of r(k) based on Equation (38) (dashed curves) and based only on Equation (37) (dashed–dotted curves) bracket the actual simulated results.

Standard image High-resolution image

Appendix E: Sensitivities for the Anisotropic Power Spectrum

A consideration of S/N for the anisotropic power spectrum, which is averaged in bins of k and $\mu ={k}_{\parallel }/k$ (the cosine of the spherical polar angle of ${\boldsymbol{k}}$), is important for several reasons:

  • 1.  
    The signal loss from galaxy redshift errors as considered in the main results of our paper should be highly anisotropic, disproportionately affecting line of sight modes (with large μ). (This should be the only source of significant anisotropy in our $P({\boldsymbol{k}})$, given that we ignore the peculiar velocities of halos in our lightcones.)
  • 2.  
    The beam attenuation calculated in Equation (19) implicitly assumes a power spectrum that is isotropic in all three dimensions, which will not be the case if the previous point holds. We may then be mistaken in our estimate of signal loss due to the beam.

We expect the signal loss from redshift errors to be significant enough that the above points should not affect the basic conclusions of this work, but consider calculation of P(k,μ) and sensitivities for one realization (for cross-correlation with M*-selected galaxy samples only, using the fiducial ${M}_{* }\,\gt {10}^{10}\,{M}_{\odot }$ cut).

We can readily generalize the expressions in the main body of this work for P(k, μ) instead of P(k), working with two-dimensional bins in k and μ (instead of binning only in k) to draw out line-of-sight anisotropies. If we consider the previous averaging of $P({\boldsymbol{k}})$ into P(k) to be in spherical shells of k, we now also divide ${\boldsymbol{k}}$-space into sectors based on values of μ, and the averaging of $P({\boldsymbol{k}})$ into P(k,μ) takes place within intersections of k-shells and μ-sectors. We still assume that the power spectrum is isotropic in the two transverse dimensions, so we can still separately average the beam attenuation as

Equation (39)

which is similar to Equation (19) in that it is an average of $\exp (-{k}_{\perp }^{2}{\sigma }_{\perp }^{2})$ (the expected attenuation of $P({\boldsymbol{k}})$ for any given ${\boldsymbol{k}}$) within a bin of discrete ${\boldsymbol{k}}$ corresponding to the discrete Fourier transform. However, this average is over all discrete ${\boldsymbol{k}}$ that fall within the spherical shell centered at k and the spherical sector corresponding to μ, with the same bins defined by these shell-sector intersections used for averaging of $P({\boldsymbol{k}})$ into P(k,μ). (Averaging the above across all μ would reproduce the ${W}^{2}(k)$ of Equation (19).)

E.1. An Approximate Analytic Example

Note in particular that, even with the same 3D $P({\boldsymbol{k}})$, there is no reason to expect the total signal-to-noise to be the same for P(k) and P(k, μ). For simplicity, take the CO auto spectrum (ignoring or absorbing ${W}^{2}(k)$) as an example. Suppose that we have nμ μ-bins in each k-shell, such that

Equation (40)

and ${N}_{m,i}(k)$ represents the number of modes in the μ-bin centered at ${\mu }_{i}$ falling within the k-shell centered at k. (From here on, summing over i always implies a sum over all nμ applicable values.)

In the main text, we calculate the spherically averaged spectrum P(k) and then the signal-to-noise ratio at each k (which is then added in quadrature over all k to obtain total S/N across all modes):

Equation (41)

where $P(k)=[{\sum }_{i}P(k,{\mu }_{i}){N}_{m,i}(k)]/{N}_{\mathrm{modes}}(k)$. We substitute and simplify to obtain

Equation (42)

However, if we took the S/N for each ${\mu }_{i}$ and then averaged, we would have

Equation (43)

It is difficult to see a way that the two can be generally equal. It is reasonable to approximate ${N}_{m,i}(k)\approx {N}_{\mathrm{modes}}(k)/{n}_{\mu }$ (intervals in μ are roughly linear with intervals in the polar angle up to $\mu \sim 0.5$, so the number of modes in each μ-bin should be similar), in which case

Equation (44)

and

Equation (45)

In the case that sample variance dominates our uncertainties, i.e., ${P}_{n}\ll P(k,{\mu }_{i})$,

Equation (46)

However, we now demonstrate that, in the extreme case where instrumental noise dominates the uncertainties and only one of nμ bins has a nonzero value of $P(k,{\mu }_{i})$, the signal-to-noise ratio is significantly higher for P(k,μ) than for P(k). (Note that it would be misleading to assume ${P}_{n}\ll P(k,{\mu }_{i})$ for all i when the right-hand side is zero for most i.) Take ${\mu }_{0}$ to be the bin with nonzero P(k,μ):

Equation (47)

and

Equation (48)

If ${P}_{n}\gg P(k,{\mu }_{0})$, as is typical in our simulated surveys,

Equation (49)

resulting in a difference of a factor of $\sqrt{{n}_{\mu }}$ in signal-to-noise at this k.

E.2. Results from One Realization

For one lightcone out of our 100, we simulate cross-correlations between a CO cube and galaxy sample with ${M}_{* }\gt {10}^{10}\,{M}_{\odot }$ for all ${\sigma }_{z}/(1+z)$ values considered in the main text. However, we now obtain the anisotropic P(k,μ), which we show in Figure 15. As anticipated, the signal loss with increasing ${\sigma }_{z}/(1+z)$ is highly anisotropic, and is greater for higher μ at any given k.

Figure 15.

Figure 15. The anisotropic CO-galaxy cross spectrum ${P}_{\mathrm{CO}\times \mathrm{gal}}(k,\mu )$ shown for one realization of COMAP cross-correlation against a galaxy sample with ${M}_{* }\gt {10}^{10}\,{M}_{\odot }$. Each panel shows the cross spectrum for the galaxy ${\sigma }_{z}/(1+z)$ value indicated above it.

Standard image High-resolution image

Table 6 also shows the total signal-to-noise ratios across all modes for this realization, comparing between P(k) and P(k,μ). The difference is notable for high values of ${\sigma }_{z}/(1+z)$, but not as great as the $\sqrt{{n}_{\mu }}$ in the simple calculation above (which would have been a factor of almost 8). Furthermore, as we show in Figure 16, the change in signal-to-noise with frequency resolution differs from what we show in Figure 5 for P(k). Working with P(k,μ), by definition, separates the line-of-sight modes from the transverse modes, and thus the different degrees of attenuation experienced due to redshift error. Thus, the only effect of decreasing the number of frequency channels in the survey volume is to decrease the number of modes averaged and consequently to increase uncertainties; the slight gain in P(k) signal-to-noise shown in Figure 5 for ${\sigma }_{z}/(1+z)\geqslant 0.01$ is absent in the P(k,μ) signal-to-noise curves in Figure 16. We thus consider our basic conclusion—that photometric errors significantly reduce any advantage in detection significance from cross-correlation—to be unchanged.

Figure 16.

Figure 16. A demonstration of the effect of COMAP line-of-sight resolution on the signal-to-noise ratio for auto and cross spectra—and specifically how it differs, for the anisotropic P(k, μ), from what we show for P(k) in Figure 5. We express frequency resolution here as the number of channels across the spectrometer bandwidth, and show how it affects total S/N over all scales ${\rm{S}}/{\rm{N}}={[{\sum }_{k}({\rm{S}}/{\rm{N}}{)}_{k}^{2}]}^{1/2}$ in one realization for simulated CO auto spectra and CO-galaxy cross spectra—both spherically averaged P(k) (dashed curves) and anisotropic P(k,μ) (solid curves)—for different galaxy ${\sigma }_{z}/(1+z)$ values. The simulated galaxy sample is selected with a minimum stellar mass of $\mathrm{log}({M}_{* ,\min }/{M}_{\odot })=10.0$. All signal-to-noise ratios are quoted for a single patch of 2.5 deg2 observed for 1500 hr; we may expect improvement by up to a factor of $\sqrt{2}$ if two equivalent patches are observed for 1500 hr each.

Standard image High-resolution image

Table 6.  Total Signal-to-noise Ratio over All Modes for Spherically Averaged Power Spectra (${\rm{S}}/{{\rm{N}}}_{\mathrm{sph}}$) and Anisotropic Power Spectra (${\rm{S}}/{{\rm{N}}}_{\mathrm{aniso}}$) in One Realization

Power Spectrum ${\sigma }_{z}/(1+z)$ for galaxies ${\rm{S}}/{{\rm{N}}}_{\mathrm{sph}}$ ${\rm{S}}/{{\rm{N}}}_{\mathrm{aniso}}$
CO-galaxy cross $0.$ 33.2 37.7
CO-galaxy cross 0.0007 29.9 28.6
CO-galaxy cross 0.003 19.9 19.9
CO-galaxy cross 0.01 11.3 13.3
CO-galaxy cross 0.02 7.7 10.6
CO-galaxy cross 0.03 6.6 9.5
CO auto ... 4.7 5.0

Note. For the galaxy sample, $\mathrm{log}({M}_{* ,\min }/{M}_{\odot })=10$, with a galaxy count of $2.9\times {10}^{4}$ without redshift errors. All signal-to-noise ratios are still quoted for a single patch of 2.5 deg2 observed for 1500 hr; we may expect to see this increase by up to a factor of $\sqrt{2}$ if two equivalent patches are observed for 1500 hr each, and a further, roughly linear increase with more integration time.

Download table as:  ASCIITypeset image

Footnotes

  • 10 

    A 350 nm minimum wavelength (below which strong ozone absorption features exist; see Schachter 1991) is typical in ground-based optical spectrographs, setting the minimum redshift for ground-based LAE surveys beyond z ∼ 1 by necessity.

  • 11 

    While peculiar velocities do alter P(k) at the scales studied, the effect is at a factor of order unity (boosting the CO P(k) by 30%, at most, given the mass-averaged bias expected of CO emission), and is as such unlikely to weaken how well our different tracers cross-correlate with each other. Furthermore, we wish to obtain an even comparison to previous studies like Li et al. (2016) that also neglect peculiar velocities.

  • 12 

    Unlike thermal noise in the line-intensity maps, the shot noise in the galaxy density field emerges naturally from the simulation procedure outlined above, and it may be considered a component of the observed/simulated galaxy power spectrum (as is shot noise in the line-intensity power spectra).

  • 13 

    Some works—e.g., Li et al. (2016)—denote ${W}^{2}(k)$ as W(k). Here, we adopt the convention that $W({\boldsymbol{k}})$ refers to the window function applied to the Fourier-transformed field, not its squared magnitude.

  • 14 

    While this redshift error seems well-matched to the spectral resolution of the COMAP data ($R=\lambda /{\rm{\Delta }}\lambda ={\nu }_{\mathrm{obs}}/{\delta }_{\nu }\sim 2000$ corresponds to ${\sigma }_{z}/(1+z)\,=0.0005$), the COMAP channel width is in reality equivalent to a full width at half maximum, whereas the galaxy redshift errors have been described here as 1σ errors in each direction.

  • 15 

    The auto power spectrum signal-to-noise is also quite similar for the HETDEX-as-LIM survey and the HETDEX-as-LAE survey, with a signal-to-noise ratio of up to 113 for HETDEX LIM and 81 for HETDEX LAE before sparse sampling, even across the limited sky area and range of k considered in this work. With quarter-fill sparse sampling, this ratio drops to 104 for HETDEX LIM and 59 for HETDEX LAE.

  • 16 

    See also Gong et al. (2014) for a more direct masking-based rejection of [O ii] emission and a discussion of benefits and limitations of mitigation through cross-correlation. That work considers Lyα emission from $z\sim 7$, but the broad conclusions should be extensible to $z\sim 3$.

  • 17 
  • 18 

    See the rescalings of SFR–FUV conversion factors in Madau & Dickinson (2014) with different choices of IMF. These suggest that the right-hand side of Equation (1) should be multiplied by 1.6 if using the Salpeter IMF instead of the Chabrier IMF, and the right-hand side of Equation (23) multiplied by 1.5 if using the Kroupa IMF instead of the Salpeter IMF. By contrast, the conversion factors assuming the Kroupa IMF or the Chabrier IMF are within 6% of each other, which we are happy to deem subdominant to other modeling uncertainties.

Please wait… references are loading.
10.3847/1538-4357/ab0027