Euclid: Cosmology forecasts from the void-galaxy cross-correlation function with reconstruction

We investigate the cosmological constraints that can be expected from measurement of the cross-correlation of galaxies with cosmic voids identified in the Euclid spectroscopic survey, which will include spectroscopic information for tens of millions of galaxies over $15\,000$ deg$^2$ of the sky in the redshift range $0.9\leq z<1.8$. We do this using simulated measurements obtained from the Flagship mock catalogue, the official Euclid mock that closely matches the expected properties of the spectroscopic data set. To mitigate anisotropic selection-bias effects, we use a velocity field reconstruction method to remove large-scale redshift-space distortions from the galaxy field before void-finding. This allows us to accurately model contributions to the observed anisotropy of the cross-correlation function arising from galaxy velocities around voids as well as from the Alcock-Paczynski effect, and we study the dependence of constraints on the efficiency of reconstruction. We find that Euclid voids will be able to constrain the ratio of the transverse comoving distance $D_{\rm M}$ and Hubble distance $D_{\rm H}$ to a relative precision of about $0.3\%$, and the growth rate $f\sigma_8$ to a precision of between $5\%$ and $8\%$ in each of four redshift bins covering the full redshift range. In the standard cosmological model, this translates to a statistical uncertainty $\Delta\Omega_\mathrm{m}=\pm0.0028$ on the matter density parameter from voids, better than can be achieved from either Euclid galaxy clustering and weak lensing individually. We also find that voids alone can measure the dark energy equation of state to $6\%$ precision.


Introduction
Euclid (Laureijs et al. 2011) is an upcoming space mission that aims to explore the nature of dark matter and dark energy through observations of galaxy clustering and weak lensing.Clustering statistics will mainly be extracted from the spectroscopic dataset, which will contain accurate spectroscopic redshifts for tens of millions of galaxies, spanning a wide redshift range (0.9 ≤ z < 1.8) and covering almost a third of the sky.The primary analyses of such data usually focus on the galaxy auto-correlation statistics, in particular the two-point correlation function (2PCF) and its Fourier space counterpart, the power spectrum.These analyses allow for the measurement of the char-et al. 2001;Gil-Marín et al. 2016;Philcox & Ivanov 2022) and N-point correlation functions (e.g.Wang et al. 2004;Nichol et al. 2006;Marín et al. 2013;Guo et al. 2015;Slepian et al. 2017).Other proposed statistics include marked correlation functions or power spectra (e.g.White 2016;Satpathy et al. 2019;Massara et al. 2023), counts-in-cells statistics (e.g.Yang & Saslaw 2011;Uhlemann et al. 2017), density-dependent clustering (e.g.Tinker 2007;Chiang et al. 2014;Bayer et al. 2021;Bonnaire et al. 2022;Paillas et al. 2023), anisotropic clustering (e.g.Paz et al. 2011;Kazin et al. 2012;Correa et al. 2019), waveletbased methods (e.g.Valogiannis & Dvorkin 2022;Valogiannis & Dvorkin 2022), non-linear transformations of the galaxy field (e.g.Neyrinck et al. 2009;Carron & Szapudi 2014;Wolk et al. 2015), and density field reconstruction (e.g.Wang et al. 2022).
The anisotropic distribution of galaxies around voids is a particularly interesting observable (Lavaux & Wandelt 2012) that has been shown to be a valuable source of cosmological information, providing parameter constraints that are highly complementary to those obtained from galaxy clustering (e.g.Nadathur et al. 2019b;Hamaus et al. 2022;Woodfinden et al. 2022).Individual voids have irregular shapes and arbitrary orientations on the sky, but, given a large enough sample of voids, the assumption of statistical isotropy of the Universe (together with some additional assumptions discussed in more detail in Sect.2) should give rise to a spherically symmetric distribution of galaxies around voids on average (Ryden 1995).Two effects that spoil this symmetry are distortions due to the Alcock-Paczyński (AP) effect (Alcock & Paczynski 1979), arising from the choice of the fiducial cosmological model used to convert observed galaxy redshifts and angles on the sky into distances, and RSD arising from the peculiar velocities of galaxies around these voids.
As voids are underdense environments, the RSD contributions to the anisotropy can be relatively successfully modelled using linear theory (Hamaus et al. 2014;Cai et al. 2016;Nadathur & Percival 2019;Paillas et al. 2021) without needing to exclude small scales.This provides constraints on the growth rate of structure parametrised by f σ 8 or β (e.g.Paz et al. 2013;Hamaus et al. 2016;Hawken et al. 2017;Nadathur et al. 2019b;Achitouv 2019;Hawken et al. 2020;Aubert et al. 2022;Woodfinden et al. 2022).More importantly, Hamaus et al. (2015) and Nadathur et al. (2019b) show that the anisotropies from RSD contributions can be easily distinguished from those arising due to the AP effect.This allows the stacked void profile -equivalent to the void-galaxy cross-correlation function (CCF) -to be used as a 'standard sphere' for the AP test, as originally advocated by Lavaux & Wandelt (2012).This observable can then be used to measure the AP parameter, which is the ratio of the transverse comoving distance D M (z) to the Hubble distance D H (z) as a function of redshift.The constraints obtained on this quantity by applying this method to the BOSS and eBOSS surveys are a factor of 1.7 to 3.5 more precise than the constraints from galaxy clustering and the BAO in the same data (Hamaus et al. 2016;Nadathur et al. 2019bNadathur et al. , 2020b)).This precise AP measurement underpins the power of the void-galaxy CCF as a cosmological probe: Nadathur et al. (2020a) showed that the combination of voids and BAO measurements alone provides greater than 10σ evidence for late-time cosmic acceleration, independent of the cosmic microwave background or supernovae.
In this work we aim to forecast the constraints on the AP parameter D M /D H and the growth rate f σ 8 that can be obtained from measurements of the void-galaxy CCF by Euclid.This paper is one of a set of four companion papers published on behalf of the Euclid Consortium, which forecast the expected constraints from cosmic voids using different observables.Contarini et al. (2022) explored the use of the void size function, Bonici et al. (2023) investigated the cross-correlation of voids with weak lensing, while this paper and Hamaus et al. (2022) examine different methods of studying the void-galaxy CCF.The primary difference between our work and that of Hamaus et al. (2022) is that we use a velocity-field reconstruction method to approximately remove large-scale galaxy RSD effects before running the void-finding pipeline (Nadathur et al. 2019a) in order to mitigate against systematic errors due to anisotropic selection effects.On the other hand, instead of applying reconstruction, Hamaus et al. (2022) modify terms in the theoretical model used for the fits, and so measure a very different pattern of anisotropy in the CCF.
The structure of this paper is as follows: in Sect. 2 we provide a theoretical description of the void-galaxy cross-correlation function, as well as the AP test.In Sect. 3 we describe the Flagship galaxy mock and the reconstruction method we apply and which enables us to recover the approximate real-space galaxy positions.We also describe the void finder and detail the properties of the void catalogue.In Sect. 4 we describe our pipeline, obtaining the data vectors and translating this into constraints on parameters of interest.Sect. 5 describes the results we obtain, which are then summarised and put into a larger context in Sect.6.

The void-galaxy cross-correlation function in redshift space
The void-galaxy CCF ξ represents the excess probability of finding a galaxy at a given separation from a void centre.We use the notation ξ rr to denote this quantity when the cross-correlation is performed between the real-space positions of galaxies and the positions of voids identified in this real-space galaxy distribution.For convenience, we refer to ξ rr as the real-space void-galaxy CCF.For the same fixed void centres, the crosscorrelation with the redshift-space galaxy positions is then denoted ξ rs , and will be referred to as the redshift-space CCF.Here r refers to real space, s to redshift space, the first superscript refers to the voids, and the second to the galaxies.We denote the vectors from the void centre to the galaxy in real and redshift space by r and s respectively, and decompose them into their components perpendicular and parallel to the line-of-sight direction, s = s ⊥ + s || .Thus s ⊥ = r ⊥ while where a is the scale factor, H the Hubble rate, v || is the component of the galaxy peculiar velocity along the line of sight (not the pairwise velocity; see e.g.Massara et al. 2022), and the line of sight is directed through the void centre.
Since, by construction, the number of void-galaxy pairs is conserved under the mapping between real and redshift space, ξ rs and ξ rr are related by a convolution with the (position-dependent) velocity distribution: (2) This is recognisable as the streaming model (e.g.Peebles 1979), which was used to describe the void-galaxy CCF by Paz et al. (2013) and also, for example, by Cai et al. (2016) and Paillas et al. (2021).Although -for reasons explained below -we have chosen to hold fixed the void positions identified in the real-space galaxy field, Eq. ( 2) would be equally valid if both the CCFs were defined with respect to the void positions in the redshift-space galaxy field, that is if ξ rs and ξ rr were replaced by ξ ss and ξ sr respectively.Indeed this is the approach taken in a number of other papers (Paz et al. 2013;Hamaus et al. 2016;Achitouv 2017).However, the number of voids in a size-selected sample is not conserved under the RSD mapping (Chuang et al. 2017;Nadathur et al. 2019a;Correa et al. 2022), so Eq. ( 2) cannot be used to relate CCFs obtained from running void-finding independently in the two spaces, for example ξ ss and ξ rr .We can view the galaxy peculiar velocities as being composed of a term describing a coherent velocity outflow from the void centre, plus a random velocity component along a random direction.Given the assumption of statistical isotropy in real space, when averaged over sufficiently large numbers of voids, the coherent outflow v(r) = v r (r) r must be directed radially outwards from the void centre and be spherically symmetric about it.Thus the line-of-sight component of the galaxy velocity can be written as where µ r ≡ cos(θ) is the cosine of the angle θ between r and the line of sight, and ṽ|| is a zero-mean random variable describing the projection of the stochastic velocity component along the line of sight.Using Eq. ( 3) to change variables, Woodfinden et al. (2022) show that Eq. ( 2) is equivalent to where is the Jacobian of the transformation from real to redshift space, prime denotes derivatives with respect to r, and P(ṽ || , r) now describes the distribution of ṽ|| about zero.Equation ( 4) is the same model described by Nadathur & Percival (2019).Several other works (e.g.Cai et al. 2016;Hawken et al. 2020;Aubert et al. 2022) use approximations to this expression, first by assuming a Dirac delta function form for the distribution P(ṽ || , r), which reduces the integral equation to and then approximating the square brackets in Eq. ( 5) by a series expansion, which one can choose to truncate at some desired order.Hamaus et al. (2020) introduce additional nuisance parameters to Eq. ( 6), and Hamaus et al. (2022) further modify some terms of the Jacobian expansion.
A key assumption for all models based on Eq. ( 4) is that the coherent velocity term v r (r) is spherically symmetric and radially directed, which depends crucially on the assumption that in real space there is no preferred orientation direction for the large number of individually asymmetric voids.This follows from statistical isotropy only provided there is no orientation-dependent selection effect introduced in creating the void sample.Nadathur et al. (2019a) argued that exactly such an effect is introduced for voids identified in redshift space, since regions with velocity outflows oriented along the line of sight direction appear to have lower densities in redshift space.This means that elongated voids aligned along the line-of-sight direction are more likely to be selected as void candidates if void-finding is performed on the redshift-space galaxy positions.
This orientation-dependent bias has a large effect on the measured CCFs (Nadathur et al. 2019a;Correa et al. 2022;Paillas et al. 2023).The CCF with real-space galaxy positions, ξ sr , becomes strongly anisotropic, the coherent velocity outflow v r no longer has spherical symmetry, and the multipoles of the CCF with redshift-space galaxy positions, ξ ss , have a very different shape.The loss of spherical symmetry in particular introduces significant challenges for the modelling approach described above.To avoid this problem, in this paper we do not consider the case of voids identified in the redshift-space galaxy field.As the real-space galaxy positions are in general not known, we used the reconstruction technique described in Sect.3.2 below to approximately recover them before performing void-finding, and used these void positions to approximately measure ξ rs and ξ rr .

Modelling the velocity distribution
Given the stated assumptions, the convolution described by Eq. (4) -or, equivalently, Eq. (2) -is completely general and exact.In order to make specific predictions, however, we need to specify a model for the velocity field around voids to obtain the radial term v r (r) and the distribution P(ṽ || , r).
The radial outflow velocity v r itself is commonly modelled (e.g.Cai et al. 2016;Nadathur et al. 2019b;Woodfinden et al. 2022) using a a linearised form of the continuity equation (Peebles 1980): where is the fully non-linear enclosed density profile of dark matter, δ(r) = ρ/ ρ−1 is the matter overdensity, and f is the linear growth rate.Nadathur et al. (2019b) found that in simulated galaxy catalogues the dark matter profile of voids approximately scales as ∆ ∝ σ 8 , where σ 8 describes the amplitude of the matter density perturbations on 8 h −1 Mpc scales. 1 When combined with Eq. ( 7), this implies a scaling v r ∝ f σ 8 .Equation (7) has been found to give a reasonably good description of the matter velocity field around voids by some studies (Hamaus et al. 2014;Nadathur & Percival 2019).However, other works (Achitouv 2017;Paillas et al. 2021) have found that it can be necessary to add correction terms.Moreover, it has been shown by Massara et al. (2022) that for some void finders, when void-finding is performed on sparse tracers of the matter density field such as galaxies, an effective velocity bias can arise between the matter velocity which satisfies Eq. ( 7), and the galaxy velocity appearing in Eq. ( 5).This can mean that even if Eq. ( 7) accurately describes the velocity field for matter, galaxy velocities can differ from this in the inner regions of voids.These considerations will need to be taken into account when improving the velocity modelling beyond that in Eq. ( 7).It is not our aim to attempt this task in this paper, but only to forecast the constraints from the void-galaxy cross-correlation function that could be obtained from such a model in the future.We therefore applied a template-based method that assumes only that the scaling inferred from Eq. ( 7) holds.That is, we measured the mean velocity profile v sim r (r) from the Flagship simulation as described in Sect.4.2 below, and assume that the velocity in other cosmological models is given by For the velocity distribution P(ṽ || , r), we assume a Gaussian form, which has been shown to be a good approximation by Nadathur & Percival (2019) and Paillas et al. (2021).As with v r (r), we assume the width of the dispersion σ v || (r) is spherically symmetric about the void centre and take a template-based approach to its modelling.We measured a template dispersion profile σ sim v || (r) from the Flagship simulation (Sect.4.2).At large distances from the void centre, this profile asymptotes to a constant, Hamaus et al. 2015).To account for changes in other cosmologies, we renormalised this function by an amplitude factor σ v , taken to be a free parameter to be marginalised over: In summary, the template-based approach to modelling the velocity field around voids requires the specification of two template functions, v sim r (r) and σ sim v || (r), and depends on two free parameters, f σ 8 and σ v .

The real-space correlation function
Although the velocity model has been specified as in Sect.2.2, the RSD mapping of the void-galaxy correlation described by Eq. ( 4) only relates ξ rs to its counterpart in real space, ξ rr .Though some empirical fitting formulae have been used (Paz et al. 2013;Hamaus et al. 2014;Hawken et al. 2017), models which attempt to predict ξ rr directly from the cosmological parameters do not exist at present2 .Previous studies have therefore taken different approaches to obtain the real-space CCF.Hamaus et al. (2020Hamaus et al. ( , 2022) ) apply an inversion algorithm (Pisani et al. 2014) to recover it from the redshift-space CCF projected along the line of sight (implictly assuming that the real-space correlation function is spherically symmetric about the void centre).Nadathur et al. (2019bNadathur et al. ( , 2020b) ) and Woodfinden et al. (2022) instead measure the average ξ rr from a large number of mocks that match the galaxy clustering seen in the data sample: this is effectively a template-based method similar to that described for v r (r) and σ v || (r) above.
In this paper, we use the reconstruction-based estimators described in Sect.3.2 to estimate the real-space CCF ξ rr (r) from the simulation data.While we have argued that in the absence of orientation-dependent selection effects, statistical isotropy should imply spherical symmetry, ξ rr (r) = ξ rr (r), when measured in the true background cosmology, this will in general not be true in the presence of the AP distortions discussed in Sect.2.4, and could also be spoiled by imperfections in the reconstruction-based RSD removal step.Our implementation of Eq. ( 4) therefore uses the full ξ rr (r) without imposing an assumption of isotropy.
Our treatment of ξ rr here differs from that in Nadathur et al. (2019b) and Woodfinden et al. (2022) because, unlike in those works, we currently only have access to the single representative Flagship simulation here, so we cannot construct a template based on the mean of thousands of mocks (this situation will change by the time of the final Euclid data analyses).This means that our estimates of ξ rr are necessarily much noisier.As it is measured from the same data as the redshift-space correlation function ξ rs we wish to fit, correlations between the theory model and data vector may also be introduced, and we consider the implications of this in Appendix A.

Alcock-Paczy ński effect
Besides RSD, another source of anisotropies in the observed void-galaxy cross-correlation function is the AP effect.This arises from the conversion of observed galaxy angular positions and redshifts to distances in order to apply the modelling developed above.This conversion requires the assumption of a fiducial cosmological model which may differ from the true background and thus lead to incorrectly inferred distances, introducing anisotropies.
To parameterise this effect, we introduce two AP scaling parameters describing the ratio of distances parallel and perpendicular to the line of sight to those in the fiducial model, where D M (z) is the comoving angular diameter distance and D H (z) = c/H(z) is the Hubble distance at redshift z.In this model the correlation function scales as where the superscript fid denotes quantities calculated in the fiducial cosmology.
The parameters α ∥ and α ⊥ can also be recast into the following combinations: which describes an isotropic volume dilation, and describing anisotropic distortions.The dilation parameter α can be measured using a standard ruler, such as the BAO scale (e.g.Blake & Glazebrook 2003;Seo & Eisenstein 2003), by comparing a measured distance scale to one predicted by fundamental physics models.In our analysis, however, the calibrated template functions used depend on the void size cut applied to the sample.This cut is numerically identical to the one applied to the data.However, in general if the template cosmology differs from the truth, due to AP dilation the same numerical cut might not select the same population of voids, leading to a shift in the mean void size.Therefore, we conservatively choose to avoid using the observed void size as an absolute ruler by marginalising over values of α in order to reduce the possible effects of the template cosmology.To achieve this we isotropically rescaled all of the template profiles ξ rr (r), v r (r) and σ v || (r) with α, as described by Nadathur et al. (2019b).This rescaling removes all sensitivity of the model to α.The model then depends only on the AP distortion parameter ϵ, via its effect on the quadrupole and (to a lesser extent) hexadecapole moments.

Multipole decomposition
The model described above applies generally to the full threedimensional correlation function, which may be written as ξ rs (s, µ s ), where s = |s| and µ s ≡ s || /s.In practice it is preferable to compress the information into a small number of Legendre multipoles, as where P ℓ (µ) are the Legendre polynomials of order ℓ.Symmetry dictates that only the even multipoles are non-zero, and we restrict our attention to the monopole, quadrupole and hexadecapole moments (ℓ = 0, 2, 4) only, as these contain almost all of the measurable information in the correlation.
The real-space cross-correlation function ξ rr can also be decomposed into multipoles analogously to Eq. ( 16).As mentioned in Sect.2.3, a commonly used approximation, justified by the assumption of spherical symmetry in real space, is that only the monopole term ξ rr 0 is non-zero.However, this assumption can be spoiled by the AP effect or imperfect reconstruction of realspace galaxy positions, so for generality we keep track of the high-order multipoles of ξ rr as well.

Galaxy catalogue
We used the Euclid Flagship galaxy mock (Castander et al., in preparation), which is constructed from an N-body simulation with 2 trillion particles of mass m p ≈ 2 × 10 9 h −1 M ⊙ .The simulation was run using PKDGRAV3 (Potter et al. 2017) in a box of side L = 3780 h −1 Mpc, with a flat ΛCDM cosmology and matter density Ω m = 0.319, baryon density Ω b = 0.049, dark energy density Ω Λ = 0.681, scalar spectral index n s = 0.96, Hubble parameter h = H 0 /(100 km s −1 Mpc −1 ) = 0.67, and the RMS value of density fluctuations on 8 h −1 Mpc scales σ 8 = 0.83.These values were chosen to be very close to those obtained by Planck Collaboration et al. (2020).The halo catalogue was constructed on the fly, with halos identified using Rockstar (Behroozi et al. 2013), and then populated with a halo occupation distribution (HOD) model which had been calibrated to reproduce observables such as galaxy luminosity and clustering statistics as a function of galaxy luminosity and colour.This was used to create a lightcone mock covering an octant (5157 deg 2 ) of the sky, which was then cut to match the expected observed Hα flux f Hα > 2×10 −16 erg s −1 cm −2 , and expected redshift range 0.9 ≤ z < 1.8 for the Euclid spectroscopic galaxy sample (Laureijs et al. 2011;Costille et al. 2018).To simulate the expected sample completeness of Euclid, we randomly downsampled the final catalogue, retaining 60% of all objects.
For each mock galaxy in this catalogue we record both the true redshift (corresponding to the background Hubble expansion) and the observed redshift, which includes the Doppler effect due to the galaxy peculiar velocities.We also consider the effects of redshift errors by adding an additional random component drawn from a Gaussian with RMS σ z = 0.001. 3The final resulting galaxy catalogue contains about 6.5 × 10 6 mostly central galaxies.For the analysis described below we divided this sample into four non-overlapping redshift bins as illustrated in Fig. 1, chosen to match those used in Euclid forecasts by Euclid Collaboration: Blanchard et al. (2020).The effective redshift of each bin is generically defined as the weighted average redshifts of void-galaxy pairs in the bin, where Z i is the redshift of the ith void centre, z j is redshift of the jth galaxy, and W i and w j are optional weights associated with individual voids and galaxies when computing the CCF (as the Flagship mock does not include observational systematics, we do not use weights, that is they are all set to unity).The effective redshifts of the different bins are z eff = 1.0, 1.2, 1.4 and 1.64.

Reconstruction
As discussed in Sect.2, in order to avoid an orientationdependent selection bias in the void sample that would invalidate the theory model used, void-finding must be performed on the real-space galaxy distribution without large-scale RSD effects.To achieve this, we used a velocity-field reconstruction method, related to the reconstruction commonly used in BAO analysis (Eisenstein et al. 2007;Padmanabhan et al. 2012), to approximately remove galaxy RSD before the void-finding step (Nadathur et al. 2019a).Reconstruction was performed using an iterative fast Fourier transform (FFT) algorithm (Burden et al. 2015) which solves the Zeldovich equation in redshift space for a Lagrangian displacement field Ψ, where f is the growth rate, b is the linear galaxy bias, and δ g is the galaxy overdensity in redshift space.This step is implemented in the pyrecon package4 .It is performed on a regular grid, and densities estimated on the grid are first smoothed with a Gaussian kernel of width R s before solving for the displacement.Given the solution

galaxies voids×100
Fig Fig. 2: Quadrupole moments of the galaxy-galaxy power spectrum in each redshift bin of the Flagship mock, measured in redshift space (yellow), real space (blue) and after applying reconstruction to remove RSD from the galaxy positions (red).Reconstruction was performed using the fiducial values of β fid listed in Table 1 and using smoothing scales R s of 8 and 9 h −1 Mpc for the first two and last two bins, respectively.The similarity of the red and blue lines and their proximity to zero is because reconstruction is successfully removing the large-scale RSD-induced anisotropy in the galaxy positions.
to Eq. ( 18), individual galaxy positions are shifted by −Ψ RSD evaluated at their locations, where This shift subtracts the estimated RSD contribution to the observed galaxy redshifts, in order to approximately recover the real-space galaxy field.The algorithm for solving Eq. ( 18) is the same as that used for BAO reconstruction (e.g. in the eBOSS analyses Gil- Marín et al. 2020;Bautista et al. 2021) with the difference that the shifts applied to the galaxy positions only attempt to correct for large-scale RSD and not for non-linear evolution.
If we define a new variable ψ ≡ bΨ, Eq. ( 18) can be rewritten as The shifts applied to the galaxy positions in order to remove RSD are then Ψ RSD = −β (ψ • r) r, and it is clear that the output of our reconstruction procedure depends only on the parame-Table 1: Summary properties of the galaxy and void catalogues used.For each redshift bin we indicate the approximate total number of galaxies N g , the number of voids N v after the void size cut R v > R cut was applied, and the fiducial values for the parameters f σ 8 , linear galaxy bias b, β = f /b and velocity dispersion amplitude σ v .Void numbers refer to those in the perfect reconstruction case, and vary by a few percent in the other reconstruction cases.ter β and not on f and b individually. 5The fiducial growth rate for the Flagship cosmology can be calculated as f (z) = Ω γ m (z) (Linder 2005), where Ω m is the matter density parameter, and γ = 6/11 ≈ 0.55.To determine the fiducial linear bias b, we measured the galaxy power spectrum in real-space P g (k), calculated the theoretical matter power spectrum P m (k) for Flagship using CAMB Lewis et al. (2000), and fit for b 2 = P g (k)/P m (k), using only large-scale modes at k < 0.07 h Mpc −1 .
The fiducial values b fid and β fid obtained in this way are reported in Table 1 for each redshift bin.However, the growth rate f is a parameter in any fit we perform, and so we do not want to fix β to this fiducial value, but instead to marginalise over it.To achieve this in a computationally feasible way, we precomputed the reconstruction on a grid of β values in the range β fid − 0.15, β fid + 0.15 around the fiducial value.The data vectors obtained from each of these approximations to the real-space galaxy field consist of the monopole, quadrupole and hexadecapole (all ranging from 0 to 120 h −1 Mpc), and are interpolated over when varying β in the analysis.
As the Zeldovich approximation breaks down on small scales, reconstruction also depends on the width of the Gaussian smoothing kernel, R s , which dictates the smallest scales used.Reconstruction with a smoothing radius that is too small will over-correct the RSD, due to contributions from non-linear small-scale fluctuations, while a smoothing radius that is too big will fail to remove all of them.To determine the optimal smoothing radius to use, we computed the quadrupole of the postreconstruction galaxy power spectrum, P 2 (k), to use as a proxy.In real space this quantity is consistent with zero, while RSD introduce significant anisotropies in redshift space (Fig. 2).In each redshift bin we tested reconstruction with multiple smoothing radii at the fiducial value β fid , and pick the one that results in a quadrupole which is roughly consistent with zero.This gives R s = 8 h −1 Mpc for the first two redshift bins, and R s = 9 h −1 Mpc for the last two.
The reconstruction method outlined here cannot perfectly remove all redshift-space effects in the galaxy distribution.We are interested in quantifying the impact of these inaccuracies, as well as those caused by redshift errors, on the void-finding and data analysis pipeline below.We therefore repeated our analysis pipeline in three different scenarios, labelled as: 1. perfect reconstruction, using the true real-space galaxy positions from the simulation; 2. realistic reconstruction, applying reconstruction to the observed redshifts to obtain approximate real-space galaxy positions; 3. and realistic reconstruction with σ z , applying reconstruction to observed redshifts including redshift errors.

Void finding
Voids were found in the real space or post-reconstruction galaxy distribution using the voxel algorithm, implemented in REVOLVER6 (Nadathur et al. 2019c).voxel is a watershed-based void-finding algorithm that operates on a particle-mesh interpolation of the galaxy density field.In the particle-mesh assignment step, tracers (in our case galaxies) are first placed onto a grid to estimate the local density field, a step which is much faster than commonly used tessellation algorithms, and scales linearly with the number of tracers, O(N) (as opposed to O(N 3/2 ) for tesselation).The mesh size is determined by requiring each cubic grid cell, or voxel, to have a side length , where n is the approximate value for the mean number density of galaxies.As n varies by a factor of about 5 over the Flagship redshift range, in order to maintain an appropriate resolution in the highest density bin while maintaining a reasonable number of voxels for computational efficiency, we decided to perform the void-finding separately in each of the four redshift bins, using a different a vox for each.The voxel algorithm (like all void-finder algorithms) unavoidably misses some voids near the redshift edges of the galaxy sample, so this analysis choice leads to the drops in void numbers around redshifts z = 1.1, 1.3 and 1.5 seen in Fig. 1.An alternative approach could involve the use of overlapping redshift bins, which would require a more delicate treatment of correlation between redshifts.We have found that this does not improve our final constraints enough to justify the complexity of the analysis.
In order to correct for the survey window and redshiftdependent or angular selection effects, this density estimate is normalised by the density of a random unclustered catalogue of points which has the same window and selection effects imposed, projected on the same grid.We see from this that, in addition to being significantly faster, this method of estimating the galaxy density field can then more easily account for complex survey masks and systematic effects than popular tessellationbased density estimation methods used in other void-finding codes, such as those based on ZOBOV (Neyrinck 2008).For our purposes with the Flagship catalogue, we created a random catalogue with 50 times as many points as the galaxy catalogue, covering the Flagship octant and following the same redshift distribution n(z).
The estimated galaxy density field is then smoothed with a Gaussian filter of smoothing length r s = n−1/3 , in order to reduce shot-noise fluctuations on smaller scales.Voids are then identified as the sites of local minima in the smoothed density field, and their extents are determined using a watershed method, similar to that of ZOBOV, to create a final catalogue of nonoverlapping voids.Each void is composed of a contiguous set of voxels, which define its volume.Individual voids have irregular shapes and orientations, but we define an effective void radius as the radius of a sphere with the same volume, where N vox is the number of voxels in the void and V vox = a 3 vox is the voxel volume.The centre of each void is identified as the location of the lowest-density voxel within it, being the position of the density minimum.This results in voids with very few galaxies in the centre.
For each reconstruction scenario considered, the full void catalogues obtained in this way contain about 87 000 voids (with small variations in number) across all redshift bins.However, we expect that smaller voids will not be adequately described by our approach to modelling the coherent velocity outflow v r (r).This is due to a combination of factors.The dynamics of small voids are more likely to be dominated by the presence of neighbouring large density fluctuations so that the isolated void model is inappropriate (see also Schuster et al. 2023); they are more likely to be segments of larger voids that appear artificially small because they extend outside of the survey boundaries; and they are also more likely to be entirely spurious detections arising from shot-noise fluctuations (Cousinou et al. 2019).Therefore, we applied a void size cut to exclude them from the catalogue used for clustering measurements7 : Keeping only voids that survive this cut reduces the total numbers by about 40%.The radius cuts and the final sizes of the void catalogues are presented in Table 1.

Measuring the void-galaxy cross-correlation function
We measured correlation functions using the Python wrapper pycorr8 for Corrfunc (Sinha & Garrison 2020;Sinha & Garrison 2019).Corrfunc is a pair-counting engine that counts the number of pairs of voids and galaxies within a given distance and angular separation bin.This was then compared to the equivalent pair counts computed using random unclustered catalogues to obtain an estimate of the CCF using the Landy-Szalay estimator (Landy & Szalay 1993): Here D and R refer to the data and unclustered random catalogues respectively, subscripts 1 and 2 are used to differentiate between voids and galaxies, and each term XY refers to the number of pairs of species X and Y in the (r, µ) bin, normalised relative to the total possible number of such pairs.Thus DD 12 refers to normalised number of void-galaxy pairs, DR 12 to the normalised number of pairs of voids and galaxy randoms, and so on.We created the unclustered random catalogues as described in Sect.3.3, such that they have 50 times as many points as the respective data samples, are uniformly sampled from the Flagship octant footprint and follow the redshift distributions of voids and galaxies respectively (Fig. 1).We measured pair counts and correlation functions 30 radial bins over 0 < r < 120 h −1 Mpc and in 200 angular bins over −1 < µ < 1, keeping distances in units of h −1 Mpc rather than rescaling them in units of the void size, as is done in some void publications.
It is important to note that shot-noise fluctuations in the galaxy random catalogue used in the mesh density estimation step during void finding (Sect.3.3) will be slightly correlated with the positions of identified voids.We therefore used an independent realisation of the galaxy random catalogue (generated using the same method but with a different random seed) for measuring correlations via Eq.( 22).If instead the same random catalogue were used for both, small artefacts could be propagated to the correlation ξ in the regions very close to the void centres.
We used the estimator in Eq. ( 22) with the pair separations determined by the positions of voids and galaxies in different combinations of real and redshift space to measure the different correlation functions relevant to our analysis.For instance, in the idealised case of perfect reconstruction, voids are identified in the true real-space galaxy field, and their cross-correlation with the real and redshift-space galaxy positions give ξ rr and ξ rs , respectively.On the other hand, when we use realistic reconstruction we use the superscript p to distinguish positions in this post-reconstruction approximation of the real-space galaxy field from the true real-space positions.Voids are found in the postreconstruction field and, as before, their positions are held fixed to allow measurements of ξ ps and ξ pp with the galaxies in redshift space and approximate real space respectively.
Figure 3 shows the recovered monopole, quadrupole and hexadecapole moments for the true real-space void-galaxy CCF ξ rr in the first Flagship redshift bin, and its comparison with the equivalent multipoles of ξ pp in the realistic reconstruction scenarios with and without including redshift errors.Reconstruction recovers the real-space CCF multipoles well, and anisotropies in all cases are negligible, as evidenced by the quadrupole and hexadecapole moments being close to zero.This demonstrates that our void sample selection successfully avoids the problem of orientation-dependent selection bias discussed in Sect. 2.

Constructing template functions
As discussed in Sect.2, we adopt a template-fitting approach to the analysis, and so we require templates for the three functions v r (r), σ v || (r) and ξ rr (r).These were measured from the Flagship simulation data as follows.
To measure the radial velocity profile v r (r), we measured the mean radial component of galaxy peculiar velocities (given in the simulation box rest frame) in spherical shells around the void centre, where the average was taken over all void-galaxy pairs in that separation bin.We used 30 linearly spaced radial bins, up to r = 120 h −1 Mpc.The mean radial velocity profiles at different redshifts are shown in the top panel of Fig. 4. The velocity is positive over almost the entire radial range, corresponding to an outflow from the void centre as expected.The profile shows a Fig. 3: Void-galaxy CCF in real space, measured in the Flagship simulation z eff = 1.0 redshift bin.From top to bottom, the panels show the monopole, quadrupole and hexadecapole moments of the CCF.The multipoles of ξ rr , corresponding to the perfect reconstruction scenario and measured using the true real-space positions are shown in blue, while light and dark red respectively show the multipoles of ξ pp obtained using reconstruction without and with redshift errors.The inset panels show the residuals with respect to the perfect reconstruction case of ξ rr ℓ for each multipole.In all cases ξ 2 and ξ 4 are close to zero, indicating that reconstruction successfully removes orientation-dependent selection bias.peak in the vicinity of the edge of the void and at larger radii it goes to zero, showing that there is no net bulk motion far outside of the void, as expected.
We also measured the width of the line-of-sight velocity PDF, σ v || (r), from the distribution of the galaxy peculiar velocities in the same spherical shells.The full vector form of Eq. ( 3) can be written as for the four bins can be found in Table 1.
where the stochastic velocity component has random magnitude ṽ and is directed along a random direction n.The quantity we wish to estimate is σ 2 v || (r) = ⟨ṽ 2 ( n • r) 2 ⟩ and from Eq. ( 23) we obtain (Fiorini et al. 2022) where ⟨|v| 2 ⟩ and v r (r) are quantities independent of µ r , and can be measured in radial bins.The bottom panel of Fig. 4 shows the normalised template functions for σ v || (r) measured for each of the four redshift bins.
The measured amplitudes at large distances, σ v || , are listed in Table 1, where we see that the dispersion is higher at lower redshift, consistent with the growth of large-scale structure.From the figure we also see that the dispersion decreases in the void interior, where the potential is steep and the bulk outflow motion dominates.
The final template function required is that for ξ rr (r).In the two realistic reconstruction scenarios we test, we assume that the true real-space information is not available, so that ξ rr cannot be directly measured.Instead we used the quantity ξ pp (r), which could be measured using the approximate real-space catalogues, as an estimator for ξ rr (r).This is the same approach as used by Nadathur et al. (2019bNadathur et al. ( , 2020b) ) and Woodfinden et al. (2022) except that here we do not have a large number of mock catalogues over which we can average this estimator.This greatly increases the noise in the estimate, which is especially relevant when attempting to determine the dependence of ξ pp on the reconstruc-  tion parameter β.For the current work we therefore decided to fix this estimator to ξ pp (β fid ) evaluated at the fiducial β value given in Table 1.This approach will be revisited for a future full Euclid analysis, by which time we expect a full complement of mocks to be available.As mentioned in Sect.2.3, our general code implementation seeks to preserve information on possible anisotropies in ξ pp (r) which may be introduced by the AP effect or imperfections in reconstruction, without imposing the assumption of spherical symmetry, ξ pp (r) = ξ pp (r).In practice we achieve this by measuring the monopole ξ pp 0 (r), quadrupole ξ pp 2 (r) and hexadecapole ξ pp 4 (r) moments and using them to reconstruct ξ pp (r).In the idealised perfect reconstruction scenario, things are much simpler and we just use the true real-space information from Flagship and directly measure the multipoles ξ rr 0 (r), ξ rr 2 (r) and ξ rr 4 (r) from the data.

Model fitting and parameter inference
We compare the model to the measured redshift-space CCF in terms of the compression to multipoles.In each redshift bin, we define the data vector ξ s ≡ ξ rs 0 , ξ rs 2 , ξ rs 4 formed by concatenating the monopole, quadrupole and hexadecapole moments of ξ rs (s, µ s ), measured in 30 radial bins.In the two realistic reconstruction scenarios where the true real-space information is not used, we use the CCF ξ ps (s, µ s ) measured using the void positions identified in the post-reconstruction, approximate realspace galaxy distribution to construct ξ s .In this case the data vector inherits a dependence on the reconstruction parameter β from the void positions.
To determine the uncertainty in this measurement of ξ s , we used a jackknife resampling method to estimate the covariance matrix.For each of the four redshift bins we divided the Flagship data sample into N JK = 900 non-overlapping sub-regions of equal volume.We iterated over all sub-boxes, in each case estimating the cross-correlation function ξ s(k) by removing all the void-galaxy pairs for which both void and galaxy lie in the kth sub-box, and reweighting the ones where either void or galaxy lie within the sub-box using the matched weighting scheme prescribed by Mohammad & Percival (2022) and implemented in pycorr.We then used these realisations to estimate the covariance between the ith and jth bins of the data vector where ξs i = 1 is the mean of the N JK jackknife realisations.The correlation structure of the resultant covariance matrix is shown in Fig. 5. Due to the limitations of having only the single Flagship mock, there will be an additional correlation between the model prediction and the measured data vector that has been neglected in this estimate of the covariance but is discussed in Appendix A.
At any point in parameter space, we computed the model prediction for ξ s as described in Sect. 2 using the public code victor9 .We assume that the likelihood has a Gaussian form, In the realistic reconstruction scenarios we consider, this likelihood evaluation depends on four parameters: explicitly on f σ 8 , σ v || and ϵ, and implicitly on the reconstruction parameter β via its effect on the recovered void positions in the post-reconstruction field and thus on the measured correlations.In the idealised case of perfect reconstruction there is no β dependence.We sampled the full parameter space using the interface between victor and the Monte Carlo Markov Chain (MCMC) sampling code Cobaya10 (Torrado & Lewis 2021).Repeating reconstruction, void-finding and CCF measurement at each β value along the chain would be too slow for the MCMC so instead we followed the method of previous works (Nadathur et al. 2019b(Nadathur et al. , 2020b) ) and pre-computed the correlations on a grid of β values around the fiducial β fid and interpolated the data vector at the time of evaluation of the likelihood. 11To avoid extrapolating outside of this grid, the prior on β was limited to the same range as the interpolation; we used uninformative flat priors for all of the other parameters.Additionally, convergence of the MCMC chains is checked using the Gelman-Rubin R − 1 statistic, using the convergence criterion R−1 < 0.01, and 20% of the initial steps were discarded as burn-in.

Fits to Flagship data
In Fig. 6 we show the measured monopole, quadrupole and hexadecapole moments of the redshift-space CCF ξ ps and the predictions for the best-fit model, together with the residuals, in each of the redshift bins for the realistic reconstruction case and for the catalogue with redshift errors.It can be seen that the model describes the observed multipoles well, although the hexadecapole moment does not contain much information except in the lowest redshift bin where the density of galaxies and voids is highest.This is the most realistic of the scenarios considered, but a similar quality of fit was obtained for the idealised perfect reconstruction scenario and when neglecting redshift errors.Table 2: Marginalised posterior constraints on the AP distortion parameter ϵ and growth rate f σ 8 obtained from fits to the Flagship mock data in four redshift bins.We present results in three scenarios: idealised perfect reconstruction (using true real-space galaxy positions), realistic reconstruction and realistic reconstruction performed on a catalogue with added redshift errors.Fits are performed using covariance C data estimated for Flagship.The true value of ϵ is 1 as the data are analysed in the Flagship fiducial cosmology.The true values of f σ 8 in each redshift bin are as listed in Table 1.The marginalised one-dimensional posterior constraints on the growth rate f σ 8 and the AP distortion parameter ϵ individually obtained from analysis of the Flagship mock are summarised in Table 2.For the idealised case of perfect reconstruction, we recover unbiased constraints on both f σ 8 and ϵ in all redshift bins.We find a measurement precision on ϵ of about 0.5-0.6% in each individual redshift bin, and the precision on f σ 8 varies between ∼ 8.5% for the lowest redshift bin and ∼ 12% in the two highest redshift bins.Using realistic reconstruction and adding redshift errors to the mocks does not significantly degrade these constraints, and the central values recovered are consistent to within the stated uncertainty in each case.The reconstruction method used is therefore not causing a significant loss of information.
The marginalised 1D and 2D posteriors on f σ 8 and ϵ are shown in Fig. 7, where the dashed crosshairs indicate the fiducial values for the Flagship cosmology, which are consistently recovered.Comparing the perfect and realistic reconstruction contours, the largest shift (of ∼ 1σ) is seen in the lowest redshift bin.
The introduction of redshift errors does not cause significant additional shifts in any redshift bin.
Figure 8 shows the marginalised 1D measurements of the growth rate and ratio of the transverse comoving distance D M to the Hubble distance D H , derived from the ϵ constraints, from the Flagship data as a function of redshift.These measurements are compared to the 68.3% (1σ) and 95.5% (2σ) confidence intervals on these quantities derived from extrapolating the CMB constraints from Planck Collaboration et al. ( 2020) down to these redshifts while assuming a ΛCDM cosmological model.Results for ϵ itself are shown in the inset plot.

Full Euclid volume
The planned final footprint of Euclid is 15 000 deg 2 , almost three times larger than that of the Flagship mock used in the analysis above.The constraints we expect from using the full Euclid volume will therefore be correspondingly better.A simple scaling of the covariance matrix suggests that it should shrink proportional to the increase in the volume, that is by a factor of 2.91.This Table 3: Forecast marginalised posterior constraints on ϵ and f σ 8 from the full Euclid survey, obtained from fitting a synthetic data vector using an appropriately scaled data covariance matrix.Uncertainties in the two parameters are correlated, with correlation coefficient ρ.
1.00 0.445 corresponds to measurement uncertainties on the binned multipole moments that should be over 40% smaller than those shown in Fig. 6.We first confirmed this scaling with survey volume by computing covariance matrices for eight volume subsections of the Flagship simulation, ranging from 0.125 to 0.875.We confirmed that the terms of C scale roughly linearly with the survey volume.We therefore obtained the expected covariance matrix for the full Euclid survey by scaling the one obtained in Sect.4.3 by a factor of 1/2.91.
To test how this reduction in uncertainties in the measurement of the void-galaxy CCF propagates through to the recovered cosmological parameter constraints, we repeated the parameter inference step with this rescaled covariance and using a synthetic data vector generated assuming this level of measurement noise.To do this, we considered the idealised perfect reconstruction scenario.Using the real-space CCF multipoles ξ rr ℓ (r) measured from Flagship, we computed the fiducial model prediction, ξ s,th , for the redshift-space data vector.We then created a synthetic data vector where LL T = C Euclid , L is obtained by a Cholesky decomposition of the scaled covariance matrix C Euclid , and Z is a vector of independent standard normal random variables.We then repeated the parameter inference of Sect.4.3 fitting to ξ s,mock using the rescaled covariance matrix.The results of this fitting procedure provide the forecast for Euclid constraints from the void-galaxy CCF measurement, and form the main result of this paper.The forecasts are summarised in Table 3 and shown as a function of redshift in Fig. 9, compared to existing constraints from the application of the same void-galaxy method on current data.As expected, we recover the 0.28 0.30 0.32 0.34 w SDSS BAO+FS (Cuceu et al. 2022) Euclid voids (Hamaus et al. 2022) Euclid voids (Hamaus et al. 2022, 'cal.')Euclid voids (Radinović et al. 2023) Fig. 10: Forecast constraints on the matter density parameter Ω m and the dark energy equation of state w in a wCDM cosmological model from a measurement of the void-galaxy CCF with Euclid as outlined in this work (red contours), corresponding to w = −1.00+0.06 −0.05 .We also show previous forecasts for the Euclid void-galaxy CCF -obtained using a different model and measurement method -by Hamaus et al. (2022), for the 'independent' (green contours) and 'calibrated' (blue, labelled 'cal.' here) cases in that paper, corresponding to whether nuisance parameters are marginalised over or fixed.For context, we show current measurements from current BAO and 'full-shape' measurements in the SDSS galaxy, quasar and Lyman-α datasets from the MGS, BOSS and eBOSS surveys (yellow) presented by Cuceu et al. (2023).fiducial cosmological parameters up to the 68.3% (1σ) statistical uncertainty in each redshift bin.Comparing Table 2 and Table 3 shows that while the forecast uncertainty in ϵ scales approximately as the 1/ √ 2.91 factor expected from the simple volume scaling, the forecast constraints on f σ 8 are slightly more pessimistic.Our final results suggest that with Euclid voids we will be able to measure the distance ratio D M /D H to a precision of 0.3-0.4% in four redshift bins over the range 0.9 ≤ z < 1.8.This is comparable even to the extrapolated constraints from Planck which assume ΛCDM.Voids will also provide a precision of about 5-8% in measurement of the growth rate over these redshifts, a large increase in the statistical power of this method compared to current surveys, and a useful consistency check for results from galaxy clustering.Fig. 9 highlights the key point that AP constraints from voids will be competitive with those derived from model-dependent extrapolations to low redshift from the CMB (and will exceed those that can be obtained from galaxy clustering alone).AP measurements therefore present the best potential for information gain from using the void-galaxy CCF.On the other hand, while the constraints on f σ 8 are weaker than those that can be obtained from galaxy clustering alone and do not add much cosmological information, they can serve as useful consistency checks.

Cosmology forecasts
We now investigate the impact on cosmological model forecasts of the expected measurement precision that can be achieved with Euclid voids, summarised in Table 3. Euclid Collaboration: Blanchard et al. (2020) have presented forecasts for the constraints on model parameters that can be achieved with the primary Euclid probes, namely spectroscopic galaxy clustering, weak lensing, photometric galaxy clustering, and their crosscorrelations.In this work, we compare voids to these benchmarks as a separate standalone probe, and do not investigate the combination of voids with other observables (see, for instance, Nadathur et al. 2019bNadathur et al. , 2020b;;Bonici et al. 2023 for examples of such combinations).
In this scenario, voids probe cosmology through the AP measurement of D M (z)/D H (z) and the stringent constraints this places on the background expansion history.In contrast, the growth rate constraints are weaker than those that can be obtained from galaxy clustering.These measurements can be used as tests of consistency of the model in different regimes, but in the absence of any discrepancy they do not add much cosmological information, even when voids are combined with other probes (Nadathur et al. 2020b).When treating voids as a standalone probe as we do here, they do not provide sufficient constraints to be of interest.We therefore neglect the f σ 8 results and focus entirely on the geometrical AP constraints.To do so, we centred the D M (z)/D H (z) values at their expected values for the Flagship cosmology and used the measurement uncertainties from Table 3.
Within the flat ΛCDM model scenario, measurement of D M /D H at a given redshift directly translates to a constraint on the matter density parameter Ω m (or, equivalently, on Ω Λ = 1 − Ω m ).The precision that we can achieve on D M (z)/D H (z) with voids corresponds to Ω m = 0.3183 ± 0.0028 (at the 68.3% confidence level), a relative statistical uncertainty of ±0.009.The corresponding forecast relative uncertainty from Euclid spectroscopic galaxy clustering in the optimistic (pessimistic) forecast scenario is 0.013 (0.021), from weak lensing is 0.012 (0.018), and from the combination of the two is 0.006 (0.009) (Euclid Collaboration: Blanchard et al. 2020).Thus for this model and parameter, Euclid voids alone seem to outperform both galaxy clustering and weak lensing individually, and look to be competitive with their combination.It is worth noting, however, that this forecast is based only on the statistical constraints, obtained by rescaling a Flagship covariance up to the full Euclid volume.
We also consider a minimal one-parameter extension to the ΛCDM model known as wCDM, in which the cosmological constant Λ is replaced by a dark energy component with a constant equation of state that is not restricted to w = −1.Fig. 10 shows the forecast constraints obtained on Ω m and w in the dark red contours.We recover w = −1.00+0.06  −0.05 and Ω m = 0.3183 ± 0.0028, corresponding to a relative precision of 6% on w.For comparison, the yellow contours in Fig. 10 show the current constraints on these parameters from SDSS at multiple redshifts between z = 0.15 and z = 2.33, comprising the BAO results from different galaxy and quasar samples reported by Alam et al. (2021) together with recent improved high-redshift AP measurements from the 'full shape' of the Lyman-α forest clustering (Cuceu et al. 2023), which give w = −0.90± 0.12.Due to a wellknown geometrical degeneracy which is broken by low-redshift AP measurements, the constraints on this model from the CMB (including CMB lensing) are relatively weak, w = −1.57+0.50 −0.40 (Planck Collaboration et al. 2020).For completeness, Fig. 10 also shows two sets of forecasts for the Euclid void-galaxy CCF obtained by Hamaus et al. (2022) through a different method for modelling and measurement to that used in this work.
These results highlight the excellent promise of the voidgalaxy CCF as a standalone cosmological probe.When removing the assumption of flatness and allowing more general dark energy models with additional degrees of freedom, the void measurement of D M /D H is no longer sufficient to provide useful model constraints on its own.However, the higher precision of the AP measurement still provides large gains in information and improves the figure of merit for dark energy in these models when voids are used in combination with other probes, as shown by Nadathur et al. (2020a).Thus this technique retains great potential to enhance Euclid science.

Conclusions
We have presented a forecast of the cosmological constraints that can be obtained from measurement of the anisotropic voidgalaxy cross-correlation function (CCF) in the spectroscopic sample of the upcoming Euclid galaxy survey.The pattern of anisotropies in the redshift-space CCF can be used to simultaneously fit for AP distortions caused by differences between the fiducial analysis model and the true background cosmology, and for RSD caused by the peculiar motions of galaxies around voids.Similar cosmological analyses of the void-galaxy CCF have previously been applied to data from the Sloan Digital Sky Survey (Nadathur et al. 2019b(Nadathur et al. , 2020b;;Hamaus et al. 2020;Woodfinden et al. 2022), and have been shown to provide significantly better AP measurements than obtained from traditional galaxy-clustering analyses, including BAO.Precise AP measurements, particularly at high redshift, provide very powerful probes of cosmological models of dark energy (Nadathur et al. 2020a;Cuceu et al. 2023).
Our forecast is based on an analysis of the lightcone Flagship galaxy mock catalogue, which was calibrated to match the expected properties of the Euclid spectroscopic galaxy sample, and covers 5157 deg 2 of the sky between the redshifts 0.9 ≤ z < 1.8.We analysed this mock data using the same method previously used by Nadathur et al. (2019b) and Woodfinden et al. (2022).
A key component of this method is the use of velocity field reconstruction to approximately remove large-scale RSD from the galaxy distribution before identifying voids.This step is necessary to remove a selection bias in the construction of the void sample that depends on the orientation of the voids to the line-ofsight direction, thus ensuring the validity of the theoretical modelling.To assess the efficiency of the particular algorithm used to perform this reconstruction, we compared the results obtained using the true real-space galaxy field -corresponding to the idealised perfect reconstruction scenario, that will not be achievable in practice with Euclid data -to those obtained with realistic reconstruction, with and without including spectroscopic redshift errors in the mock data.
We performed the analysis in four non-overlapping redshift bins, compressing the anisotropic CCF information to its first three even-order multipoles -monopole, quadrupole, and hexadecapole -which contain all of the practically measurable information.We found that using a realistic reconstruction algorithm did not significantly increase the statistical uncertainty of the recovered marginalised constraints on the cosmological parameters ϵ and f σ 8 compared to the idealised scenario, and also did not introduce significant systematic offsets at the level of statistical precision available from the Flagship mock.Realistic spectroscopic redshift errors were found to have a similarly minor effect on the results.However, we caution that the analysis here was limited by the availability of only a single simulation realisation, so we can not guarantee that systematic offsets will not be present when dealing with Euclid data.A more detailed study of systematic errors should be performed when a larger number of Euclid mocks become available.
We then extrapolated from the results obtained from the Flagship mock to the full Euclid survey, covering almost three times larger survey volume, to obtain our forecast for the achievable statistical precision.We found that void-galaxy CCF measurements with Euclid should be able to achieve a statistical precision of down to 0.3% on the measurement of the distance ratio D M (z)/D H (z) over the redshift range 0.9 ≤ z < 1.8, and 5-8% on the growth rate parameter f σ 8 .Within the context of flat ΛCDM cosmological models, the statistical uncertainty we achieve on D M (z)/D H (z) translates to a relative statistical uncertainty of 0.009 on the matter density parameter Ω m which is ∼ 30% better than forecasts of what is achievable from both the Euclid spectroscopic galaxy clustering and weak lensing probes individually, and is comparable to their combination (Euclid Collaboration: Blanchard et al. 2020).In the wCDM model scenario, we found that voids alone can measure the dark energy equation of state to a relative precision of 6%, w = −1.00+0.06 −0.05 .It should be noted here that the modelling method adopted for this forecast is the template-fitting approach used in several previous works.In this approach template functions describing the real-space CCF, the void velocity profile (or equivalently, matter density profile) and the velocity dispersion profile are first constructed by calibrating against simulation results and then allowed to vary with cosmological parameters, as described in Sect. 2. The recent work of Massara et al. (2022) suggests that improvements to this template-fitting approach may be required in the future, but our aim is to forecast the constraints possible with currently available techniques so we have not considered these here.We have also not accounted for possible systematic effects which may enter this analysis if the simulations from which the templates are constructed differ strongly from the true cosmological model.A quantitative study of these effects (such as conducted in the context of SDSS by Woodfinden et al. 2022) would require mock catalogues constructed at different cosmologies, but at present we only have access to a single realisation of the Euclid Flagship mock at a fixed cosmology.A large number of mock catalogues would also allow a better estimate of the covariance matrix than the jackknife resampling technique used here.
In this work, the template we used for the real-space CCF was necessarily estimated from the single realisation of the Flagship data itself.This increases the noise in this estimate, in a manner that is correlated with measurement noise in the data vector.When the final Euclid data are analysed, we anticipate that large numbers of mock realisations will be available, so that while a similar procedure could be used in the final analysis, it will not be mandatory.If indeed both real-space and redshiftspace CCFs are measured from the same data, it would be necessary to account for the correlation between them for a fully selfconsistent analysis.We describe how to do this in Appendix A, although we choose not to include this in the main analysis presented above.However, as shown in Appendix A, the effect of accounting for this correlation is to decrease the effective statistical uncertainties on the recovered cosmological parameters.The forecasts using the larger statistical uncertainties presented in our headline analysis are therefore expected to be conservative.
Our treatment of possible observational systematics is quite simple and ignores some effects that we expect will be present in the Euclid data.In particular, although we have incorporated the effects of a uniformly low completeness of the spectroscopic sample and a Gaussian distribution of redshift errors, we have not considered the purity of the sample and errors due to spectral line misidentification.These effects will be explored in detail in a future work.This paper is one of several companion papers forecasting the constraining power from different void observables in Euclid (Hamaus et al. 2022;Contarini et al. 2022;Bonici et al. 2023) which all use the same common baseline of assumptions, to allow a consistent comparison across methods.Hamaus et al. (2022) have also investigated forecasts for constraints obtained from the void-galaxy CCF with Flagship, using a different approach.Unlike in our work, that paper did not use reconstruction to try to remove the orientation-dependent sample selection bias in the void catalogues, and thus obtained sharply different measurements of the CCF than those we find here.They then introduced additional nuisance parameters and modified the growth-dependent terms of the theoretical model derived from the conservation equation in Sect. 2 in order to describe the Flagship data.Other minor differences in approach include the assumption that the matter density profile around voids is simply related to the galaxy profile by the large-scale linear galaxy bias b (instead of our template method for predicting the mean galaxy velocity), and neglecting velocity dispersion around the mean.Despite these differences, the forecasts of Hamaus et al. ( 2022) are broadly compatible with ours.In their baseline model (which they refer to as 'independent'), which marginalises over nuisance parameters as we do here, they predict a 0.5% measurement of the AP parameter, somewhat weaker than our 0.3%, and a 4% measurement of the growth rate, slightly better than our value of 5%-8%.They further report that if the nuisance parameters are instead fixed to their best-fit values (referred to as the 'calibrated' case), these results could improve to 0.4% and 1% precision on the AP parameter and growth rate, respectively.The additional improvement in the AP measurement that is possible from our method can have a large effect on the cosmological constraints obtained, as shown in Fig. 10, where uncertainties in w and Ω m are significantly reduced.Taken together, the results of these two papers showcase both the promise and the robustness of the void-galaxy cross-correlation method applied to Euclid data.

Appendix A: Accounting for correlation between data and theory
Equation ( 25) defines the jackknife estimate of the covariance C data of the redshift-space data vector ξ s data , which we have used for computing the likelihood in the main analysis.However, this is not the only source of uncertainty: the computation of the theory model ξ s theory depends on the multipoles of the realspace CCF ξ rr ℓ , which are themselves determined from measurements performed on the galaxy catalogue using the estimators described in Sect.4.2.These estimators have an associated uncertainty, which is negligible when ξ rr ℓ is estimated from the mean of a large number (N ≈ 1000) of mock realisations as done, for example, by Nadathur et al. (2019b) and Woodfinden et al. (2022), but is significant in our case as the estimate is derived from a single Flagship realisation.Equally, since ξ rr ℓ is estimated from the same Flagship void and galaxy catalogues which are used to measure the data vector, the uncertainties in ξ s theory and ξ s data will in principle be correlated with each other.In order to self-consistently account for these model uncertainties and correlations with the data vector, we can define a combined effective data vector ξ tot ≡ ξ s theory − ξ s data and estimate the covariance C tot using the same jackknife procedure described in Sect.4.3.Then Eq. ( 26) for the log-likelihood remains unchanged, provided that C −1 is understood to refer to the inverse covariance matrix of ξ tot and not ξ s data alone.We can write this schematically as: where the last term represents the cross-covariance between the data vector and the theory model.The familiar scenario corresponds to a situation where the second term on the right is negligible and the last term completely vanishes.This is the case by construction in previous studies in the literature (e.g.Nadathur et al. 2020b;Woodfinden et al. 2022), since ξ rr ℓ was taken as the mean from many mocks and thus data were theory are not correlated.For the method used in this work, given the limitations of The blue points show results for fits to the data vector for the perfect reconstruction case using C data and are the same as in Fig. 8. Black points are for fits to the same data vector but using the full covariance C tot .Light red points are for the fit to the realistic reconstruction case when using C tot .The use of the full covariance C tot greatly reduces the statistical uncertainty but can introduce systematic offsets when combined with the use of reconstruction.
only a single Flagship mock sample, this is not true.For future Euclid data analyses, whether this term is important or not will depend on the choice of how to determine ξ rr ℓ .For Flagship, we found a high degree of correlation between the model and the data vectors, which changes the correlation structure of the full covariance C tot with respect to C data .This is shown in Fig. A.1, and can be compared to Fig. 5. Two features are immediately noticeable: off-diagonal correlations between the monopole and quadrupole components in particular are enhanced, and within each diagonal block of the covariance matrix, correlations between different radial bins are relatively suppressed.This behaviour arises because uncertainties in the measurement of the real and redshift-space monopoles ξ rr 0 and ξ rs 0 in each bin are naturally strongly correlated, as the real and redshift-space galaxy positions are closely correlated.In each radial bin, the monopole component ξ rr 0 strongly influences the corresponding modelled multipoles of ξ s theory , thus creating the correlation between the elements of ξ s theory and the monopole component of ξ s data .Even more importantly, the strong correlation between realand redshift-space multipole moments causes the amplitude of the diagonal elements of C tot to be significantly smaller than those of C data .This is shown in seem unusual that accounting for additional uncertainty in the model effectively reduces the total uncertainty C tot , but this is simply a reflection of the correlation that means that variations in ξ s theory − ξ s data are reduced since uncertainties move both components in the same direction.It is intuitive that when ξ s theory and ξ s data are highly correlated with each other, models that deviate from observation should be more severely penalised in the loglikelihood, and this is achieved by the smaller values of C tot .An equivalent way to view this is that when ξ s theory and ξ s data are both obtained from the same realisation of the initial conditions, the cosmic variance in them cancels out.
Changing the covariance matrix used in the likelihood evaluation in this way naturally propagates through to the recovered statistical uncertainty on the cosmological parameters.To estimate this, we performed all the model fits again using the appropriate full covariance C tot instead of C data .The results obtained for D M /D H and f σ 8 are shown in Fig. A.2, where the black points are for the perfect reconstruction scenario fit using C tot , and the blue points are for fits to the same data using C data .It is clear from this that correctly accounting for the correlation between model and data leads to a very significant increase in the statistical precision of the fit, roughly by a factor of 3 in both D M /D H and f σ 8 .The main results of our paper, which do not account for this effect, are therefore conservative overestimates of the statistical error that can be achieved with Euclid if the real-space cross-correlation function is measured from the data instead of being taken from a simulation.
However, the reduction in statistical uncertainty makes systematic errors more important.While in the perfect reconstruction scenario we are able to recover unbiased estimates of D M /D H and f σ 8 at the higher precision, this is not so for the realistic reconstruction scenario (red points in Fig. A.2), where systematic offsets from the fiducial values are apparent.The source of these offsets is likely to be the residuals in the modelling of the monopole moment at small separations, an example of which can be seen in Fig. 6.These are caused by the imperfections of the practical reconstruction technique.Figure A.3 shows that the relative reduction in covariance terms from using C tot instead of C data is largest for the monopole, so these residuals have a larger effect on the recovered fit.For the realistic reconstruction scenario, the full covariance matrix C tot therefore does not give a good estimate of the total error budget, and systematic uncertainties would have to be folded in.
As we only have one Flagship simulation available, we cannot quantitatively assess the true size of these systematic uncertainties.However, it appears that the apparent large improvement in statistical precision would not be practically realised for Euclid in the reconstruction scenarios, even if ξ rr ℓ were estimated from the data and the full covariance C tot were used.Therefore the pessimistic forecasts we have obtained for the existing methodology using C data alone are more realistic.
Fig. 1: Left: Distribution of void sizes in each of the four redshift bins of the Flagship mock catalogue.The vertical bars indicate the void size cuts applied to each catalogue, and dashed lines represent the voids which are discarded.Right: Comoving number density n(z) of galaxies and voids as function of redshift in Flagship.For voids, n(z) is multiplied by 100 for visibility.The edges of the redshift bins are shown by dashed vertical lines.The decrease in the number of voids near the edges of each bin arises because void-finding is performed on each bin individually as discussed in the text; some void candidates near the redshift edges of the bin are then missed.

Fig. 4 :
Fig.4: Measured radial outflow velocity profile v r (r) (top), and normalised line-of-sight velocity dispersion σ v || (r) (bottom) of galaxies around voids found in the four redshift bins of Flagship.Shading represents the 68.3% (1σ) measurement uncertainty coming from Poisson noise.The dispersion amplitude σ v for the four bins can be found in Table1.

Fig. 5 :
Fig.5: Normalised covariance matrix C data for the data vector from the Flagship z eff = 1.0 redshift bin, estimated using jackknife resampling.Dashed lines differentiate between monopole, quadrupole and hexadecapole blocks of the data vector.

S.Fig. 6 :
Fig. 6: Measured multipole moments of the redshift-space void-galaxy CCF, ξ ps ℓ , for realistic reconstruction applied to the catalogue including redshift errors, are shown as the data points.Error bars shown are determined from the diagonal entries of the covariance matrix C data .The solid lines show the corresponding best-fit models, and insets show the residuals of the model with respect to the data.The rows correspond to (from top to bottom) the monopole, quadrupole and hexadecapole moments (ℓ = 0, 2, 4).Columns correspond to the different redshift bins, labelled with the effective redshift z eff .

Fig. 7 :
Fig. 7: Marginalised 1D and 2D posterior constraints for parameters f σ 8 and ϵ from fits to the Flagship void-galaxy CCF measurements in four redshift bins.Contours show the 68.3% (1σ) and 95.5% (2σ) confidence intervals obtained in the case of perfect reconstruction (blue), realistic reconstruction (light red), and realistic reconstruction with added redshift error (dark red).Dashed crosshairs indicate the true values for the Flagship cosmology.

Fig. 8 :
Fig. 8: Redshift dependence of the measured cosmological parameters obtained from fits to Flagship simulation data.The top panel shows constraints on the ratio of comoving transverse and Hubble distances divided by redshift z, D M /(zD H ), which is derived from ϵ (inset).The bottom panel shows measurements of the growth rate f σ 8 .Coloured data points indicate the different reconstruction scenarios considered, plotted at slightly shifted redshift locations for clarity.The grey shaded bands represent the 68.3% (1σ) and 95.5% (2σ) confidence intervals from fits to Planck Collaboration et al. (2020), extrapolated down to these redshifts assuming ΛCDM, and centred on the Flagship cosmology.

Fig. 9 :
Fig.9: Redshift dependence of the forecast measurements that could be obtained from the 15 000 deg 2 Euclid survey (purple data points).These are generated from a random realisation of the Euclid CCF (Sect.5.2) and so the central values can be shifted relative to the Flagship results in Fig.8.Low redshift measurements from current SDSS data obtained using the same analysis method byNadathur et al. (2020b) andWoodfinden et al. (2022) are included for comparison.
Fig. A.1: Normalised full covariance matrix C tot from the Flagship z eff = 1.0 redshift bin, estimated using jackknife resampling.Dashed lines differentiate between monopole, quadrupole and hexadecapole blocks of the data vector.

C
data − theory = C (data) + C theory − 2C data, theory , (A.1) Fig. A.2:Redshift dependence of the measured cosmological parameters obtained from Flagship using different covariance prescriptions.The blue points show results for fits to the data vector for the perfect reconstruction case using C data and are the same as in Fig.8.Black points are for fits to the same data vector but using the full covariance C tot .Light red points are for the fit to the realistic reconstruction case when using C tot .The use of the full covariance C tot greatly reduces the statistical uncertainty but can introduce systematic offsets when combined with the use of reconstruction.

Fig. A. 3 :
Fig. A.3: Diagonal elements of the full covariance matrix C tot (black lines), as well as the three different contributions from Eq. (A.1), showing the reduced variance that results from properly accounting for correlation between data and theory.The three different panels represent the monopole, quadrupole and hexadecapole part, respectively.