Field-level Lyman-alpha forest modelling in redshift space via augmented non-local Fluctuating Gunn-Peterson Approximation

Context. Devising fast and accurate methods to predict the Lyman-alpha forest at the field level avoiding the computational burden of running large-volume cosmological hydrodynamic simulations is of fundamental importance to quickly generate the massive set of simulations needed by the state-of-the-art galaxy and Ly α forest spectroscopic surveys


Introduction
The neutral hydrogen (H I) absorption features imprinted in quasar spectra, also known as the Lyman-α forest (Lyα forest, hereafter), represent a promising cosmological probes over the current and next decades.The Lyα forest traces the density field continuously along the line-of-sight, and the high number density of quasar sight lines per square degree delivered by state-ofthe-art cosmological redshift surveys will allow building accurate three-dimensional maps of the Universe at z ≳ 1.8, i.e at distances which are currently beyond the reach of wide-field galaxy redshift surveys.In the await of next-generation astronomical facilities such as the Square Kilometre Array Observatory 1 and the Extremely Large Telescope 2 , the Lyα forest is anticipated to play a pivotal role in refining the constraints on cosmological models and advancing our comprehension of the high-redshift Universe.
To maximize the scientific return and optimize the extraction of cosmological information from the incoming unprecedented Lyα forest data sets provided by surveys as DESI (Levi et al. 2013), Euclid (Amendola et al. 2018), WEAVE-QSO (Pieri et al. 2016), and Subaru-PFS (Takada et al. 2014), it is extremely important to develop accurate analytical models and numerical tools to efficiently interpret and analyze Lyα forest observables.
In particular, a primary goal is to formalize an effective model of the bias relation linking the Lyα forest to the dark matter field.In fact, an effective bias mapping allows the prediction of the Lyα forest from the dark matter field in a fast and accurate way, enabling the generation of massive sets of Lyα forest mock lightcones.Mock catalogs have become the standard tool used in large-scale cosmological surveys to robustly address uncertainties on cosmological parameters, to test pipelines and commissioning tools, and to assess the feasibility of studies targeting new observables, among others.Furthermore, knowing an effective bias description make it possible to forward-model the Lyα forest at the field level and iteratively reconstruct the Lyα forest (see e.g.Horowitz et al. 2019;Porqueres et al. 2020) and its Baryon Acoustic Oscillations (BAOs), improving on methods de-evolving cosmic structures back in time with some approximation for the displacement field (e.g.Eisenstein et al. 2007, and later works implementing higher-order Lagrangian Perturbation Theory refinements).
In addition, the dark matter field -Lyα forest connection has found an interesting application in tomographic studies as well, such as in the LATIS survey (Newman et al. 2020), in the CLAMATO survey (Lee et al. 2018;Horowitz et al. 2022), and in the PSF Galaxy Evolution Survey (Greene et al. 2022).
Pioneering studies (Hui & Gnedin 1997;Rauch 1998;Croft et al. 1998;Weinberg et al. 1999) proposed to model the Lyα forest using a deterministic scaling relation mapping -known as Fluctuating Gunn-Peterson approximation (FGPA, hereafter) -between the dark matter density and the H I optical depth.While useful, fairly accurate on large linear scales, and fast to compute, the FGPA has been shown to lose accuracy and not adequately model the summary statistics of the Lyα forest in the weakly non-linear regime (k > 0.1 h Mpc −1 ) (e.g.Sinigaglia et al. 2022).A plethora of works employed the FGPA to forwardmodel the Lyα forest from dark matter density fields obtained through N-body simulations (Meiksin & White 2001;Viel et al. 2002;Slosar et al. 2009;White et al. 2010;Rorai et al. 2013;Lee et al. 2014), approximated gravity solvers (Horowitz et al. 2019) and Gaussian or lognormal random fields (Le Goff et al. 2011;Font-Ribera et al. 2012;Farr et al. 2020).In particular, the LyAlpha-Colore method (Farr et al. 2020), applying the FGPA on lognormal fields, with free parameters constrained so as to match the position and width of the acoustic peak in the Lyα forest two-point correlation function and the amplitude of the 1D Lyα forest power spectrum, was adopted to produce the mock lightcones used for the BAO measurements of the Lyα forest from the final eBOSS data release (SDSS DR16) (du Mas des Bourboux et al. 2020).
On the other hand, more sophisticated iterative methods to model the Lyα forest and employing different strategies have been proposed, such as e.g.LyMas (Peirani et al. 2014(Peirani et al. , 2022) ) and the Iteratively-Matched Statistics (Ims, Sorini et al. 2016).These techniques have been shown to yield promising results, although still feature deviations of order 5 − 20% in the 3D power spectrum.Moreover, those techniques were not applied to approximated gravity solvers, but rely on full N-body simulations.Therefore, they are not able to overcome the computational burden of running massive sets of simulations.
With the flourishing of machine learning methods, and the extraordinary attention that this field is receiving in astrophysics, the generation of AI 3 -accelerated simulated cosmological volumes is witnessing a rapid expansion.
supervised learning strategy and the exploitation of the hierarchy of baryon quantities, has been shown to model two-and threepoint statistics to 1% and ∼ 10% level of accuracy, respectively.Therefore, this technique represents a promising way forward to produce Lyα forest mock catalogs for the next-generation surveys (see e.g., Balaguera-Antolínez et al. 2022, for a concrete application to halo mock catalogs generation for the DESI survey).
Deep learning has also achieved competitive results.Harrington et al. (2021) employed deep convolutional generative adversarial networks to learn how to correct the FGPA feeding the density and velocity DM fields as inputs.In a companion paper, Horowitz et al. (2021) exploited the idea of image generation behind deep generative methods and built a conditional convolutional auto-encoder, able to sample representations in latent space of the hydrodynamic fields of a simulation and synthesize the Lyα forest with a deep posterior distribution mapping.
Nonetheless, machine learning techniques to date still require an adequately large training set in order to appropriately address the issue of overfitting, making it still expensive in the case of predicting the Lyα forest.
In this work we propose to elaborate on the FGPA formalism, augmenting it in light of the recent findings on the importance of accounting for long-range and short-range non-local terms in the mapping of dark matter tracers (see e.g.Balaguera-Antolínez et al. 2019;Kitaura et al. 2022;Sinigaglia et al. 2021).In particular, we explicitly introduce the dependence on the cosmic web environments through the cosmic web classification (Hahn et al. 2007) in the FGPA modeling.We showcase the application of our algorithm to map the Lyα forest on a few Mpc cells mesh in redshift space, and assess its accuracy by evaluating the deviation of relevant summary statistics (mean transmitted flux, probability distribution function, power spectrum, and bispectrum) of the predictions from the ones of a reference cosmological hydrodynamic simulation.
The paper is organized as follows.§2 introduces the cosmological hydrodynamic simulation we use to validate our method.
§3 summarizes the cosmic web classification and its connection to non-local bias.In §4 we review the FGPA and describe our improvements in modeling redshift-space distortions (RSD hereafter) and non-local bias terms.§5 presents the analysis, results of our predictions and a discussion of them.In §6 we describe out to apply our machinery to the generation of Lyα forest simulations with lightcone geometry.§7 discusses potential future improvements for our model.We conclude in §8.

Reference cosmological hydrodynamic simulation
In this section, we present our reference cosmological hydrodynamic simulation.
The reference simulation has been run with the cosmological smoothed-particle hydrodynamics (SPH) code GADGET3-OSAKA (Aoyama et al. 2018;Shimizu et al. 2019), a modified version of GADGET-3 and a descendant of the popular N-body/SPH code GADGET-2 (Springel 2005).It embeds a comoving volume V = (500h −1 Mpc) 3 and N = 2 × 1024 3 particles of mass m DM = 8.43 × 10 9 h −1 M ⊙ for DM particles and m gas = 1.57× 10 9 h −1 M ⊙ for gas particles.The gravitational softening length is set to ϵ g = 16h −1 kpc (comoving), but we allow the baryonic smoothing length to become as small as 0.1ϵ g .This means that the minimum baryonic smoothing at z = 2 is about physical 533 h −1 pc.The star formation and supernova feedback are treated as de-scribed in Shimizu et al. (2019).The code contains also important refinements, such as the density-independent formulation of SPH and the time-step limiter (Saitoh & Makino 2009;Saitoh & Makino 2013;Hopkins et al. 2013).
The main baryonic processes which shape the evolution of the gas are photo-heating, photo-ionization under the UV background radiation (Haardt & Madau 2012), and radiative cooling.All these processes are accounted for and solved by the Grackle library4 (Smith et al. 2017), which determines the chemistry for atomic (H, D and He) and molecular (H 2 and HD) species.The chemical enrichment from supernova is also treated with the CELib chemical evolution library by Saitoh (2017).The initial conditions are generated at redshift z = 99 using MUSIC2 (Hahn et al. 2021) with cosmological parameters taken from Planck Collaboration et al. (2020).
In this work we use the output at z = 2 (for which the computation times amount to ∼ 1.44 × 105 CPU hours), reading both gas and dark matter properties.
To compute the Lyα forest flux field F = exp(−τ) 5 , we first obtain the H I optical depth τ by means of a line-of-sight integration as follows (Nagamine et al. 2021): where e, m e , c, n HI , f , x j , ∆l denote respectively the electron charge, electron mass, speed of light in vacuum, H I number density, the absorption oscillator strength, the line-of-sight coordinate of the j-th cell and the physical cell size.The Voigt-line profile ϕ(x) in Eq. ( 1) is provided by the fitting formula of Tasitsiomi (2006).Where necessary, relevant quantities (e.g.H I number density) are previously interpolated on the mesh according to the SPH kernel of the simulation.Coordinates x j of cells along the line-of-sight refer to the outcome of interpolation of particles based either on their positions in real space r j or redshift-space s j = r j + v los j /aH, where v los j , a and H are the j-th particle velocity component along the line-of-sight, the scale factor and the Hubble parameter at z = 2, respectively.
We interpolate the dark matter and the Lyα forest fields in real and in redshift space onto a 256 3 cells cubic mesh using a CIC mass assignment scheme (Hockney & Eastwood 1981).Such resolution corresponds to a physical cell-volume of ∂V ∼ (1.95 h −1 Mpc) 3 and a Nyquist frequency of k nyq ∼ 1.6 h Mpc −1 .
It is worth noting that the simulations utilized for the analyses of the Lyα forest in Momose et al. (2021); Nagamine et al. (2021), the exploration of the cosmological H I distribution in Ando et al. (2019), and the examination of bias in cosmological gas tracers in Sinigaglia et al. (2021Sinigaglia et al. ( , 2022) ) were all executed using the same code.

Cosmic web classification and non-local bias
The cosmic web (see e.g., Bond et al. 1996) arises as a result of gravitational instability and the formation and growth of cosmic structures from tiny perturbations of the primordial matter density field in the Early Universe.While several different methods have been proposed in the literature to mathematically define the cosmic web (see Cautun et al. 2014;Libeskind et al. 2017, for a summary), we focus here on the following procedure.
To quantitatively describe the large-scale matter distribution and split it into different cosmic web environments, Hahn et al. (2007) proposed a classification scheme based on the signature of the eigenvalues of the gravitational tidal field tensor where ϕ denotes the gravitational potential and r stands for Eulerian coordinates.Considering the equations of motion in comoving coordinates r = −∇ϕ(r) (3) for a test particle subject to the gravitational potential ϕ, and assuming ∇ϕ(r) = 0 at the center of mass of dark matter haloes (i.e.there is a local minimum), one can linearize the equation of motions and realize that the dynamics close to local extrema of the gravitational potential in the linear regime is ruled by the three eigenvalues λ i (i = 1, 2, 3) of T i j (we refer the reader to Hahn et al. 2007, for more details on the calculations).In close analogy to Zel'dovich approximation (Zel'dovich 1970), and sorting the λ i in decreasing order such that λ 1 ≥ λ 2 ≥ λ 3 , Hahn et al. (2007) defined a region at coordinates r to belong to: At this point, one can drop the assumption of local extrema at the center of mass of the dark matter halo and generalize the cosmic web classification to any point of the density field.This implies the emergence of a constant additive acceleration term to the linearized equations of motion, which however does not affect the role played by the tidal tensor in linear order.
Elaborating on this work, Forero-Romero et al. (2009) proposed to relax the threshold to classify the cosmic web, based on simple dynamical arguments and realizing that highering the threshold from λ th = 0 to λ th = 0.1 provides a classification that better matches the visual appearance of the cosmic web.
The T-web classification has also been shown to provide a direct connection between the phenomenology of the cosmic web and the mathematical description of halo bias relying on Eulerian perturbation theory (see e.g.McDonald & Roy 2009).In fact, discriminating between different T-web environments corresponds to considering a perturbative bias expansion up to third order including both local and non-local long-range bias terms.In this sense, following Kitaura et al. (2022), let us denote the three invariants of the tidal field tensor as: - These three items represent an alternative formulation of the perturbative bias expansion up to third order (Kitaura et al. 2022), but they can also be straightforwardly linked to the phenomenological T-web classification as follows: knots: Eventually, one can generalize this description by relaxing the value of the eigenvalues threshold and replacing zero with any other arbitrary value.
From an intuitive point of view, this means that T-web models the full perturbative expansion up to the third order adopting a low-resolution binning, i.e. just four categories identified as cosmic web types (see Kitaura et al. 2022, and references therein).While using the invariants I i of T i j has been shown to provide a more accurate description of non-local bias and of anisotropic clustering than T-web, the T-web classification has the advantage of being much less likely to incur into overfitting.
In this work, we apply the T-web classification to the modeling of the Lyα forest, in order to introduce non-local contributions within the FGPA prescription.In particular, as will be described in more detail in §4, we make both the model for RSDs and for the FGPA dependent on the T-web by fitting the parameters of such models separately for each cosmic web type.Also, we extract the cosmic web classification both in real and in redshift space.Table 1 reports the volume filling factor of each cosmic web type for the real space dark matter field, and the redshift space dark matter field obtained by applying the RSD description including velocity bias dependent on the real-space T-web with parameters described in the upper part of Table 2 (see §4.2 for the details about the procedure), and used to compute our final non-local FGPA model.One realizes that there is only a tiny sub-percent difference in volume filling factors between real and redshift space T-web classification, with sheets being the most frequent cosmic web type (∼ 50% of the volume), knots being the rarest (∼ 2 − 3% of the volume), and filaments and voids representing intermediate cases.

FGPA modeling
In this section, we first introduce the FGPA approximation in the standard scenario, and present later on our improvement on the modeling within such a framework.

Standard FGPA
A popular way to compute a fast proxy for the Lyα forest consists in the FGPA, which assumes equilibrium between opticallythin photoionization and collisional recombination of the intergalactic H I. In particular, because the majority (> 90% in volume and > 50% in mass, Lukić et al. 2015) of the gas probed by the Lyα forest found in regions with mildly non-linear density contrasts is diffuse and is not shock-heated, the gas density ρ gas and temperature T gas can be linked through the power-law relation where ρ is mean gas density and T 0 and γ depend on the reionization history and on the spectral slope of the UV background model.These vary commonly within the ranges 4000 K ≲ T 0 ≲ 10 4 K and 0.3 ≲ γ ≲ 0.6 (see e.g., Hui & Gnedin 1997).Assuming photo-ionization equilibrium, one can express the Lyα forest optical depth τ as a function of the gas density as (5) This expression represents the FGPA, as τ is described as a field that fluctuates with the underlying gas distribution.While α = 2 − 0.7 γ ∼ 1.6 (see e.g., Weinberg et al. 1999;Viel et al. 2002;Seljak 2012;Rorai et al. 2013;Lukić et al. 2015;Cieplak & Slosar 2016;Horowitz et al. 2019), A is a normalization constant which depends on redshift and on the details of the hydrodynamics (e.g.Weinberg et al. 1999).Given that in the cool low-density regions, dark matter and gas densities display a very high cross-correlation (see e.g.Sinigaglia et al. 2021) and that the FGPA consists in a useful tool to be applied to the dark matter field to obtain Lyα forest predictions without solving the equations of hydrodynamics, δ gas can be replaced with δ dm in Eq. ( 5), so that the FGPA is applied directly to the dark matter field: The FGPA has been shown to represent a good approximation for very high-resolution full N-body simulations (e.g.Sorini et al. 2016;Kooistra et al. 2022), i.e. when representing density fields on regular grids with cell size l ∼ 150 − 200 kpc h −1 , where particle positions and velocities are known with the high accuracy thanks to the exact solution of collisionless and fluid dynamics for dark matter and gas particles/cells, respectively.However, this is not the case if either a coarser grid resolution (≳ 1 − 2 Mpc h −1 ) and/or approximated gravity solvers are to be adopted, as in the case of the massive generation of large-volume Lyα forest boxes encompassing all the universe up to z ∼ 4 (e.g.Farr et al. 2020).In particular, the summary statistics predicted by the FGPA have been shown to depart from the corresponding hydrodynamic flux field statistics obtained with the full hydro computation already at cell resolutions of l ∼ 0.8 Mpc h −1 , both in real and in redshift space (Sinigaglia et al. 2022).
In this work, we compute the FGPA based on the cubic mesh used to CIC-interpolate the simulation particles, i.e. with physical cell size l ∼ 1.95 Mpc h −1 .To motivate the choice of a coarse grid, we notice that such resolution implies that one needs to resort to a grid with N = 5120 3 cells to represent a simulation box of volume V = (10 Gpc h −1 ) 3 , covering a full-sky cosmic realization at 0 ≤ z ≲ 3.8, to cover the relevant volume needed for Lyα forest studies.This implies already a quite heavy computational burden, especially when hundreds of mock lightcones are to be generated, and resorting to higher resolution meshes would make the process practically unfeasible.
To put ourselves in realistic observational conditions, we compute the FGPA in redshift space.To this end, we displace particles from real to redshift space via the mapping (Kaiser 1987;Hamilton 1998) where s i and r i are the Eulerian comoving coordinates of the ith particle in redshift space and real space, ri = r i /|r i |, v i is its velocity, and a and H are the scale factor and the Hubble parameter at the redshift of interest.respectively.Under the assumption of plane-parallel approximation that we adopt, and therefore neglecting the lightcone geometry, Eq. ( 7) simplifies to only one scalar component along the line of sight.
In this setup, we compute the standard6 FGPA by setting α = 1.6 and determining the parameter A as the value which matches the normalization of the Lyα forest 3D power spectrum on large scales.In this case, we find A = 0.27.

RSD modeling
While the mapping from real to redshift space presented in Eq. ( 7) holds exactly for dark matter particles, it does not account for any velocity bias contribution in the Lyα forest.Therefore, following Sinigaglia et al. (2022), we generalize such a model as follows.
We start by considering the dark matter field in real space.For each cell i, we assign a set of N fictitious particles, with Eulerian position r i coinciding with the center of the cell.Applying an inverse nearest grid point (NGP) scheme, each particle j inside cell i is assigned a mass M i = ρ i V i /N, where ρ i and V i are the dark matter density and volume of the cell, respectively.The jth particles is then displaced from real to redshift space using a modified version of Eq. ( 7) accounting for velocity bias: where s j and r j are redshift space and real space Eulerian comoving coordinates of the jth particle, v dm, j is the modeled velocity of the particle, and b v is a velocity bias factor.The particle velocity v dm, j is modeled as the sum of two components Here, v coh dm, j = v sim dm,i corresponds to the velocity field at cell i, interpolated on the mesh with the same mass assignment scheme as the density field, and models the large-scale coherent flows.We draw the attention of the reader to the fact that the velocity bias b v directly multiplies the coherent flows component of the velocity field.On the other hand, v disp dm, j is built as a velocity dispersion term, randomly sampled from a Gaussian with zero mean and variance proportional to the value of the density field inside the cell i in real space through a power law: , with B and β free parameters (Heß et al. 2013;Kitaura et al. 2014).This latter component models the quasi-virialized velocity dispersion motions, which are smoothed out by the interpolation of the velocity field on the mesh.
To determine the optimal values for the RSDs and FGPA normalization parameters, we jointly maximize the crosscorrelation coefficient between the FGPA prediction and the Lyα forest flux field from the reference simulation and match the normalization of the Lyα forest 3D power spectrum at large scales.After fixing α = 1.6 as fiducial value, we find the choice A = 0.7, b v = −0.8,b v = 1.5, β = 2.0 to fulfill the aforementioned constraints.

Non-local bias contribution
Looking at Eq. ( 6), one easily realizes that the standard FGPA prescription provides a way to compute the Lyα forest optical depth relying exclusively on the local dark matter density, and neglecting any non-local contribution.While the simplicity of the FGPA model is appealing as it allows to compute the Lyα forest with just a few parameters, it attempts at modeling distinct cosmic web environments with the same physical model, therefore failing at capturing the intrinsic diversity of physical conditions.As a major contribution of this paper, we propose to make the FGPA cosmic web dependent, to relax the exponent α and let it variable, and to determine the sets of coefficients that best describe each cosmic web type.We point out that we are considering at this stage the dark matter field in redshift space as modeled in Eq. ( 8), and we therefore extract the T-web classification directly in redshift space.
Analogously, we also make the RSD model in Eq. ( 8) dependent on the cosmic web environment, thereby passing from the 3 free parameters {b v , B, β}, to 12 parameters (3 parameters × 4 cosmic web environments).However, in order to keep the number of free parameters as small as possible, we notice that the virial theorem allows us to write v ∝ √ ρ, therefore we can fix β = 0.5.Moreover, peculiar velocity are closer to the linear or quasi-linear regime in sheets and voids, hence one can expect the velocity dispersion component to be negligible there, allowing us to set B sh = B vd = 07 .These two tricks allow us to reduce the number of free parameters from 12 to 6.We notice that, in contrast to the cosmic web classification discussed above, here the T-web is extracted in real space, as the T-web itself is needed to perform the mapping from real to redshift space.In summary, to introduce the non-local dependence on the cosmic web in the FGPA we: 1. extract the T-web from the dark matter in real space; 2. determine the distinct parameters {b v , B, β} depending on the real-space T-web; 3. map particles from real to redshift space adopting the cosmic web dependent version of Eq. ( 8); 4. extract the T-web from the dark matter in redshift space; 5. use the redshift-space T-web to model non-localities in the FGPA by making the free parameters {A, α} dependent on the cosmic web category each cell belongs to.

Threshold bias
It is well known from cosmological hydrodynamic simulations that the Lyα forest transmitted flux one-point probability distribution function (PDF) is starkly bimodal, featuring two sharp peaks around F = 0 and F = 1, and displaying a much lower occurrence of cells with intermediate values 0.1 ≲ F ≲ 0.9 (e.g., Nagamine et al. 2021, and references therein).These characteristics can be mainly attributed to the exponential mapping F = exp(−τ), which regularizes the domain of τ mapping it in the interval [0, 1], truncating low values of τ to F ∼ 1, as well as rapidly saturating high values of τ to F ∼ 0. The bimodal nature of the Lyα forest PDF, together with the high non-linearity induced by the exponential mapping, makes it particularly hard to capture such behavior and accurately predict fluxes, which can suffer rapid variations within the domain, even in neighboring cells.
To improve on the FGPA model regarding this aspect, we adopt the perspective of introducing a threshold bias (Kaiser 1984;Bardeen et al. 1986;Sheth et al. 2001;Mo & White 2002;Kitaura et al. 2014;Neyrinck et al. 2014;Vakili et al. 2017), which in state-of-the-art models of halo bias is used to suppress the formation in overdense regions.
So far, the model has 22 = 6 + 16 free parameters (6 for RSD as described in §4.2, 16 = 4 × 4 for the augmented nonlocal FGPA prescription).We describe the procedure we adopt to determine such parameters and our findings in §5 and Table 2, respectively.

Stochasticity
As final ingredient, we add a stochastic component ϵ to the model expressed in Eq. ( 9) and the subsequent non-local modification: where ϵ j i = f ϵ,i N j is the noise term in the jth cell for i = {knots, filaments, sheets, voids}, and f ϵ,i is a normalization constant.While f ϵ is a parameter controlling the amplitude of the stochastic term, N is sampled from a negative binomial distribution (see Kitaura et al. 2014;Vakili et al. 2017), parametriszd as8 where N is the sampled term and corresponds to the number of failures in the standard negative binomial distribution definition, n stands for the number of successes9 , N + n is the total number of trials, and p is the success probability.
In particular, as will be commented more in detail in §5, the latest non-local modification of Eq. ( 9) without stochasticity yields predictions that lack power towards small scales, and display suppression of small-scale structures, especially in the underdense environments.To this end, we decide to model the stochastic term only in voids, leaving the improved non-local FGPA model deterministic in knots, sheets, and voids, to limit the loss of cross-correlation.This implies f ϵ = 0 in knots, filaments, and sheets.As will be shown later in the paper, we find this choice to be sufficient to accomplish the desired goal.We can develop an intuition about the application of discrete tracers statistics to the Lyα forest in voids as follows.From a mathematical point of view, we start by noticing that we can in principle sample the noise term from any arbitrary continuous distribution.In fact, we are effectively restricting our study to the second moment of the distribution.Starting from this, we can then wonder how to model the variance of the distribution.One possibility would be to assume a Poisson variance, or going beyond it, as done in this paper by choosing the negative binomial.Therefore, our model is statistically robust.This stochastic component can then be interpreted as follows.Assuming that H I is found in sparse clouds in voids at high redshift, then we can treat them as point sources and model them in a similar way as haloes and galaxies.We will test the validity of this assumption in future work.For the moment, we argue that this effective model works well and achieves the goal it was designed for (see §5 for the results).

Parameter
The negative binomial distribution can be interpreted as an overdispersed alternative to the Poisson distribution, where one can allow the mean and variance to be different.This aspect can turn out to be useful in cases when two events have a positively correlated occurrence (i.e., the two events have a positive covariance term and are not independent), and this causes a larger variance than in the case in which the events are independent.In the presented parametrization, n controls the deviation from Poissonity, making the distribution converge to the Poisson distribution for large n and causing an overdispersion for small n.While a thorough investigation of the deviation from Poissonity for the Lyα forest goes beyond the scope of this paper, it is sensible and convenient to allow the distribution from which we sample the stochastic term to feature a non-Poissonian variance including a correlation term (see e.g.Peebles 1980, for a detailed treatment applied to galaxies).

Analysis, results and discussion
In this section, we describe the procedure we adopt to determine the optimal parameters for our model, and present our findings.

Fit of model parameters
Given the model developed in §4.2 and §4.3, we fit the coefficients of our model as follows.
We first determine the RSD model parameters by finding the values which maximize the cross-correlation between the predicted and reference Lyα forest field (see the upper part of Table 2).In particular, we start with an initial guess based on the prior knowledge we have, i.e. we initialize the parameters b v,i = −0.8, and B i = 1 ∀i, i = {knots, filaments, sheets, voids}.As anticipated in §4.2, we fix β i = 0.5 for all the cosmic web types, and set B sh = B vd = 0. We then vary the parameters around the initial guess, one cosmic web at a time, apply the standard FGPA prediction, and monitor the variation of the cross correlation between our prediction and the reference Lyα forest field from the simulation.We notice that at this point we are still applying the standard FGPA, with the same parameters for all cells, because we have not yet fit the parameters for the non-local FGPA model.In order to be able to predict flux with the non-local FGPA at this point, one should perform an end-to-end automatic extraction of the parameters, going from the real-space dark matter field to the redshift-space non-local FGPA Lyα forest flux.However, this would imply a computationally expensive optimization procedure, involving interpolation of particles to the grid at each step.Therefore, we limit ourselves to a simpler manual determination which, while in principle suboptimal, turned out to be very instructive and allowed us to develop a good understanding of how different choices of the model parameters impact the cross correlation at different scales.
Afterward, we determine the parameters for the modified non-local FGPA model.Here, we resort to automatic parameter estimation.In particular, we determine the optimal values for our parameters by sampling their posterior distribution through the affine-invariant emcee Markov Chain Monte Carlo (MCMC) sampler (Goodman & Weare 2010;Foreman-Mackey et al. 2013).
We proceed as follows.We separately fit the Lyα forest flux PDFs ρ(F) of knots, filaments, sheets, and voids, i.e. performing a separate fit for each cosmic web type.In this way, each Markov Chain samples from the posterior distribution p( {knots, filaments, sheets, voids}, where "ref" denotes reference simulation Lyα forest PDF ρ ref (F).We assume a Gaussian likelihood for ρ ref (F): where ρ ref (F) and ρ mock (F) are the flux PDFs of the reference and predicted Lyα forest fields, and σ F corresponds to the number of cells containing a flux value inside the same bin of the PDF as F. Furthermore, we choose the following flat priors for the model parameters: 0 < A i < 2, 1 < α i < 5, 0 < δ * 1,i < 1.5, 0 < δ * 2,i < 1.5, ∀i.After running the chain with 32 walkers for 2000 iterations, we compute the autocorrelation length τ cl using the in-built emcee function, and find that in all cases τ cl ∼ 100 iterations.Therefore, we conservatively discard the first 500 iterations, i.e. ∼ 5 times the chain autocorrelation length.As an example, we show the resulting posterior distribution of the {A, α, δ * 1 , δ * 2 } free parameters obtained by fitting the Lyα forest PDF in knots.One clearly notices that there is a tight correlation between δ * 1 and δ * 2 .Eventually, we compute the median of the resulting posterior distribution, and the 16 th and 84 th percentiles intervals as associated uncertainties.
Up to this point, the model does not include the stochastic noise term, and the predicted P(k) suffers from a lack of power towards small scales with respect to the reference P(k) (see §5.4 for details).Therefore, the model parameters determined thus far correspond to the non-local FGPA without stochasticity.As described in §4.5, we include a further additive noise term, exclusively in voids, randomly sampled from a negative binomial distribution.To determine the best parameters for the binomial distribution θ ′ = { f ϵ , n, p}, we first manually vary the parameters until we reach a combination of θ ′ which enhances enough the small-scale power and provides a good fit of the power spectrum.(see §5.4 for details).Eventually, we refit the parameters altogether, including the random component, and using as an initial guess the optimal parameter values previously found.Furthermore, we adopt the following upper and lower bounds as priors on θ ′ : 1 < f ϵ < 2, 0 < n < 1, 0 < p < 1.
We report the final estimated parameter values and their uncertainty in Table 2.We find clear differences in the best-fitting values of the same parameter in different cosmic web environments.This supports our finding that including non-local terms is playing an informative role in the model about missing physical dependencies in the standard FGPA.
As a final note, we stress that only the fit of parameters for the negative binomial distribution governing the random sampling of the stochastic term was determined by considering deviations between the reference and predicted power spectrum.In fact, in the fit of the parameters of the non-local FGPA prescription, only the PDFs were taken into account.In this sense, the fact that this bias model reproduces the power spectrum and the bispectrum with such good accuracy constitutes a reassuring sanity check.displaying less pronounced redshift space distortions.When introducing a velocity bias correction to the standard FGPA (bottom left), redshift space distortions appear to model more appropriately the observed elongation of structures in the cosmic web, however, it turns out not to capture the saturation at F ∼ 1 in the underdense regions, also reflected in the probability distribution function (see also Figure 3).We again stress that the free parameters for the FGPA model have been chosen in this case in order to match the large-scale amplitude of the power spectrum.It would be also possible to find the parameters which best fit the PDF, but this yields a severe bias (≳ 30%) in the largescale predicted power spectrum amplitude.Eventually, our non-local FGPA model10 (top right) visually resembles the reference simulation to a good degree of approximation.In particular, it succeeds both in reproducing the redshift-space structure of the cosmic web, and in properly predicting the right values of flux in all the different density regimes.Without the stochasticity, underdense regions in our predicted flux would appear emptier than they actually are.This aspect is reflected in a lack of power towards small scales, as will be commented more in detail in §5.4.The addition of the noise term ϵ, only in voids, is meant to compensate for the lack of substructures.Therefore, we do not expect those small-scale structures to visually match the reference simulation in position.Rather, in a global sense, with the addition of the noise term voids appear to be more filled with substructures than in the case where we do not model stochasticity.

Probability distribution function
Figure 3 shows a comparison between the Lyα forest PDF from the reference simulation (red solid), and the prediction from standard FGPA (dashed yellow), from FGPA with velocity bias (purple dotted), from FGPA with velocity and threshold bias (black dashed-dotted), and from our non-local FGPA without and with the addition of stochastic fluctuations in voids (green dashed and blue dashed-dotted, respectively).The sharp bimodality of the Lyα forest PDF makes it particularly non-trivial to correctly predict both flux regimes, yielding the correct shape and average flux.While the standard FGPA model fails to reproduce the height of both peaks, the FGPA with velocity (and threshold) bias prescriptions achieve a better prediction towards F ∼ 0, but a worse result at F ∼ 1, as already  commented in §5.2.In particular, the FGPA model with threshold bias achieves a better prediction of the PDF with respect to the FGPA model with only velocity bias, but none of the two is able to properly account for the low-density peak.The non-local FGPA, however, succeeds in reproducing both peaks, and qualitatively achieve the best result among the studied cases also in the intermediate flux regime 0.1 ≲ F ≲ 0.9.We stress that our final model, including stochasticity, with model parameters fitted to reproduce the PDFs of knots, filaments, sheets, and voids separately, yields a predicted average flux Fpred ∼ 0.75, in great agreement with the value from the simulation Fref ∼ 0.76, even though the model has not been explicitly calibrated to reproduce such a quantity.
We point the reader's attention to the fact that the difference between the non-local FGPA (green dashed) and the FGPA model with velocity and threshold bias (black dashed-dotted) consists in making the model sensitive to non-local terms.It turns out to be clear that the non-local dependencies play an important role in improving the prediction of the Lyα forest PDF.

Power spectrum and cross-correlation
In Figure 4 we present a comparison between Lyα forest 3D power spectra P(k) in the top panel, power spectrum ratios P mock (k)/P ref (k) in the mid panel, and cross-correlation coefficients C(k) in the bottom panel.We display the reference simulation summary statistics as a red solid line, while we plot the predicted Lyα forest flux summary statistics as a yellow dashed line in the case of standard FGPA, as a purple dotted line in the case of FGPA with velocity bias, as a black dashed-dotted line in the case of FGPA with velocity and threshold bias, as a green dashed line in the case of non-local FGPA, and as a blue dashed-dotted line the non-local FGPA with stochasticity.The top and bottom panels clearly highlight that the standard FGPA (yellow dashed) and its augmented version including velocity bias (purple dotted) and velocity and threshold bias (black dashed-dotted) rapidly lose power towards small scales, reach- between the predicted and reference power spectra.Bottom: comparison of cross-correlation coefficients C(k) between the reference Lyα forest forest and the predictions from the different tested models.In each panel, the investigated reference simulation summary statistics are shown as a red solid line, while the prediction by the standard FGPA is displayed as a yellow dashed line, the FGPA with velocity bias as a purple dotted line, the FGPA with velocity and threshold bias as a black dashed-dotted line, the non-local FGPA as a green dashed line, and the non-local FGPA with stochasticity as a blue dashed-dotted line.In the mid panel, the gray shaded areas stand for 1%, 2%, 5%, and 10% deviations, from darker to lighter.In the bottom panel, the gray shaded areas stand for 5%, and 10% deviations, from darker to lighter.
Article number, page 10 of 15 , respectively, at the Nyquist frequency k nyq ∼ 1.6 h Mpc −1 .As already anticipated, the P(k) predicted by the non-local FGPA (green dashed) suffers from a ∼ 20% lack of power towards small scales with respect to the reference P(k).
Eventually, the P(k) yielded by the non-local FGPA model with stochasticity ensures a good fit of the target power spectrum, featuring maximum deviation ∼ 3% and average deviations ∼ 0.1% up to k ∼ 0.4 h Mpc −1 , maximum deviation ∼ 5% and average deviations ∼ 1.8% up to k ∼ 1.4 h Mpc −1 , and maximum deviation ∼ 8% and average deviations ∼ 0.8% up to the Nyquist frequency k ∼ 1.6 h Mpc −1 .An inspection of the bottom panel of Figure 4, instead, reveals that the standard FGPA (yellow dashed) delivers a lower C(k) with respect to all the other cases which incorporate the modeling of velocity bias.In fact, looking at Figure 2, it can be appreciated by eye how the standard FGPA (top right) does not model properly the enhancement of cosmic structure elongation along the line of sight clearly seen in the reference simulation (top left).In this sense, an adequate treatment of velocity bias ensures a better cross-correlation at all scales.The non-local FGPA (green dashed) displays a larger C(k) than the standard FGPA (yellow dashed), than the local FGPA with velocity bias (purple dotted) and than the local FGPA with velocity bias (purple dotted) at all scales, from a 1% gain in C(k) on large scales, up to a ∼ 3% gain in C(k) at k ∼ 1.0 h Mpc −1 .Eventually, the nonlocal FGPA with stochasticity (blue dashed-dotted) features a slightly lower C(k) than the non-local FGPA (green dashed) up to k ∼ 0.5 h Mpc −1 , while it starts to depart from the latter at larger k, lacking a ∼ 6% C(k) at k ∼ 1.0 h Mpc −1 .This is expected by construction, due to the addition of a random component, which has the effect of lowering the cross correlation.However, the loss of C(k) is not dramatic, and the non-local FPGA model with stochasticity (blue dashed-dotted) still improves the local FGPA with velocity bias C(k) (purple dotted) up to k ∼ 0.8 h Mpc −1 .Remarkably, the FGPA model with velocity and threshold bias is found to provide the highest C(k) across all scales, even though very similar to the prediction by the non-local FGPA model.
Here again, the difference between the non-local FGPA (green dashed) and the FGPA model with velocity and threshold bias (black dashed-dotted) consists in the inclusion of the nonlocal dependence on the cosmic web.The remarkable difference (≳ 20% gain in the accuracy of the predicted P(k) at the Nyquist frequency) makes it clear the improvement accounted for by the non-local model.

Bispectrum
To address the capability of the model to reproduce higher-order statistics, we also assess the accuracy of the predicted bispectrum.We report in Figure 5 the reduced bispectrum Q(θ 12 ) as a function of the subtended angle θ 12 , for four different triangular configurations: The plot displays the bispectrum of the reference simulation (red solid), compared to the predictions from the standard FGPA (yellow dashed), the FGPA with velocity bias (purple dotted), the FGPA with velocity and threshold bias (black dashed-dotted), the non-local FGPA (green dashed), and the nonlocal FGPA with stochasticity (blue dashed-dotted).The latter (i.e.our preferred model) reproduces the target bispectrum with reasonable accuracy for all the probed triangular configurations, while the others start to feature significant deviations at k ≳ 0.2 h Mpc −1 .In particular, the non-local FGPA model with stochasticity reproduces not only the overall shape of the reference bispectrum, but also some peculiar features, such as the position and amplitude of the peak around θ ∼ 2.3 in the k 1 = k 2 = 0.5 h Mpc −1 configuration, as well as the position (although not very much the amplitude) of the peak around θ ∼ 2.1 in the k 1 = 0.3, k 2 = 0.5 h Mpc −1 .

Lightcone generation
In this section we illustrate how our method can be generalized to other redshifts and applied to generate Lyα forest lightcones.
First, we show that the method generalizes also at other redshifts.To do so, we re-calibrate the parameters of our model also at z = 1.8 and 2.2. Figure 6 shows the resulting Lyα forest power spectra ratios at z = 1.8 (purple dotted) and at z = 2.2 (green dashed).The achieved accuracy is very similar to the one shown in Fig. 4 (mid panel) at z = 2. Subsequently, we demonstrate that once the calibration is performed for some redshift snapshots, we can interpolate the parameters for the redshifts in between and ensure a continuous smooth redshift evolution.In particular, we linearly interpolate the parameters obtained at z = 1.8 and z = 2.2 and predict the Lyα forest field at z = 2.We show the resulting P(k) at z = 2 in Fig. 6 (blue dashed-dotted), noticing that again the achieved accuracy is very similar to the one we get via direct calibration at z = 2.
Afterwards, we re-calibrate the model parameters also on the remaining redshift snapshots and generate a lightcone box as follows.We consider the redshift range ∆z = [1.8,3.8].The comoving radial distance subtended by this redshift interval is l ∼ 1500 h −1 Mpc.Therefore, after having mapped the dark matter fields at all redshift from real to redshift space, we replicate three times the same simulation box along the redshift direction.Then, we connect the redshift space dark matter field snapshots transitioning from one to the following at the average redshift between the two.Even though this implies a sharp transition in the dark matter field, we paint the Lyα forest on the lightcone by interpolating the parameters of the model with a cubic spline interpolation scheme at each redshift slice.This ensures a smooth redshift evolution.We show a slice through the final lightcone box in Fig. 7.One can clearly visualize how the Lyα forest changes from z = 1.8 to z = 3.8, with voids becoming bigger and emptier towards the low-redshift end.
In a forthcoming publication, we will present full-sky Lyα forest mocks by generating dark matter fields with lightcone geometry and smooth evolution in redshift already in the dark matter field (Sinigaglia, Kitaura et al. in prep.).

Potential future improvements
While the presented model, as anticipated, turns out to be sufficient to fulfill the purpose it was designed for, there are further potential improvements which we leave to be explored in future works.
In this work, we have included non-local dependencies in the bias formulation only through the T-web.Such dependencies are known to have a long-range effect (e.g.McDonald & Roy 2009), and we are hence neglecting short-range terms, whose effect kicks in when it comes to modeling the non-linear clustering towards small scales.Short-range non-local terms can be −15 The plot displays the bispectrum of the reference simulation (red solid), compared to the predictions from the standard FGPA (yellow dashed), the FGPA with velocity bias (purple dotted), the FGPA with velocity and threshold bias (black dashed-dotted), the non-local FGPA (green dashed), and the non-local FGPA with stochasticity (blue dashed-dotted).The latter (i.e.our preferred model) reproduces the target bispectrum with reasonable accuracy for all the probed triangular configurations, while the others start to feature significant deviations at k ≳ 0.2 h Mpc −1 .
0.1 0.5 1.0 1.5  et al. 2022) or in Lagrangian space (e.g.Zennaro et al. 2022), which characterizes the shape of the local maxima of the density field (Peacock & Heavens 1985).More in general, one can build short-range non-local terms as scalars computed from arbitrarily higher-order derivatives of the density field, such as ∂ l i ∂ l j δ (e.g.Kitaura et al. 2022).Alternatively, in analogy with the Iweb description based on the invariants of the tidal field tensor (see also §3), one can work in a framework relying on the invariants of the curvature tensor.This latter description has been proven to encode crucial information to model the bias of baryon density fields (e.g.Sinigaglia et al. 2021Sinigaglia et al. , 2022)).Since we do not model such family of dependencies here, we stress that we may be missing some relevant piece of physical information.However, we also stress that computing such terms requires an accurate enough modeling of small scales, which is not guaranteed here.Therefore, by introducing short-range non-local terms we may face the risk of introducing a noisy component, which goes in detriment of the final accuracy.Moreover, in the computationally cheapest way of modeling such dependencies, we would be extracting the analogous of the T-web based on δ i j , which would imply a non-negligible number of additional free parameters in our model.In connection to the previous point, we notice that Kitaura et al. (2022) showed that the I-web model encodes a larger amount of information on the clustering of biased tracers with respect to the T-web, at the expense of a larger number of parameters (the number of bins used to described the invariants) and therefore at a larger risk to incur into overfitting.Moreover, while describing the non-local quantities by explicitly using the invariants of the tidal tensor is possible (e.g.Pellejero-Ibañez et al. 2020), it implies devising a suitable functional form for each additional term used in the bias, which is in principle nontrivial.
Another point which can be improved consists in the way we model the low-density regions, identified as cosmic voids according to the T-web classification.In fact, in order to compensate the lack of substructures, we randomly sample a noise component from a negative binomial distribution.At this point we notice that this could be avoided, or its impact alleviated, by improving the modeling of the deterministic component.One easy modification could consist in binning the density distribution in voids into an arbitrary number of intervals, and identify distinct parameters for such distinct density bins.However, this would again introduce a larger number of free parameters in our model.
Eventually, if a noise term is to be included as done in this work, we notice that we have randomly sampled the stochastic component from a negative binomial distribution for the reasons previously stated, but this is not the only possible solution.In fact, one may find that using a different distribution may be more convenient.
We leave all these potential improvements to be investigated in future publications, or in applications adopting this model.

Conclusions
This work presents a significantly augmented version of the widely-used FGPA to predict the Lyα forest in redshift space, as observed from current and forthcoming spectroscopic galaxy surveys.This new model relies on explicit modeling of velocity bias and redshift space distortions and of a modification to the FGPA introducing a cut-off and a boosting scale, with free parameters (determined based on both heuristic and physical arguments, as well as on an efficient MCMC scheme) made dependent on the cosmic web environment as described by the T-web classification (knots, filaments, sheets, and voids, Hahn et al. 2007).In this sense, we label our model non-local FGPA.In fact, the Lyα forest flux in each cell is here made dependent on the content of the surrounding cells and the geometry of the gravitational field, and the model effectively incorporates non-local bias information up to third-order in Eulerian perturbative bias expansion (see Kitaura et al. 2022).In addition, since we find that such a model fails to reproduce the small-scale clustering in the underdense regions, we add a further stochastic term only in voids, randomly sampled from a negative binomial distribution.This random component has the effect of enhancing the smallscale power, and hence, of improving the fit of the Lyα forest forest 3D power spectrum towards small scales.
We predict Lyα forest fluxes with standard FGPA, our preferred non-local FGPA model with stochasticity, and three other intermediate models in between, on a mesh with V = (500 h −1 Mpc) 3 volume and N c = 256 3 cells, with physical cell resolution l ∼ 1.95 h −1 Mpc, trying to reproduce a full cosmological hydrodynamic N-body/SPH simulation (spanning the same volume) and run with N = 1024 3 particles.We assess the accuracy of the prediction of such models by comparing the results regarding the mean transmitted flux F, the PDF, the power spectrum, and the bispectrum, with the analogous summary statistics computed from the reference simulation.
The augmented non-local FGPA with stochasticity improve upon the standard FGPA model in all the investigated summary statistics.In fact, the non-local FGPA accurately reproduces the mean transmitted flux F, the flux PDF, the power spectrum, and the bispectrum.In particular, our model yields a mean transmitted flux Fpred ∼ 0.75, in excellent agreement with the value Fref ∼ 0.76, and a power spectrum featuring maximum deviation ∼ 3% and average deviations ∼ 0.1% up to k ∼ 0.4 h Mpc −1 , maximum deviation ∼ 5% and average deviations ∼ 1.8% up to k ∼ 1.4 h Mpc −1 , and maximum deviation ∼ 8% and average deviations ∼ 0.8% up to the Nyquist frequency k ∼ 1.6 h Mpc −1 .The predicted PDF and the bispectrum clearly outperform the results from any other model tested in this paper as well.
Compared to other schemes based on machine learning (e.g., Harrington et al. 2021;Horowitz et al. 2021;Sinigaglia et al. 2022), and other methods based on iterative calibrations (e.g., Sorini et al. 2016;Peirani et al. 2022), our model offers the appeal of being a purely analytical method, and therefore it is fast to compute, less prone to the overfitting problem, and it can be straightforwardly generalized to larger/smaller volumes, different mesh resolutions, and at different redshift snapshots.This method can even ensure a continuous smooth treatment of redshift evolution via interpolation of the fitting coefficients in between the redshift snapshots used for calibration, avoiding dis-continuity problems at the edge of distinct contiguous redshift shells.
We point out that the resolution (1.95 h −1 Mpc physical cell size) considered in this work is not sufficient to provide realistic Lyα forest high-resolution spectra with correct modeling of the 1D power spectrum, needed to perform studies of small-scale (k ∼ 5 − 10 h Mpc −1 ) Physics, such as bounds on the mass of warm dark matter particles (Viel et al. 2005(Viel et al. , 2013) ) and of neutrinos (Palanque-Delabrouille et al. 2015) and constraints on the thermal state of the intergalactic medium (Garzilli et al. 2017, and references therein), among others.While the application of the non-local FGPA to high-resolution density fields is possible, one should bare in mind that resolving sub-Mpc scales on a 3D mesh makes it computationally very expensive to realize largevolume cosmological boxes.Therefore, in future works, we will explore novel techniques based on Bayesian inference aimed at upsampling the resolution of 1D Lyα forest skewers extracted from low-resolution Lyα forest flux boxes generated following the procedure presented in this work.
We plan to adopt this novel scheme to generate Lyα forest mock lightcones for the DESI survey, in combination with accurate approximated gravity solvers correcting for shell-crossing (e.g.Kitaura & Hess 2013;Tosone et al. 2021;Kitaura et al. 2023).In particular, we aim at using a recent fast structure formation model based on iterative Eulerian Lagrangian Perturbation Theory (eALPT, Kitaura et al. 2023), which has been shown to reproduce the clustering of N-body simulations with percent accuracy deep into the non-linear regime.Furthermore, the application of the non-local FGPA model to approximated dark matter fields generated through the ALPT or eALPT approximations can potentially represent a new avenue in Bayesian density field reconstruction studies using the Lyα forest.We will investigate this point in future publications.
In conclusion, we argue that this novel non-local FGPA scheme represents a significant step forward with respect to previous analytical efforts, and it can become instrumental in the generation of fast accurate mocks for covariance matrices estimation in the context of current and forthcoming Lyα forest surveys.
The code implementing the model presented in this work is made publicly available.11

Figure 2 Fig. 1 .
Figure2shows a comparison of slices through the simulation box, obtained by averaging two contiguous slices 1.95 h −1 Mpc thick parallel to the (x, z) plane, displaying redshift space distortions along the z axis.A visual inspection between the prediction from the standard FGPA (bottom right) and the reference cosmological simulation (top left) immediately reveals that the former fails to adequately render the redshift-space cosmic structures,

Fig. 2 .
Fig. 2. Slices through the simulation box, obtained by averaging two contiguous slices 1.95 h −1 Mpc thick parallel to the (x, z) plane, displaying redshift space distortions along the z axis.The plot displays a Lyα forest transmitted flux F/F c slice through the reference cosmological hydrodynamic simulation (top left), and through the predicted Lyα forest boxes obtained through standard FGPA (bottom right), FGPA with velocity and threshold bias (bottom left), and our final non-local cosmic-web dependent FGPA and stochasticity (top right).The maps are color-coded from red to blue for values ranging in the interval [0.1],where red indicates underdense and blue overdense regions.All the slices share the same color scale and the same extrema.

Fig. 3 .
Fig. 3. Comparison between probability distribution functions of transmitted Lyα forest flux F/F c as extracted from the reference simulation (red solid), and as predicted by the standard FGPA (yellow dashed), by the FGPA with velocity bias (purple dotted), by the FGPA with velocity and threshold bias (black dashed-dotted), by the non-local FGPA(green dashed), and by the non-local FGPA and stochasticity (blue dasheddotted).

Fig. 4 .
Fig.4.Top: comparison of 3D Lyα forest power spectrum P(k).Mid: comparison of 3D Lyα forest power spectrum ratios P mock (k)/P ref (k) between the predicted and reference power spectra.Bottom: comparison of cross-correlation coefficients C(k) between the reference Lyα forest forest and the predictions from the different tested models.In each panel, the investigated reference simulation summary statistics are shown as a red solid line, while the prediction by the standard FGPA is displayed as a yellow dashed line, the FGPA with velocity bias as a purple dotted line, the FGPA with velocity and threshold bias as a black dashed-dotted line, the non-local FGPA as a green dashed line, and the non-local FGPA with stochasticity as a blue dashed-dotted line.In the mid panel, the gray shaded areas stand for 1%, 2%, 5%, and 10% deviations, from darker to lighter.In the bottom panel, the gray shaded areas stand for 5%, and 10% deviations, from darker to lighter.

Fig. 7 .
Fig. 7. Slice through the lightcone box with redshift evolution from z = 1.8 to z = 3.8.The lightcone is obtained by replicating three times the same simulation box along the redshift direction and connecting the different redshift space dark matter field snapshots.Afterwards, the Lyα forest was painted on the dark matter lightcone by interpolating the parameters of the model calibrated at different redshift snapshots, as illustrated in §6.Once can visually appreciate the redshift evolution, with voids becoming bigger and emptier towards the low-redshift end.

Table 2 .
Best-fit model parameters for our preferred non-local FGPA prescription with stochasticity.The left column reports the symbol used throughout the paper for each parameter, the central column reports the value and the associated uncertainties, and the right column provides a brief description of what the parameter stands for.