An emulator for the Lyman-$\alpha$ forest in beyond-$\Lambda$CDM cosmologies

Interpreting observations of the Lyman-$\alpha$ forest flux power spectrum requires interpolation between a small number of expensive simulations. We present a Gaussian process emulator modelling the 1D flux power spectrum as a function of the amplitude and slope of the small-scale linear matter power spectrum, and the state of the intergalactic medium at the epoch of interest ($2<z<4$). This parameterisation enables the prediction of the flux power spectrum in extended cosmological models that are not explicitly included in the training set, eliminating the need to construct bespoke emulators for a number of extensions to $\Lambda$CDM. Our emulator is appropriate for cosmologies in which the linear matter power spectrum is described to percent level accuracy by just an amplitude and slope across the epoch of interest, and in the regime probed by eBOSS/DESI data. We demonstrate this for massive neutrino cosmologies, where the emulator is able to predict the flux power spectrum in a $\Sigma m_\nu=0.3$ eV neutrino cosmology to sub-percent accuracy, without including massive neutrinos in the training simulations. Further parameters would be required to describe models with sharp features in the linear power, such as warm or light axion dark matter. This work will facilitate the combination of upcoming DESI data with observations of the cosmic microwave background, to obtain constraints on neutrino mass and other extensions to $\Lambda$CDM cosmology.

interpolation between a small number of expensive simulations. We present a Gaussian process emulator modelling the 1D flux power spectrum as a function of the amplitude and slope of the small-scale linear matter power spectrum, and the state of the intergalactic medium at the epoch of interest (2 < z < 4). This parameterisation enables the prediction of the flux power spectrum in extended cosmological models that are not explicitly included in the training set, eliminating the need to construct bespoke emulators for a number of extensions to ΛCDM. Our emulator is appropriate for cosmologies in which the linear matter power spectrum is described to percent level accuracy by just an amplitude and slope across the epoch of interest, and in the regime probed by eBOSS/DESI data. We demonstrate this for massive neutrino cosmologies, where the emulator is able to predict the flux power spectrum in a Σm ν = 0.3 eV neutrino cosmology to sub-percent accuracy, without including massive neutrinos in the training simulations. Further parameters would be required to describe models with sharp features in the linear power, such as warm or light axion dark matter. This work will facilitate the combination of upcoming DESI data with observations of the cosmic microwave background, to obtain constraints on neutrino mass and other extensions to ΛCDM cosmology.

Introduction
The Lyman-α forest (Lyα forest) is a series of absorption features in the spectra of highredshift quasars, which is caused by the presence of intervening neutral hydrogen. During the last decade, the Baryon Oscillation Spectroscopic Survey (BOSS, [1]) and its extension eBOSS [2] have obtained over 200,000 quasar spectra. These surveys were able to measure the 3D correlation of absorption features and detect baryon acoustic oscillations (BAO) around z = 2.3 [3]. In addition, the same dataset has been used to provide precise measurements of the flux power spectrum along the line of sight, P 1D [4,5]. In this paper we present a framework to interpret these P 1D measurements and obtain robust constraints on cosmological parameters. Starting in 2021, the Dark Energy Spectroscopic Instrument (DESI, [6]) will observe 700,000 Lyα forest quasar spectra, and will provide the most precise measurement of the P 1D to date. P 1D measurements are a unique probe of the clustering of matter on megaparsec scales [7][8][9][10][11][12][13]. In particular, measurements from BOSS, eBOSS and DESI 1 are sensitive to the clustering of matter on scales of 0.2 Mpc −1 < k < 3 Mpc −1 [22]. This information is highly complementary to other probes, such as observations of the cosmic microwave background (CMB), and combined analyses have provided some of the tightest constraints on the shape of the primordial power spectrum and upper limits on the neutrino mass scale [23][24][25][26][27][28][29][30][31][32].
Obtaining cosmological constraints from P 1D measurements requires theoretical modelling. Perturbative approaches are not sufficiently accurate on the small scales to which the P 1D is sensitive. Moreover, the absorption features are also sensitive to the state of the intergalactic medium (IGM), which is dependent on baryonic physics and the ultra-violet background [33]. Modelling these effects simultaneously requires expensive hydrodynamical simulations, each costing ∼ 10 5 CPU hours, which means, in practice, that only a small number ( 100) of simulations can be run for a given analysis. This is far fewer than the ∼ 10 6 likelihood evaluations necessary for common sampling techniques, e.g., Markov chain Monte Carlo methods (MCMC), to provide robust statistical constraints on parameters. A solution to this problem has been to construct frameworks that interpolate between simulations. Similar problems exist in other areas of cosmology, such as in analysis of galaxy surveys, where Gaussian processes have been used to interpolate probabilistically (or emulate) the galaxy power spectrum [34] and correlation function [35] as functions of cosmology from a small number of expensive simulations. Analysis of the P 1D over the last decade most commonly used quadratic polynomial interpolation [36,37]; however recent work has demonstrated the advantages of using a Gaussian process emulator to perform the interpolation [15,20,21,[38][39][40][41].
Historically, approaches to combining small-scale clustering information from the Lyα forest with CMB observations fall into two categories. Early work on the Lyα forest recognised that the Lyα forest probes the universe at an epoch that is close to an Einstein de-Sitter (EdS) model, and therefore considered the cosmological information in the Lyα forest to be concentrated in the linear matter power spectrum [7,[9][10][11]. The procedure was then to constrain the z ∼ 3, linear matter power spectrum on megaparsec scales from Lyα forest observations [8,9,12,13], and combine these measurements with the CMB to constrain cosmological parameters [23][24][25][26][27][28]. More recent analyses [30-32, 36, 37, 42] have instead modelled the P 1D as a function of ΛCDM parameters, plus neutrino mass in the case of studies into massive neutrinos.
The work we present here returns to the original approach. The motivation for this is twofold. First, the effects of ΛCDM parameters on the Lyα forest are strongly degenerate with one another. Minimising parameter degeneracies is important at all stages of the analysis. Degeneracies can be artificially broken by interpolation errors, and lower dimensional parameter spaces allow for a more efficient distribution of simulations. Second, there are several well-motivated extensions to ΛCDM that will further characterise physical properties of the universe, such as the number and mass scale of the neutrinos, or running of the spectral index (see section 7 of ref. [43] for further examples). When combined with CMB observations, the small-scale clustering information provided by the P 1D provides constraints on these and many other extensions to ΛCDM. However, given the large number of prospective parameter spaces that one would want to explore, it is impractical to construct an emulator for each. A potential solution is to emulate the P 1D in a parameter space that can simultaneously describe the effect of these extended models on the flux power spectrum.
In this work we investigate this idea by constructing a Gaussian process emulator modelling the P 1D with just two parameters in the cosmological sector: the amplitude and slope of the linear matter power spectrum at the epoch of interest. We demonstrate that this emulator can accurately predict the P 1D in massive neutrino cosmologies without including massive neutrinos in the training simulations.
As we will show in section 2, parameterising the linear power spectrum in terms of just an amplitude and slope well describes two commonly investigated extensions to ΛCDM, neutrino mass [44,45] and curved cosmologies [43], over the relevant range of length scales. This is because these cosmologies induce smooth variations in the small-scale linear power spectrum. Many proposed alternatives to cold dark matter (such as light axion [46,47], self-interacting [48] or sterile neutrino [49,50] dark matter) feature a sharp suppression of the linear matter power spectrum on small scales, which this parameterisation does not describe. High resolution P 1D measurements [16,51] have been used to put bounds on properties of these models by constraining this cutoff scale [17-19, 21, 32, 38]. Recent results bound the cutoff scale at wavenumbers higher than those probed by the P 1D measured from BOSS/eBOSS/DESI, which is the focus of this paper. Therefore, while in principle our parameterisation of the linear power spectrum can be extended to include a cutoff scale, we leave this investigation for future work.
The outline of this paper is as follows: in section 2, we describe the parameterisation of the emulator in terms of the linear matter power spectrum and state of the IGM. Section 3 describes the suite of simulations, and the post-processing steps. In section 4, we detail the construction of the Gaussian process emulator and validate the predictions on test data. In section 5, we run MCMC chains to obtain constraints on the primordial power spectrum from simulated mock data. We discuss our results in section 6.

Parameterisation
In this section we describe the parameterisation of our emulator. The absorption features that constitute the Lyα forest are caused by the presence of intervening neutral hydrogen along the lines of sight to quasars. The density of this neutral hydrogen is determined both by fluctuations in the baryon density which trace the underlying dark matter field, and the thermal and ionisation state of the IGM. We use a total of six parameters to describe these effects: two to describe the clustering of matter and four to characterise uncertainty in the state of the IGM. These are discussed in sections 2.1 and 2.2 respectively.

Linear matter power spectrum
The Lyα forest is sensitive to the matter density field. We parameterise the P 1D with the amplitude ∆ 2 p (z) and slope n p (z) of the combined cold dark matter + baryon linear power spectrum 2 , P lin , defined as: where we use a pivot scale k p = 0.7 Mpc −1 which was used in the measurement of the smallscale linear matter power spectrum of ref. [13]. To numerically evaluate ∆ 2 p and n p , we use CAMB to generate a linear matter power spectrum and fit a second order log polynomial over the range 0.5k p < k < 2k p . We do not include parameters to describe changes to the growth rate or expansion history. This choice is motivated by the fact that in the redshift range that gives rise to the Lyα forest (5 > z > 2), the universe is very close to Einstein-de Sitter (EdS) with Ω m (z = 2) ≈ 0.91 and Ω m (z = 5) ≈ 0.99 for typical cosmological models allowed Figure 1. Information compression applied to two extended-ΛCDM parameter spaces. For 100 randomly sampled points in the Planck chain, we plot the ratio of the z = 3 linear matter power spectrum for each model with the baseline ΛCDM best fit model [43]. In panel (a), we sample from a chain with free Σm ν . In panel (b), for each of these models we rescale the amplitude and slope of the linear matter power spectrum to match at k p = 0.7 Mpc −1 , shown in the vertical dashed line. The colour of each line represents the value of Σm ν in that model. In the bottom panels we repeat the procedure for a chain with free Ω k . Panel (c) shows the ratio for 100 randomly chosen samples, and in panel (d) we have rescaled the slope and amplitude to match at the vertical dashed line. The vertical shaded area represents the length scales to which the P 1D is sensitive in BOSS data [22], and the horizontal gray band represents the region of 1% agreement. by the CMB. In an EdS universe, the expansion rate evolves H(z) =ȧ/a ∝ (1 + z) 3/2 , and the growth factor evolves proportional to a. In particular, for the expansion history, there is a 1.2% difference in H(z = 2) between flat cosmologies with h = 0.67 and h = 0.74 at fixed physical density ω m = Ω m h 2 , which is the range of the current tension between early and late universe constraints on H 0 [52]. This difference drops to 0.5% at z = 4 as we move deeper into the EdS regime. Similarly, when we compare the logarithmic growth rates for cosmologies with Ω m = 0.25 and Ω m = 0.35, we see a 2.1% difference at z = 2 and a 0.4% difference at z = 4. We construct our emulator under the assumption that the relationship between the matter power spectrum and the P 1D is insensitive to these slight changes in expansion history and growth rate, and we consider the results shown in sections 4 and 5 a test of this assumption. Figure 1 demonstrates why we consider just two parameters describing a slope and an amplitude to be sufficient to parameterise the linear matter power spectrum within this regime. In panel (a), we randomly sample 100 points from the Planck chains 3 [43] with free Σm ν , and plot the ratio of the linear matter power spectrum at z = 3 with the baseline ΛCDM best fit model, which we denote as P best fit lin . The colour of each line indicates the neutrino mass value for each model. We see the expected behaviour where cosmologies with a higher neutrino mass tend to lead to a suppression of the power spectrum on small scales at late times. In panel (b), we take the same 100 samples and rescale the amplitude and slope of the power spectrum in order to match the small-scale linear power in P best fit lin as parameterised by equations 2.1 and 2.2. The vertical shaded region shows the approximate length scales to which the P 1D is sensitive in BOSS data [22]. The upcoming DESI results will provide an improved measurement of the P 1D over the same range of wavenumbers, so the arguments presented in this paper will continue to apply. The horizontal band shows the region of 1% agreement. The effects on the linear matter power spectrum of variations in the 7 free parameters are captured by changes to the slope and amplitude to within 1% (and often much better) when considering only the length scales within the shaded region. This is compared with the most recent constraints on the amplitude of the small-scale linear power spectrum from BOSS which have an uncertainty of 6% [5]. In the bottom panels (c) and (d) we repeat the procedure for a chain with free curvature parameter Ω k instead of Σm ν . Again we see that the all the variations in cosmology are degenerate with changing just the slope and amplitude of the power spectrum on the scales of interest.
A natural extension to ∆ 2 p and n p would be to include a parameter describing the second order derivative in the linear power around k p (a running of the small-scale linear power spectrum). The necessity of this parameter would be determined by the variation in the running within the prior range of the cosmological models under investigation, and the k range and precision of the observational dataset. It would also be possible to include an extra parameter describing the growth rate if deemed necessary. We leave such extensions to future work.

Intergalactic medium
The neutral hydrogen density is determined both by the density of hydrogen (dependent upon the clustering as described in the previous section, as well as the mean baryon density ω b ) and the ionisation state, which is itself affected by several astrophysical factors. A primary factor is the intensity of the ionising metagalactic UV background (UVB), which is highly uncertain. Assuming ionisation equilibrium this is balanced with the rate of recombination, which is dependent on the gas temperature. We parameterise these effects in terms of an effective optical depth, τ eff , which is related to the mean transmitted flux fraction F by The thermal state of the gas around the mean cosmic density regions that are probed by the Lyα forest is well approximated by a power-law: is the baryon overdensity , T 0 is the gas temperature in K at mean density [53] and γ is a free parameter. The effect of T 0 on the recombination rate is captured by F , however thermal motion of the gas particles also induces a Doppler broadening of the absorption features. This has the effect of smoothing the P 1D on small scales. We describe this with a thermal broadening scale where the factor (1 + z)/H(z) is required to convert the broadening scale from km/s to Mpc (to match the units in which we store the P 1D ). We include the parameters σ T and γ in order to characterise the effects of gas temperature on the P 1D .
On small scales the baryons are supported by gas pressure, leading to a small-scale suppression in the baryon power spectrum which therefore affects the Lyα forest. Unlike the thermal broadening, this is a 3D effect that also affects correlations perpendicular to the line of sight. Following ref. [54] we parameterise this effect with a pressure smoothing scale, k F , in units of Mpc −1 .

Simulations
To create a set of training data for constructing the emulator we present in section 4, we run a suite of 30 hydrodynamical simulations using the TreeSPH code MP-Gadget 4 [55], a variant of the widely used Gadget-2 [56] with improved parallelisation and gravitational force solver. The speed of these simulations is increased using the QuickLymanAlpha option which turns regions of baryon overdensity exceeding 10 3 and with temperatures T < 10 5 K into stars. These highly non-linear regions are computationally expensive to evolve but do not contribute to the Lyα forest [26]. Our simulation box size is L = 67.5 Mpc with 768 3 CDM and baryon particles. These values were chosen as a compromise between modelling the low k modes that play an important role in cosmological constraints from BOSS data, and modelling the small-scale effects of pressure smoothing of the IGM within computational constraints.
The simulations start at z = 99 from initial conditions created using MP-GenIC, which computes initial displacements using the Zel'dovich approximation. The baryons and dark matter are initialised on an offset grid using species-specific transfer functions calculated in CLASS [57]. Following the simulations presented in ref. [58], in order to reduce the effects of cosmic variance we adopt the 'paired-and-fixed' approach from refs. [59][60][61]. In this method, the initial amplitudes of the Fourier modes are fixed, with random phases. The same random seed is used for all simulations, meaning that the residual cosmic variance after applying the paired-fixed approach is the same in each simulation. We do not include any modelling of AGN feedback. We output snapshots at 4 > z > 2 at intervals of ∆z = 0.25. Note that this redshift information is not included in the emulator. For each simulation snapshot we calculate the transmitted flux fraction along 500 2 lines of sight using the postprocessing code "fake spectra" 5 [62]. We take the Fourier transform of the flux fraction along each line of sight, and find the average of each Fourier mode to calculate the P 1D . The line-of-sight resolution is set to 0.05 Mpc, which is high enough to resolve the thermal broadening scale σ T whose distribution is shown in figure 2. Box sizes and flux line-of-sight resolutions are chosen such that our mock P 1D have the same spacing in k for all simulations and snapshots.
In order to ensure that our training data are well distributed within the range of physically interesting models, the suite was set up using a Latin hypercube. The Latin hypercube spans the parameters [∆ 2 , with the ranges shown in the left column of table 1. We use a flat ΛCDM cosmology, and vary only the primordial power spectrum in order to sample ∆ 2 p and n p . This will allow us to test later our hypotheses from section 2. We fix ω c = 0.12, ω b = 0.022 throughout this paper, since they are very well constrained by the CMB and otherwise poorly constrained by the Lyα forest. We also explore a range of IGM histories so that the astrophysical parameters in section 2.2 are well sampled. This is done using modifications [63] around a fiducial reionisation model presented  in ref. [64], using a spatially uniform UVB. We define z rei as the midpoint of HI reionisation, i.e. when the ionisation fraction of HI is 0.5, and modify the UVB rates such that this point ranges between z rei ∈ [5.5, 15]. Following ref. [65], the thermal history is varied using parameters H A and H S , which respectively rescale the amplitude of the gas heating rates, and the slope of their density dependence, = H A 0 ∆ H S b . These translate most directly into changes to T 0 (and therefore σ T ) and γ respectively.
In addition to the training simulations, we run three more simulations that we use for various tests, shown in the right side of table 1. The first is a simulation at the centre of the Latin hypercube of simulation parameters in table 1. We refer to this as the central sim. Second we run a ΛCDM simulation with a different background expansion and growth, labelled h sim, with a larger value of h = 0.74 at fixed physical densities (ω c , ω b ). Finally we run a simulation with 0.3 eV massive neutrinos, labelled ν sim, where the neutrinos are implemented using the linear response approximation from ref. [66]. Again we keep ω c and ω b fixed. The values of A s in these test simulations are all set such that ∆ 2 p (z = 3) = 0.35, which is the midpoint of the Latin hypercube of the training data. We use the same reionisation model in all test simulations. These simulations are therefore constructed in such a way that they lie in the same central part of emulator parameter space. In the case of the ν sim, this leads to an increase in A s to compensate for the suppression of the linear power on small scales caused by massive neutrino free streaming [58].
The 30 training simulations, each with 9 snapshots, produce a total of 270 mock P 1D measurements. The Latin hypercube of simulation parameters in table 1 does not propagate into a Latin hypercube in the parameters described in section 2, but it does ensure that the physically motivated regions of this parameter space are well populated with training points. In figure 2 we show the distribution of these training points in three projections of our six dimensional emulator parameter space. The points are coloured corresponding to the redshift of each training point, although we show this purely for illustrative purposes. We assume that these 6 parameters are sufficient to describe the P 1D without including explicit redshift information, and so we construct a single training set combining points from all redshifts.  plane. Both of these parameters are strongly correlated with redshift, and so large parts of this plane are unphysical, and therefore not populated with training points. Although we do not explicitly include redshift information in the emulator, we observe the redshift evolution of P 1D through the evolution of ∆ 2 p and F . In the right panel we show the distribution of training points in k F and n p . There is no redshift evolution of n p , and so once again the trajectories of individual simulations are identifiable. The pressure smoothing scale evolves monotonically with redshift as the gas accumulates thermal energy from the UV background.

Gaussian process emulator
The purpose of this paper is to model the P 1D as a function of the parameters described in section 2 using the simulations presented in section 3. To do this we use a Gaussian process (GP) [67] implemented in the Python package GPy [68]. Our simulations are constructed in such a way that every snapshot provides a measurement of the P 1D in 85 linearly spaced k bins covering the range [0.0931 < k < 7.91] Mpc −1 . Therefore we have a set of training points for each k bin, and we drop the k notation for the following discussion. We denote a point in parameter space as θ = [∆ 2 p , n p , F , σ T , γ, k F ]. A GP models the function one wants to learn, in this case P 1D (θ), as a collection of random variables that form a joint Gaussian distribution, i.e. P 1D (θ) ∼ N (0, K(θ, θ i )). Here θ i are the locations of the training points where we have modelled the P 1D using simulations, and K(θ, θ ) is a covariance kernel whose precise form will be discussed shortly. In order to approximate the Gaussian process prior of zero mean, we normalise the P 1D by the median value of the training data. At a test point θ , the joint distribution of the test data P 1D (θ ) and training data P 1D (θ i ) can be written as where σ 2 n is a hyperparameter representing Gaussian noise on the training data. The mean and variance of the posterior predictive distribution given the training information at θ i are then taken as our estimates of the value and interpolation uncertainty for the P 1D (θ ): There is significant freedom in the choice of covariance kernel, which represents a weak prior on the behaviour of the underlying function one wants to model. We use a squared exponential kernel, also known as a radial basis function with an independent correlation length l i in each parameter direction. It follows that our GP has a total of 8 hyperparameters: 6 correlation lengths, σ 2 0 and σ 2 n . The hyperparameters are optimised by maximising the marginal log likelihood of the training data [67]. This is done across all k bins simultaneously, so we use the same set of hyperparameters for the full bandpower description of the P 1D . To avoid numerical issues, all of our parameters θ are rescaled to vary within a unit volume before being passed to the emulator.
We briefly highlight the differences in this setup compared to previous applications of GPs to emulation of the P 1D [20,39,40]. These emulators trained an independent GP on each redshift bin, whereas we combine training points from all redshifts in one GP. These works also used a different covariance kernel, combining the squared exponential in equation 4.4 with a linear term, and used an isotropic correlation length. The parameterisation used in these works was such that the P 1D varied by a similar order of magnitude within the prior volume in each parameter direction, meaning that an isotropic kernel performed well. A major difference in our setup is that the sensitivity of the P 1D to each parameter varies strongly. This is reflected in the optimised correlation lengths, which range from 0.47 for F to 11.5 for n p . Since we combine all training points in a single emulator and we use larger simulations with more k bins, there is more information in the training set with which to optimise a larger number of hyperparameters in an anisotropic kernel.

Emulator validation
In this section we validate the emulator predictions on simulation data. In figure 3 we show the results of a leave-one-out validation test iterated across the full suite of 30 training simulations 6 . We compare the P 1D in the nine snapshots of each validation simulation with the predicted P 1D from an emulator trained on the rest of the suite. The solid red lines and shaded red regions respectively show the mean and variance standard deviation of the percentage error on these 30 predictions. The black solid lines show the ± median 1 − σ uncertainty on the predictions as estimated by the GP using equation 4.3, and the gray shaded area shows the region of 1% error. We show the accuracy of the emulator predictions at different redshifts to compare it with the uncertainty on BOSS measurements of the P 1D from ref. [5], represented by blue shaded regions 7 .
The emulator predictions are unbiased, with a mean error of < 1% at the redshifts and wavenumbers measured by BOSS, eBOSS and DESI. The variance indicated by the shaded  . The gray shaded area shows the region of 1% error. We show the performance of the emulator at several redshifts to compare it with the observational uncertainties from the latest analysis of BOSS/eBOSS data (blue shaded regions [5]). We expect that P 1D measurements from DESI will cover a similar range of wavenumbers, but the systematic contribution to the data covariance is currently difficult to predict. red region can be interpreted as an empirical estimate of the average error on emulator predictions. In all regimes probed by BOSS data this quantity is < 0.5%, considerably lower than observational uncertainties. The median theoretical error is slightly larger implying that emulator uncertainties are slightly overestimated, which was also the case in refs. [20,39,40]. Nevertheless, these uncertainties are still also well below those of the BOSS data. We do not include lines for DESI here as the exact increase in sensitivity is still uncertain. However it is worth noting that in the case where DESI errors become comparable to these uncertainties, there are several avenues to be explored such as changes to the covariance kernel to improve emulator performance or the addition of refinement simulations using Bayesian optimisation  [20,40]. Besides the measurements from BOSS and eBOSS, the Lyα forest P 1D has also been measured from small samples of very high-resolution spectra [15-17, 51, 69]. These measurements extend to higher redshift and smaller scales, and e.g. can be useful in breaking some of the parameter degeneracies (see appendix A). However, our simulations do not have the resolution required to resolve the pressure scale for some of the colder IGM models at z > 4 (see right panel of figure 2).

Predictions at a redshift without training data
In constructing a single GP using training points from all redshifts, we are assuming that the P 1D is sufficiently well described by the six parameters from section 2, without the need to explicitly include redshift information in the emulator. Under this assumption, if one were to construct two simulations that had the same set of θ parameters at different redshifts, the P 1D at these two different redshifts would be the same. In practice, engineering such a simulation is non-trivial due to the complex relationship between simulation parameters that model the IGM and the astrophysical emulator parameters. Therefore, in order to test this hypothesis, we construct an emulator dropping all training points at z = 3 (the points shown in red in figure 2). We then use this emulator to predict the P 1D in the z = 3 snapshots for 3 randomly chosen simulations, i.e. we are using training points at z ≤ 2.75 and 3.25 ≤ z to predict the P 1D at z = 3.
The accuracy of these predictions is shown in figure 4, with the theoretical uncertainty (computed using equation 4.3) on each prediction shown in the shaded coloured region. In each case the predictions have sub-percent accuracy over the full k range. This is a novel result that has not been possible with previous Lyα forest emulators, which validates our choice to combine training data from all redshifts and omit redshift information. The ability to emulate the P 1D at arbitrary redshifts also confers several advantages. In particular, our emulator could be used to analyse DESI measurements of P 1D , regardless of the particular redshift binning of the data, which is currently unknown. Additionally, it is possible to combine training data from simulation suites with different output redshifts, making the construction of training sets more flexible.

Predicting extended models
Finally, we verify the emulator predictions on two test simulations: the h sim and ν sim. The suite of training simulations has a fixed expansion history and growth rate, with only the primordial power spectrum varied in order to sample ∆ 2 p and n p . By fixing the growth rate in all training simulations, the emulator is learning the transfer function between the linear matter power spectrum and the P 1D for a fixed relationship between the density power spectrum and the velocity power spectrum. The velocity power spectrum will affect the P 1D through redshift space distortions (RSDs). Therefore it is important to test the accuracy of the emulator predictions in cosmologies with different growth rates to the training set. This will act as a test of our hypothesis that the P 1D is sensitive primarily to the linear matter power spectrum and insensitive to minor changes in the background expansion.
On the left side of figure 5, we show the accuracy of the emulated P 1D when compared with the truth in the h sim. The vertical dotted line shows the highest wavenumbers in BOSS measurements [5], and the gray shaded area shows the region of 1% accuracy. One way to consider this difference in h within flat cosmologies is as a variation in the onset of dark energy. We are therefore changing the growth factor in the redshift range 4 > z > 2, which will have two clear effects. First, the redshift evolution of ∆ 2 p will change, and this effect is captured by our parameterisation. Second, the velocity power spectrum and therefore the RSDs will be different to the simulations in the training set. We are interested in determining whether this effect, which is not captured by any of our parameters, will impact the accuracy of the emulator predictions to a statistically-significant degree. Similarly to the tests shown in figure 3, the predictions have sub-percent accuracy in the regime probed by BOSS and DESI data. Therefore we can conclude that the effect on the growth rate for this scale of change in h is negligible.
In ref. [58] it was shown that the effects of massive neutrinos on the P 1D are degenerate with changes to the amplitude of the linear power spectrum to within 1%. In that work it was difficult to disentangle effects of massive neutrinos on the IGM, that were partially accounted for by changing F and σ T in postprocessing. The emulator presented in this paper allows for this degeneracy to be tested more clearly since the effects of variations in the IGM and linear power spectrum can be predicted independently. The right side of figure 5 shows the accuracy of predictions compared with the truth in the ν sim, which has Σm ν = 0.3 eV. When adding massive neutrinos, we have kept ω b and ω c fixed. Therefore we are modifying the late-time growth and expansion rate governed by Ω m , which will have similar effects to the previous case on the IGM, the growth of structure, and baryon velocities. Another effect is the suppression of late-time, small-scale clustering due to free-streaming. This suppression is fairly large -approximately 8% in the case of Σm ν = 0.3 eV, and it is this suppression that makes the Lyα forest a valuable probe of neutrino mass. Again the effect on the amplitude of clustering is captured in our setup by changes to ∆ 2 p . We see similar results to the previous test, in that the emulator predictions are accurate to less than 1 %, with the best performance at high z. Note that this neutrino mass is larger than current bounds Σm ν < 0.12 (95% credibility) from Planck + baryon acoustic oscillations (BAO) [43]. In this extreme case the emulator still performs well; the methodological errors for smaller neutrino masses will be even smaller. These results demonstrate that the relationship between the linear matter power spectrum and the P 1D is insensitive to the specific background expansion used in training the emulator. The test simulations were set up such that each emulator parameter is the same in both simulations to within 2%. This choice was made so that the accuracy of the predictions shown in figure 5 serves as a test of our parameterisation rather than as a test of the interpolation.

Parameter constraints
The ability of Lyα forest information alone to constrain cosmological parameters is limited by parameter degeneracies and insensitivities. The constraining power of the P 1D on cosmological parameters comes when combined with CMB information, and we leave a discussion on this for future work. However, in order to test the robustness of our emulator, we present parameter constraints from a Lyα forest-only likelihood on the amplitude and slope of the primordial power spectrum, A s and n s when using simulated mock data. In order to min-imise degeneracies, these quantities are defined at a scale k = 0.7 Mpc −1 . To further test the assumptions of fixing the growth and expansion rate, we also explore the sensitivity of the P 1D to H 0 . Finally, we show constraints on Σm ν in order to investigate the degeneracy between Σm ν and A s [42,58]. Given that constraints on Σm ν are a major motivation in P 1D analysis, understanding this degeneracy will be important in the analysis of DESI data.

Likelihood parameters
Given that our emulator parameters are not associated with any redshift, we do not attempt to constrain the emulator parameters themselves. For our likelihood parameter vectorθ we consider two classes of parameters -cosmological and IGM parameters. In section 5.3 the cosmological parameters we vary are A s and n s , and we use uniform priors covering the ranges [1.1, 2.5] × 10 −9 and [0.89, 1.05] respectively. In section 5.4 we also vary Σm ν and H 0 , with uniform priors covering the ranges [0, 1] eV and [55, 90] km s −1 Mpc −1 respectively. We keep the physical baryon and cold dark matter densities, ω c and ω b , fixed to the values set in section 3. For each combination of these parameters we use CAMB to calculate the linear matter power spectra at each redshift where we have data, and we compute the corresponding values of ∆ 2 p (z) and n p (z). Therefore each likelihood evaluation results in nine calls to the emulator.
To model the redshift evolution of the IGM parameters we use the redshift evolution of each parameter in the central sim described in table 1 as a fiducial model. The evolution of the IGM for this simulation and all 30 training simulations is shown in appendix B. To vary the IGM, we use a multiplicative rescaling m(z) of this fiducial model using a power law around a pivot redshift: where we set z = 3 as it is the midpoint of our redshift outputs. X 0 controls the amplitude of the rescaling, and X 1 accommodates changes in the redshift evolution of a given parameter. The same approach is used for all IGM parameters, meaning we have a total of eight free parameters to model variations in the IGM. This model will not perfectly describe the evolution of the IGM in any simulation except the central sim and so we are not interested in constraints on these parameters per se, but instead interested in ensuring that we allow for enough variation in the IGM for a thorough marginalisation. The prior range for each parameter is ∈ [−0.4, 0.4], and we have verified that marginalised constraints on cosmological parameters are not impacted by expanding this range.
An advantage of omitting redshift information from the emulator is the increased flexibility in modelling the redshift evolution of the IGM. Since our emulator parameters are decoupled from simulation and likelihood parameters, the redshift dependence of the IGM is not fixed at the point of running the simulations as is the case in many previous setups. Whilst we use equation 5.1 to vary the IGM in this analysis, in principle any form of redshift evolution can be chosen provided it does not involve calls to the emulator that lie outside the convex hull of the training set.
Our baseline likelihood parameter vector is thereforẽ where τ i represents rescalings to the effective optical depth, τ eff , σ T i to σ T , γ i to γ, k F i to k F . The quantities that relate to physical length scales, σ T and k F are evaluated in velocity units.

Likelihood function
To more closely resemble a real analysis we perform our likelihood evaluations in velocity units. Since our emulator is constructed in comoving units, a cosmological model is required to convert between these using the relation k = qH(z)/(1+z), where k and q are wavenumbers in comoving and velocity units respectively, and denote a flux power spectrum in velocity units asP 1D (q ). We use the same q bins as the BOSS data [5], and use a cubic interpolation to rebin our mock and emulated flux power spectra to match the BOSS bins. We use a Gaussian likelihood function: whereP emu 1D (q i , z,θ) is the theoretical prediction from the emulator for a given set of parametersθ, and z is the redshift at which we are making the calls to the emulator. The covariance matrix C ij is a combination of the data and emulator covariance: In order to obtain results that resemble an analysis using current datasets, for the data covariance C data ij (z) we use the covariance matrices from the most recent measurement of theP 1D (q ) from BOSS [5]. Our simulation snapshot outputs are different to the redshift binning of the BOSS measurements, so for each z where we have mock data, we simply use the covariance matrix from the closest BOSS redshift bin, prioritising the lower z bin in the case where the bins are equidistant. The emulator covariance matrix is constructed from the theoretical uncertainty on each emulator call, which is a function of the emulator hyperparameters and the local density of the training points. Given that both these properties are common to all q bins, we assume the errors are maximally correlated and build C emu ij (z,θ) under this assumption. Since we use the same random seed in all simulations, any residual cosmic variance that remains after applying the 'paired-and-fixed' approach is the same in all simulations. We do not add noise realisations to the mock data drawn from the mock data covariance. The likelihood function is sampled using MCMC sampling implemented in emcee [70], and we check for convergence of the integrated autocorrelation time.

MCMC results
In figure 6 we show the accuracy of marginalised constraints on A s and n s for each of the 30 training simulations when running with 10 free parameters: A s and n s and the 8 IGM parameters described earlier. We fix h and Σm ν to the true values of h = 0.67 and Σm ν = 0 eV. As in section 4.1, for each constraint we leave the validation simulation out of the emulator training set. The error bars show the 1 − σ uncertainty. In all but 3 cases we recover the correct cosmology to within 1 − σ.

Extended models
In figure 7 we show constraints on A s , n s and H 0 after marginalising over the IGM parameters, using the h sim as mock data. We set a wide prior range of H 0 ∈ [55, 90]kms −1 Mpc −1 . Since the Lyα forest is observed in velocity units, our emulator predictions undergo a conversion from comoving to velocity units using the factor H(z)/(1 + z). Therefore we expect the constraints on the amplitude of the power spectrum to be degenerate with H(z). This is evident in the lower left contour of figure 7. The degeneracy is weakened by the fact that we are probing a high redshift where the differences in H(z) are smaller. For example, across the wide prior range, there is only a 6% difference in H(z = 2), and a 1.5% difference at H(z = 4). The constraints on A s and n s are accurate. This is an important demonstration, as we are correctly recovering parameters in a cosmology with a different background evolution to the one used in constructing the emulator.
In figure 8 we show marginalised constraints when using the ν sim as mock data, allowing neutrino mass to vary in the range Σm ν ∈ [0.0, 1.0] eV. The effects of massive neutrinos on cosmology are to vary both the expansion history and the growth rate, and to suppress the amplitude of matter clustering on small scales. Given the results of the previous figure, we expect the effects of massive neutrinos on the expansion and growth to have negligible contributions to the constraints. We use an extreme value of Σm ν = 0.3 eV that is significantly higher than the current bound Σm ν < 0.12 eV (95% credibility) from joint Planck + BAO data [43]. The neutrino mass is strongly degenerate with A s when using the Lya forest alone [42,58]; stronger bounds arise from a combination with CMB data by breaking the degeneracy (e.g [31,32]).

Discussion
We have presented a Gaussian process emulator that predicts the P 1D of the Lyα forest as a function of the amplitude and slope of the small-scale linear matter power spectrum and the state of the intergalactic medium. This work is an application of modern simulation interpolation techniques to a modelling approach that was more prevalent in early analysis of the Lyα forest, based on the fact that the cosmological information in the Lyα forest is concentrated in the linear matter power spectrum at the observed redshift. This is due to a unique property of the Lyα forest, in that it probes the universe at a period very close to an Einstein de-Sitter model. In this regime, variations in the P 1D due to differences in the growth rate and expansion history (allowed by current data) are negligible. For the first time we test the accuracy of this approximation at the level required by current datasets, and show that it holds. In particular, we showed that even in models with massive neutrinos, the cosmological information in P 1D analysis is contained in the slope and amplitude of the linear power spectrum within current observational uncertainty.
In addition to being lower dimensional in the cosmological sector, our emulator affords other practical advantages. By emulating the flux power spectrum independently of redshift, predictions can be made to sub-percent accuracy at redshifts not necessarily included in the training simulations. This affords flexibility with regards to the analysis of future data. Moreover, the redshift evolution of the IGM can be modelled at the point of constructing a likelihood function, as opposed to being determined at the point of constructing the emulator as in many previous frameworks. Whilst we demonstrated that this emulator can be applied to massive neutrino cosmologies without massive neutrinos used in the training set, the same principle can be applied to a wider range of extended cosmological models, e.g. running of the spectral index in the primordial power spectrum or curved universes [43]. There are two conditions under which our implementation would not be valid. Models with sharp features in the small-scale matter power spectrum (such as warm dark matter models [17,71] or ultra-light axion dark matter [46,47]), or models with a non-standard relationship between the baryon density and velocities (such as modified gravity models [72]). However, simple extensions to the emulator parameter space could incorporate some of these models too (e.g. [20,38]), and the motivations for parameterising the P 1D in terms of the linear power spectrum still apply.
The joint analysis of CMB observations with measurements of the small-scale linear matter power spectrum from the Lyα forest has long been a powerful combination in constraining extensions to ΛCDM [23][24][25][26][27][28][73][74][75]. We presented an emulator that will facilitate the combination of current (BOSS/eBOSS) and upcoming DESI measurements of the P 1D with CMB observations, to provide competitive constraints on beyond-ΛCDM cosmologies.

Acknowledgments
The authors thank Jose Oñorbe for sharing his code for the reionisation model of ref. [ A Parameter dependence of the P 1D In figure 9 we use the GP presented in section 4 to show the fractional change of the P 1D in response to changes of each of the parameters described in section 2. We show the fractional change around a central point in parameter space set by the midpoint of all training points at z = 3, i.e. the points shown in red in figure 2. We denote this point as P 0 1D (k ). We show the change in the P 1D with respect to P 0 1D (k ) when moving from edge to edge of the convex hull of the full training set, to ensure that we are not extrapolating into regions of parameter space where the GP cannot reliably predict the P 1D . The blue dotted lines show the highest k probed by BOSS [5]. The effects of IGM parameters are most dominant at higher wavenumbers than those probed by BOSS data, with a degeneracy between F and γ when looking only at the lower k modes.

B IGM evolution
We plot the redshift evolution of each IGM parameter for all 30 training simulations in gray lines in figure 10, with the central simulation shown in red. The values in red are used to provide a fiducial model for the marginalisation over the IGM described in section 5. The  . Fractional effect on the P 1D of changing each of the six emulator parameters. We chose as a reference model, P 0 1D (k ) the midpoint of the z = 3 training points, and show the effect on the P 1D when moving from edge to edge of the convex hull of the training set. Lowest values for each parameter are coloured blue, transitioning to the highest values in red. The blue dotted line represents the highest k bin in BOSS data [5]. mean flux, F is found by averaging over the flux transmission fraction in every line-ofsight pixel of a given snapshot. To calculate the thermal parameters, σ T and γ, the particle snapshot data is binned in the log 10 T − log 10 ∆ b plane, where T is in units of K. We then perform a linear fit to the mode within the range −1.5 < log 10 ∆ b < 0.5 to obtain values for T 0 and γ from equation 2.3. T 0 is then converted into a value for σ T using equation  2.4. In order to calculate the pressure smoothing scale, k F , we follow the approach in ref. [54]. We calculate the real-space Lyman-α forest 3D flux power spectrum, ∆ 2 F (k), and fit an exponential cutoff of the following form: where A and n are free parameters.

C Full posterior with IGM parameters
In section 5 we presented MCMC posteriors for several analyses on mock data, after marginalising over the 8 IGM parameters. In figure 11 we show the full posterior for one of the analyses, including cosmological and IGM parameters.  Figure 11. Marginalised 1D and 2D posterior distributions for all free parameters when using the h sim as mock data (the same as in figure 7).