Bayesian inference methodology for primordial power spectrum reconstructions from Large Scale Structure

We use Bayesian inference to develop a non-parametric method to reconstruct the primordial power spectrum Pℛ (k) from Large Scale Structure (LSS) data. The performance of the method is assessed by testing it against simulations of the clustering of high-z (QSOs) objects. Their clustering is derived from different templates of the primordial power spectrum motivated by models of inflation: the Standard Model power law characterized by the two parameters As and ns ; a local feature template; and a global oscillatory template. The primordial power spectrum is reconstructed using N knots in the log {k, Pℛ (k)} plane while sampling the cosmological parameters {H 0, Ω b , Ω c }. We use two statistical tests to examine the reconstructions for signs of primordial features: a global test comparing the evidences and a novel local test quantifying the power of the hypothesis test between the power law model and the marginalized probability over N model. We also discuss results of an application to low-z (ELGs) objects with two different photometric errors keeping the cosmology fixed. The method shows good performance in all scenarios considered. In particular, the tests show no feature detection for the standard power-law primordial power spectrum; yet, the method is able to detect power spectrum deviations at a percent level for all considered features, combining either the low-z or the high-z redshift bins. In addition, we include a test proof-of-concept application to real data from the Sloan Digital Sky Survey Luminous Red Galaxy Data Release 4 (SDSS LRG 04), finding no preference for deviations from the primordial power law. The method is flexible, model independent, and suitable for its application to existing and future LSS surveys.

Most cosmological observations support the hypothesis that the primordial fluctuations were adiabatic, Gaussian and quasi-scale invariant, and that the background universe was spatially isotropic and homogeneous [1].These properties, together with several shortcomings of the standard Hot Big Bang scenario [2,3], provide strong motivation for cosmological inflation [2][3][4][5][6][7], a hypothetical epoch of exponential expansion in the early universe.However, the nature and origin of the fields that drove inflation remain largely unknown and poorly constrained by current observations.The primordial correlation functions encode very valuable information about the physical mechanism that generated the initial conditions for cosmic structure formation.Some well-motivated theoretical scenarios can produce distinctive features in those functions, such as the primordial scalar power spectrum of curvature perturbations1 P R (k).P R (k) is a key quantity to probe the physics of the very early universe, allowing us to test and constrain different inflationary models.P R (k) is usually parametrized by a simple power law with two parameters: the amplitude A s and the spectral index n s of the primordial comoving curvature perturbations.
The predictions of the simplest single field slow-roll models of inflation, of nearly-Gaussian and quasi-scale invariant power spectrum for scalar and tensor perturbations [8,9], are consistent with the latest results of the ESA Planck satellite that, in particular, provides a value for the scalar power spectrum parameters of A s = 2.10 +0.03 −0.04 × 10 −9 and n s = 0.965 ± 0.004 [1].Models as the Higgs inflation [10] or the Starobinsky R 2 inflation [11] are favoured by the latest Planck data [12] from a plethora of slow-roll inflationary models [13].
Departures from the slow-roll scenario can have significant cosmological implications, such as the production of primordial black holes and the enhancement of the inflationary gravitational wave spectrum at small scales [14].Therefore, it is important to look for observational signatures of slow-roll deviations [15][16][17] or modified scalar field dynamics during inflation [18].There are various mechanisms acting in different inflationary models that can produce deviations from the power law form of P R (k).For instance: logarithmic oscillations arising from non-Bunch-Davies initial conditions [19][20][21] or from axion monodromy [22]; linear oscillations predicted by boundary effective field theory models [23,24]; localized oscillatory features induced by a step in the inflaton potential [25] or in the sound speed [26,27].Other models exhibit cutoffs at large scales [28,29] or more general modulations [30,31].One way of exploring the presence of features is by looking to the Cosmic Microwave Background (CMB) anisotropies.Some anomalies in the CMB, that were originally detected in the NASA WMAP data [32], were confirmed later by Planck [33]: power suppression at large scales [34], low variance [35,36], hemispherical power asymmetry [37,38], preference for odd parity [39], tension in the lensing parameter A l [40], and the "cold spot" [41,42] are some of the more remarkable ones.The statistical significance of each of these anomalies is inconclusive, but it is worth studying them due to the relevance that their existence would have to uncover new physics beyond the ΛCDM model.We will refer to this model, including the primordial power law power spectrum, as the Standard Model (SM) of cosmology.
Large Scale Structure (LSS) stage IV galaxy surveys [43][44][45] are expected to reveal more details than CMB experiments at intermediate scales, ranging from wavenumbers k of about 0.01 h Mpc −1 to 1 h Mpc −1 .The high signal to noise S/N of their galaxy power spectra allows us to probe the potential existence of features in the primordial seeds of the LSS.Developing accurate methods to perform this analysis is crucial for extracting the maximum amount of information from these forthcoming surveys.
The shape of the primordial power spectrum can be determined following two different approaches: parametrization and reconstruction.Parametrization relies on selecting a specific form or model of P R (k) and constraining its parameters using CMB and/or LSS data.Various parametrizations of P R (k) have been applied in the literature such as [28,[46][47][48][49][50][51][52][53][54][55][56][57][58][59] and those used in the inflation analysis of the Planck data [60][61][62].Also parametrizations of the two point correlation function can be found [63].Reconstruction, on the other hand, does not assume any model or template for P R (k), but rather infers its shape from the data.Several methods have been developed for model independent reconstruction of P R (k) based on different inference methods, such as Bayesian inference, linear interpolation methods [46] [64], combination of top hat functions [65], wavelet expansion of P R (k) [66,67], smoothing splines [68][69][70], fixed wavenumber knots joined with cubic splines [60,62], the critical filter method [71], different spline techniques [72][73][74] or placement of N free knots in the {k, P R } plane [60,62,[72][73][74]; penalized likelihood, using function space generalization of the Fisher matrix formalism [75] or a P R (k) ansatz [60][61][62]; Principal Component Analysis, using expansion of a orthonormal set of basis functions [76]; and sparsity of the primordial power spectrum, using a sparsity-based linear inversion method [77].These works have been applied to CMB and/or LSS data to reconstruct P R (k) and test for deviations from the standard power law form.
Most non-parametric methods of P R (k) reconstruction do not find any statistically significant deviations from the primordial power law [46, 64, 65, 68-70, 74, 77].The precision of these methods vary from subpercent to 30%, depending on the data and the technique used.The Planck collaboration applied three non-parametric methods to test the power law hypothesis and found no statistically significant evidence for deviations with a precision approaching 1%.The most notable deviation, although not statistically significant, was a deficit in power at k ≈ 0.001 h Mpc −1 (ℓ ≈ 30) [60][61][62].Some methods have also focused on detecting features at scales between k ≈ 0.01 h Mpc −1 and k ≈ 0.2 h Mpc −1 [66,67,71,74,76,77].In [73], models that can account for a lack of power at k ≈ 0.001 h Mpc −1 and in k > 0.1 h Mpc −1 are slightly favoured against the power law parametrization in a Bayesian sense.However, all methods are consistent with a featureless tilted power law P R (k) that agrees with Planck.Earlier studies with WMAP data have reported some features, such as a modulation around k ≈ 0.009 Mpc −1 [54] and fine structure at k ≈ 0.002 Mpc −1 and k ≈ 0.009 Mpc −1 [78].Moreover, the power law parametrization is claimed to be disfavoured against the Lasenby & Doran model [79], which produces a lack of power at scales k ∼ 10 −4 Mpc −1 [72].
Reconstruction techniques can identify coarse characteristics in P R (k), but they are limited in detecting higher frequency features that are predicted by various physical mechanisms [80].The parametric approach can provide the necessary resolution to detect such features, but it depends heavily on the chosen model or template of P R (k), which makes reconstructions more suitable for obtaining model independent information.In our non-parametric methodology, free placement of knots in the log {k, P R } plane does not require any prior k-binning of P R (k).As a result, it is sensitive to both global and local features in P R (k).Model complexity is penalized by comparing evidences Z, which facilitates the comparison between different N configurations when reconstructing P R (k).This advantage is harder to obtain in other methods, such as those based on basis functions, top hats, or wavelet expansion.Moreover, our methodology is flexible and adaptable to additional data from diverse surveys, and it can increase the number of sampled knots if needed.However, one limitation of our methodology is the lack of smoothness introduced by the linear splines connecting the knots.Other approaches using smoothing splines or cubic splines offer greater sensitivity to curvature features in P R (k).Additionally, our method may miss very sharp features, which is a common drawback of non-parametric approaches compared to parametric ones.Nevertheless, the discrete width at which the sampler evaluates the knots in the log {k, P R } plane can be reduced, at the expense of a higher computational effort.
Assuming a specific deviation from the standard inflationary model may be challenging due to the vast number of proposed models.Our objective is to detect features in P R (k) from LSS galaxy clustering data, both simulated and real, with a non-parametric Bayesian method that does not impose any assumptions regarding any particular inflationary model.By reconstructing P R (k) and quantifying statistical deviations from a power law, we aim to provide information about the very early universe.
The structure of the paper is as follows: section 2 describes the methodology used to perform the P R (k) reconstructions with nested sampling, along with the details of the tests used for the subsequent analysis.Section 3 describes the galaxy power spectrum simulations used to test our method, including the survey specifications and modelling.Section 4 presents the results obtained from applying the methodology to simulated spectra, considering four different cases for the primordial power spectrum: the power law spectrum of the Standard Model (SM), a bump and an oscillatory feature from the same local template, and a global log-log oscillatory feature template.The smallest power deviations that can be detected are also given for each feature template.Appendix F focuses on the application of the methodology to real observational data from the SDSS LRG 04 catalogue, discussing the obtained results about possible deviations from the Standard Model.Section 5 presents the conclusions derived from this work and some possible lines of future work.Finally, appendices A to F include the appendices that cover the solution of the label switching problem, the PolyChord sample selection criterion, and a description of the feature model used for the bump/oscillatory local template.

Methodology
We use a model independent and non-parametric approach to reconstruct the primordial power spectrum P R by sampling N knots freely in the log {k, P R } plane using nested MCMC sampling.This approach follows [73] and some of the inflation analyses of the Planck collaboration [60,62].In the first subsection we explain our procedure, priors and likelihood, and the method to represent the P R (k) reconstructions.In the second subsection we present two tests for detecting primordial features in the reconstructed primordial power spectra.

P R (k) reconstructions
We use a Bayesian framework in this work.Given a cosmological model M characterized by a parameter set Θ and a set of cosmological data D, the Bayes Theorem allows us to update a prior P (Θ|M ) to a posterior P (Θ|D, M ) using the likelihood L = P (D|Θ, M ) and an evidence Z, which can be computed as: To sample the posteriors and evidences for our reconstructions we use Cobaya [81] [82]: a framework for sampling and statistical modelling.In this framework we use the nested sampler PolyChord [83,84], as the Metropolis-Hastings sampler in Cobaya is insufficient due to the computational complexity of our posteriors and evidences.The PolyChord algorithm improves the sampling efficiency for our evidence computation in eq.(2.1) and enables the acquisition of posterior samples, which present multi-modal behaviour.PolyChord has been used for many cosmological purposes, from which we highlight the reconstructions of the deceleration parameter q(z) [85] and the primordial power spectrum in a non-parametric Bayesian approach [73], a methodology followed in this work.The specific priors used are described in table 1.The evidences obtained from eq. (2.1) will be normalized to have its maximum value equal to 1.In the present work the P R (k) reconstructions are performed in a model independent approach.The underlying assumption is that the primordial power spectrum can be represented as a series of N knots, connected by linear splines in logarithmic scale.The knots are pairs of coordinates {log(k i ), log(P i R )} that are sampled with PolyChord.We consider different number of knots configurations with N ∈ [2,9], that we can marginalize in order to achieve a non-parametric method.The number of knot parameters are 2N − 2, and together with the three cosmological parameters {H 0 , Ω b , Ω c } makes a total of 2N + 1 parameters to be sampled.The sampler PolyChord was used with 25 live points for each sampled parameter, for a total of 50N + 25.We employed ≈ 5 × 10 4 CPU hours, using 128 CPU cores for each reconstruction.Figure 1 illustrates the methodology for a case of N = 5.The knot position is determined by maximizing the likelihood of the knot-constructed spectrum given the galaxy power spectrum data from a specific LSS survey.To achieve this, a model to relate the knot-constructed primordial power spectrum into a galaxy power spectrum is required.The likelihood is given below in this section, and the model is described in the next section.

Sampled
We assume that the P (0) R (k) is given in the scale range k ∈ [0.02, 0.45] h Mpc −1 , divided in 125 elements with uniform logarithmic steps Log(∆k) = 0.02.For low-z objects, that are also discussed at the results, this range is reduced to k = 0.2 h Mpc −1 , with the same Figure 1: Illustration of the methodology for a 5 knots case (N = 5).We fix both k 1 = 0.02 h Mpc −1 and k N ≡ k 5 = 0.45 h Mpc −1 , which correspond to the largest and smallest scales of the survey.The sampled parameters are the log(x) (see next paragraph) for each {k 2 ,k 3 ,k 4 } and the logarithms of {P 1 R ,P2 R ,P 3 R ,P 4 R ,P 5 R }, denoted as log(y).We impose a flat prior with the upper and lower limits for all the {log(x i ), log(y i )} coordinates according to the dashed lines: scales within the survey limits and P R from 1.0 × 10 −10 to 5.6 × 10 −9 .step. 2 We adopt flat priors for both P R in a wide range P R ∈ [0.10, 5.60] × 10 −9 and x ∈ [0, 1].For the cosmological parameters {H 0 , Ω b , Ω c } we use Planck DR 3 uncorrelated Gaussian posteriors.This decision is motivated to reduce possible degeneracies between the knots and the cosmological parameters when exploring the parameter space and, therefore, decrease the computational cost.In the future, primordial power spectrum reconstructions will benefit, most likely, from a wide range of cosmological observations, including the full Planck likelihood.
We reconstruct P R (k) by maximizing a likelihood function that quantifies the agreement between the theoretical model of the monopole galaxy power spectrum P (0) g (model) and the one derived from the data catalog P (0) g (data).We evaluate the spectra for different bins of redshift z and for each value of the survey scale grid.The likelihood L used in this analysis follows a multi-variable Gaussian form: where g,ij (data)]; i is the index for the different redshift bins, with Z the total number of redshift bins for the survey; j is the index accounting for the scales, with K the total number of scales for that survey; and σ 2 ij is the variance calculated from P (0) g,ij (model), as explained in the next section.In general a covariance matrix is needed.Since we consider uncorrelated scales and redshift bins, it is diagonal and we can reduce it to its variance elements.The reconstruction of P R (k) are obtained from the values of P (0) g,ij (model) that maximize L.
We summarize our sample selection criteria in appendix B. After the sampling is done and a first set of chains is chosen, we get a large number of the sampled primordial power spectra for each number of knots configuration, each one denoted as P R,j,N (k).Each primordial spectrum has associated its own normalized importance weight ω j,N .
We can merge all the primordial spectra coming from any N configuration by reconstructing a marginalized probability over N of P R (k).Since PolyChord calculates the individual evidences Z N , the marginalization is done by combining all spectra from all N with a new set of evidence-dependent weights q j,N : Each weight q j,N can be interpreted as the joint probability of both the chain j and the N knots configuration.This joint probability is the product of two factors: the conditional probability of the chain j given N , ω j,N , and the probability of the N configuration given by the normalized evidence.Their corresponding constructions of the primordial power spectrum P R,j,N (k) have an associated probability q j,N , from which we determine the confidence intervals of P R (k) according to the q j,N distribution.We test our method on simulated data with different P R (k), based on the power spectrum templates given in eq.(C.1) and eq.(4.1) and described in [19][20][21]60].These templates cover a broad range of theoretical motivated models.We also apply our methodology to real observational data obtained from the Sloan Sky Digital Survey Luminous Red Galaxy data release 4 (SDSS LRG 04) in appendix F.
We focus on the small scale regime where the LSS surveys can achieve higher precision than the CMB experiments.For simplicity we consider uncorrelated bins of k.An estimate of these correlations was obtained by performing simulations based on the mock generation code lognormal galaxies [86] for the redshift bins and volumes considered in this paper, finding an upper limit for the non-diagonal terms of ≈ 30% at the smallest scales.
Two statistical tests are used to analyze the reconstructed spectra and assess possible deviations from the power law assumption, as explained below.

Feature detection from P R (k) reconstructions
To detect features in the reconstructed P R (k), we perform two tests for each reconstruction.The first test is a global one based on the comparison of the evidence ratios and the second test is a novel local one based on a hypothesis test.Both tests are applied to the comparison of the N = 2 and the marginalized probability over N reconstructions.

Global test
Evidence comparison for model selection and comparison with the SM is widely used [87].In this work we use the evidence ratio Z max /Z 2 , where Z 2 is the evidence of the primordial power law reconstruction and Z max is the highest value among the evidences Z N except Z 2 .The Z max /Z 2 ratio measures how a power law primordial spectrum compares with a different knot number configuration spectrum with the highest evidence.We have checked that the sampling errors in the estimates of the evidences are less than 1% for values above 1% of the maximum evidence.In this way, we are confident in the application of the Jeffreys criterion [88] for quantifying the feature detection status.
This global test provides an effective way of measuring how different N configurations compare when reconstructing P R (k).The detection of a feature can be claimed, but with only this analysis we could not identify at which scales the deviations are more significant.To locate the deviations at the scales they occur, we use a local test, explained below.

Local test
We perform another complementary analysis of the P R (k) reconstructions.This analysis is a novel approach for detecting features in P R (k), providing insights about possible features present in P R (k) in a localized way, since it enables us to identify the scales at which the deviations occur.
The local test is a hypothesis test applied to the distributions of the N = 2 and the marginalized probability over N reconstructions for each value of the survey grid scale.The details of the hypothesis test can be found in [89].The significance level considered for all the tests is α = 0.05.The power of the test 1 − β quantifies the separation between both distributions providing a measure of the significance of the deviation from the power law.
We establish the feature detection status using the following thresholds for the power of the test: 1 − β > 0.95 at any k indicates that both distributions are separated enough to consider the feature detection at that scale; 0.95 > 1 − β > 0.5 shows a hint of a feature; and 0.5 > 1 − β shows no evidence of the presence of a feature.
The global test is a statistically more robust test for feature detection, since it compares evidences computed for all the scales and it is less sensitive to fluctuations, but the local test enables us to locate features at certain scales k, providing an indicator for the existence and location of features in P R .We apply both test to all our reconstructions, exploiting the complementary information that both methods offer.

Galaxy power spectrum simulations: survey specifications and modelling
In order to test the methodology we apply it to simulations.In this work we simulate galaxy power spectra using galaxy clustering modeling.Recent galaxy clustering models account for various effects, such as Baryon Acoustic Oscillations (BAO) modelling [90,91], non-linear corrections [57,92,93], the Fingers of God [94] and Alcock-Paczynski [95,96] effects, or complex shapes of the redshift-space power spectrum [97][98][99][100].A simple BAO modelling and non-linear corrections will be considered for the SDSS LRG real data application in appendix F. All the simulations described in this section have been done assuming a linear regime using a Kaiser redshift-space power spectrum, encompassing both the Fingers of God and the Alcock-Paczynski effects.Although this model resembles those used in the forecasts of upcoming surveys [91], we focus on testing our methodology, without aiming for a state-ofthe-art modelling of the simulated data.A comparison of the deviations w.r.t.non-lineal modelling [91,92] at scales k ∼ 0.1 h Mpc −1 shows an underestimation of power of ≈ 30%.However, this underestimation in the modelling at the smallest scales is not biasing our results since it is present coherently in both the model and the simulations.Therefore, we consider that for our analysis, scales up to k = 0.4 h Mpc −1 are adequate enough for our purposes.When dealing with low-z objects we reduce the smallest scales considered to 0.2 h Mpc −1 , since non-linear effects happen at larger scales.We explain how our simulations are done in the rest of this section.

Simulated surveys specifications
We differentiate between low and high redshift objects: Luminous Red Galaxies (LRGs) and Emission Line Galaxies (ELGs) are examples of the former, while Quasi-Stellar Objects (QSOs) of the later.Typical redshifts covered by current planned surveys for galaxies are z ≲ 1 whereas for QSOs z ∼ 2. In this work we show the results for high-z objects, for which we choose a redshift binning with a step ∆z = 0.2 and evaluate the power spectra at the central values of the bins.
We estimate mean values of the photometric redshift error δ z from J-PAS representative data [101][102][103] by weighting the δ z associated to each redshift bin with the densities derived below (see table 2).For high redshift objects a single value δ z = 0.0036 is obtained considering all the QSOs, and for low-z objects, a low photometric error of δ z = 0.0038 is obtained using the 30% best determined redshifts, and a high photometric error of δ z = 0.039 is derived when the 50% best ones are considered.
The observed sky fraction assumed for the simulations is f sky = 0.2575, equivalent to an area of 10620 deg

Bias model
Galaxy biasing is a complex process that needs to be modeled to extract cosmological information from current and future LSS data.Different tracers of the LSS exhibit different biases, which can be derived from bottom-up or top-down approaches, with local or non-local bias, and considering non-linearities.In this work we focus on spectra with scales larger than 0.45 h Mpc −1 .For these scales we assume scale independent bias models and neglect the effects of scale-dependent deviations in the bias.
The QSO bias model is obtained by matching the integrated correlation function for QSOs with the expected one for the WMAP/2dF cosmology [104].The model is parametrized as [105]: For the low-z galaxies we adopt the relation in [106]: b(z) = b 0 D(z) with an ELG-like biasing, with b 0 = 0.84 following [107] and consistent with [43].
Galaxy clustering observables require non-linear and non-local bias models for a broad range of scales, as linear and scale independent bias models are inadequate [108].For instance, the power spectrum multipoles and the bispectrum of BOSS CMASS DR11 galaxies were analyzed using such models [109][110][111], and they are also necessary for more recent galaxy clustering data.However, the main goal of this work is to test our reconstructions methodologically rather than to model galaxy clustering with the latest techniques.

Number density of galaxies
To estimate the uncertainties in the monopole galaxy power spectrum, we need the densities of the survey objects, which depend on both the redshift and the photometric error, i.e. n(z, σ z ), where σ z = δ z (1 + z).We use realistic values of densities n for both low-z and high-z power spectrum simulations, based on the expectations of the J-PAS collaboration

Low-δ z
High-δ z z-bin n/10 −3 h 3 Mpc −3 n/10 [43].To estimate the densities we follow [101] for low redshift (ELGs) and [102,103] for high redshift (QSOs).The values of the densities that we use in our power spectrum model are obtained by fitting the density data with the following function for n(z): By evaluating that function at the central values of the z-bins for each object type, we obtain the densities required.The parameters {C 0 , z 0 , α} are considered as free parameters, with β = 3/2 [107].Table 2 shows the values of the densities used for each z-bin.

Construction of the monopole galaxy power spectrum
Our observable of interest is the monopole galaxy power spectrum P (0) g (k, z), from which we reconstruct P R (k) using the likelihood of eq.(2.2).We estimate the mean value Pg (0) (k, z) following the model in [107], which is based on [112] and inspired by the Kaiser model [97].This model accounts for redshift space distortions at a basic level, with a Gaussian uncertainty in its radial position.Pg (0) (k, z) is obtained through the matter power spectrum, which is computed with CAMB [113].We describe the construction of the galaxy power spectrum below.
We start by choosing a primordial power spectrum P R (k).We will consider either the or various templates with deviations from the power law, all based on physically motivated models, such as the ones in eq.(C.1), that we explain in appendix C, or in eq.(4.1).
The linear matter power spectrum P m (k) reads as: A simple way to estimate a galaxy power spectrum Pg (k) is to assume that it is proportional to the matter power spectrum.The proportionality is taking into account through the bias b: This simple method is not accurate enough to describe a galaxy power spectrum [109], not even in the linear regime.We use a more complex description of the galaxy power spectrum, incorporating the Kaiser model for the redshift space distortions and both the Fingers of God and the Alcock-Paczynski effects: with µ being the cosine of the angle between the wavevector ⃗ k and the line of sight, b(z) the bias described in the previous subsection and f (z) the growth function.The subscript 'fid' indicates that a variable is evaluated at a fixed fiducial cosmology.The model has an exponential term e −k 2 µ 2 σ 2 z (z) that accounts for a reduction of power due to uncertainties induced by the photometric error of the measurements through the parameter σ z = δz(1+z) H(z) , with .The function E(z) is defined as: the angular distance D A (z) can be constructed from the comoving radial distance χ(z): and the growth function f (z) is estimated as: being γ = 0.545 the growth index.The Fingers of God effect is taken into account with the function F FoG (k, µ, z), which is modelled as a Lorentzian following [114]: where the dispersion parameter σ p,fid (z) is obtained integrating the linear matter power spectrum as: In practice, we perform this integral with finite boundaries k min = 10 −5 h Mpc −1 and k max = 10 3 h Mpc −1 .The Alcock-Paczynski effect [115] distorts the scales k and µ as: ) with: A factor A E fid (z) must be included in the galaxy power spectrum for properly modelling the Alcock-Paczynski effect.
The ℓ-multipoles of Pg (k, µ, z) can be computed as: with L ℓ (µ) the Legendre polynomial of degree ℓ.The monopole (ℓ = 0) is: Measurements of higher order multipoles, as the quadrupole Pg (2) (k) in [92], are usually significantly noisier than those of the monopole.While these measurements can potentially provide valuable cosmological information, we believe its impact on the determination of the P R (k) shape will be small.Therefore our focus remains primarily on the monopole.Eq. (3.5) does not capture the redshift space power spectrum accurately, as it neglects the non-linear effects [116,117] and the smoothed out BAOs [118].More advanced models take into account both the non-linearities, such as the ones in [97][98][99], and the BAO smoothing, as in [91].We will not consider these two effects here, but would be required for state of the art surveys.
The expected value of the monopole galaxy power spectrum at wavenumber k and redshift z is given by eq.(3.15).To estimate its uncertainties, its variance matrix has to be computed.We follow the method in [119], which requires an additional term: the Fourier number of modes assigned to the k-shell N k (z).For this, we define the comoving radial distance as: Then we estimate the volume in a redshift bin of central value z and upper and lower limits z + and z − : The number of Fourier modes N k inside a k-shell between k + and k − for the redshift bin z are: where k + and k − represent the upper and lower limits of the bin associated to k.The number of Fourier modes contributes to the sampling variance effect.The shot noise is included by the inverse of the density n.Then, the variance accounting for both effects can be written as: The standard deviations are considered as the uncertainties in our observable monopole galaxy power spectrum, i.e. δ Pg (0) (k, z) = σ(k, z).
The sample volume at the largest scales and the dominance of non-linear effects at the smallest ones determine the scales at which we apply our methodology.A scale range of k ∈ [0.02 − 0.2] h Mpc −1 for low-z objects and k ∈ [0.02 − 0.45] h Mpc −1 for high-z ones is used, as non-linear effects appear at smaller scales for high redshifts.The noise properties also vary with scale k: the sampling variance is higher at large scales and the shot noise tends to dominate at small ones.The variation of the signal to noise ratio with scale k can be seen in figure 2, presented in the next subsection for different redshifts.
We use this model of the monopole galaxy power spectrum to generate simulations and test our methodology for the reconstruction of the primordial one in section 4.

Signal to noise analysis
In the previous section we have described the observed power spectrum model, the physical properties of the objects and the characteristics of the survey.We now study the expected signal to noise ratio S/N for each clustering scenario.The sensitivity of our methodology to power deviations will depend on the S/N of P (0) g (k i , z j ) for the different objects, given by S/N (k, z) = In figure 2 the S/N as a function of the scale k is represented for the SM.This figure reflects the dominance of the sampling variance for most of the redshift bins of the low-z objects (top panels) and the shot noise dominance for the high-z objects (bottom panel).For the high-z objects (plotted in the bottom panel of figure 2) the shot noise dominates at all scales due to their lower densities compared to the low-z ones.As a consequence, those bins with the highest densities will have the smallest noise and the highest S/N.The volume differences between redshift bins are much smaller than in the case of the low-z objects, implying similar sampling variance among them.In the low-z scenario the shot noise term 1 n in eq.(3.19) is subdominant for the majority of redshift bins, except for the highest ones that have the lowest densities (listed in table 2) at small scales.The number of modes N k (z) for a certain k leads to a very different sampling variance between redshift bins, explaining the low S/N at the smallest bins.In the case of low-z with high-δ z (top right panel), the shot noise has a slightly higher effect than for low δ z due to the reduction of P (0) g (implying a reduction in both the signal and the sampling variance), caused by the higher photometric error (see eq. (3.5)).
In section 4 we apply the methodology to high-z simulations where we consider the favorable bin z = 2.3 having the highest S/N of all the bins up to k = 0.15 h Mpc −1 , and for smaller scales is slightly lower than for the z = 2.5 bin where the S/N is maximum.In this way we can obtain the minimum amplitude of features that the method is able to detect at low or high redshift.

Application to simulated spectra
In this section we apply the methodology to simulations of different observational scenarios, characterized by the monopole galaxy power spectrum model Pg (0) (k i , z j ) at a given scale k i and redshift bin z j , and the survey and object specifications described in section 3. We simulate the "observed" monopole power spectrum at a scale k i and redshift bin z j , P g (k i , z j ), by drawing random samples from a Gaussian distribution with mean Pg (0) (k i , z j ) and standard deviation δ Pg (0) (k i , z j ) = σ(k i , z j ) obtained from the variance matrix given by eq.(3.19).We consider a high-z QSO survey with a relatively low δ z (see section 3.1).Also we discuss the obtained results for a low-z galaxy survey with two photometric redshift errors, a low and a high δ z with a fixed cosmology.First we test the methodology with SM simulations, i.e. assuming a power law for the primordial power spectrum P R (k).Then we apply the methodology to simulations with primordial features.We consider two features templates, a local and a global one.From the local template given in eq.(C.1), we first generate a feature that consists of a 20% bump in power at k ≈ 0.4 h Mpc −1 (see figure 6).Using the same template, we also generate a second local feature having an oscillatory behaviour with excess and deficit of power of approximately 10% at the maximum and minimum of the oscillation (see figure 10).We also apply the methodology to a different feature template, namely a global log-log oscillatory feature with oscillations at all scales (see eq. (4.1)).We consider three amplitudes for the global template, corresponding to approx.10%, 3% and 1.5% power deviations from the featureless primordial power law.We have checked that different realizations of the same observational scenario provide very similar reconstructions leading to the same feature detection results derived from the global and local tests.

Standard Model
The Standard Model simulations are performed considering P R (k) as a power law . We consider simulations of high-z objects P (0) g (k) at the z = 2.3 bin, a favorable case since it has the minimum shot noise and the highest S/N values for almost all the scale range (see figure 2).The P  As expected, the evidences of the reconstructed spectrum peak at the 2 knots case, Z 2 , and exhibit a quasi-exponential decay with increasing N , as shown in figure 4. The recovery of the power law is clear, with negative feature detection as summarized in the second column of table 3.
The value of 1 − β, is very small (< 0.05) at all scales, showing a very good consistency between the N = 2 and the marginalized reconstruction.The contours displayed in the top panel of figure 5 are narrower at the largest scales and wider at the smallest ones k ≈ 0.5 h Mpc −1 due to the dominance of the shot noise.The hypothesis test do not show any local deviations between the marginalized and the N = 2 case, showed at the bottom panel of figure 5.
The results obtained for the case of the SM show the preference for the power law, with a good recovery of the input values for the parameters {A s , n s } with the global test and no deviation detected with the local one, validating the performance of the proposed methodology.

Local bump feature
We study a local feature in P R (k) following the template in eq.(C.1).We tune the parameters of this template to have a power excess relative to the featureless model of 5% at k ≈ 0.2 hMpc −1 and of 20% at k ≈ 0.4 h Mpc −1 .We call this feature template 'bump'.The primordial spectrum for this bump feature, and the parameters used to generate it, are shown in figure 6.
We use the same redshift bin than for the SM.The corresponding simulated power spectrum shows the slight increase of power at the smallest scales of the template, as plotted in figure 7. 2.0 × 10 -9 2.5 × 10 -9 3.0 × 10 -9 Figure 5: Contours of the high-z P R (k) reconstructions for the SM simulations in the N = 2 case (magenta) and for the case of marginalized probability over N knots (blue).We include in dashed black the power law that sourced the simulations.In the bottom panel the values of the power 1 − β obtained from the hypothesis test are plotted.
The global test for the bump feature is based on the ratio of evidences that are shown in figure 8. Z 3 is three times as high as in the case of the SM and Z 4 forty times higher.6: Local bump feature model of the primordial power spectrum and the power law primordial power spectrum (blue).The parameters used to generate the bump feature are given in the table inside the figure.Note the increase of power at the scale k = 0.45 h Mpc −1 where it reaches 20%.At large scales, the feature model tends to the SM power law.

Param. Value
Nevertheless, the maximum evidence is still obtained for N = 2 and the ratio Z 3 /Z 2 is below 0.3.
The P R (k) reconstruction is represented in the top panel of figure 9, whereas the hypothesis test is represented in the panel below.Although there is a minor increase at the extreme scales of the explored range, the power of the test remains below the threshold of 1 − β = 0.5 for all scales.
To summarize, for the bump feature template, even for the redshift bin with best S/N , we can not detect the feature, as shown in the third column of table 3. The application of our methodology to the bump feature provides the same qualitative results as for the SM case.

Local oscillatory feature
We examine how our methodology performs with another local feature in the primordial power spectrum P R (k).The template for generating this feature is the same as for the bump one (see eq. (C.1)).We now choose values of the different parameters to generate a power excess and a deficit of amplitude ≈ 10%.We call this feature "oscillatory", as it completes an oscillation in the range of scales considered.This feature is still local since at large scales the model tends to the power law.The primordial spectrum for the local oscillatory feature, and the parameters used to generate it, are shown in figure 10.
In figure 11 the P (0) g (k) simulation for the local oscillatory feature template is shown.We keep the same favorable bin z = 2.3 (see figure 2).The simulated P (0) g (k) shows the power excess and deficit coming from the primordial feature.
The evidences of the global test appear in figure 12.The minimum evidence is for two knots, N = 2, with values below 10 −5 relative to the highest evidence, which is N = 4. Significant values of the evidence ≳ 10% are found at N = 5, 6.The reconstruction for the oscillatory feature are shown in the top panel of figure 13.The contours follow a N = 4 shape mainly (i.e., with two changes of slope), accounting for a complete oscillation and leading to a decisive detection.The hypothesis test displayed at the bottom panel of figure 13 also provides clear detection since there are three scale ranges where 1 − β > 0.95, being 1 − β > 0.6 at the largest scales (see the last two columns of table 3).2.0 × 10 -9 2.5 × 10 -9 3.0 × 10 -9 Figure 9: Contours of the high-z P R (k) reconstructions for the local bump feature simulations for the N = 2 case (magenta) and for the case of marginalized probability over N knots (blue).We include in dashed black the bump feature template that sourced the simulations.In the bottom panel the values of the power 1 − β obtained from the hypothesis test are plotted.

Global oscillatory feature
We have tested our methodology with the power law and the bump and oscillatory feature templates, which are local features that approach the power law at large scales.We now consider a different type of feature in P R (k) which is global, i.e.P R (k) deviates from the power law at all scales in the form of oscillations.This kind of feature was motivated by a better fit to the Planck data [120,121], although a modulation is usually required.In those works the oscillations are located at scales from k ≈ 0.04 h Mpc −1 to k ≈ 0.4 h Mpc −1 , where lensing is significant.The oscillations help to alleviate cosmological tensions, such as those in H 0 , σ 8 or the lensing amplitude A l .These global oscillations in the primordial power spectrum are predicted by inflation models with non-Bunch-Davies initial conditions [19][20][21] or with axion monodromy [22].
The model used is oscillatory in log-log scale and provides global oscillations for the range of scales that we use (k ∼ 0.01 − 0.1 h Mpc −1 ).It is parametrized including a modulation in the power law primordial power spectrum P SM R :  10: Local oscillatory feature model of the primordial power spectrum (in red) and the power law primordial power spectrum (in blue).The parameters used to generate this oscillatory feature are given in the table inserted in the figure.The power excess peaks at k = 0.07 h Mpc −1 , reaching a 10% power excess, and is followed by a trough with a minimum power deficit of near 10% at k = 0.20 h Mpc −1 .The model tends to the power law at large scales.where A log , ω log and ϕ log are the amplitude, frequency and phase of the oscillations, respectively.Figure 14 shows this modified primordial spectrum template, with the parameters 2.0 × 10 -9

Param. Value
2.5 × 10 -9 3.0 × 10 -9 Figure 13: Contours of P R (k) reconstructions of high-z objects for the oscillatory feature simulations for the N = 2 case (magenta) and for the case of marginalized probability over N knots (blue).The oscillatory feature template of the primordial power spectrum is shown as a dashed black line.In the bottom panel of each figure the values of the power 1 − β obtained from the hypothesis test are plotted.used for our specific feature listed in a table inside.4.1) used as primordial power spectrum for three different values of its amplitude parameter A log : 0.1 (red), 0.02, (green) and 0.01 (purple), corresponding to 10%, 3% and 1.5% power deviations w.r.t. the power law (blue).Their frequency and phase are kept fixed at ω log = 4.0 and ϕ log = 0 respectively.Note the bump peaking near k = 0.1 h Mpc −1 and the two deficits of power at scales k ≈ 0.05 h Mpc −1 and k ≈ 0.25 h Mpc −1 .
The simulated P (0) g (k) for the global oscillatory template are shown in the top panels of figure 15 for two of the three values of the amplitude considered, A log = 0.1, 0.02, 0.01.The P (0) g (k) oscillations across multiple scale ranges can be clearly seen in the bottom panels.
Figure 16 shows their global tests.For the A log = 0.1 case, N = 5 has the highest evidence, with contributions above the 1% coming from N ∈ [6,8].The N ∈ [2, 4] configurations have negligible contributions, being N = 2 the lowest.This indicates a decisive evidence for feature detection for this amplitude.For the A log = 0.02 feature, N = 2 has the largest contribution, with Z 3 ≈ 0.1Z 2 , and the rest of evidences following a quasi-exponential fall.This yields an inconclusive result on the global test.Table 4 summarizes the results of the global test for these two amplitudes, including also a third one of A log = 0.01.
For the A log = 0.1 feature the marginalized contours, represented in the left panel of figure 17, deviate strongly from the power law according to the global oscillatory template, as expected from the global test.The hypothesis test, in the left bottom panel, indicates values of 1 − β very close to 1 in those ranges of k where the oscillations are located.
The right panel of figure 17 shows the reconstructions for the smaller amplitude A log = 0.02.For this weaker oscillations, the marginalized reconstruction follows the power law one, showing no appreciable deviations.Consequently, no detection of the feature is found with the hypothesis test.
Table 4 summarizes all the results of the global and local tests applied to this feature template.The global oscillatory feature studied can be clearly detected with our methodology for the high-z scenario in both global and local tests when the amplitude causes 10 % power deviations w.r.t. the power law.If the deviations are reduced to the 3 %, we did not find any deviation from the power law.
0.025 0.000 0.025 Figure 15: Upper panels: Simulated monopole galaxy power spectrum P (0) g (k) for the highz survey at the bin z = 2.3 performed for the global oscillatory feature template.Left: amplitude of the feature A log = 0.1, corresponding to 10% deviations.Right: amplitude of the feature A log = 0.02, corresponding to 3.% deviations.Bottom panels: relative differences with respect to the Standard Model monopole galaxy power spectrum P (0) g,SM (k).evidences for the reconstruction with 10% power deviations.Right panel: evidences for the reconstructions with 3% power deviations.

Summary of the results for low-z and high-z objects
We explore the impact of sampling the cosmological parameters according to Planck DR 3 uncorrelated Gaussian posteriors in appendix E. As it is shown in that section, the impact of varying the cosmological parameters on the reconstructions is small.As a result, the outcomes of the global and local tests are very similar in both cases.Therefore, in addition to the previous results for high-z objects, we also reconstruct P R (k) for both low-z (ELGs) and high-z objects, but we keep the cosmological parameters fixed in order to reduce the computational load.
In tables 3 and 4 we summarize the detection status for the considered local and global features respectively, based on the evidences and hypothesis test results.
For a sampled cosmology using the high redshift z = 2.3 bin we could not detect the local bump feature with a power excess of 5% at k ≈ 0.2 h Mpc −1 , but the global and local oscillatory ones are detected with decisive statistical evidence for deviations of the 10% w.r.t. the power law.When the global oscillatory model is reduced to 3% deviations, no feature is detected.
For a fixed cosmology we compute reconstructions even combining the information of different redshift bins.The smallest power deviations that we are able to detect with substantial evidence, are ≈ 2% for all the considered features combining either low or high z redshift bins, in scenarios with different redshift bins and photometric errors.As shown in these tables, the local bump feature can only be detected when combining all the redshift bins of the high-z objects.In this case the detection is achieved with decisive evidence, as we also do for all local and global oscillatory realizations with deviations larger than 2%.

Conclusions
We have presented a flexible methodology that reconstructs the primordial power spectrum P R (k) in a model independent way using Bayesian inference.The main advantages of our methodology are that it reconstructs P R (k) in a non-parametric way, without assuming any specific model.This is done by sampling an arbitrary number N of knots without any prior in k-space, which allows a quantitative comparison of different reconstructions based on the evidence Z. Models with lower Z are penalized, contributing less to the marginalized probability over N of the P R (k) reconstructions.
This methodology is applied to detect deviations from the Standard Model primordial power spectrum.For this, we use a global indicator (evidence comparison) and a local test (hypothesis test), as they provide complementary information.The global test compares the overall fit of different N knot configurations and determines the preferred number of knots.convincingly.
For a first application to real data, we used a simple semi-empirical description of nonlinearities and a BAO modelling, following [122], to reconstruct P R (k) from the SDSS LRG 04 catalogue (figure 22).The evidence substantially supports the power law model over higher N > 2 knots configurations, and the local test shows no significant deviations at any scale.
The next step is to apply this methodology to more recent galaxy surveys such as BOSS [92,123] and the forthcoming J-PAS [43], DESI [44] and Euclid surveys [45], previously incorporating non-linear perturbative effects, BAO modelling and more advanced redshiftspace distortions models [98,99].In order to maximize the sensitivity that can be obtained with this methodology, combinations of the information from different redshift bins should be applied, particularly those with a higher signal-to-noise ratio.Also the inclusion of Planck DR 3 posteriors with correlations between the sampled parameters as cosmological priors and a covariance matrix for the k-bins will be explored in applications to recent surveys, something not applied in the present work in order to reduce the computational effort.
Finally, we can envisage several extensions of this methodology that will be addressed in the near future: the use of higher order splines to interpolate between knots, allowing smoother reconstructions; the inclusion of higher order multipoles, ℓ > 0, of the galaxy power spectrum in the reconstruction, such as the quadrupole or the hexadecapole; the incorporation of weak lensing data and its modelling, which complements the clustering information and allows smaller scales to be explored; and the extension of the methodology to include CMB data in addition to the LSS ones.

A Solution of the label switching problem
The label switching problem appears in Bayesian analysis of models with multiple indistinguishable parameters whose order is arbitrary.The permutation of any of these parameters is equivalent [130], leading to highly multi-modal posterior distributions in models with many such parameters.
As described in section 2.1, we sample N − 2 coordinates for the normalized scales x i , with 2 ≤ i ≤ N − 1 (in the case of one or two knots, no k coordinates are sampled).The variables x i are indistinguishable parameters that the sampler can order arbitrarily, causing the label switching problem.A change of coordinates of these scales x i into a (N − 2)dimensional hypertriangle of coordinates x ′ i solves the problem [130]: with x ′ 1 = x 1 = 0 and x ′ N = x N = 1.This transformation can be interpreted as an ordering of the ks for sampling, removing the degeneracy of L with respect to its different k permutations and thus avoiding this problem.

B Sample selection in PolyChord
In this paper, we use the sampler PolyChord to sample the primordial power spectrum.At each step of the sampling, a set of parameters determining the power spectrum is obtained.The chains represent samples of the primordial power spectrum with a probability according to the posterior.In this method, the likelihood L increases monotonically as the sampling progresses (see left panel of figure 18).PolyChord provides the importance weights ω [84], that once normalized can be interpreted as the probability of that particular point of the parameter space.These weights increase with time N 0 until they reach a maximum value, then they drop by a few orders of magnitude (see right panel of figure figure 18).Thus, the reconstructions with the highest ω won't have the highest L values.However, they correspond to the tail where L is barely improved at each step.
We filter the i obtained reconstructions for each number of knots N to remove the burn-in and to exclude reconstructions with a low likelihood value.For that, we use a Monte Carlo method that selects reconstructions according to their values of ω i,N .If ω i,N is larger than a random value from 0 to 1, that reconstruction is kept.Otherwise, it is discarded.Due to the shape of importance weights through the chain step ω(N 0 ) (the highest L points do not have the largest ω, see the correspondence in figure 18), the reconstructions with the highest likelihood also fail this Monte Carlo filtering in most cases, and the ones with the lowest likelihood fail in almost all cases.The retained reconstructions P R (k) have high ω values while still having high enough L values.We label by P R,j,N the reconstructions that passed through this filter.

C Local feature template
The parametrization of the primordial power spectrum used as the local bump and local oscillatory feature template has a localized oscillatory burst [15].Steps in the warp or potential, over which the inflaton rolls in much less than an e-fold, generate oscillations in the power spectrum that can be modelled with this parametrization [15].It is written as a perturbation of the standard power law P R (k): where the first and second order terms I 0 (k) and I 1 (k) are: The burst in P R (k) corresponds to a sudden step-like feature in the inflation potential [25] or the sound speed [26,27].Specifically, this parametrization describes a tanh-step in the inflationary potential and in the warp term of a DBI model [15].
The window functions used in the modified power spectrum model of eq.(C.1) are:

D Posteriors sample
In figure 19 we provide a triangle plot of the posterior distributions for the sampled parameters in the 4 knots case of the local oscillatory feature reconstruction studied in section 4.2.
No strong correlation or degeneracy among the parameters is present, specially between the cosmological and knot ones.Similar posteriors are obtained for other cases of features and number of knots.

E Comparison between fixed and sampled cosmology for the P R (k) reconstructions
In this appendix we quantify the impact of varying the cosmological parameters on the P R (k) reconstructions obtained with a fixed cosmology.The results of this comparison are presented in figure 20 for high-z objects in the cases of the Standard Model and local features.In all cases we obtain increments of the 1σ contours smaller than 20%.For the specific case of the local oscillatory feature deviations up to a factor of 2 appear at the smallest scales for the 3σ contours (lower panel).For the rest of scales the differences are much smaller, even smaller than 1 %.Regarding the impact in the evidences and the hypothesis test, we obtain small changes, as expected from the high similarity in the reconstructions.This is shown in figure 21, where in the left panel we compare the values of the evidences and in the right panel the hypothesis tests.The values of the evidences are very similar.In relation to the power of the test (right panel), the maximum deviations for a sampled cosmology are similar, and the feature detection intervals appear at the same scales but with reduced widths.
F Application to real data: SDSS LRG 04

F.1 The SDSS LRG 04 data
We present a first application of our methodology to real LSS data.We choose the catalogue provided by the Sloan Digital Sky Survey Luminous Red Galaxies 04 data release (SDSS LRG 04).The catalogue [90] contains a sample of 58360 luminous red galaxies, selected3 from the 4th data release of the SDSS [122].These galaxies have a redshift range of 0.155 < z < 0.474, and they form a particularly clean and uniform galaxy sample, consisting of luminous earlytypes galaxies at all redshifts.
Two improvements in the construction of the galaxy power spectrum model for the SDSS LRG 04 are introduced compared to our previous simulations model used in section 3. Its galaxy power spectrum model incorporates a BAO modelling and a semi-empirical description of non-linearities: where b = 1.89 is the galaxy bias and Q nl and A g are parameters accounting for non-linearities [131,132].Non-linearities are introduced in a semi-empirical way through the factor 1+Q nl k 2 1+Agk .According to eq. (F.1), P dewiggled (k) accounts for the BAO signal, written as: where the matter power spectrum is calculated with CAMB, and P nowiggle (k) is the 'nowiggle' power spectrum defined in [133], suppressing oscillations of P m (k).W SDSS (k) is a window function chosen as: W SDSS (k) ≡ e −(k/k * ) 2 /2 , (F.3) with k * being the wiggle suppression scale defined in [134].This BAO modelling mimics the smoothing of the power spectrum caused by the non-linear matter clustering.The analysis of the SDSS LRG 04 data is performed with fixed cosmological parameters (also compatible with Planck DR 3), and relies on a likelihood function that assumes a cosmology independent and diagonal covariance matrix.The covariance matrix is derived from simulations [90].
Figure 22 shows P m (k), P nowiggle (k), P dewiggled (k) and P g (k) along with the SDSS LRG 04 data, ensuring that the constructed P g (k) reproduces the best fit in [90].The linear matter power spectrum provides a reasonably good parametrization of the shape of the monopole g (k) is plotted, in cyan the matter power spectrum P m (k), in red P nowiggle (k), and in blue P dewiggled (k).The uncertainties on P (0) g (k), indicated by the vertical bars, are 1σ and they are uncorrelated, even though the horizontal error bars overlap [90].Note that the models without the non-linear correction 1+Q nl k 2 1+Agk are not able to properly track the non-linearity of data at scales bigger than k ≈ 0.09 h/Mpc.In the left panel the effect of the non-linear correction is shown explicitly with two plots of the no wiggle spectrum P nowiggle (k): without the non-linear correction (continuous red line) and with it (dashed red line).
galaxy power spectrum at large scales.At small scales k ∼ 0.1 linear theory fails to fit the data accurately, as shown in the left panel of figure 22.The non-linearities alter the broad shape of the matter power spectrum and wash out baryon wiggles on small scales, among other effects.In particular, most of the departure from the linear regime at the small scales is caused by considering the underlying matter power spectrum given by the multiple galaxies sharing the same dark matter halo instead of the one given by the dark matter halos themselves (see [131] for more details).
In the case of SDSS LRG 04, the non-linear effects start to be significant at k ≈ 0.09 h/Mpc (see figure 22).The addition of the BAO modelling and the non-linear correction term 1+Q nl k 2 1+Agk helps significantly in fitting the data [90], although the non-linear correction has a semi-empirical origin and it is not physically derived.More realistic and physically motivated models of galaxy clustering are needed for upcoming stage IV LSS surveys, including a bias model such as in [109], a model of redshift space distortions as in [99], and a more physically motivated derivation on non-linearities as in [92].

F.2 Results and statistical interpretation
Figure 23 shows the evidences for the SDSS LRG 04.The power law model has the highest evidence Z. Z 3 contributes slightly over a 10% relative to Z 2 , while the evidences for configurations with higher number of knots decrease quasi-exponentially.The N = 1 configuration, which corresponds to a primordial power spectrum with n s = 1, has an evidence many orders of magnitude smaller that Z 2 , Z 1 ≈ 10 −9 Z 2 , as shown in the right panel of figure 23.Thus, it is strongly disfavoured.The Jeffreys criterion favours the power law in our SDSS LRG 04 analysis.
The marginalized probability over N and N = 2 reconstructions are very similar (top panel of figure 24), except for the 3σ confidence level contours of the marginalized one that the Hubble function.The power suppression results from the convolution of the line of sight comoving position r with a Gaussian e −(∆r ) 2 2σ 2 z

Figure 2 :
Figure 2: Signal to noise ratios S/N (k) for the SM at different redshift bins for different scenarios.Top panels: for low-z objects with low (left) and high (right) δ z .Bottom panel: for high-z objects.

g
(k) is represented in figure3.

Figure 3 :
Figure 3: Top panel: Simulated monopole galaxy power spectrum P (0) g (k) of high-z objects performed for the SM, i.e. with its P R (k) being the power law.Bottom panel: relative difference with respect to the Standard Model monopole galaxy power spectrum P (0) g,SM (k).

Figure 4 :
Figure 4: Evidence Z of the Standard Model reconstructions for each of the N knots configuration considered for the high-z case.

Figure 7 :
Figure 7: Top panel: Simulated monopole galaxy power spectrum P (0) g (k) of high-z objects performed for the local bump feature template.Bottom panel: relative difference with respect to the Standard Model monopole galaxy power spectrum P (0) g,SM (k).

Figure 8 :
Figure 8: Evidence Z of the local bump feature template reconstructions for each of the N knots configurations considered.

Figure 11 :
Figure 11: Top panel: Simulated monopole galaxy power spectrum P (0) g (k) of high-z objects performed for the local oscillatory feature template.Bottom panel: relative difference with respect to the Standard Model monopole galaxy power spectrum P (0) g,SM (k).

Figure 12 :
Figure 12: Evidence Z of the oscillatory feature reconstructions of high-z objects for each of the N knots configuration considered.

Figure 14 :
Figure14: Global oscillatory feature template of eq.(4.1) used as primordial power spectrum for three different values of its amplitude parameter A log : 0.1 (red), 0.02, (green) and 0.01 (purple), corresponding to 10%, 3% and 1.5% power deviations w.r.t. the power law (blue).Their frequency and phase are kept fixed at ω log = 4.0 and ϕ log = 0 respectively.Note the bump peaking near k = 0.1 h Mpc −1 and the two deficits of power at scales k ≈ 0.05 h Mpc −1 and k ≈ 0.25 h Mpc −1 .

Figure 16 :
Figure 16: Evidence Z of the global oscillatory template reconstructions for each of the N knots configurations considered in the high-z survey with a single bin z = 2.3.Left panel: evidences for the reconstruction with 10% power deviations.Right panel: evidences for the reconstructions with 3% power deviations.

Figure 17 :
Figure 17: Contours of the P R (k) reconstructions of the global oscillatory feature template with the high-z survey specifications at the bin z = 2.3 for the N = 2 case (magenta) and for the case of marginalized probability over N knots (blue).The oscillatory feature template is shown as a dashed black line.In the bottom panel of each figure the value of the power 1 − β used for detecting features in the hypothesis test.Left figure: reconstructions with A log = 0.1, corresponding to 10% power deviations w.r.t. the power law.Right figure: reconstructions with A log = 0.02, corresponding to 3% power deviations w.r.t. the power law.

Figure 18 :
Figure 18: Left panel: minus likelihood −L against time N 0 .Note how L increases monotonically with the advance of the step.Right panel: normalized importance weights ω against N 0 .Note that the maximum in the weights is found at steps when the Markov chain is halfway completed, i.e at N 0 ≈ N 0,max /2.
/k s ) D k/k s x s , (C.3) with D(x) being the dumping function, D(x) = x sinh x .This model has five parameters: three amplitudes {A 1 , A 2 , A 3 }, the scale k s where the oscillations start, and the damping parameter x s .The set of window functions W (j) i are given below.

Figure 19 :
Figure 19: Triangle plot of the posterior distribution for the 4 knots case of the local oscillatory feature reconstruction.The figure was obtained with the statistical python package GetDist.

Figure 21 :
Figure 21: Comparison of tests for a sampled cosmology (orange) and a fixed one (blue).Left figure: global test.Right figure: local test.

Figure 22 :
Figure 22: Galaxy power spectrum data for the SDSS LRG 04 (black points) in semilogarithmic scale represented as kP (0) g (k) (left panel) and logarithmic scale (right panel).In green the employed best fit of P (0)

Table 1 :
Priors used in this work.The sampled cosmological parameters {H 0 , Ω b , Ω c }

Table 2 :
[102,103]c −3 Densities n in units of 10 −3 h 3 Mpc −3 of low (left table) and high (right table) redshift objects for each bin of redshift z.The values of the densities depend on the photometric error δ z , and are estimated for low-z[101]and high-z[102,103]objects using eq.(3.2).