The local PNG bias of neutral Hydrogen, ${\rm H_I}$

We use separate universe simulations with the IllustrisTNG galaxy formation model to predict the local PNG bias parameters $b_\phi$ and $b_{\phi\delta}$ of atomic neutral hydrogen, ${\rm H_I}$. These parameters and their relation to the linear density bias parameter $b_1$ play a key role in observational constraints of the local PNG parameter $f_{\rm NL}$ using the ${\rm H_I}$ power spectrum and bispectrum. Our results show that the popular calculation based on the universality of the halo mass function overpredicts the $b_\phi(b_1)$ and $b_{\phi\delta}(b_1)$ relations measured in the simulations. In particular, our results show that at $z \lesssim 1$ the ${\rm H_I}$ power spectrum is more sensitive to $f_{\rm NL}$ compared to previously thought ($b_\phi$ is more negative), but is less sensitive at other epochs ($b_\phi$ is less positive). We discuss how this can be explained by the competition of physical effects such as that large-scale gravitational potentials with local PNG (i) accelerate the conversion of hydrogen to heavy elements by star formation, (ii) enhance the effects of baryonic feedback that eject the gas to regions more exposed to ionizing radiation, and (iii) promote the formation of denser structures that shield the ${\rm H_I}$ more efficiently. Our numerical results can be used to revise existing forecast studies on $f_{\rm NL}$ using 21cm line-intensity mapping data. Despite this first step towards predictions for the local PNG bias parameters of ${\rm H_I}$, we emphasize that more work is needed to assess their sensitivity on the assumed galaxy formation physics and ${\rm H_I}$ modeling strategy.


Introduction
The level of non-Gaussianity of the distribution of the energy density fluctuations generated during inflation holds key information about the particle content and physics of the early Universe [1,2]. The most popular and well-studied type of this primordial non-Gaussianity (PNG) is called local-type PNG, where the primordial gravitational potential φ is written as [3] with φ G being a Gaussian random field and f nl a constant that quantifies the leading-order departure from Gaussianity ( · · · indicates ensemble averaging). Different models of inflation make different predictions for f nl , and so constraining its value observationally lets us distinguish between them. The current tightest bound comes from the analysis of the cosmic microwave background (CMB) data from the Planck satellite, which constrained f nl = −0.9 ± 5.1 (1σ) [4]. Using large-scale structure data, the tightest constraint to date is f nl = −12 ± 21 (1σ) [5], and was obtained using quasars from the eBOSS survey (see also Ref. [6]). This bound is still looser than the CMB one, but the volumes spanned by next-generation large-scale structure surveys should let us reach precisions of order σ fnl ∼ 1, which would mark, even without a detection, an important landmark in our ability to distinguish between competing models of inflation [7,8]. Some of these future constraints on f nl will come from studies of the spatial distribution of (atomic) neutral hydrogen (H I ) in the Universe [9][10][11][12][13][14][15][16][17][18][19], which will be mapped by 21cm line-intensity mapping experiments such as BINGO [20], CHIME [21], HIRAX [22] and SKA [23]. Compared to traditional galaxy surveying where individual galaxies are identified on the sky and their redshifts are estimated spectroscopically or photometrically, the line-intensity mapping approach targets the integrated emission from all sources (resolved or not) in a given region of the Universe, with the radial information inferred through measurements in different frequency bands [24]. In the case of H I , the relevant line is the 21cm radiation emitted by its spin-flip transition. After the epoch of reionization z 6, most of the hydrogen in the Universe is ionized, but a significant amount of H I is still present in galaxies inside gas clouds that are sufficiently dense to shield the H I from the ionizing radiation. By skipping the need to resolve individual galaxies, line-intensity mapping surveys can scan very rapidly large areas of the sky with good redshift coverage, which is key to probe the large distance scales where f nl contributes the most; when completed, some of these surveys will have mapped the H I out to redshifts z ∼ 3 over about half of the sky. These experiments are also subject to distinct observational systematics compared to traditional galaxy surveys (e.g. the large foreground emission that needs to be subtracted), which will strongly establish the robustness of large-scale structure constraints on f nl if the different surveying techniques give consistent results [25][26][27].
The key observational signatures of local PNG on the H I distribution arise through a set of bias parameter terms that are generated if f nl = 0. In general, bias parameters are a set of redshiftdependent numbers that describe how the local abundance of a given tracer of the large-scale structure (dark matter halos, galaxies, H I , etc.) is modulated by different types of long-wavelength ( 50 − 100 Mpc) perturbations (see Ref. [28] for a comprehensive review). For the case of H I , and specifying already to the bias parameters we focus on in this paper, we can write [29][30][31][32] ρ HI (x, z) ⊃ρ HI (z) 1 + b 1 (z)δ m (x, z) + b φ (z)f nl φ(q) + b φδ (z)f nl φ(q)δ m (x, z) , (1.2) where ρ HI (x, z) is the local H I energy density,ρ HI (z) is its cosmic average, δ m (x, z) is a large-scale total matter density fluctuation and φ(q) is a large-scale primordial gravitational potential fluctuation (q is the initial Lagrangian coordinate associated with the final Eulerian coordinate x). This equation makes apparent the physical meaning of the bias parameters b 1 , b φ and b φδ as the response of the local H I energy density to the presence of the perturbation each multiplies, i.e., For example, b 1 quantifies how much more (b 1 > 0) or less (b 1 < 0) H I exists inside large-scale mass overdensities. These three parameters are the most important ones for observational constraints on f nl using the H I power spectrum (2-point function) and bispectrum (3-point function). Concretely, for the case of the H I power spectrum, and analogously to the case of the galaxy power spectrum [29], the f nl signature is where k is the wavenumber. 1 Similarly, the leading-order contributions to the bispectrum include a series of terms proportional to f nl b 3 1 , f nl b φ b 2 1 and f nl b φδ b 2 1 , whose amplitude is what can be used to constrain f nl (see e.g. Fig. 1 of Ref. [33] for a visualization of these contributions).
Since the observational signatures of f nl are effectively degenerate with the bias parameters, a solid understanding of the latter is critical to obtain the best possible constraints on local PNG from large-scale structure data (see Refs. [33][34][35] for recent discussions). However, as the bias parameters effectively describe how the long-wavelength environment impacts the H I distribution, they depend on the complicated interplay between gravity, reionization, star formation and stellar/black hole feedback, and are thus extremely challenging to predict theoretically. While b 1 can be fitted for directly from the data since it enters also through terms that do not multiply f nl , the same is not true for the local PNG parameters b φ and b φδ , which require assumptions to be made on in order to constrain f nl competitively. The standard way around this problem is based on the hope that, while the parameters b φ and b φδ themselves are expected to be extremely sensitive to the details of structure formation, their relation to b 1 may be more robust and easier to predict theoretically. Thus, if b φ and b φδ can be fixed in terms of b 1 , then determining the latter from the data breaks the degeneracy with f nl , which can then be constrained.
For example, the majority of the constraint studies on f nl in the literature (including those based on H I data) assume relations based on the universality of the halo mass function, from which it follows e.g. that b φ ∝ (b 1 − 1). Despite its widespread adoption however, there is no reason to expect these relations to hold exactly for real tracers of the large-scale structure like the H I , which motivates simulation-based works to refine these simple theoretical priors. Indeed, even for the case of dark matter halos in gravity-only simulations, we now know that b φ (b 1 ) [36,37] and b φδ (b 1 ) [35] are not perfectly described by the corresponding universality expressions. Further, using hydrodynamical simulations of galaxy formation, Refs. [35,38] showed recently that the b φ (b 1 ) and b φδ (b 1 ) relations of galaxies selected by a variety of criteria (including stellar-mass, black-hole mass accretion rate and color) are also not adequately described by the universality assumption, which needs to be revisited.
This motivates our main goal in this paper, which is to use hydrodynamical simulations to study the bias parameters b φ and b φδ of the H I distribution. The methodology we adopt in this paper is similar to that of Refs. [35,38], who used separate universe simulations of galaxy formation with the IllustrisTNG model to study the same bias parameters, but for the halo and galaxy distribution. One of our main results is that the b φ (b 1 ) and b φδ (b 1 ) relations of the H I are generically overpredicted by the results that are typically encountered in the literature assuming the above-mentioned universality expressions. We show this can be explained physically by the impact that long-wavelength f nl φ perturbations have on the H I content of dark matter halos, which can be understood as a new bias parameter (or response function) in the halo model. We will show results for just a single galaxy formation model and H I modeling recipe, and as a result, given the uncertainties we still have on these areas, and despite the intuition this will let us build already, the analysis in this paper should be regarded as the first step towards robust theoretical priors on the b φ (b 1 ) and b φδ (b 1 ) relations of the H I . We stress that these relations play a key role in observational constraint studies of f nl using H I data, which strongly motivates more works like this. The rest of this paper is organized as follows. In Sec. 2, we describe the galaxy formation simulations, the H I modeling strategy, and the numerical methods that we use to estimate the H I bias parameters b 1 , b φ and b φδ . We show and discuss our numerical results in Sec. 3, and summarize and conclude in Sec 4.

Methodology
In this section we describe the N -body simulations, the modeling of the H I distribution and the methods we use to estimate the H I bias parameters b 1 , b φ and b φδ defined in Eqs. (1.2) and (1.3).

Numerical simulation specifications
We obtain our numerical results using separate universe simulations run with the moving-mesh code AREPO [39] and IllustrisTNG as the galaxy formation model [40,41]. This model is characterized by a set of sub-grid prescriptions for physics including gas cooling, star formation, metal enrichment, stellar winds, supernovae feedback, and black hole feedback, that broadly reproduces a series of key observations including the low-redshift galaxy stellar mass function, the cosmic star formation history, the sizes and colors of galaxies and the gas fractions in galaxies and galaxy groups [42][43][44][45][46][47][48]. The fiducial cosmological parameters are: mean total matter density today Ω m0 = 0.3089, mean baryon density today Ω b0 = 0.0486, mean dark energy density today Ω Λ0 = 0.6911, dimensionless Hubble rate h = 0.6774, primordial scalar power spectrum amplitude A s = 2.068 × 10 −9 (at k pivot = 0.05/Mpc, corresponding to σ 8 (z = 0) = 0.816), and primordial scalar spectral index n s = 0.967.
As we describe below, we measure the H I bias parameters b φ and b φδ , respectively, as the response of the H I abundance and its linear bias parameter b 1 to long-wavelength f nl φ perturbations. Under the separate universe approach, this is equivalent to the response to changes in the cosmological parameter A s , and so we consider also two additional cosmologies called HighA s and LowA s , which share the same parameters as the fiducial except that A s → A s [1 + δ As ], where δ As = +0.05 for HighA s and δ As = −0.05 for LowA s . When we vary A s , we keep the sub-grid parameters of the IllustrisTNG model fixed also to their fiducial values in order to interpret our results on the impact of f nl φ perturbations at fixed galaxy formation physics.
We will show results obtained at two numerical resolutions. One is called TNG300-2 and is characterized by a cubic box with size L box = 205 Mpc/h and N p = 2 × 1250 3 initial dark matter and gas resolution elements. The other is higher-resolution, it is called TNG100-1.5, and it is characterized by L box = 75 Mpc/h and N p = 2 × 1250 3 . These separate universe simulations have been presented for the first time in Ref. [38] (to which we refer the reader for more details), and used in previous works already to study galaxy bias [35] and halo occupation distribution responses [49]. In our results below we analyse the simulation outputs at redshifts z = 0, 0.5, 1, 2, 3. We have also used the publicly available IllustrisTNG data for the resolutions dubbed TNG300-1 (L box = 205 Mpc/h, N p = 2 × 2500 3 ) and TNG100-1 (L box = 75 Mpc/h, N p = 2 × 1820 3 ) to carry out some numerical convergence and consistency checks of our modeling at the fiducial cosmology.
When we analyse the H I content in halos, we consider gravitationally-bound objects found using a friends-of-friends algorithm with linking length b = 0.2 times the mean dark matter interparticle distance. We quote as halo mass M h and total H I mass M HI the summed contribution from all halo member particles and cells, and consider objects that comprise at least N cell ≥ 50 gas cells.

Modeling of H I
Our strategy to model the H I distribution in our simulations follows largely that of Ref. [50]. In IllustrisTNG, the hydrogen gas mass fraction of each Voronoi cell starts off at its primordial value of f H = 0.76, and it decreases with time as stars form and enrich the interstellar medium with heavier metals. The first part of the modeling of H I involves determining the fraction f HN of all hydrogen that is neutral. For non-star-forming gas (which for the star formation model in IllustrisTNG [51] is all gas with local hydrogen number density n H < 0.106/cm 3 ), we use the split into neutral and ionized fractions that is calculated self-consistently during the simulation taking into account the redshift-dependent UV radiation background, attenuation due to self-shielding in higher-density gas regions, and ionizing radiation by local active galactic nuclei (AGN) [52].
The situation is more complicated for star-forming gas because f HN is not computed selfconsistently: the simulation output values are based on a mass-weighted temperature of the cold and hot gas phases, which is not a well-defined physical temperature. The authors of Ref. [50] go about this complication by setting the temperature of all star-forming gas cells to T = 10 4 K, and repeating the default calculation f HN at post-processing with this temperature value. Here, based on the discussion in Ref. [53], we follow an even simpler approach and consider all hydrogen of star forming cells to be neutral. This is based on the observation that the hot gas in star-forming regions is expected to be entirely ionized, the cold component is mostly neutral, and that the star formation model of Ref. [51] predicts this cold fraction to be within 0.9 and 1 (cf. the top right panel of Fig. 1 in Ref. [53] for an illustration of how good an approximation f HN = 1 is for star-forming gas).
Given the neutral hydrogren fraction, the second part of the modeling involves determining the fraction of H N that is in atomic H I and molecular H 2 form. As in Ref. [50], we use the Krumholz-McKee-Tumlinson (KMT) model [54][55][56] to determine the fraction f H2 of all neutral Hydrogen that is molecular: and where Z is the gas metalicity in solar units, σ d = Z ×10 −21 cm 2 is an estimate of the cross-section of dust, µ H is the mean hydrogen nucleus mass and Σ gas is the gas surface density. The latter can be calculated as Σ gas = λ J ρ gas , where λ J is the gas Jeans length (used here as a proxy for the size of gravitationally-bound gas clouds) and ρ gas is the gas volume density, or using a more sophiscated approach based on actually projecting the three-dimensional gas distribution in the simulation [53,57].
Here, we follow the approximate approach of Ref. [50] and evaluate the gas surface density as Σ gas = Rρ gas , with R = (3V cell /(4π)) 1/3 and V cell the volume of the Voronoi gas cell in Arepo. Finally, the atomic neutral hydrogen fraction we are interested in is given by f HI = 1 − f H2 , and we compute the H I energy density as ρ HI = f HI f HN ρ H , where ρ H is the total hydrogen density in each gas cell.
It should be noted that the KMT model is just one of several different approaches to model the H I and H 2 phases of neutral hydrogen in hydrodynamical simulations. More advanced analytical approaches can account in particular also for the contribution of young stars to the ionizing radiation field in the interstellar medium, and other approaches based on empirical observational correlations and calibration against high-resolution numerical simulations also exist; see Refs. [53,57] for a comparison of such different models using also IllustrisTNG. The simple KMT model can nonetheless be regarded as representative enough of the phenomenology of all these different approaches, and sufficient to our purpose here to begin to build intuition about the local PNG bias parameters of the H I . In fact, Fig. 13 of Ref. [53] shows that the dependence of f H2 on the local hydrogen density predicted by the KMT model is in between the predictions from many other models, which Ref. [57] The result is shown for the fiducial cosmology and for the TNG300-2 (blue), TNG100-1.5 (green), TNG300-1 (purple) and TNG100-1 (orange) resolutions. On the left, the symbols with error bars show a few representative observational estimates compiled in Table 2 of Ref. [58], and converted to our fiducial cosmology: black from Ref. [59], orange from Ref. [58], light green from Ref. [60], blue from Ref. [61], purple from Ref. [62], green from Ref. [63], cyan from Ref. [64] and red from Ref. [65] (see Ref. [57] for a discussion of the comparison between the simulation and these observational results). On the right, the grey dots mark the actual HI and total mass of the halos in the TNG300-1 simulation, and the black dashed line shows the three-parameter fit obtained in Ref. [50] using their modeling of HI in TNG100-1.
subsequently showed perform roughly in the same way when compared to the data. This justifies the adoption of the KMT model in this paper, but we note that future analyses of these H I bias parameters should eventually determine the impact of different H I modeling techniques.
The left panel of Fig. 1 shows the time evolution of the total H I energy density Ω HI (z) = ρ HI (z)/ρ c0 , where ρ c0 = 3H 2 0 /(8πG) is the critical energy density today. The result is shown for the fiducial cosmology of our TNG300-2 and TNG100-1.5 resolutions (blue and green), as well as for the original TNG300-1 and TNG100-1 simulations (purple and orange). For the latter two cases, our results agree very well with those obtained using the same simulation data in Refs. [50,57]. Recall, in particular, that the modeling in Ref. [50] differs from ours in that they recompute f HN in star-forming gas at post-processing, whereas we simply set it to unity: the good agreement between the two works thus illustrates the weak impact of this approximation. We have also explicitly checked that the H I content of halos in our simulations agrees very well with that shown in Ref. [50]. As an example, we show in the right panel of Fig. 1, the mean H I -to-total halo mass relation M HI (M h ) at z = 1. The black dashed curve shows the fit of Ref. [50] to their relation in the TNG100-1 simulation, which agrees very well with our result for the same resolution (orange line) for M h 10 10 M /h; the difference at lower halo masses is expected as the simple fit does not describe the details of the low-mass end of the distribution also in Ref. [50]. Figure 1 shows also that numerical resolution has a strong impact on the H I distribution. Generically, Ω HI decreases with decreasing resolution, as best seen by the markedly lower H I abundance in TNG300-2 compared to all other resolutions. Note however that TNG100-1.5 is a resolution intermediate to TNG100-1 and TNG300-1, but it has more H I at z < 1. The impact of numerical resolution can be understood at least partly by inspecting the behavior of the M HI (M h ) relation. For example, the right panel of Fig. 1 shows that this relation in TNG300-2 cuts off sharply for M h 5×10 9 M /h. This is because at this resolution, these objects are not as well resolved, and supernovae driven winds, tidal stripping and heating by UV background radiation become very efficient at removing gas, which gets more easily ionized once in the intergalactic medium (see Ref. [50] for an indepth discussion of this low-mass cutoff). Additionally, high-density gas clouds that can shield the H I from ionizing radiation are also harder to resolve overall in TNG300-2. At the higher mass end, starting from M h 10 12 M /h, the halos in TNG300-2 also display a lower amount of H I , which is partly due to these larger halos comprising also a larger number of poorly resolved subhalos, which for the reasons just discussed are not as H I rich. The results from the other numerical resolutions can be broadly explained along similar lines. We shall take these considerations around numerical resolution into account when we interpret our numerical findings below.

Bias parameter measurements
We evaluate the H I bias parameters b 1 , b φ and b φδ in a way that is in all analogous to how Ref. [35] evaluated the same parameters for halos and galaxies. Concerning b 1 , we estimate it as where P mHI and P mm are the H I -matter cross-power spectrum and matter power spectrum, respectively. The lowest wavenumbers probed by our simulations are k 0.03h/Mpc for TNG300-2 and k 0.08h/Mpc for TNG100-1.5, where the scale-dependence of the P mHI /P mm ratio can still be nonnegligible. To make our estimates of b 1 more robust then, we fit this ratio on scales k < 0.15h/Mpc for TNG300-2 and k < 0.3h/Mpc for TNG100-1.5 using b 1 + Ak 2 , where A is a nuisance parameter that absorbs the leading-order impact from scale-dependent corrections. 2 We show as error bars on b 1 the error estimated from the least-squares fitting method.
For the case of b φ , we estimate it using its definition as the response of the H I energy density to changes in the amplitude of the primordial gravitational potential φ in cosmologies with local PNG (cf. Eq. (1.3)). Concretely, if φ L is the amplitude of some long-wavelength primordial potential perturbation, then it can be shown using the separate universe ansatz [29,66] that local structure formation inside this perturbation is equivalent to global structure formation in a cosmology with the primordial scalar power spectrum amplitude A s rescalled as The bias parameter b φ can then be estimated by finite diferencing using our set of fiducial, HighA s and LowA s cosmologies as and where the superscripts indicate in which cosmology the H I energy density is measured (recall |δ As | = 0.05). With only one realization of the initial conditions we cannot quote errors on our measurements in a statistical ensemble sense, and bootstrapping or resampling methods are also not adequate approaches given our relatively small box sizes. We note however that b HighAs φ and b LowAs φ should be the same up to numerical noise, and so we will use their difference as a rough estimate of the error in our measurements. Note also that since the three cosmologies share the same phases of the initial conditions, sample variance errors on b φ cancel already to a large degree. Finally, we evaluate b φδ as Redshift where the second equality follows again from the separate universe equivalence between long-wavelength f nl φ perturbations and changes to A s . This equation can be derived from the definitions of b 1 and b φδ in Eq. (1.3) [35], noting that ρ HI b φδ = ∂ (ρ HI b 1 ) /∂(f nl φ). We evaluate the derivative of b 1 in Eq. (2.10) via finite differences (analogously to Eqs. (2.7)-(2.9)) using the values of b 1 estimated from the fiducial, HighA s and LowA s simulations using Eq. (2.5). Note that b φδ could also be estimated with separate universe simulations that account simultaneously for changes to A s and the mean cosmic matter density, but at the cost of running additional simulations to the ones we use in this paper.

Results
In this section we present and discuss our main H I bias results. We discuss first the parameter b φ and its relation to b 1 , and then do the same for the b φδ parameter. Our numerical values for these three bias parameters are listed in Tab. 1.

The b φ parameter of H I
The left panel of Fig. 2 shows the redshift evolution of the b φ values of H I from the TNG100-1.5 and TNG300-2 resolutions, as labeled. Both resolutions predict that b φ increases with increasing redshift, and that it is negative at z 1. The two resolutions are in relatively good agreement in their predictions, although there are visible differences at z = 0.5, z = 1 and z = 3. The right panel of Fig. 2 shows b φ plotted against the corresponding values of b 1 , where we note a similar level of agreement between the two resolutions, except perhaps again at z = 3 (rightmost point) where TNG300-2 displays a markedly lower value of b φ .

Halo model interpretation of the results
The halo model is a useful tool to guide the interpretation of our numerical results. Assuming that all of the H I is inside halos with some mass M h , then b φ can be written as [49] b HI where n h (M h ) is the differential halo mass function, M HI (M h ) is the mean H I mass in halos of mass M h (cf. Fig. 1), b h φ (M h ) is the local PNG bias of the halos and is the response of the M HI (M h ) relation to long-wavelength perturbations f nl φ. This response function is an ingredient that is often ignored in the literature, but which must be present in general to account for the modulation of M HI (M h ) by f nl φ, in the same way that b h φ accounts for the same modulation of the halo mass function; see Ref. [49] for more details. We have evaluated Eq. (3.1) using its ingredients measured from our TNG100-1.5 and TNG300-2 simulations. In doing so, we considered all halos with at least N cell = 50 gas cells, which for TNG100-1.5, accounts for over 99% of all of the H I at z < 2 . On the right, the solid grey line marks the universality relation, the dotted grey line is the expected relation for halos, the grey dashed line shows the recent-merger relation of Ref. [66], and the relation for stellar-mass selected galaxies in IllustrisTNG from Ref. [38] is shown by the grey dot-dashed line. and 96% at z = 3, and for TNG300-2, it accounts for over 99% at z < 1, 97% at z=2 and 91% at z=3. The result is shown by the dashed lines in the left panel of Fig. 2, which agree extremely well with the simulation measurements, as expected. 3 To focus on the impact of the response function R HI φ (M h ), we show as dotted lines on the left panel of Fig. 2 the outcome of Eq. (3.1), but with the contribution from R HI φ (M h ) set to zero; by comparing to the dashed line, which shows the full result, one can thus directly appreciate the importance of this response function. Concretely: 1. At z ≤ 1, the dashed lines are below the dotted lines, which indicates a negative net effect from R HI φ . That is, the halos become generically H I poorer inside f nl φ perturbations. 2. At z = 2, the dashed and dotted lines are comparable, which signals a smaller contribution from R HI φ . That is, the H I content of halos responds less strongly to f nl φ perturbations.
3. At z = 3, for TNG100-1.5, the net effect from R HI φ is positive, i.e., the f nl φ perturbations make the halos slightly H I richer (this is to a smaller extent already visible at z = 2). For TNG300-2, however, R HI φ has a very negative net effect (cf. dashed blue line below the dotted one). 4 We do not show all of the response function measurements for brevity, and also because they are still relatively noisy and more simulations are needed to aid a more detailed inspection of their massand redshift-dependence. Just as a single illustrative case, however, the left panel of Fig. 3 shows the response function R HI φ measured using our separate universe simulations at z = 1. Indeed, and as anticipated in point 1 above, the values of R HI φ are either negative or compatible with zero on the mass range that contributes sizeably to the integral of Eq. (3.1).

The connection to the total hydrogen distribution
To understand further these results, we investigated also the response of the total hydrogen content of the halos, i.e., is the mean mass in hydrogen (not just H I ) in halos of mass M h . This is shown in the middle panel of Fig. 3 for z = 1, where we observe that it has a shape and size that is similar to the response of the H I content on the left. That is, the suppression of the H I content in halos inside f nl φ perturbations at z = 1 follows to some degree the suppression of the total amount of hydrogen H in those halos. Physically, this behavior of R H φ is expected for at least two reasons, both associated with the enhanced structure formation inside f nl φ perturbations (or equivalently, given an increase in A s ): 1. First, as found in Ref. [38], these perturbations enhance star formation, which accelerates the transformation of the primordial hydrogen into stars and heavier metals. To help visualize this effect, the right panel of Fig. 3 shows the response R Metals For another perspective into these results, we show in Fig. 4 the redshift evolution of the response of the total mass, mass in H I , mass in all hydrogen and mass in metals, found in all halos in the mass bin M h ∈ 10 12 , 5 × 10 12 M /h, as labeled. Focusing first on the TNG300-2 results on the right, the black line shows that the f nl φ perturbations work to increase the total mass (except at z = 0 when it decreases just slightly), and do more so at higher redshift; this is a consequence of there being more halos in the bin, as well as each halo becoming also more massive. The figure shows also that while the mass in H I can increase for z 1 (cf. blue line), this increase is always smaller than that in total mass, which effectively causes a suppression of the H I -to-total mass relation, i.e., R HI φ < 0 (cf. Fig. 3). At this resolution, the behavior of the H I response is well traced by that of the total hydrogen shown in cyan color. Note also how the effects of enhanced star formation are visible at z ≥ 1 by the enhanced mass in metals (cf. orange line), but at lower redshift the stronger effects by black hole feedback inside f nl φ perturbations eventually work to remove metals from the halos, and can lead to the response of the mass in metals to drop below that of the total mass at z = 0.
The results from the TNG100-1.5 resolution on the left of Fig. 4 agree relatively well with those from TNG300-2, with the main noteworthy difference being that in TNG100-1.5 the behavior of the H I Figure 4. Response of the total mass (black), mass in HI (blue), mass in all hydrogen (cyan) and mass in metals (orange) found inside all halos in the total mass bin M h ∈ 10 12 , 5 × 10 12 M /h. The result is shown as a function of redshift, and on the left and right for TNG100-1.5 and TNG300-2, respectively. response is not as well traced by the total hydrogen response. Concretely, at z ≤ 1 the response of the H I mass is below that of the total mass, but still visibly above the response of the total hydrogen, i.e., the f nl φ perturbations still lower the H I -to-total mass relation of these halos, but lower the H-to-total mass relation even more. Further, at z ≥ 2 the f nl φ perturbations can actually lead to an increase in H I mass that slightly outweighs the increase in total mass. This could be associated with the fact that f nl φ perturbations promote the formation of denser gas clouds that can shield the hydrogen inside them more efficiently from the ionizing radiation. These dense structures are not as well resolved in TNG300-2, which can explain why the same effect is not visible at this lower resolution.
There are more physical effects that can drive differences between the impact that f nl φ perturbations have on the H I and hydrogen mass of halos. For example, related to point 2 above, for the case of the H I it is actually simply sufficient that the enhanced feedback or tidal stripping effects remove gas from inside high-density regions in the halo (and not completely from inside the halo altogether) for it to be more exposed to the ionizing radiation; this lowers R HI φ , but not R H φ . Further, f nl φ perturbations also modify the details of the evolution of AGN, and thus their contribution to the local ionizing radiation field in IllustrisTNG (which impacts the H I directly, but not the total hydrogen). In fact, here one should note that since stars form earlier inside f nl φ perturbations, reionization begins earlier as well. This effect is not accounted for in our results as we assumed the time-dependent UV background (the main driver of reionization in IllustrisTNG) to be the same in our fiducial, HighA s and LowA s cosmologies. This should not have a critical impact in our results as we focus on redshifts well after reionization is complete, but it is interesting to investigate the impact of this approximation in works more focused around the epoch of reionization 6 z 10. We defer to future work a more indepth investigation of the complicated interplay between these and other astrophysical effects, which we note will depend in general on the assumed galaxy formation model and H I modeling strategy.

Comparison to the universality relation
For b 1 1.5 (z 1), this is largely because of the negative values of R HI φ , which push the b φ (b 1 ) relation downwards. For b 1 1.5 (z 1) the result is a combination of the impact of R HI φ and the relation of the halos shown by the grey dotted line 6 , which is itself already below the universality relation. Note also that the predictions for the bias parameter b 1 also affect the b φ (b 1 ) relation. For comparison purposes only, the right panel of Fig. 2 shows also the recent-merger relation b φ = 2δ c (1 − 1.6) (dashed) derived by Ref. [66] and that is usually assumed in constraints using quasars [5,6], and the relation b φ = 2δ c (1 − 0.55) (dot-dashed), which was found in Ref. [38] to roughly describe the case for stellar-mass selected galaxies in IllustrisTNG.

The b φδ parameter of H I
Our results for the b φδ parameter of H I are displayed in Fig. 5, which has the same format as Fig. 2 discussed above for b φ . The first noteworthy point is that our b φδ estimates are noisier compared to b φ , which is expected since b φδ is a second-order bias parameter; this is especially dramatic in our results for TNG300-2 at z = 3. Despite the noise, there are still a few trends that one can discern from our results, in particular, that b φδ is generically a growing function of redshift (b φδ < 0 at z 2), and that for b 1 1.5, the simulation measurements predict a b φδ (b 1 ) relation that falls below the corresponding universality expression (grey solid line in Fig. 5). The latter is in this case given by where b 2 is the bias parameter associated second-order mass perturbations δ m (x, z) 2 , and which we evaluate using the polynomial fit b 2 (b 1 ) = 0.412 − 2.143b 1 + 0.929b 2 1 + 0.008b 3 1 of Ref. [72]. This fit was obtained for dark matter halos using separate universe simulations that do not assume universality of the halo mass function, but the impact of this is small for the purpose of our comparisons here (see Ref. [72] for a discussion).
Similarly to as discussed in the last section for b φ , here too the halo model is a useful tool to interpret our numerical results. The halo model expression for b φδ is given by [49] b HI where b h 1 , b h φ and b h φδ are the bias parameters of the halos, and in addition to R HI φ defined in Eq. (3.2), now the result depends also on two extra response functions of the H I -to-total halo mass relation defined as . (3.5) The set of separate universe simulations we use in this work does not let us estimate these response functions reliably, which keeps us from being able to evaluate Eq. (3.3) entirely. We evaluate it nonetheless with R HI 1 = R HI φδ = 0 and the R HI φ measured from the simulations, and the result is shown by the dashed lines in the left panel of Fig. 5. 7 The agreement between this halo model prediction and the simulation results is not perfect, but that is not surprising since the halo model result is incomplete; the difference between the dashed lines and the simulation measurements can in fact be used to estimate roughly the size of the combined contribution from b h φ R HI 1 + R HI φδ that was neglected. Concerning the comparison to the universality prediction, previous works in the literature [15,18] have estimated the value of b φδ for the H I using Eq. 2 (z 2) our simulation results begin to drop below the universality prediction, and so that this way of calculating the b φδ parameter in the literature overestimates (the result becomes less negative) the b φδ (b 1 ) relation of the H I in our simulations. We leave a more indepth discussion of the b φδ parameter of the H I distribution to future work, which would benefit from larger simulation volumes for improved statistics.

Summary and conclusions
Upcoming 21cm line-intensity mapping observations will determine the large-scale distribution of atomic neutral hydrogen H I , which is a very sensitive probe of the local PNG parameter f nl . These future data have the potential to tighten up the current best constraint on f nl using CMB data, and consequently, to improve on our knowledge of the physics of the early Universe and inflation. The main observational signatures appear in the power spectrum and bispectrum of the H I distribution and are proportional to terms like b 1 b φ f nl and b 2 1 b φδ f nl , where b 1 , b φ and b φδ are three bias parameters that describe how the H I energy density depends on large-scale mass overdensities δ m and gravitational potential perturbations f nl φ (cf. Eqs. (1.2) and (1.3)). Naturally, given the strong degeneracy with f nl , accurate and precise theoretical priors on the expected values of these parameters are of the utmost importance to infer the numerical value of f nl from the data. Our main goal in this paper was to take the first steps towards estimating these bias parameters using hydrodynamical simulations of cosmic structure formation.
Concretely, in this paper we coupled separate universe simulations of the IllustrisTNG galaxy formation model with the analytical KMT model of the split of neutral hydrogen into atomic (H I ) and molecular (H 2 ) (cf. Sec. 2.2) to predict the b φ (b 1 ) and b φδ (b 1 ) bias parameter relations of the H I . We used simulation data obtained at two numerical resolutions called TNG300-2 (L box = 205 Mpc/h, N p = 2 × 1250 3 ) and TNG100-1.5 (L box = 75 Mpc/h, N p = 2 × 1250 3 ), and discussed our numerical findings with the aid of the halo model by inspecting the behavior of the H I content of dark matter halos. Our main results can summarized as follows: • The values of b φ and b φδ grow with redshift and are negative at z 1 and z 2, respectively (cf. Tab. 1, and Figs. 2 and 5). Our two numerical resolutions agree also relatively well, despite some visible quantitative differences. 7 To perform this calculation we evaluate b h 1 using the fitting formulae of Ref. [71] and take b h φδ to be the dotted line on the right panel of Fig. 5. All other ingredients in Eq. (3.3) are directly measured from the simulations.
• The popular universality expressions for the b φ (b 1 ) and b φδ (b 1 ) relations overpredict our simulation measurements (cf. Figs. 2 and 5), indicating that the most common calculation found in the literature may currently overestimate the values of b φ and b φδ .
• The lower values w.r.t. the universality expressions are in part due to the negative response R HI φ of M HI (M h ) to f nl φ perturbations at z 1 (cf. Fig. 3), i.e., halos become generically H I poorer inside these perturbations. We discussed how this can be explained by the fact that these perturbations accelerate the conversion of hydrogen to heavier elements by star formation, and enhance the effects of feedback that ejects the gas out to regions where it is more easily ionized.
We note that although the impact of astrophysical uncertainties on f nl constraints with line-intensity mapping data has been investigated in past works [14,16,17], this was done only through parametrizations of the gas-to-halo connection, i.e. M HI (M h ) for the case of H I . Our finding here that response functions like R HI φ and R Metals φ can be generically nonzero introduces the environmental dependence of the gas-to-halo connection as an additional, new astrophysical ingredient whose impact on f nl constraints needs to be understood as well. This can be straightforwardly achieved already with the aid of Eqs. (3.1) and (3.3), together with parametrizations for response functions like R HI φ [49]. Taken at face value, is our finding that the universality-based calculations overpredict the b φ (b 1 ) and b φδ (b 1 ) relations in our simulations good or bad news for f nl constraints using H I data? This is hard to answer generically since it depends on the survey setup and redshift range considered. For example, focusing just on the case of b φ (b 1 ) for power spectrum analyses (cf. Fig. 2), for b 1 < 1, which corresponds roughly to z 0.5−1, the universality relation is negative, and since our b φ measurements are below it, they are larger in absolute value. This means the H I distribution is actually more sensitive to the effects of f nl . Conversely, however, starting from b 1 > 1, which corresponds roughly to z 0.5 − 1, the universality relation becomes positive, and so by assuming it, one overestimates the values of b φ , and consequently, the true constraining power of the data on f nl . One may argue that most of this constraining power comes from high redshift, where one can access the large distance scales where f nl contributes most sizeably, but these are also the scales that are expected to be more affected by foreground subtraction systematics. The takeaway is that there is strong motivation to revisit the impact of b φ (b 1 ) assumptions in existing forecast studies of f nl using H I data. Doing so will also serve the purpose to quantify the accuracy and precision with which future simulation-based works need to determine the local PNG H I bias parameters in order for line-intensity mapping surveys to meet a certain target constraining power on f nl (see Refs. [33,35] for discussions).
We emphasize that the modeling of the H I in cosmological hydrodynamical simulations is a field that is still under development and therefore with still many associated uncertainties. The results and discussion in this paper provide already very useful intuition and guidance for future works on the subject, but having been obtained at fixed galaxy formation model and H I modeling strategy, they should be regarded as just the first step towards more robust theoretical predictions for the b φ and b φδ parameters of the H I . The importance of these bias parameters in observational searches of local PNG strongly motivates giving continuation to the work started here, including with investigations of the sensitivity of the bias parameters to the assumed galaxy formation physics, and the exploration of more sophisticated H I modeling strategies.