A publishing partnership

Articles

UNDERSTANDING THE FORMATION AND EVOLUTION OF INTERSTELLAR ICES: A BAYESIAN APPROACH

and

Published 2014 September 23 © 2014. The American Astronomical Society. All rights reserved.
, , Citation Antonios Makrymallis and Serena Viti 2014 ApJ 794 45 DOI 10.1088/0004-637X/794/1/45

0004-637X/794/1/45

ABSTRACT

Understanding the physical conditions of dark molecular clouds and star-forming regions is an inverse problem subject to complicated chemistry that varies nonlinearly with both time and the physical environment. In this paper, we apply a Bayesian approach based on a Markov chain Monte Carlo (MCMC) method for solving the nonlinear inverse problems encountered in astrochemical modeling. We use observations for ice and gas species in dark molecular clouds and a time-dependent, gas–grain chemical model to infer the values of the physical and chemical parameters that characterize quiescent regions of molecular clouds. We show evidence that in high-dimensional problems, MCMC algorithms provide a more efficient and complete solution than more classical strategies. The results of our MCMC method enable us to derive statistical estimates and uncertainties for the physical parameters of interest as a result of the Bayesian treatment.

Export citation and abstract BibTeX RIS

1. INTRODUCTION

Molecular clouds are regions where extinction by dust is high (AV  ≳ 5 mag), temperatures are low (∼10 K), densities are high (nH ⩾ 103 cm−3), and most of the gas is molecular. They contain higher (> 104 cm−3) density structures (Myers & Benson 1983), some of which may become gravitationally unstable and initiate the early stages of star formation. Understanding the life cycle of dark molecular clouds is very important for comprehending star formation and for gaining insight into the processes of the interstellar medium and, to that extent, galaxy formation. Molecules provide a paramount tool for the analysis of the chemical and physical conditions of star-forming regions. Every stellar or planetary evolutionary stage is characterized by a chemical composition that represents the physical processes of its phase.

In the dense cores of molecular clouds, molecules and atoms that were previously in the gas phase deplete onto the dust grains. For each atom or molecule, freeze-out (or depletion) depends on a complicated time-dependent, nonlinear chemistry that strongly depends on the physical environment. It is difficult to quantify depletion observationally (e.g., Christie et al. 2012). CO emission can be used to infer the fraction of species that is in the form of icy mantles by taking the ratio of the observed CO to the expected abundance at a particular density in steady state, if freeze-out did not occur (e.g., Caselli et al. 1999). However, this not only implies that the cores are in steady state, but also implies a knowledge of the H2 density, as well as of the efficiency of the nonthermal desorption mechanisms that can return the depleted CO to the gas. Moreover, the CO depletion factor is not necessarily equivalent to the molecular gas depletion factor, because different species freeze and desorb at different rates with different sticking coefficients, which are mostly unknown.

The detection of water ice mantles in cold dark interstellar clouds and star-forming regions (Öberg et al. 2011) provides us with direct evidence that surface reactions on dust grains involving oxygen atoms make water molecules, which are then retained on the surface and make water ice. Not all species undergo surface reactions when they stick to dust grains. For example, CO efficiently sticks to surfaces at temperatures below ∼25 K and is found to be abundant in the ices. Some of this CO can be converted to other species.

The relatively high abundance of CO2, CH3OH, and H2CO in ices (Öberg et al. 2011; Whittet et al. 2011), relative to H2O, in some clouds, indeed suggests that some processing of CO to these products is occurring, due possibly to irradiation by cosmic rays or photons generated by cosmic rays inside the cloud. H2CO and CH3OH are stages in the surface hydrogenation of CO. Similarly, CO2 can be the result of the oxygenation of CO:

Some ices can be thermally returned to the gas phase when the gas temperature is higher than 20 K. At low gas temperatures, nonthermal desorption processes can also return molecules from solid to gas phase (e.g., Roberts et al. 2007). However, these mechanisms "compete" with those of freeze-out. The composition of the icy mantles is clearly a time-dependent process that is highly dependent on the initial conditions of the gas in any particular cloud. Hence, the ices on dust grain surfaces are of a mixed composition and may reflect the local conditions and evolutionary history. In some dark molecular clouds, the ices are abundant, indicating that nonthermal desorption mechanisms may not be very efficient everywhere. The potential interconnection and linear, or nonlinear, correlation of these parameters with each other, or with extra unknown parameters, augments the difficulty we face in determining and specifying the parameter network. The large parameter space in combination with the number of parameters and the complexity of the physical system make the task of parameter estimation highly challenging.

Increasingly detailed observations of molecular clouds and star-forming regions enable us to identify some of the most important processes at work. Chemical and radiative transfer models can transform molecular observations into powerful diagnostics of the evolution and distribution of the molecular gas. Though, the results of these models depend on a number of parameters, or a group of parameters, that are usually poorly constrained. Moreover, deriving information about molecular clouds using observational information, and even well-established modeling codes, is an inverse problem that usually does not fulfill Hadamard's (1902) postulates of well posedness. That is, it may not have a solution, solutions might not be unique and/or might not depend continuously on the observational data. The first and second postulates simply state that for a well-posed problem a solution should exist and be unique. The third postulate holds that small changes in the observational data result in small changes in the solution. As shown later in Section 3.1, in typical astrochemical problems, only the first postulate holds, and we usually have to deal with nonlinear ill-posed inverse problems.

Employing sampling algorithms is a traditional approach to tackling inverse problems in many scientific fields with large parameter space. Bayesian statistical techniques and Monte Carlo sampling methods such as Markov chain Monte Carlo (MCMC) algorithms and nested sampling have flourished over the past decade in astrophysical data analysis (Christensen & Meyer 2000; Ford 2005; Fitzgerald et al. 2007; Feroz & Hobson 2008; Isella et al. 2009). A summary of a typical MCMC method and an application to quantify uncertainty in stellar parameters using stellar codes is given by Bazot et al. (2012). To our knowledge, MCMC methods have never been applied in the framework of parameter estimation through astrochemical modeling. In this paper, we present a first astrochemical application of gas–grain chemical modeling, molecular abundances and a Bayesian statistical approach based on MCMC methodology.

The motivation of the present paper is to solve the inverse problem of deriving the physical conditions in interstellar molecular clouds; in particular, the gas density, cosmic ray ionization rate, radiation field, the rate of collapse, the freeze-out rate, and nonthermal desorption efficiency. In Section 2, we formulate a typical inverse problem for interstellar molecular clouds and describe the Bayesian method and the Metropolis–Hastings (MH) algorithm (an example of a wider class of MCMC techniques). In Section 3, we discuss the statistical results and the astrophysical consequences. Finally, in Section 4, we present our conclusions.

2. PARAMETER ESTIMATION

In this paper, we are interested in dense, cold, quiescent regions of molecular clouds where atoms and molecules in the gas-phase freeze-out onto the dust grains. The observed quantities are molecular abundances for solid and gas-phase species. The parameters we want to estimate are the cloud density, nH, the cosmic ray ionization rate, ζ, the radiation field rate, G, the cloud collapse rate, Cf, and three nonthermal desorption efficiencies epsilon, ϕ, and y presented in Section 2.3. Due to the nature of the addressed inverse problem, the theoretical and modeled relationship between the parameters and the observed data is highly nonlinear. Therefore, we anticipate several degeneracies as well as a multi-modal and non-Gaussian joint parameter distribution. Moreover, the parameters are not uniquely related to the observations. While the forward problem has (in deterministic physics) a unique solution, the inverse problem does not. Different combinations of parameters can produce the same abundances. Furthermore, there are too many possible combinations of parameters to permit an exhaustive search.

Traditional approaches to tackling inverse problems of this nature fail to cope with these kind of issues. Methods based on searching iteratively to minimize an appropriate distance, such as the χ2 error, can become stuck in the local minimum and give degenerate solutions. Alternative approaches to finding a global solution, such as simulated annealing, would have some benefits, but since we are not just looking for the global optimum of our target distribution, the most comprehensive view is obtained by a Bayesian Monte Carlo sampling method. We selected the Bayesian MCMC approach against other methods that work equally well with complex and multi-modal target distributions (e.g., nested sampling), since MCMC constitutes a benchmark algorithm in Monte Carlo sampling and parameter estimation problems.

To overcome the challenges of an ill-posed nonlinear inverse problem, we adopted a Bayesian approach based on the use of the MH algorithm. The Bayesian framework for inverse problems is based on the systematic modeling of all errors and uncertainties from the Bayesian viewpoint. The potential for this approach to solve difficult inverse problems with high noise levels and serious model uncertainties is much higher and also allows for prior information to be incorporated. The Bayesian solution is the whole posterior distribution of the parameters and, therefore, there is not only one solution, but a set of possible values. The advantage of using the MCMC approach is that there is no restriction concerning the nonlinearity of the model. Moreover, an appropriate tuning of the MCMC parameters allows the algorithm to explore all modes of the target distribution. Finally, even though it is still not feasible to do an exhaustive search through the parameter space, MCMC methods can effectively explore the joint posterior distribution of the parameters, since model computations are concentrated around regions of interest in the parameter space.

2.1. Bayesian Inverse Problem

Our aim is to obtain information about the physical parameters of a molecular cloud $ \boldsymbol{\theta }=(\theta _1,\theta _2,\ldots,\theta _k)$ while we measure molecular abundances $ \boldsymbol{\mathcal {Y}}=(\mathcal {Y}_1,\mathcal {Y}_2, \ldots,\mathcal {Y}_n)$. These quantities are related to a (forward) function f(·) that represents the physical and chemical processes in the cloud. The main challenge is that there is no closed form function, f, mapping the parameters to the observations, which could be inverted. However, given a set of parameters, estimated abundance values for the species of interest can be computed with astrochemical models denoted here as $\mathcal {C}(\cdot)$. The addressed problem in our case is how to estimate $\boldsymbol{\theta }$ from

Equation (1)

and, according to Idier (2008), this constitutes an inverse problem. The error term, $\boldsymbol{\varepsilon }$, represents both the observational noise and the modeling error between $\mathcal {C}(\cdot)$ and f(·).

We treat $\boldsymbol{\mathcal {Y}}$, $\boldsymbol{\theta }$, and $\boldsymbol{\varepsilon }$ as random variables and define the solution of the inverse problem to be the posterior probability distribution (PPD) of the parameters given the observations. This allows us to model the noise via its statistical properties, even though we do not know the exact instance of the noise entering our data. We can also optionally specify a priori the form of solutions that we believe to be more likely, through a prior distribution. Thereby, we can attach weights to multiple solutions that explain the data. This is the Bayesian approach to inverse problems.

We assume that we have K parameters θk and N solid-phase observable quantities $\mathcal {Y}_n$. The error εn on each observation $\mathcal {Y}_n$ is assumed to be normally distributed with variance, $\sigma _n^2$. In addition, it is assumed that the observational errors are independent. The $\sigma _n^2$ is considered to correspond to the uncertainty on $\mathcal {Y}_n$, which is solely dictated by the observation. The probability density function of the errors is given by

Using Equation (1), we can define the likelihood function, $\mathbb {L}$, of observations given a model parameterized by a set of parameters as

In case any prior information about the unknown parameters is available, the Bayesian approach allows for this information to be taken into account. This information can be integrated through a prior probability distribution on the parameters, say $\pi (\boldsymbol{\theta })$. Then, a parameter estimation can be performed through the PPD, using Bayes' rule:

Equation (2)

The PPD expresses our uncertainty about the parameters after considering the observations and any prior information. The denominator is simply a normalization factor.

In reality, we are not able to access the whole PPD. Therefore, computation of parameter estimates or uncertainties is a hard task. MCMC methods are efficient methods that allow us to sample from complex probability distributions and approximate complex probability densities.

2.2. Markov Chain Monte Carlo

MCMC methods are a powerful class of algorithms that produce random samples distributed according to the distribution of interest. The importance and efficiency of MCMC methods lies in the fact that these samples can be used to approximate the probability density of the distribution by only calculating it for a feasible number of parameter values. Among the several implementations of possible algorithms, we employ an MH sampling algorithm (Gilks et al. 1995). The MH algorithm will enable us to explore the parameter space and efficiently approximate the PPD. A theoretical introduction of MCMC and MH is far beyond the scope of this paper. However, in Appendix A, we briefly describe the MH algorithm and how MCMC is employed for parameter estimation in our case. Note that the tuning of the MH algorithm, as described in Appendix A, is very crucial when aiming to approximate possibly multi-modal and non-Gaussian distribution, which is the case for this study.

2.3. Parameter Space

The chemical modeling code used in this paper, and denoted as $\mathcal {C}(\cdot)$ in Equation (1), is the UCL_CHEM time-dependent gas–grain chemical code (Viti et al. 2004) and is briefly described in Appendix B and references therein. Note that for each set of parameters, $\mathcal {C}(\cdot)$ provide us with a time series of chemical abundances. We choose to extract the chemical abundances of interest for the time points when the final density is reached and the cloud collapse has finished. Even though we ignore the previous time points, the time dependency is still taken into account and investigated through exploration of different final density values.

The parameters for our chemical modeling code create a nine-dimensional (9D) parameter space for molecular clouds, as used in our MH and described in Table 1:

In a first attempt to employ a Bayesian approach for deriving branching ratios for poorly understood chemical reaction pathways, we also investigated the parameter r, which controls how much of the gas-phase oxygen turns into ice H2O or ice OH. Parameter r reflects the percentage of O that turns into H2O, so that 1 − r reflects the percentage of oxygen that turns into OH. Desorption efficiencies resulting from H2 formation on grains, direct cosmic ray heating, and cosmic ray induced photodesorption are determined by parameters epsilon, ϕ, and y, as introduced and studied by Roberts et al. (2007). The freeze-out parameter in our code is effectively the sticking coefficient, a number in the range of 0%–100% that adjusts the rate per unit volume at which species deplete on the grain. For the free-collapse of a particular nH, we used the modified formula of Rawlings et al. (1992), where parameter Cf is considered to be a retardation factor with a value less than one, to roughly mimic the magnetic and/or rotational support, or an acceleration factor with a value greater than one to simulate a collapse faster than a free-fall (e.g., due to external pressure). Table 1 lists the set of physical parameters studied in this paper along with their definition domain, $\mathbb {D}_{{\theta _k}}$. The joint definition domain, $\mathbb {D}_{{\theta }}$, represents the parameter space to explore. The selected domain limits refer to the theoretical range of possible values for molecular clouds where atoms and molecules deplete onto the dust, ensuring that extreme values are included.

Table 1. Parameter Definition Domain

Parameters $\boldsymbol{\theta }$ Unit Definition Domain $\mathbb {D}_{{\theta }}$
ζ 10−17 s−1 1–10
G Habing 1–10
nH cm−3 104–108
fr ... 0%–100%
Cf ... 0.5–3
epsilon Yield per H2 formed 0.01–1
ϕ Yield per cosmic ray impact 102–106
y Yield per photon 10−3–102
r ... 0%–100%

Download table as:  ASCIITypeset image

2.4. Observational Constraints

The observational constraints of our analysis are based on data from the existing literature. Even though in this application we are primarily interested in ices, we include both gas-phase and solid-phase observations. To avoid confusion, we will use $\boldsymbol{\mathcal {Y}}$ to denote a vector containing any observed quantity and, if required, we will specify whether we refer to solid-phase or gas-phase observations.

The solid-phase observations include column densities and visual extinction data for molecular clouds in front of field stars. Such sources often provide suitable opportunities to observe and study ices in quiescent regions of the clouds (e.g., Boogert et al. 2011). We used 31 observations of H2O, CH3OH, CO, and CO2 from 31 different regions of 16 different clouds found in the literature and summarized by Whittet et al. (2011). The data suggest some abundance variation, which was attributed to different evolutionary stages for different clouds. The scope of this paper lies beyond studying the behavior of a specific cloud, but rather on how to obtain statistical insight into the dynamics of common cloud classes. Therefore, the observational data is transformed into fractional abundances with respect to total H nuclei, and then the average value is computed and used for our analysis.

In an attempt to minimize degeneracies, we introduce additional gas-phase abundances as an optional observational constraint. Due to the ill-posed nature of our problem, it is possible for our chemical model to end up with a solution space that fits the solid-phase observations perfectly, but with gas-phase abundances that are far from realistic. Hence, the addition of gas-phase observations can be considered a mathematical regularization through the introduction of additional prior information. Prior information can be naturally integrated into our Bayesian approach. The gas species observations were collected from more than one study, attempting to match the clouds, regions, or evolutionary stage of the observational sources used for the solid-phase species. If we were to fit observations of a particular source, then, ideally, every observational gas-phase constraint should be able to contribute to the regularization of our methodology. However, because here we are only attempting to explore a methodology, we found that three gas-phase species were adequate to provide insight into the efficiency of gas-phase species as a regularization factor. Abundances for NH3 and N2H+ were collected from Johnstone et al. (2010), while HCO+ from Schöier et al. (2002). The gas-phase observations are in the form of fractional abundances with respect to total hydrogen nuclei. Table 2 lists the average molecular abundances for all the species along with their uncertainties. We emphasize again that the error on each of the observations $\mathcal {Y}_n$ is assumed to be normally distributed with a variance of, $\sigma _n^2$, that is determined solely by the uncertainty reported in Table 2.

Table 2. Observational Constraints (Average Fractional Abundances)

Solid-phase Species   Gas-phase Species
H2O CH3OH CO CO2   NH2 N2H+ HCO+
7.47 ± 1.81 0.23 ± 0.13 1.14 ± 0.84 1.89 ± 0.79   3.10 ± 2.24 0.068 ± 0.049 0.20 ± 0.01

Notes. The fractional abundances are with respect to H nuclei. Solid-phase abundances are in units of 10−5; gas-phase abundances are in units of 10−8.

Download table as:  ASCIITypeset image

2.5. Priors

We run two identical sets of eight MCMC chains that differ in prior distribution information. For the first set, the prior information is noninformative and in the form of an acceptable range of possible values. Therefore, $\pi (\boldsymbol{\theta })$ is uniformly distributed on $\mathbb {D}_{{\theta }}$, as listed in Table 1. Note that the observational data, $\boldsymbol{\mathcal {Y}}$, refers only to the solid-phase molecular abundances and, in this case, the gas-phase species are ignored. In the second case, the prior information includes the observational constraints from the gas-phase species as well. Let $\boldsymbol{\mathcal {Y}}$ now include all of the observational constraints, $\boldsymbol{\mathcal {Y}}_s$ just the solid-phase, and $\boldsymbol{\mathcal {Y}}_g$ the gas-phase observational constraints. In that case, the PPD is defined as

Equation (3)

The prior information is simply the likelihood function, $\mathbb {L}(\cdot)$, of $\boldsymbol{\mathcal {Y}}_g$, given a model parameterized by $\boldsymbol{\theta }$, since

Including prior information in this way is equivalent to attaching weight to the solutions that explain the gas-phase as well as the solid-phase chemistry.

2.6. Blind Benchmark Test

In order to quantitatively investigate the effectiveness of our method to astrochemical problems, we performed a benchmark test. This benchmark test is basically our Bayesian analysis applied, this time, on synthetic observations produced by UCL_CHEM using a predefined set of parameters, $\boldsymbol{\theta _T}$. Once we have our synthetic observations, we apply our methodology and analyze the results and whether the true parameters are recovered. Knowing the solution to this test a priori allows us not only to validate the method, but also to critically perceive the nonlinear and ill-posed nature of our problem. This discussion can be found in Section 3.1. This particular selection of parameters was chosen at random and includes values that are not far from the expected parameters, or well-accepted values in the literature. The parameter values used in the test can be found in Table 3.

Table 3. Blind Benchmark Test

Parameters $\boldsymbol{\theta }$ Unit Test Value
ζ 10−17s−1 2.4
G Habing 2.6
nH cm−3 105
fr ... 42%
Cf ... 1.3
epsilon Yield per H2 formed 0.02
ϕ Yield per cosmic ray impact 150
y Yield per photon 0.1
r ... 75%

Download table as:  ASCIITypeset image

3. RESULTS

When quoting parameter estimation results, and especially multivariate results, it is convenient to decrease the parameter space to posterior intervals near single marginalized parameters. Figure 1 shows the nine one-dimensional (1D) marginalized PPDs of the parameters for the benchmark test using a uniform prior. In Figure 2, we present the nine 1D marginalized PPDs of the parameters and their 68% high density regions (HDRs), recovered from the uniform prior case. Figure 3, presents the same results for the informative prior case. HDRs indicate the parameter space where the probability density is higher. We refer readers seeking more details about marginal posterior probability function and HDRs to Appendix C and references therein. In order to compare the two prior cases and quantify the level of constraint for each parameter, we introduce a measure of parameter constraint, the high density spread (HDS), which is defined as follows.

Figure 1.

Figure 1. One-dimensional marginalized PPD for each of the nine parameters for the blind benchmark test. The plots show the Gaussian kernel density estimator of each probability density function. Dashed lines indicate the predefined parameter values $\boldsymbol{\theta _T}$ we wish to recover.

Standard image High-resolution image
Figure 2.

Figure 2. One-dimensional marginalized PPD for each of the nine parameters using the uniform noninformative prior. The plots show the Gaussian kernel density estimator of each probability density f unction. The darker regions indicate 68% HDR.

Standard image High-resolution image
Figure 3.

Figure 3. One-dimensional marginalized PPD for each of the nine parameters using informative prior from gas-phase chemistry. The plots show the Gaussian kernel density estimator of each probability density function. The darker regions indicate 68% HDR.

Standard image High-resolution image

Let |HDR| be the width of an HDR of a parameter's k density function with definition domain $\mathbb {D}_{{\theta _k}}$ and $|\mathbb {D}_{{\theta _k}}|$ the width of the domain. Width is defined with respect to some simple measure such as the Lebesque measure (Lebesgue 1902). Then, the HDS is defined as

The HDS ratio can be perceived as an index of the level of uncertainty on a predefined definition domain and, the higher it is, the less constrained a parameter is. Table 4 presents HDS for each parameter for both priors used. Figure 4 shows the two-dimensional (2D) marginal PPD for parameters that present statistical interest. Finally, Table 5 lists the statistical mean and standard deviation for the ∼35% HDR of the joint distribution for all the nine parameters. The general statistical picture we get from Figures 2 and 3 shows that the distributions of all the parameters are far from Gaussian and most of them have more than one mode. Looking at the models with physical units, we can also note that most of the density lies away from the limits of our definition domain for both cases, which validates our choice for $\mathbb {D}_{{\theta }}$.

Figure 4.

Figure 4. Two-dimensional marginalized posterior probability density functions. Warmer colors indicate higher probability density.

Standard image High-resolution image

Table 4. High Density Spread

Parameters θ High Density Spread (HDS)(%)
Noninformative Prior Informative Prior
ζ 48 36
G 46 30
nH 16 09
fr 38 28
Cf 45 35
epsilon 50 42
ϕ 33 28
y 38 33
r 43 32

Note. The lower the value of HDS, the more constrained the parameter is.

Download table as:  ASCIITypeset image

Table 5. Mean and Standard Deviation for the Most Probable Mode of the Joint Distribution

Parameters $\boldsymbol{\theta }$ Unit Mean Value
ζ 10−17 s−1 8.39(± 2.8)
G Habing 1.79(± 1.27)
nH cm−3 4.07(± 2.34) × 104
fr ... 31(± 21)%
Cf ... 1.18(± 0.9)
epsilon Yield per H2 formed 0.52(± 0.35)
ϕ Yield per cosmic ray impact 2.78(± 1.12) × 106
y Yield per photon 3.35(± 2.27) × 10−3
r ... 56(± 23)%

Note. The mode corresponds to ∼35% HDR of the total joint distribution.

Download table as:  ASCIITypeset image

3.1. Blind Benchmark Test Results

The results of the performed test as shown in Figure 1 reveal two important insights. First of all, high probability density regions for all the parameters include, and hence recover, the true parameters. As we can see in Figure 1, all of the predefined parameter values lie under, or very close to, the highest density point of the marginal PPD. This result simply validates that both the Bayesian approaches make accurate inferences based on the given observations and the MH algorithm efficiently samples the solution space. Secondly, we can observe that, in many cases, there are additional high probability density regions. These regions prove and highlight the ill-posed nature of our problem by indicating that different parameter sets can produce similar observations. Combining the two insights, we can conclude that the Bayesian method with MCMC sampling is efficiently exploring the parameter space, revealing the solution regions that answer our ill-posed inverse problem. In addition, we can conclude that, in order to constrain our solution space, we should either introduce numerical regularization factors (e.g., gas-phase species) or scientific prior knowledge.

3.2. Influence of Priors

A visual comparison of Figures 2 and 3 reveals what we can quantitatively observe in Table 4. With noninformative uniform prior, the HDRs seem to cover large sections of the distribution, which in some cases reach 50% of the definition domain. This means that most of the parameters are not constrained enough. The most statistically straightforward parameters seem to be clearly the nH and then the ϕ and fr parameters, presenting distinct modes and relatively low HDS. G and ζ seem neither constrained nor relevant enough, while fr seems to have a clear mode, followed by a very heavy tail. The rest of the parameters present high HDS, above 40% with several disjoint HDRs and do not allow us to reach credible conclusions about the parameters. Including the prior information from the gas-phase species significantly changes the picture, as can be seen in both Figure 3 and Table 4. We can observe that the HDRs get smaller and the parameters seem more constrained. The distribution of ζ is now denser around high values (>6), while G has to be low (<4). The nH remains well-constrained with even lower HDS, while the distribution of fr now clearly constraints the parameter to low domain values. The distribution of Cf is also significantly altered: not only the HDS has dropped, but a large portion of the density has also transferred from high accelerated collapse regions to free-fall collapse regions. The nondesorption mechanisms still present a multi-modal behavior, but with significantly smaller HDRs. Their distribution clearly highlights the nonlinear way these mechanisms act together or against each other. For r, the addition of informative prior information seems to reduce the HDS as well, centralizing the density, but still slightly favoring the production of H2O against OH. Therefore, we conclude that the addition of gas-phase species as a regularization factor outperforms the use of a noninformative uniform prior distribution alone. The HDS is reduced at an average of ∼12%, which indicates an equivalent constraint on the parameter space. Section 3.3 will discuss the statistical and numerical results of our analysis, while Section 3.4 will discuss the astrophysical implications. In both of these sections, we shall only concentrate on the results of the informative prior case.

3.3. High Density Regions

The nH is clearly the most constrained physical parameter. The marginal density function reveals that most of the density is between 2.2 and 5 × 104 cm−3. The ζ is constrained to values higher than 6 × 1017 s−1, while the G is constrained to values lower than 4 Habing. The HDR for the fr stays between 20% and 45%, while the Cf has one distinct HDR between 0.5 and 1.55 and one long heavy tail between two and three times the default free-fall rate. The epsilon presents two modes. The first HDR is between 0.4 and 0.8 and the second between 1.2 and 1.4. The marginal distribution of ϕ, also presents two modes. One is centered around 105. The second one is centered around 60. The marginal distribution for y, presents two disjoint HDRs as well. The first one indicates really low efficiency of about 10−6, while the second one indicates a slightly higher 2 × 10−3–8 × 10−2. Finally, the distribution for the branching ratio parameter, r, shows high density between 40% and 70% of oxygen turning into ice water.

In Figure 4, we show the marginalized 2D PPD for our parameters. Note that the nH and the fr are negatively dependent in a nearly linear way. On the other hand, G and nH seem to have a nonlinear positive correlation, hitting a plateau after a certain gas density. Similarly, the fr and the Cf may have a clear peak, but also show some evidence of a positive correlation. The relation between the cosmic ray desorption efficiency parameters, ϕ and y, reveals many distinct peaks throughout the domain space. Note that the marginalized PPD for cosmic ray ionization rate and parameter ϕ shows a clear bimodal structure. However, focusing only on the denser areas of the distribution, we can observe a potential nonlinear correlation between the cosmic rays and the efficiency of the cosmic ray related parameter ϕ. However, in general, ζ is evidently a parameter that is not sufficiently constrained. This is already obvious by the 1D marginal distribution of ζ, but the contrast of constraint between ζ and one of the most constrained parameters, such as nH, is depicted in Figure 2(f).

Due to the nonuniqueness of our solution space, examining the joint probability distribution of the PPD provides a useful insight. The dimensionality of the distribution makes a visualization impossible; thus, we chose to extract the statistical mean and the standard deviation for each one of the parameters from the most probable mode of the joint distribution. The joint distribution was approximated using a multivariate histogram and the most probable mode was chosen in a heuristic way and corresponds to ∼35% HDR of the whole PPD. The values for the mean and standard deviation are given in Table 5. As expected, the most probable mode of the joint PPD agrees with the HDR of the marginal parameter distributions. For the unimodal 1D marginalized distributions, the most probable mode completely coincides, while, for the multi-modal cases, the most probable mode coincides with one of the modes. Hence, purely based on the statistical interpretation, we conclude that a molecular cloud that matches the observed abundances should have a low nH, a low fr, and a low G. The ζ, on the other hand, is more likely to have high values, but the high standard deviation leaves room for significant variation. The collapse of the cloud may be insignificantly accelerated, while the branching ratio, r, slightly favors the branching into water, but with a high standard deviation. In terms of the nonthermal desorption efficiency parameters, we notice increased efficiency for all three of them. As a general result, we conclude that the 9D space of the joint distribution has multiple peaks. Both the marginalized distributions and the denser peak of the joint distribution indicate that some of the parameters (nH, G, fr, Cf) are well-constrained, while other parameters (ζ, r, epsilon, ϕ, y) present possible variation that implies further astrophysical or statistical implications.

3.4. Astrophysical Consequences

Here, we discuss our results for each of the parameters with regards to their astrophysical implication.

nH. The derived credible intervals for the gas density are in very good agreement with the properties of typical collapsing dark clouds, clumps, and cores (Myers & Benson 1983; Benson & Myers 1989; Bacmann et al. 2002; Bergin & Tafalla 2007). Higher cloud densities (>106 cm−3), that are usually expected in hot cores after the cloud has collapsed (van der Tak 2004) were explored, but showed nearly zero probability density in our analysis.

fr. Our study implies a depletion rate that is not high enough to dominate and is probably lower than 50%. Bacmann et al. (2002) suggest that freeze-out dominates when nH exceeds ∼3 × 104 cm−3, which is marginally the case in our study. When the freeze-out dominates and densities exceed ∼105 cm3, the abundance of CO ice is found to be significantly increased to typical gaseous values (∼10−4; Pontoppidan 2006; Bergin & Tafalla 2007). Furthermore, the ice water abundance is typically 5 × 10−5–9 × 10−5 and even higher at the highest densities (Pontoppidan et al. 2005). The ice CO and H2O abundances in our case, however, are about 0.5–1 mag lower. Therefore, along with the nH results, the lower freeze-out rate estimated by our analysis can be explained by a different evolutionary stage of the observed clouds. According to Fontani et al. (2012), low depletion values can also imply a cloud that is going to form less massive objects.

G. Our analysis showed that in order to match the observed ice abundances, the G is comparable to the standard interstellar radiation field of 1 Draine or ∼1.7 Habing (Draine 1978).

ζ. In dense gas ζ is measured to be in the range of 1–5 × 10−17 s−1 (Bergin et al. 1999). However, considerable uncertainties have been reported in the literature with derived values as high as 10−15 s−1, accounted to X-rays from a central source (Doty et al. 2004). These discrepancies may be due to whether ζ is determined via H$_3^+$ or HCO+ measurements. However, Dalgarno (2006) claims that given the latest evidence that the ζ range is narrow and between 10−16 and 10−15, the question should be focused not on why the estimations are different, but on why they are so similar. Our analysis confirms ζ values higher than the typical estimations and is even consistent with the 10−16 s−1 estimations through the H$_3^+$ determination. Most importantly, our study indicates the high standard deviation of these values, highlighting that such a variation should be expected. Theoretically, this is explained considering the fact that ζ lose energy while ionizing and exciting the gas through which they travel in conjunction with the possible variation in the origin of ζ. Even though our astrochemical model does not account for the latter factors, our probabilistic approach reflects their impact.

Cf. Our study shows that the collapse of the cloud should follow the expected free-fall collapse. Higher Cf values present moderate probability density, which implies that the observational constraints could potentially also be matched with different, but also less likely, sets of parameters (e.g., higher values for both Cf and fr).

epsilon, ϕ, y. The desorption from H2 efficiency parameter (epsilon) estimates are significantly higher than the value reported by Roberts et al. (2007) (epsilon < 0.1). The direct cosmic ray desorption efficiency (ϕ), presents two peak values. One of them agrees with Roberts et al. (2007) and is centered around 105. The second one is centered around 60, which is lower than the lowest limit studied by Roberts et al. (2007). For the cosmic-ray-induced photodesorption efficiency (y), we have two probable estimates as well. The first one indicates really low efficiency. The second one presents a slightly higher efficiency that is still lower than the one estimated by Hartquist & Williams (1990) (y = 0.1), but consistent with the results of Öberg et al. (2009) for CO2. Our analysis, in general, indicates useful credible intervals for nonthermal desorption efficiencies, highlighting that the reported nonlinearities can be tackled with further regularization factors such as molecule specific analysis and additional grain properties. Also note that our astrochemical model does not include direct UV photodesorption, which has recently be found to be efficient.

r. The branching ratio proved to be a parameter with high but anticipated variability. Its marginal probability distribution presents the most statistically normal behavior with a mean that implies a shared branching ratio of oxygen freezing into ice H2O and ice OH, favoring slightly solid H2O. The first laboratory experiment to reproduce the ice H2O formation (Dulieu et al. 2010) implied that the hydrogenation of oxygen is an important route for water formation. Furthermore, Cazaux et al. (2010) state that species such as OH are only transitory and quickly turn into ice water. However, they also state that ∼30% of the O coming on the grain is released in the gas phase as OH, which can freeze back as H2O. A pathway that is included in our model and can explain both the high water abundance on the grains and the shared branching ratio, r. At last the high branching ratio toward ice OH highlights the importance of ice OH for the production of ice CO2.

We now look at the correlation between our parameters as presented in Figure 4. The relation between the nH and depletion or freeze-out has been the subject of many studies (Bacmann et al. 2002; Christie et al. 2012; Fontani et al. 2012; Hocuk et al. 2014). They all conclude that the amount of depletion, the ice abundances, and the density of the cloud should all scale together (Rawlings et al. 1992). Even though our analysis suggests a clear anti-correlation between nH and fr (Figure 4(a)), this result is completely in line with the literature, since we are not analyzing the time evolution of the cloud, but instead focus on parameter fitting at specific time points. This negative correlation suggests that the less gas density we have, the higher the depletion should be in order to match the observed ice abundances. Our results are also in line with the negative correlation between the depletion factor and nH, derived by Fontani et al. (2012) from CO observations. The positive correlation between nH and G depicted in Figure 4(b) confirms that the denser a cloud is, the higher ζ values are needed to match the observations. The plateau after a density value (∼7 × 104 cm−3) indicates that the explored radiation field domain space is not high enough to penetrate the cloud after a density threshold. When Cf is increased, the freeze-out timescale needs to be decreased since the final nH is reached quicker. This reduced timescale requires higher fr values in order to simulate the observed ice abundances, and this relation is depicted in Figure 4(c). Even though it is not very straightforward, the relation between the cosmic ray desorption efficiency parameters, ϕ and y, is very interesting (Figure 4(d)). In most cases, the cosmic ray photodesorption efficiency is either low or high for both direct cosmic ray heating and cosmic ray induced cases. However, there is a significant peak when the direct cosmic ray impact is very efficient, while the cosmic ray induced impact is very inefficient.

4. CONCLUSIONS

In this paper, we implemented a Bayesian MH parameter estimation analysis to solve a typical ill-posed inverse astrochemical problem. We have employed a chemical modeling code and solid-phase observations in order to get a holistic insight into the behavior of physical and chemical parameters that drive ice chemistry in dark molecular clouds. The main conclusions of this work are as follows.

  • 1.  
    The Bayesian method provides a systematic approach to solve nonlinear inverse problems with high noise levels and significant model uncertainties. The MCMC technique allows us to sample from complex probability distributions in an efficient way. As highlighted by our blind benchmark test, we can conclude that the latter methods successfully handle astrochemical ill-posed problems and reveal a more complete set of solution regions. On the contrary, single solution estimates derived from traditional approaches would not have provided a complete picture of the solution space and would have contained a high risk of degeneracy.
  • 2.  
    Our probabilistic approach to physical and chemical parameter estimations used a chemical network with deficiencies (especially for the grain part) and several assumptions. Nevertheless, the results both derived useful credible intervals and highlighted model deficiencies, implying even more promising results for tackling physical, chemical, and model uncertainties for up-to-date models with targeted astrophysical goals.
  • 3.  
    We confirm that the joint PPD of the solution space is highly nonlinear and multi-modal and the 1D marginal PPD for each parameter are far from Gaussian, highlighting the complexity of the problem.
  • 4.  
    Including abundances of gas-phase species as regularization factors and introducing them as Bayesian priors, increases the parameter constraint efficiency by 12%. This result can imply that observational regularization constraints compensate for any chemical code deficiencies. Also, increasing the number of gas-phase regularization factors will constrain the solution space even more.
  • 5.  
    We show that physical parameters, such as nH, G, and Cf, are highly constrained and their variation has a great impact on the derived ice abundances.
  • 6.  
    The high variation of ζ contradicts the theoretical ζ standard values in dense gas and indicates a larger credible interval instead.
  • 7.  
    Nonthermal desorption efficiencies act and counteract in a nonlinear way with each other or ζ. This complex behavior should be analyzed with extra regularization factors.
  • 8.  
    Branching ratio parameters such as r can be successfully estimated through Bayesian MCMC methods. Our results even though with high variability, indicate that the detail or simplicity of the dust grain chemical network can be encapsulated and reflected as either certainty or uncertainty, respectively.

This work is supported by the IMPACT fund. The authors also acknowledge STFC for computational support and thank the anonymous referee for useful suggestions that improved the paper.

APPENDIX A: MARKOV CHAIN MONTE CARLO

The MCMC framework uses a Markov chain to explore the parameter space and approximate the PPD. This chain consist of a series of states $\boldsymbol{\theta }^{(1)},\ldots,\boldsymbol{\theta }^{(t)}, \ldots \boldsymbol{\theta }^{(T)}$, where the probability of $\boldsymbol{\theta }^{(t)}$ depends only on $\boldsymbol{\theta }^{(t-1)}$. MCMC methods require an algorithm for choosing states in the Markov chain in a random way. The MCMC sampler implemented for this paper was the MH algorithm. The MH is briefly outlined here using the following pseudocode.

  • 1.  
    Select a starting point $\boldsymbol{\theta }^{(1)}$ from the parameter space. Then, for i = 2, 3, ... until convergence, repeat the following steps.
  • 2.  
    Propose a random set of parameters according to a proposal distribution, q, so that $\boldsymbol{\theta }^* \! \sim \! q(\boldsymbol{\theta }^i|\boldsymbol{\theta }^{i-1}).$
  • 3.  
    Calculate the posterior probability of the new parameters, $\pi (\boldsymbol{\theta }^*|\boldsymbol{\mathcal {Y}})$, using Equation (2).
  • 4.  
    Accept the new parameters with probability
  • 5.  
    Calculate u ∼ Uniform(u; 0, 1).
  • 6.  
    If u < a, then accept the proposal, θi ← θ*; otherwise, reject the proposal and θi ← θi − 1.

The performance of the MH algorithm is highly dependent on the proposal distribution. The appropriate distribution should account for the complexity of the target distribution, but it should still be computationally easy to draw samples from. In nonlinear problems such as ours, we expect a multi-modal non-Gaussian target joint distribution. Non-Gaussianity is not a problem for MCMC algorithms. However, classical choices for the proposal distribution (i.e., Gaussian distribution) can potentially prevent the MCMC from converging to the target distribution, since the transition of the chain from one mode to another is not very possible. In our specific case, following former similar choices (e.g., Andrieu & Doucet 1999; Bazot et al. 2012), and taking into account the characteristics of the expected target distribution, we chose the proposal distribution $q(\boldsymbol{\theta }^*|\boldsymbol{\theta }^t)$ to be a mixture of two Gaussian distributions centered at $\boldsymbol{\theta }^{(t)}$ and a uniform distribution on $\mathcal {D}_{{\theta }}$. Hence, for all parameters, $\theta ^*_k$ for k = 1,..., 9

The values for $\sigma ^2_{k,1}$ and $\sigma ^2_{k,2}$ were selected based on test runs. We run m = 8 independent Markov chains of length, T = 200, 000. By using parallel and independent chains, it is easier to understand the dependence of the MH performance on the initial parameter values that we chose. Moreover, parallel chains provide insight into whether convergence has been reached. Convergence was also decided based on empirical graphical aid. The length, T, of the chains was confidently chosen to be larger than the value of decided convergence. In our case, q(·) will be a symmetrical distribution. This means that $q(\boldsymbol{\theta }^{(t)}|\boldsymbol{\theta }^*) = q(\boldsymbol{\theta }^*|\boldsymbol{\theta }^{(t)})$ and the ratio in the acceptance probability, α, is simply the PPD ratio computed at $\boldsymbol{\theta }^*$ and $\boldsymbol{\theta }^t$. In other words, the parameters that increase the PPD are always accepted, while parameters that decrease the PPD are randomly accepted based on α.

APPENDIX B: UCL_CHEM

In recent years, the molecular complexity of star-forming regions has led in the development of complex, multi-point, time-dependent, gas–grain chemical and photon-dominated models, which more accurately simulate the physics and chemistry of the observed interstellar material. The chemical modeling code used in this paper is the UCL_CHEM chemical code. A thorough description is given by Viti et al. (2004). UCL_CHEM is a time- and depth-dependent gas–grain chemical model that can be used to estimate the fractional abundances (with respect to hydrogen) of gas and surface species in every environment where molecules are present. The model includes both gas and surface reactions and determines molecular abundances in environments where not only the chemistry changes with time, but also local variations in physical conditions lead to variations in chemistry. Regardless of the object that is modeled, the code will always start from the most diffuse state, where all the of gas is in atomic form, and evolve the gas to its final density. Depending on the temperature, atoms and molecules from the gas freeze on to the grains and they hydrogenate where possible. The advantage of this approach is that the ice composition is not assumed; rather, it is derived by a time-dependent computation of the chemical evolution of the gas–dust interaction process. The main categories for the physical and chemical input parameters are the initial elemental abundances, cosmic ray ionization rate (ζ), radiation field strength (G), gas density (nH), dust grain characteristics, freeze-out (species depletion rate), desorption processes, and reaction database. The initial fractional elemental abundances, compared to the total number of hydrogen nuclei, were taken to be 0.14, 4.0 × 10−4, 1.0 × 10−4, 7.0 × 10−5, 1.3 × 10−7, and 1.0 × 10−7 for helium, oxygen, carbon, nitrogen, sulphur, and magnesium (Sofia & Meyer 2001). The gas-phase network used by UCL_CHEM is based on the UMIST database (Millar et al. 2000). Our chemical network also includes surface reactions, such as in Viti et al. (2004). In total, we have 208 species and 2391 gas and surface reactions included in our network. As an output, the code will compute the fractional abundances of all atomic and molecular species included in the network as a function of time.

APPENDIX C: EXPLORING THE POSTERIOR PROBABILITY DENSITY

The MH simulations provide us with the joint parameter PPD. However, because of the high dimensionality of the distribution, it is impossible to graphically represent the joint probability density. Therefore, we compute the marginal density for each parameter, or for a subset of parameters, by integrating the PPD over the rest of the parameters except the ones we are interest in. For example, to obtain the joint marginal distribution of $\boldsymbol{\theta }_a =\lbrace d,fr\rbrace$, we integrate over the rest of the parameters $\boldsymbol{\theta }_b = \lbrace {\zeta,{\rm rad},bc, \epsilon, \phi,y,r}\rbrace$,

The marginal probability distributions are visualized either with simple histograms for the case of univariate probabilities or with a bivariate histogram with an intensity map for the case of bivariate probabilities.

Traditionally, in order to explore the posterior distribution, typical Bayesian estimates, such as the posterior mean are used. However, for multi-modal and/or non-Gaussian distributions, the extraction of any useful estimator is usually meaningless. Instead, it is convenient to decrease the parameter space to HDRs or credible intervals. HDR computation and graphical representation is thoroughly explained by Hyndman (1996). Following his paper, we shortly define HDR as follows.

Let f(x) be the density function of a random variable, X. Then, the 100(1 − a)% HDR is the subset R(fa) of the sample space of X, such that

where fa is the largest constant such that Pr(XR(fa)) ⩾ 1 − a.

The above definition indicates two very important properties. From all of the possible regions, HDRs occupy the smallest possible volume and every point in each of the regions has a probability density that is larger or equal to every point that does not belong in the regions. HDRs are very useful for analyzing and characterizing multi-modal distributions. In such cases, HDR might consist of several regions that are disjointed due to the number of modes. In the context of ice formation mechanisms these HDRs are very useful statistical outcomes of the Bayesian approach. Such regions provide us with a precise quantitative measure of how the ice and gas observations and their uncertainties impact the cloud parameters.

Please wait… references are loading.
10.1088/0004-637X/794/1/45