Articles

A THEORETICAL FRAMEWORK FOR COMBINING TECHNIQUES THAT PROBE THE LINK BETWEEN GALAXIES AND DARK MATTER

, , , , and

Published 2011 August 11 © 2011. The American Astronomical Society. All rights reserved.
, , Citation Alexie Leauthaud et al 2011 ApJ 738 45 DOI 10.1088/0004-637X/738/1/45

0004-637X/738/1/45

ABSTRACT

We develop a theoretical framework that combines measurements of galaxy–galaxy lensing, galaxy clustering, and the galaxy stellar mass function in a self-consistent manner. While considerable effort has been invested in exploring each of these probes individually, attempts to combine them are still in their infancy. These combinations have the potential to elucidate the galaxy–dark matter connection and the galaxy formation physics responsible for it, as well as to constrain cosmological parameters and to test the nature of gravity. In this paper, we focus on a theoretical model that describes the galaxy–dark matter connection based on standard halo occupation distribution techniques. Several key modifications enable us to extract additional parameters that determine the stellar-to-halo mass relation and to simultaneously fit data from multiple probes while allowing for independent binning schemes for each probe. We construct mock catalogs from numerical simulations to investigate the effects of sample variance and covariance for each probe. Finally, we analyze how trends in each of the three observables impact the derived parameters of the model. In particular, we investigate various features of the observed galaxy stellar mass function (low-mass slope, "plateau," knee, and high-mass cutoff) and show how each feature is related to the underlying relationship between stellar and halo mass. We demonstrate that the observed "plateau" feature in the stellar mass function at M* ∼ 2 × 1010M is due to the transition that occurs in the stellar-to-halo mass relation at Mh ∼ 1012M from a low-mass power-law regime to a sub-exponential function at higher stellar mass.

Export citation and abstract BibTeX RIS

1. INTRODUCTION

Improved measurements of the link between galaxies and the dark matter distribution will benefit a variety of cosmological applications but will also provide important clues about the role that dark matter plays in the galaxy formation process. Although multiple techniques have been developed for this purpose, no single method has yet emerged as the ultimate tool and all suffer from various drawbacks. The goal of this paper is to develop the theoretical foundations required to combine multiple probes into a single tool that will provide more powerful constraints on the galaxy–dark matter connection. This paper extends and complements a growing body of work on this topic (Seljak 2000; Guzik & Seljak 2001, 2002; Berlind & Weinberg 2002; Tasitsiomi et al. 2004; Mandelbaum et al. 2005, 2006b; Yoo et al. 2006; Cacciato et al. 2009; Tinker et al. 2010).

At present, there are only two observational techniques capable of directly probing the dark matter halos of galaxies out to large radii (above 50 kpc): galaxy–galaxy lensing (e.g., Brainerd et al. 1996; McKay et al. 2001; Hoekstra et al. 2004; Sheldon et al. 2004; Mandelbaum et al. 2006a, 2006b; Heymans et al. 2006; Johnston et al. 2007; Leauthaud et al. 2010) and the kinematics of satellite galaxies (McKay et al. 2002; Prada et al. 2003; Brainerd & Specian 2003; van den Bosch et al. 2004; Conroy et al. 2007; Becker et al. 2007; Norberg et al. 2008; More et al. 2009, 2011). Galaxy–galaxy lensing (hereafter "g–g lensing") utilizes subtle distortions induced in the shapes and orientations of distant background galaxies in order to measure foreground mass distributions. The satellite kinematic method uses satellite galaxies as test particles to trace out the dark matter velocity field. Neither method can probe the halos of individual galaxies. Instead, both techniques must stack an ensemble of foreground galaxies in order to extract a signal. Nonetheless, with the advent of data sets large enough to provide statistically significant samples, improvements in photometric redshift techniques, and spectroscopic follow-up programs, both methods have emerged as powerful probes of the galaxy–dark matter connection and have truly evolved into mature techniques over the last decade.

In addition to these two direct probes, there are also several popular indirect methods to infer the galaxy–dark matter connection from the statistics of galaxy clustering. For example, numerous authors have employed a statistical model to describe the probability distribution P(N|Mh) that a halo of mass Mh is host to N galaxies above some threshold in luminosity or stellar mass. This statistical model, commonly known as the halo occupation distribution (HOD), has been considerably successful at interpreting the clustering properties of galaxies (e.g., Seljak 2000; Peacock & Smith 2000; Scoccimarro et al. 2001; Berlind & Weinberg 2002; Bullock et al. 2002; Zehavi et al. 2002, 2005, 2011; Zheng et al. 2005, 2007; Tinker et al. 2007; Wake et al. 2011; White et al. 2011). The HOD provides a description of the spatial distribution of galaxies at all scales, but it is usually inferred observationally by modeling measurements of the two-point correlation function of galaxies, ξgg(r). Since they were introduced a decade ago, HOD models have progressively increased in fidelity and complexity owing to stronger observational constraints but also to the availability of larger, high-resolution cosmological N-body simulations of the dark matter. For example, analytical descriptions of the form and evolution of the halo mass function and the large-scale halo bias, both of which are key ingredients for HOD models, are approaching percent-level precision (e.g., Tinker et al. 2008, 2010). A variety of extensions to the basic HOD framework have also been proposed. For example, the conditional luminosity function Φ(L|Mh)dL specifies the average number of galaxies of luminosity L ± dL/2 that reside in a halo of mass Mh (e.g., Yang et al. 2003; van den Bosch et al. 2003b, 2007; Vale & Ostriker 2004, 2008; Cooray 2006) and the conditional stellar mass function Φ(M*|Mh)dM* describes the average number of galaxies with stellar masses in the range M* ±  dM* as a function of host halo mass Mh (e.g., Yang et al. 2009; Moster et al. 2010; Behroozi et al. 2010). Furthermore, a number of studies are also starting to take into account, not only the simple expectation values of the underlying relations, but also the scatter between the observable and halo mass (e.g., More et al. 2011; Behroozi et al. 2010; Moster et al. 2010), a crucial ingredient for a complete description of the galaxy–dark matter connection.

Finally, halo mass constraints from the galaxy stellar mass function (hereafter "SMF") have also been derived by assuming that there is a monotonic correspondence between halo mass (or circular velocity) and galaxy stellar mass (or luminosity; e.g., Kravtsov et al. 2004; Vale & Ostriker 2004, 2006; Tasitsiomi et al. 2004; Conroy & Wechsler 2009; Drory et al. 2009; Moster et al. 2010; Behroozi et al. 2010; Guo et al. 2010). This particular technique, often referred to as "abundance matching," is economic in terms of data requirements since it only considers the observed stellar mass (or luminosity) function. However, prior knowledge about the mass distribution of halos (and substructure within those halos) from cosmological N-body simulations is necessary as well as the assumption that field halos and sub-halos of the same halo mass contain galaxies of the same stellar mass.

While considerable effort has been invested in exploring each of these probes individually, attempts to combine them in a fully consistent way are still in their infancy. Nevertheless, savvy combinations hold great potential not only to elucidate the evolution of the galaxy–dark matter connection, and consequently the galaxy formation physics responsible for it, but also to constrain fundamental physics, including the cosmological model and the nature of gravity. For example, measurements of small-scale galaxy clustering alone do not yield cosmological constraints unless coupled with probes that are sensitive to the mass scales of dark matter halos (e.g., cluster mass-to-light ratios, satellite kinematics, g–g lensing, etc.; van den Bosch et al. 2003a; Tinker et al. 2005; Seljak et al. 2005). In particular, Yoo et al. (2006) and Cacciato et al. (2009) have shown that the combination of g–g lensing and galaxy clustering is sensitive to Ωm and σ8. Conceptually, this sensitivity arises from the fact that this particular combination simultaneously probes the shape and amplitude of the halo mass function at small scales and the overall matter density and the bias of the galaxy sample at large scales.

Other combinations can be sensitive to parameters in modified gravity theories. A generic metric theory of gravity has two scalar potentials: ϕ, which affects the clustering and dynamics of galaxies, and ψ, which affects the lensing of light around galaxies. Combining probes of g–g lensing with clustering and/or satellite dynamics allows a test of the general relativity (GR) prediction that ψ = ϕ as well as the Poisson equations which relate these potentials to the underlying density distribution. For example, Reyes et al. (2010) have used a combination of g–g lensing, galaxy clustering, and redshift space distortions to place limits on possible modifications to GR on ∼10 Mpc scales.

Tests of gravity on smaller scales, though complicated by the fact that structures have undergone nonlinear evolution, are interesting in several respects. First, all of our direct probes of the dark matter are either intrinsically limited to small scales (e.g., a few Mpc for satellite kinematics) or have significantly larger signals on small scales (e.g., below 10 Mpc for both cosmic shear and g–g lensing). Second, there are many alternative gravity theories which predict unique and interesting modifications on these scales (Smith 2009; Hui et al. 2009; Schmidt 2010; Jain & Khoury 2010).

In this paper, we develop the theoretical framework necessary to constrain the galaxy–dark matter connection by combining measurements of galaxy clustering, g–g lensing, and the galaxy SMF. The formalism outlined in this paper could also be applied to model satellite kinematics (see More et al. 2011). For this work, we adopt the standard HOD framework but with several key modifications. For instance, a procedure often adopted in clustering studies is to fit a set of HOD parameters (typically three to five) independently to the clustering signal and number density for each galaxy sample. However, adopting this strategy would require selecting a common binning scheme for all probes. In practice, we would like to avoid using a single binning scheme because various probes have different signal-to-noise ratio (S/N) requirements. We therefore modify the standard HOD model so that we can simultaneously fit data from multiple probes while allowing for independent binning schemes for each probe. Also, since we are interested in the galaxy–dark matter connection, we modify the HOD model so as to specifically include a parameterization for the stellar-to-halo mass relation (hereafter "SHMR"). In Leauthaud et al. (2011, hereafter Paper II), we demonstrate that this model provides an excellent fit to g–g lensing, galaxy clustering, and SMF measurements in the Cosmic Evolution Survey (COSMOS) from z = 0.2 to z = 1.0.

For 2 deg2 surveys such as COSMOS, the finite sample size of the observational data set is also an important concern. We will present an estimate of the sample variance using mock surveys from numerical simulations. We will also estimate the covariance of the data for each observational measure. We will demonstrate that this is especially important for modeling the SMF, an effect that is usually not incorporated into most analyses. The finite sample size in COSMOS biases clustering measurements through the integral constraint, an effect we will also model through our mock surveys.

The layout of this paper is as follows. To begin with, we introduce the parametric form used to model the SHMR in Section 2. Next, in Section 3, we present the general HOD framework and our extensions to this model. In Section 4, we show how this model can be used to simultaneously fit g–g lensing, galaxy clustering, and SMF measurements. In Section 5, we describe the influence of each model parameter on the three observables. We then construct a set of mock catalogs designed to mimic the COSMOS survey and describe the behavior of the covariance matrices for the three probes in Section 6. Finally, we draw up our conclusions in Section 7.

We assume a WMAP5 ΛCDM cosmology with Ωm = 0.258, $\Omega _\Lambda =0.742$, Ωbh2 = 0.02273, ns = 0.963, σ8 = 0.796, and H0 = 72 km s−1 Mpc−1 (Hinshaw et al. 2009). Unless stated otherwise, all distances are expressed in physical Mpc. The letter Mh denotes halo mass. The halo radius is denoted by Rh. In this paper, halo mass is defined as $M_{200b}\equiv M(<R_{200b})=200\bar{\rho } \frac{4}{3}\pi R_{200b}^3$, where R200b is the radius at which the mean interior density is equal to 200 times the mean matter density ($\bar{\rho }$). We note however that our theoretical framework is valid for any reasonable choice of halo definition. Stellar mass is denoted by M*.

2. THE STELLAR-TO-HALO MASS RELATION FOR CENTRAL GALAXIES

To begin with, we present the mathematical function that we use to model the SHMR and describe the influence of each of the five parameters that regulate its shape. We will assume that the SHMR is specifically valid for "central" galaxies which are located by definition at the center of their parent halos. Dark matter halos also contain smaller bound density peaks that orbit around the center of the potential well. These substructures are commonly referred to as sub-halos; these sub-halos are the likely sites of "satellite" galaxies that have been accreted onto their parent halos. The abundance matching technique commonly assumes that satellite galaxies follow the same SHMR as centrals provided that halo mass is defined at the epoch when satellites were accreted onto their parent halos (Macc), rather than the current sub-halo mass (Conroy et al. 2006; Moster et al. 2010; Behroozi et al. 2010). However, this presupposes that satellite stellar growth occurs at a similar rate as centrals of equivalent halo mass (that is to say with a halo mass equal to Macc). Since one might expect that satellites and centrals experience distinct stellar growth rates, we model central and satellite galaxies separately in order to keep our model as general as possible.

In Section 3.2, we will show how the SHMR can be used to predict the central occupation function and then we will introduce the model for satellite galaxies in Section 3.3.

2.1. Functional Form for the SHMR

Let us consider the conditional SMF (the analog of the conditional luminosity function) which represents the number of galaxies with M* in the range M* ± dM*/2 at fixed halo mass and is denoted by Φ(M*|Mh) (e.g., Yang et al. 2009; Moster et al. 2010; Behroozi et al. 2010). The conditional SMF can be divided into a central component and a satellite component: Φ(M*|Mh) = Φc(M*|Mh) + Φs(M*|Mh). Φc(M*|Mh) is the conditional SMF for central galaxies, and it will be our mathematical representation of the SHMR. Note that in our model, the halo mass in the term Φs(M*|Mh) refers to the host halo mass.

In addition to the shape and evolution of the mean SHMR, astrophysical processes are expected to induce an intrinsic scatter in stellar mass at fixed halo mass, which is important to take into consideration when defining a functional form for Φc(M*|Mh). Another non-negligible source of scatter can be the measurement error associated with the determination of stellar masses. In the absence of strong observational or theoretical guidance for the form and magnitude of the total scatter (intrinsic plus measurement), we adopt a stochastic model where Φc(M*|Mh) is a log-normal probability distribution function (hereafter "PDF") with a log-normal scatter6 denoted by $\sigma _{\log M_{*}}$. Since we have assumed a log-normal functional form, Φc(M*|Mh) can be written as

Equation (1)

where $f_{\textsc {shmr}}$ represents the logarithmic mean of the stellar mass given the halo mass for the Φc distribution function. Equation (1) is normalized such that the integral of Φc(M*|Mh) over M* is equal to 1.

To model Φc we must specify a functional form for both $f_{\textsc {shmr}}$ and $\sigma _{\log M_{*}}$. There is increasing evidence to suggest that low- and high-mass galaxies have different stellar-to-halo mass ratios, probably as a result of multiple feedback mechanisms that operate at distinct mass scales and regulate star formation. We therefore require an SHMR that is flexible enough to capture such variations. We adopt the functional form presented in Behroozi et al. (2010, hereafter "B10"), which has been shown to reproduce the local Sloan Digital Sky Survey (SDSS) SMF using the abundance matching technique. In practice, $f_{\textsc {shmr}}(M_h)$ is mathematically defined via its inverse function:

Equation (2)

where M1 is a characteristic halo mass, M*, 0 is a characteristic stellar mass, β is the low-mass end slope, γ controls the transition region, and δ controls the massive end slope. Details regarding the justification of this functional form can be found in Section 3.4.3 of B10.

Note that a variety of similar functional forms have been proposed by previous authors. For example, the interested reader can look at Equation (2) in Moster et al. (2010) and Equation (20) in Yang et al. (2009).

In contrast to B10, we do not parameterize the redshift evolution of this functional form. Instead, in Paper II, we bin the data into three redshift bins and check for redshift evolution in the parameters a posteriori. Another difference with respect to B10 is that we assume Equation (2) is only relevant for central galaxies whereas B10 assume that the SHMR also applies to satellite galaxies, provided that the halo mass of a satellite galaxy is defined as Macc.

It is important to note that our SHMR traces the location of the mean-log stellar mass: $f_{\textsc {shmr}}(M_h)\equiv \langle \log _{10}(M_*(M_h))\rangle$. Other authors may report the mean stellar mass, 〈M*(Mh)〉, or even the mean halo mass at fixed stellar mass, 〈Mh(M*)〉 (e.g., Conroy et al. 2007). These averaging systems will yield different results in the presence of scatter. For example, 〈Mh(M*)〉 will be biased low compared to 〈log10(M*(Mh))〉 if $\sigma _{\log M_*}$ is non-zero. This bias will increase with $\sigma _{\log M_*}$ and for larger values of the high-mass slope of $f_{\textsc {shmr}}^{-1}$.

In Figure 1, we illustrate the impact of the five parameters that determine $f_{\textsc {shmr}}$ on the shape of the SHMR. A brief description is as follows.

  • 1.  
    M1 controls the characteristic halo mass; increasing M1 will result in larger halos hosting galaxies at a given stellar mass. In Figure 1 (which represents M* on the x-axis and Mh on the y-axis), M1 controls the y-axis amplitude (halo mass) of $f_{\textsc {shmr}}^{-1}$ so that a constant change in M1 leads to a constant up/down logarithmic shift in $f_{\textsc {shmr}}^{-1}$. Note that this constant logarithmic shift may not be visually obvious because the slope of the SHMR increases sharply at M* > 1011M.
  • 2.  
    M*, 0 controls the characteristic stellar mass; increasing M*, 0 will result in smaller halos hosting galaxies at a given stellar mass. In Figure 1, M*, 0 controls the x-axis amplitude of $f_{\textsc {shmr}}^{-1}$.
  • 3.  
    β controls the low-mass power-law slope of $f_{\textsc {shmr}}^{-1}$. When β increases, the low-mass end slope becomes steeper.
  • 4.  
    The δ parameter regulates how rapidly $f_{\textsc {shmr}}^{-1}$ climbs at high M*. Indeed, $f_{\textsc {shmr}}^{-1}$ asymptotes to a sub-exponential function at high M*, which signifies that $f_{\textsc {shmr}}^{-1}$ climbs more rapidly than a power-law function but less rapidly than an exponential function (see discussion in B10).
  • 5.  
    γ controls the transition regime between the low-mass power-law regime and the high-mass sub-exponential behavior. A larger value of γ corresponds to a more sharp transition between the two regimes.

A quantity that is of particular interest is the mass (we refer here to both M* and Mh) at which the ratio Mh/M* reaches a minimum. This minimum is of noteworthy importance for galaxy formation models because it marks the mass at which the accumulated stellar growth of the central galaxy has been the most efficient. In this paper, and in subsequent papers, we will refer to the stellar mass, halo mass, and ratio at which this minimum occurs as the "pivot stellar mass," Mpiv*, the "pivot halo mass," Mpivh, and the "pivot ratio," (Mh/M*)piv. Note that Mpiv* and Mpivh are not simply equal to M1 and M*, 0. Indeed, the mathematical formulation of the SHMR is such that the pivot masses depend on all five parameters. The three parameters that have the strongest effect on the pivot masses are M1, M*, 0, and γ. For example, as can be seen in the right-hand panel of Figure 1, Mpiv* and Mpivh are inversely proportional to γ. To a lesser extent, the two remaining parameters, β and δ, also have a small influence on the pivot masses.

Figure 1.

Figure 1. Illustration of the influence of M1, M*, 0, β, δ, and γ on the shape of the SHMR. M1 controls the characteristic halo mass. M*, 0 controls the characteristic stellar mass. β controls the low-mass power-law slope. δ regulates how rapidly the SHMR climbs at high M*. γ controls the transition regime between the low-mass power-law regime and the high-mass sub-exponential behavior.

Standard image High-resolution image

2.2. Scatter between Stellar and Halo Mass

We now turn our attention to the second component of Φc(M*|Mh), which is the scatter in stellar mass at fixed halo mass, $\sigma _{\log M_{*}}$. The total measured scatter will have two components: an intrinsic component (denoted by $\sigma _{\log M_{*}}^{\rm i}$) and a measurement error component due to redshift, photometry, and modeling uncertainties in stellar mass measurements (denoted by $\sigma _{\log M_{*}}^{\rm m}$). It is reasonable to assume that the intrinsic scatter component is independent of the measurement error component. Assuming Gaussian error distributions, we can write

Equation (3)

While the error distribution for the stellar mass estimate for any single galaxy may be non-Gaussian, in this work we are only concerned with stacked ensembles. B10 have tested that the error distribution for a stacked ensemble is Gaussian to good approximation and that small non-Gaussian wings in this distribution are not likely to affect this type of analysis.

In practice, since the data are always binned according to M*, the observables are actually sensitive to the scatter in halo mass at fixed stellar mass, which we denote by $\sigma _{\log M_h}$. It will therefore be useful to understand the link between $\sigma _{\log M_h}$ and $\sigma _{\log M_{*}}$. If the SHMR is a power law, the relationship between $\sigma _{\log M_h}$ and $\sigma _{\log M_{*}}$ is simply

Equation (4)

For example, if there is a power-law relation between halo mass and stellar mass such that Mh = Mη* then $\sigma _{\log M_h}=\eta \times \sigma _{\log M_{*}}$. In our case, the SHMR behaves like a power law at low M*. At high M*, however, d(log10Mh)/d(log10M*) increases as a function of M*. Therefore, if $\sigma _{\log M_{*}}$ is constant, $\sigma _{\log M_h}$ will be equal to $\sigma _{\log M_h}=\beta \times \sigma _{\log M_{*}}$ at low M* but then will continuously increase with M* at a rate set by γ and δ.

If we adopt the best-fit model to the SHMR from Paper II in the redshift range 0.22 < z < 0.48, we find that the power-law index of the SHMR increases steeply at log10(M*) > 11 so that $\sigma _{\log M_h}$ becomes quite large. For example, $\sigma _{\log M_h}\sim 0.46$ dex at log10(M*) = 11 and $\sigma _{\log M_h}\sim 0.7$ dex at log10(M*) = 11.5. In practical terms, this implies that the most massive galaxies do not necessarily live in the most massive halos. For example, a galaxy with M* ∼ 2 × 1011M could be the central galaxy of a group with Mh ∼ 1013–1014M, or could also be the central galaxy of a cluster with Mh > 1015M. The increase of $\sigma _{\log M_h}$ with M* will lead to a noticeable effect in the g–g lensing, clustering, and SMFs at large M* that is analogous to Eddington bias. This effect will be discussed in further detail in Section 5.

3. HOD FRAMEWORK

In this section, we show how Φc(M*|Mh) can be used to determine the central halo occupation function and introduce five new parameters to describe the satellite occupation function.

3.1. Halo Occupation Functions

In this paper, we assume that stellar mass is used to implement the HOD model since it is expected to be a more faithful tracer of halo mass than galaxy luminosity.

Consider a galaxy sample such that M* > Mt1* (a "threshold" sample). The central occupation function, denoted by 〈Ncen(Mh|Mt1*)〉, is the average number of central galaxies in this sample that are hosted by a halo of mass Mh. The satellite occupation function, denoted by 〈Nsat(Mh|Mt1*)〉, is the equivalent function for satellite galaxies.

In what follows, we focus on the appropriate equations for threshold samples. In Paper II, however, we will use "binned" samples (Mt1* < M* < Mt2*) to calculate the g–g lensing and the SMF. We therefore note that the occupation functions for binned samples are trivially derived from the occupation function for threshold samples via

Equation (5)

and

Equation (6)

3.2. Functional Form for 〈Ncen

For a threshold sample of galaxies, 〈Ncen(Mh|Mt1*)〉 is fully specified given Φc(M*|Mh) according to

Equation (7)

Because the integral of Φc(M*|Mh) over M* is equal to 1, 〈Ncen(Mh|Mt1*)〉 will vary between 0 and 1.

To begin with, let us make the simplifying assumption that $\sigma _{\log M_{*}}$ is constant. Because Φc is parameterized as a log-normal distribution, the central occupation function can be analytically derived from Equation (7) by considering the cumulative distribution function of the Gaussian:

Equation (8)

where erf is the error function defined as

Equation (9)

It is important to note that Equation (8) is only valid when $\sigma _{\log M_{*}}$ is constant. In the more general case where $\sigma _{\log M_{*}}$ varies, 〈Ncen〉 can nonetheless be calculated by numerically integrating Equation (7). In Paper II, we will consider cases in which $\sigma _{\log M_{*}}$ varies due to the effects of stellar-mass-dependent measurement errors. In this case, we will numerically integrate Equation (7) to calculate 〈Ncen〉 (see Section 4.2 in Paper II).

We note that most readers may be more familiar with a simplified version of Equation (8) that assumes $f_{\textsc {shmr}}(M_h)$ is a power law. We will now describe the assumptions made in order to obtain the more commonly employed equation for 〈Ncen〉 from Equation (8).

If we make the assumption that $f_{\textsc {shmr}}(M_h) \propto M_h^p$ and we define Mmin such that $M_{\rm min}\equiv f_{\textsc {shmr}}^{-1}(M_{*}^{t_1})$ (in other terms, Mmin is the inverse of the SHMR relation for the stellar mass threshold Mt1*) then using Equation (8) we can write

Equation (10)

If we now use the fact that erf(− x) = −erf(x) and if we define $\widetilde{\sigma }_{{\rm {log}}M}$ such that $\widetilde{\sigma }_{\log M}\equiv \sigma _{\log M_{*}}/p$ we can write

Equation (11)

which is a commonly employed formula for 〈Ncen〉. First, it is important to note that Equation (11) is only an approximation for 〈Ncen〉 for the case when the SHMR is a power law and is certainly not valid over a large range of stellar masses. Second, $\widetilde{\sigma }_{{\rm {log}}M}$ can be interpreted as the scatter in halo mass at fixed stellar mass if and only if the SHMR is a power law and if $\sigma _{\log M_{*}}$ is constant. Since there is accumulating evidence that the SHMR is not a single power law (and the same is in general true for the relationship between halo mass and galaxy luminosity), we recommend using Equation (8) instead of Equation (11).

Figures 2 and 3 illustrate the difference in 〈Ncen〉 when Equation (8) is used to describe clustering instead of Equation (11). To make this figure, we have assumed the para- meter set: log10(M1) = 12.71, log10(M*, 0) = 11.04, β = 0.467, δ = 0.62, γ = 1.89, and $\sigma _{\log M_{*}}=0.25$. For each stellar mass threshold in Figure 2, we determine the values of Mmin and $\widetilde{\sigma }_{{\rm {log}}M}$ in Equation (11) such that the number density of central galaxies and the bias of those galaxies is the same as that achieved with Equation (8). Thus, our procedure mimics what one would obtain through analysis of the clustering and space density of such samples, assuming that the satellite occupation would be the same in either analysis.

Figure 2.

Figure 2. Impact of the analytic model for the mean number of central galaxies in a given halo. Black lines show the form of 〈Ncen〉 for stellar mass threshold samples using Equation (8). Orange lines show 〈Ncen〉 when Equation (11) is used to describe the clustering instead of Equation (8). The parametric form of the SHMR is a power law at low M* and thus Equation (11) provides a reasonable description of 〈Ncen〉 at log10(M*) ≲ 10.5. At log10(M*) ≳ 10.5, however, 〈Ncen〉 deviates from a simple erf function and there are noticeable differences between the two proposed forms for 〈Ncen〉.

Standard image High-resolution image
Figure 3.

Figure 3. Difference between two analytic models for the mean number of central galaxies in a given halo. Assuming that Equation (8) correctly represents 〈Ncen〉, we evaluate the difference between the value for Mmin if Equation (11) is used (denoted here as MEq11min) to fit clustering data and $f_{\textsc {shmr}}^{-1}(M_{*}^{t_1})$ (denoted here as MEq8min). The difference is negligible below log10(Mmin) ≲ 12 but can be of order 10%–40% at higher masses.

Standard image High-resolution image

Figure 2 reveals that because the SHMR has a sub-exponential behavior at log10(M*) ≳ 10.5, 〈Ncen〉 begins to deviate from a simple erf function for high stellar mass samples and therefore is not well described by Equation (11). Assuming that Equation (8) correctly represents 〈Ncen〉, the error made on Mmin can be of order 10%–40% at $M_h=f_{\textsc {shmr}}^{-1}(M_{*}^{t_1})\gtrsim 10^{12}\,M_{\odot }$ if Equation (11) is used to describe 〈Ncen〉 instead of Equation (8).

We note that this does not invalidate Equation (11) as a possible parameterization of the central occupation function. However, interpreting Mmin in Equation (11) as $f_{\textsc {shmr}}^{-1}(M_{*}^{t_1})$ can result in a 10%–40% error in the true mean halo mass (with larger errors for $\sigma _{\log M_{*}}>0.25$). Also, the "scatter" ($\widetilde{\sigma }_{{\rm {log}}M}$) constrained by this parameterization is not equal to the scatter in a log-normal distribution of stellar mass at fixed halo mass. Finally, one troublesome aspect of using the erf functional form in Equation (11) is that 〈Ncen〉 curves for different stellar mass thresholds may actually cross at low halo mass, implying the unphysical condition that halos of mass Mh have a "negative" amount of galaxies between two threshold values. This is seen at Mh ∼ 1011.5M in Figure 2. For values of $\sigma _{\log M_{*}}$ larger than what we have assumed here, this effect will occur at even higher halo mass. Using Equation (8) with a model for the SHMR prevents this from occurring, and 〈Ncen〉 for various galaxy samples can be calculated self-consistently.

3.3. Functional Form for 〈Nsat

In addition to the five parameters introduced to model 〈Ncen〉 and $\sigma _{\log M_{*}}$, we introduce five new parameters to model 〈Nsat〉. In order to simultaneously fit g–g lensing, clustering, and SMF measurements that employ different binning schemes, we require a model for 〈Nsat〉 that is independent of any given binning scheme. For this reason, we parameterize the satellite function with threshold samples. The number of satellite galaxies in a bin of stellar mass is determined by a simple dif- ference of two threshold samples. This also eliminates the need to integrate over stellar mass, as required when working explicitly through the conditional SMF.

Numerical simulations demonstrate that the occupation of sub-halos (e.g., Kravtsov et al. 2004; Conroy et al. 2006) and satellite galaxies in cosmological hydrodynamic simulations (Zheng et al. 2005) follow a power law at high host halo mass, then fall off rapidly when the mean occupation becomes significantly less than unity. Thus, we parameterize the satellite occupation function as a power of host mass with an exponential cutoff and scaled to 〈Ncen〉 as follows:

Equation (12)

where αsat represents the power-law slope of the satellite mean occupation function, Msat defines the amplitude of the power law, and Mcut sets the scale of the exponential cutoff. Here, Mh refers to the host halo mass of satellite galaxies.

Observational analyses have demonstrated that there is a self-similarity in occupation functions such that Msat/Mmin ≈ constant for luminosity-defined samples (Zehavi et al. 2005, 2011; Zheng et al. 2007, 2009; Tinker et al. 2007; Abbas et al. 2010), where Mmin is taken from Equation (11) and is conceptually similar to $f_{\textsc {shmr}}^{-1}(M_{*}^{t_1})$ (modulo a 10%–40% difference as shown in Figure 3) and where Mt1* is the stellar mass threshold of the sample. Instead of simply modeling Msat and Mcut as constant factors of $f^{-1}_{\textsc {shmr}}(M_\ast ^{t_1})$, we add flexibility to our model by enabling Msat and Mcut to vary as power-law functions of $f^{-1}_{\textsc {shmr}}(M_\ast ^{t_1})$:

Equation (13)

and

Equation (14)

Zheng et al. (2007) find that Msat/Mmin ∼ 18 for SDSS and Msat/Mmin ∼ 16 using luminosity-defined samples in DEEP2. For Bcut, the expectation is that the cutoff mass scale occurs at Mmin < Mcut < Msat, although it can be significantly smaller.

3.4. Total Stellar Mass as a Function of Halo Mass

Using this model, one can also compute the total amount of stellar mass in galaxies (summing the contribution from both centrals and satellites) as a function of halo mass Mh. To begin with, let us consider the total stellar mass as a function of halo mass in some stellar mass bin: Mtot*(Mh|Mt1*, M*t2). The expression for Mtot*(Mh|Mt1*, M*t2) is given by

Equation (15)

However, in the previous section we have only specified the analytic form for Φc (Equation (1)) but not for Φs. Indeed, in Section 3.3 we outlined an analytic model for 〈Nsat〉 but calculating the analytic derivative of 〈Nsat〉 would be tedious. Thankfully, however, we do not specifically need to know the functional form of Φs in order to calculate Equation (15). By using the integration by parts rule, we can rewrite Equation (15) in a more convenient form as follows:

Equation (16)

This equation provides us with a convenient way to calculate the total stellar mass locked up in galaxies with Mt1* < M* < Mt2* as a function of halo mass Mh.

3.5. Summary of Model Parameters

In total, we have introduced six parameters to model the central occupation function (M1, M*, 0, β, δ, γ, $\sigma _{\log M_{*}}$) and five parameters to model the satellite occupation function (βsat, Bsat, βcut, Bcut, α). In addition, one could introduce a model for $\sigma _{\log M_{*}}$ or assume that the scatter is constant in which case there would be a total of 11 parameters for this model. In Figure 8 of Paper II, we show the two-dimensional marginalized distributions for this parameter set using data from the COSMOS survey. The model described in this paper provides an excellent fit to COSMOS data. Figure 8 in Paper II demonstrates that this model is reasonably free of parameter degeneracies. A summary and description of these parameters can be found in Table 1. Figure 4 gives an illustration of the central and satellite occupation functions for galaxy samples in bins and thresholds of stellar mass.

Figure 4.

Figure 4. Illustration of the occupation functions for various galaxy samples as a function of stellar mass. The upper panels represent binned galaxy samples, whereas the lower panels represent threshold samples. Left panels: central occupation function. Middle panels: satellite occupation function. Right panels: total occupation function where 〈Ntot〉 = 〈Ncen〉 + 〈Nsat〉. The parameters chosen for this HOD model correspond to the best-fit parameters from Paper II for 0.48 < z < 0.74.

Standard image High-resolution image

Table 1. Parameters in Model

Parameter Unit Description Ncen〉 or 〈Nsat
M1 M Characteristic halo mass in the SHMR Ncen
M*, 0 M Characteristic stellar mass in the SHMR Ncen
β None Faint end slope in the SHMR Ncen
δ None Controls massive end slope in the SHMR Ncen
γ None Controls the transition regime in the SHMR Ncen
$\sigma _{\log M_{*}}$ dex Log-normal scatter in stellar mass at fixed halo mass Ncen
βsat None Slope of the scaling of Msat Nsat
Bsat None Normalization of the scaling of Msat Nsat
βcut None Slope of the scaling of Mcut Nsat
Bcut None Normalization of the scaling of Mcut Nsat
αsat None Power-law slope of the satellite occupation function Nsat

Download table as:  ASCIITypeset image

4. HOW TO DERIVE THE SMF, G–G LENSING, AND CLUSTERING FROM THE MODEL

We now describe how the model outlined in the previous section yields analytic descriptions for the g–g lensing, clustering, and SMF, which can then be fit simultaneously to observations.

4.1. Analytical Model for the Stellar Mass Function

The SMF is typically calculated in bins in stellar mass. Let us consider the stellar mass bin Mt1* < M* < Mt2*. The abundance of galaxies within this stellar mass bin, ΦSMF(Mt1*, Mt2*), is simply obtained from our model and the halo mass function, dn/dMh, according to

Equation (17)

where we recall that Φ(M*|Mh) represents the conditional SMF and 〈Ntot〉 is the total occupation function (including both satellites and centrals).

4.2. The Lensing Observable, ΔΣ

The shear signal induced by a given foreground mass distribution on a background source galaxy will depend on the transverse proper distance between the lens and the source and on the redshift configuration of the lens–source system. A lens with a projected surface mass density, Σ(r), will create a shear that is proportional to the surface mass density contrast, ΔΣ(r):

Equation (18)

Here, $\overline{\Sigma }(< r)$ is the mean surface density within proper radius r, $\overline{\Sigma }(r)$ is the azimuthally averaged surface density at radius r (e.g., Miralda-Escude 1991; Wilson et al. 2001), and γt is the tangentially projected shear. The geometry of the lens–source system intervenes through the critical surface mass density, 7 Σcrit, which depends on the angular diameter distances to the lens (DOL), to the source (DOS), and between the lens and the source (DLS):

Equation (19)

where GN represents Newton's constant.

4.3. Relationship between ΔΣ, the Density Field, and Correlation Functions

Consider two different populations characterized, respectively, by δa and δb. The two-point cross-correlation function of δa and δb at comoving position $\vec{r}_{\rm co}$ is given by

Equation (20)

For example, if δg and δdm are, respectively, the overdensities of galaxies and dark matter, then we can characterize their relative distributions via the galaxy-mass cross-correlation function, which is denoted by ξgm and is equal to

Equation (21)

Similarly, ξgg refers to the galaxy autocorrelation function.

In the following, $\vec{r}_{\rm co}$ is the three-dimensional comoving distance, $\vec{r}_{||,{\rm co}}$ is the projected comoving line-of-sight distance, and $\vec{r}_{p,{\rm co}}$ is the projected comoving transverse distance:

Equation (22)

In Paper II, we will employ physical coordinates for g–g lensing measurements whereas for clustering we will use comoving coordinates. In previous work, Mandelbaum et al. (2006b) have used comoving coordinates and Johnston et al. (2007) have used physical coordinates. Therefore, our g–g lensing formulas more closely resemble those of Johnston et al. (2007). The relationship between comoving and physical distances is simply rco = rph(1 + z). In a similar fashion to Equation (22), we can write

Equation (23)

In comoving coordinates, the density field is $\rho (r_{\rm co},z)=\overline{\rho }(1+\xi _{\rm gm}(r_{\rm co},z))$, where $\overline{\rho }=\rho _{c,0} \Omega _{m,0}$ is the average density of matter in the universe. Since ξgm is often expressed in comoving coordinates, we first derive Σ in comoving units (denoted by Σco) and then we transform Σ into physical units (denoted by Σph) before computing ΔΣ. For a lens at redshift zL, the projected surface mass density, Σ, is obtained by integrating the three-dimensional density over the line of sight:

Equation (24)

where rp, co and r||, co refer, respectively, to the comoving transverse and line-of-sight distance from the lens. In principle, this integral should extend from the redshift of the observer (zO) to the redshift of the source (zS). However, ξgm falls off rapidly enough that in practice, the redshift evolution of ξgm can be neglected and integrating only out to a distance of r||, co = 50 Mpc is sufficient. Furthermore, for the purpose of computing ΔΣ, the constant term in Equation (24) can be dropped (due to the subtraction in ΔΣ) and the mean excess projected density $\overline{\Sigma }(r)$ can be approximated by the radial integral:

Equation (25)

The mean excess projected density in physical units is then $\overline{\Sigma }_{\rm ph} =\overline{\Sigma }_{\rm co}\times (1+z_L)^2$.

The average $\overline{\Sigma }$ within radius r is equal to

Equation (26)

Finally, ΔΣ is obtained by combining Equations (18), (25), and (26).

4.4. Analytical Modeling of ξgg and w(θ)

Our model for calculating the autocorrelation function of galaxies is based on the model given in Tinker et al. (2005) (also see Zheng 2004). As described above, the HOD is broken into central and satellite galaxy occupation functions. Thus, in the HOD context, pairs of galaxies come from two distinct terms: pairs within a single halo and pairs between galaxies in two different halos. The total correlation function is

Equation (27)

where 1h and 2h refer to "one-halo" and "two-halo" terms, respectively. The one-halo correlation function is written as

Equation (28)

where $\bar{n}_g$ is the space density of galaxies in the sample being modeled, 〈N(N − 1)〉M is the second moment of the distribution of galaxies within halos as a function of halo mass, and F'(x) is the radial distribution of pairs within the halo normalized to unity. Within a halo, pairs of galaxies can be between the central galaxy and a satellite, or between two satellite galaxies. The radial pair profile is different for these two combinations, thus we express their relative contributions to ξ1hgg(rco) as

Equation (29)

where F'cs is the pair distribution for central–satellite pairs and F'ss is the equivalent for satellite–satellite pairs. The former is related to the density profile of satellite galaxies, and the latter is related to the density profile convolved with itself (analytic expressions for this convolution can be found in Sheth et al. 2001). Here we assume that the radial distribution of satellite galaxies is the same as the dark matter, for which we assume the profile form of Navarro–Frenk–White (NFW; Navarro et al. 1997) using the mass–concentration relation of Muñoz-Cuartas et al. (2010). Since we assume that satellites trace the dark matter, F'cs will be equal to the quantity F'c that we will introduce in Equation (35) and F'ss will be equal to F's.

Central galaxies only exist as one or zero objects in a halo, thus they have no second moment. For the second moment of satellite galaxies, we assume Poisson statistics about 〈Nsat〉, which is in good agreement with results from numerical simulations (Kravtsov et al. 2004; Zheng et al. 2005). Possible deviations from Poisson behavior (Busha et al. 2010; Boylan-Kolchin et al. 2010) mainly affect clustering for galaxy samples where a majority of the satellites originate in halos with Mh < Msat because in this case, 〈Nsat〉 drops to 〈Nsat〉 ≲ 1. Luminous red galaxies (LRGs) fall into this category for example. Indeed, satellite galaxies in LRG samples mainly originate in Mh < Msat halos because Msat is close to the exponential cutoff in the halo mass function (for LRGs, Msat ∼ 4 × 1014M). However, for the types of samples that we consider in Paper II, deviations from Poisson statistics should not significantly affect the clustering predictions of the HOD.

A detailed description of the two-halo term can be found in Tinker et al. (2005). Briefly, in the regime where r > Rh of massive halos, this term can be expressed as

Equation (30)

where ξm(rco) is the nonlinear matter correlation function and bg is the large-scale bias of galaxies in the sample, and ζ(rco) is the scale dependence of dark matter halo bias. For ξm(rco), we use the fitting function of Smith et al. (2003). For ζ(rco), we use the fitting function of Tinker et al. (2005). The galaxy bias is computed from the HOD by

Equation (31)

where dn/dMh is the halo mass function for which we use Tinker et al. (2008), and b(Mh) is the halo bias function for which we use Tinker et al. (2010).

In the regime where r < Rh, Equation (30) breaks down due to the effects of halo exclusion; i.e., the effect that the center of one halo cannot exist within the virial radius of another halo (and still be considered a "two-halo" pair). This is explained in detail in Tinker et al. (2005). Because the mass function and bias relation used in this analysis are taken from numerical results based on spherical overdensity (SO) halo catalogs (Tinker et al. 2008, 2010), the halo exclusion must be modified to match this halo definition. In the SO halo finding algorithm of Tinker et al. (2008), halos are allowed to overlap so long as the center of one halo is not contained within the radius of another halo. Thus, the minimum separation of two halos with radii R1R2 is R1, rather than the sum of the two radii, as done in Tinker et al. (2005). For a projected statistic like w(θ), this makes only a small difference in clustering at the one-halo to two-halo transition, but significantly speeds up computation of the two-halo term.

Once we have calculated ξgg(rco) for a given HOD model, we compute the observable w(θ) by

Equation (32)

where N(z) is the normalized redshift distribution of the galaxy sample, rco is the comoving radial coordinate at redshift z, and $dr_{\rm co}/dz = (c/H_0)/\sqrt{\Omega _{\rm m}(1+z)^3+\Omega _\Lambda }$.

Figure 5 shows the breakdown of the angular correlation function into the one-halo and two-halo terms for low-mass and high-mass galaxy samples.

Figure 5.

Figure 5. Angular correlation function for two different stellar mass thresholds. The filled points are the full HOD calculation for w(θ). The different curves break the calculation into its distinct parts: the solid curve is the one-halo term and the dotted curve is the two-halo term. We further break the one-halo term into the relative contribution of central–satellite galaxy pairs (dash-dot curve) and satellite–satellite galaxy pairs (long-dash curve). For more massive galaxy samples, the one-halo term is more prominent and it is dominated by central–satellite pair counts.

Standard image High-resolution image

4.5. Analytical Modeling of ξgm and ΔΣ

We have shown in Section 4.3 that the lensing observable, ΔΣ, can be obtained from ξgm by performing two integrals (Equations (25) and (26)). Following the approach of Yoo et al. (2006), ξgm is computed from the HOD and ΔΣ is obtained by combining Equations (18), (25), and (26). Thus, ΔΣ is fully specified given our model. Note that the calculation of ξgm is performed in comoving units and then the projections of Equations (25) and (26) to obtain ΔΣ are performed in physical units (to match our measured g–g lensing signal which is computed in physical units in Paper II).

Typically, ξgm is decomposed into a one-halo and a two-halo term,

Equation (33)

where the one-halo term represents galaxy–matter pairs from single halos and is dominant on small scales (≲Mpc) and the two-halo term corresponds to pairs from distinct halos and is dominant on larger scales (≳Mpc). The one-halo term is obtained according to

Equation (34)

Further details about the origin of Equation (34) are presented in the Appendix.

The product $\overline{n}_g F^{\prime }$ is split into two terms:

Equation (35)

where F'c is linked to the density profiles of dark matter halos (see the Appendix) and F's is related to the convolution of the dark matter density profile and the satellite galaxy distribution. To calculate F'c we assume spherical NFW profiles truncated at Rh and we adopt the Muñoz-Cuartas et al. (2010) mass–concentration relation for a WMAP5 cosmology. To calculate F's we assume that the satellite galaxy distribution follows the dark matter distribution. Given this assumption, F's is simply related to the convolution of the NFW profile with itself.

In this paper, we neglect the contribution to ΔΣ from sub-halos; Yoo et al. (2006) have shown this component to be negligible at the 10% level.

We calculate the two-halo term as described in Yoo et al. (2006), with two major exceptions. First, as stated in the previous section, we are using the halo exclusion from the SO halo definition. Second, because the halo mass function of Tinker et al. (2008) and the halo bias function of Tinker et al. (2010) are normalized such that integrating over all Mh produces the mean matter density and a bias of unity, there is no need to employ the "break mass" from Yoo et al. (2006, see their Equation (16)). In the limit where r > Rh of massive halos,

Equation (36)

analogous to Equation (30).

In addition to the two terms presented above, we add another component to the modeling of ΔΣ which is absent in Yoo et al. (2006), namely the contribution to ΔΣ from the baryons of the central galaxy which can be non-negligible on very small scales (<50 kpc). Although the baryons typically follow Sérsic profiles (Sérsic 1963), at the scales of interest for this study, well above a few effective radii (>20 kpc), the lensing contribution of the baryons can be modeled by a simple point source, scaled to 〈M*〉, the average stellar mass of the galaxies in the sample:

Equation (37)

In total, the final g–g lensing signal is modeled as the sum of three terms: ΔΣtot = ΔΣstellar + ΔΣ1h + ΔΣ2h. Note that the one-halo and two-halo terms can also be decomposed into central and satellite contributions, but for simplicity, we have grouped these terms together.

In order to illustrate the various terms that contribute to the g–g lensing signal, we have plotted the signal in Figure 6.

Figure 6.

Figure 6. Illustration of the various terms that contribute to a g–g lensing signal. The solid black curve shows the total g–g lensing signal which can be decomposed into a sum of terms that contribute at various scales. On small scales (∼10 kpc), the signal is dominated by the baryonic content of galaxies represented by the red curve (dotted). At intermediate radii (∼200 kpc), dark matter halos come into play as shown by the blue (dashed) and magenta (dash-dot) curves. The former represents an NFW profile and the latter is due to a contribution from satellite galaxies. On large scales (>3 Mpc), the g–g lensing follows the dark matter linear autocorrelation function scaled by the bias factor as depicted by the gray curve (triple-dot-dashed). It is interesting to note that the total g–g lensing signal is roughly a power law, despite the fact that the various contributing components deviate strongly from power laws.

Standard image High-resolution image

5. INFLUENCE OF THE MODEL PARAMETERS ON THE OBSERVABLES

In the previous section, we have outlined how our model can be used to analytically predict the SMF, g–g lensing, and clustering signals. We will now investigate how each parameter in the model affects the three observables. For this exercise, we adopt the best-fit model parameters for 0.48 < z < 0.74 from Paper II and we vary each parameter in turn by 2σ around the best-fit model. For this section, we assume that $\sigma _{\log M_{*}}$ is constant with M*. We also assume that αsat is constant and we set αsat = 1 in this section since this is also the assumption that we make in Paper II. In total, we therefore study the effects of 10 parameters: M1, M*, 0, β, δ, γ, $\sigma _{\log M_{*}}$, βsat, Bsat, βcut, and Bcut. The results are shown in Figures 7 and 8 and are described in further detail below.

Figure 7.

Figure 7. Effect of varying each of the 10 parameters in turn by 2σ around the best-fit model from Paper II for 0.48 < z < 0.74, where σ is the fitted error on the para- meter in question. In the left panels, we show how the predicted w(θ) signal varies for log10(M*) > 11.1 (a high stellar mass threshold). In the middle panels, we show how the predicted ΔΣ signal varies for 11.29 < log10(M*) < 12 (a high stellar mass bin) and the right panels show the pre- dicted variations for SMF down to log10(M*) = 9.3. To highlight the effects of the parameter variation, in all three cases we have plotted the model minus the fiducial best-fit model divided by the fiducial best-fit model. The data point with an error bar represents the typical error bar for each observable for a COSMOS-like survey.

Standard image High-resolution image
Figure 8.

Figure 8. Same as Figure 7 but the left panels now show w(θ) for log10(M*) > 9.3 (a low stellar mass threshold) and the middle panels now show ΔΣ for 9.8 < log10(M*) < 10.3 (a low stellar mass bin). The right panels (depicting the SMF) are identical to Figure 7.

Standard image High-resolution image

5.1. Effect of Parameters on the SMF

The influence of each model parameter on the observed SMF is shown in the right-hand column in Figure 7 (the same column is reproduced in Figure 8). The data point with an error bar represents the typical error bar for a COSMOS-like survey where the error bar includes sample variance computed from a series of mock catalogs (described in the following section). The first point worth mentioning here is that the errors on the SMF are relatively small compared to the clustering and the lensing. It is always the case that the measurement of a one-point statistic from a given set of data is more precise than the measurement of a two-point (or higher) statistic. This implies that the SMF will in general play an important role in constraining the parameters of the SHMR. However, in follow-up work we will investigate the sensitivity of this probe combination to cosmological parameters and models of modified gravity for example. In these studies, the clustering and the g–g lensing will play a critical role despite their typically larger errors bars.

A second noteworthy point in Figure 7 is that the SMF appears to be sensitive to all 10 parameters, whereas certain parameters such as M*, 0 and β have very little effect on the clustering and lensing signals. Coupled with the fact that the SMF has fairly small error bars, this implies that the SMF will have quite a lot of constraining power on the overall model compared to the g–g lensing for example.

The effects of M1 and M*, 0 on the SMF are fairly intuitive. M1 roughly induces an up/down shift in the amplitude of the SMF and M*, 0 corresponds to a left/right shift in the SMF. A larger value of Bsat implies that there are fewer satellite galaxies in high-mass halos for a given stellar mass threshold. Thus, we see that an increase in Bsat corresponds to a decrease in the amplitude of the SMF due to the fact that the contribution from satellite galaxies has decreased. Similar arguments apply to Bcut. From Figure 7 we can anticipate that if the SMF were only used to constrain this model, degeneracies would occur between Bsat, Bcut, and M1. Fortunately, the satellite parameters have a significant effect on both the clustering and the g–g lensing so this degeneracy should be broken when all three probes are used in conjunction.

In Figure 9, we highlight the effects of four particular parameters on the SMF: β, γ, δ, and $\sigma _{\log M_{*}}$. The β parameter affects the low-mass slope of the SMF so that a larger value of β corresponds to a steeper low-mass slope in the SMF. The γ parameter (we recall that this controls the transition region of the SHMR as can be seen from Figure 1) has an interesting effect since it regulates a "plateau" feature in the SMF at log10(M*) ∼ 10.5. In fact, this feature in the SMF has been noticed already and discussed in detail, for example, in Drory et al. (2009). In Drory et al. (2009), this feature was described as a "dip" because at these scales, the SMF is below the best-fit Schechter function. However, we note that "dip" is a somewhat misleading name for this feature since it could also be taken to mean that dN/dlog10(M*) does not decrease monotonically with M*. A close inspection of our SMFs in Paper II shows that there is no evidence in the data for an actual "dip" in dN/dlog10(M*). Instead, the data are more consistent with a flattening of dN/dlog10(M*) around log10(M*) ∼ 10.5. Therefore, we would like to suggest that this feature should be described as a "plateau" in the SMF rather than a "dip."

Figure 9.

Figure 9. Effects of β, δ, γ, and $\sigma _{\log M_{*}}$ on the shape of the observed SMF. Left upper panel: β determines the low-mass slope of the SMF. Right upper panel: δ affects the knee and the high-mass slope of the SMF. Left lower panel: γ affects the knee of the SMF but γ also affects the "plateau" feature that has been observed in the SMF at 10 < log10(M*) < 10.5 (see discussion and references in Drory et al. 2009). Right lower panel: $\sigma _{\log M_{*}}$ affects the high-mass slope of the SMF but also affects the "plateau" feature. A larger value of $\sigma _{\log M_{*}}$ leads to an inflated observed SMF at the high-mass end. This effect is also commonly referred to as Eddington bias.

Standard image High-resolution image

In Figure 10 we show the link between the dark matter halo mass function, the SHMR, and the SMF. In this figure, we have used the fact that dN/dlog10M*  =  dN/dlog10Mh × (dlog10Mh/dlog10M*) so that the various functions can be linked "by eye" by drawing a box between the four different panels. We have illustrated how to link the various functions with the dashed lines in Figure 10 at the scale of the pivot stellar mass. Figure 10 shows that the "plateau" feature is caused by the transition that occurs in the SHMR at Mh ∼ 1012M from a low-mass power-law regime to a sub-exponential function at higher stellar mass.

Figure 10.

Figure 10. Illustration of the link between the halo mass function (upper left panel), the SHMR (upper right panel), and the SMF (lower right panel). The contribution to the SMF from satellite galaxies has been subtracted so that the SMF in the lower right panel only depicts the contribution from central galaxies. The dotted line shows the SMF when $\sigma _{\log M_{*}}=0$. This figure is designed so that all three quantities can be linked by drawing a "box" between the four panels as shown here by example with the dashed lines at the location of the pivot stellar mass. We note that this will only work when the SMF with zero scatter (dotted line) is considered. This figure shows, for example, that the location of the pivot stellar mass is coincident with the location of the "plateau" feature in the SMF.

Standard image High-resolution image

Finally, the scatter in stellar mass at fixed halo mass has a noticeable effect on the SMF at the high-mass end, which is also commonly referred to as Eddington bias. A larger value of $\sigma _{\log M_{*}}$ will lead to an inflated observed SMF at large stellar masses.

5.2. Effect of Parameters on G–G Lensing

The g–g lensing signal is dominated by the central one-halo term roughly on scales below 0.3 Mpc and by the satellite one-halo term roughly on scales above 0.3 Mpc and below a few Mpc (see Figure 6). Thus, the effects of the four parameters that regulate the satellite occupation function (Bsat, Bcut, βsat, βcut) have a strong scale-dependent effect on the g–g lensing signal. For example, Bsat controls the power-law amplitude of 〈Nsat〉. A smaller value of Bsat will reduce the ratio $M_{\rm sat}/f^{-1}_{\textsc {shmr}}(M_\ast ^{t_1})$ and will therefore increase the number of satellites in a sample. This will lead to an increase in the g–g lensing signal from 0.3 to 1 Mpc due to the increased amplitude of the one-halo satellite term. Similarly, a smaller value of βsat will also reduce the ratio $M_{\rm sat}/f^{-1}_{\textsc {shmr}}(M_\ast ^{t_1})$ and consequently will increase the one-halo satellite term. This effect is more pronounced in Figure 8, which illustrates a low stellar mass sample, compared to Figure 7, which illustrates a high stellar mass sample.

Another parameter worth discussing here is $\sigma _{\log M_{*}}$. Figure 7 demonstrates that $\sigma _{\log M_{*}}$ has a stronger effect on the lensing signal for high stellar mass samples compared to low stellar mass samples. As discussed in Section 2.2, this is simply due to the fact that the data are binned according to M*. The observables are therefore sensitive to the scatter in halo mass at fixed stellar mass, $\sigma _{\log M_h}$. At fixed $\sigma _{\log M_{*}}$, $\sigma _{\log M_h}$ will increase with M*. As a result, the effects of scatter are more prominent in g–g lensing measurements for high stellar mass samples.

5.3. Effect of Parameters on Clustering

To first order, the SMF and the clustering of galaxies are tethered; more massive halos are both more clustered and less abundant. This is also true of galaxies because rare, massive galaxies live in such halos. If the amplitude of the SMF increases, the clustering as a function of stellar mass decreases. This is especially true at masses above the knee in the SMF. From Figure 9, increasing δ or $\sigma _{\log M_{*}}$ increases the abundance of high-mass galaxies. Given that the number of halos is fixed, this can only mean that massive galaxies are occupying less massive, less clustered halos.

There are several parameters that have a direct influence on the clustering of galaxies without changing the SMF appreciably. The parameters Bsat and βsat are the most important in this regard. They control the "shoulder" in the HOD, defined conceptually as the increase in halo mass, relative to $f^{-1}_{\textsc {shmr}}(M_\ast ^{t_1})$, before satellites begin to enter the sample. Quantitatively, this is expressed as the ratio $M_{\rm sat} /f^{-1}_{\textsc {shmr}}(M_\ast ^{t_1})$, as shown in Equation (12). Reducing this ratio increases the number of satellite galaxies in a sample, which in turn increases the large-scale bias of a sample and significantly enhances the clustering within the one-halo term. The parameters Bcut and βcut have a more subtle effect on clustering. If the cutoff mass, defined by Equation (14), is below $f^{-1}_{\textsc {shmr}}(M_\ast ^{t_1})$, then Mcut has no effect on clustering. But as Mcut increases, satellite galaxies are removed from low-mass halos. If the density of satellites is held fixed, increasing Mcut redistributes satellites into more massive halos. This will increase the large-scale bias and change the shape of the one-halo term such that the correlation function deviates from a pure power-law form (see the Appendix in Zheng et al. 2009).

6. MOCK CATALOGS, SAMPLE VARIANCE, AND COVARIANCE

In this section, we construct mock catalogs in order to investigate the effects of sample variance and covariance associated with measurements of g–g lensing, clustering, and the SMF. Sample variance occurs due to the finite nature of the volume encompassed by any given survey. Because of limited volume, any given survey may yield a biased measurement of the number density of galaxies and halos compared to the full universe. The error bars on all three observables must therefore reflect this additional source of error. Also, the data points in all three observables will be correlated to some degree. Consider the SMF for example. A region of space with high matter density will have an increased abundance of galaxies of nearly all masses. For w(θ), because it is a projection of ξgg(r)—which is itself a correlated quantity—multiple physical scales will contribute to each bin in θ. For ΔΣ, we will show that the data points are correlated on scales where satellite galaxies contribute to the lensing signal.

To investigate both the sample variance and the covariance associated with all three observables, we use numerical simulations to construct a series of mock catalogs for a COSMOS-like survey. Since the volume of COSMOS is relatively small, the effects of variance and covariance will be quite apparent (whereas the effects would decrease if we simulated a larger fiducial survey) and so COSMOS is well suited for our purpose. In addition, we will also use these mock catalogs in Paper II to analyze the actual COSMOS data.

COSMOS-like mocks are created from a single simulation (named "Consuelo") 420 h−1 Mpc on a side, resolved with 14003 particles, and a particle mass of 1.87× 109h−1M.8 This simulation can robustly resolve halos with masses above ∼1011h−1M and is part of the LasDamas suite9 (C. McBride et al. in preparation). We create mocks for three redshift intervals: z1 = [0.22, 0.48], z2 = [0.48, 0.74], and z3 = [0.74, 1]. For each redshift interval, we construct a series of mocks created from random lines of sight through the simulation volume that have the same area as COSMOS and the same comoving length for the given redshift slice. This yields 405 independent mocks for the z1 bin, 172 mocks for the z2 bin, and 109 mocks for the z3 bin. For each redshift bin, mocks are created from the simulation output at the median redshift of the bin.

Halos within the simulation are identified with the friends-of-friends halo finder (Davis et al. 1985) with a linking length of b = 0.2. For each redshift interval, halos are populated with our best-fit model from Paper II. We use the mock-to-mock variance and covariance to estimate a covariance matrix for the SMF, for w(θ) (using a series of stellar mass thresholds), and for ΔΣ (using a series of stellar mass bins). Although the data points between the different quantities will be correlated to some degree (as well as the bins in w(θ) and ΔΣ), we ignore that covariance as we do not have enough simulation volume to estimate the uber-covariance matrix of all [N] data points in each redshift bin.

Figure 11 shows the correlation coefficient matrix for the SMF in three redshift bins for a COSMOS-like survey. The first-order effect of sample variance on the SMF is to correlate all of the data points so that globally, the SMF will shift up and down for different realizations of a COSMOS-like survey.

Figure 11.

Figure 11. Correlation coefficient matrix for the SMF in three redshift bins for a COSMOS-like survey. The effect of sample variance on the SMF is to correlate all of the data points so that globally, the SMF will shift up and down for different realizations of a COSMOS-like survey.

Standard image High-resolution image

Figure 12 shows the correlation coefficient matrix for the galaxy clustering for 0.22 < z < 0.48 and for three stellar mass thresholds: log10(M*) > 9.3, log10(M*) > 10.3, and log10(M*) > 11.1. The data are more correlated at larger scales where galaxy pairs come from the two-halo term. As shown earlier, clustering at these scales is proportional to the matter clustering ξm(r). Patches of the universe that exist in an over- or under-density tend to have higher or lower clustering in their matter. This will be reflected in the clustering of the halos and thus the two-halo term for the galaxies. In the one-halo term, Poisson fluctuations of the number of satellites become more important and the data are less correlated at these scales. Overall, as the density of the galaxy sample becomes smaller, shot noise will dominate on all scales. This can be seen in the progression from left to right in the examples in Figure 12.

Figure 12.

Figure 12. Correlation coefficient matrix for galaxy clustering for 0.22 < z < 0.48 and for several stellar mass thresholds.

Standard image High-resolution image

Figure 13 illustrates the effect of sample variance on g–g lensing signals for various stellar mass bins and for 0.22 < z < 0.48. Figure 14 shows the associated correlation coefficient matrices. The key point to note here is that the sample variance for g–g lensing is dominated by the one-halo satellite term on scales of about 100 kpc to 1 Mpc. The impact of this term becomes more apparent in galaxy samples with lower stellar masses as the contribution from the one-halo central term decreases. The fact that the one-halo satellite term has a large sample variance compared to the one-halo central term can be understood as follows. Consider a sample of galaxies in a given stellar mass bin. The galaxies that are satellites in this sample will tend to live in more massive halos than the galaxies that are centrals (this can be seen in Figure 4 for example). Since more massive halos are more rare than less massive halos at fixed survey volume, this explains the large one-halo satellite sample variance.

Figure 13.

Figure 13. Illustration of the effects of sample variance on the g–g lensing signals for a COSMOS-like survey. The light gray region represents the 2σ variation between mocks and the dark gray region represents the 1σ variation between mocks. Sample variance mainly affects the one-halo satellite term of the g–g lensing signal. For example, the large variance that can be seen in the bottom right panel (8.7 < log10(M*) < 9.2) from 100 kpc to 1 Mpc is due to the one-halo satellite term. This can be explained by the fact that, for a given stellar mass sample, the parent halos of satellite galaxies are more rare at fixed volume than the halos of central galaxies.

Standard image High-resolution image
Figure 14.

Figure 14. Correlation coefficient matrix for g–g lensing for 0.22 < z < 0.48 and for several bins in stellar mass.

Standard image High-resolution image

Finally, we also use mock catalogs to estimate the effects of the integral constraint (IC; Groth & Peebles 1977) on clustering measurements for a small area survey. Due to spatial fluctuations in the number density of galaxies, the mean correlation function measured from an ensemble of samples will be smaller than the correlation function measured from a single contiguous sample of the same volume as the sum of the ensemble sample. This attenuation of w(θ) becomes relevant on angular scales significant with respect to the sample size. For large surveys like the SDSS, the IC is not an issue on scales of interest. For a pencil-beam survey like COSMOS, however, the IC must be taken into account when modeling the clustering. We estimate the IC correction to our w(θ) measurements through the use of the mock galaxy distributions described previously. The results are shown in Figure 15. For COSMOS, our fitting functions for the IC correction are

Equation (38)

and

Equation (39)

for 0.22 < z < 0.48 and 0.48 < z < 0.74, respectively, and where θ is expressed in arcseconds. For 0.74 < z < 1 there is sufficient volume such that fIC = 1. We note that these fitting functions are only valid for θ < 103 arcseconds.

Figure 15.

Figure 15. Mean and dispersion of the angular clustering of galaxies as a function of stellar mass threshold in our mock COSMOS simulations. The results shown here are for 0.48 < z < 0.74. The filled circles in each panel show w(θ) for a single mock with ≳ 10 times the area as COSMOS itself. The difference between the large-area mock and the mean of the COSMOS mocks is due to the IC. The bottom right panel shows the ratio of w(θ) for the large-area mock divided by the mean of the COSMOS mocks. The solid curve is a fitting function to account for the IC. For the 0.22 < z < 0.48, the effect of the IC is stronger, while for 0.74 < z < 1.0 there is sufficient volume such that the IC is unity on all scales measured.

Standard image High-resolution image

7. SUMMARY AND CONCLUSIONS

The goal of this paper is to develop the theoretical framework necessary to combine measurements of galaxy–galaxy lensing, galaxy clustering, and the galaxy SMF into a single and more robust probe of the galaxy–dark matter connection. We have achieved this goal by introducing several key modifications to the standard HOD framework. To begin with, we have modified the standard HOD model so as to fit all three probes simultaneously and independently of the selected binning scheme. Next, since we are interested in the galaxy–dark matter connection, we have also modified the HOD model so as to specifically include the SHMR. In a companion paper (Leauthaud et al. 2011b) we demonstrate that the model presented here provides an excellent fit to galaxy–galaxy lensing, galaxy clustering, and SMFs measured in the COSMOS survey from z = 0.2 to z = 1.0.

Nonetheless, while the promise of combined dark matter probes in studying galaxy formation, gravity, and cosmology is clear, we must ensure that our parametric description of the SHMR is sophisticated enough to capture its possible behavior. There are a number of questions that remain to be answered in order to achieve this goal. For example, is P(M*|Mh) well described by a log-normal distribution and is the scatter in P(M*|Mh) constant or does it vary with halo mass? Do the parameters that describe P(M*|Mh) vary with redshift and galaxy type? Can we marginalize over uncertainties related to the shapes and concentrations of dark matter halos? What exactly do we learn from various probe combinations? The challenges are steep, but with increasing large data sets such as the Dark Energy Survey, the Large Synoptic Survey Telescope, the HyperSuprime Cam survey, and EUCLID, refined and sophisticated models can be built and constrained by the data. Although the model presented in this paper is sophisticated enough to describe COSMOS data, it is clear that further refinements will be necessary given the statistical precision of upcoming surveys. Improving models such as the one presented in this paper by using insights provided, for example, by semi-analytic models of galaxy formation and dark matter N-body simulations, is clearly a worthy pursuit.

We thank Jaiyul Yoo for help with the Appendix and Kevin Bundy for useful discussions and for reading the manuscript. A.L. acknowledges support from the Chamberlain Fellowship at LBNL and from the Berkeley Center for Cosmological Physics. This research received partial support from the U.S. Department of Energy under contract No. DE-AC02-76SF00515. R.H.W. and P.S.B. received additional support from NASA Program HST-AR-12159.A, provided through a grant from the Space Telescope Science Institute, which is operated by the Association of Universities for Research in Astronomy, Inc., under NASA contract NAS5-26555. M.T.B. and R.H.W. also thank their collaborators on the LasDamas project for critical input on the Consuelo simulation, which was performed on the Orange cluster at SLAC.

APPENDIX: FURTHER DETAILS ON THE ORIGIN OF EQUATION (34)

We consider it useful to provide some more details on the origin of Equation (34) and in particular on the link between the NFW profile and F'c and F's. This might be useful for those who are not familiar with the notations of galaxy clustering studies. To begin with, consider a central galaxy that is associated with an NFW dark matter halo at redshift zL and with halo mass M. The NFW profile, $\rho _{\textsc {nfw}}$, is given by

Equation (A1)

where δ is a characteristic (dimensionless) density and Rs is the NFW scale radius. The relation between δ and the NFW concentration parameter c is

Equation (A2)

where Δ is a chosen overdensity (for example, Δ is often set to 200). The NFW radius is denoted by Rh and is equal to Rh = c × Rs. The projected surface mass density of this lens, Σ, is computed by taking the integral of $\rho _{\textsc {nfw}}$ over the line of sight:

Equation (A3)

Analytical expressions for the projection of $\rho _{\textsc {nfw}}$ to Σ can be found in Wright & Brainerd (2000) for example.

Instead of a single galaxy, now consider an ensemble of central galaxies characterized by the central occupation function 〈Ncen〉. We note $\overline{n}_c$ such that

Equation (A4)

The probability that a galaxy in this selection lives in a halo of mass M is

Equation (A5)

The average surface mass density of the galaxy ensemble is

Equation (A6)

Let us now define F'c such that

Equation (A7)

Combining Equations (A6) and (A7), we obtain

Equation (A8)

Finally, Equation (34) is obtained by considering a galaxy sample that contains both central and satellite galaxies. In this case, $\overline{n}_g$ is defined as

Equation (A9)

F's is defined in a similar fashion to Equation (A7) but $\rho _{\textsc {nfw}}$ is replaced with the convolution of the NFW profile with itself. Analytic expressions for the convolution of the truncated NFW profile with itself can be found in the Appendix of Sheth et al. (2001).

Footnotes

  • Scatter is quoted as the standard deviation of the logarithm base 10 of the stellar mass at fixed halo mass.

  • Note that some authors consider the comoving critical surface mass density which has an extra factor of (1 + z)−2 with respect to ours.

  • In this paragraph, numbers are quoted for H0 = 100 h km s−1 Mpc−1.

  • Details regarding this simulation can be found at http://lss.phy.vanderbilt.edu/lasdamas/simulations.html.

Please wait… references are loading.
10.1088/0004-637X/738/1/45