A DIRECT DYNAMICAL MEASUREMENT OF THE MILKY WAY'S DISK SURFACE DENSITY PROFILE, DISK SCALE LENGTH, AND DARK MATTER PROFILE AT 4 kpc ≲ R ≲ 9 kpc

Jo Bovy; Hans-Walter Rix

doi:10.1088/0004-637X/779/2/115

1. INTRODUCTION

The mass distribution of the Milky Way inside R ∼ 10 kpc is uncertain in many respects. While good measurements exist of the rotation curve, these do not allow the separation of the major Galactic components: bulge, halo, and stellar and gas disks. Hence the relative contributions as well as the density profiles of these components are still very uncertain. Determining these is fundamental to our understanding of the Milky Way's formation and evolution and to evaluating the importance of dark matter within the visible extent of galaxies. Because the Milky Way is in many ways an anchor point for empirical relations used to determine the physical properties of external galaxies, this ultimately has a large impact on the study of the low-redshift universe (Rix & Bovy 2013).

The structure, size, and mass of the Milky Way's disk in particular are poorly known. Measurements of the radial scale length of the disk over the last few decades have produced values anywhere in between 2 kpc and 5 kpc (see, e.g., Sackett 1997 for an overview). While photometric measurements have improved with the advent of wide-area surveys with accurate photometry, systematic uncertainties in the distance scale significantly limit the precision and accuracy of photometrically measured scale lengths (e.g., Jurić et al. 2008), and such measurements only trace the radial luminosity profile of the tracer stars rather than the mass profile of the disk. This is especially problematic as it has recently become clear that the stellar disk contains abundance-distinct subcomponents with scale lengths ranging from 2 kpc to >4.5 kpc (Bovy et al. 2012d), and in particular that the thicker disk components in the Milky Way have a much shorter scale length than the thin disk (Bensby et al. 2011; Bovy et al. 2012d; Cheng et al. 2012). Thus, more than ever it is important to measure the mass-weighted radial profile, which can be achieved using the dynamics of tracer populations. There are currently no plausible dynamical measurements of the scale length of the disk. While there are global potential fits to a variety of dynamical data (e.g., McMillan 2011), these cannot measure the scale length without strong prior assumptions as there have been no dynamical data and models that constrain the disk scale length to date; because of this, the disk's scale length is the most important unknown in disentangling the contributions from the disk and the dark halo to the mass distribution near the disk (Dehnen & Binney 1998).

In external galaxies, rotation curves are often decomposed into disk and halo contributions using the maximum-disk hypothesis (van Albada & Sancisi 1986): the disk component is assumed to contribute the maximum amount possible without exceeding the rotation curve anywhere; in practice this means that such maximal disks contribute 85% ± 10% of the rotation velocity at the peak of the disk rotation curve, allowing for the contribution of a bulge and a cored halo (Sackett 1997). Observational evidence for or against the maximum-disk hypothesis is scant: e.g., Courteau & Rix (1999) argue against spiral galaxy disks being generically maximal based on the residuals from the Tully–Fisher relation, while detailed gas kinematics shows that some disks, particularly fast rotators, are maximal (e.g., Weiner et al. 2001; Kranz et al. 2003; see Section 6.3 for more discussion of this). Ultimately, this is due to the fact that directly measuring and decomposing the mass profile of external disks is close to impossible. However, because mass-to-light ratios for external spiral galaxies are calibrated using the maximum-disk hypothesis (e.g., Bell & de Jong 2001), the fate of the maximum-disk hypothesis is intimately connected with many conclusions regarding the stellar-mass content of the low-redshift universe (e.g., Bell et al. 2003; Li & White 2009). Whether the Milky Way has a maximum disk is unknown (Sackett 1997), and determining whether it does is important since it can provide a robust data point for the discussion of the maximum-disk hypothesis and because it could be used to calibrate dynamical measurements of the radial profile of external galaxies. Because the local midplane density and the local surface density are reasonably well measured (e.g., Kuijken & Gilmore 1989b; Holmberg & Flynn 2000), whether the Milky Way's disk is maximal hinges on whether it has a short stellar disk mass scale length.

Because the rotation curve reflects the sum of the radial in-plane accelerations contributed by the various Galactic components, precise measurements of the radial profile and mass of only the disk also allows the radial profile of the dark-matter halo to be constrained. Measurements of the local density of dark matter have significantly improved recently (Bovy & Tremaine 2012; Zhang et al. 2013), but there are no dynamical constraints on the radial profile of dark matter at R ≲ 10 kpc. The large microlensing optical depth measured toward the bulge argues for a relatively shallow halo profile (Binney & Evans 2001), although this constraint has been relaxed compared to Binney & Evans (2001) due to the downward revision of the optical depth (e.g., Popowski et al. 2005) and of the local dark matter density since 2001 (Bovy & Tremaine 2012; Zhang et al. 2013). Measuring the radial profile of dark matter in the inner Milky Way would be invaluable for analyses of dark-matter indirect detection toward the Galactic center and for comparing the Milky Way to the predictions from cosmological simulations: A Milky-Way-like galaxy is presumed to have an inner dark-matter profile close to the Navarro–Frenk–White (NFW; Navarro et al. 1997) profile seen in simulations or steeper due to the adiabatic contraction that is expected to occur when the Milky Way's massive disk forms within the dark matter halo.

In this article, we constrain the mass distribution in the inner Milky Way by measuring the vertical force at |Z| ≈ 1 kpc as a function of radius between R ≈ 4 kpc and ≈9 kpc. Because the vertical force and the surface density are approximately proportional near the disk (Kuijken & Gilmore 1989a; Bovy & Tremaine 2012), this quite directly measures the radial surface-density profile near the disk, allowing the mass-weighted scale length of the disk to be obtained without any prior assumptions derived from stellar number counts; we measure the radial profile of the mass rather than that of the luminosity.

We measure the vertical force as a function of radius using modeling based on simple, parameterized, six-dimensional (phase-space), three-action distribution-functions (DFs) developed by Binney (2010, 2012b) and McMillan & Binney (2013). In Ting et al. (2013) we showed that this approach should work well when applied to subsets of Galactic disk stars with very similar elemental abundances, so-called mono-abundance populations (MAPs). In a recent series of articles, we (Bovy et al. 2012c, 2012d) demonstrated that these constitute structurally simple disk populations. Such populations can hence be described well by few-parameter families of DFs, which allows a full likelihood-based exploration of the joint posterior distribution function (PDF) of dynamical (gravitational potential) and DF parameters. In this article, we apply this methodology for the first time to a real data set, G-type dwarfs from the Sloan Digital Sky Survey (SDSS)/SEGUE survey (Yanny et al. 2009). Each MAP, i.e., each distinct, abundance-selected stellar subset, can act as a separate tracer population of the gravitational potential. We focus our efforts in this article on measuring the dynamical quantity of interest that is most robustly determined for each MAP. This turns out to be the vertical force at 1.1 kpc at a single Galactocentric radius for each MAP, where the radius is determined directly from the dynamical PDF for each MAP; because the MAPs span a wide range in orbital distributions, this leads to measurements between R ≈ 4 kpc and R ≈ 9 kpc for 43 MAPs.

We use these new measurements in combination with constraints on the rotation curve, primarily from terminal-velocity measurements, and with measurements of the local vertical profile of the vertical force to constrain the mass distribution at R ≲ 10 kpc and its decomposition into disk and dark halo contributions. Because our measurements are at 1.1 kpc, we strongly constrain the properties of the Milky Way's disk, finding a relatively short mass scale length of 2.15 ± 0.14 kpc and deriving precise measurements of the local column density of stars and of the total mass in the disk. We find that the Milky Way's disk is maximal, providing about 70% of the rotational support at 2.2 radial scale lengths (V_{c, disk}/V_{c, total} = 0.83 ± 0.04; the rotational support is (V_{c, disk}/V_{c, total})² = 0.69 ± 0.06). As the halo contributes very little to the rotation curve and the vertical force at 1 kpc, we only weakly constrain the radial profile of the dark matter, but we nevertheless obtain a first constraint: α < 1.53 at 95% confidence for a model in which ρ_DM(r; ≈R₀)∝1/r^α.

The outline of this article is as follows. We describe the SEGUE G-dwarf data in Section 2. In Section 3 we explain in detail the various steps involved in the dynamical modeling of abundance-selected populations of stars using a DF model. We present the results from this modeling in Section 4.1 and discuss how we turn these results into a measurement of the surface density to 1.1 kpc at a single radius for each MAP. We show the resulting surface-density and vertical-force profiles in Section 4.2 and discuss systematics associated with this measurement in Section 4.3. In Section 5, we use these new measurements together with additional dynamical data to fit mass models of the inner Milky Way. We discuss the implications of these new measurements in Section 6. We recapitulate our main conclusions in Section 7 and look ahead to the dynamical analysis of Gaia data. Appendix A gives technical details of the calculation of the effective survey volume required in the dynamical modeling, and Appendix B presents various detailed comparisons between the spatial and kinematical distribution of the data and that of our best-fit dynamical models. We recommend that readers not interested in the technical details of the dynamical models read the introduction of Section 3, which briefly summarizes the main ingredients of the modeling, then Section 4.2 for the measurement of the vertical-force profile, as well as Section 5 and following sections for the implications of these new measurements for the mass distribution in the inner Milky Way. In this article, we use BO12a to collectively refer to the set of Bovy et al. (2012b, 2012c, 2012d) articles, appending b,c, or d to indicate specific articles in that series.

2. SDSS/SEGUE DATA

The data used in this article are largely the same as the SEGUE G-dwarf data used in BO12a. We summarize the main properties of these data with a focus on what is new. We refer the reader to BO12c and BO12d for a full description of the G-star data used here.

2.1. G-dwarf Data

The G-star sample used in this analysis is identical to the one used in BO12d, but we now also use the stars' line-of-sight velocities and proper motions explicitly to calculate the vertical velocities and errors, as in BO12c. For a detailed description of the sample, we refer to BO12c, BO12d, Yanny et al. (2009), and Lee et al. (2011b). The G-dwarf sample is the largest of the systematically targeted subsamples in SEGUE to explore the Galactic disk. They are the brightest tracers whose main-sequence lifetime is larger than the expected disk age (≈10 Gyr) for all but the most metal-rich stars with [Fe/H] ≳ 0.2 dex. Their rich metal-line spectrum affords velocity determinations good to ∼5 to 10 km s⁻¹(Yanny et al. 2009), as well as good abundance ([α/Fe], [Fe/H]) determinations (δ_[α/Fe] ∼ 0.1 dex, δ_[Fe/H] ∼ 0.2 dex;⁴ Lee et al. 2008a, 2008b; Allende Prieto et al. 2008; Schlesinger et al. 2010; Lee et al. 2011a; Smolinski et al. 2011; BO12c). The distances to the sample stars range from 0.6 kpc to nearly 4 kpc, with stars somewhat closer to the disk plane being sampled by the lines of sight at lower Galactic latitudes. The effective minimal distance limit of the stars (600 pc) implies that the vertical heights below one scale height (|Z| < h_z) of the thinner disk components is not well sampled by the G-dwarf data. As in BO12a, we employ a signal-to-noise ratio cut of S/N > 15.

We transform distances and velocities to the Galactocentric rest-frame by assuming that the Sun's displacement from the midplane is 25 pc toward the north Galactic pole (Chen et al. 2001; Jurić et al. 2008), that the Sun is located at 8 kpc from the Galactic center (e.g., Ghez et al. 2008; Gillessen et al. 2009; Bovy et al. 2009), and that the Sun's vertical peculiar velocity with respect to the Galactic center is 7.25 km s⁻¹ (Schönrich et al. 2010). In particular, we note that none of the results in this article are affected by the details of the Sun's peculiar motion in the plane of the Galaxy. We use the uncertainties on the distances, line-of-sight velocities, and proper motions to calculate the uncertainty in the vertical velocities, but otherwise we assume that distance uncertainties are negligible (which is a good approximation because the typical distance uncertainty of ∼10% is much smaller than any Galactic spatial gradient). Uncertainties are typically 10% in distance, ∼3.5 mas yr⁻¹ in proper motion (Munn et al. 2004), and 5–10 km s⁻¹ in line-of-sight velocity. The distance errors are obtained by marginalizing over the color and apparent-magnitude errors that enter the photometric distance relation (Equation (A7) of Ivezić et al. 2008, I08 hereafter, see also BO12d), which is assumed to have an intrinsic scatter of 0.1 mag in the distance modulus. These uncertainties lead to tangential velocity uncertainties that are ∼50 km s⁻¹ at 3 kpc and a smaller contribution than that to the vertical velocity uncertainty (depending on Galactic latitude). Stars in our sample at such large distances are either far from the plane as part of a population with a large velocity dispersion or closer to the Galactic center, where the velocity dispersion is larger by ∼50% because it rises exponentially with a scale length of ∼7 kpc (BO12c). Therefore, the intrinsic dispersion is larger than the typical velocity-measurement uncertainty for all populations analyzed below.

Motivated by the analysis in BO12a and Ting et al. (2013), we treat MAPs as independent tracer populations of the Galactic potential, whose fits provide independent constraints on the potential. Operationally MAPs are defined here as stars whose [Fe/H] and [α/Fe] lie within bins 0.1 dex and 0.05 dex wide, as in BO12a.

One significant change to the distances that we made for the current analysis is that we have placed the distances on the scale of An et al. (2009, An09 hereafter), which was used in many other analyses of the SEGUE G and K-dwarf samples (e.g., Schlesinger et al. 2012; Zhang et al. 2013) and which appears to provide more accurate distances from the kinematical tests of Schönrich et al. (2012). These distances are typically ∼7% smaller than the distances calculated using the I08 relation. Because our method for correcting for color and metallicity biases in the sample selection requires a simple photometric distance relation (see Section 3 and Appendix A below), we apply a single distance factor to the stars in each MAP calculated as the mean distance modulus offset between the I08 and An09 relations for the stars in the MAP, taking into account the MAP's g − r and [Fe/H] distributions (see Figure 1 of BO12d for the difference between the I08 and An09 distance scales as a function of g − r and [Fe/H]). These distance factors are shown in Figure 1. As discussed below, we have checked that our measurements of the surface density as a function of radius are affected by only a few percent by the difference between these two distance scales and are therefore much less sensitive to distance systematics than might be assumed (but we stress that the An09 distances are more accurate and therefore the correct distances to use).

**Figure 1.** Distance factors applied to the Ivezić et al. (2008) photometric distance relation to put it on the distance scale of An et al. (2009). These factors are primarily a function of [Fe/H] but are calculated based on the mean [Fe/H] of each MAP.
Download figure:
Standard image High-resolution image

Because dynamical modeling such as that performed in this article essentially succeeds by finding the gravitational potential that makes the observed velocities consistent with their observed spatial distribution in a dynamical steady state, the detailed understanding of the data sampling function that was crucial for BO12b and BO12d is again an important ingredient for the present analysis. We refer the reader to Appendix A of BO12d for a full description of the SEGUE G-star selection function.

We begin with a SEGUE G-dwarf sample that has about 28,000 stars with acceptably well-determined measurements, but here we only use those 23,767 stars that fall within well-populated "mono-abundance" bins in the ([Fe/H],[α/Fe]) plane (BO12d; a bin is well-populated when it contains more than 100 stars and the maximum number of stars in a bin is 789). Furthermore, we exclude MAPs whose best-fit scale length in BO12d was larger than 4.85 kpc. We exclude these MAPs because their radial profile cannot be well-fit with the simple DF ansatz that we make below. These MAPs are all [Fe/H]-poor and have [α/Fe] close to solar; they are primarily associated with the outer disk (see Figure 7 in BO12d). This removes 17 MAPs and leaves 16,653 stars in 45 MAPs. For reasons discussed below, we remove 2 MAPs from this sample, such that the final data sample consists of 16,269 stars in 43 MAPs.

3. ANALYSIS METHOD

While stellar dynamics overall is a long- and well-established field (Binney & Tremaine 2008), the best ways to model large discrete data sets with full phase-space information are still being developed (cf. McMillan & Binney 2012; Rix & Bovy 2013). In what follows, we analyze each MAP separately and independently with the end goal of constraining the total surface density to Z = 1.1 kpc— $\Sigma (R,|Z| \le 1.1\,\mathrm{kpc}) \equiv \int _{-1.1\,\mathrm{kpc}}^{1.1\,\mathrm{kpc}}{d}Z \rho (R,Z)$ , abbreviated as Σ_1.1(R)—at a single Galactocentric radius for each MAP. In this and the following section we describe all of the steps involved in fitting each MAP as a population in a dynamical steady state in order to measure the gravitational potential and how we translate this into a single, robust measurement of Σ_1.1 at a single radius for each MAP. Our measurements of Σ_1.1(R) assume that the rotation curve is flat; if this is not the case, then we show in Section 4.3 that we still robustly measure the vertical force at 1.1 kpc.

The basic ingredients of the fitting procedure are as follows.

1.
Each MAP is fit as a population of stars drawn from a quasi-isothermal DF (qDF; Binney 2010; Binney & McMillan 2011; Ting et al. 2013). This form for the DF has a number of free parameters related to the radial density profile of the tracer population and the radial and vertical velocity dispersions and how they change with R. The qDF is described in detail in Section 3.1.
2.
The qDF is a function of the orbital actions J = (J_R, J_ϕ, J_Z) ≡ (J_R, L_Z, J_Z). For the fit we only need the (x, v) → J transformation, which for the disk orbits in our sample can be efficiently, accurately, and precisely computed using a Stäckel approximation of the potential for any reasonable form of the potential near the Galactic plane (Binney 2012a). This procedure is briefly summarized in Section 3.1.
3.
To calculate the orbital actions we require a full three-dimensional (3D) model for the Milky Way's potential. We use a bulge + stellar disk + gas disk + dark halo model that a priori has many free parameters. For the present application we fix at fiducial values all but two of these parameters, the stellar disk scale length, and the relative halo-to-disk contribution to the radial force (= circular velocity squared) at R₀. These two parameters are varied to scan through a wide range of Σ_1.1(R) including all a priori reasonable values for the range of Galactocentric radii that we are interested in (4 kpc < R < 10 kpc). In Section 4.3 we show explicitly that changing the parameters that are held fixed in the fiducial fit does not significantly impact our measurement of Σ_1.1(R), and our measurement of the vertical force at 1.1 kpc at a single Galactocentric radius is particularly free from systematics. Our model for the Milky Way's potential is described in detail in Section 3.2.
4.
We determine the probability of the position–velocity data (the likelihood) for each set of DF + potential parameters using an expanded version of the methodology described in Section 3.1 of BO12d, carefully accounting for the sample selection. The calculation of the effective survey volume consists of a nine-dimensional integral, whose fast computation is discussed in Appendix A. We only use the vertical velocity component of the data and marginalize the six-dimensional qDF over the velocity components in the plane, because the planar motions do not constrain the vertical force, and they may be affected by nonaxisymmetric streaming motions. The fitting procedure is discussed in Section 3.3.
5.
For each MAP we determine the full joint PDF for the DF and the potential parameters. By marginalizing this over the DF parameters we obtain constraints on the potential that include uncertainties in the determination of the correct DF. The details on how we explore the PDF are given in Section 3.4. From the properties of the DF we determine the Galactocentric radius R at which Σ_1.1(R) is best constrained for each MAP, and we calculate Σ_1.1(R) and its uncertainty δΣ_1.1(R) from the PDF. This is described in Section 4.1.

We now describe these various steps in more detail.

3.1. Distribution Function Model

We use the qDF first proposed by Binney (2010) and later refined by Binney & McMillan (2011) to fit the dynamics of a single MAP. BO12a found that the radial and vertical density profiles of individual MAPs are well represented by single exponentials and that the vertical velocity distribution is isothermal, i.e., the vertical velocity dispersion is constant with height above the plane. Further investigation of the Galactocentric radial velocities showed that the radial velocity dispersion is also isothermal (unpublished). As shown in Ting et al. (2013), these properties of MAPs can be reproduced by assuming that the DF is a qDF of the form of Binney & McMillan (2011). Therefore, in what follows we fit each MAP as a single qDF. This is a simple way of including abundance information to separate different populations into the general approach followed by Binney (2012b), where disk stars with any abundance were fit using a superposition of a large number of qDFs. In the current application, abundances are used to group stars that presumably can be fit using just a single qDF. The requirement that a MAP can be modeled using a single qDF drives our fine-grained pixelization of the ([Fe/H],[α/Fe]) plane.

The form for the qDF as it is used here is that of Binney & McMillan (2011):

$\begin{equation} \mathrm{qDF}(\mathbf {x},\mathbf {v}) = f_{\sigma _R}(J_R,L_Z) \times \frac{\nu }{2\pi \sigma _Z^2}\,\exp \left(-\frac{\nu J_Z}{\sigma _Z^2(R_c)}\right), \end{equation} \tag{ 1 }$

where $f_{\sigma _R}$ is given by

$\begin{eqnarray} f_{\sigma _R}(J_R,L_Z) & =& \left.\frac{\Omega n(R_c)}{\pi \sigma _R(R_c)^2 \kappa }\right|_{R_c}\times \left[1+\tanh (L_z/L_0)\right]\nonumber\\ && \times \,\exp \left(-\frac{\kappa J_R}{\sigma _R^2(R_c)}\right). \end{eqnarray} \tag{ 2 }$

Here κ, Ω, and ν are the epicycle, circular, and vertical frequencies (Binney & Tremaine 2008); R_c is the radius of the circular orbit with angular momentum L_Z. We include the factor in Equation (2) containing the tanh to eliminate stars on counterrotating orbits following Binney & McMillan (2011), but we also explicitly set the DF to zero for counter-rotating orbits. n, σ_R, and σ_Z are free functions of R_c, which indirectly determine the radial profiles of the tracer density, the radial velocity dispersion, and the vertical dispersion. However, it should be noted that these are merely scale profiles, as opposed to the actual, physical profiles that can be calculated by taking the appropriate moments of the DF. In principle these three functions can take any form, but based on observations of the Milky Way and external galaxies (e.g., Lewis & Freeman 1989; BO12c) we assume that each of these functions is an exponential

$\begin{eqnarray} n(R_c) & \propto & \exp \left(-R_c/h_R\right),\qquad\quad\ \ \ \ \end{eqnarray} \tag{ 3 }$

$\begin{eqnarray} \ \sigma _R(R_c) & =& \sigma _{R,0}\,\exp (-\left[R_c-R_0\right]/h_{\sigma _R}), \end{eqnarray} \tag{ 4 }$

$\begin{eqnarray} \sigma _Z(R_c) & =& \sigma _{Z,0}\,\exp (-\left[R_c-R_0\right]/h_{\sigma _Z}). \end{eqnarray} \tag{ 5 }$

To evaluate the qDF for a given position and velocity we need to calculate the actions for that position and velocity, (x, v) → J, given Φ(R, Z). We calculate the actions using the approximate algorithm of Binney (2012a), where the actions are calculated using an (implicit) approximation of the potential as a Stäckel potential. For the sake of a self-contained description, we summarize the relevant parts of the algorithm here. This algorithm proceeds by introducing prolate confocal coordinates (u, v)

$\begin{equation} R = \Delta \,\sinh u\,\sin v; \qquad z = \Delta \,\cosh u \cos v. \end{equation} \tag{ 6 }$

Binney (2012a) discusses how the choice of Δ influences the precision with which the actions are calculated. Based on this discussion, we fix Δ to 0.45 R₀ for all potentials. The momenta in this new coordinate system are given by

$\begin{eqnarray} p_u & = \Delta \,(p_R\,\cosh u \,\sin v + p_z\,\sinh u \cos v), \end{eqnarray} \tag{ 7 }$

$\begin{eqnarray} p_v & = \Delta \,(p_R\,\sinh u \,\cos v - p_z\,\cosh u \sin v). \end{eqnarray} \tag{ 8 }$

If the potential is of the Stäckel form Φ_S(u, v) = (U(u) − V(v))/(sinh ²u + sin ²v), then these momenta are functions of their associated coordinate only:

$\begin{eqnarray} \frac{p_u^2}{2\,\Delta ^2} & = E\,\sinh ^2 u - I_3 - U(u) - \frac{L_Z^2}{2\,\Delta ^2\sinh ^2 u}, \end{eqnarray} \tag{ 9 }$

$\begin{eqnarray} \frac{p_v^2}{2\,\Delta ^2} & = E\,\sin ^2 v + I_3 + V(v) - \frac{L_Z^2}{2\,\Delta ^2\sin ^2 v},\quad\ \end{eqnarray} \tag{ 10 }$

where E is the energy and I₃ is a constant of the motion (the third integral). This allows one to calculate the actions through a single integration:

$\begin{equation} J_R= \frac{1}{\pi } \int _{u_\mathrm{min}}^{u_{\mathrm{max}}} {d}u \,p_u(u),\quad J_Z= \frac{2}{\pi } \int _{v_\mathrm{min}}^{\pi } {d}v \,p_v(v).\quad \end{equation} \tag{ 11 }$

The crucial ingredient of Binney's (2012a) algorithm is that although it appears from Equations (9)–(10) that we require an explicit form for U(u) and V(v)—and thus an explicit representation of the potential in Stäckel form, as in Sanders (2012)—if the potential is close to a Stäckel potential then U(u) can be rewritten as U(u₀) + δU(u), where δU(u) ≡ U(u) − U(u₀) and u₀ is a reference value for u that does not affect the calculated actions. Similarly V(v) can be rewritten as V(π/2) + δV(v). These δU(u) and δV(v) can be calculated directly from the potential as

$\begin{eqnarray} \delta U(u) & =& (\sinh ^2 u+\sin ^2 v)\Phi (u, v)\qquad\qquad\qquad\quad\ \ \end{eqnarray} \tag{ 12 }$

$\begin{eqnarray} && \qquad - (\sinh ^2 u_0 +\sin ^2 v)\Phi (u_0, v),\qquad\qquad \end{eqnarray} \tag{ 13 }$

$\begin{eqnarray} \delta V(v) & =& \cosh ^2 u \Phi (u,\pi /2)-(\sinh ^2 u+\sin ^2 v)\Phi (u, v), \nonumber\\ \end{eqnarray} \tag{ 14 }$

as (sinh ²u + sin ²v)Φ(u, v) = U(u) − V(v) for a Stäckel potential. If the potential is close to a Stäckel potential, then the dependence of δU(u) on v and that of δV(v) on u is small. Thus they can be calculated for a chosen reference value of v and u, respectively.

While we cannot calculate U(u₀) and V(π/2) directly without an explicit representation of the potential in Stäckel form, we can compute the combinations I₃ + U(u₀) and I₃ + V(π/2) from the current position and velocity of the orbit (the position and velocity at which the actions are required) if we set u₀ to the value corresponding to the given position

$\begin{eqnarray} I_3 + U(u_0) & = E \sinh ^2 u - \frac{p_u^2}{2\,\Delta ^2} - \frac{L_Z^2}{2\,\Delta \sinh ^2 u}, \end{eqnarray} \tag{ 15 }$

$\begin{eqnarray} I_3 + V(\pi /2) & = -E \sin ^2 v + \frac{p_v^2}{2\,\Delta ^2} + \frac{L_Z^2}{2\,\Delta \sin ^2 v}. \quad \end{eqnarray} \tag{ 16 }$

In what follows we always use this algorithm to calculate the actions rather than constructing an interpolation look-up table for a grid of integrals of the motion. The range of the integrals in Equations (11) is determined using Brent's (1973) method for finding the zeros of the integrands. The integrals are then calculated using 10th order Gauss–Legendre quadrature.

The distribution of the actions of the 16,269 sample stars calculated in a fiducial potential is shown in Figure 2. The diagonal edges in the distribution of J_R and L_Z (the radial action and the angular momentum; see below) are due to the fact that circular orbits far from the solar neighborhood (i.e., outside of the bounds of our sample) are not represented in our data set. Similarly, in the distribution of the vertical action J_Z, stars with small vertical actions (∼small vertical energies) are missing in our sample because of the effective lower limit on the vertical height of the stars in our sample; stars with low vertical action do not reach high enough above the plane of the disk to be included in our sample.

The DF as a function of the actions for a fiducial set of parameters is shown in Figure 3. A comparison between this distribution and the distribution of the data actions in Figure 2 shows that the data's action distribution is heavily affected by selection biases related to the spatial sampling of the data.

**Figure 3.** Quasi-isothermal DF as a function of the actions. The qDF of Equation (1) is shown as a function of *L_Z* and *J_R* for *J_Z* = 0 and as a function of *J_R* and *J_Z* for *L_Z*/8 kpc = 200 km s⁻¹. The qDF parameters are chosen to mimic those of one of the α-old MAPs considered in this article: *h_R* = 2 kpc, σ_Z(R₀) = 66 km s⁻¹, σ_R(R₀) = 55 km s⁻¹, $h_{\sigma _Z} = 7\,\mathrm{kpc}$ , and $h_{\sigma _R} = 8\,\mathrm{kpc}$ (see Figure 5 for how these scale parameters relate to the physical scale length and dispersion parameters). The potential is the same as that used in Figure 2.
Download figure:
Standard image High-resolution image

The density, mean velocity, velocity dispersions, etc, of the qDF can be calculated as velocity moments of the 6D qDF:

$\begin{eqnarray} \quad \nu _*(R,Z) & = \int {d}\mathbf {v}\,\mathrm{qDF}(R,Z,\mathbf {v}), \end{eqnarray} \tag{ 17 }$

$\begin{eqnarray} \nu _*(R,Z)\,\langle \mathbf {v}\rangle (R,Z) & = \int {d}\mathbf {v}\,\mathbf {v}\,\mathrm{qDF}(R,Z,\mathbf {v}),\qquad \end{eqnarray} \tag{ 18 }$

$\begin{eqnarray} \nu _*(R,Z)\,\langle \mathbf {v}^2 \rangle (R,Z) & = \int {d}\mathbf {v}\,\mathbf {v}^2\,\mathrm{qDF}(R,Z,\mathbf {v})\,;\qquad \end{eqnarray} \tag{ 19 }$

these can be straightforwardly transformed into the velocity dispersions. In what follows these velocity moments are calculated using 20th order Gauss–Legendre quadrature between −4σ_R(R) and 4 σ_R(R) in the radial velocity V_R, −4 σ_Z(R); 4 σ_Z(R) in the vertical velocity V_Z; and 0 and 3 V_c(R₀)/2 in the rotational velocity V_T. The dispersions σ_R(R) and σ_Z(R) in these expressions for the ranges are calculated using the scale dispersion profiles from Equations (3)–(5). We use actions calculated using the adiabatic approximation in a few instances below; in this case we use 20th order Gauss–Legendre quadrature between 0 and 4σ_{R, Z}(R) because in the adiabatic approximation the qDF is perfectly symmetric in V_R and V_Z separately, which is not the case when using the more correct Stäckel actions (see below).

In Figure 4 we show the vertical and radial tracer density profile for a fiducial set of qDF parameters, as well as the tilt of the velocity profile and the vertical velocity dispersion as a function of height. We calculate these using the Stäckel approximation for the actions as well as the simpler, but less precise, adiabatic approximation (e.g., Binney 2010). The vertical and radial density profiles are close to exponential, but the vertical profile gradually flattens closer to the plane of the disk. The inset in the leftmost panel shows that the tracer density flares slightly. The right panel shows that the qDF is indeed close to isothermal as the vertical velocity dispersion only increases by a few km s⁻¹ over 5 kpc; the same holds for the radial velocity dispersion (see also Ting et al. 2013).

The difference between the Stäckel and adiabatic approximations is small for most observables, except for the tilt of the velocity ellipsoid. The tilt is equal to zero when using the adiabatic approximation because this assumes that vertical and planar motions are close to decoupled, implying that there is no correlation between vertical and radial oscillations along the orbit. The Stäckel approximation does not make this assumption, and it correctly captures the correlation between the radial and vertical oscillations for stars that reach large heights above the plane (≳ 500 pc); the gray line shows the tilt if the velocity ellipsoid is pointing toward the Galactic center and the tilt included in the qDF falls only slightly short of this. Also shown in the third panel of Figure 4 is the measurement of the tilt of the velocity ellipsoid from Siebert et al. (2008), which agrees well with the tilt that results from the qDF if the actions are calculated in the Stäckel approximation.

As discussed above, the density and dispersion scale profiles in Equations (3)–(5) are not the physical profiles, and the actual density and dispersion profiles as calculated using Equations (17) are slightly different. In what follows we use the results from BO12a for the dispersions at R₀ to determine a likely range for the qDF parameters for each MAP used in a grid-based exploration of the joint DF–potential PDF. Therefore, we need to determine how the scale profiles relate to the physical profiles. In Figure 5 we show these relations calculated in a potential similar to potential II in Binney & Tremaine (2008). We use this relation to calculate the approximate scale profiles that give physical profiles similar to the best-fit profiles found in BO12a for the radial density and the radial and vertical velocity dispersion at R₀. These estimates are shown in Figure 6.

**Figure 5.** Physical properties of the quasi-isothermal DFs vs. their scale parameters. The left panel shows the radial scale length measured over ΔR = R₀/3 around R₀ vs. the scale length scale parameter for various values of the radial dispersion. The middle and right panels show the vertical and radial dispersion, respectively, at (R, Z) = (R₀, 800 pc) vs. the dispersion scale parameters for various values of the q radial scale length scale parameter. We assume that $\sigma ^{\mathrm{in}}_R / \sigma ^{\mathrm{in}}_Z = \sqrt{3}$ , $h_{\sigma _Z} = 7\,\mathrm{kpc}$ , and $h_{\sigma _R} = 8\,\mathrm{kpc}$ for all DF models in this figure.
Download figure:
Standard image High-resolution image

**Figure 5.** Physical properties of the quasi-isothermal DFs vs. their scale parameters. The left panel shows the radial scale length measured over ΔR = R₀/3 around R₀ vs. the scale length scale parameter for various values of the radial dispersion. The middle and right panels show the vertical and radial dispersion, respectively, at (R, Z) = (R₀, 800 pc) vs. the dispersion scale parameters for various values of the q radial scale length scale parameter. We assume that $\sigma ^{\mathrm{in}}_R / \sigma ^{\mathrm{in}}_Z = \sqrt{3}$ , $h_{\sigma _Z} = 7\,\mathrm{kpc}$ , and $h_{\sigma _R} = 8\,\mathrm{kpc}$ for all DF models in this figure.
Download figure:
Standard image High-resolution image

**Figure 6.** Estimates of the parameters of the quasi-isothermal DF that describes each MAP, based on the results from the previous, nondynamical fits in BO12c and BO12d. The previous results for the dispersions were smoothed before estimating the DF parameters. The estimated parameters are shown for all MAPs considered in BO12a; only MAPs with BO12d radial scale lengths <4.85 kpc are included in the dynamical modeling in this article (that is, the MAPs that are dark brown in the leftmost panel are excluded).
Download figure:
Standard image High-resolution image

3.2. Milky Way Potential Model

While for the present article we are primarily interested in Σ_1.1(R), the modeling of each MAP using the qDF requires a full 3D model for the Galactic potential. We use a four-component model consisting of a Hernquist bulge with a mass of 4 × 10⁹ M_☉ and a scale radius of 600 pc⁵; a stellar exponential disk component characterized by a (single, effective) scale height z_h and scale length R_d; a spherical power-law dark halo; and a gas disk modeled as an exponential disk with a local surface density of 13 M_☉ pc⁻², a scale height of 130 pc, and a scale length fixed to be twice that of the stellar disk (following Binney & Tremaine 2008). We do not include a stellar halo component, as the mass of the stellar halo is negligible compared to the other components.

In the qDF fits to data of individual MAPs, the circular velocity curve, the epicycle, and the vertical frequencies are calculated for each potential on a logarithmic grid in R between 0.01 R₀ and 20 R₀. The potential is calculated on the same radial grid and on a linear grid in Z between 0 and R₀. These values are interpolated using two-and one-dimensional cubic B-splines, and the interpolating functions are used to first calculate the actions for a given (x, v) and then to evaluate the qDF.

The calculation of the bulge and halo potential and frequencies is straightforward. The potential and forces of an exponential disk with density $\rho (R,Z) = \rho _d\,e^{-R/R_d-|Z|/z_h}$ are calculated using the expressions in Appendix A of Kuijken & Gilmore (1989a), which we repeat here for completeness:

$\begin{eqnarray} \Phi (R,Z) & =& {-}4\pi \,G\rho _d\frac{z_h}{R_d} \int _0^\infty {d}k\,J_0(kR)\,\left(k^2+\frac{1}{R_d^2}\right)^{-3/2}\nonumber \\ & & \times \,\frac{e^{-k|Z|} - k z_h e^{-|Z|/z_h}}{1-(k\,z_h)^2}, \end{eqnarray} \tag{ 20 }$

$\begin{eqnarray} F_R(R,Z) & =& {-}4\pi \,G\rho _d\frac{z_h}{R_d} \int _0^\infty {d}k\,k\,J_1(kR)\,\left(k^2+\frac{1}{R_d^2}\right)^{-3/2}\nonumber \\ & & \times \,\frac{e^{-k|Z|} - k z_h e^{-|Z|/z_h}}{1-(k\,z_h)^2}, \end{eqnarray} \tag{ 21 }$

$\begin{eqnarray} F_Z(R,Z) & =& {-}\mathrm{sign}(Z)\,4\pi \,G\rho _d\frac{z_h}{R_d} \int _0^\infty {d}k\,k\,J_0(kR)\nonumber \\ && \times \,\left(k^2+\frac{1}{R_d^2}\right)^{-3/2}\,\frac{e^{-k|Z|} - e^{-|Z|/z_h}}{1-(k\,z_h)^2},\qquad\quad\ \ \end{eqnarray} \tag{ 22 }$

and similar expressions for the second derivative of the potential. The integrals in these expressions contain strongly oscillating Bessel functions J_i(·), especially when evaluating them near the Galactic plane. We calculate these integrals using 10-point Gauss–Legendre integration between each of the zeros of the relevant Bessel function out to k = 2/z_h (4/z_h for the radial force) for R ⩾ R₀ and k = 2 R₀/(R z_h) (4 R₀/(R z_h) for the radial force) for R < R₀. At large radii (R > 6 R₀) the exponential disk potential is approximated as Keplerian.

The free parameters of the model for the gravitational potential are parameterized using (1) and (2) the stellar disk mass scale length R_d and mass scale height z_h, (3) the relative contribution of the halo to the stellar-disk+halo contribution to the radial force at R₀, (4) the circular velocity V_c at R₀ (this sets the overall amplitude of the potential), and (5) the logarithmic derivative of the circular velocity at R₀ with respect to R, dln V_c(R₀)/dln R (i.e., the "flatness" of the rotation curve). All of the other parameters of the potential can be calculated in terms of these basic parameters. We perform all the calculations involving the actions and qDF by first rescaling the potential such that V_c(R₀) ≡ 1 at R₀ = 1; this requires rescaling the data positions by R₀ and velocities by V_c and rescaling the input parameters of the qDF similarly. The PDF (see below) then needs to be divided by appropriate factors of V_c(R₀) and R₀ to have the correct units.

In the fits of a qDF to individual MAPs below we fix all of the five basic parameters except for the stellar disk scale length and the relative halo-to-disk contribution to the radial force at R₀. This proved necessary because exploring the joint qDF+potential parameters PDF for each MAP is computationally expensive such that two potential parameters is the maximum that can currently be efficiently explored (see below for more discussion of this). In our fiducial model we set the stellar disk mass scale height to 400 pc (this is the mass-weighted mean scale height calculated based on the star-count decomposition of BO12b, i.e., it is the mean of the histogram shown in Figure 2 of BO12b). We also fix the potential amplitude to be such that V_c = 230 km s⁻¹ and the rotation curve to be flat at R₀, i.e., dln V_c(R₀)dln R = 0. While these may appear to be strong assumptions, we stress that the main goal of the MAP fits is to measure the surface density Σ_1.1 at 1.1 kpc at the Galactocentric radius where this is best determined for each MAP. As such, the range of allowed Σ_1.1 is what really matters. Figure 7 shows the range of Σ_1.1(R₀) in the fiducial model; it is clear that any reasonable Σ_1.1(R₀) (cf. Bovy & Tremaine 2012; Zhang et al. 2013) is included in the fiducial model. Figure 8 shows the range of Σ_1.1(R) between 4 kpc and 10 kpc included in the fiducial model, and this is wide enough to encompass all a priori reasonable Σ_1.1(R). We illustrate below that changing the potential parameters that we keep fixed in the fiducial model does not significantly change our results.

**Figure 7.** Range of surface densities spanned by our fiducial model for the Milky Way's potential near the plane of the disk, when varying the two free potential parameters (see Section 3.2): the stellar disk scale length and the relative contribution from the halo to the stellar-disk+halo contribution to the circular velocity at R₀. The colors represent surface densities at R₀, while the black, solid and white, dashed contours indicate lines of constant surface density at 5 kpc and 11 kpc, respectively. The white squares show the actual grid points included in the fiducial model (see Table 1).
Download figure:
Standard image High-resolution image

**Figure 8.** Range of surface densities spanned by our fiducial potential model as a function of radius when varying the two potential parameters. This figure shows the range of surface densities at every R corresponding to the same models as shown in Figure 7. Our fiducial model includes the a priori reasonable range of surface densities at all R between 4 kpc and 10 kpc. A typical value (and the final measurement in Section 5) of Σ_1.1(R₀) is indicated by the solar symbol.
Download figure:
Standard image High-resolution image

3.3. Fitting Procedure

In Section 3.1 of BO12d we described a method for fitting the density profiles of a sample of stars using the individual positions of the stars in a likelihood-based approach and correcting for the effects of various selection biases. Here we expand this methodology to fitting both the positions and velocities to a 6D DF. As described in Section 3.1, we model the DF with a dynamical equilibrium model specified by the qDF, which is a function of the actions J only, that is f(x, v) ≡ f(J[x, v]). We work in the approximation that the selection function has only a spatial (or magnitude) dependence, which is the case for the G-dwarf sample used in this article.

We fit a qDF to the various MAPs by taking into account the fact that the observed stellar number counts do not reflect the underlying (volume-complete) stellar distribution, but are instead strongly shaped by (1) the strongly position-dependent selection fraction of stars with spectra (see Figure 11 in BO12d), (2) the need to use photometric distances to relate the underlying phase-space model to the observed magnitude distribution (as the magnitude-limited SEGUE sample corresponds to a color- and metallicity-dependent distance-limited sample), and (3) the pencil-beam nature of the survey. As in BO12d, we model the observed distribution of stars as a Poisson sampling of the underlying population. The details of this method are the same as that described in Section 3.1 of BO12d, and we refer the reader to that paper for a detailed discussion.

Since the photometric distance estimates D depend on the g − r color, metallicity [Fe/H], and apparent r-band magnitude, and because the selection function is a function of position, r, and g − r, we need to model the observed density of stars in color–magnitude–metallicity–phase (position–velocity) space, λ(l, b, d, v, r, g − r, [Fe/H]). This density of stars can be written as

$\begin{eqnarray} &&\lambda (l,b,D,\mathbf {v},r,g-r,[\mathrm{Fe/H}])\nonumber\\ &&\quad= \rho (r,g-r,[\mathrm{Fe/H}]|R,Z,\phi) \times \mathrm{qDF}(\mathbf {x},\mathbf {v}) \nonumber\\ &&\qquad\times |J(R,Z,\phi;l,b,D)| \times S(\mathrm{plate},r,g-r). \qquad \end{eqnarray} \tag{ 23 }$

Here, (R, Z, ϕ) are Galactocentric cylindrical coordinates corresponding to rectangular coordinates x ≡ (X, Y, Z), which can be calculated from (l, b, D). The factor ρ(r, g − r, [Fe/H]|R, Z, ϕ) is the number density in magnitude–color–metallicity space as a function of position (see further discussion in Appendix B of BO12d). The |J(R, z; l, b, D)| is a Jacobian term because of the (X, Y, Z) → (l, b, D) coordinate transformation; the factor S(plate, r, g − r) is the selection function as given in Equation (A2) of BO12d. Finally, qDF(x, v) is the underlying DF of the sample. While we model this DF as being a function of the actions J only, we stress that the DF always remains a density in (x, v)-space.⁶ The parameters $\mathbf {p}_\Phi$ of the gravitational potential Φ enter the observed density through the dependence of J on Φ. We denote the parameters of the DF using $\mathbf {p}_{\mathrm{DF}}\equiv (h_R,\sigma _Z(R_0), h_{\sigma _Z}, \sigma _R(R_0), h_{\sigma _R})$ .

The likelihood of a given model for the DF and potential Φ is given by that of a Poisson process with rate parameter λ. After marginalizing over the amplitude of the DF with a logarithmically flat prior (which effectively normalizes the DF), the likelihood of a single data point i given by

$\begin{eqnarray} \ln \mathcal {L}_{i,\mathrm{DF}} &=& \ln \mathrm{qDF}(\mathbf {J}[\mathbf {x}_i,\mathbf {v}_i]|\mathbf {p}_\Phi,\mathbf {p}_{\mathrm{DF}}) \nonumber\\ && {-}\ln \int {d}l \,{d}b\, {d}D\, {d}\mathbf {v}\,{d}r\, {d}(g-r) \,{d}[\mathrm{Fe/H}] \nonumber\\ && \lambda (l,b,D,\mathbf {v},r,g-r,[\mathrm{Fe/H}]|\mathbf {p}_\Phi,\mathbf {p}_{\mathrm{DF}}). \qquad \end{eqnarray} \tag{ 24 }$

The second term normalizes the density λ over the observed volume. In Appendix A we describe how we calculate the nine-dimensional normalization integral efficiently. For comparison with observed velocities, we convolve the likelihood in Equation (24) with the observational velocity uncertainties.

Because we are primarily interested in measuring the surface density Σ_1.1(R), the dynamics of MAPs in the plane of the Milky Way is mostly irrelevant as it does not contain much information about the vertical surface density. Moreover, nonaxisymmetric perturbations to the Galactic potential (e.g., those from spiral structure or the bar) strongly affect the kinematics in the plane of the Galaxy, such that nonaxisymmetric planar kinematics (V_R and V_T) could bias the fit of an axisymmetric model to the MAP data. Practically, discarding the planar kinematics also allows us to fix the parameters of the qDF related to the radial velocity dispersion, as they only affect the distribution of velocities in the plane. This reduces the number of qDF parameters from five to three, thus significantly reducing the computations involved in evaluating the likelihood (see Section 3.4 below). Therefore, we do not consider the planar kinematics and marginalize the likelihood over V_R and V_T for each star; we only need to convolve the likelihood with the observational uncertainty in the vertical velocity (since we assume that distance uncertainties are only important as far as the velocity is concerned). We perform this convolution using 20th order Gauss–Legendre quadrature over the range $(V^i_Z-4\delta \,V^i_Z,V^i_Z+4\delta \,V^i_Z)$ , where $V^i_Z$ and $\delta \,V^i_Z$ are the observed velocity and its uncertainty. Thus, the likelihood for a single data point i becomes

$\begin{eqnarray} \ln \mathcal {L}_{i,\mathrm{DF}} &=& \ln \int _{V^i_Z-4\,\delta \,V^i_Z}^{V^i_Z+4\,\delta \,V^i_Z}{d}V_Z\,\nonumber\\ &&\times \int {d}V_R \,{d}V_T\, \mathrm{qDF}(\mathbf {J}[\mathbf {x}_i,\mathbf {v}_i]|\mathbf {p}_\Phi,\mathbf {p}_{\mathrm{DF}})\nonumber\\ && {-}\ln \int {d}l \,{d}b\, {d}D\, {d}\mathbf {v}\,{d}r\, {d}(g-r) \,{d}[\mathrm{Fe/H}]\nonumber\\ &&\lambda (l,b,D,\mathbf {v},r,g-r,[\mathrm{Fe/H}]|\mathbf {p}_\Phi,\mathbf {p}_{\mathrm{DF}}). \qquad \end{eqnarray} \tag{ 25 }$

We also add an outlier model consisting of a constant spatial density and a Gaussian velocity distribution with a dispersion of 100 km s⁻¹ exp (− (R − R₀)/6 kpc) For MAPs with best-fit scale heights from BO12d larger than 500 pc, this outlier velocity dispersion is multiplied by two. This outlier model mimics a halo population. The outlier fraction p_out is added as a free parameter. In the analysis below, the outlier fraction is small for most MAPs; only the most α-old, [Fe/H]-poor MAPs have outlier fractions of ∼10%, because their intrinsic dispersions are large. Most importantly, there is no correlation between the outlier fraction and the inferred value of Σ_1.1(R), such that it does not affect our results. The full likelihood is then given by

$\begin{eqnarray} \ln \mathcal {L} & = \sum _i \ln [(1-p_{\mathrm{out}})\,\mathcal {L}_{i,\mathrm{DF}} + p_{\mathrm{out}}\,\mathcal {L}_{i,\mathrm{out}}], \end{eqnarray} \tag{ 26 }$

where $\mathcal {L}_{i,\mathrm{out}}$ is the likelihood contribution from the outlier model.

3.4. Further Implementation Details

Using the methodology described in the previous section, we explore the joint PDF of the potential and qDF parameters for each MAP separately. We do this by calculating the PDF on a fixed grid specified below. This section gives the details of the specific implementation choices made to explore the joint PDF of potential and qDF parameters in a computationally efficient manner. The main steps of the implementation are shown graphically in Figure 9.

**Figure 9.** Graphical representation of the steps involved in determining the joint PDF of potential and qDF parameters for the data of each MAP. These steps are described in detail in Section 3, and the implementation in particular is discussed in Section 3.4. The main loop steps through the different potential models of Table 1, for which all of the dynamical quantities involved in evaluating the qDF (actions and rotational, epicycle, and rotational frequencies) are precalculated. An inner loop then goes through the different qDF models of Table 2 and outlier fractions to determine the likelihood of the potential+qDF models for the data.
Download figure:
Standard image High-resolution image

The potential grid consists of 8 stellar disk scale lengths ranging from 2 kpc to 3.4 kpc and 16 relative halo-to-disk contributions to $V^2_c(R_0)$ , ranging from 0 to 1. We do not consider models in which the halo power-law is smaller than 0 or larger than 3; these regions are blank in Figure 7. The list of potential parameters, the range over which the two basic parameters are varied, and the values at which the other potential parameters are fixed are given in Table 1 for all of the potential models considered below.

Table 1. Potential Models for Which the Analysis Described in Section 3 to Measure K_{Z, 1.1}(R) is Performed

	R_d	z_h	f_h	V_c	$\frac{{d}\ln V_c(R_0)}{{d}\ln R}$
Fiducial model	8 values from 2 kpc to 3.4 kpc	400 pc	16 values from 0 to 1	230 km s⁻¹	0.0
Systematics: high V_c	8 values from 2 kpc to 3.4 kpc	400 pc	16 values from 0 to 1	250 km s⁻¹	0.0
Systematics: low V_c	8 values from 2 kpc to 3.4 kpc	400 pc	16 values from 0 to 1	210 km s⁻¹	0.0
Systematics: low z_h	8 values from 2 kpc to 3.4 kpc	200 pc	16 values from 0 to 1	230 km s⁻¹	0.0
Systematics: falling rot. curve	8 values from 2 kpc to 3.4 kpc	400 pc	16 values from 0 to 1	230 km s⁻¹	−0.1
Systematics: rising rot. curve	8 values from 2 kpc to 3.4 kpc	400 pc	16 values from 0 to 1	230 km s⁻¹	0.1

Download table as: ASCII Typeset image

The qDF as described in Section 3.1 has nominally five free parameters. However, as discussed in the previous section, we marginalize the likelihood over the radial and azimuthal velocity components V_R and V_T, such that the parameters describing the radial dispersion profile—σ_R(R₀) and $h_{\sigma _R}$ —are not important for the data modeling. Therefore, we fix σ_R(R₀) at the values estimated based on fits of the radial dispersion similar to those performed in BO12c (see Figure 6) and we set $h_{\sigma _R} = 8\,\mathrm{kpc}$ . This leaves three free qDF parameters for each MAP: h_R, σ_Z(R₀), and $h_{\sigma _Z}$ . Table 2 lists all five qDF parameters and the range over which they are varied in the analysis, as described below.

Table 2. Parameters of the qDF and the Range Over Which They Are Varied in the Analysis

Parameter	Range Considered
h_R/8 kpc	8 values log.-spaced between −1.8 and 0.9
σ_Z(R₀)	16 values log.-spaced over range of width 0.36 around estimate in Figure 6
$h_{\sigma _Z}$	8 values log.-spaced between ln 4 kpc and ln 16 kpc
σ_R(R₀)	fixed at estimate in Figure 6
$h_{\sigma _R}$	fixed at 8 kpc

Download table as: ASCII Typeset image

For each MAP we consider the same range of ln h_R/8 kpc: eight values from −1.86 to 0.9. The range of the vertical dispersion scale length is also the same for each MAP: eight values logarithmically spaced between 4 kpc and 16 kpc. The range of σ_Z(R₀) is set individually for each MAP based on the estimates for σ_Z(R₀) from Figure 6. Sixteen values are logarithmically spaced over a range of width 0.36 around the estimated ln σ_Z(R₀). The central values were slightly adjusted based on visual inspection of the PDFs such that the best-fit does not occur at the edge of the parameter range; these adjustments are typically less than 20%.

In practice, for computational speed reasons, the likelihood calculation is structured as shown in Figure 9. We first set up an interpolation grid for the potential and the rotational, epicycle, and vertical frequencies as well as the guiding-star radii. These grids are then used to quickly evaluate the gravitational potential when calculating the actions and the interpolated frequencies and guiding-star radii are used in the fast evaluation of the qDF (see below). This step is necessary because directly evaluating the contribution to the potential coming from the stellar and gas exponential disks would be prohibitively computationally expensive. The explored parameters for each MAP are then separated in three classes: (1) the potential parameters $\mathbf {p}_\Phi$ , (2) the three qDF parameters, and (3) the outlier fraction.

Once the parameters of the potential are specified, all of the actions that are involved in evaluating the likelihood in Equation (25) are precomputed. This includes all of the actions involved in the calculation of the effective survey volume (Equation (A2)) and the actions for the data that are used to convolve the likelihood with the vertical velocity uncertainty (Equation (25)). As discussed above, these integrals over the velocities are performed over a range specified by the model dispersions (those that are explicit parameters of the qDF). In order to be able to reuse the actions for multiple qDF parameter sets, we calculate the integrals over the velocity, always using ranges based on the estimated dispersions from Figure 6. As the dispersions that are considered during the PDF exploration are typically within ≲ 30% of these estimates and we integrate out to 4σ, this procedure works well for all of the considered qDF parameters. It also makes sure that there is no stochastic noise in the likelihood evaluation, which could be large if the velocity and/or spatial integrals are performed using Monte Carlo integration (see discussion in McMillan & Binney 2013; Ting et al. 2013). We also precalculate the rotational, epicycle, and vertical frequencies as well as the guiding star radii associated with all of the points at which the qDF has to be evaluated in the course of a single likelihood evaluation (cf. the definition of the qDF in Equation (1)). Once the actions, frequencies, and guiding star radii are calculated, the likelihood can be quickly evaluated for different sets of qDF parameters, by evaluating the qDF using the precalculated values and resumming the integrals.

With all of the actions etc. precalculated we then vary the qDF parameters (second loop in Figure 9). For any given qDF parameter set we calculate the effective survey volume and evaluate the likelihood related to the qDF for all of the data points. The outlier fraction can then be varied very quickly, and we use a grid of 25 outlier fractions ranging from 0% to 50% (innermost loop in Figure 9).

We find that by using Gauss–Legendre quadrature to perform the velocity integrals, we can evaluate the effective survey volume accurately enough (that is, to an accuracy ≪0.001%, sufficiently small such that this is not a source of noise in the likelihood) using ≈2 × 10⁶ action evaluations. This is less than the ≈10⁷ evaluations required by McMillan & Binney (2013), and this leads to a significant reduction in the time required for a single likelihood evaluation, as setting up the grid of actions takes up a significant amount of time. The reason Gauss–Legendre quadrature works well is that the qDF is a very smooth function, and it is easy to estimate the range over which it is nonzero. Similarly, we have found that Gauss–Legendre quadrature is far superior to Monte Carlo sampling for convolving the likelihood with the velocity uncertainty. We can accurately calculate the convolved likelihood with 20th order Gauss–Legendre quadrature, whereas when even using thousands of Monte Carlo samples the convolution was not well-behaved (even though we used the same Monte Carlo samples for all different parameter settings to remove numerical noise, as advocated in Ting et al. 2013). Future analyses of Gaia data will still be mostly in the regime where distance uncertainties are only important through their influence on the velocity uncertainties. As 3D Gauss–Legendre quadrature could be adequately performed using ∼10⁵ evaluations of the integrand, Gauss–Legendre quadrature will likely still be superior to Monte Carlo integration.

The computational time is dominated both by setting up the grid of actions and frequencies for a given potential and calculating the effective survey volume for a given set of qDF parameters. Using all of the speed-ups described in this section, the exploration of the PDF still requires ∼3 days using eight cpus for a single MAP. Each MAP's PDF can be explored independently, and all of the steps involved in the likelihood evaluation can be easily parallelized such that the availability of sufficient computational resources is the main limitation for exploring a wider range of potential models. The combined computational cost for all steps of the analysis described in this article was 20,000 cpu-hours, or ≈1 cpu-hour per datapoint.

4. RESULTS

In this section we present the results of fitting each MAP with a qDF model to measure Σ_1.1 at the Galactocentric radius R at which this quantity is best constrained for each MAP. In Section 4.1 we discuss the joint constraints on the dynamical parameters of the gravitational potential model and on the qDF parameters for each MAP individually. We then describe how we determine the radius for each MAP at which Σ_1.1 is best constrained for that MAP. In Section 4.2 we present the Σ_1.1(R) profile that results from combining the individual fits to the different MAPs. In Section 4.3 we show that the Σ_1.1(R) profile that we measure using the fiducial model for the gravitational potential does not significantly change if we change the parameters of the fiducial potential. We also demonstrate that the impact of systematic distance uncertainties is small.

4.1. Results for Individual MAPs and Determination of Σ_1.1(R)

As discussed in Section 3, we fit each set of MAP data using a qDF model, and we determine the joint PDF of two dynamical parameters + three qDF parameters on a grid in all of the parameters. We thus obtain constraints on the basic dynamical parameters, disk scale length, and relative halo-to-disk contribution to $V_c^2(R_0)$ after marginalizing over the parameters of the qDF (which is a weak form of the "summing over all possible DFs" of Margorrian 2006). We can calculate the PDF of derived dynamical parameters, such as the surface density at a given radius, the density of the dark halo, etc., by appropriate transformations of the PDF of the basic dynamical parameters (with the understanding that all of the different derived parameters will be strongly correlated because there are only two free parameters). Similarly, we can marginalize over the dynamical parameters to constrain the qDF parameters for the individual MAPs, but this is not the focus of this article.

Figure 10 shows as examples the PDFs resulting from the fits to two MAPs for the basic dynamical parameters as well as PDFs for the qDF parameters related to the vertical velocity dispersion, σ_Z(R₀) and $h_{\sigma _Z}$ . We only show PDFs for two MAPs in Figure 10, and we only show two different 2D marginalized PDFs to give a sense of the PDFs and how they are different for metal-poor and metal-rich MAPs. We have visually inspected all such PDFs as well as other 2D projections for all individual MAPs. We have done this for the fit of the fiducial potential model as well as for all of the systematics checks in Section 4.3. All PDFs were found to be sensible, although in a few cases the range of σ_Z(R₀) had to be changed and the PDF had to be recomputed as the best-fitting value of σ_Z(R₀) was found to lie on the boundary of the σ_Z(R₀) grid. We were unable to adjust the range of σ_Z(R₀) within an acceptably small number of iterations for only one of the 45 MAPS in our initial sample, and we dropped this bin from further consideration: this MAP is located at [Fe/H] = −0.25, [α/Fe] = 0.275, and it contained only 129 data points.

To assess whether the qDF model is a good fit to the observed spatial density and kinematics of the MAPs we have performed extensive direct comparisons between the data and the best-fitting model for each MAP. We do this by projecting the best-fitting model into the space of observables and take into account the various selection biases affecting the data. These data–model comparisons are described and discussed in detail in Appendix B. Overall, we find that the spatial distribution and the vertical kinematics of the data are very well fit by the qDF models for individual MAPs. However, we remove the MAP at [Fe/H] = −0.05 and [α/Fe] = 0.125 from further consideration because the detailed comparisons between the data and the model indicates that no good fit is obtained for this MAP; this MAP contains 255 data points. All 43 other MAPs, containing 16,269 data points, are included in the results described below.

In principle, the optimal way of constraining the mass distribution using individual MAPs would be to multiply together the individual PDFs for the dynamical parameters. However, these PDFs of the individual MAPs are not mutually consistent, e.g., the joint PDF for all α-old MAPs is inconsistent with the joint PDF for all metal-rich, α-young MAPs. This is a reflection of the fact that our fiducial model for the gravitational potential, with only two free parameters, is too restrictive, i.e., it does not contain a model that is consistent with all of the individual MAPs. However, the two-parameter potential model is general enough to provide a measurement of the gravitational force in the small range of radii for each MAP that most affects the MAP's orbital distribution.

The PDF of the dynamical parameters in Figure 10 and those of all other MAPs are characterized by a strong degeneracy between the two basic dynamical parameters. A comparison with Figure 7 shows that this degeneracy is along lines of constant surface density, which is expected as the vertical kinematics of an individual MAP primarily constrains the total surface density. A comparison with Figure 7 also makes it clear that the direction of the degeneracy is indicative of the radius at which the surface density is best constrained by the data of an individual MAP. We focus on the surface density at a height of 1.1 kpc because that is close to the mode of the distribution of distances-from-the-mid-plane of the data for each MAP, which is set through the combination of the intrinsic vertical distribution and the SEGUE selection procedure.

Formally, we can calculate the radius at which Σ_1.1 is best determined by calculating the correlation between the disk scale length and Σ_1.1 as a function of R and finding the radius at which this correlation is minimized. This radius serves as the point around which Σ_1.1 pivots as the model's scale length is changed. As such, it is the radius at which Σ_1.1 is most robustly determined. From the PDF of the dynamical parameters, we can then calculate the mean Σ_1.1(R) and its uncertainty at that radius. Figure 11 illustrates this procedure.

**Figure 11.** Illustration of the procedure used to determine the radius R at which each MAP best determines Σ_1.1(R), shown for the more metal-poor MAP of Figure 10. The top panel shows the Σ_1.1(R) profile corresponding to the best-fit relative halo contribution to $V_c^2(R_0)$ for each of the eight values of the disk scale length considered in the fiducial potential model. The bottom panel shows the posterior correlation between the disk scale length and Σ_1.1 as a function of radius. The radius R at which a given MAP measures Σ_1.1 is that for which the correlation between the disk scale length and Σ_1.1 is minimized. The top panel shows that this is the radius around which the inferred Σ_1.1(R) profile pivots when changing the assumed disk scale length.
Download figure:
Standard image High-resolution image

**Figure 11.** Illustration of the procedure used to determine the radius R at which each MAP best determines Σ_1.1(R), shown for the more metal-poor MAP of Figure 10. The top panel shows the Σ_1.1(R) profile corresponding to the best-fit relative halo contribution to $V_c^2(R_0)$ for each of the eight values of the disk scale length considered in the fiducial potential model. The bottom panel shows the posterior correlation between the disk scale length and Σ_1.1 as a function of radius. The radius R at which a given MAP measures Σ_1.1 is that for which the correlation between the disk scale length and Σ_1.1 is minimized. The top panel shows that this is the radius around which the inferred Σ_1.1(R) profile pivots when changing the assumed disk scale length.
Download figure:
Standard image High-resolution image

Figure 10 shows that the degeneracy in the PDF of the dynamical parameters, and therefore the correlation between Σ_1.1 and the disk scale length, is different for different MAPs. Thus, the radius at which Σ_1.1 is best determined is different for the various MAPs. This is plausible as stars of a given MAP, found in the solar vicinity, are drawn from tracer populations of very different scale lengths (BO12d). We expect that MAPs with abundances that are close to solar measure Σ_1.1 at a radius that is close to the solar circle because these stars dominate the local population of stars and they spend much of their orbits close to the solar circle. Likewise, we expect that the α-old MAPs, which have short scale lengths and large velocity dispersion (BO12a), spend a much larger fraction of their orbit at smaller radii and therefore measure Σ_1.1 closer to the Galactic center.

To test whether this expectation is borne out in practice, and therefore whether the procedure of determining the radius at which Σ_1.1 is best measured makes sense, we have calculated the median of the mean orbital radii of the data in each MAP. We use a simplified model for the Milky Way's gravitational potential that is the same as that used in Section 5.4 and Figure 7 of BO12d, but the details of the mass model for the Milky Way used are unimportant for the calculation of the mean orbital radii. As we are interested in the vertical dynamics, we time-average over the orbits weighting by the midplane density (= vertical oscillation frequency squared). These median mean-orbital-radii for each MAP are plotted against the radii with the most robust Σ_1.1 determination in Figure 12, color-coded by [α/Fe].

It is clear from Figure 12 that there is a strong correlation between the radii calculated in these two entirely different manners (apart from a single outlier in the lower-right corner). Therefore, we can be confident that the procedure for calculating the radius at which each MAP best measures Σ_1.1 provides a good definition for this radius. The color-coding confirms the expectation that α-old MAPs constrain Σ_1.1 at radii much smaller than R₀, while α-young MAPs measure Σ_1.1 at positions close to the Sun's. MAPs that are intermediate between these two measure Σ_1.1 at intermediate radii. One of the more α-young MAPs in Figure 12 falls far from the relation defined by the other MAPs (best Σ_1.1 radius of 4.5 kpc for a mean orbital radius of 7.5 kpc). However, this single MAP does not influence any of the results obtained below. Below, we quote results using the best radius as determined from the PDF; if we instead require the measurements of the vertical force to be at the median of the mean orbital radii of the stars in each MAP, the best-fit surface-density and vertical-force profiles are unchanged, albeit with slightly increased uncertainties on the scale lengths.

4.2. The Surface Density at 1.1 kpc between 4 kpc and 9 kpc as Measured by MAPs

Even though the data from the different MAPs were sampled by the same survey centered on the solar radius (see Figure 7 in Rix & Bovy 2013), they constrain Σ_1.1(R) at quite different radii. As a consequence, we can measure the Galactic disk's surface density profile. For each MAP we calculate the mean Σ_1.1(R) and its uncertainty δΣ_1.1(R) at the best radius R from moments of the PDF of the dynamical parameters. Thus, we condense each MAP's dynamical information into the best single measurement of the Milky Way's mass distribution for that MAP. The surface densities thus measured are shown in Figure 13 and tabulated in Table 3.

**Figure 13.** Surface density as a function of radius as measured by SEGUE G-dwarf MAPs. Each point represents a measurement based on the data from one MAP; each MAP measures the surface density Σ(R, |Z| ⩽ 1.1 kpc) at the radius R where the correlation between the model stellar disk scale length and Σ(R₀, |Z| ⩽ 1.1 kpc) is minimized in the MAP's PDF (see Figures 10 and 11). The gray curve is an exponential fit with Σ(R₀, |Z| ⩽ 1.1 kpc) = 69 M_☉ pc⁻² and a scale length of 2.5 kpc. Points are color-coded by the [α/Fe] of the MAP bin, from [α/Fe] = 0.025 (dark blue) to [α/Fe] = 0.475 (red/brown).
Download figure:
Standard image High-resolution image

Table 3. Measured Surface Density and Vertical Force at Different Galactocentric Radii

[Fe/H]	[α/Fe]	R	Σ_1.1(R)	δΣ_1.1(R)	R₀ − R	K_{Z, 1.1}(R)	δK_{Z, 1.1}(R)
(dex)	(dex)	(kpc)	(M_☉ pc⁻²)	(M_☉ pc⁻²)	(kpc)	(2πG M_☉ pc⁻²)	(2πG M_☉ pc⁻²)
−1.25	0.425	4.63	256.0	50.4	3.37	217.6	41.5
−1.15	0.425	4.68	270.9	44.3	3.32	230.6	36.8
−1.05	0.375	6.71	89.7	20.3	1.29	84.2	18.5
−1.05	0.425	4.77	244.2	40.1	3.23	209.0	33.2
−0.95	0.325	4.59	228.4	48.3	3.41	194.7	38.6
−0.95	0.375	5.04	207.6	34.9	2.96	180.5	29.2
−0.95	0.425	5.22	204.3	30.1	2.78	179.0	25.6
−0.95	0.475	4.68	247.9	41.3	3.32	211.4	33.7
−0.85	0.275	7.38	65.0	11.9	0.62	62.2	11.2
−0.85	0.325	6.53	104.9	19.8	1.47	97.5	18.0
−0.85	0.375	6.62	118.1	14.0	1.38	109.8	12.9
−0.85	0.425	6.66	127.6	14.0	1.34	119.0	12.9
−0.85	0.475	5.08	202.8	35.9	2.92	176.8	30.1
−0.75	0.275	7.42	64.4	11.6	0.58	61.7	11.0
−0.75	0.325	6.53	125.0	16.2	1.47	115.9	14.8
−0.75	0.375	7.11	97.3	8.0	0.89	92.0	7.5
−0.75	0.425	6.71	130.6	11.8	1.29	121.9	10.9
−0.75	0.475	6.53	103.3	19.8	1.47	96.1	18.0
−0.65	0.275	7.20	81.4	12.6	0.80	77.3	11.8
−0.65	0.325	7.20	100.2	9.9	0.80	95.2	9.3
−0.65	0.375	6.34	159.2	12.7	1.66	146.3	11.5
−0.65	0.425	6.79	108.0	14.5	1.21	101.2	13.4
−0.55	0.275	7.29	89.2	9.6	0.71	84.9	9.1
−0.55	0.325	7.29	93.1	7.8	0.71	88.5	7.4
−0.55	0.375	7.02	104.7	8.7	0.98	98.7	8.2
−0.55	0.425	6.57	108.3	18.7	1.43	100.8	17.1
−0.45	0.225	7.56	77.6	8.5	0.44	74.5	8.1
−0.45	0.275	6.75	122.4	12.6	1.25	114.3	11.7
−0.45	0.325	6.84	115.7	10.1	1.16	108.4	9.4
−0.45	0.375	5.54	189.6	24.1	2.46	168.6	20.9
−0.35	0.225	7.29	91.5	10.2	0.71	87.2	9.6
−0.35	0.275	6.57	150.9	13.5	1.43	140.1	12.4
−0.35	0.325	5.67	190.6	22.3	2.33	170.6	19.5
−0.25	0.175	7.70	75.6	8.4	0.30	72.8	8.0
−0.25	0.225	7.88	64.6	8.5	0.12	62.4	8.2
−0.15	0.125	7.70	76.7	8.1	0.30	73.9	7.8
−0.15	0.175	6.08	161.8	19.8	1.92	147.4	17.7
−0.05	0.025	6.57	121.9	16.4	1.43	113.2	14.9
−0.05	0.075	7.92	71.4	7.0	0.08	69.2	6.8
0.05	0.025	8.55	54.7	4.9	-0.55	53.4	4.7
0.05	0.075	7.20	106.8	10.0	0.80	101.3	9.4
0.15	0.025	6.03	145.4	20.9	1.97	132.4	18.8
0.25	0.025	4.82	240.3	42.9	3.18	206.2	35.7

Notes. Each row gives the measurement of the surface density Σ_1.1 up to |Z| = 1.1 kpc along with its uncertainty δΣ_1.1 obtained from a single MAP, specified by its central [Fe/H] and [α/Fe]. The radius is that where the correlation between the inferred surface density and the potential model's disk scale length is minimal; this is the radius at which the surface density is best measured by a MAP (see Figure 11). The last two columns give the alternative measurement of the vertical force K_{Z, 1.1} at 1.1 kpc and its uncertainty δK_{Z, 1.1}. The sixth column gives the difference between the radius at which Σ_1.1 or K_{Z, 1.1} is measured and R₀ (to be held constant when using our measurements with a different value of R₀).

Download table as: ASCII Typeset image

Figure 13 shows that the MAPs provide a highly precise measurement of Σ_1.1(R) between 4.5 kpc and 9 kpc and that the results from all 43 MAPs are in excellent agreement with each other. From a purely phenomenological perspective this measurement of Σ_1.1(R) can be well-characterized by a single exponential:

$\begin{equation} \Sigma _{1.1}(R) = 69\,M_\odot \,\mathrm{pc}^{-2}\,\exp \left(-(R-R_0)/ 2.5\,\mathrm{kpc}\right), \end{equation} \tag{ 27 }$

with formal uncertainties of ∼2.5% in the normalization and 4% in the scale length. However, we have no reason to expect that the real physical Σ_1.1(R) is a single exponential, as the mass distribution is made up of various disk components and the halo. We discuss this further in Section 5 below, where we fit mass models.

An alternative way of presenting our results is as measurements of the vertical force at 1.1 kpc, which we denote by K_{Z, 1.1} instead. These measurements are shown in Figure 14 and are also tabulated in Table 3. As with the measurements of Σ_1.1(R), the radial dependence of the vertical force is well-fit by an exponential as

$\begin{equation} \frac{K_{Z,1.1}(R)}{2\pi G} = 67\,M_\odot \,\mathrm{pc}^{-2}\,\exp \left(-(R-R_0)/ 2.7\,\mathrm{kpc}\right), \end{equation} \tag{ 28 }$

again with formal uncertainties of ∼2.5% in the normalization and 4% in the scale length. As expected based on the vertical Jeans equation and shown explicitly in the following section, our K_{Z, 1.1} measurements are more robust with respect to changes in the assumptions about the derivative of the circular velocity (i.e., the "slope of the rotation curve"). Therefore, when using our measurements to constrain dynamical models of the Milky Way, it may be preferable to use the K_{Z, 1.1}(R) measurements rather than the Σ_1.1, especially when considering models where the rotation curve is significantly nonflat. This is the approach we follow in Section 5 below.

**Figure 14.** Vertical force at 1.1 kpc as a function of radius as measured by SEGUE G dwarf MAPs. This figure represents the same measurements as in Figure 13, but expressed as vertical forces. The gray curve is an exponential fit with *K_Z*(R₀, |Z| = 1.1 kpc)/2πG = 67 M_☉ pc⁻² and a scale length of 2.7 kpc.
Download figure:
Standard image High-resolution image

We have assumed R₀ = 8 kpc throughout the analysis. The effect of changing R₀ is to shift our measurements of Σ_1.1(R) and K_{Z, 1.1}(R) such as to keep R − R₀ constant (i.e., our measurements are at a fixed distance from R₀; see also Section 4.3). Therefore, our measurements can be shifted to a different value of R₀ by keeping R − R₀ (see Table 3) constant.

The measured Σ_1.1(R) or K_{Z, 1.1}(R) in Figures 13 and 14 and tabulated in Table 3 are 43 new, independent constraints on the Milky Way's mass distribution and gravitational potential. They can be combined with other constraints, such as measurements of the vertical mass distribution as in Bovy & Tremaine (2012) and Zhang et al. (2013), and constraints on the rotation curve from the terminal velocity curve or based on stellar tracers. This is the approach that we take in Section 5, where we show that the MAP measurements of Σ_1.1(R) quite directly constrain the disk scale length and through combination with data on the rotation curve can be used to disentangle the disk and halo contributions to the mass distribution in the inner Milky Way.

4.3. Variations on the Fiducial Model

The results in the previous section are obtained in the fiducial model in which we use the An09 distance scale for the G-type SDSS dwarfs and fix the parameters of the gravitational potential model such that V_c(R₀) = 230 km s⁻¹, z_h = 400 pc, and dln V_c(R₀)/dln R = 0. In this section we determine the impact of these choices and find them to be small and largely inconsequential for the measurement of Σ_1.1(R) and especially K_{Z, 1.1}(R), and for the dynamical analysis in Section 5 below.

We assess the impact of systematic distance uncertainties by performing the analysis of the MAP dynamics when using the distances of I08 instead of the An09 distances in our fiducial model. As discussed in Section 2, the I08 distances are typically 7% larger than the An09 distances to G-type dwarfs, with the detailed relative distance factors given in Figure 1. As discussed in BO12d, the impact of unresolved binaries is such that distances would be ∼6% larger, so the difference between the results obtained using I08 distances and those derived from An09 is a good proxy for understanding the impact of typical systematic distance uncertainties affecting the data.

We expect the impact of systematic distance uncertainties to be such that $\Sigma _{1.1}^{\mathrm{Ivezic}} / \Sigma _{1.1}^{\mathrm{An}} = (d^\mathrm{Ivezic} / d^{\mathrm{An}})^\alpha$ , with −1 ⩽ α ⩽ 1 and similar for K_{Z, 1.1}. This is because roughly $\Sigma _{1.1}\propto \sigma _Z^2 / h_Z$ , such that if the measurement is dominated by proper motions, such as when one observes the vertical motions of disk stars far from the Sun, Σ_1.1∝d. Similarly, if the measurement primarily depends on the line-of-sight velocities, which is the case when measuring Σ_1.1 at R₀ by looking at the motions of stars near the Galactic poles, Σ_1.1∝1/d. Because the measurement of Σ_1.1 has contributions from both proper motions and line-of-sight velocities, we expect Σ_1.1 to scale with distance in a manner that is in between these two extremes. We expect the scaling to be different for different MAPs, as MAPs that measure Σ_1.1 at smaller R rely more on proper motions than MAPs that measure Σ_1.1 closer to the solar circle.

We show the comparison between the measured Σ_1.1 obtained using the I08 distances with those derived from the An09 distances in Figure 15. The difference in the measured Σ_1.1 is shown for each MAP at the radius at which the MAP best measures Σ_1.1 determined for the fiducial model. The dashed lines show the approximate locus of the linear and inverse scalings with distance discussed in the previous paragraph. This figure shows that the impact of the systematic distance shift between the two distance scales is small for the majority of the MAPs. MAPs that measure Σ_1.1 at R < 5.5 kpc rely more on proper motions and their measured Σ_1.1 therefore scales close to linear with the distance scale. Most of the MAPs that measure Σ_1.1 at R > 5.5 kpc are only affected at the few percent level by the uncertainty in the distance scale. K_{Z, 1.1} is affected in the same way as Σ_1.1.

**Figure 15.** The impact of distance systematics. This figure compares the measured Σ_1.1 for each MAP when using distances on the An09 scale with those when using the I08 distances. The differences are only a few percent for most MAPs that measure Σ_1.1 at 6 kpc ≲ R ≲ 9 kpc, while the differences are almost 10% for MAPs that measure Σ_1.1 at R ⩽ 5.5 kpc. The dashed lines show the approximate locus where the impact of using the I08 distances results in a linear increase in Σ_1.1 ( $\Sigma _{1.1}^{\mathrm{Ivezic}} / \Sigma _{1.1}^{\mathrm{An}} = (d^\mathrm{Ivezic} / d^{\mathrm{An}})$ ), as expected when the measurement is dominated by proper motions or in a linear decrease in Σ_1.1 ( $\Sigma _{1.1}^{\mathrm{Ivezic}} / \Sigma _{1.1}^{\mathrm{An}} = (d^\mathrm{Ivezic} / d^{\mathrm{An}})^{-1}$ ), as expected when the measurement is dominated by line-of-sight velocities. Overall, the impact of systematic distance uncertainties is much smaller than the statistical uncertainties, and the best-fitting exponential approximation to Σ_1.1(R) changes only by 2% in the normalization and scale length (see the text).
Download figure:
Standard image High-resolution image

The overall impact of the change in the distance scale between that of I08 and An09 on Σ_1.1(R) is small. It is clear from Figure 15 that the effect of using the larger distances is to slightly steepen the Σ_1.1(R) profile. The best-fit exponential approximations to Σ_1.1(R) and K_{Z, 1.1}(R) obtained using the I08 scale, using the optimal radii derived from the PDFs for the I08 scale, are

$\begin{eqnarray} \Sigma _{1.1}(R) & =& 70\,M_\odot \,\mathrm{pc}^{-2}\,\exp \left(-(R-R_0)/ 2.5\,\mathrm{kpc}\right)\nonumber \\ &&\qquad \mathrm{(I08\ distance\ scale)}, \end{eqnarray} \tag{ 29 }$

$\begin{eqnarray} \frac{K_{Z,1.1}(R)}{2\pi G} & =& 68\,M_\odot \,\mathrm{pc}^{-2}\,\exp \left(-(R-R_0)/ 2.7\,\mathrm{kpc}\right)\quad\nonumber \\ &&\qquad \mathrm{(I08\ distance\ scale)}, \qquad \end{eqnarray} \tag{ 30 }$

which is very close and within the statistical uncertainties of the result obtained using the An09 scale (Equation (27)). Therefore, the impact of systematic distance uncertainties on the measurement of Σ_1.1(R) and K_{Z, 1.1}(R) is insignificant, at the level of ≈2%.

To determine the impact of the rather constrained fiducial model for the gravitational potential, we repeat the measurement of Σ_1.1(R) for different choices for the most important fixed parameters of the potential. Foremost among these is the normalization of the potential, characterized by the local circular velocity in our parameterization. We have fixed this to V_c(R₀) = 230 km s⁻¹ in the fiducial model. In the top panel of Figure 16 we compare the results when using V_c(R₀) = 250 km s⁻¹ to those obtained for the fiducial model, again at the radius determined using the fiducial model. We see that the impact of changing the potential normalization is close to zero for almost all MAPs, and in all cases the shift is much smaller than the random uncertainties. The dashed line in this figure shows the expected difference if $\Sigma _{1.1}\propto V_c^2$ , the expected scaling if the measured surface-density were wholly dependent on the normalization of the potential (or equivalently, of the rotation curve), that is, if we were not really measuring Σ_1.1 or K_{Z, 1.1}. K_{Z, 1.1}(R) behaves the same when changing V_c(R₀). Similarly, in the second panel we compare the results obtained using V_c(R₀) = 210 km s⁻¹ to those from the fiducial model. These two sets of results are again nearly indistinguishable and the difference is much smaller than the $\Sigma _{1.1}\propto V_c^2$ expectation (dashed line). Note in this case that the measured value for V_c(R₀) = 230 km s⁻¹ for the bins that measure Σ_1.1 at R ≈ 5 kpc is at the edge of the prior range for V_c(R₀) = 210 km s⁻¹, which artificially lowers the inferred Σ_1.1 for these MAPs (this is clear from an inspection of the PDFs). The best-fitting exponential approximations to Σ_1.1(R) and K_{Z, 1.1}(R) measured from the V_c(R₀) = 250 km s⁻¹ results are

$\begin{eqnarray} \Sigma _{1.1}(R) & =& 72\,M_\odot \,\mathrm{pc}^{-2}\,\exp \left(-(R-R_0)/ 2.5\,\mathrm{kpc}\right)\nonumber \\ &&\qquad (V_c(R_0) = 250\,\mathrm{km\ s}^{-1}), \end{eqnarray} \tag{ 31 }$

$\begin{eqnarray} \frac{K_{Z,1.1}(R)}{2\pi G} & =& 69\,M_\odot \,\mathrm{pc}^{-2}\,\exp \left(-(R-R_0)/ 2.7\,\mathrm{kpc}\right)\quad\nonumber \\ &&\qquad (V_c(R_0) = 250\,\mathrm{km\ s}^{-1}), \qquad \end{eqnarray} \tag{ 32 }$

and those from the V_c(R₀) = 210 km s⁻¹ results are

$\begin{eqnarray} \Sigma _{1.1}(R) & =& 65\,M_\odot \,\mathrm{pc}^{-2}\,\exp \left(-(R-R_0)/ 2.5\,\mathrm{kpc}\right)\nonumber \\ &&\qquad (V_c(R_0) = 210\,\mathrm{km\ s}^{-1}), \end{eqnarray} \tag{ 33 }$

$\begin{eqnarray} \frac{K_{Z,1.1}(R)}{2\pi G} & =& 63\,M_\odot \,\mathrm{pc}^{-2}\,\exp \left(-(R-R_0)/ 2.8\,\mathrm{kpc}\right)\quad\nonumber \\ &&\qquad (V_c(R_0) = 210\,\mathrm{km\ s}^{-1}), \qquad \end{eqnarray} \tag{ 34 }$

Therefore, the scale length of Σ_1.1(R) hardly changes and the normalization is only affected by a few percent, which is similar to the statistical uncertainties in the fit.

**Figure 16.** The influence of the most important fixed parameters of the fiducial potential model. This figure shows the (logarithmic) difference between the surface-densities $\Sigma _{1.1}^{\mathrm{fid}}$ inferred with the fiducial model for the potential (*V_c* = 230 km s⁻¹, *z_h* = 400 pc, and dln *V_c*(R₀)/dln R = 0) and those inferred with a model where one of these three parameters is changed. The top two panels show the influence of the normalization of the potential *V_c*, which is negligible and much smaller than the naive expectation that $\Sigma _{1.1}^{\mathrm{alt}} / \Sigma _{1.1}^{\mathrm{fid}} = (V_c^{\mathrm{alt}} / V_c^{\mathrm{fid}})^2$ if the surface densities simply scaled with $V_c^2$ (shown by the dashed line). The middle panel shows the outcome of changing the assumed disk scale height *z_h*. The bottom two panels explore the effect of changing the shape of the rotation curve (parameterized using the local logarithmic slope dln *V_c*(R₀)/dln R). The only significant systematic in measuring Σ_1.1 is that related to changing the shape of the rotation curve, which for measurements near R₀ is close to the naive expectation that the effect of changing the slope of the rotation curve is the addition of $|Z|\,({d}V^2_c / {d}R)/2\pi G R$ (dashed curves). However, the change in K_{Z, 1.1}, indicated by the gray diamond symbols, is still small (the change in K_{Z, 1.1} is the same as that for Σ_1.1 for the upper three panels). Thus, the measurement of K_{Z, 1.1}(R) is unaffected by systematics, while the Σ_1.1(R) measurement is robust if the rotation curve is locally close to flat.
Download figure:
Standard image High-resolution image

**Figure 16.** The influence of the most important fixed parameters of the fiducial potential model. This figure shows the (logarithmic) difference between the surface-densities $\Sigma _{1.1}^{\mathrm{fid}}$ inferred with the fiducial model for the potential (*V_c* = 230 km s⁻¹, *z_h* = 400 pc, and dln *V_c*(R₀)/dln R = 0) and those inferred with a model where one of these three parameters is changed. The top two panels show the influence of the normalization of the potential *V_c*, which is negligible and much smaller than the naive expectation that $\Sigma _{1.1}^{\mathrm{alt}} / \Sigma _{1.1}^{\mathrm{fid}} = (V_c^{\mathrm{alt}} / V_c^{\mathrm{fid}})^2$ if the surface densities simply scaled with $V_c^2$ (shown by the dashed line). The middle panel shows the outcome of changing the assumed disk scale height *z_h*. The bottom two panels explore the effect of changing the shape of the rotation curve (parameterized using the local logarithmic slope dln *V_c*(R₀)/dln R). The only significant systematic in measuring Σ_1.1 is that related to changing the shape of the rotation curve, which for measurements near R₀ is close to the naive expectation that the effect of changing the slope of the rotation curve is the addition of $|Z|\,({d}V^2_c / {d}R)/2\pi G R$ (dashed curves). However, the change in K_{Z, 1.1}, indicated by the gray diamond symbols, is still small (the change in K_{Z, 1.1} is the same as that for Σ_1.1 for the upper three panels). Thus, the measurement of K_{Z, 1.1}(R) is unaffected by systematics, while the Σ_1.1(R) measurement is robust if the rotation curve is locally close to flat.
Download figure:
Standard image High-resolution image

We have also performed an extreme test where we set V_c(R₀) = 280 km s⁻¹, far beyond any reasonable setting for this parameter. Setting V_c(R₀) to 280 km s⁻¹ shifts the prior on Σ_1.1(R) in Figure 8 upward, such that many of the best-fit Σ_1.1 in Figure 13 lie on the edge or slightly outside the prior range. We find that the measured Σ_1.1 for V_c(R₀) are all on the lower edge of the prior, such that they clearly indicate that Σ_1.1 is in reality lower than what is allowed by the V_c = 280 km s⁻¹ prior. Even in this extreme case (which should not be trusted, since all measured Σ_1.1 are on the edge of the prior), the scale length of the best-fitting exponential is 2.5 kpc and the normalization only changes to 76 M_☉ pc⁻². This change in normalization is only 10%, while $V_c^2$ has changed by 50%. This and the results for V_c = 210 km s⁻¹ and V_c = 250 km s⁻¹ confirm that we are primarily measuring Σ_1.1 (or K_{Z, 1.1}), such that the assumed normalization of the potential is unimportant as long as it permits the inclusion of the actual Σ_1.1.

In the fiducial model, the scale height of the stellar disk is fixed to z_h = 400 pc, which is the mass-weighted scale height inferred from the MAP decomposition of BO12a and agrees with dynamical measurements of the disk scale height based on Σ(R₀, Z) (Siebert et al. 2003; Zhang et al. 2013). However, the mass-weighted scale height of the disk is still quite uncertain. Because we are primarily measuring Σ_1.1 using the vertical kinematics, we do not expect our results to depend on the disk scale height, which for any reasonable value changes the vertical distribution of disk matter below approximately 1 kpc, but not above it (as at most a few disk M_☉ pc⁻² are above 1 kpc). We have repeated the analysis above using a scale height of 200 pc, which is at the low end of plausible values of this parameter. The middle panel of Figure 16 demonstrates that the inferred Σ_1.1 for all MAPs barely change when using a different stellar-disk scale height, and all changes are within the random uncertainties of the measurement. The inferred Σ_1.1(R) and K_{Z, 1.1}(R) are almost exactly the same as those measured using the fiducial potential model:

$\begin{eqnarray} \Sigma _{1.1}(R) & =& 70\,M_\odot \,\mathrm{pc}^{-2}\,\exp \left(-(R-R_0)/ 2.6\,\mathrm{kpc}\right)\nonumber \\ &&\qquad\mathrm{(scale\ height}\ = 200\,\mathrm{pc}), \end{eqnarray} \tag{ 35 }$

and

$\begin{eqnarray} \frac{K_{Z,1.1}(R)}{2\pi G} & =& 67\,M_\odot \,\mathrm{pc}^{-2}\,\exp \left(-(R-R_0)/ 2.7\,\mathrm{kpc}\right)\quad\nonumber \\ &&\qquad \mathrm{(scale\ height}\ = 200\,\mathrm{pc}). \qquad \end{eqnarray} \tag{ 36 }$

The final important parameter of the gravitational-potential model is the slope of the rotation curve, set to dln V_c(R₀)/dln R = 0 in the fiducial model. From the vertical Poisson equation it is clear that the slope of the rotation curve changes the surface density measured from vertical kinematics by adding approximately $|Z|\,({d}V^2_c / {d}R)/2\pi G R$ (Kuijken & Gilmore 1989a; Bovy & Tremaine 2012). Locally this amounts to a systematic uncertainty of

${\approx }7\,M_\odot \,\mathrm{pc}^{-2} (|Z| / 1.1\,\mathrm{kpc}) ([{d}\ln V_c / {d}\ln R]/0.1)$

$\qquad \quad (V_c / 230\,\mathrm{km\ s}^{-1})^2(R_0 / 8\,\mathrm{kpc})^{-2}$

given the current uncertainty in the slope of the rotation curve (see below). However, we expect the vertical force K_Z to be unaffected by this because the vertical Jeans equation shows that K_Z is constrained by the vertical dynamics only and that it does not depend on the slope of the rotation curve.

The bottom two panels of Figure 16 show the change in the measured Σ_1.1 and K_{Z, 1.1} (gray diamond symbols) when changing the assumed slope of the rotation curve. This changes the slope of the model rotation curve at all R and the dashed lines show the expectation from the simple calculation given in the previous paragraph assuming that the change is such that dln V_c/dln R = +/− 0.1 at all R. It is clear that the measurements of Σ_1.1 are strongly and systematically affected by the change in the slope of the rotation curve, especially near R₀. The inferred Σ_1.1(R) profile changes by about twice the random uncertainties:

$\begin{eqnarray} \Sigma _{1.1}(R) & =& 65\,M_\odot \,\mathrm{pc}^{-2}\,\exp \left(-(R-R_0)/ 2.3\,\mathrm{kpc}\right)\nonumber \\ &&\qquad ({d}\ln V_c(R_0) / {d}\ln R= -0.1), \end{eqnarray} \tag{ 37 }$

$\begin{eqnarray} \Sigma _{1.1}(R) & = & 72\,M_\odot \,\mathrm{pc}^{-2}\,\exp \left(-(R-R_0)/ 2.7\,\mathrm{kpc}\right)\nonumber \\ &&\qquad ({d}\ln V_c(R_0) / {d}\ln R= 0.1). \end{eqnarray} \tag{ 38 }$

However, the measured K_{Z, 1.1} is much less affected by the change in the slope of the rotation curve, and the change in K_{Z, 1.1} is well within the random uncertainties for all MAPs. The inferred K_{Z, 1.1}(R) profile changes only by about 1σ:

$\begin{eqnarray} \frac{K_{Z,1.1}(R)}{2\pi G} & =& 69\,M_\odot \,\mathrm{pc}^{-2}\,\exp \left(-(R-R_0)/ 2.6\,\mathrm{kpc}\right)\nonumber \\ & &\qquad ({d}\ln V_c(R_0) / {d}\ln R = -0.1), \end{eqnarray} \tag{ 39 }$

$\begin{eqnarray} \frac{K_{Z,1.1}(R)}{2\pi G} & = & 63\,M_\odot \,\mathrm{pc}^{-2}\,\exp \left(-(R-R_0)/ 2.8\,\mathrm{kpc}\right)\nonumber \\ &&\qquad ({d}\ln V_c(R_0) / {d}\ln R = 0.1). \end{eqnarray} \tag{ 40 }$

Therefore, the inferred K_{Z, 1.1} are more robust to changes in the assumed rotation curve. When using our measurements in analyses where the rotation curve is varied or assumed to be very different from the flat rotation curve in our fiducial model, we recommend using the K_{Z, 1.1} measurements instead of the Σ_1.1 measurements.

We have also investigated the effect of changing the value of R₀. One would expect that our measurements, which are based on a volume centered on the Sun, are at constant distances from the assumed R₀. Therefore, we should find that a measurement of Σ_1.1 or K_{Z, 1.1} at R for R₀ = 8 kpc is at R + ΔR, where ΔR = R₀ − 8 kpc, for a different value of R₀. We have performed the full Σ_1.1 measurement for the two MAPs in Figure 10 using R₀ = 8.5 kpc and have found that the inferred Σ_1.1 and K_{Z, 1.1} is indeed shifted by R₀ − 8 kpc = 0.5 kpc for these MAPs, such that their measured Σ_1.1 and K_{Z, 1.1} are the same as for the fiducial R₀, but at Galactocentric radii that are 0.5 kpc larger. We are therefore confident that this holds for the measurements from all of the MAPs. Table 3 contains a column that lists the radius R₀ − R at which Σ_1.1 and K_{Z, 1.1} is measured; this value should be kept constant when changing the assumed value of R₀.

The effect of changing R₀ is only to shift the measured Σ_1.1(R) and K_{Z, 1.1}(R) profiles without changing their (logarithmic) slopes. Below, we are primarily interested in determining the mass scale length of the stellar disk, and this measurement is mainly dependent on the logarithmic slope of the surface density profile. As this slope does not depend on the assumed value of R₀, we do not vary R₀ in the analysis below.

5. MEASUREMENT OF THE DISK SCALE LENGTH AND CONSTRAINTS ON THE MASS PROFILE IN THE INNER MILKY WAY

The measurement of the vertical force profile between 4.5 kpc and 9 kpc presented in the previous section provides a strong new constraint on mass models for the inner Milky Way. We explore these constraints in this section. We find that the stellar disk scale length is largely constrained by the measurements of K_{Z, 1.1}(R), but we also consider additional data on the rotation curve and the local vertical mass distribution to separate the disk and halo contributions to the total mass. These additional data are described in Section 5.1. The results of mass-model fits to these data are presented in Section 5.2.

5.1. Additional Data

As additional constraints on the mass distribution in the inner Milky Way, we use data on the rotation curve and the vertical mass distribution near the Sun. The rotation-curve data that we use are in the form of terminal velocities measured through H i and CO emission. In the fourth quadrant we use the H i data from McClure-Griffiths & Dickey (2007) and in the first quadrant we use the CO data from Clemens (1985), both binned into bins of Δl = 1°. These terminal velocity measurements are shown in the bottom panel of Figure 17. Because the terminal velocities are characterized by wiggles on the order of 10 km s⁻¹ that are likely due to the effects of nonaxisymmetry, we model these data very conservatively by assuming that they have an uncertainty of 7 km s⁻¹ correlated over Δsin l = 0.125 ≈ 1 kpc. This makes sure that wiggles typical of nonaxisymmetry do not affect our results. Because the terminal velocities depend on the Sun's peculiar velocity in the plane of the Milky Way through terms proportional to sin l and cos l, we further marginalize over such terms in the fits below, so that any effects of nonaxisymmetry and the Sun's peculiar velocity do not affect the analysis.

**Figure 17.** The upper panel shows K_{Z, 1.1}(R) of the best-fit mass model for the inner Milky Way when fitting the K_{Z, 1.1}(R) measurements from this article, the terminal velocity data, the measurement of the slope of the rotation curve from Bovy et al. (2012a), and the measurement of the local contribution from the stellar disk from Zhang et al. (2013). The black dots are the K_{Z, 1.1} measurements from this article (Figure 14 and Table 3), and the measurement from Zhang et al. (2013) is indicated by a gray diamond. The lower panel compares the terminal velocity curve for the best-fitting mass model with the terminal velocity data of Clemens (1985; longitudes 40° to 80°) and McClure-Griffiths & Dickey (2007; longitudes −80° to −40°).
Download figure:
Standard image High-resolution image

We further use constraints on the local slope of the rotation curve. Such constraints exists, e.g., from the measurement of the Oort constants (e.g., Feast & Whitelock 1997), from the kinematics of masers in the disk of the Milky Way (Reid et al. 2009; Bovy et al. 2009), or from the kinematics of stars throughout the disk (Bovy et al. 2012a). Since the measurement of dln V_c(R₀)/dln R of Bovy et al. (2012a) encompasses the results from these different analyses, we use it to represent the current uncertainty in the logarithmic slope of the rotation curve. We represent the Bovy et al. (2012a) measurement with the simple analytic form

$\begin{eqnarray} p\left(\frac{{d}\ln V_c(R_0)}{{d}\ln R} \right)& = 0\qquad \mathrm{if}\ \frac{{d}\ln V_c(R_0)}{{d}\ln R} > 0.04,\\ p\left(\frac{{d}\ln V_c(R_0)}{{d}\ln R} \right)& = W\,e^{-W},\ \mathrm{otherwise},\nonumber \\ & \quad \mathrm{where}\ W = \left(1 - \frac{1}{0.04}\,\frac{{d}\ln V_c(R_0)}{{d}\ln R}\right),\nonumber \end{eqnarray} \tag{ 41 }$

which approximately matches the PDF for dln V_c(R₀)/dln R of Bovy et al. (2012a). We stress that we do not use any measurements of the circular velocity itself. This way we can assess what direct measurements of the mass distribution combined with measurements of the shape of the rotation curve (but not its normalization) constrain the circular velocity to be.

Finally, we also include measurements of the vertical mass distribution near the Sun, Σ(R₀, Z). A number of such measurements exist in the literature and we use the results from Zhang et al. (2013), which are the best measurements of Σ(R₀, Z) to date and are consistent with all other measurements. We only use the measurement of K_{Z, 1.1}(R₀) = 67 ± 6 M_☉ pc⁻² and the measurement of the stellar disk surface density Σ_*(R₀) = 42 ± 5 M_☉ pc⁻². The measurement of K_{Z, 1.1}(R₀) is consistent with our much tighter measurement in Section 4.2 and therefore it does not contain much additional information, but the measurement of the stellar disk surface density is useful for separating the contribution from the disk and the dark halo to the mass budget.

5.2. Results

We use the same mass model as described in Section 3.2, except that we replace the bulge model with an exponentially cut off power-law with a power-law exponent of −1.8, a cut-off radius of 1.9 kpc, and a mass of 6 × 10⁹ M_☉, because this is a more realistic model for the mass distribution of the bulge (see Binney & Tremaine 2008; McMillan 2011). We fit this to various combinations of (1) the measurements of K_{Z, 1.1}(R) of Section 4.2, (2) the terminal velocity data, (3) the constraint on the local slope of the rotation curve from Equation (41), and (4) the measurements of K_{Z, 1.1}(R₀) and Σ_*(R₀) from Zhang et al. (2013). We now vary all five of the basic parameters of the mass model, that is, the stellar disk mass scale length and scale height, the circular velocity, the relative halo-to-disk contribution to $V_c^2(R_0)$ , and the local dln V_c/dln R. None of the considered data really constrains the stellar scale height, but its value is uncorrelated with the value of the other parameters; we let it vary between 100 pc and 500 pc. PDFs showing the primary results from these fits are shown in Figures 18–20. A comparison between the best-fit model using all of the dynamical data and the K_{Z, 1.1}(R) measurements of this article and the terminal velocities is shown in Figure 17.

Figure 18 shows the joint PDF for the stellar disk scale length and the contribution of the stellar disk to the circular velocity at 2.2 scale lengths (≈ the peak of the disk rotation curve). The latter parameter determines whether the Milky Way's disk is maximal using the definition of Sackett (1997), according to which a disk is maximal when its contribution to V_c at 2.2 scale lengths is 85% ± 10%. We show the constraints from different combinations of the dynamical constraints. The red curves show the contours of the PDF based on the K_{Z, 1.1}(R) measurements from this article only, while the black curves give the constraints based on all of the dynamical data. It is clear that the measurements of K_{Z, 1.1}(R) from this article alone measure the scale length, largely independent of the contribution to the mass from the other Galactic components. Figure 18 also shows that the additional dynamical data from Section 5.1 does not constrain the stellar disk scale length. The joint PDF for the scale length and disk-maximality parameter based on the rotation curve and Σ(R₀, Z) does show the familiar relation that for the disk to be maximal, the disk scale length needs to be small.

5.2.1. Constraints on the Galactic Disk Mass Distribution

Therefore, we conclude that the K_{Z, 1.1}(R) measurements from this article are the single most important existing dynamical constraint on the stellar disk scale length of the Milky Way. The result from the combined fit to all dynamical data gives

$\begin{equation} \mathrm{stellar\ disk\ scale\ length} = 2.15\pm 0.14\,\mathrm{kpc}. \end{equation} \tag{ 42 }$

The combination of the K_{Z, 1.1}(R) measurements and the additional dynamical data shows that the disk is maximal since

$\begin{equation} \frac{V_{c,*}}{V_c} |_{2.2\,R_d} = 0.83\pm 0.04. \end{equation} \tag{ 43 }$

These measurements of the stellar disk scale length and its contribution to the rotation curve allow us to derive the mass of the disk. We can measure the surface density of the stellar disk because the additional data on the rotation curve described in Section 5.1 combined with the K_{Z, 1.1}(R) measurements from this article allow us to disentangle the stellar and dark-halo contributions to Σ_1.1(R). We find that the surface density of the stellar disk at R₀ is Σ_*(R₀) = 38 ± 4 M_☉ pc⁻², for a total surface density to 1.1 kpc of Σ_1.1(R₀) = 68 ± 4 M_☉ pc⁻², 13 M_☉ pc⁻² of which is assumed to be in the thin interstellar medium (ISM) layer. As a consequence of ours being a full 3D dynamical model, this measurement of Σ_1.1(R₀) is a real measurement of Σ_1.1(R₀) as opposed to a measurement of K_{Z, 1.1}(R₀) converted to Σ_1.1(R₀). The fact that it agrees so well with the local normalization of our measured Σ_1.1(R) profile in Equation (27) is due to the fact that the local slope of the circular velocity curve is very close to flat in our best-fit model (see below). Our measurement of the local surface density to 1.1 kpc is consistent with all previous measurements (which are really K_{Z, 1.1}(R₀) measurements) but with a smaller uncertainty (Kuijken & Gilmore 1989b; Siebert et al. 2003; Holmberg & Flynn 2004; Bienaymé et al. 2006; Garbari et al. 2012; Zhang et al. 2013). We note in particular that the measurement of Garbari et al. (2012) of Σ_1.1(R₀) = 105 ± 24 M_☉ pc⁻² is consistent with our measurement; our much smaller errorbar leads to a much tighter measurement of the local dark matter density (see below).

As mentioned above, none of the dynamical data in Section 5.1 constrain the mass scale height of the stellar disk: we obtain a flat PDF for z_h over the prior range 100 pc < z_h < 500 pc. To illustrate how z_h can be constrained using measurements of the (surface density) at heights different from |Z| = 1.1 kpc, we fit the dynamical data from Section 5.1 together with the constraint on the total midplane density at R₀ from Holmberg & Flynn (2000): ρ_total(R₀, |Z| = 0) = 0.102 ± 0.010 M_☉ pc⁻³. While all of the other dynamical parameters are unchanged, in this case we do constrain the mass scale height: z_h = 370 ± 60 pc, which is in good agreement with the measurements from star counts (z_h ≈ 400 pc, see above and BO12d) and constraints from the vertical profile of K_Z(R₀, Z) (z_h < 430 pc at 84% confidence; Zhang et al. 2013).

Using our measurements of the disk mass scale length and the local disk normalization, we can derive the total mass of the disk to the extent that a characterization of the disk with a single scale length makes sense. We find that the total stellar disk mass is M_* = 4.6 ± 0.3 × 10¹⁰ M_☉. Under our assumption that the local ISM column density is 13 M_☉ pc⁻² and the ISM layer has a scale length twice that of the stellar disk, the ISM contributes ≈0.7 × 10¹⁰ M_☉ for a total (stars+gas) disk mass of M_disk = 5.3 ± 0.4 × 10¹⁰ M_☉. Note that in our model the ISM layer's scale length is tied to that of the stellar disk and the local normalization is fixed. Therefore, the mass of the ISM layer is perfectly correlated with that of the stellar disk, and the uncertainty in the total disk mass would be 0.3 × 10¹⁰ M_☉. For this reason, we have added an uncertainty of 0.3 × 10¹⁰ M_☉, or almost 50% of the ISM mass, in quadrature to the formal uncertainty in the disk mass. The bulge contributes another ≈6 × 10⁹ M_☉ (Binney & Tremaine 2008), while the stellar halo mass is negligible, such that the total baryonic mass of the Milky Way is

$\begin{equation} M_{\mathrm{baryonic}} = 5.9\pm 0.5\times 10^{10}\,M_\odot, \end{equation} \tag{ 44 }$

assuming an uncertainty of 0.3 × 10¹⁰ M_☉ on the bulge mass. To derive these masses, we have assumed that R₀ = 8 kpc. Changing R₀ to a different value does not change the measurements made in this subsection or in subsequent subsections, except for the total masses, which all increase by 1.5 × 10¹⁰ M_☉ when increasing R₀ to 8.5 kpc; this change is approximately linear in (R₀ − 8 kpc). The change is so large because of the short scale length of the disk: a small change in R₀ leads to a big change in R₀/R_d. All other measurements are independent of R₀, in particular the disk scale length, disk maximality V_{c, disk}/V_{c, total} at R = 2.2 R_d, and the local surface densities.

Our finding that the Milky Way's disk has a short scale length raises the question of whether such a massive disk is stable to axisymmetric perturbations, i.e., whether Toomre's Q > 1 (Toomre 1964). We do not discuss this here in detail as this needs to be looked at carefully, accounting for the mix of different components of the disk and their radial dispersion profiles (which are poorly constrained currently), as well as for the finite thickness of the disk.

5.2.2. Constraints on the Dark Matter Halo

A plausible way toward measuring the local dark halo density profile is to combine the rotation curve, which measures the total mass as a function of radius but is relatively insensitive to the flattening and therefore cannot separate the disk's contribution from the halo's contribution, with independent measurements of the disk contribution as a function of radius, as provided in this article. The constraints on the halo parameters—the local normalization ρ_DM(R₀) and power-law index α in ρ_DM∝1/r^α—from our data marginalized over all other mass-model parameters (including V_c(R₀)) are shown in Figure 19. We show the constraints from measurements of the vertical dynamics (K_{Z, 1.1}(R) and Σ_*(R₀)) and those from the rotation curve (terminal velocities and dln V_c/dln R) alone; neither of these constrains the radial profile of the dark halo, and the entire prior range 0 < α < 3 is allowed at 2σ. However, the combination of these two dynamical probes allows us to put a first constraint on the radial profile. The combination gives

$\begin{equation} \alpha \le 1.53\quad \ \mathrm{at}\ 4\,\mathrm{kpc}<R<9\,\mathrm{kpc}\quad (\mathrm{95\%\ confidence}). \end{equation} \tag{ 45 }$

This encompasses a cored halo as well as an NFW halo, although very steep halo density profiles are ruled out. We constrain the local dark matter density to be ρ_DM(R₀) = 0.008 ± 0.0025 M_☉ pc⁻³, consistent with more direct measurements of this quantity (e.g., Bovy & Tremaine 2012; Zhang et al. 2013), which measure the dark matter density using the vertical dependence of K_Z(R₀, |Z|) (≈Σ(R₀, Z) rather than the radial dependence of K_{Z, 1.1} as we do here.

**Figure 19.** Contours of the joint PDF for the local dark-matter halo density and its local logarithmic radial slope. The halo is modeled with a power-law density profile ρ_DM(r) = ρ_DM(R₀, Z = 0) (R₀/r)^α. Neither the measurements of the vertical mass profile (local and K_{Z, 1.1}(R)) nor the measurements of the shape of the circular velocity curve (terminal velocities V_term and local slope dln *V_c*(R₀)/dln R) constrain the radial profile of the halo very much. However, the combination of these two dynamical probes give a first constraint on the local halo profile: α < 1.53 at 95% confidence. This combination is different from multiplying the red and yellow PDFs, as both of these PDFs are marginalized over the other parameters of the Galactic potential. In particular, the steep halo-density peaks in the red and yellow PDFs correspond to very different and mutually inconsistent slopes of the rotation curve, which is why they are disfavored in the combined PDF.
Download figure:
Standard image High-resolution image

5.2.3. Constraints on the Rotation Curve

Finally, while we do not measure the local circular velocity in a direct manner, we do constrain it indirectly in its guise of providing the normalization of the forces in our model and thus setting the mass scale. A comparison between the value of V_c(R₀) measured in this way and more direct measurements can therefore test whether the vertical dynamics is consistent with the planar dynamics. Figure 20 shows the joint PDF for the local V_c and the local logarithmic slope of the rotation curve, dln V_c/dln R. The constraints from the terminal velocities alone show the familiar degeneracy, indicating that the terminal velocities only measure a combination of V_c and the slope of V_c(R). The constraints from the vertical dynamics (red curves) have a different degeneracy and strongly disfavor rising rotation curves. The combination of the terminal velocities and the vertical dynamics therefore measures the properties of the circular velocity curve, and we find that V_c = 218 ± 10 km s⁻¹ and dln V_c(R₀)/dln R = −0.06 ± 0.05, consistent with a flat rotation curve. These measurements are consistent with the recent APOGEE measurements of Bovy et al. (2012a), which are shown for comparison. They are also consistent with the measurement of the angular rotation frequency at the Sun by Feast & Whitelock (1997) who found V_c/R₀ = 27.19 ± 0.87 km s⁻¹ kpc⁻¹ and dln V_c(R₀)/dln R = −0.09 ± 0.05. We emphasize that our measurement of V_c in this article does not rely on the Sun's peculiar rotational velocity. For a further discussion of how a measurement of V_c = 218 km s⁻¹ compares with the literature we refer the reader to Section 5.3 of Bovy et al. (2012a); suffice it to say that all previous measurements are consistent with this measurement. A combination of the data considered in this article with the APOGEE results gives V_c = 219 ± 4 km s⁻¹ and dln V_c(R₀)/dln R = −0.06 ± 0.04. As the measurements of this article and the APOGEE measurements are very different in the way that they probe the dynamics of the disk, the fact that these two measurements agree on V_c strongly argues that V_c ≈ 220 km s⁻¹.

**Figure 20.** Joint PDF for the circular velocity at R₀ and the local logarithmic slope of the circular velocity curve. The terminal velocity curve alone constrains *V_c* to be around 220 km s⁻¹ for a flat rotation curve and a larger (smaller) *V_c* for a rising (falling) rotation curve). The measurements of the vertical mass distribution (red curves) give the opposite constraint: for *V_c* to be larger than 220 km s⁻¹, the rotation curve needs to be falling near R₀. The combination of rotation-curve shape constraints and surface-density measurements requires *V_c* = 218 ± 10 km s⁻¹ and a gently falling rotation curve dln *V_c*/dln R = −0.06 ± 0.05. This is consistent with the recent direct measurements of these quantities from stellar kinematics in the plane by APOGEE (Bovy et al. 2012a), which are shown for comparison.
Download figure:
Standard image High-resolution image

Figure 21 shows a different representation of all of the results described in this section. Shown are the total rotation curve and its decomposition into stellar-disk and halo contributions. The total and stellar-disk rotation curves are quite tightly constrained by our dynamical data. This is mostly due to our precise measurement of the stellar disk scale length, which was made possible by our measurements of K_{Z, 1.1}(R) over 4.5 kpc < R < 9 kpc. The dark halo contributes significantly less to V_c(R) than the stellar-disk at all R < 10 kpc. Figure 21 decidedly shows that we have for the first time clearly—and through direct dynamical measurement—separated the disk and halo contributions to the Milky Way's rotation curve.

6. DISCUSSION

6.1. First Dynamical Measurement of the Milky Way's Scale Length

We believe that this article presents the first dynamical measurement of the Milky Way disk's mass profile. Other measurements of the scale length are either based on star counts and it is therefore unclear whether they trace all of the mass in the disk (e.g., Jurić et al. 2008; BO12d), or they are based on previous dynamical data that leave the scale length essentially unconstrained (Dehnen & Binney 1998; Figure 18) unless strong priors are used (e.g., McMillan 2011). It turns out that our best-fit model for the mass distribution in the inner 10 kpc of the Milky Way is similar to that of model I in Binney & Tremaine (2008), which has a maximal disk with a scale length of 2 kpc.

If star counts do trace the underlying mass distribution, then we can compare our dynamically inferred scale length with that measured from star counts. There have been many measurements over the last few decades of the radial scale lengths of the thin- and thick-disk components spanning a wide range between 2 and 5 kpc. These measurements have greatly improved over the last few years with the advent of larger-area surveys with precise multiband photometry leading to better photometric distances. For example, Gould et al. (1996) measured a scale length of 3.0 ± 0.4 kpc from Hubble Space Telescope (HST) star counts of M dwarfs, whose distribution is expected to trace that of the underlying stellar mass, and Jurić et al. (2008) found from an analysis of SDSS star counts that the thin disk scale length is 2.6 kpc. Both of these are somewhat larger than the scale length measured in this article. This offset may be due to systematic uncertainties in the photometric distances used by analyses of star counts. Another, in our view more likely, explanation is that these analyses did not take into account that the radial scale lengths of different stellar disk components vary strongly. In particular, the scale lengths of the old, thick components of the disk are only ≈2 kpc (Bensby et al. 2011; BO12d; Cheng et al. 2012). This lowers the scale length of the mass profile compared with that of the thin-disk components, an effect which is larger at R < R₀, where most of our Σ_1.1 measurements lie.

A proper comparison between the dynamically inferred scale length and that measured from star counts should therefore take into account the full, complex structure of the disk. BO12a showed that the disk is made up of many components with scale lengths ranging from 2 kpc for the oldest populations to >4.5 kpc for the younger populations. The mass scale length of the stellar disk is determined by the sum of all of these components, whose combined density profile defines an effective disk scale length at every radius. This effective disk scale length based on the reassembly of the MAP decomposition of BO12a is shown in Figure 22 (the MAP scale lengths of BO12d have been re-scaled to the An09 distance scale). The stellar mass profile derived from summing over all MAPs is not a single exponential, but has a profile whose effective disk scale length increases smoothly from ≈2 kpc in the inner disk to ≈3 kpc at R = 12 kpc. The dynamical measurement from this article compares well with that inferred from star counts: the dynamical estimate falls short of the star-counts measurement by about 1σ. Nevertheless, this comparison lends credence to the interpretation that the stars in the Galactic disk are indeed the dominant contributors to the "dynamically inferred disk mass" derived here. With measurements in the outer disk at R > 10 kpc, the MAP star-counts model predicts that one should dynamically measure a mass scale length >2.5 kpc.

**Figure 22.** Comparison between the effective disk scale length determined from star counts and the dynamically measured disk mass scale length from this article. The prediction from star counts is obtained by taking the scale length and stellar surface density measurements of MAPs of BO12a and measuring the scale length of the effective radial profile that is obtained by summing over all MAPs. The errorbar on the radius of the dynamically measured scale length indicates the range over which it is measured in this article (cf. Figure 13). The scale length determined by star counts is in excellent agreement with the dynamically measured scale length.
Download figure:
Standard image High-resolution image

We can further compare our dynamical measurement of the stellar-disk surface density at R₀, Σ_*(R₀) = 38 ± 4 M_☉ pc⁻², with estimates based on direct star counts. Estimates of the total amount of ordinary stellar matter are Σ_visible(R₀) ≈ 30 M_☉ pc⁻² (Flynn et al. 2006; BO12b) with an additional 7 M_☉ pc⁻² contributed by stellar remnants and brown dwarfs (Flynn et al. 2006). These estimates have uncertainties of a few M_☉ pc⁻², which includes the uncertainty in the shape of the initial mass function at low masses (see BO12b). Thus, our dynamical measurement of the disk mass properties is in excellent agreement with direct star-counts.

That the dynamical and star-counts measurements of the stellar-disk scale length and the local stellar surface density agree so well is evidence that basically all of the mass in the stellar disk is accounted for by ordinary stellar matter orbiting under the influence of Newtonian gravity. Our results leave little room for a dark disk component in the Milky Way (Read et al. 2008) because the presence of such a component would increase our estimates of the local surface density and the mass scale length. Similarly, in the MOND theory of modified gravity the local surface-density to 1.1 kpc is predicted to be enhanced with respect to the contribution from baryonic matter by ≳ 60%⁷ (Nipoti et al. 2007; Bienaymé et al. 2009; Famaey & McGaugh 2012), such that we expect for Σ_baryonic(R₀) = 51 ± 4 M_☉ pc⁻² that $\Sigma _{1.1}^{\mathrm{MOND}}(R_0) \gtrsim 82\pm 6\,M_\odot \,\mathrm{pc}^2$ . This prediction is in ≳ 2σ tension with our measurement of Σ_1.1(R₀) = 68 ± 4 M_☉ pc⁻². Furthermore, in MOND the dynamically inferred disk scale length (i.e., that inferred from K_Z(R) measurements) is predicted to be 25% larger than that measured from star counts (Bienaymé et al. 2009), so we would expect to dynamically measure a scale length of ≈2.9 kpc. This is 5σ removed from our measurement of the mass scale length. We emphasize that these are preliminary tests of MOND based on the Bekenstein–Milgrom (Bekenstein & Milgrom 1984) formulation of MOND and that these tests should more fully take into account the uncertain structure of the baryonic disk. However, it is clear that the measurements from this article and further improved measurements of the vertical forces in the Milky Way are key to improved tests of modified gravity models such as MOND on galaxy scales.

6.2. The Mass of the Galactic Disk

Our measurement of the mass scale length of the disk now makes our measurement of the Galactic disk mass the most accurate estimate to date. Our measurement of M_* = 4.6 ± 0.3 × 10¹⁰ M_☉ compares well with the range found by Flynn et al. (2006; who also assume R₀ = 8 kpc), who estimated the stellar-disk mass as a function of an assumed scale length. Our measurement of the scale length falls at the lower end of their considered range where their stellar disk mass is ≈5 × 10¹⁰ M_☉. Our estimate of the baryonic mass in the Galaxy of M_baryonic = 5.9 ± 0.5 × 10¹⁰ M_☉ also agrees with their estimate of M_baryonic = 6.1 ± 0.5 × 10¹⁰ M_☉. However, it is important to note that ours is a purely dynamical measurement, while Flynn's value relies on assumptions about the stellar mass function in the disk and about the manner in which mass traces light. Both measurements are similarly affected by the uncertainty in R₀.

6.3. Disk Maximality and the Milky Way Compared to External Galaxies

The measurements of the Milky Way's disk mass scale length and stellar mass from this article allow us to make more precise comparisons of the Milky Way with external galaxies, e.g., through the Tully–Fisher relation, initially put forth by Flynn et al. (2006). A main source of systematic uncertainty in Flynn et al. (2006) was the uncertainty in the value of the radial scale length, which a priori could have been anywhere between 2 kpc and 5 kpc. Our measurement of R_d = 2.15 ± 0.14 kpc removes this source of uncertainty. Using Flynn et al.'s (2006) calculations of the total I-band luminosity of the Milky Way we find that $L_I^{\mathrm{disk}} \approx 3.5\times 10^{10}\,L_\odot$ (M_I = −22.2). Using a bulge luminosity of 10¹⁰ L_☉ (Kent et al. 1991), we find that

$\begin{equation} L_I \approx 4.5\times 10^{10}\,L_\odot, \ \mathrm{or}\qquad M_I \approx -22.5. \end{equation} \tag{ 46 }$

Combined with our measurement that V_c = 218 ± 10 km s⁻¹, this allows us to place the Milky Way onto the Tully–Fisher relation defined by nearby disk galaxies: it falls well within the 1σ scatter (e.g., Dale et al. 1999) as this relation predicts M_I = −22.8 for V_c = 220 km s⁻¹ with a scatter of 0.4 mag. Therefore, from the point of view of the Tully–Fisher relation, the Milky Way is a typical galaxy.

The decomposition of the Milky Way's rotation curve into the contributions from the stellar disk and the dark-matter halo in Section 5.2 has shown that the Milky Way's disk is maximal by the definition of Sackett (1997). This appears to be in conflict with arguments based on the lack of correlation between Tully–Fisher velocity residuals and disk size for external galaxies (Courteau & Rix 1999), which have been used to argue against disks being maximal. However, whether this really shows that all disks are submaximal is far from clear, as dynamical modeling of gas kinematics in various galaxies and constraints from spiral structure have shown that some disks, especially those with V_c > 200 km s⁻¹, are maximal (Athanassoula et al. 1987; Weiner et al. 2001; Kranz et al. 2003), while being consistent with the considerations of Courteau & Rix (1999). Measurements of the vertical velocity dispersions of external galaxies have been interpreted to indicate that disks are substantially submaximal (e.g., Bottema 1997 and more recently Kregel et al. 2005 and Bershady et al. 2011). Such measurements rely on the same dynamical principles as those employed in this article's measurement, but these are much more difficult to apply in external galaxies without strong assumptions. It is essentially impossible to measure both the scale height and the vertical velocity dispersion for any individual external galaxy. Additionally, velocity dispersions obtained from integrated light do not trace the older, dynamically relaxed stellar populations very well. As such, these measurements are afflicted with systematic uncertainties, which have not been sufficiently investigated. The fact that the Milky Way would appear substantially submaximal in the analysis of Bershady et al. (2011) while the detailed dynamical modeling in this article indicates otherwise may be a sign of these systematic uncertainties.⁸

In Figure 23 we compare our measurements of the Milky Way disk's properties to those derived from a sample of 81 disk-dominated galaxies from Pizagno et al. (2005). We follow Gnedin et al. (2007) in using the measurements of the disk scale lengths, stellar masses, and circular velocities at 2.2 disk scale lengths to derive the disk's contribution to the rotation curve at 2.2 disk scale lengths. In detail, we scale the stellar masses down by 20% to correct them for the presence of any bulge component and we lower the observed scale lengths by 10% as these scale lengths are measured from g- and r-band images, which are typically larger than the K-band scale length, which in turn better traces the stellar mass (de Jong 1996). The top panel shows that the Milky Way falls nicely within the stellar-mass–disk-maximality relation derived from the Pizagno et al. (2005) data. However, the Milky Way's scale length is quite different from that of external galaxies with similar stellar masses: for its stellar mass, the Milky Way would be expected to have a scale length of ≈4.5 kpc, albeit with a large scatter of ≈2 kpc This is clear from the color-coding of the points in the top panel of Figure 23. Following Gnedin et al. (2007), we also consider the relation between the disk's surface density ( $\equiv M_* \,R_d^{-2}$ ) and the fraction of the circular velocity contributed by the disk. This is shown in the bottom panel of Figure 23. The short scale length that we measure for the Milky Way has the effect of putting the Milky Way at the upper edge of the observed surface densities, but the Milky Way still falls along the general trend determined by external galaxies.

**Figure 23.** The Milky Way's disk properties compared to 81 external galaxies from Pizagno et al. (2005). The top panel shows the relation between stellar mass and the disk's contribution to the rotation velocity at 2.2 scale lengths (the extent to which the disk is maximal, see Figure 18) Points are color-coded by the radial scale length. The bottom panel shows the relation between the surface density ( $\equiv M_*\,R_d^{-2}$ ) and the disk maximality. The Milky Way falls along the general trends defined by external galaxies, except that its scale length appears short compared to that of similar external galaxies.
Download figure:
Standard image High-resolution image

Therefore, if the Milky Way is atypical in any way, it is because of its short scale length. However, the mass-weighted scale length is only poorly measured for external galaxies, because variations in the mass-to-light ratio with radius may hamper the photometrically measured disk profiles.

6.4. MAPs as Dynamically Phase-mixed Populations

The analysis in this article has used the properties of MAPs as measured in BO12a to justify modeling the MAPs as phase-mixed, steady-state stellar populations that lend themselves to simple dynamical models. This has worked well and all MAPs lead to a consistent measurement of the vertical mass distribution near the disk. The single-qDF-per-MAP model provides a good fit to the spatial and kinematic properties of MAPs, as explicitly shown in the detailed comparisons between the data and the best-fit dynamical models in Appendix B. Thus, the MAPs are indeed well-described by the action-based qDF, as first proposed by Ting et al. (2013), although with the caveat that we have not modeled the radial and azimuthal velocities.

The fact that all MAPs can be described by the qDF adds further to the evidence that MAPs are simple dynamical building blocks of the (local) disk, as first proposed by BO12a. In particular the MAPs with scale heights and velocity dispersions intermediate between those of the canonical thin and thick disk are real dynamical populations. If these intermediate MAPs would have arisen because of abundance errors, as was already strongly argued against based on their observed spatial properties and kinematics in BO12a, we would not expect to be able to model them using a consistent dynamical model. For example, we would expect that the vertical profile measurements would be dominated by the stars in the thin component, while the dispersion measurements would be dominated by the stars in the thicker, kinematically warmer component, which would strongly overestimate the surface density (by a factor of three or more). The fact that this does not happen strongly argues that the intermediate MAPs are real phase-mixed stellar populations in a dynamical steady state.

7. CONCLUSIONS

In this article we have used six-dimensional dynamical fitting employing three-action-based DFs in order to model abundance-selected stellar populations from the SEGUE survey, fully accounting for the selection function and deriving dynamical constraints while marginalizing over properties of the DF. We have used this in particular to obtain a measurement of Σ_1.1 (or K_{Z, 1.1}) at a single Galactocentric radius for each MAP, thus dynamically measuring, for the first time, the radial profile of the surface density near the disk. These measurements are given in Table 3, and they present stringent new constraints on the mass distribution in the inner Milky Way.

We have used these new measurements of K_{Z, 1.1}(R) in addition to (weak) existing measurements of the terminal velocity curve between 4 kpc and R₀, the contribution of the disk to the local surface density, and the local slope of the rotation curve to constrain the gravitational potential between 4 kpc and 10 kpc. We find that our new measurements of K_{Z, 1.1}(R) provide the only dynamical constraint able to measure the dynamical (mass-weighted) disk scale length; in combination with the other dynamical constraints we can measure the properties of the Milky Way's disk to great precision and find

$\begin{eqnarray*} \mathrm{stellar\ disk\ scale\ length} & = 2.15\pm 0.14\,\mathrm{kpc},\\ \Sigma _{*}(R_0) & = 38\pm 4\,M_\odot \,\mathrm{pc}^{-2},\\ \Sigma _{\mathrm{disk}}(R_0) & = 51\pm 4\,M_\odot \,\mathrm{pc}^{-2}, \end{eqnarray*}$

where Σ_*(R₀) includes the contributions from ordinary stellar matter, stellar remnants, and brown dwarfs and Σ_disk(R₀) = Σ_ISM(R₀) + Σ_*(R₀). We further find that

$\begin{eqnarray*} M_{*} & = 4.6\pm 0.3\,(\mathrm{ran.})\,\pm 1.5\,(\mathrm{syst.})\,\times 10^{10}\,M_\odot,\\ M_{\mathrm{disk}} & = 5.3\pm 0.4\,(\mathrm{ran.})\,\pm 1.5\,(\mathrm{syst.})\,\times 10^{10}\,M_\odot,\\ M_{\mathrm{baryonic}} & = 5.9\pm 0.5\,(\mathrm{ran.})\,\pm 1.5\,(\mathrm{syst.})\,\times 10^{10}\,M_\odot . \end{eqnarray*}$

The systematic uncertainty is due to the uncertainty in R₀: increasing R₀ from our fiducial value of 8 kpc to 8.5 kpc increases the estimated masses by 1.5 × 10¹⁰ M_☉. The disk mass M_disk includes the mass of the stellar disk, M_*, as well as the mass of the ISM layer; M_baryonic is the total baryonic mass of the Milky Way, including the stellar and ISM disks and the bulge.

These direct dynamical measurements of the stellar disk's properties are in good agreement with measurements derived from star counts, leaving little room for dark matter in a disk-like configuration. With a scale length this short, the Milky Way's disk is maximal by the definition of Sackett (1997): we find that V_{c, *}/V_c = 0.83 ± 0.04 at 2.2 disk scale lengths.

This article's measurement of the disk mass will also be particularly valuable when measuring the dark halo's flattening from constraints on the potential at larger heights. For example, Koposov et al. (2010) measured the total potential flattening at ≈8 kpc from the plane from fitting an orbit to the cold GD-1 stream, but found that the uncertainty in the mass of the disk did not allow for this to be turned into an interesting constraint on the halo's flattening. Using our measurement of the disk mass, the GD-1 data indicate that the halo density flattening is ${\approx }0.7^{+0.3}_{-0.15}$ , but it is clear that a more rigorous combination of these measurements and further progress in the dynamical fitting of tidal streams (e.g., Sanders & Binney 2013) are necessary to robustly measure the halo's flattening.

These measurements of the disk's properties allow us to separate the contribution from the disk and the halo to the rotation curve, as shown in Figure 21. The halo does not contribute much to the Milky Way's rotation curve at R < 10 kpc. In turn this means that our constraints on the radial profile of the dark halo are relatively weak. Nevertheless, these are the first dynamical constraints on the radial distribution of dark matter at R < 10 kpc, and we find for a model ρ_DM(r; ≈R₀)∝1/r^α that α < 1.53 at 95% confidence (where ≈R₀ indicates that our measurement is based on dynamical data at 4 kpc ≲ R ≲ 9 kpc). Further measurements of the vertical mass distribution (e.g., Σ(R) at |Z| ≈ 2 kpc and |Z| ≈ 3 kpc) would significantly improve this constraint.

The dynamical modeling performed in this article is complex, computationally expensive, and currently limited to exploring only a few parameters of the gravitational potential. It provides a full generative modeling framework in which selection effects, observational uncertainties, nuisance parameters, outlier models, and other data issues can be naturally included using a likelihood-based approach. We have dealt with the computational complexity by focusing our efforts on the measurement of a single dynamical constraint derived from each dynamical subsample of stellar tracers (MAPs): the vertical force at 1.1 kpc. However, it is clear that the fitting procedure to each MAP can yield much more information than this. In particular, each subsample could be used to also measure the vertical profile of the vertical force within the SEGUE sample's spatial volume, which should lead to better constraints on the radial profile of the dark-matter halo. Ultimately, this is what is required for the optimal analysis of Gaia data. The analysis in this article has demonstrated 3D, three-action dynamical modeling of disk populations using observations of individual stars in the context of a real data set with all of the complexities of a nontrivial selection function and data uncertainties. But it is clear that further development of this technique, especially in more efficiently exploring the dynamical PDF, is necessary before the Gaia data arrive.

It is a pleasure to thank James Binney, Chris Flynn, Yuan-Sen Ting, Scott Tremaine, and the Oxford dynamics group for helpful comments and assistance. We thank the SEGUE team for their efforts in producing the SEGUE data set. Computations were run on the Institute for Advanced Study's School of Natural Sciences' Aurora cluster. J.B. was supported by NASA through Hubble Fellowship grant HST-HF-51285.01 from the Space Telescope Science Institute, which is operated by the Association of Universities for Research in Astronomy, Incorporated, under NASA contract NAS5-26555. J.B. and H.-W.R. acknowledge support from SFB 881 (A3) funded by the German Research Foundation DFG.

Funding for the SDSS and SDSS-II has been provided by the Alfred P. Sloan Foundation, the Participating Institutions, the National Science Foundation, the U.S. Department of Energy, the National Aeronautics and Space Administration, the Japanese Monbukagakusho, the Max Planck Society, and the Higher Education Funding Council for England. The SDSS website is http://www.sdss.org/.

APPENDIX A: CALCULATION OF THE EFFECTIVE SURVEY VOLUME

In this Appendix, we describe how we can efficiently calculate the eight-dimensional normalization integral (the effective survey volume) in the likelihood in Equation (25). This calculation is similar to that in the case of the density fits in BO12d as described in their Appendix B, and we start from their Equation (B4), which is rewritten to include the velocity integration

$\begin{eqnarray} &&\int {d}l\,{d}b\,{d}r\,{d}D\, {d}\mathbf {v}\,{d}(g-r) \,{d}[\mathrm{Fe/H}]\,\nonumber\\ &&\quad\times \lambda (l,b,D,\mathbf {v},r,g-r,[\mathrm{Fe/H}]|\mathbf {p}_\Phi,\mathbf {p}_{\mathrm{DF}})\nonumber\\ &&\qquad = A_p\,\sum _{\mathrm{plates}\ p} \sum _{D}\,D^2\,\nu _*(R,z|l,b,D,\mathbf {p}_\Phi,\mathbf {p}_{\mathrm{DF}})\,\nonumber\\ &&\qquad\quad\times \sum _{g-r} \,\sum _{[\mathrm{Fe/H}]}\,\rho ^c(g-r)\,\rho ^{[\mathrm{Fe/H}]}([\mathrm{Fe/H}])\,\nonumber\\ &&\qquad\quad\times S(p,r[g-r,[\mathrm{Fe/H}],D],g-r), \end{eqnarray} \tag{ A1 }$

where

$\begin{equation} \nu _*(R,z|l,b,D,\mathbf {p}_\Phi,\mathbf {p}_{\mathrm{DF}}) = \int {d}\mathbf {v}f(\mathbf {J}[\mathbf {x},\mathbf {v}])|l,b,D,\mathbf {p}_\Phi,\mathbf {p}_{\mathrm{DF}}), \end{equation} \tag{ A2 }$

is the spatial density predicted by the DF.

To evaluate the expression in Equation (A1) efficiently, we split the calculation in two steps: (1) we calculate the predicted spatial density in Equation (A2) on a grid in (R, Z) and interpolate it, and (2) we use the interpolated density to evaluate the sums in Equation (A1). We perform the velocity integration in Equation (A2) by using 20th order Gauss–Legendre integration in each direction: from 0 to 3V_c(R₀)/2 for the tangential velocity and from −4σ to 4σ for the radial and vertical velocity. "σ" is calculated using the scale dispersion profiles of the DF (see Equations (3)–(5)). We calculate the density on a grid of 16 × 16 points ranging from 4 to 15 kpc in R and from 0 to 4 R₀/5 in Z and interpolate using 3rd order two-dimensional spline interpolation.

APPENDIX B: DETAILED DATA VERSUS MODEL COMPARISONS

In this Appendix, we describe the results from detailed comparisons between the best-fit dynamical models for individual MAPs obtained using the methodology described in Section 3 and the SEGUE data. As in BO12d, we do this in a space close to that of the raw data: star counts uncorrected for selection effects. Even though our model is a full generative model that could predict the distribution of any combination of r, g − r, [Fe/H], x, v to be compared with the data, because we are mainly concerned with measuring Σ_1.1(R) we focus here on (1) vertical number counts and (2) the distribution of vertical velocities, both as a function of radius as these are the main data ingredients for constraining Σ_1.1(R). As described in Section 3, we fit individual MAPs that typically only have a few hundred stars, making visual comparisons between the data and the model highly susceptible to the influence of Poisson noise. For this reason and because showing comparisons for 43 different subsamples would take up the better part of an ApJ volume, we group MAPs into three main divisions with similar DFs. These are (1) α-old MAPs, i.e., all MAPs with [α/Fe] > 0.3 dex; these MAPs all have short radial scale lengths around 2 kpc, and they are vertically thick; and (2) α-young MAPs, i.e., those with [Fe/H] > −0.1 dex; these are vertically thin and have longer scale lengths. Finally, we consider MAPs intermediate between (1) and (2), which are all of the MAPs with [α/Fe] < 0.3 dex and [Fe/H] < −0.1 dex. We calculate the predicted number counts for each MAP within a subdivision separately and plot the sum of all predictions, weighted by the relative number of data points in different MAPs.

The subdivision of α-old MAPs is by far the largest of the three subdivisions, allowing us to closely investigate the goodness of the dynamical fit as a function of R. Figure 24 shows the comparison between the predicted number counts as a function of distance from the plane and the data, in 10 different radial bins that each contain about 1000 stars. The number counts for these α-old populations are reproduced excellently by our dynamical model to almost every single wiggle. As α-old MAPs primarily measure Σ_1.1(R) at R ≈ 6 kpc (see Figure 12), we also include the predictions from models with very different Σ_1.1(6 kpc) from the best-fit model. All three models shown predict almost the same number-counts profile.

**Figure 24.** Comparison between the vertical distribution of the data and the best-fit dynamical model for α-old G-dwarf MAPs. The α-old MAPs in this figure are all of those with [α/Fe] > 0.3; these all have best-fit radial scale lengths from BO12d around 2 kpc. The predictions are calculated for each MAP separately, based on each MAP's best-fit qDF and potential parameters, and they are combined to provide better statistics for the comparison (because each MAP only contains a few hundred data points). The comparison is shown for 10 different radial bins that each contain about 1000 data points. The dashed and dotted lines are alternative models that have Σ_1.1(6 kpc) = 192 M_☉ pc⁻² and Σ_1.1(6 kpc) = 108 M_☉ pc⁻², respectively, choosing the best-fitting qDF parameters for each MAP corresponding to those potentials. The vertical distribution of the data is excellently matched for all radial bins for all three models, indicating that the vertical density distribution of the data is so informative that it has to be matched by all potential models, regardless of whether the velocity dispersion is matched (see Figure 25). The best-fitting model is essentially that for which the best-fitting qDF parameters for the tracer density profile also provide the best-fit for the velocity distribution.
Download figure:
Standard image High-resolution image

The distribution of vertical velocities as a function of position for 10 different bins in distance from the midplane is shown in Figure 25. These distributions show excellent agreement between the data and the best-fit model over the full range of distances from the plane, from |Z| ≈ 500 pc to |Z| ≈ 4500 pc. The alternative models slightly overestimate or underestimate the width of the distribution, depending on whether their disk is too heavy or too light. From Figures 24 and 25 it is clear that the number-count measurements are the most informative, such that the dynamical fit works essentially as follows: (1) for a given potential, the qDF parameters are adjusted to match the observed number counts without regard as to whether the vertical velocities are fit well; and (2) the best-fitting potential is the one for which these qDF parameters also fit the vertical velocity distribution. This is essentially the same procedure that was used more explicitly by Kuijken & Gilmore (1989a).

Figures 26 and 27 show similar comparisons for the intermediate MAPs. These demonstrate that the vertical number counts are matched well, although with a slight overprediction of the scale height around R = 8 kpc. Heavier and lighter disks (with changes to Σ_1.1 around 7 kpc for intermediate MAPs) again predict almost the same vertical profile. Figure 27 shows that the distribution of vertical velocities as a function of height is excellently fit by the best-fit dynamical model; the predictions of the heavier and lighter disks are obviously ruled out for these populations.

**Figure 26.** Same as Figure 24, but for intermediate MAPs ([α/Fe] < −0.3 and [Fe/H] < −0.1). The alternative dashed and dotted models now correspond to Σ_1.1(7 kpc) = 138 M_☉ pc⁻² and Σ_1.1(7 kpc) = 65 M_☉ pc⁻².
Download figure:
Standard image High-resolution image

**Figure 27.** Same as Figure 25, but for intermediate MAPs. The alternative models are described in the caption of Figure 26.
Download figure:
Standard image High-resolution image

Finally, Figures 28 and 29 show data–model comparisons for the α-young populations, split only into two radial and vertical bins here because only about 2000 stars have [Fe/H] > −0.1 dex in our selection of the SEGUE G-type dwarfs. The comparison is therefore not as fine-grained as for the α-old and intermediate divisions, but the correspondence between the data and the model are similar as for the other divisions: the vertical number-count profiles are predicted to be similar for the best-fit potential and the heavier- and lighter-disk alternative models, and the distribution of the vertical velocities clearly prefers the best-fit model over the alternatives.

**Figure 28.** Same as Figure 24, but for α-young MAPs ([Fe/H] > −0.1). The alternative dashed and dotted models now correspond to Σ_1.1(8 kpc) = 87 M_☉ pc⁻² and Σ_1.1(8 kpc) = 45 M_☉ pc⁻².
Download figure:
Standard image High-resolution image

**Figure 29.** Same as Figure 25, but for α-young MAPs. The alternative models are described in the caption of Figure 28.
Download figure:
Standard image High-resolution image

The fact that both the vertical number counts and the distribution of vertical velocities is matched in detail by our three-action based qDF model proves that a description of these populations as dynamically relaxed populations makes sense. However, the detailed comparison of the vertical number-counts of the best-fit dynamical model and the data for all three subdivisions shows a slight underprediction of the number counts at |Z| ≈ 1 kpc and R ≈ 8 kpc, which is likely due to substructure that is not captured by our model. Clearly, investigation of the residuals from the best-fit dynamical models for MAPs in the future will be useful for determining the fine-grained orbital structure of MAPs, which may contain traces of dynamical or accreted substructures.

A DIRECT DYNAMICAL MEASUREMENT OF THE MILKY WAY'S DISK SURFACE DENSITY PROFILE, DISK SCALE LENGTH, AND DARK MATTER PROFILE AT 4 kpc ≲ R ≲ 9 kpc

Article metrics

Permissions

Author e-mails

Author affiliations

Author notes

Dates

ABSTRACT

1. INTRODUCTION

2. SDSS/SEGUE DATA

2.1. G-dwarf Data

3. ANALYSIS METHOD

3.1. Distribution Function Model

3.2. Milky Way Potential Model

3.3. Fitting Procedure

3.4. Further Implementation Details

4. RESULTS

4.1. Results for Individual MAPs and Determination of Σ_1.1(R)

4.2. The Surface Density at 1.1 kpc between 4 kpc and 9 kpc as Measured by MAPs

4.3. Variations on the Fiducial Model

5. MEASUREMENT OF THE DISK SCALE LENGTH AND CONSTRAINTS ON THE MASS PROFILE IN THE INNER MILKY WAY

5.1. Additional Data

5.2. Results

5.2.1. Constraints on the Galactic Disk Mass Distribution

5.2.2. Constraints on the Dark Matter Halo

5.2.3. Constraints on the Rotation Curve

6. DISCUSSION

6.1. First Dynamical Measurement of the Milky Way's Scale Length

6.2. The Mass of the Galactic Disk

6.3. Disk Maximality and the Milky Way Compared to External Galaxies

6.4. MAPs as Dynamically Phase-mixed Populations

7. CONCLUSIONS

APPENDIX A: CALCULATION OF THE EFFECTIVE SURVEY VOLUME

APPENDIX B: DETAILED DATA VERSUS MODEL COMPARISONS

Footnotes

A DIRECT DYNAMICAL MEASUREMENT OF THE MILKY WAY'S DISK SURFACE DENSITY PROFILE, DISK SCALE LENGTH, AND DARK MATTER PROFILE AT 4 kpc ≲ R ≲ 9 kpc

Article metrics

Permissions

Share this article

Author e-mails

Author affiliations

Author notes

Dates

ABSTRACT

1. INTRODUCTION

2. SDSS/SEGUE DATA

2.1. G-dwarf Data

3. ANALYSIS METHOD

3.1. Distribution Function Model

3.2. Milky Way Potential Model

3.3. Fitting Procedure

3.4. Further Implementation Details

4. RESULTS

4.1. Results for Individual MAPs and Determination of Σ1.1(R)

4.2. The Surface Density at 1.1 kpc between 4 kpc and 9 kpc as Measured by MAPs

4.3. Variations on the Fiducial Model

5. MEASUREMENT OF THE DISK SCALE LENGTH AND CONSTRAINTS ON THE MASS PROFILE IN THE INNER MILKY WAY

5.1. Additional Data

5.2. Results

5.2.1. Constraints on the Galactic Disk Mass Distribution

5.2.2. Constraints on the Dark Matter Halo

5.2.3. Constraints on the Rotation Curve

6. DISCUSSION

6.1. First Dynamical Measurement of the Milky Way's Scale Length

6.2. The Mass of the Galactic Disk

6.3. Disk Maximality and the Milky Way Compared to External Galaxies

6.4. MAPs as Dynamically Phase-mixed Populations

7. CONCLUSIONS

APPENDIX A: CALCULATION OF THE EFFECTIVE SURVEY VOLUME

APPENDIX B: DETAILED DATA VERSUS MODEL COMPARISONS

Footnotes

4.1. Results for Individual MAPs and Determination of Σ_1.1(R)