HSTPROMO Internal Proper Motion Kinematics of Dwarf Spheroidal Galaxies: I. Velocity Anisotropy and Dark Matter Cusp Slope of Draco

We analyze four epochs of HST imaging over 18 years for the Draco dwarf spheroidal galaxy. We measure precise proper motions (PMs) for hundreds of stars and combine these with existing line-of-sight (LOS) velocities. This provides the first radially-resolved 3D velocity dispersion profiles for any dwarf galaxy. These constrain the intrinsic velocity anisotropy and resolve the mass-anisotropy degeneracy. We solve the Jeans equations in oblate axisymmetric geometry to infer the mass profile. We find the velocity dispersion to be radially anisotropic along the symmetry axis and tangentially anisotropic in the equatorial plane, with a globally-averaged value $\overline{\beta_{\mathrm B}}=-0.20^{+ 0.28}_{- 0.53}$, (where $1 - \beta_{\mathrm B} \equiv \langle v_{\mathrm{ tan}}^2 \rangle / \langle v_{\mathrm{ rad}}^2 \rangle$ in 3D). The logarithmic dark matter (DM) density slope over the observed radial range, $\Gamma_{\mathrm{ dark}}$, is $-0.83^{+ 0.32}_{- 0.37}$, consistent with the inner cusp predicted in $\Lambda$CDM cosmology. As expected given Draco's low mass and ancient star formation history, it does not appear to have been dissolved by baryonic processes. We rule out cores larger than 487, 717, 942 pc at respective 1-, 2-, 3-$\sigma$ confidence, thus imposing important constraints on the self-interacting DM cross-section. Spherical models yield biased estimates for both the velocity anisotropy and the inferred slope. The circular velocity at our outermost data point (900 pc) is $24.19^{+ 6.31}_{- 2.97} \ \mathrm{km~s^{-1}}s$. We infer a dynamical distance of $75.37^{+ 4.73}_{- 4.00}$ kpc, and show that Draco has a modest LOS rotation, with $\left= 0.22 \pm 0.09$. Our results provide a new stringent test of the so-called `cusp-core' problem that can be readily extended to other dwarfs.


INTRODUCTION
Decades of astrophysical evidence support the notion that most of the matter in the Universe is dark.However, the nature of this dark matter (DM) remains a mystery.The most likely candidate is some form of cold Corresponding author: Eduardo Vitral evitral@stsci.eduDM (CDM), consisting of collisionless particles that cannot (yet) be detected directly, but that interact through gravity.
Some of the best systems to study DM are the "classical" dwarf spheroidal galaxies (dSphs) in the Milky Way (MW).They are strongly DM dominated (Pryor & Kormendy 1990), and have a large number of bright stars that can be resolved due to their proximity.The stars' motions contain information about the gravitational potential in which they move, and thus a large observational effort has been invested in obtaining their line-of-sight (LOS) velocities (v LOS , e.g.Tolstoy et al. 2004;Walker et al. 2007;Gilmore et al. 2022).Results from analyzing these data have been inconclusive about some CDM predictions.A conspicuous example of this is the so-called "cusp-core problem": the tension around the predicted and observed DM mass-density profiles of galaxies.CDM halos in collisionless cosmological Nbody simulations follow a nearly universal mass-density profile that increases and diverges toward the center, forming a 'cusp' (Navarro, Frenk, & White 1997).In contrast, observations of some dSphs favor shallower density profile slopes, consistent with a constant density 'core' at the center (e.g.Battaglia et al. 2008;Walker & Peñarrubia 2011;Amorisco & Evans 2012;Brownsberger & Randall 2021).
Various solutions have been proposed to explain this and other discrepancies.Some propose fundamental changes in the nature of DM, such as warm DM (WDM), e.g.sterile neutrinos and gravitinos, that predict lower central DM densities and cored profiles (Dalcanton & Hogan 2001), or self-interacting DM (SIDM) for which DM particles in the central region thermalize via collisions and thereby form a cored profile (e.g.Sameie et al. 2020).Others include the impact of baryons, which may transform cusps into cores by transferring energy and mass to the outer parts of the halos, e.g.via supernova feedback (Read & Gilmore 2005;Pontzen & Governato 2012;Brooks & Zolotov 2014), or star formation events (Read et al. 2018).Recent studies have also found that the orientation of a galaxy with respect to the viewer has a large impact on the derived velocity dispersion, resulting in a range of density slopes fitting the data (Genina et al. 2018).
Significant uncertainties are introduced by the fact that most observational studies are based solely on v LOS measurements, which constrain only one component of motion.Consequently, interpretations rely on substantial assumptions, in particular that v LOS is representative of the three-dimensional (3D) velocities. 1 These assumptions have been challenged by alternatives implying that the inferred, excessive dynamical mass-to-light ratios could be due to e.g.modified gravity (McGaugh & Wolf 2010) or out-of-equilibrium dynamics caused by tidal interaction with the MW (Klessen & Kroupa 1998;Hammer et al. 2018), although the latter is hard to explain for satellites on orbits having higher pericenter val-ues reported by Li et al. (2021), Battaglia et al. (2022) and Pace et al. (2022) from Gaia-based systemic proper motions.
Multiple techniques have been used to model v LOS dispersion (σ LOS ) profiles and, thus, constrain mass density profiles of dSphs.Examples include Jeans models (Walker et al. 2009;Zhu et al. 2016;Read & Steger 2017), distribution function (DF) fitting (Wilkinson et al. 2002;Vasiliev 2019), and Schwarzschild orbit superposition modeling (Breddels et al. 2013;Kowalczyk et al. 2019), each with their own strengths and weaknesses.However, all modeling techniques face the same problem: When only v LOS are used, there is a strong degeneracy between the mass density profile ρ(r) and the velocity anisotropy profile β(r), which quantifies differences in velocity dispersions in orthogonal directions (Binney & Tremaine 1987;Binney & Mamon 1982).Some models mitigate this degeneracy by restricting parameter space or using higher-order moments (Vasiliev 2019;Genina et al. 2020;Read et al. 2021), but having only the LOS component of motion fundamentally limits what can be achieved.
The key to progress is to measure the internal proper motion (PM) kinematics of stars.The radial and tangential PM components directly measure the projected velocity dispersion anisotropy, which, under assumptions of inclination and intrinsic shape, uniquely determines β(r) without requiring any dynamical modeling (e.g.van der Marel & Anderson 2010).This makes PMs crucial for dynamical modeling of dSphs, with models making use of PMs performing consistently better than those based solely on v LOS (Read et al. 2021).Different techniques can be used to measure internal PM kinematics, but all of them require combining two or more epochs of observations to determine PMs of individual stars.At typical distances of MW dSphs, the only feasible instruments currently available for measuring individual PMs are Gaia, the Hubble Space Telescope (HST), and JWST.
Gaia has been tremendously successful in revolutionizing our view of the MW and its satellites, but the relatively shallow limiting magnitude (G∼21 mag) and its large PM uncertainties for typical dSphs stars in the MW halo (e.g.Pace et al. 2022;Vitral 2021) hinder its use for a direct measurement of internal PM dispersions (Martínez-García et al. 2021).An alternative is to combine Gaia astrometry with that from another instrument (e.g.HST) to achieve longer time baselines, and thus lower PM uncertainties.This procedure has been applied in Massari et al. (2017Massari et al. ( , 2020) ) and del Pino et al. (2022).However, even with the Gaia-end-of-mission PM uncertainties reduced by a factor of 3, the number of stars available to measure PM dispersion profiles will always be confined to those near the tip of the red giant branch due to the limiting magnitude of Gaia.This is insufficient to discriminate between cusp and core models (Strigari et al. 2018;Guerra et al. 2023).
While the JWST time baseline is still too short (due to its recent launch) for a robust JWST vs. JWST PM computation, comparing positions of stars in images obtained with the same detectors onboard HST over time is the best means to obtain precise PMs of thousands of individual stars.HST is exquisitely well suited for astrometric and PM science, due to is stability, high spatial resolution, and well-determined point spread functions (PSFs) and geometric distortions.By combining two or more epochs of space-based imaging, it is possible to measure precise internal PMs in nearby stellar systems (e.g.Libralato et al. 2022).
In the current work, we combine 18 years of HST data, mostly obtained in the context of our HSTPROMO (High-Resolution Space Telescope PROper MOtion) Collaboration,2 to measure PMs of hundreds of stars in the Draco dSph.With this, we measure its internal PM dispersion profile for the first time and thus provide unprecedent constraints on its DM density slope.We describe the datasets we used in Section 2, we explain the methods used to analyze the data in Section 3, and we present our results in Section 4. We comment on the robustness of our findings in Section 5, and discuss and conclude our work in Sections 6 and 7, respectively.Throughout the paper, we use lower case r to denote 3D distances, and upper case R to denote 2D projected distances.

DRACO DATA AND GENERAL CHARACTERISTICS
The Draco dSph is an excellent candidate to test the predictions of CDM scenarios.Its star formation shut down long ago (∼10 Gyr; Aparicio et al. 2001), making it a prime candidate for hosting a 'pristine' DM cusp, unaffected by baryonic processes (Read et al. 2018).Interestingly, the stellar mass of Draco is clearly below the limit where stellar feedback, as implemented in current cosmological simulations, should still produce a core (Fitts et al. 2017). 3Furthermore, Draco is one of the most DM dominated satellites of the MW (Kleyna et al. 2002), and seemingly unaffected by Galactic tides that might heat its velocity dispersion profile (Odenkirchen et al. 2001;Ségall et al. 2007).
Recent efforts to infer the DM density of Draco from Jeans modeling of LOS velocities have yielded similar results: Read et al. (2018) fit rotation-less spherical Jeans models combined with higher-order LOS moments and report a DM density slope at 150 pc of −0.95 +0.50  −0.46 (95% intervals); Hayashi et al. (2020) applied rotation-less axisymmetric models to LOS data, and report a cusp with 'high probability' and a formal measurement of the asymptotic DM slope of −1.03 +0.14  −0.15 (68% intervals).Meanwhile, when formulating dynamical mass estimators based on PM dispersions, Lazar & Bullock (2020) found the then available data to be insufficient for the purpose of constraining the asymptotic DM slope.Below, we describe the main characteristics of the new datasets we employ, and how those are able to grasp the dynamical status of Draco in more detail and with better accuracy.

Center
The quoted center for Draco in the McConnachie (2012) catalog is the one from Wilson (1955), when the dSph was discovered.After that, more detailed sky surveys have allowed further refinement of this measurement.In particular, Odenkirchen et al. (2001) and Martin et al. (2008) used SDSS data to estimate the values quoted in Table 1, and Vitral (2021) computed its center from Gaia EDR3 data assuming a Plummer (Plummer 1911) spherical model.In this paper, we compute it again from Gaia EDR3, but using a more refined version of the Vitral (2021) algorithm, which allows for an elliptical Plummer distribution (see Appendix A for analytical expressions of density profiles).The overall parameterizations are thus similar to the ones reported in Vitral (2021), with the exception of the elliptical Plummer, which adds two extra free parameters to the fit: (i) a projected angle θ in the sky and (ii) the ellipticity implied by the minor-axis scale length of the projected ellipse.
Our fit was performed by a Monte Carlo Markov Chain (MCMC) routine that uses the software emcee (Foreman- Mackey et al. 2013).We selected the most probable values from the joint MCMC posterior chain as the parameters, and assigned uncertainties based on its difference to the 16 th − 84 th percentiles of the respective posterior distribution.Our best fit center, projected angle and projected ellipticity are listed in Table 1.Overall, the fits agree very well with the estimates from Wilson (1955); Odenkirchen et al. (2001) and Martin et al. (2008).

Surface Density Profile
Further on, we will set up not only axisymmetric Jeans models, but also spherical models to fit our dataset (see Section 3).For that purpose, it is of interest to know the best scale radius of the observed data, assuming a spherical density profile, so that we can set reasonable priors to our models.Following Massari et al. (2020); Hayashi et al. (2020), we assume a Plummer model.We derive the Plummer scale radius of Draco using Gaia EDR3 data, using the same formalism as in Vitral (2021), and with the (α 0 , δ 0 ) centers calculated in Section 2.1.1.
Figure 1 displays the goodness-of-fit of our spherical Plummer profile to the Gaia EDR3 data.This satisfactory agreement yields a 3D half-number radius of 10.4 +0.3  −0.4 arcmin, which lies between the values of 10.0 +0.3 −0.2 and 12.4 +1.6 −1.2 arcmin, estimated by Martin et al. (2008) and Odenkirchen et al. (2001) with SDSS data, respectively, for an exponential model, and a Sérsic model.

Inclination
Due to the elliptical projected shape of Draco, we choose to model it as an oblate spheroid with a flattening parameter (i.e.intrinsic axial ratio) q.This relates to the projected axial ratio of Draco, q p , through the equation (Binney & Tremaine 1987), where i is the inclination of the spheroid (see Section B.2 below, where an edge-on model is defined to have i = 90 • ).We derive q p from the ellipticity value in Table 1, which yields q p = 0.745 ± 0.051.The intrinsic axial ratio q is not known, but its probability distribution can be assumed to follow the general flattening probability distribution of oblate elliptical galaxies in the nearby Universe.This is given by equation ( 4) from Lambas, Maddox, & Loveday (1992), (2) with q ∈ [0, 0.9].This can be used to obtain a probability distribution function (PDF) for the inclination of Draco, as follows: 1. We draw N values4 of flattening from the PDF in Lambas et al. (1992), which we label q all .
2. We draw N inclination values according to i all = arccos (U), where U is the uniform distribution within the [0, 1] interval.This ensures that the inclinations are sampled uniformly on the surface of a unit sphere from face-on (i = 0 • ) to edge-on (i = 90 • ) cases.3.For each of those inclinations, we compute the respective projected flattening from Eq. ( 1), using the q all values previously drawn, and we label these as q p, all .
4. From those (q p, all , i all ) pairs, we keep the ones that satisfy |q p, all − q p, Draco | < 0.01.
The remaining pairs from the last step above yield the projection of the inclination PDF onto the observed projected axial ratio of Draco, which is depicted in Figure 2. The resulting [16,50,84]

Line-of-Sight Velocities
Draco has been the subject of many observational campaigns to obtain LOS velocities of samples of individual stars (e.g.Armandroff et al. 1995;Wilkinson et al. 2004;Walker et al. 2015).Recently, Walker et al. (2023) provided the most complete catalog of dwarf galaxy LOS kinematics, including also metallicities and stellar parameters.Here, we make use of this catalog to complement our PM dataset.In this Section, we study some of the main aspects of this LOS dataset, including its interloper contribution, the implied galaxy rotation, and the influence of binaries on the inferred kinematics.

Interloper cleaning
To best interpret our results based on LOS data, we need to remove interlopers (essentially, stars in the foreground and background).Hence, we perform a multidimensional mixture model to assign membership probabilities to each star in our subset, and then select it (or not) based on a threshold probability.
We first narrow our study to catalog stars that satisfy good obs == 1, as suggested in Walker et al. (2023,  section 5).This essentially removes stars having high v LOS uncertainties.Next, we select the parametrizations that we use to model the joint PDF of Draco stars (tracers) and interlopers, in each dimension of the data (i.e.v LOS , T eff , log g,5 [Fe/H], [Mg/Fe]).The tracer PDF of v LOS , log g, [Fe/H] and [Mg/Fe] were modeled as a Gaussian, while the tracer PDF of T eff was modeled as a double Gaussian.The interloper PDF of log g, [Fe/H] and [Mg/Fe] were modeled as a triple Gaussian, while the interloper PDF of v LOS and T eff were modeled as a log-Gauss and double Gaussian distributions, respectively.These choices of PDFs were done so as to maximize the goodness-of-fit.We fitted this multi-dimensional distribution through an MCMC routine on discrete data and considered the region between the 16 th − 84 th percentiles of each posterior distribution as the uncertainty on our fits, and the most probable values from the joint MCMC posterior chain as the best parameters.Figure 3 showcases the goodness of our fit, projected on the v LOS −[Fe/H] dimensions.From our fits, we assigned as Draco members the stars having a membership probability higher than 99%, which removes most of the stars beyond a little more than 3-σ LOS from the bulk v LOS .This final subset was composed of 435 stars with v LOS data and uncertainties smaller than the value of σ LOS .Our measured value for the bulk LOS motion, µ LOS ,6 is presented in Table 1.For this subset, we chose not to correct for perspective effects caused by Draco's bulk motion, since those have negligible effects.Indeed, the RMS correction for the sample stars implied by eq. ( 13) of van der Marel et al. ( 2002) is only 0.2 km s −1 , while the rotation and velocity dispersion profiles inferred further in this work change by at most 0.1 km s −1 , which is well below their respective measurement uncertainties.

Rotation
Like most galaxies (e.g.Martínez-García et al. 2023), it is possible that Draco possesses detectable mean rotation.We estimate here the rotation fraction of Draco, ⟨v/σ⟩, from its v LOS .
We partition the LOS data into six concentric annuli on the sky, all of which having the nearly the same amount N ∼ 50 of stars.We then perform a sinusoidal fit with free amplitude for each partition (i.e. to the re-spective v LOS vs. projected angle quantities), and free mean velocity and phase.The mean and phase 7 are forced to be the same for all annuli (thus, in total, 8 free parameters).The measured amplitude and σ LOS per annulus allow us to construct the rotation profile displayed in Figure 4, which has a mean ⟨v LOS /σ LOS ⟩ = 0.22±0.09averaged over all radii (listed in Table 1).For visualization, the blue line in the Figure displays the best fit using a parametrization of the form which increases linearly at small projected radii, and falls as a power-law at higher projected radii.
Our results lie between those found by Hargreaves et al. (1996) and Kleyna et al. (2002), who used less complete datasets than ours.They found a rotation amplitude of 0.7 km s −1 around Draco's minor axis and of 6 km s −1 at 30 arcmin, respectively.Meanwhile, our (v/σ) rot fit predicts v/σ = 0.4 at 30 arcmin, which translates to a rotation amplitude, v rot , of less than 4 km s −1 at this projected radius.Our fit also agrees well with Martínez-García et al. (2021), where internal rotation was confirmed using Gaia EDR3 PMs.From those results, we conclude that although Draco has some mean rotation, it is small when compared to the overall velocity dispersion, especially at inner radii where most of the data are concentrated (see Figure 1 for comparison).

Binaries
Per construction, measurements of v LOS from spectral lines are subject to Doppler shifts from unresolved binary motion.As a consequence, single epoch LOS measurements can carry an overestimated σ LOS and thus an overestimation of the system's total mass due to binary motion.Meanwhile, given the multi-epoch requirement of PM measurements, those end up averaging the motion of unresolved binaries to zero, such that the mentioned overestimation becomes negligible.For example, while Bianchini et al. (2016) showed that globular star clusters with unresolved binary fractions up to 7 The phase is taken with respect to the position angle of the minor axis (see column 5 from Table 1).We found that while the rotation curve was robustly constrained by the data, the exact angle of the rotation axis was not.The fits showed angle variations between annuli, and the best-fit angle also depended on the exact choice of annuli, both in excess of the formal uncertainties.While a kinematic axis intermediate between the major and minor photometric axis appeared formally preferred by the fits, we concluded after experimentation that an oblate model with rotation around the photometric minor axis was acceptable.In any case, the data do show a preference for one spin sign (i.e.receding relative velocities on the Western longitudes) rather than another.50% should introduce changes < 6% on the PM velocity dispersion, Pianta et al. (2022) recently performed simulations of dSphs to argue that one could reach much higher changes when using only LOS data.Such an undesirable effect can be almost completely corrected by obtaining multi-epoch LOS observations, as recently argued by Wang et al. (2023).Given Draco's high binary fraction of 50% (Spencer et al. 2018), we perform here a multi-epoch test to gauge the influence of unresolved binaries on our cleaned Draco LOS dataset.To do so, we plot the radial profile of σ LOS for groups of LOS data constructed from a different number of epochs.If unresolved binaries are to affect our LOS data as proposed by Pianta et al. (2022), then one should expect multi-epoch velocity dispersions to be considerably smaller than the ones computed from single epoch exposures.
Figure 5 displays our multi-epoch comparison, where velocity dispersion profiles are computed according to van der Marel & Anderson (2010, appendix A) and Vitral et al. (2023a, section 3.2.1).The number of stars having multi-epoch observations is scarcer, and thus we have fewer radial bins for those.In any case, all our multi-epoch radial dispersion profiles agree within 1-σ to the single-epoch measurement.Hence, Figure 5 reassures us that for the cleaned LOS subset of Walker et al. (2023), the effects of unresolved binaries in the velocity dispersion of Draco are within the statistical uncertainties.Finally, we revisit this conclusion in Section 5.1, where we compare our results for subsets with and without LOS data.

Observations and Astrometric Catalogs
For our new PM measurements of Draco stars, we used multi-epoch HST ACS/WFC imaging data.Descriptions about field locations and observations during earlier epochs for our target fields F1, and F2 are provided in Sohn et al. (2017).The field locations are also shown in Figure 6.In summary, the F1 field had three epochs of imaging data obtained in 2004, 2006, and 2013, while F2 had two epochs of imaging data obtained in 2004 and 2012.All fields were observed once again on October-November 2022 through our HST program GO-16737 (Sohn et al. 2021) using the same filter (F606W), telescope pointing, and orientation as in the previous epochs. 8In this latest epoch, we obtained 15 individual exposures with each exposure lasting 430 seconds for each field.
The data analysis largely followed the procedures described in Bellini et al. (2018) and Libralato et al. (2018).Here, we provide only a high-level outline of the PM derivation process and refer the reader to those papers for more details about the methodology.We downloaded the flat-fielded flt.fits images of all target fields for each epoch from the Mikulski Archive for Space Telescopes (MAST) and processed them using the hst1pass program (Anderson 2022) to derive a position and a flux for each star in each exposure.Instead of working on the flc.fits images that are corrected for charge transfer efficiency (CTE) losses, we utilized the table-based CTE correction option in hst1pass, which is an improved version of the ones used in previous works (Anderson 2022, Anderson in prep.).We applied corrections to the positions using the ACS/WFC geometric distortions based on Kozhurina-Platais et al. (2015); these were further extended to include time-dependent distortion variations beyond 2020 (V.Kozhurina-Platais, private communication).
For each field, we constructed a "master frame" using the average positions of stars from the repeated first-epoch exposures.The (X, Y ) axes of these master frames were aligned with (α, δ) by registering the stellar positions to the Gaia DR3 astrometric system.We aligned the positions of stars from the other epochs to these master frames using a six-parameter linear transformation, and determined average positions for each epoch.By construction, this procedure aligns the star fields between different epochs leading to zero PM on average for the Draco dSph stars themselves.This does not affect our results since we are mostly interested in measuring the internal velocity dispersion on the plane of the sky (see Section 2.3.3 below for more discussion of this topic).Uncertainties on the average positions were determined from the repeated measurements as the root mean square divided by the square-root of the number of exposures.In the end, for each field per epoch, we prepared a catalog that includes positions of stars measured as described above as well as average instrumental9 F606W and F814W magnitudes (from the 2012-2013-epoch data) output by hst1pass.

Photometric cleaning
Once our observations are reduced and we have the master frame (X, Y ) positions of sources at each epoch, for each field, we first perform a photometric cleaning of the data.The goal of this step is mainly (i) to remove interlopers, (ii) to remove background galaxies, and (iii) to remove stars associated with poor photometry that might bias our PM analysis.Points (i) and (ii -iii) are performed independently, and we further select stars that simultaneously survived both sets of cleaning cuts.Point (i) is accomplished by performing a cleaning on the color-magnitude diagram (CMD) of each field, at each epoch.We use a friends-of-friends procedure where we assign as an interloper a star whose distances to other stars in the CMD10 is greater than typical distances in the subset.To do so, we define, after inspection of the CMD-distance distribution at each epoch, fiducial distance-thresholds to use in this cleaning.Since this step is likely to remove bright stars on the tip of the red giant branch and the horizontal branch (this is because they do not have a high number of neighbors), we reintroduce them to the cleaned subset.They are likely dSph members and would be filtered in further steps if they are not.As an example, Figure 7 (upper panels) display the results of this CMD cleaning for the three epochs of Field 2.
Next, to address point (ii), we remove sources likely to be background galaxies, which lie on the upper side of the QFIT -F606W diagram, 11 departing from the bulk set of stars.This step is performed with a similar friends-of-friends analysis as for point (i), with different distance thresholds per field and per magnitude range.Finally, we proceed to point (iii) by removing stars that satisfy QFIT≥ 0.2, as they are associated with poor PSF fits.The final QFIT cleaning of our subset is displayed in the lower panels of Figure 7.

Local corrections
After having a photometrically-cleaned subset which is also devoid, at least to a large extent, of interlopers, we proceed to compute the PM of each star in our subset.Essentially, the raw PMs are computed by a leastsquares line fit of the master frame (X, Y ) positions as a function of the epoch time.We use the numpy.polyfitroutine from Python, assuming the (X, Y ) uncertain- 11 The QFIT parameter is a combined measure of goodness of fit and S/N (see Anderson et al. 2006;Libralato et al. 2014 for details).
ties calculated in Section 2.3.1, and no χ 2 re-scaling. 12We store the χ 2 of the fit for later data-cleaning.
The raw PMs may contain low-level systematic effects related to the CTE issues of HST's degrading chargecoupled devices (CCD), as well as from subtle variations in geometrical distortion between epochs.As a result, some regions of the observed fields may present systematically higher/lower PMs.This problem has been previously reported, for instance, in Bellini et al. (2014) and Libralato et al. (2022).We display this for our Field 1 in the left panels of Figure 8.The best procedure to correct for these effects is to perform a local PM correction that shifts those regions back to the bulk PM of the field.In practice, we follow the procedures laid out in previous works (e.g.Bellini et al. 2014;Libralato et al. 2022) that have constructed HST PMs by looping over each star and removing the median PM of a local net of the ten closest 13 stars.This process adds an extra layer of uncertainties (basically the uncertainty on the median, 14 which we add quadratically to the original PM uncertainty), but successfully renders the PM dataset more homogeneous.The right panels of Figure 8 show that is successfully removes most of the systematic effects.
This local correction step removes not only streaming artifacts from the data, but also any variations in mean streaming intrinsic to the galaxy.However, it preserves the local velocity dispersion, which is most critical to perform mass-modeling.We verified with axisymmetric rotating mock datasets that this step does not significantly change the second order velocity moment of the data (changes remain smaller than ∼ 5% for all possible inclinations).Moreoever, our axisymmetric model fitting in Section 4.2.2 below explicitly accounts for the fact that any mean streaming in the PM directions is not observationally constrained.

Sky coordinates
As explained in Section 2.3.1, our master frame (X, Y ) positions are already aligned, per construction, with 12 Essentially, the χ 2 quantity is defined as with N free = 2 (i.e. a line) and N being the number of epochs used in the fit. 13Here, 'closest' refers to (X, Y ) spatial positions.We verified that the systematics observed in Figure 8 pertained mostly to geometrical distortions (rather then CTE), where distances in magnitude space are not relevant and could instead bring farther away stars into the local net. 14The uncertainty on the median for a Gaussian distribution is given by (Kenney & Keeping 1963), where σ and n are the distribution standard deviation and number of samples, which we fix to ten.sky coordinates, by using Gaia reference frames.This means that our PMs computed in (X, Y ) directions are straightforwardly converted as µ α, * = −µ X and µ δ = µ Y , where we denote µ α, * ≡ µ α cos δ.
To convert (X, Y ) positions to (α, δ), we first perform a naive translation/rotation such that the coordinates match approximately the true ones.Next, we select brighter sources and associate them with Gaia EDR3 catalog sources.We perform a final translation/rotation to minimize the logarithm of the sum of distances between those matches, and apply the respective conversion parameters to our dataset.We verified that our matches are performed correctly by comparison to the Draco subset from del Pino et al. ( 2022).

Outliers and underestimated errors
After the conversion to sky coordinates, our PM subset is nearly ready for use.However, there might still be hidden interlopers in the data with unusually high PMs, or stars with underestimated errors that might bias our results.
A rapid test to probe the number of such stars is to fit the PM distribution with a Gaussian (in both radial and tangential directions), and to compare the fraction of stars beyond 3-σ to the fraction ∼ 0.27% predicted in this Gaussian.When performing this exercise, we observe that the fraction of stars in the wings of the distribution increases as we consider stars with higher PM uncertainties.This not only shows that we could be encompassing interlopers, but also points to the possibility of underestimated errors towards fainter stars.
When measuring the velocity dispersion of a subset (as explained in Section 2.3.6),we are actually fitting the quadratic sum of the intrinsic dispersion and the errors associated with the tracers.If the errors are underestimated, the intrinsic dispersion will be overestimated.Figure 9 shows the PM velocity dispersion (i.e.σ POS , where 'POS' stands for 'plane-of-sky') of stars with maximum PM uncertainty ϵ lim .The curves show that the fitted σ POS starts to increase as we include stars with errors beyond ϵ lim ≳ 0.024 mas yr −1 , roughly equal to the intrinsic velocity dispersion of the galaxy.Those are fainter/high-magnitude stars that likely have underestimated errors.Hence, for further analysis we removed all stars whose PM uncertainties exceed the threshold of ϵ lim = 0.024 mas yr −1 .In Section 5.3 we further test the impact of this choice.
Given the possible issues related to stars with large PMs, we decided to also impose a 3-σ cut on our PM sample.This can jointly remove unwanted interlopers and remaining stars with underestimated errors.Comparison of the solid and opaque lines in Figure 9 shows Blue curves relate to Field 1, while the red ones relate to Field 2. Opaque curves relate to the PM subsets without the 3-σ cleaning explained in Section 2.3.5, while the solid curves show results for the PM subsets with such cut.For reference, we display a x = y line in dashed green.This plot shows that it is important to impose a cut in PM errors at ϵ lim ∼ 0.024 mas yr −1 .Inclusion of stars with PM uncertainties in excess of the intrinsic galaxy dispersion yields an overestimated value for the galaxy dispersion.
that this does not strongly change the inferred σ POS .Nonetheless, the downside of any velocity cut is that it yields a slight underestimate of the true velocity dispersion (essentially, ∼ 99.73% of the true value for a 3-σ cut of a Gaussian).To assure that this does not bias our dynamical modeling, which depends in part on comparison of LOS and PM kinematics, we performed the same cut in our LOS dataset (on top of the previous membership probability cut).This did not significantly change the LOS dataset, which already had a cut within a few σ due to the larger fraction of interlopers.Our final dataset is the most complete and accurate PM catalog of a dSph to date, comprising 364 well measured stars.Comparatively, Figure 10 shows that it comprises nearly ten times more stars than in Massari et al. (2020, orange squares), twice more stars than in del Pino et al. (2022, blue squares), and it reaches much deeper magnitudes than both datasets could ever do given their necessity for Gaia measurements.Moreover, the uncertainties in our PM measurements are all below the local PM dispersion (see dashed gray line), compared to no such stars in both previous studies.This improvement is particularly important, because it is difficult to accurately constrain the PM dispersion of a galaxy based on individual PM measurements with uncertainties that do not resolve this dispersion (which is further compounded by known Gaia systematics, e.g.Fardal et al. 2021 andVasiliev &Baumgardt 2021).

Proper Motion Dispersion Profiles
Having constructed the PM catalog, we proceed to construct velocity dispersion profiles that will be used throughout the next sections.As in Section 2.2.3, all our computations of the dispersion σ of a given random variable follow the recipe presented in van der Marel & Anderson (2010), also recently employed in Vitral et al. (2023a).This consists of a maximum likelihood fit of a Gaussian distribution to the data, aiming to recover the respective standard deviation of the fit.The bias and uncertainty of such an estimate (e.g.Kenney & Keeping 1951) are corrected in a Monte Carlo sense, where we analyze numerous pseudo-data sets in the same fashion as the real data (see appendix A from van der Marel & Anderson 2010 for details).
For spherically symmetric models of Draco, the velocity dispersion profile can be written as a function of the projected distance to the galaxy's center, R. We thus create logarithmically-separated data bins in R whenever we need to visualize σ.In practice, our spherical modeling deals with discrete data (see Section 3.1 below), such that the bin choices we use to visualize our results do not actually matter for the fitting procedure.For the axisymmetric case however, σ will also depend on the projected angle ξ of the data bin, defined as the angle between a given point and the projected majoraxis of the galaxy.Besides, our fitting approach in this case is frequentist (see Section 3.2), such that the binning process requires more attention.
Our LOS kinematics are based on fits of all position angles along an annulus, with the rotation amplitude in Figure 4 pertaining to the value on the kinematic major  2023) for the leftmost panels, and from our HST program in the rightmost panels.Model predictions and the adopted galaxy distance (which we use to convert mas yr −1 to km s −1 in the rightmost panel) are from our axisymmetric JamPy MCMC fits for i = 57.1 • as discussed in Section 4.2.2 below.Our best fit (as defined in Section 3.2) is depicted as a black solid line, which we interpolate with respect to projected radius R from the actual data R values (this is done for visualization purposes, since there is also a dependence with the projected angle ξ).The percentile from our MCMC chains are color-coded as in the color bar scheme, in the right.
axis. 15 Instead, for the PMs we have measurements only for specific fields (see Figure 6) that span a small range of position angles.Therefore, whenever sampling velocity moments as explained in Section 3.2, these moments are averaged over all sky angles for the LOS, while we take the mean sky angle of each radial bin for the POS directions (namely, POSr for the plane-of-sky radial direction and POSt for the plane-of-sky tangential direction).We use the major axis (i.e.ξ = 0 • ) for comparison to the rotation amplitude.
The inferred velocity dispersion profiles in the three orthogonal directions are shown in Figure 11, together with the LOS rotation curve, all with similar x and y-scales. 16This provides, for the first time, radially-resolved 3D velocity dispersion profiles for any dwarf galaxy.Focusing on the observations, we note that the radial PM dispersion is considerably higher than the tangential PM dispersion.Averaged over all radii probed, ⟨σ POSt ⟩ / ⟨σ POSr ⟩ = 0.80 ± 0.08.The ratio of the LOS dispersion to the PM dispersion is somewhat closer to unity, ⟨σ LOS ⟩ / ⟨σ POS ⟩ = 1.08 ± 0.09, where σ POS represents an average over both PM directions.The first ratio is independent of galaxy distance, while the second is inversely proportional to it.
The tight observational constraints on ratios like these enable dynamical models of the kinds discussed in Section 3 below to strongly constrain the structure of Draco.To understand why, consider first the ratio σ POSt /σ POSr , which is a measure of the projected velocity dispersion anisotropy in the plane of the sky.In spherical geometry, Leonard & Merritt (1989) and van der Marel & Anderson (2010) both showed that there is a direct relation between this projected anisotropy and the intrinsic three-dimensional velocity dispersion anisotropy.In Appendix B.3 we use scale-free dynamical models of the type discussed in de Bruijne, van der Marel, & de Zeeuw (1996) to show that the same is expected to hold in axisymmetric geometry.The details of the relation depend on quantities that are constrained by observational data, such as the projected axial ratio of the system, the position angle of the tracers on the sky, the radial profiles of the luminous and dark matter densities, the viewing inclination of the galaxy, etc.But in essence, σ POSt /σ POSr is a diluted measure (i.e.brought closer to unity due to projection effects) of the intrinsic 3D ratio σ tan /σ r (see Figure 16 in the appendix).Thus, the observed ⟨σ POSt ⟩ / ⟨σ POSr ⟩ implies the presence of radial velocity dispersion anisotropy in Draco.With suitable dynamical modeling, quantitative constraints are obtained on the shape of the 3D velocity dispersion tensor.This then breaks the mass-anisotropy degeneracy that plagues modeling of LOS velocities alone (Binney & Mamon 1982), so that the DM density profile can be determined.And with the 3D anisotropy known, the ratio σ LOS /σ POS allows a kinematic determination of the galaxy distance (as done previously for globular clusters, e.g.Watkins et al. 2015a).

METHODS
Various techniques have been used to model the velocity dispersion profiles of dSphs and, thus, constrain their mass density profiles (see introduction).In this work, we employ multiple techniques to exploit our dataset.This helps us to understand any modeling uncertainties, and makes best use of the different codes available in the literature.We summarize them below.

Spherical Jeans modeling: MAMPOSSt-PM
Although Draco, like many other dSphs, is a flattened system (e.g.Table 1 and Figure 6), previous studies have, in general, considered spherical models to fit its internal kinematics (e.g.Read et al. 2018;Massari et al. 2020).Hence, it is useful to perform a similar kind of modeling if one wants to better interpret and compare previous results that assumed sphericity.
We perform spherical mass modeling with the Bayesian code MAMPOSSt-PM (Mamon & Vitral in prep.), which is an extension of MAMPOSSt (Mamon et al. 2013) to handle PMs in addition to line-of-sight velocities.MAMPOSSt-PM is briefly described in section 2 of Vitral & Mamon (2021), and was tested by Read et al. (2021), who showed that MAMPOSSt-PM reproduced well the radial profiles of mass density and velocity anisotropy of mock dwarf spheroidal galaxies.MAMPOSSt-PM is also a faster code than its massmodeling counterparts (see Table 2 from Read et al. 2021), which allows us to probe a wide range of dynamical models in less time, which is useful when defining priors and fitting boundaries (see Section 3.2 below).

General formalism of MAMPOSSt-PM
MAMPOSSt-PM fits models for the radial profiles of total mass and the velocity anisotropy of the visible stars to the distribution of these stars in projected phase space.The local velocity ellipsoid is assumed to be an anisotropic Gaussian, whose axes are aligned with the spherical coordinates.The sizes of the axes are obtained by solving the spherical Jeans equation (Binney 1980) assuming a given mass profile M (r) and velocity anisotropy profile β B (r), for a previously determined mass density profile ν(r) for the kinematic tracers (here stars).The term ν σ 2 r is the dynamical pressure that counteracts gravity.17The (Binney 1980) velocity anisotropy ('anisotropy' for short) is defined as: where the v 2 i are the second-order velocity moments in spherical polar coordinates.In both spherical and axisymmetric geometry, the first moments ⟨v r ⟩ = ⟨v θ ⟩ = 0, so that the corresponding velocity dispersions satisfy The first azimuthal moment ⟨v ϕ ⟩ need not generally be zero, so that in general v 2 ϕ = σ 2 ϕ + ⟨v ϕ ⟩ 2 .
Our spherical models are constructed to have ⟨v ϕ ⟩ = 0, but in the axisymmetric models that we present later we do allow for the possibility of mean rotation.
In MAMPOSSt-PM, the likelihood is written where the conditional probability of measuring a velocity v i at projected radius R i is the mean of the local ve- (7) MAMPOSSt-PM determines the marginal distributions of the free parameters and their covariances by running the MCMC routine CosmoMC18 (Lewis & Bridle 2002).
In practice, we use 6 MCMC chains run in parallel and stop the exploration of parameter space after one of the chains reaches a number of steps N steps = 10 000 N free , where N free is the number of free parameters of the model.We discard the first 3000 N free steps of each MCMC chain, which are associated with a burn-in phase.From the resulting chain values, we assign uncertainties to our best likelihood parameters using the 16th and 84th percentiles of the respective posterior distribution.If the fit lay below (above) those boundaries, we extended the uncertainty down to (up to) the minimum (maximum) chain value.

Parametrizations & Priors of MAMPOSSt-PM
MAMPOSSt-PM is a parametric code that fits discrete data.The motivations for our choices of parametrization are described further below.Our choice of priors on the other hand is performed to maximize the entropy of the posterior probability distribution.This can be done by assigning flat priors whenever we assume no previous knowledge on a specific parameter, or Gaussian priors whenever we trust a previous measurement, from a different dataset, with reported mean and uncertainty.
The anisotropic runs of MAMPOSSt-PM use the generalization (hereafter gOM) of the Osipkov-Merritt model (Osipkov 1979;Merritt 1985, Eq. [8a]) or the generalization (hereafter gTiret) of the (Tiret et al. 2007, Eq. [8b]) model for the velocity anisotropy profile: where r β is the anisotropy radius.We fit β 0 and β ∞ using flat priors, from −1.99 to 1.99 to the symmetrized quantity19 β sym = β/(1 − β/2), while fixing r β to the scale radius of the luminous tracers.20 Notice that a constant-anisotropy case is obtained by fixing β ∞ = β 0 .The mass density of the luminous tracers is chosen to be a Plummer model, similarly to what was adopted in Massari et al. (2020) and Hayashi et al. (2020), and also supported by our previous fits of Gaia ERD3 data (see Fig 1 in Section 2.1.2).We fit the Plummer r −2 radius21 with Gaussian priors, using the mean and uncertainty from our fits.The total luminous mass of Draco, M ⋆ , was estimated by Martin et al. (2008) from its CMD, by assuming either a Kroupa et al. (1993) or a Salpeter (1955) initial mass function (IMF).We estimate the mean and variance of log M ⋆ from both of those values, and use it as a Gaussian prior, which encompasses both estimates within 1-σ.
We test numerous parametrizations for the DM density profile, including: • A generalized Plummer model, which is a special case of the αβγ model by Zhao (1996), with α = 2 and β = 5.
• The Kazantzidis et al. (2004b) model, which is motivated from N -body simulations of tidally stripped cuspy DM halos.
• The generalized NFW profile, motivated from cosmological simulations by Navarro et al. (1997).
• The Einasto (1965) profile, which was used recently by Jiao et al. (2023) to fit the MW DM halo.
These density models are all listed in Appendix A, and depend on three quantities: a scale radius,22 a total DM mass (M dark , or M −2 for the generalized NFW model), and an inner slope γ (n index for the Einasto model).We assume flat priors for all these variables: Finally, we set Gaussian priors for the bulk v LOS of Draco, using our estimate depicted in Table 1, and we also set Gaussian priors for the distance modulus, defined as µ 0 = 5 log (D/[kpc]) + 10.The mean and uncertainty on the distance modulus are derived by propagating the value and respective uncertainty on the RR Lyrae estimate from Bonanos et al. (2004), which yields µ 0 = 19.398± 0.156.

Axisymmetric Jeans modeling: JamPy
To model our dataset under the assumption of an oblate axisymmetric galaxy, we use the publicly available code JamPy (Cappellari 2008(Cappellari , 2020)), tailored to the analysis of axisymmetric systems.This software was shown to reproduce well the dynamics of mock oblate dSphs with rotation (Sedain & Kacharov 2023), and has been applied in Zhu et al. (2024) to recover DM structural parameters of thousands of galaxies.

General formalism of JamPy
We use the version of JamPy in which the velocity ellipsoid is aligned with spherical coordinates, given that we assume a spherical global potential, which would be only minimally altered by Draco's luminous axisymmetric component.This configuration of JamPy considers the Jeans equations for rotating oblate systems (e.g.Bacon, Simien, & Monnet 1983) where Φ is the gravitational potential, the symbol ⟨.⟩ indicates the distribution function-averaged quantity, and finally, β J is defined as where the second equality assumes that ⟨v θ ⟩ = ⟨v r ⟩ = 0, as in MAMPOSSt-PM.Because of symmetry and continuity, axisymmetric models always have ⟨v ϕ ⟩ = 0 and v 2 ϕ = v 2 θ along the symmetric axis.Hence, along the symmetry axis, β as defined by equation 5 equals β J .Models with β J = 0 yield the same predicted second velocity moments as models in which the distribution function f (E, Lz) does not depend on a third integral.Such models have been widely used for fitting data of axisymmetric systems (e.g.van der Marel 1991).Away from the symmetry axis, models with β J = 0 do not have an isotropic velocity dispersion tensor.
Beyond the non-sphericity, another main difference between MAMPOSSt-PM and JamPy is that the latter allows us to model Draco's rotation, by not imposing ⟨v ϕ ⟩ = 0 throughout the whole system (this equality was also imposed in many previous analyses such as Read et al. 2018;Hayashi et al. 2020;Massari et al. 2020).The first moment in the ϕ direction relates to the second order moment in the radial direction through where the rotation parameter Ω is introduced.This parameter was named γ in Cappellari ( 2020), but we change this notation to avoid confusion with the inner slope of the DM mass density, which uses the same symbol.
From those equations, JamPy samples projected velocity moments that we use to compute the respective quantities in the LOS and POS directions.To fit our dataset, we employ an MCMC chain using the emcee routine that minimizes the χ 2 , defined as where χ 2 x = (x data − x model ) 2 i /ϵ 2 x, i .In Eq. ( 12), the four χ 2 terms pertain to the LOS, POSr and POSt velocity dispersions at a given (R, ξ)24 point, while the last term pertains to the first order moment of the LOS velocity on the major axis. 25e then maximize the log-probability of our dataset with the set of parameters Θ, defined as ln Pr{Θ} = −χ 2 /2, along with respective priors defined further below in Section 3.2.2.Our MCMC routine sets a maximum of 10 000 iterations per fit, which we run in parallel in 64 CPUs.In those configurations, each run takes ∼ 3 days to complete, and we perform it for a different set of possible inclinations from the PDF derived in Section 2.1.3.We discard the burn-in phase by removing the first 3 000 steps of the chain and visually checking that the chains remain stable further on.

Parametrizations & Priors of JamPy
As mentioned above, the timescales to run converging JamPy models are drastically longer than respective MAMPOSSt-PM runs, which can be explained by both the software languages employed in each code (Python vs. Fortran, respectively), as well as the choice of parametrizations: While MAMPOSSt-PM uses analytical parametrizations for a set of different models, JamPy assumes Multi-Gaussian Expansions (MGE) to model both the potential and the stellar distribution, which allows for more general density profiles at the expense of more time.
Therefore, we use our results from the spherical Jeans modeling to assist our fitting choices with JamPy.For instance, since we observed no significant preference for a particular DM density parametrization in our spherical modeling results (see Section 4.1), we here decide to use only the generalized Plummer profile, as its analytical expressions are more easily handled when building MGEs in Python.In the absence of external constraints on the geometrical shape of Draco's DM halo, we continue to assume that it is spherical, 26 even when the luminous density is chosen to be axisymmetric (so as to fit the observed projected shape of Draco).Although we use the same priors as MAMPOSSt-PM for the DM density parameters 27 and Draco's distance, we fix the stellar density parameters, 28 as we observed no departure from the mean MAMPOSSt-PM Gaussian priors.In addition, we assume the β J parameter to be a constant, 29 since we show in Section 4 that the data does not prefer more general profiles such as the Osipkov-Merrit generalization of Eq. (8a).In-  (2014).For robustness purposes, we also ran a test with a gOMlike parametrization for β J and observed no departure from the constant case.
deed, because MAMPOSSt-PM assumes no rotation and spherical symmetry, such that σ θ = σ ϕ , its velocity anisotropy parameter is equal to JamPy's β J .Finally, we observed that when assuming a spatially constant rotation parameter Ω, we could not fit well enough Draco's observed rotation curve.Hence, we assume the more general behavior We fit (Ω 0 , Ω ∞ ) by assuming flat priors from −1.99 to 1.99 to the symmetrized quantity Ω sym = Ω/(1 − Ω/2), while fixing r Ω to the luminous scale radius.
As a consistency check that the fitting strategy of the different Jeans modeling algorithms we use do not strongly diverge from each other, we compared two constant-anisotropy runs from MAMPOSSt-PM and JamPy for a spherical geometry, and confirmed that both the inferred velocity anisotropy and the DM density slope differ by much less than their respective 1-σ uncertainties. 30

Practical Quantities
Given the choices of different DM parametrizations to compare with, and the fact that our data is not complete at all radii, we define here practical quantities to help us interpret our results.For example, the total DM mass (or M −2 for NFW profiles) is not a well-constrained quantity, since our data does not really allow us to constrain in detail the shape of the DM density at large radii.Hence, a more suitable parameter to display and use for comparison purposes is the DM mass up to a fiducial radius.We do so, by displaying further in Tables 2 and 3 the variable M dark (r = R max ) -i.e. the total mass of DM up to the maximum projected radius in our LOS+PM dataset, namely R max = 900 pc.To aid in the interpretation of this quantity we also compute, at this same radius, the circular velocity which depends on the total 31 cumulative mass up to a certain radius r, and on the gravitational constant G.
Similarly, due to the restricted spatial extent of our data, the inner dark matter slope parameter γ dark , or the respective Einasto index n dark , may not reflect accurately our fits and uncertainties of the DM slope where 30 Precisely, we measure ∆β = 0.04 and ∆γ = 0.06, while the uncertainty on each parameter for the spherical case is of the order of ∼ 0.15 and ∼ 1, respectively. 31The total mass is a sum of the luminous and dark components.
we are actually able to constrain it -i.e.where we have both PM and LOS data.We therefore define an effective density slope parameter as where ρ(r) is the DM density.The r min variable is defined as the minimum projected radius in the data where PM information is available.We define r max = min(R max,PM , r ΛCDM /3), where R max,PM is the maximum projected radius in the data where PM information is available and r ΛCDM is the scale radius of a NFW profile as expected for DM halos in ΛCDM simulations 32 assuming low-mass stellar components (Read et al. 2017).
In practice, the radial limits over which we average the logarithmic DM density slope are r min = 42 pc and r max = 297 pc, which translate to roughly r min = 1.9 arcmin and r max = 22.5 arcmin.If one considers the scale radius r ΛCDM , the conversion between Γ dark and γ dark in our PM radial range is such that cusp (γ dark = −1) and cored (γ dark = 0) values translate to Γ dark = −1.2 and Γ dark = −0.38,respectively, for a generalized NFW profile.For a generalized Plummer profile such as used in our axisymmetric fits, the respective numbers are Γ dark = −1.07 and Γ dark = −0.14 for cusp and cored models, thus providing a more subtle difference.

Statistical tools
We employ Bayesian evidence to compare our different MAMPOSSt-PM mass-anisotropy models and correct for over-and under-fitting.This model selection involves comparing the maximum log posteriors using Bayesian information criteria.We use the corrected Akaike Information Criterion (derived by Sugiura 1978 and independently by Hurvich & Tsai 1989 who demonstrated its utility for a wide range of models) where AIC is the original Akaike Information Criterion (Akaike 1973) 32 We choose this value as reference because we wish to compare our observables to what is predicted from theory, while the (1/3) factor is added with the intent of removing the part of the predicted density profile that has a cuspier drop due to the transition from the inner to the outer density profile.
and where L MLE is the maximum likelihood estimate found when exploring the parameter space, N free is the number of free parameters, and N data the number of data points.We prefer AICc to the other popular simple Bayesian evidence model, the Bayes Information Criterion (BIC, Schwarz 1978), because AICc is more robust for situations where the true model is not among the tested ones (for example our choice of a Plummer density profile for the stellar component is purely empirical and not theoretically motivated), in contrast with BIC (Burnham & Anderson 2002).
The likelihood (given the data) of one model relative to a reference one is (Akaike 1983 and we use it to infer likelihood probabilities.In general ∆AICc differences ≳ 4 (i.e. a confidence level ≳ 85%, according to eq. [18] above) are required to prefer one model over another.It is also important to mention that such diagnostics are purely statistical, and do not account for intrinsic astrophysical phenomena that might favor or disfavor a particular model.

Spherical modeling
We first present the results of spherical Jeans modeling with MAMPOSSt-PM.Key outcomes are listed in Table 2. Our velocity dispersion goodness-of-fits for the spherical case were very similar to the ones presented in Figure 11 for the case of axisymmetric models, so we do not show the spherical fits separately (caveat: our spherical modeling neglects rotation, so the spherical model predictions in the lower left panel are zero).Below, we detail our results.

Dark Matter density parametrization
The first four lines of Table 2 address the comparison between the four DM density parametrizations we use: Kazantzidis et al. (2004b) listed as GKAZ, Einasto (1965) listed as EIN, generalized Plummer (1911) with free inner slope listed as GPLU, and, finally, the generalized Navarro et al. (1997) with free inner slope listed as GNFW.We refer the reader back to Section 3.1.2for the motivations of each parametrization and now focus on the practical fitting results.
A comparison of these models' AICc yields a modest preference for the GKAZ profile, followed by EIN, GPLU and GNFW.However, the AICc differences reach, at most, 0.34 between the GKAZ and NFW profiles, which translates to the GKAZ model being ∼ 1.2 times more likely than GNFW (from Eq. [18]).This is definitely not enough to robustly distinguish those models, meaning that they all fit the data equally well.This is true even though the GPLU yields a slightly more cuspy DM profile, but not significantly so given the high uncertainties associated with this parameter when using spherical models.Therefore, given the analytical simplicity of the GPLU profile and its physical meaning, 33 we choose to use this model for further tests, as well as when modeling Draco with JamPy further on.

Velocity anisotropy parametrization
Models 5 and 6 from Table 2 display our tests with different velocity anisotropy parametrizations, specifically the gOM and gTiret generalizations with free inner and outer anisotropy values.The results show that although the inner velocity anisotropy tends to agree with the constant anisotropy case (i.e.preferring radial anisotropy), the velocity anisotropy at infinity prefers a more tangential behavior.Nevertheless, the uncertainties associated with this parameter are extremely large and basically 33 Models such as GNFW do not have a finite mass, while EIN models are not able to reproduce centrally decreasing density profiles.
encompass very radially anisotropic cases as well.Since the anisotropy at large radii is not well constrained by the data, the improvement in the fit is not enough for us to actually prefer more general models such as this (i.e.AICc is higher).Hence we use the constant anisotropy case here and, when performing axisymmetric modeling, we use it as our standard model.More general anisotropy parameterizations could over-fit the data.

Cusp vs. Core under the spherical assumption
One of the main goals of our dynamical modeling is to constrain the DM slope of Draco.From our fits that leave this slope as a free variable, we observe a general preference midway between a cored and a cuspy profile where the PM data is present (i.e.column 12).A different test is then to directly compare two mass models, one with a fixed inner cusp (i.e.γ = −1) and another with a fixed core (i.e.γ = 0).We perform those runs and display them in Table 2 as models 7 and 8.
The cusped and cored models prefer velocity anisotropies that are consistent with each other at the 1σ levels, with cuspy models preferring slightly lower values.The AICc comparison between these two models shows a significant preference for the cored case, which is ∼ 6 times more likely than the cuspy counterpart, or equivalently, rules out the latter with nearly 85% probability (from Eq. [18]).This is opposite to what was found by the spherical modeling from Massari et al. (2020), although the much better completeness and accuracy of our dataset (cf. Figure 10) can easily account for such differences.In the next section, we move on to test this result under the more suitable geometric assumption of axisymmetry.

Velocity anisotropy under the spherical assumption
Along with the measurement of the DM slope, our study is the first to consistently constrain the velocity anisotropy of Draco.As discussed in Section 2.3.6, the projected anisotropy is directly measured by the data, independently of any model assumption.However, to recover the intrinsic 3D value of β, assumptions are required; here, we analyze this problem under the consideration of spherical symmetry.
Throughout all models in Table 2, the velocity anisotropy β B remains within the 1-σ range of ∼ 0.2 − 0.6.This means that under spherical assumptions, our data consistently constrain the orbital shapes in Draco to be radially anisotropic.Previous PM studies of Draco (Massari et al. 2020;del Pino et al. 2022) hint towards a similar behavior, although their error bars are too high to discard tangential orbits at a 1-σ level in the former work.Given the unprecedentedly low PM uncertainties of our dataset, we are able to rule out negative values of β B in model 3 with 98.5% confidence, if one assumes sphericity.

Axisymmetric modeling
Having explored spherical models, we now move on to more realistic axisymmetric models.Because JamPy is much slower than the equivalent MAMPOSSt-PM MCMC routines, we are forced here to reduce our set of dynamical models.We thus use the priors and assumptions specified in Section 3.2.2.Specifically, informed by the results of the spherical models, we focus on axisymmetric luminous models with a Plummer density distribution and constant β J as function of radius, embedded in a spherical dark halo with a generalized Plummer profile.

Inclination dependence
We ran JamPy over a set of eleven inclinations linearly spaced from 43 • to 90 • (edge-on), which encompasses Draco's inclination PDF computed in Section 2.1.3,and display the results in Table 3.In addition to the parameter β J defined by Eq. ( 10), we also list the globally-averaged Binney anisotropy β B .The latter was obtained by first calculating the mass-weighted second velocity moments averaged over the entire system, and then substituting those into Eq.( 5).
Overall, the results for Ω ∞ and r dark slightly increase with inclination, while other parameters tend to decrease with i.In particular, models that are closer to edge-on prefer more cuspy DM profiles, lower velocity anisotropies β J (still positive, but closer to β J = 0) and lower dynamical distances.
In spite of these correlations, we can provide a global estimate and uncertainty for every model parameter, integrated over the inclination distribution that we calculated for Draco in Section 2.1.3.That is, for a parameter Θ, we apply where p(Θ) is the final posterior distribution integrated over all inclinations, p(Θ|i) is the posterior distribution obtained from the MCMC chain, and p(i)di is the probability of falling into a certain inclination range, which we compute empirically from the distribution computed in Section 2.1.3.In practice, the integral above is a discrete sum over the inclination ranges probed by the eleven values we calculated.We use the same procedure to obtain the most probable value of every parameter, i.e. we average the results in Table 3 weighted by the respective inclination probabilities.Finally, we define the confidence regions as the 16th and 84th percentiles of the numerically computed p(Θ).

Data-Model Comparison
The parameter estimates thus marginalized over all inclinations are listed in the last row of Table 3.They resemble the estimates for the cases i = 52.4• and i = 57.1 • .We choose the case of i = 57.1 • as a canonical model for display, as it represents a value between the median (56.2 • ) and the mean (58.9 • ) of the inclination PDF.In Figure 11 we display, for this inclination, the data-model comparison for the three velocity dispersion components, and the LOS rotation velocity amplitude.Figure 12 shows the dark matter density profile we estimate over the range where we calculate its slope, as well as the circular velocity over the entire data range.
Figure 11 shows a very satisfactory fit from JamPy, which helps to strengthen both our parametrization choices and our conclusions further on.Figure 12 shows that both the circular velocity and DM cusp (down to ∼ 100 pc) are well constrained by our models.
We verified that our results do not depend sensitively on the adopted parameterization for ρ ⋆ .For this we ran a comparison i = 57.1 • model in which the stellar (10) Globally-averaged βB velocity anisotropy, as defined in Eq. ( 5); (11) Circular velocity at maximum projected data radius, in km s −1 ; The uncertainties are based on the 16th and 84th percentiles of the marginal distributions, unless the maximum likelihood solution was outside that boundary, in which case the uncertainties are based on the minimum or maximum value of the MCMC chain.The last row displays the integrated estimate for each parameter, averaged over the inclination probability distribution of Draco, as described in Section 4.2.1.The baryonic mass and respective Plummer major axis are fixed in all cases to 4.7 × 10 5 M⊙ and 9.1 arcmin.

Inferred Quantities
We display in Figure 13 the posterior distributions for and correlation between the parameters Θ = {Γ dark , β B }, marginalized over all inclinations as described in Section 4.2.1.The Figure shows that models with a core generally require a more radial velocity anisotropy to fit the data, consistent with what has been found in other contexts (e.g. Figure 3 in van der Marel et al. 2000).Our averaged slope estimate, Γ dark = −0.83+0.32  −0.37 , is consistent with a classic ΛCDM slope, even though the posterior distribution we derive has a long tail towards larger (including even positive) Γ dark values.The globally-averaged Binney β B is well below the parameter β J , defined by Eq. ( 10) for all in-clinations.This is because β B depends on v 2 ϕ , while β J does not.Axisymmetric models generally have v 2 ϕ increasing from the symmetry axis towards the equatorial plane (see Appendix B.4).So while β B = β J on the symmetry axis, instead β B ≤ β J in the equatorial plane.While Table 3 shows that our best-fit models have β J positive and increasing with decreasing inclination, the inferred value of β B depends less on inclination to within the statistical uncertainties.The overall anisotropy marginalized over inclinations is β B = −0.20 +0.28  −0.53 .So our best-fit models are radially anisotropic on the symmetry axis, and tangentially anisotropic in the equatorial plane.When integrated over the entire meridional (R, z) plane they are tangentially anisotropic, but still statistically consistent with isotropy.
Similarly, we plot in Figure 14 an equivalent case for the parameters Θ = {γ dark , log r dark }, with r dark in pc.As expected from our conversions between Γ dark and γ dark in Section 3.3, one has a remarkable agreement between the peak of γ dark 's PDF and a classic ΛCDM slope.Besides, our uncertainties on γ dark are consistent with what is expected from PM datasets having a similar number of stars as ours, as argued in Guerra et al. (2023).More importantly, this figure allows us to probe the core radius that our 3D data is able to constrain: The upper panel shows the dark matter density profiles from our JamPy MCMC chains for i = 57.1 • , color-coded according to their respective percentiles.We display the most probable solution (as defined in Section 3.2) as a black line.The lower panel shows the circular velocity curves, defined by Eq. ( 14), for the same models, as well as their maximum value in green.
The radial extent of the upper panel covers the extent of our PM data (where Γ dark was computed), while the lower panel covers the extent of our entire dataset.
while negative asymptotic slopes agree with a large set of DM scale radii, positive slopes require that the respective core (or even a drop in the density) be limited within ≲ 1 kpc.Indeed, upon analyses of our MCMC chains, cores larger than 487 pc, 717 pc and 942 pc are ruled out at 1-, 2-and 3-σ confidence, 35 respectively.For reference, the scale radius predicted by ΛCDM, previously mentioned in Section 3.3, equals 1.06 kpc.
The circular velocity at our outermost data point in our best-fit models is v Rmax circ = 24.19+6.31  −2.97 km s −1 .Similarly, the maximum value of the circular velocity is v max circ (r) = 27.78+10.85 −5.97 km s −1 .This v max circ (r) measurement is generally higher than most previous calculations, namely (Strigari et al. 2007, 15 − 35 km s −1 ), (Martinez 2015, 18.2 +3.2 −1.6 km s −1 ) and (Massari et al. 2020, 10.2 − 17.0 km s −1 ).Correspondingly, the dark mass in 35 These numbers are derived upon selecting the elements of the corner plot in Figure 14 that correspond to γ dark ≥ 0, and retrieving the values that encompasses 68%, 95% and 99.7% of the respective log r dark distribution.Final posterior probability distributions of Draco's asymptotic dark matter density slope, γ dark , and the logarithm of the dark matter scale radius, log r dark , with r dark in pc.The plotting details are similar to the ones in Figure 13.The results rule out a core larger than 942 pc at 3-σ confidence.
our models is higher as well.It is difficult to precisely determine where such differences could come from, but one could speculate that this relates to different completeness of the respective datasets used in each work.Indeed, the circular velocity values measured by the likewise axisymmetric modeling from Hayashi et al. (2020, figure 9) over similar radial ranges lie closer to ours (i.e.∼ 25 − 30 km s −1 ).
From Table 3, one sees that higher heliocentric distances of Draco are usually related to lower inclinations and more cored models, and vice-versa.Our estimate of Draco's distance, D = 75.37+4.73  −4.00 kpc, provides the first dynamical distance for this dwarf.Comparatively, Bonanos et al. (2004) measured 75.8 ± 5.44 kpc using a set of 146 RR Lyrae stars, Aparicio et al. (2001) found 80 ± 7 kpc from analyses of the magnitude of the horizontal branch at the RR Lyrae instability strip, and Muraveva et al. (2020) reported 80.5 ± 2.6 kpc when using 285 RR Lyrae stars.Our measurement is thus comparable to and competitive with other literature results based on stellar population methods.Thus, high-quality astrometric data also provides a valuable validation of standard distance determination techniques.

Spherical vs. axisymmetric models
Until recently, there were no PM dispersion profiles available for internal mass modeling of dSphs.Hence, methods employed to analyze LOS velocities had to make substantial assumptions to remove degeneracies in the data.Among other things, models usually assumed spherical geometry (e.g.Wilkinson et al. 2002;Read et al. 2018;Massari et al. 2020), with the important exception of Hayashi et al. (2020).The velocity moments that are derived from the Jeans equations then depend only on the projected radius to the system's center.Instead, observed quantities in axisymmetric models depend on the position angle on the sky.We have found that for the new PM dataset presented here, axisymmetric models yield substantially different results from spherical models.This is true especially for the quantities most of interest, namely the dark matter cusp slope and the velocity anisotropy.Axisymmetric models imply lower anisotropy β B and higher cusp slope Γ dark .Hence, it is critically important to construct axisymmetric models that properly take position angle dependencies into account.
In Appendices B.3 and B.4, we again use the scalefree dynamical models of the type discussed in de Bruijne et al. (1996) to explain this result.Figure 16 in the appendix shows that there is a tight monotonic relation between the PM anisotropy σ POSt /σ POSr integrated over the sky, and the globally-integrated intrinsic β B .This relation is very similar for spherical and axisymmetric models.However, for Draco we have not measured σ POSt /σ POSr over the entire projected image of the galaxy, but only for two fields along the major axis (see Figure 6).Figure 17 in the Appendix shows that in axisymmetric geometry, σ POSt /σ POSr is not constant with position angle on the sky.Instead, it is much lower on the major axis than on the minor axis.Hence, a spherical model that assumes that σ POSt /σ POSr is the same everywhere as measured on the major axis will overestimate the radial anisotropy β B .So while our axisymmetric models imply that β B = −0.20 +0.28  −0.53 , our best-fit canonical spherical model (model 3 in Table 2) has instead β B = 0.39 +0.13  −0.14 .As mentioned in Section 4.1.4,the latter is consistent with previous studies that assumed sphericity (e.g.Massari et al. 2020;del Pino et al. 2022).The higher β B in spherical models translates to a shallower DM slope, given the correlation in Figure 13.
Systematic biases of this type are lessened, but not necessarily erased, when using more spatially complete datasets.This is something to keep in mind as we enter a new era of galactic PMs: although spherical models are less costly and thus helpful to understand general choices of parametrization and priors, robust results and respective conclusions can only be obtained when considering more complex models that take into account the real shape of the galaxy (this has also been argued previously by Genina et al. 2018).
Even in our case, there are still degeneracies that are not fully taken into account.Specifically, as highlighted in Cappellari (2020), the deprojection of the surface brightness to obtain the intrinsic luminosity density is not unique unless the axisymmetric galaxy is seen edgeon (Rybicki 1987;Kochanek & Rybicki 1996).The degeneracy increases considerably when the galaxy is seen at low inclinations (Romanowsky & Kochanek 1997;van den Bosch 1997;Magorrian 1999).Our results for Γ dark do not depend strongly on the assumed inclination (cd.Table 3).But it should be kept in mind that at low inclination other deprojections of the luminous density may be possible that differ from the Plummer models assumed here.

The effect of binaries
As explained in Section 2.2.3, recent works have proposed that the LOS velocity dispersion from dwarf galaxies could be inflated due to the presence of unresolved binaries from single-epoch exposures.In particular, such inflation could translate to mass overestimation, and could eventually bias our DM measurements.Beyond the tests performed in Section 2.2.3, which have shown that the LOS velocity dispersion remains consistent when using data with different number of epochs, another independent test is to compare mass-modeling with either LOS+PM data (case i) or with PM data alone (case ii).Due to the lack of high quality PM data, such a test has never been performed before.Our new state-of-the-art PM catalog can thus shed light onto this question.
Model 9 in Table 2 depicts the results from spherical Jeans modeling with PM data alone, but with the same priors as in our preferred model 3 (LOS+PM).Despite the slight increase of Poisson uncertainties, due to the fact that the PM subset alone has about half as many stars as the LOS+PM subset, the overall predictions of model 9 agree within 1-σ to all the predictions from model 3. Therefore, we conclude from both the tests performed in the current Section and those from Section 2.2.3, that our dynamical modeling results for Draco do not have significant biases due to the presence of unresolved binaries.Although we can only constrain this for the specific galaxy analyzed here, this is an important result that sets the stage for future interpretations on the impact of unresolved binaries on the velocity dispersion profiles of other dwarf galaxies.

The effect of tides
To derive mass estimates and density profile shapes, our analyses assume that Draco is in dynamical equilibrium.However, a number of recent studies have suggested that the excessive mass-to-light ratios measured in dSphs could be due to out-of-equilibrium dynamics, which in turn inflate the velocity dispersion (e.g.Klessen & Kroupa 1998;Hammer et al. 2018).Thus, it is of interest to gauge the possibility that our conclusions could be biased by MW tidal effects.
On this subject, the past literature strongly supports that Draco is a galaxy with no clear signs of tidal disruption: Both Odenkirchen et al. (2001) and Ségall et al. (2007) have found no evident stellar streams, asymmetric disturbances or density breaks that are characteristic of a tidally-perturbed system.Orbital analyses from Sohn et al. (2017) support that the last pericenter passage of Draco, when accounting for a massive Large Magellanic Cloud (see their table 5), happens ∼ 4 Gyr ago, at an average closest MW distance of ∼ 100 kpc, while Pace et al. (2022) report a pericenter of r peri = 58.0+11.4 −9.5 kpc and Battaglia et al. (2022) report r peri = 51.7 +4.1 −6.1 kpc (37.6 +4.2 −4.4 kpc) for a lighter (heavier) MW, mostly from Gaia DR3 data.
For the lower pericenter distances found by Battaglia et al. (2022) and Pace et al. (2022), one could envision the gas-expansion scenario proposed by Hammer et al. (2024) and Wang et al. (2024), which could produce higher velocity dispersion profiles than the stellar component alone (although still with typically ≲ half of the values observed in Figure 11).However, Sohn et al. (2017) also provide arguments (see their figure 4) against a mean radial expansion of the stellar component.Hence, there is no indication that strong tidal effects could be biasing our results, thus making Draco an ideal galaxy for our equilibrium-based dynamical modeling.

Error threshold
As discussed in Section 2.3.5, dynamical modeling should be careful when including data points with PM uncertainties higher than the system's intrinsic velocity dispersion.The larger the PM uncertainties, the more important it is that they are known very accurately.This is difficult to guarantee, since not all sources of observational uncertainty are always known or easily quantified.If the PM uncertainties are underestimated, then the galaxy PM dispersion will be overestimated, which biases the dynamical modeling results.Hence, in our analysis we have only included stars for which the observational PM uncertainty is smaller than 0.024 mas yr −1 , similar to the intrinsic PM dispersion of the galaxy (Section 2.3.5).Also, previous works with globular cluster PM data have often considered only low PM-error stars when deriving PM dispersion profiles from Gaia (Baumgardt et al. 2019;Vitral et al. 2022) and HST (Bellini et al. 2014;Watkins et al. 2015b).
To further test our choice of a PM error threshold, we also performed, for comparison, a MAMPOSSt-PM analysis with only the subset of stars meeting a lower threshold of 0.022 mas yr −1 .We display this run as model 10 in Table 2, which should be compared to its counterpart with the higher error threshold, model 3. Model 10 has 239 stars with measured PMs, compared to 364 in model 3.This implies a decrease of 35% in statistical completeness, and results in somewhat larger uncertainties for all inferred model parameters.Within 1-σ all parameters agree between these two models.This further supports that our results are robust to our choice of PM error threshold, and that there is no indication of biases due to underestimated errors in our standard dataset.

Higher-Order Velocity Moments
The shape of the LOS velocity distribution can in principle be used to obtain a constraint on the velocity anisotropy that is independent of our Jeans results (van der Marel & Franx 1993).However, detailed modeling of this shape requires more complicated modeling techniques than those presented here (e.g.Chanamé, Kleyna, & van der Marel 2008), which is outside the scope of the present paper.Nonetheless, we describe in Appendix B.5 that approximate modeling is again possible with the scale-free dynamical models of the type discussed in de Bruijne et al. (1996).Figure 18 in the Appendix shows that plausible axisymmetric models exist that both: (a) have values of β B consistent with those derived from our Jeans models; and (b) predict a kurtosis for the LOS velocity distribution that is consistent with the observed value.Hence, we would not expect that future detailed modeling of higher-order moments would change our conclusions about Draco's velocity anisotropy, and hence its mass distribution.However, per Figure 18, it might be helpful to constrain Draco's viewing inclination, as well as details of its phase-space distribution function.

COSMOLOGICAL IMPLICATIONS AND FUTURE WORK
Our newly measured asymptotic DM slope for Draco is γ dark = −0.71+0.44  −0.49 , consistent with the behavior expected in ΛCDM (Navarro et al. 1997), especially when comparing the most likely results from our MCMC chains in Figure 14 (in other words, the peak of the respective posterior PDF).Instead, various early studies of dSph galaxies found that observations favored shallow inner DM density profile slopes, consistent with a constant density 'core' at the center (e.g.Battaglia et al. 2008;Walker & Peñarrubia 2011;Amorisco & Evans 2012;Brownsberger & Randall 2021), 36 and inconsistent with predictions of standard ΛCDM cosmology, from DM-only simulations.So if future PM studies of other dSph galaxies were to support our findings, then this supports the standard cosmological hypothesis that the DM in the Universe behaves as a cold particle.The cold aspect of the DM particles would cause them to clump towards the deeper regions of the potential well, and form diverging cusps such as the one we measure.
Early suggestions of DM cores in dwarf galaxies inspired some studies to propose fundamental changes in the nature of DM, such as warm DM (WDM), e.g.sterile neutrinos and gravitinos, that predict lower central DM densities and cored profiles (Dalcanton & Hogan 2001), or SIDM for which DM particles in the central region thermalize via collisions and thereby form a cored profile (e.g.Sameie et al. 2020).The fact that our study 36 Notice that such analyses considered the Sculptor and Fornax dwarfs, which have a higher stellar content, from whence it is not completely excluded that a cuspy DM density profile could have been lowered by baryonic effects.
can rule out cored profiles larger than 942 pc at a 3-σ level thus imposes useful constraints on the SIDM crosssection. 37Independent evidence for the presence of DM cores in gas-rich star-forming dwarf galaxies exists on the basis of HI rotation curve modeling (e.g.Moore 1994;Flores & Primack 1994;Burkert 1995).A likely explanation for this in the context of standard ΛCDM cosmology is the impact of baryons on the DM density profile.This may transform DM cusps into cores by transferring energy and mass to the outer parts of the halos, e.g.via supernova feedback (Read & Gilmore 2005;Pontzen & Governato 2012;Brooks & Zolotov 2014), or star formation events (Read et al. 2018).Our findings in the present paper suggest that these processes have not been important for Draco.However, that is not unexpected.Draco is below the limit where stellar feedback as implemented in current cosmological simulations should still produce a small core (Fitts et al. 2017).Moreover, its star formation shut down long ago (∼10 Gyr; Aparicio et al. 2001).Hence, our work does not provide insight into how the DM profiles of higher-mass or star-forming galaxies may be impacted by baryonic physics.
To further probe how the variety of proposed theoretical mechanisms to form a core compare to observations, it will be essential to measure and model the 3D dynamics of dwarfs with different characteristics than Draco.For example, dSphs having more troubled dynamical pasts, having suffered close encounters with other satellites or the Milky Way itself, or instead dSphs with star formation histories that shut down more recently.Efforts in these directions are already underway -we have acquired multi-epoch datasets of the Sculptor dSph galaxy from the same HST programs described here (Sohn et al. 2021), and are in the process of also acquiring analogous HST data for the Ursa Minor dSph galaxy (Vitral et al. 2023b).Moreover, we have ongoing programs on JWST to further extend our time baselines for both Draco and Sculptor (van der Marel et al. 2023).The Nancy Grace Roman Space Telescope will provide the opportunity to further extend such studies over large fields of view (Han et al. 2023).This will help lift various degeneracies that have so far complicated a full exploration of existing discrepancies between observations and theoretical predictions for dwarf galaxies.

CONCLUSIONS
37 Specific constraints on how the core radius relates to the DM cross-section and particle mass can be found in Read et al. (2018) and (Macciò et al. 2012, see 2013 erratum), respectively.
For many decades, 3D velocity datasets of the internal kinematics of dwarf galaxies were only conceivable through numerical simulations.Thanks to HST's record time in operation, and its exquisite astrometric capabilities, we can now study other galaxies using plane-of-sky velocities that are not simulated, but observed.With four epochs of HST observations of the Draco dwarf spheroidal galaxy spanning an 18 year temporal baseline, we measure precise proper motions for hundreds of stars, with uncertainties below the intrinsic velocity dispersion of the galaxy.This provides the most precise PM dataset of Draco to date.We make this dataset publicly available as online material,38 so that it can also be used for other studies than those described here.
By combining the PMs with existing line-of-sight velocities, we derive for the first time radially-resolved 3D velocity dispersion profiles for any dwarf galaxy.With suitable modeling, these directly constrain the intrinsic velocity anisotropy of the galaxy, and resolve the mass-anisotropy degeneracy that often plagues dynamical modeling.
To fit the measurements and infer the radial mass profile, we solve the Jeans equations in both spherical and axisymmetric geometries.The latter provides the first axisymmetric modeling of any pressure-supported external galaxy that is observationally constrained by all three orthogonal components of the stellar velocity field.
The viewing inclination of the galaxy is not constrained by the data.So we marginalize our modeling results over all possible inclinations, informed by the overall distribution of projected galaxy shapes for elliptical galaxies in the nearby Universe.None of our conclusions depend sensitively on the actual inclination.
Below, we summarize our main findings for the Draco dSph galaxy, as well as the cosmological implications of our study.
• We provide new estimates of the galaxy center, based on elliptical Bayesian fits to the Gaia EDR3 stellar counts (see Table 1).
• We determine the line-of-sight rotation curve of the galaxy (see Figure 4) and find that it has measurable rotation.Over the radial region covered by the LOS dataset, ⟨v/σ⟩ = 0.22 ± 0.09.
• We show that the impact of unresolved binaries on the LOS data is negligible, and does not significantly alter the dynamical modeling results.
• For the PMs in our HST fields along the projected major axis, we measure averaged observed velocity dispersion ratios of ⟨σ POSt ⟩ / ⟨σ POSr ⟩ = 0.80±0.08 and ⟨σ LOS ⟩ / ⟨σ POS ⟩ = 1.08 ± 0.09, where σ POS represents an average over both PM directions.The first ratio is independent of galaxy distance, while the second is inversely proportional to it. 39he tight observational constraints on ratios like these enable dynamical models to strongly constrain the structure of Draco.
• Axisymmetric models imply that the velocity dispersion tensor of the galaxy is radially anisotropic along the symmetry axis, with β J = 0.56 +0.25 −0.42 .This is similar to the anisotropy everywhere in our best-fit spherical model (β B = 0.39 +0.13  −0.14 ).However, the best-fit axisymmetric models are tangentially anisotropic in the equatorial plane, as required to maintain hydrostatic equilibrium in an oblate system.Their globally averaged anisotropy is β B = −0.20 +0.28  −0.53 , so tangentially anisotropic, but still statistically consistent with isotropy.
• Construction of axisymmetric models is essential for flattened galaxies.This is particularly important for PM datasets such as those presented here, which do not cover all position angles.Spherical models then yield biased estimates for both the velocity anisotropy and the inferred cusp slope.
• We infer the DM density slope averaged over the spatial range for which we have PM measurements, Γ dark , to be −0.83+0.32 −0.37 , and an asymptotic DM density slope of γ dark = −0.71+0.44  −0.49 .Cores larger than 487 pc, 717 pc and 942 pc are ruled out at 1-, 2-and 3-σ confidence, respectively.The data do not have the constraining power to distinguish between different plausible parametrizations for the cusped DM density profile (see Table 2).
• The measured slope is in good agreement with ΛCDM predictions, 40 given that our measurements fall well within the break radius of the DM density profile predicted by cosmological simulations.Our best likelihood results corroborate the idea that DM is formed by some sort of cold particle.An asymptotic core is marginally inconsistent with the data at 89.5% confidence, when marginalized over all other quantities, arguing against mod-ified DM scenarios such as warm DM or SIDM.Nonetheless, a small asymptotic core cannot be effectively ruled out.
• The measured cusp slope provides no evidence that it has been lowered through baryonic feedback processes, although this cannot be ruled out either.However, this is not unexpected given Draco's low mass and ancient star formation history.This provides no insight into how the DM profiles of higher-mass or star-forming galaxies may be impacted by baryonic physics.
• We measure Draco's DM halo to have a mass of 1.20 +0.79 −0.28 × 10 8 M ⊙ at the outermost data point (R max = 900 pc), where the circular velocity reaches v circ = 24.19+6.31  −2.97 km s −1 .The maximum circular velocity in our best-fit models is v max circ (r) = 27.78+10.85 −5.97 km s −1 .• We infer a dynamical distance of 75.37 +4.73  −4.00 kpc.This is consistent with estimates obtained in the literature using other methods.Our ∼ 6% distance uncertainty is competitive with the uncertainties inherent to galaxy distances based on stellar evolution and the cosmic distance ladder.This is similar to the situation for globular clusters (Watkins et al. 2015a).
We have obtained here one of the most reliable constraints to date on the DM density profiles of dwarf galaxies.The results lessen the tension around the 'cusp-core' problem, and give further credence to standard ΛCDM cosmology.Our study is just a first step into the realm of 3D axisymmetric dynamics of dSphs.The methods we used can be largely generalized to other systems, and further studies on both Draco and other galaxies are already underway.Hence, many substantial advances are likely to be made in this area in the coming years.
8. ACKNOWLEDGMENTS Support for this work was provided by NASA through grants for programs GO-12966 and GO-16737 from the Space Telescope Science Institute (STScI), which is operated by the Association of Universities for Research in Astronomy (AURA), Inc., under NASA contract NAS5-26555, and through funding to the JWST Telescope Scientist Team (PI: M. Mountain) through grant 80NSSC20K0586.M.G.W. acknowledges support from National Science Foundation (NSF) grants AST-1909584 and AST-2206046.We thank Jorge Peñarrubia for thoughtful suggestions concerning our models, and Justin Read for sharing his Python algorithm to compute M ⋆ − M 200 relations as published in Read et al. (2017).The Digitized Sky Surveys were produced at the Space Telescope Science Institute under U.S. Government grant NAG W-2166.The images of these surveys are based on photographic data obtained using the Oschin Schmidt Telescope on Palomar Mountain and the UK Schmidt Telescope.The plates were processed into the present compressed digital form with the permission of these institutions.This work has made use of data from the European Space Agency (ESA) mission Gaia (https://www.cosmos.esa.int/gaia),processed by the Gaia Data Processing and Analysis Consortium (DPAC, https://www.cosmos.esa.int/web/gaia/dpac/consortium).Funding for the DPAC has been provided by national institutions, in particular the institutions participating in the Gaia Multilateral Agreement.
This project is part of the HSTPROMO (Highresolution Space Telescope PROper MOtion) Collaboration (https://www.stsci.edu/∼ marel/hstpromo.html),a set of projects aimed at improving our dynamical understanding of stars, clusters, and galaxies in the nearby Universe through measurement and interpretation of PMs from HST, Gaia, and other space observatories.We thank the collaboration members for the sharing of their ideas and software.exp − r a .(A4) The generalized Navarro et al. (1997, NFW) profile, where M −2 stands for the cumulative mass within the radius where the condition d log ρ/d log r = −2 is satisfied.The Einasto (1965) profile, where n is the Einasto index.We also use the Sérsic (1963Sérsic ( , 1968) model, where R is the projected distance to the source center, R e is the effective radius containing half of the projected luminosity and n is the Sérsic index.The term b n is a function of n, obtained by solving the equation: where γ(a, x) = x 0 t a−1 e −t dt is the lower incomplete gamma function.This model does not have an exact analytical deprojection into volume density (see Graham & Driver 2005 for a review).Instead, we use a combination of the most precise analytical approximations for different domains of Sérsic indexes and radii.This combination is described thoroughly in appendix A from Vitral & Mamon (2021), and essentially uses the methods from Lima Neto et al. (1999) plus Simonneau &Prada (2004), andVitral &Mamon (2020).

B.1. Model Parameters and Distribution Functions
The Jeans models in Section 3 are largely based on numerical solution of ordinary differential equations.This means that it is somewhat complicated to quickly or approximately explore how the intrinsic properties or predicted observables of the models depend on its key parameters.For the latter purpose we have found it more convenient to resort to a simpler class of models, namely the scale-free models introduced by de Bruijne, van der Marel, & de Zeeuw (1996).
These models consider the case of an axisymmetric luminous density with axial ratio q and radial dependence ρ ⋆ ∝ r −κ ,41 embedded in a spherical gravitational potential with circular velocity v circ ∝ r −δ/2 .The case of a logarithmic potential corresponds to δ = 0, while a Kepler potential corresponds to δ = 1.
For these potentials, there are different classes of phase-space distribution functions (DFs) that can yield hydrostatic equilibrium in the same geometry and with the same globally-averaged anisotropy, but with a different variation of the anisotropy in the meridional (R, z) plane.Specifically, de Bruijne et al. (1996) defined distinct Case I and Case II DFs.In the Case I DFs (for which f (E, L z ) models form a subset) the quantity β B , as defined in Eq. ( 5), is more tangential in the equatorial plane than on the symmetry axis.In the Case II DFs, Figure 15.Geometry of the system: Set of coordinate definitions.The astronomical source is inclined by an angle i and the line-of-sight direction is the same than the z ′ direction depicted above.
instead, β B is constant throughout the system.The intrinsic and projected quantities of interest for these DFs can be expressed semi-analytically, so that they can be computed quickly.
A software package called scalefree originally written by one of us (RvdM) for the de Bruijne et al. (1996) paper is available for this purpose.The scale-free models have the additional advantage that the full DF is known, so that higher-order moments can be calculated in addition to the second-order moments constrained by the Jeans equations (see Appendix B.5).
The scale-free models exactly reproduce the asymptotic large-radii limit of our Jeans models in Section 3, at intrinsic and projected radii well in excess of the scale radii r ⋆ and r dark of the luminous and dark matter (see e.g.Table 2).But they also provide a reasonable approximation at the intermediate radii where we have PM measurements, if the power-law slopes κ and δ are set to reproduce the average slopes of the luminous density and circular velocity over those radii.
From exploration of the known properties of our bestfit Jeans models in Section 3, we found that the combination of κ = 2 and δ = 0 (i.e. an axisymmetric isothermal luminous density in a spherical isothermal potential) is reasonable over the relevant radial range in Draco.We show below some predictions of such models as a function of the geometry (spherical, or axisymmetric of a given axial ratio and inclination), of the type of DF, and of the globally-averaged β B .The latter can be calculated numerically for given β p , where β p is a pa-rameter that enters into the analytical DFs and which controls the amount of anisotropy in the models.42

B.2. Evaluation of Proper Motions
The de Bruijne et al. (1996) paper addressed the evaluation of projected line-of-sight velocities of the scalefree models, but not the calculation of PMs, which were not observationally accessible at the time.We therefore extended their formalism to plane-of-sky (POS) velocities. 43igure 15 shows the general geometry of the problem, with the same conventions as in Evans & de Zeeuw (1994).The Cartesian frame (x, y, z) is aligned with the axisymmetric luminous density, so that the z-axis is its symmetry axis.The (r, θ, ϕ) are spherical coordinates in this system.A second Cartesian frame (x ′ , y ′ , z ′ ) is introduced so that the (x ′ , y ′ ) frame corresponds to the plane of the sky, with x ′ ≡ y along the projected major axis of the luminous density, and z ′ along the LOS direction.The viewing inclination i is the angle between the z and z ′ axes.The coordinates are related through The velocity along any coordinate direction in the (x ′ , y ′ , z ′ ) frame can be expressed as a combination of radial, tangential and azimuthal components, (B10) For the LOS direction dir=z ′ , the C i terms are given by eqs (49) in de Bruijne et al. (1996) C r = sin θ sin i cos ϕ + cos θ cos i , C θ = cos θ sin i cos ϕ − sin θ cos i , (B11b) The velocities in the radial and tangential directions on the plane of the sky can be written as Figure 17.Position-angle dependence of kinematics: Rows show the predictions of scale-free models for the same three geometries as in Figure 16, in each case chosen to have globally-averaged anisotropy βB = −0.20.The left two panels show the variations of σr, σ θ , and σ ϕ with angle θ ′ in the meridional (R, z) plane (θ ′ ≡ 90 • − θ = 0 in the equatorial plane).The right two panels show the variations of the projected σLOS, σPOSr, and σPOSt with the angle ξ in the plane of the sky (ξ = 0 on the projected major axis).Here σ is used as shorthand for v 2 1/2 .Velocities are expressed in the dimensionless units defined in section 2.1 of de Bruijne et al. (1996).The first and third columns show Case I DFs, while the second and fourth columns show Case II DFs.In axisymmetric geometry there are important variations of the kinematical observables with position angle on the sky, which are ignored when observations are interpreted with spherical models.This can introduce biases in model results.

B.4. Position-Angle Dependence
For this paper, we have obtained new HST PM data primarily along the projected major axis.So while Figure 16 shows the quantity σ POSt /σ POSr integrated over all position angles, it is important to assess also how the kinematics vary with position angle in the models.The three rows of Figure 17 show scalefree model predictions for the same three geometries as in Figure 16.For each of the geometries we show models with two different types of DF (Case I and Case II), but always with β B = −0.20,consistent with the best fit from our axisymmetric Jeans models in Table 3.The left two panels show the variations of the internal kinematics in the meridional (R, z) plane, while the right two panels show the variations of the projected kinematics in the plane of the sky.
The main difference in the internal kinematics predictions for the axisymmetric models between the two types of DF is in the variation of σ r with θ ′ (≡ 90 • − θ).For either DF, σ ϕ increases from the symmetry axis to the major axis (as before, we use in this discussion σ as shorthand for v 2 1/2 ).This is a direct consequence of the tensor virial theorem, which requires that overall the system has more dynamical pressure parallel to the equatorial plane than perpendicular to it, so as to support its flattened shape.The right two panels show that in projection, σ POSr decreases from the major to the minor axis, while σ POSt increases from the major to the minor axis.These behaviors are nearly independent of inclination (at fixed projected axial ratio) and of the specific type of DF.These variations are ignored when a spherical model is constructed, because the projected quantities are then independent of position angle.This can introduce important biases, as discussed in Section 4.2.4.
While Figure 17 pertains to a specific set of parameter combinations, we verified that different parameter combinations yield qualitatively similar conclusions.For example, when considering flatter models with lower axial ratio (either intrinsically or in projection), the variations from major to minor axis become more pronounced than in Figure 17, but otherwise remain qualitatively similar.

B.5. Higher-order moments
Our analysis in this paper has been based on the second (and first) velocity moments that enter into the Jeans equations of hydrostatic equilibrium.However, for the observed LOS velocities we have enough measurements, with small enough error bars, to make it possible to also determine higher-order moments.Specifically, we calculated both the fourth-order Gauss-Hermite moment h 4 (as in van der Marel & Franx 1993) and the kurtosis K LOS of the overall LOS velocity distribution.The kurtosis is defined here as where as before, v k indicates a k-th order velocity moment.We use all position angles on the sky and center the LOS velocity distribution on the bulk systemic velocity listed in Table 1.Hence, odd moments are generally statistically consistent with zero.The observations then imply that h 4 = 0.065 ± 0.036 and K LOS = 0.21 ± 0.25.Both of these quantities are positive, indicating that the LOS velocity distribution is more centrally peaked than a Gaussian.The second-order Jeans equations cannot be used to interpret these measurements.However, the scale-free DF models predict all higher-order moments v k for any give set of model parameters.Reconstructing the full velocity distribution and Gauss-Hermite moments from the higher-order moments is not a straightfoward inversion problem (de Bruijne et al. 1996), so we restrict data-model comparisons here to the kurtosis.Figure 18 shows the predicted kurtosis as function of the globally averaged Binney β B , defined by Eq. ( 5).As in Figure 16, we show predictions for different geometries, with Case I DFs in the left panel and Case II DFs in the right panel.We now also include a flatter axisymmetric model with true axial ratio q = 0.45, which at viewing inclination 48.3 • projects to the observed axial ratio for Draco.The horizontal orange-grey band shows the observed kurtosis with its uncertainty.For reference, the vertical green band shows the 68% confidence band on β B for Draco from our axisymmetric Jeans models (bottom row of Table 3).
The predictions in the Figure show that more radially anisotropic models predict more peaked LOS velocity distributions (higher kurtosis), as has been previously established (e.g.van der Marel & Franx 1993).For the Case II DFs, the predictions are more-or-less independent of geometry.But for the Case I DFs, flatter axisymmetric models (seen at lower inclinations) predict higher kurtosis values.Hence, higher-order moments can help both to determine the viewing inclination and the full structure of the DF.More sophisticated dynamical mod- We now also include a flatter axisymmetric model with true axial ratio q = 0.45, which at viewing inclination 48.3 • projects to the observed axial ratio for Draco.We show in dashed black the observed kurtosis along with the percentiles of its distribution, color-coded as indicated on the right.The βB inferred from our axisymmetric Jeans models, with its 68% confidence region, is depicted in green.
Plausible models exist that match all constraints (i.e. have predictions within the intersection of the orange and green bands).
eling is required to fully exploit this information (e.g.Chanamé et al. 2008).Nonetheless, we note here that an axisymmetric Case I scale-free model at the median inclination of 57.1 • predicts the observed K LOS = 0.21 when β B = −0.07.The latter is fully consistent with the constraints that we have obtained from our axisymmetric Jeans models.Hence, there is no reason to expect that detailed modeling of higher-order moments would alter the conslusions about Draco's velocity anisotropy, and hence its mass distribution, that we have drawn in this paper.

Figure 1 .
Figure 1.Surface density: Goodness of fit of a Plummer Plummer (1911) spherical model to Gaia EDR3 data of Draco, with significant field-star interlopers.The surface density fits follow the formalism detailed in Vitral (2021), which assumes a constant contribution of field stars.

Figure 2 .
Figure 2. Inclinations: Probability distribution function of inclinations for Draco, computed according to the methodology described in Section 2.1.3.The dashed red lines indicate the n-th percentiles of the distribution, with n spaced from zero to a hundred, on intervals of ten.

Figure 3 .
Figure 3. Interloper cleaning: Goodness of fit of our multi-dimensional mixture model, projected on the vLOS and [Fe/H] dimensions.The central panel shows the stars from the Walker et al. (2023) catalog on the sky region of Draco, in the vLOS -[Fe/H] plane, color-coded by their Draco membership probability.The upper and right panels display the histogram of this plane projected on the vLOS and [Fe/H] dimensions, respectively.On the corner of each side panel, we represent the median uncertainty on the respective dimension by an error bar.Stars with P memb > 0.99 were selected for inclusion in our analysis.

Figure 5 .
Figure 5. Impact of unresolved binaries: Multi-epoch velocity dispersion profiles (as a function of projected radii) for groups of LOS data constructed from a different number of epochs.The number of stars having multi-epoch observations is scarcer, and thus we have fewer radial bins for those.The excellent agreement within 1-σ between all N epoch , and the fact that the σLOS profiles associated with higher N epoch are not considerably smaller than the N epoch = 1 subset give us good confidence that unresolved binaries do not affect our mass estimates beyond the statistical uncertainties.

Figure 7 .
Figure 7. Photometric cleaning: The upper panels show the CMDs of our initial interloper cleaning of Field 2 (using instrumental magnitudes), with the original stars in the subset in black, and the retained ones in orange.The lower panels show the QFIT-based cleaning of Field 2, with the original stars in the subset in black, and the retained ones in orange.The plots attest to the effectiveness of our friendsof-friends photometric and interloper cleaning.

Figure 8 .
Figure 8. Local corrections: Effect of local proper motion (PM) corrections on our dataset (here, shown for Field 1).The panels show the master frame (X, Y ) positions of the sources, color-coded according to the PM in each direction (for reference, Draco's typical PM dispersion peaks around 0.03 mas yr −1 ).The left panels show clearly artificial PM shifts related to uncorrected residual CTE and geometric distortion effects, especially around the detector edges and boundaries.The right panels show our corrected sample, which has a generally homogeneous PM distribution throughout the whole field, centered around null PMs.

Figure 9 .
Figure9.Underestimated errors: Measured proper motion (PM) dispersion, σPOS, as a function of the maximum PM uncertainty in the dataset, ϵ lim .Blue curves relate to Field 1, while the red ones relate to Field 2. Opaque curves relate to the PM subsets without the 3-σ cleaning explained in Section 2.3.5, while the solid curves show results for the PM subsets with such cut.For reference, we display a x = y line in dashed green.This plot shows that it is important to impose a cut in PM errors at ϵ lim ∼ 0.024 mas yr −1 .Inclusion of stars with PM uncertainties in excess of the intrinsic galaxy dispersion yields an overestimated value for the galaxy dispersion.

Figure 10 .
Figure 10.Data overview: (A) Zoom-in of studied fields.Black dots indicate stars that have PM accuracies good enough to fulfill our scientific goals (i.e.ϵPOS ≲ σPOS) and that were used in our dynamical modeling, while the gray dots indicate stars that do not, or that failed our multiple steps of data cleaning.The HST+Gaia sample compiled with GaiaHub (del Pino et al. 2022) is marked in blue, while the orange stars depict the dataset used by Massari et al. (2020).(B) CMDs based on HST data (using instrumental magnitudes), using the same symbols as (A).(C) F606W magnitude as a function of the 1D PM error (calculated as in eq.[B2] from Lindegren et al. 2018), using the same symbols as (A), with a dashed line representing the ϵPOS threshold we use.Our dataset comprises hundreds of stars with ϵPOS ≲ σPOS, compared to zero such stars in previous PM analyses of Draco.

Figure 11 .
Figure 11.Observations and model comparison: The quantities showcased areupper left: velocity dispersion in the line-ofsight; upper right: plane-of-sky velocity dispersion in radial direction; lower left: line-of-sight rotation amplitude; lower right: plane-of-sky velocity dispersion in tangential direction.The black circles and error bars represent the data, computed from the catalog of Walker et al. (2023) for the leftmost panels, and from our HST program in the rightmost panels.Model predictions and the adopted galaxy distance (which we use to convert mas yr −1 to km s −1 in the rightmost panel) are from our axisymmetric JamPy MCMC fits for i = 57.1 • as discussed in Section 4.2.2 below.Our best fit (as defined in Section 3.2) is depicted as a black solid line, which we interpolate with respect to projected radius R from the actual data R values (this is done for visualization purposes, since there is also a dependence with the projected angle ξ).The percentile from our MCMC chains are color-coded as in the color bar scheme, in the right.

26
While cosmological simulations tend to favor generally triaxial DM halos (e.g.Jing & Suto 2000;Kazantzidis et al. 2004a), recent observational studies of the MW DM halo support a quasispherical potential within the inner ∼ 30 kpc(Hattori, Valluri, & Vasiliev 2021;Wegg, Gerhard, & Bieth 2019).27With exception of the DM scale radius, to which we allow a larger prior towards higher radii. 28This means we fix M⋆ = 4.7 × 10 5 M ⊙ , and the major axis of the projected density as the value fitted in Section 2.1.1,namely a maj,⋆ = 9.1 arcmin, where a maj is the respective Plummer major axis (see Appendix A). 29 A similar assumption is also present inHayashi et al. (2020), who base their choices on cosmological simulations by Vera-Ciro et al.

Figure 12 .
Figure12.Dark matter density and circular velocity: The upper panel shows the dark matter density profiles from our JamPy MCMC chains for i = 57.1 • , color-coded according to their respective percentiles.We display the most probable solution (as defined in Section 3.2) as a black line.The lower panel shows the circular velocity curves, defined by Eq. (14), for the same models, as well as their maximum value in green.The radial extent of the upper panel covers the extent of our PM data (where Γ dark was computed), while the lower panel covers the extent of our entire dataset.

Figure 14 .
Figure 13.Averaged dark matter slope and velocity anisotropy: Final posterior probability distributions of Draco's dark matter density slope, Γ dark , averaged over the range where we have PM data, and the globally-averaged βB velocity anisotropy (see Eq. [5]).The distributions were marginalized over inclinations as described in Section 4.2.1, and further smoothed for simple visualization purposes.The values highlighted correspond to the best likelihood estimates (title texts, arrows, and cross), and the respective 16th and 84th percentiles (dashed lines).The results imply a classic ΛCDM-like slope in the Draco Sph galaxy, and a marginal preference for overall tangential velocity anisotropy.

Figure 18 .
Figure18.Kurtosis of the grand-total LOS velocity distribution as a function of the globally averaged velocity anisotropy βB.As in Figure16, different curves show predictions of scale-free models for different geometries, for Case I (left panel) and Case II (right panel) DFs, respectively.We now also include a flatter axisymmetric model with true axial ratio q = 0.45, which at viewing inclination 48.3 • projects to the observed axial ratio for Draco.We show in dashed black the observed kurtosis along with the percentiles of its distribution, color-coded as indicated on the right.The βB inferred from our axisymmetric Jeans models, with its 68% confidence region, is depicted in green.Plausible models exist that match all constraints (i.e. have predictions within the intersection of the orange and green bands).

Table 1 .
Overview of Draco parameters.
Figure 4. Rotation: Rotation profile of Draco, as a function of its projected radius, computed with LOS data.The black dots and error bars show the v/σ ratio per bin, while the red and blue curves (and respective uncertainty regions) show, respectively, the overall mean through the Draco field, and the best fit of Eq. (3).We add a dashed-gray line to represent the case of no rotation.From this plot, we conclude that Draco has a modest amount of rotation.

Table 2 .
Main results of the MAMPOSSt-PM spherical Jeans modeling.
Tiret et al. (2007)itt (1985))ID; (2) Dark matter parametrization: "GPLU" for a generalizedPlummer (1911)model with free inner slope, "GKAZ" for a generalizedKazantzidis et al. (2004b)model with free inner slope, "GNFW" for a generalizedNavarro, Frenk, & White (1997)model with free inner slope, and "EIN" for theEinasto (1965)model; (3) Test type: "ρ dark " when testing different parametrizations for the DM density profile, "βgOM" for a generalizedOsipkov (1979)-Merritt (1985)parametrization of the velocity anisotropy profile, "βgTiret" for a generalizedTiret et al. (2007)parametrization of the velocity anisotropy profile, "Cusp" when forcing an inner density slope of −1 for the dark matter, "Core" when forcing a cored model for the dark matter, "PM" when only using PMs (no LOS data) and "ϵ" when using a lower PM error threshold; (4) Heliocentric distance, in kpc; (5) anisotropy value at r = 0; (6) anisotropy value at infinity (only for models with variable anisotropy); (7) Plummer scale radius of the stellar component, in 10 2 pc; (8) Total mass of the stellar component, in 10 5 M⊙; (9) Dark matter scale radius, in 10 2 pc; (10) Dark matter mass at the maximum projected data radius, in 10 8 M⊙; (11) Dark matter asymptotic density slope or Einasto index n; (12) Dark matter density slope averaged over the spatial range where PMs are available; (13) Difference in AICc relative to model 1.Listed uncertainties are based on the 16th and 84th percentiles of the marginal distributions, unless the maximum likelihood solution was outside that boundary, in which case the uncertainties are related to the minimum or maximum value of the MCMC chain.We did not consider the AICc diagnostic when the data set was different from the respective standard model.

Table 3 .
Main results of the JamPy axisymmetric Jeans modeling.