$\texttt{galkin}$: a new compilation of the Milky Way rotation curve data

We present $\texttt{galkin}$, a novel compilation of kinematic measurements tracing the rotation curve of our Galaxy, together with a tool to treat the data. The compilation is optimised to Galactocentric radii between 3 and 20 kpc and includes the kinematics of gas, stars and masers in a total of 2780 measurements carefully collected from almost four decades of literature. A simple, user-friendly tool is provided to select, treat and retrieve the data of all source references considered. This tool is especially designed to facilitate the use of kinematic data in dynamical studies of the Milky Way with various applications ranging from dark matter constraints to tests of modified gravity.


I. INTRODUCTION
The rotation curve of a spiral galaxy provides farreaching insight into its properties, as noticeably explored for decades now (see e.g. Refs. [1][2][3][4][5]). Data on the rotation curve of our Galaxy, a spiral itself, have also been available for several decades [6][7][8][9][10][11]. However, the data are rather disperse throughout the literature and groups of references are often neglected. We therefore set out to assemble a comprehensive compilation of the decades-long observational effort to pinpoint the rotation curve of the Milky Way. The compilation, named galkin 1 , improves upon existing ones (e.g. Refs. [12,13]) on several aspects, including most notably: (i) an enlarged database of observations appropriately treated for unified use, and (ii) the release of a simple out-of-thebox tool to retrieve the data. This compilation has been presented in Ref. [14] and later adopted in other works in the literature. Without venturing into any analysis of the Galactic structure or dynamics (as done in galpy [15]), here we provide instead a thorough description of the data sets (Sec. II) as well as the features of an outof-the-box tool (Sec. III) to access the database and output the desired data for independent analyses. The open source code provided is simple, flexible and can be easily modified to include new datasets or other types of measurements. The latter feature is particularly relevant on the eve of the precision era soon to be introduced by the Gaia satellite [16] and an array of optical and near-infrared ground-based surveys such as APOGEE-2 [17,18], GALAH [19], WEAVE [20] and 4MOST [21]. Our compilation can be regarded as a step forward in unifying the current state of the art, yet it is certainly susceptible of further inclusions -please see our own extensive caveats and notes throughout the manuscript. We encourage the community to adopt galkin and participate in its extension as new datasets arise.

II. COMPILATION
The rationale behind our literature survey is to be as exhaustive as possible, but still selective enough to put together a clean and reliable sample of kinematic tracers. For that reason, we have decided to exclude tracers with kinematic distance determination only, important asymmetric drift or large random motions. Whereas we have taken utter care in making such compilation exhaustive, we cannot exclude that some data sets may be missing; the reader compelled to add extra data sets is welcome to do so by modifying the open source code described in Sec. III. Our compilation is focussed on the range of Galactocentric radii R = 3 − 20 kpc and is not intended for use far outside this range. In particular, compilations dedicated to the very internal regions of the Milky Way are available elsewhere in the literature [39]. However, we warn the reader that the use of kinematic tracers at R 2−5 kpc may be problematic for certain applications [40].
Note as well that kinematic tracers beyond the ones in our compilation exist and are discussed at length in the literature, including tracers in the outer halo [13,[41][42][43][44], disc [45][46][47][48] and off the Galactic plane [49][50][51][52][53]. For careful analyses of the implications of these and other tracers on the mass distribution of the Milky Way, we refer the reader to the works cited above and also to Refs. [54][55][56][57][58][59][60][61][62][63][64][65][66][67], where the authors have often times built their own tracer compilations. Partial compilations are easily reproducible with the galkin tool given the possibility to (de-)select individual references. Again, we remind the interested reader that our code galkin is modular and easily modifiable by the user for the addition of new data sets.
Tab. I recaps the key features of each data set. We refer to the original source references and in particular to the supplementary information of Ref. [14] for in-depth descriptions of the different object types. In order to obtain a clean sample, we have imposed various data selection cuts on the available sources following closely the recommendations of each original reference (notice, however, that the interested user can easily override these software cuts hacking the source code directly). The Appendix at the end of the present paper gives a full account of our data selection and treatment for each reference listed in Tab. I.
Our compilation consists of 2780 individual tracers distributed in Galactocentric radius R, Galactic longitude and height z above Galactic plane as shown in Fig. 1.   to report measurements of v los h in terms of the line-ofsight velocity in the local standard of rest (LSR) v los lsr for a fixed peculiar solar motion (U, V, W ) . In these cases, we infer v los h by subtracting the peculiar solar motion used in the reference off the reported v los lsr,0 (cf. the Appendix). Once v los h is obtained, this is summed to the adopted peculiar solar motion to get the final LSR line-of-sight velocity v los lsr . Each object has an associated measurement ( , b, d ± ∆d, v los lsr ± ∆v los lsr ). The corresponding Galactocentric radius follows from simple geometry as where R 0 is the distance of the Sun to the Galactic centre. Under the assumption of circular orbits, the angular circular velocity of the object ω c is found by inverting where v 0 is the local circular velocity. The uncertainties on d and v los lsr are propagated to R and ω c , respec-tively. We shall also provide the familiar circular velocity v c ≡ Rω c and corresponding uncertainties, but note that the errors of R and v c are strongly positively correlated, while those of R and ω c are independent. All uncertainties currently implemented in galkin are symmetric following the information available in each reference; future data might provide the full distribution of observables, which would then be treated in upcoming versions of galkin and would be of great value for Bayesian studies. The procedure described above is common to all object types in Tab. I, with some modifications in two cases. For terminal velocities, we set b = 0 and R = R 0 | sin | (or, equivalently, d = R 0 | cos |) in Eqs. (1) and (2), and each measurement reads ( , v los lsr ± ∆v los lsr ). For the HI thickness method, the measured quantity is lsr , so each data point is defined by (R/R 0 ± ∆R/R 0 , W ± ∆W ), cf. Refs. [24,69]. We also process the proper motions µ * , µ b when available, as is the case of open clusters and masers. All other details of data treatment are exhaustively documented in the Appendix.

III. TOOL DESCRIPTION
We now turn to the description of the galkin tool, whose function is to allow the user to access the data described Sec. II in a customisable way. The tool is written in Python and the code package is available through our GitHub page, github.com/galkintool/galkin.
The distribution has a very simple structure. The parent directory contains the setup file setup.py together with the usual README file. The galkin package is provided under galkin/, where galkin/data/ contains the original data as extracted from the corresponding references 2 , and example scripts can be found under bin/. The main dependencies and installation instructions are described in the README file, whereas here we simply summarise the usage of the tool assuming it is properly installed and running. Note that galkin adopts astropy [77] for coordinate conversion if required by the user, but the code can also be used without installing this package, which is the default option.
The goal of galkin is to provide ready-to-use data files containing all the necessary kinematic tracer information for constraining the rotation curve of the Galaxy. The user may choose the values for the Galactic parameters (R 0 , v 0 ) and (U, V, W ) , as well as (de)select either entire classes of tracers (gas, stars, masers) or single references in Tab. I. It is also possible to add a given systematic uncertainty to the line-of-sight velocity of each tracer. This is done with the help of the script bin/galkin data.py either through a graphic interface (see Fig. 3) by launching the tool from bin/ with the command python galkin data.py or through a customisable input file by typing python galkin data.py inputpars.txt The code then processes the original data sets of the selected references converting each data set consistently 2 Note that the main point of this tool is to handle in a unified way data from references using different Galactic parameters, different definitions and often overlapping in observed sources. The data files under data/ contain the original published data of each reference; please do not use these files directly unless you are fully aware of all details of each reference. for the chosen values of the Galactic parameters. The output is stored in three data files: bin/output/posdata.dat with the position information of each tracer, namely (R, d, , b, x, y, z); bin/output/vcdata.dat with the rotation curve measurements, namely (R, ∆R, v c , ∆v c , ω c , ∆ω c ); and bin/output/rawdata.dat with the raw data measurements, namely (v los , ∆v los , µ * , ∆µ * , µ b , ∆µ b ).
In the first two files, the values of the Galactic parameters chosen by the user are reported in the first line and the source reference for each tracer is indicated in the last column. For testing purposes, we provide under bin/output/ the sample files corresponding to our baseline choice of Galactic parameters used to produce Fig. 2 (for the entire database and single classes objects, separately) as well as for a variation of the Galactic parameters (for the entire database only). The tool also includes the script bin/galkin plotter.py to read and visualise the output described above. This can be launched from bin/ with the command python galkin plotter.py output/vcdata.dat output/posdata.dat which produces a set of demonstrative plots including the spatial distribution of the tracers and the inferred rotation curve.
Finally, we provide the script bin/galkin data fast.py in order to illustrate how to use galkin inside another code without dealing with input nor output files. This is a faster version of the data processing pipeline that is specifically designed for applications that need to use galkin repeateadly, e.g. in scans over Galactic parameters.
The structure of galkin is purposely minimal and modular. The code can be easily adapted to replace or modify existing data sets or single tracers and to add new data sets as they become available. The idea behind galkin is to provide an up-to-date and user-friendly compilation of tracers of the kinematics of the Milky Way over the coming years.
Acknowledgements. M. P. acknowledges the support from Wenner-Gren Stiftelserna in Stockholm; F. I. from the Simons Foundation and FAPESP process 2014/11070-2.

Appendix: Data treatment
Tracers of the rotation curve of the Milky Way usually adopted in the literature can be roughly divided into three categories: gas kinematics, kinematics of stellar objects, and masers. For each of these classes of objects, different methodologies are used to infer positions, distances and kinematics. Without attempting here a review of the properties of each tracer type, we point to the supplementary information of Ref. [14], where this collection of data has first been presented. The reader will find there an in-depth description of all tracer types as well as appropriate references to the original literature. In this Appendix, we recap the original source references and fully document our data treatment. Let us notice that halo stars are currently not included in this version of galkin, due to the additional assumptions needed. Future versions of the code will properly include halo stars as an extra tracer type.
Tab. I displays the key features of all data sets used in galkin. In the following we provide a detailed description of the treatment applied to each data set. Equation, figure and table numbering refers to the original source references.

a. HI terminal velocities
Fich+ '89 [10] From Tab. 2, we take the terminal velocities v los lsr,0 measured at different longitudes and assume an overall velocity uncertainty ∆v los lsr = 4.5 km/s following Sec. II.b.i. The reported velocities v los lsr,0 correspond to a peculiar solar motion (U, V, W ) = (10.3, 15.3, 7.7) km/s, which is consistently subtracted off in our compilation 3 . We further correct for the pe- 3 Notice that the original reference fails to indicate explicitely the adopted peculiar solar motion. Following Ref. [74], we assume the adopted value is the old standard solar motion as defined in the Appendix of Ref. [74].
Malhotra '95 [22] From Fig. 7, we take the terminal velocities v los lsr,0 measured at different Galactocentric radii R/R 0 and assume an overall velocity uncertainty ∆v los lsr = 9 km/s in line with the dispersions computed in Sec. 3.4 for both the first and fourth quadrants. The Galatocentric radii are converted into longitudes through R = R 0 | sin | depending on the quadrant (first quadrant: circles and triangles in top panel of Fig. 7; fourth quandrant: squares in bottom panel of Fig. 7). The reported velocities v los lsr,0 correspond to a peculiar solar motion (U, V, W ) = (10.3, 15.3, 7.7) km/s, which is consistently subtracted off in our compilation 3 .

McClure-Griffiths & Dickey '07 [23] From
Tab. 1 (online version), we take the terminal velocities v los lsr,0 measured at different longitudes . According to Secs. 3.3 and 4.3.1 and the caption of Fig. 8, the velocity uncertainty amounts to ∆v los lsr = 1 km/s for < 325 • , ∆v los lsr = 3 km/s for > 332.5 • and ∆v los lsr = 10 km/s for = 327.5 • − 332 • ; we conservatively assume ∆v los lsr = 10 km/s for the whole range = 325 • − 332.5 • . The reported velocities v los lsr,0 correspond to a peculiar solar motion (U, V, W ) = (10.3, 15.3, 7.7) km/s, which is consistently subtracted off in our compilation 3 . We exclude from the full data set regions with discrete HI clouds at = 306 • ± 1 • , 312 • ± 0.5 • , 320 • ± 0.5 • (cf. Sec. 4; the width of the intervals is fixed a posteriori in order to eliminate the spikes in velocity) and also the region at | sin | > 0.95 because there ∆v los lsr /v los lsr,0 ∼ 1 (cf. Sec. 4.2; this last cut is actually already performed in the online version of Tab. 1).

b. HI thickness method
Honma & Sofue '97 [24] From Tab. 1, we take the Galactocentric radii R/R 0 fitted through the thickness method for different velocities W ≡ R 0 ω c −v 0 and assume an overall uncertainty ∆W = 5.8 km/s following Sec. 2.4. The authors of this reference find that the method of Merrifield '92 [69] (i.e. method 1 in Tab. 1) covers the largest range of R/R 0 and is the most accurate, so we use the results of that method for our compilation.

c. CO terminal velocities
Burton & Gordon '78 [7] From Fig. 2, we take the terminal velocities v los lsr,0 measured at different longitudes , where v los lsr,0 already includes the line width correction, cf. Sec. 3 and Eq. (4a). The resolution of the original CO data presented in this reference is ∆v los lsr = 1.3 km/s, while it amounts to ∆v los lsr = 2.6 km/s for the other data sets [78,79] plotted in Fig. 2 (cf. Sec. 2); therefore, we conservatively assume ∆v los lsr = 2.6 km/s all along. The reported velocities v los lsr,0 correspond to a peculiar solar motion (U, V, W ) = (10.3, 15.3, 7.7) km/s, which is consistently subtracted off in our compilation.  Fig. 1 (top panel), we take the terminal velocities v los lsr,0 measured at different longitudes and assume an overall velocity uncertainty ∆v los lsr = 3 km/s following Sec   Hou+ '09 [28] From Tab. A1, we take the distances d ± ∆d and line-of-sight velocities v los lsr,0 corresponding to objects towards different Galactic coordinates ( , b) and assume an overall velocity uncertainty ∆v los lsr = 3 km/s. The reported velocities v los lsr,0 correspond to a peculiar solar motion (U, V, W ) = (10.3, 15.3, 7.7) km/s, which is consistently subtracted off in our compilation 3 . We exclude from the full data set objects without stellar distances nor at the tangent points, near to the Galactic centre = 345 • − 15 • or anti Galactic centre = 165 • − 195 • and objects in (or coincident to objects in) Brand & Blitz '93.

e. Giant molecular clouds
Hou+ '09 [28] From Tab. A2, we take the distances d and line-of-sight velocities v los lsr,0 corresponding to objects towards different Galactic coordinates ( , b) and adopt an overall velocity uncertainty ∆v los lsr = 3 km/s and an overall relative distance uncertainty ∆d/d = 20%. The reported velocities v los lsr,0 correspond to a peculiar solar motion (U, V, W ) = (10.3, 15.3, 7.7) km/s, which is consistently subtracted off in our compilation 3 . We exclude from the full data set objects without stellar distances and near to the Galactic centre = 345 • − 15 • or anti Galactic centre = 165 • − 195 • .

b. Planetary nebulae
Durand+ '98 [30] From Tab. 2 (online version), we take the heliocentric line-of-sight velocities v los h ± ∆v los h corresponding to objects towards different equatorial coordinates (α, δ) (then converted to Galactic coordinates ( , b)). The distances d are found by cross-matching Tab. 2 to Tabs. 1 and 3 of Ref. [82], which report individually determined and statistical distances, respectively. When available, we prefer the individually determined distances (i.e. Tab. 1 of Ref. [82]), based on the methods presented in Ref. [83] with a relative uncertainty ranging from 15 to 40% (cf. Sec. 3.3 in Ref. [83]) so an overall relative distance uncertainty ∆d/d = 25% is adopted; when not possible, we use the statistical distance scale (i.e. Tab. 3 of Ref. [82]), for which the relative uncertainty is ∆d/d = 30% (cf. Sec. 5.2 in Ref. [82] [11] From Tab. 3, we take the distance moduli µ (FW) and heliocentric line-of-sight velocities v los h corresponding to objects towards different Galactic coordinates ( , b) and assume an overall distance modulus uncertainty ∆µ = 0.23 mag following Sec. 11.3. The distance moduli are converted to distances through d/pc = 10 µ/5+1 . The velocity uncertainty reads ∆v los h = (σ 2 1 + σ 2 2 ) 1/2 , where σ 1 = 1, 2.5, 5 km/s depending on the method used to calculate the radial velocity (cf. Sec. 11.3) and σ 2 ∼ 11.1 km/s is the contribution of the velocity ellipsoid (cf. Sec. 11.3). We further correct for the K-term of −1.81 km/s, cf. Sec. 11.6. We use the reduced sample of 266 stars (cf. Sec. 11.4 and Tab. 4) excluding objects near to the anti Galactic centre = 160 • − 200 • and two further objects due to high residual velocity.
Pont+ '97 [31] From Tab. 1, we take the heliocentric line-of-sight velocities v los h , period P and colours V and B − V corresponding to objects towards different Galactic coordinates ( , b). The distance moduli are found with the period-luminosity-colour relation described in Ref. [11] (FW) and the zero point given in Sec. 3.1; an overall distance modulus uncertainty ∆µ = 0.21 mag is assumed following Sec. 3.3. The distance moduli are converted to distances through d/pc = 10 µ/5+1 . The velocity uncertainty reads ∆v los h = (σ 2 1 + σ 2 2 ) 1/2 , where σ 1 = 1 (2.5) km/s for ≥ 10 (< 10) velocity measurements (cf. Tab. 1) and σ 2 ∼ 11.1 km/s is the contribution of the velocity ellipsoid (cf. Sec. 11.3 of Ref. [11]). We further add a 6 km/s systematic uncertainty (cf. Sec. 6; see also Sec. 5.3) to the derived circular velocity. We exclude from the full data set objects near to the anti Galactic centre = 160 • − 200 • and with no measured radial velocity or no B − V .

d. Carbon stars
Demers & Battinelli '07 [32] From Tab. 4 (online version), we take the heliocentric line-of-sight velocities v los h ± ∆ṽ los h corresponding to objects towards different Galactic coordinates ( , b) and add a random motion of 20 km/s in quadrature to ∆ṽ los h following Sec. 5.1. From Tab. 1 (online version), we take the distances d and assume an overall relative distance uncertainty ∆d/d = 10% following Sec. 5.1. We exclude from the full data set objects near to the anti Galactic centre = 170 • − 190 • ; stars no. 20 and 23 (nearby fast stars; cf. Sec. 5.2) and stars no. 17, 18,27,28,35,42,52,56,58 and 60 (possibly belonging to Canis Major; cf. Sec. 5.2). Battinelli+ '13 [33] From Tab. 1, we take the Galactocentric radiiR (computed with R 0 = 7.62 kpc) and heliocentric line-of-sight velocities v los h ± ∆ṽ los h corresponding to objects towards different Galactic coordinates ( , b) and add a random motion of 20 km/s in quadrature to ∆ṽ los h following Sec. 5.1 of Ref. [32]. The Galactocentric radiiR are converted to heliocentric distances d by inverting Eq. (1) using R 0 = 7.62 kpc. We exclude from the full data set star no. 712 (probably belongs to the Sagittarius dwarf galaxy; cf. Sec. 4).