Milky Way Cepheid Standards for Measuring Cosmic Distances and Application to Gaia DR2: Implications for the Hubble Constant

We present HST photometry of a selected sample of 50 long-period, low-extinction Milky Way Cepheids measured on the same WFC3 F555W, F814W, and F160W-band photometric system as extragalactic Cepheids in SN Ia hosts. These bright Cepheids were observed with the WFC3 spatial scanning mode in the optical and near-infrared to mitigate saturation and reduce pixel-to-pixel calibration errors to reach a mean photometric error of 5 millimags per observation. We use the new Gaia DR2 parallaxes and HST photometry to simultaneously constrain the cosmic distance scale and to measure the DR2 parallax zeropoint offset appropriate for Cepheids. We find a value for the zeropoint offset of -46 +/- 13 muas or +/- 6 muas for a fixed distance scale, higher than found from quasars, as expected, for these brighter and redder sources. The precision of the distance scale from DR2 has been reduced by a factor of 2.5 due to the need to independently determine the parallax offset. The best fit distance scale is 1.006 +/- 0.033, relative to the scale from Riess et al 2016 with H0=73.24 km/s/Mpc used to predict the parallaxes photometrically, and is inconsistent with the scale needed to match the Planck 2016 CMB data combined with LCDM at the 2.9 sigma confidence level (99.6%). At 96.5% confidence we find that the formal DR2 errors may be underestimated as indicated. We identify additional error associated with the use of augmented Cepheid samples utilizing ground-based photometry and discuss their likely origins. Including the DR2 parallaxes with all prior distance ladder data raises the current tension between the late and early Universe route to the Hubble constant to 3.8 sigma (99.99 %). With the final expected precision from Gaia, the sample of 50 Cepheids with HST photometry will limit to 0.5% the contribution of the first rung of the distance ladder to the uncertainty in the Hubble constant.


INTRODUCTION
Measurements of cosmic distances from standard candles form a cornerstone of our cosmological model. Yet even the best available standard candles require calibration of their absolute brightnesses via geometric distance measurements. Trigonometric parallax is the "gold standard" of such geometric distance measurements -the simplest, the most direct, and the most assumption-free. Previously, most Milky Way (MW) stars, including known examples of rare stars, were well out of range of even state-of-the-art 0.3-1 milliarcsecond (mas) parallax measurements from Hipparcos and the Fine Guidance Sensor (FGS) aboard the Hubble Space Telescope (HST ). Now, we are entering a "golden age" of parallax determinations as most of the MW's stars will come into parallax range from relative astrometry measured at the µas level by the ESA mission, Gaia (Gaia Collaboration et al. 2016a,b, 2018. The parallaxes of long-period Cepheids are among the most coveted because these variables can be seen with HST in the host galaxies of Type Ia supernovae (SNe Ia) at D < 50 Mpc and used to calibrate their luminosities and the expansion rate of the Universe ) (hereafter R16). Benedict et al. (2007) used the FGS on HST to measure parallaxes to 9 of the 10 Cepheids in the MW known at D < 0.5 kpc with individual precision of 8% and a sample mean error of 2.5%. However, all but one of these had periods < 10 days, a range where Cepheids are too faint to be observed in most SN Ia hosts. Nearly all of the long-period Cepheids live at D > 1 kpc, demanding a parallax precision better than 100 µas for a useful measurement. Spatial scanning with HST's WFC3 has provided relative astrometry with 30-40 µas precision to extend the useful range of Cepheid parallaxes to 2-4 kpc, measuring 8 with P ≥ 10 days with an error in the mean distance of 3% and providing a calibration more applicable to extragalactic Cepheid samples (Riess et al. 2014;Casertano et al. 2016;Riess et al. 2018).
The Gaia mission is expected to measure the parallaxes of hundreds of MW Cepheids with a precision of 5-10 µas by the end of the mission. Such parallax measurements would support a ∼ 1% determination of the Hubble constant (H 0 ) provided the possible precision of the calibration of Cepheid luminosities is not squandered by photometric inaccuracy. To retain the precision of Gaia's Cepheid parallaxes when they are used as standard candles it is necessary to measure their mean brightness on the same photometric systems used to measure their extragalactic counterparts. By using such purely differential flux measurements of Cepheids along the distance ladder, it is possible to circumvent systematic uncertainties related to zeropoints and transmission functions which otherwise incur a systematic uncertainty of ∼ 2-3% in the determination of H 0 , nearly twice the target goal, even before including additional uncertainties along the distance ladder.
To forge this photometric bridge, in HST Cycle 20 (2012) we began observing MW Cepheid "Standards" among the set of 70 with P > 8 days, A H < 0.4 mag, V > 6 mag and expected distances of D < 7 kpc, criteria which yield the most useful sample for calibration of the thousands of extragalactic Cepheids observed in the hosts of SNe Ia. These extragalactic Cepheids across the hosts of 19 SNe Ia and in NGC 4258 have all been observed in the near-infrared (NIR) with WFC3-IR in filter F 160W (similar to the H band) to reduce systematics caused by reddening and metallicity, and in optical colors F 555W (similar to the V band) and F 814W (similar to the I band), to form a reddening-free distance measure Riess et al. 2016). To measure the much closer and brighter MW Cepheids on the same photometric system and mitigate saturation, we used very fast spatial scans, moving the telescope during the observation so that the target covers a long, nearly vertical line over the detector. We use a scan speed of 7. ′′ 5/sec, corresponding to an effective exposure time of 0.005 sec in the visible and 0.02 sec in the infrared, much shorter than the minimum effective exposure times possible with the WFC3 hardware. Scanning observations are also free from the variations and uncertainties in shutter flight time (for F 555W and F 814W with WFC3-UVIS) that affect very short pointed observations (Sahu et al. 2015). Spatial scans offer the additional advantage of varying the position of the source on the detector, which averages down pixel-to-pixel errors in the flat fields, and can also be used to vary the pixel phase, reducing the uncertainty from undersampled point-spread-function photometry. Finally, unlike ground-based photometry which relies on calibrators in the same region of the sky, HST can measure the photometry of MW Cepheids over the whole sky, without concern about regional variations in calibrators. Most of the observations were obtained in HST "SNAP" mode, which selects a subset of the targets to be observed based on scheduling convenience, thus essentially randomly with respect to intrinsic Cepheid properties. Observations were obtained for a total of 50 of the 70 Cepheids, which therefore constitute an unbiased subset of the full sample.
In § 2 we present the 3-filter spatial scan photometry of the 50 MW Cepheids observed in our HST programs, and compare them internally as well as with ground-based measurements in corresponding passbands. In § 3 we carry out an analysis of the recently released Gaia DR2 parallaxes for our targets; using the precise and accurate HST photometry, we verify the existence and magnitude of a zeropoint offset for the Gaia parallaxes, and at the same time test current measurements of H 0 . In § 4 we discuss these results and the nature of the zeropoint issue.

MILKY WAY CEPHEID STANDARDS
The 50 MW Cepheids collected here were observed photometrically in several HST programs: GO-12879, GO-13334, GO-13686, GO-13678, GO-14206, and GO-14268 include photometric and astrometric measurements for 18 of the targets, while GO-13335 and GO-13928 were purely photometric SNAP programs. For 8 of these targets, the photometric measurements have been reported in Riess et al. (2018); the photometry of the other 42 targets follows the same procedures. Here we summarize the key steps for convenience; the full description is in Riess et al. (2018).
1. Fluxes are measured from the amplitude of the fits of the line-spread function to the extracted signal at every position along the scan; a 15-pixel minirow across the scan is used to perform the fit. The flux is divided by the effective exposure time, i.e., the pixel size divided by the scan rate. Pairs of direct and scanning mode images are used to calibrate out possible errors in the pixel size and scan rate, and provide the aperture correction applicable between scanning and staring mode observations. This offset has an error in the mean of 0.002-0.003 mag, depending on the filter.
2. We multiply the measured flux by the local (relative) pixel area using the same pixel area map used for photometry of all point sources in staring mode; this corrects from flux per unit area to actual flux.
3. We then need to correct for the differing sizes of the pixel length along the scan (Y ) direction, which changes the effective exposure time seen at each location along the scan. This step partially reverses the correction in step 2; the net result of steps 2 and 3 is to multiply the fitted amplitude by only the relative pixel size perpendicular to the scan direction. Riess et al. (2018) compare pairs of scans of MW Cepheids in back-to-back exposures, and demonstrate a mean photometric error per scan observation of 0.007, 0.003, and 0.001 mag in F 160W , F 555W , and F 814W , respectively. For the sample of 50 Cepheids presented here, the mean number of epochs per filter is between 2 and 3.  4. Finally, we apply a correction for the light curve phase, i.e., the difference between each Cepheid's magnitude at the observed phase and the magnitude at the epoch of mean intensity of its light curve. These phase corrections are derived from ground-based light curves of these Cepheids in filters with wavelengths best corresponding to the WFC3 filters with their sources given in a table in the Appendix. The phase corrections are calculated in the HST system after the ground-based light curves are transformed to this system using the transformations given in Riess et al. (2016). Because the phase corrections are relative quantities, they do not have a zeropoint and they do not change the zeropoint of the light curves, which remain on the HST WFC3 natural system. The uncertainties in these phase corrections depend on the quality of the ground-based light curves; the average uncertainty in the magnitude corrections is 0.024, 0.020, and 0.018 mag per epoch in F 555W , F 814W , and F 160W , respectively. The empirical scatter between multiple measurements for the same target -typically 4-5 for the targets in Riess et al. (2018) -is consistent with the estimated uncertainties. The mean uncertainty in the light-curve mean magnitude for these 50 Cepheids is 0.021, 0.018, and 0.015 mag in F 555W , F 814W , and F 160W , respectively. The internal agreement between individual epochs, each corrected to the mean (for Cepheids with 3 or more epochs) is shown in Figure 1 and includes both the photometry errors and the phase-correction uncertainties.
For distance measurements and for the determination of H 0 , it is useful to convert these three bands to the reddening-free Wesenheit magnitudes (Madore 1982) used by R16 for measuring extragalactic Cepheids in the hosts of Type Ia supernovae: These 50 m W H values have a mean uncertainty of 0.019 mag, including photometric measurement errors, phase corrections, and error propagation to the Wesenheit magnitude, corresponding to approximately 1% in distance; at the mean expected parallax of 400 µas this represents a mean uncertainty of 4 µas in the predicted parallax. At this level of precision, both the breadth of the instability strip at 0.04-0.08 mag in m W H (Macri et al. 2015;Persson et al. 2004) and the expected parallax uncertainties by the end of the Gaia mission (5-14 µas) will still dominate the determination of individual Cepheid luminosities. Some of these Cepheids have been suggested as possible binaries, but in general we do not automatically exclude possible binaries from consideration. At a typical Milky Way Cepheid distance of 2.5 kpc, companion separations of less than 0.1" for HST WFC3 UVIS channel or < 400 AU are unresolved and thus included with the measured Cepheid flux. This contribution, while small, is statistically matched in extragalactic Cepheids and thus cancels in the use of Cepheid fluxes along the distance ladder. For wider binaries, 400 − 4000 AU Anderson & Riess (2017) estimate that the effect on the photometric calibration of Cepheids is on the order of 0.004% (in distance) and thus negligible.
In Table 1 we provide the photometric measurements of these 50 Cepheids for WFC3 F 555W , Employing the derived periods in Table 1 in mas given in Table 1. With negligible uncertainties in the periods, the mean uncertainties in the predicted parallaxes are ∼ 2 − 3% in distance due to the width of the instability strip. These expected parallaxes are on the scale in which H 0 = 73.24 km s −1 Mpc −1 as obtained in the same best-fit solution from Riess et al. (2016). This set of photometry offers a number of distinct advantages over ground-based magnitudes. By measuring all Cepheids along the distance ladder (and in both hemispheres) with a single, stable photometric system, HST WFC3, we can largely eliminate the propagation of zeropoint and bandpass uncertainties among Cepheid flux measurements. This is especially important in the NIR where individual system zeropoints are typically based on only a handful of historical standards, systematic uncertainties are ∼ 0.02-0.03 mag (Riess 2011), and the relative systematic differences between two systems can be expected to be ∼ 0.03-0.04 mag. To illustrate these differences, we compare the HST WFC3 system photometry with ground-based equivalents, transformed into the same system using the conversions of Riess et al. (2016). Ground-based observations in V, I, J, H were obtained form the sources listed in the Appendix, and are rather inhomogeneous but the best available; the NIR measurements are primarily from three sources: Monson & Pierce (2011), Laney & Stobie (1992, and new observations obtained at CTIO. In Figure 2 we compare the ground-based mean magnitudes to the HST WFC3 values. We find mean differences (in the direction Ground− HST ) and a sample dispersion (SD) in F 555W , F 814W , F 160W , and m W H of 0.024 mag, SD=0.032 mag, 0.038 mag, SD=0.027 mag, -0.056 mag, SD=0.048 mag, and -0.051 mag, SD=0.052 mag, respectively; a few outliers (4, 2, 1, and 1, respectively) are marked in Figure 2. In the following we use only the reddening-free Wesenheit magnitude m W H . Restricting our analysis to HST magnitudes limits the sample of usable Cepheids; over 200 more have ground-based photometry, and in principle could be used for the same type of analysis, as was done for DR1 in Casertano et al. (2017). However, our sample is close to complete for the most relevant Cepheids, those with long periods. Moreover, at the much higher precision of DR2 vs. DR1 parallaxes (roughly 40 vs. 300 µas per Cepheid), further reduced by averaging across a Cepheid sample, photometric errors and systematics become dominant, as we will demonstrate in § 3. Even at the current (DR2) precision, the quality of photometric information is paramount to obtain the best possible information from Gaia parallaxes.  Note-a Does not include addition of 0.052 ± 0.014 mag to correct CRNL 6.4 dex between MW and extragalactic Cepheids.
Note-b Includes addition of 0.052 ± 0.014 mag to correct CRNL 6.4 dex between MW and extragalactic Cepheids.
H is the absolute Wesenheit magnitude determined from the Cepheid period and the distance scale from Riess et al. (2016) where H 0 =73.24 km s −1 Mpc −1 as discussed in the text.
Note- * Not used in final analysis, see text

Gaia DR2
The Gaia mission (Prusti 2012;Gaia Collaboration et al. 2016a,b, 2018 is well positioned to revolutionize our knowledge of the luminosity scale of various stellar types, including those used to set the cosmic distance scale. By mission end, Gaia parallaxes for the Cepheids in our sample are expected to have errors of 5-14 µas, about 2% of their typical parallax; with tens of objects, the uncertainty in the Cepheid luminosity calibration will be << 1%, negligible in the error budget for the local measurement of H 0 . With the release of DR2 (Gaia Collaboration et al. 2018, hereafter G18), the nominal statistical parallax errors for the Cepheids in our sample were expected to drop from ∼ 300 µas, typical of DR1, to ∼ 40 µas. These Cepheids are all in the brightness range 6.05 < G < 11.70 (mean magnitude; G is the natural passband of the Gaia astrometric detectors). These are fainter than the saturation limit at the shortest gating interval used (TDI gate 4, 16 lines; Gaia Collaboration et al. 2016a), and thus are not expected to be sigificantly affected by saturation effects. However, (Lindegren et al. 2018, hereafter L18) and online material accompanying DR2 2 identify significant systematic uncertainties which substantially reduce the present leverage of the DR2 Cepheid parallax measurements.
Perhaps the most significant issue with DR2 parallaxes identified in L18 is the existence of a significant parallax zeropoint error, i.e., a number which must be subtracted from all Gaia DR2 parallaxes. In principle, large-angle astrometric measurements, such as those carried out by Hipparcos and Gaia, yield an absolute parallax measurement, without the need for a correction from relative to absolute parallax. In contrast, narrow-angle parallax measurements, such as those using HST (e.g., Benedict et al. 2007;Riess et al. 2014Riess et al. , 2018Casertano et al. 2017;Brown et al. 2018) are only sensitive to relative parallaxes of stars within the same field, and require a correction to absolute parallax -often based on astrophysical information. However, as pointed out, e.g., by Michalik & Lindegren (2016), instrumental uncertainties associated with monitoring the large angle between observing planes can lead to systematic errors in the determined parallaxes. Specifically for Gaia, a variation in basic angle with period equal to the spin period of the satellite is difficult to correct on the basis of self-calibration procedures; in particular, Butkevich et al. (2017) show that the effect produced by a periodic variation of this nature is almost degenerate with a global shift of the parallaxes, resulting in a whole-sky systematic offset, i.e., a zeropoint error. Indeed, L18 consider the measured parallaxes for a carefully selected sample of over 500,000 quasars, whose parallaxes are expected to be extremely small, and find that they have a mean value of −29 µas, with a small dependence on color and ecliptic latitude (their Fig. 7). According to L18, "the actual offset applicable for a given combination of magnitude, colour, and position may be different by several tens of µas". The quasars are primarily faint (G > 17 mag); thus, a possible magnitude dependence, suggested in their Figure 7 (left panel), is difficult to investigate. The distribution of corrected parallaxes π corr = π meas + 29 µas is fairly consistent with a normal distribution if their nominal errors are increased by ∼ 8% (see L18, Fig. 8). We will return to the issue of the parallax zeropoint error later in this section.
Other potential systematics identified by G18 and L18 include: • Uncharacterized systematic errors dominate over the ideally available precision in the post-fit astrometric residuals (see L18, Figure 9) in DR2 by a large factor for G < 12 mag, with the discrepancy increasing for brighter magnitudes. Note that a systematic deviation of parallax measurements as a function of Cepheid brightness would be somewhat degenerate with a luminosity scale determination because brightness is partially correlated with distance.
• A small proportion of individual parallaxes are "corrupted"; they can generally be identified by large positive or negative values and must be discarded.
• The statistical uncertainties may be underestimated by up to ∼ 30 % for stars with G < 12. We note that the spatial correlation of parallax errors on the sky for DR2 is not very significant for the Cepheids in our sample, for which only two pairs are separated by less than 10 degrees.
In light of these issues, there is likely no unique way to model the Cepheid sample while using the DR2 results to determine their luminosity scale. Rather we take a cautious, "common sense" approach to illustrate what such an approach can reveal at present. We anticipate reduction to these systematic uncertainties through independent analyses of other classes of objects and from future Gaia data releases.
As a first, exploratory step we plot the DR2 parallaxes of the Cepheids in Table 1 against their uncertainties in Figure 3. It is immediately obvious that three of the 50 (RY-Vel, RW-Cam, and SV-Per) have anomalously high formal uncertainties, and two of their parallaxes define the extrema for the set. (Note that because of the excess noise formalism (Lindegren et al. 2012), large errors are often indicative of poor adherence to the model used, in this case a five-parameter, single-star astrometric model.) One of the three (RW-Cam) was also an outlier in the lower precision DR1 data (Casertano et al. 2017). For SV-Per and RW-Cam, our HST spatial scan data demonstrate the presence of a companion within 0. ′′ 3 of the Cepheid; see insets in Figure 3. Both have reported UV excess from IUE spectra consistent with B8III companions (Evans 1994). The companions are the likely source of the anomalous astrometric solution. All three objects are excluded from further analysis.
An additional, independent test can be carried out thanks to the existence of HST parallax measurements for 19 Cepheids, obtained using the Fine Guidance sensor (Benedict et al. 2007) or WFC3 spatial scanning (Riess et al. 2018). The comparison between HST and Gaia DR2 parallaxes is shown in Figure 4 vs. their mean Gaia G magnitude; error bars combine the errors in the HST and DR2 parallaxes. Two Cepheids, Y-Sgr and Delta-Cep, were excluded from this comparison because their Gaia DR2 values were negative, indicating they are likely corrupted. Delta-Cep is also a binary (Anderson et al. 2015), and its orbit will eventually be included in the Gaia solution in later releases. For one, ℓ-Car, its DR2 G mag was unrealistically faint, and so we plotted it at its ground-based value. The agreement is good for Cepheids with G > 6 mag, 7 of 8 of which fall within 1σ, but it becomes quite poor for Cepheids with G < 6 mag (even excluding the 3 mentioned), just 1 of 11 within 1σ. This suggests as expected that around G ≈ 6 mag, where the Gaia detectors are known to saturate, the DR2 parallaxes become much less reliable. We conclude that even with the possible maximum 30% enhanced Gaia DR2 errors the Cepheid parallaxes at G < 6 mag are not yet sufficiently well understood and should not be used in any quantitative analyses. To be safe, we also exclude from further comparisons the Cepheid T-Mon: with a mean magnitude G ≈ 6.1, it is the brightest in our sample (brighter by 0.3 mag in our own F 555W data than the next one). Because of brightness variations in Cepheids, T-Mon is likely to have exceeded the saturation limit during some of its epochs of astrometric observations. The comparison of the 8 Cepheids with G > 6 mag yields a DR2 parallax offset of −90 ± 21 µas but the comparison is strongly impacted by SS CMa. Excluding SS CMa yields an offset of −55 ± 25 µas. While informative and consistent with subsequent analyses, we do not make explicit use of the HST parallaxes when comparing the Gaia parallaxes to their photometric predictions to retain independence with the expectations based on  or (Riess et al. 2018).
After excluding four Cepheids with too large formal uncertainties (RY-Vel, RW-Cam, and SV-Per) or too close to the saturation threshold (T-Mon), we are left with 46 Cepheids with HST photometry and reliable Gaia DR2 parallaxes and uncertainties. Following the approach of Casertano et al. (2017), we determine for each Cepheid the expected parallax based on its photometry and the absolute magnitude derived from the Leavitt law (Leavitt & Pickering 1912), calibrated by Riess et al. (2016) in the same photometric system. This parallax has a typical uncertainty of only a few percent. Figure 5 compares the measured DR2 parallax with the expected value; the comparison is made in parallax space to avoid issues related to the conversion of low SNR parallaxes to magnitudes (Hanson 1979) which otherwise skews their likelihood in magnitude space (see also recommendations in Luri et al. 2018).
As expected, the DR2 parallaxes are offset, on average, with respect to the predicted values. However, a cursory examination of Figure 5, and basic statistics on the differences between predicted and measured values, suggest a zeropoint offset in the same direction but somewhat larger (in absolute value) than the value of −29 µas for quasars reported by Lindegren et al. (2018). Therefore, we proceed to constrain the parallax zeropoint internally from our sample, as discussed above. Fortunately, the Cepheids in our sample have a fairly narrow range of color (a dispersion in F 555W − F 814W of 0.28 mag) and magnitude, so we will assume zeropoint variations are small across our sample.
At the same time, we consider a possible rescaling of the photometrically-predicted distances (parallaxes) because these make direct use of the calibration of the distance ladder and attendant value of the Hubble constant, H 0 = 73.24 ± 1.7 km s −1 Mpc −1 , from Riess et al. (2016). A degree of tension exists between this value of H 0 and the one determined from Planck cosmic microwave background (CMB) data in concert with the Λ-cold-darm-matter (ΛCDM) model, which yields H 0 = 66.93 ± 0.62 DR2 and the Hubble Constant 11 km s −1 Mpc −1 (Planck Collaboration et al. 2016). To the extent the DR2 data permit an independent determination of the Cepheid luminosity calibration, they can also help distinguish between these values of H 0 .
Therefore we seek to optimize the value of: with the two free parameters, α and zp, representing the cosmic distance scale from DR2 relative to H 0 = 73.24 km s −1 Mpc −1 and the parallax zeropoint appropriate for the DR2 Cepheid measurements, in the direction of measured minus predicted parallaxes (consistent with the definition of L18).
To determine the individual σ i we add in quadrature the photometric parallax uncertainty (mean of 0.02 mag or 4 µas in parallax), the intrinsic width of the NIR Wesenheit P − L (0.05 mag or a mean of 18 µas in parallax) and the nominal parallax uncertainty as given in the DR2 release. The mean of these σ i is 39 µas (median 35 µas).
Minimizing the value of χ 2 gives value of zp = −46 ± 13 µas and α = 1.006 ± 0.033 , with a value of χ 2 = 62.5 for 44 degrees of freedom. Confidence regions for the two parameters are shown in Figure  6. Although these two parameters are correlated, the range of Cepheid parallaxes, from 0.2 to 1 mas, breaks to some extent the degeneracy between α and zp and allows for their separate determination.
This parallax zeropoint error, somewhat different (larger in absolute value) from the value determined by L18, is significant in comparison with the formal DR2 uncertainties of our sources. The uncertainty in the parallax zeropoint error has the potential to impact significantly any astronomical analysis based on DR2 parallaxes, especially when multiple sources are used, thus in principle reducing statistical uncertainties. The potential dependence of the zeropoint error on source properties suggested by L18, such as color or (possibly) magnitude, is especially relevant, as it suggests that applying the nominal L18 zeropoint correction, which was determined for blue, faint objects, may not be optimal for objects with different characteristics. Indeed, in Lindegren (2018) the offset measured from quasars appears to increase at the brighter end (14 < G < 16 mag) and at the redder end (G BP -G RP > 1 mag), where the offset fluctuates around a higher value of −50 µas, both directions that apply to Cepheids. The online documentation also indicates that the estimated parallax zeropoint depends on the sample of sources examined (Arenou et al. 2018), and the value determined in L18 should not be used to "correct" the catalogue parallax values.
A more precise constraint on the zeropoint offset for use in other studies may be derived by fixing the value of α (e.g., to unity based on other geometric distance measurements to Cepheids from R16 which have a mean error of 1.4%) which results in a constraint of −46 ± 6 µas. That this value is more than 3σ from the value derived from relatively bluer and fainter quasars from L18 reinforces their finding that the parallax zeropoint offset can vary with sources' position, magnitude and color, all quite different between MW Cepheids and quasars but in the right direction as suggested by the brightest and reddest quasars.
The value of α is quite consistent with unity, indicating that the predicted parallaxes, after accounting for the offset, are in good agrement with DR2, affirming the cosmic distance scale or the value of H 0 used to predict the parallaxes from Riess et al. (2016). On the other hand, this value of α is inconsistent with the value of α = 0.91 needed to rescale the parallaxes to match the Planck CMB + ΛCDM value of H 0 at the 2.9σ confidence level (99.6% likelihood). Including the 8 Cepheids with HST parallaxes from Riess et al. (2018) to help constrain the parallax offset gives a result of α = 1.035 ± 0.029 and α = 1.010 ± 0.029 after excluding the one Cepheid with a large difference between HST and Gaia DR2, SS CMa. These are 4.4 σ and 3.4 σ from the Planck CMB + ΛCDM value of H 0 , respectively.
We also note that the value of χ 2 appears somewhat high for the 46 Cepheids and 2 fitted parameters (44 degrees of freedom), a value which would be exceeded by chance in 3.5% of trials. The bottom of Figure 6 shows the residuals from the best fit versus Gaia DR2 G magnitudes with a dispersion of 43 µas. No trend is apparent nor any outliers (largest deviations is 2.3σ, expectable for 44 Cepheids and below the threshold for outlier rejection; Chauvenet's criterion would suggest a threshold of 2.6σ for outlier rejection for a sample with 46 objects).
If we consider the high χ 2 to be an indication of additional variance in the data, a promising source is suggested by L18 and other material accompanying DR2, which states that for bright targets (G < 12 mag) formal errors may be underestimated by up to 30%. Rescaling the DR2 parallax errors by χ 2 dof 1/2 ≈ 1.19 raises the mean error to 46 µas. We refit the model and find zp = −47 ± 16 µas and α = 1.008 ± 0.039 with a value of χ 2 = 45.0; the inconsistency with Planck CMB + ΛCDM is 2.6σ. For the expanded errors there is now no Cepheid with a deviation > 2σ. Because we would expect between 2 and 3 such Cepheids, one might argue that the expanded errors are now too large. However, we think this identifies the range of reasonability for fitting these data.
Unfortunately, the cost of needing to measure the appropriate zeropoint offset from the Cepheid sample is (painfully) large. The marginalized uncertainty in α is 0.033, providing a 3.3% independent calibration of the cosmic distance scale, 2.5 times what would otherwise result from the formal parallax uncertainties and full knowledge of the parallax zeropoint, better than it has ever been determined in the local Universe. As an illustration, in Figure 8 we use the constraint of -53 ±2.6 µas on the parallax zeropoint offset calculated from 3475 Red Giants with Kepler-based asteroseismic estimates of radii and parallaxes from Zinn et al. (2018). The mean color of this Red Giant sample well matches the Cepheids (greater optical extinction of the Cepheids compensates their bluer color). The Red Giant mean magnitudes are a few magnitudes fainter than the Cepheid mean but much closer than the Quasar sample used by Lindegren et al. (2018). It is therefore not surprising that the constraint from the Red Giants is quite consistent with the Cepheids. We have chosen not to formally include it in the determination of H 0 due to its model-dependence. However, this external constraint, which is five times more precise than the internal constraint, demonstrates the value of independent knowledge of the DR2 offset term. Making use of it reduces the uncertainty in the distance scale for the HST Cepheid sample to ∼ 1.3%.
We are optimistic that future Gaia data releases will resolve the uncertainty of the parallax zeropoint offset while producing parallax measurements near the expectations for the end of the mission. With these and the HST photometry presented here we would expect to reach the full potential precision of ∼ 0.5% from this Cepheid sample.

Ground-Based Sample, Caveats
To improve the constraint on zp and α we might consider using a larger sample of MW Cepheids, though the augmentation of the sample would need to rely exclusively on ground-based photometry. There are compilations of ground-based Cepheid photometry which could augment the HST sample by an additional ∼ 150-250 Cepheids, for example from van Leeuwen et al. (2007).
However, because the HST sample of 50 presented here were selected to have P > 8 days, A H < 0.4 mag, V > 6 mag, D < 6 kpc and is largely (> 70%) complete (with selection made randomly by the scheduling of HST), an expanded sample would be dominated by Cepheids which necessarily violate these criteria. Most would have A H > 0.4 or P < 8 days with resulting negative consequences.
On average, such shorter period Cepheids are bluer in their mean B − V by ∼ 0.2 mag which might alter their parallax zeropoints relative to the redder HST sample. In addition, these Cepheids are a couple of magnitudes fainter, so that their astrometric observations are more easily contaminated by a companion.
Further, Cepheids with P < 8 days would be shorter than the period range they would be used to calibrate, i.e., those which are visible in distant SN Ia hosts, putting too great a reliance on the linearity of the P -L relation. Adding Cepheids with A H > 0.4 would lead to larger magnitude errors due to variations in the reddening law.
Additional loss in precision is expected from the use of ground-based photometry in lieu of HST photometry for the expanded sample. Ground-based photometry covering two hemispheres is by necessity quite inhomogeneous, especially in the NIR where limited standards are available. The few truly wide-angle surveys to date in the NIR lack time sampling needed to determine the mean of the light curves. Moreover, in the NIR, the correspondance between the ground-based H-band filter and the WFC3 F 160W filter is particularly poor as indicated by the large color term of ∼ 0.2 mag per mag of J − H color which has been measured between the two Riess 2011). We find systematic errors are readily apparent between different sources of ground-based NIR Cepheid photometry. A comparison of the 79 Cepheids in common between the two most recent compilations, Monson &Pierce (2011) andvan Leeuwen et al. (2007) shows a gradient of 0.015 mag per mag at > 3σ confidence in H and a mean difference of 0.08 mag in J.
We also note that the comparison of the HST system photometry for 50 Cepheids presented here with ground equivalent (see Appendix for source) indicated a systematic difference of -0.051 mag (Ground-HST) in the Wesenheit magnitudes used to measure distances. The size of this systematic error would likely depend on the specific ground-based system used, or their mixture for heterogeneous collections.
Lastly, without the use of the high resolution HST data one would lose the means to test for contamination of parallaxes by nearby companions as illustrated in two of three cases in § 2.
To better compare the size of the uncertainties associated with HST Cepheid sample and an augmentation to it from ground-based data we produced a Cepheid sample comprised of all the Cepheids with photometry from a single source; NIR mean magnitudes from Monson & Pierce (2011) for Northern Cepheids and V and I-band mean magnitudes from Berdnikov et al. (2000), excluding objects in the HST sample, leaving 86 additional Cepheids. For these we included the -0.051 mag offset identified between the ground and HST measurements of m W H and the reduced CRNL of 0.036 mag (reduced to the 4.5 dex that applies from extragalactic Cepheids to HST system standards; see notes in Table 1) that would apply between the ground and extragalactic Cepheids. A basic comparison to the DR2 parallaxes is shown in Figure 7. The augmented sample, though in rough agreement with the HST sample, has far greater errors with a dispersion of differences (after removing the parallax zeropoint offset) of 99 µas, 2.3 times that of the HST sample. Even removing the most deviant points leaves a high dispersion of 60-070 µas. This level of uncertainty is far greater than we can model by increasing the Gaia DR2 parallax errors, even by the maximum suggested value of 30% as it would require ∼ 70%. It is hard to realistically characterize the source of this additional variance and whether it may belie other important dependencies and we therefore decided to not make further use of an augmented sample.

The Zeropoint Error in DR2 Parallaxes
We have presented an analysis of the Gaia DR2 parallax values and their uncertainties for a carefully selected sample of 50 Cepheids with precise, consistent photometry obtained with HST using spatial scanning observations. The photometry for most of these Cepheids is published here for the first time ( Table 1). The high accuracy of the Leavitt law calibration for these Cepheids, obtained by R16 on the basis of several independent anchors, allows us to predict their parallaxes with uncertainties much smaller than those of the DR2 parallaxes. We also consider a larger sample of Cepheids covering a broader range of magnitudes and properties but with only ground-based photometry, and a small sample of Cepheids for which HST parallaxes have been published.
Our first conclusion is that the Gaia DR2 parallaxes for our Cepheids, in the Gaia magnitude range 6 < G < 12 mag, are generally in good agreement with their predicted values. We confirm the existence of a zeropoint parallax error indicated in the Gaia release material; however, we find a somewhat larger (more negative) value for the zeropoint, −46 ± 13 µas , where the uncertainty includes marginalizing over the possible recalibration of the Leavitt law on the basis of DR2 parallaxes alone. Using the R16 period-luminosity calibration without rescaling, the value of the parallax zeropoint inferred from this sample of Cepheids is −46 ± 6 µas. The difference between our estimated zeropoint and the value obtained by L18 from quasars suggests that the zeropoint does depend on magnitude, color, or position on the sky (all are different for Cepheids), as suggested by L18; online DR2 documentation similarly states that the zeropoint depends on the sample used. We emphasize the need to include zeropoint uncertainties in any analysis based on DR2 parallaxes; an independent determination of the parallax zeropoint should be carried out for any data for which this is possiblefor example, via asteroseismology (De Ridder et al. 2016) and eclipsing binaries (Stassun & Torres 2016). We also suggest a possible increase of the formal DR2 errors for stars in this range by about 19%, with modest (96.5%) significance.
Comparison of DR2 with HST parallaxes suggest that parallaxes for bright stars (G < 6) may be unreliable, consistent with the large residuals L18 find for bright stars. This conclusion is reinforced by the analysis of the larger sample of Cepehids with ground-based photometry. We also find that at the level of precision of DR2 parallaxes, existing ground-based photometry is of insufficient quality to take full advantage of the parallax information; photometric errors are likely underestimated, possibly because of systematic offsets between systems and between standards in different parts of the sky. This will be even more true with future releases, when Gaia precision is expected to improve significantly, and zeropoint issues will likely be addressed. We would recommend that only Cepheids with accurate, high-quality photometry, free of systematic effects, should be used in the calibration of the Leavitt law with the precision enabled by Gaia DR2 and beyond.

Implications for Determination of the Hubble Constant
The results presented here may be evaluated as (another) independent test of the scale of the local determination of H 0 from Riess et al. (2016) or as an augmentation to that measurement. As an independent test, the results from constraining α reaffirm the present "tension" between the local determi-nation of H 0 and that based on Planck CMB data in concert with ΛCDM (Planck Collaboration et al. 2016). This test is similar in outcome to the one from Riess et al. (2018) which employed measurements of 8 parallaxes of long-period Cepheids using spatial scanning on HST and reaching a mean precision of 45 µas, similar to the Gaia DR2 formal precision. The key differences are that the present study uses a factor of 5 times as many Cepheids but that statistical advantage is largely returned by the need to determine the offset in the Gaia DR2 Cepheid parallaxes.
By including the new MW parallaxes from HST and Gaia to the rest of the data from Riess et al. (2016) the value of H 0 changes slightly to 73.52 ± 1.62 (including systematics discussed in R16) and increases the tension to 3.8σ. While we have chosen not to formally use the Zinn et al. (2018) external constraint on the parallax offset based on Red Giants due to its model-dependence, we note including it would result in H 0 = 73.83 ± 1.48 and would raise tension to 4.3 σ, thus illustrating the leverage that such knowledge of the offset provides.
Undoubtedly the greater benefit derived from these two new sets of of parallaxes is as independent tests of luminosity calibration derived from the masers in NGC 4258 (Humphreys et al. 2013;Riess et al. 2016), the detached eclipsing binaries in the LMC (Pietrzyński et al. 2013), and shorter period, nearer MW parallaxes (Benedict et al. 2007;van Leeuwen et al. 2007). It is very difficult to imagine an unknown significant systematic error which would affect all 5 sources of Cepheid luminosity calibration to a comparable level.
With improved parallaxes from Gaia in the future and better knowledge of their zeropoint and with observations of Cepheids in new hosts of Type Ia supernovae (now underway), a target precision for H 0 of ∼ 1% is not out of reach and would be an invaluable aid for resolving the source of the present tension.
database, operated at CDS, Strasbourg, France. Research at Lick Observatory is partially supported by a generous gift from Google.
The HST data used in this paper are available at the MAST archive http://dx.doi.org/10.17909/T9G40B

Sources for Ground-based Phase Corrections
Compared to the Riess et al. (2018) analysis, we make use of additional ground measurements from the ASAS-SN (Shappee et al. 2014) web interface (Kochanek et al. 2017), Berdnikov et al. (2000, Berdnikov et al. (2007), Berdnikov et al. (2015), and van Leeuwen et al. (2007).  -4,6,7,9,10,13,18,21,29-31 1-3,9,13,18 Note-a The labels are described in Table 3. NA indicates no ground data avaliable.  Figure 3. For Gaia DR2, reported values of π and σ π for the sample of 50 Milky Way Cepheids with HST WFC3 system photometry. The Gaia team reports DR2 has "A small proportion of sources with corrupted parallaxes indicated by the occurence of apparently very significant large positive or negative values." We identify three Cepheids whose parallaxes are likely corrupted as they appear far from the rest in this space (SV-Per, RW-Cam, and RY-Vel). For SV-Per and RW-Cam, our WFC3 spatial scans (insets) reveal a close companion within 0.2 ′′ of the HST line-spread function, which is the likely source of the corruption. These 3 are excluded from further analysis.   (Benedict et al. 2007) for G < 6 or WFC3 spatial scanning for G > 6 ( Riess et al. 2018). Two Cepheids (Y-Sgr and Delta-Cep) were excluded because their Gaia DR2 values were negative, and one (ℓ Car) was extremely large, indicating they are corrupted. The agreement is good for G > 6 mag (7 of 8 within 1 σ) but poor at G < 6 mag (1 of 10 within 1 σ), indicating that at G < 6 mag, where the Gaia detectors saturate, the DR2 parallaxes become unreliable.  Table 1, the Cepheid periods, and the P -L parameters given by R16. A zeropoint offset, as indicated (dashed), is readily apparent with otherwise good agreement.  Figure 6. For the HST sample of 50 Milky Way Cepheids, a sample with long periods, low extinction, and homogeneous photometry, we determined the best match between the measured Gaia DR2 parallaxes and those predicted photometrically from their photometry, periods, and the SHOES distance ladder of Riess et al. (2016). We allow two free parameters to account for the parallax zeropoint offset, zp, and a rescaling of the distance ladder, α. We find a significant zeropoint offset of −46 ± 13 µas and a rescaling of the SHOES distance ladder of 1.006 ± 0.033 . The rescaling parameter is inconsistent at the 2.9σ confidence level (99.6%) with the value needed to match Planck + ΛCDM (Planck Collaboration et al. 2016). The lower panel shows the residuals from the best fit. The HST sample of 50 Cepheids presented here was selected to have P > 8 days, A H < 0.4 mag, V > 6 mag, and expected distances of D < 6 kpc. It is 70% complete (by random selection of the HST schedule) and has a dispersion of 43 µas, comparable to expectations. A nonoverlapping sample of 86 Cepheids with photometry compiled from a single source for each ground-based system (see text) shows much greater dispersion, 99 µas or 68 µas after discarding the two most deviant (or 60 µas after discarding the four most deviant), far more dispersion than the DR2 errors can explain. The text discusses reasons why such samples may be unreliable.  Figure 8. Same as Figure 6 except now including a constraint of -53 ±2.6 µas on the parallax zeropoint offset calculated from 3475 Red Giants with Kepler-based asteroseismic estimates of radii and parallaxes from Zinn et al. (2018). The constraint is intended only to illustrate the reduction in uncertainty in the distance scale that is possible with independent knowledge of the offset term.