Mean Estimate Distances for Galaxies with Multiple Estimates in NED-D

Numerous research topics rely on an improved cosmic distance scale (e.g., cosmology, gravitational waves), and the NASA/IPAC Extragalactic Database of Distances (NED-D) supports those efforts by tabulating multiple redshift-independent distances for 12,000 galaxies (e.g., Large Magellanic Cloud (LMC) zero-point). Six methods for securing a mean estimate distance (MED) from the data are presented (e.g., indicator and Decision Tree). All six MEDs yield surprisingly consistent distances for the cases examined, including for the key benchmark LMC and M106 galaxies. The results underscore the utility of the NED-D MEDs in bolstering the cosmic distance scale and facilitating the identification of systematic trends.


Introduction
Redshift-independent distances (hereafter distances) tied to multiple indicators are beneficial for gravitational wave cross-matching, and establishing the cosmic distance scale, peculiar velocity flows, and the Hubble constant. Consequently, the NASA/IPAC Extragalactic Database of Distances (NED-D) was created in part to serve as a resource that hosts such pertinent information (Steer et al. 2017). NED-D is the largest compilation of extragalactic distances, containing the majority of published estimates since 1980. Currently, distances for 150,000 galaxies are available, and 12,000 of those have multiple distances based on a total 78,000 estimates. Those estimates are tied to at least 77 separate indicators.
Mean distances presently cited in NED are inferred from an unweighted average of all distances per galaxy, as published. No corrections are applied to account for differences in zero-point or distance indicators, nor are outliers removed. The key objective of the present study is to report on the implementation of diverse methods for estimating the mean distance, thereby establishing a multifaceted mean estimate distance (MED) procedure. For example, a best estimate distance (BED, Harris et al. 2010) approach applied to Cen A (NGC 5128) resulted in an error-weighted mean of 3.8 ± 0.3 Mpc. Specifically, Harris et al. 2010 selected the single most precise and recent distance for each of four selected primary indicators. Eight primary indicators and numerous additional distances are available for that galaxy in NED-D. Mean distances cited in certain other compilations also follow a BED approach, and for example, a weighted mean for three estimates based on two primary indicators is provided for M101 (NGC 5457) in the Extragalactic Distance Database (EDD; Tully et al. 2016). For that galaxy NED-D features 112 estimates based on six primary indicators. Other compilations include the Updated Nearby Galaxies Catalog (UCNG; Karachentsev, Makorov, & Kaisina 2013), and the HyperLEDA catalog (Makarov, Prugniel, & Terekhova 2014).
It is desirable that an enhanced MED approach be relayed to researchers when selecting extragalactic distance estimates. Six means shall be presented to researchers: an unweighted mean of the (1) distance estimates or (2) indicators; a weighted mean of the indicators based on either (3) distance error or (4) date of publication; and a Decision Tree mean involving either (5) a BED approach based on selected estimates per indicator or (6) a mean of indicators weighted by preference. A seventh mean that combines a subset of the aforementioned is likewise provided.
This study is organized as follows. In section 2, descriptions are provided for the different indicators and indicator categories, the placement of estimates onto a common scale, and the clipping of estimate outliers. The different MEDs are described in Section 3, and MEDs are evaluated for the LMC, the primary extragalactic distance scale zero-point, and to 40 Messier galaxies including M106, an alternate distance scale zero-point galaxy. Conclusions regarding the determination of mean distances for galaxies with multiple estimates are summarized in Section 4.

Distance Indicators, Distance Scales, and Outliers
The NED compilation of distances (NED-D) is described in Steer et al. 2017. Briefly, the distances include published peer-reviewed estimates since 1980, as well as some vetted non-peer-reviewed distances. At least 77 distance indicators are currently in use (Table 1), and hence NED facilitates the identification of systematic uncertainties.
The placement of indicators into categories has been revised. Distances were previously classified as primary standard candle, primary standard ruler, or secondary. A new indicator category is conveyed to recognize that certain indicators are neither primary nor secondary. The added category accounts for 19 tertiary indicators that are imprecise, and includes distances based on the Infrared Astronomical Satellite (IRAS) indicator. The precision of primary and secondary estimates is typically 10% and 20%, respectively. Accordingly, primary distances are weighted four times greater than secondary estimates (e.g., Tully et al. 2013).
Distances in NED can be tied to different scales, and can assume either a Hubble constant or LMC modulus. Distances can be placed onto a homogenized scale, however, by making use of the two ancillary data columns provided. Distances based on a Hubble constant offset from H0 = 70 km s -1 Mpc -1 are noted in an ancillary column. Similarly, distances based on an LMC modulus offset from μ0 = 18.50 mag are likewise noted.
Published uncertainty estimates are inhomogeneous, and may be tied to a weighted approach, standard deviation, standard error, formal uncertainties tied to leastsquares fitting routines, a quadrature sum of standard error and systematic uncertainties, etc. For this study the uncertainties are adopted verbatim and no adjustments are made for the aforementioned differences. For all MEDs computed the standard deviation is cited, even for weighted mean approaches.

Mean Distances for the LMC and M106
MEDs were determined once tertiary distances were discarded, primary and secondary distances were placed on a common scale, and 3σ outliers were excluded. The MED methods employed are summarized in Table 2.
The unweighted mean of the 940 primary distance estimates to the LMC implies 49.57 ± 3.03 kpc (MED 1). The unweighted mean of the mean distances for each of 17 primary indicators is 50.08 ± 1.58 kpc (MED 2). The unweighted mean of the error-weighted mean distances for each of 17 indicators is 50.29 ± 1.37 kpc (MED 3), and importantly, the uncertainty cited here is the standard deviation rather than the canonical weighted uncertainty. The unweighted mean of the date-weighted mean distances for each of 17 indicators is 50.31 ± 1.46 kpc (MED 4). The date-weighting scheme adopted for MED 4 is 1.258 n , where n is the number of years between publication and 1980. Estimated distances to the LMC, including means for each primary indicator and based on each MED, are shown in Figure 1.
Regarding method 4, attention has been drawn to the fact that over time, distances and indicators improve in precision and accuracy (e.g., Helou & Madore 1988, de Grijs, Wicker, & Bono 2014. Decade-over-decade improvement in the standard deviation among Hubble constants published since 1980 demonstrates this, as shown in Figure 2. Controversy in the 1980s over whether the Hubble constant was close to 50 or 100 km s -1 Mpc -1 has been reduced to whether it is closer to 68 or 73, depending on whether global values based on cosmological microwave background radiation (Bennett et al. 2013, Planck Collaboration et al. 2018, or local values based on Cepheids calibrated Type Ia supernovae are assumed (Freedman et al. 2012, Riess et al. 2016. Published Hubble constant estimates in the most recent decade favor the global value, but do not rule out the local one. Interestingly, one of the teams behind one of the local values has recently found more conclusive evidence for the global value, based on an independent calibration of the distance scale using the tip of the red giant (TRGB) indicator (Freedman et al. 2019).
To within the standard deviation, error weighting and date weighting do not impact the mean distance obtained. Two Decision Tree approaches were evaluated. The first follows a BED approach based on selecting the single most precise estimate per indicator, and in case of a tie the most recent. For the LMC, this provides a distance of 49.90 ± 1.78 kpc (MED 5). The second Decision Tree approach applies weighting each indicator based on a precomputed ranking. The subjective ranking is relayed in Table 1. For the LMC the result is 49.33 ± 1.48 kpc (MED 6). Both Decision Tree approaches thus also supply consistent distances to within the standard deviations. Table 3 hosts the mean distances to the LMC based on each of the 17 indicators, and each MED. A seventh mean is likewise employed, and combines the literature preferred BED approach of selecting estimates (MED 5) with a weighted mean of MED 2, 3, and 4. Method 1 is not included because it exhibits the most scatter, and is the mean of distances regardless of indicators, while method 6 was excluded owing to the increased subjectivity. The unweighted (MED 2), error-weighted (MED 3), and date-weighted (MED 4) means were combined with weights of 1:2:4, respectively. The result is a combined MED 7 estimate of 50.09 ± 1.61 kpc (m-M = 18.499 ± 0.069 mag).
For the LMC all six methods and the combined method 7 produce a consistent distance, and the standard deviation for each MED is ~3%. The mean distances for the LMC support the Pietrzynski et al. (2019) finding. Pietrzynski et al. (2019) obtained 49.59 ± 0.09 (stat.) ± 0.54 (syst.) kpc based on 20 eclipsing binaries. The canonical LMC distance of 50.1 ± 2.5 kpc adopted by the Hubble Space Telescope Key Project (Freedman et al. 2001), with an accuracy of 5%, remains within 1% of the eclipsing binary determination as well as all but one of the MED estimates.
A comparison was likewise carried out on 40 Messier galaxies. M106 is of particular interest as an alternate zero-point, and the primary standard ruler megamaserbased distance is 7.54 ± 0.23 Mpc (Riess et al. 2016). For M106 there are 112 distances, of which 3 are tertiary and excluded. Another 3 are discarded as 3σ outliers, leaving 106 distances. Those include 8 distances based on megamasers, for which only the Riess et al. 2016 result is examined. The different MED methods again produce consistent distances to within the standard deviations, resulting in a mean for the six MEDs based on primary indicators of 7.49 ± 0.02 Mpc. Results are presented in Table 4, and displayed in Figure 3. Table 5 hosts the mean distances to 40 Messier galaxies based on MEDs 2, 3, 4, and 5. In that table, primary and secondary distances are combined and weighted at 4:1. All mean distance estimates for the 40 galaxies were calculated manually. To determine MEDs for the entire ensemble of ~272,000 NED distances, a Python program was created by a visiting member of the NED Team (Michael Randall). The results were subsequently compared with redshift-based distances. The latter sample featured galaxies with distances greater than 5 Mpc, and heliocentric recessional velocities greater than 300 km s -1 . That excludes nearby galaxies with high peculiar velocities, and galaxies with low or negative recessional velocities. The comparison was also limited to galaxies with a heliocentric recessional velocity of v < 32,000 km s -1 , and with mean distances within 1,000 Mpc. Linear distances were determined assuming a Hubble constant of H0 = 70 km s -1 Mpc -1 . Note that only galaxies with multiple distance estimates are viable for MED evaluations.
For the 11,699 galaxies available, the mean redshift-based distance is 85.3 Mpc, and the mean redshift-independent distance from the six MEDs is 87.5 Mpc. Again, the six MEDs and the combined MED 7 provide distances consistent to within the standard deviations. Mean distances for the ensemble based on all six MEDs and an added seventh method are presented in the machine-readable version of Table 6, and are inferred from 78,228 eligible distances. A representative sample is shown here in Table 6, for guidance. A Hubble graph for the 11,699 galaxies, and their positions in galactic coordinates, are shown in Figures 4 and 5, respectively.

Summary and Discussion
Establishing reliable distances for galaxies with primary and secondary distances is a first step in calibrating indicators, providing an improved distance scale and Hubble constant, and aiding determination of the latter's evolution. As a result, the NED team evaluated six methods to estimate MEDs, with an aim in part to providing fellow researchers additional pertinent information that may facilitate the identification of systematic trends. Those MEDs are summarized in Table 2. All six MEDs produce consistent distances to within the standard deviations for the LMC, M106, for all 40 Messier galaxies, and in general among all galaxies with multiple distances in NED (n = 11,699).
In this first benchmarking of the cited approaches the MED distances determined for the LMC are consistent with one another and agree with the Pietrzynski et al. (2019) result to within 1%. For M106 the distances computed likewise are consistent and within 1% of the fiduciary megamaser-based estimate (Riess et al. 2016).
New distances will be provided for ~320,000 galaxies, and inferred from indicators which include the Fundamental Plane and Brightest Cluster Galaxy methods in an update planned for 120,000 galaxies with distances (Saulder et al. 2016). Those galaxies will benefit from MEDs, since each will possess on average four estimates based on at least two indicators.
Repeated consistency among MEDs in multiple applications increases confidence in the cosmological distance scale and the estimates it is based on. It indicates both are surprisingly free from unknown systematic errors, unless such unknowns cancel fortuitously. Overall, NED-D is pertinent for a diverse suite of research topics, such as aiding those on the Swope observatory team to quickly identify NGC 4993 as the host of gravitational wave GW170817 (Drout et al. 2017).
The anonymous referee was as important to this article as the author, who appreciates and admires the dedication. It has been an honor to serve with members of the NED Team, past and present, including Kay Baker, Ben H. P. Chan Additional generous support to IS from the Carnegie Institution of Canada is also gratefully appreciated.

Tertiary
a Messier 102 ID as NGC 5866 to be confirmed. b The last row, Source = "Total/Mean," includes the total number of observations and the mean of 40 values listed in each column of the table.